ARR as a highly available reverse proxy in Windows Azure

With the general availability of Windows Azure’s IaaS solution last year, we’ve seen a significant uptake in migration of legacy solutions to the Windows Azure platform. And with the even more recent announcement of our agreement with Oracle for them to support their products on Microsoft’s hypervisor technology, Hyper-V, we have a whole new category of apps we are being asked to help move to Windows Azure. One common pattern that’s been emerging is for the need for Linux/Apache/Java solutions to run in Azure at the same level of “density” that is available via traditional hosting providers. If you were an ISV (Independent Software Vendor) hosting solutions for individual customers, you may choose to accomplish this by giving each customer a unique URI and binding that to a specific Apache module, sometimes based on a unique IP address that is associated with a customer specific URL and a unique SSL certificate. This results in a scenario that requires multiple IP’s per server.

As you may have heard, the internet starting to run a bit short on IP addresses. So supporting multiple public IP’s per server is a difficult proposition for a cloud, as well as some traditional hosting providers. To that end we’ve seen new technologies emerge such as SNI (Server Name Indication) and use of more and more proxy and request routing solutions like HaProxy, FreeBSD, Microsoft’s Application Request Routing (ARR). This is also complicated by the need for delivery highly available, fault tolerant solutions that can load balancing client traffic. This isn’t a always an easy problem to solve, especially using just application centric approaches. They require intelligent, configurable proxies and/or load balancers. Precisely the kind of low level management the cloud is supposed to help us get away from.

But today, I’m here to share one solution I created for a customer that I think addresses some of this need. Using Microsoft’s ARR modules for IIS, hosted in Windows Azure’s IaaS service, as a reverse proxy for a high-density application hosting solution.

Disclaimer: This article assumes you are familiar with creating/provisioning virtual machines in Windows Azure and then remoting into them to further alter their configurations. Additionally, you will need a basic understanding of IIS and how to make changes to it via the IIS Manager console. I’m also aware of there being a myriad of ways to accomplish what we’re trying to do with this solution. This is simply one possible solution.

Overview of the Scenario and proposed solution

Here’s the outline of a potential customer’s scenario:

  • We have two or more virtual machines hosted in Windows Azure that are configured for high availability. Each of these virtual machines is identical, and hosts several web applications.
  • The web applications consist of two types:
    • Stateful web sites, accessed by end users via a web browser
    • Stateless APIs accessed by a “rich client” running natively on a mobile device
  • The “state” of the web sites is stored in an in-memory user session store that is specific to the machine on which the session was started. So all subsequent requests made during that session must be routed to the same server. This is referred to as ‘session affinity’ or ‘sticky sessions’.
  • All client requests will be over SSL (on port 443), to a unique URL specific to a given application/customer.
  • Each site/URL has its own unique SSL certificate
  • SSL Offloading (decryption of HTTPS traffic prior to its receipt by the web application) should be enabled to reduce the load on the web servers.

As you can guess based on the title of this article my intent is to solve this problem using Application Request Routing (aka ARR), a free plug-in for Windows Server IIS. ARR is an incredibly powerful utility that can be used to do many things, including acting as a reverse proxy to help route requests in a way that is completely transparent to the requestor. Combined with other features of IIS 8.0, it is able to meet the needs of the scenario we just outlined.

For my POC, I use four virtual machines within a single Windows Azure cloud service (a cloud service is simply a container that virtual machines can be placed into that provides a level of network isolation). On-premises we had the availability provided by the “titanium eggshell” that is robust hardware, but in the cloud we need to protect ourselves from potential outages by running multiple instances configured to help minimize downtimes. To be covered by Windows Azure’s 99.95% uptime SLA, I am required to run multiple virtual machine instances placed into an availability set. But since the Windows Azure Load Balancer doesn’t support sticky sessions, I need something in the mix to deliver this functionality.

The POC will consist of two layers, the ARR based Reverse Proxy layer, and the web servers. To get the Windows Azure SLA, each layer will have two virtual machines: two running ARR with public endpoints for SSL traffic (port 443) and two set up as our web servers, but since these will sit behind our reverse proxy, they will not have any public endpoints (outside of remote desktop to help with initial setup). Requests will come in from various clients (web browsers or devices) and arrive at the Windows Azure Load Balancer. The load balancer will then distribute the traffic equally across our two reserve proxy virtual machines where the requests are processed by IIS and ARR and routed based on the rules we will configure to the proper applications on the web servers, each running on a unique port. Optionally, ARR will also handle the routing of requests to a specific web server, ensuring that “session affinity” is maintained. The following diagram illustrates the solution.

The focus on this article in on how we can leverage ARR to fulfill the scenario in a way that’s “cloud friendly”. So while the original customer scenario called for Linux/Apache servers, I’m going to use Windows Server/IIS for this POC. This is purely a decision of convenience since it has been a LONG time since I set up a Linux/Apache web server. Additionally, while the original scenario called for multiple customers, each with their own web applications/modules (as shown in the diagram), I just need to demonstrate the URI to specific application routing. So as you’ll see in later in the article, I’m just going to set up a couple of web applications.

Note: While we can have more than two web servers, I’ve limited the POC to two for the sake of simplicity. If you want to run, 3, 10, or 25, it’s just a matter of creating the additional servers and adding them to the ARR web farms as we’ll be doing later in this article.

Setting up the Servers in Windows Azure

If you’re used to setting up Virtual Machines in Windows Azure, this is fairly straight forward. We start by creating a cloud service and two storage accounts. The reason for the two is that I really want to try and maximize the uptime of the solution. And if all the VM’s had their hard-drives in a single storage account and that account experienced a sustained service interruption, my entire solution could be taken-offline.

NOTE: The approach to use multiple storage accounts does not guarantee availability. This is a personal preference to help, even if in some small part, mitigate potential risk.

You can also go so far as to define a virtual network for the machines with separate subnets for the front and back end. However, this should not be required for the solution to work as the cloud service container gives us DNS resolution within its boundaries. However, the virtual network can be used to help manage visibility and security of the different virtual machine instances.

Once the storage accounts are created, I create the first of our two “front end” ARR servers by provisioning a new Windows Server 2012 virtual machine instance. I give it a meaningful name like “ARRFrontEnd01” and make sure that I also create an availability set and define a HTTPS endpoint on port 443. If you’re using the Management portal, be sure to select the “from gallery” option as opposed to ‘quick create’ as it will give you additional options when provisioning the VM instance and allow you to more easily set the cloud service, availability set, and storage account. After the first virtual machine is created, create a second, perhaps “ARRFrontEnd02”, and “attach” it to the first instance by associating it with the endpoint we created while provisioning the previous instance.

Once our “front end” machines are provisioned, we set up two more Windows Server 2012 instances for our web servers, “WebServer01” and “WebServer02”. However, since these machines will be behind our front end servers, we won’t declare any public endpoints for ports 80 or 443, just leave the defaults.

When complete, we should have four virtual machine instances, two that are load balanced via Windows Azure on port 433 and will act as our ARR front end servers and our two that will act as our web servers.

Now before we can really start setting things up, we’ll need to remote desktop into each of these servers and add a few roles. When we log on, we should see the Server Manager dashboard. Select “Add roles and features” from the “configure this local server” box.

In the “Add Roles and Features” wizard, skip over the “Before you Begin” (if you get it), and select the role-based installation type.

On the next page, we’ll select the current server from the server pool (the default) and proceed to adding the “Web Server (IIS)” server role.

This will pop-up another dialog confirming the features we want added. Namely the Management Tools and IIS Management Console. So take the defaults here and click “Add Features” to proceed.

The next page in the Wizard is “Select Features”. We’ve already selected what we needed when we added the role, so click on “Next” until you arrive at the “Select Role Services”. There are two optional role services here I’d recommend you consider adding. Health and Diagnostic Tracing will be helpful if we have to troubleshoot our ARR configuration later and The IIS Management Scripts and Tools will be essential if we want to automate the setup of any of this at a later date (but that’s another blog post for another day). Below is a composite image that shows these options selected.

It’s also a good idea to double-check here and make sure that the IIS Management Console is selected. It should be by default since it was part of the role features we included earlier. But it doesn’t hurt to be safe. J

With all this complete, go ahead and create several sites on the two web servers. We can leave the default site on port 80, but create two more HTTP sites. I used 8080 and 8090 for the two sites, but feel free to pick available ports that meet your needs. Just be sure to go into the firewall settings of that server enable inbound connections on these ports. I also went into the sites and changed the HTML so I could tell which server and which app I was getting results back from (something like “Web1 – SiteA” works fine).

Lastly, test the web sites from our two front end servers to make sure they can connect by logging into those servers and opening a web browser and enter in the proper address. This will be something like HTTP://<servername>:8080/iisstart.htm. The ‘servername’ parameter is simply the name we gave the virtual machine when it was provisioned. Make sure that you can hit both servers and all three apps from both of our proxy servers before proceeding. If these fail to connect, the most likely cause is an issue in the way the IIS site was defined, or an issue with the firewall configuration on the web server preventing the requests from being received.

Install ARR and setting up for HA

With our server environment now configured, and some basic web sites we can balance traffic against, it’s time to define our proxy servers. We start by installing ARR 3.0 (the latest version as of this writing and compatible with IIS 8.0. You can download it from here, or install it via the Web Platform Installer (WebPI). I would recommend this option, as WebPI will also install any dependencies and can be scripted. Fortunately, when you open up the IIS Manager for the first time and select the server, it will ask if you want to install the “Microsoft Web Platform” and open up a browser to allow you to download it. After a few adding a few web sites to the ‘trusted zone’ (and enabling file downloads when in the ‘internet’ zone), you’ll be able to download and install this helpful tool. Once installed, run it and enter “Application Request” into the search bar. We want to select version 3.0.

Now that ARR is installed (which we have to do on both of our proxy servers), let’s talk about setting this up for high availability. We hopefully placed both or proxy servers into an availability set and load balanced the 443 endpoint as mentioned above. This allows both servers to act as our proxy. But we have two possible challenges yet:

  1. How to maintain the ARR setup across two servers
  2. Ensure that session affinity (aka sticky sessions) works with multiple, load balanced ARR servers

Fortunately, there’s a couple of decent
blog posts on IIS.NET about this subject. Unfortunately, these appear to have been written by folks that are familiar with IIS, networking, pings and pipes, and a host of other items. But as always, I’m here to try and help cut through all that and put this stuff in terms that we can all relate too. And hopefully in such a way that we don’t lose any important details.

To leverage Windows Azure’s compute SLA, we will need to run two instances of our ARR machines and place them into an availability set. We set up both these servers earlier, and hopefully properly placed them into an availability set with a load balanced endpoint on port 443. This allows the Windows Azure fabric to load balanced traffic between the two instances. Also, should updates to the host server (where our VMs run) or the fabric components be necessary, we can minimize the risk of both ARR servers being taken offline at the same time.

This configuration leads us to the options highlighted in the blog post I linked previously, “Using Multiple Instances of Application Request Routing (AAR) Servers“. The article discusses using Shared Configuration and External Cache. A Shared Configuration allows two ARR servers to share their confiurations. By leveraging a shared configuration, changes made to one ARR server will automatically be leveraged by the other because both servers will share a single applicationhost.config file. The External Cache is used to allow both ARR servers to share affinity settings. So if a client’s first request is sent to a given back end web server, then all subsequent requests will be sent to that same back end server regardless of which ARR server receives the request.

For this POC, I decided not to use either option. Both require a shared network location. I could put this on either ARR server, but this creates a single point of failure. And since our objective is to ensure the solution remains as available as possible, I didn’t want to take a dependency that would ultimately reduce the potential availability of the overall solution. As for the external cache, for this POC I only wanted to have server affinity for one of the two web sites since the POC is mocking up both round-robin load balancing for requests that may be more like an API. For requests that are from a web browser, instead of using shared cache, we’ll use “client affinity”. This option returns a browser cookie that contains all the routing information needed by ARR to ensure that subsequent requests are sent to the same back end server. This is the same approach used by the Windows Azure Java SDK and Windows Azure Web Sites.

So to make a long story short, if we’ve properly set up our two ARR server in an availability set, with load balanced endpoints, there’s no additional high level configuration necessary to set up the options highlighted in the “multiple instances” article. We can get what we need within ARR itself.

Configure our ARR Web Farms

I realize I’ve been fairly high level with my setup instructions so far. But many of these steps have been fairly well documented and up until this point we’ve been painting with a fairly broad brush. But going forward I’m going to get more detailed since it’s important that we properly set this all up. Just remember, that each of the steps going forward will need to be executed on each of our ARR servers since we opted not to leverage the Shared Configuration.

The first step after our servers have been set up is to configure the web farms. Open the IIS Manager on one of our ARR servers and (provided our ARR 3.0 install was completed successfully), we should see the “Server Farm” node. Right-click on that node and select “Create Server Farm” from the pop-up menu as shown in the image at the right. A Server Farm is a collection of servers that we will have ARR route traffic to. It’s the definition of this farm that will control aspects like request affinity and load balancing behaviors as well as which servers will receive traffic.

The first step in setting up the farm is to add our web servers. Now in building my initial POC, this is the piece that caused me the most difficulty. Not because creating the server farm was difficult, but because there’s one thing that’s not apparent to those of us that aren’t intimately familiar with web hosting and server farms. Namely that we need to consider a server farm to be specific to one of our applications. It’s this understanding that helps us realize that we need the definition of the server farm to help us route requests coming to the ARR server on one port, to be routed to the proper port(s) on the destination back end servers. We’ll do this as we add each server to the farm using the following steps…

After clicking on “Create Server Farm”, provide a name for the farm. Something suitable of course…

After entering the farm name and clicking on the “Next” button, we’ll be presented with the “Add Server” dialog. In this box, we’ll enter in the name of each of our back end servers but more importantly we need to make sure we expand the “Advanced Settings” options so we can also specify the port on that server we want to target. In my case, I’m going to a ‘Web1’, the name of the server I want to add and I want to set ‘httpPort’ to 8080.

We’re able to do this because Windows Azure handles DNS resolution for the servers I added to the cloud service. And since they’re all in the same cloud service, we can address each server on any ports those servers will allow. There’s no need to define endpoints for connections between servers in the same cloud service. So we’ll complete the process by clicking on the ‘Add’ button and then doing the same for my second web server, ‘Web2’. We’ll receive a prompt about the creation of a default a rewrite rule, click on the “No” button to close the dialog.

It’s important to set the ‘httpPort’ when we add the servers. I’ve been unable to find a way to change this port via the IIS Manager UI once the server has been added. Yes you can change it via appcmd, powershell, or even directly editing the applicationhost.config, but that’s a topic for another day. J

Now to set the load balancing behavior and affinity we talked about earlier, we select the newly created server farm from the tree and we’ll see the icons presented below:

If we double-click on the Load Balance icon, it will open a dialog box that allows us to select from the available load balancing algorithms. For the needs of this POC, Least Recent Request and Weighted Round Robin would both work suitably. Select the algorithm you prefer and click on “Apply”. To set the cookie based client affinity I mentioned earlier, you can double click on the “Server Affinity” option and then check the box for “Client Affinity”.

The final item that we will make sure is enabled here is SSL Offloading. We can verify this by double-clicking on “Routing Rules” and verifying that “Enabled SSL Offloading” is checked which is should be by default.

Now it’s a matter of repeating this process for our second application (I put it on port 8090) as well as setting up the same two farms on the other ARR server.

Setting up the URL Rewrite Rule

The next step is to set up the URL rewrite rule that will tell ARR how to route requests for each of our applications to the proper web farm. But before we can do that, we need to make sure we have two unique URI’s, one for each of our applications. If you scroll up and refer to the diagram that provides the overview of our solution, you’ll see that an end user request to the solution are directed at custaweb.somedomian.com and device api calls are directed to custbweb.somedomain.com. So we will need to create an aliasing DNS entry for these names and alias them to the *.cloudapp.net URI that is the entry point of the cloud service where this solution resides. We can’t use just a forwarding address for this but need a true CNAME alias.

Presuming that has already been setup, we’re ready to create the URL rule for our re-write behavior.

We’ll start by selecting the web server itself in the IIS server manager and double clicking the URL Rewrite icon as shown below.

This will open the list of URL rewrite rules, and we’ll select “add rules…” form the action menu on the right. Select to create a blank inbound rule. Give the rule an appropriate name, and complete the sections as shown in the following images.

Matching URL

This section details what incoming request URI’s this rule should be applied too. I have set it up so that all inbound requests will be evaluated.

Conditions

Now as it stands, this rule would route nearly any request. So we need have to add a condition to the rule to associate it with a specific request URL. We need to expand the “Conditions” section and click on “Add…”. We specify “{HTTP_HOST}” as the input condition (what to check) and set the condition’s type is a simple pattern match. And for the pattern itself, I opted to use a regular expression that looks at the first part of the domain name and makes sure it contains the value “^custAweb.*” (as we highlighted in the diagram at the top). In this way we ensure that the rule will only be applied to one of the two URI’s in our sample.

Action

The final piece of the rule is to define the action. For our type, we’ll select “Route to Server Farm”, keep HTTP as the scheme, and specify the appropriate server farm. And for the path, we’ll leave the default value of “/{R:0}”. The final piece of this tells ARR to add any paths or parameters that were in the request URL to the forwarded request.

Lastly, we have the option of telling ARR that if we execute this rule, we should not process any subsequent rules. This can be checked or unchecked depending on your needs. You may desire to set up a “default” page for requests that don’t meet any of our other rules. In which case just make sure you don’t “stop processing of subsequent rules” and place that default rule at the bottom of the list.

This completes the basics of setting up of our ARR based reverse proxy. Only one more step remains.

Setting up SNI and SSL Offload

Now that we have the ARR URL Rewrite rules in place, we need to get all the messiness with the certificates out of the way. We’ll assume, for the sake of argument, that we’ve already created a certificate and added it to the proper local machine certificate store. If you’re unsure how to do this, you can find some instructions in this article.

We start by creating web site for the inbound URL. Select the server in the IIS Manager and right-click it to get the pop-up menu. This open the “Add Website” dialog which we will complete to set up the site.

Below you’ll find some settings I used. The site name is just a descriptive name that will appear in the IIS manager. For the physical path, I specified the same path as the “default” site that was created when we installed IIS. We could specify our own site, but that’s really not necessary unless you want to have a placeholder page in case something goes wrong with the ARR URL Rewrite rules. And since we’re doing SSL for this site, be sure to set the binding type to ‘https’ and specify the host name that matches the inbound URL that external clients will use (aka our CNAME). Finally, be sure to check “Require Server Name Indication” to make sure we support Server Name Indication (SNI).

And that’s really all there is to it. SSL offloading was already configured for us by default when we created the server farm (feel free to go back and look for the checkbox). So all we had to do was make sure we had a site defined in IIS that could be used to resolve the certificate. This will process the encryption duties, then ARR will pick up the request for processing against our rules.

Debugging ARR

So if we’ve done everything correctly, it should just work. But if it doesn’t, debugging ARR can be a bit of a challenge. You may recall that back when we installed ARR, I suggested also installing the tracing and logging features. If you did, these can be used to help troubleshoot some issue as outlined in this article from IIS.NET. While this is helpful, I also wanted to leave you with one other tip I ran across. If possible, use a browser on the server we’re configured ARR on to access the various web sites locally. While this won’t do any routing unless you set up some local DNS entries to help with resolving to the local machine, it will show you more than a stock “500” error. By accessing the local IIS server from within, we can get more detailed error messages that help us understand what may be wrong with our rules. It won’t allow you to fix everything, but could sometimes be helpful.

I wish I had more for you on this, but ARR is admittedly a HUGE topic, especially for something that’s a ‘free’ add-on to IIS. This blog post is the results of several days of experimentation and self-learning. And even with this time invested, I would never presume to call myself an expert on this subject. So please forgive if I didn’t get into enough depth.

With this, I’ll call this article to a close. I hope you find this information useful and I hope to revisit this topic again soon. One item I’m still keenly interested in is how to automate these tasks. Something that will be extremely useful for anyone that has to provision new ‘apps’ into our server farm on a regular basis. Until next time then!

Postscript

I started this post in October 2013 and apologize for the delay in getting it out. We were hoping to get it published as a full-fledge magazine article but it just didn’t work out. So I’m really happy to finally get this out “in the wild”. I’d also like to give props to Greg, Gil, David, and Ryan for helping do technical reviews. They were a great help but I’m solely responsible for any grammar or spelling issues contained here-in. If you see something, please call it out in the comments or email me and I’m happy to make corrections.

This will also hopefully be the first of a few ARR related posts/project I plan to share over the next few weeks/months. Enjoy!

5 Responses to ARR as a highly available reverse proxy in Windows Azure

  1. Pingback: Friday, February 7, 2014 on #WindowsAzure | Alexandre Brisebois

  2. Travis Shepherd says:

    How does this handle a scenario where Request 1 hits Reverse Proxy 1, and establishes a session with Host 1. Request 2 (from the same session) then hits Reverse Proxy 2. How does the second proxy know to route the request to Host 1?

    • Brent says:

      The way we’ve configured ARR in this blog post would result in a cookie being attached to the response from the first request. Assuming the requesting client supports cookies, this will be resent on the second request. The second proxy then uses this cookie to know to route the request to Host 1. There’s another option that can be leveraged that doesn’t require the cookie, but it does require that all the proxy hosts use a shared storage. This storage then in turn stores a lookup table that the proxies use when routing requests. At the time this post was written, this would have been problematic. But with the new Azure Files, a hosted SMB share, it may not be a valid option.

  3. Yaser Suhail says:

    Hello Brent,

    As per the ARR work flow shown the request first hits the ARR load balancer and the LB forwards the requests to the ARR proxy 1 and 2. My question here is: Is the Windows LB smart enough to identify if one of the proxy (either 1 or 2) are down? and does the LB smart enough to route the traffic to the available ARR?

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.