Remote Powershell with Windows Azure

One of the great things about being part of the TED (Technical Evangelism and Development) organization is that on occasion, we get to engage in various internal projects. The intent of these is allow us to get “hands on” with things we may not normally have the opportunity to work with. Of late, I’ve been doing exactly this. And the task I took on, eventually required me to do some work with remote PowerShell on a Windows Azure Virtual Machine.

In April of 2013, shortly before we announced the general availability of Windows Azure Virtual Machines, we announced that we would enable remote PowerShell by default on all newly provisioned Windows Azure Virtual Machines. So I’ve been aware of it for some time, but hadn’t yet really had the need to get my hands dirty with it. When the opportunity came up on one of these internal side-projects, I jumped at it. And since I just dropped my wife and daughter off so they can take part in the Annual St. Patrick’s “Get Lucky” half marathon run/walk, I find myself sitting in the newly refurbished St. Paul Union depot with free wifi. So I figured why not share what I learned lately with all of you. J

The Requirements

To leverage remote powershell, there’s a few things we need:

  • The Remote Powershell Certifcate from the Azure hosted VM
  • User Credentials to execute with on the Azure hosted VM
  • The URL for our remote endpoint, <cloudservicename>.cloudapp.net:<port>

Now you could get these directly from the VM, but since we’re discussing PowerShell, why not make things easy and leverage the Management API and PowerShell to get this all set up.

Set up management API

To access the management API, we need a management certificate. We can generate our own certificate, install it locally, and then install it into Windows Azure. But I prefer to use PowerShell to do this. To make life easy, let’s start by going to https://manage.windowsazure.com and logging into the portal using the LiveID that’s associated with the subscription we want to access via the management API.

That done, we will switch over to Powershell, and execute the cmdlet Get-AzurePublishSettingsFile. This command will actually launch a browser session to https://manage.windowsazure.com/publishsettings/ (you can also go the URL manually), and using the profile that’s already logged in (hence why I went to the portal previously), prompt you to download a publishing profile. What this page has actually done is generate a new x.509 v3 management certificate and associated it with the subscription for us. The page then generates a publishsettings file that contains the details of the subscription and a thumbnail for this certificate and prompts us to download it.

Note: I would recommend you save this file in a secure location as Windows Azure current allows a maximum of 100 certificates per subscription. So if you run this command often, you’ll exhaust your available certificate slots and need to start removing older ones.

Once you have the file downloaded, you can now import it into your local machine via the Import-AzurePublishSettingsFile cmdlet. This cmdlet will have used the publishsettings file to create a certificate in your local certificate store that will be used to validate our Windows Azure Management API calls. We can even verify that it was imported successfully using the Get-AzureSubscription cmdlet.

Remote Powershell Certificate

Now the certificate we just installed will be used to sign and validate our Management API commands. For security purposes, remote powershell also requires a certificate. Windows Azure created one for us when the virtual machine was provisioned, but we need to get a copy of it. Fortunately, there’s an excellent code snippet from former Microsoft Powershell Guru Michael Washam that does an excellent job of detailing this. I’m just going to break this down a bit and perhaps add one additional step.

The first thing we need to do is make sure we set the Azure Subscription we want to work with as our current one. If you only have one subscription, you may not miss this step, but for those us that juggle multiple, subscriptions, its key to include this.

Select-AzureSubscription <subscriptionNameValue>

 This will make sure all subsequent operations operate on the subscription identified by the name we provided.

Next, we get a VM PowerShell object for the IaaS Machine we want to work with, and get the DefaultWinRMCertificateThumbprint from its extended properties. Since we already selected the target subscription, we now need to know the name of the Windows Azure Cloud Service that contains our VM, and of course the VM’s name.

We do this as follows:

$winRMCert = (Get-AzureVM -ServiceName $cloudServiceName -Name $virtualMachineName | select -ExpandProperty vm).DefaultWinRMCertificateThumbprint

 After this, command, we simply check the value of $winRMCert to make sure it is not null (meaning we found our VM and got the thumbnail.

if (!$winRMCert)
{
    write-Host ("**ERROR**: Unable to find WinRM Certificate for virtual machine '"+$virtualMachineName)
    $vm = Get-AzureVM -ServiceName $cloudServiceName -Name $virtualMachineName
    if (!$vm)
    {
        write-Host ("virtual machine "+$virtualMachineName+" not found in cloud service "+$cloudServiceName);
    }
    Exit
}

Using this thumbnail, we are able to call the management API again and extract the certificate that was associated with that virtual machine. We then save this locally…

$AzureX509cert = Get-AzureCertificate -ServiceName $cloudServiceName -Thumbprint $winRMCert -ThumbprintAlgorithm sha1

$certTempFile = [IO.Path]::GetTempFileName()
$AzureX509cert.Data | Out-File $certTempFile

And with the cert file locally, we just need to import it into our local machine and delete the file (we don’t want to leave a security credential laying around after all).

$CertToImport = New-Object System.Security.Cryptography.X509Certificates.X509Certificate2 $certTempFile

$store = New-Object System.Security.Cryptography.X509Certificates.X509Store "Root""LocalMachine"
$store.Certificates.Count
$store.Open([System.Security.Cryptography.X509Certificates.OpenFlags]::ReadWrite)
$store.Add($CertToImport)
$store.Close()

write-Host ("Cleanup cert file- "+[System.DateTime]::Now.ToString("hh:mm:ss"))
Remove-Item $certTempFile

And there we have it, the local machine is now all set up to start performing remote PowerShell commands against our Windows Azure Hosted Virtual Machine.

Executing the remote command

We’re in the final stretch now. All that remains is to get the URL for the endpoint we need to send the PowerShell commands to, and fire them off. Getting the URI is fairly easy, we again leverage the cloud service and virtual machine names.

$uri = Get-AzureWinRMUri -ServiceName $cloudServiceName -Name $virtualMachineName 

But we’ll also need the user that will be executing these commands within the remote machine. And for that, we’ll create a PSCredential Object. There are two ways to do this, if you don’t want the credentials stored in the script, you can do as Michael’s example shows and just:

$credential Get-Credential

This option will cause a prompt to come up where you can enter in the username and password for the remote user you want to create. You can even shortcut this a bit by providing the user name, and then only prompting for the password. This is an excellent “best practice”, but there are times when you need to be able to do this in a way that’s completely unattended. When this happens, you can use an approach like the following:

$secpwd = ConvertTo-SecureString $password -AsPlainText -Force
$credential = New-Object System.Management.Automation.PSCredential($adminUser, $secpwd)

Thanks to my friend Cory for turning me onto this trick. Using this you can imbed the username/password right into the script, read it from a file, or perhaps even make it a parameter if your script is being called form something else.

Now at the top, I said we needed three things: a certificate, a URI, and some credentials. Well we have those now, so it’s just a matter of executing our command!

Invoke-Command -ConnectionUri $uri.ToString() -Credential $credential -ScriptBlock {
    if (Test-Path -Path $args[0])
    {
        Remove-Item -Path $args[0] -Recurse
    }
} -ArgumentList $path

Invoke-Command is doing the heavy work here, using our URI and Credentials. It will reach out to the VM, and execute the script block I’ve provided. In this case, an attempt to remove a registry item that’s identified by the variable $path.

TA-DA!

Crossing the finish line

My wife and daughter and still out there enjoying a brisk March day as they truck along towards their finish line. But for me this post is at an end. If you look at my previous posts on ARR, I think you can see how this could be used to remotely install and configure ARR. And while that wasn’t the intent of the task I took on that allowed me to dig into remote PowerShell, it is a nice way to tie things back to my other work. I still have two posts in that series, so I figured If I’m going to interrupt that series, I might as well have some way to tie it all together. J

So until next time!.

Automating ARR Configuration

In the world of cloud, we have to become familiar with the concept of DevOps, and this means that often times we need to code setups rather then write lengthy (nearly 5000 words) sets of instructions. In my last post, I walked you through manually setting up ARR. But what if we want to take this to the next level and start automating the setup?

Now there are several examples out of there automated configuration of ARR in Windows Azure. I don’t want to simply rehash those examples, but instead “teach you to fish” as it were. And while I don’t have all the answers (I am not and don’t intend to be an ARR “expert), I do want to share with some you some of the things I’ve learned on this subject lately.

Installing ARR “automatically”

In my last write-up, I talked about using the Web Platform installer to download and install ARR and its dependencies. Fortunately, we can take this step and turn it into a fairly simple powershell script. The samples below are from a start-up script I created for a Windows Azure PaaS Cloud Service worker role.

First, I’m going to set up a couple of variables so we have most of things we want to customize at the top of the script.

# temporary variables

$temppath = $env:roleroot + “\approot\startuptemp\”
$webpifile = “webpi.msi” 
$tempwebpi = $temppath + $webpifile 
$webplatformdownload = http://download.microsoft.com/download/7/0/4/704CEB4C-9F42-4962-A2B0-5C84B0682C7A/WebPlatformInstaller_amd64_en-US.msi&#8221;

These four variables are, in order:

  • temppath: where we’ll put the file when its downloaded
  • webpifile: the name I’m going to give to webPI install file after we download it
  • tempwebpi: the full path with name that it will be saved as (make sure this isn’t to long or we’ll have issues)
  • webplatformdownload: the URL we are going to download the WebPI installer from

Next up, we need the code to actually create the temporary location and download the webPI install package to that location.

# if it doesn’t exist create a temp location that we can place files in
Write-Host “Testing Temporary Path: “ + $temppath
if((Test-Path -PathType Container $temppath) -eq $false)
{
    Write-Host “Created WebPI directory: “ + $temppath
     New-Item -ItemType directory -Path $temppath
}

# if it doesn’t already exist, download Web Platform Installer 4.6 to the temp location
if((Test-Path $tempwebpi) -eq $false)
{
    Write-Host “Downloading WebPI installer”
    $wc = New-Object System.Net.WebClient
    $wc.DownloadFile($webplatformdownload, $tempwebpi)
}

Ideally, we may want to wrap this in some re-try logic so we can handle any transient issues related to the download, but this will get us by for the moment.

Now, we need to install the WebPI using “/quiet” or silent install mode.

#install Web Platform Installer
Write-Host “Install WebPI”
$tempMSIParameters =  “/package “ + $tempwebpi + ” /quiet”
(Start-Process -FilePath “msiexec.exe” -ArgumentList $tempMSIParameters -Wait -Passthru).ExitCode

Please note that I’m not testing to ensure that this installed properly. So again, for full diligence, we should likely wrap this in some error handling code.

With all that done, all that remains is to use WebPI to install ARR.

#use WebPI to install ARR v3
Write-Host “Using WebPI to install ARR v3″
$tempPICmd = $env:programfiles + “\microsoft\web platform installer\webpicmd”
$tempPIParameters = “/install /accepteula /Products:ARRv3_0″
Write-Host $tempPICmd
(Start-Process -FilePath $tempPICmd -ArgumentList $tempPIParameters -Wait -Passthru).ExitCode

Now this is where we run into our first challenge. Note in the fourth line of this sample that I specify a product name, “ARRv3_0″. This wasn’t just some random guess. I needed to discover what the correct product ID was. For those that aren’t familiar with the Web Platform Installer, it gets its list of products from an RSS feed. There are many feeds, but after some poking around, I found the 5.0 feed at http://www.microsoft.com/web/webpi/5.0/WebProductList.xml

I right clicked the page, viewed source and searched the result “ARR”, eventually finding the XML node for “Application Request Routing 3.0″ (the version I’m after). In this node, you’ll find the productID value that I needed for this step. Below is a picture of the RSS feed with this value highlighted.

Needless to say, tracking that down the first time took a bit of digging. J

When you put all the snippets above together, and run it, it should result in ARR being installed and running. Mind you, this assumes you installed the IIS server role, and that nothing goes wrong with the installs. But automating those two checks is a task for another day.

Getting Scripting for changing our IIS configuration

So the next step is scripting our IIS configuration. If you search around, you’ll find links on using appcmd, and maybe even a few on powershell. But the challenge is figuring out the right steps to take if you plan to only for your own unique situations and don’t have days (sometimes weeks) to dig through all the documentation. I started down this path, analyzing the options available and their parameters with the intent to then spend countless hours writing and debugging my own scripts. That is, until I found the IIS Configuration Editor.

When you load the IIS Manager UI, there’s an innocent looking icon in the management section labelled “Configuration Editor”. This will allow you to edit the IIS configuration, save/reject those changes, and even…. generate scripting!

Now there is a catch… this tool assumes you have an understanding of the monsterously complex schema that is the applicationHost.config. When you launch the Configuration Editor, the first thing you’ll need to do specify what section of the configuration you want to work with. And unless you’ve digested the docs and have a deep understanding of the schema, this can be a real “needle in a haystack” proposition.

Fortunately for us, there’s a workaround we can leverage, namely the applicationHost.config file itself. What I’ve taken to doing, is start by using the GUI to make the configuration changes I need, and making note of the unique names I give items. Once you’ve done that, you can go to the folder “%SYSTEMROOT%\System32\inetsrv\config\” and there you will find the applicationHost.config XML file. Open that file in your favorite XML editor, and have your search button ready.

In my previous article, I set up a web farm and gave it a unique name, a name I can now search on. So using text search, I located the <webFarms><webFarm… node that described “App1Farm” (my unique name). Furthermore, this helped me identify that for setting up the web farm, I select the “webFarms” section in the Configuration Editor that I’m going to work in “webFarms”.

Once there, I can open up the collection listed, and I’ll see any farms that have been configured. After a bit of trial and error I can even find out the specific settings needed to set up my server farm, separating my custom settings from the defaults. This where the fun starts.

If you look at the previous screen shot, on the far right are the actions we can take: Apply, Cancel, and Generate Script. When you use this editor to start making changes, these options will be enabled. So assume I go in and add a Web Farm like I described in my last post. When I close the dialog where I edited the settings, before I click on Apply or Cancel, I instead click on Generate Script and get the following dialog box!

This shows me the code needed to make the change I just made. And I can do this via C#, JavaScript, the AppCmd utility, or Powershell! Now the sample above just creates a farm with no details, but you can start to see where this goes. We can now use this utility to model the configuration changes we want to automate and generate template code that we can then incorporate into our solutions.

Note: after you’ve generated the code you want, be sure to click on apply or cancel as appropriate. Otherwise the Generate Script option continues to track the delta of the changes you are making and will continue to generate code for ALL the changes you are making.

Writing our first C# IIS Configuration Modifications

So with samples in hand, we’re ready to start writing some code. In my case, I’m going to do so with C#. So open up Visual Studio, and create a new project (a class library will do), and paste in your sample code.

The first thing you’ll find is that you’re missing a reference to the Microsoft.Web.Administration. Providing your development machine has IIS w/ the Administration tools installed, you can add a reference to %systemroot%/system32/inetsrv/Microsoft.Web.Administration.dll to your project and things should resolve nicely. If you can’t find the file, then likely you will need to add these roles/components to your dev machine first. I cover how to do this with Windows Server 2012 in my last post, but for Windows 8 (or 7 for that matter), it’s a matter of going to Programs and Features, and then turning Windows features on or off.

When you click on the highlighted option above, this will bring up the Windows Features dialog. Scroll down to “Internet Information Services” and make sure you have IIS Management Service installed, as well as any World Wide Web Services you think you may want.

The mundane out of the way, the next step is to get back to the code, we’ll start by looking at some code I generated to create a basic web farm like I used last time.

The first step, is to get a ConfigurationSection object that contains the “webFarms” section (which we selected when we were editing the configuration, remember).

ServerManager serverManager = new
ServerManager();

Configuration config = serverManager.GetApplicationHostConfiguration();

ConfigurationSection webFarmsSection = config.GetSection(“webFarms”);

ServerManager allows us to access the applicationHost.config file. We use that object to retrieve the configuration, and in turn pull the “webFarms” section into a ConfigurationSection object we can then manipulate.

Next up, we need to get the collection of web farms, and create a new element in that collection for our new farm.

ConfigurationElementCollection webFarmsCollection = webFarmsSection.GetCollection();

ConfigurationElement webFarmElement = webFarmsCollection.CreateElement(“webFarm”);

webFarmElement["name"] = @”sample”;

The collection of farms is stored in a ConfirationElementCollection object which is populated by doing a GetCollection on the section we retrieved previously. We then use the CreateElement method to create a new element of type “webFarm”. Finally, give that new element our name, in this case ‘sample’. (Original, aren’t I *grin*)

The next logical step, is to make sure we identify the affinity settings for new web farm. In my case, I change the default timeout from 30 to 10 minutes.

ConfigurationElement applicationRequestRoutingElement =

webFarmElement.GetChildElement(“applicationRequestRouting”);

ConfigurationElement affinityElement =

applicationRequestRoutingElement.GetChildElement(“affinity”);

affinityElement["timeout"] = TimeSpan.Parse(“00:10:00″);

Using the same ConfigurationElement we retrieve in the last snippet, we now go retrieve a child element that contains the settings for application request routing. And using that element, get the one that has details on how affinity is set. In this case, setting “timeout” to the timespan of 10 minutes.

I also want to change the load balancing behavior. The default is least request, but I prefer round robin. This is done in the same manner, but we use the “loadBalancing” element instead of the “affinity” element of the same “applicationRequestRouting” element we just used.

ConfigurationElement loadBalancingElement =

applicationRequestRoutingElement.GetChildElement(“loadBalancing”);

loadBalancingElement["algorithm"] = @”WeightedRoundRobin”;

Now that we’re all done, it’s time to add the new web farm element back to the farms collection, and commit our changes to the applicationHost.config file.

webFarmsCollection.Add(webFarmElement);

serverManager.CommitChanges();

And there we have it! We’ve customized the IIS configuration via code!

What next…

As you can likely guess, I’m working on a project that will pull these techniques together. Two actually. Admittedly there’s no solid sample here, but then my intent was to share some of the learning I’ve managed to wring out of IIS.NET, MSDN, and TechNet. And as always, bring them to you in a way that’s hopefully fairly easy to digest. While my focus has admittedly been on doing this with C#, you will hopefully be able to leverage the Configuration Editor to help you with any appcmd or Powershell automation you’re looking to pull together.

If all goes well over the next couple weeks, I’ll hope to share my projects with you. These will hopefully add some nice, fairly turnkey capabilities to your Windows Azure projects, but more importantly bring all these learnings into clear focus. So bear with me a bit longer as I go back into hiding to help get the remaining work completed.

Until next time!

ARR as a highly available reverse proxy in Windows Azure

With the general availability of Windows Azure’s IaaS solution last year, we’ve seen a significant uptake in migration of legacy solutions to the Windows Azure platform. And with the even more recent announcement of our agreement with Oracle for them to support their products on Microsoft’s hypervisor technology, Hyper-V, we have a whole new category of apps we are being asked to help move to Windows Azure. One common pattern that’s been emerging is for the need for Linux/Apache/Java solutions to run in Azure at the same level of “density” that is available via traditional hosting providers. If you were an ISV (Independent Software Vendor) hosting solutions for individual customers, you may choose to accomplish this by giving each customer a unique URI and binding that to a specific Apache module, sometimes based on a unique IP address that is associated with a customer specific URL and a unique SSL certificate. This results in a scenario that requires multiple IP’s per server.

As you may have heard, the internet starting to run a bit short on IP addresses. So supporting multiple public IP’s per server is a difficult proposition for a cloud, as well as some traditional hosting providers. To that end we’ve seen new technologies emerge such as SNI (Server Name Indication) and use of more and more proxy and request routing solutions like HaProxy, FreeBSD, Microsoft’s Application Request Routing (ARR). This is also complicated by the need for delivery highly available, fault tolerant solutions that can load balancing client traffic. This isn’t a always an easy problem to solve, especially using just application centric approaches. They require intelligent, configurable proxies and/or load balancers. Precisely the kind of low level management the cloud is supposed to help us get away from.

But today, I’m here to share one solution I created for a customer that I think addresses some of this need. Using Microsoft’s ARR modules for IIS, hosted in Windows Azure’s IaaS service, as a reverse proxy for a high-density application hosting solution.

Disclaimer: This article assumes you are familiar with creating/provisioning virtual machines in Windows Azure and then remoting into them to further alter their configurations. Additionally, you will need a basic understanding of IIS and how to make changes to it via the IIS Manager console. I’m also aware of there being a myriad of ways to accomplish what we’re trying to do with this solution. This is simply one possible solution.

Overview of the Scenario and proposed solution

Here’s the outline of a potential customer’s scenario:

  • We have two or more virtual machines hosted in Windows Azure that are configured for high availability. Each of these virtual machines is identical, and hosts several web applications.
  • The web applications consist of two types:
    • Stateful web sites, accessed by end users via a web browser
    • Stateless APIs accessed by a “rich client” running natively on a mobile device
  • The “state” of the web sites is stored in an in-memory user session store that is specific to the machine on which the session was started. So all subsequent requests made during that session must be routed to the same server. This is referred to as ‘session affinity’ or ‘sticky sessions’.
  • All client requests will be over SSL (on port 443), to a unique URL specific to a given application/customer.
  • Each site/URL has its own unique SSL certificate
  • SSL Offloading (decryption of HTTPS traffic prior to its receipt by the web application) should be enabled to reduce the load on the web servers.

As you can guess based on the title of this article my intent is to solve this problem using Application Request Routing (aka ARR), a free plug-in for Windows Server IIS. ARR is an incredibly powerful utility that can be used to do many things, including acting as a reverse proxy to help route requests in a way that is completely transparent to the requestor. Combined with other features of IIS 8.0, it is able to meet the needs of the scenario we just outlined.

For my POC, I use four virtual machines within a single Windows Azure cloud service (a cloud service is simply a container that virtual machines can be placed into that provides a level of network isolation). On-premises we had the availability provided by the “titanium eggshell” that is robust hardware, but in the cloud we need to protect ourselves from potential outages by running multiple instances configured to help minimize downtimes. To be covered by Windows Azure’s 99.95% uptime SLA, I am required to run multiple virtual machine instances placed into an availability set. But since the Windows Azure Load Balancer doesn’t support sticky sessions, I need something in the mix to deliver this functionality.

The POC will consist of two layers, the ARR based Reverse Proxy layer, and the web servers. To get the Windows Azure SLA, each layer will have two virtual machines: two running ARR with public endpoints for SSL traffic (port 443) and two set up as our web servers, but since these will sit behind our reverse proxy, they will not have any public endpoints (outside of remote desktop to help with initial setup). Requests will come in from various clients (web browsers or devices) and arrive at the Windows Azure Load Balancer. The load balancer will then distribute the traffic equally across our two reserve proxy virtual machines where the requests are processed by IIS and ARR and routed based on the rules we will configure to the proper applications on the web servers, each running on a unique port. Optionally, ARR will also handle the routing of requests to a specific web server, ensuring that “session affinity” is maintained. The following diagram illustrates the solution.

The focus on this article in on how we can leverage ARR to fulfill the scenario in a way that’s “cloud friendly”. So while the original customer scenario called for Linux/Apache servers, I’m going to use Windows Server/IIS for this POC. This is purely a decision of convenience since it has been a LONG time since I set up a Linux/Apache web server. Additionally, while the original scenario called for multiple customers, each with their own web applications/modules (as shown in the diagram), I just need to demonstrate the URI to specific application routing. So as you’ll see in later in the article, I’m just going to set up a couple of web applications.

Note: While we can have more than two web servers, I’ve limited the POC to two for the sake of simplicity. If you want to run, 3, 10, or 25, it’s just a matter of creating the additional servers and adding them to the ARR web farms as we’ll be doing later in this article.

Setting up the Servers in Windows Azure

If you’re used to setting up Virtual Machines in Windows Azure, this is fairly straight forward. We start by creating a cloud service and two storage accounts. The reason for the two is that I really want to try and maximize the uptime of the solution. And if all the VM’s had their hard-drives in a single storage account and that account experienced a sustained service interruption, my entire solution could be taken-offline.

NOTE: The approach to use multiple storage accounts does not guarantee availability. This is a personal preference to help, even if in some small part, mitigate potential risk.

You can also go so far as to define a virtual network for the machines with separate subnets for the front and back end. However, this should not be required for the solution to work as the cloud service container gives us DNS resolution within its boundaries. However, the virtual network can be used to help manage visibility and security of the different virtual machine instances.

Once the storage accounts are created, I create the first of our two “front end” ARR servers by provisioning a new Windows Server 2012 virtual machine instance. I give it a meaningful name like “ARRFrontEnd01″ and make sure that I also create an availability set and define a HTTPS endpoint on port 443. If you’re using the Management portal, be sure to select the “from gallery” option as opposed to ‘quick create’ as it will give you additional options when provisioning the VM instance and allow you to more easily set the cloud service, availability set, and storage account. After the first virtual machine is created, create a second, perhaps “ARRFrontEnd02″, and “attach” it to the first instance by associating it with the endpoint we created while provisioning the previous instance.

Once our “front end” machines are provisioned, we set up two more Windows Server 2012 instances for our web servers, “WebServer01″ and “WebServer02″. However, since these machines will be behind our front end servers, we won’t declare any public endpoints for ports 80 or 443, just leave the defaults.

When complete, we should have four virtual machine instances, two that are load balanced via Windows Azure on port 433 and will act as our ARR front end servers and our two that will act as our web servers.

Now before we can really start setting things up, we’ll need to remote desktop into each of these servers and add a few roles. When we log on, we should see the Server Manager dashboard. Select “Add roles and features” from the “configure this local server” box.

In the “Add Roles and Features” wizard, skip over the “Before you Begin” (if you get it), and select the role-based installation type.

On the next page, we’ll select the current server from the server pool (the default) and proceed to adding the “Web Server (IIS)” server role.

This will pop-up another dialog confirming the features we want added. Namely the Management Tools and IIS Management Console. So take the defaults here and click “Add Features” to proceed.

The next page in the Wizard is “Select Features”. We’ve already selected what we needed when we added the role, so click on “Next” until you arrive at the “Select Role Services”. There are two optional role services here I’d recommend you consider adding. Health and Diagnostic Tracing will be helpful if we have to troubleshoot our ARR configuration later and The IIS Management Scripts and Tools will be essential if we want to automate the setup of any of this at a later date (but that’s another blog post for another day). Below is a composite image that shows these options selected.

It’s also a good idea to double-check here and make sure that the IIS Management Console is selected. It should be by default since it was part of the role features we included earlier. But it doesn’t hurt to be safe. J

With all this complete, go ahead and create several sites on the two web servers. We can leave the default site on port 80, but create two more HTTP sites. I used 8080 and 8090 for the two sites, but feel free to pick available ports that meet your needs. Just be sure to go into the firewall settings of that server enable inbound connections on these ports. I also went into the sites and changed the HTML so I could tell which server and which app I was getting results back from (something like “Web1 – SiteA” works fine).

Lastly, test the web sites from our two front end servers to make sure they can connect by logging into those servers and opening a web browser and enter in the proper address. This will be something like HTTP://<servername>:8080/iisstart.htm. The ‘servername’ parameter is simply the name we gave the virtual machine when it was provisioned. Make sure that you can hit both servers and all three apps from both of our proxy servers before proceeding. If these fail to connect, the most likely cause is an issue in the way the IIS site was defined, or an issue with the firewall configuration on the web server preventing the requests from being received.

Install ARR and setting up for HA

With our server environment now configured, and some basic web sites we can balance traffic against, it’s time to define our proxy servers. We start by installing ARR 3.0 (the latest version as of this writing and compatible with IIS 8.0. You can download it from here, or install it via the Web Platform Installer (WebPI). I would recommend this option, as WebPI will also install any dependencies and can be scripted. Fortunately, when you open up the IIS Manager for the first time and select the server, it will ask if you want to install the “Microsoft Web Platform” and open up a browser to allow you to download it. After a few adding a few web sites to the ‘trusted zone’ (and enabling file downloads when in the ‘internet’ zone), you’ll be able to download and install this helpful tool. Once installed, run it and enter “Application Request” into the search bar. We want to select version 3.0.

Now that ARR is installed (which we have to do on both of our proxy servers), let’s talk about setting this up for high availability. We hopefully placed both or proxy servers into an availability set and load balanced the 443 endpoint as mentioned above. This allows both servers to act as our proxy. But we have two possible challenges yet:

  1. How to maintain the ARR setup across two servers
  2. Ensure that session affinity (aka sticky sessions) works with multiple, load balanced ARR servers

Fortunately, there’s a couple of decent
blog posts on IIS.NET about this subject. Unfortunately, these appear to have been written by folks that are familiar with IIS, networking, pings and pipes, and a host of other items. But as always, I’m here to try and help cut through all that and put this stuff in terms that we can all relate too. And hopefully in such a way that we don’t lose any important details.

To leverage Windows Azure’s compute SLA, we will need to run two instances of our ARR machines and place them into an availability set. We set up both these servers earlier, and hopefully properly placed them into an availability set with a load balanced endpoint on port 443. This allows the Windows Azure fabric to load balanced traffic between the two instances. Also, should updates to the host server (where our VMs run) or the fabric components be necessary, we can minimize the risk of both ARR servers being taken offline at the same time.

This configuration leads us to the options highlighted in the blog post I linked previously, “Using Multiple Instances of Application Request Routing (AAR) Servers“. The article discusses using Shared Configuration and External Cache. A Shared Configuration allows two ARR servers to share their confiurations. By leveraging a shared configuration, changes made to one ARR server will automatically be leveraged by the other because both servers will share a single applicationhost.config file. The External Cache is used to allow both ARR servers to share affinity settings. So if a client’s first request is sent to a given back end web server, then all subsequent requests will be sent to that same back end server regardless of which ARR server receives the request.

For this POC, I decided not to use either option. Both require a shared network location. I could put this on either ARR server, but this creates a single point of failure. And since our objective is to ensure the solution remains as available as possible, I didn’t want to take a dependency that would ultimately reduce the potential availability of the overall solution. As for the external cache, for this POC I only wanted to have server affinity for one of the two web sites since the POC is mocking up both round-robin load balancing for requests that may be more like an API. For requests that are from a web browser, instead of using shared cache, we’ll use “client affinity”. This option returns a browser cookie that contains all the routing information needed by ARR to ensure that subsequent requests are sent to the same back end server. This is the same approach used by the Windows Azure Java SDK and Windows Azure Web Sites.

So to make a long story short, if we’ve properly set up our two ARR server in an availability set, with load balanced endpoints, there’s no additional high level configuration necessary to set up the options highlighted in the “multiple instances” article. We can get what we need within ARR itself.

Configure our ARR Web Farms

I realize I’ve been fairly high level with my setup instructions so far. But many of these steps have been fairly well documented and up until this point we’ve been painting with a fairly broad brush. But going forward I’m going to get more detailed since it’s important that we properly set this all up. Just remember, that each of the steps going forward will need to be executed on each of our ARR servers since we opted not to leverage the Shared Configuration.

The first step after our servers have been set up is to configure the web farms. Open the IIS Manager on one of our ARR servers and (provided our ARR 3.0 install was completed successfully), we should see the “Server Farm” node. Right-click on that node and select “Create Server Farm” from the pop-up menu as shown in the image at the right. A Server Farm is a collection of servers that we will have ARR route traffic to. It’s the definition of this farm that will control aspects like request affinity and load balancing behaviors as well as which servers will receive traffic.

The first step in setting up the farm is to add our web servers. Now in building my initial POC, this is the piece that caused me the most difficulty. Not because creating the server farm was difficult, but because there’s one thing that’s not apparent to those of us that aren’t intimately familiar with web hosting and server farms. Namely that we need to consider a server farm to be specific to one of our applications. It’s this understanding that helps us realize that we need the definition of the server farm to help us route requests coming to the ARR server on one port, to be routed to the proper port(s) on the destination back end servers. We’ll do this as we add each server to the farm using the following steps…

After clicking on “Create Server Farm”, provide a name for the farm. Something suitable of course…

After entering the farm name and clicking on the “Next” button, we’ll be presented with the “Add Server” dialog. In this box, we’ll enter in the name of each of our back end servers but more importantly we need to make sure we expand the “Advanced Settings” options so we can also specify the port on that server we want to target. In my case, I’m going to a ‘Web1′, the name of the server I want to add and I want to set ‘httpPort’ to 8080.

We’re able to do this because Windows Azure handles DNS resolution for the servers I added to the cloud service. And since they’re all in the same cloud service, we can address each server on any ports those servers will allow. There’s no need to define endpoints for connections between servers in the same cloud service. So we’ll complete the process by clicking on the ‘Add’ button and then doing the same for my second web server, ‘Web2′. We’ll receive a prompt about the creation of a default a rewrite rule, click on the “No” button to close the dialog.

It’s important to set the ‘httpPort’ when we add the servers. I’ve been unable to find a way to change this port via the IIS Manager UI once the server has been added. Yes you can change it via appcmd, powershell, or even directly editing the applicationhost.config, but that’s a topic for another day. J

Now to set the load balancing behavior and affinity we talked about earlier, we select the newly created server farm from the tree and we’ll see the icons presented below:

If we double-click on the Load Balance icon, it will open a dialog box that allows us to select from the available load balancing algorithms. For the needs of this POC, Least Recent Request and Weighted Round Robin would both work suitably. Select the algorithm you prefer and click on “Apply”. To set the cookie based client affinity I mentioned earlier, you can double click on the “Server Affinity” option and then check the box for “Client Affinity”.

The final item that we will make sure is enabled here is SSL Offloading. We can verify this by double-clicking on “Routing Rules” and verifying that “Enabled SSL Offloading” is checked which is should be by default.

Now it’s a matter of repeating this process for our second application (I put it on port 8090) as well as setting up the same two farms on the other ARR server.

Setting up the URL Rewrite Rule

The next step is to set up the URL rewrite rule that will tell ARR how to route requests for each of our applications to the proper web farm. But before we can do that, we need to make sure we have two unique URI’s, one for each of our applications. If you scroll up and refer to the diagram that provides the overview of our solution, you’ll see that an end user request to the solution are directed at custaweb.somedomian.com and device api calls are directed to custbweb.somedomain.com. So we will need to create an aliasing DNS entry for these names and alias them to the *.cloudapp.net URI that is the entry point of the cloud service where this solution resides. We can’t use just a forwarding address for this but need a true CNAME alias.

Presuming that has already been setup, we’re ready to create the URL rule for our re-write behavior.

We’ll start by selecting the web server itself in the IIS server manager and double clicking the URL Rewrite icon as shown below.

This will open the list of URL rewrite rules, and we’ll select “add rules…” form the action menu on the right. Select to create a blank inbound rule. Give the rule an appropriate name, and complete the sections as shown in the following images.

Matching URL

This section details what incoming request URI’s this rule should be applied too. I have set it up so that all inbound requests will be evaluated.

Conditions

Now as it stands, this rule would route nearly any request. So we need have to add a condition to the rule to associate it with a specific request URL. We need to expand the “Conditions” section and click on “Add…”. We specify “{HTTP_HOST}” as the input condition (what to check) and set the condition’s type is a simple pattern match. And for the pattern itself, I opted to use a regular expression that looks at the first part of the domain name and makes sure it contains the value “^custAweb.*” (as we highlighted in the diagram at the top). In this way we ensure that the rule will only be applied to one of the two URI’s in our sample.

Action

The final piece of the rule is to define the action. For our type, we’ll select “Route to Server Farm”, keep HTTP as the scheme, and specify the appropriate server farm. And for the path, we’ll leave the default value of “/{R:0}”. The final piece of this tells ARR to add any paths or parameters that were in the request URL to the forwarded request.

Lastly, we have the option of telling ARR that if we execute this rule, we should not process any subsequent rules. This can be checked or unchecked depending on your needs. You may desire to set up a “default” page for requests that don’t meet any of our other rules. In which case just make sure you don’t “stop processing of subsequent rules” and place that default rule at the bottom of the list.

This completes the basics of setting up of our ARR based reverse proxy. Only one more step remains.

Setting up SNI and SSL Offload

Now that we have the ARR URL Rewrite rules in place, we need to get all the messiness with the certificates out of the way. We’ll assume, for the sake of argument, that we’ve already created a certificate and added it to the proper local machine certificate store. If you’re unsure how to do this, you can find some instructions in this article.

We start by creating web site for the inbound URL. Select the server in the IIS Manager and right-click it to get the pop-up menu. This open the “Add Website” dialog which we will complete to set up the site.

Below you’ll find some settings I used. The site name is just a descriptive name that will appear in the IIS manager. For the physical path, I specified the same path as the “default” site that was created when we installed IIS. We could specify our own site, but that’s really not necessary unless you want to have a placeholder page in case something goes wrong with the ARR URL Rewrite rules. And since we’re doing SSL for this site, be sure to set the binding type to ‘https’ and specify the host name that matches the inbound URL that external clients will use (aka our CNAME). Finally, be sure to check “Require Server Name Indication” to make sure we support Server Name Indication (SNI).

And that’s really all there is to it. SSL offloading was already configured for us by default when we created the server farm (feel free to go back and look for the checkbox). So all we had to do was make sure we had a site defined in IIS that could be used to resolve the certificate. This will process the encryption duties, then ARR will pick up the request for processing against our rules.

Debugging ARR

So if we’ve done everything correctly, it should just work. But if it doesn’t, debugging ARR can be a bit of a challenge. You may recall that back when we installed ARR, I suggested also installing the tracing and logging features. If you did, these can be used to help troubleshoot some issue as outlined in this article from IIS.NET. While this is helpful, I also wanted to leave you with one other tip I ran across. If possible, use a browser on the server we’re configured ARR on to access the various web sites locally. While this won’t do any routing unless you set up some local DNS entries to help with resolving to the local machine, it will show you more than a stock “500″ error. By accessing the local IIS server from within, we can get more detailed error messages that help us understand what may be wrong with our rules. It won’t allow you to fix everything, but could sometimes be helpful.

I wish I had more for you on this, but ARR is admittedly a HUGE topic, especially for something that’s a ‘free’ add-on to IIS. This blog post is the results of several days of experimentation and self-learning. And even with this time invested, I would never presume to call myself an expert on this subject. So please forgive if I didn’t get into enough depth.

With this, I’ll call this article to a close. I hope you find this information useful and I hope to revisit this topic again soon. One item I’m still keenly interested in is how to automate these tasks. Something that will be extremely useful for anyone that has to provision new ‘apps’ into our server farm on a regular basis. Until next time then!

Postscript

I started this post in October 2013 and apologize for the delay in getting it out. We were hoping to get it published as a full-fledge magazine article but it just didn’t work out. So I’m really happy to finally get this out “in the wild”. I’d also like to give props to Greg, Gil, David, and Ryan for helping do technical reviews. They were a great help but I’m solely responsible for any grammar or spelling issues contained here-in. If you see something, please call it out in the comments or email me and I’m happy to make corrections.

This will also hopefully be the first of a few ARR related posts/project I plan to share over the next few weeks/months. Enjoy!

Monitoring Windows Azure VM Disk Performance

Note: The performance data mentioned here is based on individual results during a limited testing period and should NOT be used as an indication of future performance or availability. For more information on Windows Azure storage performance and availability, please refer to the published SLA.

So I’ve had an interesting experience the last few days that I wanted to take a few minutes to share with the interwebs. Namely, monitoring some Windows Azure hosted virtual machines, not at the VM level, but the storage account that held the virtual machine disks.

The scenario I was facing was a customer that was attempting to benchmark the performance and availability of two Linux based virtual machines that were running an Oracle database. Both machines were extra-large VMs with one running 10 disks (1 os, 1 Oracle, 8 data disks) and the other with 16 disks (14 for data). The customer has been running an automated load against both machines and wanted to get a clear idea of how much they may or may not have been saturating the underlying Windows Azure Storage system as well as what could be contributing to the highly variable Oracle IOPS levels they were seeing.

To support this effort, I dug into something I haven’t looked at in depth for quite some time. Windows Azure Storage Analytics (aka Logging and Metrics). Except this time with a focus on what happens at the storage account with regards to the VM disk activity.

Enable Storage Analytics Proactively

Before we go anywhere, I need to stress that if you want to be able to see what’s going on with Azure Storage and your virtual machine, you’ll need to enable this BEFORE a problem occurs. If you haven’t already enabled logging, the only option you have to try and go “back in time” and look at past behavior is to open up a support ticket. So if you plan to do this type of monitoring, please be certain to enable analytics!

For Windows Azure VM disk metrics, we need to enable analytics on the blob storage account. As the link I just shared will let you know, you will need to call the “Set Blob Service Properties” api to set this (or use your favorite Windows Azure storage utility). I happen to use the Azure Management Studio from Redgate and it allows me to set the properties you see in this screen shot:

With this, I tell Azure Storage that I want it to log all blob operations (Read/Write/Delete) and retain that information for up too two days. I also enable metrics and ask it to retain that data for two days as well.

When I enable logging, Azure Storage will log all operations and persist that information into a series of blob files in a special container in the storage account called $logs. The Logging data will be written to blobs in the same storage account I am monitoring in a special container called $logs. Logs will be spread across multiple blob files as discussed in great detail in the MSDN article “About Storage Analytics Logging“. A word of caution, if the storage account is active, logging will produce a LARGE amount of data. In my case, I was seeing a new 150mb log file approximately every 3 minutes. That’s about 70gb per day. In my case, I’ll be storing about 140gb for my 2 days of retention which is only about $6.70 per month. Given the cost of the VM itself, this was inconsequential. But if I had shifted my retention period to a month… this can start to get pricy. Additionally, the storage transactions needed to write the logs to blog storage count against the account limit of 20,000/tps. To help reduce the risk of throttling coming into play to early, the virtual machines I’m monitoring have each been deployed into their own storage account.

The metrics are much more lightweight. These are written to a table and provide a per hour view of the storage account. These are the same values that get surfaced up in the Windows Azure Management portal storage account dashboard. I could easily retain these for a much longer period since it’s only a handful of rows being inserted per hour.

Storage Metrics – hourly summary

Now that we’ve enabled storage analytics and told it to capture the metrics, we can run our test and sit back and look for data to start coming in. After we’ve run testing for several hours, we can then look at the metrics. Metrics get thrown into a series of tables, but since I only care about the blob account, I’m going to look at $MetricsTransactionsBlob. We’ll have multiple rows per hours and can filter based on the type of operation, or get the roll-up across all operations. For general trends, it’s this latter figure I’m most interested in. So I apply a query against the table to get all user operations, “(RowKey eq ‘user;All’)“. The resulting query gives me 1 row per hour that I can look at to help me get a general idea of the performance of the storage account.

You’ll remember that I opted to put my Linux/Oracle virtual machine into its own storage account. So this hour summary gives me a really good, high level overview of the performance of the virtual machine. Key factors I looked at are: Availability (we want to make sure that’s above the storage account 99.9% SLA), Average End to End Latency, and if we have less than 100% availability, what is the count of the number of errors we saw.

I won’t bore you with specific numbers, but over a 24hr period I lowest availability I saw was 99.993% availability and with the most common errors being Server Timeouts, Client Errors, or Network Errors. Seeing these occasionally, as long as the storage account remains above 99.9% availability, should be considered normal ‘noise’. In the transient nature of the cloud, some errors are simply to be expected. We also kept an eye on average end to end latency which during our testing was fairly consistent in the 19-29ms range.

You can learn more about all the data available in these various storage metrics by reviewing ‘Storage Analytics Metrics Table Schema‘ on MSDN.

When we saw numbers that appears “unusual”, we then took the next logical step and inspected the detailed storage logs.

Blob Storage Logs – the devil is in the details

Alright, so things get a bit messier here. First off, the logs are just delimited format files. And while there the metrics can help tell us which period in time we want to look at, depending on the number of storage operations, we may have several logs we need to slog through (In my case, I was getting about 20 150mb log files per hour). So the first step when digging into the logs is to download them. So either write up some code, grab your favorite utility, or perhaps just log into the management portal and download the files for the timeframe you want to take a closer look at. Once that’s done, it’s time for some Excel (yeah, that spreadsheet thing…. Really).

The log files are semi-colon delimited files. As such, the easiest way I found to do ad-hoc inspection of the files is to open them up in a spreadsheet application like Excel. I open up Excel, then do the whole “File -> Open” thing to select the log file I want to look at. I then tell Excel it’s a delimited file with a semi-colon as the delimiter and in a few seconds it will import the file all nice and spreadsheet for me. But before we start doing anything, let’s talk about the log file format. Since the log file doesn’t contain any headers, we either need to know what columns contain the data we want, or add some headers. For the sake of keeping things easy for you (and saving a copy for myself), I created my own Excel file that already has all the log file fields declared in it. So you can just copy and paste from this spreadsheet into your log file once it’s loaded into Excel. For the remainder of this article, I’m going to assume this is what you’ve done.

With our log file headers, we can now start filtering the data. If we’re looking for errors, the first thing we’ll want to do is open up a log file and filter based on “request status”. To do this, select the “Data” tab and click on “filter”. This allows us to click on the various column headings and filter down what we’re looking at. The shot below shows a log that had a couple of errors in it. So I can easily remove the checkbox on “Success” to drill into those specific errors. This is handy if we want to know exactly what happened as the log also contains a “request-id-header” field. With that value, we can open up a support ticket and ask them to dig into the issue more deeply.

Now this is the first real caution I have. Between the metrics and the logs, we can get a really good idea of what types of errors are happening. But this doesn’t mean that every error should be investigated. With cloud computing solutions, there’s a certain amount of “transient” errors that are simply to be expected. It’s only if you see a prolonged, or persistent issue that you’d really want to dig into the errors in any real depth. One key indicator is to look at the logging metrics and keep an eye on the availability. If it falls below 99.9%, that means there may have been an SLA violation for the storage account. In that case, I’d take a look at the logs for that period and see what types of errors we saw. As long as the issue wasn’t caused by a spike in throttling (meaning we overloaded the system), there may be something worth having support look into. But if we’re at 99.999%, with the occasional network failure, timeout, or ‘client other’, we’re likely just seeing the “noise” one would expect from transient errors as the system adjust and compensates for changes to its underlying fabric.

Now since we’re doing benchmarking tests, there’s one other key thing I look at. The number of operations that are occurring on the blobs that are the various disks mounted into our virtual machine. This is another task where Excel can help out, by adding subtotals. Adding subtotals requires column headings so this is the part when you go “thank you Brent for making it so I just need to copy those in”. You’re welcome. J

The field we want to look at in the logs for our subtotal is the “requested-object-key” field. This value is the specific option in the storage account that was being access (aka the blob file or disk). Going again to the Data tab in Excel, we’ll select “subtotal” and complete the dialog box as shown at the left. This will create subtotals by object (disk) and allow us to see the count of operations against that object. So what we have is the operations performed on that disk during the time period covered by the log file. Using that value, we can then get a fairly good approximation of the “transactions per second” that the disk is generating against storage.

So what did we learn?

If you are doing IO benchmarking of the virtual machine (as I was), you may notice something odd. We observed that our Linux/Oracle Vm was reporting IOPS far above what we saw at the Windows Azure Storage level. This is to be expected because Oracle is trying to help buffer requests itself to increase performance. Add in any disk buffering we may have enabled, and the numbers could skew even further. Ultimately though, what we did establish during out testing was that we knew for certain when we were overloading the Windows Azure storage sub-system and contributing to server slowdowns that way. We were also able to observer several small instances where Oracle performance trailed off somewhat and that these were due to isolated incidents where we saw an increase in various errors or in end to end operation latency.

The host result here is that while virtual machine performance is related to the performance of the underlying storage subsystem, there’s no easy 1-to-1 relation between errors in one and issues in the other. Additionally, as you watch these over time, you understand why virtual machine disk performance can vary over time and shouldn’t be compared to the behaviors we’ve come to expect from a physical disk drive. We have also learned what we need to do to help us more affectively monitor Windows Azure storage so that we can proactively take action to address potential customer facing impacts.

I apologize for all the typos and for not going into more depth on this subject. I just wanted to get this all out into before I fell into the fog of my memory. Hopefully you find it useful. So until next time!

Windows Azure TechEd Challenge – Wednesday Cheats (Part 1)!

Hey all! Sorry I missed yesterday’s update of the cheats. The booth was busy enough I ended up working double shifts and then I had some technical issues preventing me from getting the next update out (stupid failed power supply). So I didn’t manage to get to this. But I’m back this morning for another update and I’m preparing this from my “meeting” device, aka a Surface RT. One item that did come up yesterday was the realization that you can complete the entirety of the IT Pro Windows Azure Challenge from a Surface RT device! Really! So I’m going to walk through IT Pro challenges 3-5 entirely with my Surface RT!

IT Pro Challenge 3 – Create an image from a VM and redeploy with HA

So last time we created and configured a Server 2012 virtual machine and installed IIS into it. This time we need to capture that server as an image and deploy a second, load balanced copy taking advantage of high availability.

Now the first step is to create an image from our running virtual machine. The details steps for this are outlined in this MSDN article. For this we’ll need to RDP into the virtual machine we created in Challenge 2, then sysprep it. Since Surface RT includes a Remote Desktop app, I can do this without switching to a “normal workstation”.

Once we’ve issued the request to sysprep the virtual machine, we can log off and just watch the management portal until the virtual machine enters a stopped state.

*coffee time*

Now that the machine has stopped, we can capture a copy of it by selecting it like we did yesterday (select the row, don’t go into the detail view) and select “Capture” from the menu at the bottom of the page (which should now be enabled, if it isn’t make sure the machine has stopped). When clicking the capture option, you’ll get a pop-up like the one to the left. Fill it out and click on the check mark to complete this operation.

You’ll see the status of the virtual machine change to “Registering” as we capture the image. When complete, the virtual machine itself will disappear, and we’ll now see if under the list of images with the name we provided.

Now we’ll start by deploying the first virtual machine from this base image. Using the toolbar at the bottom, click the “+” sign then select Compute -> Virtual Machine -> From Gallery as illustrated below:

This will result in a dialog box where we can select to create a VM from “My Images” then define the size and location of our VM as we did when we initially deployed it. We’ll select to deploy this as a ‘stand alone’ machine, but this time we want to make sure we create an availability set. Availability sets are a way to tell the Windows Azure fabric to distribute virtual machines in such a way as to minimize the risk of downtime simultaneous downtime. Spreading it out over multiple physical locations within the datacenter for lack of a better explanation. You can enable remote powershell if you like (not necessary for the TechEd Challenge), but make sure you define it as part of an availability set.

Once the first machine has finished deploying, create a second, but this time instead of a “stand along” machine, we need to attach this one to the first.

And make sure it’s part of the same availability set. The new machine will be provisioned, and we’re almost there.

Once both machines are running, we can now set up load balanced endpoints. Endpoints in Windows Azure are important because they control what ports the outside world can connect to our virtual machines on. If you select one of the virtual machines and view the details (what we’ve been avoiding so far), and then click on “Endpoints”, we’ll see one, perhaps two endpoints already declared: RDP and remote powershell.

By defining these endpoints, we’ve told Windows Azure to allow traffic directly to this virtual machine on these ports and over a specific protocol.

To create an endpoint on our first server, select the “Add” option in the bottom toolbar and following the instructions in the dialog box. So let’s create a new endpoint (port 80 for argument’s sake) and make sure both machines have it and that Windows Azure will know to load balance the traffic. For the first server, I “add endpoint” then created one called “webdefault” and used port 80 for both the public and private ports. Once created, we should be able to open up a browser windows and enter in the address of our service and see the default IIS display screen.

Next we’ll select our second virtual machine, and create an endpoint on it. The only real difference is that this time we’ll select to “Load-Balance Traffic on an existing endpoint” as shown below.

Once both have been created. You’ve completed IT Pro Challenge 3 and are ready to receive your Windows Azure earbuds!

So I have to run to a session. But I’ll try to be back this afternoon with more cheats so you can get your Windows Azure swag. Until then!

Windows Azure TechEd Challenge – Monday Cheats!

For Microsoft’s TechEd North American event, we’re running at promotion at the booth. This promotion asks you to walk through a series of “tasks” and once the tasks have been validated by one of our booth staff members, you’re eligible to win a prize. Now we’ve had some great interest in this, but sadly many folks are still somewhat… intimidated by all the options that exist in the platform.

Now maybe I’m just an enabler, but I don’t like denying folks goodies. So to that end, I’m going to be producing a series of blog posts that will actually walk through these tasks in clear (and hopefully easily repeatable steps) for even the newest Windows Azure explorers.

As the first post in this series, I’m going to cover the first three tasks for both Developers and IT Pros. As a self-proclaimed “code money”, I’m going to let the developers go first.

Developer Challenge 1 – Active your MSDN Benefit Trial

Of course, I can’t make this walkthrough as detailed as I would have liked because I only have one MSDN subscription and I’ve already activated the MSDN benefits for it. But hopefully this will be enough to get you your free Windows Azure Water Bottle!

Start by going to http://msdn.microsoft.com and logging in using your Microsoft Account (formerly “Live ID”). Once logged in, you’ll see the main landing page and off to the right, you’ll see a link for “Access Benefits”. I’ve highlighted this in the following picture.

Now if you’re paying attention, you’ll also see that you can click on the “Activate Now” in the big banner. Alternatively, if this is all too difficult, just shortcut all of this and head to: http://www.windowsazure.com/en-us/campaigns/car/
J I can’t make it much easier.

Once clicked, just follow the prompts to complete setting up your MSDN subscription. You’ll be asked for a credit card (we’ll put a spending limit on the account to prevent any surprise charges) and likely a phone number to help with activation (it’s a fraud prevent measure, sorry). But if all goes well, this should be a fairly painless process. And in about 5 minutes, you’ll hopefully have this task complete and be eligible for your first Windows Azure challenge prize! A well-deserved pat on the back.

Developer Challenge 2 – Create/deploy a Windows Server 2012 VM

Honestly, this one is super easy! And fortunately I can get really detailed on this one. Start by logging into the Windows Azure Management portal at https://manage.windowsazure.com/ using the Microsoft Account that has your subscription. Once logged on, scroll to the bottom of the page and click on the big “+ New” link.

Once click, a menu will “slide up” from the bottom and you’ll navigate through this menu as follows: Compute -> Virtual Machine -> Quick Create (see screen shot).

This will result in a dialog box that will ask you to select the type of machine you’re wanting to deploy. There’s a couple things you’ll want to know:

  • The DNS Name must be globally unique. This name is used by Windows Azure to help route requests directed at your machine to the proper datacenter, and then within the datacenter to the proper virtual machine. So can’t be a name that is already in use.
  • Make sure you select the “Windows Server 2012″ Image (that’s what the task calls for)
  • For the Size, select “extra small”. This size may not be very powerful, but minimizes the amount of your MSDN benefit you’ll use.
  • Username needs to be something besides “admin”, so pick something that’s easy for you to remember but also fairly secure (sorry, I’m blurred mine. I’m paranoid that way).
  • Use a STRONG password. You won’t want to try to validate your solution and find its been hacked because your password was “p4ssW0rd!”.
  • Pick a Location that’s convenient. Only locations that can host Windows Azure Virtual Machines will be vislble.

That’s all there is to it!

Developer Challenge 3 – Deploy an ASP.NET application and database

Ok, a bit trickier this time. We’re going to make you work for those ear buds! There’s multiple steps that need to be done to make this one work properly. So be prepared. The outline is…

  • Install the Windows Azure SDK
  • Create the Windows Azure Environment (where our Web Site will be hosted)
  • Create an application to be deployed (in Visual Studio)
  • Deploy the Application (requires importing our Windows Azure publishing profile)
  • Adding a database & data model
  • Created a data deployment script (used to update the cloud database)
  • Publish our changes from Visual Studio to Windows azure

Now admittedly, these steps are enough for an entirely article. And fortunately, there’s already one written over on MSDN. You don’t need to complete the “o-auth” portions of that article for this task. But if you have an extra 15-20 minutes, I’d highly recommend you make the investment.

IT Pro Challenge 1 – Activate a Free Trial

Now enough developer stuff for today. Let’s look at the IT Pro side and get you setup for a free 1-month trial!

Like claiming MSDN benefits, this is super easy! Head over to http://www.windowsazure.com and click on the “Free Trial” link in the upper right (see image). If that’s too painful, then just click here! You’ll sign in with a Microsoft Account, provide the appropriate information (credit card, phone number, etc…) and the account will be setup pretty quickly. Again, there’s a spending cap in place so you should have to worry about any surprise bills.

With this step complete, you’ve qualified for your “Cloud Pro” badge. :)

IT Pro Challenge 2 – Create/deploy a Windows Server Virtual Machine with IIS and a data disk

Ok, the developers had it easy on this one. They only had to do one of three things you IT Pros are being asked for. But hey, they’re “just devs” (just kidding my fellow code monkeys). So lets start by creating the virtual machine just as I discussed in the Developer Challenge 2. Get this provisioned, go catch a TechEd Session, or maybe just get something to drink while you give your virtual machine 10-15 minutes to be built and started.

*insert commercial break here*

Hey there! Glad you could join us again! Time to remote into our virtual machine and get it customized. Lets get logged back into the management portal at http://manage.windowsazure.com and then select the “virtual machines’ group found in the left hand column. Then select the virtual machine we just deployed. Now be sure to select it, don’t click on the hour (I know, this is a bit confusing). I find the easiest way to do this is to not click on the server name, but instead on its status. This selects the row for me, without opening the server details screen.

Now the reason we want the row, is that when we select it, the “tool bar” across the bottom of the screen updates. What we’re most interested in is the “CONNECT” option. Click this button will open a new browser window and prompt us to open an RDP (remote desktop profile) file so we can connect to our virtual machine.

So click away! We’ll likely be prompted with an “unknown publisher” error, go ahead and click through that. Then when prompted for login credentials, use the admin user we defined when we created the virtual machine. This may require you to click on “other user” so you can enter in the machine name (as the domain) and userid for the administrator account you provided. After this, the connection should get secured and you may get prompted with a certificate error. Feel free to click through that one again as well.

Now the task calls for IIS, but in reality we’re not validating this. But you’re an IT pro, so I expect once you’re into an RDP session as an administrator, adding IIS should be a simple process. I’m a dev and I managed it. Here’s the short of it (providing you used the Server 2012 quick launch):

  • Launch system manager by clicking on the icon in the bottom left corner
  • Click on “add roles and features”
  • Click “next” to get through the “before you begin” page
  • Select a “role-based or feature based installation”
  • Select “server roles” in the left menu
  • Scroll down until you find “Web Server (IIS) and select it
  • Confirm “add features” and click next (a few times)
  • Finally, click “install”

The installation will run for a few minutes (depending on the size of the VM). So feel free to kick this off and go get a beverage refill.

Now we get to add a data disk to our virtual machine. This allows us to put more disks into the VM and keep them separated from the OS disk. Like before, we need to select the virtual machine without going into its detail view. But this time, instead of click on “connect”, we’re going to select the “attach” option as highlighted below.

In particular, we’re going to select “Attach empty disk”. We’ll then select a storage account for that disk (by default, it will be placed into the same location as your OS disk), a size (up to 1 terabyte), and a cache preference. For the sake of your challenge, these options don’t much matter. So set a few options, and click the check mark to finalize this.

Now that we’re created the disk and attached it to the VM, we just need to remote back into the machine, and add the new disk in via the manager. So once again we go back into Server Manager in our virtual machine, select File and Storage Services, then disk, and select our virtual machine. We’ll see our new drive in the list (I’ve highlighted mine in the screen shot below), then right-click on the drive and select “initialize”.

Add a volume if you so desire, but at this point we’re just trying to make sure there’s a disk so we can win our next piece of swag. Just be prepared to come by the booth with an RDP session open so you can show us that you have the disk attached to the VM. J

BTW, even though I created a 500gb disk, I won’t pay but a few pennies because the only thing stored on my new disk is the FAT info. Just don’t do a low level format or you’ll be paying for the full 500gb.

Free Windows Azure water bottle unlocked!

Next time

So this concludes the Monday TechEd North America 2013 edition of our challenge. Tomorrow I’m going to try and publish another step for each of these so you can get your really sweet remote control mini-Coopers! But for now, you have no excuse not to get a water bottle, some earbuds, and a nice “atta boy”.

Enjoy!

IaaS – The changing face of Windows Azure

I need to preface this post by saying I should not be considered, by any stretch of imagination, a “network guy”. I know JUST enough to plug in an Ethernet cable, not to fall for the old “the token fell out of the network ring” gag, and know how to tracert connectivity issues. Thanks mainly to my past overindulgence in online role playing games.

In June of 2012, we announced that we would be adding Infrastructure as a Service (IaaS) features to the Windows Azure Platform. While many believe that Platform as a Service (PaaS) is still the ultimate “sweet spot” with regards to cost/benefit ratios, the reality is that PaaS adoption is… well… challenging. After 25+ years of buying, installing, configuring, and maintaining hardware, nearly everyone in the IT industry tends to think of terms of servers, both physical and virtual. So the idea of having applications and data float around within a datacenter and not tied to specific locations is just alien for many. This created a barrier to the adoption of PaaS, a barrier that we are hoping our IaaS services will help bridge (not sure about “bridging barriers” as a metaphor since I always visualize barriers as those concrete fence things on the side of highway construction sites for but we’ll just go with it).

Unfortunately, there’s still a lot of confusion about what our IaaS solution is and how to work with this. Over the last few months, I’ve run into this several times with partners so I wanted to pull together some of my learnings into a single blog post. As much for my own personal reference as for me to be able to easily share it with all of you.

Some terminology

So I’d like to start by explaining a few terms as they are used within the Windows Azure Platform…

Cloud Service – This is a collection of virtual machines (either PaaS role instances or IaaS virtual machines) representing an isolation boundary that contains computational workloads. A Cloud Service can contain either PaaS compute instances, or IaaS Virtual Machines, but not both. (UPDATE 4/16/2013: A IaaS VM hosting Cloud Service will only appear in the cloud service tab of the management portal after at second VM has been added to it. Once visible, it will remain so until it is deleted).

Availability Set – For PaaS solutions, the Windows Azure Fabric already knows to distribute the same workload across different physical hardware within the datacenter. But for IaaS, I need to tell it to do this with the specific virtual machines I’m creating. We do this by placing the virtual machines into an availability set.

Virtual Network – Because addressability to the PaaS or IaaS instances within Cloud Services is limited to only those ports that you declare (by configuring endpoints), it’s sometimes helpful to have a way to create bridges between those boundaries or even between them and on-premises networks. This is where Windows Azure Virtual Networks come into play.

The reason these items are important is that in Windows Azure you’re going to use them to define your solution. Each piece represents a way to group, or arrange resources and how they can be addressed.

You control the infrastructure, mostly…

Platform as a Service, or PaaS, handles a lot for you (no surprise as that’s part of the value proposition). But in Infrastructure as a Service, IaaS, you take on some of that responsibility. The problem is that we are used to taking care of traditional datacenter deployments and either a) don’t understand what IaaS still does for us and b) just aren’t sure how this cloud stuff is supposed to be used. So we, through no fault of our own try to do things the way we always have. And who could really blame us?

So let’s start with what Windows Azure IaaS still does for you. It obviously handles the physical hardware and hypervisor management. This includes provisioning the locations for our Virtual Machines, getting them deployed, and of course moving them around the data center in the case of a hardware failure or host OS (the server that’s hosting our virtual machine) upgrades. The Azure Fabric, our secret sauce as it were, also controls basic datacenter firewall configuration (what ports are exposed to the internet), load balancing, and addressability/visibility isolation (that Cloud Service thing I was talking about). This covers everything right up to the virtual machine itself. But that’s not where it stops. To help secure Windows Azure, we control how all the virtual machines talk to our network. This means that the Azure Fabric also has control of the virtual NIC that is installed into your VM’s!

Now the reason this is important is that there are some things you’d normally try to do if you were creating a network in a traditional datacenter. Like possibly providing fixed IP’s to the servers so you can easily do name resolution. Fixed IPs in a cloud environment is generally a bad idea. Especially so if that cloud is built on the concept of having the flexibility to move stuff around the datacenter for you if it needs too. And if this happens in Windows Azure, it’s pretty much assured that the virtual NIC will get torn down and rebuilt and in the process lose any customizations you made to it. This is also a frequent cause for folks losing the ability to connect to their VMs (something that’s usually fixable by re-sizing/kicking the VM via the management portal). It also highlights one key, but not often thought of feature that Windows Azure provides for you, server name resolution.

Virtual Machine Name Resolution

The link I just dropped does a pretty good job of explaining what’s available to you with Windows Azure. You can either let Windows Azure do it for you and leverage the names you provided for the virtual machines when you created them, or you can use Virtual Networking to bring your own DNS. Both work well, so it’s really a matter of selecting the right option. The primary constraint is that the Windows Azure provided name resolution will only work for virtual machines (be they IaaS machines or PaaS role instances) hosted in Windows Azure. If you need to provide name resolution between cloud and on-premises, you’re going to want to likely use your own DNS server.

The key here again is to not hardcode IP address. Pick the appropriate solution and let it do the work for you.

Load Balanced Servers

The next big task is how to load balance virtual machines in IaaS. For the most part, this isn’t really any different than how you’d do it for PaaS Cloud Services, create the VM, and “attach” it to an existing Virtual Machine (this places both virtual machines within the same cloud service). Then, as long as both machines are watching the same ports, the traffic will be balanced between the two by the Windows Azure Fabric.

If you’re using the portal to create the VM, you’ll need to make sure you use the “create from gallery” option and not quick create. Then as you progress through the wizard, you’ll hit the step where it asks you if you want to join the new virtual machine to an existing virtual machine or leave it as standalone.

Now once they are both part of the same cloud service, we simply edit the available endpoints. In the management portal, you’ll select a Virtual Machine, and either add or edit the endpoint using the tools menu across the bottom. Then you set the endpoint attributes manually (if it’s a new endpoint that’s not already load balanced), or choose to load balance it with a previously defined endpoint. Easy-peasy. J

High Availability

Now that we have load balanced endpoints, the next step is to make sure that if one of our load balanced virtual machines goes offline (say a host OS upgrade or hardware failure), that the service doesn’t become entirely unavailable. In Windows Azure Cloud Services, the Fabric would automatically distribute the running instances across multiple fault domains. To put it simply, fault domains try to help ensure that workloads are spread across multiple pieces of hardware, this way if there is a hardware failure on a ‘rack’, it won’t take down both machines. When working with IaaS, we still have this as an option but we need to tell the Azure Fabric that we want to take advantage of this by placing our virtual machines into an Availability Set so the Azure Fabric knows it should distribute them.

You configure a virtual machine that’s already deployed to join it to an Availability Set, or we can assign a new one to a set when we create/deploy it (providing we’re not using Quick Create which you hopefully aren’t anyways because you can’t place a quick create VM into an existing cloud service). Both options work equally well and we can create multiple Availability Sets within a Cloud Service.

Virtual Networks

So you might ask, this is all find and dandy if the virtual machines are deployed as part of a single cloud service. But I can’t combine PaaS and IaaS into a single cloud service, and I also can’t do direct machine addressing if the machine I’m connecting to exists in another cloud service, or even on-premises. So how do I fix that? The answer is Windows Azure Virtual Networks.

In Windows Azure, the Cloud Service is an isolation boundary, fronted by a gatekeeper layer that serves as a combination load balancer and NAT. The items inside the cloud service can address each other directly and any communication that comes in from outside of the cloud service boundary has to come through the gatekeeper. Think of the cloud service as a private network branch. This is good because it provides a certain level of network security, but bad in that we now have challenges if we’re trying to communication across the boundary.

Virtual Network allows you to join resources across cloud service boundaries, or by leveraging an on-premises VPN gateway to join cloud services and on-premises services. Acting as a bridge across the isolation boundaries and enabling direct addressability (providing there’s appropriate domain resolution) without the need to publically expose the individual servers/instances to the internet.

Bringing it all together

So if we bring this all together, we now have a way to create complex solutions that mix and match different compute resources (we cannot currently join things like Service Bus, Azure Storage, etc… via Virtual Network). One such example might be the following diagram…

A single Windows Azure Virtual Network that combines an on-premises server, a PaaS Cloud Service, and both singular and load balanced virtual machines. Now I can’t really speculate on where this could go next, but I think we have a fairly solid starting point for some exciting scenarios. And if we do for IaaS what we’ve done for the PaaS offering over the last few years… continuing to improve the tooling, expanding the feature set, and generally just make things more awesome, I think there’s a very bright future here.

But enough chest thumping/flag waving. Like many topics here, I created this to help me better understand these capabilities and hopefully some of you may benefit from it as well. If not, I’ll at least share with you a few links I found handy:

Mike Washam – Windows Azure Virtual Machines

MSDN – Windows Azure Name Resolution

WindowsAzure.com – Load Balancing Virtual Machines

WindowsAzure.com – Manage the Availability of Virtual Machines

Until next time!

Windows Azure Web Sites – Quotas, Scaling, and Pricing

It hasn’t been easy making the transition from a consultant to someone that for lack of a better explanation is a cross between pre-sales and technical support. But I’ve come to love two aspects of this job. First off, I get to talk to many different people and I’m constantly learning as much from their questions as I’m helping teach them about the platform. Secondly, when not talking with partners about the platform, I’m digging up answers to questions. This gives me the perfect excuse… er… reason to dig into some of the features and learn more about them. I had to do this as a consultant, but the issue there is that since I’d be asked to do this by paying clients, they would own the results. But now I do this work on behalf of Microsoft, it’s much easier to share these findings with the community (providing it doesn’t violate non-disclosure agreements of course). And since this blog has always been a way for me to document things so I can refer back to them, it’s a perfect opportunity to start sharing this research.

Today’s topic is Windows Azure Web Sites quotas and pricing. Currently we (Microsoft) doesn’t IMHO do a very good job of making this information real clear. Some of it is available over on the pricing page, but for the rest you’ve got to dig it out of blog posts or from the Web Site dashboard’s usage overview details in the management portal. So I decided it was time to consolidate a few things.

Usage Quotas

A key aspect of the use of any service is to understand the limits. And nowhere is this truer then the often complex/confusing world of cloud computing services. But when someone slaps a “free” in front of a service, we tend to forget this. Well here I am to remind you. Windows Azure Web Sites has several dials that we need to be aware of when selecting the level/scale of Windows Azure Web Sites (Free, Shared, and Reserved).

File System/Storage: This is the total amount of space you have to store your site and content. There’s no timeframe on this one. If you reach the quota limit, you simply can’t write any new content to the system.

Egress Bandwidth: This is the amount of content that is served up by your web site. If you exceed this quota, your site will be temporarily suspended (no further requests) until the quota timeframe (1 day) resets.

CPU Time: This is the amount of time that is spent processing requests for your web site. Like the bandwidth quota, if you exceed the quota, your site will be temporarily suspended until the quota timeframe resets. There are two quota timeframes, a 5 minute limit, and a daily limit.

Memory: is the amount of RAM that the site can use at one shot (there’s no timeframe). If you exceed the quota, a long running or abusive process will be terminated. And if this occurs often enough, your site may be suspended. Which is pretty good encouragement to rethink that process.

Database: There’s also up to 20mb for database support for your related database (MySQL or Windows Azure SQL Database currently). I can’t find any details but I’m hoping/guessing this will work much like the File Storage quota.

Now for the real meat of this. What are the quotas for each tier? For that I’ve created the following table.

Quota Resource

Free Tier

Shared Tier

(per web site)

Reserved Tier

(up to 100 sites)

File Storage 1024mb for all sites 1024mb 10gb
Egrees Bandwidth 165mb/day per datacenter, 5gb per region Pay as you go, not included in base price Pay as you go, not included in base price
CPU Time 1hr/day, 2.5 minutes of every 5 4hrs/day, 2.5 minutes of every 5 N/A
Memory 1024mb/hr 512mb/hr N/A
Database 20mb 20mb N/A

Now there’s an important but slightly confusing “but” to the free tier. At that level, you get a daily limit egress bandwidth quota per sub-region (aka datacenter), but there’s also a regional (US, EU, Asia) limit (5GB). The regional limit is the sum total off all web sites you’re hosting that is shared with any other services. So if you’re also using Blob storage to serve up images from your site that will count against your “free” 5 GB. But when you move to the shared/reserved tier, there’s no limit, but you pay for every gigabyte that leaves the datacenter.

Monitoring Usage

Now the next logical question is how you monitor the resources your sites are using. Fortunately, the most recent update to Windows Azure portal has a dashboard that provides a quick glance as how much you’re using of each quota. This displays just below usage grid on the “Dashboard” panel of the web site.

At a glance you can tell where you on any quotas which also makes it convenient for you to predict your usage. Run some common scenarios and see what they do to your numbers and extrapolate from there.

You can also configure the site for diagnostics (again via the management portal). This allows you to take the various performance indicators and save them to Windows Azure Storage. From there you can download the files and set up automated monitors to alert you to problems. Just keep in mind that turning this on will consume resources and incur additional charges.

Fortunately, there’s a pretty good article we’ve published on Monitoring Windows Azure Web Sites.

Scaling & Pricing

Now that we’ve covered your usage quotas and how to monitor your usage, it’s important to understand how we can scale the capacity of our web sites and the impact this has on pricing.

Scaling our web site is pretty straight forward. We go can go from the Free Tier, to Shared, to Reserved using the management portal. Select the web site, click on the level, and then save to “scale” your site. But before you do that, you will want to understand the pricing impacts.

At the Free tier, we get up to 10 web sites. When we move a web site to shared, we will pay $0.02 per hour for each web site (at general availability). Now that this point, I can mix and match free (10 per sub-region/datacenter) and shared (100 per sub-region/datacenter) web sites. But things get a bit trickier when we move to reserved. A reserved web site is a dedicated virtual machine for your needs. When you move a web site within a region to the reserved tier, all web sites in that same sub-region/datacenter (up to the limit of 100) will also be moved to reserved.

Now this might seem a bit confusing until you realize that at the reserved tier, you’re paying for the virtual machine and not an individual web site. So it makes sense to have all your sites hosted on that instance, maximizing your investment. Furthermore, if you are running enough shared tier web sites, it may be more cost effective to run them as reserved.

Back to scaling, if you scale back down to the free or shared tiers, the other sites will revert back to their old states. For example, let’s assume you have two web sites one at the free tier, one at the shared tier. I scale the free web site up to reserved and now both sites are reserved. If I scale the original free tier site back to free, the other site returns to shared. If I opted to scale the original shared site back to shared or free, then the original free site returns to its previous free tier. So it’s important when dealing with reserved sites that you remember what tier they were at previously.

The tiers are not our only option for scaling our web sites. We also have a slider labelled instance count if we are running a Shared or Reserved site. When running at the shared tier, this slider will change the number of processing threads that are servicing the web site allowing us between 1 and 6 threads. Of course, it we increase the threads, there’s a greater risk of hitting our cpu usage quota. But this adjustment could come in real handy if we’re experiencing a short term spike in traffic. Running at the reserved tier, the slider increases the number of virtual machine instances we (and subsequently our cost). This option allows us to run up to 10 reserved instances.

Also at the reserved tier, we can increase the size of our virtual machine. By default, our reserved instance will be a “small” giving us a single cpu core and 1.75 GB of memory at a cost of $0.12/hr. We can increase the size to “Medium” and even “Large” with each size increase doubling our resources and the price per hour ($0.24 and $0.48 respectively). This cost will be per virtual machine instance, so if I have opted to run 3 instances, take my cost per hour for the size and multiple it by 3.

So what’s next?

This pretty much hits the limits of what we can do with scaling web sites. But fortunately we’re running on a platform that’s built for scale. So it’s just a hop, skip, and jump from Web Sites to Windows Azure Cloud Services (Platform as a Service) or Windows Azure Virtual Machines (Infrastructure as a service). But that’s an article for another day. J

BUILD 2012 – Not just for Windows anymore

Last week marked the second BUILD conference. In 2011, BUILD replaced the Microsoft PDC conference in an event that was so heavily Windows 8 focused that it was even host at buildwindows.com. While the URL didn’t change for 2012, the focus sure did as this event also marked the latest round of big release news for Windows Azure. In this post (which I’m publishing directly from MS Word 2013 btw), I’m going to give a quick rundown of the Windows Azure related announcements. Think of this as your Cliff Notes version of the conference.

Windows Azure Service Bus for Windows Server – V1 Released

Previously released as a beta/preview back in June, this on-premise flavor of the Windows Azure Service bus is now fully released and available for download. Admittedly, it’s strictly for brokered messaging for now. But it’s still a substantial step towards providing feature parity between public and private cloud solutions. Now we just need to hope that shops that opt to run this will run it as internal SaaS and not set up multiple silos. Don’t get me wrong. It’s nice to know we have the flexibility to do silos, but I’m hoping we learn from what we’ve seen in the public cloud and don’t fall back to old patterns.

One thing to keep in mind with this… It’s now possible for multiple versions of the Service Bus API to be running within an organization. To date, the public service has only had two major API versions. But going forward, we may need to be able to juggle even more. And while there will be a push to keep the hosted and on-premises versions at similar versions, there’s nothing requiring someone hosting it on-premises to always upgrade to the latest version. So as solution developers/architects, we’ll want to be prepared for be accommodating here.

Windows Azure Mobile Services – Windows Phone 8 Support

With Windows Phone 8 being formally launched the day before the BUILD conference, it only makes sense that we’d seen related announcements. And a key one of those was the addition of Windows Phone 8 support to Windows Azure Mobile Services. This announcement makes Windows Phone 8, the 3rd supported platform (Windows Store & iOS apps) for Mobile Services. This added to an announcement earlier in the month which expanded support for items like sending email, and different identity providers. So the Mobile Services team is definitely burning the midnight oil to get new features out to this great platform.

New Windows Azure Storage Scalability Targets

New scale targets have been announced for storage accounts created after June 7th 2012. This change has been enabled by the new “flat network” topology that’s being deployed into the Windows Azure Datacenters. In a nutshell, it allows the tps scale targets to be increased by 4x and the upper limit of a storage account to be raised to 200tb (2x). This new topology will continue to be rolled out through the end of the year but will only affect storage accounts created after the 07/12/2012 as mentioned above. These scale target improvements (which BTW are separate from the published Azure Storage SLA) will really help reduce the amount of ‘sharding’ that needs to be done for those with higher throughput requirements.

New 1.8 SDK – Windows Server 2012, .NET 4.5, and new Storage Client

BUILD also marked the launch of the new 1.8 Windows Azure SDK. This release is IMHO the most significant update to the SDK since the 1.3 version was launched almost 2 years ago. You could write a blog post any one of the key features, but since they are all so closely related and this is supposed to be a highlight post, I’m going to bundle it up.

The new SDK introduces the new “OS Family 3″ to Windows Azure Cloud Services giving us support for Windows Server 2012. Now when you combine this with the added support for .NET 4.5 and IIS 8, we can start taking advantage of technology like Web Sockets. Unfortunately Web Sockets are not enabled by default so there is some work you’ll need to do to take advantage of it. You may also need to tweak the internal Windows Firewall. A few older Guest OS’s were also depreciated so you may want to refer to the latest update of the compatibility matrix.

The single biggest, and subsequently most confusing piece of this release has to do with the new 2.0 Storage Client. Now this update includes some great features including support for a preview release of the storage client toolkit for Windows Runtime (Windows Store) apps. However, there are some SIGNIFICANT changes to the client, so I’d recommend you review the list of Breaking Changes and Known Issues before you decide to start converting over. Fortunately, all the new features are in a new set of namespaces (Windows.AzureStorage.StorageClient has become simply Windows.Azurestorage.Storage). So this does allow you to mix and match old functionality with the new. But forewarned is forearmed as they say. So read up before you just dive into the new client headlong.

For more details on some of the known issues with this SDK and the workarounds, refer to the October 2012 release notes and you can learn about all the changes to the Visual Studio tools by checking out “What’s New in the Windows Azure Tools“.

HDInsight – Hadoop on Windows Azure

Technically, this was released the week before BUILD, but I’m going to touch on it none the less. A preview of HDInsight has been launched that allows you to help test out the new Apache™ Hadoop® on Windows Azure service. This will feature support for common frameworks such as Pig and Hive and it also includes a local developer installation of the HDInsight Server and SDK for writing jobs with .NET and Visual Studio.

It’s exciting to see Microsoft embracing these highly popular open source initiatives. So if you’d doing anything with big data, you may want to run over and check out the blog post for additional details.

Windows Azure – coming to China

Doug Hauger also announced that Microsoft has reached an agreement (Memorandum of Understanding, aka an agreement to start negotiations) which will license Windows Azure technologies to 21Vianet. This will in turn allow them to offer Windows Azure in China from local datacenters. While not yet a fully “done deal”, it’s a significant first step. So here’s hoping the discussions are concluded quickly and that this is just the first of many such deals we’ll see struck in the coming year. So all you Aussies, hold out hope! J

Other news

This was just the beginning. The Windows Azure team ran down a slew of other slightly less high-profile but equally important announcements on the team blog. Items like a preview of the Windows Azure Store, GA (general availability) for the Windows Azure dedicated, distributed in-memory cache feature launched back in June with the 1.7 SDK, and finally the launch of the Visual Studio Team Foundation Service which has been in preview for the last year.

In closing…

All in all, it was a GREAT week in the cloud. Or as James Staten put it on ZDNet, “You’re running out of excuses to not try Microsoft Windows Azure“. And this has just been the highlights. If you’d like to learn more, I highly recommend you run over and check out the session recordings from BUILD 2012 or talk to your local Microsoft representative.

PS – Don’t forget to snag your own copy of the great new Windows Azure poster!

Local File Cache in Windows Azure

 

When creating a traditional on-premise application, it’s not uncommon to leverage the local file system as a place to store temporary files and thus increase system performance. But with Windows Azure Cloud Services, we’ve been taught that we shouldn’t write things to disk because the virtual machines that host our services aren’t durable. So we start going to remote durable storage for everything. This slows down our applications so we need to add back in some type of cache solution.

Previously, I discussed using the Windows Azure Caching Preview to create a distributed, in-memory cache. I love that we finally have a simple way to do to this. But there are times when I think that caching something, for example an image file that doesn’t change often, within a single instance would be fine, especially if I don’t have to use up precious RAM on my virtual machines.

Well there is an option! Windows Azure Cloud Services all include, at no additional cost, an allocation of non-durable local disk space called surprisingly enough “Local Storage”. For each core you get 250gb of essentially temporary disk space. And with a bit of investment, we can leverage that space as a local, file backed cache.

Extending System.Runtime.Caching

So .NET 4.0 introduced the System.Runtime.Caching namespace along with a template base class ObjectCache that can be extended to provide caching functionality with whatever storage system we want to use. Now this namespace also provides a concrete implementation called MemoryCache, but we want to use the file system. So we’ll create our own implementation called FileCache class.

Note: There’s already a codeplex project that provides a file based implementation of ObjectCache. But I still wanted to role my own for the sake of explaining some of the challenges that will arise.

So I create a class library and add a reference to System.Runtime.Caching. Next up, let’s rename the default class “Class1.cs” to “FileCache.cs”. Lastly, inside of the FileCache class, I’ll add a using statement for the Caching namespace and make sure my new class inherits from ObjectCache.

Now if we try to build the class library now, things wouldn’t go very well because there are 18 different abstract members we need to implement. Fortunately I’m running the Visual Studio Power Tools so it’s just a matter of right-clicking on ObjectCache where I indicated I’m inheriting from it and selecting the “Implement Abstract Class”. This gives us shells for all 18 abstract members, but until we add some real implementation in, our FileCache class won’t even be minimally useful.

I’ll start by fleshing out the Get method and adding a public property, CacheRootPath, to the class that designates where our file cache will be kept.

public string CacheRootPath
{
    get { return cacheRoot.FullName; }
    set
    {
        cacheRoot = new DirectoryInfo(value);
        if (!cacheRoot.Exists) // create if it doesn't exist
            cacheRoot.Create();
    }
}

public override bool Contains(string key, string regionName = null)
{
    string fullFileName = GetItemFileName(key,regionName);
    FileInfo fileInfo = null;

    if (File.Exists(fullFileName))
    {
        fileInfo = new FileInfo(fullFileName);

        // if item has expired, don't return it
        //TODO: 
        return true;
    }
    else
        return false;
}

// return type is an object, but we'll always return a stream
public override object Get(string key, string regionName = null)
{
    if (Contains(key, regionName))
    {
        //TODO: wrap this in some exception handling
        MemoryStream memStream = new MemoryStream();
        FileStream fileStream = new FileStream(GetItemFileName(key, regionName), FileMode.Open);
        fileStream.CopyTo(memStream);
        fileStream.Close();

        return memStream;
    }
    else
        return null;
}

CacheRootPath is just a way for us to set the path to where our cache will be stored. The Contains method is a way to check and see if the file exists in the cache (and ideally should also be where we check to make sure the object isn’t expired), and the Get method leverages Contains to see if the item exists in the cache and retrieves it if it exists.

Now this is where I had my fist real decision to make. Get must return an object, but what type of object should I return. In my case I opted to return a memory stream.  Now I could have returned a file stream that was attached to the file on disk, but because this could lock access to file, I wanted to have explicit control of that stream. Hence I opted to copy the file stream to a memory stream and return that to the caller.

You may also note that I left the expiration check alone. I did this for the demo because your needs for file expiration may differ. You could base this on FileInfo.CreationTimeUTC, or FileInfo.LastAccessTimeUTC. both are valid as may be any other meta data you need to base it on. I do recommend one thing, make a separate method that does the expiration check. We will use it later.

Note: I’m specifically calling out the use of UTC. When in Windows Azure, UTC is your friend. Try to use it whenever possible.

Next up, we have to shell out the three overloaded versions of AddOrGetExisting. These methods are important because even though I won’t be directly accessing them in my implementation, they are leveraged by base cass Add method. And thus, these methods are how we add items into the cache. The first two overloaded methods will call the lowest level implementation.

public override object AddOrGetExisting(string key, object value, CacheItemPolicy policy, string regionName = null)
{
    if (!(value is Stream))
        throw new ArgumentException("value parameter is not of type Stream");

    return this.AddOrGetExisting(key, value, policy.AbsoluteExpiration, regionName);
}

public override CacheItem AddOrGetExisting(CacheItem value, CacheItemPolicy policy)
{
    var tmpValue = this.AddOrGetExisting(value.Key, value.Value, policy.AbsoluteExpiration, value.RegionName);
    if (tmpValue != null)
        return new CacheItem(value.Key, (Stream)tmpValue);
    else
        return null;
}

The key item to note here is that in the first method, I do a check on the object to make sure I’m receiving a stream. Again, that was my design choice since I want to deal with the streams.

The final overload is where all the heavy work is…

public override object AddOrGetExisting(string key, object value, DateTimeOffset absoluteExpiration, string regionName = null)
{
    if (!(value is Stream))
        throw new ArgumentException("value parameter is not of type Stream");

    // if object exists, get it
    object tmpValue = this.Get(key, regionName);
    if (tmpValue != null)
        return tmpValue;
    else
    {
        //TODO: wrap this in some exception handling

        // create subfolder for region if it was specified
        if (regionName != null)
            cacheRoot.CreateSubdirectory(regionName);

        // add object to cache
        FileStream fileStream = File.Open(GetItemFileName(key, regionName), FileMode.Create);

        ((Stream)value).CopyTo(fileStream);
        fileStream.Flush();
        fileStream.Close();

        return null; // successfully added
    }
}

We start by checking to see if the object already exists and return it if found in the cache. Then we create a subdirectory if we have a region (region implementation isn’t required). Finally, we copy the value passed in to our file and save it. There really should be some exception handling in here to make sure we’re handling things in a way that’s a little more thread save (what if the file gets created between when we check for it and start the write). And the get should be checking to make sure the file isn’t already open when doing its read. But I’m sure you can finish that out.

Now there’s still about a dozen other methods that need to be fleshed out eventually. But these give us our basic get and add functions. What’s still missing is handling evictions from the cache. For that we’re going to use a timer.

public FileCache() : base()
{
    System.Threading.TimerCallback TimerDelegate = new System.Threading.TimerCallback(TimerTask);

    // time values should be based on polling interval
    timerItem = new System.Threading.Timer(TimerDelegate, null, 2000, 2000);
}

private void TimerTask(object StateObj)
{
    int a = 1;
    // check file system for size and if over, remove older objects

    //TODO: check polling interval and update timer if its changed
}

We’ll update the FileCache constructor to create a delegate using our new TimerTask method, and pass that into a Timer object. This will execute the TimeTask method and regular intervals in a separate thread. I’m using a hard-coded value, but we really should check to see we have a specific polling interval set. Course we should also put some code into this method so it actually does things like check to see how much room we have in the cache and evict expired items(by checking via the private method I suggested earlier), etc…

The Implementation

With our custom caching class done (well not done but at least to a point where its minimally functional), its time to implement it. For this, I opted to setup an MVC Web Role that allows folks to upload an image file to Windows Azure Blob storage. Then, via a WCF/REST based service, it would retrieve the images twice. The first retrieval would be without using caching, the second would be with caching. I won’t bore you with all the details of this setup, so we’ll focus on just the wiring up of our custom FileCache.

We start appropriately enough with the role’s Global.asax.cs file where we add public property that represents out cache (so its available anywhere in the web application):

public static Caching.FileCache globalFileCache = new Caching.FileCache();

And then I update the Application_Start method to retrieve our LocalResource setting and use it to set the CacheRootPath property of our caching object.

protected void Application_Start()
{
    AreaRegistration.RegisterAllAreas();

    RegisterGlobalFilters(GlobalFilters.Filters);
    RegisterRoutes(RouteTable.Routes);

    Microsoft.WindowsAzure.CloudStorageAccount.SetConfigurationSettingPublisher(
        (configName, configSetter) =>
            configSetter(RoleEnvironment.GetConfigurationSettingValue(configName))
    );

    globalFileCache.CacheRootPath = RoleEnvironment.GetLocalResource("filecache").RootPath;
}

Now ideally we could make it so that the CacheRootPath instead accepted the LocalResource object returned by GetLocalResource. This would then also mean that our FileCache could easily manage against the maximum size of the local storage resource. But I figured we’d keep any Windows Azure specific dependencies out of this base class and maybe later look at creating a WindowsAzureLocalResourceCache object. But that’s a task for another day.

Ok, now to wire up the cache into the service that will retrieve the blobs. Lets start with the basic implementation:

public Stream GetImage(string Name, string container, bool useCache)
{
    Stream tmpStream = null; // could end up being a filestream or a memory stream

    var account = CloudStorageAccount.FromConfigurationSetting("ImageStorage"); 
    CloudBlobClient blobStorage = account.CreateCloudBlobClient();
    CloudBlob blob = blobStorage.GetBlobReference(string.Format(@"{0}/{1}", container, Name));
    tmpStream = new MemoryStream();
    blob.DownloadToStream(tmpStream);

    WebOperationContext.Current.OutgoingResponse.ContentType = "image/jpeg";
    tmpStream.Seek(0, 0); // make sure we start the beginning
    return tmpStream;
}

This method takes the name of a blob and its container, as well as a useCache parameter (which we’ll implement in a moment). It uses the first two values to get the blob and download it to a stream which is then returned to the caller with a content type of “image/jpeg” so it can be rendered by the browser properly.

To implement our cache we just need to add a few things. Before we try to set up the CloudStorageAccount, we’ll add these lines:

// if we're using the cache, lets try to get the file from there
if (useCache)
    tmpStream = (Stream)MvcApplication.globalFileCache.Get(Name);

if (tmpStream == null)
{

This code tries to use the globalFileCache object we defined n the Global.asax.cs file and retrieve the blob from the cache if it exists, providing we told the method useCache=true. If we couldn’t find the file (tmpStream == null), we’ll then fall into the block we had previously that will retrieve the blob image and return it.

But we still have to add in the code to add the blob to the cache. We’ll do right after we DownloadToStream:

    // "fork off" the adding of the object to the cache so we don't have to wait for this
    Task tsk = Task.Factory.StartNew(() =>
    {
        Stream saveStream = new MemoryStream();
        blob.DownloadToStream(saveStream);
        saveStream.Seek(0, 0); // make sure we start the beginning
        MvcApplication.globalFileCache.Add(Name, saveStream, new DateTimeOffset(DateTime.Now.AddHours(1)));
    });
}

This uses an async task to add the blob to the cache. We do this with asynchronously so that we don’t block returning the blob back to the requestor while the write to disk completes. We want this service to return the file back as quickly as possible.

And that does it for our implementation. Now to testing it.

Fiddler is your friend

Earlier, you may have found yourself saying “self, why did he use a service for his implementation”. I did this because I wanted to use Fiddler to measure the performance of calls to retrieve the blob with and without caching. And by putting it in a service and letting fiddler monitor the response times, I didn’t have to write up my own client and put timings around it.

To test my implementation, I fired up fiddler and then launched the service. We should see calls in Fiddler to SimpleService.svc/GetImage, one with cache=false and one with cache=true. If we select those items, and select the Statistics tab, we should see some significant differences in the “Overall Elapsed” times of each call. In my little tests, I was seeing anywhere from a 50-90% reduction in the elapsed time.

image

In fact, if you run the tests several times by hitting refresh on the page, you may even notice that the first time you hit Windows Azure storage for a particular blob, you may have additional delay compare to subsequent calls. Its only a guess but we may be seeing Windows Azure storage doing some of its own internal caching there.

So hopefully I’ve described things well enough here and you can follow what we’ve done. But if not, I’m posting the code for you to reuse. Just make sure you update the storage account settings and please please please finish the half started implementation I’m providing you.

Here’s to speedy responses thanks to caching. Until next time.

Follow

Get every new post delivered to your Inbox.

Join 1,076 other followers