Azure Tenant Management

Hey everyone. I’m writing this post from Redmond, WA. I’m in here on the Microsoft campus this week to meet with some of my colleagues and explore options for my first significant open source project effort. We’re here to spend three days trying to find ways to manage the allocation of, and monitor the billable usage of Azure resources. There are two high level objectives for this effort:

  • Allow non-technical users to provision and access Azure resources without the need to understand Azure’s terminology and even granting direct access to the underlying Azure subscription
  • Monitor the billable usage of the allocated resources, and track against a “cap”

On the surface this seems pretty simple. That is, until you realize that not all the Azure services expose individual resource usage, and of the ones that do, they all do it separately. It gets even more complicated when you realize that you may need to do things like have users “share” cloud services and even storage accounts.

Couple examples

So let’s dive into this a little more and explore a couple of the possible use cases I’ve heard from customers.

Journalism Students

Scenario: We have a series of journalism students that will be learning to provision and maintain a online content management system. They need someplace to host their website, upload content, and need GIT integration so they can version any code changes they make to their site’s code. The instructor for the course starts by creating an individual web site for them with a basic CMS system. The student will then access this workspace as they perform their coursework.

Challenges: Depending on the languages they are using, we may be able get by Azure Web Sites. But this only allows us up to 10 “free” web sites, and what happens to the other students. Additionally, students don’t know anything about the different SKUs available and just want things that work, so do we need to provide “warm-up” and auto-scaling? Additionally, since the instructor is setting up the web sites for them, we need a simple way for the instructor to get the resources provisioned, and give the students access to it without the instructor needing to even be aware of the Azure subscriptions. We also need to track compute and bandwidth usage on the individual web site.

Software Testers

Scenario: A small company has remote testers that perform quality assurance work on software. These workers are distributed but need to remote into Windows VMs to run tests. Ideally, these VMs will be hosted “in the cloud”, and the company wants a simple façade whereby the workers can select which software they need to test and then provision a virtual machine for them. The projects being tested should be “billed back” for the resources used by the testers and the testers work on multiple projects. Additionally, the testers should be able to focus on the work they have to do, not how to manage and provision Azure resources.

Challenges: This one will likely be Azure Virtual Machines. But we need to juggle not only compute/bandwidth, but track impact on storage (transactions and total amount stored) as well. We also need to be able to provision VMs from a selection of customer gallery images and get them running for the testers, sometimes across subscriptions. Finally, we need to be aware of challenges with regards to VM endpoints and cloud services if we to maximize the density of these VMs.

Business Intelligence

Scenario: Students are learning to use Oracle databases to analyze trends. The instructor is using the base Oracle database images from the Azure gallery but has added various tools and sample datasets to them for the students to use. The students will use the virtual machines for various labs over the duration of the course and each lab should only take a few hours.

Challenges: If these VMs were kept running 24×7, it would costs thousands of dollars per month per student. So we need to make sure we can automate the start and stop of the VMs to help control these costs. And since the Oracle licensing fees appear as a separate charge, we need to be able to predict these as well based on current rates and the amount of time the VM was active.

So what are we going to do about it?

In short, my plan is to create a framework that we will release via open source to help fill some of these gaps. A simple, web based user interface for accessing your allocated resources, some back end services that monitor resource usage and track that against quotas set by solution administrators. Underneath all that, a system that allows you to “tag” resources as associated with users or specific projects. If all goes well, I hope to have the first version of this framework published and available by the end of March, 2015 that will focus on Azure Web Sites, IaaS VMs, and Azure Storage.

However, and this is the point of my little announcement post, we’re not going to make you wait until this is done. As this project progresses, I plan to regularly post here and in January we’ll hopefully have a GIT repository where you’ll be able to check out the work as we progress. Furthermore, I plan to actively work with organizations that want to use this solution so that our initial version will not be the only one.

So look for more on this in the next couple weeks as we share our learnings and plans. But also, let me know via the comments if this is something you see value in and what scenario you may have. And oh, we still don’t have a name for the framework yet. So please post a comment or tweet @brentcodemonkey with your ideas. J

Until next time!

Advertisement

Azure File Depot – The BlobWatcher

Recently I was taking a look at WebJobs, the new feature added to Windows Azure Web Sites that lets you run applications continuously, at intervals, or triggered by certain events (such as a new object in Azure storage). One of the questions that popped into my head was, how does the “binding to blobs” work? If I could find an answer to that, perhaps I could add that as a feature to the File Depot project.

After poking around a bit, I found that there’s not really anything magical to what WebJobs was doing. They are leveraging the information that could be available to you or me. In the case of Azure Storage blobs, when you create a blob binding for the job, WebJobs is reaching into the storage account in question and turning on the write logs for blobs in that account and giving those logs a 7 day retention period. It’s these logs that are scanned/monitored by WebJobs so that when new blobs arrive, it can trigger your job based on the bindings you’ve set up.

FileDepotBlobWatcher-EnableLogs

 

So how are the logs scanned? Fortunately, Mike Stall has already published a great little write-up. In a nutshell, when a job is started, it does a full scan of the logs for all past data, then does incremental scans for new files. So armed with this information, I set out to create my own implementation of a blob detector, the Azure File Depot Blob Watcher!

Azure Storage Logs

Armed with the info from Mike’s post, the first step is to dig into the storage logs and figure out how they work. The storage team has a great post on using the logs and I recommend you take the time to give it a complete read. But here are the highlights…

When you enable logging, a new “$logs” container will be created. The blobs placed into this container are read only, you can read and delete them, but not alter them or their properties. The logs buffered up internally and periodically ‘flushed’ into this container as individual blobs.

In Mike’s post, he mentions that there is latency (5-10 minutes) detecting blobs, and this is because Azure storage buffers the logs for up to 5 minutes or until the buffer hits 4MB in size. At that time, they are written out, and we are able to access them. Thus the latency.

Log files are only written when there are operations we’ve indicated we want to log. But the naming convention always follows the pattern: <service>/YYYY/MM/DD/HHmm/<sequence>.log

So we’ve already identified a couple of requirements for our solution…

  • Don’t scan for new log files more than every 5 minutes
  • Get a list of logs from the $logs container that start with “blob/”
  • Don’t reprocess log files we’ve already examined

Once we have the files, we then have to parse them. I wrote a post last fall that describes using Excel parse the semi-colon delimited log entries. We’re going to need to do that in code, but fortunately it’s not that difficult. The logs are semi-colon delimited and use double-quotes to denote strings that include semi-colons that we won’t want to split/explode on. You could do this using a regular expression, but my own regex skills are so rusty that I opted to just parse the file via a bit of C# code.

int endDelim = 0;
int currentPos = 0;
while(currentPos <= logentry.Length-1)
{
 
    // if a quoted string... 
    if (logentry.Substring(currentPos,1).StartsWith("\""))
    {
        currentPos++; // skip opening quote
        endDelim = logentry.IndexOf("\";", currentPos);
        if (endDelim == -1) // if no delim, jump to end of string
            endDelim = logentry.Length - 1;
        properties.Add(logentry.Substring(currentPos, endDelim - currentPos));
        // skip ending quote and semicolon
        endDelim = endDelim + 2;
    }
    else // not quoted string
    {
        endDelim = logentry.IndexOf(';', currentPos);
        if (endDelim == -1) // if no delim, jump to end of string
            endDelim = logentry.Length - 1;
        properties.Add(logentry.Substring(currentPos, endDelim - currentPos));
        endDelim++;
    }
 
    currentPos = endDelim; // advance position
}

Not as elegant as a regex I fully admit. But with my unpracticed skills (it’s been 10+ years since I had my fingers deep in that), it would have taken me 2-3 times longer to get that working then just brute forcing it.

The final step is knowing what we want out of the logs. There’s two key values from the log that I’m after. The OperationType, and the RequestURI. The request URI is self-explanatory enough, that’s the URI of the blob that we’re trying to detect. The OperationType is the action that was performed against Azure storage. There’s only two values we’re going to monitor for, PutBlob and PutBlockList.

Now here is a bit of an issue. A small enough blob can be created or UPDATED, using just the PutBlob call. So if we detect that operation. So there is a chance that we may process the same file multiple times. We could resolve this by using a “receipt” pattern as is called out in the comments section of Mike’s post, or we could keep a list of processed blobs (perhaps in table storage). The approach really depends on your needs, so I’m going to leave it out of this implementation for now.

NOTE: It should also be noted, that since we’re only looking for PutBlob or PutBlockList operations, we’re not doing to be able to detect page blobs and will catch (via PutBlob) updates to smaller page blobs. Fixing this is definitely on my list, but will need to wait for another day.

The solution

Now that we know how to get at the log information, it’s time to start creating a solution. The first decision I made, was to separate detecting new logs files from their parsing. So we’ll have a LogScanner, and a LogParser. I also wanted to make parsing the log entries super easy, so I decided to create a LogEntry class that I can feed the string that is a log entry into and exposes the values as properties.

But I still have two issues… It’s likely, especially under high volumes, that parsing the logs will take much longer then detecting them. So under most circumstances, I can get by with a single LogScanner. So I’m going to implement a “traffic cop” or “gatekeeper” pattern so that only one LogScanner can run at a time.

My second issue is how to ensure I only alert to a new log file once. I’ll be running scans every 5 minuts or so, and listing blobs doesn’t really have an option for “give me only the new ones”. Fortunately, since I’m already using a gatekeeper, I can have it store the name of the last log file I processed for me. Making it pretty simple to keep track.

The final step of course is having both the LogScanner and LogParser use delegates so whomever is implementing them can create a method to handle when a log file is detected, or a new blob is found. Thus allowing them to control what actions are taken.

I’ll wrap the whole thing up in a reference implementation via a console app. So the final solution looks like this:

FileDepotBlobWatcher-SolutionLayout

The BlobLogEntry class expose the individual fields of the blob log entry (see the parsing code above or Codeplex for all this really does), the Gatekeeper to make sure only one LogScanner is trying to detect new log entries, and the LogScanner to parse a log once it’s been found.

Gatekeeper

I’ve blogged about the gatekeeper pattern before. I’ve known this as a “traffic cop” since long before folks started publishing design patterns on the internet, so to me that’s what it will always be. Regardless of the name, the purpose is to make sure only one process can do something at a time. We’re going to accomplish this by using a lease on an Azure storage blob as our control switch.

The Gatekeeper object needs to be able to start, stop, and renew the underlying blob lease. And because I’m also going to use it to store the last log file processed, I’m going to add SetText and GetText methods to write and retrieve strings to the underlying blob.

This class is fairly simple, so I’m not going detail code you can look at yourself on codeplex. So instead I’ll just call out a few highlights…

My gatekeeper constructor accepts a CloudBlockBlob for the blob on which we’ll place a lease. This gives the calling process full control over where that blob lives. It then creates a lease on the blob good for up to 60 seconds (the maximum allowed value), and attempts to renew that lease every 45 seconds. This gives me 15 seconds in the case of transient failures to successfully complete getting the lease before I run the risk of another scanner taking over.

In a couple places, we trap for a Storage Exception that has a 409 error code. This indicates that our attempt to get the lease has failed because somebody else already has a lease on the blob in question (aka another scanner has taken over).

Implementing the Gatekeeper is simply a matter of creating the CloudBlockBlob object, handing it off to the class constructor, and then calling start when we want to gain control. We can check periodically to see if we have the lease, optionally getting it if we don’t.

The final bit is to make sure the starting and stopping of a timer to renew the lease is put into the appropriate spots.

Take a look at the gatekeeper code and if you have you have questions, please feel free to post them in the comments.

LogParser

Also pretty straight forward is the parser. It takes the CloudBlockBlob object (which would be a log file) as a parameter for its constructor, then we the ParseFile method to inspect the log file.

public void ParseFile(FoundBlobDelegate callback)
{
    using (Stream stream = logFile.OpenRead())
    {
        // read the log file
        using (StreamReader reader = new StreamReader(stream))
        {
            string logEntry;
            while ((logEntry = reader.ReadLine()) != null)
            {
                // parse the log entry
                BlobLogEntry blobLog = new BlobLogEntry(logEntry);
 
                //NOTE: PutBlockList is the final write for a large block blob
                // PutBlob can also be used for small enough blobs, but also presents an overwrite of an existing one
                if (blobLog.OperationType.Equals("PutBlob") || blobLog.OperationType.Equals("PutBlockList"))
                    callback(blobLog.RequestUrl);
            }
        }
    }
}

This method opens a stream on the blob, and then reads through it line by line. Each line is parsed using the BlobLogEntry object and if the OperationType is “PutBlob” or “PutBlockList”.

Now I could have put this method into the LogScanner, but as I pointed out earlier, it’s highly likely it will take longer to parse the logs then to detect them. So in a real word implementation, the LogScanner may simply notify a pool of parsers, possibly via a queue. So separating the implementations made a certain amount of sense. Especially when I look ahead to having to deal with larger page blobs.

LogScanner

This is where most of my time on the project was spent. It has a few parallels with the Scanner in that we have a constructor that accepts some parameters (a CloudBlobClient and an instance of the Gatekeeper class), as well as Start and Stop methods.

Internally, the LogScanner object will be using the CloudBlobClient to create a CloudBlobContainer object that’s looking at the “$logs” container. We then use the gatekeeper to make sure that if I have multiple processes running log scans, only one of them can actually do the processing. Finally, it uses an internal timer object to make sure we’re scanning for new log files at a regular interval (which defaults to 5 minutes).

When we call the Start method, the LogScanner takes a delegate that the calling process can use to determine what action should be taken when a new log file is detected (such as using the LogParser to digest it). It then starts the gatekeeper process, and attempts to do an initial scan for logs (like Mike’s post said WebJobs does). Once that scan is complete, it will start the timer so we can do additional scans at the specified interval.

The stop just reverses these actions, stopping the scan timer and the gatekeeper. So the real meat of this class, is what happens when we scan for log files. So let’s walk through this a bit before I show you the code.

The first thing we need to be able to do is get a list of blobs in the $logs container. We have two scenarios we have to support with this, get everything (for an initial scan), and get just new stuff for incremental scans. The challenge is that Azure storage, only supports getting a list of blobs based on a filter on the name, not on any metadata or properties. The initial scan is fairly simple, we set our filter criteria to “blob/”, which will get all blob service logs in the container.

So let’s say we’ve already done a scan and we stored the last log file we found in our Gatekeeper, so I know where I left off. But how do I pick back up again? I could just filter for all logs and iterate through until we get back to the where we left off. Perfectly ok, but doesn’t strike me as particularly efficient. So if we think back to how the logs are named, I can parse the last log I found to go back to the year, month, day, and hour for which that log was produced. So when I pick back up on scanning, I scan for that hour and all the hours in between UTCNow and then.

Note: You could alternatively scan for day, month, or year. Depending on the frequency of your scans and the production of logs, these options could be more efficient then my hourly approach.

We start by extracting the datetime values from the last log file name (uri in the sample blow) we read from our gatekeeper…

int startPOS = uri.IndexOf("blob/") + 5;
int endPOS = uri.LastIndexOf('/');
 
return uri.Substring(startPOS, endPOS - startPOS);

We know all the URIs will have a “blob/” at the beginning since that’s the service we’re monitoring. Furthermore, the file names all end in a six digit sequence number with a ‘.log’ suffix. So if I find the position of the last ‘/’ character in the string, I can now extract the YYYY/MM/DD/HHmm portions from the URI. We can make all these assumptions because the log naming conventions are published and therefore somewhat immutable.

Note: Currently, the mm portion of log URI will always be zero per the published naming convention. This is a key assumption for our processing.

Next, we need to convert this substring to a datetime type

DateTime tmpDT;
// convert prefix to datetime
DateTime.TryParseExact(fileprefix, "yyyy/MM/dd/HHmm"null,
                       DateTimeStyles.None, out tmpDT);
return tmpDT;

This takes our URI substring, and converts it into a DateTime, leaving us to simply calculate the delta between the current UTC datetime and this value to know how many hour periods we need to filter for.

ScanPasses = ((DateTime.UtcNow - PrefixToDateTime(startingPrefix)).TotalHours + 1);

So now we know that we will do one filtered list for each hour from the last hour we found a file to the current datetime. Ideally, this could be optimized so that the gatekeeper stores the last scanned period so we don’t have to scan past hours for which there was no traffic. But my assumption is that if we’re scanning the logs, we expect traffic at fairly regular intervals. So repetitive scans of empty “hours” shouldn’t happen often. And when you add up the cost of those scans versus programmer time to optimizing things, I could scan a few eons of empty logs before the cost would match the programmer cost to fine tune this.

Now that we’re armed with that we need to do the scans of the logs, let’s look at some of the code…

// get last log file value from gatekeeper
string lastLog = gatekeeper.GetText();
 
// calculate starting prefix
if (!lastLog.Equals("blob/")) // we had a "last log" from previous runs
{
    startingPrefix = getPrefixFromURI(lastLog); // use that prefix as our starting point
    ScanPasses = ((DateTime.UtcNow - PrefixToDateTime(startingPrefix)).TotalHours + 1); // 
    pastPreviousLog = false// don't start raising "found log" events until we're past the last processed log
}

We start by getting the last log file we found from the gatekeeper. If that value is not “blob/”, then we’re doing a subsequent scan. We’ll get the data/time prefix from the log URI, and use that to calculate the number of scans we need to do. We also set a value that tells us we haven’t yet passed our previously found log file. We need this last part because subsequent scans will always resume in the same hour of the last log file we processed. And it’s possible that new log files have arrived.

Next we will enter into a loop that will execute once or each scan pass we calculated we need. If it’s a first time scan, we’ll only do one pass because our blob list filter will be all available logs.

// List the blobs using the prefix
IEnumerable<IListBlobItem> blobs = 
    logContainer.ListBlobs(string.Format("blob/{0}", startingPrefix), trueBlobListingDetails.Metadata, null);
 
// interate the list of log files
foreach (IListBlobItem item in blobs)
{
    CloudBlockBlob log = item as CloudBlockBlob;
    string LogURI = log.Uri.ToString();
    if (log != null)
    {
        if (pastPreviousLog)
        {
            // call Delegate to act on log file
            this.callback(log);
 
            // update gatekeeper blob 
            gatekeeper.SetText(LogURI);
            lastLog = LogURI;
        }
        if (lastLog.Equals(LogURI, StringComparison.OrdinalIgnoreCase))
            pastPreviousLog = true;
    }
}

For each log file, we look at the URI. If we’re past the last log file (as recorded by the gatekeeper, we will call the callback method handed into our object, alerting a calling process that a new log file has been found. We then ask the gatekeeper to save that URI as our new starting point for the next scan. Lastly, in case we had a previous log file recorded, we need to check and see if we’re at it, so we can process the additional logs.

And as we exit the log listing loop, we increment our filter criteria (so we can scan the next available hour), and decrement the scanpasses value so we know how many scans remain.

On either side of this, we also enable and disable the timer object. The only purpose of this is that on the off chance it takes us more than 5 minutes to scan the logs, we don’t double up on the scan operations.

Running the Sample

Hopefully you’ll find this solution pretty straightforward. With the classes in place, all that remains is to implement them, in this case as sample console application.

LogScanner and LogParser need some delegate methods. For LogScanner, we’ll use this …

public static void LogFound(CloudBlockBlob LogBlob)
{
    Console.WriteLine(string.Format("Parsing Log File: {0}", LogBlob.Uri));
 
    // Parse the Log
    LogParser myParser = new LogParser(LogBlob);
    //HINT: we could drop the log file into a queue and process asyncronously
    myParser.ParseFile(FoundBlob);
 
    //Option: delete the log once its processed
}

When the LogScanner finds a new log file, it will call this delegate. For my sample I’ve chosen to write the event to the console output, and immediately parse the file via the LogParser. Just keep in mind that the current implementation is a synchronous blocking call, so in a real production situation, you likely won’t want to do this. Instead, writing the event to a queue, where subscribers can then take and process the event.

We follow this up with a delegate for LogParser that will be called as we parse the log files that were found, and locate what we believe to be a blob.

public static void FoundBlob(string newBlobUri)
{
    // filter however you like, by container, file name, etc... 
    if (!newBlobUri.Contains("gatekeeper")) // ignore gatekeeper updates
        Console.WriteLine(string.Format("Found new blob: {0}", newBlobUri));
}

You’ll notice that in this method, I’m doing a wee bit of filtering based on the BlobURI. In a real implementation, you may only want to watch a handful of containers. In my sample implementation, the blob object that’s at the heart of the gatekeeper object will have the name “gatekeeper”, so I went for the simple approach to make sure I ignore any operations related to it. I thought about putting filter criteria (such as container) as an attribute of the LogParser, but ultimately settled on this approach as being far more flexible.

The final step was to go into the console app and set things in motion…

// set up our private variables.
string storageAccountString = Properties.Settings.Default.AzureStorageConnection;
 
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(storageAccountString);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();

We start by retrieving the Azure Storage Account’s connection string, and using that string to get a CloudStorageAccount object, with which we create a CloudBlobClient.

// make sure we have a gatekeerp in place
CloudBlobContainer gatekeeperContainer = blobClient.GetContainerReference("gatekeeper");
gatekeeperContainer.CreateIfNotExists(); // want to make sure the container is there... 
CloudBlockBlob gatekeeperBlob = gatekeeperContainer.GetBlockBlobReference("gatekeeper");
Gatekeeper mygatekeeper = new Gatekeeper(gatekeeperBlob, "blob/");

Using the CloudBlobClient, we create a container where our gatekeeper blob will go, then get a CloudBlockBlob that will be the gatekeeper blob (the blob we’ll put leases on). Finally, using that blob, we create the gatekeeper object which also initializes the contents.

Next, we initialize the LogScanner and tell it to start processes, calling the delegate we already defined.

LogScanner myScanner = new LogScanner(blobClient, mygatekeeper);
myScanner.ScanInterval = new TimeSpan(0, 5, 0);
myScanner.Start(LogFound);

After that, all that remains is to give myself a simple loop to run in while the LogScanner and LogParser do their work. I’ve put in one that will run for up to an hour. After the loop exits, it will stop the scanner, which will release the lease on the blob. If you stop the console app forcibly, just be aware that the gatekeeper lease will persist for up to 1 minute. So your initial scan upon launching the program likely won’t have any results unless you wait at least 1 minute before restarting.

With the sample program complete, all that remains is to set the Azure Storage Account Connection string in the program’s application settings (using a storage account that has the Blob write logging enabled), then compile and run the solution. As it runs, you can upload blobs into it (perhaps using the Publishing Console project also located in the FileDepot Codeplex project), and within 5-10 minutes, you should start seeing files show up in the BlobWatcher console app.

Magic, no longer

So with this, I hope I’ve shed a bit of light on the Azure Storage logs and how they can be used. As I look back on creating this sample, I find that I almost spent more time digging into how storage logs work, then was spent actually working on the code. The final product could use some fine tuning, as well as enhancement for page blob scenarios. But as a starting point, I’m fairly happy with it.

Admittedly, if all you really want to do is monitor for new blobs and act on them, your best approach is to use Azure WebJobs. That team has far more to time and resources than I do. And as such, they can give you a solution that will be far more robust this then my simple code sample. But replacing WebJobs was never my objective, I just wanted to help highlight how Azure storage logging can be used to do more than just track errors and capacity utilization.

Please do check out BlobWatcher at the Azure File Depot on Codeplex. And more importantly, leave feedback either here or there. I want to make sure the project is fulfilling some common needs and to that end, one can never have enough feedback.

Until next time!

 

Azure Files – Share Management

Note: If you are going to be using Azure Files from the same VM regularly, be sure to follow the instructions in this blog post to ensure that the connection is persistent.

We recently announced a new preview feature, Azure Files. This feature allows you to mount a SMB based file share into your Azure hosted PaaS and IaaS VMs. As this feature is in preview, the various related bits are also in preview state. And as with many previews, there’s some risk when you mix early release bits with current production bits that can cause difficulties. So to that end

So with this in mind, I decided that I’d do something I haven’t done in some time and that’s write some code that goes directly against the Storage API, to create, delete, and list shares created in Azure Files. And do this in a way that takes no dependencies on any “preview” bits.

We’ll start with the Create Share API. For this we’ll need our account name, one of the keys, and the name of the share we’re working on. We begin by creating the basic REST request. As described at the link above, we need to create a “PUT” verb, against the ‘2014-02-14’ version of the API, I’m also going to set the content type to ‘application/xml’ and give it a content length of 0. We’ll do this with a HttpWebRequest object as follows:

var request = (HttpWebRequest)HttpWebRequest.Create(string.Format("https://{0}.file.core.windows.net/{1}?restype=share", 
    creds.AccountName, shareName));
request.Method = "PUT";
request.Headers["x-ms-version"] = "2014-02-14";
request.ContentType = "application/xml";
request.ContentLength = 0;

The variables, used in this is the account name (creds.AccountName), and the name of the share we want to create (shareName);

Once we have the request, we then have to sign it. Now you could do this manually, building the string and doing the MACSHA hashing… But since I can take a dependency on the existing Azure Storage SDK (v4.0.3), we can just use the SharedKeyLiteAuthenticationHandler class to sign the request for us. Big thanks to my colleague Kevin Williamson, for pointing me at this critical piece which had changed since I last worked with the Storage REST API.

SharedKeyLiteAuthenticationHandler auth = new SharedKeyLiteAuthenticationHandler(SharedKeyLiteCanonicalizer.Instance, creds, creds.AccountName);
auth.SignRequest(request, null);

By leveraging this aspect of the Azure SDK, we save ourselves the hassle of having to manually generate the string to be signed (canonizing), and then actually doing the signature. If you’d like to learn more on Azure Storage authentication, I’d recommending checking out the MSDN article on the subject.

With the signed request created, we only have to execute the request, and trap for any errors.

// sent the request
HttpWebResponse response = null;
try
{
    response = (HttpWebResponse)request.GetResponse();
    Console.WriteLine("Share successfully created!");
}
catch (WebException ex)
{
    Console.WriteLine(string.Format("Create failed, error message is: {0}", ex.Message));
}

And that’s all there really is too it. If the request fails, it will throw a WebException, and can look at failure for additional details. Now if you want to learn more about the Azure Files REST API, you can find a slew of great information already out on MSDN. This includes one extremely helpful page related to the naming and references

Now what I did is take this and add in the Delete, and List commands. And roll them up into a simple little console app. So with this app, you can now run a command like…

AzureFileShareHelper -create -acct:<accountname> -key:<accountkey> -share:myshare

This will create the share for you and even return the URL use to mount the share into an Azure VM. Just change the verb to -delete or -list if you want to leverage another operation. 🙂

 

Azure File Depot

It’s been a little while since my last update. BUILD 2014 has come and gone and my group, DPE (Developer Platform Evangelism), has been re-branded to DX (Developer Experience). Admittedly a name is a name, but this change comes at a time when my role is beginning to transform. For the last 18 months, my primary task has been to help organizations be successful in adopting the Azure platform. Going forward, that will still remain significant, but now we’re also tasked with helping demonstrate our technology and our skills through the creation of intellectual property (frameworks, code samples, etc…) that will assist customers. So less PowerPoint (YEAH!), more coding (DOUBLE YEAH!).

To that end, I’ve opted to tackle a fairly straightforward task that’s come up often. Namely the need to move files from one place to another. It’s come up at least 3-4 times over the last few months so it seems like a good first project under our changing direction. To that end, I’d like to present you to my first open sourced effort, the Azure File Depot.

What is it?

In short, the Azure File Depot is an effort to provide sample implementations of various tasks related to using blob storage to move files around. It contains a series of simple classes that help demonstrate common tasks (such as creating a Shared Access Signature for a blob) and business challenges (the automated publication of files from on-premises to Azure and/or an Azure hosted VM).

Over time, it’s my hope that we may attract a community around this and evolve this little project into a true framework. But in the meantime, it’s a place for me to do what I do best and that’s take learnings and share them. At least I hope I’m good at it since it’s a significant portion of my job. J

What isn’t it?

I need to stress that this project isn’t intended to be a production ready framework for solving common problems. The goal here is to create a collection of reference implementations that address some common challenges. While we do want to demonstrate solid coding practices, there will be places where the samples take less than elegant approaches.

I’m fully aware that in many cases there may be better implementations out there, in some cases perhaps even off the shelf solutions that are production ready. That’s not what this project is after. I have many good friends in Microsoft’s engineering teams that I know will groan and sigh at the code in this project. I only ask that you be kind and keep in mind, this is an educational effort and essentially one big collection of code snippets.

So what’s in it currently?

What we’ve included in this first release is the common ask I referred to above. The need to take files generated on-premises and push them to Azure. Either letting them land in blob storage, or have them eventually land in an Azure hosted VM.

Here’s a diagram that outlines the scenario.

FileDepotDiagram

Step 1 – File placed in a folder on the network

The scenario begins with a file being created by some process and saved to the file system. This location could be a network file share or just as easily could be on the same server as the process itself.

Step 2 – The Location is monitored for new files

That location is in turn monitored by a “Publication Service”. Our reference implementation uses the c# FileSystemwatcher class which allows the application to be receive notification of file change events from Windows.

Step 3 – Publication service detects file and uploads to blob storage

When the creation of a new file raises an event in the application, the publishing app waits to get an exclusive lock on the file (making sure nothing is still writing to the file), then uploads it to a blob container.

Step 4 – Notification message with SAS for blob is published

After the blob is uploaded, the publication service then generates a shared access signature and publishes a message to a “Messages” Service Bus topic so that interested processes can be alerted that there’s a new file to be downloaded.

Step 5 – Subscribers receive message

Processes that want to subscribe to these notifications create subscriptions on that topic so they can receive the alerts.

Step 6 – Download blob and save to local disk

The subscribing process then use the shared access signature to then download the blob, placing it in the local file system.

Now this process could be used to push files from any location (cloud or on-premises) to any possible receiver. I like the example because it demonstrates a few key points of cloud architecture:

  • Use of messaging to create temporal decoupling and load leveling
  • Shared Access Signatures to grant temporary access to secure, private blob storage for potentially insecure/anonymous clients
  • Use of Service Bus Topics to implement pub/sub message model

So in this one example, we have a usable implementation pattern (I’ve provided all the basic helper classes as well as sample implementations of both console and Windows Services applications). We also have a few reusable code snippets (create a blob SAS, interact with Service Bus Topics, upload/download files to/from block blobs.

At least one customer I’m working with today will find these samples helpful. I hope others will as well.

What’s next?

I have a few things I plan to do this with this in the near term. Namely make the samples a bit more robust: error handling, logging, and maybe even *gasp* unit tests! I also want to add in support for larger files by showing how to implement this with page blobs (which are cheaper if you’re using less than 500TB of storage).

We may also explore using this to do not just new file publication, but perhaps updates as well as adding some useful metadata properties to the blobs and messages.

I also want to look at including more based scenarios. In fact, if you read this, and have a scenario, you can fork the project and send us a pull request. In fact, you’re wondering how to do something that you think could fit into this project, please drop me a line.

That’s all the time I have for today. Please look over the project as we’d love your feedback.

Monitoring Windows Azure VM Disk Performance

Note: The performance data mentioned here is based on individual results during a limited testing period and should NOT be used as an indication of future performance or availability. For more information on Windows Azure storage performance and availability, please refer to the published SLA.

So I’ve had an interesting experience the last few days that I wanted to take a few minutes to share with the interwebs. Namely, monitoring some Windows Azure hosted virtual machines, not at the VM level, but the storage account that held the virtual machine disks.

The scenario I was facing was a customer that was attempting to benchmark the performance and availability of two Linux based virtual machines that were running an Oracle database. Both machines were extra-large VMs with one running 10 disks (1 os, 1 Oracle, 8 data disks) and the other with 16 disks (14 for data). The customer has been running an automated load against both machines and wanted to get a clear idea of how much they may or may not have been saturating the underlying Windows Azure Storage system as well as what could be contributing to the highly variable Oracle IOPS levels they were seeing.

To support this effort, I dug into something I haven’t looked at in depth for quite some time. Windows Azure Storage Analytics (aka Logging and Metrics). Except this time with a focus on what happens at the storage account with regards to the VM disk activity.

Enable Storage Analytics Proactively

Before we go anywhere, I need to stress that if you want to be able to see what’s going on with Azure Storage and your virtual machine, you’ll need to enable this BEFORE a problem occurs. If you haven’t already enabled logging, the only option you have to try and go “back in time” and look at past behavior is to open up a support ticket. So if you plan to do this type of monitoring, please be certain to enable analytics!

For Windows Azure VM disk metrics, we need to enable analytics on the blob storage account. As the link I just shared will let you know, you will need to call the “Set Blob Service Properties” api to set this (or use your favorite Windows Azure storage utility). I happen to use the Azure Management Studio from Redgate and it allows me to set the properties you see in this screen shot:

With this, I tell Azure Storage that I want it to log all blob operations (Read/Write/Delete) and retain that information for up too two days. I also enable metrics and ask it to retain that data for two days as well.

When I enable logging, Azure Storage will log all operations and persist that information into a series of blob files in a special container in the storage account called $logs. The Logging data will be written to blobs in the same storage account I am monitoring in a special container called $logs. Logs will be spread across multiple blob files as discussed in great detail in the MSDN article “About Storage Analytics Logging“. A word of caution, if the storage account is active, logging will produce a LARGE amount of data. In my case, I was seeing a new 150mb log file approximately every 3 minutes. That’s about 70gb per day. In my case, I’ll be storing about 140gb for my 2 days of retention which is only about $6.70 per month. Given the cost of the VM itself, this was inconsequential. But if I had shifted my retention period to a month… this can start to get pricy. Additionally, the storage transactions needed to write the logs to blog storage count against the account limit of 20,000/tps. To help reduce the risk of throttling coming into play to early, the virtual machines I’m monitoring have each been deployed into their own storage account.

The metrics are much more lightweight. These are written to a table and provide a per hour view of the storage account. These are the same values that get surfaced up in the Windows Azure Management portal storage account dashboard. I could easily retain these for a much longer period since it’s only a handful of rows being inserted per hour.

Storage Metrics – hourly summary

Now that we’ve enabled storage analytics and told it to capture the metrics, we can run our test and sit back and look for data to start coming in. After we’ve run testing for several hours, we can then look at the metrics. Metrics get thrown into a series of tables, but since I only care about the blob account, I’m going to look at $MetricsTransactionsBlob. We’ll have multiple rows per hours and can filter based on the type of operation, or get the roll-up across all operations. For general trends, it’s this latter figure I’m most interested in. So I apply a query against the table to get all user operations, “(RowKey eq ‘user;All’)“. The resulting query gives me 1 row per hour that I can look at to help me get a general idea of the performance of the storage account.

You’ll remember that I opted to put my Linux/Oracle virtual machine into its own storage account. So this hour summary gives me a really good, high level overview of the performance of the virtual machine. Key factors I looked at are: Availability (we want to make sure that’s above the storage account 99.9% SLA), Average End to End Latency, and if we have less than 100% availability, what is the count of the number of errors we saw.

I won’t bore you with specific numbers, but over a 24hr period I lowest availability I saw was 99.993% availability and with the most common errors being Server Timeouts, Client Errors, or Network Errors. Seeing these occasionally, as long as the storage account remains above 99.9% availability, should be considered normal ‘noise’. In the transient nature of the cloud, some errors are simply to be expected. We also kept an eye on average end to end latency which during our testing was fairly consistent in the 19-29ms range.

You can learn more about all the data available in these various storage metrics by reviewing ‘Storage Analytics Metrics Table Schema‘ on MSDN.

When we saw numbers that appears “unusual”, we then took the next logical step and inspected the detailed storage logs.

Blob Storage Logs – the devil is in the details

Alright, so things get a bit messier here. First off, the logs are just delimited format files. And while there the metrics can help tell us which period in time we want to look at, depending on the number of storage operations, we may have several logs we need to slog through (In my case, I was getting about 20 150mb log files per hour). So the first step when digging into the logs is to download them. So either write up some code, grab your favorite utility, or perhaps just log into the management portal and download the files for the timeframe you want to take a closer look at. Once that’s done, it’s time for some Excel (yeah, that spreadsheet thing…. Really).

The log files are semi-colon delimited files. As such, the easiest way I found to do ad-hoc inspection of the files is to open them up in a spreadsheet application like Excel. I open up Excel, then do the whole “File -> Open” thing to select the log file I want to look at. I then tell Excel it’s a delimited file with a semi-colon as the delimiter and in a few seconds it will import the file all nice and spreadsheet for me. But before we start doing anything, let’s talk about the log file format. Since the log file doesn’t contain any headers, we either need to know what columns contain the data we want, or add some headers. For the sake of keeping things easy for you (and saving a copy for myself), I created my own Excel file that already has all the log file fields declared in it. So you can just copy and paste from this spreadsheet into your log file once it’s loaded into Excel. For the remainder of this article, I’m going to assume this is what you’ve done.

With our log file headers, we can now start filtering the data. If we’re looking for errors, the first thing we’ll want to do is open up a log file and filter based on “request status”. To do this, select the “Data” tab and click on “filter”. This allows us to click on the various column headings and filter down what we’re looking at. The shot below shows a log that had a couple of errors in it. So I can easily remove the checkbox on “Success” to drill into those specific errors. This is handy if we want to know exactly what happened as the log also contains a “request-id-header” field. With that value, we can open up a support ticket and ask them to dig into the issue more deeply.

Now this is the first real caution I have. Between the metrics and the logs, we can get a really good idea of what types of errors are happening. But this doesn’t mean that every error should be investigated. With cloud computing solutions, there’s a certain amount of “transient” errors that are simply to be expected. It’s only if you see a prolonged, or persistent issue that you’d really want to dig into the errors in any real depth. One key indicator is to look at the logging metrics and keep an eye on the availability. If it falls below 99.9%, that means there may have been an SLA violation for the storage account. In that case, I’d take a look at the logs for that period and see what types of errors we saw. As long as the issue wasn’t caused by a spike in throttling (meaning we overloaded the system), there may be something worth having support look into. But if we’re at 99.999%, with the occasional network failure, timeout, or ‘client other’, we’re likely just seeing the “noise” one would expect from transient errors as the system adjust and compensates for changes to its underlying fabric.

Now since we’re doing benchmarking tests, there’s one other key thing I look at. The number of operations that are occurring on the blobs that are the various disks mounted into our virtual machine. This is another task where Excel can help out, by adding subtotals. Adding subtotals requires column headings so this is the part when you go “thank you Brent for making it so I just need to copy those in”. You’re welcome. J

The field we want to look at in the logs for our subtotal is the “requested-object-key” field. This value is the specific option in the storage account that was being access (aka the blob file or disk). Going again to the Data tab in Excel, we’ll select “subtotal” and complete the dialog box as shown at the left. This will create subtotals by object (disk) and allow us to see the count of operations against that object. So what we have is the operations performed on that disk during the time period covered by the log file. Using that value, we can then get a fairly good approximation of the “transactions per second” that the disk is generating against storage.

So what did we learn?

If you are doing IO benchmarking of the virtual machine (as I was), you may notice something odd. We observed that our Linux/Oracle Vm was reporting IOPS far above what we saw at the Windows Azure Storage level. This is to be expected because Oracle is trying to help buffer requests itself to increase performance. Add in any disk buffering we may have enabled, and the numbers could skew even further. Ultimately though, what we did establish during out testing was that we knew for certain when we were overloading the Windows Azure storage sub-system and contributing to server slowdowns that way. We were also able to observer several small instances where Oracle performance trailed off somewhat and that these were due to isolated incidents where we saw an increase in various errors or in end to end operation latency.

The host result here is that while virtual machine performance is related to the performance of the underlying storage subsystem, there’s no easy 1-to-1 relation between errors in one and issues in the other. Additionally, as you watch these over time, you understand why virtual machine disk performance can vary over time and shouldn’t be compared to the behaviors we’ve come to expect from a physical disk drive. We have also learned what we need to do to help us more affectively monitor Windows Azure storage so that we can proactively take action to address potential customer facing impacts.

I apologize for all the typos and for not going into more depth on this subject. I just wanted to get this all out into before I fell into the fog of my memory. Hopefully you find it useful. So until next time!

The “traffic cop” pattern

So I like design patterns but don’t follow them closely. Problem is that there are too many names and its just so darn hard to find them. But one “pattern” I keep seeing an ask for is the ability to having something that only runs once across a group of Windows Azure instances. This can surface as one-time startup task or it could be the need to have something that run constantly and if one instance fails, another can realize this and pick up the work.

This later example is often referred to as a “self-electing controller”. At the root of this is a pattern I’ve taken to calling a “traffic cop”. This mini-pattern involves having a unique resource that can be locked, and the process that gets the lock has the right of way. Hence the term “traffic cop”. In the past, aka my “mainframe days”, I used this with systems where I might be processing work in parallel and needed to make sure that a sensitive block of code could prevent a parallel process from executing it while it was already in progress. Critical when you have apps that are doing things like self-incrementing unique keys.

In Windows Azure, the most common way to do this is to use a Windows Azure Storage blob lease. You’d think this comes up often enough that there’d be a post on how to do it already, but I’ve never really run across one. That is until today. Keep reading!

But before I dig into the meat of this, a couple footnotes… First is a shout out to my buddy Neil over at the Convective blob. I used Neil’s Azure Cookbook for help me with the blob leasing stuff. You can never have too many reference books in your Kindle library. Secondly, the Windows Azure Storage team is already working on some enhancements for the next Windows Azure .NET SDK that will give us some more ‘native’ ways of doing blob leases. These include taking advantage of the newest features of the 2012-02-12 storage features. So the leasing techniques I have below may change in an upcoming SDK.

Blob based Traffic Cop

Because I want to get something that works for Windows Azure Cloud Services, I’m going to implement my traffic cop using a blob. But if you wanted to do this on-premises, you could just as easily get an exclusive lock on a file on a shared drive. So we’ll start by creating a new Cloud Service, add a worker role to it, and then add a public class to the worker role called “BlobTrafficCop”.

Shell this class out with a constructor that takes a CloudPageBlob, a property that we can test to see if we have control, and methods to Start and Stop control. This shell should look kind of like this:

class BlobTrafficCop
{
    public BlobTrafficCop(CloudPageBlob blob)
    {
    }

    public bool HasControl
    {
        get
        {
            return true;
        }
    }

    public void Start(TimeSpan pollingInterval)
    {
    }

    public void Stop()
    {
    }
}

Note that I’m using a CloudPageBlob. I specifically chose this over a block blob because I wanted to call out something. We could create a 1tb page blob and won’t be charged for 1 byte of storage unless we put something into it. In this demo, we won’t be storing anything so I can create a million of these traffic cops and will only incur bandwidth and transaction charges. Now the amount I’m saving here isn’t even significant enough to be a rounding error. So just note this down as a piece of trivia you may want to use some day. It should also be noted that the size you set in the call to the Create method is arbitrary but MUST be a multiple of 512 (the size of a page). If you set it to anything that’s not a multiple of 512, you’ll receive an invalid argument exception.

I’ll start putting some buts into this by doing a null argument check in my constructor and also saving the parameter to a private variable. The real work starts when I create three private helper methods to work with the blob lease. GetLease, RenewLease, and ReleaseLease.

GetLease has two parts, setting up the blob, and then acquiring the lease. Here’s how I go about creating the blob using the CloudPageBlob object that was handed in:

try
{
    myBlob.Create(512);
}
catch (StorageClientException ex)
{
    // conditionfailed will occur if there's already a lease on the blob
    if (ex.ErrorCode != StorageErrorCode.ConditionFailed)
    {
        myLeaseID = string.Empty;
        throw ex; // re-throw exception
    }
}

Now admittedly, this does require another round trip to WAS, so as a general rule, I’d make sure the blob was created when I deploy the solution and not each time I try to get a lease on it. But this is a demo and we want to make running it as simple as possible. So we’re putting this in. I’m trapping for a StorageClientExcpetion with a specific error code of ConditionFailed. This is what you will see if you issue the Create method against a blob that has an active lease on it. So we’re handing that situation. I’ll get to myLeaseID here in a moment.

The next block creates a web request to lease the blob and tries to get that lease.

try
{
    HttpWebRequest getRequest = BlobRequest.Lease(myBlob.Uri, 30, LeaseAction.Acquire, null);
    myBlob.Container.ServiceClient.Credentials.SignRequest(getRequest);
    using (HttpWebResponse response = getRequest.GetResponse() as HttpWebResponse)
    {
        myLeaseID = response.Headers["x-ms-lease-id"];
    }
}
catch (System.Net.WebException)
{
    // this will be thrown by GetResponse if theres already a lease on the blob
    myLeaseID = string.Empty;
}

BlobRequest.lease will give me a template HttpWebRequest for the least. I then use the blob I received in the constructor to sign the request, and finally I execute the request and get its response. If things go well, I’ll get a response back and it will have a header with the id for the lease which I’ll put into a private variable (the myLeaseID from earlier) which I can use later when I need to renew the lease. I also trap for a WebException which will be thrown if my attempt to get a lease fails because there’s already a lease on the blob.

RenewLease and ReleaseLease are both much simpler. Renew creates a request object, signs and executes it just like we did before. We’ve just changed the LeaseAction to Renew.

HttpWebRequest renewRequest = BlobRequest.Lease(myBlob.Uri, 30, LeaseAction.Renew, myLeaseID);
myBlob.Container.ServiceClient.Credentials.SignRequest(renewRequest);
using (HttpWebResponse response = renewRequest.GetResponse() as HttpWebResponse)
{
    myLeaseID = response.Headers["x-ms-lease-id"];
}

ReleaseLease is just a bit more complicated because we check the status code to make sure we released the lease properly. But again its mainly just creating the request and executing it, this time with the LeaseAction of Release.

HttpWebRequest releaseRequest = BlobRequest.Lease(myBlob.Uri, 30, LeaseAction.Release, myLeaseID);
myBlob.Container.ServiceClient.Credentials.SignRequest(releaseRequest);
using (HttpWebResponse response = releaseRequest.GetResponse() as HttpWebResponse)
{
    HttpStatusCode httpStatusCode = response.StatusCode;
    if (httpStatusCode == HttpStatusCode.OK)
        myLeaseID = string.Empty;
}

Ideally, I’d have liked to do a bit more testing of these to make sure there weren’t any additional exceptions I should handle. But I’m short on time so I’ll leave that for another day.

Starting and Stopping

Blob leases expire after an interval if they are not renewed. So its important that I have a process that regularly renews the lease, and another that will check to see to see if I can get the lease if I don’t already have it. To that end, I’m going to use System.Threading.Timer objects with a single delegate called TimerTask. This delegate is fairly simple, so we’ll start there.

private void TimerTask(object StateObj)
{
    // if we have control, renew the lease
    if (this.HasControl)
        RenewLease();
    else // we don't have control
        // try to get lease
        GetLease();

    renewalTimer.Change((this.HasControl ? TimeSpan.FromSeconds(45) : TimeSpan.FromMilliseconds(-1)), TimeSpan.FromSeconds(45));
    pollingTimer.Change((!this.HasControl ? myPollingInterval : TimeSpan.FromMilliseconds(-1)), TimeSpan.FromSeconds(45));
}

We start by checking that HasControl property we created in our shell. This property just checks to see if myLeaseID is a string with a length > 0.  If so, then we need to renew our lease. If not, then we need to try and acquire the lease. I then change the intervals on two System.Threading.Timer objects (we’ll set them up next), renewalTimer and pollingTimer. Both are private variables of our class.

If we have control, then the renewal timer will be set to fire again in 45 seconds(15 seconds before our lease expires), and continue to fire every 45 seconds after that. If we don’t have control, renewal will stop checking. pollingTimer works in reverse, polling if we don’t have a lease, and stopping when we do. I’m using two separate timers because the renewal timer needs to fire every minute if I’m to keep control. But the process that’s leveraging may want to control the interval at which we poll for control, so I want that on a separate timer.

Now lets start our traffic cop:

public void Start(TimeSpan pollingInterval)
{
    if (this.IsRunning)
        throw new InvalidOperationException("This traffic cop is already active. You must call 'stop' first.");

    this.IsRunning = true;

    myPollingInterval = pollingInterval;

    System.Threading.TimerCallback TimerDelegate = new System.Threading.TimerCallback(TimerTask);

    // start polling immediately for control
    pollingTimer = new System.Threading.Timer(TimerDelegate, null, TimeSpan.FromMilliseconds(0), myPollingInterval);
    // don't do any renewal polling
    renewalTimer = new System.Threading.Timer(TimerDelegate, null, TimeSpan.FromMilliseconds(-1), TimeSpan.FromSeconds(45));
}

We do a quick check to make sure we’re not already running, then set a flag to say we are (just a private boolean flag). I save off the control polling interval that was passed in and set up a TimerDelegate using the TimerTask method we set up a moment before. Now it’s just a matter of creating our Timers.

The polling timer will start immediately and fire again at the interval the calling process set. The renewal timer, since we’re just starting out attempts to get control, will not start, but will be set up to check every 45 seconds so we’re ready to renew the lease once we get it.

When we call the start method, it essentially causes our polling timer to fire immediately (asyncronously). So when TaskTimer is executed by that timer, HasControl will be false and we’ll try to get a lease. If we succeed, the polling timer will be stopped and the renewal timer will be activated.

Now to stop traffic:

public void Stop()
{
    // stop lease renewal
    if (renewalTimer != null)
        renewalTimer.Change(TimeSpan.FromMilliseconds(-1), TimeSpan.FromSeconds(45));
    // start polling for new lease
    if (pollingTimer != null)
        pollingTimer.Change(TimeSpan.FromMilliseconds(-1), myPollingInterval);

    // release a lease if we have one
    if (this.HasControl)
        ReleaseLease();

    this.IsRunning = false;
}

We’ll stop and dispose of both timers,  release any locks we have, and then reset our boolean “IsRunning” flag.

And that’s the basics of our TrafficCop class. Now for implementation….

Controlling the flow of traffic

Now the point of this is to give us a way to control when completely unrelated processes can perform an action. So let’s flip over to the WorkerRole.cs file and put some code to leverage the traffic copy into its Run method. We’ll start by creating an instance of the CloudPageBlog object that will be our lockable object and passed into our TrafficCop class.

var account = CloudStorageAccount.FromConfigurationSetting("TrafficStorage");

// create blob client
CloudBlobClient blobStorage = account.CreateCloudBlobClient();
CloudBlobContainer container = blobStorage.GetContainerReference("trafficcopdemo");
container.CreateIfNotExist(); // adding this for safety

// use a page blog, if its empty, there's no storage costs
CloudPageBlob pageBlob = container.GetPageBlobReference("singleton");

This creates an object, but doesn’t actually create the blob. I made the conscious decision to go this route and keep any need for the TrafficCop class to have to directly manage storage credentials or the container out of things. Your individual needs may vary. The nice thing is that once this is done, starting the cop is a VERY simple process:

myTrafficCop = new BlobTrafficCop(pageBlob);
myTrafficCop.Start(TimeSpan.FromSeconds(60));

So this will tell the copy to use a blob called “singleton” in the blob container “trafficcopdemo” as our controlling process and to check for control every 30 seconds. But that’s not really interesting. If we ran this with two instances, what we’d see is that one instance would get control and keep it until something went wrong with getting the lease. So I want to alter the infinite loop of this worker role so  I can see the cop is doing its job and also that I can pass control back and forth.

So I’m going to alter the default loop so that it will sleep for 15 seconds every loop and each time through will write a message to the console that it either does or does not have control. Finally, I’ll use a counter so that if an instance has control, it will only keep control for 75 seconds then release it.

int controlcount = 0;
while (true)
{
    if (!myTrafficCop.IsRunning)
        myTrafficCop.Start(TimeSpan.FromSeconds(30));

    if (myTrafficCop.HasControl)
    {
        Trace.WriteLine(string.Format("Have Control: {0}", controlcount.ToString()), "TRAFFICCOP");
        controlcount++;
    }
    else
        Trace.WriteLine("Don't Have Control", "TRAFFICCOP");

    if (controlcount >= 4)
    {
        myTrafficCop.Stop();
        controlcount = 0;
        Thread.Sleep(TimeSpan.FromSeconds(15));
    }

    Thread.Sleep(TimeSpan.FromSeconds(15));
}

Not the prettiest code I’ve ever written, but it gets the job done.

Looking at the results

So to see the demo at work, we’re going to increase the instance count to 2, and I’m also going to disable diagnostics. Enabling diagnostics will just cause some extra messages in the console output that I want to avoid. Otherwise, you can leave it in there. Once that’s done, it’s just a matter of setting up the TrafficStorage configuration setting to point at a storage account and pressing F5 to run the demo. If everything goes well, the role should deploy, and we can see both instances running in the Windows Azure Compute Emulator UI (check the little blue flag in the tool tray to view the UI).

If everything is working as intended, you’ll see output sort of like this:

image

Notice that the role is going back and forth with having control, just as we’d hoped. You may also note that the first message was that we didn’t have control. This is because our attempts to get control is happening asynchronously in a separate thread. Now you can change that if you need to, but in out case this isn’t necessary. I just wanted to point it out.

Now as I mentioned, this is just a mini-pattern. So for my next post I hope to wrap this in another class that demonstrates the self-electing controller. Again leveraging async processes to execute something for our role instance in a separate thread. But done so in a way where we don’t need to monitor and manage what’s happening ourselves. Meanwhile, I’ve uploaded the code. So please make use of it.

Until next time!

Long Running Queue Processing Part 2 (Year of Azure–Week 20)

So back in July I published a post on doing long running queue processing. In that post we put together a nice sample app that inserted some messages into a queue, read them one at a time and would take 30 seconds to process each message. It did processing in a background thread so that we could monitor it.

This approach was all good and well but hinged on us knowing the maximum amount of time it would take us to process a message. Well fortunately for us in the latest 1.6 version of the Azure Tools (aka SDK), the storage client was updated to take advantage of the new “update message” functionality introduced to queues by an earlier service update. So I figured it was time to update my sample.

UpdateMessage

Fortunately for me given the upcoming holiday (which doesn’t leave my time for blogging given that my family lives in “the boonies” and haven’t yet opted for an internet connection much less broadband, updating a message is SUPER simple.

myQueue.UpdateMessage(aMsg, new TimeSpan(0, 0, 30), MessageUpdateFields.Visibility);

All we need is the message we read (which contains the pop-receipt the underlying API use to update the invisible mssage), the new timespan, and finally a flag to tell the API if we’re updating the message content/payload or its visibility. In the sample above we of course are setting its visibility.

Ok, time for turkey and dressing! Oh wait, you want the updated project?

QueueBackgroundProcess w/ UpdateMessage

Alright, so I took exactly the same code we used before. It inserts 5 messages into a queue, then reads and processes each individually. The outer processing loop looks like this:

while (true)
{
// read messages from queue and process one at a time…
CloudQueueMessage aMsg = myQueue.GetMessage(new TimeSpan(0,0,30)); // 30 second timeout
// trap no mesage.
if (aMsg != null)
{
Trace.WriteLine(“got a message, ‘”+aMsg.AsString+“‘”, “Information”);

// start processing of message
Work workerObject = new Work();
workerObject.Msg = aMsg;
Thread workerThread = new Thread(workerObject.DoWork);
workerThread.Start();

while (workerThread.IsAlive)
{
myQueue.UpdateMessage(aMsg, new TimeSpan(0, 0, 30), MessageUpdateFields.Visibility);
Trace.WriteLine(“Updating message expiry”);
Thread.Sleep(15000); // sleep for 15 seconds
}

if (workerObject.isFinished)
myQueue.DeleteMessage(aMsg.Id, aMsg.PopReceipt); // I could just use the message, illustraing a point
else
{
// here, we should check the queue count
// and move the msg to poison message queue
}
}
else
Trace.WriteLine(“no message found”, “Information”);

Thread.Sleep(1000);
Trace.WriteLine(“Working”, “Information”);
}

The while loop is the processor of the worker role that this all runs in. I decreased the initial visibility timeout from 2 minutes to 30 seconds, increased our monitoring of the background processing thread from every 1/10th of a second to 15 seconds, and added the updating of the message visibility timeout.

The inner process was also upped from 30 seconds to 1 minute. Now here’s where the example kicks in! Since the original read only listed a 30 second visibility timeout, and my background process will take one minute, its important that I update the visibility time or the message would fall back into view. So I’m updating it with another 30 seconds every 15 seconds, thus keeping it invisible.

Ta-da! Here’s the project if you want it.

So unfortunately that’s all I have time for this week. I hope all of you in the US enjoy your Thanksgiving holiday weekend (I’ll be spending it with family and not working thankfully). And we’ll see you next week!

Displaying a List of Blobs via your browser (Year of Azure Week 9)

Sorry folks, you get a short and simple one again this week. And with no planning what-so-ever it continues the theme of the last two items.. Azure Storage Blobs.

So in the demo I did last week I showed how to get a list of blobs in a container via the storage client. Well today my inbox received the following message from a former colleague:

Hey Brent, do you know how to get a list of files that are stored in a container in Blob storage? I can’t seem to find any information on that.  At least any that works.

Well I pointed out the line of code I used last week, container.ListBlobs(), and he said he was after an approach he’d seen that you could just point a URI at it and have it work. I realized then he was talking about the REST API.

Well as I turns out, the Rest API List Blobs operation just needs a simple GET operation. So we can execute it from any browser. We just need a URI that looks like this:

http://myaccount.blob.core.windows.net/mycontainer?restype=container&comp=list

All you need to do is replace the underlines values. Well, almost all. If you try this with a browser (which is an anonymous request), you’ll also need to specify the container level access policy, allowing Full public read access. If you don’t, you may only be allowing public read access for the blobs in the container, in which case a browser with the URI above will fail.

Now if you’re successful, your browser should display a nice little chunk of XML that you can show off to your friends. Something like this…

image

Unfortunately, that’s all I have time for this week. So until next time!

Page, after page, after page (Year of Azure Week 8)

To view the live demo of this solution, check out Episode 57 of Microsoft’s Channel  9 Cloud Cover Show.

Last week, I blogged on writing page blobs one page at a time. I also said that I’d talk more this week about why I was doing this. So here we are with the second half of the demo, the receiver. Its going to download the page blob as its being uploaded.

We’ll skip over setting up the CloudStorageAcount, CloudBlobClient, an CloudBlobContainer (btw, I really need to write a reusable method that streamlines all this for me). This works exactly as it did for the transmitter part of our solution.

The first thing we need to do is pull a list of blobs and iterate through them.. To do this we create a foreach loop using the following line of code:

foreach (CloudPageBlob pageBlob in container.ListBlobs().OfType<CloudPageBlob>())

Note the “linqy” OfType part. My buddy Neil Mackenzie shared this tip with me via his new Azure Development Cookbook. It allows me to make sure I’m only retrieving page blobs from storage. A nice trick to help ensure I don’t accidently throw an exception by trying to treat a block blob like a page blob.

Quick product plug… I highly recommend Neil’s book. Not because I helped edit it, but because Neil did an excellent job writing it. There’s a SLEW of great tips and tricks contained in its pages.

Ok, moving on…

Now I need to get the size metadata tag I added to the blob in the transmitter. While the line above does get me a reference to the page blob, I didn’t populate the metadata property. To get those values, I need to call pageBlob.FetchAttibute. I follow this up by creating a save name for the file and associating it with a file stream.

pageBlob.FetchAttributes(); // have to get the metadata
long totalBytesToWrite = int.Parse(pageBlob.Metadata[“size”].ToString());

//string fileName = string.Format(@”D:\Personal Projects\{0}”, Path.GetFileName(pageBlob.Attributes.Uri.LocalPath));
string fileName = Path.GetFileName(pageBlob.Attributes.Uri.LocalPath);
FileStream theFileStream = new FileStream(fileName, FileMode.OpenOrCreate);

Now we’re ready to start receiving the data from the blobs populated pages. We use GetPageRanges to see where the blob has data, we check that against the last endpoint we read, and we sleep for 1 sec if we’re already read all the available information (waiting for more pages to be written). And we’ll keep doing that until we’ve written the total size of the blob.

long lastread = 0; // last byte read
while (totalBytesToWrite > 0)
{
foreach (PageRange pageRange in pageBlob.GetPageRanges())
{
// if we have more to write…
if (pageRange.EndOffset > lastread)
{
// hidden region to write pages to file
}
}
Thread.Sleep(1000);  // wait for more stuff to writ
}

Ok,there’s a couple things I need to call out here. My sample assumes that the pages in the blob will be written in succession. It also assumes that we’re only going to write the blobs that exist when my application started (I only list the blobs in the container once). So if blobs get added after we have retrieved our list, or we restart the receiver, we will see some odd results. So what I’m doing is STRICTLY for demonstration purposes. We’ll talk more about that later in this post.

The last big chunk of code is associating the blob with a BlobStream and writing it to a file. We do this again, one page at a time…

BlobStream blobStream = pageBlob.OpenRead();

// Seek to the correct starting offset in the page blob stream
blobStream.Seek(lastread, SeekOrigin.Begin);

byte[] streambuffer = new byte[512];

long numBytesToRead = (pageRange.EndOffset + 1 – lastread);
while (numBytesToRead > 0)
{
int n = blobStream.Read(streambuffer, 0, streambuffer.Length);
if (n == 0)
break;

numBytesToRead -= n;
int bytesToWrite = (int)(totalBytesToWrite – n > 0 ? n : totalBytesToWrite);
lastread += n;
totalBytesToWrite -= bytesToWrite;

theFileStream.Write(streambuffer, 0, bytesToWrite);
theFileStream.Flush(); // just for demo purposes
}

You’ll notice that I’m using a stream to read the blob and not retrieving the individual pages. If I wanted to do that, I’d need to go to the Azure Storage REST API which allows me to get a specific range of bytes from a blob using the Get Blob function. And while that’s fine, I can also demonstrate what I’m after using the stream. And since we’ve already established that I’m a bit of a lazy coder, we’ll just use the managed client. Smile

The rest of this code snippet consists of some fairly ugly counter/position management code that handles the writing of the blob to the file. The most important part of this is that we use bytesToWrite to decide if we write the entire 512 byte buffer, or only just as much data as remains in our blob. This is where my “size” attribute comes in. I’ve used that to determine when the file stored in the series of 512 byte blocks actually has ended. Some files may be forgiving of the extra bytes, but some aren’t. So if you’re using page blobs, you may need to make sure you manage this.

So why are we doing all this?

So if you put a breakpoint on the transmitter app, and write 3-4 pages, then put a breakpoint in the receiver app, you’ll see that it will read those pages, then keep hitting the Sleep command until we go back to the transmitter and write a few more pages. What we’re illustrating here is that unlike a block blob, I can actually read a page blob while it is being written.

You can imagine that this could come in handy if you need to push large files around, basically using page blobs as an intermediary buffer for streaming of files between two endpoints. And after a bit more work and we can start adding restart semantics to this demo.

Now my demo just shows us going in a sequential order through the blob (this is the “STRICLY for demonstration” thing I mentioned above). If we start thinking that our buffers don’t have to be 512 bytes but can instead be up to 4mb, and a 4mb operation against Azure storage may take a few seconds to process, we start thinking about maybe multi-threading the upload/download of the file, potentially realizing a huge increase in throughput while also avoiding delays that would result in me having to wait until the upload completes before starting the download.

So the end result here is that my demo has little practical application. But I hope what it has done is made you think  bit about the oft overlooked page blob. I’m just as guilty as you for this oversight. So in closing, I want to thank Sushma, one of my Azure trainees this past two weeks. Shushma, if you read this, know that your simple question helped teach me new respect for page blobs. And for that… thank you!

BTW, the complete code for this example is available here for anyone that wants it. Just remember to clear the blobs between runs. Smile

Until next time!

A page at a time (Page Blobs–Year of Azure Week 7)

Going to make this quick. I’m sitting in SeaTac airport in Seattle enjoying the free wifi and need to knock this out quickly as we’re going to start boarding in about 10-15 minutes. I’ve been here in Seattle doing some azure training and working with a client that’s trying to move to Azure when my topic for this week fell in my lap.

One of the attendees at my training wanted to see an example of doing page blobs. I poked around and couldn’t find one that I liked to I decided I’d come up with one. Then, later in the afternoon we had a discussion about an aspect of their client and the idea of the random access abilities of page blobs came to mind. So while I haven’t had a chance to fully prove our my idea yet, I do want to share the first part of it with you.

The sample below focus’s on how to take a stream, divide it into chunks, and write those to an Azure Storage page blob. Now, in the same I keep each write to storage at 512 bytes (the size of a page), but you could use any multiple of 512. I just wanted to be able to demonstrate the chunk/reassemble process.

We start off by setting up the account, and creating a file stream that we’ll write to Azure blob storage:

MemoryStream streams = new MemoryStream();// create storage account
var account = CloudStorageAccount.DevelopmentStorageAccount;
// create blob client
CloudBlobClient blobStorage = account.CreateCloudBlobClient();CloudBlobContainer container = blobStorage.GetContainerReference(“guestbookpics”);
container.CreateIfNotExist(); // adding this for safetystring uniqueBlobName = string.Format(“image_{0}.jpg”, Guid.NewGuid().ToString());

System.Drawing.Image imgs = System.Drawing.Image.FromFile(“waLogo.jpg”);

imgs.Save(streams, ImageFormat.Jpeg);

You may remember this code from the block blob samples I did a month or two back.

Next up, I need to create the page blob:

CloudPageBlob pageBlob = container.GetPageBlobReference(uniqueBlobName);
pageBlob.Properties.ContentType = “image\\jpeg”;
pageBlob.Metadata.Add(“size”, streams.Length.ToString());
pageBlob.Create(23552);

Notice that I’m setting it to a fixed size. This isn’t ideal, but in my case I know exactly what size the file I’m uploading is and this is about twice what I need. We’ll get to why I’ve done that later. The important part is that the size MUST be a multiple of 512. No partial pages allowed!

And finally, we write start reading my file stream into a byte array buffer, convert that buffer into a memory stream (I know there’s got to be a way to avoid this but I was in a hurry to write the code for this update), and writing each “page” to the page blob.

streams.Seek(0, SeekOrigin.Begin);
byte[] streambuffer = new byte[512];int numBytesToRead = (int)streams.Length;
int numBytesRead = 0;
while (numBytesToRead > 0)
{
// Read may return anything from 0 to 10.
int n = streams.Read(streambuffer, 0, streambuffer.Length);
// The end of the file is reached.
if (n == 0)
break;MemoryStream theMemStream = new MemoryStream();
theMemStream.Write(streambuffer, 0, streambuffer.Length);
theMemStream.Position = 0;
pageBlob.WritePages(theMemStream, numBytesRead);numBytesRead += n;
numBytesToRead -= n;
}

Simple enough, and it works pretty well to boot! You’ll also notice that I’m doing this one 512 byte page at a time. This is just for demonstration purposes as the maximum size you can write (based on the REST API documentation) is 4mb. But as part of my larger experiment, the one page at a time method means I can use smaller sample files. Smile

The one piece we’re missing however is the ability to shrink the page blob down to the actual minimum size I need. For that, we’re going to use the code snippet below:

Uri requestUri = pageBlob.Uri;
if (blobStorage.Credentials.NeedsTransformUri)
requestUri = new Uri(blobStorage.Credentials.TransformUri(requestUri.ToString()));HttpWebRequest request = BlobRequest.SetProperties(requestUri, 200,
pageBlob.Properties, null, 12288
);blobStorage.Credentials.SignRequest(request);
using (WebResponse response = request.GetResponse())
{
// call succeeded
};

You’ll notice this is being done via a REST request directly to blob storage, resizing a blob isn’t supported via the storage client. I also need to give credit for this last snippet to the Azure Storage Team.

As I mentioned, I’m in a hurry and wanted to get this out before boarding. So you’ll need to wait until next week to see why I’m playing with this and hopefully the potential may excite you. Until then, I’ll try to refine the code a bit and get the entire solution posted online for you.

Until next time!