Monitoring Windows Azure VM Disk Performance

Note: The performance data mentioned here is based on individual results during a limited testing period and should NOT be used as an indication of future performance or availability. For more information on Windows Azure storage performance and availability, please refer to the published SLA.

So I’ve had an interesting experience the last few days that I wanted to take a few minutes to share with the interwebs. Namely, monitoring some Windows Azure hosted virtual machines, not at the VM level, but the storage account that held the virtual machine disks.

The scenario I was facing was a customer that was attempting to benchmark the performance and availability of two Linux based virtual machines that were running an Oracle database. Both machines were extra-large VMs with one running 10 disks (1 os, 1 Oracle, 8 data disks) and the other with 16 disks (14 for data). The customer has been running an automated load against both machines and wanted to get a clear idea of how much they may or may not have been saturating the underlying Windows Azure Storage system as well as what could be contributing to the highly variable Oracle IOPS levels they were seeing.

To support this effort, I dug into something I haven’t looked at in depth for quite some time. Windows Azure Storage Analytics (aka Logging and Metrics). Except this time with a focus on what happens at the storage account with regards to the VM disk activity.

Enable Storage Analytics Proactively

Before we go anywhere, I need to stress that if you want to be able to see what’s going on with Azure Storage and your virtual machine, you’ll need to enable this BEFORE a problem occurs. If you haven’t already enabled logging, the only option you have to try and go “back in time” and look at past behavior is to open up a support ticket. So if you plan to do this type of monitoring, please be certain to enable analytics!

For Windows Azure VM disk metrics, we need to enable analytics on the blob storage account. As the link I just shared will let you know, you will need to call the “Set Blob Service Properties” api to set this (or use your favorite Windows Azure storage utility). I happen to use the Azure Management Studio from Redgate and it allows me to set the properties you see in this screen shot:

With this, I tell Azure Storage that I want it to log all blob operations (Read/Write/Delete) and retain that information for up too two days. I also enable metrics and ask it to retain that data for two days as well.

When I enable logging, Azure Storage will log all operations and persist that information into a series of blob files in a special container in the storage account called $logs. The Logging data will be written to blobs in the same storage account I am monitoring in a special container called $logs. Logs will be spread across multiple blob files as discussed in great detail in the MSDN article “About Storage Analytics Logging“. A word of caution, if the storage account is active, logging will produce a LARGE amount of data. In my case, I was seeing a new 150mb log file approximately every 3 minutes. That’s about 70gb per day. In my case, I’ll be storing about 140gb for my 2 days of retention which is only about $6.70 per month. Given the cost of the VM itself, this was inconsequential. But if I had shifted my retention period to a month… this can start to get pricy. Additionally, the storage transactions needed to write the logs to blog storage count against the account limit of 20,000/tps. To help reduce the risk of throttling coming into play to early, the virtual machines I’m monitoring have each been deployed into their own storage account.

The metrics are much more lightweight. These are written to a table and provide a per hour view of the storage account. These are the same values that get surfaced up in the Windows Azure Management portal storage account dashboard. I could easily retain these for a much longer period since it’s only a handful of rows being inserted per hour.

Storage Metrics – hourly summary

Now that we’ve enabled storage analytics and told it to capture the metrics, we can run our test and sit back and look for data to start coming in. After we’ve run testing for several hours, we can then look at the metrics. Metrics get thrown into a series of tables, but since I only care about the blob account, I’m going to look at $MetricsTransactionsBlob. We’ll have multiple rows per hours and can filter based on the type of operation, or get the roll-up across all operations. For general trends, it’s this latter figure I’m most interested in. So I apply a query against the table to get all user operations, “(RowKey eq ‘user;All’)“. The resulting query gives me 1 row per hour that I can look at to help me get a general idea of the performance of the storage account.

You’ll remember that I opted to put my Linux/Oracle virtual machine into its own storage account. So this hour summary gives me a really good, high level overview of the performance of the virtual machine. Key factors I looked at are: Availability (we want to make sure that’s above the storage account 99.9% SLA), Average End to End Latency, and if we have less than 100% availability, what is the count of the number of errors we saw.

I won’t bore you with specific numbers, but over a 24hr period I lowest availability I saw was 99.993% availability and with the most common errors being Server Timeouts, Client Errors, or Network Errors. Seeing these occasionally, as long as the storage account remains above 99.9% availability, should be considered normal ‘noise’. In the transient nature of the cloud, some errors are simply to be expected. We also kept an eye on average end to end latency which during our testing was fairly consistent in the 19-29ms range.

You can learn more about all the data available in these various storage metrics by reviewing ‘Storage Analytics Metrics Table Schema‘ on MSDN.

When we saw numbers that appears “unusual”, we then took the next logical step and inspected the detailed storage logs.

Blob Storage Logs – the devil is in the details

Alright, so things get a bit messier here. First off, the logs are just delimited format files. And while there the metrics can help tell us which period in time we want to look at, depending on the number of storage operations, we may have several logs we need to slog through (In my case, I was getting about 20 150mb log files per hour). So the first step when digging into the logs is to download them. So either write up some code, grab your favorite utility, or perhaps just log into the management portal and download the files for the timeframe you want to take a closer look at. Once that’s done, it’s time for some Excel (yeah, that spreadsheet thing…. Really).

The log files are semi-colon delimited files. As such, the easiest way I found to do ad-hoc inspection of the files is to open them up in a spreadsheet application like Excel. I open up Excel, then do the whole “File -> Open” thing to select the log file I want to look at. I then tell Excel it’s a delimited file with a semi-colon as the delimiter and in a few seconds it will import the file all nice and spreadsheet for me. But before we start doing anything, let’s talk about the log file format. Since the log file doesn’t contain any headers, we either need to know what columns contain the data we want, or add some headers. For the sake of keeping things easy for you (and saving a copy for myself), I created my own Excel file that already has all the log file fields declared in it. So you can just copy and paste from this spreadsheet into your log file once it’s loaded into Excel. For the remainder of this article, I’m going to assume this is what you’ve done.

With our log file headers, we can now start filtering the data. If we’re looking for errors, the first thing we’ll want to do is open up a log file and filter based on “request status”. To do this, select the “Data” tab and click on “filter”. This allows us to click on the various column headings and filter down what we’re looking at. The shot below shows a log that had a couple of errors in it. So I can easily remove the checkbox on “Success” to drill into those specific errors. This is handy if we want to know exactly what happened as the log also contains a “request-id-header” field. With that value, we can open up a support ticket and ask them to dig into the issue more deeply.

Now this is the first real caution I have. Between the metrics and the logs, we can get a really good idea of what types of errors are happening. But this doesn’t mean that every error should be investigated. With cloud computing solutions, there’s a certain amount of “transient” errors that are simply to be expected. It’s only if you see a prolonged, or persistent issue that you’d really want to dig into the errors in any real depth. One key indicator is to look at the logging metrics and keep an eye on the availability. If it falls below 99.9%, that means there may have been an SLA violation for the storage account. In that case, I’d take a look at the logs for that period and see what types of errors we saw. As long as the issue wasn’t caused by a spike in throttling (meaning we overloaded the system), there may be something worth having support look into. But if we’re at 99.999%, with the occasional network failure, timeout, or ‘client other’, we’re likely just seeing the “noise” one would expect from transient errors as the system adjust and compensates for changes to its underlying fabric.

Now since we’re doing benchmarking tests, there’s one other key thing I look at. The number of operations that are occurring on the blobs that are the various disks mounted into our virtual machine. This is another task where Excel can help out, by adding subtotals. Adding subtotals requires column headings so this is the part when you go “thank you Brent for making it so I just need to copy those in”. You’re welcome. J

The field we want to look at in the logs for our subtotal is the “requested-object-key” field. This value is the specific option in the storage account that was being access (aka the blob file or disk). Going again to the Data tab in Excel, we’ll select “subtotal” and complete the dialog box as shown at the left. This will create subtotals by object (disk) and allow us to see the count of operations against that object. So what we have is the operations performed on that disk during the time period covered by the log file. Using that value, we can then get a fairly good approximation of the “transactions per second” that the disk is generating against storage.

So what did we learn?

If you are doing IO benchmarking of the virtual machine (as I was), you may notice something odd. We observed that our Linux/Oracle Vm was reporting IOPS far above what we saw at the Windows Azure Storage level. This is to be expected because Oracle is trying to help buffer requests itself to increase performance. Add in any disk buffering we may have enabled, and the numbers could skew even further. Ultimately though, what we did establish during out testing was that we knew for certain when we were overloading the Windows Azure storage sub-system and contributing to server slowdowns that way. We were also able to observer several small instances where Oracle performance trailed off somewhat and that these were due to isolated incidents where we saw an increase in various errors or in end to end operation latency.

The host result here is that while virtual machine performance is related to the performance of the underlying storage subsystem, there’s no easy 1-to-1 relation between errors in one and issues in the other. Additionally, as you watch these over time, you understand why virtual machine disk performance can vary over time and shouldn’t be compared to the behaviors we’ve come to expect from a physical disk drive. We have also learned what we need to do to help us more affectively monitor Windows Azure storage so that we can proactively take action to address potential customer facing impacts.

I apologize for all the typos and for not going into more depth on this subject. I just wanted to get this all out into before I fell into the fog of my memory. Hopefully you find it useful. So until next time!

The “traffic cop” pattern

So I like design patterns but don’t follow them closely. Problem is that there are too many names and its just so darn hard to find them. But one “pattern” I keep seeing an ask for is the ability to having something that only runs once across a group of Windows Azure instances. This can surface as one-time startup task or it could be the need to have something that run constantly and if one instance fails, another can realize this and pick up the work.

This later example is often referred to as a “self-electing controller”. At the root of this is a pattern I’ve taken to calling a “traffic cop”. This mini-pattern involves having a unique resource that can be locked, and the process that gets the lock has the right of way. Hence the term “traffic cop”. In the past, aka my “mainframe days”, I used this with systems where I might be processing work in parallel and needed to make sure that a sensitive block of code could prevent a parallel process from executing it while it was already in progress. Critical when you have apps that are doing things like self-incrementing unique keys.

In Windows Azure, the most common way to do this is to use a Windows Azure Storage blob lease. You’d think this comes up often enough that there’d be a post on how to do it already, but I’ve never really run across one. That is until today. Keep reading!

But before I dig into the meat of this, a couple footnotes… First is a shout out to my buddy Neil over at the Convective blob. I used Neil’s Azure Cookbook for help me with the blob leasing stuff. You can never have too many reference books in your Kindle library. Secondly, the Windows Azure Storage team is already working on some enhancements for the next Windows Azure .NET SDK that will give us some more ‘native’ ways of doing blob leases. These include taking advantage of the newest features of the 2012-02-12 storage features. So the leasing techniques I have below may change in an upcoming SDK.

Blob based Traffic Cop

Because I want to get something that works for Windows Azure Cloud Services, I’m going to implement my traffic cop using a blob. But if you wanted to do this on-premises, you could just as easily get an exclusive lock on a file on a shared drive. So we’ll start by creating a new Cloud Service, add a worker role to it, and then add a public class to the worker role called “BlobTrafficCop”.

Shell this class out with a constructor that takes a CloudPageBlob, a property that we can test to see if we have control, and methods to Start and Stop control. This shell should look kind of like this:

class BlobTrafficCop
{
    public BlobTrafficCop(CloudPageBlob blob)
    {
    }

    public bool HasControl
    {
        get
        {
            return true;
        }
    }

    public void Start(TimeSpan pollingInterval)
    {
    }

    public void Stop()
    {
    }
}

Note that I’m using a CloudPageBlob. I specifically chose this over a block blob because I wanted to call out something. We could create a 1tb page blob and won’t be charged for 1 byte of storage unless we put something into it. In this demo, we won’t be storing anything so I can create a million of these traffic cops and will only incur bandwidth and transaction charges. Now the amount I’m saving here isn’t even significant enough to be a rounding error. So just note this down as a piece of trivia you may want to use some day. It should also be noted that the size you set in the call to the Create method is arbitrary but MUST be a multiple of 512 (the size of a page). If you set it to anything that’s not a multiple of 512, you’ll receive an invalid argument exception.

I’ll start putting some buts into this by doing a null argument check in my constructor and also saving the parameter to a private variable. The real work starts when I create three private helper methods to work with the blob lease. GetLease, RenewLease, and ReleaseLease.

GetLease has two parts, setting up the blob, and then acquiring the lease. Here’s how I go about creating the blob using the CloudPageBlob object that was handed in:

try
{
    myBlob.Create(512);
}
catch (StorageClientException ex)
{
    // conditionfailed will occur if there's already a lease on the blob
    if (ex.ErrorCode != StorageErrorCode.ConditionFailed)
    {
        myLeaseID = string.Empty;
        throw ex; // re-throw exception
    }
}

Now admittedly, this does require another round trip to WAS, so as a general rule, I’d make sure the blob was created when I deploy the solution and not each time I try to get a lease on it. But this is a demo and we want to make running it as simple as possible. So we’re putting this in. I’m trapping for a StorageClientExcpetion with a specific error code of ConditionFailed. This is what you will see if you issue the Create method against a blob that has an active lease on it. So we’re handing that situation. I’ll get to myLeaseID here in a moment.

The next block creates a web request to lease the blob and tries to get that lease.

try
{
    HttpWebRequest getRequest = BlobRequest.Lease(myBlob.Uri, 30, LeaseAction.Acquire, null);
    myBlob.Container.ServiceClient.Credentials.SignRequest(getRequest);
    using (HttpWebResponse response = getRequest.GetResponse() as HttpWebResponse)
    {
        myLeaseID = response.Headers["x-ms-lease-id"];
    }
}
catch (System.Net.WebException)
{
    // this will be thrown by GetResponse if theres already a lease on the blob
    myLeaseID = string.Empty;
}

BlobRequest.lease will give me a template HttpWebRequest for the least. I then use the blob I received in the constructor to sign the request, and finally I execute the request and get its response. If things go well, I’ll get a response back and it will have a header with the id for the lease which I’ll put into a private variable (the myLeaseID from earlier) which I can use later when I need to renew the lease. I also trap for a WebException which will be thrown if my attempt to get a lease fails because there’s already a lease on the blob.

RenewLease and ReleaseLease are both much simpler. Renew creates a request object, signs and executes it just like we did before. We’ve just changed the LeaseAction to Renew.

HttpWebRequest renewRequest = BlobRequest.Lease(myBlob.Uri, 30, LeaseAction.Renew, myLeaseID);
myBlob.Container.ServiceClient.Credentials.SignRequest(renewRequest);
using (HttpWebResponse response = renewRequest.GetResponse() as HttpWebResponse)
{
    myLeaseID = response.Headers["x-ms-lease-id"];
}

ReleaseLease is just a bit more complicated because we check the status code to make sure we released the lease properly. But again its mainly just creating the request and executing it, this time with the LeaseAction of Release.

HttpWebRequest releaseRequest = BlobRequest.Lease(myBlob.Uri, 30, LeaseAction.Release, myLeaseID);
myBlob.Container.ServiceClient.Credentials.SignRequest(releaseRequest);
using (HttpWebResponse response = releaseRequest.GetResponse() as HttpWebResponse)
{
    HttpStatusCode httpStatusCode = response.StatusCode;
    if (httpStatusCode == HttpStatusCode.OK)
        myLeaseID = string.Empty;
}

Ideally, I’d have liked to do a bit more testing of these to make sure there weren’t any additional exceptions I should handle. But I’m short on time so I’ll leave that for another day.

Starting and Stopping

Blob leases expire after an interval if they are not renewed. So its important that I have a process that regularly renews the lease, and another that will check to see to see if I can get the lease if I don’t already have it. To that end, I’m going to use System.Threading.Timer objects with a single delegate called TimerTask. This delegate is fairly simple, so we’ll start there.

private void TimerTask(object StateObj)
{
    // if we have control, renew the lease
    if (this.HasControl)
        RenewLease();
    else // we don't have control
        // try to get lease
        GetLease();

    renewalTimer.Change((this.HasControl ? TimeSpan.FromSeconds(45) : TimeSpan.FromMilliseconds(-1)), TimeSpan.FromSeconds(45));
    pollingTimer.Change((!this.HasControl ? myPollingInterval : TimeSpan.FromMilliseconds(-1)), TimeSpan.FromSeconds(45));
}

We start by checking that HasControl property we created in our shell. This property just checks to see if myLeaseID is a string with a length > 0.  If so, then we need to renew our lease. If not, then we need to try and acquire the lease. I then change the intervals on two System.Threading.Timer objects (we’ll set them up next), renewalTimer and pollingTimer. Both are private variables of our class.

If we have control, then the renewal timer will be set to fire again in 45 seconds(15 seconds before our lease expires), and continue to fire every 45 seconds after that. If we don’t have control, renewal will stop checking. pollingTimer works in reverse, polling if we don’t have a lease, and stopping when we do. I’m using two separate timers because the renewal timer needs to fire every minute if I’m to keep control. But the process that’s leveraging may want to control the interval at which we poll for control, so I want that on a separate timer.

Now lets start our traffic cop:

public void Start(TimeSpan pollingInterval)
{
    if (this.IsRunning)
        throw new InvalidOperationException("This traffic cop is already active. You must call 'stop' first.");

    this.IsRunning = true;

    myPollingInterval = pollingInterval;

    System.Threading.TimerCallback TimerDelegate = new System.Threading.TimerCallback(TimerTask);

    // start polling immediately for control
    pollingTimer = new System.Threading.Timer(TimerDelegate, null, TimeSpan.FromMilliseconds(0), myPollingInterval);
    // don't do any renewal polling
    renewalTimer = new System.Threading.Timer(TimerDelegate, null, TimeSpan.FromMilliseconds(-1), TimeSpan.FromSeconds(45));
}

We do a quick check to make sure we’re not already running, then set a flag to say we are (just a private boolean flag). I save off the control polling interval that was passed in and set up a TimerDelegate using the TimerTask method we set up a moment before. Now it’s just a matter of creating our Timers.

The polling timer will start immediately and fire again at the interval the calling process set. The renewal timer, since we’re just starting out attempts to get control, will not start, but will be set up to check every 45 seconds so we’re ready to renew the lease once we get it.

When we call the start method, it essentially causes our polling timer to fire immediately (asyncronously). So when TaskTimer is executed by that timer, HasControl will be false and we’ll try to get a lease. If we succeed, the polling timer will be stopped and the renewal timer will be activated.

Now to stop traffic:

public void Stop()
{
    // stop lease renewal
    if (renewalTimer != null)
        renewalTimer.Change(TimeSpan.FromMilliseconds(-1), TimeSpan.FromSeconds(45));
    // start polling for new lease
    if (pollingTimer != null)
        pollingTimer.Change(TimeSpan.FromMilliseconds(-1), myPollingInterval);

    // release a lease if we have one
    if (this.HasControl)
        ReleaseLease();

    this.IsRunning = false;
}

We’ll stop and dispose of both timers,  release any locks we have, and then reset our boolean “IsRunning” flag.

And that’s the basics of our TrafficCop class. Now for implementation….

Controlling the flow of traffic

Now the point of this is to give us a way to control when completely unrelated processes can perform an action. So let’s flip over to the WorkerRole.cs file and put some code to leverage the traffic copy into its Run method. We’ll start by creating an instance of the CloudPageBlog object that will be our lockable object and passed into our TrafficCop class.

var account = CloudStorageAccount.FromConfigurationSetting("TrafficStorage");

// create blob client
CloudBlobClient blobStorage = account.CreateCloudBlobClient();
CloudBlobContainer container = blobStorage.GetContainerReference("trafficcopdemo");
container.CreateIfNotExist(); // adding this for safety

// use a page blog, if its empty, there's no storage costs
CloudPageBlob pageBlob = container.GetPageBlobReference("singleton");

This creates an object, but doesn’t actually create the blob. I made the conscious decision to go this route and keep any need for the TrafficCop class to have to directly manage storage credentials or the container out of things. Your individual needs may vary. The nice thing is that once this is done, starting the cop is a VERY simple process:

myTrafficCop = new BlobTrafficCop(pageBlob);
myTrafficCop.Start(TimeSpan.FromSeconds(60));

So this will tell the copy to use a blob called “singleton” in the blob container “trafficcopdemo” as our controlling process and to check for control every 30 seconds. But that’s not really interesting. If we ran this with two instances, what we’d see is that one instance would get control and keep it until something went wrong with getting the lease. So I want to alter the infinite loop of this worker role so  I can see the cop is doing its job and also that I can pass control back and forth.

So I’m going to alter the default loop so that it will sleep for 15 seconds every loop and each time through will write a message to the console that it either does or does not have control. Finally, I’ll use a counter so that if an instance has control, it will only keep control for 75 seconds then release it.

int controlcount = 0;
while (true)
{
    if (!myTrafficCop.IsRunning)
        myTrafficCop.Start(TimeSpan.FromSeconds(30));

    if (myTrafficCop.HasControl)
    {
        Trace.WriteLine(string.Format("Have Control: {0}", controlcount.ToString()), "TRAFFICCOP");
        controlcount++;
    }
    else
        Trace.WriteLine("Don't Have Control", "TRAFFICCOP");

    if (controlcount >= 4)
    {
        myTrafficCop.Stop();
        controlcount = 0;
        Thread.Sleep(TimeSpan.FromSeconds(15));
    }

    Thread.Sleep(TimeSpan.FromSeconds(15));
}

Not the prettiest code I’ve ever written, but it gets the job done.

Looking at the results

So to see the demo at work, we’re going to increase the instance count to 2, and I’m also going to disable diagnostics. Enabling diagnostics will just cause some extra messages in the console output that I want to avoid. Otherwise, you can leave it in there. Once that’s done, it’s just a matter of setting up the TrafficStorage configuration setting to point at a storage account and pressing F5 to run the demo. If everything goes well, the role should deploy, and we can see both instances running in the Windows Azure Compute Emulator UI (check the little blue flag in the tool tray to view the UI).

If everything is working as intended, you’ll see output sort of like this:

image

Notice that the role is going back and forth with having control, just as we’d hoped. You may also note that the first message was that we didn’t have control. This is because our attempts to get control is happening asynchronously in a separate thread. Now you can change that if you need to, but in out case this isn’t necessary. I just wanted to point it out.

Now as I mentioned, this is just a mini-pattern. So for my next post I hope to wrap this in another class that demonstrates the self-electing controller. Again leveraging async processes to execute something for our role instance in a separate thread. But done so in a way where we don’t need to monitor and manage what’s happening ourselves. Meanwhile, I’ve uploaded the code. So please make use of it.

Until next time!

Long Running Queue Processing Part 2 (Year of Azure–Week 20)

So back in July I published a post on doing long running queue processing. In that post we put together a nice sample app that inserted some messages into a queue, read them one at a time and would take 30 seconds to process each message. It did processing in a background thread so that we could monitor it.

This approach was all good and well but hinged on us knowing the maximum amount of time it would take us to process a message. Well fortunately for us in the latest 1.6 version of the Azure Tools (aka SDK), the storage client was updated to take advantage of the new “update message” functionality introduced to queues by an earlier service update. So I figured it was time to update my sample.

UpdateMessage

Fortunately for me given the upcoming holiday (which doesn’t leave my time for blogging given that my family lives in “the boonies” and haven’t yet opted for an internet connection much less broadband, updating a message is SUPER simple.

myQueue.UpdateMessage(aMsg, new TimeSpan(0, 0, 30), MessageUpdateFields.Visibility);

All we need is the message we read (which contains the pop-receipt the underlying API use to update the invisible mssage), the new timespan, and finally a flag to tell the API if we’re updating the message content/payload or its visibility. In the sample above we of course are setting its visibility.

Ok, time for turkey and dressing! Oh wait, you want the updated project?

QueueBackgroundProcess w/ UpdateMessage

Alright, so I took exactly the same code we used before. It inserts 5 messages into a queue, then reads and processes each individually. The outer processing loop looks like this:

while (true)
{
// read messages from queue and process one at a time…
CloudQueueMessage aMsg = myQueue.GetMessage(new TimeSpan(0,0,30)); // 30 second timeout
// trap no mesage.
if (aMsg != null)
{
Trace.WriteLine(“got a message, ‘”+aMsg.AsString+“‘”, “Information”);

// start processing of message
Work workerObject = new Work();
workerObject.Msg = aMsg;
Thread workerThread = new Thread(workerObject.DoWork);
workerThread.Start();

while (workerThread.IsAlive)
{
myQueue.UpdateMessage(aMsg, new TimeSpan(0, 0, 30), MessageUpdateFields.Visibility);
Trace.WriteLine(“Updating message expiry”);
Thread.Sleep(15000); // sleep for 15 seconds
}

if (workerObject.isFinished)
myQueue.DeleteMessage(aMsg.Id, aMsg.PopReceipt); // I could just use the message, illustraing a point
else
{
// here, we should check the queue count
// and move the msg to poison message queue
}
}
else
Trace.WriteLine(“no message found”, “Information”);

Thread.Sleep(1000);
Trace.WriteLine(“Working”, “Information”);
}

The while loop is the processor of the worker role that this all runs in. I decreased the initial visibility timeout from 2 minutes to 30 seconds, increased our monitoring of the background processing thread from every 1/10th of a second to 15 seconds, and added the updating of the message visibility timeout.

The inner process was also upped from 30 seconds to 1 minute. Now here’s where the example kicks in! Since the original read only listed a 30 second visibility timeout, and my background process will take one minute, its important that I update the visibility time or the message would fall back into view. So I’m updating it with another 30 seconds every 15 seconds, thus keeping it invisible.

Ta-da! Here’s the project if you want it.

So unfortunately that’s all I have time for this week. I hope all of you in the US enjoy your Thanksgiving holiday weekend (I’ll be spending it with family and not working thankfully). And we’ll see you next week!

Displaying a List of Blobs via your browser (Year of Azure Week 9)

Sorry folks, you get a short and simple one again this week. And with no planning what-so-ever it continues the theme of the last two items.. Azure Storage Blobs.

So in the demo I did last week I showed how to get a list of blobs in a container via the storage client. Well today my inbox received the following message from a former colleague:

Hey Brent, do you know how to get a list of files that are stored in a container in Blob storage? I can’t seem to find any information on that.  At least any that works.

Well I pointed out the line of code I used last week, container.ListBlobs(), and he said he was after an approach he’d seen that you could just point a URI at it and have it work. I realized then he was talking about the REST API.

Well as I turns out, the Rest API List Blobs operation just needs a simple GET operation. So we can execute it from any browser. We just need a URI that looks like this:

http://myaccount.blob.core.windows.net/mycontainer?restype=container&comp=list

All you need to do is replace the underlines values. Well, almost all. If you try this with a browser (which is an anonymous request), you’ll also need to specify the container level access policy, allowing Full public read access. If you don’t, you may only be allowing public read access for the blobs in the container, in which case a browser with the URI above will fail.

Now if you’re successful, your browser should display a nice little chunk of XML that you can show off to your friends. Something like this…

image

Unfortunately, that’s all I have time for this week. So until next time!

Page, after page, after page (Year of Azure Week 8)

To view the live demo of this solution, check out Episode 57 of Microsoft’s Channel  9 Cloud Cover Show.

Last week, I blogged on writing page blobs one page at a time. I also said that I’d talk more this week about why I was doing this. So here we are with the second half of the demo, the receiver. Its going to download the page blob as its being uploaded.

We’ll skip over setting up the CloudStorageAcount, CloudBlobClient, an CloudBlobContainer (btw, I really need to write a reusable method that streamlines all this for me). This works exactly as it did for the transmitter part of our solution.

The first thing we need to do is pull a list of blobs and iterate through them.. To do this we create a foreach loop using the following line of code:

foreach (CloudPageBlob pageBlob in container.ListBlobs().OfType<CloudPageBlob>())

Note the “linqy” OfType part. My buddy Neil Mackenzie shared this tip with me via his new Azure Development Cookbook. It allows me to make sure I’m only retrieving page blobs from storage. A nice trick to help ensure I don’t accidently throw an exception by trying to treat a block blob like a page blob.

Quick product plug… I highly recommend Neil’s book. Not because I helped edit it, but because Neil did an excellent job writing it. There’s a SLEW of great tips and tricks contained in its pages.

Ok, moving on…

Now I need to get the size metadata tag I added to the blob in the transmitter. While the line above does get me a reference to the page blob, I didn’t populate the metadata property. To get those values, I need to call pageBlob.FetchAttibute. I follow this up by creating a save name for the file and associating it with a file stream.

pageBlob.FetchAttributes(); // have to get the metadata
long totalBytesToWrite = int.Parse(pageBlob.Metadata["size"].ToString());

//string fileName = string.Format(@”D:\Personal Projects\{0}”, Path.GetFileName(pageBlob.Attributes.Uri.LocalPath));
string fileName = Path.GetFileName(pageBlob.Attributes.Uri.LocalPath);
FileStream theFileStream = new FileStream(fileName, FileMode.OpenOrCreate);

Now we’re ready to start receiving the data from the blobs populated pages. We use GetPageRanges to see where the blob has data, we check that against the last endpoint we read, and we sleep for 1 sec if we’re already read all the available information (waiting for more pages to be written). And we’ll keep doing that until we’ve written the total size of the blob.

long lastread = 0; // last byte read
while (totalBytesToWrite > 0)
{
foreach (PageRange pageRange in pageBlob.GetPageRanges())
{
// if we have more to write…
if (pageRange.EndOffset > lastread)
{
// hidden region to write pages to file
}
}
Thread.Sleep(1000);  // wait for more stuff to writ
}

Ok,there’s a couple things I need to call out here. My sample assumes that the pages in the blob will be written in succession. It also assumes that we’re only going to write the blobs that exist when my application started (I only list the blobs in the container once). So if blobs get added after we have retrieved our list, or we restart the receiver, we will see some odd results. So what I’m doing is STRICTLY for demonstration purposes. We’ll talk more about that later in this post.

The last big chunk of code is associating the blob with a BlobStream and writing it to a file. We do this again, one page at a time…

BlobStream blobStream = pageBlob.OpenRead();

// Seek to the correct starting offset in the page blob stream
blobStream.Seek(lastread, SeekOrigin.Begin);

byte[] streambuffer = new byte[512];

long numBytesToRead = (pageRange.EndOffset + 1 – lastread);
while (numBytesToRead > 0)
{
int n = blobStream.Read(streambuffer, 0, streambuffer.Length);
if (n == 0)
break;

numBytesToRead -= n;
int bytesToWrite = (int)(totalBytesToWrite – n > 0 ? n : totalBytesToWrite);
lastread += n;
totalBytesToWrite -= bytesToWrite;

theFileStream.Write(streambuffer, 0, bytesToWrite);
theFileStream.Flush(); // just for demo purposes
}

You’ll notice that I’m using a stream to read the blob and not retrieving the individual pages. If I wanted to do that, I’d need to go to the Azure Storage REST API which allows me to get a specific range of bytes from a blob using the Get Blob function. And while that’s fine, I can also demonstrate what I’m after using the stream. And since we’ve already established that I’m a bit of a lazy coder, we’ll just use the managed client. Smile

The rest of this code snippet consists of some fairly ugly counter/position management code that handles the writing of the blob to the file. The most important part of this is that we use bytesToWrite to decide if we write the entire 512 byte buffer, or only just as much data as remains in our blob. This is where my “size” attribute comes in. I’ve used that to determine when the file stored in the series of 512 byte blocks actually has ended. Some files may be forgiving of the extra bytes, but some aren’t. So if you’re using page blobs, you may need to make sure you manage this.

So why are we doing all this?

So if you put a breakpoint on the transmitter app, and write 3-4 pages, then put a breakpoint in the receiver app, you’ll see that it will read those pages, then keep hitting the Sleep command until we go back to the transmitter and write a few more pages. What we’re illustrating here is that unlike a block blob, I can actually read a page blob while it is being written.

You can imagine that this could come in handy if you need to push large files around, basically using page blobs as an intermediary buffer for streaming of files between two endpoints. And after a bit more work and we can start adding restart semantics to this demo.

Now my demo just shows us going in a sequential order through the blob (this is the “STRICLY for demonstration” thing I mentioned above). If we start thinking that our buffers don’t have to be 512 bytes but can instead be up to 4mb, and a 4mb operation against Azure storage may take a few seconds to process, we start thinking about maybe multi-threading the upload/download of the file, potentially realizing a huge increase in throughput while also avoiding delays that would result in me having to wait until the upload completes before starting the download.

So the end result here is that my demo has little practical application. But I hope what it has done is made you think  bit about the oft overlooked page blob. I’m just as guilty as you for this oversight. So in closing, I want to thank Sushma, one of my Azure trainees this past two weeks. Shushma, if you read this, know that your simple question helped teach me new respect for page blobs. And for that… thank you!

BTW, the complete code for this example is available here for anyone that wants it. Just remember to clear the blobs between runs. Smile

Until next time!

A page at a time (Page Blobs–Year of Azure Week 7)

Going to make this quick. I’m sitting in SeaTac airport in Seattle enjoying the free wifi and need to knock this out quickly as we’re going to start boarding in about 10-15 minutes. I’ve been here in Seattle doing some azure training and working with a client that’s trying to move to Azure when my topic for this week fell in my lap.

One of the attendees at my training wanted to see an example of doing page blobs. I poked around and couldn’t find one that I liked to I decided I’d come up with one. Then, later in the afternoon we had a discussion about an aspect of their client and the idea of the random access abilities of page blobs came to mind. So while I haven’t had a chance to fully prove our my idea yet, I do want to share the first part of it with you.

The sample below focus’s on how to take a stream, divide it into chunks, and write those to an Azure Storage page blob. Now, in the same I keep each write to storage at 512 bytes (the size of a page), but you could use any multiple of 512. I just wanted to be able to demonstrate the chunk/reassemble process.

We start off by setting up the account, and creating a file stream that we’ll write to Azure blob storage:

MemoryStream streams = new MemoryStream();// create storage account
var account = CloudStorageAccount.DevelopmentStorageAccount;
// create blob client
CloudBlobClient blobStorage = account.CreateCloudBlobClient();CloudBlobContainer container = blobStorage.GetContainerReference(“guestbookpics”);
container.CreateIfNotExist(); // adding this for safetystring uniqueBlobName = string.Format(“image_{0}.jpg”, Guid.NewGuid().ToString());

System.Drawing.Image imgs = System.Drawing.Image.FromFile(“waLogo.jpg”);

imgs.Save(streams, ImageFormat.Jpeg);

You may remember this code from the block blob samples I did a month or two back.

Next up, I need to create the page blob:

CloudPageBlob pageBlob = container.GetPageBlobReference(uniqueBlobName);
pageBlob.Properties.ContentType = “image\\jpeg”;
pageBlob.Metadata.Add(“size”, streams.Length.ToString());
pageBlob.Create(23552);

Notice that I’m setting it to a fixed size. This isn’t ideal, but in my case I know exactly what size the file I’m uploading is and this is about twice what I need. We’ll get to why I’ve done that later. The important part is that the size MUST be a multiple of 512. No partial pages allowed!

And finally, we write start reading my file stream into a byte array buffer, convert that buffer into a memory stream (I know there’s got to be a way to avoid this but I was in a hurry to write the code for this update), and writing each “page” to the page blob.

streams.Seek(0, SeekOrigin.Begin);
byte[] streambuffer = new byte[512];int numBytesToRead = (int)streams.Length;
int numBytesRead = 0;
while (numBytesToRead > 0)
{
// Read may return anything from 0 to 10.
int n = streams.Read(streambuffer, 0, streambuffer.Length);
// The end of the file is reached.
if (n == 0)
break;MemoryStream theMemStream = new MemoryStream();
theMemStream.Write(streambuffer, 0, streambuffer.Length);
theMemStream.Position = 0;
pageBlob.WritePages(theMemStream, numBytesRead);numBytesRead += n;
numBytesToRead -= n;
}

Simple enough, and it works pretty well to boot! You’ll also notice that I’m doing this one 512 byte page at a time. This is just for demonstration purposes as the maximum size you can write (based on the REST API documentation) is 4mb. But as part of my larger experiment, the one page at a time method means I can use smaller sample files. Smile

The one piece we’re missing however is the ability to shrink the page blob down to the actual minimum size I need. For that, we’re going to use the code snippet below:

Uri requestUri = pageBlob.Uri;
if (blobStorage.Credentials.NeedsTransformUri)
requestUri = new Uri(blobStorage.Credentials.TransformUri(requestUri.ToString()));HttpWebRequest request = BlobRequest.SetProperties(requestUri, 200,
pageBlob.Properties, null, 12288
);blobStorage.Credentials.SignRequest(request);
using (WebResponse response = request.GetResponse())
{
// call succeeded
};

You’ll notice this is being done via a REST request directly to blob storage, resizing a blob isn’t supported via the storage client. I also need to give credit for this last snippet to the Azure Storage Team.

As I mentioned, I’m in a hurry and wanted to get this out before boarding. So you’ll need to wait until next week to see why I’m playing with this and hopefully the potential may excite you. Until then, I’ll try to refine the code a bit and get the entire solution posted online for you.

Until next time!

Introduction to Azure Storage Analytics (YOA Week 6)

Since my update on Storage Analytics last week was so short, I really wanted to dive back into it this. And fortunately, its new enough that there was some new ground to tread here. While is great because I hate just putting up another blog post that doesn’t really add anything new.

Steve Marx posted his sample app last week and gave us a couple nice methods for updating the storage analytics settings. The Azure Storage team did two solid updates on working with both the Metrics and Logging. However, neither of them dove deep into working with the API I wanted more meat on how to do exactly this. By digging through Steve’s code and the MSDN documentation on the API, I can hopefully shed some additional light on this.

Storage Service Properties (aka enabling logging and metrics)

So the first step is turning this on. Well, actually its understanding what we’re turning on and why, but we’ll get to that in a few. Steve posted on his blog a sample ‘Save’ method. This is a implementation of the Azure Storage Analytics API’s “Set Storage Service Properties” call. However, the key to that method is an XML document that contains the analytics settings. It looks something like this:

<?xml version="1.0" encoding="utf-8"?>
<StorageServiceProperties>
  <Logging>
    <Version>version-number</Version>
    <Delete>true|false</Delete>
    <Read>true|false</Read>
    <Write>true|false</Write>
    <RetentionPolicy>
      <Enabled>true|false</Enabled>
      <Days>number-of-days</Days>
    </RetentionPolicy>
  </Logging>
  <Metrics>
    <Version>version-number</Version>
    <Enabled>true|false</Enabled>
    <IncludeAPIs>true|false</IncludeAPIs>
    <RetentionPolicy>
      <Enabled>true|false</Enabled>
      <Days>number-of-days</Days>
    </RetentionPolicy>
  </Metrics>
</StorageServiceProperties>

Cool stuff, but what does it mean. Well fortunately, its all explained in the API documentation. Also fortunately, I won’t make you click a link to look at it. I’m nice that way.

Version – the service version / interface number to help with service versioning later one, just use “1.0” for now.

Logging->Read/Write/Delete – these nodes determine if we’re going to log reads, writes, or deletes. So you can get just the granularity of logging you want.

Metrics->Enabled – turn metrics capture on/off

Metrics->IncludeAPIs – set to true if you want to include capture of statistics for your API operations (like saving/updating analytics settings). At least I think it is, I’m still playing/researching this one.

RetentionPolicy – Use this to enabled/disable a retention policy and set the number of days to retain information for. Now without setting a policy, data will be retained FOREVER, or at least until your 20TB limit is reached. So I recommend you set a policy and leave it on at all times. The maximum value you can set is 365. To learn more about the retention policies, check out the MSDN article on them.

Setting Service Properties

Now Steve did a slick little piece of code, but given that I’m not what I’d call “MVC fluent” (I’ve been spending too much time doing middle/backend services I guess), I took a bit of deciphering, at least for me, to figure out what was happening. And I’ve done low level Azure Storage REST operations before. So I figured I’d take a few minutes to explain what was happening in his “Save” method.

First off, Steve setup the HTTP request we’re going to send to Azure Storage:

var creds = new StorageCredentialsAccountAndKey(Request.Cookies["AccountName"].Value, Request.Cookies["AccountKey"].Value);
var req = (HttpWebRequest)WebRequest.Create(string.Format("http://{0}.{1}.core.windows.net/?restype=service&comp=properties", creds.AccountName, service));
req.Method = "PUT";
req.Headers["x-ms-version"] = "2009-09-19";

 

So this code snags the Azure Storage account credentials from the cookies (where it was stored when you entered it). They are then used it to generate an HttpWebRequest object using the account name, and the service (blob/table/queue) that we want to update the settings for. Lastly, we set a method and x-ms-version properties for the request. Note: the service was posted to this method by the javascript on Steve’s MVC based page.

Next up, we need to digitally sign our request using the account credentials and the length of our XML analytics config xml document.

            req.ContentLength = Request.InputStream.Length;
            if (service == "table")
                creds.SignRequestLite(req);
            else
                creds.SignRequest(req);

Now what’s happening here, is that our XML document came to this method via the javascript/AJAX post to our code-behind method via Request.InputStream. We sign the request using the StorageCredentialsAccountAndKey object we created earlier, doing either a SignRequestLite for a call to the Table service, or SignRequest for the blob or queue service.

Next up, we need to copy our XML configuration settings to our request object…

            using (var stream = req.GetRequestStream())
            {
                Request.InputStream.CopyTo(stream);
                stream.Close();
            }

 

This chunk of code uses GetRequestStream to get the stream we’ll copy our payload to, copy it over, then close the stream so we’re ready to send the request.

            try
            {
                req.GetResponse();
                return new EmptyResult();
            }
            catch (WebException e)
            {
                Response.StatusCode = 500;
                Response.TrySkipIisCustomErrors = true;
                return Content(new StreamReader(e.Response.GetResponseStream()).ReadToEnd());
            }

Its that first line that we care about. req.GetResponse will send our request to the Azure Storage service. The rest of this snippet is really just about exception handling and returning results back to the AJAX code.

Where to Next

I had hoped to have time this week to create a nice little wrapper around the XML payload so you could just have an Analytics configuration object that you could hand a connection too and set properties on, but I ran out of time (again). I hope to get to it and actually put something our before we get the official update to the StorageClient library. Meanwhile, I think you can see how easy it is to generate your own REST requests to get (which we didn’t cover here) and set (which we did) the Azure Storage Analytics settings.

For more information, be sure to check out Steve Marx’s sample project and the MSDN Storage Analytics API documentation.

Follow

Get every new post delivered to your Inbox.

Join 1,076 other followers