Shared Access Signatures with Azure Service Bus

Sorry for the silence folks. Microsoft’s fiscal year end was over June 31st and I started on a new team on July 1st. While July and August are usually great periods for folks like me to get some extra blogging done, I’ve had a few distractions that kept me from writing. Namely learning, learning, learning and trying to find my “voice” on the new team.

Going forward, I’m going to start having more of a focus on this “internet of things” fad that everyone’s talking about. And within that, I’m going to be sticking fairly close to home and working on the services side of things. Even more tightly focused, I’m going to go deeper on “build” vs “buy” scenarios and focus a goodly amount of my available time on one of the Azure product collections I’ve often felt didn’t get enough respect… Service Bus.

So in the coming weeks expect to see blog posts on Event Hub, possibly Hybrid Connections, and for starters, setting a few things straight about Service Bus in general.

SAS vs. ACS

So my first starting point is to call out a ‘stealth update’ that happened recently. As of sometime on/after August 21st 2014, if you create a new Service Bus root namespace via the Azure Management Portal, it will no longer include the associated Access Control Service namespace. This change is in following with recommended guidance the product team has been saying for some time. Namely to use SAS over ACS whenever possible.

Note: You can still generate the associated ACS namespace automatically by using the new-azuresbnamespace powershell cmdlet. Just be aware that a new version of this will be coming soon that will match the default behavior of the management portal. When the new cmdlet is available, you will need to append the “-useAcs true” parameter to the command if you still want to create the ACS namespace.

There are a few reasons for this guidance. The first is that according to the data the team had available to them, many folks doing ACS authentication were using the “owner” credential. This identity had FULL control over the namespace. So using it was akin to having an app use the SA (system administrator) identity for a SQL database. Secondly, ACS requires two calls for the first time operation: one to get the ACS token, one to perform the requested service bus operation. Now the token had a default time to life of 3 hours, but some SDKs didn’t cache the token and as a result all operations against the SB were generating two calls which increases the latency of the operation significantly. As if these weren’t enough, ACS only supports about 40 authentications per second. So if your SDK didn’t cache the token, your possible throughput on a single queue goes from somewhere near 2,000 messages a second down to 40 at best.

Now ACS has some benefits to be sure. In general, folks are much more familiar with username/password models then shared access signatures. You could create identities for specific publishers/consumers (within reason), as well as scope those identities and their permissions to specific paths. Allowing a single identity to send/receive from multiple queues for example. It also had the ACS management namespace with a GUI to help manage things. And to shut down access, all one has to do is revoke the identity and access is cut off.But many of these needs can also be met using Shared Access Signatures if one knows how. And that is what I’d like to start helping you with in this post. J

Shared Access Policies/Rules & Connection Information

Ok, first issue… If you use the management portal, you’ll see the ability to create/manage Shared Access Policies, but in the SDK and API, these are referred to as a SharedAccessAuthorizationRule. For the sake of simplicity, and consistency with Azure storage, I’ll refer to this from now on simply as “policies” (which matches the Azure Storage naming).

In Service Bus terms, a policy (aka SharedAccessAuthorizationRule) is a named set of permissions associated with an entity. The entity can be the Service Bus’ root namespace (the name you gave it when it was created), a queue, a topic, or an event hub. Basically an addressable endpoint that has a name assigned to it. For each entity you can have up to twelve policies and each policy is allowed a mix of the same three permissions: manage, send, and listen. Each policy also has two access keys much like Azure Storage and for the same reason. So you can do key swaps periodically and ensure you always have at least one active, valid key available to your applications when an old one is regenerated.

It’s these policies that are the “Connection Information” you access via the portal and see available as SAS connection strings. And it’s the connection strings that lead me to a bit of an issue I have with how many service bus demos are done.

Service Bus Clients

When you create your first service bus project using the .NET SDK and one of the tutorials, you’ll likely be asked at some point to add code that looks like the following:

// Create EventHubClient
QueueClient client = QueueClient.Create(“vendor-queue2″);

// insert the message body into the request
BrokeredMessage message = new BrokeredMessage(“hello world!”);

// execute the request and get the response 
client.Send(message);

Notice that the sample specifies a queue name that we want a client for, but no credentials. That’s because within the SDK, this method is overloaded to look for an application configuration setting by the name of “Microsoft.ServiceBus.ConnectionString”. The value of this string is the SAS connection string you can get from the portal. It gives the application access to the entity until such time as the policy/rule is removed. In other words, you can re-write this code to look like this:

// Create EventHubClient
QueueClient client = QueueClient.CreateFromConnectionString("Endpoint=sb://<namespace>/;SharedAccessKeyName=<SharedAccessRuleName>;SharedAccessKey=<RuleKey>""vendor-queue2");

// insert the message body into the request
BrokeredMessage message = new BrokeredMessage("hello world!");

// execute the request and get the response
client.Send(message);

By using CreateFromConnectionString in place of the simple Create, we can specify our own connection string. But again, this is permanent access until the policy/rule is removed. It also highlights the issue I have with the way the available samples/demos work I mentioned earlier. I bemoaned the use of the “owner” credential when doing ACS. Lets look at the default connection string that the Service Bus Nuget package puts into the application configuration file:

    <add key="Microsoft.ServiceBus.ConnectionString" value="Endpoint=sb://[your namespace].servicebus.windows.net;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=[your secret]" />

This sample refers to a policy/rule named “RootManageSharedAccessKey”. When you create a new namespace, this default policy has been created for you automatically with permission to listen, send, AND MANAGE the entire namespace. Please! For the love of all that is digital, NEVER… EVER use this credential for anything other than an application that needs to manage all aspects of a given namespace! Instead, configure your own policies with just the permissions that are needed. If you put the default policy into all your apps, we’re right back to the “owner” credential situation we had with ACS.

There’s another issue with this approach. Rules/policies must be associated with a specific service bus entity. This is where SAS comes into play.

Our First SAS

From the root namespace, entities in service bus are accessible by a path that lies directly under that namespace. Together with the root namespace, this can be represented by a path. Like so…

brentsample-ns.servicebus.windows.net/my-queue
brentsample-ns.servicebus.windows.net/my-eventhub
brentsample-ns.servicebus.windows.net/your-topic

Now I could create a policy at the root namespace that has “send” permission. But using it as a connection string would give the sender access to send to everything in the namespace. Alternatively, I could create individual policies/rules on each of these entities. But then I need to juggle all those different connection strings.

If we opt to use a SAS, we have a simpler way to help restrict access, but also make management a touch easier by creating signatures that allow access to a partial path, namely something like entities that begin with “my-“. Unfortunately, the management portal does yet provide the ability to create a SAS. So we either need to write a bit of .NET code, or leverage 3rd party utilities. Fortunately the code is pretty simple. Using Visual Studio, you can create a new Console Application and then add the Nuget package for Azure Service Bus. Then all that remains is to populate some variables and use these two lines of code to generate our signature.

var serviceUri = ServiceBusEnvironment.CreateServiceUri("https", sbNamespace, sbPath).ToString().Trim('/');
string generatedSaS = SharedAccessSignatureTokenProvider.GetSharedAccessSignature(sbPolicy, sbKey, serviceUri, expiry);

The important variables in here are:

sbNamespace – the name of the service bus namespace. Don’t include the “.servicebus.windows.net” stuff. Just the name when we created it.
sbPath – the name, or partial name of the entities we want to access. For this example, let’s pretend its “my-“
sbPolicy – this is the rule/policy that has the permissions we want to the signature to include
sbKey – one of the two secret keys of the rule/policy we’re referencing
expiry – a date/time of when the signature should expire.

If we fill these in, we get a signature that looks something like:

SharedAccessSignature sr=https%3a%2f%2fbmssample-ns.servicebus.windows.net%2fmy-&sig=B9cy8OEuxum2UN5VjsC4JPhbVU7jwJi%2bq20qiaXk24s%3d&se=64953912194&skn=Publish

Now that we have this signature, we want to be able to use it to interact with one of our entities. There’s no “CreateFromSAS” option, but fortunately in the .NET SDK we can use this signature together with a MessagingFactory to create our entity client.

MessagingFactorySettings mfSettings = new MessagingFactorySettings();
mfSettings.TransportType = TransportType.NetMessaging;
mfSettings.TokenProvider = TokenProvider.CreateSharedAccessSignatureTokenProvider("<signature we just created>");
MessagingFactory mf = MessagingFactory.Create("sb://thatconference-ns.servicebus.windows.net", mfSettings);

// Create Client
QueueClient client = mf.CreateQueueClient(queueName);

And in this case, the same signature will work for the queue ‘my-queue’, or the event hub ‘my-eventhub’ (although for the later the TransportType needs to be Amqp). We can even take this same signature, and put into into javascript for use perhaps in a NodeJs application…

var xmlHttpRequest = new XMLHttpRequest();
xmlHttpRequest.open('POST''https://thatconference-ns.servicebus.windows.net/my-eventhub/publishers/vendorA-DeviceIDnnn/messages'true);
xmlHttpRequest.setRequestHeader('Content-Type''application/atom+xml;type=entry;charset=utf-8');
xmlHttpRequest.setRequestHeader('Authorization''<SharedAccessSignature>');

In the case of event hub, were we’ll have various publishers, we can do exactly the same but using a rule/policy from the event hub and generate a signature that goes deeper like “my-eventhub/publishers/vendorA-“.

Policy Expiry and Renewal

So here’s the $50,000 question. With ACS I could (at least to a certain scale), have a single identity for each client and remove them at any time. With SAS, I can remove a signature by removing its underlying policy at any time. But since I’m limited to twelve policies. How to I recreate the ability to revoke on demand. Simply put, you don’t.

Now before you throw your hands up and state with exasperation that this approach “simply won’t work”, I do ask you to take a long hard look at this requirement. In my experience, I find that the vast majority of the time, allowing someone to publish to an entity is a matter of trust. You’ve entered into an agreement with them and aren’t likely to revoke that access on a whim. The nature of this trust is rarely volatile/short term in nature. If it’s a customer, they are usually engaged for the term of their subscription (monthly or annual are common). So you know when your trust will expire and need to be renewed.

In this situation, were are planning for an eventuality that rarely comes to pass. And one that has an alternative that requires just a small amount of work, implementing your own “credential broker”.

If we look back at what the ACS did, you present it with credentials, and it issued you a token. You then used that token to access the appropriate service bus resources. That’s exactly what our credential broker would do. Periodically, the requesting client would call to a service you are hosting and present some credentials (username/pass, certificate, PSK, etc…). Your broker service validates this request (and likely logs it), then issues back a SAS ‘token’ with an appropriate expiry. The client then (hopefully) caches the SAS ‘token’, and uses it on subsequent requests until it expires and then goes back to the broker to get a new SAS ‘token’.

Now if this isn’t  enough, we still have the ability to remove/disable the queue (or event hub). So in a way we get the best of both worlds. We have automatic expiry of tokens to ensure “key rotation” while also having the ability revoke access immediately.

This is just one possible pattern. So instead of offering up alternatives, I’d love to hear from any of you via the comments on the patterns you have used to help manage shared access signatures.

Expiration Reached

In the coming weeks/months I’m going to generate a series of posts that will dive into various Service Bus related topics more deeply. If there’s something specific you’d like to see, please let me know. Otherwise you can look forward to posts on access service bus from other languages/sdks, programmatic management of namespaces/rules, and resilient architectures around Service Bus.

I hope this article has helped clear up some of the fog around the Azure Service Bus. So until next time!

New Role: focus on the Enterprise

For the last 18+ months, I’ve been fortunate enough to be part of a great team of professionals that have focused on helping ISVs (Independent Software Vendors), successfully adopt the Azure platform. I’ve enjoyed being a part of this team more then I can express, but the time has come for me to return to the Enterprise focused market where I’ve spent the majority of my 20+ year career.

Affective July 1st, I am transitioning to a new team within my current organization (TED), that will focus Enterprise Architecture. I remain focused on the Azure platform, but will be bringing what I’ve learned from working with ISVs as well as my past experiences with Enterprise customers to try and help grow the platform even further. This new team, led by Barry Briggs will also allow me to join forces with great minds such as Josh Holmes and my old friend David Makogon.

I’m extremely excited about this move. While a significant number of the fortunate 500 company’s out there already have Azure, I’m convinced there is still an incredible amount of untapped potential. And I look forward to working with this new team to help unlock new stories, and exciting new solutions.

Attempting to define “IOT”

NOTE: the following represents my own opinions and should NOT be consider an official viewpoint for any person or organization. These opinions are also still in flux, so if you ask me again in a month, it’s likely to have changed.

Jason, a friend and colleague, recently posted an update on his blog titled “What is Internet of Things”. In his post, he calls out a couple great potential scenarios, but I called him out for not really defining IOT. He attempted to counter with what is in my opinion a marketing blurb. I like and respect Jason, so much of this back and forth was good natured colleague rib-poking. But realized I shouldn’t be poking at his attempt without making my own.

How do you define the undefinable?

To start with. I want to call out that attempting to define “IOT” is like attempting to define “the cloud”. Over time, “cloud” has settled on a definition that revolved around a collection of attributes: scalable, self-service, pay for what you use, and internet accessible. These still fluctuated greatly, and to some degree were dependent on the viewpoint of the consumer of “cloud” based services. But it gave us a starting point.

Using that as an example, I think we could define IOT solutions as also requiring a set of attributes:

Things – A ‘thing’ is a specialized, autonomous piece of technology capable of performing an action, but does not possess a traditional human user interface. A phone or computer is a device. A security hand scanner, motion sensor, or even the GPS sensor in a phone could be considered a “thing”..

Data – Our things, being highly specialized, will have limited capabilities. But one of these will be the ability to report on the information they gather so something else can process the data, turning it into information. They *may* also be able to receive commands to alter their function (i.e. changing your sampling rate from 1s to 5s). In many case, we can likely expect both.

Connectivity – How that data moves in and out of “the things”, requires some type of connection. This could be a cellular network, location specific wi-fi, etc… In part, this refers to how the “things” interact or the data is gathered. The connection can be persistent (always on), or transient (on/off as required), but if you have to do stuff like plug a device into it or move data via USB or SD, we’re lacking the “internet” part of IOT. A big discussion point here is if the “thing” is directly connected, or requires assistance to connect (like the GPS sensors mentioned above).

Management – This attribute is how we track the “things”. Are they curated and highly managed (likely in a factory scenario). Or are they anonymous and unmanaged (open sourced climate telemetry gathering). This also covers attributes like how do I identify the device and separate it from “rogue” devices.

So with a set of initial attributes identified, the next step is to use them to define some scenarios.

A Factory Scenario

So Jason’s post calls out a manufacturing scenario. I’m careful to “a” scenario, because our definition above allows for a nearly infinite set of scenarios. But here’s one possibility: a single factory that makes widgets. For this scenario, we can identify the our IOT attributes.

The things: each piece of machinery has 3 sensors on it: power monitoring, pieces made, and operating temperature and one actuator: power control

Data: The sensors each capture a specific measurement every 1 second and report it every 5 seconds while the machine is running to a local controller.

Connectivity: The sensors area wired to a “controller” that’s mounted on the machine. This in turn is connected into the factory’s private wi-fi network. While the controller can take action on the data, it mostly just displays it and hands it back up to a central service on the network.

Management: When a machine is installed in the factory, there’s a step where the machine (and its sensors) are connected to the network, and “registered” with the factory’s services. Additionally, since the sensors are hard-wired to the machine “controller”, it has information about the sensors it can in turn share up to the factory services.

So we have a basic scenario that has all four attributes, and meets our basic criteria. The factory is capable of determining when the data is trending downward (perhaps power is increasing while pieces being produced is being reduced). It can then take action like telling the machine to shut down because a technician is on the way. We can also collect and trend the data, mining it over time.

But IOT also creates some common challenges.

Ingestion of telemetry: If I only have 100 machines, this isn’t a big deal. But what I’m in a scenario where I have several thousand or hundreds of thousands? How do I scale my factory services to ingest that many connections and messages?

Device Management: Sensors fail and have to get replaced. As machines fail, they may be parted out. So a sensor that fails may be replaced with a sensor that used to be on a different machine. So I constantly need to be able to track the relationships.

Connectivity: What happens if the factory wi-fi goes down? Or worse yet a visitor to the factory taps into the network and starts sending rogue messages that alter my factory data? What if they send commands to the machines that cause them to overheat?

It’s these challenges that all the vendors are racing to solve with their various IOT Solutions!

What are the solutions?

I recently saw slides from a session given by Alessandro Bassi at the M2M+ Industry Summary in Milano, Italy. In this session, he calls out that innovation for its own sake will usually fail. It needs to be supported by a good business model. So the vendors in this space are all attempting to offer their business solutions.

In some cases, these solution are highly targeted. The tech startup NEST, is a good example of this. It has a thing, the thing gathers data, it’s connected and transmits the data, and its managed (you register your device). In this case, the vendor is trying to solve a specific problem, driving down home energy costs, and marketing this solution to consumers is their business model.

In other cases, the solutions are industry focused. With the vendor providing a collection of services that work together in a more flexible way to help drive larger, more strategic initiatives. In the link I just shared, it’s a collection of solutions targeting hardware and software to help with factory scenarios like I listed above. This solution is positioned to organizations that are in that industry vertical, to address the requirements of that industry.

And lastly, we have services like the ones I’ve been working with lately; ISS, Service Bus, Project Orleans, and HDInsight. These are more building block oriented. They aren’t specific industry or even set of scenarios. They are meant to allow higher level, more encompassing solutions to be created. These get marketed to either software vendors looking to build our commercial or consumer solutions, or to organizations that need to build out customized solutions for internal use.

Each approach has pros and cons, but then they are also targeting a different business models. So it’s about picking the approach that best addresses your needs. Meanwhile, the vendors are all looking to capitalize on the market and find a way to sell their solution to solve the “IOT problem”.

Summary – have we accomplished anything?

This was the first time I’ve really put any of these thoughts down. Looking back over the typos and grammar errors, I have to ask if I’ve accomplished anything. I did set down a rough definition for what I think “IOT” is. This also allows me to call out some of the common challenges to related scenarios and ultimately even call out the types of solutions the industry is offering. So I guess you could say I have defined IOT.

But I can’t help and think the practical reality is more difficult. The definition of IOT will continue to evolve and there will always be shades of gray. So it’s important to keep in mind that different things to different people. With that in mind, I think I’ll stick to my “remain calm and ask questions” approach, and when someone comes to me with “an IOT scenario”, my first response will always be “so tell me about it”. The scenario and its challenges will always trump an arbitrary terminology definition. And in the end, it’s the solution and the business value that is brings that really matters more than the industry buzzword. Isn’t it?

Until next time!

Thinking about Effective Communication

Some time back, I drove to visit my parents; 6 hours alone in the car with only my thoughts. Admittedly, this trip wasn’t recreational so I was pretty introspective and the only distraction I had was random radio stations. At one point during the trip, I happened across a National Public Radio broadcast that focused on communication, “Spoken and Unspoken: TED Radio Hour”.

As I listened to the show, it struck me how communication is key to nearly everything we do. It could be communicating concepts to customers, sharing ideas with colleagues, sharing status with managers and account team associates. But at the crux of it all is how we express information and how it is received.

This program clued me in that texting is transforming our spoken communication. That if you’re talking with folks who have a different first language, some concepts (subjunctive?) simply won’t translate. That history can change what words mean. And lastly the importance of body language not just to my audience, but perhaps to the way we look at ourselves.

What we need to remember is how the way we communicate impacts the impression of the audience. If I smile at person, it means I’m happy. If I smile at a dog, I’m aggressively baring my teeth! By understanding the audience I’m addressing, I can take the appropriate action to really help get my message across. This can extend to understanding how the English language, is both globally unifying and segregating at the same time. By understanding the way communication is perceived by the audience (do I shake hands, or bow), I can make sure I’m communicating the correct message.

Growing up in a rural community in the central United States, the most exotic thing I was exposed to was tacos. So I really only knew the way Americans approached communication. As my career advanced, I started leading teams and eventually found myself in the position of having several team members from India. For a Midwestern farm boy, this was a culture shock. But fortunately I was given a tip… Always start a conversation with a personal greeting or question, never jump straight to business. This ran counter to the way I was always taught to “get to the point”, but I quickly found that my team members from India responded much more favorably when I opened discussions this way. And if you’ve IM’d with me, or even looked closely at my writing, you see this simple advice taken to heart.

The next time you prepare to address an audience, write an email, record a podcast, or author an article, always think about your audience. Know who they are, and try your best to communicate in a manner they will be most receptive too. But most importantly, realize as much comes from what you say, as what you do.

The entire show is broken up into 5 different recordings, each of which is only 5-10 minutes in length. I encourage you to give them a listen and as you do, think about what they mean to you. Try to incorporate some aspect of them into yourself so that you can be a more effective communicator.

PS – Many thanks to Jeremiah Talkar for helping me proof/edit this editorial.

Azure File Depot – The BlobWatcher

Recently I was taking a look at WebJobs, the new feature added to Windows Azure Web Sites that lets you run applications continuously, at intervals, or triggered by certain events (such as a new object in Azure storage). One of the questions that popped into my head was, how does the “binding to blobs” work? If I could find an answer to that, perhaps I could add that as a feature to the File Depot project.

After poking around a bit, I found that there’s not really anything magical to what WebJobs was doing. They are leveraging the information that could be available to you or me. In the case of Azure Storage blobs, when you create a blob binding for the job, WebJobs is reaching into the storage account in question and turning on the write logs for blobs in that account and giving those logs a 7 day retention period. It’s these logs that are scanned/monitored by WebJobs so that when new blobs arrive, it can trigger your job based on the bindings you’ve set up.

FileDepotBlobWatcher-EnableLogs

 

So how are the logs scanned? Fortunately, Mike Stall has already published a great little write-up. In a nutshell, when a job is started, it does a full scan of the logs for all past data, then does incremental scans for new files. So armed with this information, I set out to create my own implementation of a blob detector, the Azure File Depot Blob Watcher!

Azure Storage Logs

Armed with the info from Mike’s post, the first step is to dig into the storage logs and figure out how they work. The storage team has a great post on using the logs and I recommend you take the time to give it a complete read. But here are the highlights…

When you enable logging, a new “$logs” container will be created. The blobs placed into this container are read only, you can read and delete them, but not alter them or their properties. The logs buffered up internally and periodically ‘flushed’ into this container as individual blobs.

In Mike’s post, he mentions that there is latency (5-10 minutes) detecting blobs, and this is because Azure storage buffers the logs for up to 5 minutes or until the buffer hits 4MB in size. At that time, they are written out, and we are able to access them. Thus the latency.

Log files are only written when there are operations we’ve indicated we want to log. But the naming convention always follows the pattern: <service>/YYYY/MM/DD/HHmm/<sequence>.log

So we’ve already identified a couple of requirements for our solution…

  • Don’t scan for new log files more than every 5 minutes
  • Get a list of logs from the $logs container that start with “blob/”
  • Don’t reprocess log files we’ve already examined

Once we have the files, we then have to parse them. I wrote a post last fall that describes using Excel parse the semi-colon delimited log entries. We’re going to need to do that in code, but fortunately it’s not that difficult. The logs are semi-colon delimited and use double-quotes to denote strings that include semi-colons that we won’t want to split/explode on. You could do this using a regular expression, but my own regex skills are so rusty that I opted to just parse the file via a bit of C# code.

int endDelim = 0;
int currentPos = 0;
while(currentPos <= logentry.Length-1)
{
 
    // if a quoted string... 
    if (logentry.Substring(currentPos,1).StartsWith("\""))
    {
        currentPos++; // skip opening quote
        endDelim = logentry.IndexOf("\";", currentPos);
        if (endDelim == -1) // if no delim, jump to end of string
            endDelim = logentry.Length - 1;
        properties.Add(logentry.Substring(currentPos, endDelim - currentPos));
        // skip ending quote and semicolon
        endDelim = endDelim + 2;
    }
    else // not quoted string
    {
        endDelim = logentry.IndexOf(';', currentPos);
        if (endDelim == -1) // if no delim, jump to end of string
            endDelim = logentry.Length - 1;
        properties.Add(logentry.Substring(currentPos, endDelim - currentPos));
        endDelim++;
    }
 
    currentPos = endDelim; // advance position
}

Not as elegant as a regex I fully admit. But with my unpracticed skills (it’s been 10+ years since I had my fingers deep in that), it would have taken me 2-3 times longer to get that working then just brute forcing it.

The final step is knowing what we want out of the logs. There’s two key values from the log that I’m after. The OperationType, and the RequestURI. The request URI is self-explanatory enough, that’s the URI of the blob that we’re trying to detect. The OperationType is the action that was performed against Azure storage. There’s only two values we’re going to monitor for, PutBlob and PutBlockList.

Now here is a bit of an issue. A small enough blob can be created or UPDATED, using just the PutBlob call. So if we detect that operation. So there is a chance that we may process the same file multiple times. We could resolve this by using a “receipt” pattern as is called out in the comments section of Mike’s post, or we could keep a list of processed blobs (perhaps in table storage). The approach really depends on your needs, so I’m going to leave it out of this implementation for now.

NOTE: It should also be noted, that since we’re only looking for PutBlob or PutBlockList operations, we’re not doing to be able to detect page blobs and will catch (via PutBlob) updates to smaller page blobs. Fixing this is definitely on my list, but will need to wait for another day.

The solution

Now that we know how to get at the log information, it’s time to start creating a solution. The first decision I made, was to separate detecting new logs files from their parsing. So we’ll have a LogScanner, and a LogParser. I also wanted to make parsing the log entries super easy, so I decided to create a LogEntry class that I can feed the string that is a log entry into and exposes the values as properties.

But I still have two issues… It’s likely, especially under high volumes, that parsing the logs will take much longer then detecting them. So under most circumstances, I can get by with a single LogScanner. So I’m going to implement a “traffic cop” or “gatekeeper” pattern so that only one LogScanner can run at a time.

My second issue is how to ensure I only alert to a new log file once. I’ll be running scans every 5 minuts or so, and listing blobs doesn’t really have an option for “give me only the new ones”. Fortunately, since I’m already using a gatekeeper, I can have it store the name of the last log file I processed for me. Making it pretty simple to keep track.

The final step of course is having both the LogScanner and LogParser use delegates so whomever is implementing them can create a method to handle when a log file is detected, or a new blob is found. Thus allowing them to control what actions are taken.

I’ll wrap the whole thing up in a reference implementation via a console app. So the final solution looks like this:

FileDepotBlobWatcher-SolutionLayout

The BlobLogEntry class expose the individual fields of the blob log entry (see the parsing code above or Codeplex for all this really does), the Gatekeeper to make sure only one LogScanner is trying to detect new log entries, and the LogScanner to parse a log once it’s been found.

Gatekeeper

I’ve blogged about the gatekeeper pattern before. I’ve known this as a “traffic cop” since long before folks started publishing design patterns on the internet, so to me that’s what it will always be. Regardless of the name, the purpose is to make sure only one process can do something at a time. We’re going to accomplish this by using a lease on an Azure storage blob as our control switch.

The Gatekeeper object needs to be able to start, stop, and renew the underlying blob lease. And because I’m also going to use it to store the last log file processed, I’m going to add SetText and GetText methods to write and retrieve strings to the underlying blob.

This class is fairly simple, so I’m not going detail code you can look at yourself on codeplex. So instead I’ll just call out a few highlights…

My gatekeeper constructor accepts a CloudBlockBlob for the blob on which we’ll place a lease. This gives the calling process full control over where that blob lives. It then creates a lease on the blob good for up to 60 seconds (the maximum allowed value), and attempts to renew that lease every 45 seconds. This gives me 15 seconds in the case of transient failures to successfully complete getting the lease before I run the risk of another scanner taking over.

In a couple places, we trap for a Storage Exception that has a 409 error code. This indicates that our attempt to get the lease has failed because somebody else already has a lease on the blob in question (aka another scanner has taken over).

Implementing the Gatekeeper is simply a matter of creating the CloudBlockBlob object, handing it off to the class constructor, and then calling start when we want to gain control. We can check periodically to see if we have the lease, optionally getting it if we don’t.

The final bit is to make sure the starting and stopping of a timer to renew the lease is put into the appropriate spots.

Take a look at the gatekeeper code and if you have you have questions, please feel free to post them in the comments.

LogParser

Also pretty straight forward is the parser. It takes the CloudBlockBlob object (which would be a log file) as a parameter for its constructor, then we the ParseFile method to inspect the log file.

public void ParseFile(FoundBlobDelegate callback)
{
    using (Stream stream = logFile.OpenRead())
    {
        // read the log file
        using (StreamReader reader = new StreamReader(stream))
        {
            string logEntry;
            while ((logEntry = reader.ReadLine()) != null)
            {
                // parse the log entry
                BlobLogEntry blobLog = new BlobLogEntry(logEntry);
 
                //NOTE: PutBlockList is the final write for a large block blob
                // PutBlob can also be used for small enough blobs, but also presents an overwrite of an existing one
                if (blobLog.OperationType.Equals("PutBlob") || blobLog.OperationType.Equals("PutBlockList"))
                    callback(blobLog.RequestUrl);
            }
        }
    }
}

This method opens a stream on the blob, and then reads through it line by line. Each line is parsed using the BlobLogEntry object and if the OperationType is “PutBlob” or “PutBlockList”.

Now I could have put this method into the LogScanner, but as I pointed out earlier, it’s highly likely it will take longer to parse the logs then to detect them. So in a real word implementation, the LogScanner may simply notify a pool of parsers, possibly via a queue. So separating the implementations made a certain amount of sense. Especially when I look ahead to having to deal with larger page blobs.

LogScanner

This is where most of my time on the project was spent. It has a few parallels with the Scanner in that we have a constructor that accepts some parameters (a CloudBlobClient and an instance of the Gatekeeper class), as well as Start and Stop methods.

Internally, the LogScanner object will be using the CloudBlobClient to create a CloudBlobContainer object that’s looking at the “$logs” container. We then use the gatekeeper to make sure that if I have multiple processes running log scans, only one of them can actually do the processing. Finally, it uses an internal timer object to make sure we’re scanning for new log files at a regular interval (which defaults to 5 minutes).

When we call the Start method, the LogScanner takes a delegate that the calling process can use to determine what action should be taken when a new log file is detected (such as using the LogParser to digest it). It then starts the gatekeeper process, and attempts to do an initial scan for logs (like Mike’s post said WebJobs does). Once that scan is complete, it will start the timer so we can do additional scans at the specified interval.

The stop just reverses these actions, stopping the scan timer and the gatekeeper. So the real meat of this class, is what happens when we scan for log files. So let’s walk through this a bit before I show you the code.

The first thing we need to be able to do is get a list of blobs in the $logs container. We have two scenarios we have to support with this, get everything (for an initial scan), and get just new stuff for incremental scans. The challenge is that Azure storage, only supports getting a list of blobs based on a filter on the name, not on any metadata or properties. The initial scan is fairly simple, we set our filter criteria to “blob/”, which will get all blob service logs in the container.

So let’s say we’ve already done a scan and we stored the last log file we found in our Gatekeeper, so I know where I left off. But how do I pick back up again? I could just filter for all logs and iterate through until we get back to the where we left off. Perfectly ok, but doesn’t strike me as particularly efficient. So if we think back to how the logs are named, I can parse the last log I found to go back to the year, month, day, and hour for which that log was produced. So when I pick back up on scanning, I scan for that hour and all the hours in between UTCNow and then.

Note: You could alternatively scan for day, month, or year. Depending on the frequency of your scans and the production of logs, these options could be more efficient then my hourly approach.

We start by extracting the datetime values from the last log file name (uri in the sample blow) we read from our gatekeeper…

int startPOS = uri.IndexOf("blob/") + 5;
int endPOS = uri.LastIndexOf('/');
 
return uri.Substring(startPOS, endPOS - startPOS);

We know all the URIs will have a “blob/” at the beginning since that’s the service we’re monitoring. Furthermore, the file names all end in a six digit sequence number with a ‘.log’ suffix. So if I find the position of the last ‘/’ character in the string, I can now extract the YYYY/MM/DD/HHmm portions from the URI. We can make all these assumptions because the log naming conventions are published and therefore somewhat immutable.

Note: Currently, the mm portion of log URI will always be zero per the published naming convention. This is a key assumption for our processing.

Next, we need to convert this substring to a datetime type

DateTime tmpDT;
// convert prefix to datetime
DateTime.TryParseExact(fileprefix, "yyyy/MM/dd/HHmm"null,
                       DateTimeStyles.None, out tmpDT);
return tmpDT;

This takes our URI substring, and converts it into a DateTime, leaving us to simply calculate the delta between the current UTC datetime and this value to know how many hour periods we need to filter for.

ScanPasses = ((DateTime.UtcNow - PrefixToDateTime(startingPrefix)).TotalHours + 1);

So now we know that we will do one filtered list for each hour from the last hour we found a file to the current datetime. Ideally, this could be optimized so that the gatekeeper stores the last scanned period so we don’t have to scan past hours for which there was no traffic. But my assumption is that if we’re scanning the logs, we expect traffic at fairly regular intervals. So repetitive scans of empty “hours” shouldn’t happen often. And when you add up the cost of those scans versus programmer time to optimizing things, I could scan a few eons of empty logs before the cost would match the programmer cost to fine tune this.

Now that we’re armed with that we need to do the scans of the logs, let’s look at some of the code…

// get last log file value from gatekeeper
string lastLog = gatekeeper.GetText();
 
// calculate starting prefix
if (!lastLog.Equals("blob/")) // we had a "last log" from previous runs
{
    startingPrefix = getPrefixFromURI(lastLog); // use that prefix as our starting point
    ScanPasses = ((DateTime.UtcNow - PrefixToDateTime(startingPrefix)).TotalHours + 1); // 
    pastPreviousLog = false// don't start raising "found log" events until we're past the last processed log
}

We start by getting the last log file we found from the gatekeeper. If that value is not “blob/”, then we’re doing a subsequent scan. We’ll get the data/time prefix from the log URI, and use that to calculate the number of scans we need to do. We also set a value that tells us we haven’t yet passed our previously found log file. We need this last part because subsequent scans will always resume in the same hour of the last log file we processed. And it’s possible that new log files have arrived.

Next we will enter into a loop that will execute once or each scan pass we calculated we need. If it’s a first time scan, we’ll only do one pass because our blob list filter will be all available logs.

// List the blobs using the prefix
IEnumerable<IListBlobItem> blobs = 
    logContainer.ListBlobs(string.Format("blob/{0}", startingPrefix), trueBlobListingDetails.Metadata, null);
 
// interate the list of log files
foreach (IListBlobItem item in blobs)
{
    CloudBlockBlob log = item as CloudBlockBlob;
    string LogURI = log.Uri.ToString();
    if (log != null)
    {
        if (pastPreviousLog)
        {
            // call Delegate to act on log file
            this.callback(log);
 
            // update gatekeeper blob 
            gatekeeper.SetText(LogURI);
            lastLog = LogURI;
        }
        if (lastLog.Equals(LogURI, StringComparison.OrdinalIgnoreCase))
            pastPreviousLog = true;
    }
}

For each log file, we look at the URI. If we’re past the last log file (as recorded by the gatekeeper, we will call the callback method handed into our object, alerting a calling process that a new log file has been found. We then ask the gatekeeper to save that URI as our new starting point for the next scan. Lastly, in case we had a previous log file recorded, we need to check and see if we’re at it, so we can process the additional logs.

And as we exit the log listing loop, we increment our filter criteria (so we can scan the next available hour), and decrement the scanpasses value so we know how many scans remain.

On either side of this, we also enable and disable the timer object. The only purpose of this is that on the off chance it takes us more than 5 minutes to scan the logs, we don’t double up on the scan operations.

Running the Sample

Hopefully you’ll find this solution pretty straightforward. With the classes in place, all that remains is to implement them, in this case as sample console application.

LogScanner and LogParser need some delegate methods. For LogScanner, we’ll use this …

public static void LogFound(CloudBlockBlob LogBlob)
{
    Console.WriteLine(string.Format("Parsing Log File: {0}", LogBlob.Uri));
 
    // Parse the Log
    LogParser myParser = new LogParser(LogBlob);
    //HINT: we could drop the log file into a queue and process asyncronously
    myParser.ParseFile(FoundBlob);
 
    //Option: delete the log once its processed
}

When the LogScanner finds a new log file, it will call this delegate. For my sample I’ve chosen to write the event to the console output, and immediately parse the file via the LogParser. Just keep in mind that the current implementation is a synchronous blocking call, so in a real production situation, you likely won’t want to do this. Instead, writing the event to a queue, where subscribers can then take and process the event.

We follow this up with a delegate for LogParser that will be called as we parse the log files that were found, and locate what we believe to be a blob.

public static void FoundBlob(string newBlobUri)
{
    // filter however you like, by container, file name, etc... 
    if (!newBlobUri.Contains("gatekeeper")) // ignore gatekeeper updates
        Console.WriteLine(string.Format("Found new blob: {0}", newBlobUri));
}

You’ll notice that in this method, I’m doing a wee bit of filtering based on the BlobURI. In a real implementation, you may only want to watch a handful of containers. In my sample implementation, the blob object that’s at the heart of the gatekeeper object will have the name “gatekeeper”, so I went for the simple approach to make sure I ignore any operations related to it. I thought about putting filter criteria (such as container) as an attribute of the LogParser, but ultimately settled on this approach as being far more flexible.

The final step was to go into the console app and set things in motion…

// set up our private variables.
string storageAccountString = Properties.Settings.Default.AzureStorageConnection;
 
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(storageAccountString);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();

We start by retrieving the Azure Storage Account’s connection string, and using that string to get a CloudStorageAccount object, with which we create a CloudBlobClient.

// make sure we have a gatekeerp in place
CloudBlobContainer gatekeeperContainer = blobClient.GetContainerReference("gatekeeper");
gatekeeperContainer.CreateIfNotExists(); // want to make sure the container is there... 
CloudBlockBlob gatekeeperBlob = gatekeeperContainer.GetBlockBlobReference("gatekeeper");
Gatekeeper mygatekeeper = new Gatekeeper(gatekeeperBlob, "blob/");

Using the CloudBlobClient, we create a container where our gatekeeper blob will go, then get a CloudBlockBlob that will be the gatekeeper blob (the blob we’ll put leases on). Finally, using that blob, we create the gatekeeper object which also initializes the contents.

Next, we initialize the LogScanner and tell it to start processes, calling the delegate we already defined.

LogScanner myScanner = new LogScanner(blobClient, mygatekeeper);
myScanner.ScanInterval = new TimeSpan(0, 5, 0);
myScanner.Start(LogFound);

After that, all that remains is to give myself a simple loop to run in while the LogScanner and LogParser do their work. I’ve put in one that will run for up to an hour. After the loop exits, it will stop the scanner, which will release the lease on the blob. If you stop the console app forcibly, just be aware that the gatekeeper lease will persist for up to 1 minute. So your initial scan upon launching the program likely won’t have any results unless you wait at least 1 minute before restarting.

With the sample program complete, all that remains is to set the Azure Storage Account Connection string in the program’s application settings (using a storage account that has the Blob write logging enabled), then compile and run the solution. As it runs, you can upload blobs into it (perhaps using the Publishing Console project also located in the FileDepot Codeplex project), and within 5-10 minutes, you should start seeing files show up in the BlobWatcher console app.

Magic, no longer

So with this, I hope I’ve shed a bit of light on the Azure Storage logs and how they can be used. As I look back on creating this sample, I find that I almost spent more time digging into how storage logs work, then was spent actually working on the code. The final product could use some fine tuning, as well as enhancement for page blob scenarios. But as a starting point, I’m fairly happy with it.

Admittedly, if all you really want to do is monitor for new blobs and act on them, your best approach is to use Azure WebJobs. That team has far more to time and resources than I do. And as such, they can give you a solution that will be far more robust this then my simple code sample. But replacing WebJobs was never my objective, I just wanted to help highlight how Azure storage logging can be used to do more than just track errors and capacity utilization.

Please do check out BlobWatcher at the Azure File Depot on Codeplex. And more importantly, leave feedback either here or there. I want to make sure the project is fulfilling some common needs and to that end, one can never have enough feedback.

Until next time!

 

Azure Files – Share Management

Note: If you are going to be using Azure Files from the same VM regularly, be sure to follow the instructions in this blog post to ensure that the connection is persistent.

We recently announced a new preview feature, Azure Files. This feature allows you to mount a SMB based file share into your Azure hosted PaaS and IaaS VMs. As this feature is in preview, the various related bits are also in preview state. And as with many previews, there’s some risk when you mix early release bits with current production bits that can cause difficulties. So to that end

So with this in mind, I decided that I’d do something I haven’t done in some time and that’s write some code that goes directly against the Storage API, to create, delete, and list shares created in Azure Files. And do this in a way that takes no dependencies on any “preview” bits.

We’ll start with the Create Share API. For this we’ll need our account name, one of the keys, and the name of the share we’re working on. We begin by creating the basic REST request. As described at the link above, we need to create a “PUT” verb, against the ‘2014-02-14′ version of the API, I’m also going to set the content type to ‘application/xml’ and give it a content length of 0. We’ll do this with a HttpWebRequest object as follows:

var request = (HttpWebRequest)HttpWebRequest.Create(string.Format("https://{0}.file.core.windows.net/{1}?restype=share", 
    creds.AccountName, shareName));
request.Method = "PUT";
request.Headers["x-ms-version"] = "2014-02-14";
request.ContentType = "application/xml";
request.ContentLength = 0;

The variables, used in this is the account name (creds.AccountName), and the name of the share we want to create (shareName);

Once we have the request, we then have to sign it. Now you could do this manually, building the string and doing the MACSHA hashing… But since I can take a dependency on the existing Azure Storage SDK (v4.0.3), we can just use the SharedKeyLiteAuthenticationHandler class to sign the request for us. Big thanks to my colleague Kevin Williamson, for pointing me at this critical piece which had changed since I last worked with the Storage REST API.

SharedKeyLiteAuthenticationHandler auth = new SharedKeyLiteAuthenticationHandler(SharedKeyLiteCanonicalizer.Instance, creds, creds.AccountName);
auth.SignRequest(request, null);

By leveraging this aspect of the Azure SDK, we save ourselves the hassle of having to manually generate the string to be signed (canonizing), and then actually doing the signature. If you’d like to learn more on Azure Storage authentication, I’d recommending checking out the MSDN article on the subject.

With the signed request created, we only have to execute the request, and trap for any errors.

// sent the request
HttpWebResponse response = null;
try
{
    response = (HttpWebResponse)request.GetResponse();
    Console.WriteLine("Share successfully created!");
}
catch (WebException ex)
{
    Console.WriteLine(string.Format("Create failed, error message is: {0}", ex.Message));
}

And that’s all there really is too it. If the request fails, it will throw a WebException, and can look at failure for additional details. Now if you want to learn more about the Azure Files REST API, you can find a slew of great information already out on MSDN. This includes one extremely helpful page related to the naming and references

Now what I did is take this and add in the Delete, and List commands. And roll them up into a simple little console app. So with this app, you can now run a command like…

AzureFileShareHelper -create -acct:<accountname> -key:<accountkey> -share:myshare

This will create the share for you and even return the URL use to mount the share into an Azure VM. Just change the verb to -delete or -list if you want to leverage another operation. :)

 

Azure File Depot

It’s been a little while since my last update. BUILD 2014 has come and gone and my group, DPE (Developer Platform Evangelism), has been re-branded to DX (Developer Experience). Admittedly a name is a name, but this change comes at a time when my role is beginning to transform. For the last 18 months, my primary task has been to help organizations be successful in adopting the Azure platform. Going forward, that will still remain significant, but now we’re also tasked with helping demonstrate our technology and our skills through the creation of intellectual property (frameworks, code samples, etc…) that will assist customers. So less PowerPoint (YEAH!), more coding (DOUBLE YEAH!).

To that end, I’ve opted to tackle a fairly straightforward task that’s come up often. Namely the need to move files from one place to another. It’s come up at least 3-4 times over the last few months so it seems like a good first project under our changing direction. To that end, I’d like to present you to my first open sourced effort, the Azure File Depot.

What is it?

In short, the Azure File Depot is an effort to provide sample implementations of various tasks related to using blob storage to move files around. It contains a series of simple classes that help demonstrate common tasks (such as creating a Shared Access Signature for a blob) and business challenges (the automated publication of files from on-premises to Azure and/or an Azure hosted VM).

Over time, it’s my hope that we may attract a community around this and evolve this little project into a true framework. But in the meantime, it’s a place for me to do what I do best and that’s take learnings and share them. At least I hope I’m good at it since it’s a significant portion of my job. J

What isn’t it?

I need to stress that this project isn’t intended to be a production ready framework for solving common problems. The goal here is to create a collection of reference implementations that address some common challenges. While we do want to demonstrate solid coding practices, there will be places where the samples take less than elegant approaches.

I’m fully aware that in many cases there may be better implementations out there, in some cases perhaps even off the shelf solutions that are production ready. That’s not what this project is after. I have many good friends in Microsoft’s engineering teams that I know will groan and sigh at the code in this project. I only ask that you be kind and keep in mind, this is an educational effort and essentially one big collection of code snippets.

So what’s in it currently?

What we’ve included in this first release is the common ask I referred to above. The need to take files generated on-premises and push them to Azure. Either letting them land in blob storage, or have them eventually land in an Azure hosted VM.

Here’s a diagram that outlines the scenario.

FileDepotDiagram

Step 1 – File placed in a folder on the network

The scenario begins with a file being created by some process and saved to the file system. This location could be a network file share or just as easily could be on the same server as the process itself.

Step 2 – The Location is monitored for new files

That location is in turn monitored by a “Publication Service”. Our reference implementation uses the c# FileSystemwatcher class which allows the application to be receive notification of file change events from Windows.

Step 3 – Publication service detects file and uploads to blob storage

When the creation of a new file raises an event in the application, the publishing app waits to get an exclusive lock on the file (making sure nothing is still writing to the file), then uploads it to a blob container.

Step 4 – Notification message with SAS for blob is published

After the blob is uploaded, the publication service then generates a shared access signature and publishes a message to a “Messages” Service Bus topic so that interested processes can be alerted that there’s a new file to be downloaded.

Step 5 – Subscribers receive message

Processes that want to subscribe to these notifications create subscriptions on that topic so they can receive the alerts.

Step 6 – Download blob and save to local disk

The subscribing process then use the shared access signature to then download the blob, placing it in the local file system.

Now this process could be used to push files from any location (cloud or on-premises) to any possible receiver. I like the example because it demonstrates a few key points of cloud architecture:

  • Use of messaging to create temporal decoupling and load leveling
  • Shared Access Signatures to grant temporary access to secure, private blob storage for potentially insecure/anonymous clients
  • Use of Service Bus Topics to implement pub/sub message model

So in this one example, we have a usable implementation pattern (I’ve provided all the basic helper classes as well as sample implementations of both console and Windows Services applications). We also have a few reusable code snippets (create a blob SAS, interact with Service Bus Topics, upload/download files to/from block blobs.

At least one customer I’m working with today will find these samples helpful. I hope others will as well.

What’s next?

I have a few things I plan to do this with this in the near term. Namely make the samples a bit more robust: error handling, logging, and maybe even *gasp* unit tests! I also want to add in support for larger files by showing how to implement this with page blobs (which are cheaper if you’re using less than 500TB of storage).

We may also explore using this to do not just new file publication, but perhaps updates as well as adding some useful metadata properties to the blobs and messages.

I also want to look at including more based scenarios. In fact, if you read this, and have a scenario, you can fork the project and send us a pull request. In fact, you’re wondering how to do something that you think could fit into this project, please drop me a line.

That’s all the time I have for today. Please look over the project as we’d love your feedback.

Follow

Get every new post delivered to your Inbox.

Join 1,149 other followers