SAS, it’s just another token

Note: Please bear with me, I authored this post in Markdown. :)

I’ve been trying to finish this post since September of 2014. But I kept getting distracted away from it. Well this lovely (its 5 degrees Fahrenheit here in Minnesota) Saturday morning, it IS my distraction. I’ve been focused the last few weeks on my new love, the Simple Cloud Manager Project, as well as some internal stuff I can’t talk about just yet. Digging into things like Ember, Jekyll, Broccoli, GitHub Pages, git workflows, etc… has been great. But it’s made me keenly aware of how much development has leapfrogged my skills as a Visual Studio/.NET centric cloud architect. With all that learning, I needed to take a moment and get back to something I was a little more comfortable with. Namely, Azure services and specifically Service Bus and Shared Access Signatures (SAS).

I continue to see emails, forum posts, etc… regarding to SAS vs ACS for various scenarios. First off, I’d like to state that both approaches have their merit. But something we all need to come to terms with is that at their heart, both approaches are based around a security token. So as the name of this blog article points out, SAS is just a token.

What is in a SAS token?

For the Azure Service Bus, the token is simply a string that looks like the following

SharedAccessSignature sr=https%3a%2f%2fmynamespace.servicebus.windows.net%2fvendor-&sig=AQGQJjSzXxECxcz%2bbT2rasdfasdfasdfa%2bkBq%2bdJZVabU%3d&se=64953734126&skn=PolicyName

Within this string, you see a set of URL encoded parameters. Let’s break them down a bit…

SharedAccessSignature – used to identify the type of Authorization token being provided. ACS starts with “WRAP”

sr – this is the resource string we’re sharing access to. In the example above, the signature is for anything at or under the path “https://mynamespace.servicebus.windows.net/vendor-“

sig – this is a generated, HMAC-SHA256 hash of the resource string and expiry that was created using a private access key.

se – the expiry date/time for the signature expressed in the number of seconds since epoch (00:00:00 UTC on January 1st, 1970)

skn – the policy/authorization rule who’s key is was used to generate the signature and who’s permissions determine what can be done

The token, or signature, is created by using the resource path (the url that we want to access) and an expiry date/time. A HMAC-SHA256 hash using the key of a specify authorization policy/access rule is then generated off of those parameters. In its own way, using the policy name and its key is not that different then using an identity and password. And like an ACS token, we have an expiry value that helps ensure the token we receive can only be used for a given period of time.

Generating our own Token

So the next logical question is how to generate our own token. If we opt to use .NET and have the ability to leverage the Azure Service Bus SDK, we can pull together a simple console application to do this for us.

Start by creating a Console application, and adding some prompts for a parameter to it so that the main method looks like this…

static void Main(string[] args)
{
Console.WriteLine("What is your service bus namespace?");
string sbNamespace= Console.ReadLine();

Console.WriteLine("What is the path?");
string sbPath = Console.ReadLine();

Console.WriteLine("What existing policy would you like to use to generate your signature?");
string sbPolicy = Console.ReadLine();

Console.WriteLine("What is the policy's secret key?");
string sbKey = Console.ReadLine();

Console.WriteLine("When should this expire (MM/DD/YY HH, GMT)?");
string sbExpiry = Console.ReadLine();

Console.WriteLine("Press any key to exit...");
Console.ReadKey();
}

The first parameter we’re going to capture and save is the namespace we’re wanting to access without the “servicebus.windows.net” part. Next, is the path to the Service Bus Entities we want provide access to. This can be a specific entity such as a queue name or as I mentioned last time, a partial path to grant access to multiple resources. Then we need to provide a named policy (which you can set up via the portal), and one of its secret keys. Finally, you will specify when this signature will need to expire.

Next, we need to transform the expiration time that was entered as a string into a Timespan (how long does the SAS need to ‘stay alive’. We’ll insert this right after we read the expiration value…

// convert the string into a timespan...
DateTime tmpDT;
bool gotDate = DateTime.TryParseExact(sbExpiry, "M/dd/yy HH", enUS, DateTimeStyles.None, out tmpDT);
if (!gotDate)
Console.WriteLine("'{0}' is not in an acceptable format.", sbExpiry);

Now we have all the variables we need to create a signature, so it’s time to generate it. For that, we’ll use a couple classes contained in the .NET Azure Service Bus SDK. We’ll start by adding it to our project using the instructions available on MSDN (so I don’t have to retype them all here).
With the proper references added to the project, we add a simple using clause at the top..

using Microsoft.ServiceBus;

Then add the code that will create the SAS token for us right after the code that created out TimeSpan.

var serviceUri = ServiceBusEnvironment.CreateServiceUri("https", sbNamespace, sbPath).ToString().Trim('/');
string generatedSaS = SharedAccessSignatureTokenProvider.GetSharedAccessSignature(sbPolicy, sbKey, serviceUri, expiry);

And there we have it. A usable SAS token that will automatically expire after a given period of time, or that we can revoke immediately by removing the policy on which its based.

But what about doing this without the SDK?

Lets start by looking at what the Service Bus SDK is doing for us. Fortunately, Sreeram Garlapati has already written some code to generate a signature.

static string CreateSasToken(string uri, string keyName, string key)
{
// Set token lifetime to 20 minutes. When supplying a device with a token, you might want to use a longer expiration time.
DateTime origin = new DateTime(1970, 1, 1, 0, 0, 0, 0);
TimeSpan diff = DateTime.Now.ToUniversalTime() - origin;
uint tokenExpirationTime = Convert.ToUInt32(diff.TotalSeconds) + 20 * 60;

string stringToSign = HttpUtility.UrlEncode(uri) + "n" + tokenExpirationTime;
HMACSHA256 hmac = new HMACSHA256(Encoding.UTF8.GetBytes(key));

string signature = Convert.ToBase64String(hmac.ComputeHash(Encoding.UTF8.GetBytes(stringToSign)));
string token = String.Format(CultureInfo.InvariantCulture, "SharedAccessSignature sr={0}&sig={1}&se={2}&skn={3}",
HttpUtility.UrlEncode(uri), HttpUtility.UrlEncode(signature), tokenExpirationTime, keyName);
return token;
}

This example follows the steps available at this SAS Authentication with Service Bus article. Namely:
– use the time offset from UTC time January 1st, 1970 in seconds to set when the SAS should expire
– create the string to be signed using the URI and the expiry time
– sign that string via HMACSHA256 and the key for the policy we’re basing our signature on
– base64 encode the signature
– create the fully signed URL with the appropriate parameters

With the steps clearly laid out, its just a matter of converting this into the language of your choice. Be it javascript, objective c, php, ruby… it doesn’t really matter as long as you can perform these same steps.

In the future, its my sincere hope that we’ll actually see something in the Azure Service Bus portal that will make this even easier. Perhaps even a callable API that could be leveraged.

But what about “Connection Strings”

This is something I’ve had debates about. If you look at most of the Service Bus examples out there, they all use a connection string. I’m not sure why this is except that it seems simpler because you don’t have to generate the SAS. The reality is that the connection string you get from the portal works much like a SAS, except that is lacks an expiry. The only way to revoke a connection string is by revoking the policy on which its based. This seems fine, until you realize you only get a handful of policies per entity to creating hundreds of policies to be used by customers is a tricky proposition.

So what are you to do when you want to use a SAS, but all the examples use a connection string? Lets start by looking at a connection string example. First with the connection string.

Endpoint=sb:///;SharedAccessKeyName=;SharedAccessKey=

This string has several parameters:

sb – the protocol to be used. In this case its ‘sb’ which is Service Bus shorthand for “use AMQP”.

namespace – the URL we’re trying to access

SharedAccessRuleName – the policy we’re using

SharedAccessKey – the policy’s secret key

The common approach is to put this string into a configuration setting and with the .NET SDK, load it as follows…

EventHubClient client = EventHubClient.Create(ConfigurationManager.AppSettings["eventHubName"]);

or

string connectionString = CloudConfigurationManager.GetSetting("Microsoft.ServiceBus.ConnectionString");
var namespaceManager = NamespaceManager.CreateFromConnectionString(connectionString);

These are common examples you’ll see using the connection string. But what if you want to use the SAS instead… For that, we go up a level to the Service Bus MessagingFactory. Both of the examples above abstract away this factory. But we can still reach back and use it.

We start by creating the URI we want to access, and a SAS token object:

Uri runtimeUri = ServiceBusEnvironment.CreateServiceUri("sb", , string.Empty);
TokenProvider runtimeToken = TokenProvider.CreateSharedAccessSignatureTokenProvider());

Alternatively, we can create the token provider using a policy and its secrete key. But here we want to use a SAS token.

Now its just a matter of using the message factory to create a client object…

QueueClient sendClient = mf.CreateQueueClient(qPath);

Simple as that. A couple quick line changes and we’ve gone from a dependency on connection strings (ick!), to using SAS tokens.

So back to SAS vs ACS

So even with all this in mind, there’s one argument that gets brought up. The SAS tokens expire.

Yes, they do. But so do ACS and nearly all other claims based tokens. The only real difference is that most of the “common” security mechanisms support a way of renewing the tokens. Be this asking the user to log in again, or some underlying bits which store the identity/key and use them to request new tokens when the old is about to expire. The reality is that this same pattern needs to be implemented when you’re doing a SAS token.

Given what I’ve shown you above, you can now stand up your own simple token service that accepts a couple parameters (identity/key), authenticates them, selects the appropriate policy and URL for the provided identity, and then creates and returns the appropriate SAS.

The client then implements the same types of patterns we can already see for things like mobile notification services. Namely to store tokens locally until they’re about to expire. When they are approaching their expiry, reach back out using our credentials and ask for an updated token. Finally, use the token we been provided to perform the required operations.

All that’s really left up to you is to determine the expiry for the token. You can have one set value for all tokens. You may also opt to have tokens for more sensitive operations expire faster. You have that flexibility.

So until next time, enjoy SAS’s. Just please…. don’t be afraid of them.

SimpleFileFetcher for Azure Start up tasks

I know some of you are waiting for more news on my “Simple Cloud Manager” Project. Its coming, I swear! Meanwhile, I had to tackle one fairly common Azure task today and wanted to share my solution with you all.

I’m working on a project where we needed an Azure PaaS start-up task that would download a remote file and unzip it for me. Now there’s several approaches out there of varying degrees of complexity. Powershell, native code, etc… But I wanted something even simpler, a black box that I could pass parameters too that required no additional assemblies/files to work. To that end I sat down and spent about 2 hours crafting the “Simple File Fetcher”.

The usage is fairly simple: simplefilefetcher -targeturi:<where to download the file from> -targetfile:<where to save it to>

You can also optionally specify a parameter that tells it to unzip the file, ‘-unzip’, and another that will make sure downloaded file is deleted, ‘-deleteoriginal’.

I spent most of the 2 hours looking at and trying various options. The final product was < 100 lines of code and now that I know how to do it, would only likely take me 10-20 minutes to rebuild (most of that spend debugging the argument parsing). So instead of boring you all with an explation of the code, I’ll just share it along with a release build of the console app so you can just use it. :)

Until next time!

Azure Tenant Management

Hey everyone. I’m writing this post from Redmond, WA. I’m in here on the Microsoft campus this week to meet with some of my colleagues and explore options for my first significant open source project effort. We’re here to spend three days trying to find ways to manage the allocation of, and monitor the billable usage of Azure resources. There are two high level objectives for this effort:

  • Allow non-technical users to provision and access Azure resources without the need to understand Azure’s terminology and even granting direct access to the underlying Azure subscription
  • Monitor the billable usage of the allocated resources, and track against a “cap”

On the surface this seems pretty simple. That is, until you realize that not all the Azure services expose individual resource usage, and of the ones that do, they all do it separately. It gets even more complicated when you realize that you may need to do things like have users “share” cloud services and even storage accounts.

Couple examples

So let’s dive into this a little more and explore a couple of the possible use cases I’ve heard from customers.

Journalism Students

Scenario: We have a series of journalism students that will be learning to provision and maintain a online content management system. They need someplace to host their website, upload content, and need GIT integration so they can version any code changes they make to their site’s code. The instructor for the course starts by creating an individual web site for them with a basic CMS system. The student will then access this workspace as they perform their coursework.

Challenges: Depending on the languages they are using, we may be able get by Azure Web Sites. But this only allows us up to 10 “free” web sites, and what happens to the other students. Additionally, students don’t know anything about the different SKUs available and just want things that work, so do we need to provide “warm-up” and auto-scaling? Additionally, since the instructor is setting up the web sites for them, we need a simple way for the instructor to get the resources provisioned, and give the students access to it without the instructor needing to even be aware of the Azure subscriptions. We also need to track compute and bandwidth usage on the individual web site.

Software Testers

Scenario: A small company has remote testers that perform quality assurance work on software. These workers are distributed but need to remote into Windows VMs to run tests. Ideally, these VMs will be hosted “in the cloud”, and the company wants a simple façade whereby the workers can select which software they need to test and then provision a virtual machine for them. The projects being tested should be “billed back” for the resources used by the testers and the testers work on multiple projects. Additionally, the testers should be able to focus on the work they have to do, not how to manage and provision Azure resources.

Challenges: This one will likely be Azure Virtual Machines. But we need to juggle not only compute/bandwidth, but track impact on storage (transactions and total amount stored) as well. We also need to be able to provision VMs from a selection of customer gallery images and get them running for the testers, sometimes across subscriptions. Finally, we need to be aware of challenges with regards to VM endpoints and cloud services if we to maximize the density of these VMs.

Business Intelligence

Scenario: Students are learning to use Oracle databases to analyze trends. The instructor is using the base Oracle database images from the Azure gallery but has added various tools and sample datasets to them for the students to use. The students will use the virtual machines for various labs over the duration of the course and each lab should only take a few hours.

Challenges: If these VMs were kept running 24×7, it would costs thousands of dollars per month per student. So we need to make sure we can automate the start and stop of the VMs to help control these costs. And since the Oracle licensing fees appear as a separate charge, we need to be able to predict these as well based on current rates and the amount of time the VM was active.

So what are we going to do about it?

In short, my plan is to create a framework that we will release via open source to help fill some of these gaps. A simple, web based user interface for accessing your allocated resources, some back end services that monitor resource usage and track that against quotas set by solution administrators. Underneath all that, a system that allows you to “tag” resources as associated with users or specific projects. If all goes well, I hope to have the first version of this framework published and available by the end of March, 2015 that will focus on Azure Web Sites, IaaS VMs, and Azure Storage.

However, and this is the point of my little announcement post, we’re not going to make you wait until this is done. As this project progresses, I plan to regularly post here and in January we’ll hopefully have a GIT repository where you’ll be able to check out the work as we progress. Furthermore, I plan to actively work with organizations that want to use this solution so that our initial version will not be the only one.

So look for more on this in the next couple weeks as we share our learnings and plans. But also, let me know via the comments if this is something you see value in and what scenario you may have. And oh, we still don’t have a name for the framework yet. So please post a comment or tweet @brentcodemonkey with your ideas. J

Until next time!

Azure’s new Event Hub

Over the past few months, I had the good fortune to be accepted to present at ThatConference in Wisconsin and CloudDevelop in Ohio. I count myself even more fortunate because at the time I submitted my session for both these events, it was about a new Azure solution that hadn’t even been announced yet, the Event Hub.

Whenever possible, I like to put demos into a real world context. For this one, I reached out to two colleagues that were presenting at ThatConference and collectively we came up with the idea to do a conference attendee tracking solution. For my part of this series, I was going to cover using Event Hub to ingest event messages from various sources (social media, mobile apps, and proximity sensors) and feeding those into the hub. I also wrote some code so that the other sessions could consume the messages.

Event Hub vs. Topics/Queues

The first question to get out of the way is that Event Hub is NOT just a new variation on Topics/Queues. For this, I’ve found a simple visual example works best.

This is topics/queues

This is Event Hub

 

The key differentiator between the two is scale. A courier can pick up a selection of packages, and ensure they are delivered. But if you need to move hundreds of thousands of packages, you can do that with a lot of couriers, or you could build a distribution center capable of handling that kind of volume more quickly. Event Hub is that distribution center. But it’s been built as a managed service so you don’t have to build your own expensive facility. You can just leverage the one we’ve built for you.

In Service Bus, topics and queues are about the transportation and delivery of a specific payload (the brokered message) from point A to point B. These come with specific features (guaranteed delivery, visibility controls, etc…) that also limit the scale at which a single queue can operate. Service Bus was built to solve the challenges of scaled ingestion of messages, but did so with the trade-off of these types of atomic operations. The easiest way to think of Event Hub is as a giant buffer into which you place messages, and they are automatically retained for a given period of time. You then have the ability to read those messages much as you would read a file stream from disk. You can even rewind all the way back to the beginning of the stream and process everything again.

And as you might expect given the different focus of the two solutions, the programming models are also different. So it’s also important to understand that switching from one to the other isn’t simply a matter of switching the SDK.

What is the Event Hub?

If you think back to Topics/Queues, you had the option of enabling partitions via the EnablePartioning property. This would cause the topic or queue to switch from a single event hub broker (the service side edge compute node), to 16 brokers, increasing the total throughput of the queue by 16 times. We call this approach, partitioning. And this is exactly what Event Hub does.

When you create an Event Hub, you determine the number of partitions that you want (from 16, the default, up to 1024). This allows you to scale out the processes that need to consume events. Partitions are also used to help direct messages. When you send an event to the hub, you can assign a partition key which is in turn hashed by the Event hub brokers so that it lands in a given partition. This hash ensures that as long as the same partition key is used, the events will be placed into the same partition and in turn will be picked up by the same reciever. If you fail to specify a partition, the events will be distributed randomly.

When it comes to throughput, this isn’t the end of the story. Event Hubs also have “throughput units”. By default you start with a single throughput unit that allows 1mb/s in and 2mb/s out through your hubs. You can request this to get scaled up to 1000 throughput units. When you purchase a throughput unit, you do this at the namespace level since it applies to all your event hubs in that namespace.

So what we have is a service that can scale to handle massive ingestion of events, combined with a huge buffer just in case the back end, which also features scalable consumption, can’t keep up with the rate in which messages are being sent. This gives us scale on multiple facets, as a managed, hosted service.

So about that presentation…

So the next obvious question is, “how does it work?” This is where my demos came in. I wanted to show using event hug to consume events from multiple sources: a social media feed, a mobile app used by vendors to scan attendee badges, and proximity sensors scattered around the conference to help measure session attendance.

I started by realizing that when I consume event, I needed to know what type they were (aka how to deserialize them). To make this easy, I started by defining my own customer, .NET message types. I selected twitter for the media feed and for the messages, the type class declaration looks like this:

So we have who tweeted, the text of the message, and when it was created. I decorated the class with various data attributes to aid in serialization.

When a tweet is found, we’ll need a client to send the event…

This creates an EventHubClient object, using a connection string from the configuration file, and a configuration setting that defines the name of the hub I want to send to.

Next up, we need to create my event object, and serialize it.

I opted to use Newtonsoft’s .NET JSON serializer. It was already brought in by the Event Hub nuget package. JSON is lighter weight then XML, and since Event Hub is based on total throughput, not # of messages, it made sense to keep the payloads as small as was convenient.

Finally, I have to actually send the message:

I create an instance of the EventData object using the serialized event, and assign a partition key to it. Furthermore, I also add a custom property to it that my event processors can then use to determine how to deserialize the event. Finally, I call the EventHubClient method, Send, handing my event as a property. The default way for the .NET client to do all this is to use AMQP 1.0 under the covers. But you can also do this via REST from just about any language you want. Here’s an example using Javascript…

This example comes from another part of the presentation’s demo, where I use a web page with imbedded javascript to simulate the vendor mobile device app for scanning attendee badges. In this case, the Partition key is in the URI and is set to “vendor”. While I’m still sending a JSON payload, this one uses a UTF-8 encoding instead of Unicode. Another reason it could be important that we have an event type when we’re consuming events.

Now you’ll recall I explained that the partition key is used to help distribute the events so that we end up with a fairly even distribution amoung the consuming processes. So why would I select to bind each of my examples to a single partition? In my case, I knew that volumes would be low, so there wasn’t much of an issue with overloading my consuming processes. But you can also use this approach if you want to ensure that the same consuming process always gets the events from the same source. Something that can be really handy if the consuming process is using the events to maintain an in-memory state model of some type.

So what about consuming the events?

Events are consumed via “consumer groups”. Each group can track its position within the overall event hub ‘stream’ separately, allowing for parallel consumption of events. A default group is created when the event hub is created, but we can create our own. Consuming processes in turn create receivers, which connect to the various partitions to actually consume the events. This would normally require you to code up some rather complicated logic to ensure that if the process that owns a given set of receivers becomes unavailable, another process can pick up the slack. Fortunately, the event hub team thought of this already and created another nuget package called the EventProcessorHost.

Simply put, this is a handy, .NET based approach to handle resiliency and fault tolerance of event consumers/recievers. It uses Azure Storage blobs to track which receivers are attached to a given partition in an event hub. If you add or remove consuming processes, it will redistribute the receivers accordingly. I used this approach for my presentation to create a simple console app that displays the events coming into the hub. There’s really just three parts of the solution: the program itself, a receiver class, and an event processor class.

The console program is the simplest bit of code…

We use the namespace manager to create a consumer group if the one we want doesn’t already exist. Then we instantiate a Receiver object, and tell that object to start processing events, distributing threads across the various partitions in the event hub. The nuget package comes with its own Receiver class, so there’s not much you really need to do. The core of the receiver is in the MessageProcessingWithPartitionDistribution method.

You’ll note that this may actually be a bit different then the version that arrives with the nuget package. This is because I’ve modified it to use a consumer group name I specify, instead of just the default name. Otherwise, it’s the same example. I get the Azure Storage connection string (where the blobs that will control our leases will go), and then uses that to create an EventProcessorHost object. We then tell the host to start doing asynchronous event processing (via RegisterEventProcessorAsync). This registration, actually points to our third class, which implements the IEventProcessor interface. Again a template is provided as part of the nuget package, so we don’t have to write the entire thing ourselves. But if you look at this ProcessEventsAsync method, we see the heart of it…

What’s happening behinds the scenes is that a thread is being spun up for each partition on the Event Hub. This thread then uses a blob lease to take ownership of a partition, then attached a receiver and begins consuming the events. Each time it pulls events (by default, it will pull 10 at a time), the method I show above is called. This method just loops through the collection of events, and every minute will tell the EventProcessorHost to checkpoint (record were we’re at in the stream) our progress. Inside of the foreach loop is the code that looks at the events, deserializes appropriately, and displays then on the programs console.

You can see we’re checking the events “Type” property, and then deseralizing it back into an object with the proper type of encoding. It’s a simple example, but I think drives the point home.

We can see some of what’s going on under the covers of the processor by looking at the blob storage account we’ve associated with our processor. First up, the EventProcessor creates a container in the storage account that is named the same as our event hub (so if you have multiple hubs with the same name in different namespaces, be sure to use different storage accounts). Within that container is a blob named “evenhub.info” which contains a JSON payload that describes the container and the hub.

{“EventHubPath”:”defaulthub”,”CreatedAt”:”2014-10-16T20:45:16.36″,”PartitionCount”:16}

This shows the location of the hub, when this container/file was created, and the number of partitions in the hub. Getting the number of partitions is why you must use a connection string or SAS for the hub that has manager permissions. Also within this container is one blob (zero indexed) for each partition in the hub. These blobs also contain JSON payloads.

{“PartitionId”:”0″,”Owner”:”singleworker”,”Token”:”642cbd55-6723-47ca-85ee-401bf9b2bbea”,”Epoch”:1,”Offset”:”-1″}

We have the partition this file is for, the owner (aka the EventProcessorHost name we gave this), A token (presumably for the lease), an Epoch (not sure what this is for YET), and an Offset. This last value is the position we’re at in the stream. When you call the CheckPointAsync method of our SimpleEventProcessor, this will update the value of the offset so we don’t read old message again.

Now if we spin up two copies of this application, after a minute or so, you’d see the two change ownership of various partitions. Messages start appearing in both and providing you’re spreading your messages over enough partitions, you’ll even be able to see the partition keys at work as different clients will get messages from specific partitions.

Ok, but what about the presentations?

Now when I started this post, I mentioned that there was a presentation and set of demos to go along with this. I’ve upload both for you to take away and use as you see fit. So enjoy!

Annotated Event Hub PowerPoint Presentation Deck && Event Hub Visual Studio 2013 Demo Solution (contains 3 demos and 5 projects)

Until next time!

Shared Access Signatures with Azure Service Bus

Sorry for the silence folks. Microsoft’s fiscal year end was over June 31st and I started on a new team on July 1st. While July and August are usually great periods for folks like me to get some extra blogging done, I’ve had a few distractions that kept me from writing. Namely learning, learning, learning and trying to find my “voice” on the new team.

Going forward, I’m going to start having more of a focus on this “internet of things” fad that everyone’s talking about. And within that, I’m going to be sticking fairly close to home and working on the services side of things. Even more tightly focused, I’m going to go deeper on “build” vs “buy” scenarios and focus a goodly amount of my available time on one of the Azure product collections I’ve often felt didn’t get enough respect… Service Bus.

So in the coming weeks expect to see blog posts on Event Hub, possibly Hybrid Connections, and for starters, setting a few things straight about Service Bus in general.

SAS vs. ACS

So my first starting point is to call out a ‘stealth update’ that happened recently. As of sometime on/after August 21st 2014, if you create a new Service Bus root namespace via the Azure Management Portal, it will no longer include the associated Access Control Service namespace. This change is in following with recommended guidance the product team has been saying for some time. Namely to use SAS over ACS whenever possible.

Note: You can still generate the associated ACS namespace automatically by using the new-azuresbnamespace powershell cmdlet. Just be aware that a new version of this will be coming soon that will match the default behavior of the management portal. When the new cmdlet is available, you will need to append the “-useAcs true” parameter to the command if you still want to create the ACS namespace.

There are a few reasons for this guidance. The first is that according to the data the team had available to them, many folks doing ACS authentication were using the “owner” credential. This identity had FULL control over the namespace. So using it was akin to having an app use the SA (system administrator) identity for a SQL database. Secondly, ACS requires two calls for the first time operation: one to get the ACS token, one to perform the requested service bus operation. Now the token had a default time to life of 3 hours, but some SDKs didn’t cache the token and as a result all operations against the SB were generating two calls which increases the latency of the operation significantly. As if these weren’t enough, ACS only supports about 40 authentications per second. So if your SDK didn’t cache the token, your possible throughput on a single queue goes from somewhere near 2,000 messages a second down to 40 at best.

Now ACS has some benefits to be sure. In general, folks are much more familiar with username/password models then shared access signatures. You could create identities for specific publishers/consumers (within reason), as well as scope those identities and their permissions to specific paths. Allowing a single identity to send/receive from multiple queues for example. It also had the ACS management namespace with a GUI to help manage things. And to shut down access, all one has to do is revoke the identity and access is cut off.But many of these needs can also be met using Shared Access Signatures if one knows how. And that is what I’d like to start helping you with in this post. J

Shared Access Policies/Rules & Connection Information

Ok, first issue… If you use the management portal, you’ll see the ability to create/manage Shared Access Policies, but in the SDK and API, these are referred to as a SharedAccessAuthorizationRule. For the sake of simplicity, and consistency with Azure storage, I’ll refer to this from now on simply as “policies” (which matches the Azure Storage naming).

In Service Bus terms, a policy (aka SharedAccessAuthorizationRule) is a named set of permissions associated with an entity. The entity can be the Service Bus’ root namespace (the name you gave it when it was created), a queue, a topic, or an event hub. Basically an addressable endpoint that has a name assigned to it. For each entity you can have up to twelve policies and each policy is allowed a mix of the same three permissions: manage, send, and listen. Each policy also has two access keys much like Azure Storage and for the same reason. So you can do key swaps periodically and ensure you always have at least one active, valid key available to your applications when an old one is regenerated.

It’s these policies that are the “Connection Information” you access via the portal and see available as SAS connection strings. And it’s the connection strings that lead me to a bit of an issue I have with how many service bus demos are done.

Service Bus Clients

When you create your first service bus project using the .NET SDK and one of the tutorials, you’ll likely be asked at some point to add code that looks like the following:

// Create EventHubClient
QueueClient client = QueueClient.Create(“vendor-queue2″);

// insert the message body into the request
BrokeredMessage message = new BrokeredMessage(“hello world!”);

// execute the request and get the response 
client.Send(message);

Notice that the sample specifies a queue name that we want a client for, but no credentials. That’s because within the SDK, this method is overloaded to look for an application configuration setting by the name of “Microsoft.ServiceBus.ConnectionString”. The value of this string is the SAS connection string you can get from the portal. It gives the application access to the entity until such time as the policy/rule is removed. In other words, you can re-write this code to look like this:

// Create EventHubClient
QueueClient client = QueueClient.CreateFromConnectionString("Endpoint=sb://<namespace>/;SharedAccessKeyName=<SharedAccessRuleName>;SharedAccessKey=<RuleKey>""vendor-queue2");

// insert the message body into the request
BrokeredMessage message = new BrokeredMessage("hello world!");

// execute the request and get the response
client.Send(message);

By using CreateFromConnectionString in place of the simple Create, we can specify our own connection string. But again, this is permanent access until the policy/rule is removed. It also highlights the issue I have with the way the available samples/demos work I mentioned earlier. I bemoaned the use of the “owner” credential when doing ACS. Lets look at the default connection string that the Service Bus Nuget package puts into the application configuration file:

    <add key="Microsoft.ServiceBus.ConnectionString" value="Endpoint=sb://[your namespace].servicebus.windows.net;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=[your secret]" />

This sample refers to a policy/rule named “RootManageSharedAccessKey”. When you create a new namespace, this default policy has been created for you automatically with permission to listen, send, AND MANAGE the entire namespace. Please! For the love of all that is digital, NEVER… EVER use this credential for anything other than an application that needs to manage all aspects of a given namespace! Instead, configure your own policies with just the permissions that are needed. If you put the default policy into all your apps, we’re right back to the “owner” credential situation we had with ACS.

There’s another issue with this approach. Rules/policies must be associated with a specific service bus entity. This is where SAS comes into play.

Our First SAS

From the root namespace, entities in service bus are accessible by a path that lies directly under that namespace. Together with the root namespace, this can be represented by a path. Like so…

brentsample-ns.servicebus.windows.net/my-queue
brentsample-ns.servicebus.windows.net/my-eventhub
brentsample-ns.servicebus.windows.net/your-topic

Now I could create a policy at the root namespace that has “send” permission. But using it as a connection string would give the sender access to send to everything in the namespace. Alternatively, I could create individual policies/rules on each of these entities. But then I need to juggle all those different connection strings.

If we opt to use a SAS, we have a simpler way to help restrict access, but also make management a touch easier by creating signatures that allow access to a partial path, namely something like entities that begin with “my-“. Unfortunately, the management portal does yet provide the ability to create a SAS. So we either need to write a bit of .NET code, or leverage 3rd party utilities. Fortunately the code is pretty simple. Using Visual Studio, you can create a new Console Application and then add the Nuget package for Azure Service Bus. Then all that remains is to populate some variables and use these two lines of code to generate our signature.

var serviceUri = ServiceBusEnvironment.CreateServiceUri("https", sbNamespace, sbPath).ToString().Trim('/');
string generatedSaS = SharedAccessSignatureTokenProvider.GetSharedAccessSignature(sbPolicy, sbKey, serviceUri, expiry);

The important variables in here are:

sbNamespace – the name of the service bus namespace. Don’t include the “.servicebus.windows.net” stuff. Just the name when we created it.
sbPath – the name, or partial name of the entities we want to access. For this example, let’s pretend its “my-”
sbPolicy – this is the rule/policy that has the permissions we want to the signature to include
sbKey – one of the two secret keys of the rule/policy we’re referencing
expiry – a date/time of when the signature should expire.

If we fill these in, we get a signature that looks something like:

SharedAccessSignature sr=https%3a%2f%2fbmssample-ns.servicebus.windows.net%2fmy-&sig=B9cy8OEuxum2UN5VjsC4JPhbVU7jwJi%2bq20qiaXk24s%3d&se=64953912194&skn=Publish

Now that we have this signature, we want to be able to use it to interact with one of our entities. There’s no “CreateFromSAS” option, but fortunately in the .NET SDK we can use this signature together with a MessagingFactory to create our entity client.

MessagingFactorySettings mfSettings = new MessagingFactorySettings();
mfSettings.TransportType = TransportType.NetMessaging;
mfSettings.TokenProvider = TokenProvider.CreateSharedAccessSignatureTokenProvider("<signature we just created>");
MessagingFactory mf = MessagingFactory.Create("sb://thatconference-ns.servicebus.windows.net", mfSettings);

// Create Client
QueueClient client = mf.CreateQueueClient(queueName);

And in this case, the same signature will work for the queue ‘my-queue’, or the event hub ‘my-eventhub’ (although for the later the TransportType needs to be Amqp). We can even take this same signature, and put into into javascript for use perhaps in a NodeJs application…

var xmlHttpRequest = new XMLHttpRequest();
xmlHttpRequest.open('POST''https://thatconference-ns.servicebus.windows.net/my-eventhub/publishers/vendorA-DeviceIDnnn/messages'true);
xmlHttpRequest.setRequestHeader('Content-Type''application/atom+xml;type=entry;charset=utf-8');
xmlHttpRequest.setRequestHeader('Authorization''<SharedAccessSignature>');

In the case of event hub, were we’ll have various publishers, we can do exactly the same but using a rule/policy from the event hub and generate a signature that goes deeper like “my-eventhub/publishers/vendorA-“.

Policy Expiry and Renewal

So here’s the $50,000 question. With ACS I could (at least to a certain scale), have a single identity for each client and remove them at any time. With SAS, I can remove a signature by removing its underlying policy at any time. But since I’m limited to twelve policies. How to I recreate the ability to revoke on demand. Simply put, you don’t.

Now before you throw your hands up and state with exasperation that this approach “simply won’t work”, I do ask you to take a long hard look at this requirement. In my experience, I find that the vast majority of the time, allowing someone to publish to an entity is a matter of trust. You’ve entered into an agreement with them and aren’t likely to revoke that access on a whim. The nature of this trust is rarely volatile/short term in nature. If it’s a customer, they are usually engaged for the term of their subscription (monthly or annual are common). So you know when your trust will expire and need to be renewed.

In this situation, were are planning for an eventuality that rarely comes to pass. And one that has an alternative that requires just a small amount of work, implementing your own “credential broker”.

If we look back at what the ACS did, you present it with credentials, and it issued you a token. You then used that token to access the appropriate service bus resources. That’s exactly what our credential broker would do. Periodically, the requesting client would call to a service you are hosting and present some credentials (username/pass, certificate, PSK, etc…). Your broker service validates this request (and likely logs it), then issues back a SAS ‘token’ with an appropriate expiry. The client then (hopefully) caches the SAS ‘token’, and uses it on subsequent requests until it expires and then goes back to the broker to get a new SAS ‘token’.

Now if this isn’t  enough, we still have the ability to remove/disable the queue (or event hub). So in a way we get the best of both worlds. We have automatic expiry of tokens to ensure “key rotation” while also having the ability revoke access immediately.

This is just one possible pattern. So instead of offering up alternatives, I’d love to hear from any of you via the comments on the patterns you have used to help manage shared access signatures.

Expiration Reached

In the coming weeks/months I’m going to generate a series of posts that will dive into various Service Bus related topics more deeply. If there’s something specific you’d like to see, please let me know. Otherwise you can look forward to posts on access service bus from other languages/sdks, programmatic management of namespaces/rules, and resilient architectures around Service Bus.

I hope this article has helped clear up some of the fog around the Azure Service Bus. So until next time!

New Role: focus on the Enterprise

For the last 18+ months, I’ve been fortunate enough to be part of a great team of professionals that have focused on helping ISVs (Independent Software Vendors), successfully adopt the Azure platform. I’ve enjoyed being a part of this team more then I can express, but the time has come for me to return to the Enterprise focused market where I’ve spent the majority of my 20+ year career.

Affective July 1st, I am transitioning to a new team within my current organization (TED), that will focus Enterprise Architecture. I remain focused on the Azure platform, but will be bringing what I’ve learned from working with ISVs as well as my past experiences with Enterprise customers to try and help grow the platform even further. This new team, led by Barry Briggs will also allow me to join forces with great minds such as Josh Holmes and my old friend David Makogon.

I’m extremely excited about this move. While a significant number of the fortunate 500 company’s out there already have Azure, I’m convinced there is still an incredible amount of untapped potential. And I look forward to working with this new team to help unlock new stories, and exciting new solutions.

Attempting to define “IOT”

NOTE: the following represents my own opinions and should NOT be consider an official viewpoint for any person or organization. These opinions are also still in flux, so if you ask me again in a month, it’s likely to have changed.

Jason, a friend and colleague, recently posted an update on his blog titled “What is Internet of Things”. In his post, he calls out a couple great potential scenarios, but I called him out for not really defining IOT. He attempted to counter with what is in my opinion a marketing blurb. I like and respect Jason, so much of this back and forth was good natured colleague rib-poking. But realized I shouldn’t be poking at his attempt without making my own.

How do you define the undefinable?

To start with. I want to call out that attempting to define “IOT” is like attempting to define “the cloud”. Over time, “cloud” has settled on a definition that revolved around a collection of attributes: scalable, self-service, pay for what you use, and internet accessible. These still fluctuated greatly, and to some degree were dependent on the viewpoint of the consumer of “cloud” based services. But it gave us a starting point.

Using that as an example, I think we could define IOT solutions as also requiring a set of attributes:

Things – A ‘thing’ is a specialized, autonomous piece of technology capable of performing an action, but does not possess a traditional human user interface. A phone or computer is a device. A security hand scanner, motion sensor, or even the GPS sensor in a phone could be considered a “thing”..

Data – Our things, being highly specialized, will have limited capabilities. But one of these will be the ability to report on the information they gather so something else can process the data, turning it into information. They *may* also be able to receive commands to alter their function (i.e. changing your sampling rate from 1s to 5s). In many case, we can likely expect both.

Connectivity – How that data moves in and out of “the things”, requires some type of connection. This could be a cellular network, location specific wi-fi, etc… In part, this refers to how the “things” interact or the data is gathered. The connection can be persistent (always on), or transient (on/off as required), but if you have to do stuff like plug a device into it or move data via USB or SD, we’re lacking the “internet” part of IOT. A big discussion point here is if the “thing” is directly connected, or requires assistance to connect (like the GPS sensors mentioned above).

Management – This attribute is how we track the “things”. Are they curated and highly managed (likely in a factory scenario). Or are they anonymous and unmanaged (open sourced climate telemetry gathering). This also covers attributes like how do I identify the device and separate it from “rogue” devices.

So with a set of initial attributes identified, the next step is to use them to define some scenarios.

A Factory Scenario

So Jason’s post calls out a manufacturing scenario. I’m careful to “a” scenario, because our definition above allows for a nearly infinite set of scenarios. But here’s one possibility: a single factory that makes widgets. For this scenario, we can identify the our IOT attributes.

The things: each piece of machinery has 3 sensors on it: power monitoring, pieces made, and operating temperature and one actuator: power control

Data: The sensors each capture a specific measurement every 1 second and report it every 5 seconds while the machine is running to a local controller.

Connectivity: The sensors area wired to a “controller” that’s mounted on the machine. This in turn is connected into the factory’s private wi-fi network. While the controller can take action on the data, it mostly just displays it and hands it back up to a central service on the network.

Management: When a machine is installed in the factory, there’s a step where the machine (and its sensors) are connected to the network, and “registered” with the factory’s services. Additionally, since the sensors are hard-wired to the machine “controller”, it has information about the sensors it can in turn share up to the factory services.

So we have a basic scenario that has all four attributes, and meets our basic criteria. The factory is capable of determining when the data is trending downward (perhaps power is increasing while pieces being produced is being reduced). It can then take action like telling the machine to shut down because a technician is on the way. We can also collect and trend the data, mining it over time.

But IOT also creates some common challenges.

Ingestion of telemetry: If I only have 100 machines, this isn’t a big deal. But what I’m in a scenario where I have several thousand or hundreds of thousands? How do I scale my factory services to ingest that many connections and messages?

Device Management: Sensors fail and have to get replaced. As machines fail, they may be parted out. So a sensor that fails may be replaced with a sensor that used to be on a different machine. So I constantly need to be able to track the relationships.

Connectivity: What happens if the factory wi-fi goes down? Or worse yet a visitor to the factory taps into the network and starts sending rogue messages that alter my factory data? What if they send commands to the machines that cause them to overheat?

It’s these challenges that all the vendors are racing to solve with their various IOT Solutions!

What are the solutions?

I recently saw slides from a session given by Alessandro Bassi at the M2M+ Industry Summary in Milano, Italy. In this session, he calls out that innovation for its own sake will usually fail. It needs to be supported by a good business model. So the vendors in this space are all attempting to offer their business solutions.

In some cases, these solution are highly targeted. The tech startup NEST, is a good example of this. It has a thing, the thing gathers data, it’s connected and transmits the data, and its managed (you register your device). In this case, the vendor is trying to solve a specific problem, driving down home energy costs, and marketing this solution to consumers is their business model.

In other cases, the solutions are industry focused. With the vendor providing a collection of services that work together in a more flexible way to help drive larger, more strategic initiatives. In the link I just shared, it’s a collection of solutions targeting hardware and software to help with factory scenarios like I listed above. This solution is positioned to organizations that are in that industry vertical, to address the requirements of that industry.

And lastly, we have services like the ones I’ve been working with lately; ISS, Service Bus, Project Orleans, and HDInsight. These are more building block oriented. They aren’t specific industry or even set of scenarios. They are meant to allow higher level, more encompassing solutions to be created. These get marketed to either software vendors looking to build our commercial or consumer solutions, or to organizations that need to build out customized solutions for internal use.

Each approach has pros and cons, but then they are also targeting a different business models. So it’s about picking the approach that best addresses your needs. Meanwhile, the vendors are all looking to capitalize on the market and find a way to sell their solution to solve the “IOT problem”.

Summary – have we accomplished anything?

This was the first time I’ve really put any of these thoughts down. Looking back over the typos and grammar errors, I have to ask if I’ve accomplished anything. I did set down a rough definition for what I think “IOT” is. This also allows me to call out some of the common challenges to related scenarios and ultimately even call out the types of solutions the industry is offering. So I guess you could say I have defined IOT.

But I can’t help and think the practical reality is more difficult. The definition of IOT will continue to evolve and there will always be shades of gray. So it’s important to keep in mind that different things to different people. With that in mind, I think I’ll stick to my “remain calm and ask questions” approach, and when someone comes to me with “an IOT scenario”, my first response will always be “so tell me about it”. The scenario and its challenges will always trump an arbitrary terminology definition. And in the end, it’s the solution and the business value that is brings that really matters more than the industry buzzword. Isn’t it?

Until next time!

Follow

Get every new post delivered to your Inbox.

Join 1,218 other followers