Loading it on with Event Grid

Last week, I had one of those moments that makes me really appreciate my role at Microsoft. I was fortunate enough to spend 2 days working along side colleagues and members of the Event Grid product group (PG) with the Event Grid. We played with some stuff that hasn’t been announced (sorry, not covering that here), as well as tested out a few scenarios for them and gave feedback on the service.

The scenario I picked was “Event Grid under load”, basically to try and DDOS the service. The objective of this scenario was to send load to a custom topic in Event Grid and see a) could I get throttling to kick in and b) what were the behaviors we saw.  The answer was a) yes and b) educational.

The Solution

Event Grid commits to handle the ingestion of up to 5,000 events per second, but we were encouraged to see how far we could push things. And while there are some solutions out there that can be used to generate simulated load, I went to what I knew… Service Fabric. I opted to create a simple micro-service that could generate traffic, then use a Service Fabric cluster to run multiple copies of that service. The service is pretty simple, about 20 lines of code inserted into the “run” method of a stateless service.


List<EventGridEvent> eventList = new List<EventGridEvent>();

for(int i=1; i <= 1000; i++)
{
    EventGridEvent myEvent = new EventGridEvent()
    {
        Subject = $"Event {i}",
        EventType = "Type" + i.ToString()[0],
        EventTime = DateTime.UtcNow,
        Id = Guid.NewGuid().ToString(),
        Data = "sample data",
        DataVersion = "1.0"
    };
    eventList.Add(myEvent);
}

TopicCredentials topicCredentials = new TopicCredentials(sasKey);
EventGridClient client = new EventGridClient(topicCredentials);
client.PublishEventsAsync(eventgridtopic, eventList);

await Task.Delay(TimeSpan.FromMilliseconds(100), cancellationToken);

I started running a single copy of this service on my development machine using my local development Service Fabric cluster. I played around with settings going from a batch of 100 events once per second and tweaking things until it was finally running as you see above with a batch of 1000 events about 10 times a second. At that point we stood up a 5 node Service Fabric cluster running Standard_D2_v2 instances so we could deploy the service there. I was able to batch up 1000 of my events because I didn’t exceed the 1mb maximum payload of a single publish operation. So if you opt for this same type of bulk send, please keep that in mind when designing your batch size.

Event Grid is pretty smart, so when an event comes in, it will persist it to storage, than look for subscribers that need to receive it. If there are none, it stops there. So to really help finish this scenario, we needed to have a way to consume the load. This could prove much more difficult then generating the load, so we instead opted for an easy solution… Event Hub. We created a subscriber on our Custom Topic that would just send all the event that came in to an Event Hub that was configured to automatically scale up to the maximum limits.

With everything in place, it was time to start testing.

The Results

We started our first real test with one copy of the load generating service on each node in the cluster, generating approximately 50,000 events per second. Using the Azure Portal, we monitored the Custom Topic and its sole subscriber closely and saw that everything appeared to be handling the load admirably. The only real deviation we saw was momentary degradation that we believe presented the Event Hub engaging its automatic scaling to handle the extra traffic. This is important because unlike a typical load test where you’d slowly scale up to the 50,000, we basically went from 0 to 50k in a few seconds.

Raja, a lead architect on Event Grid, was sitting next to me monitoring the cluster in Azure that hosted my custom topic. After getting pretty excited about what I was seeing, he agreed that we could double the traffic. So after a quick redeploy of the solution, we doubled our load to 100k events per second. Things started off great, but after a few minutes we noticed we started seeing steadily increasing failure rates on sends to the topic. When we looked at the data, the cause became apparent. Event Hub was throttling the Event Grid’s attempts to send events to it. This created back-pressure on the Event Grid cluster as it started executing its retry scenarios.

With this finding in mind, we realized that we need to change the test a bit and filter the Event Hub subscriber. We went with a simple option again and opted to have the Event Hub only subscribe to events where the Event Type was “Type 1”. In the code above, we take the first number of the loop (1-1000) and use that as the Event Type. Thus generating 9 different events types. So with the revised filtering, we returned to sending 50k events per second, with the Event Hub only getting slightly less than 5k of them. This worked fine, so we double it again to 100k. This time… because we have eliminated the back-pressure, we were able to achieve 100k without any failures.

With 100k successfully in place, Raja looked at the cluster and gave the thumbs up again for us to double traffic to 200k. We quickly saw that we’d hit our upper limit and we started getting publication errors on the custom topic. By our account, we were hitting the limit at just a nudge past 100k. But we let the load run for awhile and continued to monitor what was going on. As we watched, we saw that our throughput would drop, at times to as low as 30k per second but would never really get above that 100k limit. Raja explained that what was happening was that the cluster had capacity and was giving it to us up to a certain point. But as pressure from other tenants on the cluster came and went, our throughput would be throttled down to ensure that there were resources available for those tenants. In short, we had proven that the system was behaving exactly as intended.

So what did we learn…

The entire experience was great and I had alot of fun. But more importantly, I learned a few things. And in keeping with my role… part of the job is to make sure I pass these learnings along to others. So here’s what we have…

First, while the service can deliver beyond the 5,000 events per second that the PG commits too, we shouldn’t count on that. This burned folks back in the early days of Azure when they would build solutions using Azure Blobs and base their design on tests that showed it was exceeding the throughput targets that were committed to at that time. When things would get throttled down to something closer to those commitments, it would break folks solutions. So when you design and build a solution, be sure to design based on the targets for the service, not observed behaviors.

Secondly, back-pressure from subscriber egress (this include failed deliver retries) can affect ingestion throughput. So while its good to know that Event Grid will retry failures, you’ll want to take steps to keep the amount of resource drain from retries to the minimum. This can mean ensuring that the subscriber(s) can handle the load, but also that when testing your solution, you properly simulate the impact of all the subscribers that you may have in your production environment.

And finally, and I would hope this goes without saying… pick the right solution for the right job. Our simulated load in this case represented an atypical use case for Event Grid. Grid, unlike Event Hub, is meant to deliver the items we need to act on.. the actionable events. The nature of my simulated traffic was more akin to a logging solution where I’m just tossing everything at it. In reality, I’d probably have a log stream, and only send certain events to the grid. So this load test certainly didn’t use it in the matter that was intended.

So there you have it. I managed to throw enough load at both the Event Hub and Event Grid to cause both solutions to throttle my traffic. With a few tweaks, my simple service could also possibly be used to create a nice little (if admittedly overly expensive) HTTP DDOS engine. But that’s not what is important here (even if it was kind of fun). What is important is that we learned a few things about how this service works and how to design solutions that don’t end up creating more problems then they solve.

Until next time!

Advertisements

Azure Event Grid

On January 30th, Azure launched a new service that, at least in Azure terms, represents a bit of a paradigm shift. This new service, the Azure Event Grid, is unlike anything else that exists in Azure, for one simple reason… its part of the “fabric” that is Azure. In Azure, you create a database, a virtual machine, a queue, etc… but there’s no creating a “grid”. The Azure Event Grid just exists. For all purposes, its just part of Azure.

This caused me a bit of confusion at first. I’ve never claimed to be the smartest guy in the room, but I think I’m fairly savy. So if I was confused by this at first, I figured there have to be others that might have similar challenges. So I wanted to put my own version of what this service is down in writing.

Overview

Azure Event Grid (we’ll just call it “the grid” for short, and because I’m a Tron fan), conceptually, can be thought of as a model for publishing, and consuming events using a pub/sub model. It has topics to which events are published, and subscribers that allow you to get those events with filtering so you only receive the events you want. Unlike queues and Azure’s Event Hub, grid subscriptions don’t need to be polled, instead they use a message pump approach to push events to a predefined endpoint (usually a web hook). And unlike Event Hub, its not a long term buffer. Instead it does it’s best to retry delivery periodically before the message is finally discarded. For a more detailed breakdown of the differences, be sure to look at the official documentation.

Topics come in two varieties, system topics and custom topics. System topics are created and managed by the various Azure Services. Azure storage, Azure Subscriptions, Events Grid already provides system topics and more are on the way. System topics are “just part of Azure”, so there’s no need to create them and you won’t see them in the portal. They are “just there” and managed by Azure for you to use if you so desire.  If you want to publish your own events, you can create a custom topic and you’ll be able to see and manage the custom topics as you would any other Azure resource. To loop back on the early message, the topic is an input endpoint, and the subscription are output endpoints, the grid is in the middle doing the plumbing between the two so you don’t have too.

To get the messages, you create subscriptions on the topics. A subscription provides the filtering criteria describing the events you want to receive and the endpoint you want events that meet those criteria to be sent too. Where you “see” subscriptions as Azure resource objects, depends on the type of topic they are associated with. If its a custom topic, you’ll see them listed as sub-resource objects of the custom topic they belong too. For system topics, you’ll see subscribers as sub-resources of the system resources you’re subscribing too (sort of like queues within a Service Bus namespace).

One last little catch here. System topics, even though they are part of Azure, are not system wide. For example, you currently can’t subscribe to all events from every storage account within a subscription. You instead have to subscribe to each storage account you want messages on. This means individual subscriptions on each storage account. Fortunately, you can have each of those subscriptions sending events to the same endpoint, giving you consolidated processing. But you do have to manage the individual subscriptions.

System topics do have another important difference from custom topics. A system topic will have an established list of events types, and their schema. These published schemas are there to help you better determine the proper filtering when creating subscriptions on system defined topics. This doesn’t exist for custom topics because you, as you’d expect, have full control over the events that get published into it.

Publishing Events

Like I said, system topics publish their own events and you can’t publish to those same topics. So if you want to have your own events, you will need to create a custom topic and send the events to that. This is done using a simple HTTP operation. In c#, it xould look as follows:

 HttpClient client = new HttpClient();
 client.DefaultRequestHeaders.Add("aeg-sas-key", sasKey); // add in our security header

 HttpRequestMessage request = new HttpRequestMessage(HttpMethod.Post, topicEndpoint) // create our request object
 {
     Content = new StringContent(JsonConvert.SerializeObject(eventList), Encoding.UTF8, "application/json")
 };
 HttpResponseMessage response = client.SendAsync(request).GetAwaiter().GetResult(); // post the events

sasKey is the security token for the custom topic. eventList is a generic List object that contains a collection of our event objects for serialization. That custom object should look something like this.

 public class GridEvent where T : class
 {
     public string Id { get; set; }
     public string Subject { get; set; }
     public string EventType { get; set; }
     public T Data { get; set; }
     public DateTime EventTime { get; set; }
 }

Of course you can do follow the approach above using the language of your choice. I just provided an example in C# as its my primary language. If you are using C#, you may opt to use the .NET SDK instead. If so, it looks a bit like this.

<pre>// create a list object for the events that will be send
List eventList = new List();

// loop through adding events to the list
foreach (var message in eventHubMessages)
{
    EventGridEvent myEvent = new EventGridEvent()
    {
        Id = Guid.NewGuid().ToString(),
        EventTime = DateTime.UtcNow,
        EventType = $"{eventType}",
        Subject = $"{eventMessage}",
        Data = $"{eventData}",
        DataVersion = "1.0"
    };
    eventList.Add(myEvent);
}

// create the topic credential
TopicCredentials topicCredentials = new TopicCredentials(sasKey);

// Create the client object and publish/send to topic
EventGridClient client = new EventGridClient(topicCredentials);
client.PublishEventsAsync($"{gridname}.{regionprefix}.eventgrid.azure.net", eventList).Wait();</pre>

In this case I’m using the SDK defined EventGridEvent class (naming brought to you by the Department of Redundancy Department). But the general approach is still the same, create a collection, put the event objects into the collection, then send them to the topic endpoint. The collection is used to help batch things together because http handshakes can be resource intensive so this helps reduce the number of handshakes when we’re sending events. You can also find this full SDK example on github.

Consuming Events

As we mentioned, a subscription defines what events you want to receive from a topic and the endpoint they should be sent too. When you create the subscription, you define these settings using a pre-defined schema. The schema represents a JSON object that describes the properties of the subscription. Things like…

includedEventTypes – an array of the event types you want to receive. If not specified, it defaults to all types. If one or more types is specified and you’re creating a subscription on a system topic, the values must match registered event types for that topic.

subjectBeginsWith – the value of the event’s Subject property starts with the specified value. What is contained in the Subject property will depend on the topic you are subscribing too.

subjectEndsWith – the value of the event’s Subject property ends with the specified value

subjectIsCaseSensitive – if the two subject matches should be case sensitive or not

Now the obvious question is why doesn’t Event Grid just support something like REGEX for matching on the Subject property (which is just a string value). I won’t speak for the product team on this but, but my theory is that when you’re building a solution like Azure Event Grid… one that commits to dealing with potentially tens of thousands of events and doing so with single digit millisecond latency… you have to make choices. REGEX, while simple to use, is a bit more complex under the covers and therefore more computationally intensive then a simple string match on the beginning or ending of a string. So I suspect they’ve done this  to maximize throughput while minimizing latency.

When you create the subscription, assuming you’re sending to an https/webhook endpoint, you’ll need to make sure the receiver is prepared to respond to the validation event. Validation of endpoints is important, because without it, Event Grid turns into a giant DDOS engine. The validation event is Event Grid’s way of ensuring that events are only being sent to endpoints that are prepared to receive them. When the subscription is created, it sends a special event to the endpoint and expects a specific response back in return. If its not received, the subscriber will be invalidated and events won’t be published. This helps prevent abuse by ensuring that all endpoints agree to receive events. Another important point here, unless you want to write your own validation code, when creating an Azure Function to be used as an endpoint, be sure to use the Event Grid trigger and not the more generic HTTP trigger. If you want an example of using the HTTP trigger, then I’d recommend this post by Brandon Hurlburt.

Now at this point I could show you code for processing the event once its been delivered. But fortunately the product team has already done a really good job of documenting how to do this. And given that we’ve already covered how to wire things up, I don’t know that I can add anything meaningful.

Event – End of Line

So that’s really it. Event Grid, because its part of Azure, is incredibly easy to use. Its really just a matter of saying “I want to subscribe to these events, and here’s where to send them”.  This makes it a great example of serverless approach because the wiring is taken care of for you so you can focus on just processing the event itself. That said, this service is still very new, so the number of Azure system topics is somewhat limited. But the team behind the Event Grid is working to change this and is also great about listening to suggestions on how to improve the service. If you have problems using the event grid and need some advice, post on StackOverflow with the tag “azure-eventgrid”. If you have a suggestion about the service, something you think is needed or could be improved, then make use of the official Azure Feedback forum.

Lastly, I’d like to thank Bahram, Dan R, and the rest of the Service Bus team for the great work they did bringing Event Grid to market. Also a shout out to Cicil Phillip for his post on sending to event grid with http and C#.

Until next time!

PS – yes, this post was written while listening the Tron Legacy soundtrack on repeat. Please don’t judge. 🙂

Azure Logic Apps, Functions, and Service Bus

Here we are yet again. Me writing something on this blog if for no other reason then to document something I learned. There’s no real narrative behind this one other than I built another POC for a partner and in the process found some things I wanted to pull together.

The story here is about digging beyond the Logic App designer and interacting with Service Bus queues, topics, and event hubs. Access and manipulating the message properties as we start chaining Logic App workflows together with functions and custom code.

Since we’re going beyond what the designer currently supports, we’ll look exclusively at the “code view” for everything.

Sending to a Queue

The first step was to create a workflow that would accept an HTTP request and use that to create a message in an Azure Service Bus queue. In doing this ‘simple’ task I learned two things, how to compose an object and how to set a custom message property.

The compose action allows you to construct a JSON object from various inputs. I wanted to be able to send a message to a queue for further processing as well as to event hub for logging. So being able to compose the object once and reuse it was VERY handy.

 
"ComposeJobMsg": {
   "inputs": {
       "JobID": "@{body('SaveJobtoDatabase')?['OutputParameters']['JobID']}",
       "customer": "@{triggerBody()?['customer']}",
       "job_payload": "@triggerBody()?['job_payload']",
       "job_type": "@{triggerBody()?['job_type']}"
   },
   "runAfter": {
       "SaveJobtoDatabase": [
         "Succeeded"
       ]
   },
   "type": "Compose"
}  

This action takes input from the workflow trigger and the result of a previous stored procedure, SaveJobToDatabase, and constructs a simple JSON object with four properties (with horribly inconsistent naming conventions I know).

With the message object created, I can now send it to a queue, specifying the output of the compose operation as the ContentData for my queue message. The SendToQueue action’s body looks like this:

"body": {
   "ContentData": "@{encodeBase64(string(outputs('ComposeJobMsg')))}",
   "ContentType": "JSON",
   "Properties": {
       "job_type": "@{triggerBody()?['job_type']}"
   }
}

There are a few things going on here I want to point out. We’re taking the output of the compose action, outputs(‘ComposeJobMsg’), and converting it from a JSON object to a string. We then base64 encode that string to ensure it will survive transport through the queue. We’re also starting the ContentData value with ‘@{‘ to designate that we’re using a parameter value and we want to treat it as a string. Using ‘{‘ to inform the Logic App that its a string is unnecessary, but sometimes its nice to err on the side of caution. You can learn more about the use of expressions like ‘@‘ and ‘{‘ in the Workflow Definition Language documentation.

Next up, we make sure to set the ContentType as “JSON”.  And finally I add a custom property, “job_type” and set its value to parameter that was on the workflow trigger (again treating that value as a string).

Queue as a Trigger

This is where things started to get interesting. I created a second workflow that is triggered “when a message is received’ and set it to run at 30 second intervals. But this created a problem with trying to update the workflow. Currently (this is something that’s being worked on), the Logic Apps connector takes advantage of Service Bus Queues’ long polling capabilities. Long polling is great because it helps reduce the latency between when a message arrives and it can be processed. So even though the workflow was set to check every 30 seconds… when a message want sent to the queue, it triggered the workflow almost immediately.

The reason for this is that the workflow is not actually running at 30 second intervals, but instead starts polling the queue and waits for that to time out, then it’ll wait 30 seconds and poll again. Where this creates an issue is that if you are in active development, you’re likely going to be changing the workflow every few minutes. Tweak this… run a test… adjust that, run a test. When the workflow start’s polling, its going to wait about 10 minutes for that operation to time out. So even though the “save” works fine, any changes you’re making won’t take affect until the next time the workflow is triggered (after the long-polling times out).

The recommendation I was given, that worked really great (thanks Jeff), is to change the  interval to something like once a day. Then via the portal, we can use the “run trigger” feature to kick off a one-time run of the workflow. So what I would do is I’d modify the workflow, submit a test message to the queue, then manually trigger it. Admittedly, its not as smooth as I’d like, but it gets the job done. The product team seems aware of this so I’m hopeful this workaround won’t be needed for the long term. Once development is complete, the “production” version of the workflow can use a normal timing setting, as long as we’re aware of and OK with the long polling behavior.

Accessing Queue Message Properties

I wanted to be able to consume events both via a Logic App workflow, as well as from some C# code. The workflow portion would look at the job_type property I set above, and use that in a condition to control routing of the message to another queue. If you’re using the drag/drop workflow designer, its pretty easy to get at the queue message ContentData. If you click on it in the designer, the code behind will insert something like this:

triggerBody()['ContentData']

Something was pointed out to me (thanks again Jeff!) and the lighbulb went off.  Note that ContentData is the exact same property we set when we sent the message to the queue up above. So if we wanted to access the job_type value we set, we simply access the Properties collection like so:

triggerBody()?['Properties']['job_type']

You can’t currently do this via the designer, so you’ll need to flip over to code view if you want to access the individual properties within the Properties collection.

But what if we want at the actual payload of ContentData? The JSON object is there, we just have to reverse what we did when we put it into the message. We’ll use a couple Workflow Definition Language functions undo the base64 encoding and get the string content. That string is JSON, so we use the json method to convert it to an object. Once its an object again, we can then access any properties within it, such as the JobID we set when we composed the original object.

@json(base64toString(triggerBody()['ContentData']))['JobID']

In C#, if we want to get at the contents of our object, we do a similar process to get the body of the BrokeredMessage object, and transform that JSON payload into an object.

// get the message body
var body = message.GetBody<Stream>();
string jsonJob = new StreamReader(body, true).ReadToEnd();

// convert message body to object
dynamic job = JsonConvert.DeserializeObject(jsonJob);

What about Event Hub?

There isn’t a connector for Event Hub (at least as of the authoring of this post). So I created an Azure function (code on github) to do this for me. It accepted a few parameters and put it into the event hub so I could later process them via stream analytics. Calling it from the workflow was then pretty straight forward.

"LogToEventHub": {
   "inputs": {
       "body": {
           "JobID": "@{json(base64toString(triggerBody()['ContentData']))['JobID']}",
           "customer": "@{json(base64toString(triggerBody()['ContentData']))['customer']}",
           "job_payload": "@string(outputs('ComposeLogMsg'))",
           "status": "routing"
       },
       "function": {
           "id": "<insert your function reference here>"
       }
   },
   "runAfter": {
       "ComposeLogMsg": [
           "Succeeded"
       ]
   },
   "type": "Function"
}

Just like when we were sending content to the queue, make sure you know what format the objects should go into event hub should be in. My function is called via HTTP, so parameters of the body need to be strings. I opted to use compose to create the payload, then convert that to a string to be output to the event hub. Make sure you know what you’re passing and how it needs to be done as forgetting the proper ‘{‘ or ‘@’ can cause a real headache.

One final Gotcha, WebJobs

Now this one is a truly personal note. When you add Azure Function to a resource group, its currently a valid target for a VSTS web publish. In fact, if you look at the Function in the portal, you can click an option to view the hosting Web App. Once in that web app, you could see any web jobs you may or may not have accidentally deployed to the wrong location (yeah, it happened). I’ve been told that this will eventually be disabled. But in the interim, I wanted to share this little tidbit so nobody else wastes a late night hour trying to figure out why event messages are being consumed when the web jobs that were processing it all are all stopped (or so you thought).

Lesson learned. 🙂

Until next time!

A JavaScript based Windows 10 UWP & Notification Hub

Here we are, back at the blog again as promised. Admittedly, this is a week or so later then I had planned, but… best laid plans and of mice and men and all that. I’ll blame three straight weeks of travelling to visit with partners for the missed target. But all that aside, I am back to share something I’ve been thinking about blogging on for over a year. Namely Azure’s Notification Hub.

Notification Hub has bounced around a bit. First it was under the Web Apps banner, then finally landed under the Service Bus brand. Regardless of which area it falls under, I think it’s a highly underappreciated service. Perhaps this will help change that perception in some small way.

Overview

In short, Notification hub is a way to centralize your notifications across multiple platforms. Its provides you with a single API interface from which you can send notifications to a myriad of platforms (iOS, Android, Kindle, Windows, etc…). In doing this, the service handles negotiations with the platform specific APIs and also helps deal with the various throttling limits that are in place for each.

As if this weren’t enough, Event Hub (EH) also provides several higher level functions.

Device Registry – a single location that you can use to store details and metadata about individual devices. The service monitor for and purge expired device registrations. Saving you from being kicked off the notification service of various providers.

Templates – the capability to define templates that can be used to map a single message to different platforms and notification message formats.

Tags – meta-data attributes that can be used to help you target the recipient(s) of a given message. Allowing you to target a single device, a group of devices, or all devices as you see fit.

Each of these topics could be grounds for its own post or possibly even a series of posts. But today I’m interested in one specific topic, using NH to send notifications to a Windows 10 UWP (Universal Windows Platform) application written in HTML5/JS.

To get this all to work, we have to perform the following steps:

  1. Define the application in the Windows Store to and configure the Windows Notification Service (WNS)
  2. Configure the Notification Hub to talk with the application’s WNS endpoint.
  3. Have the app discover its “channel uri” which is used by the Windows Notification Service (WNS) to send notifications to the app.
  4. Using that URI, ask the Notification Hub to a unique device ID
  5. Using the device ID, register the device with the Notification Hub

So let’s get to it…

The Windows Store, WNS, and Notification Hub

To do notifications, we need to start by setting up WNS, and this requires us to declare our application at Windows Dev Center: https://dev.windows.com/

After logging in, click on “Dashboard” then select “Create a new app”. You’ll need to provide a name for your app (that must be unique) so it can be reserved for your use. With the app created we then click on “Services->Push notifications”. The app should already be configured for push notifications by default, but we need to get the credentials so we can configure the Notification Hub.

In the first section on the Windows Push Notification Services, locate the “Live Service site” link in the second paragraph and click it. This routes you over to the app settings for the new app. We want to capture and record two values here, the Package SID and Client secret as shown below…

DevStore

 

Note: the values contained above are not valid. So please don’t even try. :)

Now we’ll switch over to the Azure Management Portal (classic) and create a new Notification Hub. Select the “+” at the bottom and do App Services -> Service Bus -> Notification Hub -> Quick Create. Provide a hub name, where its located, and a unique namespace for it.

CreateHub

After a few minutes (even faster if you’re re-using an existing namespace), the hub will be created and ready for configuration. We’ll click on its “configure” tab and enter in the Package SID and Client Secret from above.

win1-hubConfigureHub

Save this change, and switch back to the dashboard for the Notification Hub, and get its connection information for the DefaultFullSharedAccessSignature.

HubShareAccessSig

This use of the “full” permissions, can be a security issue. But I’ll take more about that in the next section.

With these steps done, we’ve declared a store application, created a Windows Notification Service endpoint for it, and associated the WNS with our Notification Hub. Its start to start building the app.

The Solution

Last time I worked with Notification Hub, was in September of 2014 with a Windows 8.1 app written in Angular. Since that time I’ve been actively trying to change my development habits and learn more JavaScript. So when another partner approach me about their HTML5/JavaScript application, I wanted to make another go at getting Notification Hub to work, this time with JavaScript. The challenge I soon found out was that we (Microsoft), still don’t have many materials available for building UWPs with JavaScript.

I want to fix that. To that end I’m creating a GitHub repository where I’ll be putting my explorations in creating UWPs and hopefully others will either fork this effort and perhaps even contribute back to it. I’ll start with what I hope is a fairly solid Notification Hub example.

I started by creating a Visual Studio 2015 solution that included a JavaScript Blank Universal Windows App, and a C# based ASP.NET Web API. Yeah, I know I said I’m trying to learn more about JavaScript, so I could have built the API in Node, but one battle at a time. I’m putting both in the same solution for convenience since I can then debug both side by side in the same VS instance.

The reason I created two projects was simple, security. I could have let the application directly interact with the Notification Hub api. However, this would have required me to put the shared access key for the hub into the app itself. But by placing it into a server side API, it helps protect the key from being pulled out of the app and put into malicious use.

The REST API

With the solution in place, I started by building an API side first. I’ll create a single API controller that will have two methods. There will be a POST method that accepts a channel URI and returns a Notification Hub Device ID, and a PUT method that takes the Device ID and other values and registers it for notifications. You could put both these into a single operation. But there may be times where you just want to update the registration (say adding/updating a tag). So I prefer this two step process.

We start by creating a model that will be used by the controller. This class will be used when the controller is activated to help initialize a Notification Hub client. The class looks like this..

    public class Notifications
    {
        public static Notifications Instance = new Notifications();

        public NotificationHubClient Hub { get; set; }

        private Notifications()
        {
            //get properties that have NH connection info
            string nhConnString = Properties.Settings.Default.NotificationHubConnectionString;
            string nhName = Properties.Settings.Default.NotificationHubName;

            // create a NH client 
            Hub = NotificationHubClient.CreateClientFromConnectionString(nhConnString, nhName);
        }
    }

You’ll note that this code depends on two settings. Right-click on the project and add those now. The ConnectionString value we captured when we set up the notification hub. And the hub name we specified when we set up the hub. For this code to compile, we also have to make sure we use nuget to add the Microsoft Azure Notification Hub package to our project and a couple of using clauses so that everything resolves properly.

We’ll then add the controller that will contain our Notification Hub registration API. we start by adding a private NotificationHubClient variable and then populate this variable using the model we created earlier.

    public class RegisterController : ApiController
    {
        private NotificationHubClient hub;

        public RegisterController()
        {
            hub = Notifications.Instance.Hub;
        }
    }

For the post method, I start by define the request object. I create an object that is my request payload, having a single parameter that is the “channel URI”.

        public class regGetRegistrationId
        {
            public string channeluri { get; set; }
        }

With this, we can define the POST api method. This method will accept the channel URI (aka handle), and try to find an existing device ID for this URI in the Notification Hub device registry. If we find a match, we’ll return its device ID, and it not, we’ll create a new one. When checking existing devices, we do this in a loop to help make sure we don’t have multiple registrations using the same channel URI (which could result in the app getting the same notification multiple times).

        [HttpPost]
        [Route("api/register/")]
        public async Task&lt;HttpResponseMessage&gt; Post([FromBody]regGetRegistrationId request)
        {
            string newRegistrationId = null;
            string channelUri = request.channeluri;

            //todo: validate the URI is for notify.windows.com domain

            // make sure there are no existing registrations for the channel URI provided
            if (channelUri != null)
            {
                var registrations = await hub.GetRegistrationsByChannelAsync(channelUri, 100);

                foreach (RegistrationDescription registration in registrations)
                {
                    if (newRegistrationId == null)
                    {
                        newRegistrationId = registration.RegistrationId;
                    }
                    else
                    {
                        await hub.DeleteRegistrationAsync(registration);
                    }
                }
            }

            if (newRegistrationId == null)
                newRegistrationId = await hub.CreateRegistrationIdAsync();

            return Request.CreateResponse(HttpStatusCode.OK, newRegistrationId);
        }

Moving on to the PUT method for actually registering the device the app is running on, I again start by declaring my request payload/contract. This one has the device ID from the previous call, the platform we want to register with, the handle (in WNS, this is the Channel URI), and a set of tags that can be used for targeting notifications.

        public class DeviceRegistration
        {
            public string deviceid { get; set; } // the device ID
            public string platform { get; set; } // what platform, WNS, APNS, etc... 
            public string handle { get; set; } // callback handle for the associated notification service
            public string[] tags { get; set; } // tags to be used in targetting notifications
        }

Finally, we have the API method itself. It takes the payload, reformats the details into what’s needed for the Notification Hub SDK, and performs the registration. If the device is already registered, this would update/overlay the registration with new values.

        [HttpPut]
        [Route("api/register/")]
        public async Task&lt;HttpResponseMessage&gt; Put([FromBody]DeviceRegistration deviceUpdate)
        {
            RegistrationDescription registration = null;
            switch (deviceUpdate.platform.ToLower())
            {
                case "mpns":
                    registration = new MpnsRegistrationDescription(deviceUpdate.handle);
                    break;
                case "wns":
                    registration = new WindowsRegistrationDescription(deviceUpdate.handle);
                    break;
                case "apns":
                    registration = new AppleRegistrationDescription(deviceUpdate.handle);
                    break;
                case "gcm":
                    registration = new GcmRegistrationDescription(deviceUpdate.handle);
                    break;
                default:
                    throw new HttpResponseException(HttpStatusCode.BadRequest);
            }

            registration.RegistrationId = deviceUpdate.deviceid;
            registration.Tags = new HashSet&lt;string&gt;(deviceUpdate.tags);
            // optionally you could supplement/override the list of client supplied tags here
            // tags will help you target specific devices or groups of devices depending on your needs

            try
            {
                await hub.CreateOrUpdateRegistrationAsync(registration);
            }
            catch (MessagingException e)
            {
                //ReturnGoneIfHubResponseIsGone(e);
            }

            return Request.CreateResponse(HttpStatusCode.OK);
        }

In this example, we’re taking whatever tags were handed to our API method. In reality, the API may handle this type of update. Perhaps designating specific tags based off of information about the user of the app, or settings the user has defined in the app itself.

At this point, you could use a tool like Fiddler to test the API directly. You can monitor the notification hub dashboard in the Azure management portal to make sure operations are succeeding. There’s a bit of delay from when actions show up on the dashboard after being performed (seems like 5-10 minutes for the graph across the top, but an hour or so for the device registration count across the bottom). So don’t expect immediate results if you’re doing this type of testing. I’d suggest just firing off a couple REST requests to ensure your code doesn’t throw any obvious examples and then get back to coding up the rest of the app.

The client side JavaScript code

With the API in place, we can start working on the JavaScript side of things. The bulk of the code can be found in a stand-alone object I created called notification.js. This object has a single method named registration. Its a bit long, so we’ll look at it step by step.

First up, we need to look to see if the app was run previously and saved its channel URI.

        // check and see if we have a saved ChannelURI
        var applicationData = Windows.Storage.ApplicationData.current;
        var localSettings = applicationData.localSettings;

        var savedChannelURI = localSettings.values["WNSChannelURI"];
        //savedChannelURI = "re-register"; // uncomment this line to force re-registration every time the app runs

Note the last line, if uncomment it can over-write whatever the saved value was. A channel URI can change and usually won’t last more than 30 days. So the recommendation is that you get it, and save it, and only re-register with the Notification hub if it changes. If you un-comment this line of code, you can run the app over and over again while testing/debugging your code. Just make sure you remove it after you’re done.

Next, we’re going to get a new channel URI by using the Windows UWP API.

        // get current channel URI for notifications
        var pushNotifications = Windows.Networking.PushNotifications;
        var channelOperation = pushNotifications.PushNotificationChannelManager.createPushNotificationChannelForApplicationAsync();

        // get current channel URI and check against saved URI
        channelOperation.then(function (newChannel) {
            return newChannel.uri;
        }).then(function (currentChannelURI) {
            // do stuff here!
        }).done();

This code sets up a Push Notification client and returns a promise for when the operation is complete, This returns the URI which is then passed to the “then” promise for additional operation. Its inside that promise that the real work is done.

We start by checking to see if the channel URI we just received is any different then the one we have saved.

    if (!savedChannelURI || savedChannelURI.toLowerCase() != currentChannelURI.toLowerCase()) {

And if not, we’ll start making our calls to our rest API, starting with the call to get a device ID

                // get a Notification Hub registration ID via the API
                WinJS.xhr({
                    type: "post",
                    url: "http://localhost:7521/api/register",
                    headers: { "Content-type": "application/x-www-form-urlencoded" },
                    responseType: "text",
                    data: "channeluri=" + currentChannelURI.toLowerCase()
                }).then(function (getIdSuccess) {

If this completes successful, inside of its “then” function, we set up the parameters for the call to the second API, passing it the device/registration ID we received back.

                    // strip the double quotes off the string, we don't want those
                    var deviceId = getIdSuccess.responseText.replace(/['"]/g, '');
                        console.log("Device ID is: " + deviceId);

                    // create object for notification hub device registration
                    // tag values used are arbitrary and could be supplemented by any
                    // assigned on the server side
                    var registrationpayload = {
                        "deviceid"  : deviceId,
                        "platform"  : "wns",
                        "handle"    : currentChannelURI,
                        "tags"      : ["tag1", "tag2"]
                    };

Note that I’m stripping of the quotes that came at the beginning and end of my deviceId. I then use it and other values to construct the remainder of the PUT request payload. Which I can now call…

                    // update the registration
                    WinJS.xhr({
                        type: "put",
                        url: "http://localhost:7521/api/register/",
                        headers: { "Content-type": "application/json" },
                        data: JSON.stringify(registrationpayload)
                    }).then(
                        function (registerSuccess) {
                            console.log("Device successfully registered");

                            // save/update channel URI for next app launch
                            localSettings.values["WNSChannelURI"] = currentChannelURI;
                        },
                        function (error) {
                            console.log(JSON.parse(error.responseText));
                        }
                    ).done();

Its inside the “then” for this second call that I save the new URI for the next time the app launches. It should also be noted that I’m not really doing any error handling in these samples and that in a production quality app, you really should have a few retries in this (since it will fail if there’s no network connection).

With this object and its method declared, we can now wire it up to the application. I put it in its own JavaScript file so it could be easily reused by another project, so we’ll want to add it to the default.html page of our new app.

    &lt;!-- UWP___JavaScript references --&gt;
    &lt;link href="/css/default.css" rel="stylesheet" /&gt;
    &lt;!-- TAG: #notificationhubjs --&gt;
    &lt;script src="/js/notifications.js"&gt;&lt;/script&gt; &lt;!-- Notification Hub Integration --&gt;
    &lt;script src="/js/default.js"&gt;&lt;/script&gt;

Note hose this was inserted before the default.js file. This helps ensure that the object is available to default.js which is the entry point for my application. Inside the default.js we’ll add the code to access our new method.

    // TODO: This application has been newly launched. Initialize your application here.

    // TAG: #notificationhubjs			    
    uwpNotifications.registerChannelURI(); // register the app for notifications

By placing this in the block of the default.js after the “newly launched” I’ll ensure this code is called when the app is launched, but not each time the app is resumed. Which is my intent. Now all that remains is to associate my app with the app I created in the store.

Store Association & testing.

Fortunately, since we’ve already registered our application with the store, visual studio makes this step really easy. Right-click on the UWP application in the solution explorer, and select Store -> Associate App with the Store.

storeassociation

You’ll get walked through a wizard that will log you into the store and let you select an existing app, or create a new one. Select the app you created above, and associate it with your app.

Note: If you are using a public repository such as GitHub, don’t check in your code after making the association. By associating the project with the store, you are altering the application manifest and adding in a file that contains store specific details. I recommend doing this final stage/testing in a separate branch to allow you to more easily merge back changes without bringing over those details.

With this change done, we’re can run the app and test our notifications. Start by first launching the Web API in debug mode. I found it was easiest to set that project to my start-up project. Next, run the UWP application in debug mode. You’ll find you can set breakpoints in both apps and they’ll get hit. Which makes doing basic debugging of the code a lot easier.

Once you know the app has launched and appears to have registered with the Notification Hub properly, we can go back to the Windows Management portal and use the notification hub to test our application. Be sure to set the platform to “Windows” and the notification type to “Toast”.

notificationtest

If we’ve done everything correctly, you should receive a toast notification. What’s even better, is that if you stop debugging, you can still send the notification. This is one of the things that makes notifications awesome. Your app doesn’t even need to be running for you to still send notifications to it. And with Windows 10, you can even allow the user to take actions on those notifications. But that’s likely a subject left for another day.

Conclusion

So hopefully this has helped stitch a few topics together in a way that’s helpful. As I mentioned earlier, I’ve put all this code up on GitHub for you to borrow and reuse if you see fit. Over time I hope to grow that project, so if you want to skip to the parts covered in this blog, just clone the repository, and look for the tag #notificationhubjs.

Until next time!

 

Azure’s new Event Hub

Over the past few months, I had the good fortune to be accepted to present at ThatConference in Wisconsin and CloudDevelop in Ohio. I count myself even more fortunate because at the time I submitted my session for both these events, it was about a new Azure solution that hadn’t even been announced yet, the Event Hub.

Whenever possible, I like to put demos into a real world context. For this one, I reached out to two colleagues that were presenting at ThatConference and collectively we came up with the idea to do a conference attendee tracking solution. For my part of this series, I was going to cover using Event Hub to ingest event messages from various sources (social media, mobile apps, and proximity sensors) and feeding those into the hub. I also wrote some code so that the other sessions could consume the messages.

Event Hub vs. Topics/Queues

The first question to get out of the way is that Event Hub is NOT just a new variation on Topics/Queues. For this, I’ve found a simple visual example works best.

This is topics/queues

This is Event Hub

 

The key differentiator between the two is scale. A courier can pick up a selection of packages, and ensure they are delivered. But if you need to move hundreds of thousands of packages, you can do that with a lot of couriers, or you could build a distribution center capable of handling that kind of volume more quickly. Event Hub is that distribution center. But it’s been built as a managed service so you don’t have to build your own expensive facility. You can just leverage the one we’ve built for you.

In Service Bus, topics and queues are about the transportation and delivery of a specific payload (the brokered message) from point A to point B. These come with specific features (guaranteed delivery, visibility controls, etc…) that also limit the scale at which a single queue can operate. Service Bus was built to solve the challenges of scaled ingestion of messages, but did so with the trade-off of these types of atomic operations. The easiest way to think of Event Hub is as a giant buffer into which you place messages, and they are automatically retained for a given period of time. You then have the ability to read those messages much as you would read a file stream from disk. You can even rewind all the way back to the beginning of the stream and process everything again.

And as you might expect given the different focus of the two solutions, the programming models are also different. So it’s also important to understand that switching from one to the other isn’t simply a matter of switching the SDK.

What is the Event Hub?

If you think back to Topics/Queues, you had the option of enabling partitions via the EnablePartioning property. This would cause the topic or queue to switch from a single event hub broker (the service side edge compute node), to 16 brokers, increasing the total throughput of the queue by 16 times. We call this approach, partitioning. And this is exactly what Event Hub does.

When you create an Event Hub, you determine the number of partitions that you want (from 16, the default, up to 1024). This allows you to scale out the processes that need to consume events. Partitions are also used to help direct messages. When you send an event to the hub, you can assign a partition key which is in turn hashed by the Event hub brokers so that it lands in a given partition. This hash ensures that as long as the same partition key is used, the events will be placed into the same partition and in turn will be picked up by the same reciever. If you fail to specify a partition, the events will be distributed randomly.

When it comes to throughput, this isn’t the end of the story. Event Hubs also have “throughput units”. By default you start with a single throughput unit that allows 1mb/s in and 2mb/s out through your hubs. You can request this to get scaled up to 1000 throughput units. When you purchase a throughput unit, you do this at the namespace level since it applies to all your event hubs in that namespace.

So what we have is a service that can scale to handle massive ingestion of events, combined with a huge buffer just in case the back end, which also features scalable consumption, can’t keep up with the rate in which messages are being sent. This gives us scale on multiple facets, as a managed, hosted service.

So about that presentation…

So the next obvious question is, “how does it work?” This is where my demos came in. I wanted to show using event hug to consume events from multiple sources: a social media feed, a mobile app used by vendors to scan attendee badges, and proximity sensors scattered around the conference to help measure session attendance.

I started by realizing that when I consume event, I needed to know what type they were (aka how to deserialize them). To make this easy, I started by defining my own customer, .NET message types. I selected twitter for the media feed and for the messages, the type class declaration looks like this:

So we have who tweeted, the text of the message, and when it was created. I decorated the class with various data attributes to aid in serialization.

When a tweet is found, we’ll need a client to send the event…

This creates an EventHubClient object, using a connection string from the configuration file, and a configuration setting that defines the name of the hub I want to send to.

Next up, we need to create my event object, and serialize it.

I opted to use Newtonsoft’s .NET JSON serializer. It was already brought in by the Event Hub nuget package. JSON is lighter weight then XML, and since Event Hub is based on total throughput, not # of messages, it made sense to keep the payloads as small as was convenient.

Finally, I have to actually send the message:

I create an instance of the EventData object using the serialized event, and assign a partition key to it. Furthermore, I also add a custom property to it that my event processors can then use to determine how to deserialize the event. Finally, I call the EventHubClient method, Send, handing my event as a property. The default way for the .NET client to do all this is to use AMQP 1.0 under the covers. But you can also do this via REST from just about any language you want. Here’s an example using Javascript…

This example comes from another part of the presentation’s demo, where I use a web page with imbedded javascript to simulate the vendor mobile device app for scanning attendee badges. In this case, the Partition key is in the URI and is set to “vendor”. While I’m still sending a JSON payload, this one uses a UTF-8 encoding instead of Unicode. Another reason it could be important that we have an event type when we’re consuming events.

Now you’ll recall I explained that the partition key is used to help distribute the events so that we end up with a fairly even distribution amoung the consuming processes. So why would I select to bind each of my examples to a single partition? In my case, I knew that volumes would be low, so there wasn’t much of an issue with overloading my consuming processes. But you can also use this approach if you want to ensure that the same consuming process always gets the events from the same source. Something that can be really handy if the consuming process is using the events to maintain an in-memory state model of some type.

So what about consuming the events?

Events are consumed via “consumer groups”. Each group can track its position within the overall event hub ‘stream’ separately, allowing for parallel consumption of events. A default group is created when the event hub is created, but we can create our own. Consuming processes in turn create receivers, which connect to the various partitions to actually consume the events. This would normally require you to code up some rather complicated logic to ensure that if the process that owns a given set of receivers becomes unavailable, another process can pick up the slack. Fortunately, the event hub team thought of this already and created another nuget package called the EventProcessorHost.

Simply put, this is a handy, .NET based approach to handle resiliency and fault tolerance of event consumers/recievers. It uses Azure Storage blobs to track which receivers are attached to a given partition in an event hub. If you add or remove consuming processes, it will redistribute the receivers accordingly. I used this approach for my presentation to create a simple console app that displays the events coming into the hub. There’s really just three parts of the solution: the program itself, a receiver class, and an event processor class.

The console program is the simplest bit of code…

We use the namespace manager to create a consumer group if the one we want doesn’t already exist. Then we instantiate a Receiver object, and tell that object to start processing events, distributing threads across the various partitions in the event hub. The nuget package comes with its own Receiver class, so there’s not much you really need to do. The core of the receiver is in the MessageProcessingWithPartitionDistribution method.

You’ll note that this may actually be a bit different then the version that arrives with the nuget package. This is because I’ve modified it to use a consumer group name I specify, instead of just the default name. Otherwise, it’s the same example. I get the Azure Storage connection string (where the blobs that will control our leases will go), and then uses that to create an EventProcessorHost object. We then tell the host to start doing asynchronous event processing (via RegisterEventProcessorAsync). This registration, actually points to our third class, which implements the IEventProcessor interface. Again a template is provided as part of the nuget package, so we don’t have to write the entire thing ourselves. But if you look at this ProcessEventsAsync method, we see the heart of it…

What’s happening behinds the scenes is that a thread is being spun up for each partition on the Event Hub. This thread then uses a blob lease to take ownership of a partition, then attached a receiver and begins consuming the events. Each time it pulls events (by default, it will pull 10 at a time), the method I show above is called. This method just loops through the collection of events, and every minute will tell the EventProcessorHost to checkpoint (record were we’re at in the stream) our progress. Inside of the foreach loop is the code that looks at the events, deserializes appropriately, and displays then on the programs console.

You can see we’re checking the events “Type” property, and then deseralizing it back into an object with the proper type of encoding. It’s a simple example, but I think drives the point home.

We can see some of what’s going on under the covers of the processor by looking at the blob storage account we’ve associated with our processor. First up, the EventProcessor creates a container in the storage account that is named the same as our event hub (so if you have multiple hubs with the same name in different namespaces, be sure to use different storage accounts). Within that container is a blob named “evenhub.info” which contains a JSON payload that describes the container and the hub.

{“EventHubPath”:”defaulthub”,”CreatedAt”:”2014-10-16T20:45:16.36″,”PartitionCount”:16}

This shows the location of the hub, when this container/file was created, and the number of partitions in the hub. Getting the number of partitions is why you must use a connection string or SAS for the hub that has manager permissions. Also within this container is one blob (zero indexed) for each partition in the hub. These blobs also contain JSON payloads.

{“PartitionId”:”0″,”Owner”:”singleworker”,”Token”:”642cbd55-6723-47ca-85ee-401bf9b2bbea”,”Epoch”:1,”Offset”:”-1″}

We have the partition this file is for, the owner (aka the EventProcessorHost name we gave this), A token (presumably for the lease), an Epoch (not sure what this is for YET), and an Offset. This last value is the position we’re at in the stream. When you call the CheckPointAsync method of our SimpleEventProcessor, this will update the value of the offset so we don’t read old message again.

Now if we spin up two copies of this application, after a minute or so, you’d see the two change ownership of various partitions. Messages start appearing in both and providing you’re spreading your messages over enough partitions, you’ll even be able to see the partition keys at work as different clients will get messages from specific partitions.

Ok, but what about the presentations?

Now when I started this post, I mentioned that there was a presentation and set of demos to go along with this. I’ve upload both for you to take away and use as you see fit. So enjoy!

Annotated Event Hub PowerPoint Presentation Deck && Event Hub Visual Studio 2013 Demo Solution (contains 3 demos and 5 projects)

Until next time!

Shared Access Signatures with Azure Service Bus

Sorry for the silence folks. Microsoft’s fiscal year end was over June 31st and I started on a new team on July 1st. While July and August are usually great periods for folks like me to get some extra blogging done, I’ve had a few distractions that kept me from writing. Namely learning, learning, learning and trying to find my “voice” on the new team.

Going forward, I’m going to start having more of a focus on this “internet of things” fad that everyone’s talking about. And within that, I’m going to be sticking fairly close to home and working on the services side of things. Even more tightly focused, I’m going to go deeper on “build” vs “buy” scenarios and focus a goodly amount of my available time on one of the Azure product collections I’ve often felt didn’t get enough respect… Service Bus.

So in the coming weeks expect to see blog posts on Event Hub, possibly Hybrid Connections, and for starters, setting a few things straight about Service Bus in general.

SAS vs. ACS

So my first starting point is to call out a ‘stealth update’ that happened recently. As of sometime on/after August 21st 2014, if you create a new Service Bus root namespace via the Azure Management Portal, it will no longer include the associated Access Control Service namespace. This change is in following with recommended guidance the product team has been saying for some time. Namely to use SAS over ACS whenever possible.

Note: You can still generate the associated ACS namespace automatically by using the new-azuresbnamespace powershell cmdlet. Just be aware that a new version of this will be coming soon that will match the default behavior of the management portal. When the new cmdlet is available, you will need to append the “-useAcs true” parameter to the command if you still want to create the ACS namespace.

There are a few reasons for this guidance. The first is that according to the data the team had available to them, many folks doing ACS authentication were using the “owner” credential. This identity had FULL control over the namespace. So using it was akin to having an app use the SA (system administrator) identity for a SQL database. Secondly, ACS requires two calls for the first time operation: one to get the ACS token, one to perform the requested service bus operation. Now the token had a default time to life of 3 hours, but some SDKs didn’t cache the token and as a result all operations against the SB were generating two calls which increases the latency of the operation significantly. As if these weren’t enough, ACS only supports about 40 authentications per second. So if your SDK didn’t cache the token, your possible throughput on a single queue goes from somewhere near 2,000 messages a second down to 40 at best.

Now ACS has some benefits to be sure. In general, folks are much more familiar with username/password models then shared access signatures. You could create identities for specific publishers/consumers (within reason), as well as scope those identities and their permissions to specific paths. Allowing a single identity to send/receive from multiple queues for example. It also had the ACS management namespace with a GUI to help manage things. And to shut down access, all one has to do is revoke the identity and access is cut off.But many of these needs can also be met using Shared Access Signatures if one knows how. And that is what I’d like to start helping you with in this post. J

Shared Access Policies/Rules & Connection Information

Ok, first issue… If you use the management portal, you’ll see the ability to create/manage Shared Access Policies, but in the SDK and API, these are referred to as a SharedAccessAuthorizationRule. For the sake of simplicity, and consistency with Azure storage, I’ll refer to this from now on simply as “policies” (which matches the Azure Storage naming).

In Service Bus terms, a policy (aka SharedAccessAuthorizationRule) is a named set of permissions associated with an entity. The entity can be the Service Bus’ root namespace (the name you gave it when it was created), a queue, a topic, or an event hub. Basically an addressable endpoint that has a name assigned to it. For each entity you can have up to twelve policies and each policy is allowed a mix of the same three permissions: manage, send, and listen. Each policy also has two access keys much like Azure Storage and for the same reason. So you can do key swaps periodically and ensure you always have at least one active, valid key available to your applications when an old one is regenerated.

It’s these policies that are the “Connection Information” you access via the portal and see available as SAS connection strings. And it’s the connection strings that lead me to a bit of an issue I have with how many service bus demos are done.

Service Bus Clients

When you create your first service bus project using the .NET SDK and one of the tutorials, you’ll likely be asked at some point to add code that looks like the following:

// Create EventHubClient
QueueClient client = QueueClient.Create(“vendor-queue2”);

// insert the message body into the request
BrokeredMessage message = new BrokeredMessage(“hello world!”);

// execute the request and get the response 
client.Send(message);

Notice that the sample specifies a queue name that we want a client for, but no credentials. That’s because within the SDK, this method is overloaded to look for an application configuration setting by the name of “Microsoft.ServiceBus.ConnectionString”. The value of this string is the SAS connection string you can get from the portal. It gives the application access to the entity until such time as the policy/rule is removed. In other words, you can re-write this code to look like this:

// Create EventHubClient
QueueClient client = QueueClient.CreateFromConnectionString("Endpoint=sb://<namespace>/;SharedAccessKeyName=<SharedAccessRuleName>;SharedAccessKey=<RuleKey>""vendor-queue2");

// insert the message body into the request
BrokeredMessage message = new BrokeredMessage("hello world!");

// execute the request and get the response
client.Send(message);

By using CreateFromConnectionString in place of the simple Create, we can specify our own connection string. But again, this is permanent access until the policy/rule is removed. It also highlights the issue I have with the way the available samples/demos work I mentioned earlier. I bemoaned the use of the “owner” credential when doing ACS. Lets look at the default connection string that the Service Bus Nuget package puts into the application configuration file:

    <add key="Microsoft.ServiceBus.ConnectionString" value="Endpoint=sb://[your namespace].servicebus.windows.net;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=[your secret]" />

This sample refers to a policy/rule named “RootManageSharedAccessKey”. When you create a new namespace, this default policy has been created for you automatically with permission to listen, send, AND MANAGE the entire namespace. Please! For the love of all that is digital, NEVER… EVER use this credential for anything other than an application that needs to manage all aspects of a given namespace! Instead, configure your own policies with just the permissions that are needed. If you put the default policy into all your apps, we’re right back to the “owner” credential situation we had with ACS.

There’s another issue with this approach. Rules/policies must be associated with a specific service bus entity. This is where SAS comes into play.

Our First SAS

From the root namespace, entities in service bus are accessible by a path that lies directly under that namespace. Together with the root namespace, this can be represented by a path. Like so…

brentsample-ns.servicebus.windows.net/my-queue
brentsample-ns.servicebus.windows.net/my-eventhub
brentsample-ns.servicebus.windows.net/your-topic

Now I could create a policy at the root namespace that has “send” permission. But using it as a connection string would give the sender access to send to everything in the namespace. Alternatively, I could create individual policies/rules on each of these entities. But then I need to juggle all those different connection strings.

If we opt to use a SAS, we have a simpler way to help restrict access, but also make management a touch easier by creating signatures that allow access to a partial path, namely something like entities that begin with “my-“. Unfortunately, the management portal does yet provide the ability to create a SAS. So we either need to write a bit of .NET code, or leverage 3rd party utilities. Fortunately the code is pretty simple. Using Visual Studio, you can create a new Console Application and then add the Nuget package for Azure Service Bus. Then all that remains is to populate some variables and use these two lines of code to generate our signature.

var serviceUri = ServiceBusEnvironment.CreateServiceUri("https", sbNamespace, sbPath).ToString().Trim('/');
string generatedSaS = SharedAccessSignatureTokenProvider.GetSharedAccessSignature(sbPolicy, sbKey, serviceUri, expiry);

The important variables in here are:

sbNamespace – the name of the service bus namespace. Don’t include the “.servicebus.windows.net” stuff. Just the name when we created it.
sbPath – the name, or partial name of the entities we want to access. For this example, let’s pretend its “my-”
sbPolicy – this is the rule/policy that has the permissions we want to the signature to include
sbKey – one of the two secret keys of the rule/policy we’re referencing
expiry – a date/time of when the signature should expire.

If we fill these in, we get a signature that looks something like:

SharedAccessSignature sr=https%3a%2f%2fbmssample-ns.servicebus.windows.net%2fmy-&sig=B9cy8OEuxum2UN5VjsC4JPhbVU7jwJi%2bq20qiaXk24s%3d&se=64953912194&skn=Publish

Now that we have this signature, we want to be able to use it to interact with one of our entities. There’s no “CreateFromSAS” option, but fortunately in the .NET SDK we can use this signature together with a MessagingFactory to create our entity client.

MessagingFactorySettings mfSettings = new MessagingFactorySettings();
mfSettings.TransportType = TransportType.NetMessaging;
mfSettings.TokenProvider = TokenProvider.CreateSharedAccessSignatureTokenProvider("<signature we just created>");
MessagingFactory mf = MessagingFactory.Create("sb://thatconference-ns.servicebus.windows.net", mfSettings);

// Create Client
QueueClient client = mf.CreateQueueClient(queueName);

And in this case, the same signature will work for the queue ‘my-queue’, or the event hub ‘my-eventhub’ (although for the later the TransportType needs to be Amqp). We can even take this same signature, and put into into javascript for use perhaps in a NodeJs application…

var xmlHttpRequest = new XMLHttpRequest();
xmlHttpRequest.open('POST''https://thatconference-ns.servicebus.windows.net/my-eventhub/publishers/vendorA-DeviceIDnnn/messages'true);
xmlHttpRequest.setRequestHeader('Content-Type''application/atom+xml;type=entry;charset=utf-8');
xmlHttpRequest.setRequestHeader('Authorization''<SharedAccessSignature>');

In the case of event hub, were we’ll have various publishers, we can do exactly the same but using a rule/policy from the event hub and generate a signature that goes deeper like “my-eventhub/publishers/vendorA-“.

Policy Expiry and Renewal

So here’s the $50,000 question. With ACS I could (at least to a certain scale), have a single identity for each client and remove them at any time. With SAS, I can remove a signature by removing its underlying policy at any time. But since I’m limited to twelve policies. How to I recreate the ability to revoke on demand. Simply put, you don’t.

Now before you throw your hands up and state with exasperation that this approach “simply won’t work”, I do ask you to take a long hard look at this requirement. In my experience, I find that the vast majority of the time, allowing someone to publish to an entity is a matter of trust. You’ve entered into an agreement with them and aren’t likely to revoke that access on a whim. The nature of this trust is rarely volatile/short term in nature. If it’s a customer, they are usually engaged for the term of their subscription (monthly or annual are common). So you know when your trust will expire and need to be renewed.

In this situation, were are planning for an eventuality that rarely comes to pass. And one that has an alternative that requires just a small amount of work, implementing your own “credential broker”.

If we look back at what the ACS did, you present it with credentials, and it issued you a token. You then used that token to access the appropriate service bus resources. That’s exactly what our credential broker would do. Periodically, the requesting client would call to a service you are hosting and present some credentials (username/pass, certificate, PSK, etc…). Your broker service validates this request (and likely logs it), then issues back a SAS ‘token’ with an appropriate expiry. The client then (hopefully) caches the SAS ‘token’, and uses it on subsequent requests until it expires and then goes back to the broker to get a new SAS ‘token’.

Now if this isn’t  enough, we still have the ability to remove/disable the queue (or event hub). So in a way we get the best of both worlds. We have automatic expiry of tokens to ensure “key rotation” while also having the ability revoke access immediately.

This is just one possible pattern. So instead of offering up alternatives, I’d love to hear from any of you via the comments on the patterns you have used to help manage shared access signatures.

Expiration Reached

In the coming weeks/months I’m going to generate a series of posts that will dive into various Service Bus related topics more deeply. If there’s something specific you’d like to see, please let me know. Otherwise you can look forward to posts on access service bus from other languages/sdks, programmatic management of namespaces/rules, and resilient architectures around Service Bus.

I hope this article has helped clear up some of the fog around the Azure Service Bus. So until next time!

Azure File Depot

It’s been a little while since my last update. BUILD 2014 has come and gone and my group, DPE (Developer Platform Evangelism), has been re-branded to DX (Developer Experience). Admittedly a name is a name, but this change comes at a time when my role is beginning to transform. For the last 18 months, my primary task has been to help organizations be successful in adopting the Azure platform. Going forward, that will still remain significant, but now we’re also tasked with helping demonstrate our technology and our skills through the creation of intellectual property (frameworks, code samples, etc…) that will assist customers. So less PowerPoint (YEAH!), more coding (DOUBLE YEAH!).

To that end, I’ve opted to tackle a fairly straightforward task that’s come up often. Namely the need to move files from one place to another. It’s come up at least 3-4 times over the last few months so it seems like a good first project under our changing direction. To that end, I’d like to present you to my first open sourced effort, the Azure File Depot.

What is it?

In short, the Azure File Depot is an effort to provide sample implementations of various tasks related to using blob storage to move files around. It contains a series of simple classes that help demonstrate common tasks (such as creating a Shared Access Signature for a blob) and business challenges (the automated publication of files from on-premises to Azure and/or an Azure hosted VM).

Over time, it’s my hope that we may attract a community around this and evolve this little project into a true framework. But in the meantime, it’s a place for me to do what I do best and that’s take learnings and share them. At least I hope I’m good at it since it’s a significant portion of my job. J

What isn’t it?

I need to stress that this project isn’t intended to be a production ready framework for solving common problems. The goal here is to create a collection of reference implementations that address some common challenges. While we do want to demonstrate solid coding practices, there will be places where the samples take less than elegant approaches.

I’m fully aware that in many cases there may be better implementations out there, in some cases perhaps even off the shelf solutions that are production ready. That’s not what this project is after. I have many good friends in Microsoft’s engineering teams that I know will groan and sigh at the code in this project. I only ask that you be kind and keep in mind, this is an educational effort and essentially one big collection of code snippets.

So what’s in it currently?

What we’ve included in this first release is the common ask I referred to above. The need to take files generated on-premises and push them to Azure. Either letting them land in blob storage, or have them eventually land in an Azure hosted VM.

Here’s a diagram that outlines the scenario.

FileDepotDiagram

Step 1 – File placed in a folder on the network

The scenario begins with a file being created by some process and saved to the file system. This location could be a network file share or just as easily could be on the same server as the process itself.

Step 2 – The Location is monitored for new files

That location is in turn monitored by a “Publication Service”. Our reference implementation uses the c# FileSystemwatcher class which allows the application to be receive notification of file change events from Windows.

Step 3 – Publication service detects file and uploads to blob storage

When the creation of a new file raises an event in the application, the publishing app waits to get an exclusive lock on the file (making sure nothing is still writing to the file), then uploads it to a blob container.

Step 4 – Notification message with SAS for blob is published

After the blob is uploaded, the publication service then generates a shared access signature and publishes a message to a “Messages” Service Bus topic so that interested processes can be alerted that there’s a new file to be downloaded.

Step 5 – Subscribers receive message

Processes that want to subscribe to these notifications create subscriptions on that topic so they can receive the alerts.

Step 6 – Download blob and save to local disk

The subscribing process then use the shared access signature to then download the blob, placing it in the local file system.

Now this process could be used to push files from any location (cloud or on-premises) to any possible receiver. I like the example because it demonstrates a few key points of cloud architecture:

  • Use of messaging to create temporal decoupling and load leveling
  • Shared Access Signatures to grant temporary access to secure, private blob storage for potentially insecure/anonymous clients
  • Use of Service Bus Topics to implement pub/sub message model

So in this one example, we have a usable implementation pattern (I’ve provided all the basic helper classes as well as sample implementations of both console and Windows Services applications). We also have a few reusable code snippets (create a blob SAS, interact with Service Bus Topics, upload/download files to/from block blobs.

At least one customer I’m working with today will find these samples helpful. I hope others will as well.

What’s next?

I have a few things I plan to do this with this in the near term. Namely make the samples a bit more robust: error handling, logging, and maybe even *gasp* unit tests! I also want to add in support for larger files by showing how to implement this with page blobs (which are cheaper if you’re using less than 500TB of storage).

We may also explore using this to do not just new file publication, but perhaps updates as well as adding some useful metadata properties to the blobs and messages.

I also want to look at including more based scenarios. In fact, if you read this, and have a scenario, you can fork the project and send us a pull request. In fact, you’re wondering how to do something that you think could fit into this project, please drop me a line.

That’s all the time I have for today. Please look over the project as we’d love your feedback.