Network Isolation/Security with Azure Service Fabric

There are times you really need to take things beyond the “file new” experience and implement a more advanced scenario. And with these opportunities, there are times you realize that what you need likely isn’t a “one off” kind of thing. There are larger implications to what you need that can help solve a myriad of problems. This is the story of one these scenarios.

I was recently working with a partner as they explored Service Fabric. They liked what they saw, but there was a “but” (there almost always is). This partner is in the government space, and one of the requirements they had is that all public facing services are isolated and secured from any “back end” services (in a DMZ). If you’ve been doing IT for any length of time, this shouldn’t come as news. But the question they had for me was how to do this with Service Fabric.

There were a couple ways to address this that immediately came to mind. We could deploy the front end web application as an Azure Web App, hosted in an App Service Environment that was joined to the same VNet as the Service Fabric Cluster. We could also set up two Service Fabric clusters, again joined by a single VNet. The issue with both of these is that the front and back ends of the solution would need to be deployed and managed separately. Not a huge deal admittedly. But this did complicate the provisioning and deployment processes a bit, as well as seemed to run counter to the idea of a Service Fabric “application”, composed of multiple services as a single entity. I was fortunate that I had previously engaged my friend and colleague Kal to bring his considerable Service Fabric experience into play with this partner, and he suggested a third option, one we all found fairly intriguing.

A Service Fabric cluster has Node Types which are directly related to VM Scale Sets. Taking advantage of this, we could place different node types into different subnets and place Network Security Groups (NSGs) on the subnets to provide the level of isolation the partner required. We would then use Placement Constraints to ensure that the services within an application are only hosted in the proper subnet by using constraints specific to the node type, or types, in that subnet.

We ran the idea by Mark Fussell,  the lead Project Manager of the Service Fabric team. As we talked, we realized that folks had secured a cluster from all external access, but there didn’t appear to be a public, previously documented version of what we were proposing. Mark was supportive of the idea, and even offered up that in some of the “larger” Service Fabric clusters, the placement constraint approach has been used to ensure that the services that make up the Service Fabric Cluster remain isolated from those that comprise the applications deployed within it.

Our mission clear, I set to work! We were going to create a Azure Resource Manager template to create our “DMZ’d Service Fabric Cluster”.

Network Topology 

The first step was to create the overall network topology.

image

We have the front end subnet, which has a public load balancer that would handle traffic from the internet via a load balancer. There is a back end subnet with an internal load balancer that does not allow any connections from outside of the virtual network (using a private IP). Finally, we have a management subnet that contains the cluster services, including the web portal (on port 19080) and TCP client API (19000). For good measure, we’re also going to toss an RDP jump box into this subnet so if something goes wrong with any of the nodes in the cluster, we can remote in and troubleshoot (something that I used the heck out of while crafting this template).

With this in place, we then define the VM Scale Sets, and bind their network configurations to the proper subnets as follows:

"networkInterfaceConfigurations": [ 
  { 
    "name": "[variables('nodesMgmnt')['nicName']]", 
    "properties": { 
      "ipConfigurations": [ 
        { 
          "name": "[concat(variables('nodesMgmnt')['nicName'],'-',0)]", 
          "properties": { 
            "loadBalancerBackendAddressPools": [ 
              { 
                "id": "[variables('lbMgmnt')['PoolID']]"
              } 
            ], 
            "subnet": { 
              "id": "[variables('subnetManagement')['Ref']]"
            } 
          } 
        } 
      ], 
      "primary": true
    } 
  } 
]

With the VM Scale Sets in place, then we moved on to the Service Fabric Cluster to define each Node Type. Here’s the cluster node type definition for the management subnet node type.

   
{ 
  "name": "[variables('nodesMgmnt')['TypeName']]", 
  "applicationPorts": { 
    "endPort": "[variables('svcFabCluster')['applicationEndPort']]", 
    "startPort": "[variables('svcFabCluster')['applicationStartPort']]"
  }, 
  "clientConnectionEndpointPort": "[variables('svcFabCluster')['tcpGatewayPort']]", 
  "durabilityLevel": "Bronze", 
  "ephemeralPorts": { 
    "endPort": "[variables('svcFabCluster')['ephemeralEndPort']]", 
    "startPort": "[variables('svcFabCluster')['ephemeralStartPort']]"
  }, 
  "httpGatewayEndpointPort": "[variables('svcFabCluster')['httpGatewayPort']]", 
  "isPrimary": true, 
  "placementProperties": { 
    "isDMZ": "false"
  },            
  "vmInstanceCount": "[variables('nodesMgmnt')['capacity']]"
}    

The “name” of this Node Type, must match the name of a VM Scale Set, that’s how the two get wired together. Since this sample is for our “management” node type, it would also be the only one with the isPrimary property set too true.

At This point, we debugged the template and made sure the cluster to ensure it was valid and the cluster would come up “green”. The next (and harder step) is to start securing the cluster.

Note: If you create a cluster via the Azure portal with multiple node types, each node type will get its own subnet. However, we were after a reusable ARM template so we had to configure things ourselves.

Network Security

Unfortunately, when we set out to create this, there wasn’t much publicly available on the ports that were needed within a fabric cluster. So we had to do some guesswork, some heavy digging, as well as make a wishes for some good luck. So in this section I’m hoping to lay out some of what we learned to save others the effort.

First off, we started by blocking all inbound connections on the three subnets. I then opened ports 19080 (used by the Service Fabric web portal) and 19000 (used by the Fabric Client and Powershell) for the “management” subnet so I could interact with the cluster remotely. This was all done via the Azure Portal, interactively so we could test the rules out then use the Resource Explorer to export them to our template. We assumed that with these rules in place, we would see some of the nodes in the cluster go “red” or unhealthy. But we didn’t!

It took a day or so, but we eventually figured out that we were seeing two separate systems collide. Firstly, when a VM is brought up, the Service Fabric extension is inserted into it. This extension then registers the node with the cluster. As part of that process there’s a series of connections that are established. These connections are not ephemeral, remaining up for the life of the node. Our mistake was in assuming these connections, like we encourage most of our partners to do when building applications, were only temporary and established when they were needed.

Since these are established, persistent connections, they are not impacted when new NSG rules are applied. This makes sense since the NSG rules are there to interrogate any new connection requests, not look over everything that’s already been established. So the nodes would remain green until we rebooted them (tearing down their connections) and they tried (and failed) to re-establish their connection to the cluster.

This sorted out, we set about trying to place the remainder of the rules in place for the subnets. We knew we wanted internet connectivity to any application/service ports in the front end, as well as application/service ports in the backend from within the VNet. But what we were missing was the ports that Service Fabric needed. We found most of these in the cluster manifest:

   
<Endpoints> 
  <ClientConnectionEndpoint Port="19000" /> 
  <LeaseDriverEndpoint Port="1026" /> 
  <ClusterConnectionEndpoint Port="1025" /> 
  <HttpGatewayEndpoint Port="19080" Protocol="http" /> 
  <ServiceConnectionEndpoint Port="1027" /> 
  <ApplicationEndpoints StartPort="20000" EndPort="30000" /> 
  <EphemeralEndpoints StartPort="49152" EndPort="65534" /> 
</Endpoints>    

This worked fine at first. We stood up the cluster with these rules properly in place and the nodes were all green. However, when we’d tried to deploy an app to the cluster, it would always time out during the copy step. I spent a couple hours troubleshooting this one to eventually realize that it was something inside the cluster that was still blocked. I spent a bit of time trying to look at WireShark and Netstat runs inside of the nodes to determine what could still be the blocker. This could have carried on for some time had it not been for Vaishnav Kidambi pointing out that Service Fabric uses SMB to copy the application/service packages around to the nodes in the cluster. We added on a rule for that, and things started to work!

Note: As a result of this work, the Service Fabric product team has acknowledged that there’s a need for better documentation on the ports used by Service Fabric. So keep an eye out for additions to the official documentation.

Here’s what the final set of inbound rules for the Network Security Group (NSG) associated with the management subnet looked like.

image

A quick rundown… I’ll start at the highest priority (at the bottom) and work my way up since that’s how the NSG applies the rules. Rule 4000 blocks all traffic into the subnet. Rule 3950 and 3960 enable RDP connections within the VNet, and to the RDP jumpbox (at internal IP 10.0.3.4) from the internet. The next three rules (3920-3940) allow the connections needed by Service Fabric within the VNet only (thus allowing all the service fabric agents on the nodes to communicate). And finally, the first two rules (3900 and 3910) open up external connections for ports 19080 and 19000. Rules 3960, 3900, and 3910 are unique to the management subnet. I’ll get to why 19000 and 19080 are unique to this subnet in a moment.

Dynamic vs Static Ports

One sidebar for a moment. Connectivity between the front and back end is restricted to a set of ports you set when you run the template (it defaults to 80 and 443). In Service Fabric terms, this is called a static port. When you build services you also have the option of asking the Fabric for a port to use, a dynamic port. As of the writing of this article, the Azure load balancer does not support these dynamic ports. So to leverage them via the load balancer and our network isolation, we’d have to have a way to update both each time a port is allocated or released. Not ideal.

My thought is that most of the use of dynamic ports is likely going to be between services that have a trusted relationship. This relationship would likely results in the services being places inside the same subnet. If you needed to expose something they were doing to the “outside world”, you will likely set up a gateway/façade service that in turn might be load balanced. Its this gateway service that would be exposed on a static port so that it can easily be reached via a load balancer and secured with NSG rules.

Restricting Service Placement

With the network topology set, and the security rules for each of the subnets sorted, next up was ensuring that application services get placed into the proper locations. Service Fabric services can be given placement constraints. These constraints, defined in the Service Manifest, are checked against Placement Properties for each node type to determine which nodes types should host a service instance. These are commonly used for things like restricting services that require more memory to nodes that have more memory available or situations where specific types of hardware are required (a GPU for example).

Each node type gets a default placement property, NodeTypeName, which you can reference in a service manifest like so.

   
<ServiceTypes> 
  <!-- This is the name of your ServiceType. 
       This name must match the string used in RegisterServiceType call in Program.cs. -->
  <StatelessServiceType ServiceTypeName="Web2Type"> 
    <PlacementConstraints>(NodeTypeName==BackEnd)</PlacementConstraints> 
  </StatelessServiceType> 
</ServiceTypes>    

Now we may want to have other constraints beyond just NodeTypeName. Placement Properties can be assigned to the various Node Types in the cluster Manifest. Or, if you’re doing this via an ARM template such as I was, you can declare them directly in the template via a property within the NodeType definition/declaration.

   
"placementProperties": { 
  "isDMZ": "true" 
},

If you look at the node type definition I used earlier, you’ll where this property collection goes. In that template “isDMZ” is false.

Combined, the placement properties, as well as the placement constraints will help ensure that each of the services will go into the subnet that has already been configured to secure host it. But this does pose a challenge. If we declare the placement constraint in the service manifest as I show above, this does restrict which clusters we can deploy the service too. If a cluster doesn’t have our placement properties declared, the service will fail to deploy. We could address this by removing and then added the placement constraints later (not ideal) or altering the cluster manifests (again not ideal). But there are two other options. First, we could craft our own definition of the application/service types and register them with the cluster, then copy the packages to the cluster.

This article contains a section that talks about doing this via C# or Powershell. Another option, and one I think I actually prefer (but admittedly haven’t tried), is to use a build event to alter the manifest. You can then trigger this event based on various parameters to control if it happens when you’re doing a local build, vs a cloud build. Perhaps even going so far as reading a value from the Application Parameters or Publication Profile files. But for now, I’ll need to set these aside. There’s also a third option I’m investing but I’m not confident enough to bring it up yet. I hope to eventually circle back on these.

There is one other placement constraint (I mentioned I’d get to this). There are two things unique to the management node type/subnet. The first is that it’s the only subnet I would open ports 19000 and 19080 on. The reason for this is because this is the only node type in the cluster manifest that is marked as “isPrimary”. A service fabric cluster can only have one “primary” node type. This node type is the one where all the “system” services will be placed (Naming, FileStore, Cluster Manager, etc…). So setting “isPrimary” ensures that these services will be placed into this subnet, allowing me to keep them separate from any application services. I previously mentioned that this approach was proposed by Mark Fussell of the Service Fabric team. It’s a pattern that’s used by some larger clusters to help ensure that fabric management resource demands can be scaled independently of application needs.

Between placement of the management services on the primary node type, and restricting application placement via constraints, we can now put each of our services only where we want them to be.

Using the JumpBox

A common technique in cloud solutions is to leverage a “jump box”. Allowing direct, remote access to a virtual machine is sensitive and risky. To help manage this risk, there’s usually one or more, restricted access points that are used as gatekeepers. You access one of these gatekeepers as a leaping off point to access resources inside the security boundary. We’ve set up this approach, allowing you to RDP into a jump box from which you would then  RDP into the other boxes within the VNet.

Using this template, you’ll need to address all your VM instances via IP. Since we’re using dynamic IPs within the VNet, you can RDP into a box using a fairly simple address scheme. The third area of the IP address represents the subnet you want to access (1=front end, 2=back end, 3=management) and the final area is the specific machine. Azure reserves the first three address in a subnet rate for its own use, so you can start at 4 for the VMs in the front end or management subnets. For the back end subjet, I’ve used 10.0.2.4 as the private IP for the internal load balancer. So the nodes in that subnet start at 5.

The next step would be to adapt the “allowJumpBoxRDP”security rule on the management subnet so that it only allows connections from trusted sources (say your on-prem network).

Many diet colas died to bring you this information

So there you have it. I’ll admit that on the surface it may not seem like much. But if you’ve ever built an ARM template, you know how much effort it requires. Add into this all the stuff I had to learn/discover to get it to a functional state and validate it by deploying apps to it (which required more debugging and bug fixes) and..well… we’re talking quite a bit of effort. So I’m hoping that this article and the template will help a few folks avoid what I had to go through.

The entire template (complete with jump box), can be found in my github repo. I’m going to continue to try and polish it, and I’m also looking at getting it published (with additional guidance on usage) in the Azure QuickStart Templates repository. So be sure to let me know of any suggestions or bugs you find. I’ll do my best to get them worked in.

Until next time!

PS – thank you to everyone that helped contribute to this effort: Kal, Jason, Corey, Patrick, Mike, Shenlong, Vaishnav, Chacko, and Mikkel

Extending Logic Apps with Azure Functions

Note: This article is based on services that were in preview at the time it was written. The user experience and any issues/challenges mentioned below may differ from what was eventually released for general availability.

I remember when Azure was only three services. Technically, it was four, but let’s not mince details. Today, there are dozens of services, each with a multitude of features. So it’s impossible to be an expert in them all. As a technology specialist, an architect, I try to understand the basics of most of them, but only really go deep in areas where I have a specific interest. But from time to time, the partners I work with have asks that require me to go into areas I would have otherwise skimmed over.

One such request was to help a partner solve a challenge they were facing with correlating transactions being sent to them by an upstream solution provider they were working with. A simple enough request on the surface, but in this case my partner also had a short run-way to implement this. As such, they wanted to avoid having to do a large amount of coding. They wanted to embrace the speed and flexibility that PaaS gives you.

They looked at Logic Apps, a spiritual successor to BizTalk orchestrations. Unfortunately, Logic Apps doesn’t currently have “message box” type of functionality that would allow for the correlation of multiple messages. However, Logic Apps do have the ability to integrate with another service, Azure Functions, which allows us to extend the features of Logic Apps through custom code.

So I set about creating a simple proof of concept to prove out how something like this could work.

Overview of the solution

The workflow would happen in two parts. The first step would be correlate the two inbound transactions we’d receive that together comprise a “complete order”. A second workflow would then be triggered to monitor an installation process that would occur.

For the first workflow, I wanted to keep it simple. My POC used two transactions that were the same format, each containing a customer name, and a product type. The workflow would receive them via a REST API call. Each transaction would then be written to an Azure SQL DB and the workflow would respond back to the requestor with a “success” response code (200). After this, the workflow would call out to an Azure function to determine if the transaction was complete. If it was, we would drop a message into a queue that would trigger the second half of the workflow. If it wasn’t, we’d just end there and wait for the next transaction to come in. Here’s what the first workflow would look like in Azure Logic Apps.

image

The second part we’ll discuss later. But these simple steps would give me my “message box”.

Now this article won’t cover getting started with Logic Apps or Functions. There’s enough IMHO written on these subjects already that I don’t feel I have anything new to add. So if you haven’t work with these before, I highly recommend at least covering their “getting started” materials.

Triggering the workflow

Logic apps gives you a multitude of “triggers” that can be used to start a workflow. For this proof of concept, I opted to trigger the workflow manually when an HTTP request is received. Since it’s just a POC I don’t bother to secure this endpoint. This also means I can easily test my workflow using a tool like Fiddler or Postman.

I do this using a “manual trigger”, specifically the “Request – When an HTTP request is received”. You’ll find this under the “Show Microsoft Managed APIs” list. I also keep the request body simple, just a string that is the “customer” identifier, and another that is the “product” identifier. In the Logic App, the request body schema looks like this:

{
    "properties": {
        "customer": {
            "type": "string"
        },
        "product": {
            "type": "string"
        }
    },
    "title": "Product Order",
    "type": "object"
}

So its a simple JSON object called “Product Order” that contains two sting properties, “customer” and “product”. I don’t think it could get much simpler.

Saving the transaction to our “message box”

So the first step in re-creating BizTalk’s functionality is having a place to store the messages. Logic Apps has the ability to perform actions on Azure SQL DB, so I opted to leverage this connector. So I created a database and put a table in it that looked like this:

RowID – a GUID that has a default value of “newid()”

Customer – a string

Product – a string

Complete – boolean value, defaults too false

There are several SQL Connectors available in Logic Apps, including an “insert row” action. However, I struggled to get it to work. I wanted to use the defaults I had set up in the database for RowID and Complete fields. And the insert row action told me I had to specify the values for all columns in the table. I opted instead to create a stored procedure in the database, and use the “Execute stored procedure” connector. The stored procedure accepts two parameters, the Customer and Product strings from the HTTP request body.

With the message safely saved in the database, we now respond back to the request letting them know we have received and processed it. This is done via the “Response” action. I could have done this immediately after the request was received, but I wanted to make sure to wait until after I had saved the message. This way I’m really indicating to the requestor that I’ve received AND processed their request.

Correlating the orders

With the orders now saved, I can begin the process of adding my custom business logic to correlate the orders. I created an Azure Function app, and defined a new function named “CheckOrderComplete”. This will be a C# based function triggered (again) by an HTTP trigger. I choose C# because the partner I’m working with does much of their work in C#, so it was a good fit. The HTTP trigger made sense since we’re already using HTTP operations for the workflow trigger. Why not remain consistent. The objective of the function would be to query the database and see of I had the two transactions I needed to have a “complete” transaction on my end.

Azure Functions provides a C# developer reference that was really helpful. However, it still took a bit of trial and error my first time. So I’m going to try and break down my full down my full csx file into the various changes I made. The first change was that I needed to make sure I referenced the assembly that would allow me to interact with the SQL DB where the requests had been stored.

   
#r "Newtonsoft.Json"
#r "System.Data"   

The Newtonsoft line was already there since I’m dealing with an HTTP request that would need to have its JSON payload translated into C#. But I had to reference the external System.Data assembly, so I added the second line so I’d have the SQL Client. The #r is like adding a reference in a Visual Studio project. This assembly is already available in the Azure Functions hosting environment (think of it as the Nuget package already being installed), so no other action was necessary to make it available for my use.

Next up, I had to add a “using” clause to the code so I could leverage the assembly I just referenced. This works just like it would in a Visual Studio project.

using System.Data.SqlClient;   

This function is called via an unsecured (it is only a POC after all) web hook. So the parameters are going to come to use as a json object that I need to deserialize and validated…

string jsonContent = await req.Content.ReadAsStringAsync();
dynamic data = JsonConvert.DeserializeObject(jsonContent);
  
if (data.customer == null || data.product == null) {
    return req.CreateResponse(HttpStatusCode.BadRequest, new {
        error = "Please pass customer/product properties in the input object"
    });
} 

And now that I know my customer and product parameters are both here, I can use those to check the database. This looks like any other SQL command I’d execute.

  
SqlConnection sqlConnection1 = new SqlConnection("<em>{your connection string}</em>");
SqlCommand cmd = new SqlCommand();

cmd.CommandText = "select count(Distinct Product) from Orders where Customer = '" + data.customer + "' AND complete = 0";  
cmd.CommandType = System.Data.CommandType.Text;
cmd.Connection = sqlConnection1;

sqlConnection1.Open();
  
int productCnt;  
int.TryParse(cmd.ExecuteScalar().ToString(), out productCnt);
log.Info("Product Count for Customer '" + data.customer + "' is " + productCnt.ToString());
  
sqlConnection1.Close();    

I took the easy route and hard-coded my connection string since this is a POC. The function will run in an App Service, so I can set application environment variables and store it there. This would definitely be the preferred approach for production code.

With the sql check complete, I have a count of the number of orders for the customer that have not been completed. I’m expecting at least 2. So we can now check the count and respond to the caller with a succeed or fail.

if (productCnt > 1) {
     return req.CreateResponse(HttpStatusCode.OK, new {
         greeting = $"Customer Order is complete!"
     }); 
} else {
     return req.CreateResponse(HttpStatusCode.BadRequest, new {
         greeting = $"Order Incomplete!"
     }); 
}

This check is pretty basic I’ll admit. But for my POC is does the job. I retrieves a list of orders (yeah, imbedded SQL… injection worries… I know) where I have at least two orders that are not complete for a given customer. The real example would be far more robust and also (I hope) more secure. You could also let the function perform additional operations such as updating both items as complete. Its up to you.

With the function created (and tested), we can then connect it to the Logic App workflow. The two product teams have made this really simple. With that action complete, we then add a condition check using the Status Code that was returned from my function. If its equal to 200, we drop a message into a queue that another workflow will pick up to complete processing of the order.

clip_image002

Long Running Workflows

I mentioned above that there would be a second workflow. This second part may take minutes, hours, or even days/weeks to complete. Having a single workflow run that long it problematic. It could loose state in mid process and a host of other problems. So I wanted to avoid that.

The plan here is that when you send the final event message in our first workflow, in this case a Service Bus queue, you can set optional parameters. Items like ScheduledEnqueueTimeUTC which allows you to send the message now, but have it not be visible until perhaps an hour in the future. Then the second workflow would be triggered by the receipt of this event message. That workflow can then check to see if the process has been completed (perhaps using another Azure Function), and when complete dropping a message into yet another queue to signal completion. If its not yet complete, it drops the message back into the queue again with a scheduled enqueue time again set in the future.

This allows that workflow, which could take weeks, ensure that its “state” is maintained, even if the workflow itself needs to restart.

Summary

So it may not seem like much. But I was pretty excited to find a way to accomplish what my partner was after in only 34 lines of code (once you remove all the wrapper stuff). And they were pleased with the end product.

Admittedly, as I write this, Logic Apps and Functions are both still in preview. And there have some rough edges. But there’s a significant amount of potential to be leveraged here so I have little doubt that they will be cleaning those edges up. And should you want to build this yourself, I’ve added some of the code to my personal github repository.

Enjoy, and until next time!

Multiple Windows with Windows 10 and JavaScript

So a little known fact of Universal Windows Programs (UWPs), is that they can have more than a single window. At least it was little know to me until I was asked by a partner if there was a way to do it. As it turns out, there’s actually two options. There’s the multiple window approach used by applications like the Mail app. There’s also the projection manager approach where you intentionally want to display a window on a seperate screen from the main window. These approaches have actually been around since Windows 8 and center on the use of the ApplicationView object.

In poking around, I found that there’s a solid collection of Windows 8.1 and Visual Studio 2013 samples for C++, C#, and even JavaScript. There was even an excellent blog post about implementing multiple windows/views with Javascript, but it was again focused on Windows 8.1. But I needed an example of this approach with Windows 10 and Visual Studio 2015. And unfortunately, the Windows 8 example wasn’t working. So I set about fixing it.

WinJS 4 and the missing MSAppView

It turns out when moving from Windows 8 to Windows 10, you’re also moving from WinJS 2 to WinJS 4. In WinJS 4, some of the more “Microsoft-ish” ways of doing things was dropped in favor of allowing JavaScript developers to leverage the way they’ve always done things. The primary change was the deprecation of the MSAppView object and its various functions. This object represented a Windows Store app’s window (app view) and gave us methods like close and postMessage as well as a property that was the viewId of the window.

With this object removed, some of the functionality of the afformentioned Windows 8.1 sample was broken. So to get it working, I had to find a replacement, or better yet, build one.

myAppView

I started by crafting a replacement for the deprecated MSAppView which I called simply enough myAppView. Giving this object some of the sames functions/properties as its predecessor.

    function myAppView(window) {
        this.viewId = MSApp.getViewId(window),
        this.window = window,
        this.postMessage = function (message, domain) {
            this.window.postMessage(message, domain);
        }
    };

With this in place, I also needed to implement two more missing fuctions, the MSApp.createNewView (the WinJS implementation of CoreApplication.CreateNewView). Since these would be “global” functions, I opted to wrap them in a WinJS namespace called CustomAppView. Here’s the createNewView implementation.

    createNewView: function(page) {
        var newWindow = window.open(page, null, "msHideView=yes");
        //var newWindow = window.open(page);

        return new myAppView(newWindow);
    }

This would instantiate a new window using the common JavaScript function window.open. However, what we’ve done is add the value myHideView=yes in the optional “replace” parameter. This directive means the window exists, but isn’t yet visible. I searched the internet and for the life of me couldn’t find a single referrence for this. Thankfully, I was able to track someone down inside of the team responsible for WinJS and they shared this little gem with me. Then using the handle for this new and invisible window, I create an instance of the myAppView object. This object exposes the same functions/properties as its predicesor, allowing much of the sample code to remain in place.

We also had to craft a replacement for MSApp.getViewOpener:

    getViewOpener: function () {
        var openerWindow = window.opener;
        var parentWindow = window.parent;
        return new myAppView(parentWindow);
    }

You can see the full implementation of the myAppView and the two functions on GitHub.

 Out with the old, in with the new

In the existing sample, I was focused on getting “Scenario 1” working. This centered around the ViewManager object and functions for managing the view. The first of those was one to create a new view from a URL. In here, I had to replace the existing MSApp.createNewView with my new implementation.

    //var newView = MSApp.createNewView(page);
    //BMS
    var newView = CustomAppView.createNewView(page);

So we call the method we crafted above to create the window and return a copy of the myAppView object.

Next up, I had to replace MSApp.getViewOpener, which will be used to help me keep track of who “owns” the current window to assist in sending message between the windows.

    //this.opener = MSApp.getViewOpener();
    //BMS
    this.opener = CustomAppView.getViewOpener();

I played a bit to see which works better, the window.opener and window.parent, and found that parent gave me the best results. However, its important to note that according to the official documentation, this could return a window that “may not necessarily be the same window reported by the window.parent or window.opener properties”. But I found one that worked for my needs, so hopefully it will work for you also.

I’m still no expert

I’d like to stress, I know just enough JavaScript to continue to find it frustrating. The challenges I have with it are well known by my colleagues. However, I do appreciate opportunities like this that force me to dig into it and learn more. And this is one of those times where I hope my frustration will help save you some. I’ve posted the complete project up on GitHub so you can pull and try yourself. Just be forewarned that I’ve only really focused on Scenario 1, so the others may still have issues. Smile I’ll try to keep working on this from time to time to make the sample a bit better (it really needs some inline comments).

Until next time!

Azure Automation And SQL DB

Authors Note: The version of this script on GitHub was updated in May of 2016 to include the Azure AD tenant ID where the credential exists, as well as the Subscription ID to make sure the proper subscription is selected. This should help situations where there are multiple subscriptions. You will also need to included these variables values when setting up your job schedule.

What a difference two months makes. Two months ago, I helped an old friend out with a customer by working up a sample of using Azure Automation to resize an Azure SQL DB. At the time, importing modules wasn’t as seemless an experience as anyone would have liked. So I intended to blog about it, but simply didn’t get around to it.

Until today! With our recent addition of the Azure Automation module gallery, this has just gotten easier. Given that someone invested in helping build that functionality out. I figured I couldn’t slack off and not write about this any longer. So here we go…

Resizing Azure SQL DB’s

First off we need to address a bit of a misconception. While Azure SQL DB prices are stated in terms of “per month”, the reality is that you are billed “per hour” (see the FAQ). You can adjust both the service tier, performance level, or eDTU’s as often as you want. Within a given hour, the price you pay will be the highest option you exercised during that hour.

This means that if you either have predictable performance needs (highs or lows), or are actively monitoring your database utilization (see my previous blog post), you can tweek your database capacity accordingly.

To take advantage of this, we can use Azure Automation and some PowerShell.

Preparing the Database for resizing

Before I dive into setting up the automation itself, lets discuss a few details. My approach will leverage Azure’s Role Based Access Control (RBAC) features. So manipulating the database will be done by an identity that I’m setting up specifically for Automation and it will have access to the database(s) I want to manipulate.

Presuming you have access as an Administrator to the Azure AD instance that’s associated with your subscription’s management, locate Active Directory in the portal (search or browse) and select it. This should re-direct you to the old portal (at the time of this writing, Azure AD is not currently available in the new portal). Once there, add a new user.

image

You can use your own identity for this, but I’m much more a fan of setting up “service” identities for these things. If you don’t have administrative access AD in your subscription (not uncommon if its federated with your on-premises domain), then you’ll need to work with whomever manages your Active Directory domain to get the service identity created.

With the identity created,  we can now go to the Azure SQL Database server that’s hosting our database, and add the service identity as a user with the “Owner” role.

image

This grants the service identity to resize (amoung other things), all the databases on this server.

Setting up ourAutomation

Now we have ot set up an Automation Account in our Azure subscription. Its this account that will be the container for all our automation tasks, their assets, and any schedules we have them operate under. In the new Azure portal, click on “+ New”, then select “Management”. Alternatively, you can enter “Automation” into the search box and follow that path.

image

Give your Automation account a globally unique name, and make sure its associated with the proper subscription, resource group, and region (Automation isn only available in a handful of regions currently).

With the account created, we’ll first add the service identity we created earlier to the Automation account as a Credential asset. A Credential asset will be used by our automation script to execute the commands to resize the database. Other assets include PowerShell Modules, certificates, pre-set variables, etc… This also allows you to have multiple scripts that reuse the same Credential asset without each having to be updated each time the credential changes (say for periodic password changes).

When I first did this a couple months ago, I had to manually add in another PowerShell module. But they’ve started importing more of them by default, so that’s no longer necessary (unless you are tied to specific modules that are not available by default or you need to tag a newer/older version of something that’s already included).

With the credential in place, we can now add the Runbook. I’ve created one for you already, so just download that, and then in the Azure Automation account, import and existing runbook.

image

Set the type to “PowerShell Workflow” and give it a meaningful name (one will likely default), and give it a description. The import should only take a few seconds, then you can select the new Runbook. Once in, you can go into “edit” mode to either made changes, or test it. I highly recommend a test run this way first. To execute a test, you’ll provide the parameters called out in the script…

CredentialAssetName – the name we can the Credential Asset we created

ResourceGroupName – My script works on new Resource Manager based Azure SQL DB’s. So we want to know what resource group the database is in.

ServerName – the name of the database server (just the first part, leave out the ‘.database.windows.net’). This will need to be in all lower case.

DatabaseName – the name of the database to be resized

NewEdition – what is the new edition we’re resizing to (Basic, Standard, etc…)

NewPricingTier – what tier are we moving to S1, S2 (leave blank if the Edition is basic)

Fill in each appropriately and “Start” the test. Depending on the size of your database, this could take some time, so go get something to drink, check email, etc…

Done? Good! Hopefully the test ran well, So lets now publish our runbook. Publishing is important because it allows us to edit and test one version of the runbook while another is used by any scheduled occurances.

image

With the publish complete, our final step is to schedule this run. In my case, I know each afternoon, I want to set the database back down to “Basic”, because I won’t be using it in the evening (its just a dev database). I’ve got the option of creating a new schedule, or associating the job with an existing one. This allows multiple runbooks to be executed on a single schedule entry.

image

With the schedule set, we just need to set the parameters, much like we did when we tested the script.

image

Click “Ok” a couple times, and our runbook will be all set to run on our schedule. We can also execute the runbook “on demand” by selecting it in the portal, and clicking on “Start”.

Note: Once created, the portal does not currently let us edit the schedule. You’ll need to remove and readd-it. But you can turn schedules on/off.

Monitoring our Runbook

So each time the RunBook is executed by the schedule this is referred to as a job. Either in the Automation Account, or in the Runbook itself, we can select “jobs” and view any jobs that have recently. This will include both ones that were done “on demand”, or via the scheduler. In this you can look at the output, as well as the logs that were written.

image

And there we have it!

So what’s next?

For me, next up is lunch. For you, hopefully this will work out of the box for you and help you save a few bucks here and there. I would also recommend you check out some of the scripts in the gallery and think of other interesting uses for Azure Automation.

Until next time!

PS – HUGE thanks to the team that has open sourced Live Writer. Works like a dream!

Azure SQL DB Usage Visualization with Power BI

It’s taken a week longer then I had hoped, but I’m finally ready for my next installment of my “blog more” effort. As many of you are aware, I’m not a data person. However, there’s no ignoring the fact that applications are either a conduit for, or generator of data. So it’s inevitable that I have to step out of my comfort zone from time to time. What’s important is to grab these opportunities and learn something from them.

The most recent example of this was when I had a partner that was faced with the deprecation of the Azure SQL DB Web/Business pricing tiers. Their solution is cost sensitive and when they looked at the Azure portal’s recommendations for the new tiers for their many databases, they had a bit of sticker shock.

With some help from folks like Guy Haycock of the Azure SQL team and a good starting blog post on SQL DB usage monitoring, the partner was able to determine that the high tier recommendations were due to their daily database backup processes. Since the new SQL tiers offer point in time restore my partner didn’t need to do their own backups and as a result their actual needs were much lower then what the portal suggested. Saving them A LOT of money.

To identify what they really needed, the partner did was pull the SQL diagnostic view data down into Excel, and then visualized the data there. But I thought there had to be an even easier way to not just render it once, but produce an on-going dashboard. And to that end, started looking at Power BI.

Which is what this post is about.

The Diagnostic Management Views

The first step is to understand the management views and how they can help us. The blog post I mentioned earlier refers to two separate managements views the “classic” sys.resource_stats view which displays data for up to 14 days with 15 minute averages and the new sys.dm_db_resource_stats which have 1 hour of data with 15 second averages.

For the “classic” view, there are two schemas: the web/business version, and the new schema. The new schema, is very similar to the new sys.dm_db_resource_stats schema which is what I’ll focus on for our examples in this blog post. The key difference is that resources_stats is located in the Azure SQL DB “master” database, and dm_db_resource_stats can be found in the individual databases.

There are four values available to us: cpu, data, log writes, and memory. And each is expressed as a percentage of what’s available given the database’s current tier. This allows us to query the data directly to get details:

SELECT * FROM dm_db_resource_stats

Or if we want to get a bit more complex, we can bundle values together per minute and chart out the mix, max, and average per minute:

SELECT
    distinct convert(datetime,convert(char,end_time,100)) as Clock,
    max(avg_cpu_percent) as MaxCPU,
    min(avg_cpu_percent) as MinCPU,
    avg(avg_cpu_percent) as AvgCPU,
    max(avg_log_write_percent) as MaxLogWrite,
    min(avg_log_write_percent) as MinLogWrite,
    avg(avg_log_write_percent) as AvgLogWrite,
    max(avg_data_io_percent) as MaxIOPercent,
    min(avg_data_io_percent) as MinIOPercent,
    avg(avg_data_io_percent) as AvgIOPercent
FROM sys.dm_db_resource_stats
GROUP BY convert(datetime,convert(char,end_time,100))

Using the new dm_db_resource_stats view my experiments consistently got back 256 rows (64 minutes) which is a nice small result set. This gives you a good “point in time” measure of a databases resource usage. If you use the classic resource_stats view (which you query at the master db level), you can get back a few more rows and will likely want to filter the result set back on appropriate database_name.

Since the values represent a percentage of the available resources, this lets us determine how close to the database tier cap our database is operating. Thus allowing us to determine when we may want to make adjustments.

Power BI visualization

With a result set figured out, we’re ready to start wiring up the PowerBI visualizations. For simplicity, I’m going to use the online version of Power BI. It’s an online database, so why not an only reporting tool. Power BI online is available for free, so just head over and sign up. And while this is a sort of “getting started with” Power BI article, I’m not going to dive into everything about the portal. So if you’re new to it, you may need to poke around a bit or watch a couple of the tutorials to get started.

After logging into Power BI, start by getting connected to your database.

sqldbwithpowerbi-workspace

Note: You’ll want to make sure that your SQL Database’s firewall is allowing connections from Azure services.

On the next page, select Azure SQL Database, and click connect. Then fill in the server name (complete with database.windows.net part) and the database name. But before you click on next, select “Enable Advanced Options”.

sqldbwithpowerbi-adddatabase

This is important because Power BI online does not see the system/diagnostic views. So we’re going to use the queries we were working with above, to get our data. Just paste the query into the “Custom Filters” box.

Another important item to point out before we continue is that since we’re using an Azure SQL DB, we can’t have a refresh interval of less than 15 minutes. If you need more frequent updates, you may need to use Power BI Desktop.

With this form completed, click on “Next” and provide the Username and Password for your database and click on “Sign in”. Since our query is fairly small, the import of the dataset should only take a few seconds and once done, we’ll arrive at a dashboard with our SQL database pinned to it. Click on the database tile and we get a new, blank report that we can start to populate.

Along the right side of our report, select the “Line Chart” visualization and then check AvgCPU, MaxCPU, and MinCPU in the field list. Then drag Clock field to the Axis field. This setup should look something like this:

sqldbwithpowerbi-createvisualization

If done properly, we should have a visualization that looks something like…

sqldbwithpowerbi-samplevisualization

From there you can change/resize the visualization, change various properties of it (colors, labels, title, background, etc…) by click on the paint brush icon below visualizations. You can even add other visualizations to this same report by clicking away in the whitespace and adding them like so..

sqldbwithpowerbi-samplereport

It’s entirely up to you what values you want to display and how.

So there you have it

Not really much to it. The hardest part was figuring out how to connect Power BI online to the management view. I like that the Power BI desktop tool calls it a “query” and not a “filter. Less confusion there.

If I had more time, it would be neat to look at creating a content sample pack (creation of content packs is not self-service for the moment) for Azure SQL DB. But I have a couple other irons in the fire, so perhaps I can circle back on that topic for another day. J So in the interim, perhaps have a look at the SQL Sentry content pack (Note: This requires SQL Sentry’s Performance Advisor product which has a free trial).

Until next time!

A JavaScript based Windows 10 UWP & Notification Hub

Here we are, back at the blog again as promised. Admittedly, this is a week or so later then I had planned, but… best laid plans and of mice and men and all that. I’ll blame three straight weeks of travelling to visit with partners for the missed target. But all that aside, I am back to share something I’ve been thinking about blogging on for over a year. Namely Azure’s Notification Hub.

Notification Hub has bounced around a bit. First it was under the Web Apps banner, then finally landed under the Service Bus brand. Regardless of which area it falls under, I think it’s a highly underappreciated service. Perhaps this will help change that perception in some small way.

Overview

In short, Notification hub is a way to centralize your notifications across multiple platforms. Its provides you with a single API interface from which you can send notifications to a myriad of platforms (iOS, Android, Kindle, Windows, etc…). In doing this, the service handles negotiations with the platform specific APIs and also helps deal with the various throttling limits that are in place for each.

As if this weren’t enough, Event Hub (EH) also provides several higher level functions.

Device Registry – a single location that you can use to store details and metadata about individual devices. The service monitor for and purge expired device registrations. Saving you from being kicked off the notification service of various providers.

Templates – the capability to define templates that can be used to map a single message to different platforms and notification message formats.

Tags – meta-data attributes that can be used to help you target the recipient(s) of a given message. Allowing you to target a single device, a group of devices, or all devices as you see fit.

Each of these topics could be grounds for its own post or possibly even a series of posts. But today I’m interested in one specific topic, using NH to send notifications to a Windows 10 UWP (Universal Windows Platform) application written in HTML5/JS.

To get this all to work, we have to perform the following steps:

  1. Define the application in the Windows Store to and configure the Windows Notification Service (WNS)
  2. Configure the Notification Hub to talk with the application’s WNS endpoint.
  3. Have the app discover its “channel uri” which is used by the Windows Notification Service (WNS) to send notifications to the app.
  4. Using that URI, ask the Notification Hub to a unique device ID
  5. Using the device ID, register the device with the Notification Hub

So let’s get to it…

The Windows Store, WNS, and Notification Hub

To do notifications, we need to start by setting up WNS, and this requires us to declare our application at Windows Dev Center: https://dev.windows.com/

After logging in, click on “Dashboard” then select “Create a new app”. You’ll need to provide a name for your app (that must be unique) so it can be reserved for your use. With the app created we then click on “Services->Push notifications”. The app should already be configured for push notifications by default, but we need to get the credentials so we can configure the Notification Hub.

In the first section on the Windows Push Notification Services, locate the “Live Service site” link in the second paragraph and click it. This routes you over to the app settings for the new app. We want to capture and record two values here, the Package SID and Client secret as shown below…

DevStore

 

Note: the values contained above are not valid. So please don’t even try. :)

Now we’ll switch over to the Azure Management Portal (classic) and create a new Notification Hub. Select the “+” at the bottom and do App Services -> Service Bus -> Notification Hub -> Quick Create. Provide a hub name, where its located, and a unique namespace for it.

CreateHub

After a few minutes (even faster if you’re re-using an existing namespace), the hub will be created and ready for configuration. We’ll click on its “configure” tab and enter in the Package SID and Client Secret from above.

win1-hubConfigureHub

Save this change, and switch back to the dashboard for the Notification Hub, and get its connection information for the DefaultFullSharedAccessSignature.

HubShareAccessSig

This use of the “full” permissions, can be a security issue. But I’ll take more about that in the next section.

With these steps done, we’ve declared a store application, created a Windows Notification Service endpoint for it, and associated the WNS with our Notification Hub. Its start to start building the app.

The Solution

Last time I worked with Notification Hub, was in September of 2014 with a Windows 8.1 app written in Angular. Since that time I’ve been actively trying to change my development habits and learn more JavaScript. So when another partner approach me about their HTML5/JavaScript application, I wanted to make another go at getting Notification Hub to work, this time with JavaScript. The challenge I soon found out was that we (Microsoft), still don’t have many materials available for building UWPs with JavaScript.

I want to fix that. To that end I’m creating a GitHub repository where I’ll be putting my explorations in creating UWPs and hopefully others will either fork this effort and perhaps even contribute back to it. I’ll start with what I hope is a fairly solid Notification Hub example.

I started by creating a Visual Studio 2015 solution that included a JavaScript Blank Universal Windows App, and a C# based ASP.NET Web API. Yeah, I know I said I’m trying to learn more about JavaScript, so I could have built the API in Node, but one battle at a time. I’m putting both in the same solution for convenience since I can then debug both side by side in the same VS instance.

The reason I created two projects was simple, security. I could have let the application directly interact with the Notification Hub api. However, this would have required me to put the shared access key for the hub into the app itself. But by placing it into a server side API, it helps protect the key from being pulled out of the app and put into malicious use.

The REST API

With the solution in place, I started by building an API side first. I’ll create a single API controller that will have two methods. There will be a POST method that accepts a channel URI and returns a Notification Hub Device ID, and a PUT method that takes the Device ID and other values and registers it for notifications. You could put both these into a single operation. But there may be times where you just want to update the registration (say adding/updating a tag). So I prefer this two step process.

We start by creating a model that will be used by the controller. This class will be used when the controller is activated to help initialize a Notification Hub client. The class looks like this..

    public class Notifications
    {
        public static Notifications Instance = new Notifications();

        public NotificationHubClient Hub { get; set; }

        private Notifications()
        {
            //get properties that have NH connection info
            string nhConnString = Properties.Settings.Default.NotificationHubConnectionString;
            string nhName = Properties.Settings.Default.NotificationHubName;

            // create a NH client 
            Hub = NotificationHubClient.CreateClientFromConnectionString(nhConnString, nhName);
        }
    }

You’ll note that this code depends on two settings. Right-click on the project and add those now. The ConnectionString value we captured when we set up the notification hub. And the hub name we specified when we set up the hub. For this code to compile, we also have to make sure we use nuget to add the Microsoft Azure Notification Hub package to our project and a couple of using clauses so that everything resolves properly.

We’ll then add the controller that will contain our Notification Hub registration API. we start by adding a private NotificationHubClient variable and then populate this variable using the model we created earlier.

    public class RegisterController : ApiController
    {
        private NotificationHubClient hub;

        public RegisterController()
        {
            hub = Notifications.Instance.Hub;
        }
    }

For the post method, I start by define the request object. I create an object that is my request payload, having a single parameter that is the “channel URI”.

        public class regGetRegistrationId
        {
            public string channeluri { get; set; }
        }

With this, we can define the POST api method. This method will accept the channel URI (aka handle), and try to find an existing device ID for this URI in the Notification Hub device registry. If we find a match, we’ll return its device ID, and it not, we’ll create a new one. When checking existing devices, we do this in a loop to help make sure we don’t have multiple registrations using the same channel URI (which could result in the app getting the same notification multiple times).

        [HttpPost]
        [Route("api/register/")]
        public async Task&lt;HttpResponseMessage&gt; Post([FromBody]regGetRegistrationId request)
        {
            string newRegistrationId = null;
            string channelUri = request.channeluri;

            //todo: validate the URI is for notify.windows.com domain

            // make sure there are no existing registrations for the channel URI provided
            if (channelUri != null)
            {
                var registrations = await hub.GetRegistrationsByChannelAsync(channelUri, 100);

                foreach (RegistrationDescription registration in registrations)
                {
                    if (newRegistrationId == null)
                    {
                        newRegistrationId = registration.RegistrationId;
                    }
                    else
                    {
                        await hub.DeleteRegistrationAsync(registration);
                    }
                }
            }

            if (newRegistrationId == null)
                newRegistrationId = await hub.CreateRegistrationIdAsync();

            return Request.CreateResponse(HttpStatusCode.OK, newRegistrationId);
        }

Moving on to the PUT method for actually registering the device the app is running on, I again start by declaring my request payload/contract. This one has the device ID from the previous call, the platform we want to register with, the handle (in WNS, this is the Channel URI), and a set of tags that can be used for targeting notifications.

        public class DeviceRegistration
        {
            public string deviceid { get; set; } // the device ID
            public string platform { get; set; } // what platform, WNS, APNS, etc... 
            public string handle { get; set; } // callback handle for the associated notification service
            public string[] tags { get; set; } // tags to be used in targetting notifications
        }

Finally, we have the API method itself. It takes the payload, reformats the details into what’s needed for the Notification Hub SDK, and performs the registration. If the device is already registered, this would update/overlay the registration with new values.

        [HttpPut]
        [Route("api/register/")]
        public async Task&lt;HttpResponseMessage&gt; Put([FromBody]DeviceRegistration deviceUpdate)
        {
            RegistrationDescription registration = null;
            switch (deviceUpdate.platform.ToLower())
            {
                case "mpns":
                    registration = new MpnsRegistrationDescription(deviceUpdate.handle);
                    break;
                case "wns":
                    registration = new WindowsRegistrationDescription(deviceUpdate.handle);
                    break;
                case "apns":
                    registration = new AppleRegistrationDescription(deviceUpdate.handle);
                    break;
                case "gcm":
                    registration = new GcmRegistrationDescription(deviceUpdate.handle);
                    break;
                default:
                    throw new HttpResponseException(HttpStatusCode.BadRequest);
            }

            registration.RegistrationId = deviceUpdate.deviceid;
            registration.Tags = new HashSet&lt;string&gt;(deviceUpdate.tags);
            // optionally you could supplement/override the list of client supplied tags here
            // tags will help you target specific devices or groups of devices depending on your needs

            try
            {
                await hub.CreateOrUpdateRegistrationAsync(registration);
            }
            catch (MessagingException e)
            {
                //ReturnGoneIfHubResponseIsGone(e);
            }

            return Request.CreateResponse(HttpStatusCode.OK);
        }

In this example, we’re taking whatever tags were handed to our API method. In reality, the API may handle this type of update. Perhaps designating specific tags based off of information about the user of the app, or settings the user has defined in the app itself.

At this point, you could use a tool like Fiddler to test the API directly. You can monitor the notification hub dashboard in the Azure management portal to make sure operations are succeeding. There’s a bit of delay from when actions show up on the dashboard after being performed (seems like 5-10 minutes for the graph across the top, but an hour or so for the device registration count across the bottom). So don’t expect immediate results if you’re doing this type of testing. I’d suggest just firing off a couple REST requests to ensure your code doesn’t throw any obvious examples and then get back to coding up the rest of the app.

The client side JavaScript code

With the API in place, we can start working on the JavaScript side of things. The bulk of the code can be found in a stand-alone object I created called notification.js. This object has a single method named registration. Its a bit long, so we’ll look at it step by step.

First up, we need to look to see if the app was run previously and saved its channel URI.

        // check and see if we have a saved ChannelURI
        var applicationData = Windows.Storage.ApplicationData.current;
        var localSettings = applicationData.localSettings;

        var savedChannelURI = localSettings.values["WNSChannelURI"];
        //savedChannelURI = "re-register"; // uncomment this line to force re-registration every time the app runs

Note the last line, if uncomment it can over-write whatever the saved value was. A channel URI can change and usually won’t last more than 30 days. So the recommendation is that you get it, and save it, and only re-register with the Notification hub if it changes. If you un-comment this line of code, you can run the app over and over again while testing/debugging your code. Just make sure you remove it after you’re done.

Next, we’re going to get a new channel URI by using the Windows UWP API.

        // get current channel URI for notifications
        var pushNotifications = Windows.Networking.PushNotifications;
        var channelOperation = pushNotifications.PushNotificationChannelManager.createPushNotificationChannelForApplicationAsync();

        // get current channel URI and check against saved URI
        channelOperation.then(function (newChannel) {
            return newChannel.uri;
        }).then(function (currentChannelURI) {
            // do stuff here!
        }).done();

This code sets up a Push Notification client and returns a promise for when the operation is complete, This returns the URI which is then passed to the “then” promise for additional operation. Its inside that promise that the real work is done.

We start by checking to see if the channel URI we just received is any different then the one we have saved.

    if (!savedChannelURI || savedChannelURI.toLowerCase() != currentChannelURI.toLowerCase()) {

And if not, we’ll start making our calls to our rest API, starting with the call to get a device ID

                // get a Notification Hub registration ID via the API
                WinJS.xhr({
                    type: "post",
                    url: "http://localhost:7521/api/register",
                    headers: { "Content-type": "application/x-www-form-urlencoded" },
                    responseType: "text",
                    data: "channeluri=" + currentChannelURI.toLowerCase()
                }).then(function (getIdSuccess) {

If this completes successful, inside of its “then” function, we set up the parameters for the call to the second API, passing it the device/registration ID we received back.

                    // strip the double quotes off the string, we don't want those
                    var deviceId = getIdSuccess.responseText.replace(/['"]/g, '');
                        console.log("Device ID is: " + deviceId);

                    // create object for notification hub device registration
                    // tag values used are arbitrary and could be supplemented by any
                    // assigned on the server side
                    var registrationpayload = {
                        "deviceid"  : deviceId,
                        "platform"  : "wns",
                        "handle"    : currentChannelURI,
                        "tags"      : ["tag1", "tag2"]
                    };

Note that I’m stripping of the quotes that came at the beginning and end of my deviceId. I then use it and other values to construct the remainder of the PUT request payload. Which I can now call…

                    // update the registration
                    WinJS.xhr({
                        type: "put",
                        url: "http://localhost:7521/api/register/",
                        headers: { "Content-type": "application/json" },
                        data: JSON.stringify(registrationpayload)
                    }).then(
                        function (registerSuccess) {
                            console.log("Device successfully registered");

                            // save/update channel URI for next app launch
                            localSettings.values["WNSChannelURI"] = currentChannelURI;
                        },
                        function (error) {
                            console.log(JSON.parse(error.responseText));
                        }
                    ).done();

Its inside the “then” for this second call that I save the new URI for the next time the app launches. It should also be noted that I’m not really doing any error handling in these samples and that in a production quality app, you really should have a few retries in this (since it will fail if there’s no network connection).

With this object and its method declared, we can now wire it up to the application. I put it in its own JavaScript file so it could be easily reused by another project, so we’ll want to add it to the default.html page of our new app.

    &lt;!-- UWP___JavaScript references --&gt;
    &lt;link href="/css/default.css" rel="stylesheet" /&gt;
    &lt;!-- TAG: #notificationhubjs --&gt;
    &lt;script src="/js/notifications.js"&gt;&lt;/script&gt; &lt;!-- Notification Hub Integration --&gt;
    &lt;script src="/js/default.js"&gt;&lt;/script&gt;

Note hose this was inserted before the default.js file. This helps ensure that the object is available to default.js which is the entry point for my application. Inside the default.js we’ll add the code to access our new method.

    // TODO: This application has been newly launched. Initialize your application here.

    // TAG: #notificationhubjs			    
    uwpNotifications.registerChannelURI(); // register the app for notifications

By placing this in the block of the default.js after the “newly launched” I’ll ensure this code is called when the app is launched, but not each time the app is resumed. Which is my intent. Now all that remains is to associate my app with the app I created in the store.

Store Association & testing.

Fortunately, since we’ve already registered our application with the store, visual studio makes this step really easy. Right-click on the UWP application in the solution explorer, and select Store -> Associate App with the Store.

storeassociation

You’ll get walked through a wizard that will log you into the store and let you select an existing app, or create a new one. Select the app you created above, and associate it with your app.

Note: If you are using a public repository such as GitHub, don’t check in your code after making the association. By associating the project with the store, you are altering the application manifest and adding in a file that contains store specific details. I recommend doing this final stage/testing in a separate branch to allow you to more easily merge back changes without bringing over those details.

With this change done, we’re can run the app and test our notifications. Start by first launching the Web API in debug mode. I found it was easiest to set that project to my start-up project. Next, run the UWP application in debug mode. You’ll find you can set breakpoints in both apps and they’ll get hit. Which makes doing basic debugging of the code a lot easier.

Once you know the app has launched and appears to have registered with the Notification Hub properly, we can go back to the Windows Management portal and use the notification hub to test our application. Be sure to set the platform to “Windows” and the notification type to “Toast”.

notificationtest

If we’ve done everything correctly, you should receive a toast notification. What’s even better, is that if you stop debugging, you can still send the notification. This is one of the things that makes notifications awesome. Your app doesn’t even need to be running for you to still send notifications to it. And with Windows 10, you can even allow the user to take actions on those notifications. But that’s likely a subject left for another day.

Conclusion

So hopefully this has helped stitch a few topics together in a way that’s helpful. As I mentioned earlier, I’ve put all this code up on GitHub for you to borrow and reuse if you see fit. Over time I hope to grow that project, so if you want to skip to the parts covered in this blog, just clone the repository, and look for the tag #notificationhubjs.

Until next time!

 

Introduction to the Service Fabric

I’ve spent most of the last 7 years focused on the cloud. During that time I’ve worked with customers of all sizes in a myriad of industries to adopt cloud platforms and build solutions they can then provide to their customers. If I’ve learned nothing else, it’s that continuous innovation is necessary and EVERY cloud solution could be improved.

By now, you’re hopefully familiar with the concept of DevOps. This blending of developer and IT pro responsibilities is at the center of many cloud infrastructures. Unfortunately, most approaches still don’t offer a good way to separate these two fairly disparate viewpoints. They either require the developer to learn too much about infrastructure, or the IT Pro needs to know too much about building deployment scripts. The challenge is to satisfy the unique needs of both audiences, while still allowing them to stay focused on their passions.

What we need is a platform the developer can build applications for that can scale, provide resiliency, and be used to easily deliver performant, stateful, and highly available services. This same platform also needs to be something that can be deployed either in the cloud as a managed service, or set up on-premises using existing compute resources. It also needs to be something that can allow the IT Pro to worry less about the applications/services that are on it, and more about keeping the underlying infrastructure healthy.

To that end, I’m pleased to talk about our newest innovation, Microsoft Azure Service Fabric. But before I dive into exactly what Service Fabric is, let’s talk about the road that’s led us here.

Infrastructure as a Service

A natural evolution from managed infrastructure, it’s easy to argue that IaaS simply built upon the hypervisor technologies we’d already been using for years. The only real difference here was the level of automation taking care of common tasks (such as relocating virtual machines in case of hardware failure). This solution also had the benefit that to a degree, you could stand up your own private IaaS cloud.

The issue here however was that many of the old problems still remained. You had to manage the virtual machines, patching/upgrading them. And if you deployed an application to this type of infrastructure, there was still a lot of work to be done to make it resilient in case of a failure. Especially in scenarios where the application had any type of state information that you didn’t want to lose.

When you add in common tasks like rolling upgrades, application version management, etc… IaaS really starts to look like its predecessor. IaaS did reduce the need to manage hardware, but still didn’t address how we build highly scalable, resilient, solutions.

Containerization

Linux (and you could easily argue that more specifically, Docker) brought us containerization. Containers are deployed into machines (either physical or virtual), as self-contained, mostly isolated components. This allowed for individual services to be quickly and easily combined into complex solutions. The level of automation available pushed the underlying machines further back on the stack. So while the underlying shortcomings of IaaS still exist, the automation allows us to work about it even less.

In addition to the composable nature of container based solutions, this technology also offered the advantage of less bloat. If you used virtual machines to isolate your services, you have the overhead of a full operating system for each instance. Containers are much lighter weight, sacrificing a degree of resource isolation to save resources overall.

Unfortunately, this approach is still about putting components on infrastructure and not really about applications that comprise them. So many of the problems with application lifecycle management (ALM) that we saw with IaaS, still exist. And while there are solutions that can be layered on top of containerization to help manage some of this. But these add even more complexity.

Platform as a Service

PaaS, be it Microsoft Azure Cloud Services, or solutions like Heroku, tried to solve some of this by pushing the underlying OS so far into the background that it nearly vanished. At least until something went wrong. And unfortunately, in the cloud there’s only one absolute, failure is inevitable.

Don’t get me wrong. I still love the promise of platform as a service. It is supposed to give us a place to deploy applications where we can depend on the platform to take care of the common tasks like scalability, failover, rolling upgrades. Unfortunately, in most cases PaaS solutions fell a bit short of their goals. Version management was mostly non-existent and if you wanted things to be truly resilient, you needed to externalize any state information which required a dependency on externalized caches or data stores.

Another key challenge is that PaaS adoption was often done using traditional n-tier architecture patterns. The result was that you would design a system comprised of components for the different layers, then deploy them as individual pieces. While you could scale the number of copies/instances of the solution components based on needed capacity, this pattern still often leads to wasted resources as each instance is often under-utilized.

Enter the Service Fabric

We (Microsoft) watched customers and often times ourselves struggle with these problems. And more importantly, we watched what was being done to overcome them. At the same time, we were also learning from the solutions we were building. Over time, common patterns became visible and the solutions for them were industrialized. It’s out of these learnings that we began to craft the Service Fabric.

The goal of Service Fabric, as the name applies, is to provide a framework that allows services to be easily stitched together into complex solutions. It will manage the services by helping them discover and communicate with each other while also helping them maintain a healthy state. The services could be a web application, a traditional service, or even some executable.

The fabric intends to specifically solve the following challenges:

–          Provide a platform into which services are deployed. Reducing the need to worry about the underlying virtual machine(s)

–          Increase the density of deployed services to decrease latency without increasing complexity

–          Manage deployed services, providing failover, version management, and discoverability

–          Giving services a way to coordinate actions such as data replication to increase resiliency when state needs to be maintained

And while we’re just now announcing this new offering, we’ve already tried the waters with this approach ourselves. Existing Azure Services such as Service Bus Event Hubs, Document DB, and the latest version of Azure SQL Database are being delivered on Service Fabric.

What is the Service Fabric

Service Fabric is a distributed systems platform that allows developers building build services and applications to avoid complex distributed infrastructure problems and focus instead on implementing their workloads and business logic while adding scalability and reliability. It also greatly reduces the burden on application administrators and IT Operators by implementing simple workflows for provisioning, deploying, patching, and monitoring services. Providing first class support for the full application lifecycle of cloud applications for initial deployment to eventual decommissioning.

At the heart of these buzzwords is the concept of a Service Fabric Cluster. A cluster is a collection of servers (physical or virtual) that are stitched together via the Service Fabric software. (did you see what I did there grin) Individually, these servers are referred to as “nodes”. The Service Fabric software keeps track of each node and the applications that have been deployed to the cluster. Its monitoring that allows the fabric to detect when a node fails, and move the instances of an application to another node. Thus providing resiliency for your application.

One of the interesting things about the Service Fabric is that it’s headless.  The Service Fabric software on each node synchronizes with the instances running on other nodes in the cluster. In essence, keeping track of the state of its own node, while also sharing that state with the other nodes. Those nodes in turn share their state back out. This is important because when you want to know anything about the cluster, you can simply query the Service Fabric management API on any node and you’ll get the same answer.

When you deploy an application to the cluster, a notification is sent to one of the nodes in the cluster. This node in turn looks at what it knows of the other nodes, and determines where to place the various parts of the application. Notifications are then in turn sent to the selected nodes so they can take the appropriate actions.

But before we go to deep in this, let’s look at a Service Fabric Application.

Service Fabric Application Model

Service Fabric operates on the notion that applications are composed of services. But the term ‘service’ is used fairly loosely. A service could be an application listening for a request, or a console application that runs in a loop, performing some action. A service is really just a process that you want to run.

Services in turn, are composed of three parts: code, config, and data. Just as you would expect applications and services to be versionable, Service Fabric allows you to version these components. This is extremely powerful because now you can deploy updates of any of these components independently of the others.

The service acts as a containerized, functional unit. It can be developed and revised individually, start/run independently, and scaled in isolation from other services. But most importantly, the services, when acting together, form the higher level applications that are needed for today’s cloud scale solutions.

svcFabAppModel

This separation provides several advantages. First off, we can easily upgrade and scale individual instances of the services as needed. Another key advantage for Service Fabric is that we can deploy as many applications into the cluster as its resources (cpu, memory, etc…) can support. This can even be multiple, different versions of the same application or service. But each is managed separately.

But this isn’t the end of the story. This example was fairly simple because the services in question are not stateful. And a solution doesn’t have to get very complex because the need to store state arises.

Stateless vs Stateful Services

As mentioned earlier, a key failure with most cloud solutions is that they attempt to follow a traditional “n-tier” architecture. The result is that you have clear separation of concerns and externalized data stores. This in turn introduces both complexity and latency since the solution traverses various artificial boundaries such as remote calls to read/write data, or introducing more moving pieces such as a caching layer.

To address this issue, Service Fabric has two types of services: stateless and stateful. A stateless service is entirely self-contained and didn’t do anything besides write some output to the console. But the type of service, a stateful service, has something that stateless services lack. A stateful service, as its name implies can store date and more importantly, leverage the Service Fabric to replicate that data across multiple instances of the service. This speeds up access because we can now durably store information within the fabric cluster and since the data replicated, we don’t sacrifice resiliency. We also simplify the solution by reducing its dependency on external services. Any one of which could impact the availability of our application.

Stateful Service Fabric services (say that 5 times fast), work because they leverage the cluster for discoverability and communication. Each Stateful Service should run at least three instances. The cluster will select one of these copies to be the primary, and the others will be secondary replicas. The instances of the service then leverage the Service Fabric to keep the replicas in quorum. As transactions occur against the primary, they are sent over and applied to each of the replicas.

This works because when you request an endpoint for the service, the fabric will automatically route the request to the elected primary. When changes occur (inserts, updates, deletes), these transactions are then replayed on the secondary replicas. Each replicate will store its current state either in memory, locally on disk, or optionally remotely (although this approach should be used with caution for the reasons we mentioned earlier).

Internally, the stateful service leverages another Service Fabric concept, a distributed collection. Its these collections that actually do the task of storing any data and working with the Service Fabric to ensure replication is performed. The service container provides an activation point for the collection, while simultaneously providing a process that can host any service endpoints that allow for interaction with the collection. What’s important to note here is that the only part of any application in the cluster that can interact with a given collection is its hosting service.

Now some stateful services may operate just fine using a single set of primary/secondary instances. But as demand increases, you may need to scale out the service to allow for even greater throughput. This is where “partitions” come into play. When the service registers the distributed collection with the Service Fabric, it defines the number of partitions and the key ranges for those partitions.

It should be stressed that the partition is a logical construct only. You could have a distributed collection that’s running on a single primary that has 10,000 partitions. That same collection could be spread across 10, 20, or more primary instances. It’s all based on your demand. How this is done is a bit more of an advanced topic, so we’ll leave that for another time.

PaaS for the new generation of cloud solutions

So that is it for this introduction. We’ve barely scratched the service of what this framework can help you accomplish. So we’re hoping you’ll join us for some of the other sessions we have planned that explore various aspects of Service Fabric more deeply.

Thank you, and until next time!

Follow

Get every new post delivered to your Inbox.

Join 1,343 other followers