Azure File Depot
May 16, 2014 Leave a comment
It’s been a little while since my last update. BUILD 2014 has come and gone and my group, DPE (Developer Platform Evangelism), has been re-branded to DX (Developer Experience). Admittedly a name is a name, but this change comes at a time when my role is beginning to transform. For the last 18 months, my primary task has been to help organizations be successful in adopting the Azure platform. Going forward, that will still remain significant, but now we’re also tasked with helping demonstrate our technology and our skills through the creation of intellectual property (frameworks, code samples, etc…) that will assist customers. So less PowerPoint (YEAH!), more coding (DOUBLE YEAH!).
To that end, I’ve opted to tackle a fairly straightforward task that’s come up often. Namely the need to move files from one place to another. It’s come up at least 3-4 times over the last few months so it seems like a good first project under our changing direction. To that end, I’d like to present you to my first open sourced effort, the Azure File Depot.
What is it?
In short, the Azure File Depot is an effort to provide sample implementations of various tasks related to using blob storage to move files around. It contains a series of simple classes that help demonstrate common tasks (such as creating a Shared Access Signature for a blob) and business challenges (the automated publication of files from on-premises to Azure and/or an Azure hosted VM).
Over time, it’s my hope that we may attract a community around this and evolve this little project into a true framework. But in the meantime, it’s a place for me to do what I do best and that’s take learnings and share them. At least I hope I’m good at it since it’s a significant portion of my job. J
What isn’t it?
I need to stress that this project isn’t intended to be a production ready framework for solving common problems. The goal here is to create a collection of reference implementations that address some common challenges. While we do want to demonstrate solid coding practices, there will be places where the samples take less than elegant approaches.
I’m fully aware that in many cases there may be better implementations out there, in some cases perhaps even off the shelf solutions that are production ready. That’s not what this project is after. I have many good friends in Microsoft’s engineering teams that I know will groan and sigh at the code in this project. I only ask that you be kind and keep in mind, this is an educational effort and essentially one big collection of code snippets.
So what’s in it currently?
What we’ve included in this first release is the common ask I referred to above. The need to take files generated on-premises and push them to Azure. Either letting them land in blob storage, or have them eventually land in an Azure hosted VM.
Here’s a diagram that outlines the scenario.
Step 1 – File placed in a folder on the network
The scenario begins with a file being created by some process and saved to the file system. This location could be a network file share or just as easily could be on the same server as the process itself.
Step 2 – The Location is monitored for new files
That location is in turn monitored by a “Publication Service”. Our reference implementation uses the c# FileSystemwatcher class which allows the application to be receive notification of file change events from Windows.
Step 3 – Publication service detects file and uploads to blob storage
When the creation of a new file raises an event in the application, the publishing app waits to get an exclusive lock on the file (making sure nothing is still writing to the file), then uploads it to a blob container.
Step 4 – Notification message with SAS for blob is published
After the blob is uploaded, the publication service then generates a shared access signature and publishes a message to a “Messages” Service Bus topic so that interested processes can be alerted that there’s a new file to be downloaded.
Step 5 – Subscribers receive message
Processes that want to subscribe to these notifications create subscriptions on that topic so they can receive the alerts.
Step 6 – Download blob and save to local disk
The subscribing process then use the shared access signature to then download the blob, placing it in the local file system.
Now this process could be used to push files from any location (cloud or on-premises) to any possible receiver. I like the example because it demonstrates a few key points of cloud architecture:
- Use of messaging to create temporal decoupling and load leveling
- Shared Access Signatures to grant temporary access to secure, private blob storage for potentially insecure/anonymous clients
- Use of Service Bus Topics to implement pub/sub message model
So in this one example, we have a usable implementation pattern (I’ve provided all the basic helper classes as well as sample implementations of both console and Windows Services applications). We also have a few reusable code snippets (create a blob SAS, interact with Service Bus Topics, upload/download files to/from block blobs.
At least one customer I’m working with today will find these samples helpful. I hope others will as well.
I have a few things I plan to do this with this in the near term. Namely make the samples a bit more robust: error handling, logging, and maybe even *gasp* unit tests! I also want to add in support for larger files by showing how to implement this with page blobs (which are cheaper if you’re using less than 500TB of storage).
We may also explore using this to do not just new file publication, but perhaps updates as well as adding some useful metadata properties to the blobs and messages.
I also want to look at including more based scenarios. In fact, if you read this, and have a scenario, you can fork the project and send us a pull request. In fact, you’re wondering how to do something that you think could fit into this project, please drop me a line.
That’s all the time I have for today. Please look over the project as we’d love your feedback.