Thursday, November 24, 2011

Storing data in the cloud with Azure

Suppose you’re developing a new podcasting application for Windows 7. For this application, you want to convert MP3 files to WMA. To convert an MP3 file, you first need to read the file from a hard disk (and eventually write the result). Even though there are thousands of different disk drives, you don’t need to concern yourself with the implementation of these drives because the operating system provides you with an abstracted view of the disk drive. To save the converted file to the disk, you can write the file to the filesystem; the operating system manages how it’s written to the physical device. The same piece of code that you would use to save your podcast will work, regardless of the physical disk drive.

In the same way that Windows 7 abstracts the complexities of the physical hardware of a desktop PC away from your application, Windows Azure abstracts the physical cloud infrastructure away from your applications using configuration and managed APIs.

Applications can’t subsist on code alone; they usually need to store and retrieve data to provide any real value. In the next section, we’ll discuss how Azure provides you with shared storage, and then we’ll take a quick tour of the BLOB storage service, messaging, and the Table storage service.


Understanding Azure’s shared storage mechanism
If we consider the MP3 example in the context of Windows Azure, rather than abstracting your application away from a single disk, Windows Azure needs to abstract your application away from the physical server (not just the disk). Your application doesn’t have to be directly tied to the storage infrastructure of Azure. You’re abstracted away from it so that changes in the infrastructure don’t impact your code or application. Also, the data needs to be stored in shared space, which isn’t tied to a physical server and can be accessed by multiple physical servers.

Your services won’t always be continually running on the same physical machine. Your roles (web or worker) could be shut down and moved to another machine at any time to handle faults or upgrades. In the case of web roles, the load balancer could be distributing requests to a pool of web servers, meaning that an incoming request could be performed on any machine.

To run services in such an environment, all instances of your roles (web and worker) need access to a consistent, durable, and scalable storage service. Windows Azure provides scalable storage service, which can be accessed both inside and outside the Microsoft data centers. When you register for Windows Azure, you’ll be able to create your own storage accounts with a set of endpoint URIs that you can use to access access the storage services for your account. The storage services are accessed via a set of REST APIs that’s secured by an authentication token.

Windows Azure storage services are hosted in the fabric in the same way as your own roles are hosted. Windows Azure is a scalable solution; you never need to worry about running out of capacity.


Storing and accessing BLOB data
Windows Azure provides the ability to store binary files (BLOBs) in a storage area known as BLOB storage.

In your storage account, you create a set of containers (similar to folders) that you can store your binary files in. In the initial version of the BLOB storage service, containers can either be restricted to private access (you must use an authentication key to access the files held in this container) or to public access (anyone on the internet can access the file, without using an authentication key).

We return to the audio file conversion (MP3 to WMA) scenario. In this example, you’re converting a source recording of your podcast (Podcast01.mp3) to Windows Media Audio (Podcast01.wma). The source files are held in BLOB storage in a private container called Source Files, and the destination files are held in BLOB storage in a public container called Converted Files. Anyone in the world can access the converted files because they’re held in a public container, but only you can access the files in the private container because it’s secured by your authentication token. Both the private and public containers are held in the storage account called MyStorage.

BLOBs can be split up into more manageable chunks known as blocks for more efficient uploading of files. This is only the tip of the iceberg in terms of what you can do with BLOB storage in Azure. In part 4, we’ll explore BLOB storage and usage in much more detail.

BLOBs play the role of a filesystem in the cloud, but there are other important aspects of the storage subsystem. One of those is the ability to store and forward messages to other services through a message queue.


Messaging via queues
Message queues are the primary mechanism for communicating with worker roles. Typically, a web role or an external service places a message in the queue for processing. Instances of the worker role poll the queue for any new messages and then process the retrieved message. After a message is read from the queue, it’s not available to any other instances of the worker role. Queues are considered part of the Azure storage system because the messages are stored in a durable manner while they wait to be picked up in the queue.

In the audio file conversion example, after the source podcast BLOB (Podcast01.mp3) is placed in the Source Files container, a web role or external service places a message (containing the location of the BLOB) in the queue. A worker role retrieves the message and performs the conversion. After the worker role converts the file from MP3 to WMA, it places the converted file (Podcast01.wma) in the Converted Files container. Windows Azure also provides you with the ability to store data in a highly scalable, simple Table storage service.


Storing data in tables
The Table storage service provides the ability to store serialized entities in a big table; entities can then be partitioned across multiple servers.

Using tables is a simple storage mechanism that’s particularly suitable for session management or user authentication. Tables don’t provide a relational database in the cloud, and if you need the power of a database (such as when using server-side joins), then SQL Azure.

Source of Information : Manning Azure in Action 2010
Storing data in the cloud with AzureSocialTwist Tell-a-Friend
Digg Google Bookmarks reddit Mixx StumbleUpon Technorati Yahoo! Buzz DesignFloat Delicious BlinkList Furl

0 comments: on "Storing data in the cloud with Azure"

Post a Comment