Microsoft Codename "Astoria": Data Services for the Web
Pablo Castro has announced a very cool project, Microsoft Codename "Astoria": Data Services for the Web.
So, if:
You build data-aware web applications (are there web devs that don't?)
You are interested in data access over the web (data over the web is what makes the web interesting)
You have been asking: "What is the programming model for data over web?" (I have)
You want more that simple file storage in the sky - you want relational data in the cloud you can program against - read and write (every developer I've spoken to wants this)
You like REST ('nuff said)
Then...read on.
Codename "Astoria" is the project I was working on before I left Microsoft. Astoria was probably the single hardest reason for me to leave as the momentum of the project was really picking up speed and the potential of Astoria was being grokked by a number of teams inside Microsoft. So to see Pablo's presentation today at MIX07 was nothing short of glorious.
On to "What is Astoria?"
Astoria is the cloudiest city in North America. Astoria is also the codename for an incubation project started some months ago attempting to answer the following questions: if you could provide a dead-simple way of programming against a relational data store that resides on the internet, what should the programming model look like? Could it be simpler than SOAP-based data access programming?
There are two parts (well, more - but we'll start with two) to the Astoria story.
The first part is the more traditional Microsoft CTP thing. What the downloadable bits allow you to do is to very rapidly provide a REST-based API - URLs- to an existing SQL Server database. That's right - REST to SQL in a jiffy! You download the bits for Visual Studio Orcas, do a couple of wizzy things and before you know it you have your data available through http. If you want, can then apply your own security model, constraints, etc. Downloads info and links here.
The second part is what I consider to be the coolest thing in data over web the today. The Microsoft team has deployed an online implementation of Astoria here, providing some sample read-only datasets for you to play with. There are plans to allow anyone to define their own data model and then have your data stored the same way as read-write, but it's not there today, and when that goes online it'll be for experimental use only, not the real-deal, yet.
There are currently four datasets for you to play with:
Northwind Data Service
Northwind is a classic database example, so we had to make it part of the sample set for the Astoria online service.
AdventureWorks Data Service
AdventureWorks is the new sample database that comes with Microsoft SQL Server 2005 and has a rich schema and a lot of data.
Encarta Data Service
The Microsoft® Encarta® team was nice enough to let us use a subset of their Encarta Encyclopedia article base and expose it as a data service.
TagSpace Data Service
The folks at Microsoft TagSpace were also kind enough to let us borrow their data (without user information, of course); this data service exposes the Tags, Tagged Items, Bookmarks and the relationships between them.
Let's look at the TagSpace Data Service as an example. Here's the entry point (data service root): http://astoria.sandbox.live.com/tagspace/tagspace.rse
What you see returned are the entities you can begin to browse though as an XML payload back (you can request JSON or simple RDF too): The three entities: Tags, TaggedItems and UserBookmarks.
<?xml version="1.0" encoding="utf-8" ?>
- <DataService xml:base="http://astoria.sandbox.live.com/tagspace/tagspace.rse">
- <TagSpaceEntities uri=".">
<Tags href="Tags" />
<TaggedItems href="TaggedItems" />
<UserBookmarks href="UserBookmarks" />
</TagSpaceEntities>
</DataService>
So, let's look at the all the tags. Simple add "/tags" to the end of the entry point URL: http://astoria.sandbox.live.com/tagspace/tagspace.rse/tags
One of the tags listed in the XML playload returned is vista:
- <Tag uri="Tags[vista]">
<TagName>vista</TagName>
<CreatedDate>11/6/2006 11:40:31 AM</CreatedDate>
<TaggedItems href="Tags[vista]/TaggedItems" />
Now let's replace "/tags" with "/Tags[vista]/TaggedItems" in the URL to get all the items tagged with 'vista': http://astoria.sandbox.live.com/tagspace/tagspace.rse/Tags[vista]/TaggedItems
Now we see all the the items tagged 'vista'. In the payload back, we see that 'TaggedItem' is another entity that is made up of a number of elements: the item ID (in this case '2'), the item title, the item description and the url to the item:
- <TaggedItem uri="TaggedItems[2]">
<TaggedItemId>2</TaggedItemId>
<Title>How do I change Windows Vista Media Center's Record folder...</Title>
<Description>Well, currently I installed Windows Vista Ultimate on my PC. I have two harddrives, a 80gig, and a...</Description>
<Uri>http://beta.communities.microsoft.com/Forums/thread.aspx?ThreadId=6abaf29f-81e0-4cbb-bcab-93541e49f0ea&MessageId=ddf043cb-903f-437a-9aa3-730c92398b37</Uri>
Now I know the ID (the 'key') of a specific item, I can ask just for that item back by replacing "/Tags[vista]/TaggedItems" with TaggedItems[2]" in the URL hitting the data service: http://astoria.sandbox.live.com/tagspace/tagspace.rse/Tags[vista]/TaggedItems[2]
The underlining technology that allows the data to be viewed at entities is the ADO.NET Entity Framework, so the data model you are traversing is an Entity Data Model (the EDM consists of the high-level constructs “entities” and “associations”). The cool thing is, as a consumer of the data for your app (or as a human navigating through the URLs), you don't need to anything installed, you just hit the URLS and off you go. The payload returns enough information for you to then start navigating the data model and the data itself and use the data it for your apps.
The are a number of other very cool aspects to all this, and more detail is avaiable to read through in this .doc - Using Microsoft Codename Astoria, but to summarise just a few point here:
The data service is reachable over regular HTTP requests, and standard HTTP verbs such as GET, POST, PUT and DELETE are used to perform operations against the service.
The URL-based query syntax supports a number of operators. Examples:
eq
Equal
/Customers[City eq 'London']
ne
Not equal
/Customers[City ne 'London']
gt
Greater than
/Orders[OrderDate gt '1998-5-1']
gteq
Greater than or equal
/Orders[Freight gteq 800]
lt
Less than
/Orders[Freight lt 1]
lteq
Less than or equal
/Orders[OrderDate lteq '1999-5-4']
There are plenty more operators too. You can also create new entities using HTTP POST request and delete entites using HTTP DELETE against the URLs and manage associations between entities through URIs.
Powerful stuff, but to point out here: the team is looking for feedback. It's actually pretty unusual to have a project reveal so much at this early stage of a technology's development, but the team was itching to get feedback as early as possible in the design process.
If you are at MIX07, Pablo is repeating his session on Wedesday May 2, 11:30 AM - 12:45 PM, Lando 4201.
Can't wait for the team to update the service so we can start hosting our own data up there too...
--
Update - 11:19pm
I caught up with Pablo and video'd our chat together. Also, blog reactions so far have been positive. New post here.