Thursday, January 15, 2009

Amazon Offers Public Data Sets

David Linthicum of Intelligent Enterprise Reports the following --

It was bound to happen sooner or later -- Amazon is now in the live data business with the recent launch of Public Data Sets on AWS (Amazon Web Services). In short:

"Public Data Sets on AWS provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge for the community, and like all AWS services, users pay only for the compute and storage they use for their own applications. An initial list of data sets is already available, and more will be added soon."

In essence, most data out there in the public domain, including information provided by the government, is now available, for free, from Amazon. This is clearly the company's entrance, with other data sets to appear shortly. Perhaps there will be address validation, mapping, and other information that can be delivered through Web APIs to mashups, portals, ecommerce sites, or even traditional enterprise applications. The idea is that you can access information you could find on the Web visually, as an API, for machine-to-machine integration.

This is nothing new, by the way; there have been a number of startups that have been providing information-as-a-service through Web APIs (typically Web services) for some time now, including much of the information that AWS is looking to provide with this initial offering. However, they have been charging for that information while Amazon [is not] when it's for use within their own cloud. You can count on Amazon offering these services, perhaps for a fee, to those who want access to the systems outside of their cloud.

AWS is not, however, offering high-value services of the kind offered by D&B or credit check services. They will have to charge for those, since those source providers do not give that information away. The end game for Amazon will be to offer as much useful information as it can get its hands on, thus increasing the value of the cloud services and APIs.

This is a different game for Amazon. It's more than providing infrastructure-as-a-service, such as databases and application processing clouds, but they are not maintaining the underlying data (that's a different type of business to be in, trust me). However, it could be more lucrative for the company in the long run, and among the larger players, Amazon is the first to broker this type of information as-a-service. You can count on Google, Force.com, and perhaps Microsoft, to follow up with such offerings. My bet is that Google will be second out of the gate, since they are already API-rich.

1 comment:

Anonymous said...

Ernie, nice write-up. It's only a matter of time before Google (as you pointed out) get into this arena as well with their own collection of public data sets.

The only hesitation, as a developer, I would have about using these data sets is the possibility of becoming reliant upon them while they are free and in the future having to pay for access to them.

I also, for some reason, get a bad feeling about anyone trying to capitalize on "public information". I realize that any fees would be related to the service, but it still worries me.

Web Analytics