<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: &#8220;Any any any old data&#8221;</title>
	<atom:link href="http://blog.paulwalk.net/2008/10/07/any-any-any-old-data/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.paulwalk.net/2008/10/07/any-any-any-old-data/</link>
	<description></description>
	<lastBuildDate>Thu, 14 Jan 2010 08:36:39 -0600</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Owen Stephens</title>
		<link>http://blog.paulwalk.net/2008/10/07/any-any-any-old-data/comment-page-1/#comment-249</link>
		<dc:creator>Owen Stephens</dc:creator>
		<pubDate>Thu, 30 Oct 2008 16:45:35 +0000</pubDate>
		<guid isPermaLink="false">http://blog.paulwalk.net/?p=113#comment-249</guid>
		<description>I&#039;ve done my own blog post on this, but I feel that actually EC2/S3 etc. are really &#039;on the otherside of the cloud&#039; - they aren&#039;t really &#039;of it&#039;. I agree there is an issue of provenance, but actually think for S3 etc. provenance is just as important for this as a service as it is for data.

I suspect there is something lurking in both ideas about being &#039;distributed&#039; - which seems to me to be linked to the idea of &#039;remotely hosted&#039; but goes beyond this.

My understanding is that the Internet was designed in such as not to have a single point of failure - you could always use an alternative route. I think this is an essential point of the idea of the internet as a commoditised network. When we talk about Cloud computing I can&#039;t see this is true - so Cloud computing is not analogous to the network Cloud. I think the idea of a &#039;data cloud&#039; makes more sense in this context - it has more similarities to what is being called cloud computing than the idea of the network cloud.</description>
		<content:encoded><![CDATA[<p>I&#8217;ve done my own blog post on this, but I feel that actually EC2/S3 etc. are really &#8216;on the otherside of the cloud&#8217; &#8211; they aren&#8217;t really &#8216;of it&#8217;. I agree there is an issue of provenance, but actually think for S3 etc. provenance is just as important for this as a service as it is for data.</p>
<p>I suspect there is something lurking in both ideas about being &#8216;distributed&#8217; &#8211; which seems to me to be linked to the idea of &#8216;remotely hosted&#8217; but goes beyond this.</p>
<p>My understanding is that the Internet was designed in such as not to have a single point of failure &#8211; you could always use an alternative route. I think this is an essential point of the idea of the internet as a commoditised network. When we talk about Cloud computing I can&#8217;t see this is true &#8211; so Cloud computing is not analogous to the network Cloud. I think the idea of a &#8216;data cloud&#8217; makes more sense in this context &#8211; it has more similarities to what is being called cloud computing than the idea of the network cloud.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chris Rusbridge</title>
		<link>http://blog.paulwalk.net/2008/10/07/any-any-any-old-data/comment-page-1/#comment-248</link>
		<dc:creator>Chris Rusbridge</dc:creator>
		<pubDate>Wed, 08 Oct 2008 01:12:26 +0000</pubDate>
		<guid isPermaLink="false">http://blog.paulwalk.net/?p=113#comment-248</guid>
		<description>Paul, maybe you are saying that the cloud weakens provenance? I followed through on the connected data sources image on Paul&#039;s post, and then linked to Geonames. In their example, &quot;[2] http://sws.geonames.org/3020251/about.rdf&quot; stood for a document about the particular place. This document could be anything; it could be from Wikipedia, from a tourist brochure, from the town council... and it&#039;s easy to see that the document would be different in each case. The provenance would help me distinguish and decide how much to trust.

It&#039;s perhaps a little different with real data... do I care who measured the temperature at station 435 yesterday noon, as long as someone did? Well, for scientific purposes, I guess one would care, so the provenance is important. But for many ordinary purposes I think one would not care much, unless one found reasons to distrust those data, then chasing up the provenance would be an issue. (A colleague told me once he was annoyed how much the BBC&#039;s weather forecasts for his home town differed from the Met Office forecast on which they were supposedly based... after parallel correspondences with someone in the Met Office and someone in the BBC, it became clear that the BBC&#039;s parameters for identifying the town or otherwise selecting the data for the forecast were wrong, in other words the BBC had been passing off a forecast from somewhere else for years... apocryphal, alleged, etc).

There&#039;s also different ways of things being &quot;in the cloud&quot;. Some services are hosted by EC2/S3 and so are &quot;in the cloud&quot;, but have perfectly &quot;real-looking&quot; URIs. Nothing wrong with doing the same thing for data

I&#039;ve argued with Lorcan sometimes that &quot;in the cloud&quot; is not too meaningful, ditto &quot;moving to the network level&quot;. There&#039;s real hardware, real servers, real OS&#039;s providing these services. What the hell difference does it make from my hardware and servers. Maybe the only thing is, I don&#039;t have to be bothered about some things that I used to have to worry about.

So, not knowing whether I&#039;m agreeing with you or not, maybe the thing we want is data, as authoritative as we need, with provenance available, clearly identified. But _where_ it is? Am I bovvered?</description>
		<content:encoded><![CDATA[<p>Paul, maybe you are saying that the cloud weakens provenance? I followed through on the connected data sources image on Paul&#8217;s post, and then linked to Geonames. In their example, &#8220;[2] <a href="http://sws.geonames.org/3020251/about.rdf" rel="nofollow">http://sws.geonames.org/3020251/about.rdf</a>&#8221; stood for a document about the particular place. This document could be anything; it could be from Wikipedia, from a tourist brochure, from the town council&#8230; and it&#8217;s easy to see that the document would be different in each case. The provenance would help me distinguish and decide how much to trust.</p>
<p>It&#8217;s perhaps a little different with real data&#8230; do I care who measured the temperature at station 435 yesterday noon, as long as someone did? Well, for scientific purposes, I guess one would care, so the provenance is important. But for many ordinary purposes I think one would not care much, unless one found reasons to distrust those data, then chasing up the provenance would be an issue. (A colleague told me once he was annoyed how much the BBC&#8217;s weather forecasts for his home town differed from the Met Office forecast on which they were supposedly based&#8230; after parallel correspondences with someone in the Met Office and someone in the BBC, it became clear that the BBC&#8217;s parameters for identifying the town or otherwise selecting the data for the forecast were wrong, in other words the BBC had been passing off a forecast from somewhere else for years&#8230; apocryphal, alleged, etc).</p>
<p>There&#8217;s also different ways of things being &#8220;in the cloud&#8221;. Some services are hosted by EC2/S3 and so are &#8220;in the cloud&#8221;, but have perfectly &#8220;real-looking&#8221; URIs. Nothing wrong with doing the same thing for data</p>
<p>I&#8217;ve argued with Lorcan sometimes that &#8220;in the cloud&#8221; is not too meaningful, ditto &#8220;moving to the network level&#8221;. There&#8217;s real hardware, real servers, real OS&#8217;s providing these services. What the hell difference does it make from my hardware and servers. Maybe the only thing is, I don&#8217;t have to be bothered about some things that I used to have to worry about.</p>
<p>So, not knowing whether I&#8217;m agreeing with you or not, maybe the thing we want is data, as authoritative as we need, with provenance available, clearly identified. But _where_ it is? Am I bovvered?</p>
]]></content:encoded>
	</item>
</channel>
</rss>
