<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: A google mystery</title>
	<atom:link href="http://dague.net/2007/11/19/a-google-mystery/feed/" rel="self" type="application/rss+xml" />
	<link>http://dague.net/2007/11/19/a-google-mystery/</link>
	<description>Various rambling thoughts from my personal corner of the internet</description>
	<pubDate>Sun, 12 Oct 2008 03:41:28 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.2</generator>
		<item>
		<title>By: Sean Dague</title>
		<link>http://dague.net/2007/11/19/a-google-mystery/#comment-128</link>
		<dc:creator>Sean Dague</dc:creator>
		<pubDate>Mon, 19 Nov 2007 21:39:57 +0000</pubDate>
		<guid isPermaLink="false">http://dague.net/2007/11/19/a-google-mystery/#comment-128</guid>
		<description>Right, I think the point of crossing that we were having wasn't that "it's amazing that long strings have a small number of google hits", it was "it's amazing that the summary google has for my web page (which is not in any of my content on my website) is a long string that exists no where else on the internet, so can't be marked up to some bucketing of blogs or the like".

Anyway, now I think the mystery is more or less solved.</description>
		<content:encoded><![CDATA[<p>Right, I think the point of crossing that we were having wasn&#8217;t that &#8220;it&#8217;s amazing that long strings have a small number of google hits&#8221;, it was &#8220;it&#8217;s amazing that the summary google has for my web page (which is not in any of my content on my website) is a long string that exists no where else on the internet, so can&#8217;t be marked up to some bucketing of blogs or the like&#8221;.</p>
<p>Anyway, now I think the mystery is more or less solved.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nick</title>
		<link>http://dague.net/2007/11/19/a-google-mystery/#comment-127</link>
		<dc:creator>Nick</dc:creator>
		<pubDate>Mon, 19 Nov 2007 21:35:17 +0000</pubDate>
		<guid isPermaLink="false">http://dague.net/2007/11/19/a-google-mystery/#comment-127</guid>
		<description>Also, it seems to me that Google is slowly moving from the "innovation" phase to the "assimilation" phase of their business.  Next comes "domination", "stagnation", and "superannuation", at least if they follow Microsoft's lead.</description>
		<content:encoded><![CDATA[<p>Also, it seems to me that Google is slowly moving from the &#8220;innovation&#8221; phase to the &#8220;assimilation&#8221; phase of their business.  Next comes &#8220;domination&#8221;, &#8220;stagnation&#8221;, and &#8220;superannuation&#8221;, at least if they follow Microsoft&#8217;s lead.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nick</title>
		<link>http://dague.net/2007/11/19/a-google-mystery/#comment-126</link>
		<dc:creator>Nick</dc:creator>
		<pubDate>Mon, 19 Nov 2007 21:29:40 +0000</pubDate>
		<guid isPermaLink="false">http://dague.net/2007/11/19/a-google-mystery/#comment-126</guid>
		<description>I was only addressing the uniquness issue.  I had no idea where the summary string came from, but it did seem reminiscent of your old web page.</description>
		<content:encoded><![CDATA[<p>I was only addressing the uniquness issue.  I had no idea where the summary string came from, but it did seem reminiscent of your old web page.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sean Dague</title>
		<link>http://dague.net/2007/11/19/a-google-mystery/#comment-125</link>
		<dc:creator>Sean Dague</dc:creator>
		<pubDate>Mon, 19 Nov 2007 21:17:08 +0000</pubDate>
		<guid isPermaLink="false">http://dague.net/2007/11/19/a-google-mystery/#comment-125</guid>
		<description>Yes, very true.  But I think you are missing the point. :)

Google shows that specific string as my page summary in it's index.  It doesn't show any content from my page there, just that summary string.  The only place that text exists on the internet is on google's summary of my page.

However, I may have just figured out the mystery.  Google seems to be joining in dmoz information to their index now.  The dmoz description that I put in place 7 years ago seems to be trumping found content on my homepage (which is somewhat bad behavior on google's part I think).  I think the change of my website was coincidental to this other issue.</description>
		<content:encoded><![CDATA[<p>Yes, very true.  But I think you are missing the point. <img src='http://dague.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Google shows that specific string as my page summary in it&#8217;s index.  It doesn&#8217;t show any content from my page there, just that summary string.  The only place that text exists on the internet is on google&#8217;s summary of my page.</p>
<p>However, I may have just figured out the mystery.  Google seems to be joining in dmoz information to their index now.  The dmoz description that I put in place 7 years ago seems to be trumping found content on my homepage (which is somewhat bad behavior on google&#8217;s part I think).  I think the change of my website was coincidental to this other issue.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nick</title>
		<link>http://dague.net/2007/11/19/a-google-mystery/#comment-124</link>
		<dc:creator>Nick</dc:creator>
		<pubDate>Mon, 19 Nov 2007 21:13:07 +0000</pubDate>
		<guid isPermaLink="false">http://dague.net/2007/11/19/a-google-mystery/#comment-124</guid>
		<description>Furthermore, check this out:

"Includes personal information, photographs, family, and friends" -&#62; 6 hits

"Includes personal information, photographs, family" -&#62; 28 hits

"Includes personal information, photographs" -&#62; 230 hits

"Includes personal information" -&#62; 63,500 hits

"Includes personal" -&#62; 597,000 hits

"Includes" -&#62; 627,000,000 hits

Try graphing those data points and you'll get something like a logarithmic curve where x is the number of words, and y is the number of hits.</description>
		<content:encoded><![CDATA[<p>Furthermore, check this out:</p>
<p>&#8220;Includes personal information, photographs, family, and friends&#8221; -&gt; 6 hits</p>
<p>&#8220;Includes personal information, photographs, family&#8221; -&gt; 28 hits</p>
<p>&#8220;Includes personal information, photographs&#8221; -&gt; 230 hits</p>
<p>&#8220;Includes personal information&#8221; -&gt; 63,500 hits</p>
<p>&#8220;Includes personal&#8221; -&gt; 597,000 hits</p>
<p>&#8220;Includes&#8221; -&gt; 627,000,000 hits</p>
<p>Try graphing those data points and you&#8217;ll get something like a logarithmic curve where x is the number of words, and y is the number of hits.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nick</title>
		<link>http://dague.net/2007/11/19/a-google-mystery/#comment-123</link>
		<dc:creator>Nick</dc:creator>
		<pubDate>Mon, 19 Nov 2007 21:00:54 +0000</pubDate>
		<guid isPermaLink="false">http://dague.net/2007/11/19/a-google-mystery/#comment-123</guid>
		<description>If the set of attributes was large enough, then it's plausible that you were the only one assigned this particular subset of attributes in this particular order.  Add to that the possibility that the algorithm was modified very recently, and until now was not producing that particular subset of attributes in that particular order.

I tried doing a search for "Includes personal information, friends, photographs, and snowboarding." and I got only 3 results, all of which refer to a Chris Davy, who is on the same page as you are.

Same for "Includes personal information and poetry."  4 results, all for Dan Dempsey.

If you tack enough words together the probability that google has indexed another page with the exact same combination of words decreases dramatically.</description>
		<content:encoded><![CDATA[<p>If the set of attributes was large enough, then it&#8217;s plausible that you were the only one assigned this particular subset of attributes in this particular order.  Add to that the possibility that the algorithm was modified very recently, and until now was not producing that particular subset of attributes in that particular order.</p>
<p>I tried doing a search for &#8220;Includes personal information, friends, photographs, and snowboarding.&#8221; and I got only 3 results, all of which refer to a Chris Davy, who is on the same page as you are.</p>
<p>Same for &#8220;Includes personal information and poetry.&#8221;  4 results, all for Dan Dempsey.</p>
<p>If you tack enough words together the probability that google has indexed another page with the exact same combination of words decreases dramatically.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sean Dague</title>
		<link>http://dague.net/2007/11/19/a-google-mystery/#comment-122</link>
		<dc:creator>Sean Dague</dc:creator>
		<pubDate>Mon, 19 Nov 2007 16:43:43 +0000</pubDate>
		<guid isPermaLink="false">http://dague.net/2007/11/19/a-google-mystery/#comment-122</guid>
		<description>Well searching for the exact string is interesting for the following reason: the only place the string exists is on google indexes of my content, and the only place it is found is in things which index the google hits for my site.

So, I've got the fact that the google description of my website is a self reinforcing piece of meta data that google invented, and only has applied to my website.</description>
		<content:encoded><![CDATA[<p>Well searching for the exact string is interesting for the following reason: the only place the string exists is on google indexes of my content, and the only place it is found is in things which index the google hits for my site.</p>
<p>So, I&#8217;ve got the fact that the google description of my website is a self reinforcing piece of meta data that google invented, and only has applied to my website.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nick</title>
		<link>http://dague.net/2007/11/19/a-google-mystery/#comment-121</link>
		<dc:creator>Nick</dc:creator>
		<pubDate>Mon, 19 Nov 2007 16:10:33 +0000</pubDate>
		<guid isPermaLink="false">http://dague.net/2007/11/19/a-google-mystery/#comment-121</guid>
		<description>Try the search without the quotes.  It looks to me like you were assigned a subset of a larger set of attributes which happened to be unique to your page.  Other entries in the google directory are very similar, but none of them are exactly the same.  I don't know if wordpress or google assigned you those attributes, but it was probably based on some algorithm looking for keywords, links and images that assigns these attributes.</description>
		<content:encoded><![CDATA[<p>Try the search without the quotes.  It looks to me like you were assigned a subset of a larger set of attributes which happened to be unique to your page.  Other entries in the google directory are very similar, but none of them are exactly the same.  I don&#8217;t know if wordpress or google assigned you those attributes, but it was probably based on some algorithm looking for keywords, links and images that assigns these attributes.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
