Help - Search - Members - Calendar
Full Version: Extensive encyclopedia article
> Wikimedia Discussion > General Discussion
badlydrawnjeff
I can't do screengrabs where I am, so if someone wants to get photographic evidence, please do so.

Do yourself a favor - search "Hugo Chavez" in Google. When I do that at the time of the posting, this is what comes up:

QUOTE
Hugo Chávez - Wikipedia, the free encyclopedia - 9:45am
Hyperlinked encyclopedia article about the President of Venezuela.
en.wikipedia.org/wiki/Hugo_Chávez - 299k - Cached - Similar pages - Note this


That's odd. Instead of text, we get...this.

So I decide to look up "George W. Bush," and get this:

QUOTE
George W. Bush - Wikipedia, the free encyclopedia
Open-source encyclopedia article provides personal, business and political information about the President, his policies, and public perceptions and ...
en.wikipedia.org/wiki/George_W._Bush - 319k - Cached - Similar pages - Note this


Interesting. Now I'm intrigued. Is this a new thing for all BLPs?:

QUOTE
Kevin Spacey - Wikipedia, the free encyclopedia
Kevin Spacey (born July 26, 1959) is an American actor (film and stage) and director. Spacey grew up in California, and began his career as a stage actor ...
en.wikipedia.org/wiki/Kevin_Spacey - 69k - Cached - Similar pages - Note this


...nope. Hm. Maybe just all controversial figures?

QUOTE
Mahmoud Ahmadinejad - Wikipedia, the free encyclopedia
Hyperlinked encyclopedia article about the President of the Islamic Republic of Iran.
en.wikipedia.org/wiki/Mahmoud_Ahmadinejad - 259k - Cached - Similar pages - Note this

Mahmoud Abbas - Wikipedia, the free encyclopedia
Hyperlinked encyclopedia article provides the Fatah politician's personal and political biography.
en.wikipedia.org/wiki/Mahmoud_Abbas - 82k - Cached - Similar pages - Note this

Ann Coulter - Wikipedia, the free encyclopedia
Extensive encyclopedia article about the conservative commentator and author includes her biography, positions on issues, quotes and books.
en.wikipedia.org/wiki/Ann_Coulter - 178k - Cached - Similar pages - Note this

Michael Moore - Wikipedia, the free encyclopedia
Another 1999 series, Michael Moore Live, was aired in the UK only on Channel 4, though it was broadcast from New York. This show had a similar format to The ...
en.wikipedia.org/wiki/Michael_Moore - 88k - Cached - Similar pages - Note this (bdj - huh?)

Hillary Rodham Clinton - Wikipedia, the free encyclopedia
[72]) Hillary Clinton made culturally dismissive remarks about Tammy .... Other investigations took place during Hillary Clinton's time as First Lady. ...
en.wikipedia.org/wiki/Hillary_Clinton - 361k - Cached - Similar pages - Note this

Don Imus - Wikipedia, the free encyclopedia
John Donald "Don" Imus, Jr. (born July 23, 1940 [1]) is an American humorist, philanthropist, writer, radio and television talk show host in the mould of a ...
en.wikipedia.org/wiki/Don_Imus - 115k - Cached - Similar pages - Note this


So it's not consistent, either. A quick search in the Hugo Chavez source code doesn't appear to show anything new in the metadata. I'm not sure if there are some new templates being put on articles since I'm not involved, but they're too specific to work well anyway, no?

So, what's the deal?






Firsfron of Ronchester
QUOTE(badlydrawnjeff @ Fri 28th September 2007, 5:55pm) *

I can't do screengrabs where I am, so if someone wants to get photographic evidence, please do so.

Do yourself a favor - search "Hugo Chavez" in Google. When I do that at the time of the posting, this is what comes up:

QUOTE
Hugo Chávez - Wikipedia, the free encyclopedia - 9:45am
Hyperlinked encyclopedia article about the President of Venezuela.
en.wikipedia.org/wiki/Hugo_Chávez - 299k - Cached - Similar pages - Note this


That's odd. Instead of text, we get...this.

So I decide to look up "George W. Bush," and get this:

QUOTE
George W. Bush - Wikipedia, the free encyclopedia
Open-source encyclopedia article provides personal, business and political information about the President, his policies, and public perceptions and ...
en.wikipedia.org/wiki/George_W._Bush - 319k - Cached - Similar pages - Note this


Interesting. Now I'm intrigued. Is this a new thing for all BLPs?:

QUOTE
Kevin Spacey - Wikipedia, the free encyclopedia
Kevin Spacey (born July 26, 1959) is an American actor (film and stage) and director. Spacey grew up in California, and began his career as a stage actor ...
en.wikipedia.org/wiki/Kevin_Spacey - 69k - Cached - Similar pages - Note this


...nope. Hm. Maybe just all controversial figures?

QUOTE
Mahmoud Ahmadinejad - Wikipedia, the free encyclopedia
Hyperlinked encyclopedia article about the President of the Islamic Republic of Iran.
en.wikipedia.org/wiki/Mahmoud_Ahmadinejad - 259k - Cached - Similar pages - Note this

Mahmoud Abbas - Wikipedia, the free encyclopedia
Hyperlinked encyclopedia article provides the Fatah politician's personal and political biography.
en.wikipedia.org/wiki/Mahmoud_Abbas - 82k - Cached - Similar pages - Note this

Ann Coulter - Wikipedia, the free encyclopedia
Extensive encyclopedia article about the conservative commentator and author includes her biography, positions on issues, quotes and books.
en.wikipedia.org/wiki/Ann_Coulter - 178k - Cached - Similar pages - Note this

Michael Moore - Wikipedia, the free encyclopedia
Another 1999 series, Michael Moore Live, was aired in the UK only on Channel 4, though it was broadcast from New York. This show had a similar format to The ...
en.wikipedia.org/wiki/Michael_Moore - 88k - Cached - Similar pages - Note this (bdj - huh?)

Hillary Rodham Clinton - Wikipedia, the free encyclopedia
[72]) Hillary Clinton made culturally dismissive remarks about Tammy .... Other investigations took place during Hillary Clinton's time as First Lady. ...
en.wikipedia.org/wiki/Hillary_Clinton - 361k - Cached - Similar pages - Note this

Don Imus - Wikipedia, the free encyclopedia
John Donald "Don" Imus, Jr. (born July 23, 1940 [1]) is an American humorist, philanthropist, writer, radio and television talk show host in the mould of a ...
en.wikipedia.org/wiki/Don_Imus - 115k - Cached - Similar pages - Note this


So it's not consistent, either. A quick search in the Hugo Chavez source code doesn't appear to show anything new in the metadata. I'm not sure if there are some new templates being put on articles since I'm not involved, but they're too specific to work well anyway, no?

So, what's the deal?


I had wondered this last year, when [[Stegosaurus]] became a Featured Article. There was suddenly a new Google description ("Looks at the characteristics of this dinosaur, and the relationship between the three different species.") But like you said, it's not consistent: other Featured Articles don't have them. I assumed it was the developers adding a search meta tag, like this one <link rel="search" type="application/opensearchdescription+xml" href="/w/opensearch_desc.php" title="Wikipedia (English)" />
Somey
Damn, these people are clever, aren't they?

You're right, those words don't appear anywhere in the page source for those articles, but they appear on Ask.com, not just Google:

http://www.ask.com/web?q=%22extensive+ency...c=121&o=0&l=dir

But on Yahoo!, the word "extensive" isn't used:

With quotes - no Yahoo results:
http://search.yahoo.com/search;_ylt=A0geu6...=Search&fr=moz2

But without quotes...
http://search.yahoo.com/search?p=extensive...0&pstart=1&b=21

And this Google search, with quotes, returns only the WP article on Burkina Faso:

http://www.google.com/search?hl=en&safe=of...org&btnG=Search

Sound while it certainly sounds like some sort of weird conspiracy theory, there's definitely something unusual going on here, and someone/something on Wikipedia is doing it. Somehow they're recognizing search engine crawlers and feeding them different content for selected pages! I mean, the proof is right there, on all these major search engines - did they expect nobody would notice? (Kudos to you though, Jeff, for being the first...)

To be fair, this is actually something we've been asking them to do for almost two years, but we were sort of hoping they'd make it available to anyone with a valid reason who asked - not just controversial ultra-right-wing conservatives and a handful of African countries.

Jeff, would you mind if I changed the thread title to "Extensive encyclopedia article" and move the original title to the subtitle? That way, when people search Google for that phrase, they'll be more likely to find this thread.
MrM
QUOTE(Somey @ Fri 28th September 2007, 3:30pm) *

Somehow they're recognizing search engine crawlers and feeding them different content for selected pages!


If that were the case, wouldn't the different content show up in the google cache? It must be some metadata somewhere that only some of the search crawlers use.

Daniel Brandt
My understanding is that Google grabs the snippet from the Open Directory Project, and if it cannot find a snippet there, then it looks for a topic sentence or relevant text in the page itself. I do see these snippets at ODP. Now the question is, which army of volunteer editors at ODP decided to edit a bunch of descriptions for Wikipedia articles?

It's not like you can become an admin on ODP after a few thousand edits, and start grinding lesser editors into dust just for the sheer fun of it, like you can on Wikipedia. Or maybe there's something I don't know?
Somey
QUOTE(MrM @ Fri 28th September 2007, 3:01pm) *
If that were the case, wouldn't the different content show up in the google cache? It must be some metadata somewhere that only some of the search crawlers use.

Hmm... True, it isn't in the cached versions, not for Ask.com either. So what are we saying, then? That multiple search engines are cooperatively grabbing page-description data from a completely different content source? Operated by... who, exactly?

I guess like Firsfron sorta suggested, it could be someplace like OpenSearch.org, or a site that's affiliated with them (OpenSearch is owned by Amazon.com, so it's not quite as "open" as they'd probably like you to believe). So if (like Daniel says) it's coming from dmoz.org, that would explain the consistency between search engines... but is that better or worse than having Wikipedia be responsible for it?
Firsfron of Ronchester
QUOTE(Somey @ Fri 28th September 2007, 8:10pm) *

QUOTE(MrM @ Fri 28th September 2007, 3:01pm) *
If that were the case, wouldn't the different content show up in the google cache? It must be some metadata somewhere that only some of the search crawlers use.

Hmm... True, it isn't in the cached versions, not for Ask.com either. So what are we saying, then? That multiple search engines are cooperatively grabbing page-description data from a completely different content source? Operated by... who, exactly?

I guess like Firsfron sorta suggested, it could be someplace like OpenSearch.org, or a site that's affiliated with them (OpenSearch is owned by Amazon.com, so it's not quite as "open" as they'd probably like you to believe). So if (like Daniel says) it's coming from dmoz.org, that would explain the consistency between search engines... but is that better or worse than having Wikipedia be responsible for it?


Sorry if my theory is/was off-base. The display is definitely not new, though. I noticed it back in November 2006, and it could be older than that. 2006 was just when I first started noticing the descriptions.
Somey
QUOTE(Daniel Brandt @ Fri 28th September 2007, 3:06pm) *
Now the question is, which army of volunteer editors at ODP decided to edit a bunch of descriptions for Wikipedia articles?

Apparently the right-wing ones, because while there are entries for Bill O'Reilly, Rush Limbaugh, and Ann Coulter, there aren't any for their less-conservative counterparts such as Bill Moyers and Al Franken. No entry for Jon Stewart, Christopher Hitchens, or Noam Chomsky either, though Chomsky has one for the French Wikipedia...

Try it yourselves, folks:

http://search.dmoz.org/advanced_search.html


QUOTE(Firsfron of Ronchester @ Fri 28th September 2007, 3:22pm) *
Sorry if my theory is/was off-base. The display is definitely not new, though. I noticed it back in November 2006, and it could be older than that. 2006 was just when I first started noticing the descriptions.

OK - I, in turn, am sorry I crossed that out... it's always confusing when something like this first gets noticed. And Jeff might as well just be chopped liver anyway - he's used to being treated like that!

It just looks like the issue here is that someone at dmoz.org is engaging in preferential treatment for a certain class of articles and topics... Since WP results are almost always at the top of every Google search, and are nearly always contextual, this is actually a very valuable propaganda tool for whoever can manage to control it, or even just influence it. Most people who know much about DMOZ have probably tended to think of it as relatively unbiased, but it appears they've been infiltrated.
badlydrawnjeff
I only noticed it for the first time today, I hadn't remembered reading about it elsewhere - I would have made a bit of a stink about it if I had found out the same time as Firsfron.

Somey, change the title at will.

Daniel Brandt:
QUOTE
My understanding is that Google grabs the snippet from the Open Directory Project, and if it cannot find a snippet there, then it looks for a topic sentence or relevant text in the page itself.


This may make sense - I don't know the first thing about the open directory project, but the "relevant text" makes sense if you look at what Google ends up pulling for Hillary Rodham Clinton and Michael Moore, pulling the closest matches to the name from the text.

Odd, regardless.

ETA: Here's the bizarre part, though, and maybe some of you have looked into this further: why only Wikipedia? 5 Google pages in and nothing else gets this sort of tag. Why would dmoz affect Wikipedia like this, and to what benefit?
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.