Help - Search - Members - Calendar
Full Version: Please help me get Google-Watch deleted
> Wikimedia Discussion > Articles
Daniel Brandt
The Wikipedia article about Google Watch is up for deletion. Please give me a hand and vote to delete. It was bad five years ago, and then when I tried to improve it in October 2005 — the same month that SlimVirgin and I began wrestling over the biographical stub she started on me — I was instantly slapped down by Slim, Jimbo, User2004 (Will_Beback), and Eloquence (Erik Moeller).

It's still bad, and I'm still mad. mad.gif
Error59
It isn't going to be deleted. The Big Question here - who is Manhattan Samurai a sock of?
EricBarbour
Incompetence, no. More of their childish anti-Brandt-war, yes.

I smell stinky, stinky socks.

Daniel, I'd be happy to login and vote to delete, unfortunately I'd be tempted to start ranting at these freaks. Sorry.

Just print the AFD discussion out, and save it. Let this one pass. Consider it a long-term legal strategy. If they keep it up (and that seems likely), eventually you'll have a fat file folder that a defamation attorney might find amusing to read.
Daniel Brandt
QUOTE(EricBarbour @ Sun 12th October 2008, 8:24pm) *

Incompetence, no. More of their childish anti-Brandt-war, yes.

I smell stinky, stinky socks.

Daniel, I'd be happy to login and vote to delete, unfortunately I'd be tempted to start ranting at these freaks. Sorry.

Just print the AFD discussion out, and save it. Let this one pass. Consider it a long-term legal strategy. If they keep it up (and that seems likely), eventually you'll have a fat file folder that a defamation attorney might find amusing to read.

You're right, Eric. My mistake. I thought there was some room for negotiation between me and the Hive. I was wrong, and now I feel foolish.

I've told Google that they can get back into google-watch.org and wikipedia-watch.org again. According to their webmaster guidelines, it should happen within 3 to 5 days.

You cannot reason with a swarm of bees. It's war to the death if you hope to retain any self-respect.
Daniel Brandt
QUOTE(Daniel Brandt @ Sun 12th October 2008, 11:34pm) *

I've told Google that they can get back into google-watch.org and wikipedia-watch.org again. According to their webmaster guidelines, it should happen within 3 to 5 days.

Holy smoke, that was fast. It was more like 5 hours.
Newyorkbrad
QUOTE(Daniel Brandt @ Mon 13th October 2008, 9:46am) *

QUOTE(Daniel Brandt @ Sun 12th October 2008, 11:34pm) *

I've told Google that they can get back into google-watch.org and wikipedia-watch.org again. According to their webmaster guidelines, it should happen within 3 to 5 days.

Holy smoke, that was fast. It was more like 5 hours.

I saw a recent post on The Volokh Conspiracy that a test page created there had shown up on Google within an hour after posting.

How much time elapses between when a new Wikipedia page is created and when it is picked up on Google and/or on Wikipedia mirror sites? This would be important information for the ongoing flagged revisions discussions and related matters.
Somey
QUOTE(Newyorkbrad @ Mon 13th October 2008, 2:40pm) *
How much time elapses between when a new Wikipedia page is created and when it is picked up on Google and/or on Wikipedia mirror sites? This would be important information for the ongoing flagged revisions discussions and related matters.

I think that would depend on how many existing pages link to the new page, as well as just sheer chance. It doesn't surprise me that some new WP pages would be indexed within an hour, if there were a lot of red links that changed to blue with the page's creation, but it does surprise me that Wikipedia-Watch would show up within 5 hours. It could just be a fluke, but maybe more people link to that than we thought.

For the record, I think I mentioned in one of the private forums that while I do think Wikipedia should delete the Google-Watch article, I personally wouldn't want to be among the ones calling for significant changes in their so-called "deletion policy" that would give the same opt-out rights to non-human entities that humans should have as a matter of course. Sorry if that's overly equivocal... Obviously this is putting aside the fact that WP doesn't even give human beings any sort of opt-out rights now. To me, of course, that makes Wikipedia anti-human.
Daniel Brandt
We're talking apples and oranges here. I used Google's "Webmaster Tools" interface to remove both wikipedia-watch.org and google-watch.org. All you need is a gmail account, and you need access to your site's root directory. You fix your robots.txt to exclude Google if it's the whole site you want deleted, and then while you're in the Webmaster Tool after logging into your Google account, you are given a dummy filename with a long string of encrypted numbers. You stick this in your root directory, and Google instantly checks you off as "verified" by looking to see if that file is there.

Once verified, you can remove individual pages or the entire site. It takes only a few hours before the individual pages or the entire site is out of Google.

You can use the same tool to reinclude that content. That page says "3-5 business days" for reinclusion, which is why I was surprised that it happened in about 5 hours.

This is completely different than crawling pages that haven't been indexed by Google. After the few hours it took for Google to completely remove those two sites, I restored the robots.txt to allow crawling. Google continued to crawl, even though both sites were out of the index.

When I requested reinclusion, everything came back exactly as it was before the removal. The PageRanks popped up from zero to their previous values, and all the rankings were exactly the same for all the search terms I tried to look at, on any and all of the pages on those two sites.

This is a back-end filter operation that I just described. It's not a "let's look for new pages and index them, and make them available, and see where they rank from day to day in competition with the rest of the web."

It was fun and educational.

I have noticed that Google's crawl frequency of the User_talk pages in Wikipedia is rather slow. While they all seem to have the "noindex,follow" in them, including in the archived User_talk pages, a huge number are still in Google. If you look at Google's cache copy of the ones still available, you will see that some were last crawled as long as two or three months ago. My guess is that the User_talk pages are on an entirely different crawl schedule than the article mainspace pages. Certainly a lot different than the crawling of wikipediareview.com, which is very fast. Most blogs and forums are crawled almost constantly, it seems. Static sites are crawled less frequently.

What I think the Wikimedia Foundation should do is get one of the Google fanboys on staff to study the Webmaster Tool situation, and become something of an expert. Maybe even ask Matt Cutts if there's a way to submit stuff in a bulk file to get various URLs that fit some pattern instantly deleted. It might be asking too much of Google to expect them to provide this capability, but it wouldn't hurt to ask.
UseOnceAndDestroy
QUOTE(Newyorkbrad @ Mon 13th October 2008, 8:40pm) *
How much time elapses between when a new Wikipedia page is created and when it is picked up on Google

For a site as big and rapidly updated as wikipedia - fast. Google appears to note update frequency and react accordingly. (WP appears to get extra-special treatment).

Random sample:

6 hours old
Indexed 5 hours ago

QUOTE
and/or on Wikipedia mirror sites?

Whenever the bloke running the site feels like scraping wikipedia.
Newyorkbrad
QUOTE(UseOnceAndDestroy @ Mon 13th October 2008, 6:46pm) *

QUOTE(Newyorkbrad @ Mon 13th October 2008, 8:40pm) *
How much time elapses between when a new Wikipedia page is created and when it is picked up on Google

For a site as big and rapidly updated as wikipedia - fast. Google appears to note update frequency and react accordingly. (WP appears to get extra-special treatment).

Random sample:

6 hours old
Indexed 5 hours ago

QUOTE
and/or on Wikipedia mirror sites?

Whenever the bloke running the site feels like scraping wikipedia.

And conversely, when a page is deleted, how long does it take to disappear from Google? Quite a bit longer, I might imagine, but am I right? This has serious implications that I want to think through and address. (See also my Wikipedia talkpage.)
Daniel Brandt
QUOTE(Newyorkbrad @ Mon 13th October 2008, 7:09pm) *


And conversely, when a page is deleted, how long does it take to disappear from Google? Quite a bit longer, I might imagine, but am I right? This has serious implications that I want to think through and address. (See also my Wikipedia talkpage.)

Yes, it takes longer to get a Wikipedia page out of Google.

In my Wikipedia experience, it takes Google about 5 to 7 days to find a mainspace article that suddenly has a "noindex,nofollow" in the deletion-page headers. This causes Google to drop the page and its cache copy rather quickly once it sees it, but it simply isn't checking back more than once every few days.

It takes a lot longer to check back on User-space pages, because Google assigns them a lower priority. It could be two or three months before Google checks back and re-indexes those pages, and reads the meta headers for possible new instructions.

Google is much quicker finding new pages, than it is in checking back on old pages to see if anything has changed. Their priorities are related to what is required to keep their search engine competitive. You gotta have the latest buzz before anyone else, because otherwise you aren't "kool."

As I explained above, if Wikipedia was organized to use the tools made available to webmasters by Google, you could get a page out of Google within about three hours. This is available to any webmaster.

In very serious cases of legal threats, one would think that Mike Godwin would not only blank the page or edit out the material and lock it down, but also use the Google tool to delete it from Google within hours, instead of days. It would also be good public relations to be this quick.

Then again, I rather doubt that Godwin is even aware of how Google works and what his options might be.

EricBarbour
QUOTE(Daniel Brandt @ Mon 13th October 2008, 7:29pm) *

In very serious cases of legal threats, one would think that Mike Godwin would not only blank the page or edit out the material and lock it down, but also use the Google tool to delete it from Google within hours, instead of days. It would also be good public relations to be this quick.
Then again, I rather doubt that Godwin is even aware of how Google works and what his options might be.


It seems to me that, Google aside, we have NO IDEA how many pending lawsuits are currently being fought off by Mr. Godwin and the WMF staff. I've never seen a WM news page discussing their current legal status, have you?

All I can find is this.
http://meta.wikimedia.org/wiki/Category:Legal

Just like a corporation, they are hiding any business-related legal problems from the attention of their contributors and investors.
Proabivouac
QUOTE(Error59 @ Sun 12th October 2008, 6:24am) *

It isn't going to be deleted. The Big Question here - who is Manhattan Samurai a sock of?

"Manhattan Samurai"…

http://en.wikipedia.org/w/index.php?title=...nhattan+Samurai

…is the same user as "BillDeanCarter" and "Pat Hobby", at least:

http://en.wikipedia.org/w/index.php?title=...=BillDeanCarter
http://en.wikipedia.org/w/index.php?title=...arget=Pat_Hobby

I'm told he has several dozen throwaway accounts as well.
Daniel Brandt
Hey boys and girls and whatever, help me out here. There is still time to vote!

Image
GlassBeadGame
QUOTE(Daniel Brandt @ Wed 15th October 2008, 8:47pm) *

Hey boys and girls and whatever, help me out here. There is still time to vote!

Image


A perfectly serviceable demand for retraction, Daniel.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.