Help - Search - Members - Calendar
Full Version: Your two rights on Wikipedia
> Wikimedia Discussion > General Discussion > MediaWiki Software
thekohser
I've heard it said that users have two rights on Wikipedia:

1. The right to "fork" the database
2. The right to leave the project

So, I'm thinking about forking the English Wikipedia. How exactly does one go about doing that? I thought that the Wikimedia Foundation had given up about 18 months ago with trying to produce regularly-available data dumps of the entire project, presumably because their servers were choking on the process.

Is it now incumbent on a forker of the mother database (the "mother forker") to execute the entire process from "outside" Wikipedia?

And another question -- how might one fork the Simple English Wikipedia, which has a much more manageable 25,704 articles?

Greg
Nathan
Find the data dumps then import them into your database?
thekohser
QUOTE(Nathan @ Mon 18th February 2008, 2:43pm) *

Find the data dumps then import them into your database?


Sure, but where are these elusive data dumps? I thought the last stable, successful one was back at the end of 2006!?

Greg
GlassBeadGame
QUOTE(thekohser @ Mon 18th February 2008, 2:53pm) *

QUOTE(Nathan @ Mon 18th February 2008, 2:43pm) *

Find the data dumps then import them into your database?


Sure, but where are these elusive data dumps? I thought the last stable, successful one was back at the end of 2006!?

Greg


I don't think there is anyway to execute a dump from upside as you would need sufficient permissions on the database, so you would have to rely on an existing publicly available dumps. It would still be an interesting project even if the dump was rather old. After all it's not like the project is improving anymore. You could insist on IRL identities of editors, respect experts, treat businesses with respect, exercise editorial restraint and implement BLP reform. I think the approach would be like marble sculpture. Cut away everything that doesn't look like an encyclopedia. You would have a much better product within a year, even with only modest number of committed editors.
gomi
This is one of the big lies of Wikipedia -- that you can fork it. There have been successful backups during 2007 -- as recently as December, but they get removed as soon as they are complete. There is a very small window in which to pick one up. Wordbomb has some, but I think they are old.

EternalIdealist
The misconception that database dumps are somehow rare or difficult to come by is one of the most persistent falsehoods. People really should bother to Google. rolleyes.gif

Wikimedia db dumps

Wikipedia's page about acquiring the db
Somey
Yeah! I even took a photo of one, just the other day:

FORUM Image


I'm not sure how you'd fork something like that, though. Maybe a pitchfork...
thekohser
QUOTE(EternalIdealist @ Wed 20th February 2008, 12:33am) *

The misconception that database dumps are somehow rare or difficult to come by is one of the most persistent falsehoods. People really should bother to Google. rolleyes.gif

Wikimedia db dumps

Wikipedia's page about acquiring the db


LOL. Try clicking the English Wikipedia HTML dump. (Doesn't work.)

Try grabbing the XML dump of just the most recent edited pages of the English Wikipedia. (Doesn't work.)

So, you were rolling your eyes, because...?
Pumpkin Muffins
QUOTE(thekohser @ Mon 18th February 2008, 7:05pm) *

I've heard it said that users have two rights on Wikipedia:

1. The right to "fork" the database
2. The right to leave the project

So, I'm thinking about forking the English Wikipedia. How exactly does one go about doing that? I thought that the Wikimedia Foundation had given up about 18 months ago with trying to produce regularly-available data dumps of the entire project, presumably because their servers were choking on the process.

Is it now incumbent on a forker of the mother database (the "mother forker") to execute the entire process from "outside" Wikipedia?

And another question -- how might one fork the Simple English Wikipedia, which has a much more manageable 25,704 articles?

Greg


to fork, you'd want "All pages, current versions only", not "All pages with complete edit history". Then latter is the one that crashes all the time before completing.



thekohser
QUOTE(Pumpkin Muffins @ Wed 20th February 2008, 1:19am) *

to fork, you'd want "All pages, current versions only", not "All pages with complete edit history". Then latter is the one that crashes all the time before completing.


Pumpkin, I realize that. Show me where I can get a working copy of the 6 GB file of "All pages, current versions only". Please!
Pumpkin Muffins
QUOTE(thekohser @ Wed 20th February 2008, 6:24am) *

QUOTE(Pumpkin Muffins @ Wed 20th February 2008, 1:19am) *

to fork, you'd want "All pages, current versions only", not "All pages with complete edit history". Then latter is the one that crashes all the time before completing.


Pumpkin, I realize that. Show me where I can get a working copy of the 6 GB file of "All pages, current versions only". Please!


here? or here? ... don't know if these files are functional though. The xml dumps need to be converted.
Nathan
QUOTE(thekohser @ Wed 20th February 2008, 12:52am) *

QUOTE(EternalIdealist @ Wed 20th February 2008, 12:33am) *

The misconception that database dumps are somehow rare or difficult to come by is one of the most persistent falsehoods. People really should bother to Google. :rolleyes:

Wikimedia db dumps

Wikipedia's page about acquiring the db


LOL. Try clicking the English Wikipedia HTML dump. (Doesn't work.)

Try grabbing the XML dump of just the most recent edited pages of the English Wikipedia. (Doesn't work.)

So, you were rolling your eyes, because...?


There's a dump right there, though. oops, not what you want.
dtobias
When you gotta take a dump, you gotta take a dump!

To the tune of the William Tell Overture / Lone Ranger theme:

Take a dump, take a dump, take a dump dump dump
Take a dump, take a dump, take a dump dump dump
Take a dump, take a dump, take a dump dump dump
Every day, take a dump dump dump!

Error59
Dtobias - you may enjoy The Diarrhea Song happy.gif
thekohser
Reminds me of a song a co-worker of mine would sing from the Men's room when I worked in a carpet warehouse as a teenager --

Stranded! Stranded! Stranded on the bathroom bowl...

What do you do, when you just had a poo...

And you gotta have a roll?!
JohnA
I assume that Wikipedia has told you to go fork yourself?

Greg, this is probably what you want: http://download.wikimedia.org/enwiki/20080...rticles.xml.bz2
thekohser
QUOTE(JohnA @ Wed 20th February 2008, 9:38am) *

I assume that Wikipedia has told you to go fork yourself?

Greg, this is probably what you want: http://download.wikimedia.org/enwiki/20080...rticles.xml.bz2


Perhaps. We'll see -- I'm 66% downloaded now.
JohnA
Now that you've got it, what are you going to do with it?
thekohser
QUOTE(JohnA @ Fri 22nd February 2008, 3:11am) *

Now that you've got it, what are you going to do with it?


Stay tuned. I'm assembling a strategy team and will likely be incorporating, either with or without venture capital. We've already discussed what I might do with it, elsewhere on here.

To discuss any more would just be sabotaging my own first-mover advantage.

Greg
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.