Proposal talk:Make Wikisource scale

Latest comment: 15 years ago by Nemo bis in topic DJVU and OCR

This is only a draft, I hope that Wikisourcerors will improve it, and that we can discuss what's the better way to go for Wikisource. Nemo 10:08, 3 September 2009 (UTC)Reply

Work with Project Gutenberg

Project Gutenberg and particularily Project Gutenberg Distributed Proofreaders (see PGDP.net) have set themselves the task of proofreading public domain texts and making them available as plain texts for others to use.

I think we should work with them, not try to compete with them. I suggest we concentrate on wikifying the Gutenberg texts. Taking their output and adding html markup, formatting, links to create a more version which is more useable online. They specialise in one part of the job, we in another.

Where collaboration grows, that is great. In some places our collaborators may chose to have joint events, even a common organisation. That is for later. Does that make sense?Filceolaire 10:23, 3 September 2009 (UTC)Reply

I think it makes sense, but I don't have a firm opinion on that. In italian we have a similar project, called LiberLiber, who offer the users PD texts in several formats (.txt, .pdf, .rtf). It's 15 years they are transcripting books, they have a big database now. I agree on working on the collaboration instead of competition, but I want to raise one of the biggest issues:
  1. they don't have images, and we need those for using the Proofreading Extension and be trustworthy.
  2. sometimes I don't see the point of formatting a text that is just in another project.
I know that this is important, because wiki means open, means interlink with Wikipedia, in-text citations and quotes liked as wikilink in Wikisource, categorizing, etc.
But I think we should work a lot in the direction of automatically converting from a project to Wikisource.
We should make the task of converting a book quick and easy, instead of boring and complicated. Why we don't work on the OpenOffice export in wikisyntax? That could be a starting point. --Aubrey 14:42, 3 September 2009 (UTC)Reply
Filceolaire, this is point 3 of the proposal: if we find that they haave a better system, then we can [improve and] use it. Nemo 18:09, 3 September 2009 (UTC)Reply
We have never been in competition with Project Gutenberg: we work on a parallel line, with different tools and methods. And we have already imported a lot of works from Gutenberg. There are some issues with texts on Gutenbergn mainly with other languages than English:
  1. Formatting is very poor, and not even consistant accross all works;
  2. Support for alphabets other than Latin is quite non existent, so this is obviously a problem for most languages other than English;
  3. Gutenberg interface is English only. In fact the the whole project is focused on only one language, English.
Yann 18:21, 10 September 2009 (UTC)Reply

Impact?

Some proposals will have massive impact on end-users, including non-editors. Some will have minimal impact. What will be the impact of this proposal on our end-users? -- Philippe 21:28, 3 September 2009 (UTC)Reply

First point of "motivation". :-) Nemo 09:33, 4 September 2009 (UTC)Reply

DJVU and OCR

Wikisource needs a efficient and reliable tool for converting scans to DJVU, and doing OCR, similar to Any2djvu. This only works for English, is slow, and often overbooked.

But what Wikisource mostly needs is manpower: without a massive inscrease of contributors, no tool would make the project scale. Yann 18:24, 10 September 2009 (UTC)Reply

Do you think that simpler and more effective tools would help the project to attract more contributors? Does Project Gutenberg/Distributed Proofreaders suggest so? Nemo 14:05, 12 September 2009 (UTC)Reply
Return to "Make Wikisource scale" page.