free literature
  • Home & Links
  • what you can do ...
    • ... the workflow
      • ... and why wouldn't you?
      • projects
      • news / comments / suggestions?
      • links continued
      • contact

      ... or how a treebook becomes an e-book ...



      Two main sources are available. You buy a book and scan ... second-hand bookstores and garage-sales are plenty ... or you obtain already existing scans elsewhere on the web (e.g. Gallica, Bibliothèque nationale de France has about 100.000 books available - and doesn't object to PG using them - the Internet Archive hosts even about 20 times as much, from a.o. American and Canadian Libraries).

      In the case of Project Gutenberg (based in the USA) this concerns books which are in the public domain (out of copyright) in the USA - which means, on the whole, books published before 1923. Extensive copyright information is available at PG.

      The scans are read with ocr (optical character recognition) and the result is saved in text format. Then several proofreading rounds follow. The text output is read side by side with the original. Of course there can be a lot of mistakes, which depends on the quality of the scans, age and state of (older) books, etc. Finally the text is massaged into a pleasantly readable format fit for online reading.


      Added a recent translation of Homer's Odyssey below, to read online or download here.