Best Source for Public Domain Material
Your Web Empire should include at least one ‘anchor’ site with really great content you create yourself. But you just will not have time to write content for all of your sites, especially when you have several dozen of them. Sometimes you really need content that will not require much work on your part.
Public domain material fits that bill. I don’t know how many websites consist of material from the Gutenberg collection of texts, but I am sure it is substantial, as that is one of the oldest and best-known collections of public domain texts. It is not a great source for that very reason — any content from there is certain to be categorized as duplicate content when you use it.
There have been a couple much-publicized efforts to scan books, most notably those by Microsoft and Google. These books are available as PDF image scans, but they are also run through OCR software, and the texts are available. Depending on the quality of image obtained, some of these OCR scans are surprisingly accurate. The best way to access these books, as well as some from other projects (including Gutenberg) is to use the Internet Archive text collection.
This collection is huge enough, and growing fast enough, that even Google has not fully indexed it. Use the search function to find texts in the subject area relevant to your website, then download a few in the text format, and see if the OCR scan is accurate. If it is, choose a line of 40 or 50 characters from the text, and search it (within quotes) on Google. Oftentimes, you will get ‘no results found’ — indicating the text is not in the Google database. Run it through a spell-check to catch any OCR errors. Add it to your site, and when Google indexes it, yours will be the authority site for that text.

There are also images, sound recordings, and video clips in the Internet Archive database, just use the drop-down menu to select from other formats to search for sound or images to supplement your textual material. You can build an entire site around public domain material that has not been indexed in the search engines!
No Comments
You must be logged in to post a comment.