Content comes in many ‘flavors’ — it can be text, images, audio/video, Flash, PDF or various other formats. It can be housed in a database, or in individual files. If you want your website to be indexed by the search engines so that they can send you free traffic, it should go heavy on the text. Images are OK so long as they include captions, and your HTML should have both the ‘alt’ and ‘title’ tags filled in the image tag, preferably with different text in each (though saying essentially the same thing, using different words).
Audio/video (a/v) formats are not indexed by search engines. Flash is not indexed. PDF is indexed by Google, but some of the smaller search engines can not read these files. Also, many users will see your document is PDF and click on the ‘view as html’ link Google provides. They never actually visit your site, and the rendered version is usually far inferior to what you have in the formatted PDF file, giving visitors and poor (and false) impression of your site.
Podcasts and similar a/v presentations may be easier for some people to produce, and they are popular for some sites, but you miss out on a lot of free search engine traffic, and need to make up for that with other means of promotion. Ideally you should have a complete transcript of the audio, or better yet that plus a verbal description of the video portion, but that can be a tremendous amount of work to produce. At the very minimum, if you want to include a/v material, include a good description of the content.
So in most cases, we will be using primarily text material for our content, or graphics with text descriptions. Where do you get that material? I have already mentioned public domain content on the post Free Content. Finding public domain material that has not already been published online is one way to use this material without running into duplicate content problems.
Writing all of your own material is time-consuming, but it doesn’t cost you money, so for the budget conscious beginner it may be the preferred choice. Just be sure to double check your spelling and grammar. It does not have to be great prose, worthy of a Hemingway, but you need to be able to get your point across to your audience.
Another way to get data is to pay for written articles. There are few unscrupulous sorts who try to fob-off copied material as their own, after just changing a few words here and there … so be careful who you deal with.
Creating a database is one of the best ways to produce content. Select some aspect of your subject area, and produce a database by copying information — not by copying copyrighted material, like descriptions or reviews, but by copying things like names, statistics, feature lists, options, etc. If you have an automobile site, for example, you can create a database by make, model and year, showing which options were available, which engine sizes, etc., etc. Always try to put more in your database than you need for your site — that way in the future, when producing a related site, you will have more material to choose from if you want to extract part of that database onto your new site, without duplicating the information on your original site.
You can also buy databases full of content. Since they are for sale, they will not be unique, so you should not use one database as the basis for a website, but it is a great way to supplement other material. Also, by selecting a few fields from one database, and a few others from a related database, you can create a unique mix that looks different enough from what others do with the same data that you won’t get penalized for duplicate content. Also, by selecting a sub-set of the records from a database, you can randomly rotate some material, giving your site an appearance of freshness. Remember, you are building a Web Empire, not just a website.