MetaTags
Metatags are tags that contain commands to the gathering programs of search engines that try to index your page, or commands and information to a browser that retrieves your page. They are not displayed on the page but you can use them to control how your site behaves and how it's indexed.
Metatags go in the header: between the tags <HEAD> and </HEAD>. Their general form is:
<META name="type" description="text">
For instance:<META name="author" description="Judy Koren">
They come after the title tag: HTML doesn't require this, but search engines that don't understand metatags expect to find <TITLE> as the first element in the header and may not read it correctly if it isn't.
What MetaTags are there?
Here's a list; the words in the "content" field are obviously only examples. I don't claim it's complete, but it contains the ones you're likely to need.
- <META name="Keywords" content="college newspaper current-events news Africa">
Gives a search engine keywords to index the page by, even if they don't appear in the text of the page itself. Most search engines will weight words in the keywords field as more important than words in the full text.
Don't include a keyword more than once. People used to include keywords multiple times in this field in order to raise the ranking of their document for that keyword and get it placed higher in the search results. It worked for a time and then the search engine companies caught on. Today a search engine typically refuses to index your page at all if you "spam" it with the same keyword more than once in this field.
- <META name="Description" content="Nairobi University Online is the electronic version of the university's student newspaper. It's published weekly and includes local campus news and information">
Gives a search engine an "abstract" which the engine provides together with the title of the page in search results. If you don't have a description field, and you haven't manually provided an abstract when submitting your page to a particular search engine, the search results contain the first 25 or 30 words of the text, which often include several separate elements (main heading, contents list and text) and don't make sense when read together.
- <META name="Robots" content="index, nofollow">
Tells the search engine to index this page but not the pages it links to. Obviously "noindex" tells it not to index this page and "follow" tells it to index the pages linked to.
- <META name="Revisit-after" content="45 days">
Tells the search engine how often you think it's worth re-indexing the page. Most search engines have their own agenda but some take your views into consideration on this point.
- <META name="Rating" content="General">
I have never actually seen this used, but perhaps that's because I don't visit much the sites with content="Adult"... This is one of the things the government just might decide to require, at some point.
- <META name="Generator" content="FrontPage Express 2.0">
Most HTML editors put this in automatically. I consider it free advertising and take it out if I can be bothered. But some people maintain that it helps the browser to interpret the HTML code correctly if it knows what editor produced it. I personally think that if the editor produces code that is so non-standard it can't be interpreted properly by the rules of HTML, you should find a better editor.
- <META name="Author" content="Judy Koren">
That pretty well explains itself. I haven't seen a search engine actually do much with this, but some will let you search for authors of pages.
- <META name="Copyright" content="Copyright ©1998 Unesco">
I always put this in the text of the page, which you should do anyway, but you can also tell the browser to index the information by a specific field by putting it in a metatag.
- <META name="Expires" content="30 June 1998">
Tells the search engine to delete the page from its index after that date.
- <META http-equiv="Content-Type" content="text/html; charset=windows-1255">
This is becoming more necessary as HTML tries to address the problem of multiple alphabets. It tells the browser what character set the page was written for. For a Web page the components up to "charset=" are invariant (the content type is always text/html), the name of the character set varies. There's a semicolon after "text/html" and you can put in a space as well if it makes it easier on your eyes; there is no quotation mark after "charset=" but only after "http-equiv" and then at the end of the whole string, just before the concluding >.
Written by J. Koren for Unesco
©1998