Site maps, sitemaps, image search, 404 pages
One of the first things you want to do when you restructure your site or add a new piece of quality content is add it to your site maps. You don’t need to do this with every blog post, but you should list every main page in your website that you would like your visitors to be able to find. Including pages in a site map doesn’t guarantee that the search engines will index them, but it helps.
There are two major types of site maps, and you need both:
- An HTML site map is a standard web page that is public facing and intended to help real people
- An XML sitemap is a file written in computer code (Extensible Markup Language) that’s written only for search engines and is not public facing. It isn’t linked where people can find it, and most people couldn’t understand the code even if they found the file.
Note the different spellings: HTML site maps use “site map.” XML sitemaps use the term as one word, “sitemap”.
HTML Site Maps
HTML site maps are intended to make a visit to your site easier for your visitors. People navigate the web in different ways, so you want to give them multiple ways to find your content. Some people like the discovery process of going through menus and following links, but some of your visitors will be very task-focused and will want to go directly to the content they care about. A site map lets this latter audience find specific pages without having to go through menus or other navigation. Since HTML site maps are treated like any other web page on your site, they’re also a good way for search engines to discover more of your pages. In general –
- If your site has fewer than 200 hundred pages, you can include everything in one site map
- If your site has more than 200 pages, consider a main site map that has the top level pages and links to category-specific site maps
Your task-oriented visitors aren’t there for pretty graphics, they don’t want to read paragraphs of text, and they don’t want to spend a lot of time trying to figure out what link they should click. Give them a nice, clean layout with logical headings, subheadings, and page names. Apple does a really good job with their site map. It’s categorized into major categories, such as iPod:
An XML sitemap is a file created specifically for search engines. Having one won’t guarantee the search engines will index all of the pages in your site and won’t influence how the pages rank, but it is the quickest and easiest way for search engines to learn about your pages and give you the best chance to get them indexed.
Since sitemaps are intended for search engines instead of human visitors, there is a specific format for the file that you need to follow. The sitemap must:
- Begin with an opening <urlset> tag and end with a closing </urlset> tag
- Specify the namespace (protocol standard) within the <urlset> tag
- Include a <url> entry for each URL, as a parent XML tag
- Include a <loc> child entry for each <url> parent tag
- All other tags are optional
- All URLs in a Sitemap must be from a single domain, such as www.example.com or store.example.com
You can see examples and all of the options at: http://www.sitemaps.org/protocol.html
Once you have your XML sitemap ready, upload it to Google Webmaster Tools by going to Optimization > Sitemaps and choosing Add/Test Sitemap.
Once you have a standard XML sitemap, you can expand it to include your images. This will help search engines such as Google discover the images you have on your site, so those images can be included in image search. Like the standard XML sitemap, there is a specific format to follow for image search and the details can be found at: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=178636.
A video sitemap is another addition to your XML sitemap. It’s the most complicated of the sitemaps, but if you have video content you would like people to discover in search engines, it’s worth taking the time to expand your XML sitemap to include your video content. Video content includes web pages which embed video, URLs to players for video, or the URLs of raw video content hosted on your site.
Each URL entry must contain the following information:
- Landing page URL
- Thumbnail URL
- Raw video file location and/or the player URL (SWF)
You can learn more about video site maps at http://support.google.com/webmasters/bin/answer.py?hl=en&answer=80472
No matter how careful you are, people will occasionally get to a page on your website that doesn’t work. Perhaps you moved a page and forgot to set up a redirect, or maybe you mistyped an internal link when you were building a page. Sometimes it is completely out of your control; perhaps the visitor mistyped a URL, or another website linked to a page on your site that doesn’t exist. By default, every webserver will return an error message with some version of “404 Page Not Found” when this happens. By now, most people have learned that this means something went wrong and they’ll just move on, but there is a better way than an impersonal, unsympathetic generic page.
When you create a custom 404 page there are a few things you want to keep in mind:
- The page is there to help people understand where they are if they get lost.
- Something went wrong and your visitor ended up on this page instead of the page they were expecting. Make sure they have a positive experience and use this as an opportunity to reinforce your brand
- Provide links back to the main domain, ensuring users and search engines can access other pages on your site.
- Link to key content
- Important/popular pages
- Contact information
- Sign up for a demo
- Link to sitemap
There are a number of great examples of custom 404 pages from the basic glorified site map to the humorous; just for fun, search on “custom 404 page examples”. If you want to see what a specific website is doing for their 404 page, go to the website and try and access a page you know won’t exist. For example, if you want to see Apple.com’s 404 page, you could go to http://www.apple.com/asdf.
Posts so far in this series:
- Part 5: SEO tactics including navigation and canonical URLs
- Epilogue: SEO Best Practices That No Longer Work