Earlier this morning Google launched their new Google Sitemap program. They will allow you to submit your site through an XML feed, similar to the Trusted Feed program with Yahoo! (yes, I know it has been renamed, but I am “old school” so it stays as “Trusted Feed”!). Now, Google has always stated they would never charge for inclusion to their index and they are living up to their word. More Info.
GoogleGuy has posted his thoughts in the Google Blog, a Q&A session, and also a discussion on Sitemaps.
Summary:Webmasters create XML files containing the URLs they want crawled, along with optional hints about the URLs such as things like when the page last changed, and the rate of change. They host the Sitemap on their server and tell us where it is. Google will provide an open-source tool called Sitemap Generator to assist in this process. Eventually, Google is hoping webservers will natively support the protocol so there are no extra steps for webmasters. When a Sitemap changes, Google will support auto-notifying us so Google can pick up the newest version.
Why Google is Doing This: They want to index all publicly available information so Google can offer better search results. However, currently web crawling is limited. Crawlers don’t know all the pages at a website (e.g., dynamic pages), when those pages change, how often to recrawl pages, how much load to put on a website. So they try to guess. Google wants to work collaboratively with webmasters to get a big picture of all the URLs they should be crawling, and how often they should be recrawled. Ultimately this benefits Google’s users by increasing the coverage and freshness of their index.
Michael Nguyen has a great breakdown of the XML Feed.
Leave a Reply
You must be logged in to post a comment.