Magazine

Picture of Stephen Sumner

Stephen Sumner

What is an XML Sitemap and How Should You Optimise It

You can build the most technically elaborate website using just about any platform or codebase available. It can have the best layout, beautiful images and clear and concise articles. But how is anyone going to find it?

When you launch your website, it isn’t good enough to assume “If you build it, they will come,” because they won’t, unless you have some tools in the background to let the search engine spiders know your site is up and ready for crawling, indexation and ranking.

Yes, you can share your content across social media, but that’s not nearly enough if you want thousands of visitors stopping by each day unless you have large followings already. When people do a search on Google for something in particular, unless your site is listed in the search results, people won’t just find you.

You need a sitemap, but not just any sitemap. You need an XML sitemap for search engine optimisation. Adding an XML sitemap is just one of the ever-growing list of tasks you need to perform to win at SEO these days.

WHAT IS A SITEMAP?

Firstly, an XML sitemap, or what some people refer to as a “Google sitemap,” is a different beast than a navigational sitemap or HTML sitemap as we call it amongst SEO circles. An HTML sitemap is actually a directory that you put on your website or blog for your human visitors. This type of sitemap lists the various sections of your site to help human visitors navigate their way around your site, you’ve probably stumbled across one and they are normally found linked to from the footer navigation in many sites, below is a quick screengrab of a fairly typical HTML sitemap.

An XML sitemap, however, is used by search engine robots, spiders or crawlers – to read your site’s URLs, site architecture and the metadata contained in your site. You don’t post an XML sitemap to your site for your human visitors to see (except for people like me). The XML sitemap is only shown to search engine spiders behind the scenes to tell the search engines which URLs are to be spidered, how often the search engine should revisit and also if a given URL should be treated with a higher priority than another, although there is a little bit of doubt whether the priority directive really gets much attention these days.

As such, an XML sitemap is a critical tool in your SEO (search engine optimisation) tool kit and still seems to be one that many online marketing folks and website builders seem to overlook. For me personally, I have worked on hundreds of sites over the years and right now when I conduct my ecommerce SEO audits, it still surprises me how many sites either are missing an XML sitemap altogether or it’s not set-up right.

WHAT INFORMATION IS IN AN XML SITEMAP?

Data that states how often each of your pages is updated – the more often you update your site with fresh, new content, the more important and relevant the robots will view your site (that’s the plan anyway).
When the pages were updated – again, recent dates are critical so the bots don’t see your site as being abandoned.
Importance and relevance of each page relative to the other pages – it’s important to have a few internal links on each of your posts, linking to other related content on your site. There is a priority setting for specific URLs but it’s probably not going to make any difference.
Who Needs an XML Sitemap?

If you do online marketing with any type of website or blog, you need an XML sitemap. Ecommerce SEO is one area it’s critical, the nature of ecommerce websites means that pages, categories come and go all the time depending on inventory, seasonality and more. Being able to give search engines a dynamically updated map of all your pages is critical, more so the bigger the site as it will otherwise take search engines longer to crawl the site and find new URLs amongst the existing ones.

If you operate a website that contains hundreds or thousands of pages and/or posts of archived content that you haven’t linked together, the sitemap will help the search engines map it out for crawling. When you submit your XML sitemap to Google, Yahoo! and Bing, you’re letting them know that your site and the pages in the index are available to crawl for possible indexing on their search engines.

HOW DO YOU SET UP AN XML SITEMAP FOR YOUR SITE OR BLOG?

There are a couple of ways to initiate an XML sitemap. The easiest way is to build your site with a Content Management System (CMS) platform like WordPress (WP), Drupal or Joomla for example, and let their plugins or extensions do the heavy lifting for you. You can also create one manually using HTML, but why make things harder than they have to be?

CMS Plugins, Modules and Extensions
XML sitemaps are usually easy for sites built using mainstream CMS platforms. There are various modules, extensions and plugins inside of CMS platforms that you simply install and activate then let the programming do its job. You really don’t have much to do at all.

One of the useful features of most XML sitemap tools is that they’ll continually update the sitemap automatically. As soon as you publish new content, the new page or post along with its relative metadata is immediately added to the sitemap.

If you remove a page or change it in some way, that information is automatically updated as well since the plug-in will crawl your site periodically, gather the newest information and update the sitemap accordingly.

HTML Sitemaps Using XML Code
If you built your site using HTML, it may be a bit trickier and take a bit longer, especially if you have a large site. You have the option of using XML site structure code and adding each URL of each page into the code.

You can also simply list each URL on its own line in plain text (.txt) file.

Either way, if you have hundreds of pages already, it could be very time-consuming. If you haven’t built your site with any type of CMS platform, it’s best to get the XML sitemap started with your first published page.

Another negative to maintaining your own XML sitemap on a non-CMS platform site is remembering to remove or change information as your site grows and changes. Every time you make an edit – update or change a page, change keywords, add keywords, meta descriptions and the like, you have to go in and manually make these changes to your XML sitemap as well.

If it’s in your budget, you can also contract an SEO specialist to either do the job for you or show you how to get it started and maintained.

There are a number of tools available that will crawl a site like the search engine spiders do and then create a static XML sitemap, the downside is that if you make a change to the site URLs or structure you will need to run the process again.

HOW DO YOU OPTIMISE AN XML SITEMAP?

The main thing you want to do is have the sitemap direct the search engine spiders to your most important pages. If you’re practicing good search engine optimisation or ecommerce SEO tactics, the sitemap should contain all your core pages, not include redirects or URLs that have 404 errors as this just impedes the search engine spiders.

Never include the URL of any page or post that you do not want to be indexed and have indicated it as a “noindex” page. A good example of a “noindex” page may be a Thank You page that is only shown to those who subscribe to your list. No one else needs to see or know about that page except your subscribers.

Another page you may publish as “noindex” may be the checkout page of an ecommerce site, again, because no one else needs to see that except your buyers.

There are some additional options in XML sitemaps to specify a frequency that the search engines should crawl a giving URL in the XMl sitemap and also to assign a priority, however, Google has downplayed the effectiveness of these attributes.

SIZE MATTERS

If you have a few hundred pages on your site, you’re probably good with just one XML file. However, if you’re looking at thousands of pages, it is best to create more than one XML file and split up the pages. You can designate which pages go first and put the rest in subsequent XML sitemap files.

Doing this keeps the crawling process streamlined for the search engine spiders because they can read the sitemap quicker if it’s a smaller file.

One way to do this easily within CMS platforms is to use Yoast SEO, an extension or plug-in available for both WP and Drupal. One of Yoast’s features is to automatically generate XML sitemaps; separate sitemaps for pages, posts, attachments, categories, tags and author.

Again, keeping separate, smaller sitemaps makes it easier for the SE bots to crawl your site and determine which ones should be indexed.

SUBMITTING YOUR XML SITEMAP TO GOOGLE AND OTHER SEARCH ENGINES

Contrary to what you may have been told over the years, simply submitting your sitemap does not automatically guarantee that your site or blog is going to be indexed. The best way to have a solid chance at Google indexing your pages is to make sure your pages are optimized for Google’s spiders.

An XML sitemap is just that – a map. It lists the URLs and metadata of your site. By submitting it to Google, Bing, Yahoo!, etc., you are just letting the search engines know that the site is ready for indexing and you’re giving them a map to find your site and its pages.

You really can’t optimize the sitemap itself. It’s just data. The best way to ensure that your sitemap includes the important data is to make sure your pages and posts are optimized.

IN SUMMARY

Building an XML sitemap for your website or blog is something that every site needs if it is to succeed with their SEO efforts. It must be a part of every search engine optimisation toolbox, especially ecommerce SEO where traffic is critical to sales and success.

If you’re not absolutely sure where to start, please do drop me a line and I will be happy to help!

Author Profile
Director at Local SEO Ltd | Website

Stephen has been working with SEO and Digital Marketing since 1999 and has held a variety of senior roles in a wide range of companies both in the UK and Internationally.

Related Magazine