Team meeting
Crownpeak Logo Posted by Crownpeak November 24, 2015

Is Your CMS SEO-Ready? - Site Structure & Navigation

Part 2

Our last blog post on tips for SEO-ready web content management discussed how digital marketing teams can best leverage their CMS to improve On-Page SEO Elements. This post will focus on how content editors can optimize site structure and navigation elements to drive SEO success.

Search engines are pretty good at finding their way around websites. However, to best ensure search engines can confidently find and categorize pages more easily, it helps to provide them with “maps” “guideposts”, and “roadblocks” to tell which content is most relevant or would like to be discovered. Here are the ways you can best help search engine web crawlers do your content justice.

“Maps”: Leverage both XML & HTML sitemaps

When a search engine hits your domain, it’s trying to make sense of all the pages it visits. Upon visiting all of these pages, the search engine’s web crawler will attempt to categorize pages or groups of pages by specified topic and recurring keyword content. Confusing site structure will hurt this process. So providing clear site structure helps with this process. Yet, sometimes search engines could use a clarification to reinforce the confidence of their categorization decisions. That’s where sitemaps come in handy.

Sitemaps are crucial SEO components of any website. They explicitly spell out the layout of site’s pages and reinforce relationships between certain groupings of pages.

Prepare two sitemaps that are always kept up-to-date (ideally through automatic mechanisms):

  • HTML Sitemap: This kind of sitemap is essentially a web page that lays out all of your site’s pages by logical location grouping. This reinforces the topical relevance of certain groups of pages. In addition to helping web crawlers make sense of your site, it’s also a great tool for helping your visitors find specific pages themselves.
  • XML Sitemap: This kind of sitemap is in an XML file format. While it’s not practical for helping site visitors make sense of the site, it’s a crucial file that helps search engines become aware of all of your site's important pages.

Crownpeak Sitemap

When implementing sitemaps, be sure to refer to them in your site’s Robots.txt file, which emphasizes their presence to web crawlers. Just adding a simple line of code to refer to each sitemap—like the one shown below—to the Robots.txt file makes a huge difference.

Sitemap code example:

SITEMAP: http://www.domain.com/sitemap.xml

An SEO-Ready CMS can do the following:

  • Provide out-of-box templates that are ready-to-go for deploying automated and customizable site maps (both XML and HTML).

“Guideposts”: URL structure and breadcrumbs

Let’s say you are travelling somewhere new. You’re lost in an unfamiliar place. You pull out a map to help you find your way. However, even if you have a map to guide yourself, it may be difficult to use that map effectively if there are no road signs anywhere on your journey.

You still have to do extra work, even with your map in hand. You are still forced to find your way by walking part way down the wrong paths first and looking at vague landmarks in the distance. Correct “signage” would help you a great deal in this scenario.

Search engine web crawlers benefit from clear “guideposts” in a similar way; they find and categorize content more easily if the navigation elements are clear.

Simplify URL structure

Search engines pay a great deal of attention to the formatting of URLs to try understand where pages are in relation to each other on a site. That’s why it’s important that your sites URLs are your consistent, clear, and easy-to understand. Rule of thumb: if your URL structure and naming conventions are hard for a human to understand, it’s certainly not going to be of any help to search engines trying figure out your site.

When some content management systems deliver dynamic content or indexed web pages (like blog posts or articles) in real time, the content may be delivered through unique IDs for a specific asset or page. These dynamic URLs often assemble in real-time with long strings of cryptic text that tend to confuse search engines. In the process, these cryptic URLs negatively impact your SEO.

Cryptic URL (SEO-unfriendly):

http://www.example.com/secton/post-1234326203462.php?long&20querystring=256sggadgg456t26a

To fix this issue, some systems require an elaborate and involved URL rewriting strategy of search friendly “vanity URLs”. However, certain systems automatically write URLs in a manner that naturally reflect a site’s directory structure instead of using cryptic URLs

SEO-friendly URL:

http://www.example.com/secton/subsection/explicit-post-name.aspx

Also, in some scenarios, it may be helpful to eliminate slashes (/) in particularly deep URLs with many levels down of complexity. If your URL has too many directory steps down, it may result in crawler errors that associate URLs with the wrong folders in that chain of complexity. At times, if you have flexible control over URL formatting, a good solution to this may be to eliminate or replace slashes in the final published URL:

Deep URL:

http://www.example.com/level-1/level-2/level-3/level-4/file.aspx

Optimized URL, replacing “/” with “_”:

http://www.example.com/level-1_level-2_level-3_level-4_file.aspx

Clarify with breadcrumbs

It’s important to create a naturally flowing hierarchy through your website’s navigation. Part of this—as discussed above—comes down to how your site’s back-end URL structure. However, it’s also crucial that you take the steps to emphasize the back-end structure with what’s displayed on a page’s front-end. Take clarification one step further by visually marking structural URL hierarchy with on-page breadcrumbs (especially when the page is embedded deep in your site).

The benefits of breadcrumbs extend beyond improving SEO. Making your website easier to navigate with breadcrumbs makes finding content easy and improves accessibility for both the search engine and site visitors. Namely, the breadcrumb text allows users to navigate back to home and higher level pages, as they can show users where they currently are and have previously been on your site.

Use "schema" to reinforce strength of breadcrumbs

When you are travelling down a major road in an automobile, it helps when the road signs are in a standardized format that can be recognized across different roads. For instance, national “Interstate” route numbers in the United States almost always share the same style of sign: a “red, white, and blue” shield on a green background. It’s a recognizable commonality that’s consistent on all roads across different stages.

The same should go for website breadcrumbs. Breadcrumbs are more effective when they have standardized markers that are consistent with other websites. That's where "Schema" comes in.

"Schema" refers to standardized methods of web page markup that were created in collaboration between the major search engines (i.e. Google, Yahoo!, Bing, etc.). Essentially, a bunch of search engine experts got together and agreed on a bunch of different ways of coding and tagging different web page elements to make sites better search optimized. Visit Schema.org to see all the different ways it can be used to optimize your sites.

In the case of Breadcrumbs, Schema markup is especially helpful for explicitly indicating that a list of links on your page should be recognized as Breadcrumb (and not just a random list of links).

Breadcrumb example without schema markup

Breadcrumb example with schema markup

The benefits of a CMS that is ready for SEO:

  • Deliver dynamic content without need for long convoluted URLs.
  • Automatically write default URLs to reflect directory structure.
  • Allow for easy custom URL editing for all pages.
  • Generate breadcrumbs easily, with minimal coding.
  • Allow for markup from Schema.org

Roadblocks: "You shall not pass!"

You don't need to be a wizard to prevent web crawlers from going to parts of your site you'd prefer they skip. Here's what you need to know about the putting up basic SEO roadblocks with your CMS.

rel="nofollow"

Every time your site links to an external site, by default, each link is like a vote of confidence cast for that external site. The danger in that is that, if your place of votes of confidence for sites that search engines think are crummy, then search engines will in turn penalize your site for promoting crummy sites. But sometimes, you simply have to link to external sites that you aren't always sure of. Or perhaps your site incorporates user-generated content, like blog post comments, where visitors might post "spammy" links.

To deal with these issues, you can apply the "nofollow" attribute to links as needed. With "nofollow", webmasters can tell search engines "don't follow links on this page or page section" or "don't follow this specific link."

For instance, adding a "nofollow" tag to comment columns and message boards will help protect your site and/or reputation. In this specific scenario, our official recommendation is that you always implement and enforce this practice by default (it'll be a lot more time effective than trying to vet such comment links later down the road yourself). Here on our blog, the commenting functionality is powered by our integration with Disqus, which automatically incorporates "no-follow" into all user-posted links.

Robots.txt

Sometimes, it's critical that search engines don't direct visitors certain parts of your site. You may have landing pages that you only want visitors to discover through channels other than search. Or, for example, you may think its bad experience for your visitors to randomly stumble on to your site through the seventh page of your Blog Index from a search result. For scenarios like this, it's sometimes necessary to tell search engines to ignore certain pages.

A website's "Robots.txt" file provides explicit instructions to search engines on which pages to index as search results. In addition to highlighting the presence of things like sitemaps (as discussed before), Robots.txt also tells web crawlers which pages to not index as search results. From here, site administrators can exempt specific files, entire directories, or files that mean certain rule-based criteria.

Robots.txt example

User-agent: *
Disallow: /folder1/

Disallow: /folder2/specific-file-1.aspx
Disallow: /folder2/specific-file-2.aspx

In addition to controlling indexing rules from Robots.txt, "nofollow" rules can be applied here as well.

An SEO-ready CMS can do the following:

  • Flexible and accessible control of Robots.txt.
  • Enable easy application of "nofollow" tags where necessary.
  • Leverage integrated platforms for user-generated content (UGC) or commenting that incorporate "nofollow" into visitor-posted links.

The Crownpeak difference

Crownpeak Digital Experience Management gives digital marketers, developers and content editors tools to easily edit the site navigation elements mentioned in this blog.

  • Full flexibility over published URL structure.
  • Quickly implement standardized SEO elements like sitemaps and breadcrumbs.
  • Easy controls for managing web crawler traffic-control through Robots.txt and rel="nofollow."