SEO today: One year after Google’s JavaScript update

It’s now been a year since Google’s October 27, 2014 blog post Updating our technical Webmaster Guidelines.  That quietly-launched blog post notified the world that the Googlebot now indexes JavaScript, and marked a significant development in the state of SEO. This blog post is part of an on-going series about SEO. To read all the blogs in this series, click here.

JavaScript has never been more integral to the functioning of the modern-day web and ecommerce in particular, so this change enhances search in line with today’s browsers and user experience.

Bazaarvoice’s experience and research over the past year have revealed some fundamental guidelines that will help you and your team better navigate the technical aspects of SEO as the industry is, in its current, transitional state.

Consider these guidelines in today’s world of the JavaScript-crawling Googlebot:

  1. View Source is obsolete – mostly

    Since the beginning of SEO time, about 1995, SEO professionals have agreed on a simple principle: search engines only read what’s visible in View Source. But because the world’s number-one search engine now crawls a more complete version of web pages, we have to look beyond View Source to newer methods.

    To properly understand how search engines see our web pages, we must evaluate and audit two versions of our source code: View Source HTML and Rendered HTML (Inspect Element). For more details, stay tuned for my next blog SEO Tricks: How to Audit Code Using Inspect Element. Without auditing both versions, we don’t get a complete picture of the content or markup hierarchy that we are presenting to search engines.

  1. There is a timeout and some content will be missed – sometimes

    Search engine bots, such as Googlebot, must run efficiently – they can’t wait around forever for pages to load. Therefore, it is very important to consider how timing affects a search engine’s ability to crawl and index your website. One specific aspect of timing is especially relevant as we seek to understand the JavaScript-enabled Googlebot – all evidence indicates that it has a timeout.  We do not yet know exactly how much time Google allows, or the factors that affect how much time is allotted, but evidence clearly supports that a timeout exists. Therefore, JavaScript-powered edits and additions must load quickly to be picked up by the Googlebot.

  1. Assume that Googlebot will not scroll or interact

    Many websites have experimented with lazy-loading (aka: infinite-scrolling, scroll-loading, progressive-loading) techniques over the past couple years. While techniques that load content as you scroll down the page provide a user experience attractive to consumers, and will often result in better initial performance statistics, they may have a negative impact on SEO.  When auditing code using the Inspect Element method, load the page and do not interact or scroll.  Assume that Google will only load the content that is visible in the initial, pre-interaction state of the page.

  1. Google’s Structured Data Testing Tool is broken

    As absurd as this sounds, the updated version of Google’s Structured Data Testing Tool, launched in March of 2014, does not include content and markup added after a page reaches the end user’s web browser. Even though the tool was refreshed and relaunched after Google’s JavaScript Update, the tool’s Fetch URL feature mimics the functionality of Googlebot prior to October 27, 2014. Therefore, when using Fetch URL, the All Good and Error messages within the Structured Data Testing Tool may not be accurate.

    In order to overcome this issue, do not use Fetch URL. You must copy and paste markup directly into the Structured Data Testing Tool.  Begin by pasting in the HTML retrieved via Inspect Element, as noted in SEO Tricks. How to Audit Code Using Inspect Element.  Evaluate the feedback received by the tool, then repeat the process using the HTML retrieved via View Source. Pay close attention to the hierarchy of the HTML, as well as any differences.  It may take some effort to fully understand the differences between these two versions of the same page.

  2. Canonical tags and other metadata can be edited with JavaScript

    Tests performed by Bazaarvoice, as well as Merkle | RKG have confirmed that Google will honor JavaScript edits to the HEAD section of pages. When doing such edits, consider the JavaScript timeout, noted above, as well as the order of operations as your page is built out. If the HEAD edit is performed too late, it may be missed.

  1. Bing and Yahoo! are trying

    At Bazaarvoice we continue to monitor changes and track the differences between how Google, Bing and Yahoo! index pages. Relative to Google, the other search engines have only crawled a very small percentage of the Internet with a JavaScript-enabled browser.  Therefore, we must consider the limitations of search engines other than Google, and seek to understand the traffic potential of each engine in different geographic markets.

To learn more about Bazaarvoice solutions to drive traffic through SEO, visit the Spotlights pages of our website here.

  • Michael DeHaven

    Thanks for your comment Andrew. Unfortunately, Google has not updated the “cache:” system to reflect the post-October 2014 Googlebot. Let’s take a look at an example on cutco.com so you can see the gap.

    1) Load the page:
    https://www.cutco.com/products/product.jsp?item=peel-n-pare-pack

    2) Find the following review text in the page: “Have used Cutco for years at my house”

    3) Search Google for that text by placing quotes around it:
    “Have used Cutco for years at my house”
    Google should return the Peel n’ Pare Pack page.

    4) Load the same page using the technique you recommended:
    http://webcache.googleusercontent.com/search?q=cache:https://www.cutco.com/products/product.jsp%3Fitem=peel-n-pare-pack

    5) Note that the text is not found in the page. That specific portion of the page was added with our search-friendly JavaScript integration method.

    As you can see, this text is indexed but not rendered by Google’s Webcache system. Be careful, some of our favorite techniques are no longer telling you the whole truth!

  • Andrew Mitschke

    Regarding your first point I usually like to use the “cache:” search operator to get the cached, text-only version of a website. Great for seeing if all of your content is actually being seen/stored/indexed by Google. Example: http://webcache.googleusercontent.com/search?q=cache:www.bazaarvoice.com&strip=1&vwsrc=0