Does google crawl/index the “computed” or raw html source?

I’ve got a unique situation in which I have a few pages that are “paged” in multiple pages (by WordPress “nextpage” feature). The same content, spread over two or more pages like so:

http://mysite.com/mypage/
http://mysite.com/mypage/2
http://mysite.com/mypage/3

So, the page itself has one html page title tag <title>My Page</title>, but since its spread over more than one page, I had to create script to add unique html title tags for each of those pages in order to get google to index them.

Read More

To do that, I’m using

$exploded = explode("/",$_SERVER['REQUEST_URI']);

if( is_numeric( $exploded[sizeof($exploded)-2] ) && !is_archive())
{
    $title = $title." (Page ".$exploded[sizeof($exploded)-2].")";
}

Which creates unique page titles for each of those “paged” pages like so:

<title>mypage</title>
<title>mypage (page 2)</title> 
<title>mypage (page 3)</title>

Now, I’ve run into a situation where I’m trying to enhance this a bit to replace the (page X) with a more descriptive title.

So, in my markup, when a page is paged like this, I have included an html “details” element that contains the page’s table of contents like so:

<details class="myEl" open="open">
    <summary>In this article</summary>
    <ol>
        <li><a href="post-slug/">Introduction</a></li>
        <li><a href="post-slug/2/" class="active">Title for the second page</a></li>
        <li><a href="post-slug/3/">Title for the third page</a></li>
    </ol>
</details>

And in order to try to copy the TOC’s title into the title tag (to replace the “Page X” titles), I’m trying to use this jQuery script (which works flawlessly to change the title of the “computed” source):

<script>
    jQuery(document).ready(function(){
        var title = jQuery('.myEl').find('a.active').text();
        jQuery('title').text(title);
    });
</script>

However, when I test these pages using the Google Structured data testing tool, the title remains unchanged from the “(Page X)” syntax. Its as if Google is parsing the raw html source and not the computed source.

Can this be confirmed?

Related posts

Leave a Reply

1 comment

  1. While some crawlers are capable of running JS and accessing the rendered pages, the majority are not. As a result, they all base their information off of the raw HTML, and use the rendered page (if available) for things like detecting blackhat SEO tactics (hidden keyword stuffing, link changes, js redirects, etc).

    If you want Google (and other search engines) to pick up on your improved title, you will have to send that in the HTML, instead of modifying it after page load.