Google Indexing Non-Existent URLs. WordPress Doesn’t Show 404

I was inspecting the Google search results for: “site:mywordpress.org.” And found lots or pages indexed that shouldn’t exist.

There are two problems here:

Read More
  1. I don’t know how Google located, crawled, or found these URLs.

  2. WordPress doesn’t show a 404 error, so it looks like duplicate content.

I tried the WordPress support forums, but no one responded. I also have not been able to find anyone reporting this problem. Here’s an example of what I am seeing:

mywordpress.org/blog-post/
mywordpress.org/blog-post/1363035032000/

I’ve added a canonical link reference to the head and I’ve been doing lots of Google WMT removal requests, but I’m still seeing some results like this.

I’ve tested this on a few wordpress installs, it seems that if you add any string of numbers to the end of a permalink it will still display the content rather than showing a 404 error.

I also noticed that the number that is being added to the permalinks is the UNIX time stamp with a few zeros on the end. As of this post the current UNIX time stamp is: 1363035971.

I’m looking for some advice on what I should do. I’m particularly interested in a PHP function that would check the url to see if there was a string of numbers at the end, and if there was, 301 redirect it to the right permalink. I’d also value any input on why Google is finding these wrong urls and if the UNIX time stamp is the clue.

Related posts

Leave a Reply

2 comments

  1. Did you check if some plugin is causing this? Also check your Permalink Settings under Settings > Permalinks

    Until you locate the source of your problem, you could try to get rid of it by using Redirect plugin.

    This plugin has many features, two features important to your case are:

    • All URLs can be redirected, not just ones that don’t exist
    • Full regular expression support

    So with the help of regular expression, you would probably be able to redirect URL with numbers to a correct URL.

  2. I had the same issue and found the solution to this issue.

    Just add this to functions.php

     add_action( 'template_redirect', 'so16179138_template_redirect', 0 );
    function so16179138_template_redirect()
    {
        if( is_singular() )
        {
            global $post, $page;
            $num_pages = substr_count( $post->post_content, '<!--nextpage-->' ) + 1;
            if( $page > $num_pages ){
                include( get_template_directory() . '/404.php' );
                exit;
            }
        }
    }