WordPress: Strange rewrite issue, /randomstring/categoryname works

Bit of background, we manage a network of 25 different WP sites that all run off the same codebase. Recently we had an SEO analyst join and he noticed a couple of the sites had weird 404 issues, for URLs like this:

**/category/featured-article/ryan-mcnamara-new-different/news/page/2/**

So I disabled all plugins and hooks, tried on a fresh install, and this stuff is still happening. Turns out it only happens the site’s permalink structure ends in .html. So I delved into the rewrite code and turns out this is whats happening for the URL: **/category/featured-article/ryan-mcnamara-new-different/news**

Read More
  • If the permalink structure is **/%category%/%postname%/**, then of the
    available rewrite rules ($wp_rewrite->rewrite_rules()) then this rule
    is matched: **(.+?)/([^/]+)(/[0-9]+)?/?$**, causing a 404 as expected.
  • If the permalink structure is **/%category%/%postname%.html**, then this
    rule is matched: **(.+?)/?$**, which maps to
    **index.php?category_name=$matches[1]**, hence why the category is
    rendered When users/bots find these category pages and click the
    pagination links they get taken to
    **/category/featured-article/ryan-mcnamara-new-different/news/page/2/**, which
    causes a 404.

The first question is how are people finding these pages in the first place, which is an issue I can deal with. The question for this forum is this a bug with the rewrites WP has as default OR should the paginate_links function be smarter about creating pagination URLs? Has anybody ever seen this problem before?

Caveats: No, I can’t force all sites to drop the .html, and no I don’t have the ability to change core WP code for this problem

Related posts

Leave a Reply