How can I optimize this VERY slow MySQL query?

I have a WordPress-based site which catalogs scholarly articles for a regional science journal.

Simply said, the system has several thousand “Posts”, and each has a “pub_type” Taxonomy in which only 1 Term is selected: “Manuscript”, or something else.

Read More

Each Post also has a variety of other Taxonomies/Terms it is related to.


Objective: Get a list of terms for a particular Taxonomy. For each term, Count the number of Posts related to it, AND determine how many of those Posts have “Manuscript” set within the “pub_type” Taxonomy.


Current Query:

SELECT term_id, term_id as term_id_b, name, slug,
( SELECT COUNT(id) FROM wp_posts WHERE id IN 
    ( SELECT object_id FROM wp_term_relationships WHERE term_taxonomy_id IN 
        ( SELECT term_taxonomy_id FROM wp_term_taxonomy WHERE term_id = term_id_b ) 
) AND post_status = "publish" ) as count,
( SELECT COUNT(id) FROM wp_posts WHERE id IN 
    ( SELECT object_id FROM wp_term_relationships WHERE term_taxonomy_id IN 
        ( SELECT term_taxonomy_id FROM wp_term_taxonomy WHERE term_id IN 
            ( SELECT term_id FROM wp_terms WHERE term_id = term_id_b ) 
        )
    ) 
AND id IN 
( SELECT object_id FROM wp_term_relationships WHERE term_taxonomy_id IN
    ( SELECT term_taxonomy_id FROM wp_term_taxonomy WHERE term_id = 
        ( SELECT term_id FROM wp_terms WHERE name = "Manuscript" ) 
    AND taxonomy = "pub_type" )
)
AND post_status = "publish"
) as manuscript_count
FROM wp_terms 
WHERE term_id IN 
( SELECT term_id FROM wp_term_taxonomy WHERE taxonomy = "'.$taxonomy.'" )
ORDER BY name ASC

While this query does work, it runs extremely slow… as in between 3-5 minutes depending on server load. It’s so bad that, in order to maintain site performance, I’ve had to cache the query results to a JSON text file and only let the query run every 2 hours.

I know that the main problem here is that I’ve used sub-queries for everything. While I am trying to learn more about working with joins, I don’t yet know enough to write this query any other way.

Can anyone offer some insights or suggestions as to how I can tame this beast?

edit: Here is a screenshot of the EXPLAIN output from the query:

http://i.imgur.com/Axaqun3.png

Related posts

Leave a Reply

1 comment

  1. See this great post about the use of IN and how it can negatively affect query speed. This has your problem written all over it.

    Essentially, with a traditional IN(values) query, you are simply searching for each value. In a subquery IN as you have (IN(SELECT)).

    In it, the author quotes from MySQL’s manual:

    If the inner and outer queries return M and N rows, respectively, the execution time becomes on the order of O(MxN), rather than O(M+N) as it would be for an uncorrelated subquery.

    Using this logic, you have wp_posts at as an outer query, and within it nested wp_terms_relationship at 10 rows, and within that nested, wp_terms again. That one is 949*(10*1)). The second time around you have only two levels of nesting, but because the final level only has one row, it is still the same impact: You are ending up parsing through 9490 rows.

    Subqueries are not your friend. While sometimes they may be necessary, they can almost always be solved using JOIN. This does not seem like it is much different based on your objectives. By rewriting this query and stretching to attempt to use JOIN...ON, you will find that after a brief learning curve it will be much easier to read your own code, follow its logical progression, and, most likely, see it speed up too. At the very least, utilize joins where you can. You will thank yourself later.