Troubleshooting WordPress/Woocommerce custom SQL query for reporting

Hopefully this is the right forum, my question seems to overlap the stack exchange community so this seemed best.

I have some custom reports for my WooCommerce orders on my wordpress site. I have one query that is just freezing locally, meaning in my localhost my CPU goes to 100% and it never finishes and I don’t understand why. To the point here is the query:

Read More
SELECT SUM(postmeta.meta_value)
FROM pca_postmeta AS postmeta
LEFT JOIN pca_woocommerce_order_items AS orders ON orders.order_id = postmeta.post_id
WHERE postmeta.meta_key = '_order_total'
AND orders.order_item_id IN (
    SELECT item_meta.order_item_id 
    FROM pca_woocommerce_order_itemmeta AS item_meta 
    LEFT JOIN pca_woocommerce_order_items AS orders ON item_meta.order_item_id = orders.order_item_id 
    LEFT JOIN pca_posts AS posts ON posts.ID = orders.order_id 
    WHERE item_meta.meta_value = '23563' 
    AND posts.post_status IN ('wc-processing','wc-completed') 
    GROUP BY orders.order_id
)

As you can hopefully see the goal here is to get the summation of all orders from this specific campaign (23563). The nested query works exactly as expected on its own, returning just a list of IDs like so:

NOTE: little curious if 2.6289 secs is long when it only returned 65 total, although there are 148220 total

enter image description here

The problem is this query doesn’t seem to like the nested part. Any suggestions? Completely different approach in mind?

P.S. I use that nested query at other times as well to represent all orders by campaign id in my php reporting class. But for my question PHP has nothing to do with it.

UPDATE/FOLLOW UP:

Is it possible to convert this into a join as described here: Using a SELECT statement within a WHERE clause ? I’m a little light on my SQL so not sure how I would do that but it seems promising

Related posts

1 comment

  1. GROUP BY  orders.order_id
    

    does not make sense because you are selecting only order_item_id.

    pca_woocommerce_order_itemmeta would benefit from

    INDEX(meta_value, order_item_id)
    

    An this might be an equivalent query, but avoiding the IN(SELECT...):

    SELECT  SUM(pm.meta_value)
        FROM  
          ( SELECT  im.order_item_id
                FROM  pca_woocommerce_order_itemmeta AS im
                LEFT JOIN  pca_woocommerce_order_items AS o
                                 ON im.order_item_id = o.order_item_id
                LEFT JOIN  pca_posts AS posts ON posts.ID = o.order_id
                WHERE  im.meta_value = '23563'
                  AND  posts.post_status IN ('wc-processing','wc-completed')
                GROUP BY  o.order_id 
          ) AS w
        JOIN  pca_woocommerce_order_items AS o ON w.order_item_id = o.order_item_id
        JOIN  pca_postmeta AS pm ON o.order_id = pm.post_id
        WHERE  pm.meta_key = '_order_total'
    

    Edit

    Some principles behind what I did. Here I am guessing at what the optimizer will do with various possible formulations of the query.

    • I got rid of LEFT — This may have changed the output. But I needed to avoid LEFT JOIN ( SELECT ... ) which would not be optimizable.
    • By having one subquery in the list of “tables” being JOINed, the optimizer will (almost certainly) start with the subquery and do “Nested Loop Joins” to the other tables. NLJ is the common way to perform a query.
    • A subselect like that has no index, so it needs to be first in the order, else it will be very inefficient.
    • Without subqueries, the optimizer generally likes to start with whichever table has something in the WHERE clause.
    • The requirement to start with the subquery “table” is stronger than the desire to pick the table based on WHERE pm.meta_key = '_order_total'.
    • Inside the subquery, the only “=” test (WHERE im.meta_value = '23563) provides the likely starting point for that set of JOINs. This is further enhanced by it not being ‘right’ of a LEFT JOIN. Hence, I suggested that index.

Comments are closed.