How do you force a query in MySQL to use an index?

I am trying to improve the performance of a hammered wordpress DB by adding indexes to queries that appear in the slow query log.

In MS SQL you can use query hints to force a query to use an index but it is usually quite easy to get a query to use an index if you cover the columns correctly etc.

Read More

I have this query that appears in the slow query log a lot

SELECT SQL_CALC_FOUND_ROWS  wp_posts.ID 
FROM wp_posts  
WHERE 1=1  
  AND wp_posts.post_type = 'post' 
  AND (wp_posts.post_status = 'publish')  
ORDER BY wp_posts.post_date DESC 
  LIMIT 18310, 5;

I created a covering unique index on wp_posts on post_date, post_status, post_type and post_id and restarted MySQL however when I run explain the index used is

status_password_id

and in the possible keys my new index doesn’t even appear although it’s a covering index e.g I just get

type_status_date,status_password_id

Therefore neither the used index or the possible choices the “optimiser” if MySQL has one is even considering my index which has post_date as the first column. I would have thought a query that is basically doing a TOP and ordering by date with

ORDER BY wp_posts.post_date DESC LIMIT 18310, 5;

Would want to use an index sorted by date for speed, especially one that had all the other fields required to satisfy the query in it as well?

Does MySQL have query hints to force an index to be used for speed/performance tests or is there something else I need to do to see why this index is being ignored.

I would love it if Navicat had a Visual Query Execution Plan like MS SQL but it seems EXPLAIN is the best it has to offer.

Anyone with any hints on how I can either force the index to be used or work out why its being ignored would be very helpful!

Thanks

Related posts

Leave a Reply

4 comments

  1. Does MySQL have query hints to force an index to be used for speed/performance tests or is there something else I need to do to see why this index is being ignored.

    The documentation answers this question in some detail:

    By specifying USE INDEX(index_list), you can tell MySQL to use
    only one of the named indexes to find rows in the table. The
    alternative syntax IGNORE INDEX(index_list) can be used to tell
    MySQL to not use some particular index or indexes. These hints are
    useful if EXPLAIN shows that MySQL is using the wrong index
    from the list of possible indexes.

    You can also use FORCE INDEX, which acts like USE INDEX(index_list)
    but with the addition that a table scan is assumed to be very
    expensive
    . In other words, a table scan is used only if there is no
    way to use one of the given indexes to find rows in the table.

    Each hint requires the names of indexes, not the names of columns. The
    name of a PRIMARY KEY is PRIMARY. To see the index names for a
    table, use SHOW INDEX.

    If USE INDEX doesn’t work, try using IGNORE INDEX to see what the optimizer’s second choice is (or third, and so on).

    A simple example of the syntax would be:

    SELECT * FROM t1 USE INDEX (i1) IGNORE INDEX (i2) WHERE ...
    

    There are many more where that came from, in the linked docs. I’ve linked to the version 5.0 pages, but you can easily navigate to the appropriate version using the left sidebar; some additional syntax options are available as of version 5.1.

  2. MySQL 5.6 has support for a new format of EXPLAIN, which the MySQL Workbench GUI can visualize in a more appealing way. But that doesn’t help you if you’re stuck on MySQL 5.5 or earlier.

    MySQL does have hints as @AirThomas mentions, but you should really use them sparingly. In a simple query like the one you show, it should never be necessary to use index hints — if you have the right index. And using index hints means you have hard-coded index names into your application, so if you add or drop indexes, you have to update your code.

    In your query, an index on (post_date, post_status, post_type, post_id) is not going to help.

    You want the left-most column in the index to be used for row restriction. So put post_status, post_type first. Best if the more selective column is first. That is, if post_type = 'post' matches 3% of the table, and post_status = 'publish' matches 1% of the table, then put post_status first before post_type.

    Since you used = for both conditions and the AND operator, you know that all matching rows are basically tied with respect to those two columns. So if you use post_date as the third column in the index, then the optimizer knows it can fetch the rows in the order they are stored in the index, and it can skip doing any other work for the ORDER BY. You can see this working if “Using filesort” disappears from your EXPLAIN output.

    So your index likely should be:

    ALTER TABLE wp_posts ADD INDEX (post_status, post_type, post_date);
    

    You may also enjoy my presentation How to Design Indexes, Really.

    You don’t need to add ID to the index in this case, because InnoDB indexes automatically contain the primary key column(s).

    LIMIT 18310, 5 is bound to be costly. MySQL has to generate the whole result set on the server side, up to 18315 rows, only to discard most of them. Who in the world needs to skip to the 3662nd page, anyway?!

    SQL_CALC_FOUND_ROWS is a major performance killer when you have large result sets that you’re paging through, because MySQL has to generate the whole result set, both before and after the page you requested. Best to get rid of that query modifier unless you really need FOUND_ROWS(), and even if you do need the number of rows, it can sometimes* be quicker to run two queries, one with SELECT COUNT(*).
    (* Test both ways to make sure.)

    Here are some more tips on optimizing LIMIT:

  3. Try changing the order of your index definition to

    post_type, post_status, post_date, post_id
    

    or

    post_date desc, post_type, post_status,  post_id
    
  4. Just to let you know I am on a different PC so my username has changed but I did write the original question.

    What I think would be very helpful is a conversion guide to help people from MS SQL backgrounds covert to MySQL as it seems there’s some difference in the index tuning I didn’t realise especially that different storage engines automatically add in primary keys, plus how to handle the lack of tools to aid performance tuning.

    I am used to creating my main clustered index, primary key and other unique constraints and indexes, then non-clustered indexes with included keys, some covering indexes etc.

    I will then run a timed job to log the missing index DMV report to a table to prevent the data being lost during any restart. I can then run reports to check for indexes the SQL optimiser thinks “should” be used or those it is “not” using. I can then use this information, counters of mis-hits and potential efficiency percentage if the missing index was used, as a guide to help fine tune the indexing for performance.

    As far as I can tell MySQL doesn’t have anything similar to the DMV’s MsSQL has?

    The nice graphical execution plan built into MS SQL Studio from aeons ago helps with tuning a lot and the bog standard MySQL explain is poor in comparison. I will look into that tool you mentioned although running a select @@version returns 5.0.51a-24+lenny5-log so I doubt it will help me.

    A couple of things though regarding the posts:

    1. The aim was to have a covering index so no bookmark lookups (if you call them that in MySQL) were required and the data could come straight from the index.

    2. As nearly all my posts are “published” (99.99%) and the post_type is nearly all “post” (99.99%) with a tiny percentage of “pages”. There is no selectivity in those two columns and they are in the index for the cover. I’ve turned off auto-drafts to prevent a build up of revisions etc and the number of drafts is very small.

    3. Therefore I would have thought having the post_date as the first key in the index would have been of more help as the LIMIT (as you say is expensive, and I have no control over WordPress’s code) therefore surely the ORDER BY and LIMIT (which is basically a TOP) would have been the most costly and selective part of the query and of more use to the index in comparison with the other keys (which are not selective at all). This is why I put it first.

    4. I am using WordPress and the table is wp_posts and its storage engine is MyISAM which I believe I cannot change due to the requirement for it to have full text searching.

    5. As I said to someone else I already have an index with the order post_type, post_status and post_date but EXPLAIN only shows it in the possible keys and then ignores it to use the index based around these columns: post_status, password and id.

    6. As password is not used in the query and post_status is totally unselective (as all my post_types are “post”) I am a loss to why MySQL’s “clever” optimiser thinks this index should be selected above either the ones provided OR my own?

    So I am still stuck as no suggestion seems to work.

    I’ve tried changing the ordering several times and even though I only have 20k rows it takes half an hour or more each time! I don’t know if this is normal in MySQL or not but in MSSQL it takes minutes to add/drop indexes on tables with millions of rows.

    So as nothing has worked so far I want to know (why?) and obviously about query hints to see if that does any good at all.

    I have re-started the DB after re-indexing (and even restarted the webserver).

    Thanks for your help.