I am trying to improve the performance of a hammered wordpress DB by adding indexes to queries that appear in the slow query log.
In MS SQL you can use query hints to force a query to use an index but it is usually quite easy to get a query to use an index if you cover the columns correctly etc.
I have this query that appears in the slow query log a lot
SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
FROM wp_posts
WHERE 1=1
AND wp_posts.post_type = 'post'
AND (wp_posts.post_status = 'publish')
ORDER BY wp_posts.post_date DESC
LIMIT 18310, 5;
I created a covering unique index on wp_posts
on post_date, post_status, post_type and post_id
and restarted MySQL however when I run explain the index used is
status_password_id
and in the possible keys my new index doesn’t even appear although it’s a covering index e.g I just get
type_status_date,status_password_id
Therefore neither the used index or the possible choices the “optimiser” if MySQL has one is even considering my index which has post_date as the first column. I would have thought a query that is basically doing a TOP and ordering by date with
ORDER BY wp_posts.post_date DESC LIMIT 18310, 5;
Would want to use an index sorted by date for speed, especially one that had all the other fields required to satisfy the query in it as well?
Does MySQL have query hints to force an index to be used for speed/performance tests or is there something else I need to do to see why this index is being ignored.
I would love it if Navicat had a Visual Query Execution Plan like MS SQL but it seems EXPLAIN is the best it has to offer.
Anyone with any hints on how I can either force the index to be used or work out why its being ignored would be very helpful!
Thanks
The documentation answers this question in some detail:
If
USE INDEX
doesn’t work, try usingIGNORE INDEX
to see what the optimizer’s second choice is (or third, and so on).A simple example of the syntax would be:
There are many more where that came from, in the linked docs. I’ve linked to the version 5.0 pages, but you can easily navigate to the appropriate version using the left sidebar; some additional syntax options are available as of version 5.1.
MySQL 5.6 has support for a new format of EXPLAIN, which the MySQL Workbench GUI can visualize in a more appealing way. But that doesn’t help you if you’re stuck on MySQL 5.5 or earlier.
MySQL does have hints as @AirThomas mentions, but you should really use them sparingly. In a simple query like the one you show, it should never be necessary to use index hints — if you have the right index. And using index hints means you have hard-coded index names into your application, so if you add or drop indexes, you have to update your code.
In your query, an index on
(post_date, post_status, post_type, post_id)
is not going to help.You want the left-most column in the index to be used for row restriction. So put
post_status, post_type
first. Best if the more selective column is first. That is, ifpost_type = 'post'
matches 3% of the table, andpost_status = 'publish'
matches 1% of the table, then put post_status first before post_type.Since you used
=
for both conditions and theAND
operator, you know that all matching rows are basically tied with respect to those two columns. So if you usepost_date
as the third column in the index, then the optimizer knows it can fetch the rows in the order they are stored in the index, and it can skip doing any other work for the ORDER BY. You can see this working if “Using filesort” disappears from your EXPLAIN output.So your index likely should be:
You may also enjoy my presentation How to Design Indexes, Really.
You don’t need to add ID to the index in this case, because InnoDB indexes automatically contain the primary key column(s).
LIMIT 18310, 5
is bound to be costly. MySQL has to generate the whole result set on the server side, up to 18315 rows, only to discard most of them. Who in the world needs to skip to the 3662nd page, anyway?!SQL_CALC_FOUND_ROWS
is a major performance killer when you have large result sets that you’re paging through, because MySQL has to generate the whole result set, both before and after the page you requested. Best to get rid of that query modifier unless you really needFOUND_ROWS()
, and even if you do need the number of rows, it can sometimes* be quicker to run two queries, one withSELECT COUNT(*)
.(* Test both ways to make sure.)
Here are some more tips on optimizing LIMIT:
Try changing the order of your index definition to
or
Just to let you know I am on a different PC so my username has changed but I did write the original question.
What I think would be very helpful is a conversion guide to help people from MS SQL backgrounds covert to MySQL as it seems there’s some difference in the index tuning I didn’t realise especially that different storage engines automatically add in primary keys, plus how to handle the lack of tools to aid performance tuning.
I am used to creating my main clustered index, primary key and other unique constraints and indexes, then non-clustered indexes with included keys, some covering indexes etc.
I will then run a timed job to log the missing index DMV report to a table to prevent the data being lost during any restart. I can then run reports to check for indexes the SQL optimiser thinks “should” be used or those it is “not” using. I can then use this information, counters of mis-hits and potential efficiency percentage if the missing index was used, as a guide to help fine tune the indexing for performance.
As far as I can tell MySQL doesn’t have anything similar to the DMV’s MsSQL has?
The nice graphical execution plan built into MS SQL Studio from aeons ago helps with tuning a lot and the bog standard MySQL explain is poor in comparison. I will look into that tool you mentioned although running a select @@version returns 5.0.51a-24+lenny5-log so I doubt it will help me.
A couple of things though regarding the posts:
The aim was to have a covering index so no bookmark lookups (if you call them that in MySQL) were required and the data could come straight from the index.
As nearly all my posts are “published” (99.99%) and the post_type is nearly all “post” (99.99%) with a tiny percentage of “pages”. There is no selectivity in those two columns and they are in the index for the cover. I’ve turned off auto-drafts to prevent a build up of revisions etc and the number of drafts is very small.
Therefore I would have thought having the post_date as the first key in the index would have been of more help as the LIMIT (as you say is expensive, and I have no control over WordPress’s code) therefore surely the ORDER BY and LIMIT (which is basically a TOP) would have been the most costly and selective part of the query and of more use to the index in comparison with the other keys (which are not selective at all). This is why I put it first.
I am using WordPress and the table is wp_posts and its storage engine is MyISAM which I believe I cannot change due to the requirement for it to have full text searching.
As I said to someone else I already have an index with the order post_type, post_status and post_date but EXPLAIN only shows it in the possible keys and then ignores it to use the index based around these columns: post_status, password and id.
As password is not used in the query and post_status is totally unselective (as all my post_types are “post”) I am a loss to why MySQL’s “clever” optimiser thinks this index should be selected above either the ones provided OR my own?
So I am still stuck as no suggestion seems to work.
I’ve tried changing the ordering several times and even though I only have 20k rows it takes half an hour or more each time! I don’t know if this is normal in MySQL or not but in MSSQL it takes minutes to add/drop indexes on tables with millions of rows.
So as nothing has worked so far I want to know (why?) and obviously about query hints to see if that does any good at all.
I have re-started the DB after re-indexing (and even restarted the webserver).
Thanks for your help.