Fulltext search in posts sorted by relevancy

I’m building a custom search page and what I need is to sort results by relevancy (eg. occurences of a keyword)

Using MySQL docs I’ve built this query:

Read More
SELECT ID
FROM $wpdb->posts
WHERE MATCH (post_title, post_content) AGAINST ('$keyword_list')
AND post_status = 'publish'
AND post_type = 'issue'

Problem A) If I test it using a keyword(s) that appears multiple times in my test posts, it
does find those posts, but it’s not sorted correctly by relevancy. For example the first found posts has around 3 occurences of the keyword while the posts returned on 3rd place has roughly 15 occurences.

Problem B) There are two posts that contain keyword “medicare”, if I search for “medicare” it does return those posts, however when I search for “medi”, “medic” etc. it doesn’t return anything.

Is there a way to query it using MATCH/AGAINST or do I have to use LIKE % – in that case, how to sort by relevancy.

Thanks

Related posts

Leave a Reply

3 comments

  1. You may want to check the Sphinx search engine
    http://sphinxsearch.com/

    It took me about one day to set it up and learn how to use, but it does really good job with sorting by relevancy and grammar stemming. Also it is faster than fulltext search for big data sets thanks to indexing.

  2. This won’t solve problem B, but as far as problem A: I don’t think full text searches automatically sort by relevance. I’m pretty sure you have to select the “AGAINST” clause AS something to even get the relevance into the results. Like:

    ... WHERE MATCH (post_title, post_content) AGAINST ('$keyword_list') as relevance ...
    ... ORDER BY relevance
    

    But I’d also agree with Aleh: if you’re looking for a sophisticated search, you’ll probably need something designed for that purpose, like Sphinx.

  3. This may or may not be important in your case, but FULLTEXT search works very strangely when there isn’t very much text to search against. It’s really a pain in the neck to test on a small data set. You may want to try boolean mode when you’re working with a small text data set.

    You should try…

     WHERE MATCH (post_title, post_content) 
           AGAINST ('keyword keyword keyword' IN BOOLEAN MODE)
    

    @MathSmath’s point about ordering by relevance is also correct.