How does WordPress generate URL slugs?

Is there a page somewhere that details exactly how WordPress generates slugs for URLs? I’m writing a script that needs to generate URL slugs identical to the ones WordPress generates.

Related posts

Leave a Reply

5 comments

  1. As per @SinisterBeard‘s very valid comment to the question already a couple of years back now, this answer has long been outdated and the mentioned function(s) hence been replaced by a newer API:

    See wp_unique_post_slug.

    Original Answer

    Off the bat, I can’t give you a page/tutorial/documentation on how WP slugs are generated, but take a look at the sanitize_title() function.

    Don’t get a wrong impression by the function name, it is not meant to sanitize a title for further usage as a page/post title. It takes a title string and returns it to be used in a URL:

    • strips HTML & PHP
    • strips special chars
    • converts all characters to lowercaps
    • replaces whitespaces, underscores and periods by hyphens/dashes
    • reduces multiple consecutive dashes to one

    There might be edge cases where the core does something additional (you’d have to look at the source to verify that sanitize_title() will always suffice in generating exactly the same you expect), but this should cover at least 99%, if not all, cases.

  2. You can use this function:

    static public function slugify($text)
    {
      // replace non letter or digits by -
      $text = preg_replace('~[^pLd]+~u', '-', $text);
    
      // transliterate
      $text = iconv('utf-8', 'us-ascii//TRANSLIT', $text);
    
      // remove unwanted characters
      $text = preg_replace('~[^-w]+~', '', $text);
    
      // trim
      $text = trim($text, '-');
    
      // remove duplicate -
      $text = preg_replace('~-+~', '-', $text);
    
      // lowercase
      $text = strtolower($text);
    
      if (empty($text)) {
        return 'n-a';
      }
    
      return $text;
    }
    

    Its kind of exactly how the wp url sanitize function works.

  3. Core at your service

    There’s no developer mode built into WordPress aside from WP_DEBUG, which doesn’t help you too much in this case. Basically WP uses the “Rewrite API”, which is a function based, low level wrapper for the WP_Rewrite class, which you can read about in Codex. The global $wp_rewrite object stands at your service to inspect it or interact with the class.

    Plugins that help looking into it.

    Toschos “T5 Rewrite”-Plugin and Jan Fabrys “Monkeyman Rewrite Analyzer”-Plugin will guide you your way. I’ve written a small extension for “T5 Rewrite” to smoothly integrate it with the “Monkeyman Rewrite Analyzer”, which you can find in the “T5 Rewrite” repos wikie here on GitHub.

    The “Monkeyman”-plugin adds a new page, filed in the admin UI menu under Tools. The “T5 Rewrite”-plugin adds a new help tab to the Settings > Permalinks page. My extension adds the help tabs to the mentioned Tools-page too.

    Here’s a screenshot of what the “T5 Rewrite”-plugins help tab content looks like.

    enter image description here

    Vorlage = Pattern | Beschreibung = Explanation | Beispiele = Examples

    Notes

    The “T5 Rewrite”-plugin does a wonderful job with helping you inspect the rewrite object. And it does even more: It adds new possibilities. Therefore it’s (at least in my installations) part of my basic plugins package.

  4. Actually, if you look core function wp_insert_post (post.php), you will see that it does the following:

    $data['post_name'] = wp_unique_post_slug( sanitize_title( $data['post_title'], $post_ID ), $post_ID, $data['post_status'], $post_type, $post_parent );
    
    $wpdb->update( $wpdb->posts, array( 'post_name' => $data['post_name'] ), $where );
    

    The key thing to note is that uses both wp_unique_post_slug and sanitize_title:

    wp_unique_post_slug( sanitize_title( 
    
  5. Forgive for resuming an old question, but I had the same necessity as found out this method works perfectly for me:

    $some_string = "DON'T STOP ME NOW!";
    $slug = sanitize_title(sanitize_title($some_string, '', 'save'), '', 'query');
    echo $slug; // dont-stop-me-now
    

    This method uses a double sanitization.

    The first one uses the save mode, where HTML and PHP tags are stripped, and accents are removed (accented characters are replaced with non-accented equivalents).

    The second query mode ensures all spaces are replaced with dashes - and other punctuation removed.

    Hope this helps someone! 🙂