WordPress slug issue with non-latin characters

I’m using permalinks in WP as: domain.com/category/post_name

The issue is that post names have non-latin characters such as chinese, hebrew, and arabic. So it encodes them to something like: %20%18%6b%20 therefore it counts every symbol’s character as an actual character, ending up with 3x times more length that it truncates some very short slugs.

Read More

How to fix that? Or how to extend the length limit at least? I’ve tried to extend the length of the database field “post_name” from 200 to 500, But it’s still truncating short.

Related posts

Leave a Reply

2 comments

  1. Permalinks like http://example/שָׁלוֹם are actually working in my WordPress 3.3. Could be the remove_accents() improvements for i18n permalinks.

    As Sean & Steve noted,

    • make sure you’re using WordPress ≥ 3.3
    • make sure your .htaccess file contains a rule similar to RewriteRule . /index.php [L]
    • check that your database is UTF-8 encoded (and consider converting to UTF-8 if not).

    [My original answer follows, not so relevant now but maybe still useful:]

    See

    If your post titles contain some ASCII characters, you can strip out non-ASCII characters when generating post slugs.

    Some plugins may help:

    • http://wordpress.org/extend/plugins/strings-sanitizer/

      Aggressively sanitizes titles for clean, SEO friendly post slugs, and media filenames during upload. Works by converting common accentuated UTF-8 characters, as well as a few special cyrillic, hebrew, spanish and german characters.

    • http://wordpress.org/extend/plugins/universal-slugs/

      […] if you happen to speak a language that uses characted that are not included on the English alphabet, then you either have to bear with massive, odd looking permalinks, or manually update each one whenever you write a post or a page. […] The plugin will also remove common words such as “and”, “και”, “το”, “the” etc. from the URLs, as they do simply contribute to the URL length without adding anything to the meaning or the SEO value.

    • http://wordpress.org/extend/plugins/pinyin-slug/

      For example, when you publish a post with a title like this: “Chinese PinYin” WordPress automatically assigns a long filename to your post, called a post slug: /%e4%b8%ad%e6%96%87%e6%8b%bc%e9%9f%b3 […] With Chinese PinYin plugin activated, the slug for our example blog post would look like this: /zhongwenpinyin

    • http://wordpress.org/extend/plugins/remove-utf-8-from-slug/

      remove all UTF-8 from title to permalink

    • http://wordpress.org/extend/plugins/pinyin-seo/

      Convert Chinese characters to Pinyin Permalinks.

    Also, some of the multilingual plugins might be able to translate your slugs into English (and hence Latin-only characters) automatically, but I haven’t used any of them, so I’m not sure.

  2. Apart from sanitizing, the only way to extend length of slug is modifying WP code

    in file wp-includes/formatting.php:

    replace 200 accordingly:

    $title = utf8_uri_encode($title, 200);
    

    in file: wp-includes/post.php search for 3 lines with:

    $alt_post_name = substr( $slug, 0, 200 - ( strlen( $suffix ) + 1 ) ) . "-$suffix";
    

    The problem will hunt You down with each WP update