How can I automatically delete comments that contain chinese / russian signs?

Once in a while, I get comments like these:

苍天有眼啊,让我在有生之年得以观得如此精彩绝伦的帖子。

Read More

They are absolutely useless to me. I don’t even know if it’s Chinese / Japanese / Korean / …

How can I tell WordPress to automatically delete (not just spam) those messages?

Related posts

2 comments

  1. Those characters you quoted here are Han (used by Chinese language), as they matched by the unicode character property p{Han}.

    You can perform a regular expression search in a plugin like so:

    <?php
    
    /**
     * Plugin Name: Drop comments by chars
     * Description: Delete comments which includes unicode characters of Han, Hangul and Cyrillic.
     * Version:     2014.02.18
     * Author:      David Naber
     * Licence:     MIT
     * Licence URI: http://opensource.org/licenses/mit-license.php
     * Copyright:   (c) 2014 David Naber
     * @see         http://wordpress.stackexchange.com/q/116973/31323
     */
    
    /**
     * check for the occurence of Han, Hangul and Cyrillic characters
     *
     * @param string $content
     * @return bool
     */
    function wpse_116973_has_unallowed_char( $content ) {
    
        return (bool) preg_match( '~p{Hangul}|p{Han}|p{Cyrillic}~u', $content );
    }
    
    /**
     * @wp-hook comment_post
     * @param int $comment_ID
     * @param array $approved
     * @return void
     */
    function wpse_116973_trash_unallowed_comments( $comment_ID, $approved ) {
    
        $comment = get_comment( $comment_ID );
        if ( ! wpse_116973_has_unallowed_char( $comment->comment_content ) )
            return;
    
        wp_trash_comment( $comment_ID );
    }
    add_action( 'comment_post', 'wpse_116973_trash_unallowed_comments', 10, 2 );
    

    The control function wpse_116973_has_unallowed_char() searches for all characters of Chinese (Han), Korean(Hangul) and Russian (Chyrillic) languages. The plugin move those comments to trash.

    If someone just want to mark them as spam, use the filter pre_comment_approved like this:

    /**
     * @wp-hook pre_comment_approved
     * @param bool $approved
     * @param array $commentdata
     * @return bool|string Returns FALSE, TRUE, or 'spam'
     */
    function wpse_116973_allow_comment( $approved, $commentdata ) {
    
        if ( wpse_116973_has_unallowed_char( $commentdata[ 'comment_content' ] ) )
            return 'spam';
    
        return $approved;
    }
    add_filter( 'pre_comment_approved', 'wpse_116973_allow_comment', 10, 2 );
    
  2. You can use the built in black list function in Dashboard > Settings > Discussion http://www.example.com/wp-admin/options-discussion.php, add a few characters from the alphabet you want to exclude and it won’t bother you again…I guess.

Comments are closed.