Character replacement encoding php

I have a string that I want to replace all ‘a’ characters to the greek ‘α’ character. I don’t want to convert the html elements inside the string ie <a href="http://a-url-with-a-characters">text</a>.

The function:

Read More
function grstrletter($string){

    $skip = false;
    $str_length = strlen($string);

    for ($i=0; $i < $str_length; $i++){

        if($string[$i] == '<'){
            $skip = true;
        }

        if($string[$i] == '>'){
            $skip = false;
        }

        if ($string[$i]=='a' && !$skip){
            $string[$i] = 'α';
        }
    }

    return $string;

}

Another function I have made works perfectly but it doesn’t take in account the hmtl elements.

function grstrletter_no_html($string){

 return strtr($string, array('a' => 'α'));

}

I also tried a lot of encoding functions that php offers with no luck.

When I echo the greek letter the browser output it without a problem. When I return the string the browser outputs the classic strange question mark inside a triangle whenever the replace was occured.

My header has <meta http-equiv="content-type" content="text/html; charset=UTF-8"> and I also tried it with php header('Content-Type: text/html; charset=utf-8'); but again with no luck.

The string comes from a database in UTF-8 and the site is in wordpress so I just use the wordpress functions to get the content I want. I don’t think is a db problem because when I use my function grstrletter_no_html() everything works fine.

The problem seems to happen when I iterate the string character by character.

The file is saved as UTF-8 without BOM (notepad++). I tried also to change the encoding of the file with no luck again.

I also tried to replace the greek letter with the corresponding html entity α and &alpha; but again same results.

I haven’t tried yet any regex.

I would appreciate any help and thanks in advance.

Tried: Greek characters encoding works in HTML but not in PHP

EDIT

The solution based on deceze brilliant answer:

function grstrletter($string){

    $skip = false;
    $str_length = strlen($string);

    for ($i=0; $i < $str_length; $i++){

        if($string[$i] == '<'){
            $skip = true;
        }

        if($string[$i] == '>'){
            $skip = false;
        }

        if ($string[$i]=='a' && !$skip){
            $part1 = substr($string, 0, $i);
            $part1 = $part1 . 'α';
            $string = $part1 . substr($string, $i+1);
        }
    }

    return $string;

}

Related posts

Leave a Reply

1 comment

  1. The problem is that you’re setting only a single byte of your string. Example:

    $str = "x00x00x00";
    
    var_dump(bin2hex($str));
    
    $str[1] = "xffxff";
    
    var_dump(bin2hex($str));
    

    Output:

    string(6) "000000"
    string(6) "00ff00"
    

    You’re setting a two-byte character, but only one byte of it is actually pushed into the string. The second result here would have to be 00ffff for your code to work.

    What you need is to cut the string from 0 to $i - 1, concatenate the 'α' into it, then concatenate the rest of the string $i + 1 to end onto it if you want to insert a multibyte character. That, or work with characters instead of bytes using the mbstring functions.

    For more background information, see What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text.