Why won’t my preg_replace work with content_save_pre?

I’ve written a filter function to replace all image URLs in a post with URL from the Uploads directory of my site. However, when using the content_save_pre, filter it won’t actually work – but if I use the content_edit_pre filter, it works. Here is my code:

$oldPostContent = $post->post_content; // pass old content into function as argument
function get_the_post_imgs($oldPostContent) {
    global $post;
    $newPostContent = $oldPostContent;
    $newPostContent = preg_replace_callback(
        "/<img.*src=["']([^"']*)["'].*>/i", // pattern to match to (i.e the contents of an <img src="..." /> tag)
        function ($match) {
            return "hello world"; // obviously it's a lot more complicated than this but for your sake i've massively simplified it so the src="..." content is just replaced with "hello world" for now
        },
        $oldPostContent
    );

    return $newPostContent; // return new manipulated content
}

add_filter('content_save_pre', 'get_the_post_imgs'); // this doesn't work on the content, but if I use content_edit_pre, it does work

I’ve added some comments along the way to help you out. I’m pretty sure it’s not a problem with the regex (http://regex101.com/r/aP4fU9) so it’s something to do with my code not working the way content_save_pre needs it to work.

Read More

Thanks for any help 🙂

Related posts

Leave a Reply

1 comment

  1. I copied your code to a file in my home directory on my Ubuntu box, removed the weird character that @s_ha_dum noticed, and ran it. It worked as I’d expect.

    Code:

    <?php
    
    function get_the_post_imgs($oldPostContent) {
        $newPostContent = $oldPostContent;
        $newPostContent = preg_replace_callback(
            "/<img.*src=["']([^"']*)["'].*>/i", // pattern to match to (i.e the contents of an <img src="..." /> tag)
            function ($match) {
                return "hello world"; // obviously it's a lot more complicated than this but for your sake i've massively simplified it so the src="..." content is just replaced with "hello world" for now
            },
            $oldPostContent
        );
    
        return $newPostContent; // return new manipulated content
    }
    
    echo get_the_post_imgs( '<a href="foo.php"><img align="right" src="http://example.com/fire.gif" /></a><br />' );
    
    ?>
    

    Running this code from the command line:

    % php -e foo.php 
    <a href="foo.php">hello world
    

    Your regex might be a little greedy for your purposes, but it works.

    PS, trying to run your code with the weird character still in place:

    % php -e foo.php
    PHP Parse error:  syntax error, unexpected '"/<img.*src=["']([^"']*)["'' (T_CONSTANT_ENCAPSED_STRING) in /home/username/foo.php on line 6
    

    Edit — Since you asked in the comments, I thought I’d try to cobble up a less greedy regex for you. Try this on for size:

    <img.*src=["']([^"']*)["'][^>]*>
    

    The extra [^>*] term at the end ends the <img...> tag at its closing >, instead of carrying forward to the last > in the line (in my example, the closing > in the </a> tag).