How do I use preg_replace to insert text string between 3rd and 4th paragraph?

I’m trying to figure out how to create a common newspaper device called a ‘pullquote’ in a WordPress post. (But this isn’t strictly a WordPress question; it’s more a generic Regex question.) I have a tag to surround the text in the post. I want to copy the text between the tags (which I know how to do) and the insert it between the 3rd and 4th instances of the p tags in the post.

The function below, finds the text and strips out the tags, but simply prepends the matched text to the beginning. I need help with targeting the 3rd/4th paragraph

Read More

OR… maybe I’m thinking about this wrong. Perhaps there is some way to target the elements as one can do with jQuery nth-child?

Post:

<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].</p>
<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>
<p>And here is a 3rd paragraph.</p>
<p>And here is a 4th paragraph.</p>

Desired Result

<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of Tatort or Bukow & Konig.</p>
<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>
<p>And here is a 3rd paragraph.</p>
<blockquote class="pullquote">Tatort or Bukow & Konig</blockquote>
<p>And here is a 4th paragraph.</p>

So far this is what I have for code:

function jchwebdev_pullquote( $content ) {
    $newcontent = $content;
    $replacement = '$1';
    $matches = array();
    $pattern = "~[callout](.*?)[/callout]~s";
    // strip out 'shortcode'
    $newcontent = preg_replace($pattern, $replacement, $content);
    if( preg_match($pattern, $content, $matches)) {
      // now have formatted pullquote 
      $pullquote = '<blockquote class="pullquote">' .$matches[1] . '</blockquote>';
      // now how do I target and insert $pullquote
      // between 3rd and 4th paragraph?
      preg_replace($3rd_4th_pattern, $3rd_4th_replacement,
      $newcontent);
      return $newcontent;
    }
    return $content;    
}
add_filter( 'the_content' , 'jchwebdev_pullquote');

Edit: I want to modify my question to be a bit more WordPress specific. WordPress actually converts newlines to

characters. Most WordPress posts don’t even -use- explicit ‘p’ tags because they are unneeded. The problem with the solutions so far is that they seem to strip out newline chars so if the post (source text) has newlines, it looks weird.

Typical real world WordPress post:

If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].

If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.

And here is a 3rd paragraph.


And here is a 5th paragraph.

WordPress renders it like so:

<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].</p>
<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>
<p>And here is a 3rd paragraph.</p>
<p></p>
<p>And here is a 5th paragraph.</p>

So in a perfect world, I would like to take that ‘typical real world post’ and have preg_replace render it as :

If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of Tatort or Bukow & Konig.

If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.

And here is a 3rd paragraph.

<blockquote class="callout">Tatort or Bukow & Konig</blockquote>

And here is a 5th paragraph.

…which WordPress will then render as:

<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of Tatort or Bukow & Konig.</p>
<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>
<p>And here is a 3rd paragraph.</p>
<blockquote class="callout">Tatort or Bukow & Konig</blockquote>
<p>And here is a 5th paragraph.</p>

Maybe this has gotten too far off the mark and I should re-post in WordPress forum, but I -think- all I need is a way to alter the preg_replace to use a newline char as a delimiter instead of the

and figure out how to -not- strip out those newline chars from the returned string.

THANKS FOR ALL THE HELP SO FAR!

Related posts

Leave a Reply

3 comments

  1. If you want to use PHP HTML/XML parsing, please refer to How do you parse and process HTML/XML in PHP?.

    For a regex solution, here is a regex solution:

    FIND: (?s)((?:<p>.*?</p>s*){3})

    This regex will just capture the first 3 <p> tags and then add a node after them.

    REPLACE: $1<blockquote class="pullquote">Tatort or Bukow & Konig</blockquote>n

    Code:

    $re = "/(?s)((?:<p>.*?<\/p>\s*){3})/"; 
    $str = "<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].</p>n<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>n<p>And here is a 3rd paragraph.</p>n<p>And here is a 4th paragraph.</p>"; 
    $subst = "$1<blockquote class="pullquote">Tatort or Bukow & Konig</blockquote>n"; 
    $result = preg_replace($re, $subst, $str, 1);
    

    Demo is here.

  2. You could do this in a single preg_replace function.

    $re = "~^(?:(?!/p).)*<p>(?:(?!/p).)*\[callout\](.*?)\[/callout\].*?</p>(?:[^<>]*<p>.*?</p>){2}[^<]*\K~s";
    $str = "<p>If you wanna improve yer German, don't try to read Heine or some elevated crap... watch old episodes of [callout]Tatort or Bukow & Konig[/callout].</p>n<p>If I were teaching a music appreciation I wouldn't teach Beethoven. I'd teach Stamitz and average composers.</p>n<p>And here is a 3rd paragraph.</p>n<p>And here is a 4th paragraph.</p>";
    $subst = "<blockquote class="pullquote">$1</blockquote>n";
    $result = preg_replace($re, $subst, $str);
    echo $result;
    

    DEMO

    Code in eval

  3. Simply using (.*?</p>){3}K with s modifier, you can achieve what you want:

    preg_replace("@(.*?</p>){3}K@s", $pullquote, $content);
    

    I made some changes in your function to work correctly:

    function jchwebdev_pullquote( $content )
    {
        $pattern = "~[callout](.*?)[/callout]~s";
        if(preg_match($pattern, $content, $matches))
        {
          $content = preg_replace($pattern, '$1', $content);
          $pullquote = '<blockquote class="pullquote">' .$matches[1] . '</blockquote>';
          $content = preg_replace("@(.*?</p>){3}K@s", $pullquote, $content);
          return $content;
        }
        return $content;    
    }
    

    Regex live demo

    PHP live demo

    Update #1

    Optimized: Using single preg_replace to avoid multiple patterns to be applied:

    function jchwebdev_pullquote( $content )
    {
        $pattern = "[callout](.*?)[/callout]";
        if(preg_match("@(?s)$pattern@", $content, $matches))
        {
          $content = preg_replace("@(?s)($pattern)((.*?</p>){3})@", '23<blockquote class="pullquote">2</blockquote>', $content);
          return $content;
        }
        return $content;
    }
    

    PHP live demo