DOM manipulation in WordPress - Olatech Community

Im trying to use the DOM in PHP to do a pretty specific job and Ive got no luck so far, the objective is to take a string of HTML from a WordPress blog post (from the DB, this is a wordpress plugin). And then out of that HTML replace <div id="do_not_edit">old content</div>" with <div id="do_not_edit">new content</div>" in its place. Saving anything above and below that div in its structure.

Then save the HTML back into the DB, should be simple really, I have read that a regex wouldnt be the right way to go here so Ive turned to the DOM instead.

The problem is I just cant get it to work, cant extract the div or anything.

Help me!!

UPDATE

The HTML coming out of the wordpress table looks like:

Congratulations on finding us here on the world wide web, we are on a  mission to create a website that will show off your culinary skills  better than any other website does.

<div id="do_not_edit">blah blah</div>
We want this website to be fun and  easy to use, we strive for simple elegance and incredible functionality.We aim to provide a 'complete package'. By this we want to create a  website where people can meet, share ideas and help each other out.

After several different (incorrect) workings all Ive got below is:

$content = ($wpdb->get_var( "SELECT `post_content` FROM $wpdb->posts WHERE ID = {$article[post_id]}" ));        

$doc = new DOMDocument();
$doc->validateOnParse = true; 
$doc->loadHTMLFile($content);
$element = $doc->getElementById('do_not_edit');
echo $element;

Post Views: 2

2 comments

If you are sure that the HTML from WordPress contains only one div, the following should work:

$doc = new DOMDocument();
$doc->validateOnParse = false; 
$doc->loadHTML($content);
$divs = $doc->getElementsByTagName('div');
echo $divs->item(0)->textContent;

If not, try:

$doc = new DOMDocument();
$doc->validateOnParse = false; 
$doc->loadHTML($content);
$divs = $doc->getElementsByTagName('div');

for($i=0; $i<$divs->length; $i++)
{
  $id = $divs->item($i)->attributes->getNamedItem('id');
  if($id && $id->value == 'do_not_edit')
  {
    //your code here...
    $node = $divs->item($i);
    $newText = new DOMText("This is some new content");

    $node->appendChild($newText);
    $node->removeChild($node->firstChild);
    break;
  }
}

$html = $doc->saveHTML();

Your HTML is not a complete HTML document, which is what DOMDocument expects. One option would be to wrap your HTML so it’s a complete document:

$content = ($wpdb->get_var( "SELECT `post_content` FROM $wpdb->posts WHERE ID = {$article[post_id]}" ));

$content = '<html><head><title></title></head><body>'.$content.'</body></html>';

$doc = new DOMDocument();
$doc->validateOnParse = false; 
$doc->loadHTML($content);
$element = $doc->getElementById('do_not_edit');
echo $element;

It’s a bit hacky, but might easily solve the problem.

DOM manipulation

Leave a Reply Cancel reply

2 comments

Social Network

Related posts

Leave a Reply Cancel reply

2 comments

Social Network