PHP function to scrape first image

In a WordPress blog I’m using the following function to scrape the page (single post view) and find the first image and, if none is found, to use a default image:

    function catch_that_image() {
  global $post, $posts;
  $first_img = '';
  ob_start();
  ob_end_clean();
  $output = preg_match_all('/<img.+src=['"]([^'"]+)['"].*>/i', $post->post_content, $matches);
  $first_img = $matches [1] [0];

  if(empty($first_img)){ //Defines a default image
    $first_img = "http://custome_url_for_default_image.png";
  }
  return $first_img;
}

I tried to paste it as is in a Tumblr theme, but encounter some problems (it does not load as a PHP function). Surely I’m missing something. If anyone has an idea for troubleshooting this I’ll be glad to try it.

Read More

Thanks,

P.

Related posts

Leave a Reply

1 comment

  1. The best way to do this would be to avoid using regexes to parse HTML.

    Try using DOMDocument:

    function catch_that_image() {
        global $post;
        $dom = new DOMDocument();
        $dom->loadHtml($post->post_content);
        $imgTags = $dom->getElementsByTagName('img');
        if ($imgTags->length > 0) {
            $imgElement = $imgTags->item(0);
            return $imgElement->getAttribute('src');
        } else {
            return 'http://custome_url_for_default_image.png';
        }
    }