replace all images in an XML file

I have import all my wordpress content and now I want to replace all images with a placeholder image. The most obvious way I think is to search and replace all the images. I try to do it manually but the file is big enougth to make me rethinking this.

This is an example of wordpress exported XML file: https://wpcom-themes.svn.automattic.com/demo/theme-unit-test-data.xml

Read More

I would like to replace all images url with placehold.it url (http://placehold.it/)

I am using sublime text editor, is there a any regex to search all the image url on a XML file? I am really not very good with regex..

Thanks in advance!

Related posts

Leave a Reply

2 comments

  1. A simple regex to replace all image src attributes with some placeholder text would be:

    Search for:

    <img (.*?)src=".*?"
    

    Replace with:

    <img $1src="http://example.com"
    

    If you want to use the placeholder URL, you could do:

    <img (.*?)src=".*?"(.*?)width="(d+)" height="(d+)"
    

    Replace with:

    <img $1src="http://placehold.it/$3x$4"$2width="$3" height="$4"
    

    Explanation:

    • .*? means 0 or more characters
    • d+ means 1 or more digits
    • ( and ) capture the contents of the parentheses and save it to $1, $2, $3, etc.

    • <img (.*?)src captures any characters between <img and src and saves them in $1 — so if there is a class attribute, an ID, anything like that–it will be saved as $1. .*? can also match nothing, so $1 can also be blank.

    • width="(d+)" captures the digits that give the image width, and saves them to $3 (since it’s the third set of parentheses in that regular expression).
  2. regex:

    (<imgs+.*?srcs*=s*)(?|"(.*?)"|'(.*?)')(.*?/?>)
    

    replacement:

    $1"http://placehold.it/"$3
    

    If your editor supports regex search and replace then use above, else in PHP:

    $string = preg_replace( '/(<imgs+.*?srcs*=s*)(?|"(.*?)"|'(.*?)')(.*?/?>)/is', '$1"http://placehold.it/"$3', $string );