I want to take some html content and operate on a few particular attributes in a few particular tags.
I am trying to implement the following algorithm in PHP:
- Find all ‘a’ tags and for each
- if its href attribute matches a given url
- if this tag contains an ‘img’ tag with a matching ‘src’ attribute
- do stuff : may require changing contents/attributes of current ‘a’ and ‘img’ tags
- endif
- if this tag contains an ‘img’ tag with a matching ‘src’ attribute
- endif
- if its href attribute matches a given url
- Next
I am trying to implement this within a wordpress plugin so we can assume that I have all the functionality of that platform available.
What methods/techniques can i use to implement this? i am really bad with regular expressions so i was hoping there might be an easier way to do this or if anyone can recommend a good way to build the regular expressions, that would be acceptable too.
any help would be greatly appreciated!
Whatever you do don’t try to parse the HTML using regex!
Parse it with a parser 🙂
For example DOM.
This sounds like a good job for an HTML parser 🙂