My first attempt using RE has me stuck. I’m using Regex on a WordPress website via the Search-Regex Plugin and need to match on a specific ” buried within a bunch of html code. HTML example:
provide brand-strengthening efforts for the 10-school conference. </p>
<p>
<a href="http://www.learfield.com/oldblog/.a/6a00d8345233fa69e201157155a6fc970c-pi">
<img alt="MOvalleyConf500"
border="0"
class="at-xid-6a00d8345233fa69e201157155a6fc970c"
src="http://www.learfield.com/oldblog/.a/6a00d8345233fa69e201157155a6fc970c-800wi"
style="border: 1px solid black; margin: 0px; width: 502px; height: 384px;"
title="MOvalleyConf500" />
</a>
</p>
<p>The photo above
In the above example, there are three targets
6a00d8345233fa69e201157155a6fc970c-pi"
6a00d8345233fa69e201157155a6fc970c"
6a00d8345233fa69e201157155a6fc970c-800wi"
The Regex I’m using is /6a00d834.*?"/
it locates them, however I only want to match on the ending "
and not the entire string. These are images that are missing their file extension, so I need to replace the ending ” with .jpg” I understand the replacement part of the expression, it’s the initial matching I’m having trouble with.
I have a bunch of these (221), all the targets all begin with 6a00d834
then some random alphanumeric ending with a "
Appreciate any insight. Thanks.
Edit added from OP’s comment: Actually it’s on a WordPress site using a plugin (REGEX) to query and replace data within SQL. I can use any Perl compatible regex. (Note from editor – depending on the plugin, this is most likely not actually using Perl but PHP’s implementation of PCRE.)
String replacement can be done along with the matching. Since you’re using PHP, use preg_replace
This breaks the match into two groups, and then inserts ‘.jpg’ between them.
For the wordpress regex plugin, use /(6a00d834.*?)(“)/ for the match string, and then use 1.jpg2 for the replacement string.
Wouldn’t this work?
Edit: You said in one of your comments you wanted to replace the
"
with.jpg"
; in that case this regexp would probably work:However, the best thing to do is probably to use the first regexp I provided, and use a replacement string that looks like this:
Of course,
\1
has to be replaced with whatever you particular regexp engine uses for backreferences.You question is not entirely clear, but perhaps you mean:
(That is: match 6a00d834 followed by zero or more characters that are not a ” followed by a “)
Alternatively, if it is available in the regex engine you are using, you can use a non-greedy specifier to limit the ‘*’ meta-character. Keep in mind that any question about regex’s is dependent on the engine you are using. For example:
Notice that the ‘?’ does not serve as a non-greedy specifier in sed.
I assume that you want to extract everything after
6a00d834
up to the first following"
. So try this:The match of first grouping will than be the string you are looking for.
Perhaps use a group operator?
Then, depending on your regex API, you can pull out just what is matched in the parens.
Edit
Ah, you want to do string replacement. I’ll guess you’re using Perl. Try this: