Problem with Regular Expression URL matching for http://localhost/

I am trying to use this project on GitHub https://github.com/ErisDS/Migrate to migrate the URL settings in my WordPress database from a Localhost dev install to a live URL.

At the moment the code throws an error for the URL to be replaced “http://localhost/mysitename”, but does accept the new URL “http://www.mywebsitename.com”

Read More

From what I can tell the error comes from this regular expression not validating the localhost as a valid URL – any ideas how i can update this to accept the localhost URL?

The full code can be viewed on GitHub.

function checkURL($url)
 {
  $url_regex = '/^(http://[a-zA-Z0-9_-]+(?:.[a-zA-Z0-9_-]+)*.[a-zA-Z]{2,4}(?:/[a-zA-Z0-9_]+)*(?:/[a-zA-Z0-9_]+.[a-zA-Z]{2,4}(?:?[a-zA-Z0-9_]+=[a-zA-Z0-9_]+)?)?(?:&[a-zA-Z0-9_]+=[a-zA-Z0-9_]+)*)$/';
if($url == 'http://')
 {
 return false;
 }    
return preg_match($url_regex, $url);
}

Related posts

Leave a Reply

3 comments

  1. You can add “localhost” to the acceptable hostnames by changing it to this:

    /^(http://(?:[a-zA-Z0-9_-]+(?:.[a-zA-Z0-9_-]+)*.[a-zA-Z]{2,4}|localhost)(?:/[a-zA-Z0-9_]+)*(?:/[a-zA-Z0-9_]+.[a-zA-Z]{2,4}(?:?[a-zA-Z0-9_]+=[a-zA-Z0-9_]+)?)?(?:&[a-zA-Z0-9_]+=[a-zA-Z0-9_]+)*)$/
    

    This part matches the http:// prefix:

    http://
    

    And this part matches the hostname:

    [a-zA-Z0-9_-]+(?:.[a-zA-Z0-9_-]+)*.[a-zA-Z]{2,4}
    

    So you can just change the hostname checker to a non-capturing alternation group that explicitly includes “localhost”:

    (?:X|localhost)
    

    where X is the existing hostname matching sub-expression. The (?: bit starts a non-capturing group, using a non-capturing group ensures that any group number references won’t get messed up.

    And some live examples: http://ideone.com/M0qqh

    I think a simple regex might serve you better though, that one doesn’t deal with CGI parameters very well. You could try this:

    /(http://[^/]+/([^s]+[^,.?!:;])?)/
    

    and see if it works for your data. That one is pretty loose but probably good enough for a one-shot conversion. That should properly match the URLs in these:

    'here is a URL http://localhost/x?a=b.'
    'More http://example.com nonsense!.
    

    You could also try Joseph’s from the comment.