Separating urls from a string?

I have a textbox user profile area where users will put rss urls separated by ‘return’ or each URLs per line. Then it will be saved to database. Now how I get all the urls in an array?

I know it is very basic question. Searched the web found lots of things and I got confused. How the I think little discussion will help.

Read More

I also want to some validation like if users don’t put any http:// before url the code will add it. And ignore URLs other then http:// protocols.

My CMS is WordPress so if there are any built in functions that might help me let me know.

Related posts

Leave a Reply

3 comments

  1. After using the explode function you’ll want to do a foreach to do the validation you’re looking for

    $urlArray = explode("r", $_POST["textBox"])
    if(!empty($urlArray))
    {
        foreach($urlArray as $url)
        {
             //Do your regex checking here
        }
    }
    

    Depending on your skill level if RegEx is too complicated (though I’d really recommend learning it) you could look at using substr to evaluate each of the html lines as well, though this is a much less powerful version. An example would be:

    $htmlString = substr($url, 0, 7);
    if($htmlString != "http://")
    {
      $url = "http://" . $url;
    }
    
  2. use PHP’s explode function:

    $urlArray = explode("r", $POST["URlsInputName"])
    

    This will return an array of the URL’s assuming $POST[“URlsInputName”] contains your input box’s value that contains the URLs separated by a carriage return.

  3. A basic block follows:

    $text_box_text = <<<TEXT_BOX
            http://example.com
            hello.example
            ftp://ftp.invalid
            HTTP:/fixslash.invalid
    TEXT_BOX
    ;
    
    function url_map ( $likely )
    {
        $likely = preg_replace( '/^s+/', '', $likely );   #clean up white spaces
        $likely = preg_replace( '/s+$/', '', $likely );   #
        $likely = preg_match( '/^w+:/', $likely ) ? $likely : "http://$likely";  #prepend
        $likely = preg_replace( '|^http:/*|i', 'http://', $likely );   #fix /s and case
        return $likely;
    }
    
    $likely_urls = preg_split( '/[rn]+/', $text_box_text );
    $good_urls = preg_grep( '/^http:/', array_map( "url_map",  $likely_urls ) );
    

    As this is user input you need to be a bit more paranoid. More error checks are always in order:

    preg_grep( '|^http://[-w][-.w]+/|', ... )   #assure valid host name

    and more.