SED Match/Replace URL and Update Serialized Array Count

Below is an example snippet from a sql dump file. This specific row contains a meta_value of a WordPress PHP serialized array. During database restores in dev., test., and qc. environments I’m using sed to replace URLs with the respective environment sub-domain.

INSERT INTO `wp_postmeta`
    (`meta_id`,
     `post_id`,
     `meta_key`,
     `meta_value`)
VALUES
    (527,
     1951,
     'ut_parallax_image',
     'a:4:{
          s:17:"background-image";
          s:33:"http://example.com/background.jpg";
          s:23:"mobile-background-image";
          s:37:"www.example.com/mobile-background.jpg";
      }')
;

However, I need to extend this to correct the string length in the serialized arrays after replace.

Read More
sed -r -e "s/://(www.)?${domain}/://1${1}.${domain}/g" "/vagrant/repositories/apache/$domain/_sql/$(basename "$file")" > "/vagrant/repositories/apache/$domain/_sql/$1.$(basename "$file")"

The result should look like this for dev.:

INSERT INTO `wp_postmeta`
    (`meta_id`,
     `post_id`,
     `meta_key`,
     `meta_value`)
VALUES
    (527,
     1951,
     'ut_parallax_image',
     'a:4:{
          s:17:"background-image";
          s:37:"http://dev.example.com/background.jpg";
          s:23:"mobile-background-image";
          s:41:"www.dev.example.com/mobile-background.jpg";
      }')
;

I’d prefer to not introduce any dependancies other than sed.

Related posts

Leave a Reply

5 comments

  1. Thanks @John1024. @Fabio and @Seth, I not sure for perfomance, but these code work and without wp-cli:

    localdomain=mylittlewordpress.local
    maindomain=strongwordpress.site.ru
    cat dump.sql | sed 's/;s:/;ns:/g' | awk -F'"' '/s:.+'$maindomain'/ {sub("'$maindomain'", "'$localdomain'"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1' | sed ':a;N;$!ba;s/;ns:/;s:/g' | sed "s/$maindomain/$localdomain/g" | mysql -u$USER -p$PASS $DBNAME
    

    PHP serialized string exploded by ‘;s:’ to multiline string and awk processed all lines by @John1024 solution.

    cat dump.sql | sed 's/;s:/;ns:/g'
    

    Redirect output to awk

    awk -F'"' '/^s:.+'$maindomain'/ {sub("'$maindomain'", "'$localdomain'"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1'
    

    After all lines processed, multiline implode to one line (as then exists in original dump.sql). Thanks @Zsolt https://stackoverflow.com/a/1252191

    sed ':a;N;$!ba;s/;ns:/;s:/g'
    

    Addition sed replacement need for any other strings in wordpress database.

    sed "s/$maindomain/$localdomain/g"
    

    And load into main server DB

    ... | mysql -u$USER -p$PASS $DBNAME
    
  2. Your algorithm involves arithmetic. That makes sed a poor choice. Consider awk instead.

    Consider this input file:

    $ cat inputfile
      something...
      s:33:"http://example.com/background.jpg";
      s:37:"www.example.com/mobile-background.jpg";
      s:33:"http://www.example.com/background.jpg";
      more lines...
    

    I believe that this does what you want:

    $ awk -F'"' '/://(www[.])?example.com/ {sub("example.com", "dev.example.com"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1' inputfile 
      something...
      s:37:"http://dev.example.com/background.jpg";
      s:37:"www.example.com/mobile-background.jpg";
      s:41:"http://www.dev.example.com/background.jpg";
      more lines...
    
  3. Here is a sample text file you asked for (it’s a database export).

    Original (https://www.example.com) :

    LOCK TABLES `wp_options` WRITE;
    INSERT INTO `wp_options` VALUES (1,'siteurl','https://www.example.com','yes'),(18508,'optionsframework','a:48:{s:4:"logo";s:75:"https://www.example.com/wp-content/uploads/2014/04/logo_imbrique_small3.png";s:7:"favicon";s:62:"https://www.example.com/wp-content/uploads/2017/04/favicon.ico";}','yes')
    /*!40000 ALTER TABLE `wp_options` ENABLE KEYS */;
    UNLOCK TABLES;
    

    Result needed (http://example.localhost) :

    LOCK TABLES `wp_options` WRITE;
    INSERT INTO `wp_options` VALUES (1,'siteurl','http://example.localhost','yes'),(18508,'optionsframework','a:48:{s:4:"logo";s:76:"http://example.localhost/wp-content/uploads/2014/04/logo_imbrique_small3.png";s:7:"favicon";s:64:"https://example.localhost/wp-content/uploads/2017/04/favicon.ico";}','yes');
    /*!40000 ALTER TABLE `wp_options` ENABLE KEYS */;
    UNLOCK TABLES;
    

    As you can see :

    • there is multiple occurence on the same line
    • escape characters aren’t counted in length number (eg: “/”)
    • some occurence aren’t preceded by “s:” length number (no need to replace, it can be done after awk with a simple sed)

    Thanks in advance !

  4. @Alexander Demidov’s answer is great, here’s our implementation for reference

    public static function replaceInFile(string $replace, string $replacement, string $absoluteFilePath): void
    {
    
        ColorCode::colorCode("Attempting to replace ::n($replace)nwith replacement ::n($replacement)n in file ::n(file://$absoluteFilePath)", iColorCode::BACKGROUND_MAGENTA);
    
        $replaceDelimited = preg_quote($replace, '/');
    
        $replacementDelimited = preg_quote($replacement, '/');
    
        $replaceExecutable = CarbonPHP::CARBON_ROOT . 'extras/replaceInFileSerializeSafe.sh';
    
            // @link https://stackoverflow.com/questions/29902647/sed-match-replace-url-and-update-serialized-array-count
        $replaceBashCmd = "chmod +x $replaceExecutable && $replaceExecutable '$absoluteFilePath' '$replaceDelimited' '$replace' '$replacementDelimited' '$replacement'";
    
        Background::executeAndCheckStatus($replaceBashCmd);
    
    }
    
    public static function executeAndCheckStatus(string $command, bool $exitOnFailure = true): int
    {
    
        $output = [];
    
        $return_var = null;
    
        ColorCode::colorCode('Running CMD >> ' . $command,
            iColorCode::BACKGROUND_BLUE);
    
        exec($command, $output, $return_var);
    
        if ($return_var !== 0 && $return_var !== '0') {
    
            ColorCode::colorCode("The command >>  $command nt returned with a status code (" . $return_var . '). Expecting 0 for success.', iColorCode::RED);
    
            $output = implode(PHP_EOL, $output);
    
            ColorCode::colorCode("Command output::t $output ", iColorCode::RED);
    
            if ($exitOnFailure) {
    
                exit($return_var);
    
            }
    
        }
    
        return (int) $return_var;
    
    }
        
    
    #!/usr/bin/env bash
    
    set -e
    
    SQL_FILE="$1"
    
    replaceDelimited="$2"
    
    replace="$3"
    
    replacementDelimited="$4"
    
    replacement="$5"
    
    if ! grep --quiet "$replace" "$SQL_FILE" ;
    then
    
      exit 0;
    
    fi
    
    cp "$SQL_FILE" "$SQL_FILE.old.sql"
    
    # @link https://stackoverflow.com/questions/29902647/sed-match-replace-url-and-update-serialized-array-count
    # @link https://serverfault.com/questions/1114188/php-serialize-awk-command-speed-up/1114191#1114191
    sed 's/;s:/;ns:/g' "$SQL_FILE" | 
      awk -F'"' '/s:.+'$replaceDelimited'/ {sub("'$replace'", "'$replacement'"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1' 2>/dev/null  | 
      sed -e ':a' -e 'N' -e '$!ba' -e 's/;ns:/;s:/g' | 
      sed "s/$replaceDelimited/$replacementDelimited/g" > "$SQL_FILE.replaced.sql"
    
    cp "$SQL_FILE.replaced.sql" "$SQL_FILE"