Skip to content

Commit

Permalink
Make PHP replacement more efficient
Browse files Browse the repository at this point in the history
This change makes PHP replacement queries and memory usage more
efficient.

It improves the primary key SELECT queries by eliminating the use of
OFFSET because OFFSET requires that the database consider all rows up to
OFFSET before taking rows up to the LIMIT. The new query relies on
primary key conditions to more efficiently eliminate previous rows from
consideration. This way, the database can use an index to identify rows
with keys greater than those of the previous chunk.

It improves memory usage by doing updates along the way rather than
storing all a column's updates in memory until the end. At Automattic,
when we limit search-replace to 4GB of memory, we sometimes exceed that
limit for large sites. It's possible there are other things that
contribute to high memory usage within the search-replace command, but
as a first step, we can reduce memory requirements by no longer keeping
all updated column values in memory simultaneously.
  • Loading branch information
brandonpayton committed May 10, 2023
1 parent 5c54825 commit 56faa0c
Show file tree
Hide file tree
Showing 2 changed files with 67 additions and 16 deletions.
44 changes: 43 additions & 1 deletion features/search-replace.feature
Original file line number Diff line number Diff line change
Expand Up @@ -1149,7 +1149,7 @@ Feature: Do global search/replace
Success:
"""

Scenario: Chunking works without skipping lines
Scenario: Chunking a precise search and replace works without skipping lines
Given a WP install
And a create_sql_file.sh file:
"""
Expand Down Expand Up @@ -1190,3 +1190,45 @@ Feature: Do global search/replace
"""
Success: Made 0 replacements.
"""

Scenario: Chunking a regex search and replace works without skipping lines
Given a WP install
And a create_sql_file.sh file:
"""
#!/bin/bash
echo "CREATE TABLE \`wp_123_test\` (\`key\` INT(5) UNSIGNED NOT NULL AUTO_INCREMENT, \`text\` TEXT, PRIMARY KEY (\`key\`) );" > test_db.sql
echo "INSERT INTO \`wp_123_test\` (\`text\`) VALUES" >> test_db.sql
index=1
while [[ $index -le 199 ]];
do
echo "('abc'),('abc'),('abc'),('abc'),('abc'),('abc'),('abc'),('abc'),('abc'),('abc')," >> test_db.sql
index=`expr $index + 1`
done
echo "('abc'),('abc'),('abc'),('abc'),('abc'),('abc'),('abc'),('abc'),('abc'),('abc');" >> test_db.sql
"""
And I run `bash create_sql_file.sh`
And I run `wp db query "SOURCE test_db.sql;"`

When I run `wp search-replace --dry-run 'abc' 'def' --all-tables-with-prefix --skip-columns=guid,domain --regex`
Then STDOUT should contain:
"""
Success: 2000 replacements to be made.
"""

When I run `wp search-replace 'abc' 'def' --all-tables-with-prefix --skip-columns=guid,domain --regex`
Then STDOUT should contain:
"""
Success: Made 2000 replacements.
"""

When I run `wp search-replace --dry-run 'abc' 'def' --all-tables-with-prefix --skip-columns=guid,domain --regex`
Then STDOUT should contain:
"""
Success: 0 replacements to be made.
"""

When I run `wp search-replace 'abc' 'def' --all-tables-with-prefix --skip-columns=guid,domain --regex`
Then STDOUT should contain:
"""
Success: Made 0 replacements.
"""
39 changes: 24 additions & 15 deletions src/Search_Replace_Command.php
Original file line number Diff line number Diff line change
Expand Up @@ -537,9 +537,16 @@ private function php_handle_col( $col, $primary_keys, $table, $old, $new ) {
$count = 0;
$replacer = new SearchReplacer( $old, $new, $this->recurse_objects, $this->regex, $this->regex_flags, $this->regex_delimiter, null !== $this->log_handle, $this->regex_limit );

$table_sql = self::esc_sql_ident( $table );
$col_sql = self::esc_sql_ident( $col );
$where = $this->regex ? '' : " WHERE $col_sql" . $wpdb->prepare( ' LIKE BINARY %s', '%' . self::esc_like( $old ) . '%' );
$table_sql = self::esc_sql_ident( $table );
$col_sql = self::esc_sql_ident( $col );

$base_key_condition = '';
$where_key = '';
if ( ! $this->regex ) {
$base_key_condition = "$col_sql" . $wpdb->prepare( ' LIKE BINARY %s', '%' . self::esc_like( $old ) . '%' );
$where_key = "WHERE $base_key_condition";
}

$escaped_primary_keys = self::esc_sql_ident( $primary_keys );
$primary_keys_sql = implode( ',', $escaped_primary_keys );
$order_by_keys = array_map(
Expand All @@ -550,17 +557,12 @@ static function ( $key ) {
);
$order_by_sql = 'ORDER BY ' . implode( ',', $order_by_keys );
$limit = 1000;
$offset = 0;

// Updates have to be deferred to after the chunking is completed, as
// the offset will otherwise not work correctly.
$updates = [];

// 2 errors:
// - WordPress.DB.PreparedSQL.InterpolatedNotPrepared -- escaped through self::esc_sql_ident
// - WordPress.CodeAnalysis.AssignmentInCondition -- no reason to do copy-paste for a single valid assignment in while
// phpcs:ignore
while ( $rows = $wpdb->get_results( "SELECT {$primary_keys_sql} FROM {$table_sql} {$where} {$order_by_sql} LIMIT {$limit} OFFSET {$offset}" ) ) {
while ( $rows = $wpdb->get_results( "SELECT {$primary_keys_sql} FROM {$table_sql} {$where_key} {$order_by_sql} LIMIT {$limit}" ) ) {
foreach ( $rows as $keys ) {
$where_sql = '';
foreach ( (array) $keys as $k => $v ) {
Expand Down Expand Up @@ -595,15 +597,22 @@ static function ( $key ) {
$update_where[ $k ] = $v;
}

$updates[] = [ $table, array( $col => $value ), $update_where ];
$wpdb->update( $table, [ $col => $value ], $update_where );
}
}

$offset += $limit;
}

foreach ( $updates as $update ) {
$wpdb->update( ...$update );
// Because we are ordering by primary keys from least to greatest,
// we can exclude previous chunks from consideration by adding greater-than conditions
// to insist the next chunk's keys must be greater than the last of this chunk's keys.
$last_keys = end( $rows );
$where_key_conditions = array();
if ( $base_key_condition ) {
$where_key_conditions[] = $base_key_condition;
}
foreach ( (array) $last_keys as $k => $v ) {
$where_key_conditions[] = self::esc_sql_ident( $k ) . ' > ' . self::esc_sql_value( $v );
}
$where_key = 'WHERE ' . implode( 'AND', $where_key_conditions );
}

if ( $this->verbose && 'table' === $this->format ) {
Expand Down

0 comments on commit 56faa0c

Please sign in to comment.