-
Notifications
You must be signed in to change notification settings - Fork 822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH spread MySQL ORM bind parameters #11434
Conversation
Because the rest of the "bit of hackery" is now unnecessary The ORM rewrite supporting parameterised queries was merged in July 2014 d8e9af8 and relased in October 2015. https://github.com/silverstripe/silverstripe-framework/releases/tag/3.2.0 The official PHP support at the time was 5.3.2+ https://web.archive.org/web/20140616033159/http://www.silverstripe.org/system-requirements Which excluded the spread operator introduce in PHP 5.6 from being used. https://www.php.net/manual/en/migration56.new-features.php#migration56.new-features.splat Ten years later none of this is relevant anymore, so we can save some useless data shuffling from every parameterised query by just directly using the spread operator.
In the PR description you've said However the spread operator was already being used? There's an inline comment that says the hackery is required So seems like this PR is removing something that's required |
Then why did all the tests pass? I've adjusted the PR to make use of markdown, hopefully it can be easier to understand. Please reconsider, @emteknetnz |
Unit tests won't cover every edge case. There's a note on the PHP docs saying Does this matter? I don't have certainty about that. This PR may be probably works fine, though it time it's introducing a risk of regression for seemingly no real benefit. |
I don't know why you're brining this up. The code does not use This PR removes unnecessary code from a low level querying class. If it broke querying (e.g. by triggering a fatal error), I expect nearly every test that uses the database to fail as parameterised queries are preferred by the ORM. |
I'm honestly not sure what the consequences of changing this are. I've done some local testing and this PR does change the behaviour of the method with edge case use. It's outside the normal use of this method however it's public API so a project somewhere may be doing something like this and I don't want to accidentally break someones site // app/_config.php
use SilverStripe\ORM\DB;
use SilverStripe\ORM\Connect\MySQLDatabase;
use SilverStripe\ORM\Connect\MySQLiConnector;
/**@var MySQLDatabase $database */
$database = DB::get_conn();
/** @var MySQLiConnector $connector */
$connector = $database->getConnector();
/** @var mysqli_stmt $statement */
$statement = $connector->prepareStatement('SELECT * FROM SiteTree WHERE ID = ?', $success);
$id = 1;
$connector->bindParameters($statement, ['i', &$id]);
$id = 2;
$statement->execute();
var_dump($statement->get_result()->fetch_assoc()['Title']);
die;
// without PR = string(4) "Home"
// with PR = string(8) "About Us" While it's a low-level, there would only be a very small performance benefit from merging this PR, so I'd prefer to leave this as is as presumably when this code was initially written there was some level of understanding and thought put into respecting the variable references |
None.
When this code was written it used |
I updated the PR message again. Is it clearer now? |
It does, that makes a bit more sense, though there's no functional difference between I don't want to remove code that's doing what's it's supposed to do for so little performance benefit, for instance assuming 1,000 parameterized queries are run on a web request // app/_config.php
$start = microtime(true);
$parameters = ['abc', 'def', 'xyz'];
for ($j = 0; $j < 1000; $j++) {
$boundNames = [];
$parametersCount = count($parameters ?? []);
for ($i = 0; $i < $parametersCount; $i++) {
$boundName = "param$i";
$$boundName = $parameters[$i];
$boundNames[] = &$$boundName;
}
}
$end = microtime(true);
$diff = $end - $start;
var_dump($diff);
die;
// float(0.00021719932556152344) // or 0.2 milliseconds |
With the test code above the author is explicitly manipulating the value before excecution, otherwise there is no reason to pass by reference. I would argue that it should change the output, and file it as a bug that it doesn't. And Silverstripe runs on average 50 million* queries per request. Change the target branch to |
So this is just a hard "no" then? |
I don't think it's worth targeting at 6. While this PR is likely safe, the potential measured upside is minuscule. Websites making ~1K database queries per page view is probably considered quite bad? I can't imagine it being something like 100K queries which would bump the time saving up to 20ms which would be noticeable. If that were the case then the webpage would probably be taking about 30 seconds to render at that point. I don't think we should be removing code that appears to be doing something which presumably whoever wrote it understands its purpose better than I do, unless there's a good reason to. |
🤷♂️ OK. |
I don't understand why this can't be pointed at 6 - feels like there's enough time to battle-test it before it gets released. The code comment literally describes this as hackery; it feels like the risk here is being vastly over-inflated. To your other point, there are plenty of unoptimised websites in the world, on plenty of cheaply-rented servers. There are dev environments, there are setups so complex we couldn't even comprehend. Surely any improvement, at such a low risk, should be embraced here, rather than rebuffed? I'm all for safety and caution, but this feels like overkill, and has the potential to discourage open-source contributions, when something so seemingly innocuous - with a considerable lead time - is immediately closed. |
The code comment is
The code comment is saying we NEED to do the hackery because the arguments MUST be passed a certain way. Look it's obviously ugly looking code, though it's apparently doing something so I'd rather just leave it there. This PR isn't fixing a bug and there's no real performance benefit, so it seems like purpose of this PR was just to remove some ugly looking code. |
The comment, and code, is from 10 years ago. The "need" has the context of the time, the "hackery" is still true 😄 I'd note that @NightJar has done all the hard work in figuring out the impact and implications of this change already. Is "cleaning up code and a minor performance improvement" not a high enough bar for contribution? I'm not sure what your argument is there. Would this be accepted if there was a test case that alleviated your fear that it is still "doing something"? |
That's a good reason to contribute. In this case though I'm really not sure about that what's been submitted, it seems like it's possibly regressive. As mentioned in my testing above, the behaviour of the method has changed as a result of this PR, meaning the "redundant code" that's been removed isn't redundant, it's actually doing something which it's supposed to do (at least it intends to do something it's supposed to). I did validate in my testing that it didn't matter if call_user_func_array() or if the spread operator was used, so the hackery appears to still be doing what it originally intended to do 10 years ago. So I can't simply say this code is redundant. Now, if instead that hackery has been doing the WRONG thing this whole time, then this PR is fixing a bug and should be merged into |
Removing the rest of the "bit of hackery" as it is now unnecessary.
The ORM rewrite supporting parameterised queries was merged in July 2014 and released in October 2015. The official PHP support at the time was 5.3.2+ which excluded the spread operator from being used (introduced in PHP 5.6), meaning all this variable shuffling had to happen due to the PHP docs note about using
call_user_func_array
as this code did (but has since been made reudnant by use of the spread operator in 437e53f).Ten years later none of this is relevant anymore, so we can save some useless data shuffling from every parameterised query by just directly using the spread operator.