Page MenuHomePhabricator

Create a script to update afl_var_dump, drop back-compat code
Closed, ResolvedPublic

Description

AbuseFilter let people test rules against old filter hits, but this has a cost: while checking filters for an ongoing action, we save variables in the DB. Nowadays we save an array with native data types (PHP's ones), but in the past we used to serialize a whole AbuseFilterVariableHolder object, which in turn included AFPData objects and AFComputedVariable object. The latter could include even bigger stuff, for instance a whole Article object! This was a poor choice (and led to, for instance, T187153) and the behaviour has been changed in 2013, but old entries still hold serialized classes. This is why we need a maintenance script to clean all those entries and convert them to the current format.

Since we need a script to change old dumps, we should seize the opportunity to do several improvements at the same time. Including the above, we have:

  1. Check the afl_var_dump table, and if the column contains an AbuseFilterVariableHolder (it did before rEABFf94f42b506714023570bf26cf42775287f82b515) move it to the text table (via storeVarDump)
    1. Remove the first if inside AbuseFilter::loadVarDump (back-compat code)
  2. Check the text table, and if the serialized value is an AbuseFilterVariableHolder re-store it using storeVarDump. For every row, remove the 'nativeDataArray' flag and add 'utf-8' (T34478)
    1. Remove other back-compat code from loadVarDump and storeVarDump (everything related to 'nativeDataArray': the flag shouldn't be applied anymore, and all data will be native)
  3. Since all dumps will be in the text table, change the afl_var_dump column to only hold the text id, and also perform a schema change to make it hold a bigint (or whatever the long integer data type is called)
    1. Stop adding the "stored-text" prefix
  4. Remove meta-variables ('context', 'logged_local_ids' and 'logged_global_ids') from code (patch1, patch2)
    1. Remove them from logs using the script
  5. Rename old variables to use the new names (T204236) inside all old logs
    1. Remove AbuseFilterVariableHolder::mVarsVersion and its uses
  6. Use JSON serialization instead of PHP's serialize per T161647
  7. Fix empty afl_var_dump (T110854)
  8. Ensure that variable names are saved lowercase (goes with this patch, although we already lcase the names and I don't know whether we saved uppercase names in the past)
  9. Handle corrupted/non-unserializable rows (T214193)

And maybe there's even more: I think that reasonably any other change related to stored var dumps should be included in this list.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 482515 abandoned by Daimona Eaytoy:
Store only the insert ID in afl_var_dump

Reason:
Done in I3242acd5c5163a941f584d6119e3ad3b3cad8c29

https://gerrit.wikimedia.org/r/482515

Change 482518 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Drop back-compat code

https://gerrit.wikimedia.org/r/482518

Change 482499 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] [WIP] Add a maintenance script to clean afl_var_dump

https://gerrit.wikimedia.org/r/482499

Change 482500 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Add updateVarDumps to update.php and handle new code

https://gerrit.wikimedia.org/r/482500

Change 482513 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Change afl_var_dump to be a foreign key to text.old_id

https://gerrit.wikimedia.org/r/482513

Change 482518 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Drop back-compat code

https://gerrit.wikimedia.org/r/482518

Change 482513 abandoned by Daimona Eaytoy:
Change afl_var_dump to be a foreign key to text.old_id

Reason:
We've changed the format to tt:, following what MCR is doing.

https://gerrit.wikimedia.org/r/482513

daniel triaged this task as Medium priority.Sep 30 2019, 4:14 PM

Change 551603 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Add updateVarDumps to update.php

https://gerrit.wikimedia.org/r/551603

Change 482500 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Start using new format for var dumps

https://gerrit.wikimedia.org/r/482500

Change 482499 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Add a maintenance script to clean afl_var_dump

https://gerrit.wikimedia.org/r/482499

Change 576405 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Minor fixes for the updateVarDumps script

https://gerrit.wikimedia.org/r/576405

Change 576405 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Minor fixes for the updateVarDumps script

https://gerrit.wikimedia.org/r/576405

Change 576421 had a related patch set uploaded (by Jforrester; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@wmf/1.35.0-wmf.22] Minor fixes for the updateVarDumps script

https://gerrit.wikimedia.org/r/576421

Change 576421 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@wmf/1.35.0-wmf.22] Minor fixes for the updateVarDumps script

https://gerrit.wikimedia.org/r/576421

Mentioned in SAL (#wikimedia-operations) [2020-03-03T19:38:47Z] <jforrester@deploy1001> Synchronized php-1.35.0-wmf.22/extensions/AbuseFilter: T213006 T246539: Minor fixes for the updateVarDumps script (duration: 01m 05s)

@Daimona are you waiting on code review from CPT? Moving this to completed from our side for now but feel free to move it back to needed if I'm wrong.

@Daimona are you waiting on code review from CPT? Moving this to completed from our side for now but feel free to move it back to needed if I'm wrong.

Currently, no. I'll likely ask for review for https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/AbuseFilter/+/482518/ and related patches, but that should happen at least in the next MW version. Can I poke you when the time comes?

Change 577244 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] updateVarDumps: Use new ExternalStore method for updating records

https://gerrit.wikimedia.org/r/577244

@Daimona Yes please do, if you can retag it with Platform Engineering it will be moved back into our queue and get triaged.

Change 577244 abandoned by Daimona Eaytoy:
updateVarDumps: Use new ExternalStore method for updating records

Reason:
Per core patch

https://gerrit.wikimedia.org/r/577244

Change 579274 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] updateVarDumps: Print orphaned ES records, don't try updating ES records

https://gerrit.wikimedia.org/r/579274

Change 579274 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] updateVarDumps: Print orphaned ES records, don't try updating ES records

https://gerrit.wikimedia.org/r/579274

Change 593574 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] updateVarDumps: move MediaWikiServices away from constructor

https://gerrit.wikimedia.org/r/593574

Change 593574 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] updateVarDumps: move MediaWikiServices away from constructor

https://gerrit.wikimedia.org/r/593574

Change 622211 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@REL1_35] Add updateVarDumps to update.php

https://gerrit.wikimedia.org/r/622211

Change 551603 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Add updateVarDumps to update.php

https://gerrit.wikimedia.org/r/551603

Change 622211 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@REL1_35] Add updateVarDumps to update.php

https://gerrit.wikimedia.org/r/622211

Daimona set Due Date to Dec 1 2020, 10:59 PM.Sep 27 2020, 8:51 AM

Setting due date to reflect the fact that this should be resolved before the 1.36 release, so we can remove the BC code there. Should this fail, I'd still like to remove the BC code there, but keep it on master until T246539 is done.

So, this "only" took two years. Success! Resolved?

Daimona removed a project: Patch-For-Review.

So, this "only" took two years. Success! Resolved?

Yes! Finally! (The remaining patch is in gate-and-submit)

Change 482518 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Drop back-compat code

https://gerrit.wikimedia.org/r/482518