Page MenuHomePhabricator

Implement database storage for Welcome Survey
Open, LowPublic

Description

The WelcomeSurvey feature in GrowthExperiments currently relies on storing responses in the user_properties table. This was never intended to be a long term storage option, but now that we are on most Wikipedias, it's more urgent to set up storage that can scale alongside the feature usage.

The proposal is:

  • create a database table for growthexperiments_welcome_survey_responses with columns for ID, user ID, timestamp of submission, group name, and survey_submission JSON blob
  • implement a migration pattern to write to new storage, read from old; write new storage, read new; read/write new storage, drop old values from user_properties

Example submission:

{
  "reason": "placeholder",
  "edited": "placeholder",
  "languages": [
    "en"
  ],
  "mailinglist": true,
  "_group": "exp2_target_specialpage",
  "_render_date": "20220315204755",
  "_submit_date": "20220315204822",
  "_counter": 2
}

Note that submit date, group and counter can be removed, since those will have dedicated columns in the table (or in the case of counter, multiple rows in the table).

So the table might look like:

IDuser IDsubmit dategroupvalues
1120220315204822exp2_target_specialpage{"reason":"placeholder","edited":"placeholder","languages":["en"],"mailinglist":1, "_render_date":"20220315204755" }

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
ResolvedNone
ResolvedCyndymediawiksim
ResolvedNone
ResolvedTgr
Resolvedkostajh
ResolvedUrbanecm_WMF
ResolvedUrbanecm_WMF
ResolvedKStoller-WMF
OpenNone
Resolvedkostajh
OpenNone
ResolvedUrbanecm_WMF
OpenNone
OpenNone
ResolvedUrbanecm_WMF
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
Resolvedkostajh
ResolvedSgs
ResolvedUrbanecm_WMF
OpenNone
OpenNone

Event Timeline

DMburugu triaged this task as Medium priority.Dec 20 2022, 9:09 AM
DMburugu moved this task from Triaged to Current Maintenance Focus on the Growth-Team board.
DMburugu lowered the priority of this task from Medium to Low.Jan 9 2023, 4:53 PM

In terms of scaling, this seems less important because we delete the data after a while so the increase to user_properties is not huge. So given the amount of work involved in DB migrations, I'm unsure if it's worth it.

OTOH I guess because we don't use the data much, we don't really need a migration. We can just switch to a new table one day, and the data of the users who have registered in the last 90 days would be lost from the perspective of the extension, but then all we would need to do is ensure that users don't get the reminder about filling out the welcome survey again, otherwise it would be fine.

I wonder, if we do put in the effort to set up a new table, could we somehow use it as a baby step towards a structured user profile?

I wonder, if we do put in the effort to set up a new table, could we somehow use it as a baby step towards a structured user profile?

Can you elaborate on what you have in mind for that? Since the welcome survey contains private responses, and we need to truncate it periodically, I feel like a separate table for this would make sense. I do like the idea of building out tables for user profile support, but wonder if that might be better done in mediawiki/core from the beginning.

Adding myself as a subscriber in case we do decide to create a new table. That would require the Welcome Survey aggregation script to be updated to use the new data source. It should not be seen as an argument for or against any particular solution, instead it's just a change in data that I'll need to adapt to. Carry on.