Page MenuHomePhabricator

Phase out "redis_sessions" cluster and away from memcached cluster
Closed, ResolvedPublic

Description

This task is about phasing out the co-located Redis instances on MW's memcached servers, which was historically used for MW's session storage and MW's main stash, both of which have or are being migrated to other backends.

With T212129 done, the redis_sessions cluster is not in use anymore, so, it is high time we removed it!

Background
We have been discussing this topic for a few quarters now, it is time we do the final pushes to get to it. Please attach any relevant tasks and update the description with a checklist accordingly.

We are planning to keep this cluster to version Redis 2.8, even after we upgrade those hosts to buster, but this is putting an administrative overhead of keeping and maintaining a very old Redis version T265643

Details

Show related patches Customize query in gerrit

Related Objects

StatusSubtypeAssignedTask
Resolvedjijiki
Resolvedaaron
ResolvedEevans
Resolvedaaron
ResolvedKrinkle
ResolvedPapaul
ResolvedMarostegui
Resolvedaaron
ResolvedKrinkle
Resolvedtstarling
Resolvedtstarling
ResolvedPRODUCTION ERRORjcrespo
Resolvedtstarling
ResolvedMarostegui
Resolvedtstarling
Resolvedjijiki
Resolvedjijiki
Resolvedjijiki
ResolvedClement_Goubert
ResolvedNone
DeclinedNone

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Krinkle renamed this task from Phasing out MediaWiki Redis to Phasing out "redis_sessions" MediaWiki cluster.Nov 9 2020, 7:40 PM
Krinkle updated the task description. (Show Details)

Clarified title and summary based on chat with Effie on IRC.

The two remaining uses for a simpler non-replicated are ChronologyProtector and (probably) CentralAuth tokens. I suppose these could be folded in to dedicated Redis cluster we have for MediaWiki, on the the "rdb*" hosts. This is used for MW's lock manager, and was (previously) used for JobQueue. The CP and CA uses could be migrated there fairly easily I think.

Migrating CP or CA to something else is also an option long-term, but not a priority I think as we have plenty of other more impactful things to work on, and this is mostly working fine as it is, and still a win-win in terms of opeational complexity and maintenance cost.

Clarified title and summary based on chat with Effie on IRC.

The two remaining uses for a simpler non-replicated are ChronologyProtector and (probably) CentralAuth tokens. I suppose these could be folded in to dedicated Redis cluster we have for MediaWiki, on the the "rdb*" hosts. This is used for MW's lock manager, and was (previously) used for JobQueue. The CP and CA uses could be migrated there fairly easily I think.

Just a correction, so not to confuse the various redis clusters 🙂 . The rdb* hosts constitute what we call the misc redis cluster, currently used by changeprop, ORES and by our docker-registry. MW's lock manager consists of 3 redis hosts from the 'session redis' cluster.

Please note that this cluster is not highly available, so if one master goes down, it will have to be manually substituted by the secondary one. Moreover, data in this cluster is dc-local and not replicated.

jijiki triaged this task as Medium priority.Nov 10 2020, 3:59 PM

Change 640576 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/mediawiki-config@master] ProductionServices: Document hostname of redis_lock hosts

https://gerrit.wikimedia.org/r/640576

Change 640576 merged by jenkins-bot:
[operations/mediawiki-config@master] ProductionServices: Document hostname of redis_lock hosts

https://gerrit.wikimedia.org/r/640576

Change 646638 had a related patch set uploaded (by Effie Mouzeli; owner: Effie Mouzeli):
[operations/puppet@production] redis: define redis version on buster

https://gerrit.wikimedia.org/r/646638

Change 647197 had a related patch set uploaded (by Effie Mouzeli; owner: Effie Mouzeli):
[operations/puppet@production] redis: define redis version on buster for multidc

https://gerrit.wikimedia.org/r/647197

Change 646638 abandoned by Effie Mouzeli:
[operations/puppet@production] redis: define redis version on buster

Reason:
abandoned for 647197

https://gerrit.wikimedia.org/r/646638

Change 647197 merged by Effie Mouzeli:
[operations/puppet@production] redis: define redis version on buster for multidc

https://gerrit.wikimedia.org/r/647197

jijiki renamed this task from Phasing out "redis_sessions" MediaWiki cluster to Phasing out "redis_sessions" MediaWiki cluster from the memcached cluster.Apr 19 2021, 6:52 PM
jijiki renamed this task from Phasing out "redis_sessions" MediaWiki cluster from the memcached cluster to Phasing out "redis_sessions" MediaWiki cluster and away from the memcached cluster.Apr 19 2021, 6:59 PM
Krinkle renamed this task from Phasing out "redis_sessions" MediaWiki cluster and away from the memcached cluster to Phase out "redis_sessions" cluster and away from memcached cluster.Aug 19 2022, 12:55 PM

Change 824734 had a related patch set uploaded (by Krinkle; author: Krinkle):

[operations/mediawiki-config@master] redis: Remove references to nutcracker and redis_sessions cluster

https://gerrit.wikimedia.org/r/824734

Change 824736 had a related patch set uploaded (by Krinkle; author: Krinkle):

[operations/mediawiki-config@master] Remove references to now-empty redis.php file

https://gerrit.wikimedia.org/r/824736

Change 824737 had a related patch set uploaded (by Krinkle; author: Krinkle):

[operations/mediawiki-config@master] redis: Remove now-empty and unreferenced redis.php file

https://gerrit.wikimedia.org/r/824737

Change 824734 merged by jenkins-bot:

[operations/mediawiki-config@master] redis: Remove references to nutcracker and redis_sessions cluster

https://gerrit.wikimedia.org/r/824734

Change 824736 merged by jenkins-bot:

[operations/mediawiki-config@master] Remove references to now-empty redis.php file

https://gerrit.wikimedia.org/r/824736

Change 824737 merged by jenkins-bot:

[operations/mediawiki-config@master] redis: Remove now-empty and unreferenced redis.php file

https://gerrit.wikimedia.org/r/824737

Change 530019 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[mediawiki/core@master] [WIP] Migrate uses of MemcachedClient to new MemcachedHandle class

https://gerrit.wikimedia.org/r/530019

Change 864830 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/puppet@production] Redis sessions: Goodbye

https://gerrit.wikimedia.org/r/864830

Change 865117 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/mediawiki-config@master] ProductionServices: Replace use redis_misc servers for LockManager (1/6)

https://gerrit.wikimedia.org/r/865117

Change 865118 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/mediawiki-config@master] ProductionServices: Replace use redis_misc servers for LockManager (2/6)

https://gerrit.wikimedia.org/r/865118

Change 865119 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/mediawiki-config@master] ProductionServices: Replace use redis_misc servers for LockManager (3/6)

https://gerrit.wikimedia.org/r/865119

Change 865121 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/mediawiki-config@master] ProductionServices: Replace use redis_misc servers for LockManager (4/6)

https://gerrit.wikimedia.org/r/865121

Change 865122 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/mediawiki-config@master] ProductionServices: Replace use redis_misc servers for LockManager (5/6)

https://gerrit.wikimedia.org/r/865122

Change 865123 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/mediawiki-config@master] ProductionServices: Use redis_misc servers for LockManager (6/6)

https://gerrit.wikimedia.org/r/865123

Change 865117 merged by jenkins-bot:

[operations/mediawiki-config@master] ProductionServices: Use redis_misc servers for LockManager (1/6)

https://gerrit.wikimedia.org/r/865117

Mentioned in SAL (#wikimedia-operations) [2022-12-07T09:25:17Z] <jiji@deploy1002> Started scap: Backport for [[gerrit:865117|ProductionServices: Use redis_misc servers for LockManager (1/6) (T267581)]]

Mentioned in SAL (#wikimedia-operations) [2022-12-07T09:27:19Z] <jiji@deploy1002> jiji and jiji: Backport for [[gerrit:865117|ProductionServices: Use redis_misc servers for LockManager (1/6) (T267581)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet

Mentioned in SAL (#wikimedia-operations) [2022-12-07T09:34:25Z] <jiji@deploy1002> Finished scap: Backport for [[gerrit:865117|ProductionServices: Use redis_misc servers for LockManager (1/6) (T267581)]] (duration: 09m 08s)

Change 865118 merged by Effie Mouzeli:

[operations/mediawiki-config@master] ProductionServices: Use redis_misc servers for LockManager (2/6)

https://gerrit.wikimedia.org/r/865118

Mentioned in SAL (#wikimedia-operations) [2022-12-07T10:07:03Z] <jiji@deploy1002> Started scap: Backport for [[gerrit:865118|ProductionServices: Use redis_misc servers for LockManager (2/6) (T267581)]]

Mentioned in SAL (#wikimedia-operations) [2022-12-07T10:09:04Z] <jiji@deploy1002> jiji and jiji: Backport for [[gerrit:865118|ProductionServices: Use redis_misc servers for LockManager (2/6) (T267581)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet

Mentioned in SAL (#wikimedia-operations) [2022-12-07T10:17:52Z] <jiji@deploy1002> Finished scap: Backport for [[gerrit:865118|ProductionServices: Use redis_misc servers for LockManager (2/6) (T267581)]] (duration: 10m 48s)

Change 865119 merged by jenkins-bot:

[operations/mediawiki-config@master] ProductionServices: Use redis_misc servers for LockManager (3/6)

https://gerrit.wikimedia.org/r/865119

Mentioned in SAL (#wikimedia-operations) [2022-12-07T10:25:15Z] <jiji@deploy1002> Started scap: Backport for [[gerrit:865119|ProductionServices: Use redis_misc servers for LockManager (3/6) (T267581)]]

Mentioned in SAL (#wikimedia-operations) [2022-12-07T10:27:12Z] <jiji@deploy1002> jiji and jiji: Backport for [[gerrit:865119|ProductionServices: Use redis_misc servers for LockManager (3/6) (T267581)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2022-12-07T10:35:44Z] <jiji@deploy1002> Finished scap: Backport for [[gerrit:865119|ProductionServices: Use redis_misc servers for LockManager (3/6) (T267581)]] (duration: 10m 29s)

Change 865121 merged by jenkins-bot:

[operations/mediawiki-config@master] ProductionServices: Use redis_misc servers for LockManager (4/6)

https://gerrit.wikimedia.org/r/865121

Mentioned in SAL (#wikimedia-operations) [2022-12-07T15:37:55Z] <jiji@deploy1002> Started scap: Backport for [[gerrit:865121|ProductionServices: Use redis_misc servers for LockManager (4/6) (T267581)]]

Mentioned in SAL (#wikimedia-operations) [2022-12-07T15:39:49Z] <jiji@deploy1002> jiji and jiji: Backport for [[gerrit:865121|ProductionServices: Use redis_misc servers for LockManager (4/6) (T267581)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2022-12-07T15:46:25Z] <jiji@deploy1002> Finished scap: Backport for [[gerrit:865121|ProductionServices: Use redis_misc servers for LockManager (4/6) (T267581)]] (duration: 08m 29s)

Change 865122 merged by jenkins-bot:

[operations/mediawiki-config@master] ProductionServices: Use redis_misc servers for LockManager (5/6)

https://gerrit.wikimedia.org/r/865122

Mentioned in SAL (#wikimedia-operations) [2022-12-07T15:57:32Z] <jiji@deploy1002> Started scap: Backport for [[gerrit:865122|ProductionServices: Use redis_misc servers for LockManager (5/6) (T267581)]]

Mentioned in SAL (#wikimedia-operations) [2022-12-07T15:59:25Z] <jiji@deploy1002> jiji and jiji: Backport for [[gerrit:865122|ProductionServices: Use redis_misc servers for LockManager (5/6) (T267581)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet

Mentioned in SAL (#wikimedia-operations) [2022-12-07T16:08:32Z] <jiji@deploy1002> Finished scap: Backport for [[gerrit:865122|ProductionServices: Use redis_misc servers for LockManager (5/6) (T267581)]] (duration: 10m 59s)

Change 865123 merged by jenkins-bot:

[operations/mediawiki-config@master] ProductionServices: Use redis_misc servers for LockManager (6/6)

https://gerrit.wikimedia.org/r/865123

Mentioned in SAL (#wikimedia-operations) [2022-12-07T16:46:55Z] <jiji@deploy1002> Started scap: Backport for [[gerrit:865123|ProductionServices: Use redis_misc servers for LockManager (6/6) (T267581)]]

Mentioned in SAL (#wikimedia-operations) [2022-12-07T16:48:52Z] <jiji@deploy1002> jiji and jiji: Backport for [[gerrit:865123|ProductionServices: Use redis_misc servers for LockManager (6/6) (T267581)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet

Mentioned in SAL (#wikimedia-operations) [2022-12-07T17:01:42Z] <jiji@deploy1002> Finished scap: Backport for [[gerrit:865123|ProductionServices: Use redis_misc servers for LockManager (6/6) (T267581)]] (duration: 14m 46s)

Change 867707 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/deployment-charts@master] mediawiki-common: Replace redis_session servers with rdb*

https://gerrit.wikimedia.org/r/867707

Change 864830 merged by Effie Mouzeli:

[operations/puppet@production] Redis sessions: Goodbye

https://gerrit.wikimedia.org/r/864830

Change 868393 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] profile::spicerack: Stop writing Redis sessions data

https://gerrit.wikimedia.org/r/868393

Change 868393 merged by Muehlenhoff:

[operations/puppet@production] profile::spicerack: Stop writing Redis sessions data

https://gerrit.wikimedia.org/r/868393

Change 867707 merged by jenkins-bot:

[operations/deployment-charts@master] mediawiki-common: Replace redis_session servers with rdb*

https://gerrit.wikimedia.org/r/867707

Change #530019 abandoned by Aaron Schulz:

[mediawiki/core@master] Migrate uses of MemcachedClient to new MemcachedConnRef class

Reason:

https://gerrit.wikimedia.org/r/530019

Change #530019 restored by Aaron Schulz:

[mediawiki/core@master] Migrate uses of MemcachedClient to new MemcachedConnRef class

https://gerrit.wikimedia.org/r/530019