Page MenuHomePhabricator

Wikifunctions is down
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

What happens?: An error is displayed, see screenshot

What should have happened instead?: Page should be displayed

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):

Screenshot_20240908_124149_Chrome.jpg (1×1 px, 177 KB)

Event Timeline

mw-wikifunctions seems to be down in eqiad at the moment:

vgutierrez@cp6016:~$ nc -zv mw-wikifunctions.discovery.wmnet 4451
nc: connect to mw-wikifunctions.discovery.wmnet (10.2.2.88) port 4451 (tcp) failed: Connection refused
vgutierrez@cp6016:~$ nc -zv mw-wikifunctions.svc.eqiad.wmnet 4451
nc: connect to mw-wikifunctions.svc.eqiad.wmnet (10.2.2.88) port 4451 (tcp) failed: Connection refused
vgutierrez@cp6016:~$ nc -zv mw-wikifunctions.svc.codfw.wmnet 4451
Connection to mw-wikifunctions.svc.codfw.wmnet (10.2.1.88) 4451 port [tcp/*] succeeded!
Vgutierrez claimed this task.

Service should be restored now.

Joe reopened this task as In Progress.Mon, Sep 9, 1:08 PM
Joe subscribed.

For the record, the cause was a relatively aggressive crawler filling up all resources. While we've rate-limited this bot, I think we should use robots.txt to ban crawling from most pages.

Probably just banning crawling of /view/ would be more than enough. Once we have a better backend performance and organic traffic increases, we can think of lifting that block.

Jdforrester-WMF triaged this task as Unbreak Now! priority.
Jdforrester-WMF subscribed.

I've added a general ban of ClaudeBot for all pages to https://www.wikifunctions.org/wiki/MediaWiki:Robots.txt for now. Re-Resolving, but we'll want to do follow-up to see if we can support higher requests at lower load.

Can/have we set a general rate limit so that when the next bot tries it, we don't go down again?

I wonder if we can raise that ban again? I think the crawler in combination with T374241 was causing the site instability issue. I would suggest that we de-ban the bot, and see if it causes issues again, because I don't think that the crawling itself would cause the issues described here.