Retry when requested range not satisfied #2222

tsutsu · 2024-06-13T00:28:19Z

I've run into the kind of bad-Range-response-header aborts reported in #1344 a couple of times now — mostly when downloading 100GB+ files from lesser-known CDNs using high connection parallelism.

In the cases I've observed where partial-content responses give incorrect Range response headers, the responses that do this seem completely non-deterministic. Retrying the same large split download several times, will result in a different random one or two splits (out of the thousands that make up the download) range-aborting.

My hypothesis for what's going on in these cases (and perhaps many of the ones people are noticing in #1344), is that aria2c is talking to a load-balancer and/or doing DNS round-robin, such that each partial request is being directed to a different server backend; and the pool of backend nodes has a heterogenous mix of webserver versions or configurations — such that a minority of the nodes are bugged somehow. Each "roll of the dice" when requesting a split, can therefore result in talking to a bad server that gives you a bad Range response header.

But why it happens doesn't really matter; all that matters is that these cases are non-deterministic, and therefore will almost always be fixed by just retrying the request.

This PR simply makes aria2 retry, rather than abort, when !(HttpRequest::isRangeSatisfied(...)).

I've used aria2 with this fix to download from one of these flaky hosts, and retrying does "solve" the problem. I suspect it will solve most if not all of the problems seen by people in #1344.

(And in the unlikely case where a bad Range response is a deterministic problem with a webserver, aria2 will still eventually hit its retry limit and abort anyway.)

tsutsu · 2024-06-13T00:31:35Z

I would potentially suggest also tweaking the retry logic, to use a minimum retry-wait of e.g. 2s for retries triggered by this error-code specifically, even when the user has not specified a --retry-wait.

When talking to a load-balancer that uses least-conn upstream routing, retrying immediately due to a bad Range response header might (depending on the load-balancer impl, and how busy it is) guarantee that you re-acquire exactly the same flaky backend you just released.

Adding an enforced delay for this case, would give someone else the opportunity to take the connection to the bugged backend you just released, so that you'll get a different one. 😄

Thoughts?

Retry when requested range not satisfied

8dd0805

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retry when requested range not satisfied #2222

Retry when requested range not satisfied #2222

tsutsu commented Jun 13, 2024 •

edited

Loading

tsutsu commented Jun 13, 2024 •

edited

Loading

Retry when requested range not satisfied #2222

Are you sure you want to change the base?

Retry when requested range not satisfied #2222

Conversation

tsutsu commented Jun 13, 2024 • edited Loading

tsutsu commented Jun 13, 2024 • edited Loading

tsutsu commented Jun 13, 2024 •

edited

Loading

tsutsu commented Jun 13, 2024 •

edited

Loading