64 lines
2.4 KiB
Diff
64 lines
2.4 KiB
Diff
From: Yuchung Cheng <ycheng@google.com>
|
|
Date: Wed, 6 Jan 2016 12:42:38 -0800
|
|
Subject: tcp: fix zero cwnd in tcp_cwnd_reduction
|
|
Origin: https://git.kernel.org/linus/8b8a321ff72c785ed5e8b4cf6eda20b35d427390
|
|
|
|
Patch 3759824da87b ("tcp: PRR uses CRB mode by default and SS mode
|
|
conditionally") introduced a bug that cwnd may become 0 when both
|
|
inflight and sndcnt are 0 (cwnd = inflight + sndcnt). This may lead
|
|
to a div-by-zero if the connection starts another cwnd reduction
|
|
phase by setting tp->prior_cwnd to the current cwnd (0) in
|
|
tcp_init_cwnd_reduction().
|
|
|
|
To prevent this we skip PRR operation when nothing is acked or
|
|
sacked. Then cwnd must be positive in all cases as long as ssthresh
|
|
is positive:
|
|
|
|
1) The proportional reduction mode
|
|
inflight > ssthresh > 0
|
|
|
|
2) The reduction bound mode
|
|
a) inflight == ssthresh > 0
|
|
|
|
b) inflight < ssthresh
|
|
sndcnt > 0 since newly_acked_sacked > 0 and inflight < ssthresh
|
|
|
|
Therefore in all cases inflight and sndcnt can not both be 0.
|
|
We check invalid tp->prior_cwnd to avoid potential div0 bugs.
|
|
|
|
In reality this bug is triggered only with a sequence of less common
|
|
events. For example, the connection is terminating an ECN-triggered
|
|
cwnd reduction with an inflight 0, then it receives reordered/old
|
|
ACKs or DSACKs from prior transmission (which acks nothing). Or the
|
|
connection is in fast recovery stage that marks everything lost,
|
|
but fails to retransmit due to local issues, then receives data
|
|
packets from other end which acks nothing.
|
|
|
|
Fixes: 3759824da87b ("tcp: PRR uses CRB mode by default and SS mode conditionally")
|
|
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
|
|
Signed-off-by: Yuchung Cheng <ycheng@google.com>
|
|
Signed-off-by: Neal Cardwell <ncardwell@google.com>
|
|
Signed-off-by: Eric Dumazet <edumazet@google.com>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
---
|
|
net/ipv4/tcp_input.c | 3 +++
|
|
1 file changed, 3 insertions(+)
|
|
|
|
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
|
|
index 2d656ee..d4c5115 100644
|
|
--- a/net/ipv4/tcp_input.c
|
|
+++ b/net/ipv4/tcp_input.c
|
|
@@ -2478,6 +2478,9 @@ static void tcp_cwnd_reduction(struct sock *sk, const int prior_unsacked,
|
|
int newly_acked_sacked = prior_unsacked -
|
|
(tp->packets_out - tp->sacked_out);
|
|
|
|
+ if (newly_acked_sacked <= 0 || WARN_ON_ONCE(!tp->prior_cwnd))
|
|
+ return;
|
|
+
|
|
tp->prr_delivered += newly_acked_sacked;
|
|
if (delta < 0) {
|
|
u64 dividend = (u64)tp->snd_ssthresh * tp->prr_delivered +
|
|
--
|
|
2.1.4
|
|
|