When initiating a graceful shutdown while having data chunks
on the retransmission queue with a peer which is in zero
window mode the shutdown is never completed because the
retransmission error count is reset periodically by the
following two rules:
- Do not timeout association while doing zero window probe.
- Reset overall error count when a heartbeat request has
been acknowledged.
The graceful shutdown will wait for all outstanding TSN to
be acknowledged before sending the SHUTDOWN request. This
never happens due to the peer's zero window not acknowledging
the continuously retransmitted data chunks. Although the
error counter is incremented for each failed retransmission,
the receiving of the SACK announcing the zero window clears
the error count again immediately. Also heartbeat requests
continue to be sent periodically. The peer acknowledges these
requests causing the error counter to be reset as well.
This patch changes behaviour to only reset the overall error
counter for the above rules while not in shutdown. After
reaching the maximum number of retransmission attempts, the
T5 shutdown guard timer is scheduled to give the receiver
some additional time to recover. The timer is stopped as soon
as the receiver acknowledges any data.
The issue can be easily reproduced by establishing a sctp
association over the loopback device, constantly queueing
data at the sender while not reading any at the receiver.
Wait for the window to reach zero, then initiate a shutdown
by killing both processes simultaneously. The association
will never be freed and the chunks on the retransmission
queue will be retransmitted indefinitely.
Signed-off-by: Thomas Graf <tgraf@infradead.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the peer restart the asoc, we should not only fail any unsent/unacked
data, but also stop the T3-rtx, SACK, T4-rto timers, and teardown ASCONF
queues.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove SCTP_CMD_TRANSMIT command as it never be used.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. SCTP_CMD_NUM_VERBS,SCTP_CMD_MAX
These two macros have never been used for several years since v2.6.12-rc2.
2.sctp_port_rover,sctp_port_alloc_lock
The commit 063930 abandoned global variables of port_rover and port_alloc_lock,
but still keep two macros to refer to them.
So, remove them now.
commit 0639300900
Author: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Wed Oct 10 17:30:18 2007 -0700
[SCTP]: port randomization
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currenlty, sctp breaks up user messages into fragments and
sends each fragment to the lower layer by itself. This means
that for each fragment we go all the way down the stack
and back up. This also discourages bundling of multiple
fragments when they can fit into a sigle packet (ex: due
to user setting a low fragmentation threashold).
We introduce a new command SCTP_CMD_SND_MSG and hand the
whole message down state machine. The state machine and
the side-effect parser will cork the queue, add all chunks
from the message to the queue, and then un-cork the queue
thus causing the chunks to get transmitted.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Commit faee47cdbf
(sctp: Fix the RTO-doubling on idle-link heartbeats)
broke the RTO doubling for data retransmits. If the
heartbeat was sent before the data T3-rtx time, the
the RTO will not double upon the T3-rtx expiration.
Distingish between the operations by passing an argument
to the function.
Additionally, Wei Youngjun pointed out that our treatment
of requested HEARTBEATS and timer HEARTBEATS is the same
wrt resetting congestion window. That needs to be separated,
since user requested HEARTBEATS should not treat the link
as idle.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When receiving an error length INIT-ACK during COOKIE-WAIT,
a 0-vtag ABORT will be responsed. This action violates the
protocol apparently. This patch achieves the following things.
1 If the INIT-ACK contains all the fixed parameters, use init-tag
recorded from INIT-ACK as vtag.
2 If the INIT-ACK doesn't contain all the fixed parameters,
just reflect its vtag.
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With a was number of callsites sctp_add_cmd_sf wrapper bloats
kernel by some amount. Due to unlikely tracking allyesconfig,
with the initial result were around ~7kB (thus caught my
attention) while a non-debug config produced only ~2.3kB effect.
I (ij) proposed first a patch to uninline it but Vlad responded
with a patch that removed the only sctp_add_cmd call which is
wrapped by sctp_add_cmd_sf (I wasn't sure if I could do that).
I did minor cleanup to Vlad's patch.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I was notified by Randy Stewart that lksctp claims to be
"the reference implementation". First of all, "the
refrence implementation" was the original implementation
of SCTP in usersapce written ty Randy and a few others.
Second, after looking at the definiton of 'reference implementation',
we don't really meet the requirements.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Commit d0ce92910b broke several retransmit
cases including fast retransmit. The reason is that we should
only delay by rto while doing retranmists as a result of a timeout.
Retransmit as a result of path mtu discover, fast retransmit, or
other evernts that should trigger immidiate retransmissions got broken.
Also, since rto is doubled prior to marking of packets elegable for
retransmission, we never marked correct chunks anyway.
The fix is provide a reason for a given retransmission so that we
can mark chunks appropriately and to save the old rto value to do
comparisons against.
All regressions tests passed with this code.
Spotted by Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Implement processing for the CHUNKS, RANDOM, and HMAC parameters and
deal with how this parameters are effected by association restarts.
In particular, during unexpeted INIT processing, we need to reply with
parameters from the original INIT chunk. Also, after restart, we need
to update the old association with new peer parameters and change the
association shared keys.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During the INIT/COOKIE-ACK collision cases, it's possible to get
into a situation where the association id is not yet set at the time
of the user event generation. As a result, user events have an
association id set to 0 which will confuse applications.
This happens if we hit case B of duplicate cookie processing.
In the particular example found and provided by Oscar Isaula
<Oscar.Isaula@motorola.com>, flow looks like this:
A B
---- INIT-------> (lost)
<---------INIT------
---- INIT-ACK--->
<------ Cookie ECHO
When the Cookie Echo is received, we end up trying to update the
association that was created on A as a result of the (lost) INIT,
but that association doesn't have the ID set yet.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
argument stored for SCTP_CMD_INIT_FAILED is always __be16
(protocol error). Introduced new field and accessor for
it (SCTP_PERR()); switched to their use (from SCTP_U32() and
.u32)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
On ia64, it is possible to get NaT Consumption Fault and a kernel panic
when initializing sctp sideeffect commands arguments. The union
sctp_arg_t contains different sized elements and when loading a smaller
sized element (32 or 16 bits), it is possible for a speculative load to
fail and result in a NaT bit set which causes a kernel crash. The easy
way to get around it is to load the largerst member of the union.
Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implements sctp_connectx() as defined in the SCTP sockets API draft by
tunneling the request through a setsockopt().
Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!