Commit Graph

586 Commits

Author SHA1 Message Date
Vlad Yasevich
cb95ea32a4 sctp: Don't do NAGLE delay on large writes that were fragmented small
SCTP will delay the last part of a large write due to NAGLE, if that
part is smaller then MTU.  Since we are doing large writes, we might
as well send the last portion now instead of waiting untill the next
large write happens.  The small portion will be sent as is regardless,
so it's better to not delay it.

This is a result of much discussions with Wei Yongjun <yjwei@cn.fujitsu.com>
and Doug Graham <dgraham@nortel.com>.  Many thanks go out to them.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:59 -04:00
Vlad Yasevich
b29e790728 sctp: Nagle delay should be based on path mtu
The decision to delay due to Nagle should be based on the path mtu
and future packet size.  We currently incorrectly base it on
'frag_point' which is the SCTP DATA segment size, and also we do
not count DATA chunk header overhead in the computation.  This
actuall allows situations where a user can set low 'frag_point',
and then send small messages without delay.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:59 -04:00
Vlad Yasevich
d4d6fb5787 sctp: Try not to change a_rwnd when faking a SACK from SHUTDOWN.
We currently set a_rwnd to 0 when faking a SACK from SHUTDOWN.
This results in an hung association if the remote only uses
SHUTDOWNs (which it's allowed to do) to acknowlege DATA when
closing.  The reason for that is that we simply honor the a_rwnd
from the sack, but since we faked it to be 0, we enter 0-window
probing.  The fix is to use the peers old rwnd and add our flight
size to it.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:59 -04:00
Vlad Yasevich
4d3c46e683 sctp: drop a_rwnd to 0 when receive buffer overflows.
SCTP has a problem that when small chunks are used, it is possible
to exhaust the receiver buffer without fully closing receive window.
This happens due to all overhead that we have account for with small
messages.  To fix this, when receive buffer is exceeded, we'll drop
the window to 0 and save the 'drop' portion.  When application starts
reading data and freeing up recevie buffer space, we'll wait until
we've reached the 'drop' window and then add back this 'drop' one
mtu at a time.  This worked well in testing and under stress produced
rather even recovery.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:59 -04:00
Vlad Yasevich
33ce828131 sctp: Clear fast_recovery on the transport when T3 timer expires.
If T3 timer expires, we are retransmitting data due to timeout any
any fast recovery is null and void.  We can clear the fast recovery
flag.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:58 -04:00
Vlad Yasevich
b9f8478682 sctp: Fix error count increments that were results of HEARTBEATS
SCTP RFC 4960 states that unacknowledged HEARTBEATS count as
errors agains a given transport or endpoint.  As such, we
should increment the error counts for only for unacknowledged
HB, otherwise we detect failure too soon.  This goes for both
the overall error count and the path error count.

Now, there is a difference in how the detection is done
between the two.  The path error detection is done after
the increment, so to detect it properly, we actually need
to exceed the path threshold.  The overall error detection
is done _BEFORE_ the increment.  Thus to detect the failure,
it's enough for the error count to match the threshold.
This is why all the state functions use '>=' to detect failure,
while path detection uses '>'.

Thanks goes to Chunbo Luo <chunbo.luo@windriver.com> who first
proposed patches to fix this issue and made me re-read the spec
and the code to figure out how this cruft really works.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:58 -04:00
Alexey Dobriyan
d71a09ed55 sctp: use proc_create()
create_proc_entry() is deprecated (not formally, though).

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:58 -04:00
Wei Yongjun
dadb50cc1a sctp: fix check the chunk length of received HEARTBEAT-ACK chunk
The receiver of the HEARTBEAT should respond with a HEARTBEAT ACK
that contains the Heartbeat Information field copied from the
received HEARTBEAT chunk. So the received HEARTBEAT-ACK chunk
must have a length of:
  sizeof(sctp_chunkhdr_t) + sizeof(sctp_sender_hb_info_t)

A badly formatted HB-ACK chunk, it is possible that we may access
invalid memory.  We should really make sure that the chunk format
is what we expect, before attempting to touch the data.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:58 -04:00
Wei Yongjun
a2f36eec56 sctp: drop SHUTDOWN chunk if the TSN is less than the CTSN
If Cumulative TSN Ack field of SHUTDOWN chunk is less than the
Cumulative TSN Ack Point then drop the SHUTDOWN chunk.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:57 -04:00
Vlad Yasevich
9c5c62be2f sctp: Send user messages to the lower layer as one
Currenlty, sctp breaks up user messages into fragments and
sends each fragment to the lower layer by itself.  This means
that for each fragment we go all the way down the stack
and back up.  This also discourages bundling of multiple
fragments when they can fit into a sigle packet (ex: due
to user setting a low fragmentation threashold).

We introduce a new command SCTP_CMD_SND_MSG and hand the
whole message down state machine.  The state machine and
the side-effect parser will cork the queue, add all chunks
from the message to the queue, and then un-cork the queue
thus causing the chunks to get transmitted.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:57 -04:00
Vlad Yasevich
5d7ff261ef sctp: Try to encourage SACK bundling with DATA.
If the association has a SACK timer pending and now DATA queued
to be send, we'll try to bundle the SACK with the next application send.
As such, try encourage bundling by accounting for SACK in the size
of the first chunk fragment.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:56 -04:00
Vlad Yasevich
e83963b769 sctp: Generate SACKs when actually sending outbound DATA
We are now trying to bundle SACKs when we have outbound
DATA to send.  However, there are situations where this
outbound DATA will not be sent (due to congestion or 
available window).  In such cases it's ok to wait for the
timer to expire.  This patch refactors the sending code
so that betfore attempting to bundle the SACK we check
to see if the DATA will actually be transmitted.

Based on eirlier works for Doug Graham <dgraham@nortel.com> and
Wei Youngjun <yjwei@cn.fujitsu.com>.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:56 -04:00
Vlad Yasevich
3e62abf92f sctp: Fix data segmentation with small frag_size
Since an application may specify the maximum SCTP fragment size
that all data should be fragmented to, we need to fix how
we do segmentation.   Right now, if a user specifies a small
fragment size, the segment size can go negative in the presence
of AUTH or COOKIE_ECHO bundling.

What we need to do is track the largest possbile DATA chunk that
can fit into the mtu.  Then if the fragment size specified is
bigger then this maximum length, we'll shrink it down.  Otherwise,
we just use the smaller segment size without changing it further.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:56 -04:00
Vlad Yasevich
bec9640bb0 sctp: Disallow new connection on a closing socket
If a socket has a lot of association that are in the process of
of being closed/aborted, it is possible for a remote to establish
new associations during the time period that the old ones are shutting
down.  If this was a result of a close() call, there will be no socket
and will cause a memory leak.  We'll prevent this by setting the
socket state to CLOSING and disallow new associations when in this state.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:56 -04:00
Doug Graham
af87b823ca sctp: Fix piggybacked ACKs
This patch corrects the conditions under which a SACK will be piggybacked
on a DATA packet.  The previous condition was incorrect due to a
misinterpretation of RFC 4960 and/or RFC 2960.  Specifically, the
following paragraph from section 6.2 had not been implemented correctly:

   Before an endpoint transmits a DATA chunk, if any received DATA
   chunks have not been acknowledged (e.g., due to delayed ack), the
   sender should create a SACK and bundle it with the outbound DATA
   chunk, as long as the size of the final SCTP packet does not exceed
   the current MTU.  See Section 6.2.

When about to send a DATA chunk, the code now checks to see if the SACK
timer is running.  If it is, we know we have a SACK to send to the
peer, so we append the SACK (assuming available space in the packet)
and turn off the timer.  For a simple request-response scenario, this
will result in the SACK being bundled with the response, meaning the
the SACK is received quickly by the client, and also meaning that no
separate SACK packet needs to be sent by the server to acknowledge the
request.  Prior to this patch, a separate SACK packet would have been
sent by the server SCTP only after its delayed-ACK timer had expired
(usually 200ms).  This is wasteful of bandwidth, and can also have a
major negative impact on performance due the interaction of delayed ACKs
with the Nagle algorithm.

Signed-off-by: Doug Graham <dgraham@nortel.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:55 -04:00
Vlad Yasevich
40187886bc sctp: release cached route when the transport goes down.
When the sctp transport is marked down, we can release the
cached route and force a new lookup when attempting to use
this transport for anything.  This way, if a better route
or source address is available, we'll try to use it.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:55 -04:00
Wei Yongjun
3cd9749c0b sctp: update the route for non-active transports after addresses are added
Update the route and saddr entries for the non-active transports as some
of the added addresses can be used as better source addresses, or may
be there is a better route.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:55 -04:00
Wei Yongjun
44e65c1ef1 sctp: check the unrecognized ASCONF parameter before access it
This patch fix to check the unrecognized ASCONF parameter before
access it.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:54 -04:00
Wei Yongjun
425e0f6852 sctp: avoid overwrite the return value of sctp_process_asconf_ack()
The return value of sctp_process_asconf_ack() may be
overwritten while process parameters with no error.
This patch fixed the problem.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-09-04 18:20:54 -04:00
David S. Miller
aa11d958d1 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	arch/microblaze/include/asm/socket.h
2009-08-12 17:44:53 -07:00
Rafael Laufer
418372b0ab sctp: fix missing destroy of percpu counter variable in sctp_proc_exit()
Commit 1748376b66,
	net: Use a percpu_counter for sockets_allocated

added percpu_counter function calls to sctp_proc_init code path, but
forgot to add them to sctp_proc_exit().  This resulted in a following
Ooops when performing this test
	# modprobe sctp
	# rmmod -f sctp
	# modprobe sctp

[  573.862512] BUG: unable to handle kernel paging request at f8214a24
[  573.862518] IP: [<c0308b8f>] __percpu_counter_init+0x3f/0x70
[  573.862530] *pde = 37010067 *pte = 00000000
[  573.862534] Oops: 0002 [#1] SMP
[  573.862537] last sysfs file: /sys/module/libcrc32c/initstate
[  573.862540] Modules linked in: sctp(+) crc32c libcrc32c binfmt_misc bridge
stp bnep lp snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep
snd_pcm_oss snd_mixer_oss arc4 joydev snd_pcm ecb pcmcia snd_seq_dummy
snd_seq_oss iwlagn iwlcore snd_seq_midi snd_rawmidi snd_seq_midi_event
yenta_socket rsrc_nonstatic thinkpad_acpi snd_seq snd_timer snd_seq_device
mac80211 psmouse sdhci_pci sdhci nvidia(P) ppdev video snd soundcore serio_raw
pcspkr iTCO_wdt iTCO_vendor_support led_class ricoh_mmc pcmcia_core intel_agp
nvram agpgart usbhid parport_pc parport output snd_page_alloc cfg80211 btusb
ohci1394 ieee1394 e1000e [last unloaded: sctp]
[  573.862589]
[  573.862593] Pid: 5373, comm: modprobe Tainted: P  R        (2.6.31-rc3 #6)
7663B15
[  573.862596] EIP: 0060:[<c0308b8f>] EFLAGS: 00010286 CPU: 1
[  573.862599] EIP is at __percpu_counter_init+0x3f/0x70
[  573.862602] EAX: f8214a20 EBX: f80faa14 ECX: c48c0000 EDX: f80faa20
[  573.862604] ESI: f80a7000 EDI: 00000000 EBP: f69d5ef0 ESP: f69d5eec
[  573.862606]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  573.862610] Process modprobe (pid: 5373, ti=f69d4000 task=c2130c70
task.ti=f69d4000)
[  573.862612] Stack:
[  573.862613]  00000000 f69d5f18 f80a70a8 f80fa9fc 00000000 fffffffc f69d5f30
c018e2d4
[  573.862619] <0> 00000000 f80a7000 00000000 f69d5f88 c010112b 00000000
c07029c0 fffffffb
[  573.862626] <0> 00000000 f69d5f38 c018f83f f69d5f54 c0557cad f80fa860
00000001 c07010c0
[  573.862634] Call Trace:
[  573.862644]  [<f80a70a8>] ? sctp_init+0xa8/0x7d4 [sctp]
[  573.862650]  [<c018e2d4>] ? marker_update_probe_range+0x184/0x260
[  573.862659]  [<f80a7000>] ? sctp_init+0x0/0x7d4 [sctp]
[  573.862662]  [<c010112b>] ? do_one_initcall+0x2b/0x160
[  573.862666]  [<c018f83f>] ? tracepoint_module_notify+0x2f/0x40
[  573.862671]  [<c0557cad>] ? notifier_call_chain+0x2d/0x70
[  573.862678]  [<c01588fd>] ? __blocking_notifier_call_chain+0x4d/0x60
[  573.862682]  [<c016b2f1>] ? sys_init_module+0xb1/0x1f0
[  573.862686]  [<c0102ffc>] ? sysenter_do_call+0x12/0x28
[  573.862688] Code: 89 48 08 b8 04 00 00 00 e8 df aa ec ff ba f4 ff ff ff 85
c0 89 43 14 74 31 b8 b0 18 71 c0 e8 19 b9 24 00 a1 c4 18 71 c0 8d 53 0c <89> 50
04 89 43 0c b8 b0 18 71 c0 c7 43 10 c4 18 71 c0 89 15 c4
[  573.862725] EIP: [<c0308b8f>] __percpu_counter_init+0x3f/0x70 SS:ESP
0068:f69d5eec
[  573.862730] CR2: 00000000f8214a24
[  573.862734] ---[ end trace 39c4e0b55e7cf54d ]---

Signed-off-by: Rafael Laufer <rlaufer@cisco.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-09 21:45:43 -07:00
Jan Engelhardt
36cbd3dcc1 net: mark read-only arrays as const
String literals are constant, and usually, we can also tag the array
of pointers const too, moving it to the .rodata section.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-05 10:42:58 -07:00
Wei Yongjun
1bc4ee4088 sctp: fix warning at inet_sock_destruct() while release sctp socket
Commit 'net: Move rx skb_orphan call to where needed' broken sctp protocol
with warning at inet_sock_destruct(). Actually, sctp can do this right with
sctp_sock_rfree_frag() and sctp_skb_set_owner_r_frag() pair.

    sctp_sock_rfree_frag(skb);
    sctp_skb_set_owner_r_frag(skb, newsk);

This patch not revert the commit d55d87fdff,
instead remove the sctp_sock_rfree_frag() function.

------------[ cut here ]------------
WARNING: at net/ipv4/af_inet.c:151 inet_sock_destruct+0xe0/0x142()
Modules linked in: sctp ipv6 dm_mirror dm_region_hash dm_log dm_multipath
scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
Pid: 1808, comm: sctp_test Not tainted 2.6.31-rc2 #40
Call Trace:
 [<c042dd06>] warn_slowpath_common+0x6a/0x81
 [<c064a39a>] ? inet_sock_destruct+0xe0/0x142
 [<c042dd2f>] warn_slowpath_null+0x12/0x15
 [<c064a39a>] inet_sock_destruct+0xe0/0x142
 [<c05fde44>] __sk_free+0x19/0xcc
 [<c05fdf50>] sk_free+0x18/0x1a
 [<ca0d14ad>] sctp_close+0x192/0x1a1 [sctp]
 [<c0649f7f>] inet_release+0x47/0x4d
 [<c05fba4d>] sock_release+0x19/0x5e
 [<c05fbab3>] sock_close+0x21/0x25
 [<c049c31b>] __fput+0xde/0x189
 [<c049c3de>] fput+0x18/0x1a
 [<c049988f>] filp_close+0x56/0x60
 [<c042f422>] put_files_struct+0x5d/0xa1
 [<c042f49f>] exit_files+0x39/0x3d
 [<c043086a>] do_exit+0x1a5/0x5dd
 [<c04a86c2>] ? d_kill+0x35/0x3b
 [<c0438fa4>] ? dequeue_signal+0xa6/0x115
 [<c0430d05>] do_group_exit+0x63/0x8a
 [<c0439504>] get_signal_to_deliver+0x2e1/0x2f9
 [<c0401d9e>] do_notify_resume+0x7c/0x6b5
 [<c043f601>] ? autoremove_wake_function+0x0/0x34
 [<c04a864e>] ? __d_free+0x3d/0x40
 [<c04a867b>] ? d_free+0x2a/0x3c
 [<c049ba7e>] ? vfs_write+0x103/0x117
 [<c05fc8fa>] ? sys_socketcall+0x178/0x182
 [<c0402a56>] work_notifysig+0x13/0x19
---[ end trace 9db92c463e789fba ]---

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-06 12:47:08 -07:00
Wei Yongjun
ff0ac74afb sctp: xmit sctp packet always return no route error
Commit 'net: skb->dst accessors'(adf30907d6)
broken the sctp protocol stack, the sctp packet can never be sent out after
Eric Dumazet's patch, which have typo in the sctp code.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Vlad Yasevich <vladisalv.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-29 19:41:53 -07:00
Brian Haley
d5fdd6babc ipv6: Use correct data types for ICMPv6 type and code
Change all the code that deals directly with ICMPv6 type and code
values to use u8 instead of a signed int as that's the actual data
type.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-23 04:31:07 -07:00
Eric Dumazet
31e6d363ab net: correct off-by-one write allocations reports
commit 2b85a34e91
(net: No more expensive sock_hold()/sock_put() on each tx)
changed initial sk_wmem_alloc value.

We need to take into account this offset when reporting
sk_wmem_alloc to user, in PROC_FS files or various
ioctls (SIOCOUTQ/TIOCOUTQ)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-18 00:29:12 -07:00
Jesper Dangaard Brouer
eaa184a1a1 sctp: protocol.c call rcu_barrier() on unload.
On module unload call rcu_barrier(), this is needed as synchronize_rcu()
is not strong enough.  The kmem_cache_destroy() does invoke
synchronize_rcu() but it does not provide same protection.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-10 01:11:25 -07:00
David S. Miller
1b003be39e sctp: Use frag list abstraction interfaces.
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-09 00:24:07 -07:00
Vlad Yasevich
c6ba68a266 sctp: support non-blocking version of the new sctp_connectx() API
Prior implementation of the new sctp_connectx() call that returns
an association ID did not work correctly on non-blocking socket.
This is because we could not return both a EINPROGRESS error and
an association id.  This is a new implementation that supports this.

Originally from Ivan Skytte Jørgensen <isj-sctp@i1.dk

Signed-off-by: Ivan Skytte Jørgensen <isj-sctp@i1.dk
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-06-03 09:14:47 -04:00
Wei Yongjun
9919b455fc sctp: fix to choose alternate destination when retransmit ASCONF chunk
RFC 5061 Section 5.1 ASCONF Chunk Procedures said:

B4)  Re-transmit the ASCONF Chunk last sent and if possible choose an
     alternate destination address (please refer to [RFC4960],
     Section 6.4.1).  An endpoint MUST NOT add new parameters to this
     chunk; it MUST be the same (including its Sequence Number) as
     the last ASCONF sent.  An endpoint MAY, however, bundle an
     additional ASCONF with new ASCONF parameters with the next
     Sequence Number.  For details, see Section 5.5.

This patch fix to choose an alternate destination address when
re-transmit the ASCONF chunk, with some dup codes cleanup.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-06-03 09:14:46 -04:00
Jean-Mickael Guerin
d48e074dfd sctp: fix sack_timeout sysctl min and max types
sctp_sack_timeout is defined as int, but the sysctl's maxsize is set
to sizeof(long) and the min/max are defined as long.

Signed-off-by: jean-mickael.guerin@6wind.com
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-06-03 09:14:46 -04:00
Wei Yongjun
10a43cea7d sctp: fix panic when T4-rto timer expire on removed transport
If T4-rto timer is expired on a removed transport, kernel panic
will occur when we do failure management on that transport.
You can reproduce this use the following sequence:

Endpoint A                           Endpoint B
(ESTABLISHED)                        (ESTABLISHED)

            <-----------------      ASCONF
                                    (SRC=X)
ASCONF        ----------------->
(Delete IP Address = X)
            <-----------------      ASCONF-ACK
                                    (Success Indication)
            <-----------------      ASCONF
                                    (T4-rto timer expire)

This patch fixed the problem.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-06-03 09:14:46 -04:00
Wei Yongjun
6345b19985 sctp: fix panic when T2-shutdown timer expire on removed transport
If T2-shutdown timer is expired on a removed transport, kernel
panic will occur when we do failure management on that transport.
You can reproduce this use the following sequence:

  Endpoint A                           Endpoint B
  (ESTABLISHED)                        (ESTABLISHED)

                <-----------------      SHUTDOWN
                                        (SRC=X)
  ASCONF        ----------------->
  (Delete IP Address = X)
                <-----------------      ASCONF-ACK
                                        (Success Indication)
                <-----------------      SHUTDOWN
                                        (T2-shutdown timer expire)
This patch fixed the problem.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-06-03 09:14:46 -04:00
Wei Yongjun
a2c395846c sctp: fix to only enable IPv6 address support on PF_INET6 socket
If socket is create by PF_INET type, it can not used IPv6 address
to send/recv DATA. So only enable IPv6 address support on PF_INET6
socket.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-06-03 09:14:46 -04:00
Wei Yongjun
4553e88d87 sctp: fix a typo in net/sctp/sm_statetable.c
Just fix a typo in net/sctp/sm_statetable.c.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-06-03 09:14:45 -04:00
Wei Yongjun
945e5abcee sctp: fix the error code when ASCONF is received with invalid address
Use Unresolvable Address error cause instead of Invalid Mandatory
Parameter error cause when process ASCONF chunk with invalid address
since address parameters are not mandatory in the ASCONF chunk.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-06-03 09:14:45 -04:00
Wei Yongjun
a987f762ca sctp: fix report unrecognized parameter in ACSONF-ACK
RFC5061 Section 5.2.  Upon Reception of an ASCONF Chunk

V2)  In processing the chunk, the receiver should build a
     response message with the appropriate error TLVs, as
     specified in the Parameter type bits, for any ASCONF
     Parameter it does not understand.  To indicate an
     unrecognized parameter, Cause Type 8 should be used as
     defined in the ERROR in Section 3.3.10.8, [RFC4960].  The
     endpoint may also use the response to carry rejections for
     other reasons, such as resource shortages, etc., using the
     Error Cause TLV and an appropriate error condition.

So we should indicate an unrecognized parameter with error
SCTP_ERROR_UNKNOWN_PARAM in ACSONF-ACK chunk, not
SCTP_ERROR_INV_PARAM.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2009-06-03 09:14:45 -04:00
Eric Dumazet
adf30907d6 net: skb->dst accessors
Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-03 02:51:04 -07:00
Eric Dumazet
511c3f92ad net: skb->rtable accessor
Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb

Delete skb->rtable field

Setting rtable is not allowed, just set dst instead as rtable is an alias.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-03 02:51:02 -07:00
Jesse Brandeburg
8dc92f7e2e sctp: add feature bit for SCTP offload in hardware
this is the sctp code to enable hardware crc32c offload for
adapters that support it.

Originally by: Vlad Yasevich <vladislav.yasevich@hp.com>

modified by Jesse Brandeburg <jesse.brandeburg@intel.com>

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-28 01:53:14 -07:00
Alexey Dobriyan
99b7623380 proc 2/2: remove struct proc_dir_entry::owner
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
as correctly noted at bug #12454. Someone can lookup entry with NULL
->owner, thus not pinning enything, and release it later resulting
in module refcount underflow.

We can keep ->owner and supply it at registration time like ->proc_fops
and ->data.

But this leaves ->owner as easy-manipulative field (just one C assignment)
and somebody will forget to unpin previous/pin current module when
switching ->owner. ->proc_fops is declared as "const" which should give
some thoughts.

->read_proc/->write_proc were just fixed to not require ->owner for
protection.

rmmod'ed directories will be empty and return "." and ".." -- no harm.
And directories with tricky enough readdir and lookup shouldn't be modular.
We definitely don't want such modular code.

Removing ->owner will also make PDE smaller.

So, let's nuke it.

Kudos to Jeff Layton for reminding about this, let's say, oversight.

http://bugzilla.kernel.org/show_bug.cgi?id=12454

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-31 01:14:44 +04:00
Vlad Yasevich
8d2f9e8116 sctp: Clean up TEST_FRAME hacks.
Remove 2 TEST_FRAME hacks that are no longer needed.  These allowed
sctp regression tests to compile before, but are no longer needed.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-21 13:41:09 -07:00
David S. Miller
2b1c4354de Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/virtio_net.c
2009-03-20 02:27:41 -07:00
Al Viro
cb0dc77de0 net: fix sctp breakage
broken by commit 5e739d1752aca4e8f3e794d431503bfca3162df4; AFAICS should
be -stable fodder as well...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Aced-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-18 19:12:42 -07:00
malc
6fc791ee63 sctp: add Adaptation Layer Indication parameter only when it's set
RFC5061 states:

        Each adaptation layer that is defined that wishes
        to use this parameter MUST specify an adaptation code point in an
        appropriate RFC defining its use and meaning.

If the user has not set one - assume they don't want to sent the param
with a zero Adaptation Code Point.

Rationale - Currently the IANA defines zero as reserved - and
1 as the only valid value - so we consider zero to be unset - to save
adding a boolean to the socket structure.

Including this parameter unconditionally causes endpoints that do not
understand it to report errors unnecessarily.

Signed-off-by: Malcolm Lashley <mlashley@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 11:37:58 -07:00
Wei Yongjun
76595024ff sctp: fix to send FORWARD-TSN chunk only if peer has such capable
RFC3758 Section 3.3.1.  Sending Forward-TSN-Supported param in INIT

   Note that if the endpoint chooses NOT to include the parameter, then
   at no time during the life of the association can it send or process
   a FORWARD TSN.

If peer does not support PR-SCTP capable, don't send FORWARD-TSN chunk
to peer.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 11:37:58 -07:00
Wei Yongjun
5ffad5aceb sctp: fix to indicate ASCONF support in INIT-ACK only if peer has such capable
This patch fix to indicate ASCONF support in INIT-ACK only if peer has
such capable.

This patch also fix to calc the chunk size if peer has no FWD-TSN
capable.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 11:37:56 -07:00
Vlad Yasevich
5e8f3f703a sctp: simplify sctp listening code
sctp_inet_listen() call is split between UDP and TCP style.  Looking
at the code, the two functions are almost the same and can be
merged into a single helper.  This also fixes a bug that was
fixed in the UDP function, but missed in the TCP function.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 11:37:56 -07:00
David S. Miller
508827ff0a Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/tokenring/tmspci.c
	drivers/net/ucc_geth_mii.c
2009-03-05 02:06:47 -08:00
Brian Haley
fb13d9f9e4 SCTP: change sctp_ctl_sock_init() to try IPv4 if IPv6 fails
Change sctp_ctl_sock_init() to try IPv4 if IPv6 socket registration
fails.  Required if the IPv6 module is loaded with "disable=1", else
SCTP will fail to load.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-04 03:20:26 -08:00