Commit Graph

2244 Commits

Author SHA1 Message Date
Petr Malat
8467dda0c2 sctp: Initialize daddr on peeled off socket
Function sctp_do_peeloff() wrongly initializes daddr of the original
socket instead of the peeled off socket, which makes getpeername()
return zeroes instead of the primary address. Initialize the new socket
instead.

Fixes: d570ee490f ("[SCTP]: Correctly set daddr for IPv6 sockets during peeloff")
Signed-off-by: Petr Malat <oss@malat.biz>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/20220409063611.673193-1-oss@malat.biz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-11 20:33:10 -07:00
Xin Long
e2d88f9ce6 sctp: use the correct skb for security_sctp_assoc_request
Yi Chen reported an unexpected sctp connection abort, and it occurred when
COOKIE_ECHO is bundled with DATA Fragment by SCTP HW GSO. As the IP header
is included in chunk->head_skb instead of chunk->skb, it failed to check
IP header version in security_sctp_assoc_request().

According to Ondrej, SELinux only looks at IP header (address and IPsec
options) and XFRM state data, and these are all included in head_skb for
SCTP HW GSO packets. So fix it by using head_skb when calling
security_sctp_assoc_request() in processing COOKIE_ECHO.

v1->v2:
  - As Ondrej noticed, chunk->head_skb should also be used for
    security_sctp_assoc_established() in sctp_sf_do_5_1E_ca().

Fixes: e215dab1c4 ("security: call security_sctp_assoc_request in sctp_sf_do_5_1D_ce")
Reported-by: Yi Chen <yiche@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/71becb489e51284edf0c11fc15246f4ed4cef5b6.1649337862.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08 14:38:58 -07:00
Jamie Bainbridge
e3d37210df sctp: count singleton chunks in assoc user stats
Singleton chunks (INIT, HEARTBEAT PMTU probes, and SHUTDOWN-
COMPLETE) are not counted in SCTP_GET_ASOC_STATS "sas_octrlchunks"
counter available to the assoc owner.

These are all control chunks so they should be counted as such.

Add counting of singleton chunks so they are properly accounted for.

Fixes: 196d675934 ("sctp: Add support to per-association statistics via a new SCTP_GET_ASSOC_STATS call")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/c9ba8785789880cf07923b8a5051e174442ea9ee.1649029663.git.jamie.bainbridge@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-05 09:51:12 +02:00
Linus Torvalds
c269497d24 selinux/stable-5.18 PR 20220321
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmI473AUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPaMBAAuxb+RBG0Wqlt0ktUYHF0ZxDVJOTK
 YGGmaDp657YJ349+c0U3mrhm7Wj8Mn7Eoz3tAYUWRQ5xPziJQRX7PfxFzT/qpPUz
 XYLRppwCWpLSB5NdpNzK3RdGNv+/9BzZ6gmjTj2wfsUCOA8cfpB1pYwyIWm6M9B+
 FXMTZ7WOqiuJ3wJa5nD1PPM1z+99nPkYiE6/iKsDidbQgSl8NX6mJY/yUsVxcZ6A
 c45n0Pf6Fj9w1XKdVDPfiRY4nekmPCwqbrn7QVtiuCYyC54JcZNmuCQnoN8dy5XY
 s/j2M2DBxT6M9rjOqQznL5jGdNKFCWydCAso06JO/13pfakvPpSS6v95Iltqkbtw
 1oHf3j5URIirAhyqcyPGoQz+g5c6krgx/Z2GOpvDs9r/AQ80GlpOBYhN3x61lVT5
 MLYq0ylV1Vfosnv7a6+AQZ9lJAkmIqws1WtG28adn7/zMPyD/hWwQ7736k/50CMl
 oC6zi3G6jCZueWdHZviqf96bjW20ZmNL2DQRy0n8ZSQQGgrsQnFgYMpXtB1Zv8+m
 XaDOPo20Ne68rzmTsEp2gVgcnXFc5/KQBDvaUta9etrbTEWqQqqTWiP8mA2QiGme
 JwKMgprV0uVDd6s9TC/O0as02xoKrWuGaL7czhlFxuL45k0nYDmk7ea/gz9MrcWV
 Y5pzAxs4LVMwVzs=
 =5E1v
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20220321' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:
 "We've got a number of SELinux patches queued up, the highlights are:

   - Fixup the security_fs_context_parse_param() LSM hook so it executes
     all of the LSM hook implementations unless a serious error occurs.

     We also correct the SELinux hook implementation so that it returns
     zero on success.

   - In addition to a few SELinux mount option parsing fixes, we
     simplified the parsing by moving it earlier in the process.

     The logic was that it was unlikely an admin/user would use the new
     mount API and not have the policy loaded before passing the SELinux
     options.

   - Properly fixed the LSM/SELinux/SCTP hooks with the addition of the
     security_sctp_assoc_established() hook.

     This work was done in conjunction with the netdev folks and should
     complete the move of the SCTP labeling from the endpoints to the
     associations.

   - Fixed a variety of sparse warnings caused by changes in the "__rcu"
     markings of some core kernel structures.

   - Ensure we access the superblock's LSM security blob using the
     stacking-safe accessors.

   - Added the ability for the kernel to always allow FIOCLEX and
     FIONCLEX if the "ioctl_skip_cloexec" policy capability is
     specified.

   - Various constifications improvements, type casting improvements,
     additional return value checks, and dead code/parameter removal.

   - Documentation fixes"

* tag 'selinux-pr-20220321' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: (23 commits)
  selinux: shorten the policy capability enum names
  docs: fix 'make htmldocs' warning in SCTP.rst
  selinux: allow FIOCLEX and FIONCLEX with policy capability
  selinux: use correct type for context length
  selinux: drop return statement at end of void functions
  security: implement sctp_assoc_established hook in selinux
  security: add sctp_assoc_established hook
  selinux: parse contexts for mount options early
  selinux: various sparse fixes
  selinux: try to use preparsed sid before calling parse_sid()
  selinux: Fix selinux_sb_mnt_opts_compat()
  LSM: general protection fault in legacy_parse_param
  selinux: fix a type cast problem in cred_init_security()
  selinux: drop unused macro
  selinux: simplify cred_init_security
  selinux: do not discard const qualifier in cast
  selinux: drop unused parameter of avtab_insert_node
  selinux: drop cast to same type
  selinux: enclose macro arguments in parenthesis
  selinux: declare name parameter of hash_eval const
  ...
2022-03-21 20:47:54 -07:00
Eric Dumazet
633593a808 sctp: fix kernel-infoleak for SCTP sockets
syzbot reported a kernel infoleak [1] of 4 bytes.

After analysis, it turned out r->idiag_expires is not initialized
if inet_sctp_diag_fill() calls inet_diag_msg_common_fill()

Make sure to clear idiag_timer/idiag_retrans/idiag_expires
and let inet_diag_msg_sctpasoc_fill() fill them again if needed.

[1]

BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline]
BUG: KMSAN: kernel-infoleak in copyout lib/iov_iter.c:154 [inline]
BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x6ef/0x25a0 lib/iov_iter.c:668
 instrument_copy_to_user include/linux/instrumented.h:121 [inline]
 copyout lib/iov_iter.c:154 [inline]
 _copy_to_iter+0x6ef/0x25a0 lib/iov_iter.c:668
 copy_to_iter include/linux/uio.h:162 [inline]
 simple_copy_to_iter+0xf3/0x140 net/core/datagram.c:519
 __skb_datagram_iter+0x2d5/0x11b0 net/core/datagram.c:425
 skb_copy_datagram_iter+0xdc/0x270 net/core/datagram.c:533
 skb_copy_datagram_msg include/linux/skbuff.h:3696 [inline]
 netlink_recvmsg+0x669/0x1c80 net/netlink/af_netlink.c:1977
 sock_recvmsg_nosec net/socket.c:948 [inline]
 sock_recvmsg net/socket.c:966 [inline]
 __sys_recvfrom+0x795/0xa10 net/socket.c:2097
 __do_sys_recvfrom net/socket.c:2115 [inline]
 __se_sys_recvfrom net/socket.c:2111 [inline]
 __x64_sys_recvfrom+0x19d/0x210 net/socket.c:2111
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Uninit was created at:
 slab_post_alloc_hook mm/slab.h:737 [inline]
 slab_alloc_node mm/slub.c:3247 [inline]
 __kmalloc_node_track_caller+0xe0c/0x1510 mm/slub.c:4975
 kmalloc_reserve net/core/skbuff.c:354 [inline]
 __alloc_skb+0x545/0xf90 net/core/skbuff.c:426
 alloc_skb include/linux/skbuff.h:1158 [inline]
 netlink_dump+0x3e5/0x16c0 net/netlink/af_netlink.c:2248
 __netlink_dump_start+0xcf8/0xe90 net/netlink/af_netlink.c:2373
 netlink_dump_start include/linux/netlink.h:254 [inline]
 inet_diag_handler_cmd+0x2e7/0x400 net/ipv4/inet_diag.c:1341
 sock_diag_rcv_msg+0x24a/0x620
 netlink_rcv_skb+0x40c/0x7e0 net/netlink/af_netlink.c:2494
 sock_diag_rcv+0x63/0x80 net/core/sock_diag.c:277
 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
 netlink_unicast+0x1093/0x1360 net/netlink/af_netlink.c:1343
 netlink_sendmsg+0x14d9/0x1720 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:705 [inline]
 sock_sendmsg net/socket.c:725 [inline]
 sock_write_iter+0x594/0x690 net/socket.c:1061
 do_iter_readv_writev+0xa7f/0xc70
 do_iter_write+0x52c/0x1500 fs/read_write.c:851
 vfs_writev fs/read_write.c:924 [inline]
 do_writev+0x645/0xe00 fs/read_write.c:967
 __do_sys_writev fs/read_write.c:1040 [inline]
 __se_sys_writev fs/read_write.c:1037 [inline]
 __x64_sys_writev+0xe5/0x120 fs/read_write.c:1037
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Bytes 68-71 of 2508 are uninitialized
Memory access of size 2508 starts at ffff888114f9b000
Data copied to user address 00007f7fe09ff2e0

CPU: 1 PID: 3478 Comm: syz-executor306 Not tainted 5.17.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 8f840e47f1 ("sctp: add the sctp_diag.c file")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20220310001145.297371-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 14:46:42 -08:00
Ondrej Mosnacek
5e50f5d4ff security: add sctp_assoc_established hook
security_sctp_assoc_established() is added to replace
security_inet_conn_established() called in
sctp_sf_do_5_1E_ca(), so that asoc can be accessed in security
subsystem and save the peer secid to asoc->peer_secid.

Fixes: 72e89f5008 ("security: Add support for SCTP security hooks")
Reported-by: Prashanth Prahlad <pprahlad@redhat.com>
Based-on-patch-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Tested-by: Richard Haines <richard_c_haines@btinternet.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-02-15 15:03:38 -05:00
Jakub Kicinski
b9adba350a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-05 14:36:10 -08:00
Xin Long
f9d31c4cf4 sctp: hold endpoint before calling cb in sctp_transport_lookup_process
The same fix in commit 5ec7d18d18 ("sctp: use call_rcu to free endpoint")
is also needed for dumping one asoc and sock after the lookup.

Fixes: 86fdb3448c ("sctp: ensure ep is not destroyed before doing the dump")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-02 12:46:41 +00:00
Jakub Kicinski
aec53e60e0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
  commit 077cdda764 ("net/mlx5e: TC, Fix memory leak with rules with internal port")
  commit 31108d142f ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'")
  commit 4390c6edc0 ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'")
  https://lore.kernel.org/all/20211229065352.30178-1-saeed@kernel.org/

net/smc/smc_wr.c
  commit 49dc9013e3 ("net/smc: Use the bitmap API when applicable")
  commit 349d43127d ("net/smc: fix kernel panic caused by race of smc_sock")
  bitmap_zero()/memset() is removed by the fix

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-30 12:12:12 -08:00
Xin Long
5ec7d18d18 sctp: use call_rcu to free endpoint
This patch is to delay the endpoint free by calling call_rcu() to fix
another use-after-free issue in sctp_sock_dump():

  BUG: KASAN: use-after-free in __lock_acquire+0x36d9/0x4c20
  Call Trace:
    __lock_acquire+0x36d9/0x4c20 kernel/locking/lockdep.c:3218
    lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844
    __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
    _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
    spin_lock_bh include/linux/spinlock.h:334 [inline]
    __lock_sock+0x203/0x350 net/core/sock.c:2253
    lock_sock_nested+0xfe/0x120 net/core/sock.c:2774
    lock_sock include/net/sock.h:1492 [inline]
    sctp_sock_dump+0x122/0xb20 net/sctp/diag.c:324
    sctp_for_each_transport+0x2b5/0x370 net/sctp/socket.c:5091
    sctp_diag_dump+0x3ac/0x660 net/sctp/diag.c:527
    __inet_diag_dump+0xa8/0x140 net/ipv4/inet_diag.c:1049
    inet_diag_dump+0x9b/0x110 net/ipv4/inet_diag.c:1065
    netlink_dump+0x606/0x1080 net/netlink/af_netlink.c:2244
    __netlink_dump_start+0x59a/0x7c0 net/netlink/af_netlink.c:2352
    netlink_dump_start include/linux/netlink.h:216 [inline]
    inet_diag_handler_cmd+0x2ce/0x3f0 net/ipv4/inet_diag.c:1170
    __sock_diag_cmd net/core/sock_diag.c:232 [inline]
    sock_diag_rcv_msg+0x31d/0x410 net/core/sock_diag.c:263
    netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2477
    sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:274

This issue occurs when asoc is peeled off and the old sk is freed after
getting it by asoc->base.sk and before calling lock_sock(sk).

To prevent the sk free, as a holder of the sk, ep should be alive when
calling lock_sock(). This patch uses call_rcu() and moves sock_put and
ep free into sctp_endpoint_destroy_rcu(), so that it's safe to try to
hold the ep under rcu_read_lock in sctp_transport_traverse_process().

If sctp_endpoint_hold() returns true, it means this ep is still alive
and we have held it and can continue to dump it; If it returns false,
it means this ep is dead and can be freed after rcu_read_unlock, and
we should skip it.

In sctp_sock_dump(), after locking the sk, if this ep is different from
tsp->asoc->ep, it means during this dumping, this asoc was peeled off
before calling lock_sock(), and the sk should be skipped; If this ep is
the same with tsp->asoc->ep, it means no peeloff happens on this asoc,
and due to lock_sock, no peeloff will happen either until release_sock.

Note that delaying endpoint free won't delay the port release, as the
port release happens in sctp_endpoint_destroy() before calling call_rcu().
Also, freeing endpoint by call_rcu() makes it safe to access the sk by
asoc->base.sk in sctp_assocs_seq_show() and sctp_rcv().

Thanks Jones to bring this issue up.

v1->v2:
  - improve the changelog.
  - add kfree(ep) into sctp_endpoint_destroy_rcu(), as Jakub noticed.

Reported-by: syzbot+9276d76e83e3bcde6c99@syzkaller.appspotmail.com
Reported-by: Lee Jones <lee.jones@linaro.org>
Fixes: d25adbeb0c ("sctp: fix an use-after-free issue in sctp_sock_dump")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-25 17:13:37 +00:00
Xin Long
3d3b2f57d4 sctp: move hlist_node and hashent out of sctp_ep_common
Struct sctp_ep_common is included in both asoc and ep, but hlist_node
and hashent are only needed by ep after asoc_hashtable was dropped by
Commit b5eff71283 ("sctp: drop the old assoc hashtable of sctp").

So it is better to move hlist_node and hashent from sctp_ep_common to
sctp_endpoint, and it saves some space for each asoc.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-23 11:15:30 +00:00
Xin Long
703319094c sctp: make the raise timer more simple and accurate
Currently, the probe timer is reused as the raise timer when PLPMTUD is in
the Search Complete state. raise_count was introduced to count how many
times the probe timer has timed out. When raise_count reaches to 30, the
raise timer handler will be triggered.

During the whole processing above, the timer keeps timing out every probe_
interval. It is a waste for the Search Complete state, as the raise timer
only needs to time out after 30 * probe_interval.

Since the raise timer and probe timer are never used at the same time, it
is no need to keep probe timer 'alive' in the Search Complete state. This
patch to introduce sctp_transport_reset_raise_timer() to start the timer
as the raise timer when entering the Search Complete state. When entering
the other states, sctp_transport_reset_probe_timer() will still be called
to reset the timer to the probe timer.

raise_count can be removed from sctp_transport as no need to count probe
timer timeout for raise timer timeout. last_rtx_chunks can be removed as
sctp_transport_reset_probe_timer() can be called in the place where asoc
rtx_data_chunks is changed.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/edb0e48988ea85997488478b705b11ddc1ba724a.1637781974.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 21:00:45 -08:00
Eric Dumazet
4b66d2161b net: annotate accesses to dev->gso_max_size
dev->gso_max_size is written under RTNL protection, or when the device is
not yet visible, but is read locklessly.

Add the READ_ONCE()/WRITE_ONCE() pairs, and use netif_set_gso_max_size()
where we can to better document what is going on.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 12:49:42 +00:00
Jakub Kicinski
50fc24944a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-18 13:13:16 -08:00
Eric Dumazet
b3cb764aa1 net: drop nopreempt requirement on sock_prot_inuse_add()
This is distracting really, let's make this simpler,
because many callers had to take care of this
by themselves, even if on x86 this adds more
code than really needed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-16 13:20:45 +00:00
Paul Moore
32a370abf1 net,lsm,selinux: revert the security_sctp_assoc_established() hook
This patch reverts two prior patches, e7310c9402
("security: implement sctp_assoc_established hook in selinux") and
7c2ef0240e ("security: add sctp_assoc_established hook"), which
create the security_sctp_assoc_established() LSM hook and provide a
SELinux implementation.  Unfortunately these two patches were merged
without proper review (the Reviewed-by and Tested-by tags from
Richard Haines were for previous revisions of these patches that
were significantly different) and there are outstanding objections
from the SELinux maintainers regarding these patches.

Work is currently ongoing to correct the problems identified in the
reverted patches, as well as others that have come up during review,
but it is unclear at this point in time when that work will be ready
for inclusion in the mainline kernel.  In the interest of not keeping
objectionable code in the kernel for multiple weeks, and potentially
a kernel release, we are reverting the two problematic patches.

Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-11-12 12:07:02 -05:00
Linus Torvalds
f54ca91fe6 Networking fixes for 5.16-rc1, including fixes from bpf, can
and netfilter.
 
 Current release - regressions:
 
  - bpf: do not reject when the stack read size is different
    from the tracked scalar size
 
  - net: fix premature exit from NAPI state polling in napi_disable()
 
  - riscv, bpf: fix RV32 broken build, and silence RV64 warning
 
 Current release - new code bugs:
 
  - net: fix possible NULL deref in sock_reserve_memory
 
  - amt: fix error return code in amt_init(); fix stopping the workqueue
 
  - ax88796c: use the correct ioctl callback
 
 Previous releases - always broken:
 
  - bpf: stop caching subprog index in the bpf_pseudo_func insn
 
  - security: fixups for the security hooks in sctp
 
  - nfc: add necessary privilege flags in netlink layer, limit operations
    to admin only
 
  - vsock: prevent unnecessary refcnt inc for non-blocking connect
 
  - net/smc: fix sk_refcnt underflow on link down and fallback
 
  - nfnetlink_queue: fix OOB when mac header was cleared
 
  - can: j1939: ignore invalid messages per standard
 
  - bpf, sockmap:
    - fix race in ingress receive verdict with redirect to self
    - fix incorrect sk_skb data_end access when src_reg = dst_reg
    - strparser, and tls are reusing qdisc_skb_cb and colliding
 
  - ethtool: fix ethtool msg len calculation for pause stats
 
  - vlan: fix a UAF in vlan_dev_real_dev() when ref-holder tries
    to access an unregistering real_dev
 
  - udp6: make encap_rcv() bump the v6 not v4 stats
 
  - drv: prestera: add explicit padding to fix m68k build
 
  - drv: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge
 
  - drv: mvpp2: fix wrong SerDes reconfiguration order
 
 Misc & small latecomers:
 
  - ipvs: auto-load ipvs on genl access
 
  - mctp: sanity check the struct sockaddr_mctp padding fields
 
  - libfs: support RENAME_EXCHANGE in simple_rename()
 
  - avoid double accounting for pure zerocopy skbs
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGNQdwACgkQMUZtbf5S
 IrsiMQ//f66lTJ8PJ5Qj70hX9dC897olx7uGHB9eiKoyOcJI459hFlfXwRU2T4Tf
 fPNwPNUQ9Mynw9tX/jWEi+7zd6r6TSHGXK49U9/rIbQ95QjKY4LHowIE63x+vPl2
 5Cpf+80zXC3DUX1fijgyG1ujnU3kBaqopTxDLmlsHw2PGkwT5Ox1DUwkhc370eEL
 xlpq3PYGWA8/AQNyhSVBkG/UmoLaq0jYNP5yVcOj4jGjgcgLe1SLrqczENr35QHZ
 cRkuBsFBMBZF7wSX2f9qQIB/+b1pcLlD9IO+K3S7Ruq+rUd7qfL/tmwNxEh0axYK
 AyIun1Bxcy7QJGjtpGAz+Ku7jS9T3HxzyxhqilQo3co8jAW0WJ1YwHl+XPgQXyjV
 DLG6Vxt4syiwsoSXGn8MQugs4nlBT+0qWl8YamIR+o7KkAYPc2QWkXlzEDfNeIW8
 JNCZA3sy7VGi1ytorZGx16sQsEWnyRG9a6/WV20Dr+HVs1SKPcFzIfG6mVngR07T
 mQMHnbAF6Z5d8VTcPQfMxd7UH48s1bHtk5lcSTa3j0Cw+GkA6ytTmjPdJ1qRcdkH
 dl9jAfADe4O6frG+9XH7FEFqhmkghVI7bOCA4ZOhClVaIcDGgEZc2y7sY9/oZ7P4
 KXBD2R5X1caCUM0UtzwL7/8ddOtPtHIrFnhY+7+I6ijt9qmI0BY=
 =Ttgq
 -----END PGP SIGNATURE-----

Merge tag 'net-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf, can and netfilter.

  Current release - regressions:

   - bpf: do not reject when the stack read size is different from the
     tracked scalar size

   - net: fix premature exit from NAPI state polling in napi_disable()

   - riscv, bpf: fix RV32 broken build, and silence RV64 warning

  Current release - new code bugs:

   - net: fix possible NULL deref in sock_reserve_memory

   - amt: fix error return code in amt_init(); fix stopping the
     workqueue

   - ax88796c: use the correct ioctl callback

  Previous releases - always broken:

   - bpf: stop caching subprog index in the bpf_pseudo_func insn

   - security: fixups for the security hooks in sctp

   - nfc: add necessary privilege flags in netlink layer, limit
     operations to admin only

   - vsock: prevent unnecessary refcnt inc for non-blocking connect

   - net/smc: fix sk_refcnt underflow on link down and fallback

   - nfnetlink_queue: fix OOB when mac header was cleared

   - can: j1939: ignore invalid messages per standard

   - bpf, sockmap:
      - fix race in ingress receive verdict with redirect to self
      - fix incorrect sk_skb data_end access when src_reg = dst_reg
      - strparser, and tls are reusing qdisc_skb_cb and colliding

   - ethtool: fix ethtool msg len calculation for pause stats

   - vlan: fix a UAF in vlan_dev_real_dev() when ref-holder tries to
     access an unregistering real_dev

   - udp6: make encap_rcv() bump the v6 not v4 stats

   - drv: prestera: add explicit padding to fix m68k build

   - drv: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge

   - drv: mvpp2: fix wrong SerDes reconfiguration order

  Misc & small latecomers:

   - ipvs: auto-load ipvs on genl access

   - mctp: sanity check the struct sockaddr_mctp padding fields

   - libfs: support RENAME_EXCHANGE in simple_rename()

   - avoid double accounting for pure zerocopy skbs"

* tag 'net-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (123 commits)
  selftests/net: udpgso_bench_rx: fix port argument
  net: wwan: iosm: fix compilation warning
  cxgb4: fix eeprom len when diagnostics not implemented
  net: fix premature exit from NAPI state polling in napi_disable()
  net/smc: fix sk_refcnt underflow on linkdown and fallback
  net/mlx5: Lag, fix a potential Oops with mlx5_lag_create_definer()
  gve: fix unmatched u64_stats_update_end()
  net: ethernet: lantiq_etop: Fix compilation error
  selftests: forwarding: Fix packet matching in mirroring selftests
  vsock: prevent unnecessary refcnt inc for nonblocking connect
  net: marvell: mvpp2: Fix wrong SerDes reconfiguration order
  net: ethernet: ti: cpsw_ale: Fix access to un-initialized memory
  net: stmmac: allow a tc-taprio base-time of zero
  selftests: net: test_vxlan_under_vrf: fix HV connectivity test
  net: hns3: allow configure ETS bandwidth of all TCs
  net: hns3: remove check VF uc mac exist when set by PF
  net: hns3: fix some mac statistics is always 0 in device version V2
  net: hns3: fix kernel crash when unload VF while it is being reset
  net: hns3: sync rx ring head in echo common pull
  net: hns3: fix pfc packet number incorrect after querying pfc parameters
  ...
2021-11-11 09:49:36 -08:00
Alexey Khoroshilov
e7ea51cd87 sctp: remove unreachable code from sctp_sf_violation_chunk()
sctp_sf_violation_chunk() is not called with asoc argument equal to NULL,
but if that happens it would lead to NULL pointer dereference
in sctp_vtag_verify().

The patch removes code that handles NULL asoc in sctp_sf_violation_chunk().

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Proposed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-07 19:32:39 +00:00
Linus Torvalds
512b7931ad Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "257 patches.

  Subsystems affected by this patch series: scripts, ocfs2, vfs, and
  mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache,
  gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc,
  pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools,
  memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm,
  vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram,
  cleanups, kfence, and damon)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits)
  mm/damon: remove return value from before_terminate callback
  mm/damon: fix a few spelling mistakes in comments and a pr_debug message
  mm/damon: simplify stop mechanism
  Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
  Docs/admin-guide/mm/damon/start: simplify the content
  Docs/admin-guide/mm/damon/start: fix a wrong link
  Docs/admin-guide/mm/damon/start: fix wrong example commands
  mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
  mm/damon: remove unnecessary variable initialization
  Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
  mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
  selftests/damon: support watermarks
  mm/damon/dbgfs: support watermarks
  mm/damon/schemes: activate schemes based on a watermarks mechanism
  tools/selftests/damon: update for regions prioritization of schemes
  mm/damon/dbgfs: support prioritization weights
  mm/damon/vaddr,paddr: support pageout prioritization
  mm/damon/schemes: prioritize regions within the quotas
  mm/damon/selftests: support schemes quotas
  mm/damon/dbgfs: support quotas of schemes
  ...
2021-11-06 14:08:17 -07:00
Mianhan Liu
a1554c0026 include/linux/mm.h: move nr_free_buffer_pages from swap.h to mm.h
nr_free_buffer_pages could be exposed through mm.h instead of swap.h.
The advantage of this change is that it can reduce the obsolete
includes.  For example, net/ipv4/tcp.c wouldn't need swap.h any more
since it has already included mm.h.  Similarly, after checking all the
other files, it comes that tcp.c, udp.c meter.c ,...  follow the same
rule, so these files can have swap.h removed too.

Moreover, after preprocessing all the files that use
nr_free_buffer_pages, it turns out that those files have already
included mm.h.Thus, we can move nr_free_buffer_pages from swap.h to mm.h
safely.  This change will not affect the compilation of other files.

Link: https://lkml.kernel.org/r/20210912133640.1624-1-liumh1@shanghaitech.edu.cn
Signed-off-by: Mianhan Liu <liumh1@shanghaitech.edu.cn>
Cc: Jakub Kicinski <kuba@kernel.org>
CC: Ulf Hansson <ulf.hansson@linaro.org>
Cc: "David S . Miller" <davem@davemloft.net>
Cc: Simon Horman <horms@verge.net.au>
Cc: Pravin B Shelar <pshelar@ovn.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:43 -07:00
Xin Long
7c2ef0240e security: add sctp_assoc_established hook
security_sctp_assoc_established() is added to replace
security_inet_conn_established() called in
sctp_sf_do_5_1E_ca(), so that asoc can be accessed in security
subsystem and save the peer secid to asoc->peer_secid.

v1->v2:
  - fix the return value of security_sctp_assoc_established() in
    security.h, found by kernel test robot and Ondrej.

Fixes: 72e89f5008 ("security: Add support for SCTP security hooks")
Reported-by: Prashanth Prahlad <pprahlad@redhat.com>
Reviewed-by: Richard Haines <richard_c_haines@btinternet.com>
Tested-by: Richard Haines <richard_c_haines@btinternet.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-03 11:09:20 +00:00
Xin Long
e215dab1c4 security: call security_sctp_assoc_request in sctp_sf_do_5_1D_ce
The asoc created when receives the INIT chunk is a temporary one, it
will be deleted after INIT_ACK chunk is replied. So for the real asoc
created in sctp_sf_do_5_1D_ce() when the COOKIE_ECHO chunk is received,
security_sctp_assoc_request() should also be called.

v1->v2:
  - fix some typo and grammar errors, noticed by Ondrej.

Fixes: 72e89f5008 ("security: Add support for SCTP security hooks")
Reported-by: Prashanth Prahlad <pprahlad@redhat.com>
Reviewed-by: Richard Haines <richard_c_haines@btinternet.com>
Tested-by: Richard Haines <richard_c_haines@btinternet.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-03 11:09:20 +00:00
Xin Long
c081d53f97 security: pass asoc to sctp_assoc_request and sctp_sk_clone
This patch is to move secid and peer_secid from endpoint to association,
and pass asoc to sctp_assoc_request and sctp_sk_clone instead of ep. As
ep is the local endpoint and asoc represents a connection, and in SCTP
one sk/ep could have multiple asoc/connection, saving secid/peer_secid
for new asoc will overwrite the old asoc's.

Note that since asoc can be passed as NULL, security_sctp_assoc_request()
is moved to the place right after the new_asoc is created in
sctp_sf_do_5_1B_init() and sctp_sf_do_unexpected_init().

v1->v2:
  - fix the description of selinux_netlbl_skbuff_setsid(), as Jakub noticed.
  - fix the annotation in selinux_sctp_assoc_request(), as Richard Noticed.

Fixes: 72e89f5008 ("security: Add support for SCTP security hooks")
Reported-by: Prashanth Prahlad <pprahlad@redhat.com>
Reviewed-by: Richard Haines <richard_c_haines@btinternet.com>
Tested-by: Richard Haines <richard_c_haines@btinternet.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-03 11:09:20 +00:00
Xin Long
75cf662c64 sctp: return true only for pathmtu update in sctp_transport_pl_toobig
sctp_transport_pl_toobig() supposes to return true only if there's
pathmtu update, so that in sctp_icmp_frag_needed() it would call
sctp_assoc_sync_pmtu() and sctp_retransmit(). This patch is to fix
these return places in sctp_transport_pl_toobig().

Fixes: 8369640831 ("sctp: do state transition when receiving an icmp TOOBIG packet")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29 12:21:23 +01:00
Xin Long
40171248bb sctp: allow IP fragmentation when PLPMTUD enters Error state
Currently when PLPMTUD enters Error state, transport pathmtu will be set
to MIN_PLPMTU(512) while probe is continuing with BASE_PLPMTU(1200). It
will cause pathmtu to stay in a very small value, even if the real pmtu
is some value like 1000.

RFC8899 doesn't clearly say how to set the value in Error state. But one
possibility could be keep using BASE_PLPMTU for the real pmtu, but allow
to do IP fragmentation when it's in Error state.

As it says in rfc8899#section-5.4:

   Some paths could be unable to sustain packets of the BASE_PLPMTU
   size.  The Error State could be implemented to provide robustness to
   such paths.  This allows fallback to a smaller than desired PLPMTU
   rather than suffer connectivity failure.  This could utilize methods
   such as endpoint IP fragmentation to enable the PL sender to
   communicate using packets smaller than the BASE_PLPMTU.

This patch is to set pmtu to BASE_PLPMTU instead of MIN_PLPMTU for Error
state in sctp_transport_pl_send/toobig(), and set packet ipfragok for
non-probe packets when it's in Error state.

Fixes: 1dc68c1945 ("sctp: do state transition when PROBE_COUNT == MAX_PROBES on HB send path")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29 12:21:23 +01:00
Xin Long
9d02831e51 sctp: add vtag check in sctp_sf_ootb
sctp_sf_ootb() is called when processing DATA chunk in closed state,
and many other places are also using it.

The vtag in the chunk's sctphdr should be verified, otherwise, as
later in chunk length check, it may send abort with the existent
asoc's vtag, which can be exploited by one to cook a malicious
chunk to terminate a SCTP asoc.

When fails to verify the vtag from the chunk, this patch sets asoc
to NULL, so that the abort will be made with the vtag from the
received chunk later.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:45 -07:00
Xin Long
ef16b1734f sctp: add vtag check in sctp_sf_do_8_5_1_E_sa
sctp_sf_do_8_5_1_E_sa() is called when processing SHUTDOWN_ACK chunk
in cookie_wait and cookie_echoed state.

The vtag in the chunk's sctphdr should be verified, otherwise, as
later in chunk length check, it may send abort with the existent
asoc's vtag, which can be exploited by one to cook a malicious
chunk to terminate a SCTP asoc.

Note that when fails to verify the vtag from SHUTDOWN-ACK chunk,
SHUTDOWN COMPLETE message will still be sent back to peer, but
with the vtag from SHUTDOWN-ACK chunk, as said in 5) of
rfc4960#section-8.4.

While at it, also remove the unnecessary chunk length check from
sctp_sf_shut_8_4_5(), as it's already done in both places where
it calls sctp_sf_shut_8_4_5().

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:44 -07:00
Xin Long
aa0f697e45 sctp: add vtag check in sctp_sf_violation
sctp_sf_violation() is called when processing HEARTBEAT_ACK chunk
in cookie_wait state, and some other places are also using it.

The vtag in the chunk's sctphdr should be verified, otherwise, as
later in chunk length check, it may send abort with the existent
asoc's vtag, which can be exploited by one to cook a malicious
chunk to terminate a SCTP asoc.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:44 -07:00
Xin Long
a64b341b86 sctp: fix the processing for COOKIE_ECHO chunk
1. In closed state: in sctp_sf_do_5_1D_ce():

  When asoc is NULL, making packet for abort will use chunk's vtag
  in sctp_ootb_pkt_new(). But when asoc exists, vtag from the chunk
  should be verified before using peer.i.init_tag to make packet
  for abort in sctp_ootb_pkt_new(), and just discard it if vtag is
  not correct.

2. In the other states: in sctp_sf_do_5_2_4_dupcook():

  asoc always exists, but duplicate cookie_echo's vtag will be
  handled by sctp_tietags_compare() and then take actions, so before
  that we only verify the vtag for the abort sent for invalid chunk
  length.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:44 -07:00
Xin Long
438b95a7c9 sctp: fix the processing for INIT_ACK chunk
Currently INIT_ACK chunk in non-cookie_echoed state is processed in
sctp_sf_discard_chunk() to send an abort with the existent asoc's
vtag if the chunk length is not valid. But the vtag in the chunk's
sctphdr is not verified, which may be exploited by one to cook a
malicious chunk to terminal a SCTP asoc.

sctp_sf_discard_chunk() also is called in many other places to send
an abort, and most of those have this problem. This patch is to fix
it by sending abort with the existent asoc's vtag only if the vtag
from the chunk's sctphdr is verified in sctp_sf_discard_chunk().

Note on sctp_sf_do_9_1_abort() and sctp_sf_shutdown_pending_abort(),
the chunk length has been verified before sctp_sf_discard_chunk(),
so replace it with sctp_sf_discard(). On sctp_sf_do_asconf_ack() and
sctp_sf_do_asconf(), move the sctp_chunk_length_valid check ahead of
sctp_sf_discard_chunk(), then replace it with sctp_sf_discard().

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:44 -07:00
Xin Long
eae5783908 sctp: fix the processing for INIT chunk
This patch fixes the problems below:

1. In non-shutdown_ack_sent states: in sctp_sf_do_5_1B_init() and
   sctp_sf_do_5_2_2_dupinit():

  chunk length check should be done before any checks that may cause
  to send abort, as making packet for abort will access the init_tag
  from init_hdr in sctp_ootb_pkt_new().

2. In shutdown_ack_sent state: in sctp_sf_do_9_2_reshutack():

  The same checks as does in sctp_sf_do_5_2_2_dupinit() is needed
  for sctp_sf_do_9_2_reshutack().

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:43 -07:00
Xin Long
4f7019c7eb sctp: use init_tag from inithdr for ABORT chunk
Currently Linux SCTP uses the verification tag of the existing SCTP
asoc when failing to process and sending the packet with the ABORT
chunk. This will result in the peer accepting the ABORT chunk and
removing the SCTP asoc. One could exploit this to terminate a SCTP
asoc.

This patch is to fix it by always using the initiate tag of the
received INIT chunk for the ABORT chunk to be sent.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 12:36:43 -07:00
Eiichi Tsukata
a2d859e3fc sctp: account stream padding length for reconf chunk
sctp_make_strreset_req() makes repeated calls to sctp_addto_chunk()
which will automatically account for padding on each call. inreq and
outreq are already 4 bytes aligned, but the payload is not and doing
SCTP_PAD4(a + b) (which _sctp_make_chunk() did implicitly here) is
different from SCTP_PAD4(a) + SCTP_PAD4(b) and not enough. It led to
possible attempt to use more buffer than it was allocated and triggered
a BUG_ON.

Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Fixes: cc16f00f65 ("sctp: add support for generating stream reconf ssn reset request chunk")
Reported-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/b97c1f8b0c7ff79ac4ed206fc2c49d3612e0850c.1634156849.git.mleitner@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-14 07:15:22 -07:00
Xin Long
f7e745f8e9 sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb
We should always check if skb_header_pointer's return is NULL before
using it, otherwise it may cause null-ptr-deref, as syzbot reported:

  KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
  RIP: 0010:sctp_rcv_ootb net/sctp/input.c:705 [inline]
  RIP: 0010:sctp_rcv+0x1d84/0x3220 net/sctp/input.c:196
  Call Trace:
  <IRQ>
   sctp6_rcv+0x38/0x60 net/sctp/ipv6.c:1109
   ip6_protocol_deliver_rcu+0x2e9/0x1ca0 net/ipv6/ip6_input.c:422
   ip6_input_finish+0x62/0x170 net/ipv6/ip6_input.c:463
   NF_HOOK include/linux/netfilter.h:307 [inline]
   NF_HOOK include/linux/netfilter.h:301 [inline]
   ip6_input+0x9c/0xd0 net/ipv6/ip6_input.c:472
   dst_input include/net/dst.h:460 [inline]
   ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline]
   NF_HOOK include/linux/netfilter.h:307 [inline]
   NF_HOOK include/linux/netfilter.h:301 [inline]
   ipv6_rcv+0x28c/0x3c0 net/ipv6/ip6_input.c:297

Fixes: 3acb50c18d ("sctp: delay as much as possible skb_linearize")
Reported-by: syzbot+581aff2ae6b860625116@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:36:19 +01:00
Xin Long
ae954bbc45 sctp: move the active_key update after sh_keys is added
In commit 58acd10092 ("sctp: update active_key for asoc when old key is
being replaced"), sctp_auth_asoc_init_active_key() is called to update
the active_key right after the old key is deleted and before the new key
is added, and it caused that the active_key could be found with the key_id.

In Ying Xu's testing, the BUG_ON in sctp_auth_asoc_init_active_key() was
triggered:

  [ ] kernel BUG at net/sctp/auth.c:416!
  [ ] RIP: 0010:sctp_auth_asoc_init_active_key.part.8+0xe7/0xf0 [sctp]
  [ ] Call Trace:
  [ ]  sctp_auth_set_key+0x16d/0x1b0 [sctp]
  [ ]  sctp_setsockopt.part.33+0x1ba9/0x2bd0 [sctp]
  [ ]  __sys_setsockopt+0xd6/0x1d0
  [ ]  __x64_sys_setsockopt+0x20/0x30
  [ ]  do_syscall_64+0x5b/0x1a0

So fix it by moving the active_key update after sh_keys is added.

Fixes: 58acd10092 ("sctp: update active_key for asoc when old key is being replaced")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-03 11:43:43 +01:00
Marcelo Ricardo Leitner
557fb5862c sctp: fix return value check in __sctp_rcv_asconf_lookup
As Ben Hutchings noticed, this check should have been inverted: the call
returns true in case of success.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: 0c5dc070ff ("sctp: validate from_addr_param return")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-28 09:25:18 +01:00
Chen Shen
2ebda02714 sctp: delete addr based on sin6_scope_id
sctp_inet6addr_event deletes 'addr' from 'local_addr_list' when setting
netdev down, but it is possible to delete the incorrect entry (match
the first one with the same ipaddr, but the different 'ifindex'), if
there are some netdevs with the same 'local-link' ipaddr added already.
It should delete the entry depending on 'sin6_addr' and 'sin6_scope_id'
both. otherwise, the endpoint will call 'sctp_sf_ootb' if it can't find
the according association when receives 'heartbeat', and finally will
reply 'abort'.

For example:
1.when linux startup
the entries in local_addr_list:
ifindex:35 addr:fe80::40:43ff:fe80:0 (eths0.201)
ifindex:36 addr:fe80::40:43ff:fe80:0 (eths0.209)
ifindex:37 addr:fe80::40:43ff:fe80:0 (eths0.210)

the route table:
local fe80::40:43ff:fe80:0 dev eths0.201
local fe80::40:43ff:fe80:0 dev eths0.209
local fe80::40:43ff:fe80:0 dev eths0.210

2.after 'ifconfig eths0.209 down'
the entries in local_addr_list:
ifindex:36 addr:fe80::40:43ff:fe80:0 (eths0.209)
ifindex:37 addr:fe80::40:43ff:fe80:0 (eths0.210)

the route table:
local fe80::40:43ff:fe80:0 dev eths0.201
local fe80::40:43ff:fe80:0 dev eths0.210

3.asoc not found for src:[fe80::40:43ff:fe80:0]:37381 dst:[:1]:53335
::1->fe80::40:43ff:fe80:0 HEARTBEAT
fe80::40:43ff:fe80:0->::1 ABORT

Signed-off-by: Chen Shen <peterchenshen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-26 12:34:03 +01:00
Xin Long
eacf078cf4 sctp: send pmtu probe only if packet loss in Search Complete state
This patch is to introduce last_rtx_chunks into sctp_transport to detect
if there's any packet retransmission/loss happened by checking against
asoc's rtx_data_chunks in sctp_transport_pl_send().

If there is, namely, transport->last_rtx_chunks != asoc->rtx_data_chunks,
the pmtu probe will be sent out. Otherwise, increment the pl.raise_count
and return when it's in Search Complete state.

With this patch, if in Search Complete state, which is a long period, it
doesn't need to keep probing the current pmtu unless there's data packet
loss. This will save quite some traffic.

v1->v2:
  - add the missing Fixes tag.

Fixes: 0dac127c05 ("sctp: do black hole detection in search complete state")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25 23:06:02 +01:00
Xin Long
058e6e0ed0 sctp: improve the code for pmtu probe send and recv update
This patch does 3 things:

  - make sctp_transport_pl_send() and sctp_transport_pl_recv()
    return bool type to decide if more probe is needed to send.

  - pr_debug() only when probe is really needed to send.

  - count pl.raise_count in sctp_transport_pl_send() instead of
    sctp_transport_pl_recv(), and it's only incremented for the
    1st probe for the same size.

These are preparations for the next patch to make probes happen
only when there's packet loss in Search Complete state.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25 23:06:02 +01:00
Xin Long
02dc2ee7c7 sctp: do not update transport pathmtu if SPP_PMTUD_ENABLE is not set
Currently, in sctp_packet_config(), sctp_transport_pmtu_check() is
called to update transport pathmtu with dst's mtu when dst's mtu
has been changed by non sctp stack like xfrm.

However, this should only happen when SPP_PMTUD_ENABLE is set, no
matter where dst's mtu changed. This patch is to fix by checking
SPP_PMTUD_ENABLE flag before calling sctp_transport_pmtu_check().

Thanks Jacek for reporting and looking into this issue.

v1->v2:
  - add the missing "{" to fix the build error.

Fixes: 69fec325a6 ('Revert "sctp: remove sctp_transport_pmtu_check"')
Reported-by: Jacek Szafraniec <jacek.szafraniec@nokia.com>
Tested-by: Jacek Szafraniec <jacek.szafraniec@nokia.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-21 14:17:58 -07:00
Xin Long
58acd10092 sctp: update active_key for asoc when old key is being replaced
syzbot reported a call trace:

  BUG: KASAN: use-after-free in sctp_auth_shkey_hold+0x22/0xa0 net/sctp/auth.c:112
  Call Trace:
   sctp_auth_shkey_hold+0x22/0xa0 net/sctp/auth.c:112
   sctp_set_owner_w net/sctp/socket.c:131 [inline]
   sctp_sendmsg_to_asoc+0x152e/0x2180 net/sctp/socket.c:1865
   sctp_sendmsg+0x103b/0x1d30 net/sctp/socket.c:2027
   inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:821
   sock_sendmsg_nosec net/socket.c:703 [inline]
   sock_sendmsg+0xcf/0x120 net/socket.c:723

This is an use-after-free issue caused by not updating asoc->shkey after
it was replaced in the key list asoc->endpoint_shared_keys, and the old
key was freed.

This patch is to fix by also updating active_key for asoc when old key is
being replaced with a new one. Note that this issue doesn't exist in
sctp_auth_del_key_id(), as it's not allowed to delete the active_key
from the asoc.

Fixes: 1b1e0bc994 ("sctp: add refcnt support for sh_key")
Reported-by: syzbot+b774577370208727d12b@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-21 08:44:44 -07:00
Xin Long
2f3fdd8d48 sctp: trim optlen when it's a huge value in sctp_setsockopt
After commit ca84bd058d ("sctp: copy the optval from user space in
sctp_setsockopt"), it does memory allocation in sctp_setsockopt with
the optlen, and it would fail the allocation and return error if the
optlen from user space is a huge value.

This breaks some sockopts, like SCTP_HMAC_IDENT, SCTP_RESET_STREAMS and
SCTP_AUTH_KEY, as when processing these sockopts before, optlen would
be trimmed to a biggest value it needs when optlen is a huge value,
instead of failing the allocation and returning error.

This patch is to fix the allocation failure when it's a huge optlen from
user space by trimming it to the biggest size sctp sockopt may need when
necessary, and this biggest size is from sctp_setsockopt_reset_streams()
for SCTP_RESET_STREAMS, which is bigger than those for SCTP_HMAC_IDENT
and SCTP_AUTH_KEY.

Fixes: ca84bd058d ("sctp: copy the optval from user space in sctp_setsockopt")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-18 09:47:29 -07:00
Yajun Deng
01757f536a net: Use nlmsg_unicast() instead of netlink_unicast()
It has 'if (err >0 )' statement in nlmsg_unicast(), so use nlmsg_unicast()
instead of netlink_unicast(), this looks more concise.

v2: remove the change in netfilter.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13 09:28:29 -07:00
Xin Long
1d11fa231c sctp: move 198 addresses from unusable to private scope
The doc draft-stewart-tsvwg-sctp-ipv4-00 that restricts 198 addresses
was never published. These addresses as private addresses should be
allowed to use in SCTP.

As Michael Tuexen suggested, this patch is to move 198 addresses from
unusable to private scope.

Reported-by: Sérgio <surkamp@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01 11:47:13 -07:00
Xin Long
650b2a846d sctp: check pl.raise_count separately from its increment
As Marcelo's suggestion this will make code more clear to read.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01 11:46:44 -07:00
Dan Carpenter
a34dcbfa14 sctp: prevent info leak in sctp_make_heartbeat()
The "hbinfo" struct has a 4 byte hole at the end so we have to zero it
out to prevent stack information from being disclosed.

Fixes: fe59379b9a ("sctp: do the basic send and recv for PLPMTUD probe")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01 11:16:38 -07:00
Linus Torvalds
dbe69e4337 Networking changes for 5.14.
Core:
 
  - BPF:
    - add syscall program type and libbpf support for generating
      instructions and bindings for in-kernel BPF loaders (BPF loaders
      for BPF), this is a stepping stone for signed BPF programs
    - infrastructure to migrate TCP child sockets from one listener
      to another in the same reuseport group/map to improve flexibility
      of service hand-off/restart
    - add broadcast support to XDP redirect
 
  - allow bypass of the lockless qdisc to improving performance
    (for pktgen: +23% with one thread, +44% with 2 threads)
 
  - add a simpler version of "DO_ONCE()" which does not require
    jump labels, intended for slow-path usage
 
  - virtio/vsock: introduce SOCK_SEQPACKET support
 
  - add getsocketopt to retrieve netns cookie
 
  - ip: treat lowest address of a IPv4 subnet as ordinary unicast address
        allowing reclaiming of precious IPv4 addresses
 
  - ipv6: use prandom_u32() for ID generation
 
  - ip: add support for more flexible field selection for hashing
        across multi-path routes (w/ offload to mlxsw)
 
  - icmp: add support for extended RFC 8335 PROBE (ping)
 
  - seg6: add support for SRv6 End.DT46 behavior
 
  - mptcp:
     - DSS checksum support (RFC 8684) to detect middlebox meddling
     - support Connection-time 'C' flag
     - time stamping support
 
  - sctp: packetization Layer Path MTU Discovery (RFC 8899)
 
  - xfrm: speed up state addition with seq set
 
  - WiFi:
     - hidden AP discovery on 6 GHz and other HE 6 GHz improvements
     - aggregation handling improvements for some drivers
     - minstrel improvements for no-ack frames
     - deferred rate control for TXQs to improve reaction times
     - switch from round robin to virtual time-based airtime scheduler
 
  - add trace points:
     - tcp checksum errors
     - openvswitch - action execution, upcalls
     - socket errors via sk_error_report
 
 Device APIs:
 
  - devlink: add rate API for hierarchical control of max egress rate
             of virtual devices (VFs, SFs etc.)
 
  - don't require RCU read lock to be held around BPF hooks
    in NAPI context
 
  - page_pool: generic buffer recycling
 
 New hardware/drivers:
 
  - mobile:
     - iosm: PCIe Driver for Intel M.2 Modem
     - support for Qualcomm MSM8998 (ipa)
 
  - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices
 
  - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches
 
  - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)
 
  - NXP SJA1110 Automotive Ethernet 10-port switch
 
  - Qualcomm QCA8327 switch support (qca8k)
 
  - Mikrotik 10/25G NIC (atl1c)
 
 Driver changes:
 
  - ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP
    (our first foray into MAC/PHY description via ACPI)
 
  - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx
 
  - Mellanox/Nvidia NIC (mlx5)
    - NIC VF offload of L2 bridging
    - support IRQ distribution to Sub-functions
 
  - Marvell (prestera):
     - add flower and match all
     - devlink trap
     - link aggregation
 
  - Netronome (nfp): connection tracking offload
 
  - Intel 1GE (igc): add AF_XDP support
 
  - Marvell DPU (octeontx2): ingress ratelimit offload
 
  - Google vNIC (gve): new ring/descriptor format support
 
  - Qualcomm mobile (rmnet & ipa): inline checksum offload support
 
  - MediaTek WiFi (mt76)
     - mt7915 MSI support
     - mt7915 Tx status reporting
     - mt7915 thermal sensors support
     - mt7921 decapsulation offload
     - mt7921 enable runtime pm and deep sleep
 
  - Realtek WiFi (rtw88)
     - beacon filter support
     - Tx antenna path diversity support
     - firmware crash information via devcoredump
 
  - Qualcomm 60GHz WiFi (wcn36xx)
     - Wake-on-WLAN support with magic packets and GTK rekeying
 
  - Micrel PHY (ksz886x/ksz8081): add cable test support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmDb+fUACgkQMUZtbf5S
 Irs2Jg//aqN0Q8CgIvYCVhPxQw1tY7pTAbgyqgBZ01vwjyvtIOgJiWzSfFEU84mX
 M8fcpFX5eTKrOyJ9S6UFfQ/JG114n3hjAxFFT4Hxk2gC1Tg0vHuFQTDHcUl28bUE
 mTm61e1YpdorILnv2k5JVQ/wu0vs5QKDrjcYcrcPnh+j93wvnPOgAfDBV95nZzjS
 OTt4q2fR8GzLcSYWWsclMbDNkzyTG50RW/0Yd6aGjr5QGvXfrMeXfUJNz533PMf/
 w5lNyjRKv+x9mdTZJzU0+msNUrZgUdRz7W8Ey8lD3hJZRE+D6/uU7FtsE8Mi3+uc
 HWxeZUyzA3YF1MfVl/eesbxyPT7S/OkLzk4O5B35FbqP0YltaP+bOjq1/nM3ce1/
 io9Dx9pIl/2JANUgRCAtLi8Z2dkvRoqTaBxZ/nPudCCljFwDwl6joTMJ7Ow22i5Y
 5aIkcXFmZq4LbJDiHvbTlqT7yiuaEvu2UK/23bSIg/K3nF4eAmkY9Y1EgiMf60OF
 78Ttw0wk2tUegwaS5MZnCniKBKDyl9gM2F6rbZ/IxQRR2LTXFc1B6gC+ynUxgXfh
 Ub8O++6qGYGYZ0XvQH4pzco79p3qQWBTK5beIp2eu6BOAjBVIXq4AibUfoQLACsu
 hX7jMPYd0kc3WFgUnKgQP8EnjFSwbf4XiaE7fIXvWBY8hzCw2h4=
 =LvtX
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - BPF:
      - add syscall program type and libbpf support for generating
        instructions and bindings for in-kernel BPF loaders (BPF loaders
        for BPF), this is a stepping stone for signed BPF programs
      - infrastructure to migrate TCP child sockets from one listener to
        another in the same reuseport group/map to improve flexibility
        of service hand-off/restart
      - add broadcast support to XDP redirect

   - allow bypass of the lockless qdisc to improving performance (for
     pktgen: +23% with one thread, +44% with 2 threads)

   - add a simpler version of "DO_ONCE()" which does not require jump
     labels, intended for slow-path usage

   - virtio/vsock: introduce SOCK_SEQPACKET support

   - add getsocketopt to retrieve netns cookie

   - ip: treat lowest address of a IPv4 subnet as ordinary unicast
     address allowing reclaiming of precious IPv4 addresses

   - ipv6: use prandom_u32() for ID generation

   - ip: add support for more flexible field selection for hashing
     across multi-path routes (w/ offload to mlxsw)

   - icmp: add support for extended RFC 8335 PROBE (ping)

   - seg6: add support for SRv6 End.DT46 behavior

   - mptcp:
      - DSS checksum support (RFC 8684) to detect middlebox meddling
      - support Connection-time 'C' flag
      - time stamping support

   - sctp: packetization Layer Path MTU Discovery (RFC 8899)

   - xfrm: speed up state addition with seq set

   - WiFi:
      - hidden AP discovery on 6 GHz and other HE 6 GHz improvements
      - aggregation handling improvements for some drivers
      - minstrel improvements for no-ack frames
      - deferred rate control for TXQs to improve reaction times
      - switch from round robin to virtual time-based airtime scheduler

   - add trace points:
      - tcp checksum errors
      - openvswitch - action execution, upcalls
      - socket errors via sk_error_report

  Device APIs:

   - devlink: add rate API for hierarchical control of max egress rate
     of virtual devices (VFs, SFs etc.)

   - don't require RCU read lock to be held around BPF hooks in NAPI
     context

   - page_pool: generic buffer recycling

  New hardware/drivers:

   - mobile:
      - iosm: PCIe Driver for Intel M.2 Modem
      - support for Qualcomm MSM8998 (ipa)

   - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices

   - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches

   - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)

   - NXP SJA1110 Automotive Ethernet 10-port switch

   - Qualcomm QCA8327 switch support (qca8k)

   - Mikrotik 10/25G NIC (atl1c)

  Driver changes:

   - ACPI support for some MDIO, MAC and PHY devices from Marvell and
     NXP (our first foray into MAC/PHY description via ACPI)

   - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx

   - Mellanox/Nvidia NIC (mlx5)
      - NIC VF offload of L2 bridging
      - support IRQ distribution to Sub-functions

   - Marvell (prestera):
      - add flower and match all
      - devlink trap
      - link aggregation

   - Netronome (nfp): connection tracking offload

   - Intel 1GE (igc): add AF_XDP support

   - Marvell DPU (octeontx2): ingress ratelimit offload

   - Google vNIC (gve): new ring/descriptor format support

   - Qualcomm mobile (rmnet & ipa): inline checksum offload support

   - MediaTek WiFi (mt76)
      - mt7915 MSI support
      - mt7915 Tx status reporting
      - mt7915 thermal sensors support
      - mt7921 decapsulation offload
      - mt7921 enable runtime pm and deep sleep

   - Realtek WiFi (rtw88)
      - beacon filter support
      - Tx antenna path diversity support
      - firmware crash information via devcoredump

   - Qualcomm WiFi (wcn36xx)
      - Wake-on-WLAN support with magic packets and GTK rekeying

   - Micrel PHY (ksz886x/ksz8081): add cable test support"

* tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits)
  tcp: change ICSK_CA_PRIV_SIZE definition
  tcp_yeah: check struct yeah size at compile time
  gve: DQO: Fix off by one in gve_rx_dqo()
  stmmac: intel: set PCI_D3hot in suspend
  stmmac: intel: Enable PHY WOL option in EHL
  net: stmmac: option to enable PHY WOL with PMT enabled
  net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}
  net: use netdev_info in ndo_dflt_fdb_{add,del}
  ptp: Set lookup cookie when creating a PTP PPS source.
  net: sock: add trace for socket errors
  net: sock: introduce sk_error_report
  net: dsa: replay the local bridge FDB entries pointing to the bridge dev too
  net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev
  net: dsa: include fdb entries pointing to bridge in the host fdb list
  net: dsa: include bridge addresses which are local in the host fdb list
  net: dsa: sync static FDB entries on foreign interfaces to hardware
  net: dsa: install the host MDB and FDB entries in the master's RX filter
  net: dsa: reference count the FDB addresses at the cross-chip notifier level
  net: dsa: introduce a separate cross-chip notifier type for host FDBs
  net: dsa: reference count the MDB entries at the cross-chip notifier level
  ...
2021-06-30 15:51:09 -07:00
Jakub Kicinski
b6df00789e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Trivial conflict in net/netfilter/nf_tables_api.c.

Duplicate fix in tools/testing/selftests/net/devlink_port_split.py
- take the net-next version.

skmsg, and L4 bpf - keep the bpf code but remove the flags
and err params.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-29 15:45:27 -07:00
Alexander Aring
e3ae2365ef net: sock: introduce sk_error_report
This patch introduces a function wrapper to call the sk_error_report
callback. That will prepare to add additional handling whenever
sk_error_report is called, for example to trace socket errors.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29 11:28:21 -07:00
Linus Torvalds
8ec035ac4a fallthrough fixes for Clang for 5.14-rc1
Hi Linus,
 
 Please, pull the following patches that fix many fall-through warnings
 when building with Clang 12.0.0 and this[1] change reverted. Notice
 that in order to enable -Wimplicit-fallthrough for Clang, such change[1]
 is meant to be reverted at some point. So, these patches help to move
 in that direction.
 
 Thanks!
 
 [1] commit e2079e93f5 ("kbuild: Do not enable -Wimplicit-fallthrough for clang for now")
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAmDaNe8ACgkQRwW0y0cG
 2zFfGA/9G1A/Hrf261/P9olyYe2TRBwLnO1tUDREm3qtJ2JdKpf+7EM3VDm+Ue/A
 qhNmwp5G7nmp7Nqq8MfbdFjeo/rPS67voXiOfO8b0pU+E4XlOc+B1BXL0BWtnP7b
 xvuauklQU6dmCp2u44vsxdBIO6ooR0uQh+7/+1la+mPyEk9mlooQ4lyFcpfA53yt
 zxEGrx0tZBrDXghEI1CkHxOaJaX3qhw4EUYvxe8n2L7Dgx+o2djL/G4/SRYH/xoq
 MZa8TLyCuR3J0Ph4TfDONhMmf8ZLn+j70xBhewcVfZ1JfvGSVw4DQNN44KZCDnrK
 tGsBo5VFksjbmX83LmT8UlqB1rTP4nVQtRmtOPvbQA9kd19yy+Y64Y58FcGU2FHl
 PWt3rQJ1JzBo3TtzQoz7HSJCt9QTil4U7hFbNtcp5BbWQfUPkRgpWcL3FOchZbZ6
 FnLMqHanw2lrKMzZEoyHvg6G7BT67k3rrFgtd/xGSn8ohtfKXaZBYa9PKrQ0LwuG
 o8tQtIX1owj4rbdI1t6Ob4X/tT6Y7DzH8nsF+TsJQ4XeSCD2rURUcYltBMIlEr16
 DFj7iWKIrrX80/JRsBXu7a9h8nn5YptxV12SGRq/Cu/2jfRwjDye4IzsCyqMf67n
 oEN6YC1XYaEUmKXTnI8Z0CxY0qwSTcNjeH5Ci9jWepinsqD3Jxw=
 =Kt2q
 -----END PGP SIGNATURE-----

Merge tag 'fallthrough-fixes-clang-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull fallthrough fixes from Gustavo Silva:
 "Fix many fall-through warnings when building with Clang 12.0.0 and
  '-Wimplicit-fallthrough' so that we at some point will be able to
  enable that warning by default"

* tag 'fallthrough-fixes-clang-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (26 commits)
  rxrpc: Fix fall-through warnings for Clang
  drm/nouveau/clk: Fix fall-through warnings for Clang
  drm/nouveau/therm: Fix fall-through warnings for Clang
  drm/nouveau: Fix fall-through warnings for Clang
  xfs: Fix fall-through warnings for Clang
  xfrm: Fix fall-through warnings for Clang
  tipc: Fix fall-through warnings for Clang
  sctp: Fix fall-through warnings for Clang
  rds: Fix fall-through warnings for Clang
  net/packet: Fix fall-through warnings for Clang
  net: netrom: Fix fall-through warnings for Clang
  ide: Fix fall-through warnings for Clang
  hwmon: (max6621) Fix fall-through warnings for Clang
  hwmon: (corsair-cpro) Fix fall-through warnings for Clang
  firewire: core: Fix fall-through warnings for Clang
  braille_console: Fix fall-through warnings for Clang
  ipv4: Fix fall-through warnings for Clang
  qlcnic: Fix fall-through warnings for Clang
  bnxt_en: Fix fall-through warnings for Clang
  netxen_nic: Fix fall-through warnings for Clang
  ...
2021-06-28 20:03:38 -07:00
Marcelo Ricardo Leitner
ef6c8d6ccf sctp: add param size validation for SCTP_PARAM_SET_PRIMARY
When SCTP handles an INIT chunk, it calls for example:
sctp_sf_do_5_1B_init
  sctp_verify_init
    sctp_verify_param
  sctp_process_init
    sctp_process_param
      handling of SCTP_PARAM_SET_PRIMARY

sctp_verify_init() wasn't doing proper size validation and neither the
later handling, allowing it to work over the chunk itself, possibly being
uninitialized memory.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28 15:34:50 -07:00
Marcelo Ricardo Leitner
b6ffe7671b sctp: validate chunk size in __rcv_asconf_lookup
In one of the fallbacks that SCTP has for identifying an association for an
incoming packet, it looks for AddIp chunk (from ASCONF) and take a peek.
Thing is, at this stage nothing was validating that the chunk actually had
enough content for that, allowing the peek to happen over uninitialized
memory.

Similar check already exists in actual asconf handling in
sctp_verify_asconf().

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28 15:34:50 -07:00
Marcelo Ricardo Leitner
50619dbf8d sctp: add size validation when walking chunks
The first chunk in a packet is ensured to be present at the beginning of
sctp_rcv(), as a packet needs to have at least 1 chunk. But the second
one, may not be completely available and ch->length can be over
uninitialized memory.

Fix here is by only trying to walk on the next chunk if there is enough to
hold at least the header, and then proceed with the ch->length validation
that is already there.

Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28 15:34:50 -07:00
Marcelo Ricardo Leitner
0c5dc070ff sctp: validate from_addr_param return
Ilja reported that, simply putting it, nothing was validating that
from_addr_param functions were operating on initialized memory. That is,
the parameter itself was being validated by sctp_walk_params, but it
doesn't check for types and their specific sizes and it could be a 0-length
one, causing from_addr_param to potentially work over the next parameter or
even uninitialized memory.

The fix here is to, in all calls to from_addr_param, check if enough space
is there for the wanted IP address type.

Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28 15:34:50 -07:00
Xin Long
fea1d5b17f sctp: send the next probe immediately once the last one is acked
These is no need to wait for 'interval' period for the next probe
if the last probe is already acked in search state. The 'interval'
period waiting should be only for probe failure timeout and the
current pmtu check when it's in search complete state.

This change will shorten the probe time a lot in search state, and
also fix the document accordingly.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:58:03 -07:00
Xin Long
0dac127c05 sctp: do black hole detection in search complete state
Currently the PLPMUTD probe will stop for a long period (interval * 30)
after it enters search complete state. If there's a pmtu change on the
route path, it takes a long time to be aware if the ICMP TooBig packet
is lost or filtered.

As it says in rfc8899#section-4.3:

  "A DPLPMTUD method MUST NOT rely solely on this method."
  (ICMP PTB message).

This patch is to enable the other method for search complete state:

  "A PL can use the DPLPMTUD probing mechanism to periodically
   generate probe packets of the size of the current PLPMTU."

With this patch, the probe will continue with the current pmtu every
'interval' until the PMTU_RAISE_TIMER 'timeout', which we implement
by adding raise_count to raise the probe size when it counts to 30
and removing the SCTP_PL_COMPLETE check for PMTU_RAISE_TIMER.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:58:03 -07:00
Xin Long
9e47df005c sctp: process sctp over udp icmp err on sctp side
Previously, sctp over udp was using udp tunnel's icmp err process, which
only does sk lookup on sctp side. However for sctp's icmp error process,
there are more things to do, like syncing assoc pmtu/retransmit packets
for toobig type err, and starting proto_unreach_timer for unreach type
err etc.

Now after adding PLPMTUD, which also requires to process toobig type err
on sctp side. This patch is to process icmp err on sctp side by parsing
the type/code/info in .encap_err_lookup and call sctp's icmp processing
functions. Note as the 'redirect' err process needs to know the outer
ip(v6) header's, we have to leave it to udp(v6)_err to handle it.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 11:28:52 -07:00
Xin Long
d83060759a sctp: extract sctp_v4_err_handle function from sctp_v4_err
This patch is to extract sctp_v4_err_handle() from sctp_v4_err() to
only handle the icmp err after the sock lookup, and it also makes
the code clearer.

sctp_v4_err_handle() will be used in sctp over udp's err handling
in the following patch.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 11:28:52 -07:00
Xin Long
f6549bd37b sctp: extract sctp_v6_err_handle function from sctp_v6_err
This patch is to extract sctp_v6_err_handle() from sctp_v6_err() to
only handle the icmp err after the sock lookup, and it also makes
the code clearer.

sctp_v6_err_handle() will be used in sctp over udp's err handling
in the following patch.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 11:28:52 -07:00
Xin Long
237a6a2e31 sctp: remove the unessessary hold for idev in sctp_v6_err
Same as in tcp_v6_err() and __udp6_lib_err(), there's no need to
hold idev in sctp_v6_err(), so just call __in6_dev_get() instead.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 11:28:52 -07:00
Xin Long
7307e4fa4d sctp: enable PLPMTUD when the transport is ready
sctp_transport_pl_reset() is called whenever any of these 3 members in
transport is changed:

  - probe_interval
  - param_flags & SPP_PMTUD_ENABLE
  - state == ACTIVE

If all are true, start the PLPMTUD when it's not yet started. If any of
these is false, stop the PLPMTUD when it's already running.

sctp_transport_pl_update() is called when the transport dst has changed.
It will restart the PLPMTUD probe. Again, the pathmtu won't change but
use the dst's mtu until the Search phase is done.

Note that after using PLPMTUD, the pathmtu is only initialized with the
dst mtu when the transport dst changes. At other time it is updated by
pl.pmtu. So sctp_transport_pmtu_check() will be called only when PLPMTUD
is disabled in sctp_packet_config().

After this patch, the PLPMTUD feature from RFC8899 will be activated
and can be used by users.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 11:28:52 -07:00
Xin Long
8369640831 sctp: do state transition when receiving an icmp TOOBIG packet
PLPMTUD will short-circuit the old process for icmp TOOBIG packets.
This part is described in rfc8899#section-4.6.2 (PL_PTB_SIZE =
PTB_SIZE - other_headers_len). Note that from rfc8899#section-5.2
State Machine, each case below is for some specific states only:

  a) PL_PTB_SIZE < MIN_PLPMTU || PL_PTB_SIZE >= PROBED_SIZE,
     discard it, for any state

  b) MIN_PLPMTU < PL_PTB_SIZE < BASE_PLPMTU,
     Base -> Error, for Base state

  c) BASE_PLPMTU <= PL_PTB_SIZE < PLPMTU,
     Search -> Base or Complete -> Base, for Search and Complete states.

  d) PLPMTU < PL_PTB_SIZE < PROBED_SIZE,
     set pl.probe_size to PL_PTB_SIZE then verify it, for Search state.

The most important one is case d), which will help find the optimal
fast during searching. Like when pathmtu = 1392 for SCTP over IPv4,
the search will be (20 is iphdr_len):

  1. probe with 1200 - 20
  2. probe with 1232 - 20
  3. probe with 1264 - 20
  ...
  7. probe with 1388 - 20
  8. probe with 1420 - 20

When sending the probe with 1420 - 20, TOOBIG may come with PL_PTB_SIZE =
1392 - 20. Then it matches case d), and saves some rounds to try with the
1392 - 20 probe. But of course, PLPMTUD doesn't trust TOOBIG packets, and
it will go back to the common searching once the probe with the new size
can't be verified.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 11:28:52 -07:00
Xin Long
b87641aff9 sctp: do state transition when a probe succeeds on HB ACK recv path
As described in rfc8899#section-5.2, when a probe succeeds, there might
be the following state transitions:

  - Base -> Search, occurs when probe succeeds with BASE_PLPMTU,
    pl.pmtu is not changing,
    pl.probe_size increases by SCTP_PL_BIG_STEP,

  - Error -> Search, occurs when probe succeeds with BASE_PLPMTU,
    pl.pmtu is changed from SCTP_MIN_PLPMTU to SCTP_BASE_PLPMTU,
    pl.probe_size increases by SCTP_PL_BIG_STEP.

  - Search -> Search Complete, occurs when probe succeeds with the probe
    size SCTP_MAX_PLPMTU less than pl.probe_high,
    pl.pmtu is not changing, but update *pathmtu* with it,
    pl.probe_size is set back to pl.pmtu to double check it.

  - Search Complete -> Search, occurs when probe succeeds with the probe
    size equal to pl.pmtu,
    pl.pmtu is not changing,
    pl.probe_size increases by SCTP_PL_MIN_STEP.

So search process can be described as:

 1. When it just enters 'Search' state, *pathmtu* is not updated with
    pl.pmtu, and probe_size increases by a big step (SCTP_PL_BIG_STEP)
    each round.

 2. Until pl.probe_high is set when a probe fails, and probe_size
    decreases back to pl.pmtu, as described in the last patch.

 3. When the probe with the new size succeeds, probe_size changes to
    increase by a small step (SCTP_PL_MIN_STEP) due to pl.probe_high
    is set.

 4. Until probe_size is next to pl.probe_high, the searching finishes and
    it goes to 'Complete' state and updates *pathmtu* with pl.pmtu, and
    then probe_size is set to pl.pmtu to confirm by once more probe.

 5. This probe occurs after "30 * probe_inteval", a much longer time than
    that in Search state. Once it is done it goes to 'Search' state again
    with probe_size increased by SCTP_PL_MIN_STEP.

As we can see above, during the searching, pl.pmtu changes while *pathmtu*
doesn't. *pathmtu* is only updated when the search finishes by which it
gets an optimal value for it. A big step is used at the beginning until
it gets close to the optimal value, then it changes to a small step until
it has this optimal value.

The small step is also used in 'Complete' until it goes to 'Search' state
again and the probe with 'pmtu + the small step' succeeds, which means a
higher size could be used. Then probe_size changes to increase by a big
step again until it gets close to the next optimal value.

Note that anytime when black hole is detected, it goes directly to 'Base'
state with pl.pmtu set to SCTP_BASE_PLPMTU, as described in the last patch.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 11:28:52 -07:00
Xin Long
1dc68c1945 sctp: do state transition when PROBE_COUNT == MAX_PROBES on HB send path
The state transition is described in rfc8899#section-5.2,
PROBE_COUNT == MAX_PROBES means the probe fails for MAX times, and the
state transition includes:

  - Base -> Error, occurs when BASE_PLPMTU Confirmation Fails,
    pl.pmtu is set to SCTP_MIN_PLPMTU,
    probe_size is still SCTP_BASE_PLPMTU;

  - Search -> Base, occurs when Black Hole Detected,
    pl.pmtu is set to SCTP_BASE_PLPMTU,
    probe_size is set back to SCTP_BASE_PLPMTU;

  - Search Complete -> Base, occurs when Black Hole Detected
    pl.pmtu is set to SCTP_BASE_PLPMTU,
    probe_size is set back to SCTP_BASE_PLPMTU;

Note a black hole is encountered when a sender is unaware that packets
are not being delivered to the destination endpoint. So it includes the
probe failures with equal probe_size to pl.pmtu, and definitely not
include that with greater probe_size than pl.pmtu. The later one is the
normal probe failure where probe_size should decrease back to pl.pmtu
and pl.probe_high is set.  pl.probe_high would be used on HB ACK recv
path in the next patch.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 11:28:52 -07:00
Xin Long
fe59379b9a sctp: do the basic send and recv for PLPMTUD probe
This patch does exactly what rfc8899#section-6.2.1.2 says:

   The SCTP sender needs to be able to determine the total size of a
   probe packet.  The HEARTBEAT chunk could carry a Heartbeat
   Information parameter that includes, besides the information
   suggested in [RFC4960], the probe size to help an implementation
   associate a HEARTBEAT ACK with the size of probe that was sent.  The
   sender could also use other methods, such as sending a nonce and
   verifying the information returned also contains the corresponding
   nonce.  The length of the PAD chunk is computed by reducing the
   probing size by the size of the SCTP common header and the HEARTBEAT
   chunk.

Note that HB ACK chunk will carry back whatever HB chunk carried, including
the probe_size we put it in; We also check hbinfo->probe_size in the HB ACK
against link->pl.probe_size to validate this HB ACK chunk.

v1->v2:
  - Remove the unused 'sp' and add static for sctp_packet_bundle_pad().

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 11:28:52 -07:00
Xin Long
92548ec2f1 sctp: add the probe timer in transport for PLPMTUD
There are 3 timers described in rfc8899#section-5.1.1:

  PROBE_TIMER, PMTU_RAISE_TIMER, CONFIRMATION_TIMER

This patches adds a 'probe_timer' in transport, and it works as either
PROBE_TIMER or PMTU_RAISE_TIMER. At most time, it works as PROBE_TIMER
and expires every a 'probe_interval' time to send the HB probe packet.
When transport pl enters COMPLETE state, it works as PMTU_RAISE_TIMER
and expires in 'probe_interval * 30' time to go back to SEARCH state
and do searching again.

SCTP HB is an acknowledged packet, CONFIRMATION_TIMER is not needed.

The timer will start when transport pl enters BASE state and stop
when it enters DISABLED state.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 11:28:52 -07:00
Xin Long
3190b649b4 sctp: add SCTP_PLPMTUD_PROBE_INTERVAL sockopt for sock/asoc/transport
With this socket option, users can change probe_interval for
a transport, asoc or sock after it's created.

Note that if the change is for an asoc, also apply the change
to each transport in this asoc.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 11:28:51 -07:00
Xin Long
d1e462a7a5 sctp: add probe_interval in sysctl and sock/asoc/transport
PLPMTUD can be enabled by doing 'sysctl -w net.sctp.probe_interval=n'.
'n' is the interval for PLPMTUD probe timer in milliseconds, and it
can't be less than 5000 if it's not 0.

All asoc/transport's PLPMTUD in a new socket will be enabled by default.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 11:28:51 -07:00
Xin Long
745a32117b sctp: add pad chunk and its make function and event table
This chunk is defined in rfc4820#section-3, and used to pad an
SCTP packet. The receiver must discard this chunk and continue
processing the rest of the chunks in the packet.

Add it now, as it will be bundled with a heartbeat chunk to probe
pmtu in the following patches.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22 11:28:51 -07:00
Zheng Yongjun
0c2c366e0e sctp: sm_statefuns: Fix spelling mistakes
Fix some spelling mistakes in comments:
genereate ==> generate
correclty ==> correctly
boundries ==> boundaries
failes ==> fails
isses ==> issues
assocition ==> association
signe ==> sign
assocaition ==> association
managemement ==> management
restransmissions ==> retransmission
sideffect ==> sideeffect
bomming ==> booming
chukns ==> chunks
SHUDOWN ==> SHUTDOWN
violationg ==> violating
explcitly ==> explicitly
CHunk ==> Chunk

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Link: https://lore.kernel.org/r/20210601020801.3625358-1-zhengyongjun3@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:45:15 -07:00
Jakub Kicinski
5ada57a9a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
cdc-wdm: s/kill_urbs/poison_urbs/ to fix build

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 09:55:10 -07:00
Xin Long
b2540cdce6 sctp: fix the proc_handler for sysctl encap_port
proc_dointvec() cannot do min and max check for setting a value
when extra1/extra2 is set, so change it to proc_dointvec_minmax()
for sysctl encap_port.

Fixes: e8a3001c21 ("sctp: add encap_port for netns sock asoc and transport")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-25 15:18:29 -07:00
Xin Long
297739bd73 sctp: add the missing setting for asoc encap_port
This patch is to add the missing setting back for asoc encap_port.

Fixes: 8dba29603b ("sctp: add SCTP_REMOTE_UDP_ENCAPS_PORT sockopt")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-25 15:17:26 -07:00
Gustavo A. R. Silva
0572b37b27 sctp: Fix fall-through warnings for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix a couple
of warnings by explicitly adding a break statement and replacing a
comment with a goto statement instead of letting the code fall through
to the next case.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2021-05-17 20:06:33 -05:00
Yejune Deng
5796254e46 net: Remove the member netns_ok
Every protocol has the 'netns_ok' member and it is euqal to 1. The
'if (!prot->netns_ok)' always false in inet_add_protocol().

Signed-off-by: Yejune Deng <yejunedeng@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-17 15:29:35 -07:00
Xin Long
34e5b01186 sctp: delay auto_asconf init until binding the first addr
As Or Cohen described:

  If sctp_destroy_sock is called without sock_net(sk)->sctp.addr_wq_lock
  held and sp->do_auto_asconf is true, then an element is removed
  from the auto_asconf_splist without any proper locking.

  This can happen in the following functions:
  1. In sctp_accept, if sctp_sock_migrate fails.
  2. In inet_create or inet6_create, if there is a bpf program
     attached to BPF_CGROUP_INET_SOCK_CREATE which denies
     creation of the sctp socket.

This patch is to fix it by moving the auto_asconf init out of
sctp_init_sock(), by which inet_create()/inet6_create() won't
need to operate it in sctp_destroy_sock() when calling
sk_common_release().

It also makes more sense to do auto_asconf init while binding the
first addr, as auto_asconf actually requires an ANY addr bind,
see it in sctp_addr_wq_timeout_handler().

This addresses CVE-2021-23133.

Fixes: 6102365876 ("bpf: Add new cgroup attach type to enable sock modifications")
Reported-by: Or Cohen <orcohen@paloaltonetworks.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:36:21 -07:00
Xin Long
01bfe5e8e4 Revert "net/sctp: fix race condition in sctp_destroy_sock"
This reverts commit b166a20b07.

This one has to be reverted as it introduced a dead lock, as
syzbot reported:

       CPU0                    CPU1
       ----                    ----
  lock(&net->sctp.addr_wq_lock);
                               lock(slock-AF_INET6);
                               lock(&net->sctp.addr_wq_lock);
  lock(slock-AF_INET6);

CPU0 is the thread of sctp_addr_wq_timeout_handler(), and CPU1
is that of sctp_close().

The original issue this commit fixed will be fixed in the next
patch.

Reported-by: syzbot+959223586843e69a2674@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:36:21 -07:00
Xin Long
f282df0391 sctp: fix a SCTP_MIB_CURRESTAB leak in sctp_sf_do_dupcook_b
Normally SCTP_MIB_CURRESTAB is always incremented once asoc enter into
ESTABLISHED from the state < ESTABLISHED and decremented when the asoc
is being deleted.

However, in sctp_sf_do_dupcook_b(), the asoc's state can be changed to
ESTABLISHED from the state >= ESTABLISHED where it shouldn't increment
SCTP_MIB_CURRESTAB. Otherwise, one asoc may increment MIB_CURRESTAB
multiple times but only decrement once at the end.

I was able to reproduce it by using scapy to do the 4-way shakehands,
after that I replayed the COOKIE-ECHO chunk with 'peer_vtag' field
changed to different values, and SCTP_MIB_CURRESTAB was incremented
multiple times and never went back to 0 even when the asoc was freed.

This patch is to fix it by only incrementing SCTP_MIB_CURRESTAB when
the state < ESTABLISHED in sctp_sf_do_dupcook_b().

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:32:12 -07:00
Xin Long
7aa4e54739 Revert "sctp: Fix SHUTDOWN CTSN Ack in the peer restart case"
This reverts commit 12dfd78e3a.

This can be reverted as shutdown and cookie_ack chunk are using the
same asoc since commit 35b4f24415 ("sctp: do asoc update earlier
in sctp_sf_do_dupcook_a").

Reported-by: Jere Leppänen <jere.leppanen@nokia.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:30:04 -07:00
Xin Long
22008f560b Revert "Revert "sctp: Fix bundling of SHUTDOWN with COOKIE-ACK""
This reverts commit 7e9269a5ac.

As Jere notice, commit 35b4f24415 ("sctp: do asoc update earlier
in sctp_sf_do_dupcook_a") only keeps the SHUTDOWN and COOKIE-ACK
with the same asoc, not transport. So we have to bring this patch
back.

Reported-by: Jere Leppänen <jere.leppanen@nokia.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:30:04 -07:00
Xin Long
51eac7f2f0 sctp: do asoc update earlier in sctp_sf_do_dupcook_b
The same thing should be done for sctp_sf_do_dupcook_b().
Meanwhile, SCTP_CMD_UPDATE_ASSOC cmd can be removed.

v1->v2:
  - Fix the return value in sctp_sf_do_assoc_update().

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 15:06:34 -07:00
Xin Long
7e9269a5ac Revert "sctp: Fix bundling of SHUTDOWN with COOKIE-ACK"
This can be reverted as shutdown and cookie_ack chunk are using the
same asoc since the last patch.

This reverts commit 145cb2f717.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 15:06:34 -07:00
Xin Long
35b4f24415 sctp: do asoc update earlier in sctp_sf_do_dupcook_a
There's a panic that occurs in a few of envs, the call trace is as below:

  [] general protection fault, ... 0x29acd70f1000a: 0000 [#1] SMP PTI
  [] RIP: 0010:sctp_ulpevent_notify_peer_addr_change+0x4b/0x1fa [sctp]
  []  sctp_assoc_control_transport+0x1b9/0x210 [sctp]
  []  sctp_do_8_2_transport_strike.isra.16+0x15c/0x220 [sctp]
  []  sctp_cmd_interpreter.isra.21+0x1231/0x1a10 [sctp]
  []  sctp_do_sm+0xc3/0x2a0 [sctp]
  []  sctp_generate_timeout_event+0x81/0xf0 [sctp]

This is caused by a transport use-after-free issue. When processing a
duplicate COOKIE-ECHO chunk in sctp_sf_do_dupcook_a(), both COOKIE-ACK
and SHUTDOWN chunks are allocated with the transort from the new asoc.
However, later in the sideeffect machine, the old asoc is used to send
them out and old asoc's shutdown_last_sent_to is set to the transport
that SHUTDOWN chunk attached to in sctp_cmd_setup_t2(), which actually
belongs to the new asoc. After the new_asoc is freed and the old asoc
T2 timeout, the old asoc's shutdown_last_sent_to that is already freed
would be accessed in sctp_sf_t2_timer_expire().

Thanks Alexander and Jere for helping dig into this issue.

To fix it, this patch is to do the asoc update first, then allocate
the COOKIE-ACK and SHUTDOWN chunks with the 'updated' old asoc. This
would make more sense, as a chunk from an asoc shouldn't be sent out
with another asoc. We had fixed quite a few issues caused by this.

Fixes: 145cb2f717 ("sctp: Fix bundling of SHUTDOWN with COOKIE-ACK")
Reported-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Reported-by: syzbot+bbe538efd1046586f587@syzkaller.appspotmail.com
Reported-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 15:06:34 -07:00
Jakub Kicinski
8203c7ce4e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
 - keep the ZC code, drop the code related to reinit
net/bridge/netfilter/ebtables.c
 - fix build after move to net_generic

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-17 11:08:07 -07:00
Gustavo A. R. Silva
e5272ad4aa sctp: Fix out-of-bounds warning in sctp_process_asconf_param()
Fix the following out-of-bounds warning:

net/sctp/sm_make_chunk.c:3150:4: warning: 'memcpy' offset [17, 28] from the object at 'addr' is out of the bounds of referenced subobject 'v4' with type 'struct sockaddr_in' at offset 0 [-Warray-bounds]

This helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines
on memcpy().

Link: https://github.com/KSPP/linux/issues/109
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 16:55:15 -07:00
Or Cohen
b166a20b07 net/sctp: fix race condition in sctp_destroy_sock
If sctp_destroy_sock is called without sock_net(sk)->sctp.addr_wq_lock
held and sp->do_auto_asconf is true, then an element is removed
from the auto_asconf_splist without any proper locking.

This can happen in the following functions:
1. In sctp_accept, if sctp_sock_migrate fails.
2. In inet_create or inet6_create, if there is a bpf program
   attached to BPF_CGROUP_INET_SOCK_CREATE which denies
   creation of the sctp socket.

The bug is fixed by acquiring addr_wq_lock in sctp_destroy_sock
instead of sctp_close.

This addresses CVE-2021-23133.

Reported-by: Or Cohen <orcohen@paloaltonetworks.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Fixes: 6102365876 ("bpf: Add new cgroup attach type to enable sock modifications")
Signed-off-by: Or Cohen <orcohen@paloaltonetworks.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-13 14:59:46 -07:00
Jakub Kicinski
8859a44ea0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

MAINTAINERS
 - keep Chandrasekar
drivers/net/ethernet/mellanox/mlx5/core/en_main.c
 - simple fix + trust the code re-added to param.c in -next is fine
include/linux/bpf.h
 - trivial
include/linux/ethtool.h
 - trivial, fix kdoc while at it
include/linux/skmsg.h
 - move to relevant place in tcp.c, comment re-wrapped
net/core/skmsg.c
 - add the sk = sk // sk = NULL around calls
net/tipc/crypto.c
 - trivial

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-09 20:48:35 -07:00
Maciej Żenczykowski
630e4576f8 net-ipv6: bugfix - raw & sctp - switch to ipv6_can_nonlocal_bind()
Found by virtue of ipv6 raw sockets not honouring the per-socket
IP{,V6}_FREEBIND setting.

Based on hits found via:
  git grep '[.]ip_nonlocal_bind'
We fix both raw ipv6 sockets to honour IP{,V6}_FREEBIND and IP{,V6}_TRANSPARENT,
and we fix sctp sockets to honour IP{,V6}_TRANSPARENT (they already honoured
FREEBIND), and not just the ipv6 'ip_nonlocal_bind' sysctl.

The helper is defined as:
  static inline bool ipv6_can_nonlocal_bind(struct net *net, struct inet_sock *inet) {
    return net->ipv6.sysctl.ip_nonlocal_bind || inet->freebind || inet->transparent;
  }
so this change only widens the accepted opt-outs and is thus a clean bugfix.

I'm not entirely sure what 'fixes' tag to add, since this is AFAICT an ancient bug,
but IMHO this should be applied to stable kernels as far back as possible.
As such I'm adding a 'fixes' tag with the commit that originally added the helper,
which happened in 4.19.  Backporting to older LTS kernels (at least 4.9 and 4.14)
would presumably require open-coding it or backporting the helper as well.

Other possibly relevant commits:
  v4.18-rc6-1502-g83ba4645152d net: add helpers checking if socket can be bound to nonlocal address
  v4.18-rc6-1431-gd0c1f01138c4 net/ipv6: allow any source address for sendmsg pktinfo with ip_nonlocal_bind
  v4.14-rc5-271-gb71d21c274ef sctp: full support for ipv6 ip_nonlocal_bind & IP_FREEBIND
  v4.7-rc7-1883-g9b9742022888 sctp: support ipv6 nonlocal bind
  v4.1-12247-g35a256fee52c ipv6: Nonlocal bind

Cc: Lorenzo Colitti <lorenzo@google.com>
Fixes: 83ba464515 ("net: add helpers checking if socket can be bound to nonlocal address")
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-By: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-05 12:56:52 -07:00
Lu Wei
21c00a186f net: sctp: Fix some typos
Modify "unkown" to "unknown" in net/sctp/sm_make_chunk.c and
Modify "orginal" to "original" in net/sctp/socket.c.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Lu Wei <luwei32@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-28 17:52:50 -07:00
Bhaskar Chowdhury
fb373c8455 sm_statefuns.c: Mundane spello fixes
s/simulataneous/simultaneous/    ....in three dirrent places.
s/tempory/temporary/
s/interpeter/interpreter/

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-28 17:31:14 -07:00
Xin Long
8ff0b1f08e sctp: move sk_route_caps check and set into sctp_outq_flush_transports
The sk's sk_route_caps is set in sctp_packet_config, and later it
only needs to change when traversing the transport_list in a loop,
as the dst might be changed in the tx path.

So move sk_route_caps check and set into sctp_outq_flush_transports
from sctp_packet_transmit. This also fixes a dst leak reported by
Chen Yi:

  https://bugzilla.kernel.org/show_bug.cgi?id=212227

As calling sk_setup_caps() in sctp_packet_transmit may also set the
sk_route_caps for the ctrl sock in a netns. When the netns is being
deleted, the ctrl sock's releasing is later than dst dev's deleting,
which will cause this dev's deleting to hang and dmesg error occurs:

  unregister_netdevice: waiting for xxx to become free. Usage count = 1

Reported-by: Chen Yi <yiche@redhat.com>
Fixes: bcd623d8e9 ("sctp: call sk_setup_caps in sctp_packet_transmit instead")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-19 11:34:49 -07:00
Drew Fustini
d93ef30164 net: sctp: trivial: fix typo in comment
Fix typo of 'overflow' for comment in sctp_tsnmap_check().

Reported-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Drew Fustini <drew@beagleboard.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-04 13:48:32 -08:00
David S. Miller
dc9d87581d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-02-10 13:30:12 -08:00
NeilBrown
af8085f3a4 net: fix iteration for sctp transport seq_files
The sctp transport seq_file iterators take a reference to the transport
in the ->start and ->next functions and releases the reference in the
->show function.  The preferred handling for such resources is to
release them in the subsequent ->next or ->stop function call.

Since Commit 1f4aace60b ("fs/seq_file.c: simplify seq_file iteration
code and interface") there is no guarantee that ->show will be called
after ->next, so this function can now leak references.

So move the sctp_transport_put() call to ->next and ->stop.

Fixes: 1f4aace60b ("fs/seq_file.c: simplify seq_file iteration code and interface")
Reported-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-08 10:15:49 -08:00
Xin Long
1fef8544bf sctp: remove the NETIF_F_SG flag before calling skb_segment
It makes more sense to clear NETIF_F_SG instead of set it when
calling skb_segment() in sctp_gso_segment(), since SCTP GSO is
using head_skb's fraglist, of which all frags are linear skbs.

This will make SCTP GSO code more understandable.

Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-16 19:05:59 -08:00
Christophe JAILLET
5112cf59d7 sctp: Fix some typo
s/tranport/transport/

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/20201122180704.1366636-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 17:44:11 -08:00
Jakub Kicinski
56495a2442 Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 19:08:46 -08:00
Xin Long
057a10fa1f sctp: change to hold/put transport for proto_unreach_timer
A call trace was found in Hangbin's Codenomicon testing with debug kernel:

  [ 2615.981988] ODEBUG: free active (active state 0) object type: timer_list hint: sctp_generate_proto_unreach_event+0x0/0x3a0 [sctp]
  [ 2615.995050] WARNING: CPU: 17 PID: 0 at lib/debugobjects.c:328 debug_print_object+0x199/0x2b0
  [ 2616.095934] RIP: 0010:debug_print_object+0x199/0x2b0
  [ 2616.191533] Call Trace:
  [ 2616.194265]  <IRQ>
  [ 2616.202068]  debug_check_no_obj_freed+0x25e/0x3f0
  [ 2616.207336]  slab_free_freelist_hook+0xeb/0x140
  [ 2616.220971]  kfree+0xd6/0x2c0
  [ 2616.224293]  rcu_do_batch+0x3bd/0xc70
  [ 2616.243096]  rcu_core+0x8b9/0xd00
  [ 2616.256065]  __do_softirq+0x23d/0xacd
  [ 2616.260166]  irq_exit+0x236/0x2a0
  [ 2616.263879]  smp_apic_timer_interrupt+0x18d/0x620
  [ 2616.269138]  apic_timer_interrupt+0xf/0x20
  [ 2616.273711]  </IRQ>

This is because it holds asoc when transport->proto_unreach_timer starts
and puts asoc when the timer stops, and without holding transport the
transport could be freed when the timer is still running.

So fix it by holding/putting transport instead for proto_unreach_timer
in transport, just like other timers in transport.

v1->v2:
  - Also use sctp_transport_put() for the "out_unlock:" path in
    sctp_generate_proto_unreach_event(), as Marcelo noticed.

Fixes: 50b5d6ad63 ("sctp: Fix a race between ICMP protocol unreachable and connect()")
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/102788809b554958b13b95d33440f5448113b8d6.1605331373.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 11:57:12 -08:00
Jakub Kicinski
ae0d0bb29b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-06 17:33:38 -08:00
Xin Long
0356010d82 sctp: bring inet(6)_skb_parm back to sctp_input_cb
inet(6)_skb_parm was removed from sctp_input_cb by Commit a1dd2cf2f1
("sctp: allow changing transport encap_port by peer packets"), as it
thought sctp_input_cb->header is not used any more in SCTP.

syzbot reported a crash:

  [ ] BUG: KASAN: use-after-free in decode_session6+0xe7c/0x1580
  [ ]
  [ ] Call Trace:
  [ ]  <IRQ>
  [ ]  dump_stack+0x107/0x163
  [ ]  kasan_report.cold+0x1f/0x37
  [ ]  decode_session6+0xe7c/0x1580
  [ ]  __xfrm_policy_check+0x2fa/0x2850
  [ ]  sctp_rcv+0x12b0/0x2e30
  [ ]  sctp6_rcv+0x22/0x40
  [ ]  ip6_protocol_deliver_rcu+0x2e8/0x1680
  [ ]  ip6_input_finish+0x7f/0x160
  [ ]  ip6_input+0x9c/0xd0
  [ ]  ipv6_rcv+0x28e/0x3c0

It was caused by sctp_input_cb->header/IP6CB(skb) still used in sctp rx
path decode_session6() but some members overwritten by sctp6_rcv().

This patch is to fix it by bring inet(6)_skb_parm back to sctp_input_cb
and not overwriting it in sctp4/6_rcv() and sctp_udp_rcv().

Reported-by: syzbot+5be8aebb1b7dfa90ef31@syzkaller.appspotmail.com
Fixes: a1dd2cf2f1 ("sctp: allow changing transport encap_port by peer packets")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/136c1a7a419341487c504be6d1996928d9d16e02.1604472932.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-05 14:27:30 -08:00