Commit Graph

1072629 Commits

Author SHA1 Message Date
Anshuman Khandual
708e8af492 arm64: errata: Add detection for TRBE trace data corruption
TRBE implementations affected by Arm erratum #1902691 might corrupt trace
data or deadlock, when it's being written into the memory. So effectively
TRBE is broken and hence cannot be used to capture trace data. This adds
a new errata ARM64_ERRATUM_1902691 in arm64 errata framework.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-doc@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1643120437-14352-5-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2022-01-27 12:01:53 -07:00
Anshuman Khandual
3bd94a8759 arm64: errata: Add detection for TRBE invalid prohibited states
TRBE implementations affected by Arm erratum #2038923 might get TRBE into
an inconsistent view on whether trace is prohibited within the CPU. As a
result, the trace buffer or trace buffer state might be corrupted. This
happens after TRBE buffer has been enabled by setting TRBLIMITR_EL1.E,
followed by just a single context synchronization event before execution
changes from a context, in which trace is prohibited to one where it isn't,
or vice versa. In these mentioned conditions, the view of whether trace is
prohibited is inconsistent between parts of the CPU, and the trace buffer
or the trace buffer state might be corrupted. This adds a new errata
ARM64_ERRATUM_2038923 in arm64 errata framework.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-doc@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1643120437-14352-4-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2022-01-27 12:01:53 -07:00
Anshuman Khandual
607a9afaae arm64: errata: Add detection for TRBE ignored system register writes
TRBE implementations affected by Arm erratum #2064142 might fail to write
into certain system registers after the TRBE has been disabled. Under some
conditions after TRBE has been disabled, writes into certain TRBE registers
TRBLIMITR_EL1, TRBPTR_EL1, TRBBASER_EL1, TRBSR_EL1 and TRBTRG_EL1 will be
ignored and not be effected. This adds a new errata ARM64_ERRATUM_2064142
in arm64 errata framework.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-doc@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1643120437-14352-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2022-01-27 12:01:53 -07:00
Anshuman Khandual
53960faf2b arm64: Add Cortex-A510 CPU part definition
Add the CPU Partnumbers for the new Arm designs.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1643120437-14352-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2022-01-27 12:01:53 -07:00
Linus Torvalds
23a46422c5 Networking fixes for 5.17-rc2, including fixes from netfilter and can.
Current release - new code bugs:
 
  - tcp: add a missing sk_defer_free_flush() in tcp_splice_read()
 
  - tcp: add a stub for sk_defer_free_flush(), fix CONFIG_INET=n
 
  - nf_tables: set last expression in register tracking area
 
  - nft_connlimit: fix memleak if nf_ct_netns_get() fails
 
  - mptcp: fix removing ids bitmap setting
 
  - bonding: use rcu_dereference_rtnl when getting active slave
 
  - fix three cases of sleep in atomic context in drivers: lan966x, gve
 
  - handful of build fixes for esoteric drivers after netdev->dev_addr
    was made const
 
 Previous releases - regressions:
 
  - revert "ipv6: Honor all IPv6 PIO Valid Lifetime values", it broke
    Linux compatibility with USGv6 tests
 
  - procfs: show net device bound packet types
 
  - ipv4: fix ip option filtering for locally generated fragments
 
  - phy: broadcom: hook up soft_reset for BCM54616S
 
 Previous releases - always broken:
 
  - ipv4: raw: lock the socket in raw_bind()
 
  - ipv4: decrease the use of shared IPID generator to decrease the
    chance of attackers guessing the values
 
  - procfs: fix cross-netns information leakage in /proc/net/ptype
 
  - ethtool: fix link extended state for big endian
 
  - bridge: vlan: fix single net device option dumping
 
  - ping: fix the sk_bound_dev_if match in ping_lookup
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmHy4mMACgkQMUZtbf5S
 IrvOZg/9HyOFAJrCYBlgA3zskHBdqYOGn9M3LCIevBcrCzQeigT+U1uWCINfBn+H
 DmsljeYKTicHZ38+HjdNXmzdnMqHtU+iJl4Ep1mcDNywygofW8JcS2Nf0n6Y+hK6
 nzyEa23DBt9UAiLmGXUTIoJwEhDRbuL/eH1/ZkkPLG7GtShtEDAKHg+dJBgHbYgJ
 0MQs3Q4s6AQ1PYOC0Z0zByhpSrAo2c4X/tr6g2ExNxU0vnydUbjIME0a5clFULr+
 ziVeOo4e83FINPaZiYAXEDbMGUC0z+rp1RoGsgRCdTnixi5BclkmEeGRaChYJHTZ
 T7tfIC2H0vZHu5/pAXFqwEHiRbminLv9jLkvA1/J67jbnpoNWTLD2jkuIWFlaY/Z
 xDm7LnVBB1CdLmXYo2ItSC/8ws9GANpJOq/vFvm+uOYZNKUVctfQ5viA3+hOSULC
 6BJHC0m5UminHZPVge9s1XZClarHK4jMMTH9Du2sHLsl3fedNxbgvcVPFdHswLdF
 uYiUGMSrIXuQjXw6SNmR4/voJgzikvYhT+jwMn4vTeWoFQFi5eNUch0MPfUImlXG
 e3T6WJHrOY3yJFyWQQ9GGLStchD72+iGq2uWLfOIyu9NRKCNBj4Kkm3bUvfqYp5b
 d5sP/nl93o3um4WskxB/fDLyhSXWjprgM9mKI45ilPhUC8bWQyo=
 =mwR3
 -----END PGP SIGNATURE-----

Merge tag 'net-5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter and can.

  Current release - new code bugs:

   - tcp: add a missing sk_defer_free_flush() in tcp_splice_read()

   - tcp: add a stub for sk_defer_free_flush(), fix CONFIG_INET=n

   - nf_tables: set last expression in register tracking area

   - nft_connlimit: fix memleak if nf_ct_netns_get() fails

   - mptcp: fix removing ids bitmap setting

   - bonding: use rcu_dereference_rtnl when getting active slave

   - fix three cases of sleep in atomic context in drivers: lan966x, gve

   - handful of build fixes for esoteric drivers after netdev->dev_addr
     was made const

  Previous releases - regressions:

   - revert "ipv6: Honor all IPv6 PIO Valid Lifetime values", it broke
     Linux compatibility with USGv6 tests

   - procfs: show net device bound packet types

   - ipv4: fix ip option filtering for locally generated fragments

   - phy: broadcom: hook up soft_reset for BCM54616S

  Previous releases - always broken:

   - ipv4: raw: lock the socket in raw_bind()

   - ipv4: decrease the use of shared IPID generator to decrease the
     chance of attackers guessing the values

   - procfs: fix cross-netns information leakage in /proc/net/ptype

   - ethtool: fix link extended state for big endian

   - bridge: vlan: fix single net device option dumping

   - ping: fix the sk_bound_dev_if match in ping_lookup"

* tag 'net-5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits)
  net: bridge: vlan: fix memory leak in __allowed_ingress
  net: socket: rename SKB_DROP_REASON_SOCKET_FILTER
  ipv4: remove sparse error in ip_neigh_gw4()
  ipv4: avoid using shared IP generator for connected sockets
  ipv4: tcp: send zero IPID in SYNACK messages
  ipv4: raw: lock the socket in raw_bind()
  MAINTAINERS: add missing IPv4/IPv6 header paths
  MAINTAINERS: add more files to eth PHY
  net: stmmac: dwmac-sun8i: use return val of readl_poll_timeout()
  net: bridge: vlan: fix single net device option dumping
  net: stmmac: skip only stmmac_ptp_register when resume from suspend
  net: stmmac: configure PTP clock source prior to PTP initialization
  Revert "ipv6: Honor all IPv6 PIO Valid Lifetime values"
  connector/cn_proc: Use task_is_in_init_pid_ns()
  pid: Introduce helper task_is_in_init_pid_ns()
  gve: Fix GFP flags when allocing pages
  net: lan966x: Fix sleep in atomic context when updating MAC table
  net: lan966x: Fix sleep in atomic context when injecting frames
  ethernet: seeq/ether3: don't write directly to netdev->dev_addr
  ethernet: 8390/etherh: don't write directly to netdev->dev_addr
  ...
2022-01-27 20:58:39 +02:00
Paul Menzel
854d0982ee docs/vm: Fix typo in *harden*
Fixes: df4e817b71 ("mm: page table check")
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Link: https://lore.kernel.org/r/20220117111338.115455-1-pmenzel@molgen.mpg.de
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-01-27 11:22:34 -07:00
Pali Rohár
573fe46e39 Documentation: arm: marvell: Extend Avanta list
Include another two SoCs from Avanta family.

Signed-off-by: Pali Rohár <pali@kernel.org>
Link: https://lore.kernel.org/r/20220121115804.28824-1-pali@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-01-27 11:22:34 -07:00
Takahiro Itazuri
10855b45a4 docs: fix typo in Documentation/kernel-hacking/locking.rst
Change copy_from_user*( to copy_from_user() .

Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
Link: https://lore.kernel.org/r/20220124081447.34066-1-itazur@amazon.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-01-27 11:22:33 -07:00
Jonathan Corbet
941518d653 docs: Hook the RTLA documents into the kernel docs build
The RTLA documents were added to Documentation/ but never hooked into the
rest of the docs build, leading to a bunch of warnings like:

  Documentation/tools/rtla/rtla-osnoise.rst: WARNING: document isn't included in any toctree

Add some basic glue to wire these documents into the build so that they are
available with the rest of the rendered docs.  No attempt has been made to
turn the RTLA docs into proper RST files rather than warmed-over man pages;
that is an exercise for the future.

Fixes: d40d48e1f1 ("rtla: Add Documentation")
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/877dau555q.fsf@meer.lwn.net
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-01-27 11:20:39 -07:00
Usama Arif
f6133fbd37 io_uring: remove unused argument from io_rsrc_node_alloc
io_ring_ctx is not used in the function.

Signed-off-by: Usama Arif <usama.arif@bytedance.com>
Link: https://lore.kernel.org/r/20220127140444.4016585-1-usama.arif@bytedance.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-27 10:18:53 -07:00
Laibin Qiu
10825410b9 blk-mq: Fix wrong wakeup batch configuration which will cause hang
Commit 180dccb0db ("blk-mq: fix tag_get wait task can't be
awakened") will recalculate wake_batch when incrementing or decrementing
active_queues to avoid wake_batch > hctx_max_depth. At the same time, in
order to not affect performance as much as possible, the minimum wakeup
batch is set to 4. But when the QD is small (such as QD=1), if inc or dec
active_queues increases wakeup batch, that can lead to a hang:

Fix this problem with the following strategies:
QD          :  >= 32 | < 32
---------------------------------
wakeup batch:  8~4   | 3~1

Fixes: 180dccb0db ("blk-mq: fix tag_get wait task can't be awakened")
Link: https://lore.kernel.org/linux-block/78cafe94-a787-e006-8851-69906f0c2128@huawei.com/T/#t
Reported-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Tested-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Link: https://lore.kernel.org/r/20220127100047.1763746-1-qiulaibin@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-27 10:15:32 -07:00
Tim Yi
fd20d97383 net: bridge: vlan: fix memory leak in __allowed_ingress
When using per-vlan state, if vlan snooping and stats are disabled,
untagged or priority-tagged ingress frame will go to check pvid state.
If the port state is forwarding and the pvid state is not
learning/forwarding, untagged or priority-tagged frame will be dropped
but skb memory is not freed.
Should free skb when __allowed_ingress returns false.

Fixes: a580c76d53 ("net: bridge: vlan: add per-vlan state")
Signed-off-by: Tim Yi <tim.yi@pica8.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20220127074953.12632-1-tim.yi@pica8.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-27 09:01:25 -08:00
Menglong Dong
364df53c08 net: socket: rename SKB_DROP_REASON_SOCKET_FILTER
Rename SKB_DROP_REASON_SOCKET_FILTER, which is used
as the reason of skb drop out of socket filter before
it's part of a released kernel. It will be used for
more protocols than just TCP in future series.

Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/all/20220127091308.91401-2-imagedong@tencent.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-27 08:45:13 -08:00
Eric Dumazet
3c42b20198 ipv4: remove sparse error in ip_neigh_gw4()
./include/net/route.h:373:48: warning: incorrect type in argument 2 (different base types)
./include/net/route.h:373:48:    expected unsigned int [usertype] key
./include/net/route.h:373:48:    got restricted __be32 [usertype] daddr

Fixes: 5c9f7c1dfc ("ipv4: Add helpers for neigh lookup for nexthop")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220127013404.1279313-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-27 08:38:33 -08:00
Jakub Kicinski
3ede6465e7 Merge branch 'ipv4-less-uses-of-shared-ip-generator'
Eric Dumazet says:

====================
ipv4: less uses of shared IP generator

From: Eric Dumazet <edumazet@google.com>

We keep receiving research reports based on linux IPID generation.

Before breaking part of the Internet by switching to pure
random generator, this series reduces the need for the
shared IP generator for TCP sockets.
====================

Link: https://lore.kernel.org/r/20220127011022.1274803-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-27 08:37:06 -08:00
Eric Dumazet
23f57406b8 ipv4: avoid using shared IP generator for connected sockets
ip_select_ident_segs() has been very conservative about using
the connected socket private generator only for packets with IP_DF
set, claiming it was needed for some VJ compression implementations.

As mentioned in this referenced document, this can be abused.
(Ref: Off-Path TCP Exploits of the Mixed IPID Assignment)

Before switching to pure random IPID generation and possibly hurt
some workloads, lets use the private inet socket generator.

Not only this will remove one vulnerability, this will also
improve performance of TCP flows using pmtudisc==IP_PMTUDISC_DONT

Fixes: 73f156a6e8 ("inetpeer: get rid of ip_id_count")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reported-by: Ray Che <xijiache@gmail.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-27 08:37:02 -08:00
Eric Dumazet
970a5a3ea8 ipv4: tcp: send zero IPID in SYNACK messages
In commit 431280eebe ("ipv4: tcp: send zero IPID for RST and
ACK sent in SYN-RECV and TIME-WAIT state") we took care of some
ctl packets sent by TCP.

It turns out we need to use a similar strategy for SYNACK packets.

By default, they carry IP_DF and IPID==0, but there are ways
to ask them to use the hashed IP ident generator and thus
be used to build off-path attacks.
(Ref: Off-Path TCP Exploits of the Mixed IPID Assignment)

One of this way is to force (before listener is started)
echo 1 >/proc/sys/net/ipv4/ip_no_pmtu_disc

Another way is using forged ICMP ICMP_FRAG_NEEDED
with a very small MTU (like 68) to force a false return from
ip_dont_fragment()

In this patch, ip_build_and_send_pkt() uses the following
heuristics.

1) Most SYNACK packets are smaller than IPV4_MIN_MTU and therefore
can use IP_DF regardless of the listener or route pmtu setting.

2) In case the SYNACK packet is bigger than IPV4_MIN_MTU,
we use prandom_u32() generator instead of the IPv4 hashed ident one.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Ray Che <xijiache@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Cc: Geoff Alexander <alexandg@cs.unm.edu>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-27 08:37:02 -08:00
Mathias Krause
a0f90c8815 drm/vmwgfx: Fix stale file descriptors on failed usercopy
A failing usercopy of the fence_rep object will lead to a stale entry in
the file descriptor table as put_unused_fd() won't release it. This
enables userland to refer to a dangling 'file' object through that still
valid file descriptor, leading to all kinds of use-after-free
exploitation scenarios.

Fix this by deferring the call to fd_install() until after the usercopy
has succeeded.

Fixes: c906965dee ("drm/vmwgfx: Add export fence to file descriptor support")
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Zack Rusin <zackr@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-27 17:55:20 +02:00
Eric Dumazet
153a0d187e ipv4: raw: lock the socket in raw_bind()
For some reason, raw_bind() forgot to lock the socket.

BUG: KCSAN: data-race in __ip4_datagram_connect / raw_bind

write to 0xffff8881170d4308 of 4 bytes by task 5466 on cpu 0:
 raw_bind+0x1b0/0x250 net/ipv4/raw.c:739
 inet_bind+0x56/0xa0 net/ipv4/af_inet.c:443
 __sys_bind+0x14b/0x1b0 net/socket.c:1697
 __do_sys_bind net/socket.c:1708 [inline]
 __se_sys_bind net/socket.c:1706 [inline]
 __x64_sys_bind+0x3d/0x50 net/socket.c:1706
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff8881170d4308 of 4 bytes by task 5468 on cpu 1:
 __ip4_datagram_connect+0xb7/0x7b0 net/ipv4/datagram.c:39
 ip4_datagram_connect+0x2a/0x40 net/ipv4/datagram.c:89
 inet_dgram_connect+0x107/0x190 net/ipv4/af_inet.c:576
 __sys_connect_file net/socket.c:1900 [inline]
 __sys_connect+0x197/0x1b0 net/socket.c:1917
 __do_sys_connect net/socket.c:1927 [inline]
 __se_sys_connect net/socket.c:1924 [inline]
 __x64_sys_connect+0x3d/0x50 net/socket.c:1924
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x00000000 -> 0x0003007f

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 5468 Comm: syz-executor.5 Not tainted 5.17.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-27 14:09:10 +00:00
Jakub Kicinski
966f435add MAINTAINERS: add missing IPv4/IPv6 header paths
Add missing headers to the IP entry.

Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-27 13:58:28 +00:00
Jakub Kicinski
492fefbaaf MAINTAINERS: add more files to eth PHY
include/linux/linkmode.h and include/linux/mii.h
do not match anything in MAINTAINERS. Looks like
they should be under Ethernet PHY.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-27 13:55:39 +00:00
Jens Axboe
3c8cef9f3d nvme fixes for Linux 5.17
- add the IGNORE_DEV_SUBNQN quirk for Intel P4500/P4600 SSDs (Wu Zheng)
  - remove the unneeded ret variable in nvmf_dev_show (Changcheng Deng)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmHyRy8LHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYM7kw//XR9X95EsjTZ89AQPxpIcD08tuAP1g+qOyo+LjP+U
 N+Y2DiObRxbL2gmJKldPTeHl8m6CZiP6gdYfyHfRztNpeyCGY55EPy3GrmTRFVz+
 aTMkYCtOzJCeRgPYjPuGnjGpovfogSQRbEhndms67Z73OE32+2O9Ru8dLdoS5ecJ
 LGGjuH77V7SyjRzYNuv3adMf1LS5HBAuH38C64SGF5AkIzGVAPvqzqU8F34Q/fLb
 hLQCQ12joX9qx2/TrouSqb9x5uOpEe3AbTOlaL4nBMHIoziIOWqUNKJ01expc8zB
 K2ys+kHd3pSc930eZgmEZYaeRdLOZ4oPtHlaKFS4ia934SkX0kdV4tbRMAUV+dHz
 le15eMqrmNWvUP/BE4QO7FbeBGhlMRtVBkNtooBfluUhANsLjwZyUZAyU+bF4V3n
 PCJ85FyNK7stXHs6vMODywkQ82cgwaB9fHQ1L2ED8AUeYF2i7auzOF4LJ16vrCSE
 Tfs+1Ht2rPb9fgDSdl0FOTyA20rdpqJiECAvx54zOnZg/lC6XZRYFYzIg2LtcBdd
 W483Z77KQ3hglzBvlgtuhukqAOcU4w5eKGmH7NyiBoud2PG7YVRP7J7++jqn1Apf
 azN6bSgI8gEZzds0JYKv2rVCLQ4VmOR04PwkfrvCGPOaFyE1TomMVYOyFY1JTFw+
 Exc=
 =viZK
 -----END PGP SIGNATURE-----

Merge tag 'nvme-5.17-2022-01-27' of git://git.infradead.org/nvme into block-5.17

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.17

 - add the IGNORE_DEV_SUBNQN quirk for Intel P4500/P4600 SSDs (Wu Zheng)
 - remove the unneeded ret variable in nvmf_dev_show (Changcheng Deng)"

* tag 'nvme-5.17-2022-01-27' of git://git.infradead.org/nvme:
  nvme-fabrics: remove the unneeded ret variable in nvmf_dev_show
  nvme-pci: add the IGNORE_DEV_SUBNQN quirk for Intel P4500/P4600 SSDs
2022-01-27 06:52:50 -07:00
Jisheng Zhang
9e0db41e7a net: stmmac: dwmac-sun8i: use return val of readl_poll_timeout()
When readl_poll_timeout() timeout, we'd better directly use its return
value.

Before this patch:
[    2.145528] dwmac-sun8i: probe of 4500000.ethernet failed with error -14

After this patch:
[    2.138520] dwmac-sun8i: probe of 4500000.ethernet failed with error -110

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-27 13:51:10 +00:00
Nikolay Aleksandrov
dcb2c5c6ca net: bridge: vlan: fix single net device option dumping
When dumping vlan options for a single net device we send the same
entries infinitely because user-space expects a 0 return at the end but
we keep returning skb->len and restarting the dump on retry. Fix it by
returning the value from br_vlan_dump_dev() if it completed or there was
an error. The only case that must return skb->len is when the dump was
incomplete and needs to continue (-EMSGSIZE).

Reported-by: Benjamin Poirier <bpoirier@nvidia.com>
Fixes: 8dcea18708 ("net: bridge: vlan: add rtm definitions and dump support")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-27 13:48:17 +00:00
David S. Miller
aa44323e1c Merge branch 'stmmac-ptp-fix'
Mohammad Athari Bin Ismail says:

====================
Fix PTP issue in stmmac

This patch series to fix PTP issue in stmmac related to:
1/ PTP clock source configuration during initialization.
2/ PTP initialization during resume from suspend.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-27 13:46:18 +00:00
Mohammad Athari Bin Ismail
0735e639f1 net: stmmac: skip only stmmac_ptp_register when resume from suspend
When resume from suspend, besides skipping PTP registration, it also
skipping PTP HW initialization. This could cause PTP clock not able to
operate properly when resume from suspend.

To fix this, only stmmac_ptp_register() is skipped when resume from
suspend.

Fixes: fe13192911 ("stmmac: Don't init ptp again when resume from suspend/hibernation")
Cc: <stable@vger.kernel.org> # 5.15.x
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-27 13:45:56 +00:00
Mohammad Athari Bin Ismail
94c82de43e net: stmmac: configure PTP clock source prior to PTP initialization
For Intel platform, it is required to configure PTP clock source prior PTP
initialization in MAC. So, need to move ptp_clk_freq_config execution from
stmmac_ptp_register() to stmmac_init_ptp().

Fixes: 76da35dc99 ("stmmac: intel: Add PSE and PCH PTP clock source selection")
Cc: <stable@vger.kernel.org> # 5.15.x
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-27 13:45:56 +00:00
Guillaume Nault
36268983e9 Revert "ipv6: Honor all IPv6 PIO Valid Lifetime values"
This reverts commit b75326c201.

This commit breaks Linux compatibility with USGv6 tests. The RFC this
commit was based on is actually an expired draft: no published RFC
currently allows the new behaviour it introduced.

Without full IETF endorsement, the flash renumbering scenario this
patch was supposed to enable is never going to work, as other IPv6
equipements on the same LAN will keep the 2 hours limit.

Fixes: b75326c201 ("ipv6: Honor all IPv6 PIO Valid Lifetime values")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-27 13:35:14 +00:00
Linus Torvalds
626b2dda76 rpmsg fixes for v5.17-rc1
The cdev cleanup in the rpmsg_char driver was not performed properly,
 resulting in unpredicable behaviour when the parent remote processor is
 stopped with any of the cdevs open by a client.
 
 The two patches transitions the implementation to use cdev_device_add()
 and cdev_del_device(), to capture the relationship between the two
 objects, and relocates the incorrectly placed cdev_del().
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEEBd4DzF816k8JZtUlCx85Pw2ZrcUFAmHxa4UbHGJqb3JuLmFu
 ZGVyc3NvbkBsaW5hcm8ub3JnAAoJEAsfOT8Nma3Fi7sQAMHdsaqZapq9MMLRtGJi
 YjpOvnH4ftOfr1+yfsKbbmH9ubUX4pdqgG7z+sRRzpmNjEwsKxe6rKXG30D+ZUX9
 b8waVW5qC6A3VjDzDVdC+Mz2HhUkkeukxFzfwAuTOnpGHwtILkCid5A+jv1cJAz0
 BDrszx+kIDrXC0KpI9V0mlPn2iqONYZHgjSRgQtw1xe2LtDL+5/iLvoP7hGq0qOx
 ujK9Ux+FOK2+H6G7xaNLK1GekVJBa2kZRcYYhHPpLrpV5Ul9IXpBGUYfiVFPp5WD
 ZDNQEImd96EQJs8rfQnAq9WhFtk95BSe/oGP2V/NB3O+RqD1IjtQxnFk99DFrXnQ
 DlgzSTD1iV356y8ZFPhlaCLAaPueFMXRRoKMhSaP28d3wNNOawDKwnXb7Amf7y2c
 Aeygibrh6UnOCROPWlvMrhImPuOiTyGBmD0TbNSGdYoXiZqpITmcVoW6NDyM0zST
 YB/PW091k6wXLmumARd5G6xK/+hALx6WFf50StxncXqax6aRy7HtaiqLMO95X+5w
 kD6GF4mBasqtYgRW9bWTndT/H2r09T9R5xksZDmWE9DAcsozz8HSBAXe8XqZiPED
 ZfqUlfHVBOYuMYb8P4bUYW3ZK9p5LIQeP1RbPojgWcf3/Rn6aLLg/mOUkbqRFSa7
 XXj8jSmvMyGLbLFSvtcnU7z0
 =KdZJ
 -----END PGP SIGNATURE-----

Merge tag 'rpmsg-v5.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux

Pull rpmsg fixes from Bjorn Andersson:
 "The cdev cleanup in the rpmsg_char driver was not performed properly,
  resulting in unpredicable behaviour when the parent remote processor
  is stopped with any of the cdevs open by a client.

  Two patches transitions the implementation to use cdev_device_add()
  and cdev_del_device(), to capture the relationship between the two
  objects, and relocates the incorrectly placed cdev_del()"

* tag 'rpmsg-v5.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  rpmsg: char: Fix race between the release of rpmsg_eptdev and cdev
  rpmsg: char: Fix race between the release of rpmsg_ctrldev and cdev
2022-01-27 11:23:26 +02:00
Linus Torvalds
96b5590a48 remoteproc fixes for v5.17-rc1
The interaction between the various Qualcomm remoteproc drivers and the
 Qualcomm "QMP" driver (used to communicate with the power-management
 hardware) was reworked in v5.17-rc1, but failed to account for the new
 Kconfig dependency.
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEEBd4DzF816k8JZtUlCx85Pw2ZrcUFAmHxaXIbHGJqb3JuLmFu
 ZGVyc3NvbkBsaW5hcm8ub3JnAAoJEAsfOT8Nma3FHP0P/36qFsB7vuyveRuA+7PE
 MLbA35xvQjLztY8yoYDXB5akruHclPUuLnzkarogED8pH4WmaAUIB9lups/rFQwt
 m2skIZ8hot2jRFYWvBKe6RMaZjKPoPCUSx94gDP315/VTryoJAxEg9rdHrnfBaAm
 IexxXbSxAyEH8K3aXRHxXj9snD+uC7e3LRePvbiZoPT0xY80DoDlaAw4vBor+Cl/
 VlHPO+hJFJUM+LtdMbDe3MiBQfuun3yAIIGn8wzc9M7aDrHCN4kX/MwayEB8TPI3
 V1im7P0tTFMC4kx4FeQKSf4uq45Qsofqp4W3PFDWy5wCNX1L2QUIh4w7CINDSoZX
 dL0sGqxyvlMZTlI22H2Ga69F37mksl8sJ4aVXvsYJ5hx9Z6Y/YwTeG4oz/BlWB/6
 44KxoNxCB+fITvI11O0bnyozbGS5AkKeBTN1vcowJS/VpdKDcvaYAxwasZhFZqMC
 2p6ZjdWL96MQqw3BYW/if/AymN0LyVE4tTezU50hhE2JufJRpq8Qo11A8uRMKUxN
 uI2Clcp0tlBxtzyhn8P8mQf3xKexBj4ETEmnYXlChL2ZJ4VoIJiGHAssKGbWUzru
 nf28Bf7Cr0AXJqWN2tIvnt+36i0sRUWxSmSAlhTb8flB4ygAhZlwkTQjsXOsW8pW
 AHULY6Tixaa3Wn6E2I+hWs+e
 =CCeM
 -----END PGP SIGNATURE-----

Merge tag 'rproc-v5.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux

Pull remoteproc fix from Bjorn Andersson:
 "The interaction between the various Qualcomm remoteproc drivers and
  the Qualcomm 'QMP' driver (used to communicate with the
  power-management hardware) was reworked in v5.17-rc1, but failed to
  account for the new Kconfig dependency"

* tag 'rproc-v5.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  remoteproc: qcom: q6v5: fix service routines build errors
2022-01-27 11:19:20 +02:00
Thomas Bogendoerfer
fa62f39dc7 MIPS: Fix build error due to PTR used in more places
Use PTR_WD instead of PTR to avoid clashes with other parts.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-01-27 09:04:19 +01:00
Greg Kroah-Hartman
d1ad2721b1 kbuild: remove include/linux/cyclades.h from header file check
The file now rightfully throws up a big warning that it should never be
included, so remove it from the header_check test.

Fixes: f23653fe64 ("tty: Partially revert the removal of the Cyclades public API")
Cc: stable <stable@vger.kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: "Maciej W. Rozycki" <macro@embecosm.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/20220127073304.42399-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-27 08:51:08 +01:00
Changcheng Deng
a5f3851b7f nvme-fabrics: remove the unneeded ret variable in nvmf_dev_show
Remove unneeded variable and directly return 0.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-01-27 08:17:17 +01:00
Wu Zheng
25e58af4be nvme-pci: add the IGNORE_DEV_SUBNQN quirk for Intel P4500/P4600 SSDs
The Intel P4500/P4600 SSDs do not report a subsystem NQN despite claiming
compliance to a standards version where reporting one is required.

Add the IGNORE_DEV_SUBNQN quirk to not fail the initialization of a
second such SSDs in a system.

Signed-off-by: Zheng Wu <wu.zheng@intel.com>
Signed-off-by: Ye Jinhe <jinhe.ye@intel.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-01-27 08:17:14 +01:00
Jakub Kicinski
c7ec845f0e Merge branch 'pid-introduce-helper-task_is_in_root_ns'
Leo Yan says:

====================
pid: Introduce helper task_is_in_root_ns()

This patch series introduces a helper function task_is_in_init_pid_ns()
to replace open code.  The two patches are extracted from the original
series [1] for network subsystem.

As a plan, we can firstly land this patch set into kernel 5.18; there
have 5 patches are left out from original series [1], as a next step,
I will resend them for appropriate linux-next merging.

[1] https://lore.kernel.org/lkml/20211208083320.472503-1-leo.yan@linaro.org/
====================

Link: https://lore.kernel.org/r/20220126050427.605628-1-leo.yan@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-26 18:57:13 -08:00
Leo Yan
42c66d1675 connector/cn_proc: Use task_is_in_init_pid_ns()
This patch replaces open code with task_is_in_init_pid_ns() to check if
a task is in root PID namespace.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-26 18:57:09 -08:00
Leo Yan
d7e4f8545b pid: Introduce helper task_is_in_init_pid_ns()
Currently the kernel uses open code in multiple places to check if a
task is in the root PID namespace with the kind of format:

  if (task_active_pid_ns(current) == &init_pid_ns)
      do_something();

This patch creates a new helper function, task_is_in_init_pid_ns(), it
returns true if a passed task is in the root PID namespace, otherwise
returns false.  So it will be used to replace open codes.

Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-26 18:57:09 -08:00
Catherine Sullivan
a92f7a6fee gve: Fix GFP flags when allocing pages
Use GFP_ATOMIC when allocating pages out of the hotpath,
continue to use GFP_KERNEL when allocating pages during setup.

GFP_KERNEL will allow blocking which allows it to succeed
more often in a low memory enviornment but in the hotpath we do
not want to allow the allocation to block.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Link: https://lore.kernel.org/r/20220126003843.3584521-1-awogbemila@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-26 18:45:01 -08:00
Zhou Qingyang
9b6d90e208 ata: pata_platform: Fix a NULL pointer dereference in __pata_platform_probe()
In __pata_platform_probe(), devm_kzalloc() is assigned to ap->ops and
there is a dereference of it right after that, which could introduce a
NULL pointer dereference bug.

Fix this by adding a NULL check of ap->ops.

This bug was found by a static analyzer.

Builds with 'make allyesconfig' show no new warnings,
and our static analyzer no longer warns about this code.

Fixes: f3d5e4f18d ("ata: pata_of_platform: Allow to use 16-bit wide data transfer")
Signed-off-by: Zhou Qingyang <zhou1615@umn.edu>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
2022-01-27 11:22:43 +09:00
Eric W. Biederman
f9d87929d4 ucount: Make get_ucount a safe get_user replacement
When the ucount code was refactored to create get_ucount it was missed
that some of the contexts in which a rlimit is kept elevated can be
the only reference to the user/ucount in the system.

Ordinary ucount references exist in places that also have a reference
to the user namspace, but in POSIX message queues, the SysV shm code,
and the SIGPENDING code there is no independent user namespace
reference.

Inspection of the the user_namespace show no instance of circular
references between struct ucounts and the user_namespace.  So
hold a reference from struct ucount to i's user_namespace to
resolve this problem.

Link: https://lore.kernel.org/lkml/YZV7Z+yXbsx9p3JN@fixkernel.com/
Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Reported-by: Mathias Krause <minipli@grsecurity.net>
Tested-by: Mathias Krause <minipli@grsecurity.net>
Reviewed-by: Mathias Krause <minipli@grsecurity.net>
Reviewed-by: Alexey Gladkov <legion@kernel.org>
Fixes: d646969055 ("Reimplement RLIMIT_SIGPENDING on top of ucounts")
Fixes: 6e52a9f053 ("Reimplement RLIMIT_MSGQUEUE on top of ucounts")
Fixes: d7c9e99aee ("Reimplement RLIMIT_MEMLOCK on top of ucounts")
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-01-26 18:34:11 -06:00
Paul E. McKenney
da123016ca rcu-tasks: Fix computation of CPU-to-list shift counts
The ->percpu_enqueue_shift field is used to map from the running CPU
number to the index of the corresponding callback list.  This mapping
can change at runtime in response to varying callback load, resulting
in varying levels of contention on the callback-list locks.

Unfortunately, the initial value of this field is correct only if the
system happens to have a power-of-two number of CPUs, otherwise the
callbacks from the high-numbered CPUs can be placed into the callback list
indexed by 1 (rather than 0), and those index-1 callbacks will be ignored.
This can result in soft lockups and hangs.

This commit therefore corrects this mapping, adding one to this shift
count as needed for systems having odd numbers of CPUs.

Fixes: 7a30871b6a ("rcu-tasks: Introduce ->percpu_enqueue_shift for dynamic queue selection")
Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Reported-by: Martin Lau <kafai@fb.com>
Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-01-26 13:04:05 -08:00
Jeff Layton
4584a768f2 ceph: set pool_ns in new inode layout for async creates
Dan reported that he was unable to write to files that had been
asynchronously created when the client's OSD caps are restricted to a
particular namespace.

The issue is that the layout for the new inode is only partially being
filled. Ensure that we populate the pool_ns_data and pool_ns_len in the
iinfo before calling ceph_fill_inode.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/54013
Fixes: 9a8d03ca2e ("ceph: attempt to do async create when possible")
Reported-by: Dan van der Ster <dan@vanderster.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-26 20:17:50 +01:00
Jeff Layton
932a9b5870 ceph: properly put ceph_string reference after async create attempt
The reference acquired by try_prep_async_create is currently leaked.
Ensure we put it.

Cc: stable@vger.kernel.org
Fixes: 9a8d03ca2e ("ceph: attempt to do async create when possible")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-26 20:17:50 +01:00
Xiubo Li
89d43d0551 ceph: put the requests/sessions when it fails to alloc memory
When failing to allocate the sessions memory we should make sure
the req1 and req2 and the sessions get put. And also in case the
max_sessions decreased so when kreallocate the new memory some
sessions maybe missed being put.

And if the max_sessions is 0 krealloc will return ZERO_SIZE_PTR,
which will lead to a distinct access fault.

URL: https://tracker.ceph.com/issues/53819
Fixes: e1a4541ec0 ("ceph: flush the mdlog before waiting on unsafe reqs")
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-26 20:17:50 +01:00
Evgenii Stepanov
3758a6c74e arm64: extable: fix load_unaligned_zeropad() reg indices
In ex_handler_load_unaligned_zeropad() we erroneously extract the data and
addr register indices from ex->type rather than ex->data. As ex->type will
contain EX_TYPE_LOAD_UNALIGNED_ZEROPAD (i.e. 4):
 * We'll always treat X0 as the address register, since EX_DATA_REG_ADDR is
   extracted from bits [9:5]. Thus, we may attempt to dereference an
   arbitrary address as X0 may hold an arbitrary value.
 * We'll always treat X4 as the data register, since EX_DATA_REG_DATA is
   extracted from bits [4:0]. Thus we will corrupt X4 and cause arbitrary
   behaviour within load_unaligned_zeropad() and its caller.

Fix this by extracting both values from ex->data as originally intended.

On an MTE-enabled QEMU image we are hitting the following crash:
 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
 Call trace:
  fixup_exception+0xc4/0x108
  __do_kernel_fault+0x3c/0x268
  do_tag_check_fault+0x3c/0x104
  do_mem_abort+0x44/0xf4
  el1_abort+0x40/0x64
  el1h_64_sync_handler+0x60/0xa0
  el1h_64_sync+0x7c/0x80
  link_path_walk+0x150/0x344
  path_openat+0xa0/0x7dc
  do_filp_open+0xb8/0x168
  do_sys_openat2+0x88/0x17c
  __arm64_sys_openat+0x74/0xa0
  invoke_syscall+0x48/0x148
  el0_svc_common+0xb8/0xf8
  do_el0_svc+0x28/0x88
  el0_svc+0x24/0x84
  el0t_64_sync_handler+0x88/0xec
  el0t_64_sync+0x1b4/0x1b8
 Code: f8695a69 71007d1f 540000e0 927df12a (f940014a)

Fixes: 753b323687 ("arm64: extable: add load_unaligned_zeropad() handler")
Cc: <stable@vger.kernel.org> # 5.16.x
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Evgenii Stepanov <eugenis@google.com>
Link: https://lore.kernel.org/r/20220125182217.2605202-1-eugenis@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-01-26 18:58:12 +00:00
Dan Carpenter
fc55e63e14 counter: fix an IS_ERR() vs NULL bug
There are 8 callers for devm_counter_alloc() and they all check for NULL
instead of error pointers.  I think NULL is the better thing to return
for allocation functions so update counter_alloc() and devm_counter_alloc()
to return NULL instead of error pointers.

Fixes: c18e276030 ("counter: Provide alternative counter registration functions")
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: William Breathitt Gray <vilhelm.gray@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20220111173243.GA2192@kili
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26 19:40:33 +01:00
Paolo Bonzini
dd4516aee3 selftests: kvm: move vm_xsave_req_perm call to amx_test
There is no need for tests other than amx_test to enable dynamic xsave
states.  Remove the call to vm_xsave_req_perm from generic code,
and move it inside the test.  While at it, allow customizing the bit
that is requested, so that future tests can use it differently.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26 12:45:20 -05:00
Like Xu
05a9e06505 KVM: x86: Sync the states size with the XCR0/IA32_XSS at, any time
XCR0 is reset to 1 by RESET but not INIT and IA32_XSS is zeroed by
both RESET and INIT. The kvm_set_msr_common()'s handling of MSR_IA32_XSS
also needs to update kvm_update_cpuid_runtime(). In the above cases, the
size in bytes of the XSAVE area containing all states enabled by XCR0 or
(XCRO | IA32_XSS) needs to be updated.

For simplicity and consistency, existing helpers are used to write values
and call kvm_update_cpuid_runtime(), and it's not exactly a fast path.

Fixes: a554d207dc ("KVM: X86: Processor States following Reset or INIT")
Cc: stable@vger.kernel.org
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220126172226.2298529-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26 12:42:45 -05:00
Like Xu
4c282e51e4 KVM: x86: Update vCPU's runtime CPUID on write to MSR_IA32_XSS
Do a runtime CPUID update for a vCPU if MSR_IA32_XSS is written, as the
size in bytes of the XSAVE area is affected by the states enabled in XSS.

Fixes: 203000993d ("kvm: vmx: add MSR logic for XSAVES")
Cc: stable@vger.kernel.org
Signed-off-by: Like Xu <likexu@tencent.com>
[sean: split out as a separate patch, adjust Fixes tag]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220126172226.2298529-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26 12:42:45 -05:00
Xiaoyao Li
be4f3b3f82 KVM: x86: Keep MSR_IA32_XSS unchanged for INIT
It has been corrected from SDM version 075 that MSR_IA32_XSS is reset to
zero on Power up and Reset but keeps unchanged on INIT.

Fixes: a554d207dc ("KVM: X86: Processor States following Reset or INIT")
Cc: stable@vger.kernel.org
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220126172226.2298529-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26 12:42:44 -05:00