Commit Graph

1031102 Commits

Author SHA1 Message Date
Trond Myklebust
19598141f4 nfsd: Fix a warning for nfsd_file_close_inode
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-10-01 11:17:40 -04:00
Trond Myklebust
f2e717d655 nfsd4: Handle the NFSv4 READDIR 'dircount' hint being zero
RFC3530 notes that the 'dircount' field may be zero, in which case the
recommendation is to ignore it, and only enforce the 'maxcount' field.
In RFC5661, this recommendation to ignore a zero valued field becomes a
requirement.

Fixes: aee3776441 ("nfsd4: fix rd_dircount enforcement")
Cc: <stable@vger.kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-09-30 16:53:17 -04:00
Patrick Ho
1d625050c7 nfsd: fix error handling of register_pernet_subsys() in init_nfsd()
init_nfsd() should not unregister pernet subsys if the register fails
but should instead unwind from the last successful operation which is
register_filesystem().

Unregistering a failed register_pernet_subsys() call can result in
a kernel GPF as revealed by programmatically injecting an error in
register_pernet_subsys().

Verified the fix handled failure gracefully with no lingering nfsd
entry in /proc/filesystems.  This change was introduced by the commit
bd5ae9288d ("nfsd: register pernet ops last, unregister first"),
the original error handling logic was correct.

Fixes: bd5ae9288d ("nfsd: register pernet ops last, unregister first")
Cc: stable@vger.kernel.org
Signed-off-by: Patrick Ho <Patrick.Ho@netapp.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-09-30 10:58:52 -04:00
Dai Ngo
02579b2ff8 nfsd: back channel stuck in SEQ4_STATUS_CB_PATH_DOWN
When the back channel enters SEQ4_STATUS_CB_PATH_DOWN state, the client
recovers by sending BIND_CONN_TO_SESSION but the server fails to recover
the back channel and leaves it as NFSD4_CB_DOWN.

Fix by enhancing nfsd4_bind_conn_to_session to probe the back channel
by calling nfsd4_probe_callback.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-09-17 10:35:12 -04:00
Chuck Lever
89c485c7a3 NLM: Fix svcxdr_encode_owner()
Dai Ngo reports that, since the XDR overhaul, the NLM server crashes
when the TEST procedure wants to return NLM_DENIED. There is a bug
in svcxdr_encode_owner() that none of our standard test cases found.

Replace the open-coded function with a call to an appropriate
pre-fabricated XDR helper.

Reported-by: Dai Ngo <Dai.Ngo@oracle.com>
Fixes: a6a63ca565 ("lockd: Common NLM XDR helpers")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-09-17 10:35:10 -04:00
NeilBrown
0c217d5066 SUNRPC: improve error response to over-size gss credential
When the NFS server receives a large gss (kerberos) credential and tries
to pass it up to rpc.svcgssd (which is deprecated), it triggers an
infinite loop in cache_read().

cache_request() always returns -EAGAIN, and this causes a "goto again".

This patch:
 - changes the error to -E2BIG to avoid the infinite loop, and
 - generates a WARN_ONCE when rsi_request first sees an over-sized
   credential.  The warning suggests switching to gssproxy.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=196583
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-09-03 13:38:11 -04:00
NeilBrown
e38b3f2005 SUNRPC: don't pause on incomplete allocation
alloc_pages_bulk_array() attempts to allocate at least one page based on
the provided pages, and then opportunistically allocates more if that
can be done without dropping the spinlock.

So if it returns fewer than requested, that could just mean that it
needed to drop the lock.  In that case, try again immediately.

Only pause for a time if no progress could be made.

Reported-and-tested-by: Mike Javorski <mike.javorski@gmail.com>
Reported-and-tested-by: Lothar Paltins <lopa@mailbox.org>
Fixes: f6e70aab9d ("SUNRPC: refresh rq_pages using a bulk page allocator")
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Mel Gorman <mgorman@suse.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-09-01 11:05:07 -04:00
J. Bruce Fields
0bcc7ca40b nfsd: fix crash on LOCKT on reexported NFSv3
Unlike other filesystems, NFSv3 tries to use fl_file in the GETLK case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-26 15:32:29 -04:00
J. Bruce Fields
bb0a55bb71 nfs: don't allow reexport reclaims
In the reexport case, nfsd is currently passing along locks with the
reclaim bit set.  The client sends a new lock request, which is granted
if there's currently no conflict--even if it's possible a conflicting
lock could have been briefly held in the interim.

We don't currently have any way to safely grant reclaim, so for now
let's just deny them all.

I'm doing this by passing the reclaim bit to nfs and letting it fail the
call, with the idea that eventually the client might be able to do
something more forgiving here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-26 15:32:28 -04:00
J. Bruce Fields
b840be2f00 lockd: don't attempt blocking locks on nfs reexports
As in the v4 case, it doesn't work well to block waiting for a lock on
an nfs filesystem.

As in the v4 case, that means we're depending on the client to poll.
It's probably incorrect to depend on that, but I *think* clients do poll
in practice.  In any case, it's an improvement over hanging the lockd
thread indefinitely as we currently are.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-26 15:32:18 -04:00
J. Bruce Fields
f657f8eef3 nfs: don't atempt blocking locks on nfs reexports
NFS implements blocking locks by blocking inside its lock method.  In
the reexport case, this blocks the nfs server thread, which could lead
to deadlocks since an nfs server thread might be required to unlock the
conflicting lock.  It also causes a crash, since the nfs server thread
assumes it can free the lock when its lm_notify lock callback is called.

Ideal would be to make the nfs lock method return without blocking in
this case, but for now it works just not to attempt blocking locks.  The
difference is just that the original client will have to poll (as it
does in the v4.0 case) instead of getting a callback when the lock's
available.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-26 15:32:10 -04:00
J. Bruce Fields
7f024fcd5c Keep read and write fds with each nlm_file
We shouldn't really be using a read-only file descriptor to take a write
lock.

Most filesystems will put up with it.  But NFS, for example, won't.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-23 18:05:31 -04:00
J. Bruce Fields
b661601a9f lockd: update nlm_lookup_file reexport comment
Update comment to reflect that we *do* allow reexport, whether it's a
good idea or not....

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-23 12:56:17 -04:00
J. Bruce Fields
a81041b7d8 nlm: minor refactoring
Make this lookup slightly more concise, and prepare for changing how we
look this up in a following patch.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-23 12:56:17 -04:00
J. Bruce Fields
2dc6f19e4f nlm: minor nlm_lookup_file argument change
It'll come in handy to get the whole nlm_lock.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-23 12:56:03 -04:00
J. Bruce Fields
7de875b231 lockd: lockd server-side shouldn't set fl_ops
Locks have two sets of op arrays, fl_lmops for the lock manager (lockd
or nfsd), fl_ops for the filesystem.  The server-side lockd code has
been setting its own fl_ops, which leads to confusion (and crashes) in
the reexport case, where the filesystem expects to be the only one
setting fl_ops.

And there's no reason for it that I can see-the lm_get/put_owner ops do
the same job.

Reported-by: Daire Byrne <daire@dneg.com>
Tested-by: Daire Byrne <daire@dneg.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-21 11:48:34 -04:00
Chuck Lever
400edd8c04 SUNRPC: Add documentation for the fail_sunrpc/ directory
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-20 13:50:33 -04:00
Chuck Lever
3a12618059 SUNRPC: Server-side disconnect injection
Disconnect injection stress-tests the ability for both client and
server implementations to behave resiliently in the face of network
instability.

A file called /sys/kernel/debug/fail_sunrpc/ignore-server-disconnect
enables administrators to turn off server-side disconnect injection
while allowing other types of sunrpc errors to be injected. The
default setting is that server-side disconnect injection is enabled
(ignore=false).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-20 13:50:33 -04:00
Chuck Lever
a4ae308143 SUNRPC: Move client-side disconnect injection
Disconnect injection stress-tests the ability for both client and
server implementations to behave resiliently in the face of network
instability.

Convert the existing client-side disconnect injection infrastructure
to use the kernel's generic error injection facility. The generic
facility has a richer set of injection criteria.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-20 13:50:32 -04:00
Chuck Lever
c782af2500 SUNRPC: Add a /sys/kernel/debug/fail_sunrpc/ directory
This directory will contain a set of administrative controls for
enabling error injection for kernel RPC consumers.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-20 13:50:32 -04:00
Chuck Lever
729580ddc5 svcrdma: xpt_bc_xprt is already clear in __svc_rdma_free()
svc_xprt_free() already "puts" the bc_xprt before calling the
transport's "free" method. No need to do it twice.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-19 08:29:32 -04:00
J. Bruce Fields
f7104cc1a9 nfsd4: Fix forced-expiry locking
This should use the network-namespace-wide client_lock, not the
per-client cl_lock.

You shouldn't see any bugs unless you're actually using the
forced-expiry interface introduced by 89c905becc.

Fixes: 89c905becc "nfsd: allow forced expiration of NFSv4 clients"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17 11:47:54 -04:00
J. Bruce Fields
5a47534462 rpc: fix gss_svc_init cleanup on failure
The failure case here should be rare, but it's obviously wrong.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17 11:47:53 -04:00
Chuck Lever
b4ab2fea7c SUNRPC: Add RPC_AUTH_TLS protocol numbers
Shared by client and server. See:

https://www.iana.org/assignments/rpc-authentication-numbers/rpc-authentication-numbers.xhtml

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17 11:47:53 -04:00
Jia He
d02a3a2cb2 lockd: change the proc_handler for nsm_use_hostnames
nsm_use_hostnames is a module parameter and it will be exported to sysctl
procfs. This is to let user sometimes change it from userspace. But the
minimal unit for sysctl procfs read/write it sizeof(int).
In big endian system, the converting from/to  bool to/from int will cause
error for proc items.

This patch use a new proc_handler proc_dobool to fix it.

Signed-off-by: Jia He <hejianet@gmail.com>
Reviewed-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
[thuth: Fix typo in commit message]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17 11:47:53 -04:00
Jia He
a2071573d6 sysctl: introduce new proc handler proc_dobool
This is to let bool variable could be correctly displayed in
big/little endian sysctl procfs. sizeof(bool) is arch dependent,
proc_dobool should work in all arches.

Suggested-by: Pan Xinhui <xinhui@linux.vnet.ibm.com>
Signed-off-by: Jia He <hejianet@gmail.com>
[thuth: rebased the patch to the current kernel version]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17 11:47:53 -04:00
Chuck Lever
5c11720767 SUNRPC: Fix a NULL pointer deref in trace_svc_stats_latency()
Some paths through svc_process() leave rqst->rq_procinfo set to
NULL, which triggers a crash if tracing happens to be enabled.

Fixes: 89ff87494c ("SUNRPC: Display RPC procedure names instead of proc numbers")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17 11:47:53 -04:00
NeilBrown
ea49dc7900 NFSD: remove vanity comments
Including one's name in copyright claims is appropriate.  Including it
in random comments is just vanity.  After 2 decades, it is time for
these to be gone.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17 11:47:53 -04:00
Chuck Lever
07a92d009f svcrdma: Convert rdma->sc_rw_ctxts to llist
Relieve contention on sc_rw_ctxt_lock by converting rdma->sc_rw_ctxts
to an llist.

The goal is to reduce the average overhead of Send completions,
because a transport's completion handlers are single-threaded on
one CPU core. This change reduces CPU utilization of each Send
completion by 2-3% on my server.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-By: Tom Talpey <tom@talpey.com>
2021-08-17 11:47:53 -04:00
Chuck Lever
b6c2bfea09 svcrdma: Relieve contention on sc_send_lock.
/proc/lock_stat indicates the the sc_send_lock is heavily
contended when the server is under load from a single client.

To address this, convert the send_ctxt free list to an llist.
Returning an item to the send_ctxt cache is now waitless, which
reduces the instruction path length in the single-threaded Send
handler (svc_rdma_wc_send).

The goal is to enable the ib_comp_wq worker to handle a higher
RPC/RDMA Send completion rate given the same CPU resources. This
change reduces CPU utilization of Send completion by 2-3% on my
server.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-By: Tom Talpey <tom@talpey.com>
2021-08-17 11:47:53 -04:00
Chuck Lever
6c8c84f525 svcrdma: Fewer calls to wake_up() in Send completion handler
Because wake_up() takes an IRQ-safe lock, it can be expensive,
especially to call inside of a single-threaded completion handler.
What's more, the Send wait queue almost never has waiters, so
most of the time, this is an expensive no-op.

As always, the goal is to reduce the average overhead of each
completion, because a transport's completion handlers are single-
threaded on one CPU core. This change reduces CPU utilization of
the Send completion thread by 2-3% on my server.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-By: Tom Talpey <tom@talpey.com>
2021-08-17 11:47:53 -04:00
Benjamin Coddington
cd2d644ddb lockd: Fix invalid lockowner cast after vfs_test_lock
After calling vfs_test_lock() the pointer to a conflicting lock can be
returned, and that lock is not guarunteed to be owned by nlm.  In that
case, we cannot cast it to struct nlm_lockowner.  Instead return the pid
of that conflicting lock.

Fixes: 646d73e91b ("lockd: Show pid of lockd for remote locks")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17 11:47:52 -04:00
Chuck Lever
d27b74a867 NFSD: Use new __string_len C macros for nfsd_clid_class
Clean up.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17 11:47:52 -04:00
Chuck Lever
408c0de706 NFSD: Use new __string_len C macros for the nfs_dirent tracepoint
Clean up.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17 11:47:52 -04:00
Steven Rostedt (VMware)
883b4aee4d tracing: Add trace_event helper macros __string_len() and __assign_str_len()
There's a few cases that a string that is to be recorded in a trace event,
does not have a terminating 'nul' character, and instead, the tracepoint
passes in the length of the string to record.

Add two helper macros to the trace event code that lets this work easier,
than tricks with "%.*s" logic.

  __string_len() which is similar to __string() for declaration, but takes a
                 length argument.

  __assign_str_len() which is similar to __assign_str() for assiging the
                 string, but it too takes a length argument.

Note, the TRACE_EVENT() macro will allocate the location on the ring
buffer to 'len + 1', that will be used to store the string into. It is a
requirement that the 'len' used for this is a most the length of the
string being recorded.

This string can still use __get_str() just like strings created with
__string() can use to retrieve the string.

Link: https://lore.kernel.org/linux-nfs/20210513105018.7539996a@gandalf.local.home/

Tested-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17 11:47:52 -04:00
Chuck Lever
496d83cf0f NFSD: Batch release pages during splice read
Large splice reads call put_page() repeatedly. put_page() is
relatively expensive to call, so replace it with the new
svc_rqst_replace_page() helper to help amortize that cost.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
2021-08-17 11:47:52 -04:00
Chuck Lever
2f0f88f42f SUNRPC: Add svc_rqst_replace_page() API
Replacing a page in rq_pages[] requires a get_page(), which is a
bus-locked operation, and a put_page(), which can be even more
costly.

To reduce the cost of replacing a page in rq_pages[], batch the
put_page() operations by collecting "freed" pages in a pagevec,
and then release those pages when the pagevec is full. This
pagevec is also emptied when each RPC completes.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17 11:47:52 -04:00
Chuck Lever
c7e0b781b7 NFSD: Clean up splice actor
A few useful observations:

 - The value in @size is never modified.

 - splice_desc.len is an unsigned int, and so is xdr_buf.page_len.
   An implicit cast to size_t is unnecessary.

 - The computation of .page_len is the same in all three arms
   of the "if" statement, so hoist it out to make it clear that
   the operation is an unconditional invariant.

The resulting function is 18 bytes shorter on my system (-Os).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
2021-08-17 11:47:52 -04:00
Linus Torvalds
7c60610d47 Linux 5.14-rc6 2021-08-15 13:40:53 -10:00
Linus Torvalds
ecf9343196 powerpc fixes for 5.14 #5
- Fix crashes coming out of nap on 32-bit Book3s (eg. powerbooks).
  - Fix critical and debug interrupts on BookE, seen as crashes when using ptrace.
  - Fix an oops when running an SMP kernel on a UP system.
  - Update pseries LPAR security flavor after partition migration.
  - Fix an oops when using kprobes on BookE.
  - Fix oops on 32-bit pmac by not calling do_IRQ() from timer_interrupt().
  - Fix softlockups on CPU hotplug into a CPU-less node with xive (P9).
 
 Thanks to: Cédric Le Goater, Christophe Leroy, Finn Thain, Geetika Moolchandani, Laurent
 Dufour, Laurent Vivier, Nicholas Piggin, Pu Lehui, Radu Rendec, Srikar Dronamraju, Stan
 Johnson.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmEY/6QTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgHNBD/9zsObrReMj+lKbgEBw5u8kbERqWjQ4
 otJuxkT3mrQTA2YsJZ4QpUE7C+78h7P7aS3LpfpeONkI+WSxbuq/j+47538mpiLu
 LmasKZVVdLP3b+3eww2pOEYKF1qACkBxsy6gBy0DAzoWAjczVQkdpoe1pXyIQjz2
 j3UyuuFvyE76eKHn7aSfOHO1PiNfO0ZXghum9gc5kXsOsqg9eaFbbJ4HUD2FHd6V
 UmIl+njlt03TS6TBXkZwpcplfZWhcks7ZY/VqylrWSlbUx75J2aJ2hb0G1iU3l9S
 51AepEOQmZnkhOGA19PJhVudtUBc8pw5RCwYPeqv71tgo8hayCVgjBy+kmHqAvFI
 u0iFqA1dZjCPaFlm9Pcgq/DZdzD2xFLilpY/e4qwyDrQ1TsXM4CdJpEkaSsZ2IZ/
 HQbvjx1D4U7qZTPCMGSG4IQNtxtSVrZO8CzKoRUTDVDLPdjW/259abLQQTpY7x8z
 N5M5KeCk6xNk1ZYzxpzRKk+qSwiueIrqyP5GMMfzOCtJwBe7Q+vWtN1RbNQ2pBVO
 TUzQ0b7WYqiweNUFahXzgeUBUXP6HixG3Ay7z8bnUaWgWSgD8agbyx0gX1Jtj/cJ
 GAnKOH+GygnqIsijonohXpS+TPOHTR7hAP2w3G7ONJhXiaBKFHp4PKJwSO5tuiR3
 NZqm9NYZEsf6CQ==
 =GOBK
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix crashes coming out of nap on 32-bit Book3s (eg. powerbooks).

 - Fix critical and debug interrupts on BookE, seen as crashes when
   using ptrace.

 - Fix an oops when running an SMP kernel on a UP system.

 - Update pseries LPAR security flavor after partition migration.

 - Fix an oops when using kprobes on BookE.

 - Fix oops on 32-bit pmac by not calling do_IRQ() from
   timer_interrupt().

 - Fix softlockups on CPU hotplug into a CPU-less node with xive (P9).

Thanks to Cédric Le Goater, Christophe Leroy, Finn Thain, Geetika
Moolchandani, Laurent Dufour, Laurent Vivier, Nicholas Piggin, Pu Lehui,
Radu Rendec, Srikar Dronamraju, and Stan Johnson.

* tag 'powerpc-5.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/xive: Do not skip CPU-less nodes when creating the IPIs
  powerpc/interrupt: Do not call single_step_exception() from other exceptions
  powerpc/interrupt: Fix OOPS by not calling do_IRQ() from timer_interrupt()
  powerpc/kprobes: Fix kprobe Oops happens in booke
  powerpc/pseries: Fix update of LPAR security flavor after LPM
  powerpc/smp: Fix OOPS in topology_init()
  powerpc/32: Fix critical and debug interrupts on BOOKE
  powerpc/32s: Fix napping restore in data storage interrupt (DSI)
2021-08-15 06:57:43 -10:00
Linus Torvalds
c4f14eac22 A set of fixes for PCI/MSI and x86 interrupt startup:
- Mask all MSI-X entries when enabling MSI-X otherwise stale unmasked
    entries stay around e.g. when a crashkernel is booted.
 
  - Enforce masking of a MSI-X table entry when updating it, which mandatory
    according to speification
 
  - Ensure that writes to MSI[-X} tables are flushed.
 
  - Prevent invalid bits being set in the MSI mask register
 
  - Properly serialize modifications to the mask cache and the mask register
    for multi-MSI.
 
  - Cure the violation of the affinity setting rules on X86 during interrupt
    startup which can cause lost and stale interrupts. Move the initial
    affinity setting ahead of actualy enabling the interrupt.
 
  - Ensure that MSI interrupts are completely torn down before freeing them
    in the error handling case.
 
  - Prevent an array out of bounds access in the irq timings code.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmEY5bcTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoaMvD/9KeK2430f4h/x/fQhHmHHIJOv3kqmB
 gRXX4RV+N/DfU9GSflbzPxY9l2SkydgpQjHeGnqpV7DYRIu84nVYAuWcWtPimHHy
 JxapniLlQv2GS+SIy9f1mmChH6VUPS05brHxKSqAQZvQIoZqza8vF3umZlV7eYF4
 uZFd86TCbDFsBxbsKmyV1FtQLo008EeEp8dtZ/1cZ9Fbp0M/mQkuu7aTNqY0qWwZ
 rAoGyE4PjDR+yf87XjE5z7hMs2vfUjiGXg7Kbp30NPKGcRyasb+SlHVKcvZKJIji
 Y0Bk/SOyqoj1Co3U+cEaWolB1MeGff4nP+Xx8xvyNklKxxs1+92Z7L1RElXIc0cL
 kmUehUSf5JuJ83B6ucAYbmnXKNw1XB00PaMy7iSxsYekTXJx+t0b+Rt6o0R3inWB
 xUWbIVmoL2uF1oOAb6mEc3wDNMBVkY33e9l2jD0PUPxKXZ730MVeojWJ8FGFiPOT
 9+aCRLjZHV5slVQAgLnlpcrseJLuUei6HLVwRXxv19Bz5L+HuAXUxWL9h74SRuE9
 14kH63aXSVDlcYyW7c3t8Lh6QjKAf7AIz0iG+u3n09IWyURd4agHuKOl5itileZB
 BK9NuRrNgmr2nEKG461Suc6GojLBXc1ih3ak+MG+O4iaLxnhapTjW3Weqr+OVXr+
 SrIjoxjpEk2ECA==
 =yf3u
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2021-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "A set of fixes for PCI/MSI and x86 interrupt startup:

   - Mask all MSI-X entries when enabling MSI-X otherwise stale unmasked
     entries stay around e.g. when a crashkernel is booted.

   - Enforce masking of a MSI-X table entry when updating it, which
     mandatory according to speification

   - Ensure that writes to MSI[-X} tables are flushed.

   - Prevent invalid bits being set in the MSI mask register

   - Properly serialize modifications to the mask cache and the mask
     register for multi-MSI.

   - Cure the violation of the affinity setting rules on X86 during
     interrupt startup which can cause lost and stale interrupts. Move
     the initial affinity setting ahead of actualy enabling the
     interrupt.

   - Ensure that MSI interrupts are completely torn down before freeing
     them in the error handling case.

   - Prevent an array out of bounds access in the irq timings code"

* tag 'irq-urgent-2021-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  driver core: Add missing kernel doc for device::msi_lock
  genirq/msi: Ensure deactivation on teardown
  genirq/timings: Prevent potential array overflow in __irq_timings_store()
  x86/msi: Force affinity setup before startup
  x86/ioapic: Force affinity setup before startup
  genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP
  PCI/MSI: Protect msi_desc::masked for multi-MSI
  PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown()
  PCI/MSI: Correct misleading comments
  PCI/MSI: Do not set invalid bits in MSI mask
  PCI/MSI: Enforce MSI[X] entry updates to be visible
  PCI/MSI: Enforce that MSI-X table entry is masked for update
  PCI/MSI: Mask all unused MSI-X entries
  PCI/MSI: Enable and mask MSI-X early
2021-08-15 06:49:40 -10:00
Linus Torvalds
839da25385 - Fix a CONFIG symbol's spelling
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmEYzdcACgkQEsHwGGHe
 VUoeOhAAltobYZmv2XJdqVOcvCBSwgd4M5ILD7ZFoztw6ZyhvAr8Jzm9P4owW8a8
 r83kEkTIEGnjFK0X0RhakIZcnLJK9wQD74DxHlrcBMCUsnYHtLRvKDJS5niNN2Pz
 7d82NO7mVxPkRoTsG2gzn6xD0dkNGSJU6BhJjo8LdvtpOqYDt3jLPk6kGDxkjO9/
 8QaknaDHz1dYCoRt6YynH9lIH7Vzffjnt3MYPpm0pEtQIVSTK4FwaXkLctwHgXlk
 ZOEMI4qyu9z1Zid3V6pRoKQBCpI0d5bkiqXBaGw7aW2vdnE4LU0UF9RdTHJZwryw
 oRe587wfCln43+yGnath2XR93tkl0vKn0eu7FNyTmEHXduP8/+i9dIhsTiWp4JKL
 TRWFcOy676g+Qq3P8l0gkTJGPTcS91LWMfkx/FsbYBiecNBcMEX6V8FffsM7SF8A
 M558SRjZ/kMGXOdDe+Dtr8Vz7RXgFpye93nXZ9ZOieSeH5DWmTWsSfaGJi3pLJYZ
 utSRivpGDxleNkPMihPkg/aa1D3MsVFwnr7+SwwKNhUTtAzanUuyQrjXsEjuUnIX
 /sXMWrV+N8yk5JT6+GMmvLpeG6kspkn8UuQ+GVz2/8x88HLdIR3q+RATX8i9pQtI
 L0t1NWUFlXt3rb5xolZukG6h5dBTa1AXHJ5AR82tEMBGg2JgFss=
 =y2ev
 -----END PGP SIGNATURE-----

Merge tag 'locking_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fix from Borislav Petkov:

 - Fix a CONFIG symbol's spelling

* tag 'locking_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rtmutex: Use the correct rtmutex debugging config option
2021-08-15 06:46:04 -10:00
Linus Torvalds
12aef8acf0 Ard says:
A batch of fixes for the arm64 stub image loader:
 
 - fix a logic bug that can make the random page allocator fail
   spuriously
 - force reallocation of the Image when it overlaps with firmware
   reserved memory regions
 - fix an oversight that defeated on optimization introduced earlier
   where images loaded at a suitable offset are never moved if booting
   without randomization
 - complain about images that were not loaded at the right offset by the
   firmware image loader.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmEYyWsACgkQEsHwGGHe
 VUpXQRAAjlElX+DMBy9xEkQooo+c5FtOBkJeD0K4rpj+4UClhCYaNJR5Exj/y4rq
 SerlOW4YNy7Ku9uqsIg3VJsH7MvrSMWwiiNh2Y86ZyZM4z6uULrveIj734/ZE73P
 wne6hwqHHoNVzTHsdeDuY7DVabZPCg3uyJy/UYR21XPUcqXMaaJEAuctqU3f1nST
 67EojbtEmSOdDSCy+DDfsOCGLaBeGyDEIrVQfqo+GzQXzEFnvXyfGbeN6BFVsSEe
 AFv+MNqlb2qepX9ZkfNGtggjhBFzR8p7nGlxF+F6E4GmSNFsh3Lzu2a020/UhA5B
 Ahs/e93EUNDB2VoZkifSPVZil5cqzOFRvjphdQHApuh6fpnR+fqjcHxvgeGocEpD
 J1CDbpc5YVDO3OBbMaWRIam3VtRaLg/FjkxDj0hBMAGZKyVDktIG2/Eqz0E+/aHM
 L5HQX0d9mbxMhau9PKdcDEC4T9lUCLtNARpZiSqBBsw3uOyuEpiDZSxb/4C96HDU
 LyN3+Z3N72i3XIt3ikziJ3saJBMJfsKJsaMoH99qdmCIYfN5uczKOcbcSXXJ0443
 +u3XeTkz9ZfY/B34bYxEYWA21HkfTupZKZ+1ci/89JyU24nKAAmh7iYdFKh+wq7n
 p3d4//qBDBcStzKR9BfSL0flGOYuKfNK5S8e0XNf2bqUgfDKqok=
 =oMzy
 -----END PGP SIGNATURE-----

Merge tag 'efi_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull EFI fixes from Borislav Petkov:
 "A batch of fixes for the arm64 stub image loader:

   - fix a logic bug that can make the random page allocator fail
     spuriously

   - force reallocation of the Image when it overlaps with firmware
     reserved memory regions

   - fix an oversight that defeated on optimization introduced earlier
     where images loaded at a suitable offset are never moved if booting
     without randomization

   - complain about images that were not loaded at the right offset by
     the firmware image loader"

* tag 'efi_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi/libstub: arm64: Double check image alignment at entry
  efi/libstub: arm64: Warn when efi_random_alloc() fails
  efi/libstub: arm64: Relax 2M alignment again for relocatable kernels
  efi/libstub: arm64: Force Image reallocation if BSS was not reserved
  arm64: efi: kaslr: Fix occasional random alloc (and boot) failure
2021-08-15 06:38:26 -10:00
Linus Torvalds
b045b8cc86 - An objdump checker fix to ignore parenthesized strings in the objdump
version
 
 - Fix resctrl default monitoring groups reporting when new subgroups get
   created
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmEYyDsACgkQEsHwGGHe
 VUqYvw/7BhM3XR0xsjGnKAKTIofWNgpupqd/CIgcXwty9WKsJfLD0CMWnrvJXKi2
 NiyrGiJ3TKjgWajd7LAQzpVdq+YNgG4i5YY6Lvxc2VVgoccKQqpD0JfU9vT8m6cC
 kzSWV+dLs1ydhmgb+bxKqedrautaPjM7RN8/EAnv56mBUxlemD8WSx/rEnP9sgwF
 RE9teVSBuutMQj8lO238SJMN9AIF11Ti1ZIaHmuIKwjFTSLIETthE3o+Dhhq17gY
 vaP1uYFPlyh5tTJA0pa7wijoStPvZmdUzn5n2QQ5CJCkoDNXrmNEu7qS5SERbZBA
 U6jag/SNLwTkN2cA4Mmpb6HsA8r7vOhweovC9GgInnsyFiKAgZ1tUT7LbFQOUrhq
 QWQTrsews0xwhHLrv7r92mZf/W4cLoS0iEN9rinHiatb3Nr0/5ugDSgErw8scqLC
 JqjDCqy6Wm3NuRQhXoZfqid+WE/xN8BTfsbrQ7kuAqOV3NSVmm3K6XTSUTLtJ/C0
 x+Fj+W+4Q8UthQoW5WldsfnGLrKM4UjmXBQbM5o9fWW1L4gYIM6FD6uVqBZk5GAs
 bxuT4f1M/R3/5qdm9L69e4WPduyo53/+bJjmwA9DXLaKXvnFqkZikkV3+S3U+9/j
 pKhg+IfRZ4f1ymjH8sEwEsA037cmP3OzrIwF0vrQDIrWErL5h7Q=
 =Gg+d
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:
 "Two fixes:

   - An objdump checker fix to ignore parenthesized strings in the
     objdump version

   - Fix resctrl default monitoring groups reporting when new subgroups
     get created"

* tag 'x86_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Fix default monitoring groups reporting
  x86/tools: Fix objdump version check again
2021-08-15 06:30:24 -10:00
Linus Torvalds
3e763ec791 ARM:
- Plug race between enabling MTE and creating vcpus
 
 - Fix off-by-one bug when checking whether an address range is RAM
 
 x86:
 
 - Fixes for the new MMU, especially a memory leak on hosts with <39
   physical address bits
 
 - Remove bogus EFER.NX checks on 32-bit non-PAE hosts
 
 - WAITPKG fix
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmEWjBwUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPrMgf9EDBsRvD/Kids0kddaoAgM6qICdsH
 tQX/GdsmecUlU16Bkp21XeZif1ZKcJxCmx/dhYmid3woi9HuX5AreFTlLjlJDRxg
 +lJvboqTV0kk7PjaYkOaqd42RSg/BiSLZ+JVPpbW7CqeIr1lGG4yhIC/Nl7fCCto
 sCaY/NoxtraoG5+WZcRRP7XptQmMRckVZ9bimHHh8dKqMkosGx1hcGfj64aKmx4F
 2EVrrjr+an3mpMnwvUIgNw4xEj/jUCFebvGAROVEsrZzNTZ9UrwgT0HeA92XwQVQ
 93z7nqcBUKHH11rnbOvRESEJD9f6I9vCSaiqRROwmoqLY/Xi7jly7XeDcA==
 =Lj8B
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "ARM:

   - Plug race between enabling MTE and creating vcpus

   - Fix off-by-one bug when checking whether an address range is RAM

  x86:

   - Fixes for the new MMU, especially a memory leak on hosts with <39
     physical address bits

   - Remove bogus EFER.NX checks on 32-bit non-PAE hosts

   - WAITPKG fix"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock
  KVM: x86/mmu: Don't step down in the TDP iterator when zapping all SPTEs
  KVM: x86/mmu: Don't leak non-leaf SPTEs when zapping all SPTEs
  KVM: nVMX: Use vmx_need_pf_intercept() when deciding if L0 wants a #PF
  kvm: vmx: Sync all matching EPTPs when injecting nested EPT fault
  KVM: x86: remove dead initialization
  KVM: x86: Allow guest to set EFER.NX=1 on non-PAE 32-bit kernels
  KVM: VMX: Use current VMCS to query WAITPKG support for MSR emulation
  KVM: arm64: Fix race when enabling KVM_ARM_CAP_MTE
  KVM: arm64: Fix off-by-one in range_is_memory
2021-08-15 06:21:30 -10:00
Linus Torvalds
0aa78d1709 SCSI fixes on 20210814
Three minor fixes, all in drivers.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYRiF2yYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishS1EAQDZL/WM
 TCYRGUQ7tAB/CgoShLDDZqRzmi74EUa7Nnc5XgEA/dA10eWDG8d3U8gSbL86+Jcw
 1cRaCemzI2CJm42ixNQ=
 =4eSU
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Three minor fixes, all in drivers"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: mpt3sas: Fix incorrectly assigned error return and check
  scsi: storvsc: Log TEST_UNIT_READY errors as warnings
  scsi: lpfc: Move initialization of phba->poll_list earlier to avoid crash
2021-08-14 19:51:58 -10:00
Linus Torvalds
7ba34c0cba libnvdimm fixes for v5.14-rc6
- Fix support for NFIT "virtual" ranges (BIOS-defined memory disks)
 
 - Fix recovery from failed label storage areas on NVDIMM devices
 
 - Miscellaneous cleanups from Ira's investigation of dax_direct_access
   paths preparing for stray-write protection.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYRhC0wAKCRDfioYZHlFs
 Z6InAQD+duS9GS5DnnFInmRDj/rMRQFVB4X25mmSlViYOR0gNwEAtJQP03CGAp+G
 +DP7/nu2HrIhx8Ng8vTsu8ZnO8ge7Qw=
 =zmii
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-fixes-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:
 "A couple of fixes for long standing bugs, a warning fixup, and some
  miscellaneous dax cleanups.

  The bugs were recently found due to new platforms looking to use the
  ACPI NFIT "virtual" device definition, and new error injection
  capabilities to trigger error responses to label area requests. Ira's
  cleanups have been long pending, I neglected to send them earlier, and
  see no harm in including them now. This has all appeared in -next with
  no reported issues.

  Summary:

   - Fix support for NFIT "virtual" ranges (BIOS-defined memory disks)

   - Fix recovery from failed label storage areas on NVDIMM devices

   - Miscellaneous cleanups from Ira's investigation of
     dax_direct_access paths preparing for stray-write protection"

* tag 'libnvdimm-fixes-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  tools/testing/nvdimm: Fix missing 'fallthrough' warning
  libnvdimm/region: Fix label activation vs errors
  ACPI: NFIT: Fix support for virtual SPA ranges
  dax: Ensure errno is returned from dax_direct_access
  fs/dax: Clarify nr_pages to dax_direct_access()
  fs/fuse: Remove unneeded kaddr parameter
2021-08-14 19:46:39 -10:00
Linus Torvalds
12f41321ce USB fix for 5.14-rc6
Here is a single revert of a commit that caused problems in 5.14-rc5 for
 5.14-rc6.  It has been in linux-next almost all week, and has resolved
 the issues that were reported on lots of different systems that were not
 the platform that the change was originally tested on (gotta love SoC
 cores used in multiple devices from multiple vendors...)
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYRf0Qg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynEBgCgg/t8I7hGF5+bi6qV82K/atVFgvkAnRLN9c43
 nUXoVeo1RBk3tI+G31Cl
 =vpaV
 -----END PGP SIGNATURE-----

Merge tag 'usb-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fix from Greg KH:
 "A single revert of a commit that caused problems in 5.14-rc5 for
  5.14-rc6. It has been in linux-next almost all week, and has resolved
  the issues that were reported on lots of different systems that were
  not the platform that the change was originally tested on (gotta love
  SoC cores used in multiple devices from multiple vendors...)"

* tag 'usb-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  Revert "usb: dwc3: gadget: Use list_replace_init() before traversing lists"
2021-08-14 19:22:33 -10:00
Linus Torvalds
56aee57345 IIO fixes for 5.14-rc6
Here are some small IIO driver fixes for reported problems for 5.14-rc6
 (no staging driver fixes at the moment).
 
 All of them resolve reported issues and have been in linux-next all week
 with no reported problems.  Full details are in the shortlog.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYRfzsg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yn4qACdG4OZ4cn8C2QtsxXMtxknUcBeKMYAoIv2x+bB
 gl/bJVn9vtyVqSESnefM
 =OOPX
 -----END PGP SIGNATURE-----

Merge tag 'staging-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull IIO driver fixes from Greg KH:
 "Here are some small IIO driver fixes for reported problems for
  5.14-rc6 (no staging driver fixes at the moment).

  All of them resolve reported issues and have been in linux-next all
  week with no reported problems. Full details are in the shortlog"

* tag 'staging-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  iio: adc: Fix incorrect exit of for-loop
  iio: humidity: hdc100x: Add margin to the conversion time
  dt-bindings: iio: st: Remove wrong items length check
  iio: accel: fxls8962af: fix i2c dependency
  iio: adis: set GPIO reset pin direction
  iio: adc: ti-ads7950: Ensure CS is deasserted after reading channels
  iio: accel: fxls8962af: fix potential use of uninitialized symbol
2021-08-14 19:16:30 -10:00
Linus Torvalds
76c9e465dd Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "One driver bugfix, a documentation bugfix, and an "uninitialized data"
  leak fix for the core"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  Documentation: i2c: add i2c-sysfs into index
  i2c: dev: zero out array used for i2c reads from userspace
  i2c: iproc: fix race between client unreg and tasklet
2021-08-14 18:59:53 -10:00