Commit Graph

251 Commits

Author SHA1 Message Date
Chuck Lever
38f080f3cd NFSD: Move callback_wq into struct nfs4_client
Commit 8838203667 ("nfsd: update workqueue creation") made the
callback_wq single-threaded, presumably to protect modifications of
cl_cb_client. See documenting comment for nfsd4_process_cb_update().

However, cl_cb_client is per-lease. There's no other reason that all
callback operations need to be dispatched via a single thread. The
single threading here means all client callbacks can be blocked by a
problem with one client.

Change the NFSv4 callback client so it serializes per-lease instead
of serializing all NFSv4 callback operations on the server.

Reported-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-05-06 09:07:16 -04:00
Linus Torvalds
e33c4963bf nfsd-6.9 fixes:
- Revert some backchannel fixes that went into v6.9-rc
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmYqZecACgkQM2qzM29m
 f5fOPhAAlxsB8ThAAyg+vOMeR6lxutEjPzez+U4M3N1s8dj9eMmcx8s6DPxsvfTS
 ExhcqxauCEwh9q+yZQ5ismWiE43Gvf39tvKGXzH30oM4v8xYo9nzxi6w9mEGyqR+
 bkpbvbpuCfSeVY4WXD2rFI8DMh1G8PM9llue2QVLWPny2y5K2gAsbinpV/QQtdgp
 LEwhnGcX2ZmSh9yd7aGiOOO18CHU9rMygcb4XDzmwWggO8eCrElWbp4x8yuZJFAD
 OzuIx00WYny00rlMpZUyZ67hhg0+KcGQVldhr+A8f0K1baFCfW4QDq2T3RGTonoe
 DWdDGW7zwBbBw/rg4NUhdIZ9CArml9bB8ohPsaGr99e/95mtpSnXoEDJvKYgsFVE
 QembV7zPZqzEjtrR6g/tdf8/g1yRRwUMLze+RLjzz70TohLKvj7+qHCHirNMCMVE
 +HXvEdPKCzdZ24YST2wDh6sQ571ka6TX8fL6gYU8O+w+fiitLkEVXGE7KDJvynkm
 bsoJxTtbTuFleaW79d51VpIuiW1uyqKJrD1RnwDlMpp+HDtUnoVXU5SfGnmMp4KG
 Tz9Hx7hgFthXpajDpyzgLMnYVJ8IvXnoQaOp97mqFe/jSktXfslDdsj/R5ZAjZHO
 SfViZ8hjyoxrna7bPh250tSCb/S3qljhsIOc/3tAl8nPWGLoV28=
 =O2Ox
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:

 - Revert some backchannel fixes that went into v6.9-rc

* tag 'nfsd-6.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  Revert "NFSD: Convert the callback workqueue to use delayed_work"
  Revert "NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down"
2024-04-25 09:31:06 -07:00
Chuck Lever
8ddb7142c8 Revert "NFSD: Convert the callback workqueue to use delayed_work"
This commit was a pre-requisite for commit c1ccfcf1a9 ("NFSD:
Reschedule CB operations when backchannel rpc_clnt is shut down"),
which has already been reverted.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-04-23 20:12:41 -04:00
Chuck Lever
9c8ecb9308 Revert "NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down"
The reverted commit attempted to enable NFSD to retransmit pending
callback operations if an NFS client disconnects, but
unintentionally introduces a hazardous behavior regression if the
client becomes permanently unreachable while callback operations are
still pending.

A disconnect can occur due to network partition or if the NFS server
needs to force the NFS client to retransmit (for example, if a GSS
window under-run occurs).

Reverting the commit will make NFSD behave the same as it did in
v6.8 and before. Pending callback operations are permanently lost if
the client connection is terminated before the client receives them.

For some callback operations, this loss is not harmful.

However, for CB_RECALL, the loss means a delegation might be revoked
unnecessarily. For CB_OFFLOAD, pending COPY operations will never
complete unless the NFS client subsequently sends an OFFLOAD_STATUS
operation, which the Linux NFS client does not currently implement.

These issues still need to be addressed somehow.

Reported-by: Dai Ngo <dai.ngo@oracle.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218735
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-04-23 20:12:34 -04:00
Linus Torvalds
a01c9fe323 NFSD 6.9 Release Notes
The bulk of the patches for this release are optimizations, code
 clean-ups, and minor bug fixes.
 
 One new feature to mention is that NFSD administrators now have the
 ability to revoke NFSv4 open and lock state. NFSD's NFSv3 support
 has had this capability for some time.
 
 As always I am grateful to NFSD contributors, reviewers, and
 testers.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmXwV4QACgkQM2qzM29m
 f5c7cg/8CRe0mGbeEMonoSycBjANDuiRolCM+DhVccUvSyWPqf4blF5yrNHcf5zN
 WmjQHVXIJUMVpLovcakj+4aBIuXGgdSmBJamFTy9fVfcFadiWYRceNgMMXpLMDDI
 fMAszRUyfL/r0Evj0Zajt86R5/gGn+W9X6HlDc1k7VV0Z+fzRw9WMxADy11cgHLp
 mh2bzyPmwu0EfBYlWNWLqzWVZm1C5UCGnlInyr0KXImCLOkpJqAVXTDvDkGFW2Qw
 1kJhodyabf6fRV2ZqPjLUuR4aRqABey83rB0N5z7MumO/dJUBW3CHR3uNMqvkmh3
 XevI8bPzS2Kypijcx7dONtkDWwU+fsvCdepNpmVDB73B19BFiLG+HDbMypJ0dmp+
 rvvfILRDCmIb+FA1DUeT3lIc6ac1f1+qAVc7hi3E7rGctEJWeHDsZg+E1PuTvpxM
 3XfRaFnucY5vwyiB2/uI4eblBHcVXoKho+pUqQMegLPRbgsEUyFUfg3+ZMtntagd
 OVUXvWYIARP97HNh0J5ChcGI72UpXtFWMlbbiTiCzYx4FeiCffeczIERXNJ4FYAg
 fKUaiBhdAN1PPFCRXJORZ5XlSIeZttUNSJUPfmuOpkscMdkpRUIhuEUYo9K8/1eL
 O+YZeGW/kTG+llxOERfEHJoekLf1TgGdU7oBmTIgQIK03hTUih8=
 =75G4
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "The bulk of the patches for this release are optimizations, code
  clean-ups, and minor bug fixes.

  One new feature to mention is that NFSD administrators now have the
  ability to revoke NFSv4 open and lock state. NFSD's NFSv3 support has
  had this capability for some time.

  As always I am grateful to NFSD contributors, reviewers, and testers"

* tag 'nfsd-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (75 commits)
  NFSD: Clean up nfsd4_encode_replay()
  NFSD: send OP_CB_RECALL_ANY to clients when number of delegations reaches its limit
  NFSD: Document nfsd_setattr() fill-attributes behavior
  nfsd: Fix NFSv3 atomicity bugs in nfsd_setattr()
  nfsd: Fix a regression in nfsd_setattr()
  NFSD: OP_CB_RECALL_ANY should recall both read and write delegations
  NFSD: handle GETATTR conflict with write delegation
  NFSD: add support for CB_GETATTR callback
  NFSD: Document the phases of CREATE_SESSION
  NFSD: Fix the NFSv4.1 CREATE_SESSION operation
  nfsd: clean up comments over nfs4_client definition
  svcrdma: Add Write chunk WRs to the RPC's Send WR chain
  svcrdma: Post WRs for Write chunks in svc_rdma_sendto()
  svcrdma: Post the Reply chunk and Send WR together
  svcrdma: Move write_info for Reply chunks into struct svc_rdma_send_ctxt
  svcrdma: Post Send WR chain
  svcrdma: Fix retry loop in svc_rdma_send()
  svcrdma: Prevent a UAF in svc_rdma_send()
  svcrdma: Fix SQ wake-ups
  svcrdma: Increase the per-transport rw_ctx count
  ...
2024-03-12 14:27:37 -07:00
Dai Ngo
6487a13b5c NFSD: add support for CB_GETATTR callback
Includes:
   . CB_GETATTR proc for nfs4_cb_procedures[]
   . XDR encoding and decoding function for CB_GETATTR request/reply
   . add nfs4_cb_fattr to nfs4_delegation for sending CB_GETATTR
     and store file attributes from client's reply.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01 09:12:31 -05:00
Chuck Lever
e3179e443c NFSD: Remove redundant cb_seq_status initialization
As far as I can see, setting cb_seq_status in nfsd4_init_cb() is
superfluous because it is set again in nfsd4_cb_prepare().

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01 09:12:17 -05:00
Chuck Lever
84ebf02d75 NFSD: Remove BUG_ON in nfsd4_process_cb_update()
Don't kill the kworker thread, and don't panic while cl_lock is
held. There's no need for scorching the earth here.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01 09:12:16 -05:00
Chuck Lever
112bdd597d NFSD: Replace comment with lockdep assertion
Convert a code comment into a real assertion.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01 09:12:16 -05:00
Chuck Lever
627fb18302 NFSD: Remove unused @reason argument
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01 09:12:15 -05:00
Chuck Lever
9a026aec88 NFSD: Add callback operation lifetime trace points
Help observe the flow of callback operations.

bc_shutdown() records exactly when the backchannel RPC client is
destroyed and cl_cb_client is replaced with NULL.

Examples include:

         nfsd-955   [004]   650.013997: nfsd_cb_queue:        addr=192.168.122.6:0 client 65b3c5b8:f541f749 cb=0xffff8881134b02f8 (first try)
kworker/u21:4-497   [004]   650.014050: nfsd_cb_seq_status:   task:00000001@00000001 sessionid=65b3c5b8:f541f749:00000001:00000000 tk_status=-107 seq_status=1
kworker/u21:4-497   [004]   650.014051: nfsd_cb_restart:      addr=192.168.122.6:0 client 65b3c5b8:f541f749 cb=0xffff88810e39f400 (first try)
kworker/u21:4-497   [004]   650.014066: nfsd_cb_queue:        addr=192.168.122.6:0 client 65b3c5b8:f541f749 cb=0xffff88810e39f400 (need restart)


kworker/u16:0-10    [006]   650.065750: nfsd_cb_start:        addr=192.168.122.6:0 client 65b3c5b8:f541f749 state=UNKNOWN
kworker/u16:0-10    [006]   650.065752: nfsd_cb_bc_update:    addr=192.168.122.6:0 client 65b3c5b8:f541f749 cb=0xffff8881134b02f8 (first try)
kworker/u16:0-10    [006]   650.065754: nfsd_cb_bc_shutdown:  addr=192.168.122.6:0 client 65b3c5b8:f541f749 cb=0xffff8881134b02f8 (first try)
kworker/u16:0-10    [006]   650.065810: nfsd_cb_new_state:    addr=192.168.122.6:0 client 65b3c5b8:f541f749 state=DOWN

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01 09:12:15 -05:00
Chuck Lever
a7cdf065e1 NFSD: Rename nfsd_cb_state trace point
Make it clear where backchannel state is updated.

Example trace point output:

kworker/u16:0-10    [006]  2800.080404: nfsd_cb_new_state:    addr=192.168.122.6:0 client 65b3c5b8:f541f749 state=UP
         nfsd-940   [003]  2800.478368: nfsd_cb_new_state:    addr=192.168.122.6:0 client 65b3c5b8:f541f749 state=UNKNOWN
kworker/u16:0-10    [003]  2800.478828: nfsd_cb_new_state:    addr=192.168.122.6:0 client 65b3c5b8:f541f749 state=DOWN

kworker/u16:0-10    [005]  2802.039724: nfsd_cb_start:        addr=192.168.122.6:0 client 65b3c5b8:f541f749 state=UP
kworker/u16:0-10    [005]  2810.611452: nfsd_cb_start:        addr=192.168.122.6:0 client 65b3c5b8:f541f749 state=FAULT
kworker/u16:0-10    [005]  2810.616832: nfsd_cb_start:        addr=192.168.122.6:0 client 65b3c5b8:f541f749 state=UNKNOWN
kworker/u16:0-10    [005]  2810.616931: nfsd_cb_start:        addr=192.168.122.6:0 client 65b3c5b8:f541f749 state=DOWN

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01 09:12:14 -05:00
Chuck Lever
8626664c87 NFSD: Replace dprintks in nfsd4_cb_sequence_done()
Improve observability of backchannel session operation.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01 09:12:14 -05:00
Chuck Lever
43b02dba11 NFSD: Retransmit callbacks after client reconnects
NFSv4.1 clients assume that if they disconnect, that will force the
server to resend pending callback operations once a fresh connection
has been established.

Turns out NFSD has not been resending after reconnect.

Fixes: 7ba6cad6c8 ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01 09:12:13 -05:00
Chuck Lever
c1ccfcf1a9 NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down
As part of managing a client disconnect, NFSD closes down and
replaces the backchannel rpc_clnt.

If a callback operation is pending when the backchannel rpc_clnt is
shut down, currently nfsd4_run_cb_work() just discards that
callback. But there are multiple cases to deal with here:

 o The client's lease is getting destroyed. Throw the CB away.

 o The client disconnected. It might be forcing a retransmit of
   CB operations, or it could have disconnected for other reasons.
   Reschedule the CB so it is retransmitted when the client
   reconnects.

Since callback operations can now be rescheduled, ensure that
cb_ops->prepare can be called only once by moving the
cb_ops->prepare paragraph down to just before the rpc_call_async()
call.

Fixes: 2bbfed98a4 ("nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback()")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01 09:12:12 -05:00
Chuck Lever
fe0e9580e2 NFSD: Convert the callback workqueue to use delayed_work
Normally, NFSv4 callback operations are supposed to be sent to the
client as soon as they are queued up.

In a moment, I will introduce a recovery path where the server has
to wait for the client to reconnect. We don't want a hard busy wait
here -- the callback should be requeued to try again in several
milliseconds.

For now, convert nfsd4_callback from struct work_struct to struct
delayed_work, and queue with a zero delay argument. This should
avoid behavior changes for current operation.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01 09:12:12 -05:00
Chuck Lever
961b4b5e86 NFSD: Reset cb_seq_status after NFS4ERR_DELAY
I noticed that once an NFSv4.1 callback operation gets a
NFS4ERR_DELAY status on CB_SEQUENCE and then the connection is lost,
the callback client loops, resending it indefinitely.

The switch arm in nfsd4_cb_sequence_done() that handles
NFS4ERR_DELAY uses rpc_restart_call() to rearm the RPC state machine
for the retransmit, but that path does not call the rpc_prepare_call
callback again. Thus cb_seq_status is set to -10008 by the first
NFS4ERR_DELAY result, but is never set back to 1 for the retransmits.

nfsd4_cb_sequence_done() thinks it's getting nothing but a
long series of CB_SEQUENCE NFS4ERR_DELAY replies.

Fixes: 7ba6cad6c8 ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01 09:12:12 -05:00
Jeff Layton
05580bbfc6
nfsd: adapt to breakup of struct file_lock
Most of the existing APIs have remained the same, but subsystems that
access file_lock fields directly need to reach into struct
file_lock_core now.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20240131-flsplit-v3-42-c6129007ee8d@kernel.org
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-05 13:11:43 +01:00
ChenXiaoSong
52e8910075 NFSv4, NFSD: move enum nfs_cb_opnum4 to include/linux/nfs4.h
Callback operations enum is defined in client and server, move it to
common header file.

Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Anna Schumaker <Anna.Schumaker@netapp.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07 17:54:26 -05:00
Chuck Lever
1227561c2f NFSD: Revert 738401a9bd
There's nothing wrong with this commit, but this is dead code now
that nothing triggers a CB_GETATTR callback. It can be re-introduced
once the issues with handling conflicting GETATTRs are resolved.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-12-18 11:22:19 -05:00
Dai Ngo
738401a9bd NFSD: add support for CB_GETATTR callback
Includes:
   . CB_GETATTR proc for nfs4_cb_procedures[]
   . XDR encoding and decoding function for CB_GETATTR request/reply
   . add nfs4_cb_fattr to nfs4_delegation for sending CB_GETATTR
     and store file attributes from client's reply.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-10-16 12:44:08 -04:00
Dai Ngo
7de82c2f36 NFSD: callback request does not use correct credential for AUTH_SYS
Currently callback request does not use the credential specified in
CREATE_SESSION if the security flavor for the back channel is AUTH_SYS.

Problem was discovered by pynfs 4.1 DELEG5 and DELEG7 test with error:
DELEG5   st_delegation.testCBSecParms     : FAILURE
           expected callback with uid, gid == 17, 19, got 0, 0

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Fixes: 8276c902bb ("SUNRPC: remove uid and gid from struct auth_cred")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-04-04 09:55:27 -04:00
Linus Torvalds
aeba12b26c nfsd-6.2 supplement:
- Address numerous reports of refcount underflows in NFSD's filecache
 - Address a UAF in callback setup error handling
 - Address a UAF during server-to-server copy
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmOgbwoACgkQM2qzM29m
 f5f2ew/+OWN7A+cPw5mSY7mfeVc7yb8la7X+dfENrDnypwHj5HP+xm1X9HqB6+Ci
 IVY+0rv1VLMAb+SWXJeK77plNAtIbbpvyqoAzV504wiPJIe70X4iW9fYWYdPwaQE
 PfJADcH+Rvcm5Pg8Gfvmkkij69O6oXUbtLIknfI/U5KSZuDTMsROeH1lMnH7TtOl
 DQ8I2/QKm8C8yFWL0VhUeOLRupND9A0eFw4R+SN9r+cEYT+3daGMWHe9HVekTv/D
 r1hbh5BmHMaU9cN72qQZhylQJ80MSabDmeLuQXL0AFy/FolJF7qr5hJGvh4A6pZW
 n1zIhxXoPHn6jtpE481qAdDMEsFB794oCJqsFZMHwa5OF/xosJ4exjd8lHO7HEvQ
 wdvVpQeHS9Gf+h4IZ0mhu7lW4AdOfLe46Y/LGM2AAcUPSXvViK0leMTqhwVzmiZw
 sOOvEHl8YDOTg4leodYzZ3XLflPiET1My+5GgsuTmr9wkwiMvhj2mNAYVB32H0sI
 zC5XaWLye+6np6tiLkhOTxDtI4K+zX9h6f6zBxPr2+u4TNH76UmPAsKTwV37kYD+
 HfXhYLmJbl6YMnmQbDBeDFkG3mJn1xdoZIoXBb15trYLQDakOdA0Q1SnSSjlM/sD
 hWiByiLR8cE3+jut8dGOddhpKQi3zqBqr2p4tIQpjJ926GV1Ung=
 =STtE
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull more nfsd updates from Chuck Lever:
 "This contains a number of crasher fixes that were not ready for the
  initial pull request last week.

  In particular, Jeff's patch attempts to address reference count
  underflows in NFSD's filecache, which have been very difficult to
  track down because there is no reliable reproducer.

  Common failure modes:
      https://bugzilla.kernel.org/show_bug.cgi?id=216691#c11
      https://bugzilla.kernel.org/show_bug.cgi?id=216674#c6
      https://bugzilla.redhat.com/show_bug.cgi?id=2138605

  The race windows were found by inspection and the clean-ups appear
  sensible and pass regression testing, so we include them here in the
  hope that they address the problem. However we remain vigilant because
  we don't have 100% certainty yet that the problem is fully addressed.

  Summary:

   - Address numerous reports of refcount underflows in NFSD's filecache

   - Address a UAF in callback setup error handling

   - Address a UAF during server-to-server copy"

* tag 'nfsd-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  NFSD: fix use-after-free in __nfs42_ssc_open()
  nfsd: under NFSv4.1, fix double svc_xprt_put on rpc_create failure
  nfsd: rework refcounting in filecache
2022-12-19 09:10:33 -06:00
Linus Torvalds
48ea09cdda hardening updates for v6.2-rc1
- Convert flexible array members, fix -Wstringop-overflow warnings,
   and fix KCFI function type mismatches that went ignored by
   maintainers (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook).
 
 - Remove the remaining side-effect users of ksize() by converting
   dma-buf, btrfs, and coredump to using kmalloc_size_roundup(),
   add more __alloc_size attributes, and introduce full testing
   of all allocator functions. Finally remove the ksize() side-effect
   so that each allocation-aware checker can finally behave without
   exceptions.
 
 - Introduce oops_limit (default 10,000) and warn_limit (default off)
   to provide greater granularity of control for panic_on_oops and
   panic_on_warn (Jann Horn, Kees Cook).
 
 - Introduce overflows_type() and castable_to_type() helpers for
   cleaner overflow checking.
 
 - Improve code generation for strscpy() and update str*() kern-doc.
 
 - Convert strscpy and sigphash tests to KUnit, and expand memcpy
   tests.
 
 - Always use a non-NULL argument for prepare_kernel_cred().
 
 - Disable structleak plugin in FORTIFY KUnit test (Anders Roxell).
 
 - Adjust orphan linker section checking to respect CONFIG_WERROR
   (Xin Li).
 
 - Make sure siginfo is cleared for forced SIGKILL (haifeng.xu).
 
 - Fix um vs FORTIFY warnings for always-NULL arguments.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmOZSOoWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJjAAD/0YkvpU7f03f8hcQMJK6wv//24K
 AW41hEaBikq9RcmkuvkLLrJRibGgZ5O2xUkUkxRs/HxhkhrZ0kEw8sbwZe8MoWls
 F4Y9+TDjsrdHmjhfcBZdLnVxwcKK5wlaEcpjZXtbsfcdhx3TbgcDA23YELl5t0K+
 I11j4kYmf9SLl4CwIrSP5iACml8CBHARDh8oIMF7FT/LrjNbM8XkvBcVVT6hTbOV
 yjgA8WP2e9GXvj9GzKgqvd0uE/kwPkVAeXLNFWopPi4FQ8AWjlxbBZR0gamA6/EB
 d7TIs0ifpVU2JGQaTav4xO6SsFMj3ntoUI0qIrFaTxZAvV4KYGrPT/Kwz1O4SFaG
 rN5lcxseQbPQSBTFNG4zFjpywTkVCgD2tZqDwz5Rrmiraz0RyIokCN+i4CD9S0Ds
 oEd8JSyLBk1sRALczkuEKo0an5AyC9YWRcBXuRdIHpLo08PsbeUUSe//4pe303cw
 0ApQxYOXnrIk26MLElTzSMImlSvlzW6/5XXzL9ME16leSHOIfDeerPnc9FU9Eb3z
 ODv22z6tJZ9H/apSUIHZbMciMbbVTZ8zgpkfydr08o87b342N/ncYHZ5cSvQ6DWb
 jS5YOIuvl46/IhMPT16qWC8p0bP5YhxoPv5l6Xr0zq0ooEj0E7keiD/SzoLvW+Qs
 AHXcibguPRQBPAdiPQ==
 =yaaN
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull kernel hardening updates from Kees Cook:

 - Convert flexible array members, fix -Wstringop-overflow warnings, and
   fix KCFI function type mismatches that went ignored by maintainers
   (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook)

 - Remove the remaining side-effect users of ksize() by converting
   dma-buf, btrfs, and coredump to using kmalloc_size_roundup(), add
   more __alloc_size attributes, and introduce full testing of all
   allocator functions. Finally remove the ksize() side-effect so that
   each allocation-aware checker can finally behave without exceptions

 - Introduce oops_limit (default 10,000) and warn_limit (default off) to
   provide greater granularity of control for panic_on_oops and
   panic_on_warn (Jann Horn, Kees Cook)

 - Introduce overflows_type() and castable_to_type() helpers for cleaner
   overflow checking

 - Improve code generation for strscpy() and update str*() kern-doc

 - Convert strscpy and sigphash tests to KUnit, and expand memcpy tests

 - Always use a non-NULL argument for prepare_kernel_cred()

 - Disable structleak plugin in FORTIFY KUnit test (Anders Roxell)

 - Adjust orphan linker section checking to respect CONFIG_WERROR (Xin
   Li)

 - Make sure siginfo is cleared for forced SIGKILL (haifeng.xu)

 - Fix um vs FORTIFY warnings for always-NULL arguments

* tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (31 commits)
  ksmbd: replace one-element arrays with flexible-array members
  hpet: Replace one-element array with flexible-array member
  um: virt-pci: Avoid GCC non-NULL warning
  signal: Initialize the info in ksignal
  lib: fortify_kunit: build without structleak plugin
  panic: Expose "warn_count" to sysfs
  panic: Introduce warn_limit
  panic: Consolidate open-coded panic_on_warn checks
  exit: Allow oops_limit to be disabled
  exit: Expose "oops_count" to sysfs
  exit: Put an upper limit on how often we can oops
  panic: Separate sysctl logic from CONFIG_SMP
  mm/pgtable: Fix multiple -Wstringop-overflow warnings
  mm: Make ksize() a reporting-only function
  kunit/fortify: Validate __alloc_size attribute results
  drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid()
  drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid()
  driver core: Add __alloc_size hint to devm allocators
  overflow: Introduce overflows_type() and castable_to_type()
  coredump: Proactively round up to kmalloc bucket size
  ...
2022-12-14 12:20:00 -08:00
Dan Aloni
3bc8edc98b nfsd: under NFSv4.1, fix double svc_xprt_put on rpc_create failure
On error situation `clp->cl_cb_conn.cb_xprt` should not be given
a reference to the xprt otherwise both client cleanup and the
error handling path of the caller call to put it. Better to
delay handing over the reference to a later branch.

[   72.530665] refcount_t: underflow; use-after-free.
[   72.531933] WARNING: CPU: 0 PID: 173 at lib/refcount.c:28 refcount_warn_saturate+0xcf/0x120
[   72.533075] Modules linked in: nfsd(OE) nfsv4(OE) nfsv3(OE) nfs(OE) lockd(OE) compat_nfs_ssc(OE) nfs_acl(OE) rpcsec_gss_krb5(OE) auth_rpcgss(OE) rpcrdma(OE) dns_resolver fscache netfs grace rdma_cm iw_cm ib_cm sunrpc(OE) mlx5_ib mlx5_core mlxfw pci_hyperv_intf ib_uverbs ib_core xt_MASQUERADE nf_conntrack_netlink nft_counter xt_addrtype nft_compat br_netfilter bridge stp llc nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set overlay nf_tables nfnetlink crct10dif_pclmul crc32_pclmul ghash_clmulni_intel xfs serio_raw virtio_net virtio_blk net_failover failover fuse [last unloaded: sunrpc]
[   72.540389] CPU: 0 PID: 173 Comm: kworker/u16:5 Tainted: G           OE     5.15.82-dan #1
[   72.541511] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-3.module+el8.7.0+1084+97b81f61 04/01/2014
[   72.542717] Workqueue: nfsd4_callbacks nfsd4_run_cb_work [nfsd]
[   72.543575] RIP: 0010:refcount_warn_saturate+0xcf/0x120
[   72.544299] Code: 55 00 0f 0b 5d e9 01 50 98 00 80 3d 75 9e 39 08 00 0f 85 74 ff ff ff 48 c7 c7 e8 d1 60 8e c6 05 61 9e 39 08 01 e8 f6 51 55 00 <0f> 0b 5d e9 d9 4f 98 00 80 3d 4b 9e 39 08 00 0f 85 4c ff ff ff 48
[   72.546666] RSP: 0018:ffffb3f841157cf0 EFLAGS: 00010286
[   72.547393] RAX: 0000000000000026 RBX: ffff89ac6231d478 RCX: 0000000000000000
[   72.548324] RDX: ffff89adb7c2c2c0 RSI: ffff89adb7c205c0 RDI: ffff89adb7c205c0
[   72.549271] RBP: ffffb3f841157cf0 R08: 0000000000000000 R09: c0000000ffefffff
[   72.550209] R10: 0000000000000001 R11: ffffb3f841157ad0 R12: ffff89ac6231d180
[   72.551142] R13: ffff89ac6231d478 R14: ffff89ac40c06180 R15: ffff89ac6231d4b0
[   72.552089] FS:  0000000000000000(0000) GS:ffff89adb7c00000(0000) knlGS:0000000000000000
[   72.553175] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   72.553934] CR2: 0000563a310506a8 CR3: 0000000109a66000 CR4: 0000000000350ef0
[   72.554874] Call Trace:
[   72.555278]  <TASK>
[   72.555614]  svc_xprt_put+0xaf/0xe0 [sunrpc]
[   72.556276]  nfsd4_process_cb_update.isra.11+0xb7/0x410 [nfsd]
[   72.557087]  ? update_load_avg+0x82/0x610
[   72.557652]  ? cpuacct_charge+0x60/0x70
[   72.558212]  ? dequeue_entity+0xdb/0x3e0
[   72.558765]  ? queued_spin_unlock+0x9/0x20
[   72.559358]  nfsd4_run_cb_work+0xfc/0x270 [nfsd]
[   72.560031]  process_one_work+0x1df/0x390
[   72.560600]  worker_thread+0x37/0x3b0
[   72.561644]  ? process_one_work+0x390/0x390
[   72.562247]  kthread+0x12f/0x150
[   72.562710]  ? set_kthread_struct+0x50/0x50
[   72.563309]  ret_from_fork+0x22/0x30
[   72.563818]  </TASK>
[   72.564189] ---[ end trace 031117b1c72ec616 ]---
[   72.566019] list_add corruption. next->prev should be prev (ffff89ac4977e538), but was ffff89ac4763e018. (next=ffff89ac4763e018).
[   72.567647] ------------[ cut here ]------------

Fixes: a4abc6b12e ("nfsd: Fix svc_xprt refcnt leak when setup callback client failed")
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Cc: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Dan Aloni <dan.aloni@vastdata.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-12-12 09:18:44 -05:00
Dai Ngo
3959066b69 NFSD: add support for sending CB_RECALL_ANY
Add XDR encode and decode function for CB_RECALL_ANY.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-12-10 11:01:12 -05:00
Kees Cook
5a17f040fa cred: Do not default to init_cred in prepare_kernel_cred()
A common exploit pattern for ROP attacks is to abuse prepare_kernel_cred()
in order to construct escalated privileges[1]. Instead of providing a
short-hand argument (NULL) to the "daemon" argument to indicate using
init_cred as the base cred, require that "daemon" is always set to
an actual task. Replace all existing callers that were passing NULL
with &init_task.

Future attacks will need to have sufficiently powerful read/write
primitives to have found an appropriately privileged task and written it
to the ROP stack as an argument to succeed, which is similarly difficult
to the prior effort needed to escalate privileges before struct cred
existed: locate the current cred and overwrite the uid member.

This has the added benefit of meaning that prepare_kernel_cred() can no
longer exceed the privileges of the init task, which may have changed from
the original init_cred (e.g. dropping capabilities from the bounding set).

[1] https://google.com/search?q=commit_creds(prepare_kernel_cred(0))

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Steve French <sfrench@samba.org>
Cc: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: Shyam Prasad N <sprasad@microsoft.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: "Michal Koutný" <mkoutny@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: linux-nfs@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Russ Weight <russell.h.weight@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Link: https://lore.kernel.org/r/20221026232943.never.775-kees@kernel.org
2022-11-01 10:04:52 -07:00
Jeff Layton
b95239ca49 nfsd: make nfsd4_run_cb a bool return function
queue_work can return false and not queue anything, if the work is
already queued. If that happens in the case of a CB_RECALL, we'll have
taken an extra reference to the stid that will never be put. Ensure we
throw a warning in that case.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:50:57 -04:00
Chuck Lever
a11ada99ce NFSD: Move copy offload callback arguments into a separate structure
Refactor so that CB_OFFLOAD arguments can be passed without
allocating a whole struct nfsd4_copy object. On my system (x86_64)
this removes another 96 bytes from struct nfsd4_copy.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:17:00 -04:00
NeilBrown
d8b26071e6 NFSD: simplify struct nfsfh
Most of the fields in 'struct knfsd_fh' are 2 levels deep (a union and a
struct) and are accessed using macros like:

 #define fh_FOO fh_base.fh_new.fb_FOO

This patch makes the union and struct anonymous, so that "fh_FOO" can be
a name directly within 'struct knfsd_fh' and the #defines aren't needed.

The file handle as a whole is sometimes accessed as "fh_base" or
"fh_base.fh_pad", neither of which are particularly helpful names.
As the struct holding the filehandle is now anonymous, we
cannot use the name of that, so we union it with 'fh_raw' and use that
where the raw filehandle is needed.  fh_raw also ensure the structure is
large enough for the largest possible filehandle.

fh_raw is a 'char' array, removing any need to cast it for memcpy etc.

SVCFH_fmt() is simplified using the "%ph" printk format.  This
changes the appearance of filehandles in dprintk() debugging, making
them a little more precise.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-10-02 15:51:10 -04:00
J. Bruce Fields
05570a2b01 nfsd: rpc_peeraddr2str needs rcu lock
I'm not even sure cl_xprt can change here, but we're getting "suspicious
RCU usage" warnings, and other rpc_peeraddr2str callers are taking the
rcu lock.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-07-06 20:14:42 -04:00
Chuck Lever
1d2bf65983 NFSD: Remove the nfsd_cb_work and nfsd_cb_done tracepoints
Clean up: These are noise in properly working systems. If you really
need to observe the operation of the callback mechanism, use the
sunrpc:rpc\* tracepoints along with the workqueue tracepoints.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-05-18 13:44:04 -04:00
Chuck Lever
4ade892ae1 NFSD: Add an nfsd_cb_probe tracepoint
Record a tracepoint event when the server performs a callback
probe. This event can be enabled as a group with other nfsd_cb
tracepoints.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-05-18 13:44:04 -04:00
Chuck Lever
3c92fba557 NFSD: Enhance the nfsd_cb_setup tracepoint
Display the transport protocol and authentication flavor so admins
can see what they might be getting wrong.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-05-18 13:44:04 -04:00
Chuck Lever
9f57c6062b NFSD: Remove spurious cb_setup_err tracepoint
This path is not really an error path, so the tracepoint I added
there is just noise.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-05-18 13:44:04 -04:00
Chuck Lever
b200f0e353 NFSD: Adjust cb_shutdown tracepoint
Show when the upper layer requested a shutdown. RPC tracepoints can
already show when rpc_shutdown_client() is called.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-05-18 13:44:04 -04:00
Chuck Lever
8476c69a7f NFSD: Capture every CB state transition
We were missing one.

As a clean-up, add a helper that sets the new CB state and fires
a tracepoint. The tracepoint fires only when the state changes, to
help reduce trace log noise.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-05-18 13:44:03 -04:00
Olga Kornievskaia
b4250dd868 NFSD: fix error handling in NFSv4.0 callbacks
When the server tries to do a callback and a client fails it due to
authentication problems, we need the server to set callback down
flag in RENEW so that client can recover.

Suggested-by: Bruce Fields <bfields@redhat.com>
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Link: https://lore.kernel.org/linux-nfs/FB84E90A-1A03-48B3-8BF7-D9D10AC2C9FE@oracle.com/T/#t
2021-03-11 10:58:49 -05:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Xiyu Yang
a4abc6b12e nfsd: Fix svc_xprt refcnt leak when setup callback client failed
nfsd4_process_cb_update() invokes svc_xprt_get(), which increases the
refcount of the "c->cn_xprt".

The reference counting issue happens in one exception handling path of
nfsd4_process_cb_update(). When setup callback client failed, the
function forgets to decrease the refcnt increased by svc_xprt_get(),
causing a refcnt leak.

Fix this issue by calling svc_xprt_put() when setup callback client
failed.

Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-28 18:15:00 -04:00
Chuck Lever
1eace0d1e9 NFSD: Add tracepoints for monitoring NFSD callbacks
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-05-20 17:30:44 -04:00
Chuck Lever
6221f1d9b6 SUNRPC: Fix backchannel RPC soft lockups
Currently, after the forward channel connection goes away,
backchannel operations are causing soft lockups on the server
because call_transmit_status's SOFTCONN logic ignores ENOTCONN.
Such backchannel Calls are aggressively retried until the client
reconnects.

Backchannel Calls should use RPC_TASK_NOCONNECT rather than
RPC_TASK_SOFTCONN. If there is no forward connection, the server is
not capable of establishing a connection back to the client, thus
that backchannel request should fail before the server attempts to
send it. Commit 58255a4e3c ("NFSD: NFSv4 callback client should
use RPC_TASK_SOFTCONN") was merged several years before
RPC_TASK_NOCONNECT was available.

Because setup_callback_client() explicitly sets NOPING, the NFSv4.0
callback connection depends on the first callback RPC to initiate
a connection to the client. Thus NFSv4.0 needs to continue to use
RPC_TASK_SOFTCONN.

Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@vger.kernel.org> # v4.20+
2020-04-17 12:40:31 -04:00
Arnd Bergmann
20b7d86f29 nfsd: use boottime for lease expiry calculation
A couple of time_t variables are only used to track the state of the
lease time and its expiration. The code correctly uses the 'time_after()'
macro to make this work on 32-bit architectures even beyond year 2038,
but the get_seconds() function and the time_t type itself are deprecated
as they behave inconsistently between 32-bit and 64-bit architectures
and often lead to code that is not y2038 safe.

As a minor issue, using get_seconds() leads to problems with concurrent
settimeofday() or clock_settime() calls, in the worst case timeout never
triggering after the time has been set backwards.

Change nfsd to use time64_t and ktime_get_boottime_seconds() here. This
is clearly excessive, as boottime by itself means we never go beyond 32
bits, but it does mean we handle this correctly and consistently without
having to worry about corner cases and should be no more expensive than
the previous implementation on 64-bit architectures.

The max_cb_time() function gets changed in order to avoid an expensive
64-bit division operation, but as the lease time is at most one hour,
there is no change in behavior.

Also do the same for server-to-server copy expiration time.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[bfields@redhat.com: fix up copy expiration]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 22:07:17 -05:00
J. Bruce Fields
cc1ce2f13e nfsd: document callback_wq serialization of callback code
The callback code relies on the fact that much of it is only ever called
from the ordered workqueue callback_wq, and this is worth documenting.

Reported-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-08 12:32:59 -05:00
J. Bruce Fields
20428a8047 nfsd: mark cb path down on unknown errors
An unexpected error is probably a sign that something is wrong with the
callback path.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-08 12:32:59 -05:00
Trond Myklebust
2bbfed98a4 nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback()
When we're destroying the client lease, and we call
nfsd4_shutdown_callback(), we must ensure that we do not return
before all outstanding callbacks have terminated and have
released their payloads.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-08 12:32:59 -05:00
Trond Myklebust
12357f1b2c nfsd: minor 4.1 callback cleanup
Move all the cb_holds_slot management into helper functions.  No change
in behavior.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-08 12:32:44 -05:00
J. Bruce Fields
10fa8acf0f nfsd: Remove unnecessary NULL checks
"cb" is never actually NULL in these functions.

On a quick skim of the history, they seem to have been there from the
beginning.  I'm not sure if they originally served a purpose.

Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-16 13:49:29 -04:00
Linus Torvalds
700a800a94 This pull consists mostly of nfsd container work:
Scott Mayhew revived an old api that communicates with a userspace
 daemon to manage some on-disk state that's used to track clients across
 server reboots.  We've been using a usermode_helper upcall for that, but
 it's tough to run those with the right namespaces, so a daemon is much
 friendlier to container use cases.
 
 Trond fixed nfsd's handling of user credentials in user namespaces.  He
 also contributed patches that allow containers to support different sets
 of NFS protocol versions.
 
 The only remaining container bug I'm aware of is that the NFS reply
 cache is shared between all containers.  If anyone's aware of other gaps
 in our container support, let me know.
 
 The rest of this is miscellaneous bugfixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEYtFWavXG9hZotryuJ5vNeUKO4b4FAlzcWNcVHGJmaWVsZHNA
 ZmllbGRzZXMub3JnAAoJECebzXlCjuG+DUEP/0WD3jKNAHFV3M5YQPAI9fz/iCND
 Db/A4oWP5qa6JmwmHe61il29QeGqkeFr/NPexgzM3Xw2E39d7RBXBeWyVDuqb0wr
 6SCXjXibTsuAHg11nR8Xf0P5Vej3rfGbG6up5lLCIDTEZxVpWoaBJnM8+3bewuCj
 XbeiDW54oiMbmDjon3MXqVAIF/z7LjorecJ+Yw5+0Jy7KZ6num9Kt8+fi7qkEfFd
 i5Bp9KWgzlTbJUJV4EX3ZKN3zlGkfOvjoo2kP3PODPVMB34W8jSLKkRSA1tDWYZg
 43WhBt5OODDlV6zpxSJXehYKIB4Ae469+RRaIL4F+ORRK+AzR0C/GTuOwJiG+P3J
 n95DX5WzX74nPOGQJgAvq4JNpZci85jM3jEK1TR2M7KiBDG5Zg+FTsPYVxx5Sgah
 Akl/pjLtHQPSdBbFGHn5TsXU+gqWNiKsKa9663tjxLb8ldmJun6JoQGkAEF9UJUn
 dzv0UxyHeHAblhSynY+WsUR+Xep9JDo/p5LyFK4if9Sd62KeA1uF/MFhAqpKZF81
 mrgRCqW4sD8aVTBNZI06pZzmcZx4TRr2o+Oj5KAXf6Yk6TJRSGfnQscoMMBsTLkw
 VK1rBQ/71TpjLHGZZZEx1YJrkVZAMmw2ty4DtK2f9jeKO13bWmUpc6UATzVufHKA
 C1rUZXJ5YioDbYDy
 =TUdw
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "This consists mostly of nfsd container work:

  Scott Mayhew revived an old api that communicates with a userspace
  daemon to manage some on-disk state that's used to track clients
  across server reboots. We've been using a usermode_helper upcall for
  that, but it's tough to run those with the right namespaces, so a
  daemon is much friendlier to container use cases.

  Trond fixed nfsd's handling of user credentials in user namespaces. He
  also contributed patches that allow containers to support different
  sets of NFS protocol versions.

  The only remaining container bug I'm aware of is that the NFS reply
  cache is shared between all containers. If anyone's aware of other
  gaps in our container support, let me know.

  The rest of this is miscellaneous bugfixes"

* tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linux: (23 commits)
  nfsd: update callback done processing
  locks: move checks from locks_free_lock() to locks_release_private()
  nfsd: fh_drop_write in nfsd_unlink
  nfsd: allow fh_want_write to be called twice
  nfsd: knfsd must use the container user namespace
  SUNRPC: rsi_parse() should use the current user namespace
  SUNRPC: Fix the server AUTH_UNIX userspace mappings
  lockd: Pass the user cred from knfsd when starting the lockd server
  SUNRPC: Temporary sockets should inherit the cred from their parent
  SUNRPC: Cache the process user cred in the RPC server listener
  nfsd: Allow containers to set supported nfs versions
  nfsd: Add custom rpcbind callbacks for knfsd
  SUNRPC: Allow further customisation of RPC program registration
  SUNRPC: Clean up generic dispatcher code
  SUNRPC: Add a callback to initialise server requests
  SUNRPC/nfs: Fix return value for nfs4_callback_compound()
  nfsd: handle legacy client tracking records sent by nfsdcld
  nfsd: re-order client tracking method selection
  nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld
  nfsd: un-deprecate nfsdcld
  ...
2019-05-15 18:21:43 -07:00
Linus Torvalds
06cbd26d31 NFS client updates for Linux 5.2
Stable bugfixes:
 - Fall back to MDS if no deviceid is found rather than aborting   # v4.11+
 - NFS4: Fix v4.0 client state corruption when mount
 
 Features:
 - Much improved handling of soft mounts with NFS v4.0
   - Reduce risk of false positive timeouts
   - Faster failover of reads and writes after a timeout
   - Added a "softerr" mount option to return ETIMEDOUT instead of
     EIO to the application after a timeout
 - Increase number of xprtrdma backchannel requests
 - Add additional xprtrdma tracepoints
 - Improved send completion batching for xprtrdma
 
 Other bugfixes and cleanups:
 - Return -EINVAL when NFS v4.2 is passed an invalid dedup mode
 - Reduce usage of GFP_ATOMIC pages in SUNRPC
 - Various minor NFS over RDMA cleanups and bugfixes
 - Use the correct container namespace for upcalls
 - Don't share superblocks between user namespaces
 - Various other container fixes
 - Make nfs_match_client() killable to prevent soft lockups
 - Don't mark all open state for recovery when handling recallable state revoked flag
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlzUjdcACgkQ18tUv7Cl
 QOsUiw/+OirzlZI7XeHfpZ/CwS7A+tSk3AAg9PDS1gjbfylER0g++GpA08tXnmDt
 JdUnBKYC5ujLyAqxN1j7QK+EvmXZQro8rucJxhEdPJMIQDC65fQQnmW7efl2bAEv
 CAWNDCf9Xe4g6X8LSR5jrnaMV4kuOQBYX4wqrrmaV8I+g/A/GKXW262KWnAv+w1M
 Y1ZlX+d1Gm8hODXhvqz4lldW6bkyrpWpU9BKUtYSYnSR0x1fam6PLPuCTm74fEDR
 N/Tgy5XvJi4xgti4SOZ/dI2O/Oqu6ut81PEPlhs8sTX04G8bLhr+hl3rSksCZFlu
 Afz9Hcnxg6XYB3Va7j7AO67H5SbyX4Zyj5cRMipXQE7Ebc1iXo5lu3vdhAEOAtNx
 fdNJlqD86MC/XWbtM+DfWlD+KjtpZ+lkxN+xuMgC/kVaPTeFI7nEWM796hJP/4no
 EYtnSLbSpJyH6F7wH9IL5V2EJYFxbzTvnPSTxV+QNZ0HgF17gTY0AGmQBzDE5bF0
 tfQteOG6MYXMHg64pTEzjlowlXOWdnE5TnuaFpt64/yP+hVznZMepBMSkxZO1xYt
 jc1wQlJkv/SyVH7cMGsj5lw3A6zwTrLManDUUmrLjIsVVmh4dk8WKlNtWQmvf1v6
 nFBklUa2GzH8LWKRT2ftNGcUeEiCuw/QF9oE5T/V7/7SQ/wmmvA=
 =skb2
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-5.2-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
 "Highlights include:

  Stable bugfixes:
   - Fall back to MDS if no deviceid is found rather than aborting   # v4.11+
   - NFS4: Fix v4.0 client state corruption when mount

  Features:
   - Much improved handling of soft mounts with NFS v4.0:
       - Reduce risk of false positive timeouts
       - Faster failover of reads and writes after a timeout
       - Added a "softerr" mount option to return ETIMEDOUT instead of
         EIO to the application after a timeout
   - Increase number of xprtrdma backchannel requests
   - Add additional xprtrdma tracepoints
   - Improved send completion batching for xprtrdma

  Other bugfixes and cleanups:
   - Return -EINVAL when NFS v4.2 is passed an invalid dedup mode
   - Reduce usage of GFP_ATOMIC pages in SUNRPC
   - Various minor NFS over RDMA cleanups and bugfixes
   - Use the correct container namespace for upcalls
   - Don't share superblocks between user namespaces
   - Various other container fixes
   - Make nfs_match_client() killable to prevent soft lockups
   - Don't mark all open state for recovery when handling recallable
     state revoked flag"

* tag 'nfs-for-5.2-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (69 commits)
  SUNRPC: Rebalance a kref in auth_gss.c
  NFS: Fix a double unlock from nfs_match,get_client
  nfs: pass the correct prototype to read_cache_page
  NFSv4: don't mark all open state for recovery when handling recallable state revoked flag
  SUNRPC: Fix an error code in gss_alloc_msg()
  SUNRPC: task should be exit if encode return EKEYEXPIRED more times
  NFS4: Fix v4.0 client state corruption when mount
  PNFS fallback to MDS if no deviceid found
  NFS: make nfs_match_client killable
  lockd: Store the lockd client credential in struct nlm_host
  NFS: When mounting, don't share filesystems between different user namespaces
  NFS: Convert NFSv2 to use the container user namespace
  NFSv4: Convert the NFS client idmapper to use the container user namespace
  NFS: Convert NFSv3 to use the container user namespace
  SUNRPC: Use namespace of listening daemon in the client AUTH_GSS upcall
  SUNRPC: Use the client user namespace when encoding creds
  NFS: Store the credential of the mount process in the nfs_server
  SUNRPC: Cache cred of process creating the rpc_client
  xprtrdma: Remove stale comment
  xprtrdma: Update comments that reference ib_drain_qp
  ...
2019-05-09 14:33:15 -07:00