Commit Graph

3046 Commits

Author SHA1 Message Date
Linus Torvalds
14bd41e418 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAl/bPRMACgkQnJ2qBz9k
 QNmktwf7BE+H0PEgm3VfEs8uKUnmgr/TTBd9rhuKVa8NeYrT1YlX2ocCykawaLSW
 ppyXkr2rWKwvRO5P9hZPUsMbjvp7ucz14imBHlhiQpPyfh8cqMazPJLySqbAI/M+
 Eo8WIl74EqQ4VIgCGgfIVD073yjA4FWvO+5/CITYR44Pc2WzyCdU/1oKGBrs4+Cg
 OZAsHvg+2uKiEVeaBwbII+X/jChCJwEfHEYry3A8oRL427HrDir7Jc9i3SNGTDnc
 SE6DPj9X5HWOfoXjVrMratnaz654isvdRdP6GRAFKX8rJlNPGLMZbQ3DTzLGTYKL
 7r9KylGD5nCkL1SXjUOLCqHgVRrgpg==
 =xcC/
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify updates from Jan Kara:
 "A few fsnotify fixes from Amir fixing fallout from big fsnotify
  overhaul a few months back and an improvement of defaults limiting
  maximum number of inotify watches from Waiman"

* tag 'fsnotify_for_v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fsnotify: fix events reported to watching parent and child
  inotify: convert to handle_inode_event() interface
  fsnotify: generalize handle_inode_event()
  inotify: Increase default inotify.max_user_watches limit to 1048576
2020-12-17 10:56:27 -08:00
Trond Myklebust
716a8bc7f7 nfsd: Record NFSv4 pre/post-op attributes as non-atomic
For the case of NFSv4, specify to the client that the pre/post-op
attributes were not recorded atomically with the main operation.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:39:38 -05:00
Trond Myklebust
01cbf38539 nfsd: Set PF_LOCAL_THROTTLE on local filesystems only
Don't set PF_LOCAL_THROTTLE on remote filesystems like NFS, since they
aren't expected to ever be subject to double buffering.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:39:38 -05:00
Trond Myklebust
2e19d10c14 nfsd: Fix up nfsd to ensure that timeout errors don't result in ESTALE
If the underlying filesystem times out, then we want knfsd to return
NFSERR_JUKEBOX/DELAY rather than NFSERR_STALE.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:39:38 -05:00
Jeff Layton
7f84b488f9 nfsd: close cached files prior to a REMOVE or RENAME that would replace target
It's not uncommon for some workloads to do a bunch of I/O to a file and
delete it just afterward. If knfsd has a cached open file however, then
the file may still be open when the dentry is unlinked. If the
underlying filesystem is nfs, then that could trigger it to do a
sillyrename.

On a REMOVE or RENAME scan the nfsd_file cache for open files that
correspond to the inode, and proactively unhash and put their
references. This should prevent any delete-on-last-close activity from
occurring, solely due to knfsd's open file cache.

This must be done synchronously though so we use the variants that call
flush_delayed_fput. There are deadlock possibilities if you call
flush_delayed_fput while holding locks, however. In the case of
nfsd_rename, we don't even do the lookups of the dentries to be renamed
until we've locked for rename.

Once we've figured out what the target dentry is for a rename, check to
see whether there are cached open files associated with it. If there
are, then unwind all of the locking, close them all, and then reattempt
the rename.

None of this is really necessary for "typical" filesystems though. It's
mostly of use for NFS, so declare a new export op flag and use that to
determine whether to close the files beforehand.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:39:38 -05:00
Jeff Layton
ba5e8187c5 nfsd: allow filesystems to opt out of subtree checking
When we start allowing NFS to be reexported, then we have some problems
when it comes to subtree checking. In principle, we could allow it, but
it would mean encoding parent info in the filehandles and there may not
be enough space for that in a NFSv3 filehandle.

To enforce this at export upcall time, we add a new export_ops flag
that declares the filesystem ineligible for subtree checking.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:39:38 -05:00
Jeff Layton
daab110e47 nfsd: add a new EXPORT_OP_NOWCC flag to struct export_operations
With NFSv3 nfsd will always attempt to send along WCC data to the
client. This generally involves saving off the in-core inode information
prior to doing the operation on the given filehandle, and then issuing a
vfs_getattr to it after the op.

Some filesystems (particularly clustered or networked ones) have an
expensive ->getattr inode operation. Atomicity is also often difficult
or impossible to guarantee on such filesystems. For those, we're best
off not trying to provide WCC information to the client at all, and to
simply allow it to poll for that information as needed with a GETATTR
RPC.

This patch adds a new flags field to struct export_operations, and
defines a new EXPORT_OP_NOWCC flag that filesystems can use to indicate
that nfsd should not attempt to provide WCC info in NFSv3 replies. It
also adds a blurb about the new flags field and flag to the exporting
documentation.

The server will also now skip collecting this information for NFSv2 as
well, since that info is never used there anyway.

Note that this patch does not add this flag to any filesystem
export_operations structures. This was originally developed to allow
reexporting nfs via nfsd.

Other filesystems may want to consider enabling this flag too. It's hard
to tell however which ones have export operations to enable export via
knfsd and which ones mostly rely on them for open-by-filehandle support,
so I'm leaving that up to the individual maintainers to decide. I am
cc'ing the relevant lists for those filesystems that I think may want to
consider adding this though.

Cc: HPDD-discuss@lists.01.org
Cc: ceph-devel@vger.kernel.org
Cc: cluster-devel@redhat.com
Cc: fuse-devel@lists.sourceforge.net
Cc: ocfs2-devel@oss.oracle.com
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:39:38 -05:00
J. Bruce Fields
1631087ba8 Revert "nfsd4: support change_attr_type attribute"
This reverts commit a85857633b.

We're still factoring ctime into our change attribute even in the
IS_I_VERSION case.  If someone sets the system time backwards, a client
could see the change attribute go backwards.  Maybe we can just say
"well, don't do that", but there's some question whether that's good
enough, or whether we need a better guarantee.

Also, the client still isn't actually using the attribute.

While we're still figuring this out, let's just stop returning this
attribute.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:39:38 -05:00
J. Bruce Fields
942b20dc24 nfsd4: don't query change attribute in v2/v3 case
inode_query_iversion() has side effects, and there's no point calling it
when we're not even going to use it.

We check whether we're currently processing a v4 request by checking
fh_maxsize, which is arguably a little hacky; we could add a flag to
svc_fh instead.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:39:38 -05:00
J. Bruce Fields
4b03d99794 nfsd: minor nfsd4_change_attribute cleanup
Minor cleanup, no change in behavior.

Also pull out a common helper that'll be useful elsewhere.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:39:37 -05:00
J. Bruce Fields
b2140338d8 nfsd: simplify nfsd4_change_info
It doesn't make sense to carry all these extra fields around.  Just
make everything into change attribute from the start.

This is just cleanup, there should be no change in behavior.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:39:37 -05:00
J. Bruce Fields
70b87f7729 nfsd: only call inode_query_iversion in the I_VERSION case
inode_query_iversion() can modify i_version.  Depending on the exported
filesystem, that may not be safe.  For example, if you're re-exporting
NFS, NFS stores the server's change attribute in i_version and does not
expect it to be modified locally.  This has been observed causing
unnecessary cache invalidations.

The way a filesystem indicates that it's OK to call
inode_query_iverson() is by setting SB_I_VERSION.

So, move the I_VERSION check out of encode_change(), where it's used
only in GETATTR responses, to nfsd4_change_attribute(), which is
also called for pre- and post- operation attributes.

(Note we could also pull the NFSEXP_V4ROOT case into
nfsd4_change_attribute() as well.  That would actually be a no-op,
since pre/post attrs are only used for metadata-modifying operations,
and V4ROOT exports are read-only.  But we might make the change in
the future just for simplicity.)

Reported-by: Daire Byrne <daire@dneg.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:39:37 -05:00
Dai Ngo
ca9364dde5 NFSD: Fix 5 seconds delay when doing inter server copy
Since commit b4868b44c5 ("NFSv4: Wait for stateid updates after
CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
seconds delay regardless of the size of the copy. The delay is from
nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
fails because the seqid in both nfs4_state and nfs4_stateid are 0.

Fix by modifying nfs4_init_cp_state to return the stateid with seqid 1
instead of 0. This is also to conform with section 4.8 of RFC 7862.

Here is the relevant paragraph from section 4.8 of RFC 7862:

   A copy offload stateid's seqid MUST NOT be zero.  In the context of a
   copy offload operation, it is inappropriate to indicate "the most
   recent copy offload operation" using a stateid with a seqid of zero
   (see Section 8.2.2 of [RFC5661]).  It is inappropriate because the
   stateid refers to internal state in the server and there may be
   several asynchronous COPY operations being performed in parallel on
   the same file by the server.  Therefore, a copy offload stateid with
   a seqid of zero MUST be considered invalid.

Fixes: ce0887ac96 ("NFSD add nfs4 inter ssc to nfsd4_copy")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:38:34 -05:00
Chuck Lever
eb162e1772 NFSD: Fix sparse warning in nfs4proc.c
linux/fs/nfsd/nfs4proc.c:1542:24: warning: incorrect type in assignment (different base types)
linux/fs/nfsd/nfs4proc.c:1542:24:    expected restricted __be32 [assigned] [usertype] status
linux/fs/nfsd/nfs4proc.c:1542:24:    got int

Clean-up: The dup_copy_fields() function returns only zero, so make
it return void for now, and get rid of the return code check.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:38:34 -05:00
kazuo ito
4420440c57 nfsd: Fix message level for normal termination
The warning message from nfsd terminating normally
can confuse system adminstrators or monitoring software.

Though it's not exactly fair to pin-point a commit where it
originated, the current form in the current place started
to appear in:

Fixes: e096bbc648 ("knfsd: remove special handling for SIGHUP")
Signed-off-by: kazuo ito <kzpn200@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09 09:38:33 -05:00
Amir Goldstein
950cc0d2be fsnotify: generalize handle_inode_event()
The handle_inode_event() interface was added as (quoting comment):
"a simple variant of handle_event() for groups that only have inode
marks and don't have ignore mask".

In other words, all backends except fanotify.  The inotify backend
also falls under this category, but because it required extra arguments
it was left out of the initial pass of backends conversion to the
simple interface.

This results in code duplication between the generic helper
fsnotify_handle_event() and the inotify_handle_event() callback
which also happen to be buggy code.

Generalize the handle_inode_event() arguments and add the check for
FS_EXCL_UNLINK flag to the generic helper, so inotify backend could
be converted to use the simple interface.

Link: https://lore.kernel.org/r/20201202120713.702387-2-amir73il@gmail.com
CC: stable@vger.kernel.org
Fixes: b9a1b97725 ("fsnotify: create method handle_inode_event() in fsnotify_operations")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-12-03 14:58:35 +01:00
Chuck Lever
5cfc822f3e NFSD: Remove macros that are no longer used
Now that all the NFSv4 decoder functions have been converted to
make direct calls to the xdr helpers, remove the unused C macros.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:44 -05:00
Chuck Lever
d9b74bdac6 NFSD: Replace READ* macros in nfsd4_decode_compound()
And clean-up: Now that we have removed the DECODE_TAIL macro from
nfsd4_decode_compound(), we observe that there's no benefit for
nfsd4_decode_compound() to return nfs_ok or nfserr_bad_xdr only to
have its sole caller convert those values to one or zero,
respectively. Have nfsd4_decode_compound() return 1/0 instead.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:44 -05:00
Chuck Lever
3a237b4af5 NFSD: Make nfsd4_ops::opnum a u32
Avoid passing a "pointer to int" argument to xdr_stream_decode_u32.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:43 -05:00
Chuck Lever
2212036cad NFSD: Replace READ* macros in nfsd4_decode_listxattrs()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:43 -05:00
Chuck Lever
403366a7e8 NFSD: Replace READ* macros in nfsd4_decode_setxattr()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:43 -05:00
Chuck Lever
830c71502a NFSD: Replace READ* macros in nfsd4_decode_xattr_name()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:43 -05:00
Chuck Lever
3dfd0b0e15 NFSD: Replace READ* macros in nfsd4_decode_clone()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:43 -05:00
Chuck Lever
9d32b412fe NFSD: Replace READ* macros in nfsd4_decode_seek()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:43 -05:00
Chuck Lever
2846bb0525 NFSD: Replace READ* macros in nfsd4_decode_offload_status()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:43 -05:00
Chuck Lever
f9a953fb36 NFSD: Replace READ* macros in nfsd4_decode_copy_notify()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:43 -05:00
Chuck Lever
e8febea719 NFSD: Replace READ* macros in nfsd4_decode_copy()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:43 -05:00
Chuck Lever
f49e4b4d58 NFSD: Replace READ* macros in nfsd4_decode_nl4_server()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:43 -05:00
Chuck Lever
6aef27aaea NFSD: Replace READ* macros in nfsd4_decode_fallocate()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:42 -05:00
Chuck Lever
0d6467844d NFSD: Replace READ* macros in nfsd4_decode_reclaim_complete()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:42 -05:00
Chuck Lever
c95f2ec349 NFSD: Replace READ* macros in nfsd4_decode_destroy_clientid()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:42 -05:00
Chuck Lever
b7a0c8f6e7 NFSD: Replace READ* macros in nfsd4_decode_test_stateid()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:42 -05:00
Chuck Lever
cf907b1132 NFSD: Replace READ* macros in nfsd4_decode_sequence()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:42 -05:00
Chuck Lever
53d70873e3 NFSD: Replace READ* macros in nfsd4_decode_secinfo_no_name()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:42 -05:00
Chuck Lever
645fcad371 NFSD: Replace READ* macros in nfsd4_decode_layoutreturn()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:42 -05:00
Chuck Lever
c8e88e3aa7 NFSD: Replace READ* macros in nfsd4_decode_layoutget()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:42 -05:00
Chuck Lever
5185980d8a NFSD: Replace READ* macros in nfsd4_decode_layoutcommit()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:42 -05:00
Chuck Lever
044959715f NFSD: Replace READ* macros in nfsd4_decode_getdeviceinfo()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:42 -05:00
Chuck Lever
aec387d590 NFSD: Replace READ* macros in nfsd4_decode_free_stateid()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:41 -05:00
Chuck Lever
94e254af1f NFSD: Replace READ* macros in nfsd4_decode_destroy_session()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:41 -05:00
Chuck Lever
81243e3fe3 NFSD: Replace READ* macros in nfsd4_decode_create_session()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:41 -05:00
Chuck Lever
3a3f1fbacb NFSD: Add a helper to decode channel_attrs4
De-duplicate some code.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:41 -05:00
Chuck Lever
10ff842281 NFSD: Add a helper to decode nfs_impl_id4
Refactor for clarity.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:41 -05:00
Chuck Lever
523ec6ed6f NFSD: Add a helper to decode state_protect4_a
Refactor for clarity.

Also, remove a stale comment. Commit ed94164398 ("nfsd: implement
machine credential support for some operations") added support for
SP4_MACH_CRED, so state_protect_a is no longer completely ignored.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:41 -05:00
Chuck Lever
547bfeb4cd NFSD: Add a separate decoder for ssv_sp_parms
Refactor for clarity.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:41 -05:00
Chuck Lever
2548aa784d NFSD: Add a separate decoder to handle state_protect_ops
Refactor for clarity and de-duplication of code.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:41 -05:00
Chuck Lever
571e0451c4 NFSD: Replace READ* macros in nfsd4_decode_bind_conn_to_session()
A dedicated sessionid4 decoder is introduced that will be used by
other operation decoders in subsequent patches.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:41 -05:00
Chuck Lever
0f81d96098 NFSD: Replace READ* macros in nfsd4_decode_backchannel_ctl()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:40 -05:00
Chuck Lever
1a99440807 NFSD: Replace READ* macros in nfsd4_decode_cb_sec()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:40 -05:00
Chuck Lever
a4a80c15ca NFSD: Replace READ* macros in nfsd4_decode_release_lockowner()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:40 -05:00