linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-21 19:53:59 +08:00

Author	SHA1	Message	Date
NeilBrown	466e16f092	nfsd: check for EBUSY from vfs_rmdir/vfs_unink. vfs_rmdir and vfs_unlink can return -EBUSY if the target is a mountpoint. This currently gets passed to nfserrno() by nfsd_unlink(), and that results in a WARNing, which is not user-friendly. Possibly the best NFSv4 error is NFS4ERR_FILE_OPEN, because there is a sense in which the object is currently in use by some other task. The Linux NFSv4 client will map this back to EBUSY, which is an added benefit. For NFSv3, the best we can do is probably NFS3ERR_ACCES, which isn't true, but is not less true than the other options. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-11-30 14:59:52 -05:00
Trond Myklebust	a25e3726b3	nfsd: Ensure CLONE persists data and metadata changes to the target file The NFSv4.2 CLONE operation has implicit persistence requirements on the target file, since there is no protocol requirement that the client issue a separate operation to persist data. For that reason, we should call vfs_fsync_range() on the destination file after a successful call to vfs_clone_file_range(). Fixes: `ffa0160a10` ("nfsd: implement the NFSv4.2 CLONE operation") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-11-30 14:55:38 -05:00
J. Bruce Fields	7c149057d0	nfsd: restore NFSv3 ACL support An error in `e333f3bbef` left the nfsd_acl_program->pg_vers array empty, which effectively turned off the server's support for NFSv3 ACLs. Fixes: `e333f3bbef` "nfsd: Allow containers to set supported nfs versions" Cc: stable@vger.kernel.org Cc: Trond Myklebust <trondmy@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-11-19 17:43:05 -05:00
Scott Mayhew	a2e2f2dc77	nfsd: v4 support requires CRYPTO_SHA256 The new nfsdcld client tracking operations use sha256 to compute hashes of the kerberos principals, so make sure CRYPTO_SHA256 is enabled. Fixes: `6ee95d1c89` ("nfsd: add support for upcall version 2") Reported-by: Jamie Heilman <jamie@audible.transient.net> Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-11-12 14:53:26 -05:00
Scott Mayhew	18b9a895e6	nfsd: Fix cld_net->cn_tfm initialization Don't assign an error pointer to cld_net->cn_tfm, otherwise an oops will occur in nfsd4_remove_cld_pipe(). Also, move the initialization of cld_net->cn_tfm so that it occurs after the check to see if nfsdcld is running. This is necessary because nfsd4_client_tracking_init() looks for -ETIMEDOUT to determine whether to use the "old" nfsdcld tracking ops. Fixes: `6ee95d1c89` ("nfsd: add support for upcall version 2") Reported-by: Jamie Heilman <jamie@audible.transient.net> Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-11-12 14:53:26 -05:00
Mao Wenan	2a67803e13	nfsd: Drop LIST_HEAD where the variable it declares is never used. The declarations were introduced with the file, but the declared variables were not used. Fixes: `65294c1f2c` ("nfsd: add a new struct file caching facility to nfsd") Signed-off-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-11-08 12:32:59 -05:00
J. Bruce Fields	cc1ce2f13e	nfsd: document callback_wq serialization of callback code The callback code relies on the fact that much of it is only ever called from the ordered workqueue callback_wq, and this is worth documenting. Reported-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-11-08 12:32:59 -05:00
J. Bruce Fields	20428a8047	nfsd: mark cb path down on unknown errors An unexpected error is probably a sign that something is wrong with the callback path. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-11-08 12:32:59 -05:00
Trond Myklebust	2bbfed98a4	nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback() When we're destroying the client lease, and we call nfsd4_shutdown_callback(), we must ensure that we do not return before all outstanding callbacks have terminated and have released their payloads. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-11-08 12:32:59 -05:00
Trond Myklebust	12357f1b2c	nfsd: minor 4.1 callback cleanup Move all the cb_holds_slot management into helper functions. No change in behavior. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-11-08 12:32:44 -05:00
Andy Shevchenko	12b4157b7d	nfsd: remove private bin2hex implementation Calling sprintf in a loop is not very efficient, and in any case, we already have an implementation of bin-to-hex conversion in lib/ which we might as well use. Note that original code used to nul-terminate the destination while bin2hex doesn't. That's why replace kmalloc() with kzalloc(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-10-11 12:49:14 -04:00
Scott Mayhew	6e73e92b15	nfsd4: fix up replay_matches_cache() When running an nfs stress test, I see quite a few cached replies that don't match up with the actual request. The first comment in replay_matches_cache() makes sense, but the code doesn't seem to match... fix it. This isn't exactly a bugfix, as the server isn't required to catch every case of a false retry. So, we may as well do this, but if this is fixing a problem then that suggests there's a client bug. Fixes: `53da6a53e1` ("nfsd4: catch some false session retries") Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-10-09 16:39:31 -04:00
J. Bruce Fields	c4b77edb3f	nfsd: "\%s" should be "%s" Randy says: > sparse complains about these, as does gcc when used with --pedantic. > sparse says: > > ../fs/nfsd/nfs4state.c:2385:23: warning: unknown escape sequence: '\%' > ../fs/nfsd/nfs4state.c:2385:23: warning: unknown escape sequence: '\%' > ../fs/nfsd/nfs4state.c:2388:23: warning: unknown escape sequence: '\%' > ../fs/nfsd/nfs4state.c:2388:23: warning: unknown escape sequence: '\%' I'm not sure how this crept in. Fix it. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-10-08 16:01:33 -04:00
YueHaibing	19a1aad888	nfsd: remove set but not used variable 'len' Fixes gcc '-Wunused-but-set-variable' warning: fs/nfsd/nfs4xdr.c: In function nfsd4_encode_splice_read: fs/nfsd/nfs4xdr.c:3464:7: warning: variable len set but not used [-Wunused-but-set-variable] It is not used since commit `83a63072c8` ("nfsd: fix nfs read eof detection") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-10-08 16:01:33 -04:00
Linus Torvalds	298fb76a55	Highlights: - add a new knfsd file cache, so that we don't have to open and close on each (NFSv2/v3) READ or WRITE. This can speed up read and write in some cases. It also replaces our readahead cache. - Prevent silent data loss on write errors, by treating write errors like server reboots for the purposes of write caching, thus forcing clients to resend their writes. - Tweak the code that allocates sessions to be more forgiving, so that NFSv4.1 mounts are less likely to hang when a server already has a lot of clients. - Eliminate an arbitrary limit on NFSv4 ACL sizes; they should now be limited only by the backend filesystem and the maximum RPC size. - Allow the server to enforce use of the correct kerberos credentials when a client reclaims state after a reboot. And some miscellaneous smaller bugfixes and cleanup. -----BEGIN PGP SIGNATURE----- iQJJBAABCAAzFiEEYtFWavXG9hZotryuJ5vNeUKO4b4FAl2OoFcVHGJmaWVsZHNA ZmllbGRzZXMub3JnAAoJECebzXlCjuG+dRoP/3OW1NxPjpjbCQWZL0M+O3AYJJla W8E+uoZKMosFEe/ymokMD0Vn5s47jPaMCifMjHZa2GygW8zHN9X2v0HURx/lob+o /rJXwMn78N/8kdbfDz2FvaCPeT0IuNzRIFBV8/sSXofqwCBwvPo+cl0QGrd4/xLp X35qlupx62TRk+kbdRjvv8kpS5SJ7BvR+FSA1WubNYWw2hpdEsr2OCFdGq2Wvthy DK6AfGBXfJGsOE+HGCSj6ejRV6i0UOJ17P8gRSsx+YT0DOe5E7ROjt+qvvRwk489 wmR8Vjuqr1e40eGAUq3xuLfk5F5NgycY4ekVxk/cTVFNwWcz2DfdjXQUlyPAbrSD SqIyxN1qdKT24gtr7AHOXUWJzBYPWDgObCVBXUGzyL81RiDdhf38HRNjL2TcSDld tzCjQ0wbPw+iT74v6qQRY05oS+h3JOtDjU4pxsBnxVtNn4WhGJtaLfWW8o1C1QwU bc4aX3TlYhDmzU7n7Zjt4rFXGJfyokM+o6tPao1Z60Pmsv1gOk4KQlzLtW/jPHx4 ZwYTwVQUKRDBfC62nmgqDyGI3/Qu11FuIxL2KXUCgkwDxNWN4YkwYjOGw9Lb5qKM wFpxq6CDNZB/IWLEu8Yg85kDPPUJMoI657lZb7Osr/MfBpU0YljcMOIzLBy8uV1u 9COUbPaQipiWGu/0 =diBo -----END PGP SIGNATURE----- Merge tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "Highlights: - Add a new knfsd file cache, so that we don't have to open and close on each (NFSv2/v3) READ or WRITE. This can speed up read and write in some cases. It also replaces our readahead cache. - Prevent silent data loss on write errors, by treating write errors like server reboots for the purposes of write caching, thus forcing clients to resend their writes. - Tweak the code that allocates sessions to be more forgiving, so that NFSv4.1 mounts are less likely to hang when a server already has a lot of clients. - Eliminate an arbitrary limit on NFSv4 ACL sizes; they should now be limited only by the backend filesystem and the maximum RPC size. - Allow the server to enforce use of the correct kerberos credentials when a client reclaims state after a reboot. And some miscellaneous smaller bugfixes and cleanup" * tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linux: (34 commits) sunrpc: clean up indentation issue nfsd: fix nfs read eof detection nfsd: Make nfsd_reset_boot_verifier_locked static nfsd: degraded slot-count more gracefully as allocation nears exhaustion. nfsd: handle drc over-allocation gracefully. nfsd: add support for upcall version 2 nfsd: add a "GetVersion" upcall for nfsdcld nfsd: Reset the boot verifier on all write I/O errors nfsd: Don't garbage collect files that might contain write errors nfsd: Support the server resetting the boot verifier nfsd: nfsd_file cache entries should be per net namespace nfsd: eliminate an unnecessary acl size limit Deprecate nfsd fault injection nfsd: remove duplicated include from filecache.c nfsd: Fix the documentation for svcxdr_tmpalloc() nfsd: Fix up some unused variable warnings nfsd: close cached files prior to a REMOVE or RENAME that would replace target nfsd: rip out the raparms cache nfsd: have nfsd_test_lock use the nfsd_file cache nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache ...	2019-09-27 17:00:27 -07:00
Trond Myklebust	83a63072c8	nfsd: fix nfs read eof detection Currently, the knfsd server assumes that a short read indicates an end of file. That assumption is incorrect. The short read means that either we've hit the end of file, or we've hit a read error. In the case of a read error, the client may want to retry (as per the implementation recommendations in RFC1813 and RFC7530), but currently it is being told that it hit an eof. Move the code to detect eof from version specific code into the generic nfsd read. Report eof only in the two following cases: 1) read() returns a zero length short read with no error. 2) the offset+length of the read is >= the file size. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-09-23 16:24:08 -04:00
YueHaibing	65643f4c82	nfsd: Make nfsd_reset_boot_verifier_locked static Fix sparse warning: fs/nfsd/nfssvc.c:364:6: warning: symbol 'nfsd_reset_boot_verifier_locked' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-09-23 12:20:10 -04:00
NeilBrown	2030ca560c	nfsd: degraded slot-count more gracefully as allocation nears exhaustion. This original code in nfsd4_get_drc_mem() would hand out 30 slots (approximately NFSD_MAX_MEM_PER_SESSION bytes at slightly over 2K per slot) to each requesting client until it ran out of space, then it would possibly give one last client a reduced allocation, then fail the allocation. Since commit `de766e5704` ("nfsd: give out fewer session slots as limit approaches") the last 90 slots to be given to about 12 clients with quickly reducing slot counts (better than just 3 clients). This still seems unnecessarily hasty. A subsequent patch allows over-allocation so every client gets at least one slot, but that might be a bit restrictive. The requested number of nfsd threads is the best guide we have to the expected number of clients, so use that - if it is at least 8. 256 threads on a 256Meg machine - which is a lot for a tiny machine - would result in nfsd_drc_max_mem being 2Meg, so 8K (3 slots) would be available for the first client, and over 200 clients would get more than 1 slot. So I don't think this change will be too debilitating on poorly configured machines, though it does mean that a sensible configuration is a little more important. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-09-20 12:31:51 -04:00
NeilBrown	7f49fd5d7a	nfsd: handle drc over-allocation gracefully. Currently, if there are more clients than allowed for by the space allocation in set_max_drc(), we fail a SESSION_CREATE request with NFS4ERR_DELAY. This means that the client retries indefinitely, which isn't a user-friendly response. The RFC requires NFS4ERR_NOSPC, but that would at best result in a clean failure on the client, which is not much more friendly. The current space allocation is a best-guess and doesn't provide any guarantees, we could still run out of space when trying to allocate drc space. So fail more gracefully - always give out at least one slot. If all clients used all the space in all slots, we might start getting memory pressure, but that is possible anyway. So ensure 'num' is always at least 1, and remove the test for it being zero. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-09-20 12:30:02 -04:00
Linus Torvalds	e170eb2771	Merge branch 'work.mount-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs mount API infrastructure updates from Al Viro: "Infrastructure bits of mount API conversions. The rest is more of per-filesystem updates and that will happen in separate pull requests" * 'work.mount-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: mtd: Provide fs_context-aware mount_mtd() replacement vfs: Create fs_context-aware mount_bdev() replacement new helper: get_tree_keyed() vfs: set fs_context::user_ns for reconfigure	2019-09-18 13:15:58 -07:00
Scott Mayhew	6ee95d1c89	nfsd: add support for upcall version 2 Version 2 upcalls will allow the nfsd to include a hash of the kerberos principal string in the Cld_Create upcall. If a principal is present in the svc_cred, then the hash will be included in the Cld_Create upcall. We attempt to use the svc_cred.cr_raw_principal (which is returned by gssproxy) first, and then fall back to using the svc_cred.cr_principal (which is returned by both gssproxy and rpc.svcgssd). Upon a subsequent restart, the hash will be returned in the Cld_Gracestart downcall and stored in the reclaim_str_hashtbl so it can be used when handling reclaim opens. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-09-10 09:26:33 -04:00
Scott Mayhew	11a60d1592	nfsd: add a "GetVersion" upcall for nfsdcld Add a "GetVersion" upcall to allow nfsd to determine the maximum upcall version that the nfsdcld userspace daemon supports. If the daemon responds with -EOPNOTSUPP, then we know it only supports v1. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-09-10 09:26:33 -04:00
Trond Myklebust	bbf2f09883	nfsd: Reset the boot verifier on all write I/O errors If multiple clients are writing to the same file, then due to the fact we share a single file descriptor between all NFSv3 clients writing to the file, we have a situation where clients can miss the fact that their file data was not persisted. While this should be rare, it could cause silent data loss in situations where multiple clients are using NLM locking or O_DIRECT to write to the same file. Unfortunately, the stateless nature of NFSv3 and the fact that we can only identify clients by their IP address means that we cannot trivially cache errors; we would not know when it is safe to release them from the cache. So the solution is to declare a reboot. We understand that this should be a rare occurrence, since disks are usually stable. The most frequent occurrence is likely to be ENOSPC, at which point all writes to the given filesystem are likely to fail anyway. So the expectation is that clients will be forced to retry their writes until they hit the fatal error. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-09-10 09:23:41 -04:00
Trond Myklebust	055b24a8f2	nfsd: Don't garbage collect files that might contain write errors If a file may contain unstable writes that can error out, then we want to avoid garbage collecting the struct nfsd_file that may be tracking those errors. So in the garbage collector, we try to avoid collecting files that aren't clean. Furthermore, we avoid immediately kicking off the garbage collector in the case where the reference drops to zero for the case where there is a write error that is being tracked. If the file is unhashed while an error is pending, then declare a reboot, to ensure the client resends any unstable writes. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-09-10 09:23:41 -04:00
Trond Myklebust	27c438f53e	nfsd: Support the server resetting the boot verifier Add support to allow the server to reset the boot verifier in order to force clients to resend I/O after a timeout failure. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-09-10 09:23:41 -04:00
Trond Myklebust	5e113224c1	nfsd: nfsd_file cache entries should be per net namespace Ensure that we can safely clear out the file cache entries when the nfs server is shut down on a container. Otherwise, the file cache may end up pinning the mounts. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-09-10 09:23:41 -04:00
Al Viro	533770cc0a	new helper: get_tree_keyed() For vfs_get_keyed_super users. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2019-09-05 14:34:22 -04:00
J. Bruce Fields	2b86e3aaf9	nfsd: eliminate an unnecessary acl size limit We're unnecessarily limiting the size of an ACL to less than what most filesystems will support. Some users do hit the limit and it's confusing and unnecessary. It still seems prudent to impose some limit on the number of ACEs the client gives us before passing it straight to kmalloc(). So, let's just limit it to the maximum number that would be possible given the amount of data left in the argument buffer. That will still leave one limit beyond whatever the filesystem imposes: the client and server negotiate a limit on the size of a request, which we have to respect. But we're no longer imposing any additional arbitrary limit. struct nfs4_ace is 20 bytes on my system and the maximum call size we'll negotiate is about a megabyte, so in practice this is limiting the allocation here to about a megabyte. Reported-by: "de Vandiere, Louis" <louis.devandiere@atos.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-28 21:13:45 -04:00
J. Bruce Fields	9d60d93198	Deprecate nfsd fault injection This is only useful for client testing. I haven't really maintained it, and reference counting and locking are wrong at this point. You can get some of the same functionality now from nfsd/clients/. It was a good idea but I think its time has passed. In the unlikely event of users, hopefully the BROKEN dependency will prompt them to speak up. Otherwise I expect to remove it soon. Reported-by: Alex Lyakas <alex@zadara.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-26 10:36:56 -04:00
YueHaibing	bb13f35b96	nfsd: remove duplicated include from filecache.c Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-20 11:15:34 -04:00
Trond Myklebust	ed9927533a	nfsd: Fix the documentation for svcxdr_tmpalloc() Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:09:10 -04:00
Trond Myklebust	b96811cd02	nfsd: Fix up some unused variable warnings Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:09:10 -04:00
Jeff Layton	7775ec57f4	nfsd: close cached files prior to a REMOVE or RENAME that would replace target It's not uncommon for some workloads to do a bunch of I/O to a file and delete it just afterward. If knfsd has a cached open file however, then the file may still be open when the dentry is unlinked. If the underlying filesystem is nfs, then that could trigger it to do a sillyrename. On a REMOVE or RENAME scan the nfsd_file cache for open files that correspond to the inode, and proactively unhash and put their references. This should prevent any delete-on-last-close activity from occurring, solely due to knfsd's open file cache. This must be done synchronously though so we use the variants that call flush_delayed_fput. There are deadlock possibilities if you call flush_delayed_fput while holding locks, however. In the case of nfsd_rename, we don't even do the lookups of the dentries to be renamed until we've locked for rename. Once we've figured out what the target dentry is for a rename, check to see whether there are cached open files associated with it. If there are, then unwind all of the locking, close them all, and then reattempt the rename. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:09:10 -04:00
Jeff Layton	501cb1849f	nfsd: rip out the raparms cache The raparms cache was set up in order to ensure that we carry readahead information forward from one RPC call to the next. In other words, it was set up because each RPC call was forced to open a struct file, then close it, causing the loss of readahead information that is normally cached in that struct file, and used to keep the page cache filled when a user calls read() multiple times on the same file descriptor. Now that we cache the struct file, and reuse it for all the I/O calls to a given file by a given user, we no longer have to keep a separate readahead cache. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:09:09 -04:00
Jeff Layton	6b556ca287	nfsd: have nfsd_test_lock use the nfsd_file cache Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:09:09 -04:00
Jeff Layton	5c4583b2b7	nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache Have nfs4_preprocess_stateid_op pass back a nfsd_file instead of a filp. Since we now presume that the struct file will be persistent in most cases, we can stop fiddling with the raparms in the read code. This also means that we don't really care about the rd_tmp_file field anymore. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:09:09 -04:00
Jeff Layton	eb82dd3937	nfsd: convert fi_deleg_file and ls_file fields to nfsd_file Have them keep an nfsd_file reference instead of a struct file. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:09:09 -04:00
Jeff Layton	fd4f83fd7d	nfsd: convert nfs4_file->fi_fds array to use nfsd_files Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:09:09 -04:00
Jeff Layton	5920afa3c8	nfsd: hook nfsd_commit up to the nfsd_file cache Use cached filps if possible instead of opening a new one every time. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:00:40 -04:00
Jeff Layton	48cd7b5125	nfsd: hook up nfsd_read to the nfsd_file cache Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:00:40 -04:00
Jeff Layton	b493523926	nfsd: hook up nfsd_write to the new nfsd_file cache Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:00:39 -04:00
Jeff Layton	65294c1f2c	nfsd: add a new struct file caching facility to nfsd Currently, NFSv2/3 reads and writes have to open a file, do the read or write and then close it again for each RPC. This is highly inefficient, especially when the underlying filesystem has a relatively slow open routine. This patch adds a new open file cache to knfsd. Rather than doing an open for each RPC, the read/write handlers can call into this cache to see if there is one already there for the correct filehandle and NFS_MAY_READ/WRITE flags. If there isn't an entry, then we create a new one and attempt to perform the open. If there is, then we wait until the entry is fully instantiated and return it if it is at the end of the wait. If it's not, then we attempt to take over construction. Since the main goal is to speed up NFSv2/3 I/O, we don't want to close these files on last put of these objects. We need to keep them around for a little while since we never know when the next READ/WRITE will come in. Cache entries have a hardcoded 1s timeout, and we have a recurring workqueue job that walks the cache and purges any entries that have expired. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Richard Sharpe <richard.sharpe@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-19 11:00:39 -04:00
J. Bruce Fields	10fa8acf0f	nfsd: Remove unnecessary NULL checks "cb" is never actually NULL in these functions. On a quick skim of the history, they seem to have been there from the beginning. I'm not sure if they originally served a purpose. Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-16 13:49:29 -04:00
He Zhe	78e70e780b	nfsd4: Fix kernel crash when reading proc file reply_cache_stats reply_cache_stats uses wrong parameter as seq file private structure and thus causes the following kernel crash when users read /proc/fs/nfsd/reply_cache_stats BUG: kernel NULL pointer dereference, address: 00000000000001f9 PGD 0 P4D 0 Oops: 0000 [#3] SMP PTI CPU: 6 PID: 1502 Comm: cat Tainted: G D 5.3.0-rc3+ #1 Hardware name: Intel Corporation Broadwell Client platform/Basking Ridge, BIOS BDW-E2R1.86C.0118.R01.1503110618 03/11/2015 RIP: 0010:nfsd_reply_cache_stats_show+0x3b/0x2d0 Code: 41 54 49 89 f4 48 89 fe 48 c7 c7 b3 10 33 88 53 bb e8 03 00 00 e8 88 82 d1 ff bf 58 89 41 00 e8 eb c5 85 00 48 83 eb 01 75 f0 <41> 8b 94 24 f8 01 00 00 48 c7 c6 be 10 33 88 4c 89 ef bb e8 03 00 RSP: 0018:ffffaa520106fe08 EFLAGS: 00010246 RAX: 000000cfe1a77123 RBX: 0000000000000000 RCX: 0000000000291b46 RDX: 000000cf00000000 RSI: 0000000000000006 RDI: 0000000000291b28 RBP: ffffaa520106fe20 R08: 0000000000000006 R09: 000000cfe17e55dd R10: ffffa424e47c0000 R11: 000000000000030b R12: 0000000000000001 R13: ffffa424e5697000 R14: 0000000000000001 R15: ffffa424e5697000 FS: 00007f805735f580(0000) GS:ffffa424f8f80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000001f9 CR3: 00000000655ce005 CR4: 00000000003606e0 Call Trace: seq_read+0x194/0x3e0 __vfs_read+0x1b/0x40 vfs_read+0x95/0x140 ksys_read+0x61/0xe0 __x64_sys_read+0x1a/0x20 do_syscall_64+0x4d/0x120 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f805728b861 Code: fe ff ff 50 48 8d 3d 86 b4 09 00 e8 79 e0 01 00 66 0f 1f 84 00 00 00 00 00 48 8d 05 d9 19 0d 00 8b 00 85 c0 75 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 48 83 ec 28 48 89 54 RSP: 002b:00007ffea1ce3c38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f805728b861 RDX: 0000000000020000 RSI: 00007f8057183000 RDI: 0000000000000003 RBP: 00007f8057183000 R08: 00007f8057182010 R09: 0000000000000000 R10: 0000000000000022 R11: 0000000000000246 R12: 0000559a60e8ff10 R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000 Modules linked in: CR2: 00000000000001f9 ---[ end trace 01613595153f0cba ]--- RIP: 0010:nfsd_reply_cache_stats_show+0x3b/0x2d0 Code: 41 54 49 89 f4 48 89 fe 48 c7 c7 b3 10 33 88 53 bb e8 03 00 00 e8 88 82 d1 ff bf 58 89 41 00 e8 eb c5 85 00 48 83 eb 01 75 f0 <41> 8b 94 24 f8 01 00 00 48 c7 c6 be 10 33 88 4c 89 ef bb e8 03 00 RSP: 0018:ffffaa52004b3e08 EFLAGS: 00010246 RAX: 0000002bab45a7c6 RBX: 0000000000000000 RCX: 0000000000291b4c RDX: 0000002b00000000 RSI: 0000000000000004 RDI: 0000000000291b28 RBP: ffffaa52004b3e20 R08: 0000000000000004 R09: 0000002bab1c8c7a R10: ffffa424e5500000 R11: 00000000000002a9 R12: 0000000000000001 R13: ffffa424e4475000 R14: 0000000000000001 R15: ffffa424e4475000 FS: 00007f805735f580(0000) GS:ffffa424f8f80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000001f9 CR3: 00000000655ce005 CR4: 00000000003606e0 Killed Fixes: `3ba75830ce` ("nfsd4: drc containerization") Signed-off-by: He Zhe <zhe.he@windriver.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-16 13:36:55 -04:00
J. Bruce Fields	bebd699716	nfsd: initialize i_private before d_add A process could race in an open and attempt to read one of these files before i_private is initialized, and get a spurious error. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-15 16:24:07 -04:00
J. Bruce Fields	dc46bba709	nfsd: use i_wrlock instead of rcu for nfsdfs i_private synchronize_rcu() gets called multiple times each time a client is destroyed. If the laundromat thread has a lot of clients to destroy, the delay can be noticeable. This was causing pynfs test RENEW3 to fail. We could embed an rcu_head in each inode and do the kref_put in an rcu callback. But simplest is just to take a lock here. (I also wonder if the laundromat thread would be better replaced by a bunch of scheduled work or timers or something.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-15 14:53:02 -04:00
Tetsuo Handa	d6846bfbee	nfsd: fix dentry leak upon mkdir failure. syzbot is reporting that nfsd_mkdir() forgot to remove dentry created by d_alloc_name() when __nfsd_mkdir() failed (due to memory allocation fault injection) [1]. [1] https://syzkaller.appspot.com/bug?id=ce41a1f769ea4637ebffedf004a803e8405b4674 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+2c95195d5d433f6ed6cb@syzkaller.appspotmail.com> Fixes: `e8a79fb14f` ("nfsd: add nfsd/clients directory") [bfields: clean up in nfsd_mkdir instead of __nfsd_mkdir] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-08-15 14:53:00 -04:00
Linus Torvalds	933a90bf4f	Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs mount updates from Al Viro: "The first part of mount updates. Convert filesystems to use the new mount API" * 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) mnt_init(): call shmem_init() unconditionally constify ksys_mount() string arguments don't bother with registering rootfs init_rootfs(): don't bother with init_ramfs_fs() vfs: Convert smackfs to use the new mount API vfs: Convert selinuxfs to use the new mount API vfs: Convert securityfs to use the new mount API vfs: Convert apparmorfs to use the new mount API vfs: Convert openpromfs to use the new mount API vfs: Convert xenfs to use the new mount API vfs: Convert gadgetfs to use the new mount API vfs: Convert oprofilefs to use the new mount API vfs: Convert ibmasmfs to use the new mount API vfs: Convert qib_fs/ipathfs to use the new mount API vfs: Convert efivarfs to use the new mount API vfs: Convert configfs to use the new mount API vfs: Convert binfmt_misc to use the new mount API convenience helper: get_tree_single() convenience helper get_tree_nodev() vfs: Kill sget_userns() ...	2019-07-19 10:42:02 -07:00
Linus Torvalds	f632a8170a	Driver Core and debugfs changes for 5.3-rc1 Here is the "big" driver core and debugfs changes for 5.3-rc1 It's a lot of different patches, all across the tree due to some api changes and lots of debugfs cleanups. Because of this, there is going to be some merge issues with your tree at the moment, I'll follow up with the expected resolutions to make it easier for you. Other than the debugfs cleanups, in this set of changes we have: - bus iteration function cleanups (will cause build warnings with s390 and coresight drivers in your tree) - scripts/get_abi.pl tool to display and parse Documentation/ABI entries in a simple way - cleanups to Documenatation/ABI/ entries to make them parse easier due to typos and other minor things - default_attrs use for some ktype users - driver model documentation file conversions to .rst - compressed firmware file loading - deferred probe fixes All of these have been in linux-next for a while, with a bunch of merge issues that Stephen has been patient with me for. Other than the merge issues, functionality is working properly in linux-next :) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXSgpnQ8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ykcwgCfS30OR4JmwZydWGJ7zK/cHqk+KjsAnjOxjC1K LpRyb3zX29oChFaZkc5a =XrEZ -----END PGP SIGNATURE----- Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and debugfs updates from Greg KH: "Here is the "big" driver core and debugfs changes for 5.3-rc1 It's a lot of different patches, all across the tree due to some api changes and lots of debugfs cleanups. Other than the debugfs cleanups, in this set of changes we have: - bus iteration function cleanups - scripts/get_abi.pl tool to display and parse Documentation/ABI entries in a simple way - cleanups to Documenatation/ABI/ entries to make them parse easier due to typos and other minor things - default_attrs use for some ktype users - driver model documentation file conversions to .rst - compressed firmware file loading - deferred probe fixes All of these have been in linux-next for a while, with a bunch of merge issues that Stephen has been patient with me for" * tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits) debugfs: make error message a bit more verbose orangefs: fix build warning from debugfs cleanup patch ubifs: fix build warning after debugfs cleanup patch driver: core: Allow subsystems to continue deferring probe drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT arch_topology: Remove error messages on out-of-memory conditions lib: notifier-error-inject: no need to check return value of debugfs_create functions swiotlb: no need to check return value of debugfs_create functions ceph: no need to check return value of debugfs_create functions sunrpc: no need to check return value of debugfs_create functions ubifs: no need to check return value of debugfs_create functions orangefs: no need to check return value of debugfs_create functions nfsd: no need to check return value of debugfs_create functions lib: 842: no need to check return value of debugfs_create functions debugfs: provide pr_fmt() macro debugfs: log errors when something goes wrong drivers: s390/cio: Fix compilation warning about const qualifiers drivers: Add generic helper to match by of_node driver_find_device: Unify the match function with class_find_device() bus_find_device: Unify the match callback with class_find_device ...	2019-07-12 12:24:03 -07:00
YueHaibing	b78fa45d4e	nfsd: Make __get_nfsdfs_client() static Fix sparse warning: fs/nfsd/nfsctl.c:1221:22: warning: symbol '__get_nfsdfs_client' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2019-07-09 19:36:33 -04:00

1 2 3 4 5 ...

2780 Commits