linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-02 16:44:10 +08:00

Author	SHA1	Message	Date
Kinglong Mee	3c7aa15d20	NFSD: Using min/max/min_t/max_t for calculate Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-06-23 11:31:36 -04:00
Christoph Hellwig	b52bd7bccc	nfsd: remove unused function nfsd_read_file Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 17:32:27 -04:00
J. Bruce Fields	dc97618ddd	nfsd4: separate splice and readv cases The splice and readv cases are actually quite different--for example the former case ignores the array of vectors we build up for the latter. It is probably clearer to separate the two cases entirely. There's some code duplication between the split out encoders, but this is only temporary and will be fixed by a later patch. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 17:32:09 -04:00
J. Bruce Fields	02fe470774	nfsd4: nfsd_vfs_read doesn't use file handle parameter Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 17:32:09 -04:00
NeilBrown	8658452e4a	nfsd: Only set PF_LESS_THROTTLE when really needed. PF_LESS_THROTTLE has a very specific use case: to avoid deadlocks and live-locks while writing to the page cache in a loop-back NFS mount situation. It therefore makes sense to only set PF_LESS_THROTTLE in this situation. We now know when a request came from the local-host so it could be a loop-back mount. We already know when we are handling write requests, and when we are doing anything else. So combine those two to allow nfsd to still be throttled (like any other process) in every situation except when it is known to be problematic. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-22 15:59:19 -04:00
Kinglong Mee	368fe39b50	NFSD: Don't clear SUID/SGID after root writing data We're clearing the SUID/SGID bits on write by hand in nfsd_vfs_write, even though the subsequent vfs_writev() call will end up doing this for us (through file system write methods eventually calling file_remove_suid(), e.g., from __generic_file_aio_write). So, remove the redundant nfsd code. The only change in behavior is when the write is by root, in which case we previously cleared SUID/SGID, but will now leave it alone. The new behavior is the behavior of every filesystem we've checked. It seems better to be consistent with local filesystem behavior. And the security advantage seems limited as root could always restore these bits by hand if it wanted. SUID/SGID is not cleared after writing data with (root, local ext4), File: ‘test’ Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: 803h/2051d Inode: 1200137 Links: 1 Access: (4777/-rwsrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root) Context: unconfined_u:object_r:admin_home_t:s0 Access: 2014-04-18 21:36:31.016029014 +0800 Modify: 2014-04-18 21:36:31.016029014 +0800 Change: 2014-04-18 21:36:31.026030285 +0800 Birth: - File: ‘test’ Size: 5 Blocks: 8 IO Block: 4096 regular file Device: 803h/2051d Inode: 1200137 Links: 1 Access: (4777/-rwsrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root) Context: unconfined_u:object_r:admin_home_t:s0 Access: 2014-04-18 21:36:31.016029014 +0800 Modify: 2014-04-18 21:36:31.040032065 +0800 Change: 2014-04-18 21:36:31.040032065 +0800 Birth: - With no_root_squash, (root, remote ext4), SUID/SGID are cleared, File: ‘test’ Size: 0 Blocks: 0 IO Block: 262144 regular empty file Device: 24h/36d Inode: 786439 Links: 1 Access: (4777/-rwsrwxrwx) Uid: ( 1000/ test) Gid: ( 1000/ test) Context: system_u:object_r:nfs_t:s0 Access: 2014-04-18 21:45:32.155805097 +0800 Modify: 2014-04-18 21:45:32.155805097 +0800 Change: 2014-04-18 21:45:32.168806749 +0800 Birth: - File: ‘test’ Size: 5 Blocks: 8 IO Block: 262144 regular file Device: 24h/36d Inode: 786439 Links: 1 Access: (0777/-rwxrwxrwx) Uid: ( 1000/ test) Gid: ( 1000/ test) Context: system_u:object_r:nfs_t:s0 Access: 2014-04-18 21:45:32.155805097 +0800 Modify: 2014-04-18 21:45:32.184808783 +0800 Change: 2014-04-18 21:45:32.184808783 +0800 Birth: - Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-21 12:17:16 -04:00
Linus Torvalds	75ff24fa52	Merge branch 'for-3.15' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "Highlights: - server-side nfs/rdma fixes from Jeff Layton and Tom Tucker - xdr fixes (a larger xdr rewrite has been posted but I decided it would be better to queue it up for 3.16). - miscellaneous fixes and cleanup from all over (thanks especially to Kinglong Mee)" * 'for-3.15' of git://linux-nfs.org/~bfields/linux: (36 commits) nfsd4: don't create unnecessary mask acl nfsd: revert v2 half of "nfsd: don't return high mode bits" nfsd4: fix memory leak in nfsd4_encode_fattr() nfsd: check passed socket's net matches NFSd superblock's one SUNRPC: Clear xpt_bc_xprt if xs_setup_bc_tcp failed NFSD/SUNRPC: Check rpc_xprt out of xs_setup_bc_tcp SUNRPC: New helper for creating client with rpc_xprt NFSD: Free backchannel xprt in bc_destroy NFSD: Clear wcc data between compound ops nfsd: Don't return NFS4ERR_STALE_STATEID for NFSv4.1+ nfsd4: fix nfs4err_resource in 4.1 case nfsd4: fix setclientid encode size nfsd4: remove redundant check from nfsd4_check_resp_size nfsd4: use more generous NFS4_ACL_MAX nfsd4: minor nfsd4_replay_cache_entry cleanup nfsd4: nfsd4_replay_cache_entry should be static nfsd4: update comments with obsolete function name rpc: Allow xdr_buf_subsegment to operate in-place NFSD: Using free_conn free connection SUNRPC: fix memory leak of peer addresses in XPRT ...	2014-04-08 18:28:14 -07:00
Miklos Szeredi	520c8b1650	vfs: add renameat2 syscall Add new renameat2 syscall, which is the same as renameat with an added flags argument. Pass flags to vfs_rename() and to i_op->rename() as well. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: J. Bruce Fields <bfields@redhat.com>	2014-04-01 17:08:42 +02:00
J. Bruce Fields	fbb74a34a5	nfsd: typo in nfsd_rename comment Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 18:02:11 -04:00
J. Bruce Fields	9f67f18993	nfsd: notify_change needs elevated write count Looks like this bug has been here since these write counts were introduced, not sure why it was just noticed now. Thanks also to Jan Kara for pointing out the problem. Cc: stable@vger.kernel.org Reported-by: Matthew Rahtz <mrahtz@rapitasystems.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-27 16:31:48 -04:00
J. R. Okajima	1406b916f4	nfsd: fix lost nfserrno() call in nfsd_setattr() There is a regression in `208d0ac` 2014-01-07 nfsd4: break only delegations when appropriate which deletes an nfserrno() call in nfsd_setattr() (by accident, probably), and NFSD becomes ignoring an error from VFS. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-02-18 12:25:57 -05:00
Linus Torvalds	d9894c228b	Merge branch 'for-3.14' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: - Handle some loose ends from the vfs read delegation support. (For example nfsd can stop breaking leases on its own in a fewer places where it can now depend on the vfs to.) - Make life a little easier for NFSv4-only configurations (thanks to Kinglong Mee). - Fix some gss-proxy problems (thanks Jeff Layton). - miscellaneous bug fixes and cleanup * 'for-3.14' of git://linux-nfs.org/~bfields/linux: (38 commits) nfsd: consider CLAIM_FH when handing out delegation nfsd4: fix delegation-unlink/rename race nfsd4: delay setting current_fh in open nfsd4: minor nfs4_setlease cleanup gss_krb5: use lcm from kernel lib nfsd4: decrease nfsd4_encode_fattr stack usage nfsd: fix encode_entryplus_baggage stack usage nfsd4: simplify xdr encoding of nfsv4 names nfsd4: encode_rdattr_error cleanup nfsd4: nfsd4_encode_fattr cleanup minor svcauth_gss.c cleanup nfsd4: better VERIFY comment nfsd4: break only delegations when appropriate NFSD: Fix a memory leak in nfsd4_create_session sunrpc: get rid of use_gssp_lock sunrpc: fix potential race between setting use_gss_proxy and the upcall rpc_clnt sunrpc: don't wait for write before allowing reads from use-gss-proxy file nfsd: get rid of unused function definition Define op_iattr for nfsd4_open instead using macro NFSD: fix compile warning without CONFIG_NFSD_V3 ...	2014-01-30 10:18:43 -08:00
J. Bruce Fields	4335723e8e	nfsd4: fix delegation-unlink/rename race If a file is unlinked or renamed between the time when we do the local open and the time when we get the delegation, then we will return to the client indicating that it holds a delegation even though the file no longer exists under the name it was open under. But a client performing an open-by-name, when it is returned a delegation, must be able to assume that the file is still linked at the name it was opened under. So, hold the parent i_mutex for longer to prevent concurrent renames or unlinks. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-27 13:59:16 -05:00
Christoph Hellwig	4ac7249ea5	nfsd: use get_acl and ->set_acl Remove the boilerplate code to marshall and unmarhall ACL objects into xattrs and operate on the posix_acl objects directly. Also move all the ACL handling code into nfs?acl.c where it belongs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-01-26 08:26:41 -05:00
J. Bruce Fields	208d0acc49	nfsd4: break only delegations when appropriate As a temporary fix, nfsd was breaking all leases on unlink, link, rename, and setattr. Now that we can distinguish between leases and delegations, we can be nicer and break only the delegations, and not bother lease-holders with operations they don't care about. And we get to delete some code while we're at it. Note that in the presence of delegations the vfs calls here all return -EWOULDBLOCK instead of blocking, so nfsd threads will not get stuck waiting for delegation returns. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-07 16:01:15 -05:00
Stanislav Kholmanskikh	c4fa6d7c59	nfsd: revoking of suid/sgid bits after chown() in a consistent way There is an inconsistency in the handling of SUID/SGID file bits after chown() between NFS and other local file systems. Local file systems (for example, ext3, ext4, xfs, btrfs) revoke SUID/SGID bits after chown() on a regular file even if the owner/group of the file has not been changed: ~# touch file; chmod ug+s file; chmod u+x file ~# ls -l file -rwsr-Sr-- 1 root root 0 Dec 6 04:49 file ~# chown root file; ls -l file -rwxr-Sr-- 1 root root 0 Dec 6 04:49 file but NFS doesn't do that: ~# touch file; chmod ug+s file; chmod u+x file ~# ls -l file -rwsr-Sr-- 1 root root 0 Dec 6 04:49 file ~# chown root file; ls -l file -rwsr-Sr-- 1 root root 0 Dec 6 04:49 file NFS does that only if the owner/group has been changed: ~# touch file; chmod ug+s file; chmod u+x file ~# ls -l file -rwsr-Sr-- 1 root root 0 Dec 6 05:02 file ~# chown bin file; ls -l file -rwxr-Sr-- 1 bin root 0 Dec 6 05:02 file See: http://pubs.opengroup.org/onlinepubs/9699919799/functions/chown.html "If the specified file is a regular file, one or more of the S_IXUSR, S_IXGRP, or S_IXOTH bits of the file mode are set, and the process has appropriate privileges, it is implementation-defined whether the set-user-ID and set-group-ID bits are altered." So both variants are acceptable by POSIX. This patch makes NFS to behave like local file systems. Signed-off-by: Stanislav Kholmanskikh <stanislav.kholmanskikh@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-12-12 11:37:18 -05:00
Christoph Hellwig	987da47910	nfsd: make sure to balance get/put_write_access Use a straight goto error label style in nfsd_setattr to make sure we always do the put_write_access call after we got it earlier. Note that the we have been failing to do that in the case nfsd_break_lease() returns an error, a bug introduced into 2.6.38 with `6a76bebefe` "nfsd4: break lease on nfsd setattr". Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-11-18 12:06:48 -05:00
Christoph Hellwig	818e5a22e9	nfsd: split up nfsd_setattr Split out two helpers to make the code more readable and easier to verify for correctness. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-11-18 12:06:47 -05:00
J. Bruce Fields	27ac0ffeac	locks: break delegations on any attribute modification NFSv4 uses leases to guarantee that clients can cache metadata as well as data. Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: David Howells <dhowells@redhat.com> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Dustin Kirkland <dustin.kirkland@gazzang.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:44 -05:00
J. Bruce Fields	146a8595c6	locks: break delegations on link Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Dustin Kirkland <dustin.kirkland@gazzang.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:43 -05:00
J. Bruce Fields	8e6d782cab	locks: break delegations on rename Cc: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:43 -05:00
J. Bruce Fields	b21996e36c	locks: break delegations on unlink We need to break delegations on any operation that changes the set of links pointing to an inode. Start with unlink. Such operations also hold the i_mutex on a parent directory. Breaking a delegation may require waiting for a timeout (by default 90 seconds) in the case of a unresponsive NFS client. To avoid blocking all directory operations, we therefore drop locks before waiting for the delegation. The logic then looks like: acquire locks ... test for delegation; if found: take reference on inode release locks wait for delegation break drop reference on inode retry It is possible this could never terminate. (Even if we take precautions to prevent another delegation being acquired on the same inode, we could get a different inode on each retry.) But this seems very unlikely. The initial test for a delegation happens after the lock on the target inode is acquired, but the directory inode may have been acquired further up the call stack. We therefore add a "struct inode **" argument to any intervening functions, which we use to pass the inode back up to the caller in the case it needs a delegation synchronously broken. Cc: David Howells <dhowells@redhat.com> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Dustin Kirkland <dustin.kirkland@gazzang.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:42 -05:00
Al Viro	a6a9f18f0a	nfsd: switch to %p[dD] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-10-24 23:34:51 -04:00
Harshula Jayasuriya	e4daf1ffbe	nfsd: nfsd_open: when dentry_open returns an error do not propagate as struct file The following call chain: ------------------------------------------------------------ nfs4_get_vfs_file - nfsd_open - dentry_open - do_dentry_open - __get_file_write_access - get_write_access - return atomic_inc_unless_negative(&inode->i_writecount) ? 0 : -ETXTBSY; ------------------------------------------------------------ can result in the following state: ------------------------------------------------------------ struct nfs4_file { ... fi_fds = {0xffff880c1fa65c80, 0xffffffffffffffe6, 0x0}, fi_access = {{ counter = 0x1 }, { counter = 0x0 }}, ... ------------------------------------------------------------ 1) First time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is NULL, hence nfsd_open() is called where we get status set to an error and fp->fi_fds[O_WRONLY] to -ETXTBSY. Thus we do not reach nfs4_file_get_access() and fi_access[O_WRONLY] is not incremented. 2) Second time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is NOT NULL (-ETXTBSY), so nfsd_open() is NOT called, but nfs4_file_get_access() IS called and fi_access[O_WRONLY] is incremented. Thus we leave a landmine in the form of the nfs4_file data structure in an incorrect state. 3) Eventually, when __nfs4_file_put_access() is called it finds fi_access[O_WRONLY] being non-zero, it decrements it and calls nfs4_file_put_fd() which tries to fput -ETXTBSY. ------------------------------------------------------------ ... [exception RIP: fput+0x9] RIP: ffffffff81177fa9 RSP: ffff88062e365c90 RFLAGS: 00010282 RAX: ffff880c2b3d99cc RBX: ffff880c2b3d9978 RCX: 0000000000000002 RDX: dead000000100101 RSI: 0000000000000001 RDI: ffffffffffffffe6 RBP: ffff88062e365c90 R8: ffff88041fe797d8 R9: ffff88062e365d58 R10: 0000000000000008 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000000007 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #9 [ffff88062e365c98] __nfs4_file_put_access at ffffffffa0562334 [nfsd] #10 [ffff88062e365cc8] nfs4_file_put_access at ffffffffa05623ab [nfsd] #11 [ffff88062e365ce8] free_generic_stateid at ffffffffa056634d [nfsd] #12 [ffff88062e365d18] release_open_stateid at ffffffffa0566e4b [nfsd] #13 [ffff88062e365d38] nfsd4_close at ffffffffa0567401 [nfsd] #14 [ffff88062e365d88] nfsd4_proc_compound at ffffffffa0557f28 [nfsd] #15 [ffff88062e365dd8] nfsd_dispatch at ffffffffa054543e [nfsd] #16 [ffff88062e365e18] svc_process_common at ffffffffa04ba5a4 [sunrpc] #17 [ffff88062e365e98] svc_process at ffffffffa04babe0 [sunrpc] #18 [ffff88062e365eb8] nfsd at ffffffffa0545b62 [nfsd] #19 [ffff88062e365ee8] kthread at ffffffff81090886 #20 [ffff88062e365f48] kernel_thread at ffffffff8100c14a ------------------------------------------------------------ Cc: stable@vger.kernel.org Signed-off-by: Harshula Jayasuriya <harshula@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-23 12:15:32 -04:00
Linus Torvalds	0ff08ba5d0	Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux Pull nfsd changes from Bruce Fields: "Changes this time include: - 4.1 enabled on the server by default: the last 4.1-specific issues I know of are fixed, so we're not going to find the rest of the bugs without more exposure. - Experimental support for NFSv4.2 MAC Labeling (to allow running selinux over NFS), from Dave Quigley. - Fixes for some delicate cache/upcall races that could cause rare server hangs; thanks to Neil Brown and Bodo Stroesser for extreme debugging persistence. - Fixes for some bugs found at the recent NFS bakeathon, mostly v4 and v4.1-specific, but also a generic bug handling fragmented rpc calls" * 'for-3.11' of git://linux-nfs.org/~bfields/linux: (31 commits) nfsd4: support minorversion 1 by default nfsd4: allow destroy_session over destroyed session svcrpc: fix failures to handle -1 uid's sunrpc: Don't schedule an upcall on a replaced cache entry. net/sunrpc: xpt_auth_cache should be ignored when expired. sunrpc/cache: ensure items removed from cache do not have pending upcalls. sunrpc/cache: use cache_fresh_unlocked consistently and correctly. sunrpc/cache: remove races with queuing an upcall. nfsd4: return delegation immediately if lease fails nfsd4: do not throw away 4.1 lock state on last unlock nfsd4: delegation-based open reclaims should bypass permissions svcrpc: don't error out on small tcp fragment svcrpc: fix handling of too-short rpc's nfsd4: minor read_buf cleanup nfsd4: fix decoding of compounds across page boundaries nfsd4: clean up nfs4_open_delegation NFSD: Don't give out read delegations on creates nfsd4: allow client to send no cb_sec flavors nfsd4: fail attempts to request gss on the backchannel nfsd4: implement minimal SP4_MACH_CRED ...	2013-07-11 10:17:13 -07:00
Al Viro	ac6614b764	[readdir] constify ->actor Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-06-29 12:57:05 +04:00
Al Viro	5c0ba4e076	[readdir] introduce iterate_dir() and dir_context iterate_dir(): new helper, replacing vfs_readdir(). struct dir_context: contains the readdir callback (and will get more stuff in it), embedded into whatever data that callback wants to deal with; eventually, we'll be passing it to ->readdir() replacement instead of (data,filldir) pair. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-06-29 12:46:46 +04:00
David Quigley	18032ca062	NFSD: Server implementation of MAC Labeling Implement labeled NFS on the server: encoding and decoding, and writing and reading, of file labels. Enabled with CONFIG_NFSD_V4_SECURITY_LABEL. Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com> Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg> Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg> Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-15 09:27:02 -04:00
J. Bruce Fields	aa387d6ce1	nfsd: fix EXDEV checking in rename We again check for the EXDEV a little later on, so the first check is redundant. This check is also slightly racier, since a badly timed eviction from the export cache could leave us with the two fh_export pointers pointing to two different cache entries which each refer to the same underlying export. It's better to compare vfsmounts as the later check does, but that leaves a minor security hole in the case where the two exports refer to two different directories especially if (for example) they have different root-squashing options. So, compare ex_path.dentry too. Reported-by: Joe Habermann <joe.habermann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-26 16:18:15 -04:00
Kent Overstreet	e49dbbf3e7	nfsd: fix bad offset use vfs_writev() updates the offset argument - but the code then passes the offset to vfs_fsync_range(). Since offset now points to the offset after what was just written, this is probably not what was intended Introduced by `face15025f` "nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes". Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: stable@vger.kernel.org Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-03-22 16:55:15 -04:00
Linus Torvalds	d895cb1af1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle and ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...	2013-02-26 20:16:07 -08:00
Al Viro	496ad9aa8e	new helper: file_inode(file) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-22 23:31:31 -05:00
Eric W. Biederman	6fab877900	nfsd: Properly compare and initialize kuids and kgids Use uid_eq(uid, GLOBAL_ROOT_UID) instead of !uid. Use gid_eq(gid, GLOBAL_ROOT_GID) instead of !gid. Use uid_eq(uid, INVALID_UID) instead of uid == -1 Use gid_eq(uid, INVALID_GID) instead of gid == -1 Use uid = GLOBAL_ROOT_UID instead of uid = 0; Use gid = GLOBAL_ROOT_GID instead of gid = 0; Use !uid_eq(uid1, uid2) instead of uid1 != uid2. Use !gid_eq(gid1, gid2) instead of gid1 != gid2. Use uid_eq(uid1, uid2) instead of uid1 == uid2. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-02-13 06:16:09 -08:00
J. Bruce Fields	10532b560b	Revert "nfsd: warn on odd reply state in nfsd_vfs_read" This reverts commit `79f77bf9a4`. This is obviously wrong, and I have no idea how I missed seeing the warning in testing: I must just not have looked at the right logs. The caller bumps rq_resused/rq_next_page, so it will always be hit on a large enough read. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-21 17:07:45 -08:00
J. Bruce Fields	afc59400d6	nfsd4: cleanup: replace rq_resused count by rq_next_page pointer It may be a matter of personal taste, but I find this makes the code clearer. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-17 22:00:16 -05:00
J. Bruce Fields	79f77bf9a4	nfsd: warn on odd reply state in nfsd_vfs_read As far as I can tell this shouldn't currently happen--or if it does, something is wrong and data is going to be corrupted. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-17 21:55:46 -05:00
Neil Brown	7007c90fb9	nfsd: avoid permission checks on EXCLUSIVE_CREATE replay With NFSv4, if we create a file then open it we explicit avoid checking the permissions on the file during the open because the fact that we created it ensures we should be allow to open it (the create and the open should appear to be a single operation). However if the reply to an EXCLUSIVE create gets lots and the client resends the create, the current code will perform the permission check - because it doesn't realise that it did the open already.. This patch should fix this. Note that I haven't actually seen this cause a problem. I was just looking at the code trying to figure out a different EXCLUSIVE open related issue, and this looked wrong. (Fix confirmed with pynfs 4.0 test OPEN4--bfields) Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de> [bfields: use OWNER_OVERRIDE and update for 4.1] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:31 -05:00
J. Bruce Fields	face15025f	nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes NFSv4 shares the same struct file across multiple writes. (And we'd like NFSv2 and NFSv3 to do that as well some day.) So setting O_SYNC on the struct file as a way to request a synchronous write doesn't work. Instead, do a vfs_fsync_range() in that case. Reported-by: Peter Staubach <pstaubach@exagrid.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:31:34 -05:00
J. Bruce Fields	fae5096ad2	nfsd: assume writeable exportabled filesystems have f_sync I don't really see how you could claim to support nfsd and not support fsync somehow. And in practice a quick look through the exportable filesystems suggests the only ones without an ->fsync are read-only (efs, isofs, squashfs) or in-memory (shmem). Also, performing a write and then returning an error if the sync fails (as we would do here in the wgather case) seems unhelpful to clients. Also remove an incorrect comment. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:31:33 -05:00
J. Bruce Fields	f474af7051	UAPI Disintegration 2012-10-09 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIVAwUAUHPmWxOxKuMESys7AQKN4w//XDwALfbf0MXIw+gwyRiUtJe9mGexvI6X 1R4FWU9a3ImzEZP4cWnmPGT2wmC/x007DcIvx8cyvbdlSuqtR2i/DC+HbWabiLRn nJS7Eer1BJvLv5dn6NmXMEz7yB4Z46+frcmBs3WQeR0sqBMDm+rjQzCqECznO8Jc VtCbox+VR2DuWcM++YECTblYEH3Z+doDXUN2eBaD8L9x3klPbPXD7OcRyOnry8w+ ynmUTKKyH4+hpxDakYrObPIg+vFCxb4QRck1mlgA4wbvb3eqjhM0oOCYJ8GvmILA vdFYztWCjkiuOl5djtXBlsClX8SAMOBYlRed+R1GvjNCSR+WCWrFJJ2F8qoQ1w87 9ts2/8qrozS8luTB475SkT2uLdJkIUKX89Oh+dWeE8YkbPnRPj5lNAdtNY5QSyDq VaRpIo+YfmZygyvHJQlAXBuZ0mvzcPzArfcPgSVTD3B7xTEGVu/45V7SnQX5os/V v39ySPXMdGOIdvK51gw7OtZl64uqrEKu39PyYDX/GUADflp/CHD0J7PJrQePbsH9 AQolVZDIxTfKqYQnUdL8+C8Zc24RowEzz3c2+aO89MSzwGqev3q8sXRVbW/Iqryg p+V3nHe+ipKcga5tOBlPr9KDtDd7j3xN2yaIwf5/QyO1OHBpjAZP1gjSVDcUcwpi svYy4kPn3PA= =etoL -----END PGP SIGNATURE----- nfs: disintegrate UAPI for nfs This is to complete part of the Userspace API (UAPI) disintegration for which the preparatory patches were pulled recently. After these patches, userspace headers will be segregated into: include/uapi/linux/.../foo.h for the userspace interface stuff, and: include/linux/.../foo.h for the strictly kernel internal stuff. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-09 18:35:22 -04:00
Eric W. Biederman	5f3a4a28ec	userns: Pass a userns parameter into posix_acl_to_xattr and posix_acl_from_xattr - Pass the user namespace the uid and gid values in the xattr are stored in into posix_acl_from_xattr. - Pass the user namespace kuid and kgid values should be converted into when storing uid and gid values in an xattr in posix_acl_to_xattr. - Modify all callers of posix_acl_from_xattr and posix_acl_to_xattr to pass in &init_user_ns. In the short term this change is not strictly needed but it makes the code clearer. In the longer term this change is necessary to be able to mount filesystems outside of the initial user namespace that natively store posix acls in the linux xattr format. Cc: Theodore Tso <tytso@mit.edu> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-09-18 01:01:35 -07:00
J. Bruce Fields	fac7a17b5f	nfsd4: cast readlink() bug argument As we already do in readv, writev. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-09-10 17:46:19 -04:00
Linus Torvalds	a0e881b7c1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull second vfs pile from Al Viro: "The stuff in there: fsfreeze deadlock fixes by Jan (essentially, the deadlock reproduced by xfstests 068), symlink and hardlink restriction patches, plus assorted cleanups and fixes. Note that another fsfreeze deadlock (emergency thaw one) is not dealt with - the series by Fernando conflicts a lot with Jan's, breaks userland ABI (FIFREEZE semantics gets changed) and trades the deadlock for massive vfsmount leak; this is going to be handled next cycle. There probably will be another pull request, but that stuff won't be in it." Fix up trivial conflicts due to unrelated changes next to each other in drivers/{staging/gdm72xx/usb_boot.c, usb/gadget/storage_common.c} * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits) delousing target_core_file a bit Documentation: Correct s_umount state for freeze_fs/unfreeze_fs fs: Remove old freezing mechanism ext2: Implement freezing btrfs: Convert to new freezing mechanism nilfs2: Convert to new freezing mechanism ntfs: Convert to new freezing mechanism fuse: Convert to new freezing mechanism gfs2: Convert to new freezing mechanism ocfs2: Convert to new freezing mechanism xfs: Convert to new freezing code ext4: Convert to new freezing mechanism fs: Protect write paths by sb_start_write - sb_end_write fs: Skip atime update on frozen filesystem fs: Add freezing handling to mnt_want_write() / mnt_drop_write() fs: Improve filesystem freezing handling switch the protection of percpu_counter list to spinlock nfsd: Push mnt_want_write() outside of i_mutex btrfs: Push mnt_want_write() outside of i_mutex fat: Push mnt_want_write() outside of i_mutex ...	2012-08-01 10:26:23 -07:00
Linus Torvalds	08843b79fb	Merge branch 'nfsd-next' of git://linux-nfs.org/~bfields/linux Pull nfsd changes from J. Bruce Fields: "This has been an unusually quiet cycle--mostly bugfixes and cleanup. The one large piece is Stanislav's work to containerize the server's grace period--but that in itself is just one more step in a not-yet-complete project to allow fully containerized nfs service. There are a number of outstanding delegation, container, v4 state, and gss patches that aren't quite ready yet; 3.7 may be wilder." * 'nfsd-next' of git://linux-nfs.org/~bfields/linux: (35 commits) NFSd: make boot_time variable per network namespace NFSd: make grace end flag per network namespace Lockd: move grace period management from lockd() to per-net functions LockD: pass actual network namespace to grace period management functions LockD: manage grace list per network namespace SUNRPC: service request network namespace helper introduced NFSd: make nfsd4_manager allocated per network namespace context. LockD: make lockd manager allocated per network namespace LockD: manage grace period per network namespace Lockd: add more debug to host shutdown functions Lockd: host complaining function introduced LockD: manage used host count per networks namespace LockD: manage garbage collection timeout per networks namespace LockD: make garbage collector network namespace aware. LockD: mark host per network namespace on garbage collect nfsd4: fix missing fault_inject.h include locks: move lease-specific code out of locks_delete_lock locks: prevent side-effects of locks_release_private before file_lock is initialized NFSd: set nfsd_serv to NULL after service destruction NFSd: introduce nfsd_destroy() helper ...	2012-07-31 14:42:28 -07:00
Jan Kara	4a55c1017b	nfsd: Push mnt_want_write() outside of i_mutex When mnt_want_write() starts to handle freezing it will get a full lock semantics requiring proper lock ordering. So push mnt_want_write() call consistently outside of i_mutex. CC: linux-nfs@vger.kernel.org CC: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-31 01:02:51 +04:00
Al Viro	765927b2d5	switch dentry_open() to struct path, make it grab references itself Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-23 00:01:29 +04:00
Al Viro	312b63fba9	don't pass nameidata * to vfs_create() all we want is a boolean flag, same as the method gets now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-14 16:34:50 +04:00
J. Bruce Fields	d91d0b5690	nfsd: allow owner_override only for regular files We normally allow the owner of a file to override permissions checks on IO operations, since: - the client will take responsibility for doing an access check on open; - the permission checks offer no protection against malicious clients--if they can authenticate as the file's owner then they can always just change its permissions; - checking permission on each IO operation breaks the usual posix rule that permission is checked only on open. However, we've never allowed the owner to override permissions on readdir operations, even though the above logic would also apply to directories. I've never heard of this causing a problem, probably because a) simultaneously opening and creating a directory (with restricted mode) isn't possible, and b) opening a directory, then chmod'ing it, is rare. Our disallowal of owner-override on directories appears to be an accident, though--the readdir itself succeeds, and then we fail just because lookup_one_len() calls in our filldir methods fail. I'm not sure what the easiest fix for that would be. For now, just make this behavior obvious by denying the override right at the start. This also fixes some odd v4 behavior: with the rdattr_error attribute requested, it would perform the readdir but return an ACCES error with each entry. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-10 16:41:35 -04:00
Jeff Layton	b108fe6b08	nfsd: trivial: use SEEK_SET instead of 0 in vfs_llseek They're equivalent, but SEEK_SET is more informative... Cc: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-25 15:38:23 -04:00
J. Bruce Fields	9dc4e6c4d1	nfsd: don't fail unchecked creates of non-special files Allow a v3 unchecked open of a non-regular file succeed as if it were a lookup; typically a client in such a case will want to fall back on a local open, so succeeding and giving it the filehandle is more useful than failing with nfserr_exist, which makes it appear that nothing at all exists by that name. Similarly for v4, on an open-create, return the same errors we would on an attempt to open a non-regular file, instead of returning nfserr_exist. This fixes a problem found doing a v4 open of a symlink with O_RDONLY\|O_CREAT, which resulted in the current client returning EEXIST. Thanks also to Trond for analysis. Cc: stable@kernel.org Reported-by: Orion Poplawski <orion@cora.nwra.com> Tested-by: Orion Poplawski <orion@cora.nwra.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:49:52 -04:00

1 2 3 4 5 ...

258 Commits