linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-18 17:54:13 +08:00

Author	SHA1	Message	Date
Kinglong Mee	6896f15aab	nfsd: Fix an FS_LAYOUT_TYPES/LAYOUT_TYPES encode bug Currently we'll respond correctly to a request for either FS_LAYOUT_TYPES or LAYOUT_TYPES, but not to a request for both attributes simultaneously. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-08-31 16:12:39 -04:00
Kinglong Mee	0a2050d744	NFSD: Store parent's stat in a separate value After commit `ae7095a7c4` (nfsd4: helper function for getting mounted_on ino) we ignore the return value from get_parent_attributes(). Also, the following FATTR4_WORD2_LAYOUT_BLKSIZE uses stat.blksize, so to avoid overwriting that, use an independent value for the parent's attributes. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-08-31 15:11:05 -04:00
Kinglong Mee	c2227a39a0	nfsd: Drop BUG_ON and ignore SECLABEL on absent filesystem On an absent filesystem (one served by another server), we need to be able to handle requests for certain attributest (like fs_locations, so the client can find out which server does have the filesystem), but others we can't. We forgot to take that into account when adding another attribute bitmask work for the SECURITY_LABEL attribute. There an export entry with the "refer" option can result in: [ 88.414272] kernel BUG at fs/nfsd/nfs4xdr.c:2249! [ 88.414828] invalid opcode: 0000 [#1] SMP [ 88.415368] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd xfs libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iosf_mbi ppdev btrfs coretemp crct10dif_pclmul crc32_pclmul crc32c_intel xor ghash_clmulni_intel raid6_pq vmw_balloon parport_pc parport i2c_piix4 shpchp vmw_vmci acpi_cpufreq auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi mptscsih serio_raw mptbase e1000 scsi_transport_spi ata_generic pata_acpi [last unloaded: nfsd] [ 88.417827] CPU: 0 PID: 2116 Comm: nfsd Not tainted 4.0.7-300.fc22.x86_64 #1 [ 88.418448] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014 [ 88.419093] task: ffff880079146d50 ti: ffff8800785d8000 task.ti: ffff8800785d8000 [ 88.419729] RIP: 0010:[<ffffffffa04b3c10>] [<ffffffffa04b3c10>] nfsd4_encode_fattr+0x820/0x1f00 [nfsd] [ 88.420376] RSP: 0000:ffff8800785db998 EFLAGS: 00010206 [ 88.421027] RAX: 0000000000000001 RBX: 000000000018091a RCX: ffff88006668b980 [ 88.421676] RDX: 00000000fffef7fc RSI: 0000000000000000 RDI: ffff880078d05000 [ 88.422315] RBP: ffff8800785dbb58 R08: ffff880078d043f8 R09: ffff880078d4a000 [ 88.422968] R10: 0000000000010000 R11: 0000000000000002 R12: 0000000000b0a23a [ 88.423612] R13: ffff880078d05000 R14: ffff880078683100 R15: ffff88006668b980 [ 88.424295] FS: 0000000000000000(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000 [ 88.424944] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 88.425597] CR2: 00007f40bc370f90 CR3: 0000000035af5000 CR4: 00000000001407f0 [ 88.426285] Stack: [ 88.426921] ffff8800785dbaa8 ffffffffa049e4af ffff8800785dba08 ffffffff813298f0 [ 88.427585] ffff880078683300 ffff8800769b0de8 0000089d00000001 0000000087f805e0 [ 88.428228] ffff880000000000 ffff880079434a00 0000000000000000 ffff88006668b980 [ 88.428877] Call Trace: [ 88.429527] [<ffffffffa049e4af>] ? exp_get_by_name+0x7f/0xb0 [nfsd] [ 88.430168] [<ffffffff813298f0>] ? inode_doinit_with_dentry+0x210/0x6a0 [ 88.430807] [<ffffffff8123833e>] ? d_lookup+0x2e/0x60 [ 88.431449] [<ffffffff81236133>] ? dput+0x33/0x230 [ 88.432097] [<ffffffff8123f214>] ? mntput+0x24/0x40 [ 88.432719] [<ffffffff812272b2>] ? path_put+0x22/0x30 [ 88.433340] [<ffffffffa049ac87>] ? nfsd_cross_mnt+0xb7/0x1c0 [nfsd] [ 88.433954] [<ffffffffa04b54e0>] nfsd4_encode_dirent+0x1b0/0x3d0 [nfsd] [ 88.434601] [<ffffffffa04b5330>] ? nfsd4_encode_getattr+0x40/0x40 [nfsd] [ 88.435172] [<ffffffffa049c991>] nfsd_readdir+0x1c1/0x2a0 [nfsd] [ 88.435710] [<ffffffffa049a530>] ? nfsd_direct_splice_actor+0x20/0x20 [nfsd] [ 88.436447] [<ffffffffa04abf30>] nfsd4_encode_readdir+0x120/0x220 [nfsd] [ 88.437011] [<ffffffffa04b58cd>] nfsd4_encode_operation+0x7d/0x190 [nfsd] [ 88.437566] [<ffffffffa04aa6dd>] nfsd4_proc_compound+0x24d/0x6f0 [nfsd] [ 88.438157] [<ffffffffa0496103>] nfsd_dispatch+0xc3/0x220 [nfsd] [ 88.438680] [<ffffffffa006f0cb>] svc_process_common+0x43b/0x690 [sunrpc] [ 88.439192] [<ffffffffa0070493>] svc_process+0x103/0x1b0 [sunrpc] [ 88.439694] [<ffffffffa0495a57>] nfsd+0x117/0x190 [nfsd] [ 88.440194] [<ffffffffa0495940>] ? nfsd_destroy+0x90/0x90 [nfsd] [ 88.440697] [<ffffffff810bb728>] kthread+0xd8/0xf0 [ 88.441260] [<ffffffff810bb650>] ? kthread_worker_fn+0x180/0x180 [ 88.441762] [<ffffffff81789e58>] ret_from_fork+0x58/0x90 [ 88.442322] [<ffffffff810bb650>] ? kthread_worker_fn+0x180/0x180 [ 88.442879] Code: 0f 84 93 05 00 00 83 f8 ea c7 85 a0 fe ff ff 00 00 27 30 0f 84 ba fe ff ff 85 c0 0f 85 a5 fe ff ff e9 e3 f9 ff ff 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 be 04 00 00 00 4c 89 ef 4c 89 8d 68 fe [ 88.444052] RIP [<ffffffffa04b3c10>] nfsd4_encode_fattr+0x820/0x1f00 [nfsd] [ 88.444658] RSP <ffff8800785db998> [ 88.445232] ---[ end trace 6cb9d0487d94a29f ]--- Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-07-20 14:58:22 -04:00
Christoph Hellwig	68e8bb0334	nfsd: wrap too long lines in nfsd4_encode_read Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-06-22 14:15:05 -04:00
Christoph Hellwig	96bcad5064	nfsd: fput rd_file from XDR encode context Remove the hack where we fput the read-specific file in generic code. Instead we can do it in nfsd4_encode_read as that gets called for all error cases as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-06-22 14:15:04 -04:00
Christoph Hellwig	af90f707fa	nfsd: take struct file setup fully into nfs4_preprocess_stateid_op This patch changes nfs4_preprocess_stateid_op so it always returns a valid struct file if it has been asked for that. For that we now allocate a temporary struct file for special stateids, and check permissions if we got the file structure from the stateid. This ensures that all callers will get their handling of special stateids right, and avoids code duplication. There is a little wart in here because the read code needs to know if we allocated a file structure so that it can copy around the read-ahead parameters. In the long run we should probably aim to cache full file structures used with special stateids instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-06-22 14:15:03 -04:00
Christoph Hellwig	e749a4621e	nfsd: clean up raparams handling Refactor the raparam hash helpers to just deal with the raparms, and keep opening/closing files separate from that. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-06-19 15:39:51 -04:00
Andreas Gruenbacher	0c9d65e76a	nfsd: Checking for acl support does not require fetching any acls Whether or not a file system supports acls can be determined with IS_POSIXACL(inode) and does not require trying to fetch any acls; the code for computing the supported_attrs and aclsupport attributes can be simplified. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-05-29 11:04:02 -04:00
Linus Torvalds	9ec3a646fe	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull fourth vfs update from Al Viro: "d_inode() annotations from David Howells (sat in for-next since before the beginning of merge window) + four assorted fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: RCU pathwalk breakage when running into a symlink overmounting something fix I_DIO_WAKEUP definition direct-io: only inc/dec inode->i_dio_count for file systems fs/9p: fix readdir() VFS: assorted d_backing_inode() annotations VFS: fs/inode.c helpers: d_inode() annotations VFS: fs/cachefiles: d_backing_inode() annotations VFS: fs library helpers: d_inode() annotations VFS: assorted weird filesystems: d_inode() annotations VFS: normal filesystems (and lustre): d_inode() annotations VFS: security/: d_inode() annotations VFS: security/: d_backing_inode() annotations VFS: net/: d_inode() annotations VFS: net/unix: d_backing_inode() annotations VFS: kernel/: d_inode() annotations VFS: audit: d_backing_inode() annotations VFS: Fix up some ->d_inode accesses in the chelsio driver VFS: Cachefiles should perform fs modifications on the top layer only VFS: AF_UNIX sockets should call mknod on the top layer only	2015-04-26 17:22:07 -07:00
J. Bruce Fields	6e4891dc28	nfsd4: fix READ permission checking In the case we already have a struct file (derived from a stateid), we still need to do permission-checking; otherwise an unauthorized user could gain access to a file by sniffing or guessing somebody else's stateid. Cc: stable@vger.kernel.org Fixes: `dc97618ddd` "nfsd4: separate splice and readv cases" Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-04-21 16:16:01 -04:00
David Howells	2b0143b5c9	VFS: normal filesystems (and lustre): d_inode() annotations that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-15 15:06:57 -04:00
Kinglong Mee	1ec8c0c47f	nfsd: Remove duplicate macro define for max sec label length NFS4_MAXLABELLEN has defined for sec label max length, use it directly. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-03-31 16:46:39 -04:00
Kinglong Mee	b77a4b2edb	NFSD: Using path_equal() for checking two paths Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-03-31 16:46:38 -04:00
Kinglong Mee	376675daea	NFSD: Take care the return value from nfsd4_encode_stateid Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-03-25 21:13:02 -04:00
Kinglong Mee	db59c0ef08	NFSD: Take care the return value from nfsd4_decode_stateid Return status after nfsd4_decode_stateid failed. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-03-20 12:43:59 -04:00
Christoph Hellwig	9cf514ccfa	nfsd: implement pNFS operations Add support for the GETDEVICEINFO, LAYOUTGET, LAYOUTCOMMIT and LAYOUTRETURN NFSv4.1 operations, as well as backing code to manage outstanding layouts and devices. Layout management is very straight forward, with a nfs4_layout_stateid structure that extends nfs4_stid to manage layout stateids as the top-level structure. It is linked into the nfs4_file and nfs4_client structures like the other stateids, and contains a linked list of layouts that hang of the stateid. The actual layout operations are implemented in layout drivers that are not part of this commit, but will be added later. The worst part of this commit is the management of the pNFS device IDs, which suffers from a specification that is not sanely implementable due to the fact that the device-IDs are global and not bound to an export, and have a small enough size so that we can't store the fsid portion of a file handle, and must never be reused. As we still do need perform all export authentication and validation checks on a device ID passed to GETDEVICEINFO we are caught between a rock and a hard place. To work around this issue we add a new hash that maps from a 64-bit integer to a fsid so that we can look up the export to authenticate against it, a 32-bit integer as a generation that we can bump when changing the device, and a currently unused 32-bit integer that could be used in the future to handle more than a single device per export. Entries in this hash table are never deleted as we can't reuse the ids anyway, and would have a severe lifetime problem anyway as Linux export structures are temporary structures that can go away under load. Parts of the XDR data, structures and marshaling/unmarshaling code, as well as many concepts are derived from the old pNFS server implementation from Andy Adamson, Benny Halevy, Dean Hildebrand, Marc Eshel, Fred Isaman, Mike Sager, Ricardo Labiaga and many others. Signed-off-by: Christoph Hellwig <hch@lst.de>	2015-02-02 18:09:42 +01:00
Christoph Hellwig	4c94e13e9c	nfsd: factor out a helper to decode nfstime4 values Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-23 10:29:13 -05:00
J. Bruce Fields	0ec016e3e0	nfsd4: tweak rd_dircount accounting RFC 3530 14.2.24 says This value represents the length of the names of the directory entries and the cookie value for these entries. This length represents the XDR encoding of the data (names and cookies)... The "xdr encoding" of the name should probably include the 4 bytes for the length. But this is all just a hint so not worth e.g. backporting to stable. Also reshuffle some lines to more clearly group together the dircount-related code. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-01-07 14:48:10 -05:00
Linus Torvalds	0b233b7c79	Merge branch 'for-3.19' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "A comparatively quieter cycle for nfsd this time, but still with two larger changes: - RPC server scalability improvements from Jeff Layton (using RCU instead of a spinlock to find idle threads). - server-side NFSv4.2 ALLOCATE/DEALLOCATE support from Anna Schumaker, enabling fallocate on new clients" * 'for-3.19' of git://linux-nfs.org/~bfields/linux: (32 commits) nfsd4: fix xdr4 count of server in fs_location4 nfsd4: fix xdr4 inclusion of escaped char sunrpc/cache: convert to use string_escape_str() sunrpc: only call test_bit once in svc_xprt_received fs: nfsd: Fix signedness bug in compare_blob sunrpc: add some tracepoints around enqueue and dequeue of svc_xprt sunrpc: convert to lockless lookup of queued server threads sunrpc: fix potential races in pool_stats collection sunrpc: add a rcu_head to svc_rqst and use kfree_rcu to free it sunrpc: require svc_create callers to pass in meaningful shutdown routine sunrpc: have svc_wake_up only deal with pool 0 sunrpc: convert sp_task_pending flag to use atomic bitops sunrpc: move rq_cachetype field to better optimize space sunrpc: move rq_splice_ok flag into rq_flags sunrpc: move rq_dropme flag into rq_flags sunrpc: move rq_usedeferral flag to rq_flags sunrpc: move rq_local field to rq_flags sunrpc: add a generic rq_flags field to svc_rqst and move rq_secure to it nfsd: minor off by one checks in __write_versions() sunrpc: release svc_pool_map reference when serv allocation fails ...	2014-12-16 15:25:31 -08:00
Benjamin Coddington	bf7491f1be	nfsd4: fix xdr4 count of server in fs_location4 Fix a bug where nfsd4_encode_components_esc() incorrectly calculates the length of server array in fs_location4--note that it is a count of the number of array elements, not a length in bytes. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: `082d4bd72a` (nfsd4: "backfill" using write_bytes_to_xdr_buf) Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 15:52:17 -05:00
Benjamin Coddington	5a64e56976	nfsd4: fix xdr4 inclusion of escaped char Fix a bug where nfsd4_encode_components_esc() includes the esc_end char as an additional string encoding. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org Fixes: `e7a0444aef` "nfsd: add IPv6 addr escaping to fs_location hosts" Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 15:51:30 -05:00
Jeff Layton	779fb0f3af	sunrpc: move rq_splice_ok flag into rq_flags Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-12-09 11:22:21 -05:00
Al Viro	a455589f18	assorted conversions to %p[dD] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-11-19 13:01:20 -05:00
Anna Schumaker	b0cb908523	nfsd: Add DEALLOCATE support DEALLOCATE only returns a status value, meaning we can use the noop() xdr encoder to reply to the client. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-11-07 16:20:15 -05:00
Anna Schumaker	95d871f03c	nfsd: Add ALLOCATE support The ALLOCATE operation is used to preallocate space in a file. I can do this by using vfs_fallocate() to do the actual preallocation. ALLOCATE only returns a status indicator, so we don't need to write a special encode() function. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-11-07 16:19:49 -05:00
Linus Torvalds	6dea0737bc	Merge branch 'for-3.18' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "Highlights: - support the NFSv4.2 SEEK operation (allowing clients to support SEEK_HOLE/SEEK_DATA), thanks to Anna. - end the grace period early in a number of cases, mitigating a long-standing annoyance, thanks to Jeff - improve SMP scalability, thanks to Trond" * 'for-3.18' of git://linux-nfs.org/~bfields/linux: (55 commits) nfsd: eliminate "to_delegation" define NFSD: Implement SEEK NFSD: Add generic v4.2 infrastructure svcrdma: advertise the correct max payload nfsd: introduce nfsd4_callback_ops nfsd: split nfsd4_callback initialization and use nfsd: introduce a generic nfsd4_cb nfsd: remove nfsd4_callback.cb_op nfsd: do not clear rpc_resp in nfsd4_cb_done_sequence nfsd: fix nfsd4_cb_recall_done error handling nfsd4: clarify how grace period ends nfsd4: stop grace_time update at end of grace period nfsd: skip subsequent UMH "create" operations after the first one for v4.0 clients nfsd: set and test NFSD4_CLIENT_STABLE bit to reduce nfsdcltrack upcalls nfsd: serialize nfsdcltrack upcalls for a particular client nfsd: pass extra info in env vars to upcalls to allow for early grace period end nfsd: add a v4_end_grace file to /proc/fs/nfsd lockd: add a /proc/fs/lockd/nlm_end_grace file nfsd: reject reclaim request when client has already sent RECLAIM_COMPLETE nfsd: remove redundant boot_time parm from grace_done client tracking op ...	2014-10-08 12:51:44 -04:00
J. Bruce Fields	15b23ef5d3	nfsd4: fix corruption of NFSv4 read data The calculation of page_ptr here is wrong in the case the read doesn't start at an offset that is a multiple of a page. The result is that nfs4svc_encode_compoundres sets rq_next_page to a value one too small, and then the loop in svc_free_res_pages may incorrectly fail to clear a page pointer in rq_respages[]. Pages left in rq_respages[] are available for the next rpc request to use, so xdr data may be written to that page, which may hold data still waiting to be transmitted to the client or data in the page cache. The observed result was silent data corruption seen on an NFSv4 client. We tag this as "fixing" `05638dc73a` because that commit exposed this bug, though the incorrect calculation predates it. Particular thanks to Andrea Arcangeli and David Gilbert for analysis and testing. Fixes: `05638dc73a` "nfsd4: simplify server xdr->next_page use" Cc: stable@vger.kernel.org Reported-by: Andrea Arcangeli <aarcange@redhat.com> Tested-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-30 15:57:04 -04:00
Anna Schumaker	24bab49122	NFSD: Implement SEEK This patch adds server support for the NFS v4.2 operation SEEK, which returns the position of the next hole or data segment in a file. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-29 14:35:20 -04:00
Anna Schumaker	87a15a8090	NFSD: Add generic v4.2 infrastructure It's cleaner to introduce everything at once and have the server reply with "not supported" than it would be to introduce extra operations when implementing a specific one in the middle of the list. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-29 14:35:19 -04:00
J. Bruce Fields	aee3776441	nfsd4: fix rd_dircount enforcement Commit `3b29970909` "nfsd4: enforce rd_dircount" totally misunderstood rd_dircount; it refers to total non-attribute bytes returned, not number of directory entries returned. Bring the code into agreement with RFC 3530 section 14.2.24. Cc: stable@vger.kernel.org Fixes: `3b29970909` "nfsd4: enforce rd_dircount" Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-08 12:02:03 -04:00
J. Bruce Fields	f7b43d0c99	nfsd4: reserve adequate space for LOCK op As of `8c7424cff6` "nfsd4: don't try to encode conflicting owner if low on space", we permit the server to process a LOCK operation even if there might not be space to return the conflicting lockowner, because we've made returning the conflicting lockowner optional. However, the rpc server still wants to know the most we might possibly return, so we need to take into account the possible conflicting lockowner in the svc_reserve_space() call here. Symptoms were log messages like "RPC request reserved 88 but used 108". Fixes: `8c7424cff6` "nfsd4: don't try to encode conflicting owner if low on space" Reported-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:14 -04:00
J. Bruce Fields	1383bf37ce	nfsd4: remove obsolete comment We do what Neil suggests now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:14 -04:00
Linus Torvalds	0d10c2c170	Merge branch 'for-3.17' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "This includes a major rewrite of the NFSv4 state code, which has always depended on a single mutex. As an example, open creates are no longer serialized, fixing a performance regression on NFSv3->NFSv4 upgrades. Thanks to Jeff, Trond, and Benny, and to Christoph for review. Also some RDMA fixes from Chuck Lever and Steve Wise, and miscellaneous fixes from Kinglong Mee and others" * 'for-3.17' of git://linux-nfs.org/~bfields/linux: (167 commits) svcrdma: remove rdma_create_qp() failure recovery logic nfsd: add some comments to the nfsd4 object definitions nfsd: remove the client_mutex and the nfs4_lock/unlock_state wrappers nfsd: remove nfs4_lock_state: nfs4_state_shutdown_net nfsd: remove nfs4_lock_state: nfs4_laundromat nfsd: Remove nfs4_lock_state(): reclaim_complete() nfsd: Remove nfs4_lock_state(): setclientid, setclientid_confirm, renew nfsd: Remove nfs4_lock_state(): exchange_id, create/destroy_session() nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirm nfsd: Remove nfs4_lock_state(): nfsd4_delegreturn() nfsd: Remove nfs4_lock_state(): nfsd4_open_downgrade + nfsd4_close nfsd: Remove nfs4_lock_state(): nfsd4_lock/locku/lockt() nfsd: Remove nfs4_lock_state(): nfsd4_release_lockowner nfsd: Remove nfs4_lock_state(): nfsd4_test_stateid/nfsd4_free_stateid nfsd: Remove nfs4_lock_state(): nfs4_preprocess_stateid_op() nfsd: remove old fault injection infrastructure nfsd: add more granular locking to *_delegations fault injectors nfsd: add more granular locking to forget_openowners fault injector nfsd: add more granular locking to forget_locks fault injector nfsd: add a list_head arg to nfsd_foreach_client_lock ...	2014-08-09 14:31:18 -07:00
Jeff Layton	58fb12e6a4	nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache We don't want to rely on the client_mutex for protection in the case of NFSv4 open owners. Instead, we add a mutex that will only be taken for NFSv4.0 state mutating operations, and that will be released once the entire compound is done. Also, ensure that nfsd4_cstate_assign_replay/nfsd4_cstate_clear_replay take a reference to the stateowner when they are using it for NFSv4.0 open and lock replay caching. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-31 14:20:19 -04:00
Kinglong Mee	f98bac5a30	NFSD: Fix crash encoding lock reply on 32-bit Commit `8c7424cff6` "nfsd4: don't try to encode conflicting owner if low on space" forgot to free conf->data in nfsd4_encode_lockt and before sign conf->data to NULL in nfsd4_encode_lock_denied, causing a leak. Worse, kfree() can be called on an uninitialized pointer in the case of a succesful lock (or one that fails for a reason other than a conflict). (Note that lock->lk_denied.ld_owner.data appears it should be zero here, until you notice that it's one arm of a union the other arm of which is written to in the succesful case by the memcpy(&lock->lk_resp_stateid, &lock_stp->st_stid.sc_stateid, sizeof(stateid_t)); in nfsd4_lock(). In the 32-bit case this overwrites ld_owner.data.) Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Fixes: `8c7424cff6` ""nfsd4: don't try to encode conflicting owner if low on space" Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-23 10:31:56 -04:00
J. Bruce Fields	5d6031ca74	nfsd4: zero op arguments beyond the 8th compound op The first 8 ops of the compound are zeroed since they're a part of the argument that's zeroed by the memset(rqstp->rq_argp, 0, procp->pc_argsize); in svc_process_common(). But we handle larger compounds by allocating the memory on the fly in nfsd4_decode_compound(). Other than code recently fixed by `01529e3f81` "NFSD: Fix memory leak in encoding denied lock", I don't know of any examples of code depending on this initialization. But it definitely seems possible, and I'd rather be safe. Compounds this long are unusual so I'm much more worried about failure in this poorly tested cases than about an insignificant performance hit. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-17 16:20:39 -04:00
Kinglong Mee	d5d5c304b1	NFSD: Fix bad checking of space for padding in splice read Note that the caller has already reserved space for count and eof, so xdr->p has already moved past them, only the padding remains. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Fixes `dc97618ddd` (nfsd4: separate splice and readv cases) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-11 15:19:25 -04:00
Kinglong Mee	01529e3f81	NFSD: Fix memory leak in encoding denied lock Commit `8c7424cff6` (nfsd4: don't try to encode conflicting owner if low on space) forgot free conf->data in nfsd4_encode_lockt and before sign conf->data to NULL in nfsd4_encode_lock_denied. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-09 20:55:08 -04:00
Trond Myklebust	b607664ee7	nfsd: Cleanup nfs4svc_encode_compoundres Move the slot return, put session etc into a helper in fs/nfsd/nfs4state.c instead of open coding in nfs4svc_encode_compoundres. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-08 17:14:34 -04:00
Kinglong Mee	1055414fe1	NFSD: Avoid warning message when compile at i686 arch fs/nfsd/nfs4xdr.c: In function 'nfsd4_encode_readv': >> fs/nfsd/nfs4xdr.c:3137:148: warning: comparison of distinct pointer types lacks a cast [enabled by default] thislen = min(len, ((void )xdr->end - (void )xdr->p)); Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-08 17:14:28 -04:00
J. Bruce Fields	d5e2338324	nfsd4: replace defer_free by svcxdr_tmpalloc Avoid an extra allocation for the tmpbuf struct itself, and stop ignoring some allocation failures. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-08 17:14:27 -04:00
J. Bruce Fields	bcaab953b1	nfsd4: remove nfs4_acl_new This is a not-that-useful kmalloc wrapper. And I'd like one of the callers to actually use something other than kmalloc. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-08 17:14:27 -04:00
J. Bruce Fields	29c353b3fe	nfsd4: define svcxdr_dupstr to share some common code Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-08 17:14:26 -04:00
J. Bruce Fields	ce043ac826	nfsd4: remove unused defer_free argument `28e05dd845` "knfsd: nfsd4: represent nfsv4 acl with array instead of linked list" removed the last user that wanted a custom free function. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-08 17:14:25 -04:00
J. Bruce Fields	7fb84306f5	nfsd4: rename cr_linkname->cr_data The name of a link is currently stored in cr_name and cr_namelen, and the content in cr_linkname and cr_linklen. That's confusing. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-08 17:14:24 -04:00
J. Bruce Fields	b829e9197a	nfsd: fix rare symlink decoding bug An NFS operation that creates a new symlink includes the symlink data, which is xdr-encoded as a length followed by the data plus 0 to 3 bytes of zero-padding as required to reach a 4-byte boundary. The vfs, on the other hand, wants null-terminated data. The simple way to handle this would be by copying the data into a newly allocated buffer with space for the final null. The current nfsd_symlink code tries to be more clever by skipping that step in the (likely) case where the byte following the string is already 0. But that assumes that the byte following the string is ours to look at. In fact, it might be the first byte of a page that we can't read, or of some object that another task might modify. Worse, the NFSv4 code tries to fix the problem by actually writing to that byte. In the NFSv2/v3 cases this actually appears to be safe: - nfs3svc_decode_symlinkargs explicitly null-terminates the data (after first checking its length and copying it to a new page). - NFSv2 limits symlinks to 1k. The buffer holding the rpc request is always at least a page, and the link data (and previous fields) have maximum lengths that prevent the request from reaching the end of a page. In the NFSv4 case the CREATE op is potentially just one part of a long compound so can end up on the end of a page if you're unlucky. The minimal fix here is to copy and null-terminate in the NFSv4 case. The nfsd_symlink() interface here seems too fragile, though. It should really either do the copy itself every time or just require a null-terminated string. Reported-by: Jeff Layton <jlayton@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-08 17:14:22 -04:00
Kinglong Mee	c3a4561796	nfsd: Fix bad reserving space for encoding rdattr_error Introduced by commit `561f0ed498` (nfsd4: allow large readdirs). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-07 14:16:31 -04:00
Avi Kivity	69bbd9c7b9	nfs: fix nfs4d readlink truncated packet XDR requires 4-byte alignment; nfs4d READLINK reply writes out the padding, but truncates the packet to the padding-less size. Fix by taking the padding into consideration when truncating the packet. Symptoms: # ll /mnt/ ls: cannot read symbolic link /mnt/test: Input/output error total 4 -rw-r--r--. 1 root root 0 Jun 14 01:21 123456 lrwxrwxrwx. 1 root root 6 Jul 2 03:33 test drwxr-xr-x. 1 root root 0 Jul 2 23:50 tmp drwxr-xr-x. 1 root root 60 Jul 2 23:44 tree Signed-off-by: Avi Kivity <avi@cloudius-systems.com> Fixes: `476a7b1f4b` (nfsd4: don't treat readlink like a zero-copy operation) Reviewed-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-02 17:37:13 -04:00
J. Bruce Fields	76f47128f9	nfsd: fix rare symlink decoding bug An NFS operation that creates a new symlink includes the symlink data, which is xdr-encoded as a length followed by the data plus 0 to 3 bytes of zero-padding as required to reach a 4-byte boundary. The vfs, on the other hand, wants null-terminated data. The simple way to handle this would be by copying the data into a newly allocated buffer with space for the final null. The current nfsd_symlink code tries to be more clever by skipping that step in the (likely) case where the byte following the string is already 0. But that assumes that the byte following the string is ours to look at. In fact, it might be the first byte of a page that we can't read, or of some object that another task might modify. Worse, the NFSv4 code tries to fix the problem by actually writing to that byte. In the NFSv2/v3 cases this actually appears to be safe: - nfs3svc_decode_symlinkargs explicitly null-terminates the data (after first checking its length and copying it to a new page). - NFSv2 limits symlinks to 1k. The buffer holding the rpc request is always at least a page, and the link data (and previous fields) have maximum lengths that prevent the request from reaching the end of a page. In the NFSv4 case the CREATE op is potentially just one part of a long compound so can end up on the end of a page if you're unlucky. The minimal fix here is to copy and null-terminate in the NFSv4 case. The nfsd_symlink() interface here seems too fragile, though. It should really either do the copy itself every time or just require a null-terminated string. Reported-by: Jeff Layton <jlayton@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-06-27 16:10:46 -04:00
Kinglong Mee	3c7aa15d20	NFSD: Using min/max/min_t/max_t for calculate Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-06-23 11:31:36 -04:00

1 2 3 4 5 ...

330 Commits