linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-11 12:28:41 +08:00

Author	SHA1	Message	Date
David Windsor	6391af6f58	vfs: Copy struct mount.mnt_id to userspace using put_user() The mnt_id field can be copied with put_user(), so there is no need to use copy_to_user(). In both cases, hardened usercopy is being bypassed since the size is constant, and not open to runtime manipulation. This patch is verbatim from Brad Spengler/PaX Team's PAX_USERCOPY whitelisting code in the last public patch of grsecurity/PaX based on my understanding of the code. Changes or omissions from the original code are mine and don't reflect the original grsecurity/PaX code. Signed-off-by: David Windsor <dave@nullcore.net> [kees: adjust commit log] Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>	2018-01-15 12:07:51 -08:00
David Windsor	6a9b88204c	vfs: Define usercopy region in names_cache slab caches VFS pathnames are stored in the names_cache slab cache, either inline or across an entire allocation entry (when approaching PATH_MAX). These are copied to/from userspace, so they must be entirely whitelisted. cache object allocation: include/linux/fs.h: #define __getname() kmem_cache_alloc(names_cachep, GFP_KERNEL) example usage trace: strncpy_from_user+0x4d/0x170 getname_flags+0x6f/0x1f0 user_path_at_empty+0x23/0x40 do_mount+0x69/0xda0 SyS_mount+0x83/0xd0 fs/namei.c: getname_flags(...): ... result = __getname(); ... kname = (char )result->iname; result->name = kname; len = strncpy_from_user(kname, filename, EMBEDDED_NAME_MAX); ... if (unlikely(len == EMBEDDED_NAME_MAX)) { const size_t size = offsetof(struct filename, iname[1]); kname = (char )result; result = kzalloc(size, GFP_KERNEL); ... result->name = kname; len = strncpy_from_user(kname, filename, PATH_MAX); In support of usercopy hardening, this patch defines the entire cache object in the names_cache slab cache as whitelisted, since it may entirely hold name strings to be copied to/from userspace. This patch is verbatim from Brad Spengler/PaX Team's PAX_USERCOPY whitelisting code in the last public patch of grsecurity/PaX based on my understanding of the code. Changes or omissions from the original code are mine and don't reflect the original grsecurity/PaX code. Signed-off-by: David Windsor <dave@nullcore.net> [kees: adjust commit log, add usage trace] Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>	2018-01-15 12:07:50 -08:00
David Windsor	80344266c1	dcache: Define usercopy region in dentry_cache slab cache When a dentry name is short enough, it can be stored directly in the dentry itself (instead in a separate kmalloc allocation). These dentry short names, stored in struct dentry.d_iname and therefore contained in the dentry_cache slab cache, need to be coped to userspace. cache object allocation: fs/dcache.c: __d_alloc(...): ... dentry = kmem_cache_alloc(dentry_cache, ...); ... dentry->d_name.name = dentry->d_iname; example usage trace: filldir+0xb0/0x140 dcache_readdir+0x82/0x170 iterate_dir+0x142/0x1b0 SyS_getdents+0xb5/0x160 fs/readdir.c: (called via ctx.actor by dir_emit) filldir(..., const char *name, ...): ... copy_to_user(..., name, namlen) fs/libfs.c: dcache_readdir(...): ... next = next_positive(dentry, p, 1) ... dir_emit(..., next->d_name.name, ...) In support of usercopy hardening, this patch defines a region in the dentry_cache slab cache in which userspace copy operations are allowed. This region is known as the slab cache's usercopy region. Slab caches can now check that each dynamic copy operation involving cache-managed memory falls entirely within the slab's usercopy region. This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY whitelisting code in the last public patch of grsecurity/PaX based on my understanding of the code. Changes or omissions from the original code are mine and don't reflect the original grsecurity/PaX code. Signed-off-by: David Windsor <dave@nullcore.net> [kees: adjust hunks for kmalloc-specific things moved later] [kees: adjust commit log, provide usage trace] Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>	2018-01-15 12:07:50 -08:00
Linus Torvalds	2db767d988	NFS client fixes for Linux 4.15-rc2 Bugfixes: - NFSv4: Ensure gcc 4.4.4 can compile initialiser for "invalid_stateid" - SUNRPC: Allow connect to return EHOSTUNREACH - SUNRPC: Handle ENETDOWN errors -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlohwp4ACgkQ18tUv7Cl QOtq1A//RPOxJBPQsImfkVTiVzxZbS8k2/obJSZjPYoNozmywEJs9dnFYJVCFUGp l9AvRd/SjXOVjGovk6ZhDCY3xA2eP1XfOLiVg7EhpczPVCRNJ34BUT7hWyxnTLSz MKc1qLLfVaSjsLioO6YmdCPjiGC0KegrBKNlRlIbI+OjCq5aNJpz73Fb4mFgCp5M taERunf7X29WHxAVn0c3mhIHN7tpCi9SgfbMURBEKLNrzj7RxnRY07dT1S9Mg/Yg 4FWU9FIpAyk9C9we/LR9jUywZQ3GGJFFFTOo8RfyMB/LR9RACSXnbHjhI1nUEQTb R/NpBxlpvxEOapHdmw32jwj1fkY/WYlUiJekQhjEekp/HkFNdctQL8PjrhG6lIW7 eBfFqZ2RUhYF1OQ8k4o0pR60O2scH3/D7tZwpgnJMFSpQSMnPnU8K3gvn/B5Mi4f UPDHtfj3GlWCIIJq1RIqKN4mt4tPktatnTCLIzDmqNbwqISwxow1lxmSesNejULo MryXLLl5M3XegjokXs0d0hadoywswHRTAxXxQEZav0dKMcHq4F0NirVw+VOIyNCB CztIVFI5Czzo4h4x99lgN26bNTysGMvse2qiPkVVr0CZt2leyrZyTl9khvDe3C0t ijyq882b4LqibuQtnI3l/Pynrrowfp7fqYx7SO62VJjraBVYUzE= =eQyi -----END PGP SIGNATURE----- Merge tag 'nfs-for-4.15-2' of git://git.linux-nfs.org/projects/anna/linux-nfs Pull NFS client fixes from Anna Schumaker: "These patches fix a problem with compiling using an old version of gcc, and also fix up error handling in the SUNRPC layer. - NFSv4: Ensure gcc 4.4.4 can compile initialiser for "invalid_stateid" - SUNRPC: Allow connect to return EHOSTUNREACH - SUNRPC: Handle ENETDOWN errors" * tag 'nfs-for-4.15-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: SUNRPC: Handle ENETDOWN errors SUNRPC: Allow connect to return EHOSTUNREACH NFSv4: Ensure gcc 4.4.4 can compile initialiser for "invalid_stateid"	2017-12-01 20:04:20 -05:00
Linus Torvalds	788c1da05b	Changes since last update: - Fix memory leaks that appeared after removing ifork inline data buffer - Recover deferred rmap update log items in correct order - Fix memory leaks when buffer construction fails - Fix memory leaks when bmbt is corrupt - Fix some uninitialized variables and math problems in the quota scrubber - Add some omitted attribution tags on the log replay commit - Fix some UBSAN complaints about integer overflows with large sparse files - Implement an effective inode mode check in online fsck - Fix log's inability to retry quota item writeout due to transient errors -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCgAGBQJaIDZ8AAoJEPh/dxk0SrTrTD4QAIUq223XSyqMJYkAK163zMj4 PADY30MV7uMlFBLEm3b7ZEWA/vtFzDM7Qpa61WN15oR5jEVSqSFes9AzuLeISqia s7Hc1ksqgZLNaMnW+jQc4iT/yiCVhiWw3rFC4tahDVCF2lJO/la3ToUBbcoADAFk kBYVN1H1t5b+n5+A9QY6+Vxm6LXGPPo8vNyCQCEtN+dE7CcSEL4Ff9H9GmJiVPzk rG6uizwRvxZje/yY1jEnkCSI88Gj1v0L//VmIDDuGjCZleYxwbTQQO0l8p4S+Su8 48la8PZbk3KcBTfiRbcU0m4995DHDVT/mAOWHeZnv+ZI5jhDEe1lpJG5l65kwPK+ BOoTYaRaBv3yZvEOob6wEqyfT3A1dxXstKBJLPyHx+McqFH8+NV2WAry+6dedOkv Hwz6+OlAFmuBuhOZAZSt0LSWxu/qYovo5lCSNrBtiLlmDyFjtdbanQ7s8oWaV7p/ wimNV4Y+Y3XiePOEUftnG8yxOULZS4KMeYsdJxj9HzaKloYHQer+MWfPe0gzExBb eE3P9PckQpcx9hK8LE1irgDCDG6J2eb8b5sFZY0eNzngdtWCR/xYz3NFT+72kz3s XOI0mByH1Ab0Q1lvJml0RyW86Uj7lpMD2SzV2nVhbYrW81rkkzb7AQx5VyO57Gq6 WAX9mHNNRcY+uVrbb8QQ =oTB7 -----END PGP SIGNATURE----- Merge tag 'xfs-4.15-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Darrick Wong: "Here are some bug fixes for 4.15-rc2. - fix memory leaks that appeared after removing ifork inline data buffer - recover deferred rmap update log items in correct order - fix memory leaks when buffer construction fails - fix memory leaks when bmbt is corrupt - fix some uninitialized variables and math problems in the quota scrubber - add some omitted attribution tags on the log replay commit - fix some UBSAN complaints about integer overflows with large sparse files - implement an effective inode mode check in online fsck - fix log's inability to retry quota item writeout due to transient errors" * tag 'xfs-4.15-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: Properly retry failed dquot items in case of error during buffer writeback xfs: scrub inode mode properly xfs: remove unused parameter from xfs_writepage_map xfs: ubsan fixes xfs: calculate correct offset in xfs_scrub_quota_item xfs: fix uninitialized variable in xfs_scrub_quota xfs: fix leaks on corruption errors in xfs_bmap.c xfs: fortify xfs_alloc_buftarg error handling xfs: log recovery should replay deferred ops in order xfs: always free inline data before resetting inode fork during ifree	2017-12-01 20:00:19 -05:00
David Howells	f8de483e74	afs: Properly reset afs_vnode (inode) fields When an AFS inode is allocated by afs_alloc_inode(), the allocated afs_vnode struct isn't necessarily reset from the last time it was used as an inode because the slab constructor is only invoked once when the memory is obtained from the page allocator. This means that information can leak from one inode to the next because we're not calling kmem_cache_zalloc(). Some of the information isn't reset, in particular the permit cache pointer. Bring the clearances up to date. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Marc Dionne <marc.dionne@auristor.com>	2017-12-01 11:51:24 +00:00
David Howells	1bcab12521	afs: Fix permit refcounting Fix four refcount bugs in afs_cache_permit(): (1) When checking the result of the kzalloc(), we can't just return, but must put 'permits'. (2) We shouldn't put permits immediately after hashing a new permit as we need to keep the pointer stable so that we can check to see if vnode->permit_cache has changed before we decide whether to assign to it. (3) 'permits' is being put twice. (4) We need to put either the replacement or the thing replaced after the assignment to vnode->permit_cache. Without this, lots of the following are seen: Kernel BUG at ffffffffa039857b [verbose debug info unavailable] ------------[ cut here ]------------ Kernel BUG at ffffffffa039858a [verbose debug info unavailable] ------------[ cut here ]------------ The addresses are in the .text..refcount section of the kafs.ko module. Following the relocation records for the __ex_table section shows one to be due to the decrement in afs_put_permits() and the other to be key_get() in afs_cache_permit(). Occasionally, the following is seen: refcount_t overflow at afs_cache_permit+0x57d/0x5c0 [kafs] in cc1[562], uid/euid: 0/0 WARNING: CPU: 0 PID: 562 at kernel/panic.c:657 refcount_error_report+0x9c/0xac ... Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Marc Dionne <marc.dionne@auristor.com>	2017-12-01 11:40:43 +00:00
Linus Torvalds	9c41180be4	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota & reiserfs changes from Jan Kara: - two error checking improvements for quota - remove bogus i_version increase for reiserfs * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: Check for register_shrinker() failure. quota: propagate error from __dquot_initialize reiserfs: remove unneeded i_version bump	2017-11-30 18:38:47 -05:00
Carlos Maiolino	373b0589dc	xfs: Properly retry failed dquot items in case of error during buffer writeback Once the inode item writeback errors is already fixed, it's time to fix the same problem in dquot code. Although there were no reports of users hitting this bug in dquot code (at least none I've seen), the bug is there and I was already planning to fix it when the correct approach to fix the inodes part was decided. This patch aims to fix the same problem in dquot code, regarding failed buffers being unable to be resubmitted once they are flush locked. Tested with the recently test-case sent to fstests list by Hou Tao. Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-11-30 08:47:40 -08:00
Darrick J. Wong	3b42d38575	xfs: scrub inode mode properly Since we've used up all the bits in i_mode, the existing mode check doesn't actually do anything useful. However, we've not used all the bit values in the format portion of i_mode, so we /do/ need to test that for bad values. Fixes: `80e4e1268` ("xfs: scrub inodes") Fixes-coverity-id: 1423992 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2017-11-30 08:43:52 -08:00
Darrick J. Wong	2d5f4b5beb	xfs: remove unused parameter from xfs_writepage_map The first thing that xfs_writepage_map does is clobber the offset parameter. Since we never use the passed-in value, turn the parameter into a local variable. This gets rid of an UBSAN warning in generic/466. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2017-11-30 08:43:52 -08:00
Darrick J. Wong	22a6c83777	xfs: ubsan fixes Fix some complaints from the UBSAN about signed integer addition overflows. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2017-11-30 08:43:52 -08:00
Linus Torvalds	a0908a1b7d	Merge branch 'akpm' (patches from Andrew) Mergr misc fixes from Andrew Morton: "28 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (28 commits) fs/hugetlbfs/inode.c: change put_page/unlock_page order in hugetlbfs_fallocate() mm/hugetlb: fix NULL-pointer dereference on 5-level paging machine autofs: revert "autofs: fix AT_NO_AUTOMOUNT not being honored" autofs: revert "autofs: take more care to not update last_used on path walk" fs/fat/inode.c: fix sb_rdonly() change mm, memcg: fix mem_cgroup_swapout() for THPs mm: migrate: fix an incorrect call of prep_transhuge_page() kmemleak: add scheduling point to kmemleak_scan() scripts/bloat-o-meter: don't fail with division by 0 fs/mbcache.c: make count_objects() more robust Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" mm/madvise.c: fix madvise() infinite loop under special circumstances exec: avoid RLIMIT_STACK races with prlimit() IB/core: disable memory registration of filesystem-dax vmas v4l2: disable filesystem-dax mapping support mm: fail get_vaddr_frames() for filesystem-dax mappings mm: introduce get_user_pages_longterm device-dax: implement ->split() to catch invalid munmap attempts mm, hugetlbfs: introduce ->split() to vm_operations_struct scripts/faddr2line: extend usage on generic arch ...	2017-11-29 19:12:44 -08:00
Nadav Amit	72639e6df4	fs/hugetlbfs/inode.c: change put_page/unlock_page order in hugetlbfs_fallocate() hugetlfs_fallocate() currently performs put_page() before unlock_page(). This scenario opens a small time window, from the time the page is added to the page cache, until it is unlocked, in which the page might be removed from the page-cache by another core. If the page is removed during this time windows, it might cause a memory corruption, as the wrong page will be unlocked. It is arguable whether this scenario can happen in a real system, and there are several mitigating factors. The issue was found by code inspection (actually grep), and not by actually triggering the flow. Yet, since putting the page before unlocking is incorrect it should be fixed, if only to prevent future breakage or someone copy-pasting this code. Mike said: "I am of the opinion that this does not need to be sent to stable. Although the ordering is current code is incorrect, there is no way for this to be a problem with current locking. In addition, I verified that the perhaps bigger issue with sys_fadvise64(POSIX_FADV_DONTNEED) for hugetlbfs and other filesystems is addressed in `3a77d21480` ("mm: fadvise: avoid fadvise for fs without backing device")" Link: http://lkml.kernel.org/r/20170826191124.51642-1-namit@vmware.com Fixes: `70c3547e36` ("hugetlbfs: add hugetlbfs_fallocate()") Signed-off-by: Nadav Amit <namit@vmware.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-29 18:40:43 -08:00
Ian Kent	5d38f049ce	autofs: revert "autofs: fix AT_NO_AUTOMOUNT not being honored" Commit `42f4614821` ("autofs: fix AT_NO_AUTOMOUNT not being honored") allowed the fstatat(2) system call to properly honor the AT_NO_AUTOMOUNT flag but introduced a semantic change. In order to honor AT_NO_AUTOMOUNT a semantic change was made to the negative dentry case for stat family system calls in follow_automount(). This changed the unconditional triggering of an automount in this case to no longer be done and an error returned instead. This has caused more problems than I expected so reverting the change is needed. In a discussion with Neil Brown it was concluded that the automount(8) daemon can implement this change without kernel modifications. So that will be done instead and the autofs module documentation updated with a description of the problem and what needs to be done by module users for this specific case. Link: http://lkml.kernel.org/r/151174730120.6162.3848002191530283984.stgit@pluto.themaw.net Fixes: `42f4614821` ("autofs: fix AT_NO_AUTOMOUNT not being honored") Signed-off-by: Ian Kent <raven@themaw.net> Cc: Neil Brown <neilb@suse.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Colin Walters <walters@redhat.com> Cc: Ondrej Holy <oholy@redhat.com> Cc: <stable@vger.kernel.org> [4.11+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-29 18:40:43 -08:00
Ian Kent	43694d4bf8	autofs: revert "autofs: take more care to not update last_used on path walk" While commit `092a53452b` ("autofs: take more care to not update last_used on path walk") helped (partially) resolve a problem where automounts were not expiring due to aggressive accesses from user space it has a side effect for very large environments. This change helps with the expire problem by making the expire more aggressive but, for very large environments, that means more mount requests from clients. When there are a lot of clients that can mean fairly significant server load increases. It turns out I put the last_used in this position to solve this very problem and failed to update my own thinking of the autofs expire policy. So the patch being reverted introduces a regression which should be fixed. Link: http://lkml.kernel.org/r/151174729420.6162.1832622523537052460.stgit@pluto.themaw.net Fixes: `092a53452b` ("autofs: take more care to not update last_used on path walk") Signed-off-by: Ian Kent <raven@themaw.net> Reviewed-by: NeilBrown <neilb@suse.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: <stable@vger.kernel.org> [4.11+] Cc: Colin Walters <walters@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Ondrej Holy <oholy@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-29 18:40:43 -08:00
OGAWA Hirofumi	b6e8e12c0a	fs/fat/inode.c: fix sb_rdonly() change Commit `bc98a42c1f` ("VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)") converted fat_remount():new_rdonly from a bool to an int. However fat_remount() depends upon the compiler's conversion of a non-zero integer into boolean `true'. Fix it by switching `new_rdonly' back into a bool. Link: http://lkml.kernel.org/r/87mv3d5x51.fsf@mail.parknet.co.jp Fixes: `bc98a42c1f` ("VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)") Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Joe Perches <joe@perches.com> Cc: David Howells <dhowells@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-29 18:40:43 -08:00
Jiang Biao	d5dabd6339	fs/mbcache.c: make count_objects() more robust When running ltp stress test for 7*24 hours, vmscan occasionally emits the following warning continuously: mb_cache_scan+0x0/0x3f0 negative objects to delete nr=-9232265467809300450 ... Tracing shows the freeable(mb_cache_count returns) is -1, which causes the continuous accumulation and overflow of total_scan. This patch makes sure that mb_cache_count() cannot return a negative value, which makes the mbcache shrinker more robust. Link: http://lkml.kernel.org/r/1511753419-52328-1-git-send-email-jiang.biao2@zte.com.cn Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Minchan Kim <minchan@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: <zhong.weidong@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-29 18:40:43 -08:00
Kees Cook	04e35f4495	exec: avoid RLIMIT_STACK races with prlimit() While the defense-in-depth RLIMIT_STACK limit on setuid processes was protected against races from other threads calling setrlimit(), I missed protecting it against races from external processes calling prlimit(). This adds locking around the change and makes sure that rlim_max is set too. Link: http://lkml.kernel.org/r/20171127193457.GA11348@beast Fixes: `64701dee41` ("exec: Use sane stack rlimit under secureexec") Signed-off-by: Kees Cook <keescook@chromium.org> Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Reported-by: Brad Spengler <spender@grsecurity.net> Acked-by: Serge Hallyn <serge@hallyn.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-29 18:40:42 -08:00
Dan Williams	c7da82b894	mm: replace pmd_write with pmd_access_permitted in fault + gup paths The 'access_permitted' helper is used in the gup-fast path and goes beyond the simple _PAGE_RW check to also: - validate that the mapping is writable from a protection keys standpoint - validate that the pte has _PAGE_USER set since all fault paths where pmd_write is must be referencing user-memory. Link: http://lkml.kernel.org/r/151043111049.2842.15241454964150083466.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-29 18:40:42 -08:00
Linus Torvalds	b915176102	Highlights: - Fixes from Trond for some races in the NFSv4 state code. - Fix from Naofumi Honda for a typo in the blocked lock notificiation code. - Fixes from Vasily Averin for some problems starting and stopping lockd especially in network namespaces. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJaHxq/AAoJECebzXlCjuG+QOYP/jIa9dZnbau3owP8RJJv1+VI RSMYAZkIjy1vixn/BymZo55R7+23BhdLe8CDsknXWo85mIj61kpV1bwF2lVc7FWm +Pt93DkUsUBEjf+/3/58TLknYs5o7UhsEw2Qjg+D3BkO+z95biNa0hUBle2+Nnwi vBQLGqdlCIFZxuzEo7yUlGdKTyefzab4bocgRnh/5JMs+bHzPDD74W1GrGB1oEKX VSGzq0d7LLe23yIJwgP1eaa0tQr1/WsxlL8xD5Im6mXcN9aYa/7VZhg/oCluy8ac v95IBjQUkFqvw5OjDgSX5ZgKzokmRxLjnaUX2JT/sLCk1WxdJhUomw8qb1AJLvav e6xce1M+dR5VihTrD/cEe0xB7CXKXywPd6pXQBosAMInhS79aU8brIPCDtLvNwCw XvtNybbqC0Go89YMt2zuRfBkV7W3FmM1h+h4PWVl+iCl/7+AYIXD1qeX/FuIjnk6 SMEdtTb/cqECuh55YefEljUzY1vKYgquxCNCvcbSrMtVSOZYXXufheY+fBjf5DBb Bnsd1FiPtVkwFwX8bTbGlOOub1Ryl9SD4Ae0Ynu2FNYSFL8BVXTHkTHm9UHl83s5 pr0T6bKlpg+YzZrHVh2Herr9Ze89C9uM7oCU1M062vk4+Cg65paqNTnWVtflYPhG y9p0hsY5csyzm0SZ/1Ui =pR9D -----END PGP SIGNATURE----- Merge tag 'nfsd-4.15-1' of git://linux-nfs.org/~bfields/linux Pull nfsd fixes from Bruce Fields: "I screwed up my merge window pull request; I only sent half of what I meant to. There were no new features, just bugfixes of various importance and some very minor cleanup, so I think it's all still appropriate for -rc2. Highlights: - Fixes from Trond for some races in the NFSv4 state code. - Fix from Naofumi Honda for a typo in the blocked lock notificiation code - Fixes from Vasily Averin for some problems starting and stopping lockd especially in network namespaces" * tag 'nfsd-4.15-1' of git://linux-nfs.org/~bfields/linux: (23 commits) lockd: fix "list_add double add" caused by legacy signal interface nlm_shutdown_hosts_net() cleanup race of nfsd inetaddr notifiers vs nn->nfsd_serv change race of lockd inetaddr notifiers vs nlmsvc_rqst change SUNRPC: make cache_detail structures const NFSD: make cache_detail structures const sunrpc: make the function arg as const nfsd: check for use of the closed special stateid nfsd: fix panic in posix_unblock_lock called from nfs4_laundromat lockd: lost rollback of set_grace_period() in lockd_down_net() lockd: added cleanup checks in exit_net hook grace: replace BUG_ON by WARN_ONCE in exit_net hook nfsd: fix locking validator warning on nfs4_ol_stateid->st_mutex class lockd: remove net pointer from messages nfsd: remove net pointer from debug messages nfsd: Fix races with check_stateid_generation() nfsd: Ensure we check stateid validity in the seqid operation checks nfsd: Fix race in lock stateid creation nfsd4: move find_lock_stateid nfsd: Ensure we don't recognise lock stateids after freeing them ...	2017-11-29 14:49:26 -08:00
Linus Torvalds	26cd94744e	for-4.15-rc2-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlofBpkACgkQxWXV+ddt WDvtTQ//emI1QsD4N0e4BxMcZ1bcigiEk3jc4gj+biRapnMHHAHOqJbVtpK1v8gS PCTw+4uD5UOGvhBtS4kXJn8e2qxWCESWJDXwVlW0RHmWLfwd9z7ly0sBMi3oiIqH qief8CIkk3oTexiuuJ3mZGxqnDQjRGtWx2LM+bRJBWMk+jN32v2ObSlv9V505a5M 1daDBsjWojFWa8d4r3YZNJq1df2om/dwVQZ0Wk59bacIo9Xbvok0X459cOlWuv0p mjx8m8uA/z+HVdkTYlzyKpq08O8Z4shj3GrBbSnZ511gKzV+c9jJPxij5pKm3Z2z KW4Mp17+/7GSNcSsJiqnOYi+wtOrak2lD0COlZTijnY2jrv18h8ianoIM6CpzUdy +b09yuFXbPLoUfyl6vFaO/JHuvAkQdaR2tJbds6lvW+liC1ReoL4W1WcUjY6nv9f 6wTaIv0vwgrHaxeIzxKNpnsTlpHAgorFFk0/w8nLb40WX8AoJ/95lo2zws8oaFDN 0Fylu3NYhoDrJZK+D8dbsWx2eTsFVCqep4w0+iEVZl3lfuy3FZl1pu8CL7ru9vJl DNieh+lUvK1Fk+SYIoilGoriW96RbU8+jPo2W4A1ENzeMJfrNCSWtUSZZp4XT4tO 8m1PGud07XBLSxd62bAEDV3KZO2DnY1WxgXbKuIHSi9D5CI1LMo= =7UW+ -----END PGP SIGNATURE----- Merge tag 'for-4.15-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "We've collected some fixes in since the pre-merge window freeze. There's technically only one regression fix for 4.15, but the rest seems important and candidates for stable. - fix missing flush bio puts in error cases (is serious, but rarely happens) - fix reporting stat::st_blocks for buffered append writes - fix space cache invalidation - fix out of bound memory access when setting zlib level - fix potential memory corruption when fsync fails in the middle - fix crash in integrity checker - incremetnal send fix, path mixup for certain unlink/rename combination - pass flags to writeback so compressed writes can be throttled properly - error handling fixes" * tag 'for-4.15-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Btrfs: incremental send, fix wrong unlink path after renaming file btrfs: tree-checker: Fix false panic for sanity test Btrfs: fix list_add corruption and soft lockups in fsync btrfs: Fix wild memory access in compression level parser btrfs: fix deadlock when writing out space cache btrfs: clear space cache inode generation always Btrfs: fix reported number of inode blocks after buffered append writes Btrfs: move definition of the function btrfs_find_new_delalloc_bytes Btrfs: bail out gracefully rather than BUG_ON btrfs: dev_alloc_list is not protected by RCU, use normal list_del btrfs: add missing device::flush_bio puts btrfs: Fix transaction abort during failure in btrfs_rm_dev_item Btrfs: add write_flags for compression bio	2017-11-29 14:26:50 -08:00
Trond Myklebust	445f288d70	NFSv4: Ensure gcc 4.4.4 can compile initialiser for "invalid_stateid" gcc 4.4.4 is too old to have full C11 anonymous union support, so the current initialiser fails to compile. Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> (compile-)Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2017-11-29 13:46:32 -05:00
Tetsuo Handa	88bc0ede8d	quota: Check for register_shrinker() failure. register_shrinker() might return -ENOMEM error since Linux 3.12. Call panic() as with other failure checks in this function if register_shrinker() failed. Fixes: `1d3d4437ea` ("vmscan: per-node deferred work") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Jan Kara <jack@suse.com> Cc: Michal Hocko <mhocko@suse.com> Reviewed-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Jan Kara <jack@suse.cz>	2017-11-29 16:46:48 +01:00
Eric Sandeen	712d361d59	xfs: calculate correct offset in xfs_scrub_quota_item It's only used for tracepoints so it's relatively harmless, but the offset is calculated incorrectly in xfs_scrub_quota_item. qi_dqperchunk is the nr. of dquots per "chunk" which we have conveniently cough defined to always be 1 FSB. Therefore block_offset * qi_dqperchunk == first id in that chunk, and so offset = id / qi_dqperchunk id * dqperchunk is ... meaningless. Fixes-coverity-id: 1423965 Fixes: `c2fc338c` ("xfs: scrub quota information") Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-11-28 08:57:11 -08:00
Eric Sandeen	eda6bc27cc	xfs: fix uninitialized variable in xfs_scrub_quota On the first pass through the while(1) loop, we get to xfs_scrub_should_terminate() which can test the uninitialized error variable. Fixes-coverity-id: 1423737 Fixes: `c2fc338c` ("xfs: scrub quota information") Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-11-28 08:57:11 -08:00
Eric Sandeen	d41c6172bd	xfs: fix leaks on corruption errors in xfs_bmap.c Use _GOTO instead of _RETURN so we can free the allocated cursor on error. Fixes: `bf80628` ("xfs: remove xfs_bmse_shift_one") Fixes-coverity-id: 1423813, 1423676 Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-11-28 08:57:11 -08:00
Michal Hocko	d210a9874b	xfs: fortify xfs_alloc_buftarg error handling percpu_counter_init failure path doesn't clean up &btp->bt_lru list. Call list_lru_destroy in that error path. Similarly register_shrinker error path is not handled. While it is unlikely to trigger these error path, it is not impossible especially the later might fail with large NUMAs. Let's handle the failure to make the code more robust. Noticed-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-11-28 08:57:11 -08:00
Filipe Manana	ea37d5998b	Btrfs: incremental send, fix wrong unlink path after renaming file Under some circumstances, an incremental send operation can issue wrong paths for unlink commands related to files that have multiple hard links and some (or all) of those links were renamed between the parent and send snapshots. Consider the following example: Parent snapshot . (ino 256) \|---- a/ (ino 257) \| \|---- b/ (ino 259) \| \| \|---- c/ (ino 260) \| \| \|---- f2 (ino 261) \| \| \| \|---- f2l1 (ino 261) \| \|---- d/ (ino 262) \|---- f1l1_2 (ino 258) \|---- f2l2 (ino 261) \|---- f1_2 (ino 258) Send snapshot . (ino 256) \|---- a/ (ino 257) \| \|---- f2l1/ (ino 263) \| \|---- b2/ (ino 259) \| \|---- c/ (ino 260) \| \| \|---- d3 (ino 262) \| \| \|---- f1l1_2 (ino 258) \| \| \|---- f2l2_2 (ino 261) \| \| \|---- f1_2 (ino 258) \| \| \| \|---- f2 (ino 261) \| \|---- f1l2 (ino 258) \| \|---- d (ino 261) When computing the incremental send stream the following steps happen: 1) When processing inode 261, a rename operation is issued that renames inode 262, which currently as a path of "d", to an orphan name of "o262-7-0". This is done because in the send snapshot, inode 261 has of its hard links with a path of "d" as well. 2) Two link operations are issued that create the new hard links for inode 261, whose names are "d" and "f2l2_2", at paths "/" and "o262-7-0/" respectively. 3) Still while processing inode 261, unlink operations are issued to remove the old hard links of inode 261, with names "f2l1" and "f2l2", at paths "a/" and "d/". However path "d/" does not correspond anymore to the directory inode 262 but corresponds instead to a hard link of inode 261 (link command issued in the previous step). This makes the receiver fail with a ENOTDIR error when attempting the unlink operation. The problem happens because before sending the unlink operation, we failed to detect that inode 262 was one of ancestors for inode 261 in the parent snapshot, and therefore we didn't recompute the path for inode 262 before issuing the unlink operation for the link named "f2l2" of inode 262. The detection failed because the function "is_ancestor()" only follows the first hard link it finds for an inode instead of all of its hard links (as it was originally created for being used with directories only, for which only one hard link exists). So fix this by making "is_ancestor()" follow all hard links of the input inode. A test case for fstests follows soon. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2017-11-28 17:15:30 +01:00
Chao Yu	1a6152d36d	quota: propagate error from __dquot_initialize In commit `6184fc0b8d` ("quota: Propagate error from ->acquire_dquot()"), we have propagated error from __dquot_initialize to caller, but we forgot to handle such error in add_dquot_ref(), so, currently, during quota accounting information initialization flow, if we failed for some of inodes, we just ignore such error, and do account for others, which is not a good implementation. In this patch, we choose to let user be aware of such error, so after turning on quota successfully, we can make sure all inodes disk usage can be accounted, which will be more reasonable. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>	2017-11-28 16:08:08 +01:00
Qu Wenruo	69fc6cbbac	btrfs: tree-checker: Fix false panic for sanity test [BUG] If we run btrfs with CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y, it will instantly cause kernel panic like: ------ ... assertion failed: 0, file: fs/btrfs/disk-io.c, line: 3853 ... Call Trace: btrfs_mark_buffer_dirty+0x187/0x1f0 [btrfs] setup_items_for_insert+0x385/0x650 [btrfs] __btrfs_drop_extents+0x129a/0x1870 [btrfs] ... ----- [Cause] Btrfs will call btrfs_check_leaf() in btrfs_mark_buffer_dirty() to check if the leaf is valid with CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y. However quite some btrfs_mark_buffer_dirty() callers() don't really initialize its item data but only initialize its item pointers, leaving item data uninitialized. This makes tree-checker catch uninitialized data as error, causing such panic. : These callers include but not limited to setup_items_for_insert() btrfs_split_item() btrfs_expand_item() [Fix] Add a new parameter @check_item_data to btrfs_check_leaf(). With @check_item_data set to false, item data check will be skipped and fallback to old btrfs_check_leaf() behavior. So we can still get early warning if we screw up item pointers, and avoid false panic. Cc: Filipe Manana <fdmanana@gmail.com> Reported-by: Lakshmipathi.G <lakshmipathi.g@gmail.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2017-11-28 14:59:09 +01:00
Linus Torvalds	8f5abe842e	proc: don't report kernel addresses in /proc/<pid>/stack This just changes the file to report them as zero, although maybe even that could be removed. I checked, and at least procps doesn't actually seem to parse the 'stack' file at all. And since the file doesn't necessarily even exist (it requires CONFIG_STACKTRACE), possibly other tools don't really use it either. That said, in case somebody parses it with tools, just having that zero there should keep such tools happy. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-27 16:45:56 -08:00
Vasily Averin	81833de1a4	lockd: fix "list_add double add" caused by legacy signal interface restart_grace() uses hardcoded init_net. It can cause to "list_add double add" in following scenario: 1) nfsd and lockd was started in several net namespaces 2) nfsd in init_net was stopped (lockd was not stopped because it have users from another net namespaces) 3) lockd got signal, called restart_grace() -> set_grace_period() and enabled lock_manager in hardcoded init_net. 4) nfsd in init_net is started again, its lockd_up() calls set_grace_period() and tries to add lock_manager into init_net 2nd time. Jeff Layton suggest: "Make it safe to call locks_start_grace multiple times on the same lock_manager. If it's already on the global grace_list, then don't try to add it again. (But we don't intentionally add twice, so for now we WARN about that case.) With this change, we also need to ensure that the nfsd4 lock manager initializes the list before we call locks_start_grace. While we're at it, move the rest of the nfsd_net initialization into nfs4_state_create_net. I see no reason to have it spread over two functions like it is today." Suggested patch was updated to generate warning in described situation. Suggested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:11 -05:00
Vasily Averin	9e137ed5ab	nlm_shutdown_hosts_net() cleanup nlm_complain_hosts() walks through nlm_server_hosts hlist, which should be protected by nlm_host_mutex. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:11 -05:00
Vasily Averin	2317dc557a	race of nfsd inetaddr notifiers vs nn->nfsd_serv change nfsd_inet[6]addr_event uses nn->nfsd_serv without taking nfsd_mutex, which can be changed during execution of notifiers and crash the host. Moreover if notifiers were enabled in one net namespace they are enabled in all other net namespaces, from creation until destruction. This patch allows notifiers to access nn->nfsd_serv only after the pointer is correctly initialized and delays cleanup until notifiers are no longer in use. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Tested-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:11 -05:00
Vasily Averin	6b18dd1c03	race of lockd inetaddr notifiers vs nlmsvc_rqst change lockd_inet[6]addr_event use nlmsvc_rqst without taken nlmsvc_mutex, nlmsvc_rqst can be changed during execution of notifiers and crash the host. Patch enables access to nlmsvc_rqst only when it was correctly initialized and delays its cleanup until notifiers are no longer in use. Note that nlmsvc_rqst can be temporally set to ERR_PTR, so the "if (nlmsvc_rqst)" check in notifiers is insufficient on its own. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Tested-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:11 -05:00
Bhumika Goyal	ae2e408ec2	NFSD: make cache_detail structures const Make these const as they are only getting passed to the function cache_create_net having the argument as const. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:11 -05:00
Andrew Elble	ae254dac72	nfsd: check for use of the closed special stateid Prevent the use of the closed (invalid) special stateid by clients. Signed-off-by: Andrew Elble <aweits@rit.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:11 -05:00
Naofumi Honda	64ebe12494	nfsd: fix panic in posix_unblock_lock called from nfs4_laundromat From kernel 4.9, my two nfsv4 servers sometimes suffer from "panic: unable to handle kernel page request" in posix_unblock_lock() called from nfs4_laundromat(). These panics diseappear if we revert the commit "nfsd: add a LRU list for blocked locks". The cause appears to be a typo in nfs4_laundromat(), which is also present in nfs4_state_shutdown_net(). Cc: stable@vger.kernel.org Fixes: `7919d0a27f` "nfsd: add a LRU list for blocked locks" Cc: jlayton@redhat.com Reveiwed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:11 -05:00
Vasily Averin	3a2b19d1ee	lockd: lost rollback of set_grace_period() in lockd_down_net() Commit `efda760fe9` ("lockd: fix lockd shutdown race") is incorrect, it removes lockd_manager and disarm grace_period_end for init_net only. If nfsd was started from another net namespace lockd_up_net() calls set_grace_period() that adds lockd_manager into per-netns list and queues grace_period_end delayed work. These action should be reverted in lockd_down_net(). Otherwise it can lead to double list_add on after restart nfsd in netns, and to use-after-free if non-disarmed delayed work will be executed after netns destroy. Fixes: `efda760fe9` ("lockd: fix lockd shutdown race") Cc: stable@vger.kernel.org Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:11 -05:00
Vasily Averin	a3152f1440	lockd: added cleanup checks in exit_net hook Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Vasily Averin	b872285751	grace: replace BUG_ON by WARN_ONCE in exit_net hook Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Andrew Elble	4f34bd0540	nfsd: fix locking validator warning on nfs4_ol_stateid->st_mutex class The use of the st_mutex has been confusing the validator. Use the proper nested notation so as to not produce warnings. Signed-off-by: Andrew Elble <aweits@rit.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Vasily Averin	e919b07652	lockd: remove net pointer from messages Publishing of net pointer is not safe, use net->ns.inum as net ID in debug messages [ 171.757678] lockd_up_net: per-net data created; net=f00001e7 [ 171.767188] NFSD: starting 90-second grace period (net f00001e7) [ 300.653313] lockd: nuking all hosts in net f00001e7... [ 300.653641] lockd: host garbage collection for net f00001e7 [ 300.653968] lockd: nlmsvc_mark_resources for net f00001e7 [ 300.711483] lockd_down_net: per-net data destroyed; net=f00001e7 [ 300.711847] lockd: nuking all hosts in net 0... [ 300.711847] lockd: host garbage collection for net 0 [ 300.711848] lockd: nlmsvc_mark_resources for net 0 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Vasily Averin	ba589528d6	nfsd: remove net pointer from debug messages Publishing of net pointer is not safe, replace it in debug meesages by net->ns.inum [ 119.989161] nfsd: initializing export module (net: f00001e7). [ 171.767188] NFSD: starting 90-second grace period (net f00001e7) [ 322.185240] nfsd: shutting down export module (net: f00001e7). [ 322.186062] nfsd: export shutdown complete (net: f00001e7). Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Trond Myklebust	03da3169c6	nfsd: Fix races with check_stateid_generation() The various functions that call check_stateid_generation() in order to compare a client-supplied stateid with the nfs4_stid state, usually need to atomically check for closed state. Those that perform the check after locking the st_mutex using nfsd4_lock_ol_stateid() should now be OK, but we do want to fix up the others. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Trond Myklebust	9271d7e509	nfsd: Ensure we check stateid validity in the seqid operation checks After taking the stateid st_mutex, we want to know that the stateid still represents valid state before performing any non-idempotent actions. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Trond Myklebust	beeca19cf1	nfsd: Fix race in lock stateid creation If we're looking up a new lock state, and the creation fails, then we want to unhash it, just like we do for OPEN. However in order to do so, we need to that no other LOCK requests can grab the mutex until we have unhashed it (and marked it as closed). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Trond Myklebust	fd1fd685b3	nfsd4: move find_lock_stateid Trivial cleanup to simplify following patch. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Trond Myklebust	659aefb68e	nfsd: Ensure we don't recognise lock stateids after freeing them In order to deal with lookup races, nfsd4_free_lock_stateid() needs to be able to signal to other stateful functions that the lock stateid is no longer valid. Right now, nfsd_lock() will check whether or not an existing stateid is still hashed, but only in the "new lock" path. To ensure the stateid invalidation is also recognised by the "existing lock" path, and also by a second call to nfsd4_free_lock_stateid() itself, we can change the type to NFS4_CLOSED_STID under the stp->st_mutex. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00

1 2 3 4 5 ...

51695 Commits