linux/fs/nfs
Jeff Layton d529ef83c3 NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping
There is a possible race in how the nfs_invalidate_mapping function is
handled.  Currently, we go and invalidate the pages in the file and then
clear NFS_INO_INVALID_DATA.

The problem is that it's possible for a stale page to creep into the
mapping after the page was invalidated (i.e., via readahead). If another
writer comes along and sets the flag after that happens but before
invalidate_inode_pages2 returns then we could clear the flag
without the cache having been properly invalidated.

So, we must clear the flag first and then invalidate the pages. Doing
this however, opens another race:

It's possible to have two concurrent read() calls that end up in
nfs_revalidate_mapping at the same time. The first one clears the
NFS_INO_INVALID_DATA flag and then goes to call nfs_invalidate_mapping.

Just before calling that though, the other task races in, checks the
flag and finds it cleared. At that point, it trusts that the mapping is
good and gets the lock on the page, allowing the read() to be satisfied
from the cache even though the data is no longer valid.

These effects are easily manifested by running diotest3 from the LTP
test suite on NFS. That program does a series of DIO writes and buffered
reads. The operations are serialized and page-aligned but the existing
code fails the test since it occasionally allows a read to come out of
the cache incorrectly. While mixing direct and buffered I/O isn't
recommended, I believe it's possible to hit this in other ways that just
use buffered I/O, though that situation is much harder to reproduce.

The problem is that the checking/clearing of that flag and the
invalidation of the mapping really need to be atomic. Fix this by
serializing concurrent invalidations with a bitlock.

At the same time, we also need to allow other places that check
NFS_INO_INVALID_DATA to check whether we might be in the middle of
invalidating the file, so fix up a couple of places that do that
to look for the new NFS_INO_INVALIDATING flag.

Doing this requires us to be careful not to set the bitlock
unnecessarily, so this code only does that if it believes it will
be doing an invalidation.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-27 15:35:56 -05:00
..
blocklayout nfs: fix do_div() warning by instead using sector_div() 2013-12-04 12:57:37 -05:00
objlayout NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount 2013-06-28 15:34:45 -04:00
cache_lib.c NFS: simplify and clean cache library 2013-02-15 10:43:36 -05:00
cache_lib.h NFS: simplify and clean cache library 2013-02-15 10:43:36 -05:00
callback_proc.c NFS: When displaying session slot numbers, use "%u" consistently 2013-09-03 15:26:30 -04:00
callback_xdr.c Merge branch 'labeled-nfs' into linux-next 2013-06-28 16:29:51 -04:00
callback.c nfs: Use PTR_ERR_OR_ZERO in 'nfs41_callback_up' function 2013-10-28 18:16:55 -04:00
callback.h NFS: Add in v4.2 callback operation 2013-06-08 16:20:18 -04:00
client.c NFS: cache parsed auth_info in nfs_server 2013-10-28 15:37:43 -04:00
delegation.c NFSv4: Add tracepoints for debugging delegations 2013-08-22 08:58:24 -04:00
delegation.h NFSv4: Fix CB_RECALL_ANY to only return delegations that are not in use 2013-04-05 17:03:57 -04:00
dir.c NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping 2014-01-27 15:35:56 -05:00
direct.c nfs: page cache invalidation for dio 2014-01-13 17:29:50 -05:00
dns_resolve.c NFS: Enabling v4.2 should not recompile nfsd and lockd 2013-11-19 16:20:40 -05:00
dns_resolve.h NFS: DNS resolver cache per network namespace context introduced 2012-01-31 18:20:26 -05:00
file.c NFS: dprintk() should not print negative fileids and inode numbers 2014-01-05 15:51:23 -05:00
fscache-index.c NFS: Use the inode->i_version to cache NFSv4 change attribute information 2011-10-18 09:14:34 -07:00
fscache.c NFS: Use i_writecount to control whether to get an fscache cookie in nfs_open() 2013-09-27 18:40:25 +01:00
fscache.h NFS: Use i_writecount to control whether to get an fscache cookie in nfs_open() 2013-09-27 18:40:25 +01:00
getroot.c NFS:Add labels to client function prototypes 2013-06-08 16:20:15 -04:00
idmap.c NFSv4: Convert idmapper to use the new framework for pipefs dentries 2013-09-01 11:12:42 -04:00
inode.c NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping 2014-01-27 15:35:56 -05:00
internal.h NFS: Enabling v4.2 should not recompile nfsd and lockd 2013-11-19 16:20:40 -05:00
iostat.h
Kconfig nfs: fix pnfs Kconfig defaults 2013-11-15 13:41:43 -05:00
Makefile NFS: Enable slot table helpers for NFSv4.0 2013-09-03 15:26:33 -04:00
mount_clnt.c nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it 2013-06-28 15:51:51 -04:00
namespace.c nfs: use %p[dD] instead of open-coded (and often racy) equivalents 2013-10-24 23:34:50 -04:00
netns.h nfs: include NFSv4 header in netns.h 2012-10-02 08:17:02 -07:00
nfs2super.c NFS: Convert v2 into a module 2012-07-30 19:06:41 -04:00
nfs2xdr.c nfs: Convert nfs2xdr to use kuids and kgids 2013-02-13 06:15:30 -08:00
nfs3acl.c NFS: dprintk() should not print negative fileids and inode numbers 2014-01-05 15:51:23 -05:00
nfs3client.c NFS: Only initialize the ACL client in the v3 case 2012-07-30 19:05:54 -04:00
nfs3proc.c nfs: use %p[dD] instead of open-coded (and often racy) equivalents 2013-10-24 23:34:50 -04:00
nfs3super.c NFS: Convert v3 into a module 2012-07-30 19:06:46 -04:00
nfs3xdr.c nfs: Convert nfs3xdr to use kuids and kgids 2013-02-13 06:15:31 -08:00
nfs4_fs.h NFS: Enabling v4.2 should not recompile nfsd and lockd 2013-11-19 16:20:40 -05:00
nfs4client.c nfs4: fix discover_server_trunking use after free 2014-01-20 16:08:06 -07:00
nfs4file.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-11-13 15:34:18 +09:00
nfs4filelayout.c pnfs: fix BUG in filelayout_recover_commit_reqs 2014-01-21 14:12:18 -07:00
nfs4filelayout.h NFSv4.1: Use layout credentials for get_deviceinfo calls 2013-06-06 16:24:37 -04:00
nfs4filelayoutdev.c nfs: fix dead code of ipv6_addr_scope 2014-01-05 15:38:21 -05:00
nfs4getroot.c NFSv4: Fix security auto-negotiation 2013-09-07 16:18:30 -04:00
nfs4namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-11-13 15:34:18 +09:00
nfs4proc.c nfs: handle servers that support only ALLOW ACE type. 2014-01-27 11:54:35 -05:00
nfs4renewd.c workqueue: use mod_delayed_work() instead of cancel + queue 2012-08-13 16:27:37 -07:00
nfs4session.c When CONFIG_NFS_V4_1 is not enabled, "make C=2" emits this warning: 2013-09-04 12:26:30 -04:00
nfs4session.h When CONFIG_NFS_V4_1 is not enabled, "make C=2" emits this warning: 2013-09-04 12:26:30 -04:00
nfs4state.c point to the right include file in a comment (left over from a9004abc3) 2014-01-05 15:51:35 -05:00
nfs4super.c NFSv4.1: Fix a race in nfs4_write_inode 2014-01-13 13:34:36 -05:00
nfs4sysctl.c nfs: include nfs4_fh.h in nfs4sysctl.c 2012-10-02 08:17:03 -07:00
nfs4trace.c NFSv4.1: Add tracepoints for debugging slot table operations 2013-08-22 08:58:27 -04:00
nfs4trace.h NFSv4.1: Add tracepoints for debugging test_stateid events 2013-08-22 08:58:27 -04:00
nfs4xdr.c NFSv4: OPEN must handle the NFS4ERR_IO return code correctly 2013-12-06 13:06:30 -05:00
nfs.h NFS: Convert v4 into a module 2012-07-30 19:06:52 -04:00
nfsroot.c SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUG 2012-03-20 13:08:26 -04:00
nfstrace.c NFS: Add event tracing for generic NFS lookups 2013-08-22 08:58:18 -04:00
nfstrace.h NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping 2014-01-27 15:35:56 -05:00
pagelist.c NFS: Don't check lock owner compatability unless file is locked (part 2) 2013-09-06 11:27:41 -04:00
pnfs_dev.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
pnfs.c NFSv4.1: Fix a race in nfs4_write_inode 2014-01-13 13:34:36 -05:00
pnfs.h NFSv4.1: Don't trust attributes if a pNFS LAYOUTCOMMIT is outstanding 2014-01-13 12:08:11 -05:00
proc.c nfs: use %p[dD] instead of open-coded (and often racy) equivalents 2013-10-24 23:34:50 -04:00
read.c NFS: dprintk() should not print negative fileids and inode numbers 2014-01-05 15:51:23 -05:00
super.c NFS: correctly report misuse of "migration" mount option. 2013-11-15 13:41:43 -05:00
symlink.c
sysctl.c NFS: Initialize v4 sysctls from nfs_init_v4() 2012-07-17 13:33:18 -04:00
unlink.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-11-13 15:34:18 +09:00
write.c NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping 2014-01-27 15:35:56 -05:00