linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-20 11:13:58 +08:00

Author	SHA1	Message	Date
Theodore Ts'o	cde2d7a796	ext4: flush the extent status cache during EXT4_IOC_SWAP_BOOT Previously we weren't swapping only some of the extent_status LRU fields during the processing of the EXT4_IOC_SWAP_BOOT ioctl. The much safer thing to do is to just completely flush the extent status tree when doing the swap. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Zheng Liu <gnehzuil.liu@gmail.com> Cc: stable@vger.kernel.org	2013-08-12 09:29:30 -04:00
Jaegeuk Kim	479bd73ac4	f2fs: should cover i_xattr_nid with its xattr node page lock Previously, f2fs_setxattr assigns i_xattr_nid in the inode page inconsistently. The scenario is: = Thread 1 = = Thread 2 = = fi->i_xattr_nid = = on-disk nid = f2fs_setxattr 0 0 new_node_page X 0 sync_inode_page X X checkpoint X X -. grab_cache_page X X \| --> allocate a new xattr node block or -ENOSPC <----------------' At this moment, the checkpoint stores inconsistent data where the inode has i_xattr_nid but actual xattr node block is not allocated yet. So, we should assign the real i_xattr_nid only after its xattr node block is allocated. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-08-12 16:04:53 +09:00
Jaegeuk Kim	9c02740c01	f2fs: check the free space first in new_node_page Let's check the free space in prior to the main process of allocating a new node page. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-08-12 16:00:46 +09:00
Gu Zheng	41dfde135f	f2fs: clean up the needless end 'return' of void function Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-08-12 11:49:22 +09:00
Linus Torvalds	d92581fcad	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "These are assorted fixes, mostly from Josef nailing down xfstests runs. Zach also has a long standing fix for problems with readdir wrapping f_pos (or ctx->pos) These patches were spread out over different bases, so I rebased things on top of rc4 and retested overnight" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: don't loop on large offsets in readdir Btrfs: check to see if root_list is empty before adding it to dead roots Btrfs: release both paths before logging dir/changed extents Btrfs: allow splitting of hole em's when dropping extent cache Btrfs: make sure the backref walker catches all refs to our extent Btrfs: fix backref walking when we hit a compressed extent Btrfs: do not offset physical if we're compressed Btrfs: fix extent buffer leak after backref walking Btrfs: fix a bug of snapshot-aware defrag to make it work on partial extents btrfs: fix file truncation if FALLOC_FL_KEEP_SIZE is specified	2013-08-10 15:21:47 -07:00
Linus Torvalds	b8ea0d06ff	NFS client bugfixes for 3.11 - Stable patch for lockd to fix Oopses due to inappropriate calls to utsname()->nodename - Stable patches for sunrpc to fix Oopses on shutdown when using AF_LOCAL sockets with rpcbind - Fix memory leak and error checking issues in nfs4_proc_lookup_mountpoint - Fix a regression with the sync mount option failing to work for nfs4 mounts - Fix a writeback performance issue when doing cache invalidation - Remove an incorrect call to nfs_setsecurity in nfs_fhget -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJSBRqqAAoJEGcL54qWCgDyejoP/RNfkebEex2ugAAfymMOUNjA qc8X/YO8KDQD5T0X8P9MFXzgSSMPT5YfemxfKDpkjJlYwij8O9D8r1BAfgmhhKFI bVp9J8pH5Y3jXgLFVsubEc9N9CXLI0yRzcOFfzapYmGJV/ZO4cBAi8HMowFtbREj l2uk9H8XQ1p8KqVWXhFKoXVL2b9q0CGj5z3+RkE/rD5uuZyusbTB6OhNFArPnwXk aCx373mlvpIAhU6DkueCiXaHH06Mff2Vlu6eBkGrdfBGC6l+x1nwevxYH750fqSU 8O94rQkw9++qnIIvBJF/g1NzqyDhychJcXtgGLdxYUdWH3c8tevJZxCEj7U/dIJQ ndEaZGFxSFfdnxrwJBWtB+xsEfe9K4no9JwlkyVi8oZ5j2NUv7cJpA5cdQ4IUf/1 uqTlIxtPHQhHWCUUKpGLlhLiZyvwOPtJvuBl/Pc9UYQbyNtSqjYBk9mamrrIC6FK mF6jXgWe9x+miBqWYrEdPNLGdx/hUhhqGweYPJa6jTcxif+2l2xGfscWYI89Io/e qy8YNcHUrRci+o+YfY4lLhk88WBZogzFOYc4jDaLCEL1TE5B2k/jHr9v39V5S7Ks 63bhmfCTxB82uMzFeUWbiPzWQzt030pXPYy/PUPaucy/+QaZC/0lZ3HJKRKI37JJ ygSD7ndCZJCG+PxLkk49 =W12x -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.11-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Stable patch for lockd to fix Oopses due to inappropriate calls to utsname()->nodename - Stable patches for sunrpc to fix Oopses on shutdown when using AF_LOCAL sockets with rpcbind - Fix memory leak and error checking issues in nfs4_proc_lookup_mountpoint - Fix a regression with the sync mount option failing to work for nfs4 mounts - Fix a writeback performance issue when doing cache invalidation - Remove an incorrect call to nfs_setsecurity in nfs_fhget * tag 'nfs-for-3.11-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Fix up nfs4_proc_lookup_mountpoint NFS: Remove unnecessary call to nfs_setsecurity in nfs_fhget() NFSv4: Fix the sync mount option for nfs4 mounts NFS: Fix writeback performance issue on cache invalidation SUNRPC: If the rpcbind channel is disconnected, fail the call to unregister SUNRPC: Don't auto-disconnect from the local rpcbind socket LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs	2013-08-10 15:20:37 -07:00
Linus Torvalds	022e5d098b	Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux Pull nfsd fixes from Bruce Fields: "Some fixes for a 4.1 feature that in retrospect probably should have waited for 3.12.... But it appears to be working now" * 'for-3.11' of git://linux-nfs.org/~bfields/linux: nfsd: Fix SP4_MACH_CRED negotiation in EXCHANGE_ID nfsd4: Fix MACH_CRED NULL dereference	2013-08-10 15:19:58 -07:00
Zach Brown	db62efbbf8	btrfs: don't loop on large offsets in readdir When btrfs readdir() hits the last entry it sets the readdir offset to a huge value to stop buggy apps from breaking when the same name is returned by readdir() with concurrent rename()s. But unconditionally setting the offset to INT_MAX causes readdir() to loop returning any entries with offsets past INT_MAX. It only takes a few hours of constant file creation and removal to create entries past INT_MAX. So let's set the huge offset to LLONG_MAX if the last entry has already overflowed 32bit loff_t. Without large offsets behaviour is identical. With large offsets 64bit apps will work and 32bit apps will be no more broken than they currently are if they see large offsets. Signed-off-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2013-08-09 19:34:56 -04:00
Josef Bacik	cfad392b22	Btrfs: check to see if root_list is empty before adding it to dead roots A user reported a panic when running with autodefrag and deleting snapshots. This is because we could end up trying to add the root to the dead roots list twice. To fix this check to see if we are empty before adding ourselves to the dead roots list. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2013-08-09 19:30:23 -04:00
Josef Bacik	f3b15ccdbb	Btrfs: release both paths before logging dir/changed extents The ceph guys tripped over this bug where we were still holding onto the original path that we used to copy the inode with when logging. This is based on Chris's fix which was reported to fix the problem. We need to drop the paths in two cases anyway so just move the drop up so that we don't have duplicate code. Thanks, Cc: stable@vger.kernel.org Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2013-08-09 19:30:16 -04:00
Josef Bacik	ee20a98314	Btrfs: allow splitting of hole em's when dropping extent cache I noticed while running multi-threaded fsync tests that sometimes fsck would complain about an improper gap. This happens because we fail to add a hole extent to the file, which was happening when we'd split a hole EM because btrfs_drop_extent_cache was just discarding the whole em instead of splitting it. So this patch fixes this by allowing us to split a hole em properly, which means that added holes actually get logged properly and we no longer see this fsck error. Thankfully we're tolerant of these sort of problems so a user would not see any adverse effects of this bug, other than fsck complaining. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2013-08-09 19:30:09 -04:00
Josef Bacik	ed8c4913da	Btrfs: make sure the backref walker catches all refs to our extent Because we don't mess with the offset into the extent for compressed we will properly find both extents for this case [extent a][extent b][rest of extent a] but because we already added a ref for the front half we won't add the inode information for the second half. This causes us to leak that memory and not print out the other offset when we do logical-resolve. So fix this by calling ulist_add_merge and then add our eie to the existing entry if there is one. With this patch we get both offsets out of logical-resolve. With this and the other 2 patches I've sent we now pass btrfs/276 on my vm with compress-force=lzo set. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2013-08-09 19:30:03 -04:00
Josef Bacik	8ca15e05e6	Btrfs: fix backref walking when we hit a compressed extent If you do btrfs inspect-internal logical-resolve on a compressed extent that has been partly overwritten it won't find anything. This is because we try and match the extent offset we've searched for based on the extent offset in the data extent entry. However this doesn't work for compressed extents because the offsets are for the uncompressed size, not the compressed size. So instead only do this check if we are not compressed, that way we can get an actual entry for the physical offset rather than nothing for compressed. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2013-08-09 19:29:56 -04:00
Josef Bacik	b76bb70136	Btrfs: do not offset physical if we're compressed xfstest btrfs/276 was freaking out on slower boxes partly because fiemap was offsetting the physical based on the extent offset. This is perfectly fine with uncompressed extents, however the extent offset is into the uncompressed area, not the compressed. So we can return a physical value that isn't at all within the area we have allocated on disk. Fix this by returning the start of the extent if it is compressed no matter what the offset. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2013-08-09 19:29:50 -04:00
Liu Bo	b5b9b5b318	Btrfs: fix extent buffer leak after backref walking commit 47fb091fb787420cd195e66f162737401cce023f(Btrfs: fix unlock after free on rewinded tree blocks) takes an extra increment on the reference of allocated dummy extent buffer, so now we cannot free this dummy one, and end up with extent buffer leak. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2013-08-09 19:29:42 -04:00
Liu Bo	e68afa49ae	Btrfs: fix a bug of snapshot-aware defrag to make it work on partial extents For partial extents, snapshot-aware defrag does not work as expected, since a) we use the wrong logical offset to search for parents, which should be disk_bytenr + extent_offset, not just disk_bytenr, b) 'offset' returned by the backref walking just refers to key.offset, not the 'offset' stored in btrfs_extent_data_ref which is (key.offset - extent_offset). The reproducer: $ mkfs.btrfs sda $ mount sda /mnt $ btrfs sub create /mnt/sub $ for i in `seq 5 -1 1`; do dd if=/dev/zero of=/mnt/sub/foo bs=5k count=1 seek=$i conv=notrunc oflag=sync; done $ btrfs sub snap /mnt/sub /mnt/snap1 $ btrfs sub snap /mnt/sub /mnt/snap2 $ sync; btrfs filesystem defrag /mnt/sub/foo; $ umount /mnt $ btrfs-debug-tree sda (Here we can check whether the defrag operation is snapshot-awared. This addresses the above two problems. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2013-08-09 19:29:17 -04:00
Jie Liu	7cddc19392	btrfs: fix file truncation if FALLOC_FL_KEEP_SIZE is specified Create a small file and fallocate it to a big size with FALLOC_FL_KEEP_SIZE option, then truncate it back to the small size again, the disk free space is not changed back in this case. i.e, total 4 -rw-r--r-- 1 root root 512 Jun 28 11:35 test Filesystem Size Used Avail Use% Mounted on .... /dev/sdb1 8.0G 56K 7.2G 1% /mnt -rw-r--r-- 1 root root 512 Jun 28 11:35 /mnt/test Filesystem Size Used Avail Use% Mounted on .... /dev/sdb1 8.0G 5.1G 2.2G 70% /mnt Filesystem Size Used Avail Use% Mounted on .... /dev/sdb1 8.0G 5.1G 2.2G 70% /mnt With this fix, the truncated up space is back as: Filesystem Size Used Avail Use% Mounted on .... /dev/sdb1 8.0G 56K 7.2G 1% /mnt Signed-off-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2013-08-09 19:28:56 -04:00
Oleg Nesterov	201d3dfa4d	dlm: kill the unnecessary and wrong device_close()->recalc_sigpending() device_close()->recalc_sigpending() is not needed, sigprocmask() takes care of TIF_SIGPENDING correctly. And without ->siglock it is racy and wrong, it can wrongly clear TIF_SIGPENDING and miss a signal. But even with this patch device_close() is still buggy: 1. sigprocmask() should not be used, we have set_task_blocked(), but this is minor. 2. We should never block SIGKILL or SIGSTOP, and this is what the code tries to do. 3. This can't protect against SIGKILL or SIGSTOP anyway. Another thread can do signal_wake_up(), say, do_signal_stop() or complete_signal() or debugger. 4. sigprocmask(SIG_BLOCK, allsigs) doesn't necessarily clears TIF_SIGPENDING, say, freezing() or ->jobctl. 5. device_write() looks equally wrong by the same reason. Looks like, this tries to protect some wait_event_interruptible() logic from signals, it should be turned into uninterruptible wait. Or we need to implement something like signals_stop/start for such a use-case. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-08-09 10:48:20 -07:00
Jan Kara	97a2847d06	Merge branch 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeffm/linux-reiserfs into for_next_testing Reiserfs locking fixes.	2013-08-09 10:49:58 +02:00
Paul Gortmaker	e99a03c6f5	jbd: use a single printk for jbd_debug() Backport of jbd2 commit `169f1a2a87` ("jbd2: use a single printk for jbd_debug()") Since the jbd_debug() is implemented with two separate printk() calls, it can lead to corrupted and misleading debug output like the following (see lines marked with ""): [ 290.339362] (fs/jbd2/journal.c, 203): kjournald2: kjournald2 wakes [ 290.339365] (fs/jbd2/journal.c, 155): kjournald2: commit_sequence=42103, commit_request=42104 [ 290.339369] (fs/jbd2/journal.c, 158): kjournald2: OK, requests differ [ 290.339376] (fs/jbd2/journal.c, 648): jbd2_log_wait_commit: [* 290.339379] (fs/jbd2/commit.c, 370): jbd2_journal_commit_transaction: JBD2: want 42104, j_commit_sequence=42103 [* 290.339382] JBD2: starting commit of transaction 42104 [ 290.339410] (fs/jbd2/revoke.c, 566): jbd2_journal_write_revoke_records: Wrote 0 revoke records [ 290.376555] (fs/jbd2/commit.c, 1088): jbd2_journal_commit_transaction: JBD2: commit 42104 complete, head 42079 i.e. the debug output from log_wait_commit and journal_commit_transaction have become interleaved. The output should have been: (fs/jbd2/journal.c, 648): jbd2_log_wait_commit: JBD2: want 42104, j_commit_sequence=42103 (fs/jbd2/commit.c, 370): jbd2_journal_commit_transaction: JBD2: starting commit of transaction 42104 It is expected that this is not easy to replicate -- I was only able to cause it on preempt-rt kernels, and even then only under heavy I/O load. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jan Kara <jack@suse.cz>	2013-08-09 10:49:00 +02:00
Jaegeuk Kim	d71b5564c0	f2fs: introduce cur_cp_version function to reduce code size This patch introduces a new inline function, cur_cp_version, to reduce redundant codes. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-08-09 15:25:37 +09:00
Jaegeuk Kim	e518ff81c3	f2fs: fix inconsistency between xattr node blocks and its inode Previously xattr node blocks are stored to the COLD_NODE log, which means that our roll-forward mechanism doesn't recover the xattr node blocks at all. Only the direct node blocks in the WARM_NODE log can be recovered. So, let's resolve the issue simply by conducting checkpoint during fsync when a file has a modified xattr node block. This approach is able to degrade the performance, but normally the checkpoint overhead is shown at the initial fsync call after the xattr entry changes. Once the checkpoint is done, no additional overhead would be occurred. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-08-09 15:25:24 +09:00
Jaegeuk Kim	dbe6a5ff4f	f2fs: fix the use of XATTR_NODE_OFFSET This patch fixes the use of XATTR_NODE_OFFSET. o The offset should not use several MSB bits which are used by marking node blocks. o IS_DNODE should handle XATTR_NODE_OFFSET to avoid potential abnormality during the fsync call. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-08-09 14:57:56 +09:00
Piotr Sarna	6ae6514b33	ext4: fix mount/remount error messages for incompatible mount options Commit `5688978` ("ext4: improve handling of conflicting mount options") introduced incorrect messages shown while choosing wrong mount options. First of all, both cases of incorrect mount options, "data=journal,delalloc" and "data=journal,dioread_nolock" result in the same error message. Secondly, the problem above isn't solved for remount option: the mismatched parameter is simply ignored. Moreover, ext4_msg states that remount with options "data=journal,delalloc" succeeded, which is not true. To fix it up, I added a simple check after parse_options() call to ensure that data=journal and delalloc/dioread_nolock parameters are not present at the same time. Signed-off-by: Piotr Sarna <p.sarna@partner.samsung.com> Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-08-08 23:02:24 -04:00
Theodore Ts'o	59d9fa5c2e	ext4: allow the mount options nodelalloc and data=journal Commit `26092bf` ("ext4: use a table-driven handler for mount options") wrongly disallows the specifying the mount options nodelalloc and data=journal simultaneously. This is incorrect; it should have only disallowed the combination of delalloc and data=journal simultaneously. Reported-by: Piotr Sarna <p.sarna@partner.samsung.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-08-08 23:01:24 -04:00
Tejun Heo	8af01f56a0	cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/ The names of the two struct cgroup_subsys_state accessors - cgroup_subsys_state() and task_subsys_state() - are somewhat awkward. The former clashes with the type name and the latter doesn't even indicate it's somehow related to cgroup. We're about to revamp large portion of cgroup API, so, let's rename them so that they're less awkward. Most per-controller usages of the accessors are localized in accessor wrappers and given the amount of scheduled changes, this isn't gonna add any noticeable headache. Rename cgroup_subsys_state() to cgroup_css() and task_subsys_state() to task_css(). This patch is pure rename. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>	2013-08-08 20:11:22 -04:00
Jeff Mahoney	d2d0395fd1	reiserfs: locking, release lock around quota operations Previous commits released the write lock across quota operations but missed several places. In particular, the free operations can also call into the file system code and take the write lock, causing deadlocks. This patch introduces some more helpers and uses them for quota call sites. Without this patch applied, reiserfs + quotas runs into deadlocks under anything more than trivial load. Signed-off-by: Jeff Mahoney <jeffm@suse.com>	2013-08-08 17:34:47 -04:00
Jeff Mahoney	278f6679f4	reiserfs: locking, handle nested locks properly The reiserfs write lock replaced the BKL and uses similar semantics. Frederic's locking code makes a distinction between when the lock is nested and when it's being acquired/released, but I don't think that's the right distinction to make. The right distinction is between the lock being released at end-of-use and the lock being released for a schedule. The unlock should return the depth and the lock should restore it, rather than the other way around as it is now. This patch implements that and adds a number of places where the lock should be dropped. Signed-off-by: Jeff Mahoney <jeffm@suse.com>	2013-08-08 17:34:46 -04:00
Jeff Mahoney	4c05141df5	reiserfs: locking, push write lock out of xattr code The reiserfs xattr code doesn't need the write lock and sleeps all over the place. We can simplify the locking by releasing it and reacquiring after the xattr call. Signed-off-by: Jeff Mahoney <jeffm@suse.com>	2013-08-08 17:27:19 -04:00
Linus Torvalds	55f5bfd4c9	ext4 bugfixes for v3.11-rc4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABCAAGBQJR+YUoAAoJENNvdpvBGATwxsgQAJzxe8x1SWW8jrvasMELW3bS I7gl3lu789GROmZo5MI6+XI1DSCjHcMTisoPnPGx8q9LNhiYQGjp9Loxgl+L+Cem wbDfkSHrhxtDoEw2rdpPb/hFHOpsv8t6TT53GVbdc5l/lxwyvdrblXxhWbLuTRCQ G0X2VoxvTph7SyVPpRqoU4U99vW3g6LRI0dsxyU8kLc/kp7/t89gwh6Gj0wlb6tb 4pswvEuxK4HauMPjajCIaEeGpQoHbvCVpB4P9HvEtL/wklc5hNIAWy5+KYjMFWil 8Vc6RsJv6DZc66z+X4PT1oYrx6bkhNKNRGBETV2/dzeLU/7o2ttUIQ0vlOlN9gKV xSne19OKwqhhRMuWrc3LDzunlFiLex2zHbZJVDOeKNXmEJVpO9ZPVkvpoi/g0vOE DPwcc89J+763CVmXu8uyPqgQRfJTp2dXNggP3lINCDi1mFayrBqR0S6rQ1B2m/6R 8HYDjJ6gBpXBcs0FTGh1cfbpI8GpPx9Xu3Q/FtjvczEXUO3831KCm+k7O3cUjB72 I7I3wmyYQvUanOc39fg+EZ0cXpiSIZG4Mjv+8vWgADFlCMUKMaC+vp2PuKmBxTuk 7dAkxKr2kXfMirGWF1R/vo/CaKnCWmJbAJDFJ8Mt/Nk3zn3NOl/kCRtxndlj7k+m /s8DHlVLyD8dgvzRO9gp =xqB3 -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfixes from Ted Ts'o. Misc ext4 fixes, delayed by Ted moving mail servers and email getting marked as spam due to bad spf records. * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: add WARN_ON to check the length of allocated blocks ext4: fix retry handling in ext4_ext_truncate() ext4: destroy ext4_es_cachep on module unload ext4: make sure group number is bumped after a inode allocation race	2013-08-08 09:38:19 -07:00
Jaegeuk Kim	c2d715d144	f2fs: fix a build failure due to missing the kobject header This patch should resolve the following error reported by kbuild test robot. All error/warnings: In file included from fs/f2fs/dir.c:13:0: >> fs/f2fs/f2fs.h:435:17: error: field 's_kobj' has incomplete type struct kobject s_kobj; The failure was caused by missing the kobject header file in dir.c. So, this patch move the header file to the right location, f2fs.h. CC: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-08-08 15:56:49 +09:00
Trond Myklebust	b72888cb0b	NFSv4: Fix up nfs4_proc_lookup_mountpoint Currently, we do not check the return value of client = rpc_clone_client(), nor do we shut down the resulting cloned rpc_clnt in the case where a NFS4ERR_WRONGSEC has caused nfs4_proc_lookup_common() to replace the original value of 'client' (causing a memory leak). Fix both issues and simplify the code by moving the call to rpc_clone_client() until after nfs4_proc_lookup_common() has done its business. Reported-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-08-07 20:47:26 -04:00
Trond Myklebust	eddffa4084	NFS: Remove unnecessary call to nfs_setsecurity in nfs_fhget() We only need to call it on the creation of the inode. Reported-by: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Steve Dickson <SteveD@redhat.com> Cc: Dave Quigley <dpquigl@davequigley.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-08-07 17:07:41 -04:00
Scott Mayhew	e890db0104	NFSv4: Fix the sync mount option for nfs4 mounts The sync mount option stopped working for NFSv4 mounts after commit `c02d7adf8c` (NFSv4: Replace nfs4_path_walk() with FS path lookup in a private namespace). If MS_SYNCHRONOUS is set in the super_block that we're cloning from, then it should be set in the new super_block as well. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-08-07 17:07:41 -04:00
Trond Myklebust	f8806c843f	NFS: Fix writeback performance issue on cache invalidation If a cache invalidation is triggered, and we happen to have a lot of writebacks cached at the time, then the call to invalidate_inode_pages2() will end up calling ->launder_page() on each and every dirty page in order to sync its contents to disk, thus defeating write coalescing. The following patch ensures that we try to sync the inode to disk before calling invalidate_inode_pages2() so that we do the writeback as efficiently as possible. Reported-by: William Dauchy <william@gandi.net> Reported-by: Pascal Bouchareine <pascal@gandi.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: William Dauchy <william@gandi.net> Reviewed-by: Jeff Layton <jlayton@redhat.com>	2013-08-07 17:07:40 -04:00
Linus Torvalds	b7bc9e7d80	Oleg Nesterov has been working hard in closing all the holes that can lead to race conditions between deleting an event and accessing an event debugfs file. This included a fix to the debugfs system (acked by Greg Kroah-Hartman). We think that all the holes have been patched and hopefully we don't find more. I haven't marked all of them for stable because I need to examine them more to figure out how far back some of the changes need to go. Along the way, some other fixes have been made. Alexander Z Lam fixed some logic where the wrong buffer was being modifed. Andrew Vagin found a possible corruption for machines that actually allocate cpumask, as a reference to one was being zeroed out by mistake. Dhaval Giani found a bad prototype when tracing is not configured. And I not only had some changes to help Oleg, but also finally fixed a long standing bug that Dave Jones and others have been hitting, where a module unload and reload can cause the function tracing accounting to get screwed up. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAABAgAGBQJR/7FvAAoJEOdOSU1xswtMCAgH/RpxS5mQcQnhrGfu1qnSk+3v CyL1X9IkvgP5f+N59tbU3ZuE8Q3YQvTII35na38fgbHbwWrhYjLSvlf3Pwtre8zr Rjm9b8zA6iSobHu3DyuuupzMsLE+SpBakVSmG6mi8izZyuCi0YHMZMnHTGNnW9Vv YFqkfGgQQWMqiUHLcueetOqWAjR2JiJaLfr47rsRLNflF6dB2MbG/7YbYJ0TBk7E hemVmR7JH7606eJZ6nR7bh/hYv0sL2SSqVVaOHO5nWFLFbI8CybpNVGcoD+nihZY M7LrZT1C1mn3nKrfDhyXy4dwwx6wBON0fY23eJTHMVNnc0tvY1xqH52vTdPILWg= =UfsE -----END PGP SIGNATURE----- Merge tag 'trace-fixes-3.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Oleg Nesterov has been working hard in closing all the holes that can lead to race conditions between deleting an event and accessing an event debugfs file. This included a fix to the debugfs system (acked by Greg Kroah-Hartman). We think that all the holes have been patched and hopefully we don't find more. I haven't marked all of them for stable because I need to examine them more to figure out how far back some of the changes need to go. Along the way, some other fixes have been made. Alexander Z Lam fixed some logic where the wrong buffer was being modifed. Andrew Vagin found a possible corruption for machines that actually allocate cpumask, as a reference to one was being zeroed out by mistake. Dhaval Giani found a bad prototype when tracing is not configured. And I not only had some changes to help Oleg, but also finally fixed a long standing bug that Dave Jones and others have been hitting, where a module unload and reload can cause the function tracing accounting to get screwed up" * tag 'trace-fixes-3.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix reset of time stamps during trace_clock changes tracing: Make TRACE_ITER_STOP_ON_FREE stop the correct buffer tracing: Fix trace_dump_stack() proto when CONFIG_TRACING is not set tracing: Fix fields of struct trace_iterator that are zeroed by mistake tracing/uprobes: Fail to unregister if probe event files are in use tracing/kprobes: Fail to unregister if probe event files are in use tracing: Add comment to describe special break case in probe_remove_event_call() tracing: trace_remove_event_call() should fail if call/file is in use debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs) ftrace: Check module functions being traced on reload ftrace: Consolidate some duplicate code for updating ftrace ops tracing: Change remove_event_file_dir() to clear "d_subdirs"->i_private tracing: Introduce remove_event_file_dir() tracing: Change f_start() to take event_mutex and verify i_private != NULL tracing: Change event_filter_read/write to verify i_private != NULL tracing: Change event_enable/disable_read() to verify i_private != NULL tracing: Turn event/id->i_private into call->event.type	2013-08-07 13:01:30 -07:00
Weston Andros Adamson	58cd57bfd9	nfsd: Fix SP4_MACH_CRED negotiation in EXCHANGE_ID - don't BUG_ON() when not SP4_NONE - calculate recv and send reserve sizes correctly Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-08-07 12:06:07 -04:00
J. Bruce Fields	c47205914c	nfsd4: Fix MACH_CRED NULL dereference Fixes a NULL-dereference on attempts to use MACH_CRED protection over auth_sys. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-08-07 12:05:51 -04:00
Jeff Layton	757c4f6260	cifs: don't instantiate new dentries in readdir for inodes that need to be revalidated immediately David reported that commit `c2b93e06` (cifs: only set ops for inodes in I_NEW state) caused a regression with mfsymlinks. Prior to that patch, if a mfsymlink dentry was instantiated at readdir time, the inode would get a new set of ops when it was revalidated. After that patch, this did not occur. This patch addresses this by simply skipping instantiating dentries in the readdir codepath when we know that they will need to be immediately revalidated. The next attempt to use that dentry will cause a new lookup to occur (which is basically what we want to happen anyway). Cc: <stable@vger.kernel.org> Cc: "Stefan (metze) Metzmacher" <metze@samba.org> Cc: Sachin Prabhu <sprabhu@redhat.com> Reported-and-Tested-by: David McBride <dwm37@cam.ac.uk> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>	2013-08-07 10:57:06 -05:00
Jin Xu	a569469e96	f2fs: fix a deadlock in fsync This patch fixes a deadlock bug that occurs quite often when there are concurrent write and fsync on a same file. Following is the simplified call trace when tasks get hung. fsync thread: - f2fs_sync_file ... - f2fs_write_data_pages ... - update_extent_cache ... - update_inode - wait_on_page_writeback bdi writeback thread - __writeback_single_inode - f2fs_write_data_pages - mutex_lock(sbi->writepages) The deadlock happens when the fsync thread waits on a inode page that has been added to the f2fs' cached bio sbi->bio[NODE], and unfortunately, no one else could be able to submit the cached bio to block layer for writeback. This is because the fsync thread already hold a sbi->fs_lock and the sbi->writepages lock, causing the bdi thread being blocked when attempt to write data pages for the same inode. At the same time, f2fs_gc thread does not notice the situation and could not help. Even the sync syscall gets blocked. To fix it, we could submit the cached bio first before waiting on a inode page that is being written back. Signed-off-by: Jin Xu <jinuxstyle@gmail.com> [Jaegeuk Kim: add more cases to use f2fs_wait_on_page_writeback] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-08-06 22:00:36 +09:00
Namjae Jeon	df273efc36	f2fs: remove redundant code from f2fs_write_begin This code is being used for nobh_write_end() function. But since now f2fs_write_end function is added so there is no need for this code. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-08-06 22:00:35 +09:00
Namjae Jeon	d2dc095f42	f2fs: add sysfs entries to select the gc policy Add sysfs entry gc_idle to control the gc policy. Where gc_idle = 1 corresponds to selecting a cost benefit approach, while gc_idle = 2 corresponds to selecting a greedy approach to garbage collection. The selection is mutually exclusive one approach will work at any point. If gc_idle = 0, then this option is disabled. Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com> [Jaegeuk Kim: change the select_gc_type() flow slightly] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-08-06 22:00:18 +09:00
Namjae Jeon	b59d0bae6c	f2fs: add sysfs support for controlling the gc_thread Add sysfs entries to control the timing parameters for f2fs gc thread. Various Sysfs options introduced are: gc_min_sleep_time: Min Sleep time for GC in ms gc_max_sleep_time: Max Sleep time for GC in ms gc_no_gc_sleep_time: Default Sleep time for GC in ms Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com> [Jaegeuk Kim: fix an umount bug and some minor changes] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-08-06 21:53:34 +09:00
Trond Myklebust	9a1b6bf818	LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs Firstly, nlmclnt_setlockargs can be called from a reclaimer thread, in which case we're in entirely the wrong namespace. Secondly, commit `8aac62706a` (move exit_task_namespaces() outside of exit_notify()) now means that exit_task_work() is called after exit_task_namespaces(), which triggers an Oops when we're freeing up the locks. Fix this by ensuring that we initialise the nlm_host's rpc_client at mount time, so that the cl_nodename field is initialised to the value of utsname()->nodename that the net namespace uses. Then replace the lockd callers of utsname()->nodename. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Toralf Förster <toralf.foerster@gmx.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Nix <nix@esperi.org.uk> Cc: Jeff Layton <jlayton@redhat.com> Cc: stable@vger.kernel.org # 3.10.x	2013-08-05 15:03:46 -04:00
Zheng Liu	3d62c45b38	vfs: add missing check for __O_TMPFILE in fcntl_init() As comment in include/uapi/asm-generic/fcntl.h described, when introducing new O_* bits, we need to check its uniqueness in fcntl_init(). But __O_TMPFILE bit is missing. So fix it. Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-08-05 18:25:32 +04:00
Andy Lutomirski	bb2314b479	fs: Allow unprivileged linkat(..., AT_EMPTY_PATH) aka flink Every now and then someone proposes a new flink syscall, and this spawns a long discussion of whether it would be a security problem. I think that this is missing the point: flink is already allowed without privilege as long as /proc is mounted -- it's called AT_SYMLINK_FOLLOW. Now that O_TMPFILE is here, the ability to create a file with O_TMPFILE, write it, and link it in is very convenient. The only problem is that it requires that /proc be mounted so that you can do: linkat(AT_FDCWD, "/proc/self/fd/<tmpfd>", dfd, path, AT_SYMLINK_NOFOLLOW) This sucks -- it's much nicer to do: linkat(tmpfd, "", dfd, path, AT_EMPTY_PATH) Let's allow it. If this turns out to be excessively scary, it we could instead require that the inode in question be I_LINKABLE, but this seems pointless given the /proc situation Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-08-05 18:24:11 +04:00
Andy Lutomirski	e305f48bc4	fs: Fix file mode for O_TMPFILE O_TMPFILE, like O_CREAT, should respect the requested mode and should create regular files. This fixes two bugs: O_TMPFILE required privilege (because the mode ended up as 000) and it produced bogus inodes with no type. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-08-05 18:24:10 +04:00
Al Viro	672fe15d09	reiserfs: fix deadlock in umount Since remove_proc_entry() started to wait for IO in progress (i.e. since 2007 or so), the locking in fs/reiserfs/proc.c became wrong; if procfs read happens between the moment when umount() locks the victim superblock and removal of /proc/fs/reiserfs/<device>/*, we'll get a deadlock - read will wait for s_umount (in sget(), called by r_start()), while umount will wait in remove_proc_entry() for that read to finish, holding s_umount all along. Fortunately, the same change allows a much simpler race avoidance - all we need to do is remove the procfs entries in the very beginning of reiserfs ->kill_sb(); that'll guarantee that pointer to superblock will remain valid for the duration for procfs IO, so we don't need sget() to keep the sucker alive. As the matter of fact, we can get rid of the home-grown iterator completely, and use single_open() instead. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-08-05 17:37:37 +04:00
Paul Gortmaker	a91de85247	jbd: relocate assert after state lock in journal_commit_transaction() The state lock is taken after we are doing an assert on the state value, not before. So we might in fact be doing an assert on a transient value. Ensure the state check is within the scope of the state lock being taken. Backport of jbd2 commit `3ca841c106` ("jbd2: relocate assert after state lock in journal_commit_transaction()") Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jan Kara <jack@suse.cz>	2013-08-01 23:25:38 +02:00
Gu Zheng	62c610460d	ocfs2/refcounttree: add the missing NULL check of the return value of find_or_create_page() Add the missing NULL check of the return value of find_or_create_page() in function ocfs2_duplicate_clusters_by_page(). [akpm@linux-foundation.org: fix layout, per Joel] Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Acked-by: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-31 14:41:02 -07:00
Jan Kara	e729eac6f6	udf: Refuse RW mount of the filesystem instead of making it RO Refuse RW mount of udf filesystem. So far we just silently changed it to RO mount but when the media is writeable, block layer won't notice this change and thus will think device is used RW and will block eject button of the drive. That is unexpected by users because for non-writeable media eject button works just fine. Userspace mount(8) command handles this just fine and retries mounting with MS_RDONLY set so userspace shouldn't see any regression. Plus any tool mounting udf is likely confronted with the case of read-only media where block layer already refuses to mount the filesystem without MS_RDONLY set so our behavior shouldn't be anything new for it. Reported-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Jan Kara <jack@suse.cz>	2013-07-31 22:14:51 +02:00
Jan Kara	d759bfa4e7	udf: Standardize return values in mount sequence Change all function used in filesystem discovery during mount to user standard kernel return values - -errno on error, 0 on success instead of 1 on failure and 0 on success. This allows us to pass error number (not just failure / success) so we can abort device scanning earlier in case of errors like EIO or ENOMEM . Also we will be able to return EROFS in case writeable mount is requested but writing isn't supported. Signed-off-by: Jan Kara <jack@suse.cz>	2013-07-31 22:14:50 +02:00
Jan Kara	17b7f7cf58	isofs: Refuse RW mount of the filesystem instead of making it RO Refuse RW mount of isofs filesystem. So far we just silently changed it to RO mount but when the media is writeable, block layer won't notice this change and thus will think device is used RW and will block eject button of the drive. That is unexpected by users because for non-writeable media eject button works just fine. Userspace mount(8) command handles this just fine and retries mounting with MS_RDONLY set so userspace shouldn't see any regression. Plus any tool mounting isofs is likely confronted with the case of read-only media where block layer already refuses to mount the filesystem without MS_RDONLY set so our behavior shouldn't be anything new for it. Reported-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Jan Kara <jack@suse.cz>	2013-07-31 22:14:50 +02:00
Eric Sandeen	cf7eff4666	ext3: allow specifying external journal by pathname mount option It's always been a hassle that if an external journal's device number changes, the filesystem won't mount. And since boot-time enumeration can change, device number changes aren't unusual. The current mechanism to update the journal location is by passing in a mount option w/ a new devnum, but that's a hassle; it's a manual approach, fixing things after the fact. Adding a mount option, "-o journal_path=/dev/$DEVICE" would help, since then we can do i.e. # mount -o journal_path=/dev/disk/by-label/$JOURNAL_LABEL ... and it'll mount even if the devnum has changed, as shown here: # losetup /dev/loop0 journalfile # mke2fs -L mylabel-journal -O journal_dev /dev/loop0 # mkfs.ext3 -L mylabel -J device=/dev/loop0 /dev/sdb1 Change the journal device number: # losetup -d /dev/loop0 # losetup /dev/loop1 journalfile And today it will fail: # mount /dev/sdb1 /mnt/test mount: wrong fs type, bad option, bad superblock on /dev/sdb1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg \| tail or so # dmesg \| tail -n 1 [17343.240702] EXT3-fs (sdb1): error: couldn't read superblock of external journal But with this new mount option, we can specify the new path: # mount -o journal_path=/dev/loop1 /dev/sdb1 /mnt/test # (which does update the encoded device number, incidentally): # umount /dev/sdb1 # dumpe2fs -h /dev/sdb1 \| grep "Journal device" dumpe2fs 1.41.12 (17-May-2010) Journal device: 0x0701 But best of all we can just always mount by journal-path, and it'll always work: # mount -o journal_path=/dev/disk/by-label/mylabel-journal /dev/sdb1 /mnt/test # So the journal_path option can be specified in fstab, and as long as the disk is available somewhere, and findable by label (or by UUID), we can mount. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>	2013-07-31 22:11:15 +02:00
Jeff Layton	66ffd113f5	cifs: set sb->s_d_op before calling d_make_root() Currently, the s_root dentry doesn't get its d_op pointer set to anything. This breaks lookups in the root of case-insensitive mounts since that relies on having d_hash and d_compare routines that know to treat the filename as case-insensitive. cifs.ko has been broken this way for a long time, but commit `1c929cfe6` ("switch cifs"), added a cryptic comment which is removed in the patch below, which makes me wonder if this was done deliberately for some reason. It's not clear to me why we'd want the s_root not to have d_op set properly. It may have something to do with d_automount or d_revalidate on the root, but my suspicion in looking over the code is that Al was just trying to preserve the existing behavior when changing this code over to use s_d_op. This patch changes it so that we set s_d_op before calling d_make_root and removes the comment. I tested mounting, accessing and unmounting several types of shares (including DFS referrals) and everything still seemed to work OK afterward. I could be missing something however, so please do let me know if I am. Reported-by: Jan-Marek Glogowski <glogow@fbihome.de> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ian Kent <raven@themaw.net> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>	2013-07-31 13:45:02 -05:00
Jeff Layton	ba48202932	cifs: fix bad error handling in crypto code Jarod reported an Oops like when testing with fips=1: CIFS VFS: could not allocate crypto hmacmd5 CIFS VFS: could not crypto alloc hmacmd5 rc -2 CIFS VFS: Error -2 during NTLMSSP authentication CIFS VFS: Send error in SessSetup = -2 BUG: unable to handle kernel NULL pointer dereference at 000000000000004e IP: [<ffffffff812b5c7a>] crypto_destroy_tfm+0x1a/0x90 PGD 0 Oops: 0000 [#1] SMP Modules linked in: md4 nls_utf8 cifs dns_resolver fscache kvm serio_raw virtio_balloon virtio_net mperf i2c_piix4 cirrus drm_kms_helper ttm drm i2c_core virtio_blk ata_generic pata_acpi CPU: 1 PID: 639 Comm: mount.cifs Not tainted 3.11.0-0.rc3.git0.1.fc20.x86_64 #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff88007bf496e0 ti: ffff88007b080000 task.ti: ffff88007b080000 RIP: 0010:[<ffffffff812b5c7a>] [<ffffffff812b5c7a>] crypto_destroy_tfm+0x1a/0x90 RSP: 0018:ffff88007b081d10 EFLAGS: 00010282 RAX: 0000000000001f1f RBX: ffff880037422000 RCX: ffff88007b081fd8 RDX: 000000000000001f RSI: 0000000000000006 RDI: fffffffffffffffe RBP: ffff88007b081d30 R08: ffff880037422000 R09: ffff88007c090100 R10: 0000000000000000 R11: 00000000fffffffe R12: fffffffffffffffe R13: ffff880037422000 R14: ffff880037422000 R15: 00000000fffffffe FS: 00007fc322f4f780(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000000000004e CR3: 000000007bdaa000 CR4: 00000000000006e0 Stack: ffffffff81085845 ffff880037422000 ffff8800375e7400 ffff880037422000 ffff88007b081d48 ffffffffa0176022 ffff880037422000 ffff88007b081d60 ffffffffa015c07b ffff880037600600 ffff88007b081dc8 ffffffffa01610e1 Call Trace: [<ffffffff81085845>] ? __cancel_work_timer+0x75/0xf0 [<ffffffffa0176022>] cifs_crypto_shash_release+0x82/0xf0 [cifs] [<ffffffffa015c07b>] cifs_put_tcp_session+0x8b/0xe0 [cifs] [<ffffffffa01610e1>] cifs_mount+0x9d1/0xad0 [cifs] [<ffffffffa014ff50>] cifs_do_mount+0xa0/0x4d0 [cifs] [<ffffffff811ab6e9>] mount_fs+0x39/0x1b0 [<ffffffff811c466f>] vfs_kern_mount+0x5f/0xf0 [<ffffffff811c6a9e>] do_mount+0x23e/0xa20 [<ffffffff811c66e6>] ? copy_mount_options+0x36/0x170 [<ffffffff811c7303>] SyS_mount+0x83/0xc0 [<ffffffff8165c8d9>] system_call_fastpath+0x16/0x1b Code: eb 9e 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 55 41 54 49 89 fc 53 48 83 ec 08 48 85 ff 74 46 <48> 83 7e 48 00 48 8b 5e 50 74 4b 48 89 f7 e8 83 fc ff ff 4c 8b RIP [<ffffffff812b5c7a>] crypto_destroy_tfm+0x1a/0x90 RSP <ffff88007b081d10> CR2: 000000000000004e The cifs code allocates some crypto structures. If that fails, it returns an error, but it leaves the pointers set to their PTR_ERR values. Then later when it tries to clean up, it sees that those values are non-NULL and then passes them to the routine that frees them. Fix this by setting the pointers to NULL after collecting the error code in this situation. Cc: Sachin Prabhu <sprabhu@redhat.com> Reported-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>	2013-07-31 13:44:59 -05:00
Oleg Nesterov	776164c1fa	debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs) debugfs_remove_recursive() is wrong, 1. it wrongly assumes that !list_empty(d_subdirs) means that this dir should be removed. This is not that bad by itself, but: 2. if d_subdirs does not becomes empty after __debugfs_remove() it gives up and silently fails, it doesn't even try to remove other entries. However ->d_subdirs can be non-empty because it still has the already deleted !debugfs_positive() entries. 3. simple_release_fs() is called even if __debugfs_remove() fails. Suppose we have dir1/ dir2/ file2 file1 and someone opens dir1/dir2/file2. Now, debugfs_remove_recursive(dir1/dir2) succeeds, and dir1/dir2 goes away. But debugfs_remove_recursive(dir1) silently fails and doesn't remove this directory. Because it tries to delete (the already deleted) dir1/dir2/file2 again and then fails due to "Avoid infinite loop" logic. Test-case: #!/bin/sh cd /sys/kernel/debug/tracing echo 'p:probe/sigprocmask sigprocmask' >> kprobe_events sleep 1000 < events/probe/sigprocmask/id & echo -n >\| kprobe_events [ -d events/probe ] && echo "ERR!! failed to rm probe" And after that it is not possible to create another probe entry. With this patch debugfs_remove_recursive() skips !debugfs_positive() files although this is not strictly needed. The most important change is that it does not try to make ->d_subdirs empty, it simply scans the whole list(s) recursively and removes as much as possible. Link: http://lkml.kernel.org/r/20130726151256.GC19472@redhat.com Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2013-07-31 12:16:31 -04:00
Dan Carpenter	f0c5e565bb	f2fs: remove an unneeded kfree(NULL) This kfree() is no longer needed after a79dc083d7 "f2fs: move bio_private allocation out of f2fs_bio_alloc()". The "bio->bi_private" is NULL here so it's a no-op. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-31 19:07:01 +09:00
Andi Shyti	fe090e4e44	cifs: file: initialize oparms.reconnect before using it In the cifs_reopen_file function, if the following statement is asserted: (tcon->unix_ext && cap_unix(tcon->ses) && (CIFS_UNIX_POSIX_PATH_OPS_CAP & (tcon->fsUnixInfo.Capability))) and we succeed to open with cifs_posix_open, the function jumps to the label reopen_success and checks for oparms.reconnect which is not initialized. This issue has been reported by scan.coverity.com Signed-off-by: Andi Shyti <andi@etezian.org> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>	2013-07-30 23:54:49 -05:00
Steve French	1b244081af	Do not attempt to do cifs operations reading symlinks with SMB2 When use of symlinks is enabled (mounting with mfsymlinks option) to non-Samba servers, we always tried to use cifs, even when we were mounted with SMB2 or SMB3, which causes the server to drop the network connection. This patch separates out the protocol specific operations for cifs from the code which recognizes symlinks, and fixes the problem where with SMB2 mounts we attempt cifs operations to open and read symlinks. The next patch will add support for SMB2 for opening and reading symlinks. Additional followon patches will address the similar problem creating symlinks. Signed-off-by: Steve French <smfrench@gmail.com>	2013-07-30 23:54:45 -05:00
Chen Gang	057d6332b2	cifs: extend the buffer length enought for sprintf() using For cifs_set_cifscreds() in "fs/cifs/connect.c", 'desc' buffer length is 'CIFSCREDS_DESC_SIZE' (56 is less than 256), and 'ses->domainName' length may be "255 + '\0'". The related sprintf() may cause memory overflow, so need extend related buffer enough to hold all things. It is also necessary to be sure of 'ses->domainName' must be less than 256, and define the related macro instead of hard code number '256'. Signed-off-by: Chen Gang <gang.chen@asianux.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Reviewed-by: Scott Lovenberg <scott.lovenberg@gmail.com> CC: <stable@vger.kernel.org> Signed-off-by: Steve French <smfrench@gmail.com>	2013-07-30 23:54:40 -05:00
Tejun Heo	ededf305a8	dlm: WQ_NON_REENTRANT is meaningless and going away `dbf2576e37` ("workqueue: make all workqueues non-reentrant") made WQ_NON_REENTRANT no-op and the flag is going away. Remove its usages. This patch doesn't introduce any behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: David Teigland <teigland@redhat.com>	2013-07-30 09:24:24 -05:00
Jaegeuk Kim	cbd56e7d20	f2fs: fix handling orphan inodes This patch fixes mishandling of the sbi->n_orphans variable. If users request lots of f2fs_unlink(), check_orphan_space() could be contended. In such the case, sbi->n_orphans can be read incorrectly so that f2fs_unlink() would fall into the wrong state which results in the failure of add_orphan_inode(). So, let's increment sbi->n_orphans virtually prior to the actual orphan inode stuffs. After that, let's release sbi->n_orphans by calling release_orphan_inode or remove_orphan_inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:03 +09:00
Gu Zheng	d8207f6958	f2fs: move bio_private allocation out of f2fs_bio_alloc() bio->bi_private is not always needed. As in the reading data path, end_read_io does not need bio_private for further using, so moving bio_private allocation out of f2fs_bio_alloc(). Alloc it in the submit_write_page(), and ignore it in the f2fs_readpage(). Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:03 +09:00
Gu Zheng	60ed9a0f53	f2fs: use list_for_each rather than list_for_each_safe, in remove_orphan_inode() As we remove the target single node, so list_for_each is enought, in order to clean up, we use list_for_each_entry instead. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:03 +09:00
Gu Zheng	2d219c5188	f2fs: use seq_puts()/seq_putc() rather than seq_printf() where possible For string without format specifiers, using seq_puts()/seq_putc() instead of seq_printf(). Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:03 +09:00
Jaegeuk Kim	f0947e5cce	f2fs: fix i_name during f2fs_sync_file As similar as the i_pino fix, i_name also should be fixed when i_nlink is 1. The errorneous scenario is like this. 1. touch test1 2. link test1 test2 3. unlink test2 4. fsync test1 After this, i_name should be test1. CC: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:03 +09:00
Jaegeuk Kim	1cd14cafc6	f2fs: update file name in the inode block during f2fs_rename The error is reproducible by: 0. mkfs.f2fs /dev/sdb1 & mount 1. touch test1 2. touch test2 3. mv test1 test2 4. umount 5. dumpt.f2fs -i 4 /dev/sdb1 After this, when we retrieve the inode->i_name of test2 by dump.f2fs, we get test1 instead of test2. This is because f2fs didn't update the file name during the f2fs_rename. So, this patch fixes that. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:03 +09:00
Gu Zheng	4559071063	f2fs: introduce help function F2FS_NODE() Introduce help function F2FS_NODE() to simplify the conversion of node_page to f2fs_node. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:02 +09:00
Gu Zheng	963d4f7d7b	f2fs: add a help func F2FS_STAT() to get the f2fs_stat_info Add a help func F2FS_STAT() to get the f2fs_stat_info. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:02 +09:00
Jaegeuk Kim	5e176d54a6	f2fs: add proc entry to monitor current usage of segments You can monitor valid block counts of whole segments in: /proc/fs/f2fs/sdb1/segment_info. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:02 +09:00
Jaegeuk Kim	e5d2385ed6	f2fs: recover date requested by fdatasync In order to support SQLite that uses fdatasync instead of fsync, we should guarantee the data requested by fdatasync can be recovered after sudden-power- off. So, let's remove the fdatasync condition in f2fs_sync_file. Otherwise, we can restore the data after sudden-power-off due to nonexistence of any fsync mark'ed node blocks. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:02 +09:00
Greg Kroah-Hartman	b78b6b3a9a	Merge 3.11-rc3 into driver-core-next We want these fixes in this branch. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-07-29 12:30:13 -07:00
Zheng Liu	44fb851dfb	ext4: add WARN_ON to check the length of allocated blocks In commit `921f266b`: ext4: add self-testing infrastructure to do a sanity check, some sanity checks were added in map_blocks to make sure 'retval == map->m_len'. Enable these checks by default and report any assertion failures using ext4_warning() and WARN_ON() since they can help us to figure out some bugs that are otherwise hard to hit. Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-07-29 12:51:42 -04:00
Theodore Ts'o	94eec0fc35	ext4: fix retry handling in ext4_ext_truncate() We tested for ENOMEM instead of -ENOMEM. Oops. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-07-29 12:12:56 -04:00
Greg Kroah-Hartman	4183fb9503	cuse: convert class code to use dev_groups The dev_attrs field of struct class is going away soon, dev_groups should be used instead. This converts the cuse class code to use the correct field. Acked-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-07-26 18:05:18 -07:00
Eric Sandeen	dd12ed144e	ext4: destroy ext4_es_cachep on module unload Without this, module can't be reloaded. [ 500.521980] kmem_cache_sanity_check (ext4_extent_status): Cache name already exists. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org # v3.8+	2013-07-26 15:21:11 -04:00
Theodore Ts'o	a34eb50374	ext4: make sure group number is bumped after a inode allocation race When we try to allocate an inode, and there is a race between two CPU's trying to grab the same inode, _and_ this inode is the last free inode in the block group, make sure the group number is bumped before we continue searching the rest of the block groups. Otherwise, we end up searching the current block group twice, and we end up skipping searching the last block group. So in the unlikely situation where almost all of the inodes are allocated, it's possible that we will return ENOSPC even though there might be free inodes in that last block group. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-07-26 15:15:46 -04:00
Linus Torvalds	6c4155a9cd	xfs: fix for 3.11-rc3 - fix for regression in commit `cca9f93a52`, recovery causing filesystem corruption after a crash -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABAgAGBQJR8ZWnAAoJENaLyazVq6ZOb7cP/iLwa59qX5sHAoYRGterE4Li 34xPkihJEjcbvNCtK+rznXT9ohSvwTahnrdlLy/bQ6d0K1gBX3j1DD4cpTvGRWJR hEBbQU0PXhXjRL6ixgxfeNPfbEfNMYhiTFfjPhBjKVgzYN3NnJZ1lv8zTHaeQ8JP m7dEKrrg/J8LsW18fq2E0p/SjKi7cT1mEf8jkcYu0UGYd7yDtQSukMbEjfsIJq9L DpB3QXHQkUf1UlVdUvLncGmcDUPAEt+8/ae9uUpY2nxHv+7jmzAoCyRUCTDYsIh2 gznQjsns56B2FWfnkyzXC3nMaoyIZpT8Fy3FRBsQRKGboOeOPS+/Yyzf/FcLQ8Jl yMXA0oR+3Ft7wJ62+aSuP3/dug8TbBk09bI+RqV4D+GwM7n7kLE/Fo3kQLva5Aqf rZIhwzfBDl51vxRzm4I29wOkfvQRXndy4c0hYtfeVy0lBA2yCFLSlzGha5EX+CxM s1kbpOkuOOE5k5Mgjve/iIKbwG3OKEuPCrESJPG+sTREAkkXkycnVQft2ihJYgg8 yIgPG4fxpIIpwTdC016YAa/raOm/unIG6ko+ec3m2rB2lmo8j3vQOjoIFuV0KYV1 enzhK5F+sQJl9evQOgfJc+uOMgjs1DrE38hnlQ8rc3LXa5Dtb7ReMRAT7z2FxicF keAPwJNrMlwgIyYi+3B+ =hrYY -----END PGP SIGNATURE----- Merge tag 'for-linus-v3.11-rc3' of git://oss.sgi.com/xfs/xfs Pull xfs fix from Ben Myers: "Fix for regression in commit `cca9f93a52` ("xfs: don't do IO when creating an new inode"), recovery causing filesystem corruption after a crash" * tag 'for-linus-v3.11-rc3' of git://oss.sgi.com/xfs/xfs: xfs: di_flushiter considered harmful	2013-07-26 11:22:54 -07:00
Linus Torvalds	f315cf5e02	Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux Pull nfsd fix from Bruce Fields: "One more nfsd bugfix for 3.11" * 'for-3.11' of git://linux-nfs.org/~bfields/linux: nfsd: nfsd_open: when dentry_open returns an error do not propagate as struct file	2013-07-26 11:21:43 -07:00
Dave Chinner	e1b4271ac2	xfs: di_flushiter considered harmful When we made all inode updates transactional, we no longer needed the log recovery detection for inodes being newer on disk than the transaction being replayed - it was redundant as replay of the log would always result in the latest version of the inode would be on disk. It was redundant, but left in place because it wasn't considered to be a problem. However, with the new "don't read inodes on create" optimisation, flushiter has come back to bite us. Essentially, the optimisation made always initialises flushiter to zero in the create transaction, and so if we then crash and run recovery and the inode already on disk has a non-zero flushiter it will skip recovery of that inode. As a result, log recovery does the wrong thing and we end up with a corrupt filesystem. Because we have to support old kernel to new kernel upgrades, we can't just get rid of the flushiter support in log recovery as we might be upgrading from a kernel that doesn't have fully transactional inode updates. Unfortunately, for v4 superblocks there is no way to guarantee that log recovery knows about this fact. We cannot add a new inode format flag to say it's a "special inode create" because it won't be understood by older kernels and so recovery could do the wrong thing on downgrade. We cannot specially detect the combination of zero mode/non-zero flushiter on disk to non-zero mode, zero flushiter in the log item during recovery because wrapping of the flushiter can result in false detection. Hence that makes this "don't use flushiter" optimisation limited to a disk format that guarantees that we don't need it. And that means the only fix here is to limit the "no read IO on create" optimisation to version 5 superblocks.... Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit `e60896d8f2`)	2013-07-25 10:41:42 -05:00
Tetsuo Handa	9548906b2b	xattr: Constify ->name member of "struct xattr". Since everybody sets kstrdup()ed constant string to "struct xattr"->name but nobody modifies "struct xattr"->name , we can omit kstrdup() and its failure checking by constifying ->name member of "struct xattr". Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Joel Becker <jlbec@evilplan.org> [ocfs2] Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Reviewed-by: Paul Moore <paul@paul-moore.com> Tested-by: Paul Moore <paul@paul-moore.com> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-07-25 19:30:03 +10:00
Eric W. Biederman	5ff9d8a65c	vfs: Lock in place mounts from more privileged users When creating a less privileged mount namespace or propogating mounts from a more privileged to a less privileged mount namespace lock the submounts so they may not be unmounted individually in the child mount namespace revealing what is under them. This enforces the reasonable expectation that it is not possible to see under a mount point. Most of the time mounts are on empty directories and revealing that does not matter, however I have seen an occassionaly sloppy configuration where there were interesting things concealed under a mount point that probably should not be revealed. Expirable submounts are not locked because they will eventually unmount automatically so whatever is under them already needs to be safe for unprivileged users to access. From a practical standpoint these restrictions do not appear to be significant for unprivileged users of the mount namespace. Recursive bind mounts and pivot_root continues to work, and mounts that are created in a mount namespace may be unmounted there. All of which means that the common idiom of keeping a directory of interesting files and using pivot_root to throw everything else away continues to work just fine. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-07-24 09:14:46 -07:00
Linus Torvalds	55c62960b0	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse bugfixes from Miklos Szeredi: "These are bugfixes and a cleanup to the "readdirplus" feature" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: readdirplus: cleanup fuse: readdirplus: change attributes once fuse: readdirplus: fix instantiate fuse: readdirplus: sanity checks fuse: readdirplus: fix dentry leak	2013-07-23 14:37:04 -07:00
Trond Myklebust	4f3cc4809a	NFSv4: Fix brainfart in attribute length calculation The calculation of the attribute length was 4 bytes off. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Andre Heider <a.heider@gmail.com> Reported-and-tested-by: Henrik Rydberg <rydberg@euromail.se> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-23 14:24:59 -07:00
Harshula Jayasuriya	e4daf1ffbe	nfsd: nfsd_open: when dentry_open returns an error do not propagate as struct file The following call chain: ------------------------------------------------------------ nfs4_get_vfs_file - nfsd_open - dentry_open - do_dentry_open - __get_file_write_access - get_write_access - return atomic_inc_unless_negative(&inode->i_writecount) ? 0 : -ETXTBSY; ------------------------------------------------------------ can result in the following state: ------------------------------------------------------------ struct nfs4_file { ... fi_fds = {0xffff880c1fa65c80, 0xffffffffffffffe6, 0x0}, fi_access = {{ counter = 0x1 }, { counter = 0x0 }}, ... ------------------------------------------------------------ 1) First time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is NULL, hence nfsd_open() is called where we get status set to an error and fp->fi_fds[O_WRONLY] to -ETXTBSY. Thus we do not reach nfs4_file_get_access() and fi_access[O_WRONLY] is not incremented. 2) Second time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is NOT NULL (-ETXTBSY), so nfsd_open() is NOT called, but nfs4_file_get_access() IS called and fi_access[O_WRONLY] is incremented. Thus we leave a landmine in the form of the nfs4_file data structure in an incorrect state. 3) Eventually, when __nfs4_file_put_access() is called it finds fi_access[O_WRONLY] being non-zero, it decrements it and calls nfs4_file_put_fd() which tries to fput -ETXTBSY. ------------------------------------------------------------ ... [exception RIP: fput+0x9] RIP: ffffffff81177fa9 RSP: ffff88062e365c90 RFLAGS: 00010282 RAX: ffff880c2b3d99cc RBX: ffff880c2b3d9978 RCX: 0000000000000002 RDX: dead000000100101 RSI: 0000000000000001 RDI: ffffffffffffffe6 RBP: ffff88062e365c90 R8: ffff88041fe797d8 R9: ffff88062e365d58 R10: 0000000000000008 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000000007 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #9 [ffff88062e365c98] __nfs4_file_put_access at ffffffffa0562334 [nfsd] #10 [ffff88062e365cc8] nfs4_file_put_access at ffffffffa05623ab [nfsd] #11 [ffff88062e365ce8] free_generic_stateid at ffffffffa056634d [nfsd] #12 [ffff88062e365d18] release_open_stateid at ffffffffa0566e4b [nfsd] #13 [ffff88062e365d38] nfsd4_close at ffffffffa0567401 [nfsd] #14 [ffff88062e365d88] nfsd4_proc_compound at ffffffffa0557f28 [nfsd] #15 [ffff88062e365dd8] nfsd_dispatch at ffffffffa054543e [nfsd] #16 [ffff88062e365e18] svc_process_common at ffffffffa04ba5a4 [sunrpc] #17 [ffff88062e365e98] svc_process at ffffffffa04babe0 [sunrpc] #18 [ffff88062e365eb8] nfsd at ffffffffa0545b62 [nfsd] #19 [ffff88062e365ee8] kthread at ffffffff81090886 #20 [ffff88062e365f48] kernel_thread at ffffffff8100c14a ------------------------------------------------------------ Cc: stable@vger.kernel.org Signed-off-by: Harshula Jayasuriya <harshula@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-23 12:15:32 -04:00
Steven Whitehouse	ad151d5444	GFS2: Replace PTR_RET with PTR_ERR_OR_ZERO PTR_RET should be PTR_ERR Reported-by: Sachin Kamat <sachin.kamat@linaro.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2013-07-22 11:11:46 +09:30
Zheng Liu	dda5690def	ext3: fix a BUG when opening a file with O_TMPFILE flag When we try to open a file with O_TMPFILE flag, we will trigger a bug. The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and this check always fails because we set ->i_nlink = 1 in inode_init_always(). We can use the following program to trigger it: int main(int argc, char *argv[]) { int fd; fd = open(argv[1], O_TMPFILE, 0666); if (fd < 0) { perror("open "); return -1; } close(fd); return 0; } The oops message looks like this: kernel: kernel BUG at fs/ext3/namei.c:1992! kernel: invalid opcode: 0000 [#1] SMP kernel: Modules linked in: ext4 jbd2 crc16 cpufreq_ondemand ipv6 dm_mirror dm_region_hash dm_log dm_mod parport_pc parport serio_raw sg dcdbas pcspkr i2c_i801 ehci_pci ehci_hcd button acpi_cpufreq mperf e1000e ptp pps_core ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core ext3 jbd sd_mod ahci libahci libata scsi_mod uhci_hcd kernel: CPU: 0 PID: 2882 Comm: tst_tmpfile Not tainted 3.11.0-rc1+ #4 kernel: Hardware name: Dell Inc. OptiPlex 780 /0V4W66, BIOS A05 08/11/2010 kernel: task: ffff880112d30050 ti: ffff8801124d4000 task.ti: ffff8801124d4000 kernel: RIP: 0010:[<ffffffffa00db5ae>] [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3] kernel: RSP: 0018:ffff8801124d5cc8 EFLAGS: 00010202 kernel: RAX: 0000000000000000 RBX: ffff880111510128 RCX: ffff8801114683a0 kernel: RDX: 0000000000000000 RSI: ffff880111510128 RDI: ffff88010fcf65a8 kernel: RBP: ffff8801124d5d18 R08: 0080000000000000 R09: ffffffffa00d3b7f kernel: R10: ffff8801114683a0 R11: ffff8801032a2558 R12: 0000000000000000 kernel: R13: ffff88010fcf6800 R14: ffff8801032a2558 R15: ffff8801115100d8 kernel: FS: 00007f5d172b5700(0000) GS:ffff880117c00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b kernel: CR2: 00007f5d16df15d0 CR3: 0000000110b1d000 CR4: 00000000000407f0 kernel: Stack: kernel: 000000000000000c ffff8801048a7dc8 ffff8801114685a8 ffffffffa00b80d7 kernel: ffff8801124d5e38 ffff8801032a2558 ffff88010ce24d68 0000000000000000 kernel: ffff88011146b300 ffff8801124d5d44 ffff8801124d5d78 ffffffffa00db7e1 kernel: Call Trace: kernel: [<ffffffffa00b80d7>] ? journal_start+0x8c/0xbd [jbd] kernel: [<ffffffffa00db7e1>] ext3_tmpfile+0xb2/0x13b [ext3] kernel: [<ffffffff821076f8>] path_openat+0x11f/0x5e7 kernel: [<ffffffff821c86b4>] ? list_del+0x11/0x30 kernel: [<ffffffff82065fa2>] ? __dequeue_entity+0x33/0x38 kernel: [<ffffffff82107cd5>] do_filp_open+0x3f/0x8d kernel: [<ffffffff82112532>] ? __alloc_fd+0x50/0x102 kernel: [<ffffffff820f9296>] do_sys_open+0x13b/0x1cd kernel: [<ffffffff820f935c>] SyS_open+0x1e/0x20 kernel: [<ffffffff82398c02>] system_call_fastpath+0x16/0x1b kernel: Code: 39 c7 0f 85 67 01 00 00 0f b7 03 25 00 f0 00 00 3d 00 40 00 00 74 18 3d 00 80 00 00 74 11 3d 00 a0 00 00 74 0a 83 7b 48 00 74 04 <0f> 0b eb fe 49 8b 85 50 03 00 00 4c 89 f6 48 c7 c7 c0 99 0e a0 kernel: RIP [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3] kernel: RSP <ffff8801124d5cc8> Here we couldn't call clear_nlink() directly because in d_tmpfile() we will call inode_dec_link_count() to decrease ->i_nlink. So this commit tries to call d_tmpfile() before ext4_orphan_add() to fix this problem. Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk>	2013-07-20 22:03:20 -04:00
Zheng Liu	e94bd3490f	ext4: fix a BUG when opening a file with O_TMPFILE flag When we try to open a file with O_TMPFILE flag, we will trigger a bug. The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and this check always fails because we set ->i_nlink = 1 in inode_init_always(). We can use the following program to trigger it: int main(int argc, char *argv[]) { int fd; fd = open(argv[1], O_TMPFILE, 0666); if (fd < 0) { perror("open "); return -1; } close(fd); return 0; } The oops message looks like this: kernel BUG at fs/ext4/namei.c:2572! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dlci bridge stp hidp cmtp kernelcapi l2tp_ppp l2tp_netlink l2tp_core sctp libcrc32c rfcomm tun fuse nfnetli nk can_raw ipt_ULOG can_bcm x25 scsi_transport_iscsi ipx p8023 p8022 appletalk phonet psnap vmw_vsock_vmci_transport af_key vmw_vmci rose vsock atm can netrom ax25 af_rxrpc ir da pppoe pppox ppp_generic slhc bluetooth nfc rfkill rds caif_socket caif crc_ccitt af_802154 llc2 llc snd_hda_codec_realtek snd_hda_intel snd_hda_codec serio_raw snd_pcm pcsp kr edac_core snd_page_alloc snd_timer snd soundcore r8169 mii sr_mod cdrom pata_atiixp radeon backlight drm_kms_helper ttm CPU: 1 PID: 1812571 Comm: trinity-child2 Not tainted 3.11.0-rc1+ #12 Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010 task: ffff88007dfe69a0 ti: ffff88010f7b6000 task.ti: ffff88010f7b6000 RIP: 0010:[<ffffffff8125ce69>] [<ffffffff8125ce69>] ext4_orphan_add+0x299/0x2b0 RSP: 0018:ffff88010f7b7cf8 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff8800966d3020 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff88007dfe70b8 RDI: 0000000000000001 RBP: ffff88010f7b7d40 R08: ffff880126a3c4e0 R09: ffff88010f7b7ca0 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801271fd668 R13: ffff8800966d2f78 R14: ffff88011d7089f0 R15: ffff88007dfe69a0 FS: 00007f70441a3740(0000) GS:ffff88012a800000(0000) knlGS:00000000f77c96c0 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000002834000 CR3: 0000000107964000 CR4: 00000000000007e0 DR0: 0000000000780000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 Stack: 0000000000002000 00000020810b6dde 0000000000000000 ffff88011d46db00 ffff8800966d3020 ffff88011d7089f0 ffff88009c7f4c10 ffff88010f7b7f2c ffff88007dfe69a0 ffff88010f7b7da8 ffffffff8125cfac ffff880100000004 Call Trace: [<ffffffff8125cfac>] ext4_tmpfile+0x12c/0x180 [<ffffffff811cba78>] path_openat+0x238/0x700 [<ffffffff8100afc4>] ? native_sched_clock+0x24/0x80 [<ffffffff811cc647>] do_filp_open+0x47/0xa0 [<ffffffff811db73f>] ? __alloc_fd+0xaf/0x200 [<ffffffff811ba2e4>] do_sys_open+0x124/0x210 [<ffffffff81010725>] ? syscall_trace_enter+0x25/0x290 [<ffffffff811ba3ee>] SyS_open+0x1e/0x20 [<ffffffff816ca8d4>] tracesys+0xdd/0xe2 [<ffffffff81001001>] ? start_thread_common.constprop.6+0x1/0xa0 Code: 04 00 00 00 89 04 24 31 c0 e8 c4 77 04 00 e9 43 fe ff ff 66 25 00 d0 66 3d 00 80 0f 84 0e fe ff ff 83 7b 48 00 0f 84 04 fe ff ff <0f> 0b 49 8b 8c 24 50 07 00 00 e9 88 fe ff ff 0f 1f 84 00 00 00 Here we couldn't call clear_nlink() directly because in d_tmpfile() we will call inode_dec_link_count() to decrease ->i_nlink. So this commit tries to call d_tmpfile() before ext4_orphan_add() to fix this problem. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Tested-by: Darrick J. Wong <darrick.wong@oracle.com> Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-20 21:58:38 -04:00
Linus Torvalds	36231d255b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "The sget() one is a long-standing bug and will need to go into -stable (in fact, it had been originally caught in RHEL6), the other two are 3.11-only" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: constify dentry parameter in d_count() livelock avoidance in sget() allow O_TMPFILE to work with O_WRONLY	2013-07-20 10:50:01 -07:00
Linus Torvalds	19bf1c2c7b	Fixes for 3.11-rc2, sent at 5pm, in the professoinal style. :-) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABCAAGBQJR6cmlAAoJENNvdpvBGATwZF0P/0a7ET511UJwQbgAIq5ftFlj 86Bzvy28xo2T85t64L+Ib2XDehWHk0sZlQpB/gK8MLYn4rCRWCxkQAshKwoequsC AhuvQ7NtX9vJNCSR30+RrLhkvj6UKsMuM724adARLBUgMBoScABzZImR1e14ELah bN27a4Bk2aNUpNX68QYdQX3TGiHGZy//lNmh81JTxFS3Moqm6bIZAJbYpOslATsI Q5nti/TjQJKso2gF7Jx7NffXv0g5rGxaVQEZJPpfIv1Vs0b6vabK/sYp608ayM0K qKyjJABaHR1Pzb16V82ZqvSlsHm/ARhCF1nMM6gQ8nwl/plxcQ6Jvd/qJsNej3b/ 7Jfm86xLe+G0G5oeNEJXsoEFAsvxug6ZRMfyoRHaPlGIksmz+Jc9kzTtM3qzdzOB 5OPJwlONlM4dRVA6rgb7KiuE3h/sRt4CctFejD0f6mUqKa+B+zyHq/a/8a+60IqQ /sDiTQrqrI6LWxECFasDNoGxtnvVtKC21jbg+MTzumZDvjgnJIFFe5NrinI6SB9x VQYVq/vVkE576VTwGAttTg3s4sRwQKd/iuQjuoP76iFFHvq/sNX6fBq0NW5gpsj2 WAfH+fLQsMcVJ2MAcc3DwdBT1wQbLu+Y19hv4TDOZRmnKGhq9K08hzWR4tIUKdFJ UcjWk35Wuoz1IGpVlHJ5 =ngfz -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfixes from Ted Ts'o: "Fixes for 3.11-rc2, sent at 5pm, in the professoinal style. :-)" I'm not sure I like this new level of "professionalism". 9-5, people, 9-5. * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: call ext4_es_lru_add() after handling cache miss ext4: yield during large unlinks ext4: make the extent_status code more robust against ENOMEM failures ext4: simplify calculation of blocks to free on error ext4: fix error handling in ext4_ext_truncate()	2013-07-20 10:48:59 -07:00
Linus Torvalds	3be542d464	NFS client bugfixes for 3.11 - Fix a regression against NFSv4 FreeBSD servers when creating a new file - Fix another regression in rpc_client_register() -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJR6ZKMAAoJEGcL54qWCgDyQX8P/19LKLNKcL+y2zVGjLbXMTq0 TpyWdBO0ux7QcqnPEDg+Jpvu62IowYiKTtaSOXtHb5BNjQMBo2RKw3B0eMBoCp/z 6gHmQRD2hMgqwBxBwHceV+dNwueCUiZW7GqaaNh6/3bpGQefegdONnLEifuPogEu oZmEuiVrGDfITEF7D4k5+shXCQN4eNH0LFuIQo4XXdCqmK6PwvOsidZ7YwHVC3Mg /Jzda2YsCxHj8kPi1xb9skPPAn6g4kdfYfyr/xSY7IviPixrkg/nEEK1b8xHU81e a0dd0Yx5kq6fR8LsBvQCHdj2m7doHM15jf5Np5G7VnnaWEjB2y+QftkxWc9lCNU3 t2fr9YVD7ZG/GGNSFePHAHmBY0OqDB1Htp4vcwEQfzX6CAR3Hel82WVvut62Z6m4 G5qHjwdqUFhmRN//SWlDpEqSn+pbeCvPhQS60ayN0TLivRsscm/I4yA75odAnn9b 4su1IcUpqeJGeV6yDyMUqbx4kYZFyCZg/DNkThXiTKOs47A7ogSS9ev2fTB/V+jd rroNHNd/U508ze9D6D4ai9vR78uUp4wKNSSBZMCkBtNh0uSApOTgyGVhertB1EKS vgAr4T1tc+9t+0qg1Sb+hbKyBM/KaS5zUrPn+APHPoBXPh5PSVBzeNJkpxHRw/V0 ZxkEgSQKLZSXYb5ab770 =XE+7 -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Fix a regression against NFSv4 FreeBSD servers when creating a new file - Fix another regression in rpc_client_register() * tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Fix a regression against the FreeBSD server SUNRPC: Fix another issue with rpc_client_register()	2013-07-20 10:48:24 -07:00
Linus Torvalds	90290c4ebe	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next Pull btrfs fixes from Josef Bacik: "I'm playing the role of Chris Mason this week while he's on vacation. There are a few critical fixes for btrfs here, all regressions and have been tested well" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next: Btrfs: fix wrong write offset when replacing a device Btrfs: re-add root to dead root list if we stop dropping it Btrfs: fix lock leak when resuming snapshot deletion Btrfs: update drop progress before stopping snapshot dropping	2013-07-20 10:47:38 -07:00
Al Viro	acfec9a5a8	livelock avoidance in sget() Eric Sandeen has found a nasty livelock in sget() - take a mount(2) about to fail. The superblock is on ->fs_supers, ->s_umount is held exclusive, ->s_active is 1. Along comes two more processes, trying to mount the same thing; sget() in each is picking that superblock, bumping ->s_count and trying to grab ->s_umount. ->s_active is 3 now. Original mount(2) finally gets to deactivate_locked_super() on failure; ->s_active is 2, superblock is still ->fs_supers because shutdown will not happen until ->s_active hits 0. ->s_umount is dropped and now we have two processes chasing each other: s_active = 2, A acquired ->s_umount, B blocked A sees that the damn thing is stillborn, does deactivate_locked_super() s_active = 1, A drops ->s_umount, B gets it A restarts the search and finds the same superblock. And bumps it ->s_active. s_active = 2, B holds ->s_umount, A blocked on trying to get it ... and we are in the earlier situation with A and B switched places. The root cause, of course, is that ->s_active should not grow until we'd got MS_BORN. Then failing ->mount() will have deactivate_locked_super() shut the damn thing down. Fortunately, it's easy to do - the key point is that grab_super() is called only for superblocks currently on ->fs_supers, so it can bump ->s_count and grab ->s_umount first, then check MS_BORN and bump ->s_active; we must never increment ->s_count for superblocks past ->kill_sb(), but grab_super() is never called for those. The bug is pretty old; we would've caught it by now, if not for accidental exclusion between sget() for block filesystems; the things like cgroup or e.g. mtd-based filesystems don't have anything of that sort, so they get bitten. The right way to deal with that is obviously to fix sget()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-20 04:58:58 +04:00
Al Viro	ba57ea64cb	allow O_TMPFILE to work with O_WRONLY Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-20 03:11:32 +04:00
Linus Torvalds	89a8c5940d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "An update for the BFP jit to the latest and greatest, two patches to get kdump working again, the random-abort ptrace extention for transactional execution, the z90crypt module alias for ap and a tiny cleanup" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/zcrypt: Alias for new zcrypt device driver base module s390/kdump: Allow copy_oldmem_page() copy to virtual memory s390/kdump: Disable mmap for s390 s390/bpf,jit: add pkt_type support s390/bpf,jit: address randomize and write protect jit code s390/bpf,jit: use generic jit dumper s390/bpf,jit: call module_free() from any context s390/qdio: remove unused variable s390/ptrace: PTRACE_TE_ABORT_RAND	2013-07-19 15:08:12 -07:00
Stefan Behrens	115930cb2d	Btrfs: fix wrong write offset when replacing a device Miao Xie reported the following issue: The filesystem was corrupted after we did a device replace. Steps to reproduce: # mkfs.btrfs -f -m single -d raid10 <device0>..<device3> # mount <device0> <mnt> # btrfs replace start -rfB 1 <device4> <mnt> # umount <mnt> # btrfsck <device4> The reason for the issue is that we changed the write offset by mistake, introduced by commit `625f1c8dc`. We read the data from the source device at first, and then write the data into the corresponding place of the new device. In order to implement the "-r" option, the source location is remapped using btrfs_map_block(). The read takes place on the mapped location, and the write needs to take place on the unmapped location. Currently the write is using the mapped location, and this commit changes it back by undoing the change to the write address that the aforementioned commit added by mistake. Reported-by: Miao Xie <miaox@cn.fujitsu.com> Cc: <stable@vger.kernel.org> # 3.10+ Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-19 15:07:26 -04:00
Josef Bacik	d29a9f629e	Btrfs: re-add root to dead root list if we stop dropping it If we stop dropping a root for whatever reason we need to add it back to the dead root list so that we will re-start the dropping next transaction commit. The other case this happens is if we recover a drop because we will add a root without adding it to the fs radix tree, so we can leak it's root and commit root extent buffer, adding this to the dead root list makes this cleanup happen. Thanks, Cc: stable@vger.kernel.org Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-19 15:07:19 -04:00
Josef Bacik	fec386ac14	Btrfs: fix lock leak when resuming snapshot deletion We aren't setting path->locks[level] when we resume a snapshot deletion which means we won't unlock the buffer when we free the path. This causes deadlocks if we happen to re-allocate the block before we've evicted the extent buffer from cache. Thanks, Cc: stable@vger.kernel.org Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-19 15:07:11 -04:00
Josef Bacik	3c8f242257	Btrfs: update drop progress before stopping snapshot dropping Alex pointed out a problem and fix that exists in the drop one snapshot at a time patch. If we decide we need to exit for whatever reason (umount for example) we will just exit the snapshot dropping without updating the drop progress. So the next time we go to resume we will BUG_ON() because we can't find the extent we left off at because we never updated it. This patch fixes the problem. Cc: stable@vger.kernel.org Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-19 15:07:03 -04:00
Linus Torvalds	7a62711aac	Driver core patches for 3.11-rc2 Here are some driver core patches for 3.11-rc2. They aren't really bugfixes, but a bunch of new helper macros for drivers to properly create attribute groups, which drivers and subsystems need to fix up a ton of race issues with incorrectly creating sysfs files (binary and normal) after userspace has been told that the device is present. Also here is the ability to create binary files as attribute groups, to solve that race condition, which was impossible to do before this, so that's my fault the drivers were broken. The majority of the .c changes is indenting and moving code around a bit. It affects no existing code, but allows the large backlog of 70+ patches that I already have created to start flowing into the different subtrees, instead of having to live in my driver-core tree, causing merge nightmares in linux-next for the next few months. These were finalized too late for the -rc1 merge window, which is why they were didn't make that pull request, testing and review from others didn't happen until a few weeks ago, and then there's the whole distraction of the past few days, which prevented these from getting to you sooner, sorry about that. Oh, and there's a bugfix for the documentation build warning in here as well. All of these have been in linux-next this week, with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (GNU/Linux) iEYEABECAAYFAlHoRUUACgkQMUfUDdst+ymkNACdHAjEXZZmXohDuCb2SqyMeQsz AZcAn3qqJa/NoPEgTCgOkDlAQZM6BnC5 =+Gqk -----END PGP SIGNATURE----- Merge tag 'driver-core-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core patches from Greg KH: "Here are some driver core patches for 3.11-rc2. They aren't really bugfixes, but a bunch of new helper macros for drivers to properly create attribute groups, which drivers and subsystems need to fix up a ton of race issues with incorrectly creating sysfs files (binary and normal) after userspace has been told that the device is present. Also here is the ability to create binary files as attribute groups, to solve that race condition, which was impossible to do before this, so that's my fault the drivers were broken. The majority of the .c changes is indenting and moving code around a bit. It affects no existing code, but allows the large backlog of 70+ patches that I already have created to start flowing into the different subtrees, instead of having to live in my driver-core tree, causing merge nightmares in linux-next for the next few months. These were finalized too late for the -rc1 merge window, which is why they were didn't make that pull request, testing and review from others didn't happen until a few weeks ago, and then there's the whole distraction of the past few days, which prevented these from getting to you sooner, sorry about that. Oh, and there's a bugfix for the documentation build warning in here as well. All of these have been in linux-next this week, with no reported problems" * tag 'driver-core-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: driver-core: fix new kernel-doc warning in base/platform.c sysfs: use file mode defines from stat.h sysfs: add more helper macro's for (bin_)attribute(_groups) driver core: add default groups to struct class driver core: Introduce device_create_groups sysfs: prevent warning when only using binary attributes sysfs: add support for binary attributes in groups driver core: device.h: add RW and RO attribute macros sysfs.h: add BIN_ATTR macro sysfs.h: add ATTRIBUTE_GROUPS() macro sysfs.h: add __ATTR_RW() macro	2013-07-18 12:48:40 -07:00
Michael Holzheu	5a74953ff5	s390/kdump: Disable mmap for s390 The kdump mmap patch series (git commit `83086978c6`) directly map the PT_LOADs to memory. On s390 this does not work because the copy_from_oldmem() function swaps [0,crashkernel size] with [crashkernel base, crashkernel base+crashkernel size]. The swap int copy_from_oldmem() was done in order correctly implement /dev/oldmem. See: http://marc.info/?l=kexec&m=136940802511603&w=2 Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2013-07-18 13:40:18 +02:00
Trond Myklebust	b4a2cf76ab	NFSv4: Fix a regression against the FreeBSD server Technically, the Linux client is allowed by the NFSv4 spec to send 3 word bitmaps as part of an OPEN request. However, this causes the current FreeBSD server to return NFS4ERR_ATTRNOTSUPP errors. Fix the regression by making the Linux client use a 2 word bitmap unless doing NFSv4.2 with labeled NFS. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-07-17 16:54:46 -04:00
Linus Torvalds	61f98b0fca	Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux Pull nfsd bugfixes from Bruce Fields: "Just three minor bugfixes" * 'for-3.11' of git://linux-nfs.org/~bfields/linux: svcrdma: underflow issue in decode_write_list() nfsd4: fix minorversion support interface lockd: protect nlm_blocked access in nlmsvc_retry_blocked	2013-07-17 13:43:55 -07:00
Miklos Szeredi	c7263bcdc4	fuse: readdirplus: cleanup Niels noted that we don't need the 'dentry = NULL' line. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: Niels de Vos <ndevos@redhat.com>	2013-07-17 14:53:54 +02:00
Miklos Szeredi	fa2b721360	fuse: readdirplus: change attributes once If we got the inode through fuse_iget() then the attributes are already up-to-date. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2013-07-17 14:53:53 +02:00
Miklos Szeredi	2914941e31	fuse: readdirplus: fix instantiate Fuse does instantiation slightly differently from NFS/CIFS which use d_materialise_unique(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@vger.kernel.org	2013-07-17 14:53:53 +02:00
Miklos Szeredi	a28ef45cbb	fuse: readdirplus: sanity checks Add sanity checks before adding or updating an entry with data received from readdirplus. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@vger.kernel.org	2013-07-17 14:53:53 +02:00
Niels de Vos	53ce9a3364	fuse: readdirplus: fix dentry leak In case d_lookup() returns a dentry with d_inode == NULL, the dentry is not returned with dput(). This results in triggering a BUG() in shrink_dcache_for_umount_subtree(): BUG: Dentry ...{i=0,n=...} still in use (1) [unmount of fuse fuse] [SzM: need to d_drop() as well] Reported-by: Justin Clift <jclift@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Brian Foster <bfoster@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com> CC: stable@vger.kernel.org	2013-07-17 14:53:53 +02:00
Oliver Schinagl	388a8c353d	sysfs: prevent warning when only using binary attributes When only using bin_attrs instead of attrs the kernel prints a warning and refuses to create the sysfs entry. This fixes that. Signed-off-by: Oliver Schinagl <oliver@schinagl.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-07-16 10:57:36 -07:00
Greg Kroah-Hartman	6ab9cea160	sysfs: add support for binary attributes in groups groups should be able to support binary attributes, just like it supports "normal" attributes. This lets us only handle one type of structure, groups, throughout the driver core and subsystems, making binary attributes a "full fledged" part of the driver model, and not something just "tacked on". Reported-by: Oliver Schinagl <oliver@schinagl.nl> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-07-16 10:57:36 -07:00
Theodore Ts'o	63b999685c	ext4: call ext4_es_lru_add() after handling cache miss If there are no items in the extent status tree, ext4_es_lru_add() is a no-op. So it is not sufficient to call ext4_es_lru_add() before we try to lookup an entry in the extent status tree. We also need to call it at the end of ext4_ext_map_blocks(), after items have been added to the extent status tree. This could lead to inodes with that have extent status trees but which are not in the LRU list, which means they won't get considered for eviction by the es_shrinker. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Zheng Liu <wenqing.lz@taobao.com> Cc: stable@vger.kernel.org	2013-07-16 10:28:47 -04:00
Sachin Kamat	cd633972e1	Btrfs: volume: Replace PTR_RET with PTR_ERR_OR_ZERO PTR_RET is now deprecated. Use PTR_ERR_OR_ZERO instead. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2013-07-16 16:06:02 +09:30
Theodore Ts'o	76828c8826	ext4: yield during large unlinks During large unlink operations on files with extents, we can use a lot of CPU time. This adds a cond_resched() call when starting to examine the next level of a multi-level extent tree. Multi-level extent trees are rare in the first place, and this should rarely be executed. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-07-15 12:27:47 -04:00
Linus Torvalds	47188d39b5	Various regression and bug fixes for ext4. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABCAAGBQJR43U/AAoJENNvdpvBGATwdl4P+gI23RkFXTHKvd3XtmXLQojT ncRXVOAARuRZiMbiAOzXv/BDSkLHnOHw6fVLK5buFTLlpQ00tdlrd6ngui4NTe+v Qo0GUqL09iSMLEgZV0OwxV5EULPpYb/xQwfQNAqG3pQbUFq/JdxptBT7r/go/YnX bzWSDiMKeFQoIgH1/xDGXRrfcSdEbjewMfT7lXq+XWRlPyyJPjLnxzDGfJDaOLSR rCZJOsbCfxzwhBd2HFzH55CGGU4yoZ6O7qpsMoF1gjqUSJ2DmVhMV/NSspmTnKRd EZKDT7LK8c02UNdYzLPzPpRjAQfUWBgnh9R84Ake8Py2UHGommTyz6TqMmNTbW5Q EMRd461v+8bvIYnbe/tkT+CTTkC7lRapX6AYaq8k+MpLIWE1bmvX+bMRYOejTE4r jTgYUktzaVzx/4XdgT837vCbsFttixL3x62XelrkZoANw/m0+jgOn9mY5pjDFp8j Eq5wWJ8IsuxCofk/qQj5rOK7/3tFcdJULCoX8f3AB0vooAUKTXBYxYflfIeSgqeZ vlp0ymj588pimH3LM0Vs1BT/aGh0JninLIBk+hcb2YxC2NzvLO2pjSV8i+olBU+C Yq7MoakdT/FDTWp8WbbZm21C95Tj/zCfMCBSgC0k7LpQVM00ts87UdUgfAZPzI1w ZISZFy6O/zhPMFAZCxfV =qf2h -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfixes from Ted Ts'o: "Various regression and bug fixes for ext4" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: don't allow ext4_free_blocks() to fail due to ENOMEM ext4: fix spelling errors and a comment in extent_status tree ext4: rate limit printk in buffer_io_error() ext4: don't show usrquota/grpquota twice in /proc/mounts ext4: fix warning in ext4_evict_inode() ext4: fix ext4_get_group_number() ext4: silence warning in ext4_writepages()	2013-07-14 21:47:51 -07:00
Theodore Ts'o	e15f742ce8	ext4: make the extent_status code more robust against ENOMEM failures Some callers of ext4_es_remove_extent() and ext4_es_insert_extent() may not be completely robust against ENOMEM failures (or the consequences of reflecting ENOMEM back up to userspace may lead to xfstest or user application failure). To mitigate against this, when trying to insert an entry in the extent status tree, try to shrink the inode's extent status tree before returning ENOMEM. If there are entries which don't record information about extents under delayed allocations, freeing one of them is preferable to returning ENOMEM. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>	2013-07-15 00:12:14 -04:00
Theodore Ts'o	c8e15130e1	ext4: simplify calculation of blocks to free on error In ext4_ext_map_blocks(), if we have successfully allocated the data blocks, but then run into trouble inserting the extent into the extent tree, most likely due to an ENOSPC condition, determine the arguments to ext4_free_blocks() in a simpler way which is easier to prove to be correct. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-07-15 00:09:37 -04:00
Theodore Ts'o	8acd5e9b12	ext4: fix error handling in ext4_ext_truncate() Previously ext4_ext_truncate() was ignoring potential error returns from ext4_es_remove_extent() and ext4_ext_remove_space(). This can lead to the on-diks extent tree and the extent status tree cache getting out of sync, which is particuarlly bad, and can lead to file system corruption and potential data loss. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-07-15 00:09:19 -04:00
Rusty Russell	8c6ffba0ed	PTR_RET is now PTR_ERR_OR_ZERO(): Replace most. Sweep of the simple cases. Cc: netdev@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-kernel@lists.infradead.org Cc: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-15 11:25:01 +09:30
Linus Torvalds	41d9884c44	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs stuff from Al Viro: "O_TMPFILE ABI changes, Oleg's fput() series, misc cleanups, including making simple_lookup() usable for filesystems with non-NULL s_d_op, which allows us to get rid of quite a bit of ugliness" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: sunrpc: now we can just set ->s_d_op cgroup: we can use simple_lookup() now efivarfs: we can use simple_lookup() now make simple_lookup() usable for filesystems that set ->s_d_op configfs: don't open-code d_alloc_name() __rpc_lookup_create_exclusive: pass string instead of qstr rpc_create_*_dir: don't bother with qstr llist: llist_add() can use llist_add_batch() llist: fix/simplify llist_add() and llist_add_batch() fput: turn "list_head delayed_fput_list" into llist_head fs/file_table.c:fput(): add comment Safer ABI for O_TMPFILE	2013-07-14 11:42:26 -07:00
Al Viro	6e8cd2cb46	efivarfs: we can use simple_lookup() now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-14 17:48:35 +04:00
Al Viro	74931da7a6	make simple_lookup() usable for filesystems that set ->s_d_op Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-14 17:43:25 +04:00
Al Viro	ec193cf5af	configfs: don't open-code d_alloc_name() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-14 17:16:52 +04:00
Linus Torvalds	be9c6d9169	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "Just a bunch of small fixes and tidy ups: 1) Finish the "busy_poll" renames, from Eliezer Tamir. 2) Fix RCU stalls in IFB driver, from Ding Tianhong. 3) Linearize buffers properly in tun/macvtap zerocopy code. 4) Don't crash on rmmod in vxlan, from Pravin B Shelar. 5) Spinlock used before init in alx driver, from Maarten Lankhorst. 6) A sparse warning fix in bnx2x broke TSO checksums, fix from Dmitry Kravkov. 7) Dummy and ifb driver load failure paths can oops, fixes from Tan Xiaojun and Ding Tianhong. 8) Correct MTU calculations in IP tunnels, from Alexander Duyck. 9) Account all TCP retransmits in SNMP stats properly, from Yuchung Cheng. 10) atl1e and via-rhine do not handle DMA mapping failures properly, from Neil Horman. 11) Various equal-cost multipath route fixes in ipv6 from Hannes Frederic Sowa" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits) ipv6: only static routes qualify for equal cost multipathing via-rhine: fix dma mapping errors atl1e: fix dma mapping warnings tcp: account all retransmit failures usb/net/r815x: fix cast to restricted __le32 usb/net/r8152: fix integer overflow in expression net: access page->private by using page_private net: strict_strtoul is obsolete, use kstrtoul instead drivers/net/ieee802154: don't use devm_pinctrl_get_select_default() in probe drivers/net/ethernet/cadence: don't use devm_pinctrl_get_select_default() in probe drivers/net/can/c_can: don't use devm_pinctrl_get_select_default() in probe net/usb: add relative mii functions for r815x net/tipc: use %*phC to dump small buffers in hex form qlcnic: Adding Maintainers. gre: Fix MTU sizing check for gretap tunnels pkt_sched: sch_qfq: remove forward declaration of qfq_update_agg_ts pkt_sched: sch_qfq: improve efficiency of make_eligible gso: Update tunnel segmentation to support Tx checksum offload inet: fix spacing in assignment ifb: fix oops when loading the ifb failed ...	2013-07-13 17:42:22 -07:00
Linus Torvalds	239dab4636	xfs: update (#2 ) for 3.11-rc1 - fix for xfs_fsr returning -EINVAL - cleanup in xfs_bulkstat - cleanup in xfs_open_by_handle - update mount options documentation - clean up local format handling in xfs_bmapi_write - fix dquot log reservations which were too small - fix sgid inheritance for subdirectories when default acls are in use - add project quota fields to various structures - fix teardown of quotainfo structures when quotas are turned off -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABAgAGBQJR4FDjAAoJENaLyazVq6ZOhZIQAMLWC4Wz+3PpRLkIlXZ83wdG LYDNxdYntSVDXNLCrfIhuavgW1yhseLcQZD34g0hgOLQRsmjvUw2ikO6u7qlyoV1 GZZVwdVhsZNicycvuEE0Sva9Jmjgxe0XUOCxJBNAWN0fm5Jzg7w0OGWyzxn2obRX L4LNPqnQS/phCrtNYfUnXpCuUcy0KbpX4GYrdt5tThIWHm7AcyRnOArEkFvVuwdf N3OSN/jSMaEF/GsAqmYFYSXhuL9P1vyBSlyFW82YbyFFd4FKZJbRiFbOsrgbFSkA Ssum7N+rfMc9DCbJsrztgxFaYpj42JR5eCm+jvTejx8nJWKiGjMVtzjq4QtwQQ6e vby7MzdjZ+l2oJclA0y8hOjeg0R7sPVP7xZziZRuK4PHsjtBH3N2FtCOeBtlGhyW 14LK+z+5YXU/gEwmxV5LaknODb2mxvWycf70jaQ6bvrQRUiFnPNIxYKvgAx8YJxl jgYSassHHKtLg0S54P/T91tRsyDVOhy5lqgeogzK5uYa+v+xlMloG+fLzb9GmIgS hXgUIAo+lNlHZkw1FdD4aRgh3OMiUvLQN6woBMbbXfS5XaNpF1UG30YAXeeIqV5e cLChzY+jQiCsmcktb3YQs9C5yfxEciFFSYmZOaKCTgQRWWnyI4/lAu1gd2KtTYUt ZfV0niME4wp0kBWZHOEH =QiZy -----END PGP SIGNATURE----- Merge tag 'for-linus-v3.11-rc1-2' of git://oss.sgi.com/xfs/xfs Pull more xfs updates from Ben Myers: "Here are a fix for xfs_fsr, a cleanup in bulkstat, a cleanup in xfs_open_by_handle, updated mount options documentation, a cleanup in xfs_bmapi_write, a fix for the size of dquot log reservations, a fix for sgid inheritance when acls are in use, a fix for cleaning up quotainfo structures, and some more of the work which allows group and project quotas to be used together. We had a few more in this last quota category that we might have liked to get in, but it looks there are still a few items that need to be addressed. - fix for xfs_fsr returning -EINVAL - cleanup in xfs_bulkstat - cleanup in xfs_open_by_handle - update mount options documentation - clean up local format handling in xfs_bmapi_write - fix dquot log reservations which were too small - fix sgid inheritance for subdirectories when default acls are in use - add project quota fields to various structures - fix teardown of quotainfo structures when quotas are turned off" * tag 'for-linus-v3.11-rc1-2' of git://oss.sgi.com/xfs/xfs: xfs: Fix the logic check for all quotas being turned off xfs: Add pquota fields where gquota is used. xfs: fix sgid inheritance for subdirectories inheriting default acls [V3] xfs: dquot log reservations are too small xfs: remove local fork format handling from xfs_bmapi_write() xfs: update mount options documentation xfs: use get_unused_fd_flags(0) instead of get_unused_fd() xfs: clean up unused codes at xfs_bulkstat() xfs: use XFS_BMAP_BMDR_SPACE vs. XFS_BROOT_SIZE_ADJ	2013-07-13 11:40:24 -07:00
Linus Torvalds	f1c4108852	Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6 Pull cifs fixes from Steve French: "Fixes for 4 cifs bugs, including a reconnect problem, a problem parsing responses to SMB2 open request, and setting nlink incorrectly to some servers which don't report it properly on the wire. Also improves data integrity on reconnect with series from Pavel which adds durable handle support for SMB2." * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6: CIFS: Fix a deadlock when a file is reopened CIFS: Reopen the file if reconnect durable handle failed [CIFS] Fix minor endian error in durable handle patch series CIFS: Reconnect durable handles for SMB2 CIFS: Make SMB2_open use cifs_open_parms struct CIFS: Introduce cifs_open_parms struct CIFS: Request durable open for SMB2 opens CIFS: Simplify SMB2 create context handling CIFS: Simplify SMB2_open code path CIFS: Respect create_options in smb2_open_file CIFS: Fix lease context buffer parsing [CIFS] use sensible file nlink values if unprovided Limit allocation of crypto mechanisms to dialect which requires	2013-07-13 11:20:49 -07:00
Oleg Nesterov	4f5e65a1cc	fput: turn "list_head delayed_fput_list" into llist_head fput() and delayed_fput() can use llist and avoid the locking. This is unlikely path, it is not that this change can improve the performance, but this way the code looks simpler. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrey Vagin <avagin@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: David Howells <dhowells@redhat.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-13 13:29:10 +04:00
Andrew Morton	64372501e2	fs/file_table.c:fput(): add comment A missed update to "fput: task_work_add() can fail if the caller has passed exit_task_work()". Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrey Vagin <avagin@openvz.org> Cc: David Howells <dhowells@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-13 13:27:42 +04:00
Al Viro	bb458c644a	Safer ABI for O_TMPFILE [suggested by Rasmus Villemoes] make O_DIRECTORY \| O_RDWR part of O_TMPFILE; that will fail on old kernels in a lot more cases than what I came up with. And make sure O_CREAT doesn't get there... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-13 13:26:37 +04:00
Theodore Ts'o	e7676a704e	ext4: don't allow ext4_free_blocks() to fail due to ENOMEM The filesystem should not be marked inconsistent if ext4_free_blocks() is not able to allocate memory. Unfortunately some callers (most notably ext4_truncate) don't have a way to reflect an error back up to the VFS. And even if we did, most userspace applications won't deal with most system calls returning ENOMEM anyway. Reported-by: Nagachandra P <nagachandra@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-07-13 00:40:35 -04:00
Theodore Ts'o	bdafe42aaf	ext4: fix spelling errors and a comment in extent_status tree Replace "assertation" with "assertion" in lots and lots of debugging messages. Correct the comment stating when ext4_es_insert_extent() is used. It was no doubt tree at one point, but it is no longer true... Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Zheng Liu <gnehzuil.liu@gmail.com>	2013-07-13 00:40:31 -04:00
J. Bruce Fields	35f7a14fc1	nfsd4: fix minorversion support interface You can turn on or off support for minorversions using e.g. echo "-4.2" >/proc/fs/nfsd/versions However, the current implementation is a little wonky. For example, the above will turn off 4.2 support, but it will also turn on 4.1 support. This didn't matter as long as we only had 2 minorversions, which was true till very recently. And do a little cleanup here. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-12 16:48:52 -04:00
Anatol Pomozov	e8974c3930	ext4: rate limit printk in buffer_io_error() If there are a lot of outstanding buffered IOs when a device is taken offline (due to hardware errors etc), ext4_end_bio prints out a message for each failed logical block. While this is desirable, we see thousands of such lines being printed out before the serial console gets overwhelmed, causing ext4_end_bio() wait for the printk to complete. This in itself isn't a disaster, except for the detail that this function is being called with the queue lock held. This causes any other function in the block layer to spin on its spin_lock_irqsave while the serial console is draining. If NMI watchdog is enabled on this machine then it eventually comes along and shoots the machine in the head. The end result is that losing any one disk causes the machine to go down. This patch rate limits the printk to bandaid around the problem. Tested: xfstests Change-Id: I8ab5690dcf4f3a67e78be147d45e489fdf4a88d8 Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-07-11 22:42:42 -04:00
Pavel Shilovsky	689c3db4d5	CIFS: Fix a deadlock when a file is reopened If we request reading or writing on a file that needs to be reopened, it causes the deadlock: we are already holding rw semaphore for reading and then we try to acquire it for writing in cifs_relock_file. Fix this by acquiring the semaphore for reading in cifs_relock_file due to we don't make any changes in locks and don't need a write access. CC: <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>	2013-07-11 18:05:41 -05:00
Pavel Shilovsky	b33fcf1c9d	CIFS: Reopen the file if reconnect durable handle failed This is a follow-on patch for 8/8 patch from the durable handles series. It fixes the problem when durable file handle timeout expired on the server and reopen returns -ENOENT for such files. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2013-07-11 18:05:08 -05:00
Theodore Ts'o	ad065dd016	ext4: don't show usrquota/grpquota twice in /proc/mounts We now print mount options in a generic fashion in ext4_show_options(), so we shouldn't be explicitly printing the {usr,grp}quota options in ext4_show_quota_options(). Without this patch, /proc/mounts can look like this: /dev/vdb /vdb ext4 rw,relatime,quota,usrquota,data=ordered,usrquota 0 0 ^^^^^^^^ ^^^^^^^^ Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-07-11 18:54:37 -04:00
Chandra Seetharaman	c31ad439e8	xfs: Fix the logic check for all quotas being turned off During the review of seperate pquota inode patches, David noticed that the test to detect all quotas being turned off was incorrect, and hence the block was not freeing all the quota information. The check made sense in Irix, but in Linux, quota is turned off one at a time, which makes the test invalid for Linux. This problem existed since XFS was ported to Linux. David suggested to fix the problem by detecting when all quotas are turned off by checking m_qflags. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2013-07-11 16:49:10 -05:00
David Jeffery	1c327d962f	lockd: protect nlm_blocked access in nlmsvc_retry_blocked In nlmsvc_retry_blocked, the check that the list is non-empty and acquiring the pointer of the first entry is unprotected by any lock. This allows a rare race condition when there is only one entry on the list. A function such as nlmsvc_grant_callback() can be called, which will temporarily remove the entry from the list. Between the list_empty() and list_entry(),the list may become empty, causing an invalid pointer to be used as an nlm_block, leading to a possible crash. This patch adds the nlm_block_lock around these calls to prevent concurrent use of the nlm_blocked list. This was a regression introduced by `f904be9cc7` "lockd: Mostly remove BKL from the server". Cc: Bryan Schumaker <bjschuma@netapp.com> Cc: stable@vger.kernel.org Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-11 17:24:07 -04:00
Linus Torvalds	36805aaea5	Merge branch 'for-3.11/core' of git://git.kernel.dk/linux-block Pull core block IO updates from Jens Axboe: "Here are the core IO block bits for 3.11. It contains: - A tweak to the reserved tag logic from Jan, for weirdo devices with just 3 free tags. But for those it improves things substantially for random writes. - Periodic writeback fix from Jan. Marked for stable as well. - Fix for a race condition in IO scheduler switching from Jianpeng. - The hierarchical blk-cgroup support from Tejun. This is the grunt of the series. - blk-throttle fix from Vivek. Just a note that I'm in the middle of a relocation, whole family is flying out tomorrow. Hence I will be awal the remainder of this week, but back at work again on Monday the 15th. CC'ing Tejun, since any potential "surprises" will most likely be from the blk-cgroup work. But it's been brewing for a while and sitting in my tree and linux-next for a long time, so should be solid." * 'for-3.11/core' of git://git.kernel.dk/linux-block: (36 commits) elevator: Fix a race in elevator switching block: Reserve only one queue tag for sync IO if only 3 tags are available writeback: Fix periodic writeback after fs mount blk-throttle: implement proper hierarchy support blk-throttle: implement throtl_grp->has_rules[] blk-throttle: Account for child group's start time in parent while bio climbs up blk-throttle: add throtl_qnode for dispatch fairness blk-throttle: make throtl_pending_timer_fn() ready for hierarchy blk-throttle: make tg_dispatch_one_bio() ready for hierarchy blk-throttle: make blk_throtl_bio() ready for hierarchy blk-throttle: make blk_throtl_drain() ready for hierarchy blk-throttle: dispatch from throtl_pending_timer_fn() blk-throttle: implement dispatch looping blk-throttle: separate out throtl_service_queue->pending_timer from throtl_data->dispatch_work blk-throttle: set REQ_THROTTLED from throtl_charge_bio() and gate stats update with it blk-throttle: implement sq_to_tg(), sq_to_td() and throtl_log() blk-throttle: add throtl_service_queue->parent_sq blk-throttle: generalize update_disptime optimization in blk_throtl_bio() blk-throttle: dispatch to throtl_data->service_queue.bio_lists[] blk-throttle: move bio_lists[] and friends to throtl_service_queue ...	2013-07-11 13:03:24 -07:00
Linus Torvalds	1466b77a7b	NFS client updates for Linux 3.11 (part 2) Highlights include: - Fix an_rpc pipefs regression that causes a deadlock on mount - Readdir optimisations by Scott Mayhew and Jeff Layton - clean up the rpc_pipefs dentry operation setup -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJR3vVIAAoJEGcL54qWCgDyBWEP/0blqSlJId4zZj4xDviRFqJ4 93C7b/Vn7LrAcNCgDQsPkkzTwAX5yTB1H5eNtMuyggAdGj89d4n0jXgBniIMHmqI Pjrr/XMQ65NddehrO491N01iJSfP9wE3CizJodnAv4VxMRO3xqiJG85lcnoLOFea V1FnEFUu9oi8e93cQt2fe6KdmTu/SuRqlqR7WPGyTFgS26x1l8nkp2OQgulit5Up lWuaxg4xbKOdj1jfUDXZhWUnDtkFjxyGxnKR63aA2X1DEGCUTJ6gB3tAl9pvnUb2 RTQF3GVj+Bm/E3gE6ULJvqOjhsgWYjLAZn6hDA3yNAIiFyV7aA6gwK4oKy/B47a6 tFEN2O1EupWzCqGyHhTArk+oEBLfUv/EgFyo7+Y0YIFV4sQTu5RbaZ0nQ2geY6LA 50q2GH57tkXTs859gtBPQgKzgRF1ulkF1FDY9EYQHyGiUbNxBfx+6/2OI04ubQt3 1gKUmm9w1WVzYGmHcHbxsXPT53NtAnHXW4ExcMgpaZ1YOPuIILm78ZuAw78XB/dd mvXRtbhVt/gs7qZAQQPp1iHIv+vnJ0KgjO62gbuTIRftw5jwWrpWcfYMUUZrMnyM kn326z3f4gn/vSDZI7J4tOfG1Uc7eNy+cJxStjtiNWTs3UzuWJKzJH0rZnoNZdei xAkLhjIUEybAqIpXJuGH =NqQf -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull second set of NFS client updates from Trond Myklebust: "This mainly contains some small readdir optimisations that had dependencies on Al Viro's readdir rewrite. There is also a fix for a nasty deadlock which surfaced earlier in this merge window. Highlights include: - Fix an_rpc pipefs regression that causes a deadlock on mount - Readdir optimisations by Scott Mayhew and Jeff Layton - clean up the rpc_pipefs dentry operation setup" * tag 'nfs-for-3.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Fix a deadlock in rpc_client_register() rpc_pipe: rpc_dir_inode_operations can be static NFS: Allow nfs_updatepage to extend a write under additional circumstances NFS: Make nfs_readdir revalidate less often NFS: Make nfs_attribute_cache_expired() non-static rpc_pipe: set dentry operations at d_alloc time nfs: set verifier on existing dentries in nfs_prime_dcache	2013-07-11 12:11:35 -07:00
Linus Torvalds	19d2f8e0fb	Second round of 9p patches for the 3.11 merge window. Several of these patches were rebased in order to correct style issues. Only stylistic changes were made versus the patches which were in linux-next for two weeks. The rebases have been in linux-next for 3 days and have passed my regressions. The bulk of these are RDMA fixes and improvements. There's also some additions on the extended attributes front to support some additional namespaces and a new option for TCP to force allocation of mount requests from a priviledged port. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: GPGTools - http://gpgtools.org iQIcBAABAgAGBQJR3rWXAAoJEDZk62b0Tg6xabIP/12I+SkQ57wRN03EQy5fqUdX gK/YMHKQ9QuDnZPBvrZ2lypesQNqVU0KINay6VEA86JG1gwzPyUd2MnpQ7F0vV3N XwVD54IoflV/M74xUnrgGWB8YxaPcdacQQ8yazX+mEgOgYGdWmDAl7FHmAkdKAFB gSl25f3PNJX1Rjay0dssNVXrVPXuJY/fZXKnNQZKtRwXffRWKsWHd8FU0Eq7F30A kNQB8tmMSfHBBjP+tzR0My6/kQ09jzHdtZOkH9IgVpNzqrd8tfy0l6tEvFypxqGT 5oQFoxHHL/tUW05V0P3gYany2A7lEhSUifPKS6omqHO+vPlw+pDJw+xWlNq9fnDt 8S8znqVuEHhvqRQW7zFdb9ac2MZi8CHHhC2wGIZ7GYjNG2q5XwE8b/QhdXQeFin7 ibugvoW7+ZdcDewpQW27oO0g7B/8hRt8KC+1lc/8rITKIfGxbNJkGzTDl0F4Co7v IH7Ew5PHPe6ZiuU0QSdU+NBuvk8g8sWGxx04Xvzl3WicwOg7XvN3ivrKB9oN2U1x 50KZRnYpwQQv/9AxyhroYU+Ufje8SF4v++zsq1eMzUcHsC/C73eatw2m764t+X4S 8yMLrgqY1Nzif4nAMi/SDMnB/R1bXeuc8kXD9xT6XD9d2tf6e+zCHhQklVeC0tuK RiVRJqGrfanbKMnWIG0Y =n9rI -----END PGP SIGNATURE----- Merge tag 'for-linus-3.11-merge-window-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull second round of 9p patches from Eric Van Hensbergen: "Several of these patches were rebased in order to correct style issues. Only stylistic changes were made versus the patches which were in linux-next for two weeks. The rebases have been in linux-next for 3 days and have passed my regressions. The bulk of these are RDMA fixes and improvements. There's also some additions on the extended attributes front to support some additional namespaces and a new option for TCP to force allocation of mount requests from a priviledged port" * tag 'for-linus-3.11-merge-window-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: Remove the unused variable "err" in v9fs_vfs_getattr() 9P: Add cancelled() to the transport functions. 9P/RDMA: count posted buffers without a pending request 9P/RDMA: Improve error handling in rdma_request 9P/RDMA: Do not free req->rc in error handling in rdma_request() 9P/RDMA: Use a semaphore to protect the RQ 9P/RDMA: Protect against duplicate replies 9P/RDMA: increase P9_RDMA_MAXSIZE to 1MB 9pnet: refactor struct p9_fcall alloc code 9P/RDMA: rdma_request() needs not allocate req->rc 9P: Fix fcall allocation for rdma fs/9p: xattr: add trusted and security namespaces net/9p: add privport option to 9p tcp transport	2013-07-11 10:21:23 -07:00
Linus Torvalds	746919d266	Code cleanups and improved buffer handling during page crypto operations - Remove redundant code by merging some encrypt and decrypt functions - Get rid of a helper page allocation during page decryption by using in-place decryption - Better use of entire pages during page crypto operations - Several code cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIbBAABCgAGBQJR3Z1JAAoJENaSAD2qAscK+jAP92NR3W17njuPFJBHCROsS481 tyhzy2W/LlNK1njnS6SRP3O3Icv3adiJRtMZePV7bpCH3yD/JoYZ6VHQBNV7msHW VuEwE6eqCkKl3NLunhwd4+m5R9qijJYGfYEzTve/RNASDU7/LPcTUF8OQ5dIB0wA J6IMGGZSpsFa8ymN01YzmUEmOUx1IBR2aYBT8Og4Ke117ywDYqxS0ghd1rb953sS 7H8wnNcijs1DLGe71SZnKMVCYwO32GWarxDBAa0KsabcyY4Yr43O3ov/CsIAAA7B Q1Dn7KNNSOCu9G6fHrnuMOTncGnNPLhIe6Yc0PCZ7ykVstpzlNkKJ628IEonsJaJ 4bYc3bqq4KH7rqMxjA+1GoLehpJWJzqwfiFI1fWLlYMmO2ky126rJUgSNBHQe9+M iWl+ZrYokSsNWBcUsIq7SJFaLIhWDNcb+Wl7RiTNBBwoBaZclrNuWKIyeWPhH+9/ +/K3LBaggujzVpE743wgJhY60sfdHZmaRAD9agEbcG773JePXBg9OkiUp/hKSe8s UaGkfmwAlz8u6mR1eJCuFDCqwJKByyT4vObuOFroh7NgOHaQZghlnO4HwuOjzp6U wTUiMVslFY9WAsEWxdDhaCXxB8IrjHz3YZGIt8PU2eT6ucQU+HLvkkxqtzSYm9/7 BBfryWZKwR7T50JdehI= =mRpN -----END PGP SIGNATURE----- Merge tag 'ecryptfs-3.11-rc1-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs Pull eCryptfs updates from Tyler Hicks: "Code cleanups and improved buffer handling during page crypto operations: - Remove redundant code by merging some encrypt and decrypt functions - Get rid of a helper page allocation during page decryption by using in-place decryption - Better use of entire pages during page crypto operations - Several code cleanups" * tag 'ecryptfs-3.11-rc1-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs: Use ecryptfs_dentry_to_lower_path in a couple of places eCryptfs: Make extent and scatterlist crypt function parameters similar eCryptfs: Collapse crypt_page_offset() into crypt_extent() eCryptfs: Merge ecryptfs_encrypt_extent() and ecryptfs_decrypt_extent() eCryptfs: Combine page_offset crypto functions eCryptfs: Combine encrypt_scatterlist() and decrypt_scatterlist() eCryptfs: Decrypt pages in-place eCryptfs: Accept one offset parameter in page offset crypto functions eCryptfs: Simplify lower file offset calculation eCryptfs: Read/write entire page during page IO eCryptfs: Use entire helper page during page crypto operations eCryptfs: Cocci spatch "memdup.spatch"	2013-07-11 10:20:18 -07:00
Linus Torvalds	9db019278c	A couple cleanups to JFS for 3.11 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (GNU/Linux) iQIcBAABAgAGBQJR3XzMAAoJEDaohF61QIxk/6YP/1ehZym/G2sl836FVYqrMhiz E/IWbU64JbZnwc4HSOEOhZiYiRL9SDlLpPX11lKrTP84bzjw1gd1I6J2hnExmkPv 46S6uosl6XZ1APQmK7MXAQPzbGGgfUvsgGXcm7z1hodaOFiC+AJ6D0ytM5fDS8O7 zdqH4rhSVfVbsGYRECn/x9AdSO5sRSxcxS/U/lNBFKLNrMiAdQKGp7rr1HjO5GBs zKFCcq7DJ2XFiK0jm5/tPd87ENSIVeO60B49+JPxcgcJY3AnX7qC7HGEpkrJ2vHN WRxJ71Ut9mKMaykQ4E2RFE6o8O5LlXGcCN80/quAOFyJ0henw2TwrjYnBhXTHcYu V4ug0gU+7rH2eQZmoy6AxUl32lA4+Vu9J0+/64jWOFzpR5xZD50S3r7e1FUapXmv OBXY1fZLT8JcDDMANUI3zVykAONdy4cj6i2gUj2p7v+43CO/LTaa7a3M67XCP/me g+Kl7WBkUjEBTX4JnT8A9ZQ74IGcVkpdLJ+MRx1XmJfJR9wC1h0iSw8+ZUTCDXub Bmm6k2ah24PEZjj+S1aDhdC15RoErjXYyP4Apu2dJTNqz8l7fbWUU96L8MXtqbcq 6Ih3sFf0RSpbYcDFJrrWdoXsgtUxO2K87Efh9ueJU46R2vP0rrK6JX7eDUHVne/h bFQyAGfu74rk0f8YD1ob =s2QZ -----END PGP SIGNATURE----- Merge tag 'jfs-3.11' of git://github.com/kleikamp/linux-shaggy Pull jfs update from Dave Kleikamp: "A couple cleanups to JFS for 3.11" * tag 'jfs-3.11' of git://github.com/kleikamp/linux-shaggy: jfs: Update jfs_error jfs: fix sparse warning in fs/jfs/xattr.c	2013-07-11 10:19:34 -07:00
Linus Torvalds	0ff08ba5d0	Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux Pull nfsd changes from Bruce Fields: "Changes this time include: - 4.1 enabled on the server by default: the last 4.1-specific issues I know of are fixed, so we're not going to find the rest of the bugs without more exposure. - Experimental support for NFSv4.2 MAC Labeling (to allow running selinux over NFS), from Dave Quigley. - Fixes for some delicate cache/upcall races that could cause rare server hangs; thanks to Neil Brown and Bodo Stroesser for extreme debugging persistence. - Fixes for some bugs found at the recent NFS bakeathon, mostly v4 and v4.1-specific, but also a generic bug handling fragmented rpc calls" * 'for-3.11' of git://linux-nfs.org/~bfields/linux: (31 commits) nfsd4: support minorversion 1 by default nfsd4: allow destroy_session over destroyed session svcrpc: fix failures to handle -1 uid's sunrpc: Don't schedule an upcall on a replaced cache entry. net/sunrpc: xpt_auth_cache should be ignored when expired. sunrpc/cache: ensure items removed from cache do not have pending upcalls. sunrpc/cache: use cache_fresh_unlocked consistently and correctly. sunrpc/cache: remove races with queuing an upcall. nfsd4: return delegation immediately if lease fails nfsd4: do not throw away 4.1 lock state on last unlock nfsd4: delegation-based open reclaims should bypass permissions svcrpc: don't error out on small tcp fragment svcrpc: fix handling of too-short rpc's nfsd4: minor read_buf cleanup nfsd4: fix decoding of compounds across page boundaries nfsd4: clean up nfs4_open_delegation NFSD: Don't give out read delegations on creates nfsd4: allow client to send no cb_sec flavors nfsd4: fail attempts to request gss on the backchannel nfsd4: implement minimal SP4_MACH_CRED ...	2013-07-11 10:17:13 -07:00
Chandra Seetharaman	92f8ff73f1	xfs: Add pquota fields where gquota is used. Add project quota changes to all the places where group quota field is used: * add separate project quota members into various structures * split project quota and group quotas so that instead of overriding the group quota members incore, the new project quota members are used instead * get rid of usage of the OQUOTA flag incore, in favor of separate group and project quota flags. * add a project dquot argument to various functions. Not using the pquotino field from superblock yet. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2013-07-11 10:35:32 -05:00
Jan Kara	822dbba334	ext4: fix warning in ext4_evict_inode() The following race can lead to ext4_evict_inode() seeing i_ioend_count > 0 and thus triggering a sanity check warning: CPU1 CPU2 ext4_end_bio() ext4_evict_inode() ext4_finish_bio() end_page_writeback(); truncate_inode_pages() evict page WARN_ON(i_ioend_count > 0); ext4_put_io_end_defer() ext4_release_io_end() dec i_ioend_count This is possible use-after-free bug since we decrement i_ioend_count in possibly released inode. Since i_ioend_count is used only for sanity checks one possible solution would be to just remove it but for now I'd like to keep those sanity checks to help debugging the new ext4 writeback code. This patch changes ext4_end_bio() to call ext4_put_io_end_defer() before ext4_finish_bio() in the shortcut case when unwritten extent conversion isn't needed. In that case we don't need the io_end so we are safe to drop it early. Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-07-10 21:31:04 -04:00
Michel Lespinasse	98d1e64f95	mm: remove free_area_cache Since all architectures have been converted to use vm_unmapped_area(), there is no remaining use for the free_area_cache. Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-10 18:11:34 -07:00
Eliezer Tamir	076bb0c82a	net: rename include/net/ll_poll.h to include/net/busy_poll.h Rename the file and correct all the places where it is included. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-10 17:08:27 -07:00
Steve French	1c46943f84	[CIFS] Fix minor endian error in durable handle patch series Fix endian warning: CHECK fs/cifs/smb2pdu.c fs/cifs/smb2pdu.c:1068:40: warning: incorrect type in assignment (different base types) fs/cifs/smb2pdu.c:1068:40: expected restricted __le32 [usertype] Next fs/cifs/smb2pdu.c:1068:40: got unsigned long Signed-off-by: Steve French <smfrench@gmail.com>	2013-07-10 13:08:55 -05:00
Pavel Shilovsky	9cbc0b7339	CIFS: Reconnect durable handles for SMB2 On reconnects, we need to reopen file and then obtain all byte-range locks held by the client. SMB2 protocol provides feature to make this process atomic by reconnecting to the same file handle with all it's byte-range locks. This patch adds this capability for SMB2 shares. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>	2013-07-10 13:08:40 -05:00
Pavel Shilovsky	064f6047a1	CIFS: Make SMB2_open use cifs_open_parms struct to prepare it for further durable handle reconnect processing. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>	2013-07-10 13:08:40 -05:00
Pavel Shilovsky	226730b4d8	CIFS: Introduce cifs_open_parms struct and pass it to the open() call. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>	2013-07-10 13:08:40 -05:00
Pavel Shilovsky	63eb3def32	CIFS: Request durable open for SMB2 opens by passing durable context together with a handle caching lease or batch oplock. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>	2013-07-10 13:08:39 -05:00
Pavel Shilovsky	d22cbfecbd	CIFS: Simplify SMB2 create context handling to make it easier to add other create context further. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>	2013-07-10 13:08:39 -05:00
Pavel Shilovsky	59aa371841	CIFS: Simplify SMB2_open code path by passing a filename to a separate iovec regardless of its length. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>	2013-07-10 13:08:39 -05:00
Pavel Shilovsky	ca81983fe5	CIFS: Respect create_options in smb2_open_file and eliminated unused file_attribute parms of SMB2_open. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>	2013-07-10 13:08:39 -05:00
Pavel Shilovsky	fd55439638	CIFS: Fix lease context buffer parsing to prevent missing RqLs context if it's not the first one. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>	2013-07-10 13:08:39 -05:00
Carlos Maiolino	42c49d7f24	xfs: fix sgid inheritance for subdirectories inheriting default acls [V3] XFS removes sgid bits of subdirectories under a directory containing a default acl. When a default acl is set, it implies xfs to call xfs_setattr_nonsize() in its code path. Such function is shared among mkdir and chmod system calls, and does some checks unneeded by mkdir (calling inode_change_ok()). Such checks remove sgid bit from the inode after it has been granted. With this patch, we extend the meaning of XFS_ATTR_NOACL flag to avoid these checks when acls are being inherited (thanks hch). Also, xfs_setattr_mode, doesn't need to re-check for group id and capabilities permissions, this only implies in another try to remove sgid bit from the directories. Such check is already done either on inode_change_ok() or xfs_setattr_nonsize(). Changelog: V2: Extends the meaning of XFS_ATTR_NOACL instead of wrap the tests into another function V3: Remove S_ISDIR check in xfs_setattr_nonsize() from the patch Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2013-07-10 10:21:51 -05:00
Matthew Wilcox	cc18ec3c8f	Use ecryptfs_dentry_to_lower_path in a couple of places There are two places in ecryptfs that benefit from using ecryptfs_dentry_to_lower_path() instead of separate calls to ecryptfs_dentry_to_lower() and ecryptfs_dentry_to_lower_mnt(). Both sites use fewer instructions and less stack (determined by examining objdump output). Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Tyler Hicks <tyhicks@canonical.com>	2013-07-09 23:40:28 -07:00
Linus Torvalds	496322bc91	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "This is a re-do of the net-next pull request for the current merge window. The only difference from the one I made the other day is that this has Eliezer's interface renames and the timeout handling changes made based upon your feedback, as well as a few bug fixes that have trickeled in. Highlights: 1) Low latency device polling, eliminating the cost of interrupt handling and context switches. Allows direct polling of a network device from socket operations, such as recvmsg() and poll(). Currently ixgbe, mlx4, and bnx2x support this feature. Full high level description, performance numbers, and design in commit `0a4db187a9` ("Merge branch 'll_poll'") From Eliezer Tamir. 2) With the routing cache removed, ip_check_mc_rcu() gets exercised more than ever before in the case where we have lots of multicast addresses. Use a hash table instead of a simple linked list, from Eric Dumazet. 3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski, Marek Puzyniak, Michal Kazior, and Sujith Manoharan. 4) Support reporting the TUN device persist flag to userspace, from Pavel Emelyanov. 5) Allow controlling network device VF link state using netlink, from Rony Efraim. 6) Support GRE tunneling in openvswitch, from Pravin B Shelar. 7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from Daniel Borkmann and Eric Dumazet. 8) Allow controlling of TCP quickack behavior on a per-route basis, from Cong Wang. 9) Several bug fixes and improvements to vxlan from Stephen Hemminger, Pravin B Shelar, and Mike Rapoport. In particular, support receiving on multiple UDP ports. 10) Major cleanups, particular in the area of debugging and cookie lifetime handline, to the SCTP protocol code. From Daniel Borkmann. 11) Allow packets to cross network namespaces when traversing tunnel devices. From Nicolas Dichtel. 12) Allow monitoring netlink traffic via AF_PACKET sockets, in a manner akin to how we monitor real network traffic via ptype_all. From Daniel Borkmann. 13) Several bug fixes and improvements for the new alx device driver, from Johannes Berg. 14) Fix scalability issues in the netem packet scheduler's time queue, by using an rbtree. From Eric Dumazet. 15) Several bug fixes in TCP loss recovery handling, from Yuchung Cheng. 16) Add support for GSO segmentation of MPLS packets, from Simon Horman. 17) Make network notifiers have a real data type for the opaque pointer that's passed into them. Use this to properly handle network device flag changes in arp_netdev_event(). From Jiri Pirko and Timo Teräs. 18) Convert several drivers over to module_pci_driver(), from Peter Huewe. 19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a O(1) calculation instead. From Eric Dumazet. 20) Support setting of explicit tunnel peer addresses in ipv6, just like ipv4. From Nicolas Dichtel. 21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet. 22) Prevent a single high rate flow from overruning an individual cpu during RX packet processing via selective flow shedding. From Willem de Bruijn. 23) Don't use spinlocks in TCP md5 signing fast paths, from Eric Dumazet. 24) Don't just drop GSO packets which are above the TBF scheduler's burst limit, chop them up so they are in-bounds instead. Also from Eric Dumazet. 25) VLAN offloads are missed when configured on top of a bridge, fix from Vlad Yasevich. 26) Support IPV6 in ping sockets. From Lorenzo Colitti. 27) Receive flow steering targets should be updated at poll() time too, from David Majnemer. 28) Fix several corner case regressions in PMTU/redirect handling due to the routing cache removal, from Timo Teräs. 29) We have to be mindful of ipv4 mapped ipv6 sockets in upd_v6_push_pending_frames(). From Hannes Frederic Sowa. 30) Fix L2TP sequence number handling bugs, from James Chapman." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits) drivers/net: caif: fix wrong rtnl_is_locked() usage drivers/net: enic: release rtnl_lock on error-path vhost-net: fix use-after-free in vhost_net_flush net: mv643xx_eth: do not use port number as platform device id net: sctp: confirm route during forward progress virtio_net: fix race in RX VQ processing virtio: support unlocked queue poll net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit Documentation: Fix references to defunct linux-net@vger.kernel.org net/fs: change busy poll time accounting net: rename low latency sockets functions to busy poll bridge: fix some kernel warning in multicast timer sfc: Fix memory leak when discarding scattered packets sit: fix tunnel update via netlink dt:net:stmmac: Add dt specific phy reset callback support. dt:net:stmmac: Add support to dwmac version 3.610 and 3.710 dt:net:stmmac: Allocate platform data only if its NULL. net:stmmac: fix memleak in the open method ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available net: ipv6: fix wrong ping_v6_sendmsg return value ...	2013-07-09 18:24:39 -07:00
Scott Mayhew	c7559663e4	NFS: Allow nfs_updatepage to extend a write under additional circumstances Currently nfs_updatepage allows a write to be extended to cover a full page only if we don't have a byte range lock lock on the file... but if we have a write delegation on the file or if we have the whole file locked for writing then we should be allowed to extend the write as well. Signed-off-by: Scott Mayhew <smayhew@redhat.com> [Trond: fix up call to nfs_have_delegation()] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-07-09 19:32:50 -04:00
Dave Chinner	b0a9dab78a	xfs: dquot log reservations are too small During review of the separate project quota inode patches, it became obvious that the dquot log reservation calculation underestimated the number dquots that can be modified in a transaction. This has it's roots way back in the Irix quota implementation. That is, when quotas were first implemented in XFS, it only supported user and project quotas as Irix did not have group quotas. Hence the worst case operation involving dquot modification was calculated to involve 2 user dquots and 1 project dquot or 1 user dequot and 2 project dquots. i.e. 3 dquots. This was determined back in 1996, and has remained unchanged ever since. However, back in 2001, the Linux XFS port dropped all support for project quota and implmented group quotas over the top. This was effectively done with a search-and-replace of project with group, and as such the log reservation was not changed. However, with the advent of group quotas, chmod and rename now could modify more than 3 dquots in a single transaction - both could modify 4 dquots. Hence this log reservation has been wrong for a long time. In 2005, project quota support was reintroduced into Linux, but it was implemented to be mutually exclusive to group quotas and so this didn't add any new changes to the dquot log reservation. Hence when project quotas were in use (rather than group quotas) the log reservation was again valid, just like in the Irix days. Now, with the addition of the separate project quota inode, group and project quotas are no longer mutually exclusive, and hence operations can now modify three dquots per inode where previously it was only two. The worst case here is the rename transaction, which can allocate/free space on two different directory inodes, and if they have different uid/gid/prid configurations and are world writeable, then rename can actually modify 6 different dquots now. Further, the dquot log reservation doesn't take into account the space used by the dquot log format structure that precedes the dquot that is logged, and hence further underestimates the worst case log space required by dquots during a transaction. This has been missing since the first commit in 1996. Hence the worst case log reservation needs to be increased from 3 to 6, and it needs to take into account a log format header for each of those dquots. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2013-07-09 16:43:16 -05:00
Dave Chinner	f3508bcddf	xfs: remove local fork format handling from xfs_bmapi_write() The conversion from local format to extent format requires interpretation of the data in the fork being converted, so it cannot be done in a generic way. It is up to the caller to convert the fork format to extent format before calling into xfs_bmapi_write() so format conversion can be done correctly. The code in xfs_bmapi_write() to convert the format is used implicitly by the attribute and directory code, but they specifically zero the fork size so that the conversion does not do any allocation or manipulation. Move this conversion into the shortform to leaf functions for the dir/attr code so the conversions are explicitly controlled by all callers. Now we can remove the conversion code in xfs_bmapi_write. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2013-07-09 16:40:22 -05:00
Scott Mayhew	07b5ce8ef2	NFS: Make nfs_readdir revalidate less often Make nfs_readdir revalidate only when we're at the beginning of the directory or if the cached attributes have expired. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-07-09 17:17:07 -04:00
Scott Mayhew	43f291cd07	NFS: Make nfs_attribute_cache_expired() non-static NFS: Make nfs_attribute_cache_expired() non-static so we can call it from nfs_readdir(). Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-07-09 17:17:07 -04:00
Jeff Layton	cda57a1ef6	nfs: set verifier on existing dentries in nfs_prime_dcache nfs_prime_dcache currently only sets the verifier when it doesn't initially a matching dentry in the dcache. Set the verifier in the case where we do find a dentry in the dcache. This ensures that we don't have to look up the dentry again if we want to use it after a readdir. Cc: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-07-09 17:16:39 -04:00
Yann Droneaud	862a62937e	xfs: use get_unused_fd_flags(0) instead of get_unused_fd() Macro get_unused_fd() is used to allocate a file descriptor with default flags. Those default flags (0) can be "unsafe": O_CLOEXEC must be used by default to not leak file descriptor across exec(). Instead of macro get_unused_fd(), functions anon_inode_getfd() or get_unused_fd_flags() should be used with flags given by userspace. If not possible, flags should be set to O_CLOEXEC to provide userspace with a default safe behavor. In a further patch, get_unused_fd() will be removed so that new code start using anon_inode_getfd() or get_unused_fd_flags() with correct flags. This patch replaces calls to get_unused_fd() with equivalent call to get_unused_fd_flags(0) to preserve current behavor for existing code. The hard coded flag value (0) should be reviewed on a per-subsystem basis, and, if possible, set to O_CLOEXEC. Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2013-07-09 15:53:57 -05:00
Jie Liu	9cee4c5b7b	xfs: clean up unused codes at xfs_bulkstat() There are some unused codes at xfs_bulkstat(): - Variable bp is defined to point to the on-disk inode cluster buffer, but it proved to be of no practical help. - We process the chunks of good inodes which were fetched by iterating btree records from an AG. When processing inodes from each chunk, the code recomputing agbno if run into the first inode of a cluster, however, the agbno is not being used thereafter. This patch tries to clean up those things. Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2013-07-09 15:36:21 -05:00
Linus Torvalds	a82a729f04	Merge branch 'akpm' (updates from Andrew Morton) Merge second patch-bomb from Andrew Morton: - misc fixes - audit stuff - fanotify/inotify/dnotify things - most of the rest of MM. The new cache shrinker code from Glauber and Dave Chinner probably isn't quite stabilized yet. - ptrace - ipc - partitions - reboot cleanups - add LZ4 decompressor, use it for kernel compression * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits) lib/scatterlist: error handling in __sg_alloc_table() scsi_debug: fix do_device_access() with wrap around range crypto: talitos: use sg_pcopy_to_buffer() lib/scatterlist: introduce sg_pcopy_from_buffer() and sg_pcopy_to_buffer() lib/scatterlist: factor out sg_miter_get_next_page() from sg_miter_next() crypto: add lz4 Cryptographic API lib: add lz4 compressor module arm: add support for LZ4-compressed kernel lib: add support for LZ4-compressed kernel decompressor: add LZ4 decompressor module lib: add weak clz/ctz functions reboot: move arch/x86 reboot= handling to generic kernel reboot: arm: change reboot_mode to use enum reboot_mode reboot: arm: prepare reboot_mode for moving to generic kernel code reboot: arm: remove unused restart_mode fields from some arm subarchs reboot: unicore32: prepare reboot_mode for moving to generic kernel code reboot: x86: prepare reboot_mode for moving to generic kernel code reboot: checkpatch.pl the new kernel/reboot.c file reboot: move shutdown/reboot related functions to kernel/reboot.c reboot: remove -stable friendly PF_THREAD_BOUND define ...	2013-07-09 13:33:36 -07:00
Linus Torvalds	9a5889ae1c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "There is some follow-on RBD cleanup after the last window's code drop, a series from Yan fixing multi-mds behavior in cephfs, and then a sprinkling of bug fixes all around. Some warnings, sleeping while atomic, a null dereference, and cleanups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (36 commits) libceph: fix invalid unsigned->signed conversion for timespec encoding libceph: call r_unsafe_callback when unsafe reply is received ceph: fix race between cap issue and revoke ceph: fix cap revoke race ceph: fix pending vmtruncate race ceph: avoid accessing invalid memory libceph: Fix NULL pointer dereference in auth client code ceph: Reconstruct the func ceph_reserve_caps. ceph: Free mdsc if alloc mdsc->mdsmap failed. ceph: remove sb_start/end_write in ceph_aio_write. ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL. ceph: fix sleeping function called from invalid context. ceph: move inode to proper flushing list when auth MDS changes rbd: fix a couple warnings ceph: clear migrate seq when MDS restarts ceph: check migrate seq before changing auth cap ceph: fix race between page writeback and truncate ceph: reset iov_len when discarding cap release messages ceph: fix cap release race libceph: fix truncate size calculation ...	2013-07-09 12:39:10 -07:00
Linus Torvalds	e3a0dd98e1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs update from Chris Mason: "These are the usual mixture of bugs, cleanups and performance fixes. Miao has some really nice tuning of our crc code as well as our transaction commits. Josef is peeling off more and more problems related to early enospc, and has a number of important bug fixes in here too" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (81 commits) Btrfs: wait ordered range before doing direct io Btrfs: only do the tree_mod_log_free_eb if this is our last ref Btrfs: hold the tree mod lock in __tree_mod_log_rewind Btrfs: make backref walking code handle skinny metadata Btrfs: fix crash regarding to ulist_add_merge Btrfs: fix several potential problems in copy_nocow_pages_for_inode Btrfs: cleanup the code of copy_nocow_pages_for_inode() Btrfs: fix oops when recovering the file data by scrub function Btrfs: make the chunk allocator completely tree lockless Btrfs: cleanup orphaned root orphan item Btrfs: fix wrong mirror number tuning Btrfs: cleanup redundant code in btrfs_submit_direct() Btrfs: remove btrfs_sector_sum structure Btrfs: check if we can nocow if we don't have data space Btrfs: stop using try_to_writeback_inodes_sb_nr to flush delalloc Btrfs: use a percpu to keep track of possibly pinned bytes Btrfs: check for actual acls rather than just xattrs when caching no acl Btrfs: move btrfs_truncate_page to btrfs_cont_expand instead of btrfs_truncate Btrfs: optimize reada_for_balance Btrfs: optimize read_block_for_search ...	2013-07-09 12:33:09 -07:00
Eliezer Tamir	76b1e9b981	net/fs: change busy poll time accounting Suggested by Linus: Changed time accounting for busy-poll: - Make it microsecond based. - Use unsigned longs. - Revert back to use time_after instead of time_in_range. Reorder poll/select busy loop conditions: - Clear busy_flag after one time we can't busy-poll. - Only init busy_end if we actually are going to busy-poll. Added one more missing need_resched() test. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-09 12:33:04 -07:00
Linus Torvalds	da89bd213f	xfs: update for 3.11-rc1 - part of the work to allow project quotas and group quotas to be used together - inode change count - inode create transaction - block queue plugging in buffer readahead and bulkstat - ordered log vector support - removal of dead code in and around xfs_sync_inode_grab, xfs_ialloc_get_rec, XFS_MOUNT_RETERR, XFS_ALLOCFREE_LOG_RES, XFS_DIROP_LOG_RES, xfs_chash, ctl_table, and xfs_growfs_data_private - don't keep silent if sunit/swidth can not be changed via mount - fix a leak of remote symlink blocks into the filesystem when xattrs are used on symlinks - fix for fiemap to return FIEMAP_EXTENT_UNKOWN flag on delay extents - part of a fix for xfs_fsr - disable speculative preallocation with small files - performance improvements for inode creates and deletes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABAgAGBQJR3F9pAAoJENaLyazVq6ZOiKAP/jyfPbVj5AOiLVtTLhJUQ3Qf urCiMjl87BToixFxa/yxeOBrUbBiOgUQ/2om4b5cryYhN2RtQWiEi/iVdeBZw3rR 1J6VxH09R25GVRobIj2AwJ87eXyqKvi2DaVBrQbgva0BH8fmWKhQISfwLTPUIXcK GTrVpS1amTWxh69/5p5bxaxo4u2y7DbZwYC4xQOcPeSrfxizMQmN2JqtUjtLDyKp J08Md04vNcxJvfbFopcSFAncStIr6xnKMiaqvFttybLJf9HL9MqoaO9Xp+6aX1QI Pibae3oFctH1tGOmjgyg30AbPjzAHaGDSw9vuRaommQZbMwiZXAV2VGuJ5QWtAbi wXh5GIzaap1Z0EYuD9qY0hsFizOpmL+1YH+F20qIqa3i5CiDMeYMWlmzhfN63zH9 Zk2j1YUkbxLmKMEgl8s++eLfT1yenuzVAp5zUodApOPp161FOMJlVhhD55WDvKtP 2/3Ig25fz5CLwcIZT1MoZ9B7UP9dfrAAK30AcUmO3Iumj6b8bsxh4pzSpJ02U1cG cMIe+OhxC6KqDZlXeLmDXodnOo9Wc4glwdivcynxCNFv6WS5Ez/ufm+4oN+OVUo7 PwyNtoJfsptSYZyIRorXNez54ww3cOvvqjifUasTZTGkTSqCbf1LclzuMxcXDlyO 3YzE/DCK8ZWwJc/ysbn/ =v6KW -----END PGP SIGNATURE----- Merge tag 'for-linus-v3.11-rc1' of git://oss.sgi.com/xfs/xfs Pull xfs update from Ben Myers: "This includes several bugfixes, part of the work for project quotas and group quotas to be used together, performance improvements for inode creation/deletion, buffer readahead, and bulkstat, implementation of the inode change count, an inode create transaction, and the removal of a bunch of dead code. There are also some duplicate commits that you already have from the 3.10-rc series. - part of the work to allow project quotas and group quotas to be used together - inode change count - inode create transaction - block queue plugging in buffer readahead and bulkstat - ordered log vector support - removal of dead code in and around xfs_sync_inode_grab, xfs_ialloc_get_rec, XFS_MOUNT_RETERR, XFS_ALLOCFREE_LOG_RES, XFS_DIROP_LOG_RES, xfs_chash, ctl_table, and xfs_growfs_data_private - don't keep silent if sunit/swidth can not be changed via mount - fix a leak of remote symlink blocks into the filesystem when xattrs are used on symlinks - fix for fiemap to return FIEMAP_EXTENT_UNKOWN flag on delay extents - part of a fix for xfs_fsr - disable speculative preallocation with small files - performance improvements for inode creates and deletes" * tag 'for-linus-v3.11-rc1' of git://oss.sgi.com/xfs/xfs: (61 commits) xfs: Remove incore use of XFS_OQUOTA_ENFD and XFS_OQUOTA_CHKD xfs: Change xfs_dquot_acct to be a 2-dimensional array xfs: Code cleanup and removal of some typedef usage xfs: Replace macro XFS_DQ_TO_QIP with a function xfs: Replace macro XFS_DQUOT_TREE with a function xfs: Define a new function xfs_is_quota_inode() xfs: implement inode change count xfs: Use inode create transaction xfs: Inode create item recovery xfs: Inode create transaction reservations xfs: Inode create log items xfs: Introduce an ordered buffer item xfs: Introduce ordered log vector support xfs: xfs_ifree doesn't need to modify the inode buffer xfs: don't do IO when creating an new inode xfs: don't use speculative prealloc for small files xfs: plug directory buffer readahead xfs: add pluging for bulkstat readahead xfs: Remove dead function prototype xfs_sync_inode_grab() xfs: Remove the left function variable from xfs_ialloc_get_rec() ...	2013-07-09 12:29:12 -07:00
Linus Torvalds	be0c5d8c0b	NFS client updates for Linux 3.11 Feature highlights include: - Add basic client support for NFSv4.2 - Add basic client support for Labeled NFS (selinux for NFSv4.2) - Fix the use of credentials in NFSv4.1 stateful operations, and add support for NFSv4.1 state protection. Bugfix highlights: - Fix another NFSv4 open state recovery race - Fix an NFSv4.1 back channel session regression - Various rpc_pipefs races - Fix another issue with NFSv3 auth negotiation -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJR2vsSAAoJEGcL54qWCgDyWBIP/AqlpBBAblxbNQ1Bl/0m1Pdb iKH961qgM4U1BzK0svGtHTZqkovpm4o/VbkbKBT5mQ4g6SbbsJ/AsS1plCyfnIZi bdnKNJyj6zg0NsAkJ3vKWqd4BTaP+icdSfEIlRKQxAPESewN7b5B3OWgY4KdYmnk q5BP25anC1ryxVycSY67ux8S2IKXVSRZeCZv+RO21rvZ2G0bV5y7t8Om28ztxEnU RKrHgQHgaaktR7i8QVO0sbiWq3iqLa3GPkUvFLwWGr8PQJtTkYY0QwYSrsV3N4rY hYpMRUZFHpZ8UG5YvBT6xyOy/XaGwMGKSfZjB9/YG4QVju+tTy50U1JbTil5PEWY GHWYF68aurIeUkXrhSv8AVnOnhir0mISx5ou/SV7p0QoAZ92V6kq+LkPrW520qlc z8ILh3j28pN3ZUCIEArcaZhYCt48uO2hwBi5TqevQyyGRsXFGbN1moD5jvHkllft Fi0XGuCBdvhrzFRZcsEl+PDq7fT8lXUK2BHe8oR5jz9PhUp+jpEl9m/eg3RsjJjN DuxsHye2U4chScdnRtLBQvpFtdINvWX/Gy8Bi7kdE5tsQySvOa+rdwuBc7h88PHC +4xI2iX3z4O1+GpsAe/T9+pjW689jEilS+eVDRVEGl6yHGn9q8PYOayjPjwbJHxS R2mLTRhKu1DKguTzO13f =wGjn -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Feature highlights include: - Add basic client support for NFSv4.2 - Add basic client support for Labeled NFS (selinux for NFSv4.2) - Fix the use of credentials in NFSv4.1 stateful operations, and add support for NFSv4.1 state protection. Bugfix highlights: - Fix another NFSv4 open state recovery race - Fix an NFSv4.1 back channel session regression - Various rpc_pipefs races - Fix another issue with NFSv3 auth negotiation Please note that Labeled NFS does require some additional support from the security subsystem. The relevant changesets have all been reviewed and acked by James Morris." * tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits) NFS: Set NFS_CS_MIGRATION for NFSv4 mounts NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs nfs: have NFSv3 try server-specified auth flavors in turn nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it nfs: move server_authlist into nfs_try_mount_request nfs: refactor "need_mount" code out of nfs_try_mount SUNRPC: PipeFS MOUNT notification optimization for dying clients SUNRPC: split client creation routine into setup and registration SUNRPC: fix races on PipeFS UMOUNT notifications SUNRPC: fix races on PipeFS MOUNT notifications NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize NFS: Improve legacy idmapping fallback NFSv4.1 end back channel session draining NFS: Apply v4.1 capabilities to v4.2 NFSv4.1: Clean up layout segment comparison helper names NFSv4.1: layout segment comparison helpers should take 'const' parameters NFSv4: Move the DNS resolver into the NFSv4 module rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set ...	2013-07-09 12:09:43 -07:00
Linus Torvalds	1f792dd176	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext3 fix and quota cleanup from Jan Kara: "A fix of ext3 error reporting from fsync and a quota cleanup" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: Convert use of typedef ctl_table to struct ctl_table ext3: Fix fsync error handling after filesystem abort.	2013-07-09 12:08:43 -07:00
Linus Torvalds	c75e24752c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull third set of VFS updates from Al Viro: "Misc stuff all over the place. There will be one more pile in a couple of days" This is an "evil merge" that also uses the new d_count helper in fs/configfs/dir.c, missed by commit `84d08fa888` ("helper for reading ->d_count") * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ncpfs: fix error return code in ncp_parse_options() locks: move file_lock_list to a set of percpu hlist_heads and convert file_lock_lock to an lglock seq_file: add seq_list_*_percpu helpers f2fs: fix readdir incorrectness mode_t whack-a-mole... lustre: kill the pointless wrapper helper for reading ->d_count	2013-07-09 11:26:44 -07:00
Eric Sandeen	a69c7c0772	xfs: use XFS_BMAP_BMDR_SPACE vs. XFS_BROOT_SIZE_ADJ XFS_BROOT_SIZE_ADJ is an undocumented macro which accounts for the difference in size between the on-disk and in-core btree root. It's much clearer to just use the newly-added XFS_BMAP_BMDR_SPACE macro which gives us the on-disk size directly. In one case, we must test that the if_broot exists before applying the macro, however. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2013-07-09 13:21:15 -05:00
Mike Lockwood	6e5b93ee55	fatfs: add FAT_IOCTL_GET_VOLUME_ID This patch, originally from Android kernel, adds vfat ioctl command FAT_IOCTL_GET_VOLUME_ID, with this command we can get the vfat volume ID using following code: ioctl(fd, FAT_IOCTL_GET_VOLUME_ID, &volume_ID) This patch is a modified version of the patch by Mike Lockwood, with changes from Dmitry Pervushin, who noticed the original patch makes some volume IDs abiguous with error returns: for example, if volume id is 0xFFFFFDAD, that matches -ENOIOCTLCMD, we get "FFFFFFFF" from the user space. So add a parameter to ioctl to get the correct volume ID. Android uses vfat volume ID to identify different sd card, when a new sd card is inserted to device, android can scan the media on it and pop up new contents. Signed-off-by: Bintian Wang <bintian.wang@linaro.org> Cc: dmitry pervushin <dpervushin@gmail.com> Cc: Mike Lockwood <lockwood@android.com> Cc: Colin Cross <ccross@android.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: John Stultz <john.stultz@linaro.org> Cc: Sean McNeil <sean@mcneil.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-09 10:33:25 -07:00
Wei Yongjun	2417898b34	ncpfs: fix error return code in ncp_parse_options() Fix to return -EINVAL from the option parse error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-09 10:33:25 -07:00
Wanpeng Li	25d130ba22	mm/writeback: don't check force_wait to handle bdi->work_list After commit `839a8e8660` ("writeback: replace custom worker pool implementation with unbound workqueue"), bdi_writeback_workfn runs off bdi_writeback->dwork, on each execution, it processes bdi->work_list and reschedules if there are more things to do instead of flush any work that race with us existing. It is unecessary to check force_wait in wb_do_writeback since it is always 0 after the mentioned commit. This patch remove the force_wait in wb_do_writeback. Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Reviewed-by: Tejun Heo <tj@kernel.org> Reviewed-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-09 10:33:22 -07:00
Haicheng Li	1205784100	fs/fs-writeback.c: : make wb_do_writeback() as static It's not used globally and could be static. Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-09 10:33:22 -07:00
Lino Sanfilippo	9756b9187e	fsnotify: update comments concerning locking scheme There have been changes in the locking scheme of fsnotify but the comments in the source code have not been updated yet. This patch corrects this. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-09 10:33:20 -07:00
Lino Sanfilippo	e1e5a9f84e	inotify: fix race when adding a new watch In inotify_new_watch() the number of watches for a group is compared against the max number of allowed watches and increased afterwards. The check and incrementation is not done atomically, so it is possible for multiple concurrent threads to pass the check and increment the number of marks above the allowed max. This patch uses an inotify groups mark_lock to ensure that both check and incrementation are done atomic. Furthermore we dont have to worry about the race that allows a concurrent thread to add a watch just after inotify_update_existing_watch() returned with -ENOENT anymore, since this is also synchronized by the groups mark mutex now. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-09 10:33:20 -07:00
Lino Sanfilippo	52f8572980	dnotify: replace dnotify_mark_mutex with mark mutex of dnotify_group There is no need to use a special mutex to protect against the fcntl/close race (see dnotify.c for a description of this race). Instead the dnotify_groups mark mutex can be used. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-09 10:33:20 -07:00
Lino Sanfilippo	5e9c070ca0	fanotify: put duplicate code for adding vfsmount/inode marks into an own function The code under the groups mark_mutex in fanotify_add_inode_mark() and fanotify_add_vfsmount_mark() is almost identical. So put it into a seperate function. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-09 10:33:20 -07:00
Lino Sanfilippo	7b18527c4a	fanotify: fix races when adding/removing marks For both adding an event to an existing mark and destroying a mark we first have to find it via fsnotify_find_[inode\|vfsmount]_mark(). But getting the mark and adding an event (or destroying it) is not done atomically. This opens a race where a thread is about to destroy a mark while another thread still finds the same mark and adds an event to its mask although it will be destroyed. Another race exists concerning the excess of a groups number of marks limit: When a mark is added the number of group marks is checked against the max number of marks per group and increased afterwards. Since check and increment is also not done atomically, this may result in 2 or more processes passing the check at the same time and increasing the number of group marks above the allowed limit. With this patch both races are avoided by doing the concerning operations with the groups mark mutex locked. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-09 10:33:19 -07:00
Dan Carpenter	de1e0c40ac	fanotify: info leak in copy_event_to_user() The ->reserved field isn't cleared so we leak one byte of stack information to userspace. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-09 10:33:19 -07:00
Eliezer Tamir	cbf55001b2	net: rename low latency sockets functions to busy poll Rename functions in include/net/ll_poll.h to busy wait. Clarify documentation about expected power use increase. Rename POLL_LL to POLL_BUSY_LOOP. Add need_resched() testing to poll/select busy loops. Note, that in select and poll can_busy_poll is dynamic and is updated continuously to reflect the existence of supported sockets with valid queue information. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-08 19:25:45 -07:00
J. Bruce Fields	d109148111	nfsd4: support minorversion 1 by default We now have minimal minorversion 1 support; turn it on by default. This can still be turned off with "echo -4.1 >/proc/fs/nfsd/versions". Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-08 19:48:10 -04:00
J. Bruce Fields	f0f51f5cdd	nfsd4: allow destroy_session over destroyed session RFC 5661 allows a client to destroy a session using a compound associated with the destroyed session, as long as the DESTROY_SESSION op is the last op of the compound. We attempt to allow this, but testing against a Solaris client (which does destroy sessions in this way) showed that we were failing the DESTROY_SESSION with NFS4ERR_DELAY, because we assumed the reference count on the session (held by us) represented another rpc in progress over this session. Fix this by noting that in this case the expected reference count is 1, not 0. Also, note as long as the session holds a reference to the compound we're destroying, we can't free it here--instead, delay the free till the final put in nfs4svc_encode_compoundres. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-08 19:46:38 -04:00
Wei Yongjun	4fbeb19d53	ncpfs: fix error return code in ncp_parse_options() Fix to return -EINVAL from the option parse error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-08 13:36:43 +04:00
Jeff Layton	7012b02a2b	locks: move file_lock_list to a set of percpu hlist_heads and convert file_lock_lock to an lglock The file_lock_list is only used for /proc/locks. The vastly common case is for locks to be put onto the list and come off again, without ever being traversed. Help optimize for this use-case by moving to percpu hlist_head-s. At the same time, we can make the locking less contentious by moving to an lglock. When iterating over the lists for /proc/locks, we must take the global lock and then iterate over each CPU's list in turn. This change necessitates a new fl_link_cpu field to keep track of which CPU the entry is on. On x86_64 at least, this field is placed within an existing hole in the struct to avoid growing the size. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-08 13:36:42 +04:00
Jeff Layton	0bc77381c1	seq_file: add seq_list__percpu helpers When we convert the file_lock_list to a set of percpu lists, we'll need a way to iterate over them in order to output /proc/locks info. Add some seq_list__percpu helpers to handle that. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-08 13:36:41 +04:00
Jaegeuk Kim	99b072bb38	f2fs: fix readdir incorrectness In the previous Al Viro's readdir patch set, there occurs a bug when running xfstest: 006 as follows. [Error output] alpha size = 4, name length = 6, total files = 4096, nproc=1 1023 files created rm: cannot remove `/mnt/f2fs/permname.15150/a': Directory not empty [Correct output] alpha size = 4, name length = 6, total files = 4096, nproc=1 4097 files created This bug is due to the misupdate of directory position in ctx. So, this patch fixes this. [AV: fixed a braino] CC: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-08 13:35:48 +04:00
Gu Zheng	f2692ea8d5	fs/9p: Remove the unused variable "err" in v9fs_vfs_getattr() Delete the unused variable "err" in v9fs_vfs_getattr() Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-07-07 22:18:31 -05:00
Jim Garlick	d9a738597f	fs/9p: xattr: add trusted and security namespaces Allow requests for security.* and trusted.* xattr name spaces to pass through to server. The new files are 99% cut and paste from fs/9p/xattr_user.c with the namespaces changed. It has the intended effect in superficial testing. I do not know much detail about how these namespaces are used, but passing them through to the server, which can decide whether to handle them or not, seems reasonable. I want to support a use case where an ext4 file system is mounted via 9P, then re-exported via samba to windows clients in a cluster. Windows wants to store xattrs such as security.NTACL. This works when ext4 directly backs samba, but not when 9P is inserted. This use case is documented here: http://code.google.com/p/diod/issues/detail?id=95 Signed-off-by: Jim Garlick <garlick@llnl.gov> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-07-07 22:02:18 -05:00
Linus Torvalds	21884a83b2	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer core updates from Thomas Gleixner: "The timer changes contain: - posix timer code consolidation and fixes for odd corner cases - sched_clock implementation moved from ARM to core code to avoid duplication by other architectures - alarm timer updates - clocksource and clockevents unregistration facilities - clocksource/events support for new hardware - precise nanoseconds RTC readout (Xen feature) - generic support for Xen suspend/resume oddities - the usual lot of fixes and cleanups all over the place The parts which touch other areas (ARM/XEN) have been coordinated with the relevant maintainers. Though this results in an handful of trivial to solve merge conflicts, which we preferred over nasty cross tree merge dependencies. The patches which have been committed in the last few days are bug fixes plus the posix timer lot. The latter was in akpms queue and next for quite some time; they just got forgotten and Frederic collected them last minute." * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (59 commits) hrtimer: Remove unused variable hrtimers: Move SMP function call to thread context clocksource: Reselect clocksource when watchdog validated high-res capability posix-cpu-timers: don't account cpu timer after stopped thread runtime accounting posix_timers: fix racy timer delta caching on task exit posix-timers: correctly get dying task time sample in posix_cpu_timer_schedule() selftests: add basic posix timers selftests posix_cpu_timers: consolidate expired timers check posix_cpu_timers: consolidate timer list cleanups posix_cpu_timer: consolidate expiry time type tick: Sanitize broadcast control logic tick: Prevent uncontrolled switch to oneshot mode tick: Make oneshot broadcast robust vs. CPU offlining x86: xen: Sync the CMOS RTC as well as the Xen wallclock x86: xen: Sync the wallclock when the system time is set timekeeping: Indicate that clock was set in the pvclock gtod notifier timekeeping: Pass flags instead of multiple bools to timekeeping_update() xen: Remove clock_was_set() call in the resume path hrtimers: Support resuming with two or more CPUs online (but stopped) timer: Fix jiffies wrap behavior of round_jiffies_common() ...	2013-07-06 14:09:38 -07:00
Theodore Ts'o	960fd856fd	ext4: fix ext4_get_group_number() The function ext4_get_group_number() was introduced as an optimization in commit `bd86298e60`. Unfortunately, this commit incorrectly calculate the group number for file systems with a 1k block size (when s_first_data_block is 1 instead of zero). This could cause the following kernel BUG: [ 568.877799] ------------[ cut here ]------------ [ 568.877833] kernel BUG at fs/ext4/mballoc.c:3728! [ 568.877840] Oops: Exception in kernel mode, sig: 5 [#1] [ 568.877845] SMP NR_CPUS=32 NUMA pSeries [ 568.877852] Modules linked in: binfmt_misc [ 568.877861] CPU: 1 PID: 3516 Comm: fs_mark Not tainted 3.10.0-03216-g7c6809f-dirty #1 [ 568.877867] task: c0000001fb0b8000 ti: c0000001fa954000 task.ti: c0000001fa954000 [ 568.877873] NIP: c0000000002f42a4 LR: c0000000002f4274 CTR: c000000000317ef8 [ 568.877879] REGS: c0000001fa956ed0 TRAP: 0700 Not tainted (3.10.0-03216-g7c6809f-dirty) [ 568.877884] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 24000428 XER: 00000000 [ 568.877902] SOFTE: 1 [ 568.877905] CFAR: c0000000002b5464 [ 568.877908] GPR00: 0000000000000001 c0000001fa957150 c000000000c6a408 c0000001fb588000 GPR04: 0000000000003fff c0000001fa9571c0 c0000001fa9571c4 000138098c50625f GPR08: 1301200000000000 0000000000000002 0000000000000001 0000000000000000 GPR12: 0000000024000422 c00000000f33a300 0000000000008000 c0000001fa9577f0 GPR16: c0000001fb7d0100 c000000000c29190 c0000000007f46e8 c000000000a14672 GPR20: 0000000000000001 0000000000000008 ffffffffffffffff 0000000000000000 GPR24: 0000000000000100 c0000001fa957278 c0000001fdb2bc78 c0000001fa957288 GPR28: 0000000000100100 c0000001fa957288 c0000001fb588000 c0000001fdb2bd10 [ 568.877993] NIP [c0000000002f42a4] .ext4_mb_release_group_pa+0xec/0x1c0 [ 568.877999] LR [c0000000002f4274] .ext4_mb_release_group_pa+0xbc/0x1c0 [ 568.878004] Call Trace: [ 568.878008] [c0000001fa957150] [c0000000002f4274] .ext4_mb_release_group_pa+0xbc/0x1c0 (unreliable) [ 568.878017] [c0000001fa957200] [c0000000002fb070] .ext4_mb_discard_lg_preallocations+0x394/0x444 [ 568.878025] [c0000001fa957340] [c0000000002fb45c] .ext4_mb_release_context+0x33c/0x734 [ 568.878032] [c0000001fa957440] [c0000000002fbcf8] .ext4_mb_new_blocks+0x4a4/0x5f4 [ 568.878039] [c0000001fa957510] [c0000000002ef56c] .ext4_ext_map_blocks+0xc28/0x1178 [ 568.878047] [c0000001fa957640] [c0000000002c1a94] .ext4_map_blocks+0x2c8/0x490 [ 568.878054] [c0000001fa957730] [c0000000002c536c] .ext4_writepages+0x738/0xc60 [ 568.878062] [c0000001fa957950] [c000000000168a78] .do_writepages+0x5c/0x80 [ 568.878069] [c0000001fa9579d0] [c00000000015d1c4] .__filemap_fdatawrite_range+0x88/0xb0 [ 568.878078] [c0000001fa957aa0] [c00000000015d23c] .filemap_write_and_wait_range+0x50/0xfc [ 568.878085] [c0000001fa957b30] [c0000000002b8edc] .ext4_sync_file+0x220/0x3c4 [ 568.878092] [c0000001fa957be0] [c0000000001f849c] .vfs_fsync_range+0x64/0x80 [ 568.878098] [c0000001fa957c70] [c0000000001f84f0] .vfs_fsync+0x38/0x4c [ 568.878105] [c0000001fa957d00] [c0000000001f87f4] .do_fsync+0x54/0x90 [ 568.878111] [c0000001fa957db0] [c0000000001f8894] .SyS_fsync+0x28/0x3c [ 568.878120] [c0000001fa957e30] [c000000000009c88] syscall_exit+0x0/0x7c [ 568.878125] Instruction dump: [ 568.878130] 60000000 813d0034 81610070 38000000 7f8b4800 419e001c 813f007c 7d2bfe70 [ 568.878144] 7d604a78 7c005850 54000ffe 7c0007b4 <0b000000> e8a10076 e87f0090 7fa4eb78 [ 568.878160] ---[ end trace 594d911d9654770b ]--- In addition fix the STD_GROUP optimization so that it works for bigalloc file systems as well. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Li Zhong <lizhongfs@gmail.com> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Cc: stable@vger.kernel.org # 3.10	2013-07-05 23:11:16 -04:00
Jan Kara	27d7c4ed1f	ext4: silence warning in ext4_writepages() The loop in mpage_map_and_submit_extent() is guaranteed to always run at least once since the caller of mpage_map_and_submit_extent() makes sure map->m_len > 0. So make that explicit using do-while instead of pure while which also silences the compiler warning about uninitialized 'err' variable. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com>	2013-07-05 21:57:22 -04:00
Linus Torvalds	2dd1cb5a7e	Only a single patch which fixes a message. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJR1to/AAoJECmIfjd9wqK0t80P/3p1iovDYk+bERO0W2AYSWRO vDXxdT95xmP9qT81+nd3/Y5WYAl86LLftct+CrecJv4NtwkK4Fb+By3V8TpBS4VH ROoTXRTiXe8krrszi+6w9lNCuomqmlJoBC3sbTwRuFyGa4XPKYCoawbEu1OySPmW cK/z633usm+k67KZ184B/KW+3znFs0G6fZ18mKRT9ImfCpP6E/xq5a7+xYrLdEcg MdI3hRZIpEfDVl4RX12UhxwHXcMzc/tLRYBoWf3ukJaKSRuVfoQMh+slc7L9RlAY YaIr2Q4qiUSsv4j7+Hq18sSQwvmi7rLxHyG6ZbN9PU8uzPAlXqI1sqhNKwMVXNyh AJWtHkkt/nZY+vZmCsOZY7whGtVKoumCqSWWMai/FyXmZN79LZRY5cC/MuC49UQX 4I6tA6S7KCQSo2TM8s/iyqQ5xpzHNGaMnJ0DJ3XJaDI0B7GhJ1sv9EB2F7ZA96je hvLnlGklPyPz/QuKP7tCa3OakXm0Lns6uietzRxNARb5feMfBr7/ACSuPPcoc8hg v3YDIrXGeffg8kQmIsbYCGkyv1Il/Axh8bVvArJHf7YF21z+3ROXkOnzeqdTfuHQ rnFKgPrT0pxLoCz1sh0lS5QCVpxatUvYvVN+M3iVhDjhCm73mOssnYqJ3DjSJfO/ zFppRcF9D3bHCjyOYICt =5YQr -----END PGP SIGNATURE----- Merge tag 'upstream-3.11-rc1' of git://git.infradead.org/linux-ubifs Pull ubifs fix from Artem Bityutskiy: "Only a single patch which fixes a message" * tag 'upstream-3.11-rc1' of git://git.infradead.org/linux-ubifs: UBIFS: correct mount message	2013-07-05 12:08:47 -07:00

... 2 3 4 5 6 ...

32867 Commits