2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-11-19 08:05:27 +08:00
Commit Graph

20532 Commits

Author SHA1 Message Date
Jesper Juhl
a4264b3f40 UDF: Close small mem leak in udf_find_entry()
Hi,

There's a small memory leak in fs/udf/namei.c::udf_find_entry().

We dynamically allocate memory for 'fname' with kmalloc() and in most
situations we free it before we leave the function, but there is one
situation where we do not (but should). This patch closes the leak by
jumping to the 'out_ok' label which does the correct cleanup rather than
doing half the cleanup and returning directly.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:53:53 +01:00
Jan Kara
4651c5900e udf: Fix directory corruption after extent merging
If udf_bread() called from udf_add_entry() managed to merge created extent to
an already existing one (or if previous extents could be merged), the code
truncating the last extent to proper size would just overwrite the freshly
allocated extent with an extent that used to be in that place.  This obviously
results in a directory corruption. Fix the problem by properly reloading the
last extent.

Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:57 +01:00
Jan Kara
8754a3f718 udf: Protect udf_file_aio_write from possible races
Code doing conversion from INICB file to a normal file in udf_file_aio_write()
is not protected by any lock from other code modifying the inode. Use
i_alloc_sem for that.

Reported-by: Alessio Igor Bogani <abogani@texware.it>
Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:57 +01:00
Alessio Igor Bogani
9db9f9e31d udf: Remove unnecessary bkl usages
The udf_readdir(), udf_lookup(), udf_create(), udf_mknod(), udf_mkdir(),
udf_rmdir(), udf_link(), udf_get_parent() and udf_unlink() seems already
adequately protected by i_mutex held by VFS invoking calls. The udf_rename()
instead should be already protected by lock_rename again by VFS. The
udf_ioctl(), udf_fill_super() and udf_evict_inode() don't requires any further
protection.

This work was supported by a hardware donation from the CE Linux Forum.

Signed-off-by: Alessio Igor Bogani <abogani@texware.it>
Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:57 +01:00
Alessio Igor Bogani
7db09be629 udf: Use of s_alloc_mutex to serialize udf_relocate_blocks() execution
This work was supported by a hardware donation from the CE Linux Forum.

Signed-off-by: Alessio Igor Bogani <abogani@texware.it>
Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:56 +01:00
Alessio Igor Bogani
4d0fb621d3 udf: Replace bkl with the UDF_I(inode)->i_data_sem for protect udf_inode_info struct
Replace bkl with the UDF_I(inode)->i_data_sem rw semaphore in
udf_release_file(), udf_symlink(), udf_symlink_filler(), udf_get_block(),
udf_block_map(), and udf_setattr(). The rule now is that any operation
on regular file's or symlink's extents (or generally allocation information
including goal block) needs to hold i_data_sem.

This work was supported by a hardware donation from the CE Linux Forum.

Signed-off-by: Alessio Igor Bogani <abogani@texware.it>
Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:56 +01:00
Jan Kara
d1668fe390 udf: Remove BKL from free space counting functions
udf_count_free_bitmap() does not need BKL because bitmaps are in a fixed
place on disk and so we can count set bits without serialization.
udf_count_free_table() is now protected by s_alloc_mutex instead of BKL
to get a consistent view of free space extents.

Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:56 +01:00
Jan Kara
7abc2e45e4 udf: Call udf_add_free_space() for more blocks at once in udf_free_blocks()
There's no need to call udf_add_free_space() for one block at a time. It saves
us noticeable amount of work and yields different result from the original
code only if the filesystem is corrupted and bitmap bit is already cleared.
In such case counter of free blocks is probably wrong anyways so the change
does not matter.

Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:56 +01:00
Jan Kara
0484b1cedc udf: Remove BKL from udf_put_super() and udf_remount_fs()
udf_put_super() does not need BKL because the filesystem is shut down so
there's nothing to race with. The credential changes in udf_remount_fs()
and LVID changes are now protected by dedicated locks so we can remove BKL
from this function as well.

Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:55 +01:00
Jan Kara
c03cad241a udf: Protect default inode credentials by rwlock
Superblock carries credentials (uid, gid, etc.) which are used as default
values in __udf_read_inode() when media does not provide these. These
credentials can change during remount so we protect them by a rwlock so that
each inode gets a consistent set of credentials.

Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:55 +01:00
Jan Kara
949f4a7c08 udf: Protect all modifications of LVID with s_alloc_mutex
udf_open_lvid() and udf_close_lvid() were modifying LVID without
s_alloc_mutex. Since they can be called from remount, the modification
could race with other filesystem modifications of LVID so protect them
by s_alloc_mutex just to be sure.

Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:55 +01:00
Jan Kara
d664b6af60 udf: Move handling of uniqueID into a helper function and protect it by a s_alloc_mutex
uniqueID handling has been duplicated in three places. Move it into a common
helper. Since we modify an LVID buffer with uniqueID update, we take
sbi->s_alloc_mutex to protect agaist other modifications of the structure.

Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:55 +01:00
Jan Kara
49521de119 udf: Remove BKL from udf_update_inode
udf_update_inode() does not need BKL since on-disk inode modifications are
protected by the buffer lock and reading of values of in-memory inode is
safe without any lock. In some cases we can write inconsistent inode state
to disk but in that case inode will be marked dirty and overwritten later.

Also make unnecessarily global udf_sync_inode() static.

Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:54 +01:00
Jan Kara
f2a6cc1f14 udf: Convert UDF_SB(sb)->s_flags to use bitops
Use atomic bitops to manipulate with sb flags to make manipulation safe
without any locking.

Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:54 +01:00
Joe Perches
fab3c8581f fs/udf: Add printf format/argument verification
Add __attribute__((format... to udf_warning.

All arguments matched formats, no other changes necessary.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:54 +01:00
Joe Perches
ed2ae6f691 fs/udf: Use vzalloc
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2011-01-06 17:03:53 +01:00
Linus Torvalds
eda4b716ea Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
  ocfs2: Fix system inodes cache overflow.
  ocfs2: Hold ip_lock when set/clear flags for indexed dir.
  ocfs2: Adjust masklog flag values
  Ocfs2: Teach 'coherency=full' O_DIRECT writes to correctly up_read i_alloc_sem.
  ocfs2/dlm: Migrate lockres with no locks if it has a reference
2010-12-23 16:36:48 -08:00
Linus Torvalds
55fb78a3a8 Merge branch 'linus-hot-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'linus-hot-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix on-line resizing regression
2010-12-23 16:25:31 -08:00
Theodore Ts'o
8a7411a243 ext4: fix on-line resizing regression
https://bugzilla.kernel.org/show_bug.cgi?id=25352

This regression was caused by commit a31437b85: "ext4: use
sb_issue_zeroout in setup_new_group_blocks", by accidentally dropping
the code which reserved the block group descriptor and inode table
blocks.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-23 15:00:54 -05:00
Prasad Joshi
f06328d772 logfs: fix "Kernel BUG at readwrite.c:1193"
This happens when __logfs_create() tries to write a new inode to the disk
which is full.

__logfs_create() associates the transaction pointer with inode.  During
the logfs_write_inode() function call chain this transaction pointer is
moved from inode to page->private using function move_inode_to_page
(do_write_inode() -> inode_to_page() -> move_inode_to_page)

When the write inode fails, the transaction is aborted and iput is called
on the failed inode.  During delete_inode the same transaction pointer
associated with the page is getting used.  Thus causing kernel BUG.

The patch checks for error in write_inode() and restores the page->private
to NULL.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=20162

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Joern Engel <joern@logfs.org>
Cc: Florian Mickler <florian@mickler.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-22 19:43:33 -08:00
Prasad Joshi
eabb26cacd logfs: fix deadlock in logfs_get_wblocks, hold and wait on super->s_write_mutex
do_logfs_journal_wl_pass() should use GFP_NOFS for memory allocation GC
code calls btree_insert32 with GFP_KERNEL while holding a mutex
super->s_write_mutex.

The same mutex is used in address_space_operations->writepage(), and a
call to writepage() could be triggered as a result of memory allocation
in btree_insert32, causing a deadlock.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=20342

Signed-off-by: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Joern Engel <joern@logfs.org>
Cc: Florian Mickler <florian@mickler.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-22 19:43:33 -08:00
Tao Ma
7d8f98769e ocfs2: Fix system inodes cache overflow.
When we store system inodes cache in ocfs2_super,
we use a array for global system inodes. But unfortunately,
the range is calculated wrongly which makes it overflow and
pollute ocfs2_super->local_system_inodes.
This patch fix it by setting the range properly.

The corresponding bug is ossbug1303.
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1303

Cc: stable@kernel.org
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-12-22 02:35:36 -08:00
Linus Torvalds
9d5004fcf6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: handle partial result from get_user_pages
  ceph: mark user pages dirty on direct-io reads
  ceph: fix null pointer dereference in ceph_init_dentry for nfs reexport
  ceph: fix direct-io on non-page-aligned buffers
  ceph: fix msgr_init error path
2010-12-20 21:32:20 -08:00
Al Viro
3cb50ddf97 Fix btrfs b0rkage
Buggered-in: 76dda93c6a ("Btrfs: add snapshot/subvolume destroy
ioctl")

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-20 09:09:57 -08:00
Henry C Chang
b6aa5901c7 ceph: mark user pages dirty on direct-io reads
For read operation, we have to set the argument _write_ of get_user_pages
to 1 since we will write data to pages. Also, we need to SetPageDirty before
releasing these pages.

Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-12-17 09:54:40 -08:00
Sage Weil
92cf765237 ceph: fix null pointer dereference in ceph_init_dentry for nfs reexport
The fh_to_dentry etc. methods use ceph_init_dentry(), which assumes that
d_parent is defined.  It isn't for those callers, so check!

Signed-off-by: Sage Weil <sage@newdream.net>
2010-12-17 09:53:48 -08:00
Linus Torvalds
a3383e8372 Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify
* 'for-linus' of git://git.infradead.org/users/eparis/notify:
  fanotify: fill in the metadata_len field on struct fanotify_event_metadata
  fanotify: split version into version and metadata_len
  fanotify: Dont try to open a file descriptor for the overflow event
  fanotify: Introduce FAN_NOFD
  fanotify: do not leak user reference on allocation failure
  inotify: stop kernel memory leak on file creation failure
  fanotify: on group destroy allow all waiters to bypass permission check
  fanotify: Dont allow a mask of 0 if setting or removing a mark
  fanotify: correct broken ref counting in case adding a mark failed
  fanotify: if set by user unset FMODE_NONOTIFY before fsnotify_perm() is called
  fanotify: remove packed from access response message
  fanotify: deny permissions when no event was sent
2010-12-16 15:45:49 -08:00
Tao Ma
8ac33dc86d ocfs2: Hold ip_lock when set/clear flags for indexed dir.
When we set/clear the dyn_features for an inode we hold the ip_lock.
So do it when we set/clear OCFS2_INDEXED_DIR_FL also.

Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-12-16 00:36:15 -08:00
Sunil Mushran
41b41a26d4 ocfs2: Adjust masklog flag values
Two masklogs had the same flag value.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-12-16 00:36:11 -08:00
Ryusuke Konishi
947b10ae0a nilfs2: fix regression of garbage collection ioctl
On 2.6.37-rc1, garbage collection ioctl of nilfs was broken due to the
commit 263d90cefc ("nilfs2: remove own inode hash used for GC"),
and leading to filesystem corruption.

The patch doesn't queue gc-inodes for log writer if they are reused
through the vfs inode cache.  Here, gc-inode is the inode which
buffers blocks to be relocated on GC.  That patch queues gc-inodes in
nilfs_init_gcinode() function, but this function is not called when
they don't have I_NEW flag.  Thus, some of live blocks are wrongly
overrode without being moved to new logs.

This resolves the problem by moving the gc-inode queueing to an outer
function to ensure it's done right.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2010-12-16 14:35:18 +09:00
Henry C Chang
ab226e21ad ceph: fix direct-io on non-page-aligned buffers
The user buffer may be 512-byte aligned, not page-aligned.  We were
assuming the buffer was page-aligned and only accounting for
non-page-aligned io offsets.

Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-12-15 20:46:16 -08:00
Linus Torvalds
a4851d8f7d Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix typo which broke '..' detection in ext4_find_entry()
  ext4: Turn off multiple page-io submission by default
2010-12-15 12:41:17 -08:00
Tavis Ormandy
462e635e5b install_special_mapping skips security_file_mmap check.
The install_special_mapping routine (used, for example, to setup the
vdso) skips the security check before insert_vm_struct, allowing a local
attacker to bypass the mmap_min_addr security restriction by limiting
the available pages for special mappings.

bprm_mm_init() also skips the check, and although I don't think this can
be used to bypass any restrictions, I don't see any reason not to have
the security check.

  $ uname -m
  x86_64
  $ cat /proc/sys/vm/mmap_min_addr
  65536
  $ cat install_special_mapping.s
  section .bss
      resb BSS_SIZE
  section .text
      global _start
      _start:
          mov     eax, __NR_pause
          int     0x80
  $ nasm -D__NR_pause=29 -DBSS_SIZE=0xfffed000 -f elf -o install_special_mapping.o install_special_mapping.s
  $ ld -m elf_i386 -Ttext=0x10000 -Tbss=0x11000 -o install_special_mapping install_special_mapping.o
  $ ./install_special_mapping &
  [1] 14303
  $ cat /proc/14303/maps
  0000f000-00010000 r-xp 00000000 00:00 0                                  [vdso]
  00010000-00011000 r-xp 00001000 00:19 2453665                            /home/taviso/install_special_mapping
  00011000-ffffe000 rwxp 00000000 00:00 0                                  [stack]

It's worth noting that Red Hat are shipping with mmap_min_addr set to
4096.

Signed-off-by: Tavis Ormandy <taviso@google.com>
Acked-by: Kees Cook <kees@ubuntu.com>
Acked-by: Robert Swiecki <swiecki@google.com>
[ Changed to not drop the error code - akpm ]
Reviewed-by: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-15 12:30:36 -08:00
Eric Paris
7d13162332 fanotify: fill in the metadata_len field on struct fanotify_event_metadata
The fanotify_event_metadata now has a field which is supposed to
indicate the length of the metadata portion of the event.  Fill in that
field as well.

Based-in-part-on-patch-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-12-15 13:58:18 -05:00
Aaro Koskinen
6d5c3aa84b ext4: fix typo which broke '..' detection in ext4_find_entry()
There should be a check for the NUL character instead of '0'.

Fortunately the only thing that cares about this is NFS serving, which
is why we didn't notice this in the merge window testing.

Reported-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-14 21:45:31 -05:00
Theodore Ts'o
1449032be1 ext4: Turn off multiple page-io submission by default
Jon Nelson has found a test case which causes postgresql to fail with
the error:

psql:t.sql:4: ERROR: invalid page header in block 38269 of relation base/16384/16581

Under memory pressure, it looks like part of a file can end up getting
replaced by zero's.  Until we can figure out the cause, we'll roll
back the change and use block_write_full_page() instead of
ext4_bio_write_page().  The new, more efficient writing function can
be used via the mount option mblk_io_submit, so we can test and fix
the new page I/O code.

To reproduce the problem, install postgres 8.4 or 9.0, and pin enough
memory such that the system just at the end of triggering writeback
before running the following sql script:

begin;
create temporary table foo as select x as a, ARRAY[x] as b FROM
generate_series(1, 10000000 ) AS x;
create index foo_a_idx on foo (a);
create index foo_b_idx on foo USING GIN (b);
rollback;

If the temporary table is created on a hard drive partition which is
encrypted using dm_crypt, then under memory pressure, approximately
30-40% of the time, pgsql will issue the above failure.

This patch should fix this problem, and the problem will come back if
the file system is mounted with the mblk_io_submit mount option.

Reported-by: Jon Nelson <jnelson@jamponi.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-14 15:27:50 -05:00
Linus Torvalds
5111711d3e Merge branch 'for-2.6.37' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.37' of git://linux-nfs.org/~bfields/linux:
  nfsd: Fix possible BUG_ON firing in set_change_info
  sunrpc: prevent use-after-free on clearing XPT_BUSY
2010-12-14 11:09:05 -08:00
Linus Torvalds
e13cf63f2b Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: prevent RAID level downgrades when space is low
  Btrfs: account for missing devices in RAID allocation profiles
  Btrfs: EIO when we fail to read tree roots
  Btrfs: fix compiler warnings
  Btrfs: Make async snapshot ioctl more generic
  Btrfs: pwrite blocked when writing from the mmaped buffer of the same page
  Btrfs: Fix a crash when mounting a subvolume
  Btrfs: fix sync subvol/snapshot creation
  Btrfs: Fix page leak in compressed writeback path
  Btrfs: do not BUG if we fail to remove the orphan item for dead snapshots
  Btrfs: fixup return code for btrfs_del_orphan_item
  Btrfs: do not do fast caching if we are allocating blocks for tree_root
  Btrfs: deal with space cache errors better
  Btrfs: fix use after free in O_DIRECT
2010-12-14 11:08:13 -08:00
Linus Torvalds
073f21ae13 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: verify ioctl retries
  fuse: fix ioctl when server is 32bit
2010-12-14 11:07:39 -08:00
Linus Torvalds
497b5b13c9 Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: log timestamp changes to the source inode in rename
2010-12-14 11:06:17 -08:00
Linus Torvalds
e97b71ded9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: fix ioctl magic
  ceph: Behave better when handling file lock replies.
  ceph: pass lock information by struct file_lock instead of as individual params.
  ceph: Handle file locks in replies from the MDS.
  ceph: avoid possible null deref in readdir after dir llseek
2010-12-14 11:02:15 -08:00
Linus Torvalds
38971ce2fa Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Fix panic after nfs_umount()
  nfs: remove extraneous and problematic calls to nfs_clear_request
  nfs: kernel should return EPROTONOSUPPORT when not support NFSv4
  NFS: Fix fcntl F_GETLK not reporting some conflicts
  nfs: Discard ACL cache on mode update
  NFS: Readdir cleanups
  NFS: nfs_readdir_search_for_cookie() don't mark as eof if cookie not found
  NFS: Fix a memory leak in nfs_readdir
  Call the filesystem back whenever a page is removed from the page cache
  NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler
2010-12-14 08:51:12 -08:00
Linus Torvalds
caa4a59574 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: remove bogus remapping of error in cifs_filldir()
  cifs: allow calling cifs_build_path_to_root on incomplete cifs_sb
  cifs: fix check of error return from is_path_accessable
  cifs: remove Local_System_Name
  cifs: fix use of CONFIG_CIFS_ACL
  cifs: add attribute cache timeout (actimeo) tunable
2010-12-14 08:49:15 -08:00
Chris Mason
83a50de97f Btrfs: prevent RAID level downgrades when space is low
The extent allocator has code that allows us to fill
allocations from any available block group, even if it doesn't
match the raid level we've requested.

This was put in because adding a new drive to a filesystem
made with the default mkfs options actually upgrades the metadata from
single spindle dup to full RAID1.

But, the code also allows us to allocate from a raid0 chunk when we
really want a raid1 or raid10 chunk.  This can cause big trouble because
mkfs creates a small (4MB) raid0 chunk for data and metadata which then
goes unused for raid1/raid10 installs.

The allocator will happily wander in and allocate from that chunk when
things get tight, which is not correct.

The fix here is to make sure that we provide duplication when the
caller has asked for it.  It does all the dups to be any raid level,
which preserves the dup->raid1 upgrade abilities.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-12-13 20:07:01 -05:00
Chris Mason
cd02dca564 Btrfs: account for missing devices in RAID allocation profiles
When we mount in RAID degraded mode without adding a new device to
replace the failed one, we can end up using the wrong RAID flags for
allocations.

This results in strange combinations of block groups (raid1 in a raid10
filesystem) and corruptions when we try to allocate blocks from single
spindle chunks on drives that are actually missing.

The first device has two small 4MB chunks in it that mkfs creates and
these are usually unused in a raid1 or raid10 setup.  But, in -o degraded,
the allocator will fall back to these because the mask of desired raid groups
isn't correct.

The fix here is to count the missing devices as we build up the list
of devices in the system.  This count is used when picking the
raid level to make sure we continue using the same levels that were
in place before we lost a drive.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-12-13 20:06:52 -05:00
Chris Mason
68433b73b1 Btrfs: EIO when we fail to read tree roots
If we just get a plain IO error when we read tree roots, the code
wasn't properly sending that error up the chain.  This allowed mounts to
continue when they should failed, and allowed operations
on partially setup root structs.  The end result was usually oopsen
on spinlocks that hadn't been spun up correctly.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-12-13 14:47:58 -05:00
Jan Beulich
3dd1462e82 Btrfs: fix compiler warnings
... regarding an unused function when !MIGRATION, and regarding a
printk() format string vs argument mismatch.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-12-10 16:29:11 -05:00
Li Zefan
fdfb1e4f6c Btrfs: Make async snapshot ioctl more generic
If we had reserved some bytes in struct btrfs_ioctl_vol_args, we
wouldn't have to create a new structure for async snapshot creation.

Here we convert async snapshot ioctl to use a more generic ABI, as
we'll add more ioctls for snapshots/subvolumes in the future, readonly
snapshots for example.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-12-10 16:29:11 -05:00
Xin Zhong
914ee295af Btrfs: pwrite blocked when writing from the mmaped buffer of the same page
This problem is found in meego testing:
http://bugs.meego.com/show_bug.cgi?id=6672
A file in btrfs is mmaped and the mmaped buffer is passed to pwrite to write to the same page
of the same file. In btrfs_file_aio_write(), the pages is locked by prepare_pages(). So when
btrfs_copy_from_user() is called, page fault happens and the same page needs to be locked again
in filemap_fault(). The fix is to move iov_iter_fault_in_readable() before prepage_pages() to make page
fault happen before pages are locked. And also disable page fault in critical region in
btrfs_copy_from_user().

Reviewed-by: Yan, Zheng<zheng.z.yan@intel.com>
Signed-off-by: Zhong, Xin <xin.zhong@intel.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-12-10 16:29:10 -05:00
Li Zefan
f106e82caa Btrfs: Fix a crash when mounting a subvolume
We should drop dentry before deactivating the superblock, otherwise
we can hit this bug:

BUG: Dentry f349a690{i=100,n=/} still in use (1) [unmount of btrfs loop1]
...

Steps to reproduce the bug:

  # mount /dev/loop1 /mnt
  # mkdir save
  # btrfs subvolume snapshot /mnt save/snap1
  # umount /mnt
  # mount -o subvol=save/snap1 /dev/loop1 /mnt
  (crash)

Reported-by: Michael Niederle <mniederle@gmx.at>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-12-10 16:29:10 -05:00