linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-23 20:53:53 +08:00

Author	SHA1	Message	Date
Linus Torvalds	06a60deca8	Merge tag 'for-f2fs-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "New features: - in-memory extent_cache - fs_shutdown to test power-off-recovery - use inline_data to store symlink path - show f2fs as a non-misc filesystem Major fixes: - avoid CPU stalls on sync_dirty_dir_inodes - fix some power-off-recovery procedure - fix handling of broken symlink correctly - fix missing dot and dotdot made by sudden power cuts - handle wrong data index during roll-forward recovery - preallocate data blocks for direct_io ... and a bunch of minor bug fixes and cleanups" * tag 'for-f2fs-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (71 commits) f2fs: pass checkpoint reason on roll-forward recovery f2fs: avoid abnormal behavior on broken symlink f2fs: flush symlink path to avoid broken symlink after POR f2fs: change 0 to false for bool type f2fs: do not recover wrong data index f2fs: do not increase link count during recovery f2fs: assign parent's i_mode for empty dir f2fs: add F2FS_INLINE_DOTS to recover missing dot dentries f2fs: fix mismatching lock and unlock pages for roll-forward recovery f2fs: fix sparse warnings f2fs: limit b_size of mapped bh in f2fs_map_bh f2fs: persist system.advise into on-disk inode f2fs: avoid NULL pointer dereference in f2fs_xattr_advise_get f2fs: preallocate fallocated blocks for direct IO f2fs: enable inline data by default f2fs: preserve extent info for extent cache f2fs: initialize extent tree with on-disk extent info of inode f2fs: introduce __{find,grab}_extent_tree f2fs: split set_data_blkaddr from f2fs_update_extent_cache f2fs: enable fast symlink by utilizing inline data ...	2015-04-18 11:17:20 -04:00
Jaegeuk Kim	10027551cc	f2fs: pass checkpoint reason on roll-forward recovery This patch adds CP_RECOVERY to remain recovery information for checkpoint. And, it makes sure writing checkpoint in this case. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-16 09:45:40 -07:00
Jaegeuk Kim	feb7cbb079	f2fs: avoid abnormal behavior on broken symlink When f2fs_symlink was triggered and checkpoint was done before syncing its link path, f2fs can get broken symlink like "xxx -> \0\0\0". This incurs abnormal path_walk by VFS. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-16 09:45:40 -07:00
Jaegeuk Kim	d0cae97cb6	f2fs: flush symlink path to avoid broken symlink after POR This patch tries to avoid broken symlink case after POR in best effort. This results in performance regression. But, if f2fs has inline_data and the target path is under 3KB-sized long, the page would be stored in its inode_block, so that there would be no performance regression. Note that, if user wants to keep this file atomically, it needs to trigger dir->fsync. And, there is still a hole to produce broken symlink. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-16 09:45:35 -07:00
Taehee Yoo	9df47ba759	f2fs: change 0 to false for bool type in the f2fs_fill_super function, variable "retry" is bool type i think that it should be set as false. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-15 16:13:28 -07:00
Omar Sandoval	22c6186ece	direct_IO: remove rw from a_ops->direct_IO() Now that no one is using rw, remove it completely. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:29:45 -04:00
Omar Sandoval	6f67376318	direct_IO: use iov_iter_rw() instead of rw everywhere The rw parameter to direct_IO is redundant with iov_iter->type, and treated slightly differently just about everywhere it's used: some users do rw & WRITE, and others do rw == WRITE where they should be doing a bitwise check. Simplify this with the new iov_iter_rw() helper, which always returns either READ or WRITE. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:29:45 -04:00
Omar Sandoval	17f8c842d2	Remove rw from {,__,do_}blockdev_direct_IO() Most filesystems call through to these at some point, so we'll start here. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:29:44 -04:00
Al Viro	5d5d568975	make new_sync_{read,write}() static All places outside of core VFS that checked ->read and ->write for being NULL or called the methods directly are gone now, so NULL {read,write} with non-NULL {read,write}_iter will do the right thing in all cases. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:29:40 -04:00
Jaegeuk Kim	e03b07d908	f2fs: do not recover wrong data index During the roll-forward recovery, if we found a new data index written fsync lastly, we need to recover new block address. But, if that address was corrupted, we should not recover that. Otherwise, f2fs gets kernel panic from: In check_index_in_prev_nodes(), sentry = get_seg_entry(sbi, segno); --------------------------> out-of-range segno. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:59 -07:00
Jaegeuk Kim	418f6c2770	f2fs: do not increase link count during recovery If there are multiple fsynced dnodes having a dent flag, roll-forward routine sets FI_INC_LINK for their inode, and recovery_dentry increases its link count accordingly. That results in normal file having a link count as 2, so we can't unlink those files. This was added to handle several inode blocks having same inode number with different directory paths. But, current f2fs doesn't replay all of path changes and only recover its dentry for the last fsynced inode block. So, there is no reason to do this. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:58 -07:00
Jaegeuk Kim	cb58463bc9	f2fs: assign parent's i_mode for empty dir When assigning i_mode for dotdot, it needs to assign parent's i_mode. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:58 -07:00
Jaegeuk Kim	510022a858	f2fs: add F2FS_INLINE_DOTS to recover missing dot dentries If f2fs was corrupted with missing dot dentries, it needs to recover them after fsck.f2fs detection. The underlying precedure is: 1. The fsck.f2fs remains F2FS_INLINE_DOTS flag in directory inode, if it detects missing dot dentries. 2. When f2fs looks up the corrupted directory, it triggers f2fs_add_link with proper inode numbers and their dot and dotdot names. 3. Once f2fs recovers the directory without errors, it removes F2FS_INLINE_DOTS finally. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:57 -07:00
Jaegeuk Kim	c9ef481097	f2fs: fix mismatching lock and unlock pages for roll-forward recovery Previously, inode page is not correctly locked and unlocked in pair during the roll-forward recovery. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:56 -07:00
Jaegeuk Kim	adad81ed42	f2fs: fix sparse warnings This patch fixes the below warning. sparse warnings: (new ones prefixed by >>) >> fs/f2fs/inode.c:56:23: sparse: restricted __le32 degrades to integer >> fs/f2fs/inode.c:56:52: sparse: restricted __le32 degrades to integer Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:55 -07:00
Chao Yu	1b3e27a92a	f2fs: limit b_size of mapped bh in f2fs_map_bh Map bh over max size which caller defined is not needed, limit it in f2fs_map_bh. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:55 -07:00
Chao Yu	30c62fdb25	f2fs: persist system.advise into on-disk inode This patch fixes to dirty inode for persisting i_advise of f2fs inode info into on-disk inode if user sets system.advise through setxattr. Otherwise the new value will be lost. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:54 -07:00
Chao Yu	84e97c2767	f2fs: avoid NULL pointer dereference in f2fs_xattr_advise_get We will encounter oops by executing below command. getfattr -n system.advise /mnt/f2fs/file Killed message log: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<f8b54d69>] f2fs_xattr_advise_get+0x29/0x40 [f2fs] pdpt = 00000000319b7001 pde = 0000000000000000 Oops: 0002 [#1] SMP Modules linked in: f2fs(O) snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq joydev snd_seq_device snd_timer bnep snd rfcomm microcode bluetooth soundcore i2c_piix4 mac_hid serio_raw parport_pc ppdev lp parport binfmt_misc hid_generic psmouse usbhid hid e1000 [last unloaded: f2fs] CPU: 3 PID: 3134 Comm: getfattr Tainted: G O 4.0.0-rc1 #6 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 task: f3a71b60 ti: f19a6000 task.ti: f19a6000 EIP: 0060:[<f8b54d69>] EFLAGS: 00010246 CPU: 3 EIP is at f2fs_xattr_advise_get+0x29/0x40 [f2fs] EAX: 00000000 EBX: f19a7e71 ECX: 00000000 EDX: f8b5b467 ESI: 00000000 EDI: f2008570 EBP: f19a7e14 ESP: f19a7e08 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 80050033 CR2: 00000000 CR3: 319b8000 CR4: 000007f0 Stack: f8b5a634 c0cbb580 00000000 f19a7e34 c1193850 00000000 00000007 f19a7e71 f19a7e64 c0cbb580 c1193810 f19a7e50 c1193c00 00000000 00000000 00000000 c0cbb580 00000000 f19a7f70 c1194097 00000000 00000000 00000000 74737973 Call Trace: [<c1193850>] generic_getxattr+0x40/0x50 [<c1193810>] ? xattr_resolve_name+0x80/0x80 [<c1193c00>] vfs_getxattr+0x70/0xa0 [<c1194097>] getxattr+0x87/0x190 [<c11801d7>] ? path_lookupat+0x57/0x5f0 [<c11819d2>] ? putname+0x32/0x50 [<c116653a>] ? kmem_cache_alloc+0x2a/0x130 [<c11819d2>] ? putname+0x32/0x50 [<c11819d2>] ? putname+0x32/0x50 [<c11819d2>] ? putname+0x32/0x50 [<c11827f9>] ? user_path_at_empty+0x49/0x70 [<c118283f>] ? user_path_at+0x1f/0x30 [<c11941e7>] path_getxattr+0x47/0x80 [<c11948e7>] SyS_getxattr+0x27/0x30 [<c163f748>] sysenter_do_call+0x12/0x12 Code: 66 90 55 89 e5 57 56 53 66 66 66 66 90 8b 78 20 89 d3 ba 67 b4 b5 f8 89 d8 89 ce e8 42 7c 7b c8 85 c0 75 16 0f b6 87 44 01 00 00 <88> 06 b8 01 00 00 00 5b 5e 5f 5d c3 8d 76 00 b8 ea ff ff ff eb EIP: [<f8b54d69>] f2fs_xattr_advise_get+0x29/0x40 [f2fs] SS:ESP 0068:f19a7e08 CR2: 0000000000000000 ---[ end trace 860260654f1f416a ]--- The reason is that in getfattr there are two steps which is indicated by strace info: 1) try to lookup and get size of specified xattr. 2) get value of the extented attribute. strace info: getxattr("/mnt/f2fs/file", "system.advise", 0x0, 0) = 1 getxattr("/mnt/f2fs/file", "system.advise", "\x00", 256) = 1 For the first step, getfattr may pass a NULL pointer in @value and zero in @size as parameters for ->getxattr, but we access this @value pointer directly without checking whether the pointer is valid or not in f2fs_xattr_advise_get, so the oops occurs. This patch fixes this issue by verifying @value pointer before using. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:53 -07:00
Chao Yu	df6136ef55	f2fs: preallocate fallocated blocks for direct IO Normally, due to DIO_SKIP_HOLES flag is set by default, blockdev_direct_IO in f2fs_direct_IO tries to skip DIO in holes when writing inside i_size, this makes us falling back to buffered IO which shows lower performance. So in commit `59b802e5a4` ("f2fs: allocate data blocks in advance for f2fs_direct_IO"), we improve perfromance by allocating data blocks in advance if we meet holes no matter in i_size or not, since with it we can avoid falling back to buffered IO. But we forget to consider for unwritten fallocated block in this commit. This patch tries to fix it for fallocate case, this helps to improve performance. Test result: Storage info: sandisk ultra 64G micro sd card. touch /mnt/f2fs/file truncate -s 67108864 /mnt/f2fs/file fallocate -o 0 -l 67108864 /mnt/f2fs/file time dd if=/dev/zero of=/mnt/f2fs/file bs=1M count=64 conv=notrunc oflag=direct Time before applying the patch: 67108864 bytes (67 MB) copied, 36.16 s, 1.9 MB/s real 0m36.162s user 0m0.000s sys 0m0.180s Time after applying the patch: 67108864 bytes (67 MB) copied, 27.7776 s, 2.4 MB/s real 0m27.780s user 0m0.000s sys 0m0.036s Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:52 -07:00
Wanpeng Li	7534279798	f2fs: enable inline data by default Enable inline_data feature by default since it brings us better performance and space utilization and now has already stable. Add another option noinline_data to disable it during mount. Suggested-by: Jaegeuk Kim <jaegeuk@kernel.org> Suggested-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:52 -07:00
Chao Yu	0bdee48250	f2fs: preserve extent info for extent cache This patch tries to preserve last extent info in extent tree cache into on-disk inode, so this can help us to reuse the last extent info next time for performance. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:51 -07:00
Chao Yu	028a41e893	f2fs: initialize extent tree with on-disk extent info of inode With normal extent info cache, we records largest extent mapping between logical block and physical block into extent info, and we persist extent info in on-disk inode. When we enable extent tree cache, if extent info of on-disk inode is exist, and the extent is not a small fragmented mapping extent. We'd better to load the extent info into extent tree cache when inode is loaded. By this way we can have more chance to hit extent tree cache rather than taking more time to read dnode page for block address. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:50 -07:00
Chao Yu	93dfc52656	f2fs: introduce __{find,grab}_extent_tree This patch introduces __{find,grab}_extent_tree for reusing by following patches. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:49 -07:00
Chao Yu	216a620a7c	f2fs: split set_data_blkaddr from f2fs_update_extent_cache Split __set_data_blkaddr from f2fs_update_extent_cache for readability. Additionally rename __set_data_blkaddr to set_data_blkaddr for exporting. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:49 -07:00
Wanpeng Li	368a0e40b5	f2fs: enable fast symlink by utilizing inline data Fast symlink can utilize inline data flow to avoid using any i_addr region, since we need to handle many cases such as truncation, roll-forward recovery, and fsck/dump tools. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:48 -07:00
Jaegeuk Kim	8ce67cb07d	f2fs: add some tracepoints to debug volatile and atomic writes Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:47 -07:00
Jaegeuk Kim	3c6c2bebef	f2fs: avoid punch_hole overhead when releasing volatile data This patch is to avoid some punch_hole overhead when releasing volatile data. If volatile data was not written yet, we just can make the first page as zero. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:46 -07:00
Jaegeuk Kim	83e21db693	f2fs: avoid wrong f2fs_bug_on when truncating inline_data This patch removes wrong f2fs_bug_on in truncate_inline_inode. When there is no space, it can happen a corner case where i_isze is over MAX_INLINE_SIZE while its inode is still inline_data. The scenario is 1. write small data into file #A. 2. fill the whole partition to 100%. 3. truncate 4096 on file #A. 4. write data at 8192 offset. --> f2fs_write_begin -> -ENOSPC = f2fs_convert_inline_page -> f2fs_write_failed -> truncate_blocks -> truncate_inline_inode BUG_ON, since i_size is 4096. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:46 -07:00
Jaegeuk Kim	78373b7319	f2fs: enhance multi-threads performance Previously, f2fs_write_data_pages has a mutex, sbi->writepages, to serialize data writes to maximize write bandwidth, while sacrificing multi-threads performance. Practically, however, multi-threads environment is much more important for users. So this patch tries to remove the mutex. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:45 -07:00
Jaegeuk Kim	3402e87cfb	f2fs: set buffer_new when new blocks are allocated This patch modifies to call set_buffer_new, if new blocks are allocated. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:44 -07:00
Chao Yu	2adc3505cf	f2fs: set SBI_NEED_FSCK when encountering exception in recovery This patch tries to set SBI_NEED_FSCK flag into sbi only when we fail to recover in fill_super, so we could skip fscking image when we fail to fill super for other reason. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:43 -07:00
Jaegeuk Kim	21cb1d99bc	f2fs: fix to cover sentry_lock for block allocation In the following call stack, f2fs changes the bitmap for dirty segments and # of dirty sentries without grabbing sit_i->sentry_lock. This can result in mismatch on bitmap and # of dirty sentries, since if there are some direct_io operations. In allocate_data_block, - __allocate_new_segments - mutex_lock(&curseg->curseg_mutex); - s_ops->allocate_segment - new_curseg/change_curseg - reset_curseg - __set_sit_entry_type - __mark_sit_entry_dirty - set_bit(dirty_sentries_bitmap) - dirty_sentries++; Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:43 -07:00
Chao Yu	d6d4f1cb91	f2fs: fix to check current blkaddr in __allocate_data_blocks In __allocate_data_blocks, we should check current blkaddr which is located at ofs_in_node of dnode page instead of checking first blkaddr all the time. Otherwise we can only allocate one blkaddr in each dnode page. Fix it. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:42 -07:00
Chao Yu	0bfcfcca3d	f2fs: fix to truncate inline data past EOF Previously if inode is with inline data, we will try to invalid partial inline data in page #0 when we truncate size of inode in truncate_partial_data_page(). And then we set page #0 to dirty, after this we can synchronize inode page with page #0 at ->writepage(). But sometimes we will fail to operate page #0 in truncate_partial_data_page() due to below reason: a) if offset is zero, we will skip setting page #0 to dirty. b) if page #0 is not uptodate, we will fail to update it as it has no mapping data. So with following operations, we will meet recent data which should be truncated. 1.write inline data to file 2.sync first data page to inode page 3.truncate file size to 0 4.truncate file size to max_inline_size 5.echo 1 > /proc/sys/vm/drop_caches 6.read file --> meet original inline data which is remained in inode page. This patch renames truncate_inline_data() to truncate_inline_inode() for code readability, then use truncate_inline_inode() to truncate inline data in inode page in truncate_blocks() and truncate page #0 in truncate_partial_data_page() for fixing. v2: o truncate partially #0 page in truncate_partial_data_page to avoid keeping old data in #0 page. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:41 -07:00
Chao Yu	83dfe53c18	f2fs: fix reference leaks in f2fs_acl_create Our f2fs_acl_create is copied and modified from posix_acl_create to avoid deadlock bug when inline_dentry feature is enabled. Now, we got reference leaks in posix_acl_create, and this has been fixed in commit `fed0b588be` ("posix_acl: fix reference leaks in posix_acl_create") by Omar Sandoval. https://lkml.org/lkml/2015/2/9/5 Let's fix this issue in f2fs_acl_create too. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Reviewed-by: Changman Lee <cm224.lee@ssamsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:40 -07:00
Chao Yu	bda190760b	f2fs: fix to calculate max length of contiguous free slots correctly When lookuping for creating, we will try to record the level of current dentry hash table if current dentry has enough contiguous slots for storing name of new file which will be created later, this can save our lookup time when add a link into parent dir. But currently in find_target_dentry, our current length of contiguous free slots is not calculated correctly. This make us leaving some holes in dentry block occasionally, it wastes our space of dentry block. Let's refactor the lookup flow for max slots as following to fix this issue: a) increase max_len if current slot is free; b) update max_slots with max_len if max_len is larger than max_slots; c) reset max_len to zero if current slot is not free. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:40 -07:00
Wanpeng Li	57ed1e95ba	f2fs: fix unlocked nat set cache operation nm_i->nat_tree_lock is used to sync both the operations of nat entry cache tree and nat set cache tree, however, it isn't held when flush nat entries during checkpoint which lead to potential race, this patch fix it by holding the lock when gang lookup nat set cache and delete item from nat set cache. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:39 -07:00
Changman Lee	e0150392dd	f2fs: cleanup statement about max orphan inodes calc Through each macro, we can read the meaning easily. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:38 -07:00
Yuan Zhong	d9f46bb1a8	f2fs: remove unnecessary condition judgment Remove the unnecessary condition judgment, because 'max_slots' has been initialized to '0' at the beginging of the function, as following: if (max_slots) *max_slots = 0; Signed-off-by: Yuan Zhong <yuan.mark.zhong@samsung.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:38 -07:00
Yuan Zhong	b1f73b79d2	f2fs: set the correct place of initializing *res_page The function 'find_in_inline_dir()' contain 'res_page' as an argument. So, we should initiaize 'res_page' before this function. Signed-off-by: Yuan Zhong <yuan.mark.zhong@samsung.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:37 -07:00
Wanpeng Li	7fd97019b8	f2fs: reduce searching region of segmap when set free section In __set_free we will check whether all segment are free in one section when free one segment, in order to set section to free status. But the searching region of segmap is from start segno to last segno of main area, it's not necessary. So let's just only check all segment bitmap of target section. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:36 -07:00
Wanpeng Li	fdf6c8be33	f2fs: fix extent cache memory leak extent tree/node slab cache is created during f2fs insmod, how, it isn't destroyed during f2fs rmmod, this patch fix it by destroy extent tree/node slab cache once rmmod f2fs. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:35 -07:00
Jaegeuk Kim	d7196c5a32	f2fs: relocate Kconfig from misc filesystems The f2fs has been shipped on many smartphone devices during a couple of years. So, it is worth to relocate Kconfig into main page from misc filesystems for developers to choose it more easily. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:35 -07:00
Jaegeuk Kim	7662916591	f2fs: report -ENOENT for unreached data indices If inode has inline_data, it should report -ENOENT when accessing out-of-bound region. This is used by f2fs_fiemap which treats -ENOENT with no error. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:34 -07:00
Jaegeuk Kim	cff28521bb	f2fs: clear append/update flags once fsync is done When fsync is done through checkpoint, previous f2fs missed to clear append and update flag. This patch fixes to clear them. This was originally catched by Changman Lee before. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:33 -07:00
Jaegeuk Kim	d5669f7b9b	f2fs: avoid to trigger writepage during POR This patch doesn't make any effect on previous behavior, since f2fs_write_data_page bypasses writing the page during POR. But, the difference is that this patch avoids holding writepages mutex. This is to avoid the following false warning, since this can happen only when mount and shutdown are triggered at the same time. ====================================================== [ INFO: possible circular locking dependency detected ] 4.0.0-rc1+ #3 Tainted: G O ------------------------------------------------------- kworker/u8:0/2270 is trying to acquire lock: (&sbi->gc_mutex){+.+.+.}, at: [<ffffffffa02bdd33>] f2fs_balance_fs+0x73/0x90 [f2fs] but task is already holding lock: (&sbi->writepages){+.+...}, at: [<ffffffffa02b261b>] f2fs_write_data_pages+0xcb/0x3a0 [f2fs] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&sbi->writepages){+.+...}: [<ffffffff810e2b11>] lock_acquire+0xe1/0x2f0 [<ffffffff8185e1b3>] mutex_lock_nested+0x63/0x530 [<ffffffffa02b261b>] f2fs_write_data_pages+0xcb/0x3a0 [f2fs] [<ffffffff811c38c1>] do_writepages+0x21/0x50 [<ffffffff8126c5a6>] __writeback_single_inode+0x76/0xbf0 [<ffffffff8126e23a>] writeback_single_inode+0xea/0x1c0 [<ffffffff8126e425>] write_inode_now+0x95/0xa0 [<ffffffff81259dab>] iput+0x20b/0x3f0 [<ffffffffa02c1c8b>] recover_data.constprop.14+0x26b/0xa80 [f2fs] [<ffffffffa02c2776>] recover_fsync_data+0x2b6/0x5e0 [f2fs] [<ffffffffa02a9744>] f2fs_fill_super+0xb24/0xb90 [f2fs] [<ffffffff8123d7f4>] mount_bdev+0x1a4/0x1e0 [<ffffffffa02a3c85>] f2fs_mount+0x15/0x20 [f2fs] [<ffffffff8123e159>] mount_fs+0x39/0x180 [<ffffffff8125e51b>] vfs_kern_mount+0x6b/0x160 [<ffffffff81261554>] do_mount+0x204/0xbe0 [<ffffffff8126223b>] SyS_mount+0x8b/0xe0 [<ffffffff81863e6d>] system_call_fastpath+0x16/0x1b -> #1 (&sbi->cp_mutex){+.+...}: [<ffffffff810e2b11>] lock_acquire+0xe1/0x2f0 [<ffffffff8185e1b3>] mutex_lock_nested+0x63/0x530 [<ffffffffa02acbf2>] write_checkpoint+0x42/0x1230 [f2fs] [<ffffffffa02a847d>] f2fs_sync_fs+0x9d/0x2a0 [f2fs] [<ffffffff81272f82>] sync_filesystem+0x82/0xb0 [<ffffffff8123c214>] generic_shutdown_super+0x34/0x100 [<ffffffff8123c5f7>] kill_block_super+0x27/0x70 [<ffffffffa02a3c60>] kill_f2fs_super+0x20/0x30 [f2fs] [<ffffffff8123ca49>] deactivate_locked_super+0x49/0x80 [<ffffffff8123d05e>] deactivate_super+0x4e/0x70 [<ffffffff8125df63>] cleanup_mnt+0x43/0x90 [<ffffffff8125e002>] __cleanup_mnt+0x12/0x20 [<ffffffff810a82e4>] task_work_run+0xc4/0xf0 [<ffffffff8101f0bd>] do_notify_resume+0x8d/0xa0 [<ffffffff81864141>] int_signal+0x12/0x17 -> #0 (&sbi->gc_mutex){+.+.+.}: [<ffffffff810e2866>] __lock_acquire+0x1ac6/0x1c90 [<ffffffff810e2b11>] lock_acquire+0xe1/0x2f0 [<ffffffff8185e1b3>] mutex_lock_nested+0x63/0x530 [<ffffffffa02bdd33>] f2fs_balance_fs+0x73/0x90 [f2fs] [<ffffffffa02b5938>] f2fs_write_data_page+0x348/0x5b0 [f2fs] [<ffffffffa02af9da>] __f2fs_writepage+0x1a/0x50 [f2fs] [<ffffffff811c1b54>] write_cache_pages+0x274/0x6f0 [<ffffffffa02b2630>] f2fs_write_data_pages+0xe0/0x3a0 [f2fs] [<ffffffff811c38c1>] do_writepages+0x21/0x50 [<ffffffff8126c5a6>] __writeback_single_inode+0x76/0xbf0 [<ffffffff8126d44a>] writeback_sb_inodes+0x32a/0x710 [<ffffffff8126d8cf>] __writeback_inodes_wb+0x9f/0xd0 [<ffffffff8126dcdb>] wb_writeback+0x3db/0x850 [<ffffffff8126e848>] bdi_writeback_workfn+0x148/0x980 [<ffffffff810a3782>] process_one_work+0x1e2/0x840 [<ffffffff810a3f01>] worker_thread+0x121/0x460 [<ffffffff810a9dc8>] kthread+0xf8/0x110 [<ffffffff81863dbc>] ret_from_fork+0x7c/0xb0 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:32 -07:00
Changman Lee	e1235983e3	f2fs: add stat info for moved blocks by background gc This patch is for looking into gc performance of f2fs in detail. Signed-off-by: Changman Lee <cm224.lee@samsung.com> [Jaegeuk Kim: fix build errors] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:32 -07:00
Chao Yu	b28c3f9493	f2fs: fix to issue small discard in real-time mode discard Now in f2fs, we share functions and structures for batch mode and real-time mode discard. For real-time mode discard, in shared function add_discard_addrs, we will use uninitialized trim_minlen in struct cp_control to compare with length of contiguous free blocks to decide whether skipping discard fragmented freespace or not, this makes us ignore small discard sometimes. Fix it. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Reviewed-by : Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:31 -07:00
Sebastian Andrzej Siewior	7ecebe5e07	f2fs: add cond_resched() to sync_dirty_dir_inodes() In a preempt-off enviroment a alot of FS activity (write/delete) I run into a CPU stall: \| NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u2:2:59] \| Modules linked in: \| CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: G W 3.19.0-00010-g10c11c51ffed #153 \| Workqueue: writeback bdi_writeback_workfn (flush-179:0) \| task: df230000 ti: df23e000 task.ti: df23e000 \| PC is at __submit_merged_bio+0x6c/0x110 \| LR is at f2fs_submit_merged_bio+0x74/0x80 … \| [<c00085c4>] (gic_handle_irq) from [<c0012e84>] (__irq_svc+0x44/0x5c) \| Exception stack(0xdf23fb48 to 0xdf23fb90) \| fb40: deef3484 ffff0001 ffff0001 00000027 deef3484 00000000 \| fb60: deef3440 00000000 de426000 deef34ec deefc440 df23fbb4 df23fbb8 df23fb90 \| fb80: c02191f0 c0218fa0 60000013 ffffffff \| [<c0012e84>] (__irq_svc) from [<c0218fa0>] (__submit_merged_bio+0x6c/0x110) \| [<c0218fa0>] (__submit_merged_bio) from [<c02191f0>] (f2fs_submit_merged_bio+0x74/0x80) \| [<c02191f0>] (f2fs_submit_merged_bio) from [<c021624c>] (sync_dirty_dir_inodes+0x70/0x78) \| [<c021624c>] (sync_dirty_dir_inodes) from [<c0216358>] (write_checkpoint+0x104/0xc10) \| [<c0216358>] (write_checkpoint) from [<c021231c>] (f2fs_sync_fs+0x80/0xbc) \| [<c021231c>] (f2fs_sync_fs) from [<c0221eb8>] (f2fs_balance_fs_bg+0x4c/0x68) \| [<c0221eb8>] (f2fs_balance_fs_bg) from [<c021e9b8>] (f2fs_write_node_pages+0x40/0x110) \| [<c021e9b8>] (f2fs_write_node_pages) from [<c00de620>] (do_writepages+0x34/0x48) \| [<c00de620>] (do_writepages) from [<c0145714>] (__writeback_single_inode+0x50/0x228) \| [<c0145714>] (__writeback_single_inode) from [<c0146184>] (writeback_sb_inodes+0x1a8/0x378) \| [<c0146184>] (writeback_sb_inodes) from [<c01463e4>] (__writeback_inodes_wb+0x90/0xc8) \| [<c01463e4>] (__writeback_inodes_wb) from [<c01465f8>] (wb_writeback+0x1dc/0x28c) \| [<c01465f8>] (wb_writeback) from [<c0146dd8>] (bdi_writeback_workfn+0x2ac/0x460) \| [<c0146dd8>] (bdi_writeback_workfn) from [<c003c3fc>] (process_one_work+0x11c/0x3a4) \| [<c003c3fc>] (process_one_work) from [<c003c844>] (worker_thread+0x17c/0x490) \| [<c003c844>] (worker_thread) from [<c0041398>] (kthread+0xec/0x100) \| [<c0041398>] (kthread) from [<c000ed10>] (ret_from_fork+0x14/0x24) As it turns out, the code loops in sync_dirty_dir_inodes() and waits for others to make progress but since it never leaves the CPU there is no progress made. At the time of this stall, there is also a rm process blocked: \| rm R running 0 1989 1774 0x00000000 \| [<c047c55c>] (__schedule) from [<c00486dc>] (__cond_resched+0x30/0x4c) \| [<c00486dc>] (__cond_resched) from [<c047c8c8>] (_cond_resched+0x4c/0x54) \| [<c047c8c8>] (_cond_resched) from [<c00e1aec>] (truncate_inode_pages_range+0x1f0/0x5e8) \| [<c00e1aec>] (truncate_inode_pages_range) from [<c00e1fd8>] (truncate_inode_pages+0x28/0x30) \| [<c00e1fd8>] (truncate_inode_pages) from [<c00e2148>] (truncate_inode_pages_final+0x60/0x64) \| [<c00e2148>] (truncate_inode_pages_final) from [<c020c92c>] (f2fs_evict_inode+0x4c/0x268) \| [<c020c92c>] (f2fs_evict_inode) from [<c0137214>] (evict+0x94/0x140) \| [<c0137214>] (evict) from [<c01377e8>] (iput+0xc8/0x134) \| [<c01377e8>] (iput) from [<c01333e4>] (d_delete+0x154/0x180) \| [<c01333e4>] (d_delete) from [<c0129870>] (vfs_rmdir+0x114/0x12c) \| [<c0129870>] (vfs_rmdir) from [<c012d644>] (do_rmdir+0x158/0x168) \| [<c012d644>] (do_rmdir) from [<c012dd90>] (SyS_unlinkat+0x30/0x3c) \| [<c012dd90>] (SyS_unlinkat) from [<c000ec40>] (ret_fast_syscall+0x0/0x4c) As explained by Jaegeuk Kim: \|This inode is the directory (c.f., do_rmdir) causing a infinite loop on \|sync_dirty_dir_inodes. \|The sync_dirty_dir_inodes tries to flush dirty dentry pages, but if the \|inode is under eviction, it submits bios and do it again until eviction \|is finished. This patch adds a cond_resched() (as suggested by Jaegeuk) after a BIO is submitted so other thread can make progress. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> [Jaegeuk Kim: change fs/f2fs to f2fs in subject as naming convention] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:30 -07:00
Wanpeng Li	14b4281776	f2fs: fix max orphan inodes calculation cp_payload is introduced for sit bitmap to support large volume, and it is just after the block of f2fs_checkpoint + nat bitmap, so the first segment should include F2FS_CP_PACKS + NR_CURSEG_TYPE + cp_payload + orphan blocks. However, current max orphan inodes calculation don't consider cp_payload, this patch fix it by reducing the number of cp_payload from total blocks of the first segment when calculate max orphan inodes. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:29 -07:00

1 2 3 4 5 ...

895 Commits