linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-11 21:38:32 +08:00

Author	SHA1	Message	Date
Jaegeuk Kim	22ad0b6ab4	f2fs: add bitmaps for empty or full NAT blocks This patches adds bitmaps to represent empty or full NAT blocks containing free nid entries. If we can find valid crc\|cp_ver in the last block of checkpoint pack, we'll use these bitmaps when building free nids. In order to avoid checkpointing burden, up-to-date bitmaps will be flushed only during umount time. So, normally we can get this gain, but when power-cut happens, we rely on fsck.f2fs which recovers this bitmap again. After this patch, we build free nids from nid #0 at mount time to make more full NAT blocks, but in runtime, we check empty NAT blocks to load free nids without loading any NAT pages from disk. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-27 09:59:54 -08:00
Yunlei He	5e8256ac2e	f2fs: replace rw semaphore extent_tree_lock with mutex lock This patch replace rw semaphore extent_tree_lock with mutex lock for no read cases with this lock. Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-27 09:59:53 -08:00
Jaegeuk Kim	47b8980816	f2fs: much larger batched trim_fs job We have a kernel thread to issue discard commands, so we can increase the number of batched discard sections. By default, now it becomes 4GB range. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-27 09:59:30 -08:00
Jaegeuk Kim	ad4d307fce	f2fs: avoid very large discard command This patch adds MAX_DISCARD_BLOCKS() to avoid issuing too much large single discard command. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-27 09:59:20 -08:00
Hou Pengyang	e93b986525	f2fs: add ovp valid_blocks check for bg gc victim to fg_gc For foreground gc, greedy algorithm should be adapted, which makes this formula work well: (2 * (100 / config.overprovision + 1) + 6) But currently, we fg_gc have a prior to select bg_gc victim segments to gc first, these victims are selected by cost-benefit algorithm, we can't guarantee such segments have the small valid blocks, which may destroy the f2fs rule, on the worstest case, would consume all the free segments. This patch fix this by add a filter in check_bg_victims, if segment's has # of valid blocks over overprovision ratio, skip such segments. Cc: <stable@vger.kernel.org> Signed-off-by: Hou Pengyang <houpengyang@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-23 11:28:20 -08:00
Jaegeuk Kim	88c5c13a50	f2fs: fix multiple f2fs_add_link() calls having same name It turns out a stakable filesystem like sdcardfs in AOSP can trigger multiple vfs_create() to lower filesystem. In that case, f2fs will add multiple dentries having same name which breaks filesystem consistency. Until upper layer fixes, let's work around by f2fs, which shows actually not much performance regression. Cc: <stable@vger.kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-23 11:23:25 -08:00
Chao Yu	d260081ccf	f2fs: change recovery policy of xattr node block Currently, if we call fsync after updating the xattr date belongs to the file, f2fs needs to trigger checkpoint to keep xattr data consistent. But, this policy cause low performance as checkpoint will block most foreground operations and cause unneeded and unrelated IOs around checkpoint. This patch will reuse regular file recovery policy for xattr node block, so, we change to write xattr node block tagged with fsync flag to warm area instead of cold area, and during recovery, we search warm node chain for fsynced xattr block, and do the recovery. So, for below application IO pattern, performance can be improved obviously: - touch file - create/update/delete xattr entry in file - fsync file Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-23 10:10:52 -08:00
Jaegeuk Kim	942fd3192f	f2fs: check last page index in cached bio to decide submission If the cached bio has the last page's index, then we need to submit it. Otherwise, we don't need to submit it and can wait for further IO merges. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-23 10:10:48 -08:00
Jaegeuk Kim	d68f735b3b	f2fs: check io submission more precisely This patch check IO submission more precisely than previous rough check. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-23 10:10:47 -08:00
Jaegeuk Kim	a00861dbca	f2fs: show # of APPEND and UPDATE inodes This patch shows cached # of APPEND and UPDATE inode entries. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-22 20:54:53 -08:00
DongOh Shin	cac5a3d8f5	f2fs: fix 446 coding style warnings in f2fs.h 1) Nine coding style warnings below have been resolved: "Missing a blank line after declarations" 2) 435 coding style warnings below have been resolved: "function definition argument 'x' should also have an identifier name" 3) Two coding style warnings below have been resolved: "macros should not use a trailing semicolon" Signed-off-by: DongOh Shin <doscode.kr@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-22 20:24:55 -08:00
DongOh Shin	c64ab12e36	f2fs: fix 3 coding style errors in f2fs.h Two coding style errors below have been resolved: "Macros with complex values should be enclosed in parentheses" And a coding style error below has been resolved: "space prohibited before that ',' (ctx:WxW)" Signed-off-by: DongOh Shin <doscode.kr@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-22 20:24:55 -08:00
Jaegeuk Kim	dc91de78e5	f2fs: do not preallocate blocks which has wrong buffer Sheng Yong reports needless preallocation if write(small_buffer, large_size) is called. In that case, f2fs preallocates large_size, but vfs returns early due to small_buffer size. Let's detect it before preallocation phase in f2fs. Reported-by: Sheng Yong <shengyong1@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-22 20:24:48 -08:00
Jaegeuk Kim	dcc9165dbf	f2fs: show # of on-going flush and discard bios This patch adds stat information for flush and discard commands. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-22 20:24:47 -08:00
Jaegeuk Kim	1546996348	f2fs: add a kernel thread to issue discard commands asynchronously This patch adds a kernel thread to issue discard commands. It proposes three states, D_PREP, D_SUBMIT, and D_DONE to identify current bio status. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-22 20:24:45 -08:00
Jaegeuk Kim	0b54fb8458	f2fs: factor out discard command info into discard_cmd_control This patch adds discard_cmd_control with the existing discarding controls. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-22 18:48:53 -08:00
Jaegeuk Kim	b01a92019c	f2fs: clean up flush/discard command namings This patch simply cleans up the names for flush/discard commands. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-22 18:48:51 -08:00
Chao Yu	599a09b2c1	f2fs: check in-memory nat version bitmap This patch adds a mirror for nat version bitmap, and use it to detect in-memory bitmap corruption which may be caused by bit-transition of cache or memory overflow. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-22 18:48:49 -08:00
Chao Yu	5fe457430e	f2fs: introduce FI_ATOMIC_COMMIT This patch introduces a new flag to indicate inode status of doing atomic write committing, so that, we can keep atomic write status for inode during atomic committing, then we can skip GCing pages of atomic write inode, that avoids random GCed datas being mixed with current transaction, so isolation of transaction can be kept. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-22 18:48:48 -08:00
Jaegeuk Kim	25290fa559	f2fs: return fs_trim if there is no candidate If there is no candidate to submit discard command during f2fs_trim_fs, let's return without checkpoint. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-02-22 18:48:40 -08:00
Jaegeuk Kim	4e6a8d9b22	f2fs: relax async discard commands more This patch relaxes async discard commands to avoid waiting its end_io during checkpoint. Instead of waiting them during checkpoint, it will be done when actually reusing them. Test on initial partition of nvme drive. # time fstrim /mnt/test Before : 6.158s After : 4.822s Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-01-29 12:46:01 +09:00
Jaegeuk Kim	26a28a0c1e	f2fs: show the max number of atomic operations This patch adds to show the max number of atomic operations which are conducting concurrently. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-01-29 12:46:01 +09:00
Jaegeuk Kim	0a595ebaaa	f2fs: support IO alignment for DATA and NODE writes This patch implements IO alignment by filling dummy blocks in DATA and NODE write bios. If we can guarantee, for example, 32KB or 64KB for such the IOs, we can eliminate underlying dummy page problem which FTL conducts in order to close MLC or TLC partial written pages. Note that, - it requires "-o mode=lfs". - IO size should be power of 2, not exceed BIO_MAX_PAGES, 256. - read IO is still 4KB. - do checkpoint at fsync, if dummy NODE page was written. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2017-01-29 12:46:01 +09:00
Linus Torvalds	5084fdf081	This merge request includes the dax-4.0-iomap-pmd branch which is needed for both ext4 and xfs dax changes to use iomap for DAX. It also includes the fscrypt branch which is needed for ubifs encryption work as well as ext4 encryption and fscrypt cleanups. Lots of cleanups and bug fixes, especially making sure ext4 is robust against maliciously corrupted file systems --- especially maliciously corrupted xattr blocks and a maliciously corrupted superblock. Also fix ext4 support for 64k block sizes so it works well on ppcle. Fixed mbcache so we don't miss some common xattr blocks that can be merged. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAlhQQVEACgkQ8vlZVpUN gaN9TQgAoCD+V4kJjMCFhiV8u6QR3hqD6bOZbggo5wJf4CHglWkmrbAmc3jANOgH CKsXDRRjxuDjPXf1ukB1i4M7ArLYjkbbzKdsu7lismoJLS+w8uwUKSNdep+LYMjD alxUcf5DCzLlUmdOdW4yE22L+CwRfqfs8IpBvKmJb7DrAKiwJVA340ys6daBGuu1 63xYx0QIyPzq0xjqLb6TVf88HUI4NiGVXmlm2wcrnYd5966hEZd/SztOZTVCVWOf Z0Z0fGQ1WJzmaBB9+YV3aBi+BObOx4m2PUprIa531+iEW02E+ot5Xd4vVQFoV/r4 NX3XtoBrT1XlKagy2sJLMBoCavqrKw== =j4KP -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "This merge request includes the dax-4.0-iomap-pmd branch which is needed for both ext4 and xfs dax changes to use iomap for DAX. It also includes the fscrypt branch which is needed for ubifs encryption work as well as ext4 encryption and fscrypt cleanups. Lots of cleanups and bug fixes, especially making sure ext4 is robust against maliciously corrupted file systems --- especially maliciously corrupted xattr blocks and a maliciously corrupted superblock. Also fix ext4 support for 64k block sizes so it works well on ppcle. Fixed mbcache so we don't miss some common xattr blocks that can be merged" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (86 commits) dax: Fix sleep in atomic contex in grab_mapping_entry() fscrypt: Rename FS_WRITE_PATH_FL to FS_CTX_HAS_BOUNCE_BUFFER_FL fscrypt: Delay bounce page pool allocation until needed fscrypt: Cleanup page locking requirements for fscrypt_{decrypt,encrypt}_page() fscrypt: Cleanup fscrypt_{decrypt,encrypt}_page() fscrypt: Never allocate fscrypt_ctx on in-place encryption fscrypt: Use correct index in decrypt path. fscrypt: move the policy flags and encryption mode definitions to uapi header fscrypt: move non-public structures and constants to fscrypt_private.h fscrypt: unexport fscrypt_initialize() fscrypt: rename get_crypt_info() to fscrypt_get_crypt_info() fscrypto: move ioctl processing more fully into common code fscrypto: remove unneeded Kconfig dependencies MAINTAINERS: fscrypto: recommend linux-fsdevel for fscrypto patches ext4: do not perform data journaling when data is encrypted ext4: return -ENOMEM instead of success ext4: reject inodes with negative size ext4: remove another test in ext4_alloc_file_blocks() Documentation: fix description of ext4's block_validity mount option ext4: fix checks for data=ordered and journal_async_commit options ...	2016-12-14 09:17:42 -08:00
Linus Torvalds	09cb6464fe	for-f2fs-4.10 This patch series contains several performance tuning patches regarding to the IO submission flow, in addition to supporting new features such as a ZBC-base drive and multiple devices. It also includes some major bug fixes such as: - checkpoint version control - fdatasync-related roll-forward recovery routine - memory boundary or null-pointer access in corner cases - missing error cases It has various minor clean-up patches as well. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYTx44AAoJEEAUqH6CSFDSnAQP/jeYJq5Zd0bweEF5g00Ec1Qg qNKQ57e9EHDRaDLBUmHHEaCEPRL0bw6SOUUWWqzGA07KcsIK+Yb/dGAyIcuV7WMl PjntVbYm4yARDYBHGupdOCzFSkzr8gDalb+98jJnoGUonsftljhES9jedQ1NjAms GFPHDNtirZM/r0bjKkYKjpqJ6FCxFxcGPfb/GtohDajIpohWfKZiemaXGTgtYR4d iBVek16h+Hprz90ycZBY69uz0TdAwu/gb+htMVBrAdExHWvlFzgp35OIywiAB/YX 3QD/x4t2HqOBaNYiiOAY4ukVW/Yyqa/ZAzbm+m5B5CAcFYiWXMy+cMXUY9HJJ/K0 wdvi//Avtvgpp2PVZFn2pASx14vgMFylBzuNgKpP6MPdtWTEL33jT7VYs9Nuz45E dgZ9IpiDt4DeTRuZ4mPO5iH7bVHPvAVV80bpXzirCCzDeNZ1EFFIQzXh/2UAmCxI twPXGBIYul0aIl9JkWAyhCZSd3XDSqedpfPudknjhzM9Xb1H5X0QJco7f/UwsWXH WxV6lHr1Q7UH96wJ7x/GAqj8ArOAASRV18+K51dqU+DWHnFPpBArJe39FVf8NGWs Fz1ZmlWBQ0ZgzvLkGa80llhjalXIEy/JabMrpy6VrzQGxHdmW4cVxe4dJ3710WxX VysJUcNMRKxMUTWOKsxp =Boum -----END PGP SIGNATURE----- Merge tag 'for-f2fs-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This patch series contains several performance tuning patches regarding to the IO submission flow, in addition to supporting new features such as a ZBC-base drive and multiple devices. It also includes some major bug fixes such as: - checkpoint version control - fdatasync-related roll-forward recovery routine - memory boundary or null-pointer access in corner cases - missing error cases It has various minor clean-up patches as well" * tag 'for-f2fs-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (66 commits) f2fs: fix a missing size change in f2fs_setattr f2fs: fix to access nullified flush_cmd_control pointer f2fs: free meta pages if sanity check for ckpt is failed f2fs: detect wrong layout f2fs: call sync_fs when f2fs is idle Revert "f2fs: use percpu_counter for # of dirty pages in inode" f2fs: return AOP_WRITEPAGE_ACTIVATE for writepage f2fs: do not activate auto_recovery for fallocated i_size f2fs: fix to determine start_cp_addr by sbi->cur_cp_pack f2fs: fix 32-bit build f2fs: set ->owner for debugfs status file's file_operations f2fs: fix incorrect free inode count in ->statfs f2fs: drop duplicate header timer.h f2fs: fix wrong AUTO_RECOVER condition f2fs: do not recover i_size if it's valid f2fs: fix fdatasync f2fs: fix to account total free nid correctly f2fs: fix an infinite loop when flush nodes in cp f2fs: don't wait writeback for datas during checkpoint f2fs: fix wrong written_valid_blocks counting ...	2016-12-14 09:07:36 -08:00
Eric Biggers	db717d8e26	fscrypto: move ioctl processing more fully into common code Multiple bugs were recently fixed in the "set encryption policy" ioctl. To make it clear that fscrypt_process_policy() and fscrypt_get_policy() implement ioctls and therefore their implementations must take standard security and correctness precautions, rename them to fscrypt_ioctl_set_policy() and fscrypt_ioctl_get_policy(). Make the latter take in a struct file * to make it consistent with the former. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2016-12-11 16:26:07 -05:00
Jaegeuk Kim	5eba8c5d1f	f2fs: fix to access nullified flush_cmd_control pointer f2fs_sync_file() remount_ro - f2fs_readonly - destroy_flush_cmd_control - f2fs_issue_flush - no fcc pointer! So, this patch doesn't free fcc in this case, but just stop its kernel thread which sends flush commands. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-12-07 18:56:50 -08:00
Jaegeuk Kim	204706c7ac	Revert "f2fs: use percpu_counter for # of dirty pages in inode" This reverts commit `1beba1b3a9`. The perpcu_counter doesn't provide atomicity in single core and consume more DRAM. That incurs fs_mark test failure due to ENOMEM. Cc: stable@vger.kernel.org # 4.7+ Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-12-05 11:43:59 -08:00
Jaegeuk Kim	26787236b3	f2fs: do not activate auto_recovery for fallocated i_size If a file needs to keep its i_size by fallocate, we need to turn off auto recovery during roll-forward recovery. This will resolve the below scenario. 1. xfs_io -f /mnt/f2fs/file -c "pwrite 0 4096" -c "fsync" 2. xfs_io -f /mnt/f2fs/file -c "falloc -k 4096 4096" -c "fsync" 3. md5sum /mnt/f2fs/file; 4. godown /mnt/f2fs/ 5. umount /mnt/f2fs/ 6. mount -t f2fs /dev/sdx /mnt/f2fs 7. md5sum /mnt/f2fs/file Reported-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-29 15:42:58 -08:00
Jaegeuk Kim	8508e44ae9	f2fs: fix to determine start_cp_addr by sbi->cur_cp_pack We don't guarantee cp_addr is fixed by cp_version. This is to sync with f2fs-tools. Cc: stable@vger.kernel.org Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-28 13:39:58 -08:00
Jaegeuk Kim	97dd26ad83	f2fs: fix wrong AUTO_RECOVER condition If i_size is not aligned to the f2fs's block size, we should not skip inode update during fsync. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-25 10:16:05 -08:00
Chao Yu	281518c694	f2fs: fix fdatasync For below two cases, we can't guarantee data consistence: a) 1. xfs_io "pwrite 0 4195328" "fsync" 2. xfs_io "pwrite 4195328 1024" "fdatasync" 3. godown 4. umount & mount --> isize we updated before fdatasync won't be recovered b) 1. xfs_io "pwrite -S 0xcc 0 4202496" "fsync" 2. xfs_io "fpunch 4194304 4096" "fdatasync" 3. godown 4. umount & mount --> dnode we punched before fdatasync won't be recovered The reason is that normally fdatasync won't be aware of modification of metadata in file, e.g. isize changing, dnode updating, so in ->fsync we will skip flushing node pages for above cases, result in making fdatasynced file being lost during recovery. Currently we have introduced DIRTY_META global list in sbi for tracking dirty inode selectively, so in fdatasync we can choose to flush nodes depend on dirty state of current inode in the list. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-25 10:16:03 -08:00
Chao Yu	04d47e6738	f2fs: fix to account total free nid correctly Thread A Thread B Thread C - f2fs_create - f2fs_new_inode - f2fs_lock_op - alloc_nid alloc last nid - f2fs_unlock_op - f2fs_create - f2fs_new_inode - f2fs_lock_op - alloc_nid as node count still not be increased, we will loop in alloc_nid - f2fs_write_node_pages - f2fs_balance_fs_bg - f2fs_sync_fs - write_checkpoint - block_operations - f2fs_lock_all - f2fs_lock_op While creating new inode, we do not allocate and account nid atomically, so that when there is almost no free nids left, we may encounter deadloop like above stack. In order to avoid that, reuse nm_i::available_nids for accounting free nids and make nid allocation and counting being atomical during node creation. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-25 10:16:01 -08:00
Chao Yu	36951b38d1	f2fs: don't wait writeback for datas during checkpoint Normally, while committing checkpoint, we will wait on all pages to be writebacked no matter the page is data or metadata, so in scenario where there are lots of data IO being submitted with metadata, we may suffer long latency for waiting writeback during checkpoint. Indeed, we only care about persistence for pages with metadata, but not pages with data, as file system consistent are only related to metadate, so in order to avoid encountering long latency in above scenario, let's recognize and reference metadata in submitted IOs, wait writeback only for metadatas. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-25 10:15:59 -08:00
Jaegeuk Kim	7702bdbe50	f2fs: avoid BG_GC in f2fs_balance_fs If many threads hit has_not_enough_free_secs() in f2fs_balance_fs() at the same time, all the threads would do FG_GC or BG_GC. In this critical path, we totally don't need to do BG_GC at all. Let's avoid that. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-25 10:15:57 -08:00
Jaegeuk Kim	a7de608691	f2fs: use err for f2fs_preallocate_blocks This patch has no functional change. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-25 10:15:14 -08:00
Jaegeuk Kim	3c62be17d4	f2fs: support multiple devices This patch implements multiple devices support for f2fs. Given multiple devices by mkfs.f2fs, f2fs shows them entirely as one big volume under one f2fs instance. Internal block management is very simple, but we will modify block allocation and background GC policy to boost IO speed by exploiting them accoording to each device speed. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-25 10:15:13 -08:00
Jaegeuk Kim	6ae1be13e8	f2fs: revert segment allocation for direct IO Now we don't need to be too much careful about storage alignment for dio, since its speed becomes quite fast and we'd better avoid any misalignment first. Revert: `38aa0889b2` (f2fs: align direct_io'ed data to section) Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-25 10:15:02 -08:00
Damien Le Moal	178053e2f1	f2fs: Cache zoned block devices zone type With the zoned block device feature enabled, section discard need to do a zone reset for sections contained in sequential zones, and a regular discard (if supported) for sections stored in conventional zones. Avoid the need for a costly report zones to obtain a section zone type when discarding it by caching the types of the device zones in the super block information. This cache is initialized at mount time for mounts with the zoned block device feature enabled. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-23 12:11:22 -08:00
Damien Le Moal	96ba2decb4	f2fs: Always enable discard for zoned blocks devices Zone write pointer reset acts as discard for zoned block devices. So if the zoned block device feature is enabled, always declare that discard is enabled, even if the device does not actually support the command. For the same reason, prevent the use the "nodicard" mount option. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-23 12:11:19 -08:00
Damien Le Moal	0bfd7a091c	f2fs: Use generic zoned block device terminology SMR stands for "Shingled Magnetic Recording" which makes sense only for hard disk drives (spinning rust). The ZBC/ZAC standards enable management of SMR disks, but solid state drives may also support those standards. So rename the HMSMR feature to BLKZONED to avoid a HDD centric terminology. For the same reason, rename f2fs_sb_mounted_hmsmr to f2fs_sb_mounted_blkzoned. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-23 12:11:16 -08:00
Chao Yu	ed6bd4b146	f2fs: report error of f2fs_fill_dentries Report error of f2fs_fill_dentries to ->iterate_shared, otherwise when error ocurrs, user may just list part of dirents in target directory without any hints. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-23 12:11:12 -08:00
Jaegeuk Kim	35782b233f	f2fs: remove percpu_count due to performance regression This patch removes percpu_count usage due to performance regression in iozone. Fixes: `523be8a6b3` ("f2fs: use percpu_counter for page counters") Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-23 12:11:10 -08:00
Jaegeuk Kim	7c45729a4d	f2fs: keep dirty inodes selectively for checkpoint This is to avoid no free segment bug during checkpoint caused by a number of dirty inodes. The case was reported by Chao like this. 1. mount with lazytime option 2. fill 4k file until disk is full 3. sync filesystem 4. read all files in the image 5. umount In this case, we actually don't need to flush dirty inode to inode page during checkpoint. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-23 12:11:08 -08:00
Chao Yu	3a2ad5672b	f2fs: don't interrupt free nids building during nid allocation Let build_free_nids support sync/async methods, in allocation flow of nids, we use synchronuous method, so that we can avoid looping in alloc_nid when free memory is low; in unblock_operations and f2fs_balance_fs_bg we use asynchronuous method in where low memory condition can interrupt us. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-23 12:11:02 -08:00
Chao Yu	b8559dc242	f2fs: split free nid list During free nid allocation, in order to do preallocation, we will tag free nid entry as allocated one and still leave it in free nid list, for other allocators who want to grab free nids, it needs to traverse the free nid list for lookup. It becomes overhead in scenario of allocating free nid intensively by multithreads. This patch splits free nid list to two list: {free,alloc}_nid_list, to keep free nids and preallocated free nids separately, after that, traverse latency will be gone, besides split nid_cnt for separate statistic. Additionally, introduce __insert_nid_to_list and __remove_nid_from_list for cleanup. Signed-off-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: modify f2fs_bug_on to avoid needless branches] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-11-23 12:11:00 -08:00
Christoph Hellwig	ef295ecf09	block: better op and flags encoding Now that we don't need the common flags to overflow outside the range of a 32-bit type we can encode them the same way for both the bio and request fields. This in addition allows us to place the operation first (and make some room for more ops while we're at it) and to stop having to shift around the operation values. In addition this allows passing around only one value in the block layer instead of two (and eventuall also in the file systems, but we can do that later) and thus clean up a lot of code. Last but not least this allows decreasing the size of the cmd_flags field in struct request to 32-bits. Various functions passing this value could also be updated, but I'd like to avoid the churn for now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-10-28 08:48:16 -06:00
Chao Yu	0f34802858	f2fs: support checkpoint error injection This patch adds to support checkpoint error injection in f2fs for testing fatal error tolerance, it will be useful that it can simulate abnormal power off by f2fs itself instead of calling godown ioctl by running apps. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-30 17:34:35 -07:00
Chao Yu	1ecc0c5c50	f2fs: support configuring fault injection per superblock Previously, we only support global fault injection configuration, so that when we configure type/rate of fault injection through sysfs, mount option, it will influence all f2fs partition which is being used. It is not make sence, since it will be not convenient if developer want to test separated partitions with different fault injection rate/type simultaneously, also it's not possible to enable fault injection in one partition and disable fault injection in other one. >From now on, we move global configuration of fault injection in module into per-superblock, hence injection testing can be more flexible. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-30 17:34:31 -07:00
Weichao Guo	5b7a487cf3	f2fs: add customized migrate_page callback This patch improves the migration of dirty pages and allows migrating atomic written pages that F2FS uses in Page Cache. Instead of the fallback releasing page path, it provides better performance for memory compaction, CMA and other users of memory page migrating. For dirty pages, there is no need to write back first when migrating. For an atomic written page before committing, we can migrate the page and update the related 'inmem_pages' list at the same time. Signed-off-by: Weichao Guo <guoweichao@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: fix some coding style] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2016-09-30 17:34:23 -07:00

1 2 3 4 5 ...

512 Commits