In this round, we've got a huge number of patches that improve code readability
along with minor bug fixes, while we've mainly fixed some critical issues in
recently-added per-block age-based extent_cache, atomic write support, and some
folio cases.
Enhancement:
- add sysfs nodes to set last_age_weight and manage discard_io_aware_gran
- show ipu policy in debugfs
- reduce stack memory cost by using bitfield in struct f2fs_io_info
- introduce trace_f2fs_replace_atomic_write_block
- enhance iostat support and adds flush commands
Bug fix:
- revert "f2fs: truncate blocks in batch in __complete_revoke_list()"
- fix kernel crash on the atomic write abort flow
- call clear_page_private_reference in .{release,invalid}_folio
- support .migrate_folio for compressed inode
- fix cgroup writeback accounting with fs-layer encryption
- retry to update the inode page given data corruption
- fix kernel crash due to null io->bio
- fix some bugs in per-block age-based extent_cache:
a. wrong calculation of block age
b. update age extent in f2fs_do_zero_range()
c. update age extent correctly during truncation
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmP9M/cACgkQQBSofoJI
UNIX1Q//Yp+nDeY91H3IO6aMSHPqRoBBnVTr8ERtLUF0fuQbBkzcQE+t8cMSoYDM
88sxoC+F7UnovNr84VeuKlHN4hYyAXuxj8OZetXI3XX+yiO+auEPtljGA0BaGwqL
93lIg8nIQ2ing/oZ/4+h4dvpYCPrhKOQS6h1sHhIWlql6Wxwxq01uA47i0Ni6y/o
D23JPFaDQfumN8qy1bm5xfLRhTQmaE35n5NhcBJpUD/rGK92NPXv7RLKPgZc3tKN
tJmL+NXa3NNEx5e5TSP7JX+rhD7KlL5XlB/m8LbpPIx338I0pt0uzAc7nTIOlzM3
DI0q3HXe9U3+JBHi+rKxkIniiRDmvhPx3NzdgcsYg75EnwdrNazsRHulBUEAXB4v
ghHbx53OxA2uSnUVbkY4HXNYf7cYjrx5vbX0oqEx48btBCC8KGFIcHtI72tIBee3
xdCxoM3e2AWijkFBOCkThXNuNNbdifQzn2e7xR7W+o0L9hwdR5t7tHhHT+cqG9Ox
6UKWoZZUjYUAV/YQT5Qh6570GsGncM8gHAUMz7DVTIOB9wYkHtb0Q9tUmoQccwf5
46r84c4/jxUQHt8SkXBBl1aiqHmR7EF17YvM5AXIdG3DqqOy70NWkM48hMOyIy2G
eY/wTVJwpd6oDveQPtxaHn7Wo3jSPNUlidOfAYQ7Itm34VXY/zc=
=yXLl
-----END PGP SIGNATURE-----
Merge tag 'f2fs-for-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've got a huge number of patches that improve code
readability along with minor bug fixes, while we've mainly fixed some
critical issues in recently-added per-block age-based extent_cache,
atomic write support, and some folio cases.
Enhancements:
- add sysfs nodes to set last_age_weight and manage
discard_io_aware_gran
- show ipu policy in debugfs
- reduce stack memory cost by using bitfield in struct f2fs_io_info
- introduce trace_f2fs_replace_atomic_write_block
- enhance iostat support and adds flush commands
Bug fixes:
- revert "f2fs: truncate blocks in batch in __complete_revoke_list()"
- fix kernel crash on the atomic write abort flow
- call clear_page_private_reference in .{release,invalid}_folio
- support .migrate_folio for compressed inode
- fix cgroup writeback accounting with fs-layer encryption
- retry to update the inode page given data corruption
- fix kernel crash due to NULL io->bio
- fix some bugs in per-block age-based extent_cache:
- wrong calculation of block age
- update age extent in f2fs_do_zero_range()
- update age extent correctly during truncation"
* tag 'f2fs-for-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (81 commits)
f2fs: drop unnecessary arg for f2fs_ioc_*()
f2fs: Revert "f2fs: truncate blocks in batch in __complete_revoke_list()"
f2fs: synchronize atomic write aborts
f2fs: fix wrong segment count
f2fs: replace si->sbi w/ sbi in stat_show()
f2fs: export ipu policy in debugfs
f2fs: make kobj_type structures constant
f2fs: fix to do sanity check on extent cache correctly
f2fs: add missing description for ipu_policy node
f2fs: fix to set ipu policy
f2fs: fix typos in comments
f2fs: fix kernel crash due to null io->bio
f2fs: use iostat_lat_type directly as a parameter in the iostat_update_and_unbind_ctx()
f2fs: add sysfs nodes to set last_age_weight
f2fs: fix f2fs_show_options to show nogc_merge mount option
f2fs: fix cgroup writeback accounting with fs-layer encryption
f2fs: fix wrong calculation of block age
f2fs: fix to update age extent in f2fs_do_zero_range()
f2fs: fix to update age extent correctly during truncation
f2fs: fix to avoid potential memory corruption in __update_iostat_latency()
...
In do_read_inode(), sanity check for extent cache should be called after
f2fs_init_read_extent_tree(), fix it.
Fixes: 72840cccc0 ("f2fs: allocate the extent_cache by default")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
.i_compress_level was introduced by commit 3fde13f817 ("f2fs: compress:
support compress level"), but never be used.
This patch updates as below:
- load high 8-bits of on-disk .i_compress_flag to in-memory .i_compress_level
- load low 8-bits of on-disk .i_compress_flag to in-memory .i_compress_flag
- change type of in-memory .i_compress_flag from unsigned short to unsigned
char.
w/ above changes, we can avoid unneeded bit shift whenever during
.init_compress_ctx(), and shrink size of struct f2fs_inode_info.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch tries to use bitfield in struct f2fs_io_info to improve
memory usage.
struct f2fs_io_info {
...
unsigned int need_lock:8; /* indicate we need to lock cp_rwsem */
unsigned int version:8; /* version of the node */
unsigned int submitted:1; /* indicate IO submission */
unsigned int in_list:1; /* indicate fio is in io_list */
unsigned int is_por:1; /* indicate IO is from recovery or not */
unsigned int retry:1; /* need to reallocate block address */
unsigned int encrypted:1; /* indicate file is encrypted */
unsigned int post_read:1; /* require post read */
...
};
After this patch, size of struct f2fs_io_info reduces from 136 to 120.
[Nathan: fix a compile warning (single-bit-bitfield-constant-conversion)]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The current discard_io_aware_gran is a fixed value, change it to be
configurable through the sys node.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Previously, we supported to account iostat io_bytes,
in this patch, it adds to account iostat count and avg_bytes:
time: 1671648667
io_bytes count avg_bytes
[WRITE]
app buffered data: 31 2 15
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
discard_wake and gc_wake have only two values, 0 or 1.
So there is no need to use int type to store them.
BTW, move discard_wake to the end of the
discard_cmd_control structure.
Before:
- sizeof(struct discard_cmd_control): 8392
After move:
- sizeof(struct discard_cmd_control): 8384
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
There is no need to additionally use f2fs_show_injection_info()
to output information. Concatenate time_to_inject() and
__time_to_inject() via a macro. In the new __time_to_inject()
function, pass in the caller function name and parent function.
In this way, we no longer need the f2fs_show_injection_info() function,
and let's remove it.
Suggested-by: Chao Yu <chao@kernel.org>
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
f2fs_init_compress_mempool() only initializes the memory pool during
the f2fs module init phase. Let's mark it as __init like any other
function.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The create argument is always identicaly to map->m_may_create, so use
that consistently.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This allows to keep the f2fs_do_map_lock based locking scheme
private to data.c.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
All but three callers of f2fs_lookup_extent_cache just want the block
address. Add a small helper to simplify them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Split __submit_bio into one function each for reads and writes, and a
helper for aligning writes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
NEW_ADDR blocks are purely in-memory preallocated blocks, and thus
equivalent to what the core FS code calls delayed allocations, and not
unwritten extents which do have on-disk blocks allocated from which
reads always return zeroes until they are converted to written status.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
m_flags is never interchanged with the buffer_heads b_flags directly,
so use separate codepoints from that.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
b763f3bedc ("f2fs: restructure f2fs page.private layout") missed
to call clear_page_private_reference() in .{release,invalid}_folio,
fix it, though it's not a big deal since folio_detach_private() was
called to clear all privae info and reference count in the page.
BTW, remove page_private_reference() definition as it never be used.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Commit 3db1de0e58 ("f2fs: change the current atomic write way")
has removed all users of PAGE_PRIVATE_ATOMIC_WRITE, remove its
definition and related functions.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch introduces a runtime hot/cold data separation method
for f2fs, in order to improve the accuracy for data temperature
classification, reduce the garbage collection overhead after
long-term data updates.
Enhanced hot/cold data separation can record data block update
frequency as "age" of the extent per inode, and take use of the age
info to indicate better temperature type for data block allocation:
- It records total data blocks allocated since mount;
- When file extent has been updated, it calculate the count of data
blocks allocated since last update as the age of the extent;
- Before the data block allocated, it searches for the age info and
chooses the suitable segment for allocation.
Test and result:
- Prepare: create about 30000 files
* 3% for cold files (with cold file extension like .apk, from 3M to 10M)
* 50% for warm files (with random file extension like .FcDxq, from 1K
to 4M)
* 47% for hot files (with hot file extension like .db, from 1K to 256K)
- create(5%)/random update(90%)/delete(5%) the files
* total write amount is about 70G
* fsync will be called for .db files, and buffered write will be used
for other files
The storage of test device is large enough(128G) so that it will not
switch to SSR mode during the test.
Benefit: dirty segment count increment reduce about 14%
- before: Dirty +21110
- after: Dirty +18286
Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com>
Signed-off-by: xiongping1 <xiongping1@xiaomi.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Introduce f2fs_is_readonly() and use it to simplify code.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
F2FS_SET_FEATURE() and F2FS_CLEAR_FEATURE() have never
been used since they were introduced by this commit
76f105a2dbcd("f2fs: add feature facility in superblock").
So let's remove them. BTW, convert f2fs_sb_has_##name to return bool.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Through this node, you can control the background discard
to run more aggressively or not aggressively when reach the
utilization rate of the space.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Do cleanup in f2fs_tuning_parameters() and __init_discard_policy(),
let's use macro instead of number.
Suggested-by: Chao Yu <chao@kernel.org>
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If compress_extension is set, and a newly created file matches the
extension, the file could be marked as compression file. However,
if inline_data is also enabled, there is no chance to check its
extension since f2fs_should_compress() always returns false.
This patch moves set_compress_inode(), which do extension check, in
f2fs_should_compress() to check extensions before setting inline
data flag.
Fixes: 7165841d57 ("f2fs: fix to check inline_data during compressed inode conversion")
Signed-off-by: Sheng Yong <shengyong@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Before this patch, the varibale 'readdir_ra' takes effect if it's equal
to '1' or not, so we can change type for it from 'int' to 'bool'.
Signed-off-by: Yuwei Guan <Yuwei.Guan@zeekrlife.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
introduce a new ioctl to replace the whole content of a file atomically,
which means it induces truncate and content update at the same time.
We can start it with F2FS_IOC_START_ATOMIC_REPLACE and complete it with
F2FS_IOC_COMMIT_ATOMIC_WRITE. Or abort it with
F2FS_IOC_ABORT_ATOMIC_WRITE.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Wei Chen reports a kernel bug as blew:
INFO: task syz-executor.0:29056 blocked for more than 143 seconds.
Not tainted 5.15.0-rc5 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.0 state:D stack:14632 pid:29056 ppid: 6574 flags:0x00000004
Call Trace:
__schedule+0x4a1/0x1720
schedule+0x36/0xe0
rwsem_down_write_slowpath+0x322/0x7a0
fscrypt_ioctl_set_policy+0x11f/0x2a0
__f2fs_ioctl+0x1a9f/0x5780
f2fs_ioctl+0x89/0x3a0
__x64_sys_ioctl+0xe8/0x140
do_syscall_64+0x34/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Eric did some investigation on this issue, quoted from reply of Eric:
"Well, the quality of this bug report has a lot to be desired (not on
upstream kernel, reproducer is full of totally irrelevant stuff, not
sent to the mailing list of the filesystem whose disk image is being
fuzzed, etc.). But what is going on is that f2fs_empty_dir() doesn't
consider the case of a directory with an extremely large i_size on a
malicious disk image.
Specifically, the reproducer mounts an f2fs image with a directory
that has an i_size of 14814520042850357248, then calls
FS_IOC_SET_ENCRYPTION_POLICY on it.
That results in a call to f2fs_empty_dir() to check whether the
directory is empty. f2fs_empty_dir() then iterates through all
3616826182336513 blocks the directory allegedly contains to check
whether any contain anything. i_rwsem is held during this, so
anything else that tries to take it will hang."
In order to solve this issue, let's use f2fs_get_next_page_offset()
to speed up iteration by skipping holes for all below functions:
- f2fs_empty_dir
- f2fs_readdir
- find_in_level
The way why we can speed up iteration was described in
'commit 3cf4574705 ("f2fs: introduce get_next_page_offset to speed
up SEEK_DATA")'.
Meanwhile, in f2fs_empty_dir(), let's use f2fs_find_data_page()
instead f2fs_get_lock_data_page(), due to i_rwsem was held in
caller of f2fs_empty_dir(), there shouldn't be any races, so it's
fine to not lock dentry page during lookuping dirents in the page.
Link: https://lore.kernel.org/lkml/536944df-a0ae-1dd8-148f-510b476e1347@kernel.org/T/
Reported-by: Wei Chen <harperchen1110@gmail.com>
Cc: Eric Biggers <ebiggers@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
We need to make sure i_size doesn't change until atomic write commit is
successful and restore it when commit is failed.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The user can set the trial count limit for GC urgent and
idle mode with replaced gc_remaining_trials.. If GC thread gets
to the limit, the mode will turn back to GC normal mode finally.
It was applied only to GC_URGENT, while this patch expands it for
GC_IDLE.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Revert "f2fs: make gc_urgent and gc_segment_mode sysfs node readable".
Add a gc_mode sysfs node to show the current gc_mode as a string.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The current max_ordered_discard is a fixed value, change it to be
configurable through the sys node.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit 377224c47118("f2fs: don't split checkpoint in fstrim") obsolete
batch mode and related sysfs entry.
Since this testing sysfs node has been deprecated for a long time, let's
remove it.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch supports to inject fault into f2fs_is_valid_blkaddr() to
simulate accessing inconsistent data/meta block addressses from caller.
Usage:
a) echo 262144 > /sys/fs/f2fs/<dev>/inject_type or
b) mount -o fault_type=262144 <dev> <mountpoint>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This round looks fairly small comparing to the previous updates which includes
mostly minor bug fixes. Nevertheless, as we've still interested in improving
the stability, Chao added some debugging methods to diagnoze subtle runtime
inconsistency problem.
Enhancement
- store all the corruption or failure reasons in superblock
- detect meta inode, summary info, and block address inconsistency
- increase the limit for reserve_root for low-end devices
- add the number of compressed IO in iostat
Bug fix
- DIO write fix for zoned devices
- do out-of-place writes for cold files
- fix some stat updates (FS_CP_DATA_IO, dirty page count)
- fix race condition on setting FI_NO_EXTENT flag
- fix data races when freezing super
- fix wrong continue condition check in GC
- do not allow ATGC for LFS mode
In addition, there're some code enhancement and clean-ups as usual.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmNEVIkACgkQQBSofoJI
UNL/Qg//eu7k196yIKflDZmp5aJbb5ybpFmh7XkPiqAV17ns+R2uLGq68BvTs+Tg
rqCjB7j2kkBh1kN32R7aGcx6tcbHjWc94pi59YTGQ6+pwkop3KJxFHSwAaUw6y34
8NZwmsnrm9rv0A0QPhQPK19yWmG/2smUE9b/u7M3+20I1WANaxdS/vOKbZz/amOu
f/BvsIIGS7Zzm9OpBCvGmq9Qpd83jlH6PuYGTC/OVbCrUiAJEmwN8wGsKP/9qB/5
KxVpdlh3vxulS6ixNbMu2qw9GBAQpAOz50+eDL5ZtGvGIQNHZRpGlfpJoW1lz0EO
4fJtpf5OMGqUbNaPCTG4qQGYAtKWA9YnFeWSS7RViQ6MryRXZMK8ka5eIe5Qblcf
AXD/eU2gKzOu0fuvdBRCt/wTSb4gY8sMNhe4psDsZxfhaYIpX8Ee/XVa4d+Z4frg
irN9gid1k3laMTx9dwJL8m7gIFvy3pak6l3B0bA69fAXd3faI40enuyfubFxnDet
OuRNxj8j3J5C140ag5KOuBCRub2/aPaj9YSQqUstf64d8FzN/Ypn5iVPTs2DP/3D
bcAFBwCS2+MCsk9+ra0WldZ5awdd6CRHDkvaYeDEuLCaLHUCo6CXe3aIyWawJBvJ
RnghKNv82RIV+rQlI1/sg8lseoDnEZTp5iwDGw/qZ+ZUyn05apM=
=aZ9y
-----END PGP SIGNATURE-----
Merge tag 'f2fs-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"This round looks fairly small comparing to the previous updates and
includes mostly minor bug fixes. Nevertheless, as we've still
interested in improving the stability, Chao added some debugging
methods to diagnoze subtle runtime inconsistency problem.
Enhancements:
- store all the corruption or failure reasons in superblock
- detect meta inode, summary info, and block address inconsistency
- increase the limit for reserve_root for low-end devices
- add the number of compressed IO in iostat
Bug fixes:
- DIO write fix for zoned devices
- do out-of-place writes for cold files
- fix some stat updates (FS_CP_DATA_IO, dirty page count)
- fix race condition on setting FI_NO_EXTENT flag
- fix data races when freezing super
- fix wrong continue condition check in GC
- do not allow ATGC for LFS mode
In addition, there're some code enhancement and clean-ups as usual"
* tag 'f2fs-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (32 commits)
f2fs: change to use atomic_t type form sbi.atomic_files
f2fs: account swapfile inodes
f2fs: allow direct read for zoned device
f2fs: support recording errors into superblock
f2fs: support recording stop_checkpoint reason into super_block
f2fs: remove the unnecessary check in f2fs_xattr_fiemap
f2fs: introduce cp_status sysfs entry
f2fs: fix to detect corrupted meta ino
f2fs: fix to account FS_CP_DATA_IO correctly
f2fs: code clean and fix a type error
f2fs: add "c_len" into trace_f2fs_update_extent_tree_range for compressed file
f2fs: fix to do sanity check on summary info
f2fs: port to vfs{g,u}id_t and associated helpers
f2fs: fix to do sanity check on destination blkaddr during recovery
f2fs: let FI_OPU_WRITE override FADVISE_COLD_BIT
f2fs: fix race condition on setting FI_NO_EXTENT flag
f2fs: remove redundant check in f2fs_sanity_check_cluster
f2fs: add static init_idisk_time function to reduce the code
f2fs: fix typo
f2fs: fix wrong dirty page count when race between mmap and fallocate.
...
inode_lock[ATOMIC_FILE] was used for protecting sbi->atomic_files,
update atomic_files variable's type to atomic_t instead of unsigned
int, then inode_lock[ATOMIC_FILE] can be obsoleted.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This reverts dbf8e63f48 ("f2fs: remove device type check for direct IO"),
and apply the below first version, since it contributed out-of-order DIO writes.
For zoned devices, f2fs forbids direct IO and forces buffered IO
to serialize write IOs. However, the constraint does not apply to
read IOs.
Cc: stable@vger.kernel.org
Fixes: dbf8e63f48 ("f2fs: remove device type check for direct IO")
Signed-off-by: Eunhee Rho <eunhee83.rho@samsung.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch supports to record detail reason of FSCORRUPTED error into
f2fs_super_block.s_errors[].
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch supports to record stop_checkpoint error into
f2fs_super_block.s_stop_reason[].
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
f2fs_inode_info.cp_task was introduced for FS_CP_DATA_IO accounting
since commit b0af6d491a ("f2fs: add app/fs io stat").
However, cp_task usage coverage has been increased due to below
commits:
commit 040d2bb318 ("f2fs: fix to avoid deadloop if data_flush is on")
commit 186857c5a1 ("f2fs: fix potential recursive call when enabling data_flush")
So that, if data_flush mountoption is on, when data flush was
triggered from background, the IO from data flush will be accounted
as checkpoint IO type incorrectly.
In order to fix this issue, this patch splits cp_task into two:
a) cp_task: used for IO accounting
b) wb_task: used to avoid deadlock
Fixes: 040d2bb318 ("f2fs: fix to avoid deadloop if data_flush is on")
Fixes: 186857c5a1 ("f2fs: fix potential recursive call when enabling data_flush")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>