linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-11 04:18:39 +08:00

Author	SHA1	Message	Date
Linus Torvalds	63fa605041	Changes since last update: - Make sure only regular inodes can be used for file-backed mounts; - Two minor codebase cleanups. -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEQ0A6bDUS9Y+83NPFUXZn5Zlu5qoFAmcNHqARHHhpYW5nQGtl cm5lbC5vcmcACgkQUXZn5Zlu5qo1XA/+MFbobJ4bWxJQKnouLlCiFQ5C1xEFbVn2 HasHLfrMIcdz/n/S3Ib4Ayi+9W0zM2Ekq9EG+fuOBqjZP17+EOj3e7OPtVVPNwx0 u2GbD9zNCliZg9PigCfPO+6oImt6l/Mytmx+7bELqbMywAy7JNCNesJuyycsTcja o1I3dNNUZdppilohXPIENTRLjBlOuGBaZdUXDih0LqB+Pb0jgXTP6JfD88h1MLFw xBbhqQ1A/GgyESfsMpZFn2xvFIocLBCIAdAehi9M1AiEwCTjGkTZ66WW3H6Es/Zp vcC9KjHJoGGCXxZf8mnoQHQo/WqQuNUPc2BVf9iExzCo0nwRArcTbAu5Bskqg0LF c+a7FrrxhODz8ioxOOiMUqG4b3/qGkzlk6w5a/t7IRrmFtmcXmPWZ14aI8qpy7o/ CW3iPUoF/zEsmmFvOgJtHwy3g+bC8KhDvz3fqFIDSSMjSKjqb4cPYSe/L5MyhwED wmLgp1uYjEyR0uuqqUp93FEYIbHuO5HpPRT5crLczRIoYn7bXRhjNWLCTmzlLqrj yDAQsrngK99BQ7g0FTQ/OV9si/HRRGsusZmCkeCb6KnRNIvml4X9/WXKc1ioOFk/ 3MSaxlQlTXzCCctjVCDNn9GfD/yR1cXu2sUpGSEnP1ssLG4ARyXGVfoeSw7gJ4xn C5lm9SOmkzU= =0gFG -----END PGP SIGNATURE----- Merge tag 'erofs-for-6.12-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: "The main one fixes a syzbot issue due to the invalid inode type out of file-backed mounts. The others are minor cleanups without actual logic changes. Summary: - Make sure only regular inodes can be used for file-backed mounts - Two minor codebase cleanups" * tag 'erofs-for-6.12-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: get rid of kaddr in `struct z_erofs_maprecorder` erofs: get rid of z_erofs_try_to_claim_pcluster() erofs: ensure regular inodes for file-backed mounts	2024-10-14 11:12:09 -07:00
Gao Xiang	ae54567eaa	erofs: get rid of kaddr in `struct z_erofs_maprecorder` `kaddr` becomes useless after switching to metabuf. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241010235830.1535616-1-hsiangkao@linux.alibaba.com	2024-10-11 13:36:58 +08:00
Al Viro	5f60d5f6bb	move asm/unaligned.h to linux/unaligned.h asm/unaligned.h is always an include of asm-generic/unaligned.h; might as well move that thing to linux/unaligned.h and include that - there's nothing arch-specific in that header. auto-generated by the following: for i in `git grep -l -w asm/unaligned.h`; do sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i done for i in `git grep -l -w asm-generic/unaligned.h`; do sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i done git mv include/asm-generic/unaligned.h include/linux/unaligned.h git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h	2024-10-02 17:23:23 -04:00
Gao Xiang	7c3ca1838a	erofs: restrict pcluster size limitations Error out if {en,de}encoded size of a pcluster is unsupported: Maximum supported encoded size (of a pcluster): 1 MiB Maximum supported decoded size (of a pcluster): 12 MiB Users can still choose to use supported large configurations (e.g., for archival purposes), but there may be performance penalties in low-memory scenarios compared to smaller pclusters. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240912074156.2925394-1-hsiangkao@linux.alibaba.com	2024-09-12 23:00:09 +08:00
Hongzhen Luo	1c076f1f4d	erofs: get rid of z_erofs_map_blocks_iter_* tracepoints Consolidate them under erofs_map_blocks_* for simplicity since we have many other ways to know if a given inode is compressed or not. Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240710083459.208362-1-hongzhen@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2024-07-10 18:57:06 +08:00
Gao Xiang	9b32b063be	erofs: ensure m_llen is reset to 0 if metadata is invalid Sometimes, the on-disk metadata might be invalid due to user interrupts, storage failures, or other unknown causes. In that case, z_erofs_map_blocks_iter() may still return a valid m_llen while other fields remain invalid (e.g., m_plen can be 0). Due to the return value of z_erofs_scan_folio() in some path will be ignored on purpose, the following z_erofs_scan_folio() could then use the invalid value by accident. Let's reset m_llen to 0 to prevent this. Link: https://lore.kernel.org/r/20240629185743.2819229-1-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2024-06-30 10:54:28 +08:00
Al Viro	4afe6b8d21	erofs: don't round offset down for erofs_read_metabuf() There's only one place where struct z_erofs_maprecorder ->kaddr is used not in the same function that has assigned it - the value read in unpack_compacted_index() gets calculated in z_erofs_load_compact_lcluster(). With minor massage we can switch to storing it with offset in block already added. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20240425195944.GE1031757@ZenIV Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2024-05-18 01:52:48 +08:00
Al Viro	076d965eb8	erofs: don't align offset for erofs_read_metabuf() (simple cases) Most of the callers of erofs_read_metabuf() have the following form: block = erofs_blknr(sb, offset); off = erofs_blkoff(sb, offset); p = erofs_read_metabuf(...., erofs_pos(sb, block), ...); if (IS_ERR(p)) return PTR_ERR(p); q = p + off; // no further uses of p, block or off. The value passed to erofs_read_metabuf() is offset rounded down to block size, i.e. offset - off. Passing offset as-is would increase the return value by off in case of success and keep the return value unchanged in in case of error. In other words, the same could be achieved by q = erofs_read_metabuf(...., offset, ...); if (IS_ERR(q)) return PTR_ERR(q); This commit convert these simple cases. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20240425195915.GD1031757@ZenIV Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2024-05-18 01:47:26 +08:00
Al Viro	e09815446d	erofs: mechanically convert erofs_read_metabuf() to offsets just lift the call of erofs_pos() into the callers; it will collapse in most of them, but that's better done caller-by-caller. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20240425195846.GC1031757@ZenIV Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2024-05-18 01:46:18 +08:00
Gao Xiang	7c35de4df1	erofs: Zstandard compression support Add Zstandard compression as the 4th supported algorithm since it becomes more popular now and some end users have asked this for quite a while [1][2]. Each EROFS physical cluster contains only one valid standard Zstandard frame as described in [3] so that decompression can be performed on a per-pcluster basis independently. Currently, it just leverages multi-call stream decompression APIs with internal sliding window buffers. One-shot or bufferless decompression could be implemented later for even better performance if needed. [1] https://github.com/erofs/erofs-utils/issues/6 [2] https://lore.kernel.org/r/Y08h+z6CZdnS1XBm@B-P7TQMD6M-0146.lan [3] https://www.rfc-editor.org/rfc/rfc8478.txt Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240508234453.17896-1-xiang@kernel.org	2024-05-09 07:46:56 +08:00
Gao Xiang	d69189428d	erofs: clean up z_erofs_load_full_lcluster() Only four lcluster types here, remove redundant code. No real logic changes. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240508123357.3266173-1-hsiangkao@linux.alibaba.com	2024-05-08 20:36:43 +08:00
Linus Torvalds	6f3625006b	Changes since last update: - Fix a "BUG: kernel NULL pointer dereference" issue due to inconsistent on-disk indices of compressed inodes against per-sb `available_compr_algs` generated by Syzkaller; - Don't use certain unnecessary folio_() helpers if the folio type (page cache) is known. -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEQ0A6bDUS9Y+83NPFUXZn5Zlu5qoFAmWpO4cRHHhpYW5nQGtl cm5lbC5vcmcACgkQUXZn5Zlu5qrJ7w//UpMasVxNpnZCsaWntDhp8AM9+wQZjosM sc0B1sFjuISQuGfjVEpnlabSudzRRGKI/0R55M8/woa8fuSXJiRNou+bv9Ogi+Aa CJ4E4+TSCGq98rjuuM9gb5L7V36pBp0PtxgANzKskHcq5w5JUNG6f6nhNQqnvRUG M7hBvzzLLz3fRPZZFzdu5S8ekwuBrq8K/PBM7PFfDgbl5IZ0cjLXXIdx61MXTro9 FGGJSRbJsUYg6+sqb0YWmluW4CBiwe7crovp6IaPBU0744Ga+jGyTNrOWAGjW42e 7glsM5MClTfmv17LJK3jV1Dg8EPkKtrhpeTCdECnWnuAyLGKFOT4juNc68GzCieR sSRR+WhmF/B2msAvyH4+gcaULCMAhLiVL1Yf1sfaxC1walEuyEM0EPWEHhAEGXjA BpT6+EZBbYdh24hpyNSNWy/xGMHuiUFy7940yII0o/9cvEbMXNPtIHxA09mOH08X 1tWgLlsLJ69ApIFYD3TkP9yNj22HrxRCQByKvYEe9JsmxwqDayXUP5FQLv1NPNMm ds36PDbNpxAM/cBnQcfPbZSODSWOCkLIHtmOvFP12tiixMG7yc4KY14Wuj3ZyHYr T16BZLlcdobHPapSsxzEQqPTgAYBcvh+6PHXfwnLsoXSYQXoxaUQMX1JREnmC3+I 4nMpKIp3qpY= =knvn -----END PGP SIGNATURE----- Merge tag 'erofs-for-6.8-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: - Fix a "BUG: kernel NULL pointer dereference" issue due to inconsistent on-disk indices of compressed inodes against per-sb `available_compr_algs` generated by Syzkaller - Don't use certain unnecessary folio_() helpers if the folio type (page cache) is known * tag 'erofs-for-6.8-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: Don't use certain unnecessary folio_*() functions erofs: fix inconsistent per-file compression format	2024-01-18 18:12:26 -08:00
Gao Xiang	118a8cf504	erofs: fix inconsistent per-file compression format EROFS can select compression algorithms on a per-file basis, and each per-file compression algorithm needs to be marked in the on-disk superblock for initialization. However, syzkaller can generate inconsistent crafted images that use an unsupported algorithmtype for specific inodes, e.g. use MicroLZMA algorithmtype even it's not set in `sbi->available_compr_algs`. This can lead to an unexpected "BUG: kernel NULL pointer dereference" if the corresponding decompressor isn't built-in. Fix this by checking against `sbi->available_compr_algs` for each m_algorithmformat request. Incorrect !erofs_sb_has_compr_cfgs preset bitmap is now fixed together since it was harmless previously. Reported-by: <bugreport@ubisectech.com> Fixes: `8f89926290` ("erofs: get compression algorithms directly on mapping") Fixes: `622ceaddb7` ("erofs: lzma compression support") Reviewed-by: Yue Hu <huyue2@coolpad.com> Link: https://lore.kernel.org/r/20240113150602.1471050-1-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2024-01-13 23:58:08 +08:00
Gao Xiang	8d2517aaee	erofs: fix up compacted indexes for block size < 4096 Previously, the block size always equaled to PAGE_SIZE, therefore `lclusterbits` couldn't be less than 12. Since sub-page compressed blocks are now considered, `lobits` for a lcluster in each pack cannot always be `lclusterbits` as before. Otherwise, there is no enough room for the special value `Z_EROFS_LI_D0_CBLKCNT`. To support smaller block sizes, `lobits` for each compacted lcluster is now calculated as: lobits = max(lclusterbits, ilog2(Z_EROFS_LI_D0_CBLKCNT) + 1) Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20231206091057.87027-4-hsiangkao@linux.alibaba.com	2023-12-15 01:47:19 +08:00
Gao Xiang	ffa09b3bd0	erofs: DEFLATE compression support Add DEFLATE compression as the 3rd supported algorithm. DEFLATE is a popular generic-purpose compression algorithm for quite long time (many advanced formats like gzip, zlib, zip, png are all based on that) as Apple documentation written "If you require interoperability with non-Apple devices, use COMPRESSION_ZLIB. [1]". Due to its popularity, there are several hardware on-market DEFLATE accelerators, such as (s390) DFLTCC, (Intel) IAA/QAT, (HiSilicon) ZIP accelerator, etc. In addition, there are also several high-performence IP cores and even open-source FPGA approches available for DEFLATE. Therefore, it's useful to support DEFLATE compression in order to find a way to utilize these accelerators for asynchronous I/Os and get benefits from these later. Besides, it's a good choice to trade off between compression ratios and performance compared to LZ4 and LZMA. The DEFLATE core format is simple as well as easy to understand, therefore the code size of its decompressor is small even for the bootloader use cases. The runtime memory consumption is quite limited too (e.g. 32K + ~7K for each zlib stream). As usual, EROFS ourperforms similar approaches too. Alternatively, DEFLATE could still be used for some specific files since EROFS supports multiple compression algorithms in one image. [1] https://developer.apple.com/documentation/compression/compression_algorithm Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230810154859.118330-1-hsiangkao@linux.alibaba.com	2023-08-11 12:11:17 +08:00
Gao Xiang	8241fdd3cd	erofs: clean up zmap.c Several trivial cleanups which aren't quite necessary to split: - Rename lcluster load functions as well as justify full indexes since they are typically used for global deduplication for compressed data; - Avoid unnecessary lines, comments for simplicity. No logic changes. Reviewed-by: Guo Xuenan <guoxuenan@huaweicloud.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230615064421.103178-1-hsiangkao@linux.alibaba.com	2023-06-22 21:16:34 +08:00
Gao Xiang	001b8ccd06	erofs: fix compact 4B support for 16k block size In compact 4B, two adjacent lclusters are packed together as a unit to form on-disk indexes for effective random access, as below: (amortized = 4, vcnt = 2) _____________________________________________ \|___@_____ encoded bits __________\|_ blkaddr _\| 0 . amortized * vcnt = 8 . . . . amortized * vcnt - 4 = 4 . . .____________________________. \|_type (2 bits)_\|_clusterofs_\| Therefore, encoded bits for each pack are 32 bits (4 bytes). IOWs, since each lcluster can get 16 bits for its type and clusterofs, the maximum supported lclustersize for compact 4B format is 16k (14 bits). Fix this to enable compact 4B format for 16k lclusters (blocks), which is tested on an arm64 server with 16k page size. Fixes: `152a333a58` ("staging: erofs: add compacted compression indexes support") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230601112341.56960-1-hsiangkao@linux.alibaba.com	2023-06-18 12:10:53 +08:00
Gao Xiang	10656f9ca6	erofs: sunset erofs_dbg() Such debug messages are rarely used now. Let's get rid of these, and revert locally if they are needed for debugging. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230414083027.12307-1-hsiangkao@linux.alibaba.com	2023-04-17 01:15:54 +08:00
Gao Xiang	4fdadd5b0f	erofs: get rid of z_erofs_fill_inode() Prior to big pclusters, non-compact compression indexes could have empty headers. Let's just avoid the legacy path since it can be handled properly as a specific compression header with z_erofs_fill_inode_lazy() too. Tested with erofs-utils exist versions. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230413092241.73829-1-hsiangkao@linux.alibaba.com	2023-04-17 01:15:53 +08:00
Gao Xiang	cc4efd3dd2	erofs: stop parsing non-compact HEAD index if clusterofs is invalid Syzbot generated a crafted image [1] with a non-compact HEAD index of clusterofs 33024 while valid numbers should be 0 ~ lclustersize-1, which causes the following unexpected behavior as below: BUG: unable to handle page fault for address: fffff52101a3fff9 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 23ffed067 P4D 23ffed067 PUD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 4398 Comm: kworker/u5:1 Not tainted 6.3.0-rc6-syzkaller-g09a9639e56c0 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/30/2023 Workqueue: erofs_worker z_erofs_decompressqueue_work RIP: 0010:z_erofs_decompress_queue+0xb7e/0x2b40 ... Call Trace: <TASK> z_erofs_decompressqueue_work+0x99/0xe0 process_one_work+0x8f6/0x1170 worker_thread+0xa63/0x1210 kthread+0x270/0x300 ret_from_fork+0x1f/0x30 Note that normal images or images using compact indexes are not impacted. Let's fix this now. [1] https://lore.kernel.org/r/000000000000ec75b005ee97fbaa@google.com Reported-and-tested-by: syzbot+aafb3f37cfeb6534c4ac@syzkaller.appspotmail.com Fixes: `02827e1796` ("staging: erofs: add erofs_map_blocks_iter") Fixes: `152a333a58` ("staging: erofs: add compacted compression indexes support") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230410173714.104604-1-hsiangkao@linux.alibaba.com	2023-04-17 01:15:49 +08:00
Gao Xiang	1c7f49a767	erofs: tidy up EROFS on-disk naming - Get rid of all "vle" (variable-length extents) expressions since they only expand overall name lengths unnecessarily; - Rename COMPRESSION_LEGACY to COMPRESSED_FULL; - Move on-disk directory definitions ahead of compression; - Drop unused extended attribute definitions; - Move inode ondisk union `i_u` out as `union erofs_inode_i_u`. No actual logical change. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230331063149.25611-1-hsiangkao@linux.alibaba.com	2023-04-17 01:15:46 +08:00
Jingbo Xu	3acea5fc33	erofs: avoid hardcoded blocksize for subpage block support As the first step of converting hardcoded blocksize to that specified in on-disk superblock, convert all call sites of hardcoded blocksize to sb->s_blocksize except for: 1) use sbi->blkszbits instead of sb->s_blocksize in erofs_superblock_csum_verify() since sb->s_blocksize has not been updated with the on-disk blocksize yet when the function is called. 2) use inode->i_blkbits instead of sb->s_blocksize in erofs_bread(), since the inode operated on may be an anonymous inode in fscache mode. Currently the anonymous inode is allocated from an anonymous mount maintained in erofs, while in the near future we may allocate anonymous inodes from a generic API directly and thus have no access to the anonymous inode's i_sb. Thus we keep the block size in i_blkbits for anonymous inodes in fscache mode. Be noted that this patch only gets rid of the hardcoded blocksize, in preparation for actually setting the on-disk block size in the following patch. The hard limit of constraining the block size to PAGE_SIZE still exists until the next patch. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230313135309.75269-2-jefflexu@linux.alibaba.com [ Gao Xiang: fold a patch to fix incorrect truncated offsets. ] Link: https://lore.kernel.org/r/20230413035734.15457-1-zhujia.zj@bytedance.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-04-17 01:15:44 +08:00
Gao Xiang	9ff471800b	erofs: get rid of a useless DBG_BUGON `err` could be -EINTR and it should not be the case. Actually such DBG_BUGON is useless. Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230309053148.9223-2-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-03-09 23:36:01 +08:00
Gao Xiang	999f2f9a63	erofs: get rid of z_erofs_do_map_blocks() forward declaration The code can be neater without forward declarations. Let's get rid of z_erofs_do_map_blocks() forward declaration. Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230204093040.97967-5-hsiangkao@linux.alibaba.com	2023-02-15 08:11:25 +08:00
Gao Xiang	b780d3fc61	erofs: simplify iloc() Actually we could pass in inodes directly to clean up all callers. Also rename iloc() as erofs_iloc(). Link: https://lore.kernel.org/r/20230114150823.432069-1-xiang@kernel.org Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-02-15 08:11:24 +08:00
Siddh Raman Pant	6acd87d509	erofs/zmap.c: Fix incorrect offset calculation Effective offset to add to length was being incorrectly calculated, which resulted in iomap->length being set to 0, triggering a WARN_ON in iomap_iter_done(). Fix that, and describe it in comments. This was reported as a crash by syzbot under an issue about a warning encountered in iomap_iter_done(), but unrelated to erofs. C reproducer: https://syzkaller.appspot.com/text?tag=ReproC&x=1037a6b2880000 Kernel config: https://syzkaller.appspot.com/text?tag=KernelConfig&x=e2021a61197ebe02 Dashboard link: https://syzkaller.appspot.com/bug?extid=a8e049cd3abd342936b6 Reported-by: syzbot+a8e049cd3abd342936b6@syzkaller.appspotmail.com Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Siddh Raman Pant <code@siddh.me> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20221209102151.311049-1-code@siddh.me Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-01-10 16:38:53 +08:00
Gao Xiang	c505feba4c	erofs: validate the extent length for uncompressed pclusters syzkaller reported a KASAN use-after-free: https://syzkaller.appspot.com/bug?extid=2ae90e873e97f1faf6f2 The referenced fuzzed image actually has two issues: - m_pa == 0 as a non-inlined pcluster; - The logical length is longer than its physical length. The first issue has already been addressed. This patch addresses the second issue by checking the extent length validity. Reported-by: syzbot+2ae90e873e97f1faf6f2@syzkaller.appspotmail.com Fixes: `02827e1796` ("staging: erofs: add erofs_map_blocks_iter") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20221205150050.47784-2-hsiangkao@linux.alibaba.com	2022-12-07 10:56:31 +08:00
Gao Xiang	d5d188b8f8	erofs: fix missing unmap if z_erofs_get_extent_compressedlen() fails Otherwise, meta buffers could be leaked. Fixes: `cec6e93bea` ("erofs: support parsing big pcluster compress indexes") Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20221205150050.47784-1-hsiangkao@linux.alibaba.com	2022-12-07 10:56:31 +08:00
Gao Xiang	927e5010ff	erofs: use kmap_local_page() only for erofs_bread() Convert all mapped erofs_bread() users to use kmap_local_page() instead of kmap() or kmap_atomic(). Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-and-tested-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20221018105313.4940-1-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-12-07 10:56:31 +08:00
Yue Hu	664609e49f	erofs: fix illegal unmapped accesses in z_erofs_fill_inode_lazy() Note that we are still accessing 'h_idata_size' and 'h_fragmentoff' after calling erofs_put_metabuf(), that is not correct. Fix it. Fixes: `ab92184ff8` ("erofs: add on-disk compressed tail-packing inline support") Fixes: `b15b2e307c` ("erofs: support on-disk compressed fragments data") Signed-off-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20221005013528.62977-1-zbestahu@163.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-10-17 06:55:48 +08:00
Gao Xiang	53a7f9961c	erofs: clean up unnecessary code and comments Some conditional macros and comments are useless. Link: https://lore.kernel.org/r/20220927063607.54832-1-hsiangkao@linux.alibaba.com Reviewed-by: Yue Hu <huyue2@coolpad.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-09-27 17:27:25 +08:00
Yue Hu	31da107fdb	erofs: fold in z_erofs_reload_indexes() The name of this function looks not very accurate compared to it's implementation and it's only a wrapper to erofs_read_metabuf(). So, let's fold it directly instead. Signed-off-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220927032518.25266-1-zbestahu@gmail.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-09-27 14:39:31 +08:00
Gao Xiang	5c2a64252c	erofs: introduce partial-referenced pclusters Due to deduplication for compressed data, pclusters can be partially referenced with their prefixes. Together with the user-space implementation, it enables EROFS variable-length global compressed data deduplication with rolling hash. Link: https://lore.kernel.org/r/20220923014915.4362-1-hsiangkao@linux.alibaba.com Reviewed-by: Yue Hu <huyue2@coolpad.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-09-26 23:55:43 +08:00
Yue Hu	b15b2e307c	erofs: support on-disk compressed fragments data Introduce on-disk compressed fragments data feature. This approach adds a new field called `h_fragmentoff' in the per-file compression header to indicate the fragment offset of each tail pcluster or the whole file in the special packed inode. Similar to ztailpacking, it will also find and record the 'headlcn' of the tail pcluster when initializing per-inode zmap for making follow-on requests more easy. Signed-off-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/YzHKxcFTlHGgXeH9@B-P7TQMD6M-0146.local Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-09-26 23:55:39 +08:00
Yue Hu	fdffc091e6	erofs: support interlaced uncompressed data for compressed files Currently, uncompressed data is all handled in the shifted way, which means we have to shift the whole on-disk plain pcluster to get the logical data. However, since we are also using in-place I/O for uncompressed data, data copy will be reduced a lot if pcluster is recorded in the interlaced way as illustrated below: _______________________________________________________________ \| \| \| \|_ tail part \|_ head part _\| \|<- blk0 ->\| .. \|<- blkn-2 ->\|<- blkn-1 ->\| The logical data then becomes: ________________________________________________________ \|_ head part _\|_ blk0 _\| .. \|_ blkn-2 _\|_ tail part _\| In addition, non-4k plain pclusters are also survived by the interlaced way, which can be used for non-4k lclusters as well. However, it's almost impossible to de-duplicate uncompressed data in the interlaced way, therefore shifted uncompressed data is still useful. Signed-off-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/8369112678604fdf4ef796626d59b1fdd0745a53.1663898962.git.huyue2@coolpad.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-09-23 10:55:56 +08:00
Yue Hu	ea0b7b0d59	erofs: avoid the potentially wrong m_plen for big pcluster Actually, 'compressedlcs' stores compressed block count rather than lcluster count. Therefore, the number of bits for shifting the count should be 'LOG_BLOCK_SIZE' rather than 'lclusterbits' although current lcluster size is 4K. The value of 'm_plen' will be wrong once we enable the non 4K-sized lcluster. Signed-off-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220812060150.8510-1-huyue2@coolpad.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-09-05 23:22:01 +08:00
Gao Xiang	ab474fccd0	erofs: clean up z_erofs_extent_lookback Avoid the unnecessary tail recursion since it can be converted into a loop directly in order to prevent potential stack overflow. It's a pretty straightforward conversion. Link: https://lore.kernel.org/r/20220310182743.102365-1-hsiangkao@linux.alibaba.com Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-03-17 00:08:48 +08:00
Gao Xiang	d467e980d0	erofs: silence warnings related to impossible m_plen Dan reported two smatch warnings [1], .. warn: should '1 << lclusterbits' be a 64 bit type? .. warn: should 'm->compressedlcs << lclusterbits' be a 64 bit type? In practice, m_plen cannot be more than 1MiB due to on-disk constraint for the compression mode, so we're always safe here. In order to make static analyzers happy and not report again, let's silence them instead. [1] https://lore.kernel.org/r/202203091002.lJVzsX6e-lkp@intel.com Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220310173448.19962-1-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-03-16 09:39:07 +08:00
Gao Xiang	24331050a3	erofs: fix small compressed files inlining Prior to ztailpacking feature, it's enough that each lcluster has two pclusters at most, and the last pcluster should be turned into an uncompressed pcluster when necessary. For example, _________________________________________________ \|_ pcluster n-2 _\|_ pcluster n-1 _\|____ EOFed ____\| which should be converted into: _________________________________________________ \|_ pcluster n-2 _\|_ pcluster n-1 (uncompressed)' _\| That is fine since either pcluster n-1 or (uncompressed)' takes one physical block. However, after ztailpacking was supported, the game is changed since the last pcluster can be inlined now. And such case above is quite common for inlining small files. Therefore, in order to inline more effectively, special EOF lclusters are now supported which can have three parts at most, as illustrated below: _________________________________________________ \|_ pcluster n-2 _\|_ pcluster n-1 _\|____ EOFed ____\| ^ i_size Actually similar code exists in Yue Hu's original patchset [1], but I removed this part on purpose. After evaluating more real cases with small files, I've changed my mind. [1] https://lore.kernel.org/r/20211215094449.15162-1-huyue2@yulong.com Link: https://lore.kernel.org/r/20220203190203.30794-1-xiang@kernel.org Fixes: `ab92184ff8` ("erofs: add on-disk compressed tail-packing inline support") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-02-04 12:37:12 +08:00
Gao Xiang	09c543798c	erofs: use meta buffers for zmap operations Get rid of old erofs_get_meta_page() within zmap operations by using on-stack meta buffers in order to prepare subpage and folio features. Finally, erofs_get_meta_page() is useless. Get rid of it! Link: https://lore.kernel.org/r/20220102040017.51352-6-hsiangkao@linux.alibaba.com Reviewed-by: Yue Hu <huyue2@yulong.com> Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-01-04 23:47:36 +08:00
Yue Hu	ab92184ff8	erofs: add on-disk compressed tail-packing inline support Introduces erofs compressed tail-packing inline support. This approach adds a new field called `h_idata_size' in the per-file compression header to indicate the encoded size of each tail-packing pcluster. At runtime, it will find the start logical offset of the tail pcluster when initializing per-inode zmap and record such extent (headlcn, idataoff) information to the in-memory inode. Therefore, follow-on requests can directly recognize if one pcluster is a tail-packing inline pcluster or not. Link: https://lore.kernel.org/r/20211228054604.114518-6-hsiangkao@linux.alibaba.com Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-12-31 00:51:10 +08:00
Gao Xiang	622ceaddb7	erofs: lzma compression support Add MicroLZMA support in order to maximize compression ratios for specific scenarios. For example, it's useful for low-end embedded boards and as a secondary algorithm in a file for specific access patterns. MicroLZMA is a new container format for raw LZMA1, which was created by Lasse Collin aiming to minimize old LZMA headers and get rid of unnecessary EOPM (end of payload marker) as well as to enable fixed-sized output compression, especially for 4KiB pclusters. Similar to LZ4, inplace I/O approach is used to minimize runtime memory footprint when dealing with I/O. Overlapped decompression is handled with 1) bounced buffer for data under processing or 2) extra short-lived pages from the on-stack pagepool which will be shared in the same read request (128KiB for example). Link: https://lore.kernel.org/r/20211010213145.17462-8-xiang@kernel.org Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-10-19 23:44:30 +08:00
Gao Xiang	72bb52620f	erofs: introduce the secondary compression head Previously, for each HEAD lcluster, it can be either HEAD or PLAIN lcluster to indicate whether the whole pcluster is compressed or not. In this patch, a new HEAD2 head type is introduced to specify another compression algorithm other than the primary algorithm for each compressed file, which can be used for upcoming LZMA compression and LZ4 range dictionary compression for various data patterns. It has been stayed in the EROFS roadmap for years. Complete it now! Link: https://lore.kernel.org/r/20211017165721.2442-1-xiang@kernel.org Reviewed-by: Yue Hu <huyue2@yulong.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-10-19 23:44:19 +08:00
Gao Xiang	8f89926290	erofs: get compression algorithms directly on mapping Currently, z_erofs_map_blocks_iter() returns whether extents are compressed or not, and the decompression frontend gets the specific algorithms then. It works but not quite well in many aspests, for example: - The decompression frontend has to deal with whether extents are compressed or not again and lookup the algorithms if compressed. It's duplicated and too detailed about the on-disk mapping. - A new secondary compression head will be introduced later so that each file can have 2 compression algorithms at most for different type of data. It could increase the complexity of the decompression frontend if still handled in this way; - A new readmore decompression strategy will be introduced to get better performance for much bigger pcluster and lzma, which needs the specific algorithm in advance as well. Let's look up compression algorithms in z_erofs_map_blocks_iter() directly instead. Link: https://lore.kernel.org/r/20211008200839.24541-2-xiang@kernel.org Reviewed-by: Chao Yu <chao@kernel.org> Reviewed-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-10-18 00:15:55 +08:00
Yue Hu	c40dd3ca2a	erofs: clear compacted_2b if compacted_4b_initial > totalidx Currently, the whole indexes will only be compacted 4B if compacted_4b_initial > totalidx. So, the calculated compacted_2b is worthless for that case. It may waste CPU resources. No need to update compacted_4b_initial as mkfs since it's used to fulfill the alignment of the 1st compacted_2b pack and would handle the case above. We also need to clarify compacted_4b_end here. It's used for the last lclusters which aren't fitted in the previous compacted_2b packs. Some messages are from Xiang. Link: https://lore.kernel.org/r/20210914035915.1190-1-zbestahu@gmail.com Signed-off-by: Yue Hu <huyue2@yulong.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> [ Gao Xiang: it's enough to use "compacted_4b_initial < totalidx". ] Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-09-23 23:23:04 +08:00
Gao Xiang	eadcd6b5a1	erofs: add fiemap support with iomap This adds fiemap support for both uncompressed files and compressed files by using iomap infrastructure. Link: https://lore.kernel.org/r/20210813052931.203280-3-hsiangkao@linux.alibaba.com Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-08-19 00:13:43 +08:00
Gao Xiang	d95ae5e253	erofs: add support for the full decompressed length Previously, there is no need to get the full decompressed length since EROFS supports partial decompression. However for some other cases such as fiemap, the full decompressed length is necessary for iomap to make it work properly. This patch adds a way to get the full decompressed length. Note that it takes more metadata overhead and it'd be avoided if possible in the performance sensitive scenario. Link: https://lore.kernel.org/r/20210818152231.243691-1-hsiangkao@linux.alibaba.com Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-08-19 00:13:26 +08:00
Gao Xiang	c5fcb51111	erofs: clean up file headers & footers - Remove my outdated misleading email address; - Get rid of all unnecessary trailing newline by accident. Link: https://lore.kernel.org/r/20210602160634.10757-1-xiang@kernel.org Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-06-08 00:41:24 +08:00
Gao Xiang	0852b6ca94	erofs: fix 1 lcluster-sized pcluster for big pcluster If the 1st NONHEAD lcluster of a pcluster isn't CBLKCNT lcluster type rather than a HEAD or PLAIN type instead, which means its pclustersize _must_ be 1 lcluster (since its uncompressed size < 2 lclusters), as illustrated below: HEAD HEAD / PLAIN lcluster type ____________ ____________ \|_:__________\|_________:__\| file data (uncompressed) . . .____________. \|____________\| pcluster data (compressed) Such on-disk case was explained before [1] but missed to be handled properly in the runtime implementation. It can be observed if manually generating 1 lcluster-sized pcluster with 2 lclusters (thus CBLKCNT doesn't exist.) Let's fix it now. [1] https://lore.kernel.org/r/20210407043927.10623-1-xiang@kernel.org Link: https://lore.kernel.org/r/20210510064715.29123-1-xiang@kernel.org Fixes: `cec6e93bea` ("erofs: support parsing big pcluster compress indexes") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <xiang@kernel.org>	2021-05-13 15:58:46 +08:00
Gao Xiang	b86269f438	erofs: support parsing big pcluster compact indexes Different from non-compact indexes, several lclusters are packed as the compact form at once and an unique base blkaddr is stored for each pack, so each lcluster index would take less space on avarage (e.g. 2 bytes for COMPACT_2B.) btw, that is also why BIG_PCLUSTER switch should be consistent for compact head0/1. Prior to big pcluster, the size of all pclusters was 1 lcluster. Therefore, when a new HEAD lcluster was scanned, blkaddr would be bumped by 1 lcluster. However, that way doesn't work anymore for big pcluster since we actually don't know the compressed size of pclusters in advance (before reading CBLKCNT lcluster). So, instead, let blkaddr of each pack be the first pcluster blkaddr with a valid CBLKCNT, in detail, 1) if CBLKCNT starts at the pack, this first valid pcluster is itself, e.g. _____________________________________________________________ \|_CBLKCNT0_\|_NONHEAD_\| .. \|_HEAD_\|_CBLKCNT1_\| ... \|_HEAD_\| ... ^ = blkaddr base ^ += CBLKCNT0 ^ += CBLKCNT1 2) if CBLKCNT doesn't start at the pack, the first valid pcluster is the next pcluster, e.g. _________________________________________________________ \| NONHEAD_\| .. \|_HEAD_\|_CBLKCNT0_\| ... \|_HEAD_\|_HEAD_\| ... ^ = blkaddr base ^ += CBLKCNT0 ^ += 1 When a CBLKCNT is found, blkaddr will be increased by CBLKCNT lclusters, or a new HEAD is found immediately, bump blkaddr by 1 instead (see the picture above.) Also noted if CBLKCNT is the end of the pack, instead of storing delta1 (distance of the next HEAD lcluster) as normal NONHEADs, it still uses the compressed block count (delta0) since delta1 can be calculated indirectly but the block count can't. Adjust decoding logic to fit big pcluster compact indexes as well. Link: https://lore.kernel.org/r/20210407043927.10623-9-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-04-10 03:20:18 +08:00

1 2

63 Commits