linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-28 22:54:05 +08:00

Author	SHA1	Message	Date
Jingbo Xu	97cf5d53b4	erofs: get rid of unneeded GFP_NOFS Clean up some leftovers since there is no way for EROFS to be called again from a reclaim context. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240124031945.130782-1-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2024-01-25 11:24:19 +08:00
Gao Xiang	0ee3a0d59e	erofs: enable sub-page compressed block support Let's just disable cached decompression and inplace I/Os for partial pages as the first step in order to enable sub-page block initial support. In other words, currently it works primarily based on temporary short-lived pages. Don't expect too much in terms of performance. Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20231206091057.87027-6-hsiangkao@linux.alibaba.com	2023-12-18 15:49:39 +08:00
Ferry Meng	914fa861e3	erofs: simplify erofs_read_inode() After commit `1c7f49a767` ("erofs: tidy up EROFS on-disk naming"), there is a unique `union erofs_inode_i_u` so that we could parse the union directly. Besides, it also replaces `inode->i_sb` with `sb` for simplicity. Signed-off-by: Ferry Meng <mengferry@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20231109111822.17944-1-mengferry@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-11-17 19:55:34 +08:00
Jeff Layton	594370f7e8	erofs: convert to new timestamp accessors Convert to using the new inode timestamp accessor functions. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20231004185347.80880-30-jlayton@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2023-10-18 13:26:21 +02:00
Linus Torvalds	615e95831e	v6.6-vfs.ctime -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXTKAAKCRCRxhvAZXjc oifJAQCzi/p+AdQu8LA/0XvR7fTwaq64ZDCibU4BISuLGT2kEgEAuGbuoFZa0rs2 XYD/s4+gi64p9Z01MmXm2XO1pu3GPg0= =eJz5 -----END PGP SIGNATURE----- Merge tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs timestamp updates from Christian Brauner: "This adds VFS support for multi-grain timestamps and converts tmpfs, xfs, ext4, and btrfs to use them. This carries acks from all relevant filesystems. The VFS always uses coarse-grained timestamps when updating the ctime and mtime after a change. This has the benefit of allowing filesystems to optimize away a lot of metadata updates, down to around 1 per jiffy, even when a file is under heavy writes. Unfortunately, this has always been an issue when we're exporting via NFSv3, which relies on timestamps to validate caches. A lot of changes can happen in a jiffy, so timestamps aren't sufficient to help the client decide to invalidate the cache. Even with NFSv4, a lot of exported filesystems don't properly support a change attribute and are subject to the same problems with timestamp granularity. Other applications have similar issues with timestamps (e.g., backup applications). If we were to always use fine-grained timestamps, that would improve the situation, but that becomes rather expensive, as the underlying filesystem would have to log a lot more metadata updates. This introduces fine-grained timestamps that are used when they are actively queried. This uses the 31st bit of the ctime tv_nsec field to indicate that something has queried the inode for the mtime or ctime. When this flag is set, on the next mtime or ctime update, the kernel will fetch a fine-grained timestamp instead of the usual coarse-grained one. As POSIX generally mandates that when the mtime changes, the ctime must also change the kernel always stores normalized ctime values, so only the first 30 bits of the tv_nsec field are ever used. Filesytems can opt into this behavior by setting the FS_MGTIME flag in the fstype. Filesystems that don't set this flag will continue to use coarse-grained timestamps. Various preparatory changes, fixes and cleanups are included: - Fixup all relevant places where POSIX requires updating ctime together with mtime. This is a wide-range of places and all maintainers provided necessary Acks. - Add new accessors for inode->i_ctime directly and change all callers to rely on them. Plain accesses to inode->i_ctime are now gone and it is accordingly rename to inode->__i_ctime and commented as requiring accessors. - Extend generic_fillattr() to pass in a request mask mirroring in a sense the statx() uapi. This allows callers to pass in a request mask to only get a subset of attributes filled in. - Rework timestamp updates so it's possible to drop the @now parameter the update_time() inode operation and associated helpers. - Add inode_update_timestamps() and convert all filesystems to it removing a bunch of open-coding" * tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (107 commits) btrfs: convert to multigrain timestamps ext4: switch to multigrain timestamps xfs: switch to multigrain timestamps tmpfs: add support for multigrain timestamps fs: add infrastructure for multigrain timestamps fs: drop the timespec64 argument from update_time xfs: have xfs_vn_update_time gets its own timestamp fat: make fat_update_time get its own timestamp fat: remove i_version handling from fat_update_time ubifs: have ubifs_update_time use inode_update_timestamps btrfs: have it use inode_update_timestamps fs: drop the timespec64 arg from generic_update_time fs: pass the request_mask to generic_fillattr fs: remove silly warning from current_time gfs2: fix timestamp handling on quota inodes fs: rename i_ctime field to __i_ctime selinux: convert to ctime accessor functions security: convert to ctime accessor functions apparmor: convert to ctime accessor functions sunrpc: convert to ctime accessor functions ...	2023-08-28 09:31:32 -07:00
Jeff Layton	0d72b92883	fs: pass the request_mask to generic_fillattr generic_fillattr just fills in the entire stat struct indiscriminately today, copying data from the inode. There is at least one attribute (STATX_CHANGE_COOKIE) that can have side effects when it is reported, and we're looking at adding more with the addition of multigrain timestamps. Add a request_mask argument to generic_fillattr and have most callers just pass in the value that is passed to getattr. Have other callers (e.g. ksmbd) just pass in STATX_BASIC_STATS. Also move the setting of STATX_CHANGE_COOKIE into generic_fillattr. Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: "Paulo Alcantara (SUSE)" <pc@manguebit.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Message-Id: <20230807-mgctime-v7-2-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2023-08-09 08:56:36 +02:00
Jeff Layton	7be935e18e	erofs: convert to ctime accessor functions In later patches, we're going to change how the inode's ctime field is used. Switch to using accessor functions instead of raw accesses of inode->i_ctime. Acked-by: Gao Xiang <xiang@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org> Message-Id: <20230705190309.579783-37-jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2023-07-13 10:28:06 +02:00
Xin Yin	18bddc5b67	erofs: fix fsdax unavailability for chunk-based regular files DAX can be used to share page cache between VMs, reducing guest memory overhead. And chunk based data format is widely used for VM and container image. So enable dax support for it, make erofs better used for VM scenarios. Fixes: `c5aa903a59` ("erofs: support reading chunk-based uncompressed files") Signed-off-by: Xin Yin <yinxin.x@bytedance.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230711062130.7860-1-yinxin.x@bytedance.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-07-12 00:50:56 +08:00
Gao Xiang	10656f9ca6	erofs: sunset erofs_dbg() Such debug messages are rarely used now. Let's get rid of these, and revert locally if they are needed for debugging. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230414083027.12307-1-hsiangkao@linux.alibaba.com	2023-04-17 01:15:54 +08:00
Gao Xiang	4fdadd5b0f	erofs: get rid of z_erofs_fill_inode() Prior to big pclusters, non-compact compression indexes could have empty headers. Let's just avoid the legacy path since it can be handled properly as a specific compression header with z_erofs_fill_inode_lazy() too. Tested with erofs-utils exist versions. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230413092241.73829-1-hsiangkao@linux.alibaba.com	2023-04-17 01:15:53 +08:00
Jingbo Xu	d3c4bdcc75	erofs: set block size to the on-disk block size Set the block size to that specified in on-disk superblock. Also remove the hard constraint of PAGE_SIZE block size for the uncompressed device backend. This constraint is temporarily remained for compressed device and fscache backend, as there is more work needed to handle the condition where the block size is not equal to PAGE_SIZE. It is worth noting that the on-disk block size is read prior to erofs_superblock_csum_verify(), as the read block size is needed in the latter. Besides, later we are going to make erofs refer to tar data blobs (which is 512-byte aligned) for OCI containers, where the block size is 512 bytes. In this case, the 512-byte block size may not be adequate for a directory to contain enough dirents. To fix this, we are also going to introduce directory block size independent on the block size. Due to we have already supported block size smaller than PAGE_SIZE now, disable all these images with such separated directory block size until we supported this feature later. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230313135309.75269-3-jefflexu@linux.alibaba.com [ Gao Xiang: update documentation. ] Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-04-17 01:15:45 +08:00
Jingbo Xu	3acea5fc33	erofs: avoid hardcoded blocksize for subpage block support As the first step of converting hardcoded blocksize to that specified in on-disk superblock, convert all call sites of hardcoded blocksize to sb->s_blocksize except for: 1) use sbi->blkszbits instead of sb->s_blocksize in erofs_superblock_csum_verify() since sb->s_blocksize has not been updated with the on-disk blocksize yet when the function is called. 2) use inode->i_blkbits instead of sb->s_blocksize in erofs_bread(), since the inode operated on may be an anonymous inode in fscache mode. Currently the anonymous inode is allocated from an anonymous mount maintained in erofs, while in the near future we may allocate anonymous inodes from a generic API directly and thus have no access to the anonymous inode's i_sb. Thus we keep the block size in i_blkbits for anonymous inodes in fscache mode. Be noted that this patch only gets rid of the hardcoded blocksize, in preparation for actually setting the on-disk block size in the following patch. The hard limit of constraining the block size to PAGE_SIZE still exists until the next patch. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230313135309.75269-2-jefflexu@linux.alibaba.com [ Gao Xiang: fold a patch to fix incorrect truncated offsets. ] Link: https://lore.kernel.org/r/20230413035734.15457-1-zhujia.zj@bytedance.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-04-17 01:15:44 +08:00
Linus Torvalds	dc483c851f	Changes since last update: - Add per-cpu kthreads for low-latency decompression for Android use cases; - Get rid of tagged pointer helpers since they are rarely used now; - Several code cleanups to reduce codebase; - Documentation and MAINTAINERS updates. -----BEGIN PGP SIGNATURE----- iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCY/IDjhEceGlhbmdAa2Vy bmVsLm9yZwAKCRA5NzHcH7XmBNbTAQDT2njll/B2JSYbVC2I2HYTZSyFXEaHhH+M 6gHRbEhTWAD/VNiAcdE600IkUwut/78tDvwlz/XJSd2JQMMwkTSviwc= =oroQ -----END PGP SIGNATURE----- Merge tag 'erofs-for-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs updates from Gao Xiang: "The most noticeable feature for this cycle is per-CPU kthread decompression since Android use cases need low-latency I/O handling in order to ensure the app runtime performance, currently unbounded workqueue latencies are not quite good for production on many aarch64 hardwares and thus we need to introduce a deterministic expectation for these. Decompression is CPU-intensive and it is sleepable for EROFS, so other alternatives like decompression under softirq contexts are not considered. More details are in the corresponding commit message. Others are random cleanups around the whole codebase and we will continue to clean up further in the next few months. Due to Lunar New Year holidays, some other new features were not completely reviewed and solidified as expected and we may delay them into the next version. Summary: - Add per-cpu kthreads for low-latency decompression for Android use cases - Get rid of tagged pointer helpers since they are rarely used now - Several code cleanups to reduce codebase - Documentation and MAINTAINERS updates" * tag 'erofs-for-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: (21 commits) erofs: fix an error code in z_erofs_init_zip_subsystem() erofs: unify anonymous inodes for blob erofs: relinquish volume with mutex held erofs: maintain cookies of share domain in self-contained list erofs: remove unused device mapping in meta routine MAINTAINERS: erofs: Add Documentation/ABI/testing/sysfs-fs-erofs Documentation/ABI: sysfs-fs-erofs: update supported features erofs: remove unused EROFS_GET_BLOCKS_RAW flag erofs: update print symbols for various flags in trace erofs: make kobj_type structures constant erofs: add per-cpu threads for decompression as an option erofs: tidy up internal.h erofs: get rid of z_erofs_do_map_blocks() forward declaration erofs: move zdata.h into zdata.c erofs: remove tagged pointer helpers erofs: avoid tagged pointers to mark sync decompression erofs: get rid of erofs_inode_datablocks() erofs: simplify iloc() erofs: get rid of debug_one_dentry() erofs: remove linux/buffer_head.h dependency ...	2023-02-20 12:23:40 -08:00
Gao Xiang	b780d3fc61	erofs: simplify iloc() Actually we could pass in inodes directly to clean up all callers. Also rename iloc() as erofs_iloc(). Link: https://lore.kernel.org/r/20230114150823.432069-1-xiang@kernel.org Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-02-15 08:11:24 +08:00
Gao Xiang	7c3511a2c8	erofs: clean up erofs_iget() Move inode hash function into inode.c and simplify erofs_iget(). Link: https://lore.kernel.org/r/20230113065226.68801-1-hsiangkao@linux.alibaba.com Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-02-15 08:10:46 +08:00
Christian Brauner	b74d24f7a7	fs: port ->getattr() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:25 +01:00
Linus Torvalds	4a6bff1187	Changes since the last update: - Enable large folios for iomap/fscache mode; - Avoid sysfs warning due to mounting twice with the same fsid and domain_id in fscache mode; - Refine fscache interface among erofs, fscache, and cachefiles; - Use kmap_local_page() only for metabuf; - Fixes around crafted images found by syzbot; - Minor cleanups and documentation updates. -----BEGIN PGP SIGNATURE----- iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCY5S3khEceGlhbmdAa2Vy bmVsLm9yZwAKCRA5NzHcH7XmBLr3AQDA5xpztSsxfe0Gp+bwf12ySuntimJxXmAj 83EHCfSC+AEAu4fcWkIF38MBBVJvFVjFaXCZKmFossbI5Rp8TuqPpgk= =HDsJ -----END PGP SIGNATURE----- Merge tag 'erofs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs updates from Gao Xiang: "In this cycle, large folios are now enabled in the iomap/fscache mode for uncompressed files first. In order to do that, we've also cleaned up better interfaces between erofs and fscache, which are acked by fscache/netfs folks and included in this pull request. Other than that, there are random fixes around erofs over fscache and crafted images by syzbot, minor cleanups and documentation updates. Summary: - Enable large folios for iomap/fscache mode - Avoid sysfs warning due to mounting twice with the same fsid and domain_id in fscache mode - Refine fscache interface among erofs, fscache, and cachefiles - Use kmap_local_page() only for metabuf - Fixes around crafted images found by syzbot - Minor cleanups and documentation updates" * tag 'erofs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: validate the extent length for uncompressed pclusters erofs: fix missing unmap if z_erofs_get_extent_compressedlen() fails erofs: Fix pcluster memleak when its block address is zero erofs: use kmap_local_page() only for erofs_bread() erofs: enable large folios for fscache mode erofs: support large folios for fscache mode erofs: switch to prepare_ondemand_read() in fscache mode fscache,cachefiles: add prepare_ondemand_read() callback erofs: clean up cached I/O strategies erofs: update documentation erofs: check the uniqueness of fsid in shared domain in advance erofs: enable large folios for iomap mode	2022-12-12 20:14:04 -08:00
Gao Xiang	927e5010ff	erofs: use kmap_local_page() only for erofs_bread() Convert all mapped erofs_bread() users to use kmap_local_page() instead of kmap() or kmap_atomic(). Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-and-tested-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20221018105313.4940-1-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-12-07 10:56:31 +08:00
Jingbo Xu	e6687b8922	erofs: enable large folios for fscache mode Enable large folios for fscache mode. Enable this feature for non-compressed format for now, until the compression part supports large folios later. One thing worth noting is that, the feature is not enabled for the meta data routine since meta inodes don't need large folios for now, nor do they support readahead yet. Also document this new feature. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Link: https://lore.kernel.org/r/20221201074256.16639-3-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-12-07 10:56:31 +08:00
Jingbo Xu	ce529cc25b	erofs: enable large folios for iomap mode Enable large folios for iomap mode. Then the readahead routine will pass down large folios containing multiple pages. Let's enable this for non-compressed format for now, until the compression part supports large folios later. When large folios supported, the iomap routine will allocate iomap_page for each large folio and thus we need iomap_release_folio() and iomap_invalidate_folio() to free iomap_page when these folios get reclaimed or invalidated. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20221130060455.44532-1-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-12-07 10:52:06 +08:00
Christian Brauner	cac2f8b8d8	fs: rename current get acl method The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. The current inode operation for getting posix acls takes an inode argument but various filesystems (e.g., 9p, cifs, overlayfs) need access to the dentry. In contrast to the ->set_acl() inode operation we cannot simply extend ->get_acl() to take a dentry argument. The ->get_acl() inode operation is called from: acl_permission_check() -> check_acl() -> get_acl() which is part of generic_permission() which in turn is part of inode_permission(). Both generic_permission() and inode_permission() are called in the ->permission() handler of various filesystems (e.g., overlayfs). So simply passing a dentry argument to ->get_acl() would amount to also having to pass a dentry argument to ->permission(). We should avoid this unnecessary change. So instead of extending the existing inode operation rename it from ->get_acl() to ->get_inode_acl() and add a ->get_acl() method later that passes a dentry argument and which filesystems that need access to the dentry can implement instead of ->get_inode_acl(). Filesystems like cifs which allow setting and getting posix acls but not using them for permission checking during lookup can simply not implement ->get_inode_acl(). This is intended to be a non-functional change. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Suggested-by/Inspired-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2022-10-20 10:13:27 +02:00
Gao Xiang	312fe643ad	erofs: clean up erofs_iget() isdir indicated REQ_META\|REQ_PRIO which no longer works now. Get rid of isdir entirely. Link: https://lore.kernel.org/r/20220927063607.54832-2-hsiangkao@linux.alibaba.com Reviewed-by: Yue Hu <huyue2@coolpad.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-09-27 17:27:45 +08:00
Gao Xiang	1dd73601a1	erofs: fix order >= MAX_ORDER warning due to crafted negative i_size As syzbot reported [1], the root cause is that i_size field is a signed type, and negative i_size is also less than EROFS_BLKSIZ. As a consequence, it's handled as fast symlink unexpectedly. Let's fall back to the generic path to deal with such unusual i_size. [1] https://lore.kernel.org/r/000000000000ac8efa05e7feaa1f@google.com Reported-by: syzbot+f966c13b1b4fc0403b19@syzkaller.appspotmail.com Fixes: `431339ba90` ("staging: erofs: add inode operations") Reviewed-by: Yue Hu <huyue2@coolpad.com> Link: https://lore.kernel.org/r/20220909023948.28925-1-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-09-20 07:59:32 +08:00
Jeffle Xu	0130e4e8e4	erofs: leave compressed inodes unsupported in fscache mode for now erofs over fscache doesn't support the compressed layout yet. It will cause NULL crash if there are compressed inodes contained when working in fscache mode. So far in the erofs based container image distribution scenarios (RAFS v6), the compressed RAFS v6 images are downloaded and then decompressed on demand as an uncompressed erofs image. Then the erofs image is mounted in fscache mode for containers to use. IOWs, currently compressed data is decompressed on the userspace side instead and uncompressed erofs images will be finally cached. The fscache support for the compressed layout is still under development and it will be used for runtime decompression feature. Anyway, to avoid the potential crash, let's leave the compressed inodes unsupported in fscache mode until we support it later. Fixes: `1442b02b66` ("erofs: implement fscache-based data read for non-inline layout") Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220526010344.118493-1-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-05-29 15:34:54 +08:00
Jeffle Xu	1442b02b66	erofs: implement fscache-based data read for non-inline layout Implement the data plane of reading data from data blobs over fscache for non-inline layout. Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220425122143.56815-19-jefflexu@linux.alibaba.com Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-05-18 00:11:20 +08:00
Chao Yu	6c459b78d4	erofs: support idmapped mounts This patch enables idmapped mounts for erofs, since all dedicated helpers for this functionality existsm, so, in this patch we just pass down the user_namespace argument from the VFS methods to the relevant helpers. Simple idmap example on erofs image: 1. mkdir dir 2. touch dir/file 3. mkfs.erofs erofs.img dir 4. mount -t erofs -o loop erofs.img /mnt/erofs/ 5. ls -ln /mnt/erofs/ total 0 -rw-rw-r-- 1 1000 1000 0 May 17 15:26 file 6. mount-idmapped --map-mount b:1000:1001:1 /mnt/erofs/ /mnt/scratch_erofs/ 7. ls -ln /mnt/scratch_erofs/ total 0 -rw-rw-r-- 1 1001 1001 0 May 17 15:26 file Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Chao Yu <chao.yu@oppo.com> Link: https://lore.kernel.org/r/20220517104103.3570721-1-chao@kernel.org Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-05-17 23:56:20 +08:00
Gao Xiang	1f7aa6caef	erofs: remove obsoleted comments Some comments haven't been useful anymore since the code updated. Let's drop them instead. Link: https://lore.kernel.org/r/20220506194612.117120-2-hsiangkao@linux.alibaba.com Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-05-17 23:38:13 +08:00
David Anderson	a1108dcd93	erofs: rename ctime to mtime EROFS images should inherit modification time rather than change time, since users and host tooling have no easy way to control change time. To reflect the new timestamp meaning, i_ctime and i_ctime_nsec are renamed to i_mtime and i_mtime_nsec. Link: https://lore.kernel.org/r/20220311041829.3109511-1-dvander@google.com # v1 Signed-off-by: David Anderson <dvander@google.com> [ Gao Xiang: update document as well. ] Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220317114959.106787-1-hsiangkao@linux.alibaba.com # v2 Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-03-17 23:41:14 +08:00
Gao Xiang	c521e3ad6c	erofs: use meta buffers for inode operations Get rid of old erofs_get_meta_page() within inode operations by using on-stack meta buffers in order to prepare subpage and folio features. Link: https://lore.kernel.org/r/20220102040017.51352-3-hsiangkao@linux.alibaba.com Reviewed-by: Yue Hu <huyue2@yulong.com> Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2022-01-04 23:44:46 +08:00
Gao Xiang	e62424651f	erofs: decouple basic mount options from fs_context Previously, EROFS mount options are all in the basic types, so erofs_fs_context can be directly copied with assignment. However, when the multiple device feature is introduced, it's hard to handle multiple device information like the other basic mount options. Let's separate basic mount option usage from fs_context, thus multiple device information can be handled gracefully then. No logic changes. Link: https://lore.kernel.org/r/20211007070224.12833-1-hsiangkao@linux.alibaba.com Reviewed-by: Chao Yu <chao@kernel.org> Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-10-17 23:57:15 +08:00
Gao Xiang	d705117ddd	erofs: fix misbehavior of unsupported chunk format check Unsupported chunk format should be checked with "if (vi->chunkformat & ~EROFS_CHUNK_FORMAT_ALL)" Found when checking with 4k-byte blockmap (although currently mkfs uses inode chunk indexes format by default.) Link: https://lore.kernel.org/r/20210922095141.233938-1-hsiangkao@linux.alibaba.com Fixes: `c5aa903a59` ("erofs: support reading chunk-based uncompressed files") Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-09-23 23:22:04 +08:00
Gao Xiang	1266b4a7ec	erofs: fix double free of 'copied' Dan reported a new smatch warning [1] "fs/erofs/inode.c:210 erofs_read_inode() error: double free of 'copied'" Due to new chunk-based format handling logic, the error path can be called after kfree(copied). Set "copied = NULL" after kfree(copied) to fix this. [1] https://lore.kernel.org/r/202108251030.bELQozR7-lkp@intel.com Link: https://lore.kernel.org/r/20210825120757.11034-1-hsiangkao@linux.alibaba.com Fixes: `c5aa903a59` ("erofs: support reading chunk-based uncompressed files") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-08-25 22:05:58 +08:00
Gao Xiang	c5aa903a59	erofs: support reading chunk-based uncompressed files Add runtime support for chunk-based uncompressed files described in the previous patch. Link: https://lore.kernel.org/r/20210820100019.208490-2-hsiangkao@linux.alibaba.com Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-08-20 22:38:01 +08:00
Gao Xiang	eadcd6b5a1	erofs: add fiemap support with iomap This adds fiemap support for both uncompressed files and compressed files by using iomap infrastructure. Link: https://lore.kernel.org/r/20210813052931.203280-3-hsiangkao@linux.alibaba.com Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-08-19 00:13:43 +08:00
Gao Xiang	06252e9ce0	erofs: dax support for non-tailpacking regular file DAX is quite useful for some VM use cases in order to save guest memory extremely with minimal lightweight EROFS. In order to prepare for such use cases, add preliminary dax support for non-tailpacking regular files for now. Tested with the DRAM-emulated PMEM and the EROFS image generated by "mkfs.erofs -Enoinline_data enwik9.fsdax.img enwik9" Link: https://lore.kernel.org/r/20210805003601.183063-3-hsiangkao@linux.alibaba.com Cc: nvdimm@lists.linux.dev Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-08-10 00:14:59 +08:00
Huang Jianan	a08e67a028	erofs: iomap support for non-tailpacking DIO Add iomap support for non-tailpacking uncompressed data in order to support DIO and DAX. Direct I/O is useful in certain scenarios for uncompressed files. For example, double pagecache can be avoid by direct I/O when loop device is used for uncompressed files containing upper layer compressed filesystem. This adds iomap DIO support for non-tailpacking cases first and tail-packing inline files are handled in the follow-up patch. Link: https://lore.kernel.org/r/20210805003601.183063-2-hsiangkao@linux.alibaba.com Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Huang Jianan <huangjianan@oppo.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-08-10 00:14:42 +08:00
Gao Xiang	c5fcb51111	erofs: clean up file headers & footers - Remove my outdated misleading email address; - Get rid of all unnecessary trailing newline by accident. Link: https://lore.kernel.org/r/20210602160634.10757-1-xiang@kernel.org Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2021-06-08 00:41:24 +08:00
Gao Xiang	24a806d849	erofs: add unsupported inode i_format check If any unknown i_format fields are set (may be of some new incompat inode features), mark such inode as unsupported. Just in case of any new incompat i_format fields added in the future. Link: https://lore.kernel.org/r/20210329003614.6583-1-hsiangkao@aol.com Fixes: `431339ba90` ("staging: erofs: add inode operations") Cc: <stable@vger.kernel.org> # 4.19+ Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2021-03-29 10:20:45 +08:00
Christian Brauner	549c729771	fs: make helpers idmap mount aware Extend some inode methods with an additional user namespace argument. A filesystem that is aware of idmapped mounts will receive the user namespace the mount has been marked with. This can be used for additional permission checking and also to enable filesystems to translate between uids and gids if they need to. We have implemented all relevant helpers in earlier patches. As requested we simply extend the exisiting inode method instead of introducing new ones. This is a little more code churn but it's mostly mechanical and doesnt't leave us with additional inode methods. Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-01-24 14:27:20 +01:00
Christian Brauner	0d56a4518d	stat: handle idmapped mounts The generic_fillattr() helper fills in the basic attributes associated with an inode. Enable it to handle idmapped mounts. If the inode is accessed through an idmapped mount map it into the mount's user namespace before we store the uid and gid. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-12-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-01-24 14:27:17 +01:00
Gao Xiang	d3938ee23e	erofs: derive atime instead of leaving it empty EROFS has _only one_ ondisk timestamp (ctime is currently documented and recorded, we might also record mtime instead with a new compat feature if needed) for each extended inode since EROFS isn't mainly for archival purposes so no need to keep all timestamps on disk especially for Android scenarios due to security concerns. Also, romfs/cramfs don't have their own on-disk timestamp, and squashfs only records mtime instead. Let's also derive access time from ondisk timestamp rather than leaving it empty, and if mtime/atime for each file are really needed for specific scenarios as well, we can also use xattrs to record them then. Link: https://lore.kernel.org/r/20201031195102.21221-1-hsiangkao@aol.com [ Gao Xiang: It'd be better to backport for user-friendly concern. ] Fixes: `431339ba90` ("staging: erofs: add inode operations") Cc: stable <stable@vger.kernel.org> # 4.19+ Reported-by: nl6720 <nl6720@gmail.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2020-11-04 09:15:33 +08:00
Gao Xiang	0dcd3c94e0	erofs: fix extended inode could cross boundary Each ondisk inode should be aligned with inode slot boundary (32-byte alignment) because of nid calculation formula, so all compact inodes (32 byte) cannot across page boundary. However, extended inode is now 64-byte form, which can across page boundary in principle if the location is specified on purpose, although it's hard to be generated by mkfs due to the allocation policy and rarely used by Android use case now mainly for > 4GiB files. For now, only two fields `i_ctime_nsec` and `i_nlink' couldn't be read from disk properly and cause out-of-bound memory read with random value. Let's fix now. Fixes: `431339ba90` ("staging: erofs: add inode operations") Cc: <stable@vger.kernel.org> # 4.19+ Link: https://lore.kernel.org/r/20200729175801.GA23973@xiangao.remote.csb Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2020-08-03 21:04:46 +08:00
Alexander A. Klimov	592e7cd00b	erofs: Replace HTTP links with HTTPS ones Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w\|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Link: https://lore.kernel.org/r/20200713130944.34419-1-grandmaster@al2klimov.de Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2020-08-03 21:04:29 +08:00
Chengguang Xu	e7cda1ee94	erofs: code cleanup by removing ifdef macro surrounding Define erofs_listxattr and erofs_xattr_handlers to NULL when CONFIG_EROFS_FS_XATTR is not enabled, then we can remove many ugly ifdef macros in the code. Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Link: https://lore.kernel.org/r/20200526090343.22794-1-cgxu519@mykernel.net Signed-off-by: Gao Xiang <hsiangkao@redhat.com>	2020-05-27 16:46:20 +08:00
Gao Xiang	4231138fe0	erofs: always use iget5_locked As Christoph said [1] [2], "Just use the slightly more complicated 32-bit version everywhere so that you have a single actually tested code path. And then remove this helper. " [1] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ [2] https://lore.kernel.org/r/20190902125320.GA16726@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-25-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 20:10:09 +02:00
Gao Xiang	4f761fa253	erofs: rename errln/infoln/debugln to erofs_{err, info, dbg} Add prefix "erofs_" to these functions and print sb->s_id as a prefix to erofs_{err, info} so that the user knows which file system is affected. Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-23-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 20:10:09 +02:00
Gao Xiang	84947eb603	erofs: save one level of indentation As Christoph said [1], ".. and save one level of indentation." [1] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-22-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 20:10:09 +02:00
Gao Xiang	e2c71e74b2	erofs: kill all erofs specific fault injection As Christoph suggested [1], "Please just use plain kmalloc everywhere and let the normal kernel error injection code take care of injeting any errors." [1] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-20-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 20:10:08 +02:00
Gao Xiang	99634bf388	erofs: add "erofs_" prefix for common and short functions Add erofs_ prefix to free_inode, alloc_inode, ... Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-19-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 20:10:08 +02:00
Gao Xiang	e655b5b3a2	erofs: kill prio and nofail of erofs_get_meta_page() As Christoph pointed out [1], "Why is there __erofs_get_meta_page with the two weird booleans instead of a single erofs_get_meta_page that gets and gfp_t for additional flags and an unsigned int for additional bio op flags." And since all callers can handle errors, let's kill prio and nofail and erofs_get_inline_page() now. [1] https://lore.kernel.org/r/20190830162812.GA10694@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-17-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 20:10:08 +02:00

1 2

58 Commits