linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2025-01-10 15:54:39 +08:00

Author	SHA1	Message	Date
Dave Chinner	ebb7fb1557	xfs, iomap: limit individual ioend chain lengths in writeback Trond Myklebust reported soft lockups in XFS IO completion such as this: watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [kworker/12:1:3106] CPU: 12 PID: 3106 Comm: kworker/12:1 Not tainted 4.18.0-305.10.2.el8_4.x86_64 #1 Workqueue: xfs-conv/md127 xfs_end_io [xfs] RIP: 0010:_raw_spin_unlock_irqrestore+0x11/0x20 Call Trace: wake_up_page_bit+0x8a/0x110 iomap_finish_ioend+0xd7/0x1c0 iomap_finish_ioends+0x7f/0xb0 xfs_end_ioend+0x6b/0x100 [xfs] xfs_end_io+0xb9/0xe0 [xfs] process_one_work+0x1a7/0x360 worker_thread+0x1fa/0x390 kthread+0x116/0x130 ret_from_fork+0x35/0x40 Ioends are processed as an atomic completion unit when all the chained bios in the ioend have completed their IO. Logically contiguous ioends can also be merged and completed as a single, larger unit. Both of these things can be problematic as both the bio chains per ioend and the size of the merged ioends processed as a single completion are both unbound. If we have a large sequential dirty region in the page cache, write_cache_pages() will keep feeding us sequential pages and we will keep mapping them into ioends and bios until we get a dirty page at a non-sequential file offset. These large sequential runs can will result in bio and ioend chaining to optimise the io patterns. The pages iunder writeback are pinned within these chains until the submission chaining is broken, allowing the entire chain to be completed. This can result in huge chains being processed in IO completion context. We get deep bio chaining if we have large contiguous physical extents. We will keep adding pages to the current bio until it is full, then we'll chain a new bio to keep adding pages for writeback. Hence we can build bio chains that map millions of pages and tens of gigabytes of RAM if the page cache contains big enough contiguous dirty file regions. This long bio chain pins those pages until the final bio in the chain completes and the ioend can iterate all the chained bios and complete them. OTOH, if we have a physically fragmented file, we end up submitting one ioend per physical fragment that each have a small bio or bio chain attached to them. We do not chain these at IO submission time, but instead we chain them at completion time based on file offset via iomap_ioend_try_merge(). Hence we can end up with unbound ioend chains being built via completion merging. XFS can then do COW remapping or unwritten extent conversion on that merged chain, which involves walking an extent fragment at a time and running a transaction to modify the physical extent information. IOWs, we merge all the discontiguous ioends together into a contiguous file range, only to then process them individually as discontiguous extents. This extent manipulation is computationally expensive and can run in a tight loop, so merging logically contiguous but physically discontigous ioends gains us nothing except for hiding the fact the fact we broke the ioends up into individual physical extents at submission and then need to loop over those individual physical extents at completion. Hence we need to have mechanisms to limit ioend sizes and to break up completion processing of large merged ioend chains: 1. bio chains per ioend need to be bound in length. Pure overwrites go straight to iomap_finish_ioend() in softirq context with the exact bio chain attached to the ioend by submission. Hence the only way to prevent long holdoffs here is to bound ioend submission sizes because we can't reschedule in softirq context. 2. iomap_finish_ioends() has to handle unbound merged ioend chains correctly. This relies on any one call to iomap_finish_ioend() being bound in runtime so that cond_resched() can be issued regularly as the long ioend chain is processed. i.e. this relies on mechanism #1 to limit individual ioend sizes to work correctly. 3. filesystems have to loop over the merged ioends to process physical extent manipulations. This means they can loop internally, and so we break merging at physical extent boundaries so the filesystem can easily insert reschedule points between individual extent manipulations. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reported-and-tested-by: Trond Myklebust <trondmy@hammerspace.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2022-01-26 09:19:20 -08:00
Linus Torvalds	3acbdbf42e	dax + libnvdimm for v5.17 - Simplify the dax_operations API - Eliminate bdev_dax_pgoff() in favor of the filesystem maintaining and applying a partition offset to all its DAX iomap operations. - Remove wrappers and device-mapper stacked callbacks for ->copy_from_iter() and ->copy_to_iter() in favor of moving block_device relative offset responsibility to the dax_direct_access() caller. - Remove the need for an @bdev in filesystem-DAX infrastructure - Remove unused uio helpers copy_from_iter_flushcache() and copy_mc_to_iter() as only the non-check_copy_size() versions are used for DAX. - Prepare XFS for the pending (next merge window) DAX+reflink support - Remove deprecated DEV_DAX_PMEM_COMPAT support - Cleanup a straggling misuse of the GUID api Tags offered after the branch was cut: Reviewed-by: Mike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/Ydb/3P+8nvjCjYfO@redhat.com -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYd3dTAAKCRDfioYZHlFs Z//UAP9zetoTE+O7zJG7CXja4jSopSadbdbh6QKSXaqfKBPvQQD+N4US3wA2bGv8 f/qCY62j2Hj3hUTGHs9RvTyw3JsSYAA= =QvDs -----END PGP SIGNATURE----- Merge tag 'libnvdimm-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull dax and libnvdimm updates from Dan Williams: "The bulk of this is a rework of the dax_operations API after discovering the obstacles it posed to the work-in-progress DAX+reflink support for XFS and other copy-on-write filesystem mechanics. Primarily the need to plumb a block_device through the API to handle partition offsets was a sticking point and Christoph untangled that dependency in addition to other cleanups to make landing the DAX+reflink support easier. The DAX_PMEM_COMPAT option has been around for 4 years and not only are distributions shipping userspace that understand the current configuration API, but some are not even bothering to turn this option on anymore, so it seems a good time to remove it per the deprecation schedule. Recall that this was added after the device-dax subsystem moved from /sys/class/dax to /sys/bus/dax for its sysfs organization. All recent functionality depends on /sys/bus/dax. Some other miscellaneous cleanups and reflink prep patches are included as well. Summary: - Simplify the dax_operations API: - Eliminate bdev_dax_pgoff() in favor of the filesystem maintaining and applying a partition offset to all its DAX iomap operations. - Remove wrappers and device-mapper stacked callbacks for ->copy_from_iter() and ->copy_to_iter() in favor of moving block_device relative offset responsibility to the dax_direct_access() caller. - Remove the need for an @bdev in filesystem-DAX infrastructure - Remove unused uio helpers copy_from_iter_flushcache() and copy_mc_to_iter() as only the non-check_copy_size() versions are used for DAX. - Prepare XFS for the pending (next merge window) DAX+reflink support - Remove deprecated DEV_DAX_PMEM_COMPAT support - Cleanup a straggling misuse of the GUID api" * tag 'libnvdimm-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (38 commits) iomap: Fix error handling in iomap_zero_iter() ACPI: NFIT: Import GUID before use dax: remove the copy_from_iter and copy_to_iter methods dax: remove the DAXDEV_F_SYNC flag dax: simplify dax_synchronous and set_dax_synchronous uio: remove copy_from_iter_flushcache() and copy_mc_to_iter() iomap: turn the byte variable in iomap_zero_iter into a ssize_t memremap: remove support for external pgmap refcounts fsdax: don't require CONFIG_BLOCK iomap: build the block based code conditionally dax: fix up some of the block device related ifdefs fsdax: shift partition offset handling into the file systems dax: return the partition offset from fs_dax_get_by_bdev iomap: add a IOMAP_DAX flag xfs: pass the mapping flags to xfs_bmbt_to_iomap xfs: use xfs_direct_write_iomap_ops for DAX zeroing xfs: move dax device handling into xfs_{alloc,free}_buftarg ext4: cleanup the dax handling in ext4_fill_super ext2: cleanup the dax handling in ext2_fill_super fsdax: decouple zeroing from the iomap buffered I/O code ...	2022-01-12 15:46:11 -08:00
Matthew Wilcox (Oracle)	9e05e95ca8	iomap: Fix error handling in iomap_zero_iter() iomap_write_end() does not return a negative errno to indicate an error, but the number of bytes successfully copied. It cannot return an error today, so include a debugging assertion like the one in iomap_unshare_iter(). Fixes: `c6f4046865` ("fsdax: decouple zeroing from the iomap buffered I/O code") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211221044450.517558-1-willy@infradead.org Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-12-23 14:04:11 -08:00
Matthew Wilcox (Oracle)	4d7bd0eb72	iomap: Inline __iomap_zero_iter into its caller To make the merge easier, replicate the inlining of __iomap_zero_iter() into iomap_zero_iter() that is currently in the nvdimm tree. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org>	2021-12-21 13:51:08 -05:00
Matthew Wilcox (Oracle)	60d8231089	iomap: Support large folios in invalidatepage If we're punching a hole in a large folio, we need to remove the per-folio iomap data as the folio is about to be split and each page will need its own. If a dirty folio is only partially-uptodate, the iomap data contains the information about which blocks cannot be written back, so assert that a dirty folio is fully uptodate. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>	2021-12-18 00:06:08 -05:00
Matthew Wilcox (Oracle)	589110e897	iomap: Convert iomap_migrate_page() to use folios The arguments are still pages for now, but we can use folios internally and cut out a lot of calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>	2021-12-18 00:06:08 -05:00
Matthew Wilcox (Oracle)	e735c00794	iomap: Convert iomap_add_to_ioend() to take a folio We still iterate one block at a time, but now we call compound_head() less often. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>	2021-12-18 00:06:08 -05:00
Matthew Wilcox (Oracle)	81d4782a74	iomap: Simplify iomap_do_writepage() Rename end_offset to end_pos and offset_into_page to poff to match the rest of the file. Simplify the handling of the last page straddling i_size by doing the EOF check based on the byte granularity i_size instead of converting to a pgoff prematurely. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>	2021-12-18 00:06:08 -05:00
Matthew Wilcox (Oracle)	926550362d	iomap: Simplify iomap_writepage_map() Rename end_offset to end_pos and file_offset to pos to match the rest of the file. Simplify the loop by calculating nblocks up front instead of each time around the loop. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>	2021-12-18 00:06:08 -05:00
Matthew Wilcox (Oracle)	6e478521df	iomap,xfs: Convert ->discard_page to ->discard_folio XFS has the only implementation of ->discard_page today, so convert it to use folios in the same patch as converting the API. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>	2021-12-18 00:06:07 -05:00
Matthew Wilcox (Oracle)	9c4ce08dd2	iomap: Convert iomap_write_end_inline to take a folio This conversion is only safe because iomap only supports writes to inline data which starts at the beginning of the file. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2021-12-18 00:06:07 -05:00
Matthew Wilcox (Oracle)	bc6123a84a	iomap: Convert iomap_write_begin() and iomap_write_end() to folios These functions still only work in PAGE_SIZE chunks, but there are fewer conversions from tail to head pages as a result of this patch. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>	2021-12-18 00:06:07 -05:00
Matthew Wilcox (Oracle)	a25def1fe5	iomap: Convert __iomap_zero_iter to use a folio The zero iterator can work in folio-sized chunks instead of page-sized chunks. This will save a lot of page cache lookups if the file is cached in large folios. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>	2021-12-18 00:06:07 -05:00
Matthew Wilcox (Oracle)	d454ab82bc	iomap: Allow iomap_write_begin() to be called with the full length In the future, we want write_begin to know the entire length of the write so that it can choose to allocate large folios. Pass the full length in from __iomap_zero_iter() and limit it where necessary. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org>	2021-12-18 00:06:00 -05:00
Matthew Wilcox (Oracle)	ea0f843aa7	iomap: Convert iomap_page_mkwrite to use a folio If we write to any page in a folio, we have to mark the entire folio as dirty, and potentially COW the entire folio, because it'll all get written back as one unit. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2021-12-16 15:49:52 -05:00
Matthew Wilcox (Oracle)	3aa9c659bf	iomap: Convert readahead and readpage to use a folio Handle folios of arbitrary size instead of working in PAGE_SIZE units. readahead_folio() decreases the page refcount for you, so this is not quite a mechanical change. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2021-12-16 15:49:52 -05:00
Matthew Wilcox (Oracle)	874628a2c5	iomap: Convert iomap_read_inline_data to take a folio We still only support up to a single page of inline data (at least, per call to iomap_read_inline_data()), but it can now be written into the middle of a folio in case we decide to allocate a 16KiB page for a file that's 8.1KiB in size. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2021-12-16 15:49:52 -05:00
Matthew Wilcox (Oracle)	431c0566bb	iomap: Use folio offsets instead of page offsets Pass a folio around instead of the page, and make sure the offset is relative to the start of the folio instead of the start of a page. Also use size_t for offset & length to make it clear that these are byte counts, and to support >2GB folios in the future. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2021-12-16 15:49:52 -05:00
Matthew Wilcox (Oracle)	8ffd74e9a8	iomap: Convert bio completions to use folios Use bio_for_each_folio() to iterate over each folio in the bio instead of iterating over each page. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2021-12-16 15:49:52 -05:00
Matthew Wilcox (Oracle)	cd1e5afe55	iomap: Pass the iomap_page into iomap_set_range_uptodate All but one caller already has the iomap_page, so we can avoid getting it again. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2021-12-16 15:49:52 -05:00
Matthew Wilcox (Oracle)	8306a5f563	iomap: Add iomap_invalidate_folio Keep iomap_invalidatepage around as a wrapper for use in address_space operations. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>	2021-12-16 15:49:52 -05:00
Matthew Wilcox (Oracle)	39f16c8345	iomap: Convert iomap_releasepage to use a folio This is an address_space operation, so its argument must remain as a struct page, but we can use a folio internally. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>	2021-12-16 15:49:52 -05:00
Matthew Wilcox (Oracle)	c46e8324ca	iomap: Convert iomap_page_release to take a folio iomap_page_release() was also assuming that it was being passed a head page. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2021-12-16 15:49:52 -05:00
Matthew Wilcox (Oracle)	435d44b3fd	iomap: Convert iomap_page_create to take a folio This function already assumed it was being passed a head page, so just formalise that. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2021-12-16 15:49:51 -05:00
Matthew Wilcox (Oracle)	95c4cd053a	iomap: Convert to_iomap_page to take a folio The big comment about only using a head page can go away now that it takes a folio argument. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2021-12-16 15:49:51 -05:00
Matthew Wilcox (Oracle)	d1bd0b4ebf	fs/buffer: Convert __block_write_begin_int() to take a folio There are no plans to convert buffer_head infrastructure to use large folios, but __block_write_begin_int() is called from iomap, and it's more convenient and less error-prone if we pass in a folio from iomap. It also has a nice saving of almost 200 bytes of code from removing repeated calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>	2021-12-16 15:49:51 -05:00
Christoph Hellwig	de291b5902	iomap: turn the byte variable in iomap_zero_iter into a ssize_t @bytes also holds the return value from iomap_write_end, which can contain a negative error value. As @bytes is always less than the page size even the signed type can hold the entire possible range. Fixes: `c6f4046865` ("fsdax: decouple zeroing from the iomap buffered I/O code") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211208091203.2927754-1-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-12-08 07:04:21 -08:00
Christoph Hellwig	c6f4046865	fsdax: decouple zeroing from the iomap buffered I/O code Unshare the DAX and iomap buffered I/O page zeroing code. This code previously did a IS_DAX check deep inside the iomap code, which in fact was the only DAX check in the code. Instead move these checks into the callers. Most callers already have DAX special casing anyway and XFS will need it for reflink support as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-19-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2021-12-04 08:58:53 -08:00
Andreas Gruenbacher	5ad448ce29	iomap: iomap_read_inline_data cleanup Change iomap_read_inline_data to return 0 or an error code; this simplifies the callers. Add a description. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> [djwong: document the return value of iomap_read_inline_data explicitly] Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-11-24 10:15:47 -08:00
Andreas Gruenbacher	d8af404ffc	iomap: Fix inline extent handling in iomap_readpage Before commit `740499c784` ("iomap: fix the iomap_readpage_actor return value for inline data"), when hitting an IOMAP_INLINE extent, iomap_readpage_actor would report having read the entire page. Since then, it only reports having read the inline data (iomap->length). This will force iomap_readpage into another iteration, and the filesystem will report an unaligned hole after the IOMAP_INLINE extent. But iomap_readpage_actor (now iomap_readpage_iter) isn't prepared to deal with unaligned extents, it will get things wrong on filesystems with a block size smaller than the page size, and we'll eventually run into the following warning in iomap_iter_advance: WARN_ON_ONCE(iter->processed > iomap_length(iter)); Fix that by changing iomap_readpage_iter to return 0 when hitting an inline extent; this will cause iomap_iter to stop immediately. To fix readahead as well, change iomap_readahead_iter to pass on iomap_readpage_iter return values less than or equal to zero. Fixes: `740499c784` ("iomap: fix the iomap_readpage_actor return value for inline data") Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-11-21 16:28:07 -08:00
Andreas Gruenbacher	a6294593e8	iov_iter: Turn iov_iter_fault_in_readable into fault_in_iov_iter_readable Turn iov_iter_fault_in_readable into a function that returns the number of bytes not faulted in, similar to copy_to_user, instead of returning a non-zero value when any of the requested pages couldn't be faulted in. This supports the existing users that require all pages to be faulted in as well as new users that are happy if any pages can be faulted in. Rename iov_iter_fault_in_readable to fault_in_iov_iter_readable to make sure this change doesn't silently break things. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-10-18 16:35:06 +02:00
Christoph Hellwig	fad0a1ab34	iomap: constify iomap_iter_srcmap The srcmap returned from iomap_iter_srcmap is never modified, so mark the iomap returned from it const and constify a lot of code that never modifies the iomap. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-16 21:26:33 -07:00
Christoph Hellwig	b74b1293e6	iomap: rework unshare flag Instead of another internal flags namespace inside of buffered-io.c, just pass a UNSHARE hint in the main iomap flags field. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-16 21:26:33 -07:00
Christoph Hellwig	1b5c1e36dc	iomap: pass an iomap_iter to various buffered I/O helpers Pass the iomap_iter structure instead of individual parameters to various internal helpers for buffered I/O. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-16 21:26:33 -07:00
Christoph Hellwig	253564baff	iomap: switch iomap_page_mkwrite to use iomap_iter Switch iomap_page_mkwrite to use iomap_iter. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-16 21:26:33 -07:00
Christoph Hellwig	2aa3048e03	iomap: switch iomap_zero_range to use iomap_iter Switch iomap_zero_range to use iomap_iter. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-16 21:26:33 -07:00
Christoph Hellwig	8fc274d1f4	iomap: switch iomap_file_unshare to use iomap_iter Switch iomap_file_unshare to use iomap_iter. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-16 21:26:33 -07:00
Christoph Hellwig	ce83a0251c	iomap: switch iomap_file_buffered_write to use iomap_iter Switch iomap_file_buffered_write to use iomap_iter. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-16 21:26:33 -07:00
Christoph Hellwig	f6d480006c	iomap: switch readahead and readpage to use iomap_iter Switch the page cache read functions to use iomap_iter instead of iomap_apply. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-16 21:26:33 -07:00
Christoph Hellwig	740499c784	iomap: fix the iomap_readpage_actor return value for inline data The actor should never return a larger value than the length that was passed in. The current code handles this gracefully, but the opcoming iter model will be more picky. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-16 21:26:33 -07:00
Christoph Hellwig	1acd9e9c01	iomap: mark the iomap argument to iomap_read_page_sync const iomap_read_page_sync never modifies the passed in iomap, so mark it const. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-16 21:26:33 -07:00
Christoph Hellwig	78c64b00f8	iomap: mark the iomap argument to iomap_read_inline_data const iomap_read_inline_data never modifies the passed in iomap, so mark it const. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-16 21:26:33 -07:00
Christoph Hellwig	1d25d0aecf	iomap: remove the iomap arguments to ->page_{prepare,done} These aren't actually used by the only instance implementing the methods. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-16 21:26:33 -07:00
Darrick J. Wong	b69eea82d3	iomap: pass writeback errors to the mapping Modern-day mapping_set_error has the ability to squash the usual negative error code into something appropriate for long-term storage in a struct address_space -- ENOSPC becomes AS_ENOSPC, and everything else becomes EIO. iomap squashes /everything/ to EIO, just as XFS did before that, but this doesn't make sense. Fix this by making it so that we can pass ENOSPC to userspace when writeback fails due to space problems. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>	2021-08-16 12:12:52 -07:00
Matthew Wilcox (Oracle)	ae44f9c286	iomap: Add another assertion to inline data handling Check that the file tail does not cross a page boundary. Requested by Andreas. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-05 10:30:33 -07:00
Matthew Wilcox (Oracle)	ab069d5fdc	iomap: Use kmap_local_page instead of kmap_atomic kmap_atomic() has the side-effect of disabling pagefaults and preemption. kmap_local_page() does not do this and is preferred. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-05 10:30:33 -07:00
Andreas Gruenbacher	f1f264b4c1	iomap: Fix some typos and bad grammar Fix some typos and bad grammar in buffered-io.c to make the comments easier to read. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-03 09:43:14 -07:00
Matthew Wilcox (Oracle)	b405435b41	iomap: Support inline data with block size < page size Remove the restriction that inline data must start on a page boundary in a file. This allows, for example, the first 2KiB to be stored out of line and the trailing 30 bytes to be stored inline. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-03 09:43:13 -07:00
Gao Xiang	69f4a26c1e	iomap: support reading inline data from non-zero pos The existing inline data support only works for cases where the entire file is stored as inline data. For larger files, EROFS stores the initial blocks separately and the remainder of the file ("file tail") adjacent to the inode. Generalise inline data to allow reading the inline file tail. Tails may not cross a page boundary in memory. We currently have no filesystems that support tails and writing, so that case is currently disabled (see iomap_write_begin_inline). Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-03 09:43:13 -07:00
Christoph Hellwig	c1b79f11f4	iomap: simplify iomap_add_to_ioend Now that the outstanding writes are counted in bytes, there is no need to use the low-level __bio_try_merge_page API, we can switch back to always using bio_add_page and simply iomap_add_to_ioend again. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-08-03 09:43:13 -07:00

1 2 3

111 Commits