Go to file
David Howells db0aa2e956
mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios
Define a data structure, struct folio_queue, to represent a sequence of
folios and a kernel-internal I/O iterator type, ITER_FOLIOQ, to allow a
list of folio_queue structures to be used to provide a buffer to
iov_iter-taking functions, such as sendmsg and recvmsg.

The folio_queue structure looks like:

	struct folio_queue {
		struct folio_batch	vec;
		u8			orders[PAGEVEC_SIZE];
		struct folio_queue	*next;
		struct folio_queue	*prev;
		unsigned long		marks;
		unsigned long		marks2;
	};

It does not use a list_head so that next and/or prev can be set to NULL at
the ends of the list, allowing iov_iter-handling routines to determine that
they *are* the ends without needing to store a head pointer in the iov_iter
struct.

A folio_batch struct is used to hold the folio pointers which allows the
batch to be passed to batch handling functions.  Two mark bits are
available per slot.  The intention is to use at least one of them to mark
folios that need putting, but that might not be ultimately necessary.
Accessor functions are used to access the slots to do the masking and an
additional accessor function is used to indicate the size of the array.

The order of each folio is also stored in the structure to avoid the need
for iov_iter_advance() and iov_iter_revert() to have to query each folio to
find its size.

With careful barriering, this can be used as an extending buffer with new
folios inserted and new folio_queue structs added without the need for a
lock.  Further, provided we always keep at least one struct in the buffer,
we can also remove consumed folios and consumed structs from the head end
as we without the need for locks.

[Questions/thoughts]

 (1) To manage this, I need a head pointer, a tail pointer, a tail slot
     number (assuming insertion happens at the tail end and the next
     pointers point from head to tail).  Should I put these into a struct
     of their own, say "folio_queue_head" or "rolling_buffer"?

     I will end up with two of these in netfs_io_request eventually, one
     keeping track of the pagecache I'm dealing with for buffered I/O and
     the other to hold a bounce buffer when we need one.

 (2) Should I make the slots {folio,off,len} or bio_vec?

 (3) This is intended to replace ITER_XARRAY eventually.  Using an xarray
     in I/O iteration requires the taking of the RCU read lock, doing
     copying under the RCU read lock, walking the xarray (which may change
     under us), handling retries and dealing with special values.

     The advantage of ITER_XARRAY is that when we're dealing with the
     pagecache directly, we don't need any allocation - but if we're doing
     encrypted comms, there's a good chance we'd be using a bounce buffer
     anyway.

     This will require afs, erofs, cifs, orangefs and fscache to be
     converted to not use this.  afs still uses it for dirs and symlinks;
     some of erofs usages should be easy to change, but there's one which
     won't be so easy; ceph's use via fscache can be fixed by porting ceph
     to netfslib; cifs is using xarray as a bounce buffer - that can be
     moved to use sheaves instead; and orangefs has a similar problem to
     erofs - maybe orangefs could use netfslib?

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Gao Xiang <xiang@kernel.org>
cc: Mike Marshall <hubcap@omnibond.com>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
cc: linux-afs@lists.infradead.org
cc: linux-cifs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: linux-erofs@lists.ozlabs.org
cc: devel@lists.orangefs.org
Link: https://lore.kernel.org/r/20240814203850.2240469-13-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-12 12:20:21 +02:00
arch ARM fix for v6.11 2024-09-04 09:17:33 -07:00
block block: fix detection of unsupported WRITE SAME in blkdev_issue_write_zeroes 2024-08-28 08:49:25 -06:00
certs kbuild: use $(src) instead of $(srctree)/$(src) for source directory 2024-05-10 04:34:52 +09:00
crypto crypto: testmgr - generate power-of-2 lengths more often 2024-07-13 11:50:28 +12:00
Documentation mm/memcontrol: respect zswap.writeback setting from parent cg too 2024-09-01 17:59:02 -07:00
drivers ata fixes for 6.11-rc7 2024-09-01 19:59:59 -07:00
fs netfs: Use bh-disabling spinlocks for rreq->lock 2024-09-05 11:00:42 +02:00
include mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios 2024-09-12 12:20:21 +02:00
init Rust fixes for v6.11 2024-08-16 11:24:06 -07:00
io_uring io_uring/kbuf: return correct iovec count from classic buffer peek 2024-08-30 10:45:54 -06:00
ipc sysctl: treewide: constify the ctl_table argument of proc_handlers 2024-07-24 20:59:29 +02:00
kernel 17 hotfixes, 15 of which are cc:stable. 2024-09-04 08:37:33 -07:00
lib mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios 2024-09-12 12:20:21 +02:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm vfs-6.11-rc7.fixes 2024-09-04 09:33:57 -07:00
net Including fixes from bluetooth, wireless and netfilter. 2024-08-30 06:14:39 +12:00
rust Rust fixes for v6.11 2024-08-16 11:24:06 -07:00
samples treewide: remove unnecessary <linux/version.h> inclusion 2024-08-12 18:36:44 +09:00
scripts scripts: fix gfp-translate after ___GFP_*_BITS conversion to an enum 2024-09-01 17:59:01 -07:00
security Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging 2024-09-01 09:18:48 +12:00
sound sound fixes for 6.11-rc6 2024-08-28 06:24:22 +12:00
tools selftests: mm: fix build errors on armhf 2024-09-01 17:58:59 -07:00
usr initramfs: shorten cmd_initfs in usr/Makefile 2024-07-16 01:07:52 +09:00
virt KVM: x86: Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX) 2024-08-14 12:28:24 -04:00
.clang-format Docs: Move clang-format from process/ to dev-tools/ 2024-06-26 16:36:00 -06:00
.cocciconfig
.editorconfig .editorconfig: remove trim_trailing_whitespace option 2024-06-13 16:47:52 +02:00
.get_maintainer.ignore Add Jeff Kirsher to .get_maintainer.ignore 2024-03-08 11:36:54 +00:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore kbuild: add script and target to generate pacman package 2024-07-22 01:24:22 +09:00
.mailmap mailmap: update entry for Jan Kuliga 2024-09-01 17:59:02 -07:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS tracing: Update of MAINTAINERS and CREDITS file 2024-07-18 14:08:42 -07:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS USB fixes for 6.11-rc6 2024-09-01 07:06:28 +12:00
Makefile Linux 6.11-rc6 2024-09-01 19:46:02 +12:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.