- Enable CMA
- Add support for MB v11
- Defconfig updates
- Minor fixes
-----BEGIN PGP SIGNATURE-----
iF0EABECAB0WIQQbPNTMvXmYlBPRwx7KSWXLKUoMIQUCXjlJ1gAKCRDKSWXLKUoM
IWy9AJ4tauV9sUb+zNadrYxI+2zemRstUwCfQ49LG4kHpFCv8ldSTmhBPJY/3MI=
=QpT4
-----END PGP SIGNATURE-----
Merge tag 'microblaze-v5.6-rc1' of git://git.monstr.eu/linux-2.6-microblaze
Pull Microblaze update from Michal Simek:
- enable CMA
- add support for MB v11
- defconfig updates
- minor fixes
* tag 'microblaze-v5.6-rc1' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Add ID for Microblaze v11
microblaze: Prevent the overflow of the start
microblaze: Wire CMA allocator
asm-generic: Make dma-contiguous.h a mandatory include/asm header
microblaze: Sync defconfig with latest Kconfig layout
microblaze: defconfig: Disable EXT2 driver and Enable EXT3 & EXT4 drivers
microblaze: Align comments with register usage
In case the start + cache size is more than the max int the
start overflows.
Prevent the same.
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Based on commit 04e3543e22 ("microblaze: use the generic dma coherent
remap allocator")
CMA can be easily enabled by calling dma_contiguous_reserve() at the end of
mmu_init(). High limit is end of lowmem space which is completely unused at
this point of time.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Layout was changed by commit 6210b6402f
("kernel-hacking: group sysrq/kgdb/ubsan into 'Generic Kernel Debugging Instruments'")
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
As EXT4 filesystem driver is used for handling EXT2 file systems as
well. There is no need to enable EXT2 driver. This patch disables EXT2
and enables EXT3/EXT4 drivers.
Signed-off-by: Manish Narani <manish.narani@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXjFo8wAKCRCRxhvAZXjc
omaGAQDVwCHQekqxp2eC8EJH4Pkt+Bn1BLrA25stlTo93YBPHgEAsPVUCRNcrZAl
VncYmxCfpt3Yu0S/MTVXu5xrRiIXPQk=
=uqTN
-----END PGP SIGNATURE-----
Merge tag 'threads-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull thread management updates from Christian Brauner:
"Sargun Dhillon over the last cycle has worked on the pidfd_getfd()
syscall.
This syscall allows for the retrieval of file descriptors of a process
based on its pidfd. A task needs to have ptrace_may_access()
permissions with PTRACE_MODE_ATTACH_REALCREDS (suggested by Oleg and
Andy) on the target.
One of the main use-cases is in combination with seccomp's user
notification feature. As a reminder, seccomp's user notification
feature was made available in v5.0. It allows a task to retrieve a
file descriptor for its seccomp filter. The file descriptor is usually
handed of to a more privileged supervising process. The supervisor can
then listen for syscall events caught by the seccomp filter of the
supervisee and perform actions in lieu of the supervisee, usually
emulating syscalls. pidfd_getfd() is needed to expand its uses.
There are currently two major users that wait on pidfd_getfd() and one
future user:
- Netflix, Sargun said, is working on a service mesh where users
should be able to connect to a dns-based VIP. When a user connects
to e.g. 1.2.3.4:80 that runs e.g. service "foo" they will be
redirected to an envoy process. This service mesh uses seccomp user
notifications and pidfd to intercept all connect calls and instead
of connecting them to 1.2.3.4:80 connects them to e.g.
127.0.0.1:8080.
- LXD uses the seccomp notifier heavily to intercept and emulate
mknod() and mount() syscalls for unprivileged containers/processes.
With pidfd_getfd() more uses-cases e.g. bridging socket connections
will be possible.
- The patchset has also seen some interest from the browser corner.
Right now, Firefox is using a SECCOMP_RET_TRAP sandbox managed by a
broker process. In the future glibc will start blocking all signals
during dlopen() rendering this type of sandbox impossible. Hence,
in the future Firefox will switch to a seccomp-user-nofication
based sandbox which also makes use of file descriptor retrieval.
The thread for this can be found at
https://sourceware.org/ml/libc-alpha/2019-12/msg00079.html
With pidfd_getfd() it is e.g. possible to bridge socket connections
for the supervisee (binding to a privileged port) and taking actions
on file descriptors on behalf of the supervisee in general.
Sargun's first version was using an ioctl on pidfds but various people
pushed for it to be a proper syscall which he duely implemented as
well over various review cycles. Selftests are of course included.
I've also added instructions how to deal with merge conflicts below.
There's also a small fix coming from the kernel mentee project to
correctly annotate struct sighand_struct with __rcu to fix various
sparse warnings. We've received a few more such fixes and even though
they are mostly trivial I've decided to postpone them until after -rc1
since they came in rather late and I don't want to risk introducing
build warnings.
Finally, there's a new prctl() command PR_{G,S}ET_IO_FLUSHER which is
needed to avoid allocation recursions triggerable by storage drivers
that have userspace parts that run in the IO path (e.g. dm-multipath,
iscsi, etc). These allocation recursions deadlock the device.
The new prctl() allows such privileged userspace components to avoid
allocation recursions by setting the PF_MEMALLOC_NOIO and
PF_LESS_THROTTLE flags. The patch carries the necessary acks from the
relevant maintainers and is routed here as part of prctl()
thread-management."
* tag 'threads-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim
sched.h: Annotate sighand_struct with __rcu
test: Add test for pidfd getfd
arch: wire up pidfd_getfd syscall
pid: Implement pidfd_getfd syscall
vfs, fdtable: Add fget_task helper
Pull openat2 support from Al Viro:
"This is the openat2() series from Aleksa Sarai.
I'm afraid that the rest of namei stuff will have to wait - it got
zero review the last time I'd posted #work.namei, and there had been a
leak in the posted series I'd caught only last weekend. I was going to
repost it on Monday, but the window opened and the odds of getting any
review during that... Oh, well.
Anyway, openat2 part should be ready; that _did_ get sane amount of
review and public testing, so here it comes"
From Aleksa's description of the series:
"For a very long time, extending openat(2) with new features has been
incredibly frustrating. This stems from the fact that openat(2) is
possibly the most famous counter-example to the mantra "don't silently
accept garbage from userspace" -- it doesn't check whether unknown
flags are present[1].
This means that (generally) the addition of new flags to openat(2) has
been fraught with backwards-compatibility issues (O_TMPFILE has to be
defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old
kernels gave errors, since it's insecure to silently ignore the
flag[2]). All new security-related flags therefore have a tough road
to being added to openat(2).
Furthermore, the need for some sort of control over VFS's path
resolution (to avoid malicious paths resulting in inadvertent
breakouts) has been a very long-standing desire of many userspace
applications.
This patchset is a revival of Al Viro's old AT_NO_JUMPS[3] patchset
(which was a variant of David Drysdale's O_BENEATH patchset[4] which
was a spin-off of the Capsicum project[5]) with a few additions and
changes made based on the previous discussion within [6] as well as
others I felt were useful.
In line with the conclusions of the original discussion of
AT_NO_JUMPS, the flag has been split up into separate flags. However,
instead of being an openat(2) flag it is provided through a new
syscall openat2(2) which provides several other improvements to the
openat(2) interface (see the patch description for more details). The
following new LOOKUP_* flags are added:
LOOKUP_NO_XDEV:
Blocks all mountpoint crossings (upwards, downwards, or through
absolute links). Absolute pathnames alone in openat(2) do not
trigger this. Magic-link traversal which implies a vfsmount jump is
also blocked (though magic-link jumps on the same vfsmount are
permitted).
LOOKUP_NO_MAGICLINKS:
Blocks resolution through /proc/$pid/fd-style links. This is done
by blocking the usage of nd_jump_link() during resolution in a
filesystem. The term "magic-links" is used to match with the only
reference to these links in Documentation/, but I'm happy to change
the name.
It should be noted that this is different to the scope of
~LOOKUP_FOLLOW in that it applies to all path components. However,
you can do openat2(NO_FOLLOW|NO_MAGICLINKS) on a magic-link and it
will *not* fail (assuming that no parent component was a
magic-link), and you will have an fd for the magic-link.
In order to correctly detect magic-links, the introduction of a new
LOOKUP_MAGICLINK_JUMPED state flag was required.
LOOKUP_BENEATH:
Disallows escapes to outside the starting dirfd's
tree, using techniques such as ".." or absolute links. Absolute
paths in openat(2) are also disallowed.
Conceptually this flag is to ensure you "stay below" a certain
point in the filesystem tree -- but this requires some additional
to protect against various races that would allow escape using
"..".
Currently LOOKUP_BENEATH implies LOOKUP_NO_MAGICLINKS, because it
can trivially beam you around the filesystem (breaking the
protection). In future, there might be similar safety checks done
as in LOOKUP_IN_ROOT, but that requires more discussion.
In addition, two new flags are added that expand on the above ideas:
LOOKUP_NO_SYMLINKS:
Does what it says on the tin. No symlink resolution is allowed at
all, including magic-links. Just as with LOOKUP_NO_MAGICLINKS this
can still be used with NOFOLLOW to open an fd for the symlink as
long as no parent path had a symlink component.
LOOKUP_IN_ROOT:
This is an extension of LOOKUP_BENEATH that, rather than blocking
attempts to move past the root, forces all such movements to be
scoped to the starting point. This provides chroot(2)-like
protection but without the cost of a chroot(2) for each filesystem
operation, as well as being safe against race attacks that
chroot(2) is not.
If a race is detected (as with LOOKUP_BENEATH) then an error is
generated, and similar to LOOKUP_BENEATH it is not permitted to
cross magic-links with LOOKUP_IN_ROOT.
The primary need for this is from container runtimes, which
currently need to do symlink scoping in userspace[7] when opening
paths in a potentially malicious container.
There is a long list of CVEs that could have bene mitigated by
having RESOLVE_THIS_ROOT (such as CVE-2017-1002101,
CVE-2017-1002102, CVE-2018-15664, and CVE-2019-5736, just to name a
few).
In order to make all of the above more usable, I'm working on
libpathrs[8] which is a C-friendly library for safe path resolution.
It features a userspace-emulated backend if the kernel doesn't support
openat2(2). Hopefully we can get userspace to switch to using it, and
thus get openat2(2) support for free once it's ready.
Future work would include implementing things like
RESOLVE_NO_AUTOMOUNT and possibly a RESOLVE_NO_REMOTE (to allow
programs to be sure they don't hit DoSes though stale NFS handles)"
* 'work.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Documentation: path-lookup: include new LOOKUP flags
selftests: add openat2(2) selftests
open: introduce openat2(2) syscall
namei: LOOKUP_{IN_ROOT,BENEATH}: permit limited ".." resolution
namei: LOOKUP_IN_ROOT: chroot-like scoped resolution
namei: LOOKUP_BENEATH: O_BENEATH-like scoped resolution
namei: LOOKUP_NO_XDEV: block mountpoint crossing
namei: LOOKUP_NO_MAGICLINKS: block magic-link resolution
namei: LOOKUP_NO_SYMLINKS: block symlink resolution
namei: allow set_root() to produce errors
namei: allow nd_jump_link() to produce errors
nsfs: clean-up ns_get_path() signature to return int
namei: only return -ECHILD from follow_dotdot_rcu()
Here are the big set of tty and serial driver updates for 5.6-rc1
Included in here are:
- dummy_con cleanups (touches lots of arch code)
- sysrq logic cleanups (touches lots of serial drivers)
- samsung driver fixes (wasn't really being built)
- conmakeshash move to tty subdir out of scripts
- lots of small tty/serial driver updates
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXjFRBg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yn2VACgkge7vTeUNeZFc+6F4NWphAQ5tCQAoK/MMbU6
0O8ef7PjFwCU4s227UTv
=6m40
-----END PGP SIGNATURE-----
Merge tag 'tty-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver updates from Greg KH:
"Here are the big set of tty and serial driver updates for 5.6-rc1
Included in here are:
- dummy_con cleanups (touches lots of arch code)
- sysrq logic cleanups (touches lots of serial drivers)
- samsung driver fixes (wasn't really being built)
- conmakeshash move to tty subdir out of scripts
- lots of small tty/serial driver updates
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (140 commits)
tty: n_hdlc: Use flexible-array member and struct_size() helper
tty: baudrate: SPARC supports few more baud rates
tty: baudrate: Synchronise baud_table[] and baud_bits[]
tty: serial: meson_uart: Add support for kernel debugger
serial: imx: fix a race condition in receive path
serial: 8250_bcm2835aux: Document struct bcm2835aux_data
serial: 8250_bcm2835aux: Use generic remapping code
serial: 8250_bcm2835aux: Allocate uart_8250_port on stack
serial: 8250_bcm2835aux: Suppress register_port error on -EPROBE_DEFER
serial: 8250_bcm2835aux: Suppress clk_get error on -EPROBE_DEFER
serial: 8250_bcm2835aux: Fix line mismatch on driver unbind
serial_core: Remove unused member in uart_port
vt: Correct comment documenting do_take_over_console()
vt: Delete comment referencing non-existent unbind_con_driver()
arch/xtensa/setup: Drop dummy_con initialization
arch/x86/setup: Drop dummy_con initialization
arch/unicore32/setup: Drop dummy_con initialization
arch/sparc/setup: Drop dummy_con initialization
arch/sh/setup: Drop dummy_con initialization
arch/s390/setup: Drop dummy_con initialization
...
Pull scheduler updates from Ingo Molnar:
"These were the main changes in this cycle:
- More -rt motivated separation of CONFIG_PREEMPT and
CONFIG_PREEMPTION.
- Add more low level scheduling topology sanity checks and warnings
to filter out nonsensical topologies that break scheduling.
- Extend uclamp constraints to influence wakeup CPU placement
- Make the RT scheduler more aware of asymmetric topologies and CPU
capacities, via uclamp metrics, if CONFIG_UCLAMP_TASK=y
- Make idle CPU selection more consistent
- Various fixes, smaller cleanups, updates and enhancements - please
see the git log for details"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
sched/fair: Define sched_idle_cpu() only for SMP configurations
sched/topology: Assert non-NUMA topology masks don't (partially) overlap
idle: fix spelling mistake "iterrupts" -> "interrupts"
sched/fair: Remove redundant call to cpufreq_update_util()
sched/psi: create /proc/pressure and /proc/pressure/{io|memory|cpu} only when psi enabled
sched/fair: Fix sgc->{min,max}_capacity calculation for SD_OVERLAP
sched/fair: calculate delta runnable load only when it's needed
sched/cputime: move rq parameter in irqtime_account_process_tick
stop_machine: Make stop_cpus() static
sched/debug: Reset watchdog on all CPUs while processing sysrq-t
sched/core: Fix size of rq::uclamp initialization
sched/uclamp: Fix a bug in propagating uclamp value in new cgroups
sched/fair: Load balance aggressively for SCHED_IDLE CPUs
sched/fair : Improve update_sd_pick_busiest for spare capacity case
watchdog: Remove soft_lockup_hrtimer_cnt and related code
sched/rt: Make RT capacity-aware
sched/fair: Make EAS wakeup placement consider uclamp restrictions
sched/fair: Make task_fits_capacity() consider uclamp restrictions
sched/uclamp: Rename uclamp_util_with() into uclamp_rq_util_with()
sched/uclamp: Make uclamp util helpers use and return UL values
...
Pull EFI updates from Ingo Molnar:
"The main changes in this cycle were:
- Cleanup of the GOP [graphics output] handling code in the EFI stub
- Complete refactoring of the mixed mode handling in the x86 EFI stub
- Overhaul of the x86 EFI boot/runtime code
- Increase robustness for mixed mode code
- Add the ability to disable DMA at the root port level in the EFI
stub
- Get rid of RWX mappings in the EFI memory map and page tables,
where possible
- Move the support code for the old EFI memory mapping style into its
only user, the SGI UV1+ support code.
- plus misc fixes, updates, smaller cleanups.
... and due to interactions with the RWX changes, another round of PAT
cleanups make a guest appearance via the EFI tree - with no side
effects intended"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits)
efi/x86: Disable instrumentation in the EFI runtime handling code
efi/libstub/x86: Fix EFI server boot failure
efi/x86: Disallow efi=old_map in mixed mode
x86/boot/compressed: Relax sed symbol type regex for LLVM ld.lld
efi/x86: avoid KASAN false positives when accessing the 1: 1 mapping
efi: Fix handling of multiple efi_fake_mem= entries
efi: Fix efi_memmap_alloc() leaks
efi: Add tracking for dynamically allocated memmaps
efi: Add a flags parameter to efi_memory_map
efi: Fix comment for efi_mem_type() wrt absent physical addresses
efi/arm: Defer probe of PCIe backed efifb on DT systems
efi/x86: Limit EFI old memory map to SGI UV machines
efi/x86: Avoid RWX mappings for all of DRAM
efi/x86: Don't map the entire kernel text RW for mixed mode
x86/mm: Fix NX bit clearing issue in kernel_map_pages_in_pgd
efi/libstub/x86: Fix unused-variable warning
efi/libstub/x86: Use mandatory 16-byte stack alignment in mixed mode
efi/libstub/x86: Use const attribute for efi_is_64bit()
efi: Allow disabling PCI busmastering on bridges during boot
efi/x86: Allow translating 64-bit arguments for mixed mode calls
...
/* Background. */
For a very long time, extending openat(2) with new features has been
incredibly frustrating. This stems from the fact that openat(2) is
possibly the most famous counter-example to the mantra "don't silently
accept garbage from userspace" -- it doesn't check whether unknown flags
are present[1].
This means that (generally) the addition of new flags to openat(2) has
been fraught with backwards-compatibility issues (O_TMPFILE has to be
defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old
kernels gave errors, since it's insecure to silently ignore the
flag[2]). All new security-related flags therefore have a tough road to
being added to openat(2).
Userspace also has a hard time figuring out whether a particular flag is
supported on a particular kernel. While it is now possible with
contemporary kernels (thanks to [3]), older kernels will expose unknown
flag bits through fcntl(F_GETFL). Giving a clear -EINVAL during
openat(2) time matches modern syscall designs and is far more
fool-proof.
In addition, the newly-added path resolution restriction LOOKUP flags
(which we would like to expose to user-space) don't feel related to the
pre-existing O_* flag set -- they affect all components of path lookup.
We'd therefore like to add a new flag argument.
Adding a new syscall allows us to finally fix the flag-ignoring problem,
and we can make it extensible enough so that we will hopefully never
need an openat3(2).
/* Syscall Prototype. */
/*
* open_how is an extensible structure (similar in interface to
* clone3(2) or sched_setattr(2)). The size parameter must be set to
* sizeof(struct open_how), to allow for future extensions. All future
* extensions will be appended to open_how, with their zero value
* acting as a no-op default.
*/
struct open_how { /* ... */ };
int openat2(int dfd, const char *pathname,
struct open_how *how, size_t size);
/* Description. */
The initial version of 'struct open_how' contains the following fields:
flags
Used to specify openat(2)-style flags. However, any unknown flag
bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR)
will result in -EINVAL. In addition, this field is 64-bits wide to
allow for more O_ flags than currently permitted with openat(2).
mode
The file mode for O_CREAT or O_TMPFILE.
Must be set to zero if flags does not contain O_CREAT or O_TMPFILE.
resolve
Restrict path resolution (in contrast to O_* flags they affect all
path components). The current set of flags are as follows (at the
moment, all of the RESOLVE_ flags are implemented as just passing
the corresponding LOOKUP_ flag).
RESOLVE_NO_XDEV => LOOKUP_NO_XDEV
RESOLVE_NO_SYMLINKS => LOOKUP_NO_SYMLINKS
RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS
RESOLVE_BENEATH => LOOKUP_BENEATH
RESOLVE_IN_ROOT => LOOKUP_IN_ROOT
open_how does not contain an embedded size field, because it is of
little benefit (userspace can figure out the kernel open_how size at
runtime fairly easily without it). It also only contains u64s (even
though ->mode arguably should be a u16) to avoid having padding fields
which are never used in the future.
Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE
is no longer permitted for openat(2). As far as I can tell, this has
always been a bug and appears to not be used by userspace (and I've not
seen any problems on my machines by disallowing it). If it turns out
this breaks something, we can special-case it and only permit it for
openat(2) but not openat2(2).
After input from Florian Weimer, the new open_how and flag definitions
are inside a separate header from uapi/linux/fcntl.h, to avoid problems
that glibc has with importing that header.
/* Testing. */
In a follow-up patch there are over 200 selftests which ensure that this
syscall has the correct semantics and will correctly handle several
attack scenarios.
In addition, I've written a userspace library[4] which provides
convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary
because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care
must be taken when using RESOLVE_IN_ROOT'd file descriptors with other
syscalls). During the development of this patch, I've run numerous
verification tests using libpathrs (showing that the API is reasonably
usable by userspace).
/* Future Work. */
Additional RESOLVE_ flags have been suggested during the review period.
These can be easily implemented separately (such as blocking auto-mount
during resolution).
Furthermore, there are some other proposed changes to the openat(2)
interface (the most obvious example is magic-link hardening[5]) which
would be a good opportunity to add a way for userspace to restrict how
O_PATH file descriptors can be re-opened.
Another possible avenue of future work would be some kind of
CHECK_FIELDS[6] flag which causes the kernel to indicate to userspace
which openat2(2) flags and fields are supported by the current kernel
(to avoid userspace having to go through several guesses to figure it
out).
[1]: https://lwn.net/Articles/588444/
[2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com
[3]: commit 629e014bb8 ("fs: completely ignore unknown open flags")
[4]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523
[5]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/
[6]: https://youtu.be/ggD-eb3yPVs
Suggested-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
con_init in tty/vt.c will now set conswitchp to dummy_con if it's unset.
Drop it from arch setup code.
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Link: https://lore.kernel.org/r/20191218214506.49252-12-nivedita@alum.mit.edu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This wires up the pidfd_getfd syscall for all architectures.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20200107175927.4558-4-sargun@sargun.me
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Use a more generic name for additional table sorting usecases,
such as the upcoming ORC table sorting feature. This tool is
not tied to exception table sorting anymore.
No functional changes intended.
[ mingo: Rewrote the changelog. ]
Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: linux-kbuild@vger.kernel.org
Link: https://lkml.kernel.org/r/20191204004633.88660-6-shile.zhang@linux.alibaba.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In the x86 MM code we'd like to untangle various types of historic
header dependency spaghetti, but for this we'd need to pass to
the generic vmalloc code various vmalloc related defines that
customarily come via the <asm/page.h> low level arch header.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT.
Both PREEMPT and PREEMPT_RT require the same functionality which today
depends on CONFIG_PREEMPT.
Switch the entry code over to use CONFIG_PREEMPTION.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20191015191821.11479-12-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
microblaze has only two-level page tables and can use pgtable-nopmd and
folding of the upper layers.
Replace usage of include/asm-generic/4level-fixup.h and explicit
definition of __PAGETABLE_PMD_FOLDED in microblaze with
include/asm-generic/pgtable-nopmd.h and adjust page table manipulation
macros and functions accordingly.
Link: http://lkml.kernel.org/r/1572938135-31886-7-git-send-email-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Anatoly Pugachev <matorola@gmail.com>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Peter Rosin <peda@axentia.se>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Sam Creasey <sammy@sammy.net>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- improve dma-debug scalability (Eric Dumazet)
- tiny dma-debug cleanup (Dan Carpenter)
- check for vmap memory in dma_map_single (Kees Cook)
- check for dma_addr_t overflows in dma-direct when using
DMA offsets (Nicolas Saenz Julienne)
- switch the x86 sta2x11 SOC to use more generic DMA code
(Nicolas Saenz Julienne)
- fix arm-nommu dma-ranges handling (Vladimir Murzin)
- use __initdata in CMA (Shyam Saini)
- replace the bus dma mask with a limit (Nicolas Saenz Julienne)
- merge the remapping helpers into the main dma-direct flow (me)
- switch xtensa to the generic dma remap handling (me)
- various cleanups around dma_capable (me)
- remove unused dev arguments to various dma-noncoherent helpers (me)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl3f+eULHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYPyPg/+PVHCrhmepudQQFHu6wfurE5U77iNnoUifvG+b5z5
5mHmTMkQwyox6rKDe8NuFApAhz1VJDSUgSelPmvTSOIEIGXCvX1p+GqRSVS5YQON
aLzGvbWKE8hCpaPdDHKYDauD1FZGMM8L2P5oOMF9X9fQ94xxRqfqJM6c8iD16Sgg
+aOgPNzTnxQHJFF/Dbt/mjJrKXWI+XF+bgUbH+l9yKa7Dd7ibmJR8yl9hs1jmp0H
1CZ+CizwnAs57rCd1a6Ybc6gj59tySc03NMnnbTko+KDxrcbD3Ee2tpqHVkkCjYz
Yl0m4FIpbotrpokL/FIS727bVvkjbWgoeM+kiVPoYzmZea3pq/tFDr6tp/BxDhFj
TZXSFfgQljlYMD3ppSoklFlfjGriVWV0tPO3arPXwuuMF5EX/IMQmvxei05jpc8n
iELNXOP9iZZkY4tLHy2hn2uWrxBRrS1WQwlLg9hahlNRzyfFSyHeP0zWlVDt+RgF
5CCbEI+HQcUqg1FApB30lQNWTn1+dJftrpKVBlgNBIyIa/z2rFbt8GdSnItxjfQX
/XX8EZbFvF6AcXkgURkYFIoKM/EbYShOSLcYA3PTUtcuTnF6Kk5eimySiGWZTVCS
prruSFDZJOvL3SnOIMIiYVmBdB7lEbDyLI/VYuhoECXEDCJpVmRktNkJNg4q6/E+
fjQ=
=e5wO
-----END PGP SIGNATURE-----
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux; tag 'dma-mapping-5.5' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:
- improve dma-debug scalability (Eric Dumazet)
- tiny dma-debug cleanup (Dan Carpenter)
- check for vmap memory in dma_map_single (Kees Cook)
- check for dma_addr_t overflows in dma-direct when using DMA offsets
(Nicolas Saenz Julienne)
- switch the x86 sta2x11 SOC to use more generic DMA code (Nicolas
Saenz Julienne)
- fix arm-nommu dma-ranges handling (Vladimir Murzin)
- use __initdata in CMA (Shyam Saini)
- replace the bus dma mask with a limit (Nicolas Saenz Julienne)
- merge the remapping helpers into the main dma-direct flow (me)
- switch xtensa to the generic dma remap handling (me)
- various cleanups around dma_capable (me)
- remove unused dev arguments to various dma-noncoherent helpers (me)
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux:
* tag 'dma-mapping-5.5' of git://git.infradead.org/users/hch/dma-mapping: (22 commits)
dma-mapping: treat dev->bus_dma_mask as a DMA limit
dma-direct: exclude dma_direct_map_resource from the min_low_pfn check
dma-direct: don't check swiotlb=force in dma_direct_map_resource
dma-debug: clean up put_hash_bucket()
powerpc: remove support for NULL dev in __phys_to_dma / __dma_to_phys
dma-direct: avoid a forward declaration for phys_to_dma
dma-direct: unify the dma_capable definitions
dma-mapping: drop the dev argument to arch_sync_dma_for_*
x86/PCI: sta2x11: use default DMA address translation
dma-direct: check for overflows on 32 bit DMA addresses
dma-debug: increase HASH_SIZE
dma-debug: reorder struct dma_debug_entry fields
xtensa: use the generic uncached segment support
dma-mapping: merge the generic remapping helpers into dma-direct
dma-direct: provide mmap and get_sgtable method overrides
dma-direct: remove the dma_handle argument to __dma_direct_alloc_pages
dma-direct: remove __dma_direct_free_pages
usb: core: Remove redundant vmap checks
kernel: dma-contiguous: mark CMA parameters __initdata/__initconst
dma-debug: add a schedule point in debug_dma_dump_mappings()
...
- clean up various obsolete ioremap and iounmap variants
- add a new generic ioremap implementation and switch csky, nds32 and
riscv over to it
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl3cKcsLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYO1CRAAwFQigsbi0CqqshPWnP0owKV+HA4Xfz/lQZsd7SM/
BVXhKyDJQum6gp73dW025HCfjidTknsbdCUIP/LNUgAnop3lOlnB31/munDnJJ1H
6hB1pc+zB9VgbOe0A6TxtxPRm5aE33k1hZIZS99lOh7mY3FvF7mbkkbVoCjdS3Cq
a9bTX+X+esfUQ5GgaIc2zmz2GLkyFXIeVGs8/CoOX58ESCWQcVZrsQRompo4SgrI
jqwf47NzdmK8hW4mZ+jdQUiWiAmNs5+2om7Bvi/deFAIFUo1/hLHvQzqEGramq/j
5SPHax2gWAN3uWYP91QISkUAJWFydwgmUDoTO1M04ov4xLuBrqIQmc43tLjHo2UT
RwMozWJWN+gkB9zTIboqMPi2qcuDaWcCij7LwHl5zLxPTcOKsrALarL55BQ8MipQ
x6fpvskrQQvlArNTsRWFRUq0mCtkzE3wMZ9RR3AIETQL2hlAzB1S4gzhD+Z6WTYY
pXNgkunonVGxwyN/7iJTEl/mvF/+MynGcWqhrwHZLqncyhn/WJJ2USH3nAD1+yjp
v8v6UUeMXIjUsGAyfTjXy/WXAfwRuSC038AAFcmWKDdh08h4XvPHRficT4U8wr34
7WzGizHP9f1CqrhYL/4exhPY9X2Yb7HhsFd0bZGG0rRvSillPUp0b8s++m12QuQU
+VY=
=ooiA
-----END PGP SIGNATURE-----
Merge tag 'ioremap-5.5' of git://git.infradead.org/users/hch/ioremap
Pull generic ioremap support from Christoph Hellwig:
"This adds the remaining bits for an entirely generic ioremap and
iounmap to lib/ioremap.c. To facilitate that, it cleans up the giant
mess of weird ioremap variants we had with no users outside the arch
code.
For now just the three newest ports use the code, but there is more
than a handful others that can be converted without too much work.
Summary:
- clean up various obsolete ioremap and iounmap variants
- add a new generic ioremap implementation and switch csky, nds32 and
riscv over to it"
* tag 'ioremap-5.5' of git://git.infradead.org/users/hch/ioremap: (21 commits)
nds32: use generic ioremap
csky: use generic ioremap
csky: remove ioremap_cache
riscv: use the generic ioremap code
lib: provide a simple generic ioremap implementation
sh: remove __iounmap
nios2: remove __iounmap
hexagon: remove __iounmap
m68k: rename __iounmap and mark it static
arch: rely on asm-generic/io.h for default ioremap_* definitions
asm-generic: don't provide ioremap for CONFIG_MMU
asm-generic: ioremap_uc should behave the same with and without MMU
xtensa: clean up ioremap
x86: Clean up ioremap()
parisc: remove __ioremap
nios2: remove __ioremap
alpha: remove the unused __ioremap wrapper
hexagon: clean up ioremap
ia64: rename ioremap_nocache to ioremap_uc
unicore32: remove ioremap_cached
...
These are pure cache maintainance routines, so drop the unused
struct device argument.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Various architectures that use asm-generic/io.h still defined their
own default versions of ioremap_nocache, ioremap_wt and ioremap_wc
that point back to plain ioremap directly or indirectly. Remove these
definitions and rely on asm-generic/io.h instead. For this to work
the backup ioremap_* defintions needs to be changed to purely cpp
macros instea of inlines to cover for architectures like openrisc
that only define ioremap after including <asm-generic/io.h>.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Palmer Dabbelt <palmer@dabbelt.com>
For dma-direct we know that the DMA address is an encoding of the
physical address that we can trivially decode. Use that fact to
provide implementations that do not need the arch_dma_coherent_to_pfn
architecture hook. Note that we still can only support mmap of
non-coherent memory only if the architecture provides a way to set an
uncached bit in the page tables. This must be true for architectures
that use the generic remap helpers, but other architectures can also
manually select it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
This patch increases max dtb size to 64K from 32K. This fixes the issue of
kernel hang with larger dtb of size greater than 32KB.
Signed-off-by: Siva Durga Prasad Paladugu <siva.durga.paladugu@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Currently dropbear does not run in background because devtmps and tmpfs
is not enabled by default. Enable devtmps and tmpfs to fix this issue.
Signed-off-by: Manjukumar Matha <manjukumar.harthikote-matha@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Since the enabling and disabling of IRQs within preempt_schedule_irq()
is contained in a need_resched() loop, we don't need the outer arch
code loop.
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Merge updates from Andrew Morton:
- a few hot fixes
- ocfs2 updates
- almost all of -mm (slab-generic, slab, slub, kmemleak, kasan,
cleanups, debug, pagecache, memcg, gup, pagemap, memory-hotplug,
sparsemem, vmalloc, initialization, z3fold, compaction, mempolicy,
oom-kill, hugetlb, migration, thp, mmap, madvise, shmem, zswap,
zsmalloc)
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (132 commits)
mm/zsmalloc.c: fix a -Wunused-function warning
zswap: do not map same object twice
zswap: use movable memory if zpool support allocate movable memory
zpool: add malloc_support_movable to zpool_driver
shmem: fix obsolete comment in shmem_getpage_gfp()
mm/madvise: reduce code duplication in error handling paths
mm: mmap: increase sockets maximum memory size pgoff for 32bits
mm/mmap.c: refine find_vma_prev() with rb_last()
riscv: make mmap allocation top-down by default
mips: use generic mmap top-down layout and brk randomization
mips: replace arch specific way to determine 32bit task with generic version
mips: adjust brk randomization offset to fit generic version
mips: use STACK_TOP when computing mmap base address
mips: properly account for stack randomization and stack guard gap
arm: use generic mmap top-down layout and brk randomization
arm: use STACK_TOP when computing mmap base address
arm: properly account for stack randomization and stack guard gap
arm64, mm: make randomization selected by generic topdown mmap layout
arm64, mm: move generic mmap layout functions to mm
arm64: consider stack randomization for mmap base only when necessary
...
Both pgtable_cache_init() and pgd_cache_init() are used to initialize kmem
cache for page table allocations on several architectures that do not use
PAGE_SIZE tables for one or more levels of the page table hierarchy.
Most architectures do not implement these functions and use __weak default
NOP implementation of pgd_cache_init(). Since there is no such default
for pgtable_cache_init(), its empty stub is duplicated among most
architectures.
Rename the definitions of pgd_cache_init() to pgtable_cache_init() and
drop empty stubs of pgtable_cache_init().
Link: http://lkml.kernel.org/r/1566457046-22637-1-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Will Deacon <will@kernel.org> [arm64]
Acked-by: Thomas Gleixner <tglx@linutronix.de> [x86]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The microblaze implementation of pte_alloc_one() has a provision to
allocated PTEs from high memory, but neither CONFIG_HIGHPTE nor pte_map*()
versions for suitable for HIGHPTE are defined.
Except that, microblaze version of pte_alloc_one() is identical to the
generic one as well as the implementations of pte_free() and
pte_free_kernel().
Switch microblaze to use the generic versions of these functions. Also
remove pte_free_slow() that is not referenced anywhere in the code.
Link: http://lkml.kernel.org/r/1565690952-32158-1-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "mm: remove quicklist page table caches".
A while ago Nicholas proposed to remove quicklist page table caches [1].
I've rebased his patch on the curren upstream and switched ia64 and sh to
use generic versions of PTE allocation.
[1] https://lore.kernel.org/linux-mm/20190711030339.20892-1-npiggin@gmail.com
This patch (of 3):
Remove page table allocator "quicklists". These have been around for a
long time, but have not got much traction in the last decade and are only
used on ia64 and sh architectures.
The numbers in the initial commit look interesting but probably don't
apply anymore. If anybody wants to resurrect this it's in the git
history, but it's unhelpful to have this code and divergent allocator
behaviour for minor archs.
Also it might be better to instead make more general improvements to page
allocator if this is still so slow.
Link: http://lkml.kernel.org/r/1565250728-21721-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Clean up reset gpio handler
- Defconfig updates
- Add support for 8 byte get_user()
- Switch to generic dma code
In merge please fix dma_atomic_pool_init reported also by:
https://lkml.org/lkml/2019/9/2/393
or
https://lore.kernel.org/linux-next/20190902214011.2a5400c9@canb.auug.org.au/
-----BEGIN PGP SIGNATURE-----
iF0EABECAB0WIQQbPNTMvXmYlBPRwx7KSWXLKUoMIQUCXYnguwAKCRDKSWXLKUoM
IaF8AKCawdYH+58xRg7riR7Evbv2kM0ghwCfXosnu6Ncv07UEY9Tv5zx/qibafk=
=yI3X
-----END PGP SIGNATURE-----
Merge tag 'microblaze-v5.4-rc1' of git://git.monstr.eu/linux-2.6-microblaze
Pull Microblaze updates from Michal Simek:
- clean up reset gpio handler
- defconfig updates
- add support for 8 byte get_user()
- switch to generic dma code
* tag 'microblaze-v5.4-rc1' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Switch to standard restart handler
microblaze: defconfig synchronization
microblaze: Enable Xilinx AXI emac driver by default
arch/microblaze: support get_user() of size 8 bytes
microblaze: remove ioremap_fullcache
microblaze: use the generic dma coherent remap allocator
microblaze/nommu: use the generic uncached segment support
The microblaze uses the legacy APIs to dig out a GPIO pin
defined in the root of the device tree to issue a hard
reset of the platform.
Asserting a hard reset should be done using the standard
DT-enabled and fully GPIO descriptor aware driver in
drivers/power/reset/gpio-restart.c using the bindings
from Documentation/devicetree/bindings/power/reset/gpio-restart.txt
To achieve this, first make sure microblaze makes use of
the standard kernel restart path utilizing do_kernel_restart()
from <linux/reboot.h>. Put in some grace time and an
emergency print if the restart does not properly assert.
As this is basic platform functionality we patch the DTS
file and defconfig in one go for a lockstep change.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
[ Michal: Move machine_restart back to reset.c ]
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
arch/microblaze/ is missing support for get_user() of size 8 bytes,
so add it by using __copy_from_user().
While there, also drop a lot of the code duplication.
Fixes these build errors:
drivers/infiniband/core/uverbs_main.o: In function `ib_uverbs_write':
drivers/infiniband/core/.tmp_gl_uverbs_main.o:(.text+0x13a4): undefined reference to `__user_bad'
drivers/android/binder.o: In function `binder_thread_write':
drivers/android/.tmp_gl_binder.o:(.text+0xda6c): undefined reference to `__user_bad'
drivers/android/.tmp_gl_binder.o:(.text+0xda98): undefined reference to `__user_bad'
drivers/android/.tmp_gl_binder.o:(.text+0xdf10): undefined reference to `__user_bad'
drivers/android/.tmp_gl_binder.o:(.text+0xe498): undefined reference to `__user_bad'
drivers/android/binder.o:drivers/android/.tmp_gl_binder.o:(.text+0xea78): more undefined references to `__user_bad' follow
'make allmodconfig' now builds successfully for arch/microblaze/.
Fixes: 538722ca3b ("microblaze: fix get_user/put_user side-effects")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven J. Magnani <steve@digidescorp.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Doug Ledford <dledford@redhat.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
CONFIG_ARCH_NO_COHERENT_DMA_MMAP is now functionally identical to
!CONFIG_MMU, so remove the separate symbol. The only difference is that
arm did not set it for !CONFIG_MMU, but arm uses a separate dma mapping
implementation including its own mmap method, which is handled by moving
the CONFIG_MMU check in dma_can_mmap so that is only applies to the
dma-direct case, just as the other ifdefs for it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
This switches to using common code for the DMA allocations, including
potential use of the CMA allocator if configured.
Switching to the generic code enables DMA allocations from atomic
context, which is required by the DMA API documentation, and also
adds various other minor features drivers start relying upon. It
also makes sure we have on tested code base for all architectures
that require uncached pte bits for coherent DMA allocations.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Stop providing our own arch alloc/free hooks for nommu platforms and
just expose the segment offset and use the generic dma-direct
allocator.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXSMhhgAKCRCRxhvAZXjc
or7kAP9VzDcQaK/WoDd2ezh2C7Wh5hNy9z/qJVCa6Tb+N+g1UgEAxbhFUg55uGOA
JNf7fGar5JF5hBMIXR+NqOi1/sb4swg=
=ELWo
-----END PGP SIGNATURE-----
Merge tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull clone3 system call from Christian Brauner:
"This adds the clone3 syscall which is an extensible successor to clone
after we snagged the last flag with CLONE_PIDFD during the 5.2 merge
window for clone(). It cleanly supports all of the flags from clone()
and thus all legacy workloads.
There are few user visible differences between clone3 and clone.
First, CLONE_DETACHED will cause EINVAL with clone3 so we can reuse
this flag. Second, the CSIGNAL flag is deprecated and will cause
EINVAL to be reported. It is superseeded by a dedicated "exit_signal"
argument in struct clone_args thus freeing up even more flags. And
third, clone3 gives CLONE_PIDFD a dedicated return argument in struct
clone_args instead of abusing CLONE_PARENT_SETTID's parent_tidptr
argument.
The clone3 uapi is designed to be easy to handle on 32- and 64 bit:
/* uapi */
struct clone_args {
__aligned_u64 flags;
__aligned_u64 pidfd;
__aligned_u64 child_tid;
__aligned_u64 parent_tid;
__aligned_u64 exit_signal;
__aligned_u64 stack;
__aligned_u64 stack_size;
__aligned_u64 tls;
};
and a separate kernel struct is used that uses proper kernel typing:
/* kernel internal */
struct kernel_clone_args {
u64 flags;
int __user *pidfd;
int __user *child_tid;
int __user *parent_tid;
int exit_signal;
unsigned long stack;
unsigned long stack_size;
unsigned long tls;
};
The system call comes with a size argument which enables the kernel to
detect what version of clone_args userspace is passing in. clone3
validates that any additional bytes a given kernel does not know about
are set to zero and that the size never exceeds a page.
A nice feature is that this patchset allowed us to cleanup and
simplify various core kernel codepaths in kernel/fork.c by making the
internal _do_fork() function take struct kernel_clone_args even for
legacy clone().
This patch also unblocks the time namespace patchset which wants to
introduce a new CLONE_TIMENS flag.
Note, that clone3 has only been wired up for x86{_32,64}, arm{64}, and
xtensa. These were the architectures that did not require special
massaging.
Other architectures treat fork-like system calls individually and
after some back and forth neither Arnd nor I felt confident that we
dared to add clone3 unconditionally to all architectures. We agreed to
leave this up to individual architecture maintainers. This is why
there's an additional patch that introduces __ARCH_WANT_SYS_CLONE3
which any architecture can set once it has implemented support for
clone3. The patch also adds a cond_syscall(clone3) for architectures
such as nios2 or h8300 that generate their syscall table by simply
including asm-generic/unistd.h. The hope is to get rid of
__ARCH_WANT_SYS_CLONE3 and cond_syscall() rather soon"
* tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
arch: handle arches who do not yet define clone3
arch: wire-up clone3() syscall
fork: add clone3