Commit Graph

22286 Commits

Author SHA1 Message Date
David Hildenbrand
b611719978 mm/memory_hotplug: prepare passing flags to add_memory() and friends
We soon want to pass flags, e.g., to mark added System RAM resources.
mergeable.  Prepare for that.

This patch is based on a similar patch by Oscar Salvador:

https://lkml.kernel.org/r/20190625075227.15193-3-osalvador@suse.de

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Juergen Gross <jgross@suse.com> # Xen related part
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Libor Pechacek <lpechacek@suse.cz>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Leonardo Bras <leobras.c@gmail.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Julien Grall <julien@xen.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Link: https://lkml.kernel.org/r/20200911103459.10306-5-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16 11:11:18 -07:00
Aneesh Kumar K.V
379c926d63 powerpc/mm: move setting pte specific flags to pfn_pte
powerpc used to set the pte specific flags in set_pte_at().  This is
different from other architectures.  To be consistent with other
architecture update pfn_pte to set _PAGE_PTE on ppc64.  Also, drop now
unused pte_mkpte.

We add a VM_WARN_ON() to catch the usage of calling set_pte_at() without
setting _PAGE_PTE bit.  We will remove that after a few releases.

With respect to huge pmd entries, pmd_mkhuge() takes care of adding the
_PAGE_PTE bit.

[akpm@linux-foundation.org: whitespace fix, per Christophe]

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lkml.kernel.org/r/20200902114222.181353-3-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16 11:11:14 -07:00
Aneesh Kumar K.V
392b466981 powerpc/mm: add DEBUG_VM WARN for pmd_clear
Patch series "mm/debug_vm_pgtable fixes", v4.

This patch series includes fixes for debug_vm_pgtable test code so that
they follow page table updates rules correctly.  The first two patches
introduce changes w.r.t ppc64.

Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
page table update rules.

These tests are broken w.r.t page table update rules and results in kernel
crash as below.

[   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0]
    pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380
    lr: c0000000005eeeec: pte_update+0x11c/0x190
    sp: c000000c6d1e7950
   msr: 8000000002029033
  current = 0xc000000c6d172c80
  paca    = 0xc000000003ba0000   irqmask: 0x03   irq_happened: 0x01
    pid   = 1, comm = swapper/0
kernel BUG at arch/powerpc/mm/pgtable.c:304!
[link register   ] c0000000005eeeec pte_update+0x11c/0x190
[c000000c6d1e7950] 0000000000000001 (unreliable)
[c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190
[c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8
[c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338
[c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0
[c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4
[c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160
[c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c

With DEBUG_VM disabled

[   20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000
[   20.530183] Faulting instruction address: 0xc0000000000df330
cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700]
    pc: c0000000000df330: memset+0x68/0x104
    lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
    sp: c000000c6d19f990
   msr: 8000000002009033
   dar: 0
  current = 0xc000000c6d177480
  paca    = 0xc00000001ec4f400   irqmask: 0x03   irq_happened: 0x01
    pid   = 1, comm = swapper/0
[link register   ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
[c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
[c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378
[c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244
[c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0
[c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4
[c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160
[c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c

This patch (of 13):

With the hash page table, the kernel should not use pmd_clear for clearing
huge pte entries.  Add a DEBUG_VM WARN to catch the wrong usage.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lkml.kernel.org/r/20200902114222.181353-1-aneesh.kumar@linux.ibm.com
Link: https://lkml.kernel.org/r/20200902114222.181353-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16 11:11:14 -07:00
Linus Torvalds
5a32c3413d dma-mapping updates for 5.10
- rework the non-coherent DMA allocator
  - move private definitions out of <linux/dma-mapping.h>
  - lower CMA_ALIGNMENT (Paul Cercueil)
  - remove the omap1 dma address translation in favor of the common
    code
  - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)
  - support per-node DMA CMA areas (Barry Song)
  - increase the default seg boundary limit (Nicolin Chen)
  - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)
  - various cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl+IiPwLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYPKEQ//TM8vxjucnRl/pklpMin49dJorwiVvROLhQqLmdxw
 286ZKpVzYYAPc7LnNqwIBugnFZiXuHu8xPKQkIiOa2OtNDTwhKNoBxOAmOJaV6DD
 8JfEtZYeX5mKJ/Nqd2iSkIqOvCwZ9Wzii+aytJ2U88wezQr1fnyF4X49MegETEey
 FHWreSaRWZKa0MMRu9AQ0QxmoNTHAQUNaPc0PeqEtPULybfkGOGw4/ghSB7WcKrA
 gtKTuooNOSpVEHkTas2TMpcBp6lxtOjFqKzVN0ml+/nqq5NeTSDx91VOCX/6Cj76
 mXIg+s7fbACTk/BmkkwAkd0QEw4fo4tyD6Bep/5QNhvEoAriTuSRbhvLdOwFz0EF
 vhkF0Rer6umdhSK7nPd7SBqn8kAnP4vBbdmB68+nc3lmkqysLyE4VkgkdH/IYYQI
 6TJ0oilXWFmU6DT5Rm4FBqCvfcEfU2dUIHJr5wZHqrF2kLzoZ+mpg42fADoG4GuI
 D/oOsz7soeaRe3eYfWybC0omGR6YYPozZJ9lsfftcElmwSsFrmPsbO1DM5IBkj1B
 gItmEbOB9ZK3RhIK55T/3u1UWY3Uc/RVr+kchWvADGrWnRQnW0kxYIqDgiOytLFi
 JZNH8uHpJIwzoJAv6XXSPyEUBwXTG+zK37Ce769HGbUEaUrE71MxBbQAQsK8mDpg
 7fM=
 =Bkf/
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - rework the non-coherent DMA allocator

 - move private definitions out of <linux/dma-mapping.h>

 - lower CMA_ALIGNMENT (Paul Cercueil)

 - remove the omap1 dma address translation in favor of the common code

 - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)

 - support per-node DMA CMA areas (Barry Song)

 - increase the default seg boundary limit (Nicolin Chen)

 - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)

 - various cleanups

* tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
  ARM/ixp4xx: add a missing include of dma-map-ops.h
  dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
  dma-direct: factor out a dma_direct_alloc_from_pool helper
  dma-direct check for highmem pages in dma_direct_alloc_pages
  dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h>
  dma-mapping: move large parts of <linux/dma-direct.h> to kernel/dma
  dma-mapping: move dma-debug.h to kernel/dma/
  dma-mapping: remove <asm/dma-contiguous.h>
  dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h>
  dma-contiguous: remove dma_contiguous_set_default
  dma-contiguous: remove dev_set_cma_area
  dma-contiguous: remove dma_declare_contiguous
  dma-mapping: split <linux/dma-mapping.h>
  cma: decrease CMA_ALIGNMENT lower limit to 2
  firewire-ohci: use dma_alloc_pages
  dma-iommu: implement ->alloc_noncoherent
  dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
  dma-mapping: add a new dma_alloc_pages API
  dma-mapping: remove dma_cache_sync
  53c700: convert to dma_alloc_noncoherent
  ...
2020-10-15 14:43:29 -07:00
Jason A. Donenfeld
e6037a753e powerpc32: don't adjust unmoved stack pointer in csum_partial_copy_generic() epilogue
A recent change to the checksum code removed usage of some extra
arguments, alongside with storage on the stack for those, and the stack
pointer no longer needed to be adjusted in the function prologue.

But a left over subtraction wasn't removed in the function epilogue,
causing the function to return with the stack pointer moved 16 bytes
away from where it should have.  This corrupted local state and lead to
weird crashes.

This simply removes the leftover instruction from the epilogue.

Fixes: 70d65cd555 ("ppc: propagate the calling conventions change down to csum_partial_copy_generic()")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-14 16:04:13 -07:00
Linus Torvalds
d5660df4a5 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "181 patches.

  Subsystems affected by this patch series: kbuild, scripts, ntfs,
  ocfs2, vfs, mm (slab, slub, kmemleak, dax, debug, pagecache, fadvise,
  gup, swap, memremap, memcg, selftests, pagemap, mincore, hmm, dma,
  memory-failure, vmallo and migration)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (181 commits)
  mm/migrate: remove obsolete comment about device public
  mm/migrate: remove cpages-- in migrate_vma_finalize()
  mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary
  memblock: use separate iterators for memory and reserved regions
  memblock: implement for_each_reserved_mem_region() using __next_mem_region()
  memblock: remove unused memblock_mem_size()
  x86/setup: simplify reserve_crashkernel()
  x86/setup: simplify initrd relocation and reservation
  arch, drivers: replace for_each_membock() with for_each_mem_range()
  arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()
  memblock: reduce number of parameters in for_each_mem_range()
  memblock: make memblock_debug and related functionality private
  memblock: make for_each_memblock_type() iterator private
  mircoblaze: drop unneeded NUMA and sparsemem initializations
  riscv: drop unneeded node initialization
  h8300, nds32, openrisc: simplify detection of memory extents
  arm64: numa: simplify dummy_numa_init()
  arm, xtensa: simplify initialization of high memory pages
  dma-contiguous: simplify cma_early_percent_memory()
  KVM: PPC: Book3S HV: simplify kvm_cma_reserve()
  ...
2020-10-14 09:57:24 -07:00
Mike Rapoport
b10d6bca87 arch, drivers: replace for_each_membock() with for_each_mem_range()
There are several occurrences of the following pattern:

	for_each_memblock(memory, reg) {
		start = __pfn_to_phys(memblock_region_memory_base_pfn(reg);
		end = __pfn_to_phys(memblock_region_memory_end_pfn(reg));

		/* do something with start and end */
	}

Using for_each_mem_range() iterator is more appropriate in such cases and
allows simpler and cleaner code.

[akpm@linux-foundation.org: fix arch/arm/mm/pmsa-v7.c build]
[rppt@linux.ibm.com: mips: fix cavium-octeon build caused by memblock refactoring]
  Link: http://lkml.kernel.org/r/20200827124549.GD167163@linux.ibm.com

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-13-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13 18:38:35 -07:00
Mike Rapoport
c9118e6c37 arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()
There are several occurrences of the following pattern:

	for_each_memblock(memory, reg) {
		start_pfn = memblock_region_memory_base_pfn(reg);
		end_pfn = memblock_region_memory_end_pfn(reg);

		/* do something with start_pfn and end_pfn */
	}

Rather than iterate over all memblock.memory regions and each time query
for their start and end PFNs, use for_each_mem_pfn_range() iterator to get
simpler and clearer code.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>	[.clang-format]
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-12-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13 18:38:35 -07:00
Mike Rapoport
6e245ad4a1 memblock: reduce number of parameters in for_each_mem_range()
Currently for_each_mem_range() and for_each_mem_range_rev() iterators are
the most generic way to traverse memblock regions.  As such, they have 8
parameters and they are hardly convenient to users.  Most users choose to
utilize one of their wrappers and the only user that actually needs most
of the parameters is memblock itself.

To avoid yet another naming for memblock iterators, rename the existing
for_each_mem_range[_rev]() to __for_each_mem_range[_rev]() and add a new
for_each_mem_range[_rev]() wrappers with only index, start and end
parameters.

The new wrapper nicely fits into init_unavailable_mem() and will be used
in upcoming changes to simplify memblock traversals.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>	[MIPS]
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-11-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13 18:38:35 -07:00
Mike Rapoport
04ba0a923f KVM: PPC: Book3S HV: simplify kvm_cma_reserve()
Patch series "memblock: seasonal cleaning^w cleanup", v3.

These patches simplify several uses of memblock iterators and hide some of
the memblock implementation details from the rest of the system.

This patch (of 17):

The memory size calculation in kvm_cma_reserve() traverses memblock.memory
rather than simply call memblock_phys_mem_size().  The comment in that
function suggests that at some point there should have been call to
memblock_analyze() before memblock_phys_mem_size() could be used.  As of
now, there is no memblock_analyze() at all and memblock_phys_mem_size()
can be used as soon as cold-plug memory is registered with memblock.

Replace loop over memblock.memory with a call to memblock_phys_mem_size().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Link: https://lkml.kernel.org/r/20200818151634.14343-1-rppt@kernel.org
Link: https://lkml.kernel.org/r/20200818151634.14343-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13 18:38:34 -07:00
Dan Williams
b7b3c01b19 mm/memremap_pages: support multiple ranges per invocation
In support of device-dax growing the ability to front physically
dis-contiguous ranges of memory, update devm_memremap_pages() to track
multiple ranges with a single reference counter and devm instance.

Convert all [devm_]memremap_pages() users to specify the number of ranges
they are mapping in their 'struct dev_pagemap' instance.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: "Jérôme Glisse" <jglisse@redhat.co
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hulk Robot <hulkci@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Jason Yan <yanaijie@huawei.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Jia He <justin.he@arm.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/159643103789.4062302.18426128170217903785.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106116293.30709.13350662794915396198.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13 18:38:28 -07:00
Dan Williams
a4574f63ed mm/memremap_pages: convert to 'struct range'
The 'struct resource' in 'struct dev_pagemap' is only used for holding
resource span information.  The other fields, 'name', 'flags', 'desc',
'parent', 'sibling', and 'child' are all unused wasted space.

This is in preparation for introducing a multi-range extension of
devm_memremap_pages().

The bulk of this change is unwinding all the places internal to libnvdimm
that used 'struct resource' unnecessarily, and replacing instances of
'struct dev_pagemap'.res with 'struct dev_pagemap'.range.

P2PDMA had a minor usage of the resource flags field, but only to report
failures with "%pR".  That is replaced with an open coded print of the
range.

[dan.carpenter@oracle.com: mm/hmm/test: use after free in dmirror_allocate_chunk()]
  Link: https://lkml.kernel.org/r/20200926121402.GA7467@kadam

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	[xen]
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hulk Robot <hulkci@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Jason Yan <yanaijie@huawei.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Jia He <justin.he@arm.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/159643103173.4062302.768998885691711532.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106115761.30709.13539840236873663620.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13 18:38:28 -07:00
Linus Torvalds
8b05418b25 seccomp updates for v5.10-rc1
- heavily refactor seccomp selftests (and clone3 selftests dependency) to
   fix powerpc (Kees Cook, Thadeu Lima de Souza Cascardo)
 - fix style issue in selftests (Zou Wei)
 - upgrade "unknown action" from KILL_THREAD to KILL_PROCESS (Rich Felker)
 - replace task_pt_regs(current) with current_pt_regs() (Denis Efremov)
 - fix corner-case race in USER_NOTIF (Jann Horn)
 - make CONFIG_SECCOMP no longer per-arch (YiFei Zhu)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl+E1LAWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJgRfD/0cq7W51+o34719vefC+oZaMjJJ
 Bd5HYshmr6NRpMqn0OhtT9kVi6OeV0sK0VJeNxSISDIaGNJ8xCI9YhnXwzY+7myK
 +IQu3i2Hv7dlWvTaXWFLL+mvfk6WopLntFGGJQ8KPMnP2gcfH2AZmOeAKGFGhBDe
 NwpAUZ9zriXg9JCQp6u0FzPJgk8KfgfHjUY6Hsa095gg0aPSJhc8bWEUNBQwjCe6
 uIcxDP/zK2WWaEhO9BfHt6/VTcXw7QgTLS3yM+pwBCgR1JHs7HMhtgcwPT410qES
 LmYD8OiHmv5AZhDjcCcNipKEv3ZnxkLnpU/6hfaKM4zn/DoaR/zbfjO9U017rcNV
 9gf7k5siAP7DH48IFlqf4Erzd3xyF0OJDnVfC7NiPtggPfO9aWOHJJZCuJRQOdrN
 qPMjkaQzFb02qb501PLEn55F24OLDjz1vFOqpkJm2/XamOBVV4uiRKmfpNEo/MOf
 QkhSvzvwEFErWwzPH95uFyVhs42stwnM3ppnwtya2+U5kxXdNvbAR8N5leH7siaU
 ab+YJIHW59+BxXTlKgXIcqBP/6RqJWJtuT9OqGs0K2A7FhQSexh5MOm+9vvGgIwZ
 Qjyijku8dB3aV94BNGnlJq6BV+4Hc6EGadh7h3b8GiRAUTYo0pk5G/iKL6Ii+R6p
 0msJENqalKFtNCr70w==
 =a4u2
 -----END PGP SIGNATURE-----

Merge tag 'seccomp-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull seccomp updates from Kees Cook:
 "The bulk of the changes are with the seccomp selftests to accommodate
  some powerpc-specific behavioral characteristics. Additional cleanups,
  fixes, and improvements are also included:

   - heavily refactor seccomp selftests (and clone3 selftests
     dependency) to fix powerpc (Kees Cook, Thadeu Lima de Souza
     Cascardo)

   - fix style issue in selftests (Zou Wei)

   - upgrade "unknown action" from KILL_THREAD to KILL_PROCESS (Rich
     Felker)

   - replace task_pt_regs(current) with current_pt_regs() (Denis
     Efremov)

   - fix corner-case race in USER_NOTIF (Jann Horn)

   - make CONFIG_SECCOMP no longer per-arch (YiFei Zhu)"

* tag 'seccomp-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits)
  seccomp: Make duplicate listener detection non-racy
  seccomp: Move config option SECCOMP to arch/Kconfig
  selftests/clone3: Avoid OS-defined clone_args
  selftests/seccomp: powerpc: Set syscall return during ptrace syscall exit
  selftests/seccomp: Allow syscall nr and ret value to be set separately
  selftests/seccomp: Record syscall during ptrace entry
  selftests/seccomp: powerpc: Fix seccomp return value testing
  selftests/seccomp: Remove SYSCALL_NUM_RET_SHARE_REG in favor of SYSCALL_RET_SET
  selftests/seccomp: Avoid redundant register flushes
  selftests/seccomp: Convert REGSET calls into ARCH_GETREG/ARCH_SETREG
  selftests/seccomp: Convert HAVE_GETREG into ARCH_GETREG/ARCH_SETREG
  selftests/seccomp: Remove syscall setting #ifdefs
  selftests/seccomp: mips: Remove O32-specific macro
  selftests/seccomp: arm64: Define SYSCALL_NUM_SET macro
  selftests/seccomp: arm: Define SYSCALL_NUM_SET macro
  selftests/seccomp: mips: Define SYSCALL_NUM_SET macro
  selftests/seccomp: Provide generic syscall setting macro
  selftests/seccomp: Refactor arch register macros to avoid xtensa special case
  selftests/seccomp: Use __NR_mknodat instead of __NR_mknod
  selftests/seccomp: Use bitwise instead of arithmetic operator for flags
  ...
2020-10-13 16:33:43 -07:00
Linus Torvalds
39a5101f98 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Allow DRBG testing through user-space af_alg
   - Add tcrypt speed testing support for keyed hashes
   - Add type-safe init/exit hooks for ahash

  Algorithms:
   - Mark arc4 as obsolete and pending for future removal
   - Mark anubis, khazad, sead and tea as obsolete
   - Improve boot-time xor benchmark
   - Add OSCCA SM2 asymmetric cipher algorithm and use it for integrity

  Drivers:
   - Fixes and enhancement for XTS in caam
   - Add support for XIP8001B hwrng in xiphera-trng
   - Add RNG and hash support in sun8i-ce/sun8i-ss
   - Allow imx-rngc to be used by kernel entropy pool
   - Use crypto engine in omap-sham
   - Add support for Ingenic X1830 with ingenic"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (205 commits)
  X.509: Fix modular build of public_key_sm2
  crypto: xor - Remove unused variable count in do_xor_speed
  X.509: fix error return value on the failed path
  crypto: bcm - Verify GCM/CCM key length in setkey
  crypto: qat - drop input parameter from adf_enable_aer()
  crypto: qat - fix function parameters descriptions
  crypto: atmel-tdes - use semicolons rather than commas to separate statements
  crypto: drivers - use semicolons rather than commas to separate statements
  hwrng: mxc-rnga - use semicolons rather than commas to separate statements
  hwrng: iproc-rng200 - use semicolons rather than commas to separate statements
  hwrng: stm32 - use semicolons rather than commas to separate statements
  crypto: xor - use ktime for template benchmarking
  crypto: xor - defer load time benchmark to a later time
  crypto: hisilicon/zip - fix the uninitalized 'curr_qm_qp_num'
  crypto: hisilicon/zip - fix the return value when device is busy
  crypto: hisilicon/zip - fix zero length input in GZIP decompress
  crypto: hisilicon/zip - fix the uncleared debug registers
  lib/mpi: Fix unused variable warnings
  crypto: x86/poly1305 - Remove assignments with no effect
  hwrng: npcm - modify readl to readb
  ...
2020-10-13 08:50:16 -07:00
Linus Torvalds
22230cd2c5 Merge branch 'compat.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat mount cleanups from Al Viro:
 "The last remnants of mount(2) compat buried by Christoph.

  Buried into NFS, that is.

  Generally I'm less enthusiastic about "let's use in_compat_syscall()
  deep in call chain" kind of approach than Christoph seems to be, but
  in this case it's warranted - that had been an NFS-specific wart,
  hopefully not to be repeated in any other filesystems (read: any new
  filesystem introducing non-text mount options will get NAKed even if
  it doesn't mess the layout up).

  IOW, not worth trying to grow an infrastructure that would avoid that
  use of in_compat_syscall()..."

* 'compat.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: remove compat_sys_mount
  fs,nfs: lift compat nfs4 mount data handling into the nfs code
  nfs: simplify nfs4_parse_monolithic
2020-10-12 16:44:57 -07:00
Linus Torvalds
e18afa5bfa Merge branch 'work.quota-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat quotactl cleanups from Al Viro:
 "More Christoph's compat cleanups: quotactl(2)"

* 'work.quota-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  quota: simplify the quotactl compat handling
  compat: add a compat_need_64bit_alignment_fixup() helper
  compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h>
2020-10-12 16:37:13 -07:00
Linus Torvalds
85ed13e78d Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat iovec cleanups from Al Viro:
 "Christoph's series around import_iovec() and compat variant thereof"

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  security/keys: remove compat_keyctl_instantiate_key_iov
  mm: remove compat_process_vm_{readv,writev}
  fs: remove compat_sys_vmsplice
  fs: remove the compat readv/writev syscalls
  fs: remove various compat readv/writev helpers
  iov_iter: transparently handle compat iovecs in import_iovec
  iov_iter: refactor rw_copy_check_uvector and import_iovec
  iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c
  compat.h: fix a spelling error in <linux/compat.h>
2020-10-12 16:35:51 -07:00
Linus Torvalds
c90578360c Merge branch 'work.csum_and_copy' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull copy_and_csum cleanups from Al Viro:
 "Saner calling conventions for csum_and_copy_..._user() and friends"

[ Removing 800+ lines of code and cleaning stuff up is good  - Linus ]

* 'work.csum_and_copy' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ppc: propagate the calling conventions change down to csum_partial_copy_generic()
  amd64: switch csum_partial_copy_generic() to new calling conventions
  sparc64: propagate the calling convention changes down to __csum_partial_copy_...()
  xtensa: propagate the calling conventions change down into csum_partial_copy_generic()
  mips: propagate the calling convention change down into __csum_partial_copy_..._user()
  mips: __csum_partial_copy_kernel() has no users left
  mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS
  sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()
  i386: propagate the calling conventions change down to csum_partial_copy_generic()
  sh: propage the calling conventions change down to csum_partial_copy_generic()
  m68k: get rid of zeroing destination on error in csum_and_copy_from_user()
  arm: propagate the calling convention changes down to csum_partial_copy_from_user()
  alpha: propagate the calling convention changes down to csum_partial_copy.c helpers
  saner calling conventions for csum_and_copy_..._user()
  csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  csum_partial_copy_nocheck(): drop the last argument
  unify generic instances of csum_partial_copy_nocheck()
  icmp_push_reply(): reorder adding the checksum up
  skb_copy_and_csum_bits(): don't bother with the last argument
2020-10-12 16:24:13 -07:00
Linus Torvalds
1c6890707e This tree prepares to unify the kretprobe trampoline handler and make
kretprobe lockless. (Those patches are still work in progress.)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+EgmMRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hKQg//WrYVMc+lLG+QP4IuKfolZVGNeS60crZE
 mTvs4iX8gBrU5omrgatrjUiDhrln6MiTf6H0ec072BAho91lom/AlyDUQbta5sls
 uXKzIjHe9J7ca+myXGDiXkGmWXgcBYHBHifyzf04xhPyFXH869HLxFXCHeV1S3m7
 Tga1Lceths425t8nnYb9yao9k26l22BSklzPqEM/XNNnktrMiaiYlfgUxi1g3hMj
 v9IbZy43qpzljyrnfRk/tRGMnZ/BtZpj7swQEjUVOKgmcymX6bQoxqYvpAH5mYX7
 jqKcTLsw/Jm4YhZdeBpjZc2JNQkNJSLjiXMMtQTmncPKx2shuU1s4KhgRtYEEeyI
 BO37k3RwplED7/yBJtojNt0WWYfd7X2ee8SPuSW/VPL6jSDgJii3Um0AldPZ0J3g
 72OT4rJkyqFER0ZKSf8uIym2Zi7F5IvtzK2xJAzquOQlYdCaKSNrWurckOzWHMm9
 JKqUqq3nV4mFUKEE7Kf0Nu3UgQZNKpxUNepWBoJb3j6baK32Qgb6qpNLLPTTi2qJ
 AwxicRlr7jzdyP2cwvU5z2FuilPypOob8ZnowhhIyV+4xQY9CymJ3uluXattDC74
 ZNgydTyyYCo0PwYZGUDeE8o87apYd3+sEOErLtw4CjaoiadxDaMBmfsHzU7W29Rc
 Fow4+FQCK/Y=
 =2jY/
 -----END PGP SIGNATURE-----

Merge tag 'perf-kprobes-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf/kprobes updates from Ingo Molnar:
 "This prepares to unify the kretprobe trampoline handler and make
  kretprobe lockless (those patches are still work in progress)"

* tag 'perf-kprobes-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kprobes: Fix to check probe enabled before disarm_kprobe_ftrace()
  kprobes: Make local functions static
  kprobes: Free kretprobe_instance with RCU callback
  kprobes: Remove NMI context check
  sparc: kprobes: Use generic kretprobe trampoline handler
  sh: kprobes: Use generic kretprobe trampoline handler
  s390: kprobes: Use generic kretprobe trampoline handler
  powerpc: kprobes: Use generic kretprobe trampoline handler
  parisc: kprobes: Use generic kretprobe trampoline handler
  mips: kprobes: Use generic kretprobe trampoline handler
  ia64: kprobes: Use generic kretprobe trampoline handler
  csky: kprobes: Use generic kretprobe trampoline handler
  arc: kprobes: Use generic kretprobe trampoline handler
  arm64: kprobes: Use generic kretprobe trampoline handler
  arm: kprobes: Use generic kretprobe trampoline handler
  x86/kprobes: Use generic kretprobe trampoline handler
  kprobes: Add generic kretprobe trampoline handler
2020-10-12 14:21:15 -07:00
Linus Torvalds
34eb62d868 Orphan link sections were a long-standing source of obscure bugs,
because the heuristics that various linkers & compilers use to handle them
 (include these bits into the output image vs discarding them silently)
 are both highly idiosyncratic and also version dependent.
 
 Instead of this historically problematic mess, this tree by Kees Cook (et al)
 adds build time asserts and build time warnings if there's any orphan section
 in the kernel or if a section is not sized as expected.
 
 And because we relied on so many silent assumptions in this area, fix a metric
 ton of dependencies and some outright bugs related to this, before we can
 finally enable the checks on the x86, ARM and ARM64 platforms.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+Edv4RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hiKBAApdJEOaK7hMc3013DYNctklIxEPJL2mFJ
 11YJRIh4pUJTF0TE+EHT/D+rSIuRsyuoSmOQBQ61/wVSnyG067GjjVJRqh/eYaJ1
 fDhJi2FuHOjXl+CiN0KxzBjjp+V4NhF7jHT59tpQSvfZeg7FjteoxfztxaCp5ek3
 S3wHB3CC4c4jE3lfjHem1E9/PwT4kwPYx1c3gAUdEqJdjkihjX9fWusfjLeqW6/d
 Y5VkApi6bL9XiZUZj5l0dEIweLJJ86+PkKJqpo3spxxEak1LSn1MEix+lcJ8e1Kg
 sb/bEEivDcmFlFWOJnn0QLquCR0Cx5bz1pwsL0tuf0yAd4+sXX5IMuGUysZlEdKM
 BHL9h5HbevGF4BScwZwZH7lyEg7q67s5KnRu4hxy0Swfcj7y0oT/9lXqpbpZ2DqO
 Hd+bRRQKIbqnTMp0hcit9LfpLp93vj0dBlaV5ocAJJlu62u9VnwGG5HQuZ5giLUr
 kA1SLw63Y1wopFRxgFyER8les7eLsu0zxHeK44rRVlVnfI99OMTOgVNicmDFy3Fm
 AfcnfJG0BqBEJGQz5es34uQQKKBwFPtC9NztopI62KiwOspYYZyrO1BNxdOc6DlS
 mIHrmO89HMXuid5eolvLaFqUWirHoWO8TlycgZxUWVHc2txVPjAEU/axouU/dSSU
 w/6GpzAa+7g=
 =fXAw
 -----END PGP SIGNATURE-----

Merge tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull orphan section checking from Ingo Molnar:
 "Orphan link sections were a long-standing source of obscure bugs,
  because the heuristics that various linkers & compilers use to handle
  them (include these bits into the output image vs discarding them
  silently) are both highly idiosyncratic and also version dependent.

  Instead of this historically problematic mess, this tree by Kees Cook
  (et al) adds build time asserts and build time warnings if there's any
  orphan section in the kernel or if a section is not sized as expected.

  And because we relied on so many silent assumptions in this area, fix
  a metric ton of dependencies and some outright bugs related to this,
  before we can finally enable the checks on the x86, ARM and ARM64
  platforms"

* tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  x86/boot/compressed: Warn on orphan section placement
  x86/build: Warn on orphan section placement
  arm/boot: Warn on orphan section placement
  arm/build: Warn on orphan section placement
  arm64/build: Warn on orphan section placement
  x86/boot/compressed: Add missing debugging sections to output
  x86/boot/compressed: Remove, discard, or assert for unwanted sections
  x86/boot/compressed: Reorganize zero-size section asserts
  x86/build: Add asserts for unwanted sections
  x86/build: Enforce an empty .got.plt section
  x86/asm: Avoid generating unused kprobe sections
  arm/boot: Handle all sections explicitly
  arm/build: Assert for unwanted sections
  arm/build: Add missing sections
  arm/build: Explicitly keep .ARM.attributes sections
  arm/build: Refactor linker script headers
  arm64/build: Assert for unwanted sections
  arm64/build: Add missing DWARF sections
  arm64/build: Use common DISCARDS in linker script
  arm64/build: Remove .eh_frame* sections due to unwind tables
  ...
2020-10-12 13:39:19 -07:00
Linus Torvalds
cc7343724e Surgery of the MSI interrupt handling to prepare the support of upcoming
devices which require non-PCI based MSI handling.
 
   - Cleanup historical leftovers all over the place
 
   - Rework the code to utilize more core functionality
 
   - Wrap XEN PCI/MSI interrupts into an irqdomain to make irqdomain
     assignment to PCI devices possible.
 
   - Assign irqdomains to PCI devices at initialization time which allows
     to utilize the full functionality of hierarchical irqdomains.
 
   - Remove arch_.*_msi_irq() functions from X86 and utilize the irqdomain
     which is assigned to the device for interrupt management.
 
   - Make the arch_.*_msi_irq() support conditional on a config switch and
     let the last few users select it.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+EUxcTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoagLEACGp5U7a4mk24GsOZJDhrua1PHR/fhb
 enn/5yOPpxDXdYmtFHIjV5qrNjDTV/WqDlI96KOi+oinG1Eoj0O/MA1AcSRhp6nf
 jVdAuK1X0DHDUTEeTAP0JFwqd2j0KlIOphBrIMgeWIf1CRKlYiJaO+ioF9fKgwZ/
 /HigOTSykGYMPggm3JXnWTWtJkKSGFxeADBvVHt5RpVmbWtrI4YoSBxKEMtvjyeM
 5+GsqbCad1CnFYTN74N+QWVGmgGnUWGEzWsPYnJ9hW+yyjad1kWx3n6NcCWhssaC
 E4vAXl6JuCPntL7jBFkbfUkQsgq12ThMZYWpCq8pShJA9O2tDKkxIGasHWrIt4cz
 nYrESiv6hM7edjtOvBc086Gd0A2EyGOM879goHyaNVaTO4rI6jfZG7PlW1HHWibS
 mf/bdTXBtULGNgEt7T8Qnb8sZ+D01WqzLrq/wm645jIrTzvNHUEpOhT1aH/g4TFQ
 cNHD5PcM9OTmiBir9srNd47+1s2mpfwdMYHKBt2QgiXMO8fRgdtr6WLQE4vJjmG8
 sA0yGGsgdTKeg2wW1ERF1pWL0Lt05Iaa42Skm0D3BwcOG2n5ltkBHzVllto9cTUh
 kIldAOgxGE6QeCnnlrnbHz5mvzt/3Ih/PIKqPSUAC94Kx1yvVHRYuOvDExeO8DFB
 P+f0TkrscZObSg==
 =JlqV
 -----END PGP SIGNATURE-----

Merge tag 'x86-irq-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 irq updates from Thomas Gleixner:
 "Surgery of the MSI interrupt handling to prepare the support of
  upcoming devices which require non-PCI based MSI handling:

   - Cleanup historical leftovers all over the place

   - Rework the code to utilize more core functionality

   - Wrap XEN PCI/MSI interrupts into an irqdomain to make irqdomain
     assignment to PCI devices possible.

   - Assign irqdomains to PCI devices at initialization time which
     allows to utilize the full functionality of hierarchical
     irqdomains.

   - Remove arch_.*_msi_irq() functions from X86 and utilize the
     irqdomain which is assigned to the device for interrupt management.

   - Make the arch_.*_msi_irq() support conditional on a config switch
     and let the last few users select it"

* tag 'x86-irq-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  PCI: MSI: Fix Kconfig dependencies for PCI_MSI_ARCH_FALLBACKS
  x86/apic/msi: Unbreak DMAR and HPET MSI
  iommu/amd: Remove domain search for PCI/MSI
  iommu/vt-d: Remove domain search for PCI/MSI[X]
  x86/irq: Make most MSI ops XEN private
  x86/irq: Cleanup the arch_*_msi_irqs() leftovers
  PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
  x86/pci: Set default irq domain in pcibios_add_device()
  iommm/amd: Store irq domain in struct device
  iommm/vt-d: Store irq domain in struct device
  x86/xen: Wrap XEN MSI management into irqdomain
  irqdomain/msi: Allow to override msi_domain_alloc/free_irqs()
  x86/xen: Consolidate XEN-MSI init
  x86/xen: Rework MSI teardown
  x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init()
  PCI/MSI: Provide pci_dev_has_special_msi_domain() helper
  PCI_vmd_Mark_VMD_irqdomain_with_DOMAIN_BUS_VMD_MSI
  irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI
  x86/irq: Initialize PCI/MSI domain at PCI init time
  x86/pci: Reducde #ifdeffery in PCI init code
  ...
2020-10-12 11:40:41 -07:00
Linus Torvalds
ca1b66922a * Extend the recovery from MCE in kernel space also to processes which
encounter an MCE in kernel space but while copying from user memory by
 sending them a SIGBUS on return to user space and umapping the faulty
 memory, by Tony Luck and Youquan Song.
 
 * memcpy_mcsafe() rework by splitting the functionality into
 copy_mc_to_user() and copy_mc_to_kernel(). This, as a result, enables
 support for new hardware which can recover from a machine check
 encountered during a fast string copy and makes that the default and
 lets the older hardware which does not support that advance recovery,
 opt in to use the old, fragile, slow variant, by Dan Williams.
 
 * New AMD hw enablement, by Yazen Ghannam and Akshay Gupta.
 
 * Do not use MSR-tracing accessors in #MC context and flag any fault
 while accessing MCA architectural MSRs as an architectural violation
 with the hope that such hw/fw misdesigns are caught early during the hw
 eval phase and they don't make it into production.
 
 * Misc fixes, improvements and cleanups, as always.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl+EIpUACgkQEsHwGGHe
 VUouoBAAgwb+NkWZtIqGImV4f+LOyFjhTR/r/7ZyiijXdbhOIuAdc/jQM31mQxug
 sX2jxaRYnf1n6SLA0ggX99gwr2deRQ/hsNf5Abw55GC+Z1dOxpGL0k59A3ELl1IR
 H9KYmCAFQIHvzfk38qcdND73XHcgthQoXFBOG9wAPAdgDWnaiWt6lcLAq8OiJTmp
 D8pInAYhcnL8YXwMGyQQ1KkFn9HwydoWDsK5Ff2shaw2/+dMQqd1zetenbVtjhLb
 iNYGvV7Bi/RQ8PyMbzmtTWa4kwQJAHC2gptkGxty//2ADGVBbqUQdqF9TjIWCNy5
 V6Ldv5zo0/1s7DOzji3htzqkSs/K1Ea6d2LtZjejkJipHKV5x068UC6Fu+PlfS2D
 VZfcICeapU4G2F3Zvks2DlZ7dVTbHCvoI78Qi7bBgczPUVmk6iqah4xuQaiHyBJc
 kTFDA4Nnf/026GpoWRiFry9vqdnHBZyLet5A6Y+SoWF0FbhYnCVPpq4MnussYoav
 lUIi9ZZav6X2RZp9DDM1f9d5xubtKq0DKt93wvzqAhjK0T2DikckJ+riOYkI6N8t
 fHCBNUkdfgyMzJUTBPAzYQ7RmjbjKWJi7xWP0oz6+GqOJkQfSTVC5/2yEffbb3ya
 whYRS6iklbl7yshzaOeecXsZcAeK2oGPfoHg34WkHFgXdF5mNgA=
 =u1Wg
 -----END PGP SIGNATURE-----

Merge tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RAS updates from Borislav Petkov:

 - Extend the recovery from MCE in kernel space also to processes which
   encounter an MCE in kernel space but while copying from user memory
   by sending them a SIGBUS on return to user space and umapping the
   faulty memory, by Tony Luck and Youquan Song.

 - memcpy_mcsafe() rework by splitting the functionality into
   copy_mc_to_user() and copy_mc_to_kernel(). This, as a result, enables
   support for new hardware which can recover from a machine check
   encountered during a fast string copy and makes that the default and
   lets the older hardware which does not support that advance recovery,
   opt in to use the old, fragile, slow variant, by Dan Williams.

 - New AMD hw enablement, by Yazen Ghannam and Akshay Gupta.

 - Do not use MSR-tracing accessors in #MC context and flag any fault
   while accessing MCA architectural MSRs as an architectural violation
   with the hope that such hw/fw misdesigns are caught early during the
   hw eval phase and they don't make it into production.

 - Misc fixes, improvements and cleanups, as always.

* tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Allow for copy_mc_fragile symbol checksum to be generated
  x86/mce: Decode a kernel instruction to determine if it is copying from user
  x86/mce: Recover from poison found while copying from user space
  x86/mce: Avoid tail copy when machine check terminated a copy from user
  x86/mce: Add _ASM_EXTABLE_CPY for copy user access
  x86/mce: Provide method to find out the type of an exception handler
  x86/mce: Pass pointer to saved pt_regs to severity calculation routines
  x86/copy_mc: Introduce copy_mc_enhanced_fast_string()
  x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()
  x86/mce: Drop AMD-specific "DEFERRED" case from Intel severity rule list
  x86/mce: Add Skylake quirk for patrol scrub reported errors
  RAS/CEC: Convert to DEFINE_SHOW_ATTRIBUTE()
  x86/mce: Annotate mce_rd/wrmsrl() with noinstr
  x86/mce/dev-mcelog: Do not update kflags on AMD systems
  x86/mce: Stop mce_reign() from re-computing severity for every CPU
  x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR
  x86/mce: Increase maximum number of banks to 64
  x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check()
  x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
  RAS/CEC: Fix cec_init() prototype
2020-10-12 10:14:38 -07:00
YiFei Zhu
282a181b1a seccomp: Move config option SECCOMP to arch/Kconfig
In order to make adding configurable features into seccomp easier,
it's better to have the options at one single location, considering
especially that the bulk of seccomp code is arch-independent. An quick
look also show that many SECCOMP descriptions are outdated; they talk
about /proc rather than prctl.

As a result of moving the config option and keeping it default on,
architectures arm, arm64, csky, riscv, sh, and xtensa did not have SECCOMP
on by default prior to this and SECCOMP will be default in this change.

Architectures microblaze, mips, powerpc, s390, sh, and sparc have an
outdated depend on PROC_FS and this dependency is removed in this change.

Suggested-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/lkml/CAG48ez1YWz9cnp08UZgeieYRhHdqh-ch7aNwc4JRBnGyrmgfMg@mail.gmail.com/
Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
[kees: added HAVE_ARCH_SECCOMP help text, tweaked wording]
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/9ede6ef35c847e58d61e476c6a39540520066613.1600951211.git.yifeifz2@illinois.edu
2020-10-08 13:17:47 -07:00
Dan Williams
ec6347bb43 x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()
In reaction to a proposal to introduce a memcpy_mcsafe_fast()
implementation Linus points out that memcpy_mcsafe() is poorly named
relative to communicating the scope of the interface. Specifically what
addresses are valid to pass as source, destination, and what faults /
exceptions are handled.

Of particular concern is that even though x86 might be able to handle
the semantics of copy_mc_to_user() with its common copy_user_generic()
implementation other archs likely need / want an explicit path for this
case:

  On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote:
  >
  > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote:
  > >
  > > However now I see that copy_user_generic() works for the wrong reason.
  > > It works because the exception on the source address due to poison
  > > looks no different than a write fault on the user address to the
  > > caller, it's still just a short copy. So it makes copy_to_user() work
  > > for the wrong reason relative to the name.
  >
  > Right.
  >
  > And it won't work that way on other architectures. On x86, we have a
  > generic function that can take faults on either side, and we use it
  > for both cases (and for the "in_user" case too), but that's an
  > artifact of the architecture oddity.
  >
  > In fact, it's probably wrong even on x86 - because it can hide bugs -
  > but writing those things is painful enough that everybody prefers
  > having just one function.

Replace a single top-level memcpy_mcsafe() with either
copy_mc_to_user(), or copy_mc_to_kernel().

Introduce an x86 copy_mc_fragile() name as the rename for the
low-level x86 implementation formerly named memcpy_mcsafe(). It is used
as the slow / careful backend that is supplanted by a fast
copy_mc_generic() in a follow-on patch.

One side-effect of this reorganization is that separating copy_mc_64.S
to its own file means that perf no longer needs to track dependencies
for its memcpy_64.S benchmarks.

 [ bp: Massage a bit. ]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: <stable@vger.kernel.org>
Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com
Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
2020-10-06 11:18:04 +02:00
Christoph Hellwig
9f4df96b87 dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h>
Move more nitty gritty DMA implementation details into the common
internal header.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-06 07:07:06 +02:00
Christoph Hellwig
0a0f0d8be7 dma-mapping: split <linux/dma-mapping.h>
Split out all the bits that are purely for dma_map_ops implementations
and related code into a new <linux/dma-map-ops.h> header so that they
don't get pulled into all the drivers.  That also means the architecture
specific <asm/dma-mapping.h> is not pulled in by <linux/dma-mapping.h>
any more, which leads to a missing includes that were pulled in by the
x86 or arm versions in a few not overly portable drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-06 07:07:03 +02:00
Christoph Hellwig
c3973b401e mm: remove compat_process_vm_{readv,writev}
Now that import_iovec handles compat iovecs, the native syscalls
can be used for the compat case as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03 00:02:15 -04:00
Christoph Hellwig
598b3cec83 fs: remove compat_sys_vmsplice
Now that import_iovec handles compat iovecs, the native vmsplice syscall
can be used for the compat case as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03 00:02:15 -04:00
Christoph Hellwig
5f764d624a fs: remove the compat readv/writev syscalls
Now that import_iovec handles compat iovecs, the native readv and writev
syscalls can be used for the compat case as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03 00:02:14 -04:00
David S. Miller
1f25c9bbfd Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2020-09-29

The following pull-request contains BPF updates for your *net* tree.

We've added 7 non-merge commits during the last 14 day(s) which contain
a total of 7 files changed, 28 insertions(+), 8 deletions(-).

The main changes are:

1) fix xdp loading regression in libbpf for old kernels, from Andrii.

2) Do not discard packet when NETDEV_TX_BUSY, from Magnus.

3) Fix corner cases in libbpf related to endianness and kconfig, from Tony.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30 01:49:20 -07:00
He Zhe
9cf51446e6 bpf, powerpc: Fix misuse of fallthrough in bpf_jit_comp()
The user defined label following "fallthrough" is not considered by GCC
and causes build failure.

kernel-source/include/linux/compiler_attributes.h:208:41: error: attribute
'fallthrough' not preceding a case label or default label [-Werror]
 208   define fallthrough _attribute((fallthrough_))
                          ^~~~~~~~~~~~~

Fixes: df561f6688 ("treewide: Use fallthrough pseudo-keyword")
Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/bpf/20200928090023.38117-1-zhe.he@windriver.com
2020-09-29 16:39:11 +02:00
Thomas Gleixner
981aa1d366 PCI: MSI: Fix Kconfig dependencies for PCI_MSI_ARCH_FALLBACKS
The unconditional selection of PCI_MSI_ARCH_FALLBACKS has an unmet
dependency because PCI_MSI_ARCH_FALLBACKS is defined in a 'if PCI' clause.

As it is only relevant when PCI_MSI is enabled, update the affected
architecture Kconfigs to make the selection of PCI_MSI_ARCH_FALLBACKS
depend on 'if PCI_MSI'.

Fixes: 077ee78e39 ("PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable")
Reported-by: Qian Cai <cai@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Links: https://lore.kernel.org/r/cdfd63305caa57785b0925dd24c0711ea02c8527.camel@redhat.com
2020-09-28 16:03:10 +02:00
Christoph Hellwig
efa70f2fdc dma-mapping: add a new dma_alloc_pages API
This API is the equivalent of alloc_pages, except that the returned memory
is guaranteed to be DMA addressable by the passed in device.  The
implementation will also be used to provide a more sensible replacement
for DMA_ATTR_NON_CONSISTENT flag.

Additionally dma_alloc_noncoherent is switched over to use dma_alloc_pages
as its backend.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> (MIPS part)
2020-09-25 06:20:47 +02:00
Christoph Hellwig
8c1c6c7588 Merge branch 'master' of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into dma-mapping-for-next
Pull in the latest 5.9 tree for the commit to revert the
V4L2_FLAG_MEMORY_NON_CONSISTENT uapi addition.
2020-09-25 06:19:19 +02:00
Christoph Hellwig
028abd9222 fs: remove compat_sys_mount
compat_sys_mount is identical to the regular sys_mount now, so remove it
and use the native version everywhere.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-22 23:45:57 -04:00
Linus Torvalds
5a55d36f71 powerpc fixes for 5.9 #5
Opt us out of the DEBUG_VM_PGTABLE support for now as it's causing crashes.
 
 Fix a long standing bug in our DMA mask handling that was hidden until recently,
 and which caused problems with some drivers.
 
 Fix a boot failure on systems with large amounts of RAM, and no hugepage support
 and using Radix MMU, only seen in the lab.
 
 A few other minor fixes.
 
 Thanks to:
   Alexey Kardashevskiy, Aneesh Kumar K.V, Gautham R. Shenoy, Hari Bathini, Ira
   Weiny, Nick Desaulniers, Shirisha Ganta, Vaibhav Jain, Vaidyanathan
   Srinivasan.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl9kk4UTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgLM5D/42Wuuq6hOpGEfE2XjBjsOxCR07SCw9
 CQ/c72jw/1tDLe0YclPVYAZc8BmT8uLo6tE2Ot+0vI+1Y06rRvC5g5uQBwp1zD/t
 MOwC0d4zf8a7WzBCcVaBv9HMHVOaKaTBwQc2R4k6NzYtARIf5m0evMPOWINioRsv
 /x4+Np8aeQd1WiVn6PBdqL8w1yRhk8LsVDvX35lFzQlgZvH09umXSGjw9K442xdE
 lr1PrV9GKd4DeudwLHPkMNs8Ul1QTxmY5vKIAklsJ5g3dBySfM7+GMrTzYOHYkWr
 aqGfGH6ojdFSQZRo7QwFhO52Kni7JN7AIoUPEBDqLb1fR10w8wesdjCs8JQMXIMc
 8Eo210EbiSsq6kG/LzqZuStLUAup3rQd20+wWua7jo8HbcZOLDH7pPGwNtJInPTr
 gwH7sALYhTzFUAO4LzqVVE+yA8wndFPHoz+QSkO6LZJKuszON2LJ0r+IEx1l9Wmr
 ClZYujK36/N+ih42xBFcBZi2lL0VKzlkn7u7NDeZQBONgoJMCoWkGDkEHnMI9zdT
 1iYcVZyY+IpJLcwxH+NP8weJKHIZbP4kvuFXR/QIpHdrDWKaTT95BvBhAK1KMYBo
 lLo81zMTzJCz2wWDg/+3sLuEzHdGr//uxcVXxjqE2vyEckeuZjiCyz0dRNEY4Sw3
 vB2p3Zl7L0J4pQ==
 =cj6B
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Some more powerpc fixes for 5.9:

   - Opt us out of the DEBUG_VM_PGTABLE support for now as it's causing
     crashes.

   - Fix a long standing bug in our DMA mask handling that was hidden
     until recently, and which caused problems with some drivers.

   - Fix a boot failure on systems with large amounts of RAM, and no
     hugepage support and using Radix MMU, only seen in the lab.

   - A few other minor fixes.

  Thanks to Alexey Kardashevskiy, Aneesh Kumar K.V, Gautham R. Shenoy,
  Hari Bathini, Ira Weiny, Nick Desaulniers, Shirisha Ganta, Vaibhav
  Jain, and Vaidyanathan Srinivasan"

* tag 'powerpc-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute
  cpuidle: pseries: Fix CEDE latency conversion from tb to us
  powerpc/dma: Fix dma_map_ops::get_required_mask
  Revert "powerpc/build: vdso linker warning for orphan sections"
  powerpc/mm: Remove DEBUG_VM_PGTABLE support on powerpc
  selftests/powerpc: Skip PROT_SAO test in guests/LPARS
  powerpc/book3s64/radix: Fix boot failure with large amount of guest memory
2020-09-18 11:48:25 -07:00
Christoph Hellwig
cc7886d25b compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h>
lift the compat_s64 and compat_u64 definitions into common code using the
COMPAT_FOR_U64_ALIGNMENT symbol for the x86 special case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-17 13:00:46 -04:00
Thomas Gleixner
077ee78e39 PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
The arch_.*_msi_irq[s] fallbacks are compiled in whether an architecture
requires them or not. Architectures which are fully utilizing hierarchical
irq domains should never call into that code.

It's not only architectures which depend on that by implementing one or
more of the weak functions, there is also a bunch of drivers which relies
on the weak functions which invoke msi_controller::setup_irq[s] and
msi_controller::teardown_irq.

Make the architectures and drivers which rely on them select them in Kconfig
and if not selected replace them by stub functions which emit a warning and
fail the PCI/MSI interrupt allocation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200826112333.992429909@linutronix.de
2020-09-16 16:52:37 +02:00
Linus Torvalds
973c096f6a vgacon: remove software scrollback support
Yunhai Zhang recently fixed a VGA software scrollback bug in commit
ebfdfeeae8 ("vgacon: Fix for missing check in scrollback handling"),
but that then made people look more closely at some of this code, and
there were more problems on the vgacon side, but also the fbcon software
scrollback.

We don't really have anybody who maintains this code - probably because
nobody actually _uses_ it any more.  Sure, people still use both VGA and
the framebuffer consoles, but they are no longer the main user
interfaces to the kernel, and haven't been for decades, so these kinds
of extra features end up bitrotting and not really being used.

So rather than try to maintain a likely unused set of code, I'll just
aggressively remove it, and see if anybody even notices.  Maybe there
are people who haven't jumped on the whole GUI badnwagon yet, and think
it's just a fad.  And maybe those people use the scrollback code.

If that turns out to be the case, we can resurrect this again, once
we've found the sucker^Wmaintainer for it who actually uses it.

Reported-by: NopNop Nop <nopitydays@gmail.com>
Tested-by: Willy Tarreau <w@1wt.eu>
Cc: 张云海 <zhangyunhai@nsfocus.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-14 10:06:15 -07:00
Christoph Hellwig
5ceda74093 dma-direct: rename and cleanup __phys_to_dma
The __phys_to_dma vs phys_to_dma distinction isn't exactly obvious.  Try
to improve the situation by renaming __phys_to_dma to
phys_to_dma_unencryped, and not forcing architectures that want to
override phys_to_dma to actually provide __phys_to_dma.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2020-09-11 09:14:43 +02:00
Christoph Hellwig
7bc5c428a6 dma-direct: remove __dma_to_phys
There is no harm in just always clearing the SME encryption bit, while
significantly simplifying the interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2020-09-11 09:14:25 +02:00
Vaibhav Jain
0460534b53 powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute
The newly introduced 'perf_stats' attribute uses the default access
mode of 0444, allowing non-root users to access performance stats of
an nvdimm and potentially force the kernel into issuing a large number
of expensive hypercalls. Since the information exposed by this
attribute cannot be cached it is better to ward off access to this
attribute from users who don't need to access to these performance
statistics.

Hence update the access mode of 'perf_stats' attribute to be only
readable by root users.

Fixes: 2d02bf835e ("powerpc/papr_scm: Fetch nvdimm performance stats from PHYP")
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200907110540.21349-1-vaibhav@linux.ibm.com
2020-09-09 14:44:38 +10:00
Masami Hiramatsu
b6c5a58dd8 powerpc: kprobes: Use generic kretprobe trampoline handler
Use the generic kretprobe trampoline handler. Don't use
framepointer verification.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/159870610825.1229682.2090635992093223399.stgit@devnote2
2020-09-08 11:52:34 +02:00
Alexey Kardashevskiy
437ef802e0 powerpc/dma: Fix dma_map_ops::get_required_mask
There are 2 problems with it:
  1. "<" vs expected "<<"
  2. the shift number is an IOMMU page number mask, not an address
  mask as the IOMMU page shift is missing.

This did not hit us before f1565c24b5 ("powerpc: use the generic
dma_ops_bypass mode") because we had additional code to handle bypass
mask so this chunk (almost?) never executed.However there were
reports that aacraid does not work with "iommu=nobypass".

After f1565c24b5, aacraid (and probably others which call
dma_get_required_mask() before setting the mask) was unable to enable
64bit DMA and fall back to using IOMMU which was known not to work,
one of the problems is double free of an IOMMU page.

This fixes DMA for aacraid, both with and without "iommu=nobypass" in
the kernel command line. Verified with "stress-ng -d 4".

Fixes: 6a5c7be5e4 ("powerpc: Override dma_get_required_mask by platform hook and ops")
Cc: stable@vger.kernel.org # v3.2+
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200908015106.79661-1-aik@ozlabs.ru
2020-09-08 14:20:55 +10:00
Herbert Xu
d08d387b73 crypto: powerpc/crc-vpmsum_test - Fix sparse endianness warning
This patch fixes a sparse endianness warning by changing crc32 to
__le32 instead of u32:

  CHECK   ../arch/powerpc/crypto/crc-vpmsum_test.c
../arch/powerpc/crypto/crc-vpmsum_test.c:102:39: warning: cast from restricted __le32

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-09-04 17:57:15 +10:00
Nicolin Chen
1e9d90dbed dma-mapping: introduce dma_get_seg_boundary_nr_pages()
We found that callers of dma_get_seg_boundary mostly do an ALIGN
with page mask and then do a page shift to get number of pages:
    ALIGN(boundary + 1, 1 << shift) >> shift

However, the boundary might be as large as ULONG_MAX, which means
that a device has no specific boundary limit. So either "+ 1" or
passing it to ALIGN() would potentially overflow.

According to kernel defines:
    #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
    #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)

We can simplify the logic here into a helper function doing:
  ALIGN(boundary + 1, 1 << shift) >> shift
= ALIGN_MASK(b + 1, (1 << s) - 1) >> s
= {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
= [b + 1 + (1 << s) - 1] >> s
= [b + (1 << s)] >> s
= (b >> s) + 1

This patch introduces and applies dma_get_seg_boundary_nr_pages()
as an overflow-free helper for the dma_get_seg_boundary() callers
to get numbers of pages. It also takes care of the NULL dev case
for non-DMA API callers.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-03 18:12:15 +02:00
Michael Ellerman
4c62285439 Revert "powerpc/build: vdso linker warning for orphan sections"
This reverts commit f2af201002.

It added a usage of cc-ldoption, but cc-ldoption was removed in commit
055efab312 ("kbuild: drop support for cc-ldoption").

Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2020-09-03 15:42:26 +10:00
Aneesh Kumar K.V
675bceb097 powerpc/mm: Remove DEBUG_VM_PGTABLE support on powerpc
The test is broken w.r.t page table update rules and results in kernel
crash as below. Disable the support until we get the tests updated.

[   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0]
    pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380
    lr: c0000000005eeeec: pte_update+0x11c/0x190
    sp: c000000c6d1e7950
   msr: 8000000002029033
  current = 0xc000000c6d172c80
  paca    = 0xc000000003ba0000   irqmask: 0x03   irq_happened: 0x01
    pid   = 1, comm = swapper/0
kernel BUG at arch/powerpc/mm/pgtable.c:304!
[link register   ] c0000000005eeeec pte_update+0x11c/0x190
[c000000c6d1e7950] 0000000000000001 (unreliable)
[c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190
[c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8
[c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338
[c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0
[c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4
[c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160
[c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c

With DEBUG_VM disabled

[   20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000
[   20.530183] Faulting instruction address: 0xc0000000000df330
cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700]
    pc: c0000000000df330: memset+0x68/0x104
    lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
    sp: c000000c6d19f990
   msr: 8000000002009033
   dar: 0
  current = 0xc000000c6d177480
  paca    = 0xc00000001ec4f400   irqmask: 0x03   irq_happened: 0x01
    pid   = 1, comm = swapper/0
[link register   ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
[c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
[c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378
[c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244
[c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0
[c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4
[c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160
[c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
33:mon>

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200902040122.136414-1-aneesh.kumar@linux.ibm.com
2020-09-02 22:14:26 +10:00
Kees Cook
c604abc3f6 vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUG
The .comment section doesn't belong in STABS_DEBUG. Split it out into a
new macro named ELF_DETAILS. This will gain other non-debug sections
that need to be accounted for when linking with --orphan-handling=warn.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: linux-arch@vger.kernel.org
Link: https://lore.kernel.org/r/20200821194310.3089815-5-keescook@chromium.org
2020-09-01 09:50:35 +02:00
Linus Torvalds
b69bea8a65 A set of fixes for lockdep, tracing and RCU:
- Prevent recursion by using raw_cpu_* operations
 
   - Fixup the interrupt state in the cpu idle code to be consistent
 
   - Push rcu_idle_enter/exit() invocations deeper into the idle path so
     that the lock operations are inside the RCU watching sections
 
   - Move trace_cpu_idle() into generic code so it's called before RCU goes
     idle.
 
   - Handle raw_local_irq* vs. local_irq* operations correctly
 
   - Move the tracepoints out from under the lockdep recursion handling
     which turned out to be fragile and inconsistent.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl9L5qETHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoV/NEADG+h02tj2I4gP7IQ3nVodEzS1+odPI
 orabY5ggH0kn4YIhPB4UtOd5zKZjr3FJs9wEhyhQpV6ZhvFfgaIKiYqfg+Q81aMO
 /BXrfh6jBD2Hu7gaPBnVdkKeh1ehl+w0PhTeJhPBHEEvbGeLUYWwyPNlaKz//VQl
 XCWl7e7o/Uw2UyJ469SCx3z+M2DMNqwdMys/zcqvTLiBdLNCwp4TW5ACzEA0rfHh
 Pepu3eIKnMURyt82QanrOATvT2io9pOOaUh59zeKi2WM8ikwKd/Eho2kXYng6GvM
 GzX4Kn13MsNobZXf9BhqEGICdRkaJqLsXlmBNmbJdSTCn5W2lLZqu2wCEp5VZHCc
 XwMbey8ek+BRskJMqAV4oq2GA8Om9KEYWOOdixyOG0UJCiW5qDowuDYBXTLV7FWj
 XhzLGuHpUF9eKLKokJ7ideLaDcpzwYjHr58pFLQrqPwmjVKWguLeYMg5BhhTiEuV
 wNfiLIGdMNsCpYKhnce3o9paV8+hy1ZveWhNy+/4HaDLoEwI2T62i8R7xxbrcWMg
 sgdAiQG+kVLwSJ13bN+Cz79uLYTIbqGaZHtOXmeIT3jSxBjx5RlXfzocwTHSYrNk
 GuLYHd7+QaemN49Rrf4bPR16Db7ifL32QkUtLBTBLcnos9jM+fcl+BWyqYRxhgDv
 xzDS+vfK8DvRiA==
 =Hgt6
 -----END PGP SIGNATURE-----

Merge tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Thomas Gleixner:
 "A set of fixes for lockdep, tracing and RCU:

   - Prevent recursion by using raw_cpu_* operations

   - Fixup the interrupt state in the cpu idle code to be consistent

   - Push rcu_idle_enter/exit() invocations deeper into the idle path so
     that the lock operations are inside the RCU watching sections

   - Move trace_cpu_idle() into generic code so it's called before RCU
     goes idle.

   - Handle raw_local_irq* vs. local_irq* operations correctly

   - Move the tracepoints out from under the lockdep recursion handling
     which turned out to be fragile and inconsistent"

* tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lockdep,trace: Expose tracepoints
  lockdep: Only trace IRQ edges
  mips: Implement arch_irqs_disabled()
  arm64: Implement arch_irqs_disabled()
  nds32: Implement arch_irqs_disabled()
  locking/lockdep: Cleanup
  x86/entry: Remove unused THUNKs
  cpuidle: Move trace_cpu_idle() into generic code
  cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED generic
  sched,idle,rcu: Push rcu_idle deeper into the idle path
  cpuidle: Fixup IRQ state
  lockdep: Use raw_cpu_*() for per-cpu variables
2020-08-30 11:43:50 -07:00