linux/kernel/dma/direct.c

681 lines
18 KiB
C
Raw Normal View History

License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (C) 2018-2020 Christoph Hellwig.
*
* DMA operations that map physical memory directly without using an IOMMU.
*/
mm: remove include/linux/bootmem.h Move remaining definitions and declarations from include/linux/bootmem.h into include/linux/memblock.h and remove the redundant header. The includes were replaced with the semantic patch below and then semi-automated removal of duplicated '#include <linux/memblock.h> @@ @@ - #include <linux/bootmem.h> + #include <linux/memblock.h> [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal] Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 06:09:49 +08:00
#include <linux/memblock.h> /* for max_pfn */
#include <linux/export.h>
#include <linux/mm.h>
#include <linux/dma-map-ops.h>
#include <linux/scatterlist.h>
#include <linux/pfn.h>
#include <linux/vmalloc.h>
#include <linux/set_memory.h>
#include <linux/slab.h>
#include "direct.h"
/*
* Most architectures use ZONE_DMA for the first 16 Megabytes, but some use
* it for entirely different regions. In that case the arch code needs to
* override the variable below for dma-direct to work properly.
*/
u64 zone_dma_limit __ro_after_init = DMA_BIT_MASK(24);
static inline dma_addr_t phys_to_dma_direct(struct device *dev,
phys_addr_t phys)
{
dma-direct: Force unencrypted DMA under SME for certain DMA masks If a device doesn't support DMA to a physical address that includes the encryption bit (currently bit 47, so 48-bit DMA), then the DMA must occur to unencrypted memory. SWIOTLB is used to satisfy that requirement if an IOMMU is not active (enabled or configured in passthrough mode). However, commit fafadcd16595 ("swiotlb: don't dip into swiotlb pool for coherent allocations") modified the coherent allocation support in SWIOTLB to use the DMA direct coherent allocation support. When an IOMMU is not active, this resulted in dma_alloc_coherent() failing for devices that didn't support DMA addresses that included the encryption bit. Addressing this requires changes to the force_dma_unencrypted() function in kernel/dma/direct.c. Since the function is now non-trivial and SME/SEV specific, update the DMA direct support to add an arch override for the force_dma_unencrypted() function. The arch override is selected when CONFIG_AMD_MEM_ENCRYPT is set. The arch override function resides in the arch/x86/mm/mem_encrypt.c file and forces unencrypted DMA when either SEV is active or SME is active and the device does not support DMA to physical addresses that include the encryption bit. Fixes: fafadcd16595 ("swiotlb: don't dip into swiotlb pool for coherent allocations") Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> [hch: moved the force_dma_unencrypted declaration to dma-mapping.h, fold the s390 fix from Halil Pasic] Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-07-11 03:01:19 +08:00
if (force_dma_unencrypted(dev))
return phys_to_dma_unencrypted(dev, phys);
return phys_to_dma(dev, phys);
}
static inline struct page *dma_direct_to_page(struct device *dev,
dma_addr_t dma_addr)
{
return pfn_to_page(PHYS_PFN(dma_to_phys(dev, dma_addr)));
}
u64 dma_direct_get_required_mask(struct device *dev)
{
phys_addr_t phys = (phys_addr_t)(max_pfn - 1) << PAGE_SHIFT;
u64 max_dma = phys_to_dma_direct(dev, phys);
return (1ULL << (fls64(max_dma) - 1)) * 2 - 1;
}
static gfp_t dma_direct_optimal_gfp_mask(struct device *dev, u64 *phys_limit)
{
u64 dma_limit = min_not_zero(
dev->coherent_dma_mask,
dev->bus_dma_limit);
/*
* Optimistically try the zone that the physical address mask falls
* into first. If that returns memory that isn't actually addressable
* we will fallback to the next lower zone and try again.
*
* Note that GFP_DMA32 and GFP_DMA are no ops without the corresponding
* zones.
*/
*phys_limit = dma_to_phys(dev, dma_limit);
if (*phys_limit <= zone_dma_limit)
return GFP_DMA;
if (*phys_limit <= DMA_BIT_MASK(32))
return GFP_DMA32;
return 0;
}
swiotlb: if swiotlb is full, fall back to a transient memory pool Try to allocate a transient memory pool if no suitable slots can be found and the respective SWIOTLB is allowed to grow. The transient pool is just enough big for this one bounce buffer. It is inserted into a per-device list of transient memory pools, and it is freed again when the bounce buffer is unmapped. Transient memory pools are kept in an RCU list. A memory barrier is required after adding a new entry, because any address within a transient buffer must be immediately recognized as belonging to the SWIOTLB, even if it is passed to another CPU. Deletion does not require any synchronization beyond RCU ordering guarantees. After a buffer is unmapped, its physical addresses may no longer be passed to the DMA API, so the memory range of the corresponding stale entry in the RCU list never matches. If the memory range gets allocated again, then it happens only after a RCU quiescent state. Since bounce buffers can now be allocated from different pools, add a parameter to swiotlb_alloc_pool() to let the caller know which memory pool is used. Add swiotlb_find_pool() to find the memory pool corresponding to an address. This function is now also used by is_swiotlb_buffer(), because a simple boundary check is no longer sufficient. The logic in swiotlb_alloc_tlb() is taken from __dma_direct_alloc_pages(), simplified and enhanced to use coherent memory pools if needed. Note that this is not the most efficient way to provide a bounce buffer, but when a DMA buffer can't be mapped, something may (and will) actually break. At that point it is better to make an allocation, even if it may be an expensive operation. Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-08-01 14:24:01 +08:00
bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size)
{
dma_addr_t dma_addr = phys_to_dma_direct(dev, phys);
if (dma_addr == DMA_MAPPING_ERROR)
return false;
return dma_addr + size - 1 <=
min_not_zero(dev->coherent_dma_mask, dev->bus_dma_limit);
}
static int dma_set_decrypted(struct device *dev, void *vaddr, size_t size)
{
if (!force_dma_unencrypted(dev))
return 0;
return set_memory_decrypted((unsigned long)vaddr, PFN_UP(size));
}
static int dma_set_encrypted(struct device *dev, void *vaddr, size_t size)
{
int ret;
if (!force_dma_unencrypted(dev))
return 0;
ret = set_memory_encrypted((unsigned long)vaddr, PFN_UP(size));
if (ret)
pr_warn_ratelimited("leaking DMA memory that can't be re-encrypted\n");
return ret;
}
static void __dma_direct_free_pages(struct device *dev, struct page *page,
size_t size)
{
if (swiotlb_free(dev, page, size))
return;
dma_free_contiguous(dev, page, size);
}
static struct page *dma_direct_alloc_swiotlb(struct device *dev, size_t size)
{
struct page *page = swiotlb_alloc(dev, size);
if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
swiotlb_free(dev, page, size);
return NULL;
}
return page;
}
static struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
gfp_t gfp, bool allow_highmem)
{
int node = dev_to_node(dev);
struct page *page = NULL;
u64 phys_limit;
WARN_ON_ONCE(!PAGE_ALIGNED(size));
if (is_swiotlb_for_alloc(dev))
return dma_direct_alloc_swiotlb(dev, size);
gfp |= dma_direct_optimal_gfp_mask(dev, &phys_limit);
page = dma_alloc_contiguous(dev, size, gfp);
if (page) {
if (!dma_coherent_ok(dev, page_to_phys(page), size) ||
(!allow_highmem && PageHighMem(page))) {
dma_free_contiguous(dev, page, size);
page = NULL;
}
}
again:
if (!page)
page = alloc_pages_node(node, gfp, get_order(size));
if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
__free_pages(page, get_order(size));
page = NULL;
if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
phys_limit < DMA_BIT_MASK(64) &&
!(gfp & (GFP_DMA32 | GFP_DMA))) {
gfp |= GFP_DMA32;
goto again;
}
if (IS_ENABLED(CONFIG_ZONE_DMA) && !(gfp & GFP_DMA)) {
gfp = (gfp & ~GFP_DMA32) | GFP_DMA;
goto again;
}
}
return page;
}
/*
* Check if a potentially blocking operations needs to dip into the atomic
* pools for the given device/gfp.
*/
static bool dma_direct_use_pool(struct device *dev, gfp_t gfp)
{
return !gfpflags_allow_blocking(gfp) && !is_swiotlb_for_alloc(dev);
}
static void *dma_direct_alloc_from_pool(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t gfp)
{
struct page *page;
u64 phys_limit;
void *ret;
if (WARN_ON_ONCE(!IS_ENABLED(CONFIG_DMA_COHERENT_POOL)))
return NULL;
gfp |= dma_direct_optimal_gfp_mask(dev, &phys_limit);
page = dma_alloc_from_pool(dev, size, &ret, gfp, dma_coherent_ok);
if (!page)
return NULL;
*dma_handle = phys_to_dma_direct(dev, page_to_phys(page));
return ret;
}
static void *dma_direct_alloc_no_mapping(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t gfp)
{
struct page *page;
page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO, true);
if (!page)
return NULL;
/* remove any dirty cache lines on the kernel alias */
if (!PageHighMem(page))
arch_dma_prep_coherent(page, size);
/* return the page pointer as the opaque cookie */
*dma_handle = phys_to_dma_direct(dev, page_to_phys(page));
return page;
}
void *dma_direct_alloc(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
{
bool remap = false, set_uncached = false;
struct page *page;
void *ret;
size = PAGE_ALIGN(size);
if (attrs & DMA_ATTR_NO_WARN)
gfp |= __GFP_NOWARN;
if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
!force_dma_unencrypted(dev) && !is_swiotlb_for_alloc(dev))
return dma_direct_alloc_no_mapping(dev, size, dma_handle, gfp);
if (!dev_is_dma_coherent(dev)) {
if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_ALLOC) &&
!is_swiotlb_for_alloc(dev))
return arch_dma_alloc(dev, size, dma_handle, gfp,
attrs);
/*
* If there is a global pool, always allocate from it for
* non-coherent devices.
*/
if (IS_ENABLED(CONFIG_DMA_GLOBAL_POOL))
return dma_alloc_from_global_coherent(dev, size,
dma_handle);
/*
* Otherwise we require the architecture to either be able to
* mark arbitrary parts of the kernel direct mapping uncached,
* or remapped it uncached.
*/
set_uncached = IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED);
remap = IS_ENABLED(CONFIG_DMA_DIRECT_REMAP);
if (!set_uncached && !remap) {
pr_warn_once("coherent DMA allocations not supported on this platform.\n");
return NULL;
}
}
/*
* Remapping or decrypting memory may block, allocate the memory from
* the atomic pools instead if we aren't allowed block.
*/
if ((remap || force_dma_unencrypted(dev)) &&
dma_direct_use_pool(dev, gfp))
return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp);
/* we always manually zero the memory once we are done */
page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO, true);
if (!page)
return NULL;
/*
* dma_alloc_contiguous can return highmem pages depending on a
* combination the cma= arguments and per-arch setup. These need to be
* remapped to return a kernel virtual address.
*/
if (PageHighMem(page)) {
remap = true;
set_uncached = false;
}
if (remap) {
pgprot_t prot = dma_pgprot(dev, PAGE_KERNEL, attrs);
if (force_dma_unencrypted(dev))
prot = pgprot_decrypted(prot);
/* remove any dirty cache lines on the kernel alias */
arch_dma_prep_coherent(page, size);
/* create a coherent mapping */
ret = dma_common_contiguous_remap(page, size, prot,
__builtin_return_address(0));
if (!ret)
goto out_free_pages;
} else {
ret = page_address(page);
if (dma_set_decrypted(dev, ret, size))
goto out_leak_pages;
}
memset(ret, 0, size);
if (set_uncached) {
arch_dma_prep_coherent(page, size);
ret = arch_dma_set_uncached(ret, size);
if (IS_ERR(ret))
goto out_encrypt_pages;
}
*dma_handle = phys_to_dma_direct(dev, page_to_phys(page));
return ret;
out_encrypt_pages:
if (dma_set_encrypted(dev, page_address(page), size))
return NULL;
out_free_pages:
__dma_direct_free_pages(dev, page, size);
return NULL;
out_leak_pages:
return NULL;
}
void dma_direct_free(struct device *dev, size_t size,
void *cpu_addr, dma_addr_t dma_addr, unsigned long attrs)
{
unsigned int page_order = get_order(size);
if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
!force_dma_unencrypted(dev) && !is_swiotlb_for_alloc(dev)) {
/* cpu_addr is a struct page cookie, not a kernel address */
dma_free_contiguous(dev, cpu_addr, size);
return;
}
if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_ALLOC) &&
Merge branch 'stable/for-linus-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb updates from Konrad Rzeszutek Wilk: "A new feature called restricted DMA pools. It allows SWIOTLB to utilize per-device (or per-platform) allocated memory pools instead of using the global one. The first big user of this is ARM Confidential Computing where the memory for DMA operations can be set per platform" * 'stable/for-linus-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: (23 commits) swiotlb: use depends on for DMA_RESTRICTED_POOL of: restricted dma: Don't fail device probe on rmem init failure of: Move of_dma_set_restricted_buffer() into device.c powerpc/svm: Don't issue ultracalls if !mem_encrypt_active() s390/pv: fix the forcing of the swiotlb swiotlb: Free tbl memory in swiotlb_exit() swiotlb: Emit diagnostic in swiotlb_exit() swiotlb: Convert io_default_tlb_mem to static allocation of: Return success from of_dma_set_restricted_buffer() when !OF_ADDRESS swiotlb: add overflow checks to swiotlb_bounce swiotlb: fix implicit debugfs declarations of: Add plumbing for restricted DMA pool dt-bindings: of: Add restricted DMA pool swiotlb: Add restricted DMA pool initialization swiotlb: Add restricted DMA alloc/free support swiotlb: Refactor swiotlb_tbl_unmap_single swiotlb: Move alloc_size to swiotlb_find_slots swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing swiotlb: Update is_swiotlb_active to add a struct device argument swiotlb: Update is_swiotlb_buffer to add a struct device argument ...
2021-09-04 01:34:44 +08:00
!dev_is_dma_coherent(dev) &&
!is_swiotlb_for_alloc(dev)) {
arch_dma_free(dev, size, cpu_addr, dma_addr, attrs);
return;
}
if (IS_ENABLED(CONFIG_DMA_GLOBAL_POOL) &&
!dev_is_dma_coherent(dev)) {
if (!dma_release_from_global_coherent(page_order, cpu_addr))
WARN_ON_ONCE(1);
return;
}
/* If cpu_addr is not from an atomic pool, dma_free_from_pool() fails */
if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL) &&
dma_free_from_pool(dev, cpu_addr, PAGE_ALIGN(size)))
return;
if (is_vmalloc_addr(cpu_addr)) {
vunmap(cpu_addr);
} else {
if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_CLEAR_UNCACHED))
arch_dma_clear_uncached(cpu_addr, size);
if (dma_set_encrypted(dev, cpu_addr, size))
return;
}
__dma_direct_free_pages(dev, dma_direct_to_page(dev, dma_addr), size);
}
struct page *dma_direct_alloc_pages(struct device *dev, size_t size,
dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp)
{
struct page *page;
void *ret;
if (force_dma_unencrypted(dev) && dma_direct_use_pool(dev, gfp))
return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp);
page = __dma_direct_alloc_pages(dev, size, gfp, false);
if (!page)
return NULL;
ret = page_address(page);
if (dma_set_decrypted(dev, ret, size))
goto out_leak_pages;
memset(ret, 0, size);
*dma_handle = phys_to_dma_direct(dev, page_to_phys(page));
return page;
out_leak_pages:
return NULL;
}
void dma_direct_free_pages(struct device *dev, size_t size,
struct page *page, dma_addr_t dma_addr,
enum dma_data_direction dir)
{
void *vaddr = page_address(page);
/* If cpu_addr is not from an atomic pool, dma_free_from_pool() fails */
if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL) &&
dma_free_from_pool(dev, vaddr, size))
return;
if (dma_set_encrypted(dev, vaddr, size))
return;
__dma_direct_free_pages(dev, page, size);
}
#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \
defined(CONFIG_SWIOTLB)
void dma_direct_sync_sg_for_device(struct device *dev,
struct scatterlist *sgl, int nents, enum dma_data_direction dir)
{
struct scatterlist *sg;
int i;
for_each_sg(sgl, sg, nents, i) {
phys_addr_t paddr = dma_to_phys(dev, sg_dma_address(sg));
swiotlb: reduce swiotlb pool lookups With CONFIG_SWIOTLB_DYNAMIC enabled, each round-trip map/unmap pair in the swiotlb results in 6 calls to swiotlb_find_pool(). In multiple places, the pool is found and used in one function, and then must be found again in the next function that is called because only the tlb_addr is passed as an argument. These are the six call sites: dma_direct_map_page: 1. swiotlb_map -> swiotlb_tbl_map_single -> swiotlb_bounce dma_direct_unmap_page: 2. dma_direct_sync_single_for_cpu -> is_swiotlb_buffer 3. dma_direct_sync_single_for_cpu -> swiotlb_sync_single_for_cpu -> swiotlb_bounce 4. is_swiotlb_buffer 5. swiotlb_tbl_unmap_single -> swiotlb_del_transient 6. swiotlb_tbl_unmap_single -> swiotlb_release_slots Reduce the number of calls by finding the pool at a higher level, and passing it as an argument instead of searching again. A key change is for is_swiotlb_buffer() to return a pool pointer instead of a boolean, and then pass this pool pointer to subsequent swiotlb functions. There are 9 occurrences of is_swiotlb_buffer() used to test if a buffer is a swiotlb buffer before calling a swiotlb function. To reduce code duplication in getting the pool pointer and passing it as an argument, introduce inline wrappers for this pattern. The generated code is essentially unchanged. Since is_swiotlb_buffer() no longer returns a boolean, rename some functions to reflect the change: * swiotlb_find_pool() becomes __swiotlb_find_pool() * is_swiotlb_buffer() becomes swiotlb_find_pool() * is_xen_swiotlb_buffer() becomes xen_swiotlb_find_pool() With these changes, a round-trip map/unmap pair requires only 2 pool lookups (listed using the new names and wrappers): dma_direct_unmap_page: 1. dma_direct_sync_single_for_cpu -> swiotlb_find_pool 2. swiotlb_tbl_unmap_single -> swiotlb_find_pool These changes come from noticing the inefficiencies in a code review, not from performance measurements. With CONFIG_SWIOTLB_DYNAMIC, __swiotlb_find_pool() is not trivial, and it uses an RCU read lock, so avoiding the redundant calls helps performance in a hot path. When CONFIG_SWIOTLB_DYNAMIC is *not* set, the code size reduction is minimal and the perf benefits are likely negligible, but no harm is done. No functional change is intended. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Petr Tesarik <petr@tesarici.cz> Signed-off-by: Christoph Hellwig <hch@lst.de>
2024-07-09 03:41:00 +08:00
swiotlb_sync_single_for_device(dev, paddr, sg->length, dir);
if (!dev_is_dma_coherent(dev))
arch_sync_dma_for_device(paddr, sg->length,
dir);
}
}
#endif
#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) || \
defined(CONFIG_SWIOTLB)
void dma_direct_sync_sg_for_cpu(struct device *dev,
struct scatterlist *sgl, int nents, enum dma_data_direction dir)
{
struct scatterlist *sg;
int i;
for_each_sg(sgl, sg, nents, i) {
phys_addr_t paddr = dma_to_phys(dev, sg_dma_address(sg));
if (!dev_is_dma_coherent(dev))
arch_sync_dma_for_cpu(paddr, sg->length, dir);
swiotlb: reduce swiotlb pool lookups With CONFIG_SWIOTLB_DYNAMIC enabled, each round-trip map/unmap pair in the swiotlb results in 6 calls to swiotlb_find_pool(). In multiple places, the pool is found and used in one function, and then must be found again in the next function that is called because only the tlb_addr is passed as an argument. These are the six call sites: dma_direct_map_page: 1. swiotlb_map -> swiotlb_tbl_map_single -> swiotlb_bounce dma_direct_unmap_page: 2. dma_direct_sync_single_for_cpu -> is_swiotlb_buffer 3. dma_direct_sync_single_for_cpu -> swiotlb_sync_single_for_cpu -> swiotlb_bounce 4. is_swiotlb_buffer 5. swiotlb_tbl_unmap_single -> swiotlb_del_transient 6. swiotlb_tbl_unmap_single -> swiotlb_release_slots Reduce the number of calls by finding the pool at a higher level, and passing it as an argument instead of searching again. A key change is for is_swiotlb_buffer() to return a pool pointer instead of a boolean, and then pass this pool pointer to subsequent swiotlb functions. There are 9 occurrences of is_swiotlb_buffer() used to test if a buffer is a swiotlb buffer before calling a swiotlb function. To reduce code duplication in getting the pool pointer and passing it as an argument, introduce inline wrappers for this pattern. The generated code is essentially unchanged. Since is_swiotlb_buffer() no longer returns a boolean, rename some functions to reflect the change: * swiotlb_find_pool() becomes __swiotlb_find_pool() * is_swiotlb_buffer() becomes swiotlb_find_pool() * is_xen_swiotlb_buffer() becomes xen_swiotlb_find_pool() With these changes, a round-trip map/unmap pair requires only 2 pool lookups (listed using the new names and wrappers): dma_direct_unmap_page: 1. dma_direct_sync_single_for_cpu -> swiotlb_find_pool 2. swiotlb_tbl_unmap_single -> swiotlb_find_pool These changes come from noticing the inefficiencies in a code review, not from performance measurements. With CONFIG_SWIOTLB_DYNAMIC, __swiotlb_find_pool() is not trivial, and it uses an RCU read lock, so avoiding the redundant calls helps performance in a hot path. When CONFIG_SWIOTLB_DYNAMIC is *not* set, the code size reduction is minimal and the perf benefits are likely negligible, but no harm is done. No functional change is intended. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Petr Tesarik <petr@tesarici.cz> Signed-off-by: Christoph Hellwig <hch@lst.de>
2024-07-09 03:41:00 +08:00
swiotlb_sync_single_for_cpu(dev, paddr, sg->length, dir);
if (dir == DMA_FROM_DEVICE)
arch_dma_mark_clean(paddr, sg->length);
}
if (!dev_is_dma_coherent(dev))
arch_sync_dma_for_cpu_all();
}
/*
* Unmaps segments, except for ones marked as pci_p2pdma which do not
* require any further action as they contain a bus address.
*/
void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
int nents, enum dma_data_direction dir, unsigned long attrs)
{
struct scatterlist *sg;
int i;
for_each_sg(sgl, sg, nents, i) {
dma-mapping: name SG DMA flag helpers consistently sg_is_dma_bus_address() is inconsistent with the naming pattern of its corresponding setters and its own kerneldoc, so take the majority vote and rename it sg_dma_is_bus_address() (and fix up the missing underscores in the kerneldoc too). This gives us a nice clear pattern where SG DMA flags are SG_DMA_<NAME>, and the helpers for acting on them are sg_dma_<action>_<name>(). Link: https://lkml.kernel.org/r/20230612153201.554742-14-catalin.marinas@arm.com Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Link: https://lore.kernel.org/r/fa2eca2862c7ffc41b50337abffb2dfd2864d3ea.1685036694.git.robin.murphy@arm.com Tested-by: Isaac J. Manjarres <isaacmanjarres@google.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Mike Snitzer <snitzer@kernel.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Saravana Kannan <saravanak@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12 23:31:57 +08:00
if (sg_dma_is_bus_address(sg))
sg_dma_unmark_bus_address(sg);
else
dma_direct_unmap_page(dev, sg->dma_address,
sg_dma_len(sg), dir, attrs);
}
}
#endif
int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
enum dma_data_direction dir, unsigned long attrs)
{
struct pci_p2pdma_map_state p2pdma_state = {};
enum pci_p2pdma_map_type map;
struct scatterlist *sg;
int i, ret;
for_each_sg(sgl, sg, nents, i) {
if (is_pci_p2pdma_page(sg_page(sg))) {
map = pci_p2pdma_map_segment(&p2pdma_state, dev, sg);
switch (map) {
case PCI_P2PDMA_MAP_BUS_ADDR:
continue;
case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE:
/*
* Any P2P mapping that traverses the PCI
* host bridge must be mapped with CPU physical
* address and not PCI bus addresses. This is
* done with dma_direct_map_page() below.
*/
break;
default:
ret = -EREMOTEIO;
goto out_unmap;
}
}
sg->dma_address = dma_direct_map_page(dev, sg_page(sg),
sg->offset, sg->length, dir, attrs);
if (sg->dma_address == DMA_MAPPING_ERROR) {
ret = -EIO;
goto out_unmap;
}
sg_dma_len(sg) = sg->length;
}
return nents;
out_unmap:
dma_direct_unmap_sg(dev, sgl, i, dir, attrs | DMA_ATTR_SKIP_CPU_SYNC);
return ret;
}
dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr,
size_t size, enum dma_data_direction dir, unsigned long attrs)
{
dma_addr_t dma_addr = paddr;
if (unlikely(!dma_capable(dev, dma_addr, size, false))) {
dev_err_once(dev,
"DMA addr %pad+%zu overflow (mask %llx, bus limit %llx).\n",
&dma_addr, size, *dev->dma_mask, dev->bus_dma_limit);
WARN_ON_ONCE(1);
return DMA_MAPPING_ERROR;
}
return dma_addr;
}
int dma_direct_get_sgtable(struct device *dev, struct sg_table *sgt,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs)
{
struct page *page = dma_direct_to_page(dev, dma_addr);
int ret;
ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
if (!ret)
sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
return ret;
}
bool dma_direct_can_mmap(struct device *dev)
{
return dev_is_dma_coherent(dev) ||
IS_ENABLED(CONFIG_DMA_NONCOHERENT_MMAP);
}
int dma_direct_mmap(struct device *dev, struct vm_area_struct *vma,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs)
{
unsigned long user_count = vma_pages(vma);
unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
unsigned long pfn = PHYS_PFN(dma_to_phys(dev, dma_addr));
int ret = -ENXIO;
vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
if (force_dma_unencrypted(dev))
vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
return ret;
if (dma_mmap_from_global_coherent(vma, cpu_addr, size, &ret))
return ret;
if (vma->vm_pgoff >= count || user_count > count - vma->vm_pgoff)
return -ENXIO;
return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
user_count << PAGE_SHIFT, vma->vm_page_prot);
}
int dma_direct_supported(struct device *dev, u64 mask)
{
u64 min_mask = (max_pfn - 1) << PAGE_SHIFT;
/*
* Because 32-bit DMA masks are so common we expect every architecture
* to be able to satisfy them - either by not supporting more physical
* memory, or by providing a ZONE_DMA32. If neither is the case, the
* architecture needs to use an IOMMU instead of the direct mapping.
*/
if (mask >= DMA_BIT_MASK(32))
return 1;
/*
* This check needs to be against the actual bit mask value, so use
* phys_to_dma_unencrypted() here so that the SME encryption mask isn't
* part of the check.
*/
if (IS_ENABLED(CONFIG_ZONE_DMA))
min_mask = min_t(u64, min_mask, zone_dma_limit);
return mask >= phys_to_dma_unencrypted(dev, min_mask);
}
dma-mapping: fix dma_addressing_limited() if dma_range_map can't cover all system RAM There is an unusual case that the range map covers right up to the top of system RAM, but leaves a hole somewhere lower down. Then it prevents the nvme device dma mapping in the checking path of phys_to_dma() and causes the hangs at boot. E.g. On an Armv8 Ampere server, the dsdt ACPI table is: Method (_DMA, 0, Serialized) // _DMA: Direct Memory Access { Name (RBUF, ResourceTemplate () { QWordMemory (ResourceConsumer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, 0x0000000000000000, // Granularity 0x0000000000000000, // Range Minimum 0x00000000FFFFFFFF, // Range Maximum 0x0000000000000000, // Translation Offset 0x0000000100000000, // Length ,, , AddressRangeMemory, TypeStatic) QWordMemory (ResourceConsumer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, 0x0000000000000000, // Granularity 0x0000006010200000, // Range Minimum 0x000000602FFFFFFF, // Range Maximum 0x0000000000000000, // Translation Offset 0x000000001FE00000, // Length ,, , AddressRangeMemory, TypeStatic) QWordMemory (ResourceConsumer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, 0x0000000000000000, // Granularity 0x00000060F0000000, // Range Minimum 0x00000060FFFFFFFF, // Range Maximum 0x0000000000000000, // Translation Offset 0x0000000010000000, // Length ,, , AddressRangeMemory, TypeStatic) QWordMemory (ResourceConsumer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, 0x0000000000000000, // Granularity 0x0000007000000000, // Range Minimum 0x000003FFFFFFFFFF, // Range Maximum 0x0000000000000000, // Translation Offset 0x0000039000000000, // Length ,, , AddressRangeMemory, TypeStatic) }) But the System RAM ranges are: cat /proc/iomem |grep -i ram 90000000-91ffffff : System RAM 92900000-fffbffff : System RAM 880000000-fffffffff : System RAM 8800000000-bff5990fff : System RAM bff59d0000-bff5a4ffff : System RAM bff8000000-bfffffffff : System RAM So some RAM ranges are out of dma_range_map. Fix it by checking whether each of the system RAM resources can be properly encompassed within the dma_range_map. Signed-off-by: Jia He <justin.he@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-10-28 18:20:59 +08:00
/*
* To check whether all ram resource ranges are covered by dma range map
* Returns 0 when further check is needed
* Returns 1 if there is some RAM range can't be covered by dma_range_map
*/
static int check_ram_in_range_map(unsigned long start_pfn,
unsigned long nr_pages, void *data)
{
unsigned long end_pfn = start_pfn + nr_pages;
const struct bus_dma_region *bdr = NULL;
const struct bus_dma_region *m;
struct device *dev = data;
while (start_pfn < end_pfn) {
for (m = dev->dma_range_map; PFN_DOWN(m->size); m++) {
unsigned long cpu_start_pfn = PFN_DOWN(m->cpu_start);
if (start_pfn >= cpu_start_pfn &&
start_pfn - cpu_start_pfn < PFN_DOWN(m->size)) {
bdr = m;
break;
}
}
if (!bdr)
return 1;
start_pfn = PFN_DOWN(bdr->cpu_start) + PFN_DOWN(bdr->size);
}
return 0;
}
bool dma_direct_all_ram_mapped(struct device *dev)
{
if (!dev->dma_range_map)
return true;
return !walk_system_ram_range(0, PFN_DOWN(ULONG_MAX) + 1, dev,
check_ram_in_range_map);
}
size_t dma_direct_max_mapping_size(struct device *dev)
{
/* If SWIOTLB is active, use its maximum mapping size */
if (is_swiotlb_active(dev) &&
(dma_addressing_limited(dev) || is_swiotlb_force_bounce(dev)))
return swiotlb_max_mapping_size(dev);
return SIZE_MAX;
}
bool dma_direct_need_sync(struct device *dev, dma_addr_t dma_addr)
{
return !dev_is_dma_coherent(dev) ||
swiotlb: reduce swiotlb pool lookups With CONFIG_SWIOTLB_DYNAMIC enabled, each round-trip map/unmap pair in the swiotlb results in 6 calls to swiotlb_find_pool(). In multiple places, the pool is found and used in one function, and then must be found again in the next function that is called because only the tlb_addr is passed as an argument. These are the six call sites: dma_direct_map_page: 1. swiotlb_map -> swiotlb_tbl_map_single -> swiotlb_bounce dma_direct_unmap_page: 2. dma_direct_sync_single_for_cpu -> is_swiotlb_buffer 3. dma_direct_sync_single_for_cpu -> swiotlb_sync_single_for_cpu -> swiotlb_bounce 4. is_swiotlb_buffer 5. swiotlb_tbl_unmap_single -> swiotlb_del_transient 6. swiotlb_tbl_unmap_single -> swiotlb_release_slots Reduce the number of calls by finding the pool at a higher level, and passing it as an argument instead of searching again. A key change is for is_swiotlb_buffer() to return a pool pointer instead of a boolean, and then pass this pool pointer to subsequent swiotlb functions. There are 9 occurrences of is_swiotlb_buffer() used to test if a buffer is a swiotlb buffer before calling a swiotlb function. To reduce code duplication in getting the pool pointer and passing it as an argument, introduce inline wrappers for this pattern. The generated code is essentially unchanged. Since is_swiotlb_buffer() no longer returns a boolean, rename some functions to reflect the change: * swiotlb_find_pool() becomes __swiotlb_find_pool() * is_swiotlb_buffer() becomes swiotlb_find_pool() * is_xen_swiotlb_buffer() becomes xen_swiotlb_find_pool() With these changes, a round-trip map/unmap pair requires only 2 pool lookups (listed using the new names and wrappers): dma_direct_unmap_page: 1. dma_direct_sync_single_for_cpu -> swiotlb_find_pool 2. swiotlb_tbl_unmap_single -> swiotlb_find_pool These changes come from noticing the inefficiencies in a code review, not from performance measurements. With CONFIG_SWIOTLB_DYNAMIC, __swiotlb_find_pool() is not trivial, and it uses an RCU read lock, so avoiding the redundant calls helps performance in a hot path. When CONFIG_SWIOTLB_DYNAMIC is *not* set, the code size reduction is minimal and the perf benefits are likely negligible, but no harm is done. No functional change is intended. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Petr Tesarik <petr@tesarici.cz> Signed-off-by: Christoph Hellwig <hch@lst.de>
2024-07-09 03:41:00 +08:00
swiotlb_find_pool(dev, dma_to_phys(dev, dma_addr));
}
/**
* dma_direct_set_offset - Assign scalar offset for a single DMA range.
* @dev: device pointer; needed to "own" the alloced memory.
* @cpu_start: beginning of memory region covered by this offset.
* @dma_start: beginning of DMA/PCI region covered by this offset.
* @size: size of the region.
*
* This is for the simple case of a uniform offset which cannot
* be discovered by "dma-ranges".
*
* It returns -ENOMEM if out of memory, -EINVAL if a map
* already exists, 0 otherwise.
*
* Note: any call to this from a driver is a bug. The mapping needs
* to be described by the device tree or other firmware interfaces.
*/
int dma_direct_set_offset(struct device *dev, phys_addr_t cpu_start,
dma_addr_t dma_start, u64 size)
{
struct bus_dma_region *map;
u64 offset = (u64)cpu_start - (u64)dma_start;
if (dev->dma_range_map) {
dev_err(dev, "attempt to add DMA range to existing map\n");
return -EINVAL;
}
if (!offset)
return 0;
map = kcalloc(2, sizeof(*map), GFP_KERNEL);
if (!map)
return -ENOMEM;
map[0].cpu_start = cpu_start;
map[0].dma_start = dma_start;
map[0].size = size;
dev->dma_range_map = map;
return 0;
}