License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
|
|
|
// SPDX-License-Identifier: GPL-2.0
|
2016-02-03 13:46:32 +08:00
|
|
|
/*
|
2020-09-01 19:34:33 +08:00
|
|
|
* Copyright (C) 2018-2020 Christoph Hellwig.
|
2018-09-08 17:22:43 +08:00
|
|
|
*
|
|
|
|
* DMA operations that map physical memory directly without using an IOMMU.
|
2016-02-03 13:46:32 +08:00
|
|
|
*/
|
2018-10-31 06:09:49 +08:00
|
|
|
#include <linux/memblock.h> /* for max_pfn */
|
2016-02-03 13:46:32 +08:00
|
|
|
#include <linux/export.h>
|
|
|
|
#include <linux/mm.h>
|
2020-09-22 21:31:03 +08:00
|
|
|
#include <linux/dma-map-ops.h>
|
2016-02-03 13:46:32 +08:00
|
|
|
#include <linux/scatterlist.h>
|
2017-06-26 17:18:55 +08:00
|
|
|
#include <linux/pfn.h>
|
2019-10-29 18:06:32 +08:00
|
|
|
#include <linux/vmalloc.h>
|
2018-03-19 18:38:25 +08:00
|
|
|
#include <linux/set_memory.h>
|
2020-09-18 00:43:40 +08:00
|
|
|
#include <linux/slab.h>
|
2020-09-22 21:34:22 +08:00
|
|
|
#include "direct.h"
|
2016-02-03 13:46:32 +08:00
|
|
|
|
2018-01-10 06:39:03 +08:00
|
|
|
/*
|
2020-10-16 11:10:28 +08:00
|
|
|
* Most architectures use ZONE_DMA for the first 16 Megabytes, but some use
|
2019-10-15 02:31:03 +08:00
|
|
|
* it for entirely different regions. In that case the arch code needs to
|
|
|
|
* override the variable below for dma-direct to work properly.
|
2018-01-10 06:39:03 +08:00
|
|
|
*/
|
2024-08-11 15:09:35 +08:00
|
|
|
u64 zone_dma_limit __ro_after_init = DMA_BIT_MASK(24);
|
2018-01-10 06:39:03 +08:00
|
|
|
|
2018-09-20 19:26:13 +08:00
|
|
|
static inline dma_addr_t phys_to_dma_direct(struct device *dev,
|
|
|
|
phys_addr_t phys)
|
|
|
|
{
|
2019-07-11 03:01:19 +08:00
|
|
|
if (force_dma_unencrypted(dev))
|
2020-08-17 23:34:03 +08:00
|
|
|
return phys_to_dma_unencrypted(dev, phys);
|
2018-09-20 19:26:13 +08:00
|
|
|
return phys_to_dma(dev, phys);
|
|
|
|
}
|
|
|
|
|
2019-10-29 18:01:37 +08:00
|
|
|
static inline struct page *dma_direct_to_page(struct device *dev,
|
|
|
|
dma_addr_t dma_addr)
|
|
|
|
{
|
|
|
|
return pfn_to_page(PHYS_PFN(dma_to_phys(dev, dma_addr)));
|
|
|
|
}
|
|
|
|
|
2018-09-20 19:26:13 +08:00
|
|
|
u64 dma_direct_get_required_mask(struct device *dev)
|
|
|
|
{
|
2020-04-06 13:28:36 +08:00
|
|
|
phys_addr_t phys = (phys_addr_t)(max_pfn - 1) << PAGE_SHIFT;
|
|
|
|
u64 max_dma = phys_to_dma_direct(dev, phys);
|
2018-09-20 19:26:13 +08:00
|
|
|
|
|
|
|
return (1ULL << (fls64(max_dma) - 1)) * 2 - 1;
|
|
|
|
}
|
|
|
|
|
2023-02-20 23:06:22 +08:00
|
|
|
static gfp_t dma_direct_optimal_gfp_mask(struct device *dev, u64 *phys_limit)
|
2018-09-07 08:30:54 +08:00
|
|
|
{
|
2023-02-20 23:06:22 +08:00
|
|
|
u64 dma_limit = min_not_zero(
|
|
|
|
dev->coherent_dma_mask,
|
|
|
|
dev->bus_dma_limit);
|
2018-09-20 20:04:08 +08:00
|
|
|
|
2018-10-01 22:40:53 +08:00
|
|
|
/*
|
|
|
|
* Optimistically try the zone that the physical address mask falls
|
|
|
|
* into first. If that returns memory that isn't actually addressable
|
|
|
|
* we will fallback to the next lower zone and try again.
|
|
|
|
*
|
|
|
|
* Note that GFP_DMA32 and GFP_DMA are no ops without the corresponding
|
|
|
|
* zones.
|
|
|
|
*/
|
2020-09-08 23:56:22 +08:00
|
|
|
*phys_limit = dma_to_phys(dev, dma_limit);
|
2024-08-11 15:09:35 +08:00
|
|
|
if (*phys_limit <= zone_dma_limit)
|
2018-09-07 08:30:54 +08:00
|
|
|
return GFP_DMA;
|
2019-11-21 17:26:44 +08:00
|
|
|
if (*phys_limit <= DMA_BIT_MASK(32))
|
2018-09-07 08:30:54 +08:00
|
|
|
return GFP_DMA32;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
swiotlb: if swiotlb is full, fall back to a transient memory pool
Try to allocate a transient memory pool if no suitable slots can be found
and the respective SWIOTLB is allowed to grow. The transient pool is just
enough big for this one bounce buffer. It is inserted into a per-device
list of transient memory pools, and it is freed again when the bounce
buffer is unmapped.
Transient memory pools are kept in an RCU list. A memory barrier is
required after adding a new entry, because any address within a transient
buffer must be immediately recognized as belonging to the SWIOTLB, even if
it is passed to another CPU.
Deletion does not require any synchronization beyond RCU ordering
guarantees. After a buffer is unmapped, its physical addresses may no
longer be passed to the DMA API, so the memory range of the corresponding
stale entry in the RCU list never matches. If the memory range gets
allocated again, then it happens only after a RCU quiescent state.
Since bounce buffers can now be allocated from different pools, add a
parameter to swiotlb_alloc_pool() to let the caller know which memory pool
is used. Add swiotlb_find_pool() to find the memory pool corresponding to
an address. This function is now also used by is_swiotlb_buffer(), because
a simple boundary check is no longer sufficient.
The logic in swiotlb_alloc_tlb() is taken from __dma_direct_alloc_pages(),
simplified and enhanced to use coherent memory pools if needed.
Note that this is not the most efficient way to provide a bounce buffer,
but when a DMA buffer can't be mapped, something may (and will) actually
break. At that point it is better to make an allocation, even if it may be
an expensive operation.
Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-08-01 14:24:01 +08:00
|
|
|
bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size)
|
2018-01-10 06:40:57 +08:00
|
|
|
{
|
2020-09-18 00:43:40 +08:00
|
|
|
dma_addr_t dma_addr = phys_to_dma_direct(dev, phys);
|
|
|
|
|
|
|
|
if (dma_addr == DMA_MAPPING_ERROR)
|
|
|
|
return false;
|
|
|
|
return dma_addr + size - 1 <=
|
|
|
|
min_not_zero(dev->coherent_dma_mask, dev->bus_dma_limit);
|
2018-01-10 06:40:57 +08:00
|
|
|
}
|
|
|
|
|
2021-10-18 19:18:34 +08:00
|
|
|
static int dma_set_decrypted(struct device *dev, void *vaddr, size_t size)
|
|
|
|
{
|
|
|
|
if (!force_dma_unencrypted(dev))
|
|
|
|
return 0;
|
2022-05-21 01:10:13 +08:00
|
|
|
return set_memory_decrypted((unsigned long)vaddr, PFN_UP(size));
|
2021-10-18 19:18:34 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static int dma_set_encrypted(struct device *dev, void *vaddr, size_t size)
|
|
|
|
{
|
2021-11-09 22:41:01 +08:00
|
|
|
int ret;
|
|
|
|
|
2021-10-18 19:18:34 +08:00
|
|
|
if (!force_dma_unencrypted(dev))
|
|
|
|
return 0;
|
2022-05-21 01:10:13 +08:00
|
|
|
ret = set_memory_encrypted((unsigned long)vaddr, PFN_UP(size));
|
2021-11-09 22:41:01 +08:00
|
|
|
if (ret)
|
|
|
|
pr_warn_ratelimited("leaking DMA memory that can't be re-encrypted\n");
|
|
|
|
return ret;
|
2021-10-18 19:18:34 +08:00
|
|
|
}
|
|
|
|
|
2021-06-19 11:40:40 +08:00
|
|
|
static void __dma_direct_free_pages(struct device *dev, struct page *page,
|
|
|
|
size_t size)
|
|
|
|
{
|
2021-10-21 15:34:59 +08:00
|
|
|
if (swiotlb_free(dev, page, size))
|
2021-06-19 11:40:40 +08:00
|
|
|
return;
|
|
|
|
dma_free_contiguous(dev, page, size);
|
|
|
|
}
|
|
|
|
|
2021-10-21 15:39:12 +08:00
|
|
|
static struct page *dma_direct_alloc_swiotlb(struct device *dev, size_t size)
|
|
|
|
{
|
|
|
|
struct page *page = swiotlb_alloc(dev, size);
|
|
|
|
|
|
|
|
if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
|
|
|
|
swiotlb_free(dev, page, size);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
return page;
|
|
|
|
}
|
|
|
|
|
2020-06-15 14:52:31 +08:00
|
|
|
static struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
|
2022-04-24 01:20:24 +08:00
|
|
|
gfp_t gfp, bool allow_highmem)
|
2016-02-03 13:46:32 +08:00
|
|
|
{
|
2019-08-20 10:45:49 +08:00
|
|
|
int node = dev_to_node(dev);
|
2017-12-22 18:51:44 +08:00
|
|
|
struct page *page = NULL;
|
2019-11-21 17:26:44 +08:00
|
|
|
u64 phys_limit;
|
2016-02-03 13:46:32 +08:00
|
|
|
|
2020-06-12 03:20:28 +08:00
|
|
|
WARN_ON_ONCE(!PAGE_ALIGNED(size));
|
|
|
|
|
2021-10-21 15:39:12 +08:00
|
|
|
if (is_swiotlb_for_alloc(dev))
|
|
|
|
return dma_direct_alloc_swiotlb(dev, size);
|
|
|
|
|
2023-02-20 23:06:22 +08:00
|
|
|
gfp |= dma_direct_optimal_gfp_mask(dev, &phys_limit);
|
2020-06-12 03:20:28 +08:00
|
|
|
page = dma_alloc_contiguous(dev, size, gfp);
|
2022-04-24 01:20:24 +08:00
|
|
|
if (page) {
|
|
|
|
if (!dma_coherent_ok(dev, page_to_phys(page), size) ||
|
|
|
|
(!allow_highmem && PageHighMem(page))) {
|
|
|
|
dma_free_contiguous(dev, page, size);
|
|
|
|
page = NULL;
|
|
|
|
}
|
2019-08-20 10:45:49 +08:00
|
|
|
}
|
2018-01-10 06:40:57 +08:00
|
|
|
again:
|
2019-08-20 10:45:49 +08:00
|
|
|
if (!page)
|
2020-06-12 03:20:28 +08:00
|
|
|
page = alloc_pages_node(node, gfp, get_order(size));
|
2018-01-10 06:40:57 +08:00
|
|
|
if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
|
2024-08-31 19:01:19 +08:00
|
|
|
__free_pages(page, get_order(size));
|
2018-01-10 06:40:57 +08:00
|
|
|
page = NULL;
|
|
|
|
|
2018-04-16 23:18:19 +08:00
|
|
|
if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
|
2019-11-21 17:26:44 +08:00
|
|
|
phys_limit < DMA_BIT_MASK(64) &&
|
2018-04-16 23:18:19 +08:00
|
|
|
!(gfp & (GFP_DMA32 | GFP_DMA))) {
|
|
|
|
gfp |= GFP_DMA32;
|
|
|
|
goto again;
|
|
|
|
}
|
|
|
|
|
2019-02-13 15:01:03 +08:00
|
|
|
if (IS_ENABLED(CONFIG_ZONE_DMA) && !(gfp & GFP_DMA)) {
|
2018-01-10 06:40:57 +08:00
|
|
|
gfp = (gfp & ~GFP_DMA32) | GFP_DMA;
|
|
|
|
goto again;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-11-05 00:27:56 +08:00
|
|
|
return page;
|
|
|
|
}
|
|
|
|
|
2021-11-09 22:50:28 +08:00
|
|
|
/*
|
|
|
|
* Check if a potentially blocking operations needs to dip into the atomic
|
|
|
|
* pools for the given device/gfp.
|
|
|
|
*/
|
|
|
|
static bool dma_direct_use_pool(struct device *dev, gfp_t gfp)
|
|
|
|
{
|
|
|
|
return !gfpflags_allow_blocking(gfp) && !is_swiotlb_for_alloc(dev);
|
|
|
|
}
|
|
|
|
|
2020-10-07 17:06:09 +08:00
|
|
|
static void *dma_direct_alloc_from_pool(struct device *dev, size_t size,
|
|
|
|
dma_addr_t *dma_handle, gfp_t gfp)
|
|
|
|
{
|
|
|
|
struct page *page;
|
2023-02-20 23:06:22 +08:00
|
|
|
u64 phys_limit;
|
2020-10-07 17:06:09 +08:00
|
|
|
void *ret;
|
|
|
|
|
2021-10-21 16:00:55 +08:00
|
|
|
if (WARN_ON_ONCE(!IS_ENABLED(CONFIG_DMA_COHERENT_POOL)))
|
|
|
|
return NULL;
|
|
|
|
|
2023-02-20 23:06:22 +08:00
|
|
|
gfp |= dma_direct_optimal_gfp_mask(dev, &phys_limit);
|
2020-10-07 17:06:09 +08:00
|
|
|
page = dma_alloc_from_pool(dev, size, &ret, gfp, dma_coherent_ok);
|
|
|
|
if (!page)
|
|
|
|
return NULL;
|
|
|
|
*dma_handle = phys_to_dma_direct(dev, page_to_phys(page));
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2021-10-18 19:08:07 +08:00
|
|
|
static void *dma_direct_alloc_no_mapping(struct device *dev, size_t size,
|
|
|
|
dma_addr_t *dma_handle, gfp_t gfp)
|
|
|
|
{
|
|
|
|
struct page *page;
|
|
|
|
|
2022-04-24 01:20:24 +08:00
|
|
|
page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO, true);
|
2021-10-18 19:08:07 +08:00
|
|
|
if (!page)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
/* remove any dirty cache lines on the kernel alias */
|
|
|
|
if (!PageHighMem(page))
|
|
|
|
arch_dma_prep_coherent(page, size);
|
|
|
|
|
|
|
|
/* return the page pointer as the opaque cookie */
|
|
|
|
*dma_handle = phys_to_dma_direct(dev, page_to_phys(page));
|
|
|
|
return page;
|
|
|
|
}
|
|
|
|
|
2020-08-17 23:06:40 +08:00
|
|
|
void *dma_direct_alloc(struct device *dev, size_t size,
|
2018-11-05 00:27:56 +08:00
|
|
|
dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
|
|
|
|
{
|
2021-11-09 22:20:40 +08:00
|
|
|
bool remap = false, set_uncached = false;
|
2018-11-05 00:27:56 +08:00
|
|
|
struct page *page;
|
|
|
|
void *ret;
|
|
|
|
|
2020-06-12 03:20:28 +08:00
|
|
|
size = PAGE_ALIGN(size);
|
2020-08-17 23:14:28 +08:00
|
|
|
if (attrs & DMA_ATTR_NO_WARN)
|
|
|
|
gfp |= __GFP_NOWARN;
|
2020-06-12 03:20:28 +08:00
|
|
|
|
2019-08-06 19:33:23 +08:00
|
|
|
if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
|
2021-10-18 19:08:07 +08:00
|
|
|
!force_dma_unencrypted(dev) && !is_swiotlb_for_alloc(dev))
|
|
|
|
return dma_direct_alloc_no_mapping(dev, size, dma_handle, gfp);
|
2019-10-29 18:06:32 +08:00
|
|
|
|
2021-10-21 15:47:31 +08:00
|
|
|
if (!dev_is_dma_coherent(dev)) {
|
2023-10-05 15:05:36 +08:00
|
|
|
if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_ALLOC) &&
|
2021-10-21 15:47:31 +08:00
|
|
|
!is_swiotlb_for_alloc(dev))
|
|
|
|
return arch_dma_alloc(dev, size, dma_handle, gfp,
|
|
|
|
attrs);
|
2020-10-07 17:04:08 +08:00
|
|
|
|
2021-10-21 15:47:31 +08:00
|
|
|
/*
|
|
|
|
* If there is a global pool, always allocate from it for
|
|
|
|
* non-coherent devices.
|
|
|
|
*/
|
|
|
|
if (IS_ENABLED(CONFIG_DMA_GLOBAL_POOL))
|
|
|
|
return dma_alloc_from_global_coherent(dev, size,
|
|
|
|
dma_handle);
|
|
|
|
|
|
|
|
/*
|
2023-10-06 21:13:34 +08:00
|
|
|
* Otherwise we require the architecture to either be able to
|
|
|
|
* mark arbitrary parts of the kernel direct mapping uncached,
|
|
|
|
* or remapped it uncached.
|
2021-10-21 15:47:31 +08:00
|
|
|
*/
|
2023-10-06 21:13:34 +08:00
|
|
|
set_uncached = IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED);
|
2021-10-21 15:47:31 +08:00
|
|
|
remap = IS_ENABLED(CONFIG_DMA_DIRECT_REMAP);
|
2023-10-06 21:17:54 +08:00
|
|
|
if (!set_uncached && !remap) {
|
|
|
|
pr_warn_once("coherent DMA allocations not supported on this platform.\n");
|
2023-10-06 21:13:34 +08:00
|
|
|
return NULL;
|
2023-10-06 21:17:54 +08:00
|
|
|
}
|
2021-10-21 15:47:31 +08:00
|
|
|
}
|
2021-06-23 20:21:16 +08:00
|
|
|
|
2020-10-07 17:04:08 +08:00
|
|
|
/*
|
2023-10-06 21:13:34 +08:00
|
|
|
* Remapping or decrypting memory may block, allocate the memory from
|
|
|
|
* the atomic pools instead if we aren't allowed block.
|
2020-10-07 17:04:08 +08:00
|
|
|
*/
|
2023-10-06 21:13:34 +08:00
|
|
|
if ((remap || force_dma_unencrypted(dev)) &&
|
|
|
|
dma_direct_use_pool(dev, gfp))
|
2020-10-07 17:04:08 +08:00
|
|
|
return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp);
|
|
|
|
|
|
|
|
/* we always manually zero the memory once we are done */
|
2022-04-24 01:20:24 +08:00
|
|
|
page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO, true);
|
2020-10-07 17:04:08 +08:00
|
|
|
if (!page)
|
|
|
|
return NULL;
|
2022-02-26 23:40:21 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* dma_alloc_contiguous can return highmem pages depending on a
|
|
|
|
* combination the cma= arguments and per-arch setup. These need to be
|
|
|
|
* remapped to return a kernel virtual address.
|
|
|
|
*/
|
2021-10-21 15:47:31 +08:00
|
|
|
if (PageHighMem(page)) {
|
2021-11-09 22:20:40 +08:00
|
|
|
remap = true;
|
2021-10-21 15:47:31 +08:00
|
|
|
set_uncached = false;
|
|
|
|
}
|
2021-11-09 22:20:40 +08:00
|
|
|
|
|
|
|
if (remap) {
|
2022-03-31 14:01:21 +08:00
|
|
|
pgprot_t prot = dma_pgprot(dev, PAGE_KERNEL, attrs);
|
|
|
|
|
|
|
|
if (force_dma_unencrypted(dev))
|
|
|
|
prot = pgprot_decrypted(prot);
|
|
|
|
|
2019-10-29 18:06:32 +08:00
|
|
|
/* remove any dirty cache lines on the kernel alias */
|
2020-06-12 03:20:28 +08:00
|
|
|
arch_dma_prep_coherent(page, size);
|
2019-10-29 18:06:32 +08:00
|
|
|
|
|
|
|
/* create a coherent mapping */
|
2022-03-31 14:01:21 +08:00
|
|
|
ret = dma_common_contiguous_remap(page, size, prot,
|
2019-10-29 18:06:32 +08:00
|
|
|
__builtin_return_address(0));
|
2020-02-22 04:26:00 +08:00
|
|
|
if (!ret)
|
|
|
|
goto out_free_pages;
|
2021-11-09 22:20:40 +08:00
|
|
|
} else {
|
|
|
|
ret = page_address(page);
|
|
|
|
if (dma_set_decrypted(dev, ret, size))
|
2024-02-22 08:17:21 +08:00
|
|
|
goto out_leak_pages;
|
2018-09-23 02:47:26 +08:00
|
|
|
}
|
|
|
|
|
2018-03-19 18:38:25 +08:00
|
|
|
memset(ret, 0, size);
|
2019-06-03 14:43:51 +08:00
|
|
|
|
2021-11-09 22:20:40 +08:00
|
|
|
if (set_uncached) {
|
2019-06-03 14:43:51 +08:00
|
|
|
arch_dma_prep_coherent(page, size);
|
2020-02-22 07:55:43 +08:00
|
|
|
ret = arch_dma_set_uncached(ret, size);
|
|
|
|
if (IS_ERR(ret))
|
2020-06-12 03:20:29 +08:00
|
|
|
goto out_encrypt_pages;
|
2019-06-03 14:43:51 +08:00
|
|
|
}
|
2021-11-09 22:20:40 +08:00
|
|
|
|
2020-08-17 23:20:52 +08:00
|
|
|
*dma_handle = phys_to_dma_direct(dev, page_to_phys(page));
|
2018-03-19 18:38:25 +08:00
|
|
|
return ret;
|
2020-06-12 03:20:29 +08:00
|
|
|
|
|
|
|
out_encrypt_pages:
|
2021-10-18 19:18:34 +08:00
|
|
|
if (dma_set_encrypted(dev, page_address(page), size))
|
|
|
|
return NULL;
|
2020-02-22 04:26:00 +08:00
|
|
|
out_free_pages:
|
2021-06-19 11:40:40 +08:00
|
|
|
__dma_direct_free_pages(dev, page, size);
|
2020-02-22 04:26:00 +08:00
|
|
|
return NULL;
|
2024-02-22 08:17:21 +08:00
|
|
|
out_leak_pages:
|
|
|
|
return NULL;
|
2016-02-03 13:46:32 +08:00
|
|
|
}
|
|
|
|
|
2020-08-17 23:06:40 +08:00
|
|
|
void dma_direct_free(struct device *dev, size_t size,
|
|
|
|
void *cpu_addr, dma_addr_t dma_addr, unsigned long attrs)
|
2016-02-03 13:46:32 +08:00
|
|
|
{
|
2018-03-19 18:38:25 +08:00
|
|
|
unsigned int page_order = get_order(size);
|
2017-12-22 18:51:44 +08:00
|
|
|
|
2020-10-07 17:04:08 +08:00
|
|
|
if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
|
2021-06-19 11:40:40 +08:00
|
|
|
!force_dma_unencrypted(dev) && !is_swiotlb_for_alloc(dev)) {
|
2020-10-07 17:04:08 +08:00
|
|
|
/* cpu_addr is a struct page cookie, not a kernel address */
|
|
|
|
dma_free_contiguous(dev, cpu_addr, size);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2023-10-05 15:05:36 +08:00
|
|
|
if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_ALLOC) &&
|
2021-09-04 01:34:44 +08:00
|
|
|
!dev_is_dma_coherent(dev) &&
|
2021-06-19 11:40:40 +08:00
|
|
|
!is_swiotlb_for_alloc(dev)) {
|
2020-08-17 23:06:40 +08:00
|
|
|
arch_dma_free(dev, size, cpu_addr, dma_addr, attrs);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2021-06-23 20:21:16 +08:00
|
|
|
if (IS_ENABLED(CONFIG_DMA_GLOBAL_POOL) &&
|
|
|
|
!dev_is_dma_coherent(dev)) {
|
|
|
|
if (!dma_release_from_global_coherent(page_order, cpu_addr))
|
|
|
|
WARN_ON_ONCE(1);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2020-04-15 08:04:58 +08:00
|
|
|
/* If cpu_addr is not from an atomic pool, dma_free_from_pool() fails */
|
2020-10-07 17:04:08 +08:00
|
|
|
if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL) &&
|
2020-04-15 08:04:58 +08:00
|
|
|
dma_free_from_pool(dev, cpu_addr, PAGE_ALIGN(size)))
|
|
|
|
return;
|
|
|
|
|
2022-02-26 23:40:21 +08:00
|
|
|
if (is_vmalloc_addr(cpu_addr)) {
|
2019-10-29 18:06:32 +08:00
|
|
|
vunmap(cpu_addr);
|
2021-10-21 15:20:39 +08:00
|
|
|
} else {
|
|
|
|
if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_CLEAR_UNCACHED))
|
|
|
|
arch_dma_clear_uncached(cpu_addr, size);
|
2022-06-23 03:14:24 +08:00
|
|
|
if (dma_set_encrypted(dev, cpu_addr, size))
|
2021-11-09 22:41:01 +08:00
|
|
|
return;
|
2021-10-21 15:20:39 +08:00
|
|
|
}
|
2019-10-29 18:06:32 +08:00
|
|
|
|
2021-06-19 11:40:40 +08:00
|
|
|
__dma_direct_free_pages(dev, dma_direct_to_page(dev, dma_addr), size);
|
2016-02-03 13:46:32 +08:00
|
|
|
}
|
|
|
|
|
2020-09-01 19:34:33 +08:00
|
|
|
struct page *dma_direct_alloc_pages(struct device *dev, size_t size,
|
|
|
|
dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp)
|
|
|
|
{
|
|
|
|
struct page *page;
|
|
|
|
void *ret;
|
|
|
|
|
2021-11-09 22:50:28 +08:00
|
|
|
if (force_dma_unencrypted(dev) && dma_direct_use_pool(dev, gfp))
|
2020-10-07 17:06:09 +08:00
|
|
|
return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp);
|
2020-09-01 19:34:33 +08:00
|
|
|
|
2022-04-24 01:20:24 +08:00
|
|
|
page = __dma_direct_alloc_pages(dev, size, gfp, false);
|
2020-09-01 19:34:33 +08:00
|
|
|
if (!page)
|
|
|
|
return NULL;
|
2020-09-26 22:39:36 +08:00
|
|
|
|
2020-09-01 19:34:33 +08:00
|
|
|
ret = page_address(page);
|
2021-10-18 19:18:34 +08:00
|
|
|
if (dma_set_decrypted(dev, ret, size))
|
2024-02-22 08:17:21 +08:00
|
|
|
goto out_leak_pages;
|
2020-09-01 19:34:33 +08:00
|
|
|
memset(ret, 0, size);
|
|
|
|
*dma_handle = phys_to_dma_direct(dev, page_to_phys(page));
|
|
|
|
return page;
|
2024-02-22 08:17:21 +08:00
|
|
|
out_leak_pages:
|
2020-09-01 19:34:33 +08:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
void dma_direct_free_pages(struct device *dev, size_t size,
|
|
|
|
struct page *page, dma_addr_t dma_addr,
|
|
|
|
enum dma_data_direction dir)
|
|
|
|
{
|
|
|
|
void *vaddr = page_address(page);
|
|
|
|
|
|
|
|
/* If cpu_addr is not from an atomic pool, dma_free_from_pool() fails */
|
2020-10-07 17:04:08 +08:00
|
|
|
if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL) &&
|
2020-09-01 19:34:33 +08:00
|
|
|
dma_free_from_pool(dev, vaddr, size))
|
|
|
|
return;
|
|
|
|
|
2022-06-23 03:14:24 +08:00
|
|
|
if (dma_set_encrypted(dev, vaddr, size))
|
2021-11-09 22:41:01 +08:00
|
|
|
return;
|
2021-06-19 11:40:40 +08:00
|
|
|
__dma_direct_free_pages(dev, page, size);
|
2020-09-01 19:34:33 +08:00
|
|
|
}
|
|
|
|
|
2018-12-03 18:43:54 +08:00
|
|
|
#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \
|
|
|
|
defined(CONFIG_SWIOTLB)
|
|
|
|
void dma_direct_sync_sg_for_device(struct device *dev,
|
2018-09-08 17:22:43 +08:00
|
|
|
struct scatterlist *sgl, int nents, enum dma_data_direction dir)
|
|
|
|
{
|
|
|
|
struct scatterlist *sg;
|
|
|
|
int i;
|
|
|
|
|
2018-12-03 18:43:54 +08:00
|
|
|
for_each_sg(sgl, sg, nents, i) {
|
2019-07-19 17:26:48 +08:00
|
|
|
phys_addr_t paddr = dma_to_phys(dev, sg_dma_address(sg));
|
|
|
|
|
swiotlb: reduce swiotlb pool lookups
With CONFIG_SWIOTLB_DYNAMIC enabled, each round-trip map/unmap pair
in the swiotlb results in 6 calls to swiotlb_find_pool(). In multiple
places, the pool is found and used in one function, and then must
be found again in the next function that is called because only the
tlb_addr is passed as an argument. These are the six call sites:
dma_direct_map_page:
1. swiotlb_map -> swiotlb_tbl_map_single -> swiotlb_bounce
dma_direct_unmap_page:
2. dma_direct_sync_single_for_cpu -> is_swiotlb_buffer
3. dma_direct_sync_single_for_cpu -> swiotlb_sync_single_for_cpu ->
swiotlb_bounce
4. is_swiotlb_buffer
5. swiotlb_tbl_unmap_single -> swiotlb_del_transient
6. swiotlb_tbl_unmap_single -> swiotlb_release_slots
Reduce the number of calls by finding the pool at a higher level, and
passing it as an argument instead of searching again. A key change is
for is_swiotlb_buffer() to return a pool pointer instead of a boolean,
and then pass this pool pointer to subsequent swiotlb functions.
There are 9 occurrences of is_swiotlb_buffer() used to test if a buffer
is a swiotlb buffer before calling a swiotlb function. To reduce code
duplication in getting the pool pointer and passing it as an argument,
introduce inline wrappers for this pattern. The generated code is
essentially unchanged.
Since is_swiotlb_buffer() no longer returns a boolean, rename some
functions to reflect the change:
* swiotlb_find_pool() becomes __swiotlb_find_pool()
* is_swiotlb_buffer() becomes swiotlb_find_pool()
* is_xen_swiotlb_buffer() becomes xen_swiotlb_find_pool()
With these changes, a round-trip map/unmap pair requires only 2 pool
lookups (listed using the new names and wrappers):
dma_direct_unmap_page:
1. dma_direct_sync_single_for_cpu -> swiotlb_find_pool
2. swiotlb_tbl_unmap_single -> swiotlb_find_pool
These changes come from noticing the inefficiencies in a code review,
not from performance measurements. With CONFIG_SWIOTLB_DYNAMIC,
__swiotlb_find_pool() is not trivial, and it uses an RCU read lock,
so avoiding the redundant calls helps performance in a hot path.
When CONFIG_SWIOTLB_DYNAMIC is *not* set, the code size reduction
is minimal and the perf benefits are likely negligible, but no
harm is done.
No functional change is intended.
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2024-07-09 03:41:00 +08:00
|
|
|
swiotlb_sync_single_for_device(dev, paddr, sg->length, dir);
|
2018-09-08 17:22:43 +08:00
|
|
|
|
2018-12-03 18:43:54 +08:00
|
|
|
if (!dev_is_dma_coherent(dev))
|
2019-11-08 01:03:11 +08:00
|
|
|
arch_sync_dma_for_device(paddr, sg->length,
|
2018-12-03 18:43:54 +08:00
|
|
|
dir);
|
|
|
|
}
|
2018-09-08 17:22:43 +08:00
|
|
|
}
|
2018-12-03 18:14:09 +08:00
|
|
|
#endif
|
2018-09-08 17:22:43 +08:00
|
|
|
|
|
|
|
#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
|
2018-12-03 18:43:54 +08:00
|
|
|
defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) || \
|
|
|
|
defined(CONFIG_SWIOTLB)
|
|
|
|
void dma_direct_sync_sg_for_cpu(struct device *dev,
|
2018-09-08 17:22:43 +08:00
|
|
|
struct scatterlist *sgl, int nents, enum dma_data_direction dir)
|
|
|
|
{
|
|
|
|
struct scatterlist *sg;
|
|
|
|
int i;
|
|
|
|
|
2018-12-03 18:43:54 +08:00
|
|
|
for_each_sg(sgl, sg, nents, i) {
|
2019-07-19 17:26:48 +08:00
|
|
|
phys_addr_t paddr = dma_to_phys(dev, sg_dma_address(sg));
|
|
|
|
|
2018-12-03 18:43:54 +08:00
|
|
|
if (!dev_is_dma_coherent(dev))
|
2019-11-08 01:03:11 +08:00
|
|
|
arch_sync_dma_for_cpu(paddr, sg->length, dir);
|
2019-07-19 17:26:48 +08:00
|
|
|
|
swiotlb: reduce swiotlb pool lookups
With CONFIG_SWIOTLB_DYNAMIC enabled, each round-trip map/unmap pair
in the swiotlb results in 6 calls to swiotlb_find_pool(). In multiple
places, the pool is found and used in one function, and then must
be found again in the next function that is called because only the
tlb_addr is passed as an argument. These are the six call sites:
dma_direct_map_page:
1. swiotlb_map -> swiotlb_tbl_map_single -> swiotlb_bounce
dma_direct_unmap_page:
2. dma_direct_sync_single_for_cpu -> is_swiotlb_buffer
3. dma_direct_sync_single_for_cpu -> swiotlb_sync_single_for_cpu ->
swiotlb_bounce
4. is_swiotlb_buffer
5. swiotlb_tbl_unmap_single -> swiotlb_del_transient
6. swiotlb_tbl_unmap_single -> swiotlb_release_slots
Reduce the number of calls by finding the pool at a higher level, and
passing it as an argument instead of searching again. A key change is
for is_swiotlb_buffer() to return a pool pointer instead of a boolean,
and then pass this pool pointer to subsequent swiotlb functions.
There are 9 occurrences of is_swiotlb_buffer() used to test if a buffer
is a swiotlb buffer before calling a swiotlb function. To reduce code
duplication in getting the pool pointer and passing it as an argument,
introduce inline wrappers for this pattern. The generated code is
essentially unchanged.
Since is_swiotlb_buffer() no longer returns a boolean, rename some
functions to reflect the change:
* swiotlb_find_pool() becomes __swiotlb_find_pool()
* is_swiotlb_buffer() becomes swiotlb_find_pool()
* is_xen_swiotlb_buffer() becomes xen_swiotlb_find_pool()
With these changes, a round-trip map/unmap pair requires only 2 pool
lookups (listed using the new names and wrappers):
dma_direct_unmap_page:
1. dma_direct_sync_single_for_cpu -> swiotlb_find_pool
2. swiotlb_tbl_unmap_single -> swiotlb_find_pool
These changes come from noticing the inefficiencies in a code review,
not from performance measurements. With CONFIG_SWIOTLB_DYNAMIC,
__swiotlb_find_pool() is not trivial, and it uses an RCU read lock,
so avoiding the redundant calls helps performance in a hot path.
When CONFIG_SWIOTLB_DYNAMIC is *not* set, the code size reduction
is minimal and the perf benefits are likely negligible, but no
harm is done.
No functional change is intended.
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2024-07-09 03:41:00 +08:00
|
|
|
swiotlb_sync_single_for_cpu(dev, paddr, sg->length, dir);
|
2020-08-17 22:41:50 +08:00
|
|
|
|
|
|
|
if (dir == DMA_FROM_DEVICE)
|
|
|
|
arch_dma_mark_clean(paddr, sg->length);
|
2018-12-03 18:43:54 +08:00
|
|
|
}
|
2018-09-08 17:22:43 +08:00
|
|
|
|
2018-12-03 18:43:54 +08:00
|
|
|
if (!dev_is_dma_coherent(dev))
|
2019-11-08 01:03:11 +08:00
|
|
|
arch_sync_dma_for_cpu_all();
|
2018-09-08 17:22:43 +08:00
|
|
|
}
|
|
|
|
|
2022-07-09 00:50:56 +08:00
|
|
|
/*
|
|
|
|
* Unmaps segments, except for ones marked as pci_p2pdma which do not
|
|
|
|
* require any further action as they contain a bus address.
|
|
|
|
*/
|
2018-12-03 18:43:54 +08:00
|
|
|
void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
|
2018-09-08 17:22:43 +08:00
|
|
|
int nents, enum dma_data_direction dir, unsigned long attrs)
|
|
|
|
{
|
2018-12-03 18:43:54 +08:00
|
|
|
struct scatterlist *sg;
|
|
|
|
int i;
|
|
|
|
|
2022-07-09 00:50:56 +08:00
|
|
|
for_each_sg(sgl, sg, nents, i) {
|
2023-06-12 23:31:57 +08:00
|
|
|
if (sg_dma_is_bus_address(sg))
|
2022-07-09 00:50:56 +08:00
|
|
|
sg_dma_unmark_bus_address(sg);
|
|
|
|
else
|
|
|
|
dma_direct_unmap_page(dev, sg->dma_address,
|
|
|
|
sg_dma_len(sg), dir, attrs);
|
|
|
|
}
|
2018-09-08 17:22:43 +08:00
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2018-04-16 21:24:51 +08:00
|
|
|
int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
|
|
|
|
enum dma_data_direction dir, unsigned long attrs)
|
2016-02-03 13:46:32 +08:00
|
|
|
{
|
2022-07-09 00:50:56 +08:00
|
|
|
struct pci_p2pdma_map_state p2pdma_state = {};
|
|
|
|
enum pci_p2pdma_map_type map;
|
2016-02-03 13:46:32 +08:00
|
|
|
struct scatterlist *sg;
|
2022-07-09 00:50:56 +08:00
|
|
|
int i, ret;
|
2016-02-03 13:46:32 +08:00
|
|
|
|
|
|
|
for_each_sg(sgl, sg, nents, i) {
|
2022-07-09 00:50:56 +08:00
|
|
|
if (is_pci_p2pdma_page(sg_page(sg))) {
|
|
|
|
map = pci_p2pdma_map_segment(&p2pdma_state, dev, sg);
|
|
|
|
switch (map) {
|
|
|
|
case PCI_P2PDMA_MAP_BUS_ADDR:
|
|
|
|
continue;
|
|
|
|
case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE:
|
|
|
|
/*
|
|
|
|
* Any P2P mapping that traverses the PCI
|
|
|
|
* host bridge must be mapped with CPU physical
|
|
|
|
* address and not PCI bus addresses. This is
|
|
|
|
* done with dma_direct_map_page() below.
|
|
|
|
*/
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
ret = -EREMOTEIO;
|
|
|
|
goto out_unmap;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-12-03 18:14:09 +08:00
|
|
|
sg->dma_address = dma_direct_map_page(dev, sg_page(sg),
|
|
|
|
sg->offset, sg->length, dir, attrs);
|
2022-07-09 00:50:56 +08:00
|
|
|
if (sg->dma_address == DMA_MAPPING_ERROR) {
|
|
|
|
ret = -EIO;
|
2018-12-03 18:43:54 +08:00
|
|
|
goto out_unmap;
|
2022-07-09 00:50:56 +08:00
|
|
|
}
|
2016-02-03 13:46:32 +08:00
|
|
|
sg_dma_len(sg) = sg->length;
|
|
|
|
}
|
|
|
|
|
|
|
|
return nents;
|
2018-12-03 18:43:54 +08:00
|
|
|
|
|
|
|
out_unmap:
|
|
|
|
dma_direct_unmap_sg(dev, sgl, i, dir, attrs | DMA_ATTR_SKIP_CPU_SYNC);
|
2022-07-09 00:50:56 +08:00
|
|
|
return ret;
|
2016-02-03 13:46:32 +08:00
|
|
|
}
|
|
|
|
|
2019-01-05 01:20:05 +08:00
|
|
|
dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr,
|
|
|
|
size_t size, enum dma_data_direction dir, unsigned long attrs)
|
|
|
|
{
|
|
|
|
dma_addr_t dma_addr = paddr;
|
|
|
|
|
2019-11-20 00:38:58 +08:00
|
|
|
if (unlikely(!dma_capable(dev, dma_addr, size, false))) {
|
2020-02-03 21:54:50 +08:00
|
|
|
dev_err_once(dev,
|
|
|
|
"DMA addr %pad+%zu overflow (mask %llx, bus limit %llx).\n",
|
|
|
|
&dma_addr, size, *dev->dma_mask, dev->bus_dma_limit);
|
|
|
|
WARN_ON_ONCE(1);
|
2019-01-05 01:20:05 +08:00
|
|
|
return DMA_MAPPING_ERROR;
|
|
|
|
}
|
|
|
|
|
|
|
|
return dma_addr;
|
|
|
|
}
|
|
|
|
|
2019-10-29 18:01:37 +08:00
|
|
|
int dma_direct_get_sgtable(struct device *dev, struct sg_table *sgt,
|
|
|
|
void *cpu_addr, dma_addr_t dma_addr, size_t size,
|
|
|
|
unsigned long attrs)
|
|
|
|
{
|
|
|
|
struct page *page = dma_direct_to_page(dev, dma_addr);
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
|
|
|
|
if (!ret)
|
|
|
|
sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
bool dma_direct_can_mmap(struct device *dev)
|
|
|
|
{
|
|
|
|
return dev_is_dma_coherent(dev) ||
|
|
|
|
IS_ENABLED(CONFIG_DMA_NONCOHERENT_MMAP);
|
|
|
|
}
|
|
|
|
|
|
|
|
int dma_direct_mmap(struct device *dev, struct vm_area_struct *vma,
|
|
|
|
void *cpu_addr, dma_addr_t dma_addr, size_t size,
|
|
|
|
unsigned long attrs)
|
|
|
|
{
|
|
|
|
unsigned long user_count = vma_pages(vma);
|
|
|
|
unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
|
|
|
|
unsigned long pfn = PHYS_PFN(dma_to_phys(dev, dma_addr));
|
|
|
|
int ret = -ENXIO;
|
|
|
|
|
|
|
|
vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
|
2022-03-31 14:01:21 +08:00
|
|
|
if (force_dma_unencrypted(dev))
|
|
|
|
vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
|
2019-10-29 18:01:37 +08:00
|
|
|
|
|
|
|
if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
|
|
|
|
return ret;
|
2021-06-23 20:21:16 +08:00
|
|
|
if (dma_mmap_from_global_coherent(vma, cpu_addr, size, &ret))
|
|
|
|
return ret;
|
2019-10-29 18:01:37 +08:00
|
|
|
|
|
|
|
if (vma->vm_pgoff >= count || user_count > count - vma->vm_pgoff)
|
|
|
|
return -ENXIO;
|
|
|
|
return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
|
|
|
|
user_count << PAGE_SHIFT, vma->vm_page_prot);
|
|
|
|
}
|
|
|
|
|
2017-12-24 22:04:32 +08:00
|
|
|
int dma_direct_supported(struct device *dev, u64 mask)
|
|
|
|
{
|
2020-02-04 01:11:10 +08:00
|
|
|
u64 min_mask = (max_pfn - 1) << PAGE_SHIFT;
|
2018-09-07 15:31:58 +08:00
|
|
|
|
2020-02-04 01:11:10 +08:00
|
|
|
/*
|
|
|
|
* Because 32-bit DMA masks are so common we expect every architecture
|
|
|
|
* to be able to satisfy them - either by not supporting more physical
|
|
|
|
* memory, or by providing a ZONE_DMA32. If neither is the case, the
|
|
|
|
* architecture needs to use an IOMMU instead of the direct mapping.
|
|
|
|
*/
|
|
|
|
if (mask >= DMA_BIT_MASK(32))
|
|
|
|
return 1;
|
2018-09-07 15:31:58 +08:00
|
|
|
|
2018-12-17 22:39:16 +08:00
|
|
|
/*
|
2020-08-17 23:34:03 +08:00
|
|
|
* This check needs to be against the actual bit mask value, so use
|
|
|
|
* phys_to_dma_unencrypted() here so that the SME encryption mask isn't
|
2018-12-17 22:39:16 +08:00
|
|
|
* part of the check.
|
|
|
|
*/
|
2020-02-04 01:11:10 +08:00
|
|
|
if (IS_ENABLED(CONFIG_ZONE_DMA))
|
2024-08-11 15:09:35 +08:00
|
|
|
min_mask = min_t(u64, min_mask, zone_dma_limit);
|
2020-08-17 23:34:03 +08:00
|
|
|
return mask >= phys_to_dma_unencrypted(dev, min_mask);
|
2017-12-24 22:04:32 +08:00
|
|
|
}
|
2019-02-07 19:59:15 +08:00
|
|
|
|
dma-mapping: fix dma_addressing_limited() if dma_range_map can't cover all system RAM
There is an unusual case that the range map covers right up to the top
of system RAM, but leaves a hole somewhere lower down. Then it prevents
the nvme device dma mapping in the checking path of phys_to_dma() and
causes the hangs at boot.
E.g. On an Armv8 Ampere server, the dsdt ACPI table is:
Method (_DMA, 0, Serialized) // _DMA: Direct Memory Access
{
Name (RBUF, ResourceTemplate ()
{
QWordMemory (ResourceConsumer, PosDecode, MinFixed,
MaxFixed, Cacheable, ReadWrite,
0x0000000000000000, // Granularity
0x0000000000000000, // Range Minimum
0x00000000FFFFFFFF, // Range Maximum
0x0000000000000000, // Translation Offset
0x0000000100000000, // Length
,, , AddressRangeMemory, TypeStatic)
QWordMemory (ResourceConsumer, PosDecode, MinFixed,
MaxFixed, Cacheable, ReadWrite,
0x0000000000000000, // Granularity
0x0000006010200000, // Range Minimum
0x000000602FFFFFFF, // Range Maximum
0x0000000000000000, // Translation Offset
0x000000001FE00000, // Length
,, , AddressRangeMemory, TypeStatic)
QWordMemory (ResourceConsumer, PosDecode, MinFixed,
MaxFixed, Cacheable, ReadWrite,
0x0000000000000000, // Granularity
0x00000060F0000000, // Range Minimum
0x00000060FFFFFFFF, // Range Maximum
0x0000000000000000, // Translation Offset
0x0000000010000000, // Length
,, , AddressRangeMemory, TypeStatic)
QWordMemory (ResourceConsumer, PosDecode, MinFixed,
MaxFixed, Cacheable, ReadWrite,
0x0000000000000000, // Granularity
0x0000007000000000, // Range Minimum
0x000003FFFFFFFFFF, // Range Maximum
0x0000000000000000, // Translation Offset
0x0000039000000000, // Length
,, , AddressRangeMemory, TypeStatic)
})
But the System RAM ranges are:
cat /proc/iomem |grep -i ram
90000000-91ffffff : System RAM
92900000-fffbffff : System RAM
880000000-fffffffff : System RAM
8800000000-bff5990fff : System RAM
bff59d0000-bff5a4ffff : System RAM
bff8000000-bfffffffff : System RAM
So some RAM ranges are out of dma_range_map.
Fix it by checking whether each of the system RAM resources can be
properly encompassed within the dma_range_map.
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-10-28 18:20:59 +08:00
|
|
|
/*
|
|
|
|
* To check whether all ram resource ranges are covered by dma range map
|
|
|
|
* Returns 0 when further check is needed
|
|
|
|
* Returns 1 if there is some RAM range can't be covered by dma_range_map
|
|
|
|
*/
|
|
|
|
static int check_ram_in_range_map(unsigned long start_pfn,
|
|
|
|
unsigned long nr_pages, void *data)
|
|
|
|
{
|
|
|
|
unsigned long end_pfn = start_pfn + nr_pages;
|
|
|
|
const struct bus_dma_region *bdr = NULL;
|
|
|
|
const struct bus_dma_region *m;
|
|
|
|
struct device *dev = data;
|
|
|
|
|
|
|
|
while (start_pfn < end_pfn) {
|
|
|
|
for (m = dev->dma_range_map; PFN_DOWN(m->size); m++) {
|
|
|
|
unsigned long cpu_start_pfn = PFN_DOWN(m->cpu_start);
|
|
|
|
|
|
|
|
if (start_pfn >= cpu_start_pfn &&
|
|
|
|
start_pfn - cpu_start_pfn < PFN_DOWN(m->size)) {
|
|
|
|
bdr = m;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (!bdr)
|
|
|
|
return 1;
|
|
|
|
|
|
|
|
start_pfn = PFN_DOWN(bdr->cpu_start) + PFN_DOWN(bdr->size);
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
bool dma_direct_all_ram_mapped(struct device *dev)
|
|
|
|
{
|
|
|
|
if (!dev->dma_range_map)
|
|
|
|
return true;
|
|
|
|
return !walk_system_ram_range(0, PFN_DOWN(ULONG_MAX) + 1, dev,
|
|
|
|
check_ram_in_range_map);
|
|
|
|
}
|
|
|
|
|
2019-02-07 19:59:15 +08:00
|
|
|
size_t dma_direct_max_mapping_size(struct device *dev)
|
|
|
|
{
|
|
|
|
/* If SWIOTLB is active, use its maximum mapping size */
|
2021-06-19 11:40:36 +08:00
|
|
|
if (is_swiotlb_active(dev) &&
|
2021-06-24 23:55:20 +08:00
|
|
|
(dma_addressing_limited(dev) || is_swiotlb_force_bounce(dev)))
|
2019-07-17 04:00:54 +08:00
|
|
|
return swiotlb_max_mapping_size(dev);
|
|
|
|
return SIZE_MAX;
|
2019-02-07 19:59:15 +08:00
|
|
|
}
|
2020-06-29 21:03:56 +08:00
|
|
|
|
|
|
|
bool dma_direct_need_sync(struct device *dev, dma_addr_t dma_addr)
|
|
|
|
{
|
|
|
|
return !dev_is_dma_coherent(dev) ||
|
swiotlb: reduce swiotlb pool lookups
With CONFIG_SWIOTLB_DYNAMIC enabled, each round-trip map/unmap pair
in the swiotlb results in 6 calls to swiotlb_find_pool(). In multiple
places, the pool is found and used in one function, and then must
be found again in the next function that is called because only the
tlb_addr is passed as an argument. These are the six call sites:
dma_direct_map_page:
1. swiotlb_map -> swiotlb_tbl_map_single -> swiotlb_bounce
dma_direct_unmap_page:
2. dma_direct_sync_single_for_cpu -> is_swiotlb_buffer
3. dma_direct_sync_single_for_cpu -> swiotlb_sync_single_for_cpu ->
swiotlb_bounce
4. is_swiotlb_buffer
5. swiotlb_tbl_unmap_single -> swiotlb_del_transient
6. swiotlb_tbl_unmap_single -> swiotlb_release_slots
Reduce the number of calls by finding the pool at a higher level, and
passing it as an argument instead of searching again. A key change is
for is_swiotlb_buffer() to return a pool pointer instead of a boolean,
and then pass this pool pointer to subsequent swiotlb functions.
There are 9 occurrences of is_swiotlb_buffer() used to test if a buffer
is a swiotlb buffer before calling a swiotlb function. To reduce code
duplication in getting the pool pointer and passing it as an argument,
introduce inline wrappers for this pattern. The generated code is
essentially unchanged.
Since is_swiotlb_buffer() no longer returns a boolean, rename some
functions to reflect the change:
* swiotlb_find_pool() becomes __swiotlb_find_pool()
* is_swiotlb_buffer() becomes swiotlb_find_pool()
* is_xen_swiotlb_buffer() becomes xen_swiotlb_find_pool()
With these changes, a round-trip map/unmap pair requires only 2 pool
lookups (listed using the new names and wrappers):
dma_direct_unmap_page:
1. dma_direct_sync_single_for_cpu -> swiotlb_find_pool
2. swiotlb_tbl_unmap_single -> swiotlb_find_pool
These changes come from noticing the inefficiencies in a code review,
not from performance measurements. With CONFIG_SWIOTLB_DYNAMIC,
__swiotlb_find_pool() is not trivial, and it uses an RCU read lock,
so avoiding the redundant calls helps performance in a hot path.
When CONFIG_SWIOTLB_DYNAMIC is *not* set, the code size reduction
is minimal and the perf benefits are likely negligible, but no
harm is done.
No functional change is intended.
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2024-07-09 03:41:00 +08:00
|
|
|
swiotlb_find_pool(dev, dma_to_phys(dev, dma_addr));
|
2020-06-29 21:03:56 +08:00
|
|
|
}
|
2020-09-18 00:43:40 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* dma_direct_set_offset - Assign scalar offset for a single DMA range.
|
|
|
|
* @dev: device pointer; needed to "own" the alloced memory.
|
|
|
|
* @cpu_start: beginning of memory region covered by this offset.
|
|
|
|
* @dma_start: beginning of DMA/PCI region covered by this offset.
|
|
|
|
* @size: size of the region.
|
|
|
|
*
|
|
|
|
* This is for the simple case of a uniform offset which cannot
|
|
|
|
* be discovered by "dma-ranges".
|
|
|
|
*
|
|
|
|
* It returns -ENOMEM if out of memory, -EINVAL if a map
|
|
|
|
* already exists, 0 otherwise.
|
|
|
|
*
|
|
|
|
* Note: any call to this from a driver is a bug. The mapping needs
|
|
|
|
* to be described by the device tree or other firmware interfaces.
|
|
|
|
*/
|
|
|
|
int dma_direct_set_offset(struct device *dev, phys_addr_t cpu_start,
|
|
|
|
dma_addr_t dma_start, u64 size)
|
|
|
|
{
|
|
|
|
struct bus_dma_region *map;
|
|
|
|
u64 offset = (u64)cpu_start - (u64)dma_start;
|
|
|
|
|
|
|
|
if (dev->dma_range_map) {
|
|
|
|
dev_err(dev, "attempt to add DMA range to existing map\n");
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!offset)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
map = kcalloc(2, sizeof(*map), GFP_KERNEL);
|
|
|
|
if (!map)
|
|
|
|
return -ENOMEM;
|
|
|
|
map[0].cpu_start = cpu_start;
|
|
|
|
map[0].dma_start = dma_start;
|
|
|
|
map[0].size = size;
|
|
|
|
dev->dma_range_map = map;
|
|
|
|
return 0;
|
|
|
|
}
|