Merge branches 'arm/mediatek', 'arm/renesas', 'arm/smmu', 'x86/vt-d', 'x86/amd' and 'core' into next

This commit is contained in:
Joerg Roedel 2024-03-08 09:05:59 +01:00
49 changed files with 2309 additions and 1853 deletions

View File

@ -0,0 +1,276 @@
What: /sys/kernel/debug/iommu/intel/iommu_regset
Date: December 2023
Contact: Jingqi Liu <Jingqi.liu@intel.com>
Description:
This file dumps all the register contents for each IOMMU device.
Example in Kabylake:
::
$ sudo cat /sys/kernel/debug/iommu/intel/iommu_regset
IOMMU: dmar0 Register Base Address: 26be37000
Name Offset Contents
VER 0x00 0x0000000000000010
GCMD 0x18 0x0000000000000000
GSTS 0x1c 0x00000000c7000000
FSTS 0x34 0x0000000000000000
FECTL 0x38 0x0000000000000000
[...]
IOMMU: dmar1 Register Base Address: fed90000
Name Offset Contents
VER 0x00 0x0000000000000010
GCMD 0x18 0x0000000000000000
GSTS 0x1c 0x00000000c7000000
FSTS 0x34 0x0000000000000000
FECTL 0x38 0x0000000000000000
[...]
IOMMU: dmar2 Register Base Address: fed91000
Name Offset Contents
VER 0x00 0x0000000000000010
GCMD 0x18 0x0000000000000000
GSTS 0x1c 0x00000000c7000000
FSTS 0x34 0x0000000000000000
FECTL 0x38 0x0000000000000000
[...]
What: /sys/kernel/debug/iommu/intel/ir_translation_struct
Date: December 2023
Contact: Jingqi Liu <Jingqi.liu@intel.com>
Description:
This file dumps the table entries for Interrupt
remapping and Interrupt posting.
Example in Kabylake:
::
$ sudo cat /sys/kernel/debug/iommu/intel/ir_translation_struct
Remapped Interrupt supported on IOMMU: dmar0
IR table address:100900000
Entry SrcID DstID Vct IRTE_high IRTE_low
0 00:0a.0 00000080 24 0000000000040050 000000800024000d
1 00:0a.0 00000001 ef 0000000000040050 0000000100ef000d
Remapped Interrupt supported on IOMMU: dmar1
IR table address:100300000
Entry SrcID DstID Vct IRTE_high IRTE_low
0 00:02.0 00000002 26 0000000000040010 000000020026000d
[...]
****
Posted Interrupt supported on IOMMU: dmar0
IR table address:100900000
Entry SrcID PDA_high PDA_low Vct IRTE_high IRTE_low
What: /sys/kernel/debug/iommu/intel/dmar_translation_struct
Date: December 2023
Contact: Jingqi Liu <Jingqi.liu@intel.com>
Description:
This file dumps Intel IOMMU DMA remapping tables, such
as root table, context table, PASID directory and PASID
table entries in debugfs. For legacy mode, it doesn't
support PASID, and hence PASID field is defaulted to
'-1' and other PASID related fields are invalid.
Example in Kabylake:
::
$ sudo cat /sys/kernel/debug/iommu/intel/dmar_translation_struct
IOMMU dmar1: Root Table Address: 0x103027000
B.D.F Root_entry
00:02.0 0x0000000000000000:0x000000010303e001
Context_entry
0x0000000000000102:0x000000010303f005
PASID PASID_table_entry
-1 0x0000000000000000:0x0000000000000000:0x0000000000000000
IOMMU dmar0: Root Table Address: 0x103028000
B.D.F Root_entry
00:0a.0 0x0000000000000000:0x00000001038a7001
Context_entry
0x0000000000000000:0x0000000103220e7d
PASID PASID_table_entry
0 0x0000000000000000:0x0000000000800002:0x00000001038a5089
[...]
What: /sys/kernel/debug/iommu/intel/invalidation_queue
Date: December 2023
Contact: Jingqi Liu <Jingqi.liu@intel.com>
Description:
This file exports invalidation queue internals of each
IOMMU device.
Example in Kabylake:
::
$ sudo cat /sys/kernel/debug/iommu/intel/invalidation_queue
Invalidation queue on IOMMU: dmar0
Base: 0x10022e000 Head: 20 Tail: 20
Index qw0 qw1 qw2
0 0000000000000014 0000000000000000 0000000000000000
1 0000000200000025 0000000100059c04 0000000000000000
2 0000000000000014 0000000000000000 0000000000000000
qw3 status
0000000000000000 0000000000000000
0000000000000000 0000000000000000
0000000000000000 0000000000000000
[...]
Invalidation queue on IOMMU: dmar1
Base: 0x10026e000 Head: 32 Tail: 32
Index qw0 qw1 status
0 0000000000000004 0000000000000000 0000000000000000
1 0000000200000025 0000000100059804 0000000000000000
2 0000000000000011 0000000000000000 0000000000000000
[...]
What: /sys/kernel/debug/iommu/intel/dmar_perf_latency
Date: December 2023
Contact: Jingqi Liu <Jingqi.liu@intel.com>
Description:
This file is used to control and show counts of
execution time ranges for various types per DMAR.
Firstly, write a value to
/sys/kernel/debug/iommu/intel/dmar_perf_latency
to enable sampling.
The possible values are as follows:
* 0 - disable sampling all latency data
* 1 - enable sampling IOTLB invalidation latency data
* 2 - enable sampling devTLB invalidation latency data
* 3 - enable sampling intr entry cache invalidation latency data
Next, read /sys/kernel/debug/iommu/intel/dmar_perf_latency gives
a snapshot of sampling result of all enabled monitors.
Examples in Kabylake:
::
1) Disable sampling all latency data:
$ sudo echo 0 > /sys/kernel/debug/iommu/intel/dmar_perf_latency
2) Enable sampling IOTLB invalidation latency data
$ sudo echo 1 > /sys/kernel/debug/iommu/intel/dmar_perf_latency
$ sudo cat /sys/kernel/debug/iommu/intel/dmar_perf_latency
IOMMU: dmar0 Register Base Address: 26be37000
<0.1us 0.1us-1us 1us-10us 10us-100us 100us-1ms
inv_iotlb 0 0 0 0 0
1ms-10ms >=10ms min(us) max(us) average(us)
inv_iotlb 0 0 0 0 0
[...]
IOMMU: dmar2 Register Base Address: fed91000
<0.1us 0.1us-1us 1us-10us 10us-100us 100us-1ms
inv_iotlb 0 0 18 0 0
1ms-10ms >=10ms min(us) max(us) average(us)
inv_iotlb 0 0 2 2 2
3) Enable sampling devTLB invalidation latency data
$ sudo echo 2 > /sys/kernel/debug/iommu/intel/dmar_perf_latency
$ sudo cat /sys/kernel/debug/iommu/intel/dmar_perf_latency
IOMMU: dmar0 Register Base Address: 26be37000
<0.1us 0.1us-1us 1us-10us 10us-100us 100us-1ms
inv_devtlb 0 0 0 0 0
>=10ms min(us) max(us) average(us)
inv_devtlb 0 0 0 0
[...]
What: /sys/kernel/debug/iommu/intel/<bdf>/domain_translation_struct
Date: December 2023
Contact: Jingqi Liu <Jingqi.liu@intel.com>
Description:
This file dumps a specified page table of Intel IOMMU
in legacy mode or scalable mode.
For a device that only supports legacy mode, dump its
page table by the debugfs file in the debugfs device
directory. e.g.
/sys/kernel/debug/iommu/intel/0000:00:02.0/domain_translation_struct.
For a device that supports scalable mode, dump the
page table of specified pasid by the debugfs file in
the debugfs pasid directory. e.g.
/sys/kernel/debug/iommu/intel/0000:00:02.0/1/domain_translation_struct.
Examples in Kabylake:
::
1) Dump the page table of device "0000:00:02.0" that only supports legacy mode.
$ sudo cat /sys/kernel/debug/iommu/intel/0000:00:02.0/domain_translation_struct
Device 0000:00:02.0 @0x1017f8000
IOVA_PFN PML5E PML4E
0x000000008d800 | 0x0000000000000000 0x00000001017f9003
0x000000008d801 | 0x0000000000000000 0x00000001017f9003
0x000000008d802 | 0x0000000000000000 0x00000001017f9003
PDPE PDE PTE
0x00000001017fa003 0x00000001017fb003 0x000000008d800003
0x00000001017fa003 0x00000001017fb003 0x000000008d801003
0x00000001017fa003 0x00000001017fb003 0x000000008d802003
[...]
2) Dump the page table of device "0000:00:0a.0" with PASID "1" that
supports scalable mode.
$ sudo cat /sys/kernel/debug/iommu/intel/0000:00:0a.0/1/domain_translation_struct
Device 0000:00:0a.0 with pasid 1 @0x10c112000
IOVA_PFN PML5E PML4E
0x0000000000000 | 0x0000000000000000 0x000000010df93003
0x0000000000001 | 0x0000000000000000 0x000000010df93003
0x0000000000002 | 0x0000000000000000 0x000000010df93003
PDPE PDE PTE
0x0000000106ae6003 0x0000000104b38003 0x0000000147c00803
0x0000000106ae6003 0x0000000104b38003 0x0000000147c01803
0x0000000106ae6003 0x0000000104b38003 0x0000000147c02803
[...]

View File

@ -83,6 +83,7 @@ properties:
- description: Qcom Adreno GPUs implementing "qcom,smmu-500" and "arm,mmu-500"
items:
- enum:
- qcom,qcm2290-smmu-500
- qcom,sa8775p-smmu-500
- qcom,sc7280-smmu-500
- qcom,sc8280xp-smmu-500
@ -93,6 +94,7 @@ properties:
- qcom,sm8350-smmu-500
- qcom,sm8450-smmu-500
- qcom,sm8550-smmu-500
- qcom,sm8650-smmu-500
- const: qcom,adreno-smmu
- const: qcom,smmu-500
- const: arm,mmu-500
@ -462,6 +464,7 @@ allOf:
compatible:
items:
- enum:
- qcom,qcm2290-smmu-500
- qcom,sm6115-smmu-500
- qcom,sm6125-smmu-500
- const: qcom,adreno-smmu
@ -484,7 +487,12 @@ allOf:
- if:
properties:
compatible:
const: qcom,sm8450-smmu-500
items:
- const: qcom,sm8450-smmu-500
- const: qcom,adreno-smmu
- const: qcom,smmu-500
- const: arm,mmu-500
then:
properties:
clock-names:
@ -508,7 +516,13 @@ allOf:
- if:
properties:
compatible:
const: qcom,sm8550-smmu-500
items:
- enum:
- qcom,sm8550-smmu-500
- qcom,sm8650-smmu-500
- const: qcom,adreno-smmu
- const: qcom,smmu-500
- const: arm,mmu-500
then:
properties:
clock-names:
@ -534,7 +548,6 @@ allOf:
- cavium,smmu-v2
- marvell,ap806-smmu-500
- nvidia,smmu-500
- qcom,qcm2290-smmu-500
- qcom,qdu1000-smmu-500
- qcom,sc7180-smmu-500
- qcom,sc8180x-smmu-500
@ -544,7 +557,6 @@ allOf:
- qcom,sdx65-smmu-500
- qcom,sm6350-smmu-500
- qcom,sm6375-smmu-500
- qcom,sm8650-smmu-500
- qcom,x1e80100-smmu-500
then:
properties:

View File

@ -11245,7 +11245,6 @@ F: drivers/iommu/
F: include/linux/iommu.h
F: include/linux/iova.h
F: include/linux/of_iommu.h
F: include/uapi/linux/iommu.h
IOMMUFD
M: Jason Gunthorpe <jgg@nvidia.com>

View File

@ -163,6 +163,9 @@ config IOMMU_SVA
select IOMMU_MM_DATA
bool
config IOMMU_IOPF
bool
config FSL_PAMU
bool "Freescale IOMMU support"
depends on PCI
@ -196,7 +199,7 @@ source "drivers/iommu/iommufd/Kconfig"
config IRQ_REMAP
bool "Support for Interrupt Remapping"
depends on X86_64 && X86_IO_APIC && PCI_MSI && ACPI
select DMAR_TABLE
select DMAR_TABLE if INTEL_IOMMU
help
Supports Interrupt remapping for IO-APIC and MSI devices.
To use x2apic mode in the CPU's which support x2APIC enhancements or
@ -398,6 +401,7 @@ config ARM_SMMU_V3_SVA
bool "Shared Virtual Addressing support for the ARM SMMUv3"
depends on ARM_SMMU_V3
select IOMMU_SVA
select IOMMU_IOPF
select MMU_NOTIFIER
help
Support for sharing process address spaces with devices using the

View File

@ -26,6 +26,7 @@ obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o io-pgfault.o
obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o
obj-$(CONFIG_IOMMU_IOPF) += io-pgfault.o
obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o
obj-$(CONFIG_APPLE_DART) += apple-dart.o

View File

@ -39,20 +39,16 @@ extern enum io_pgtable_fmt amd_iommu_pgtable;
extern int amd_iommu_gpt_level;
bool amd_iommu_v2_supported(void);
struct amd_iommu *get_amd_iommu(unsigned int idx);
u8 amd_iommu_pc_get_max_banks(unsigned int idx);
bool amd_iommu_pc_supported(void);
u8 amd_iommu_pc_get_max_counters(unsigned int idx);
int amd_iommu_pc_get_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
u8 fxn, u64 *value);
int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
u8 fxn, u64 *value);
/* Device capabilities */
int amd_iommu_pdev_enable_cap_pri(struct pci_dev *pdev);
void amd_iommu_pdev_disable_cap_pri(struct pci_dev *pdev);
int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid, u64 address);
/* GCR3 setup */
int amd_iommu_set_gcr3(struct iommu_dev_data *dev_data,
ioasid_t pasid, unsigned long gcr3);
int amd_iommu_clear_gcr3(struct iommu_dev_data *dev_data, ioasid_t pasid);
/*
* This function flushes all internal caches of
* the IOMMU used by this driver.
@ -63,10 +59,10 @@ void amd_iommu_domain_update(struct protection_domain *domain);
void amd_iommu_domain_flush_complete(struct protection_domain *domain);
void amd_iommu_domain_flush_pages(struct protection_domain *domain,
u64 address, size_t size);
int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid);
int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, u32 pasid,
unsigned long cr3);
int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, u32 pasid);
void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
ioasid_t pasid, u64 address, size_t size);
void amd_iommu_dev_flush_pasid_all(struct iommu_dev_data *dev_data,
ioasid_t pasid);
#ifdef CONFIG_IRQ_REMAP
int amd_iommu_create_irq_domain(struct amd_iommu *iommu);
@ -77,10 +73,6 @@ static inline int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
}
#endif
#define PPR_SUCCESS 0x0
#define PPR_INVALID 0x1
#define PPR_FAILURE 0xf
int amd_iommu_complete_ppr(struct pci_dev *pdev, u32 pasid,
int status, int tag);
@ -150,6 +142,21 @@ static inline void *alloc_pgtable_page(int nid, gfp_t gfp)
return page ? page_address(page) : NULL;
}
/*
* This must be called after device probe completes. During probe
* use rlookup_amd_iommu() get the iommu.
*/
static inline struct amd_iommu *get_amd_iommu_from_dev(struct device *dev)
{
return iommu_get_iommu_dev(dev, struct amd_iommu, iommu);
}
/* This must be called after device probe completes. */
static inline struct amd_iommu *get_amd_iommu_from_dev_data(struct iommu_dev_data *dev_data)
{
return iommu_get_iommu_dev(dev_data->dev, struct amd_iommu, iommu);
}
bool translation_pre_enabled(struct amd_iommu *iommu);
bool amd_iommu_is_attach_deferred(struct device *dev);
int __init add_special_device(u8 type, u8 id, u32 *devid, bool cmd_line);

View File

@ -453,15 +453,6 @@
#define MAX_DOMAIN_ID 65536
/* Protection domain flags */
#define PD_DMA_OPS_MASK BIT(0) /* domain used for dma_ops */
#define PD_DEFAULT_MASK BIT(1) /* domain is a default dma_ops
domain for an IOMMU */
#define PD_PASSTHROUGH_MASK BIT(2) /* domain has no page
translation */
#define PD_IOMMUV2_MASK BIT(3) /* domain has gcr3 table */
#define PD_GIOV_MASK BIT(4) /* domain enable GIOV support */
/* Timeout stuff */
#define LOOP_TIMEOUT 100000
#define MMIO_STATUS_TIMEOUT 2000000
@ -513,14 +504,6 @@ extern struct kmem_cache *amd_iommu_irq_cache;
#define for_each_iommu_safe(iommu, next) \
list_for_each_entry_safe((iommu), (next), &amd_iommu_list, list)
#define APERTURE_RANGE_SHIFT 27 /* 128 MB */
#define APERTURE_RANGE_SIZE (1ULL << APERTURE_RANGE_SHIFT)
#define APERTURE_RANGE_PAGES (APERTURE_RANGE_SIZE >> PAGE_SHIFT)
#define APERTURE_MAX_RANGES 32 /* allows 4GB of DMA address space */
#define APERTURE_RANGE_INDEX(a) ((a) >> APERTURE_RANGE_SHIFT)
#define APERTURE_PAGE_INDEX(a) (((a) >> 21) & 0x3fULL)
struct amd_iommu;
struct iommu_domain;
struct irq_domain;
@ -541,6 +524,13 @@ struct amd_irte_ops;
#define io_pgtable_cfg_to_data(x) \
container_of((x), struct amd_io_pgtable, pgtbl_cfg)
struct gcr3_tbl_info {
u64 *gcr3_tbl; /* Guest CR3 table */
int glx; /* Number of levels for GCR3 table */
u32 pasid_cnt; /* Track attached PASIDs */
u16 domid; /* Per device domain ID */
};
struct amd_io_pgtable {
struct io_pgtable_cfg pgtbl_cfg;
struct io_pgtable iop;
@ -549,6 +539,11 @@ struct amd_io_pgtable {
u64 *pgd; /* v2 pgtable pgd pointer */
};
enum protection_domain_mode {
PD_MODE_V1 = 1,
PD_MODE_V2,
};
/*
* This structure contains generic data for IOMMU protection domains
* independent of their use.
@ -560,10 +555,8 @@ struct protection_domain {
struct amd_io_pgtable iop;
spinlock_t lock; /* mostly used to lock the page table*/
u16 id; /* the domain id written to the device table */
int glx; /* Number of levels for GCR3 table */
int nid; /* Node ID */
u64 *gcr3_tbl; /* Guest CR3 table */
unsigned long flags; /* flags to find out type of domain */
enum protection_domain_mode pd_mode; /* Track page table type */
bool dirty_tracking; /* dirty tracking is enabled in the domain */
unsigned dev_cnt; /* devices assigned to this domain */
unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
@ -816,6 +809,7 @@ struct iommu_dev_data {
struct list_head list; /* For domain->dev_list */
struct llist_node dev_data_list; /* For global dev_data_list */
struct protection_domain *domain; /* Domain the device is bound to */
struct gcr3_tbl_info gcr3_info; /* Per-device GCR3 table */
struct device *dev;
u16 devid; /* PCI Device ID */

View File

@ -2068,6 +2068,9 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
/* Prevent binding other PCI device drivers to IOMMU devices */
iommu->dev->match_driver = false;
/* ACPI _PRT won't have an IRQ for IOMMU */
iommu->dev->irq_managed = 1;
pci_read_config_dword(iommu->dev, cap_ptr + MMIO_CAP_HDR_OFFSET,
&iommu->cap);
@ -2769,6 +2772,7 @@ static void early_enable_iommu(struct amd_iommu *iommu)
iommu_enable_command_buffer(iommu);
iommu_enable_event_buffer(iommu);
iommu_set_exclusion_range(iommu);
iommu_enable_gt(iommu);
iommu_enable_ga(iommu);
iommu_enable_xt(iommu);
iommu_enable_irtcachedis(iommu);
@ -2825,6 +2829,7 @@ static void early_enable_iommus(void)
iommu_disable_irtcachedis(iommu);
iommu_enable_command_buffer(iommu);
iommu_enable_event_buffer(iommu);
iommu_enable_gt(iommu);
iommu_enable_ga(iommu);
iommu_enable_xt(iommu);
iommu_enable_irtcachedis(iommu);
@ -2838,10 +2843,8 @@ static void enable_iommus_v2(void)
{
struct amd_iommu *iommu;
for_each_iommu(iommu) {
for_each_iommu(iommu)
iommu_enable_ppr_log(iommu);
iommu_enable_gt(iommu);
}
}
static void enable_iommus_vapic(void)
@ -3694,13 +3697,11 @@ u8 amd_iommu_pc_get_max_banks(unsigned int idx)
return 0;
}
EXPORT_SYMBOL(amd_iommu_pc_get_max_banks);
bool amd_iommu_pc_supported(void)
{
return amd_iommu_pc_present;
}
EXPORT_SYMBOL(amd_iommu_pc_supported);
u8 amd_iommu_pc_get_max_counters(unsigned int idx)
{
@ -3711,7 +3712,6 @@ u8 amd_iommu_pc_get_max_counters(unsigned int idx)
return 0;
}
EXPORT_SYMBOL(amd_iommu_pc_get_max_counters);
static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
u8 fxn, u64 *value, bool is_write)

View File

@ -350,38 +350,26 @@ static const struct iommu_flush_ops v2_flush_ops = {
static void v2_free_pgtable(struct io_pgtable *iop)
{
struct protection_domain *pdom;
struct amd_io_pgtable *pgtable = container_of(iop, struct amd_io_pgtable, iop);
pdom = container_of(pgtable, struct protection_domain, iop);
if (!(pdom->flags & PD_IOMMUV2_MASK))
if (!pgtable || !pgtable->pgd)
return;
/* Clear gcr3 entry */
amd_iommu_domain_clear_gcr3(&pdom->domain, 0);
/* Make changes visible to IOMMUs */
amd_iommu_domain_update(pdom);
/* Free page table */
free_pgtable(pgtable->pgd, get_pgtable_level());
pgtable->pgd = NULL;
}
static struct io_pgtable *v2_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
{
struct amd_io_pgtable *pgtable = io_pgtable_cfg_to_data(cfg);
struct protection_domain *pdom = (struct protection_domain *)cookie;
int ret;
int ias = IOMMU_IN_ADDR_BIT_SIZE;
pgtable->pgd = alloc_pgtable_page(pdom->nid, GFP_ATOMIC);
if (!pgtable->pgd)
return NULL;
ret = amd_iommu_domain_set_gcr3(&pdom->domain, 0, iommu_virt_to_phys(pgtable->pgd));
if (ret)
goto err_free_pgd;
if (get_pgtable_level() == PAGE_MODE_5_LEVEL)
ias = 57;
@ -395,11 +383,6 @@ static struct io_pgtable *v2_alloc_pgtable(struct io_pgtable_cfg *cfg, void *coo
cfg->tlb = &v2_flush_ops;
return &pgtable->iop;
err_free_pgd:
free_pgtable_page(pgtable->pgd);
return NULL;
}
struct io_pgtable_init_fns io_pgtable_amd_iommu_v2_init_fns = {

View File

@ -45,10 +45,6 @@
#define CMD_SET_TYPE(cmd, t) ((cmd)->data[1] |= ((t) << 28))
/* IO virtual address start page frame number */
#define IOVA_START_PFN (1)
#define IOVA_PFN(addr) ((addr) >> PAGE_SHIFT)
/* Reserved IOVA ranges */
#define MSI_RANGE_START (0xfee00000)
#define MSI_RANGE_END (0xfeefffff)
@ -79,6 +75,9 @@ struct kmem_cache *amd_iommu_irq_cache;
static void detach_device(struct device *dev);
static void set_dte_entry(struct amd_iommu *iommu,
struct iommu_dev_data *dev_data);
/****************************************************************************
*
* Helper functions
@ -87,7 +86,7 @@ static void detach_device(struct device *dev);
static inline bool pdom_is_v2_pgtbl_mode(struct protection_domain *pdom)
{
return (pdom && (pdom->flags & PD_IOMMUV2_MASK));
return (pdom && (pdom->pd_mode == PD_MODE_V2));
}
static inline int get_acpihid_device_id(struct device *dev,
@ -1388,14 +1387,9 @@ void amd_iommu_flush_all_caches(struct amd_iommu *iommu)
static int device_flush_iotlb(struct iommu_dev_data *dev_data, u64 address,
size_t size, ioasid_t pasid, bool gn)
{
struct amd_iommu *iommu;
struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
struct iommu_cmd cmd;
int qdep;
qdep = dev_data->ats_qdep;
iommu = rlookup_amd_iommu(dev_data->dev);
if (!iommu)
return -EINVAL;
int qdep = dev_data->ats_qdep;
build_inv_iotlb_pages(&cmd, dev_data->devid, qdep, address,
size, pasid, gn);
@ -1415,16 +1409,12 @@ static int device_flush_dte_alias(struct pci_dev *pdev, u16 alias, void *data)
*/
static int device_flush_dte(struct iommu_dev_data *dev_data)
{
struct amd_iommu *iommu;
struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
struct pci_dev *pdev = NULL;
struct amd_iommu_pci_seg *pci_seg;
u16 alias;
int ret;
iommu = rlookup_amd_iommu(dev_data->dev);
if (!iommu)
return -EINVAL;
if (dev_is_pci(dev_data->dev))
pdev = to_pci_dev(dev_data->dev);
@ -1453,27 +1443,37 @@ static int device_flush_dte(struct iommu_dev_data *dev_data)
return ret;
}
/*
* TLB invalidation function which is called from the mapping functions.
* It invalidates a single PTE if the range to flush is within a single
* page. Otherwise it flushes the whole TLB of the IOMMU.
*/
static void __domain_flush_pages(struct protection_domain *domain,
static int domain_flush_pages_v2(struct protection_domain *pdom,
u64 address, size_t size)
{
struct iommu_dev_data *dev_data;
struct iommu_cmd cmd;
int ret = 0;
list_for_each_entry(dev_data, &pdom->dev_list, list) {
struct amd_iommu *iommu = get_amd_iommu_from_dev(dev_data->dev);
u16 domid = dev_data->gcr3_info.domid;
build_inv_iommu_pages(&cmd, address, size,
domid, IOMMU_NO_PASID, true);
ret |= iommu_queue_command(iommu, &cmd);
}
return ret;
}
static int domain_flush_pages_v1(struct protection_domain *pdom,
u64 address, size_t size)
{
struct iommu_cmd cmd;
int ret = 0, i;
ioasid_t pasid = IOMMU_NO_PASID;
bool gn = false;
if (pdom_is_v2_pgtbl_mode(domain))
gn = true;
build_inv_iommu_pages(&cmd, address, size, domain->id, pasid, gn);
build_inv_iommu_pages(&cmd, address, size,
pdom->id, IOMMU_NO_PASID, false);
for (i = 0; i < amd_iommu_get_num_iommus(); ++i) {
if (!domain->dev_iommu[i])
if (!pdom->dev_iommu[i])
continue;
/*
@ -1483,6 +1483,28 @@ static void __domain_flush_pages(struct protection_domain *domain,
ret |= iommu_queue_command(amd_iommus[i], &cmd);
}
return ret;
}
/*
* TLB invalidation function which is called from the mapping functions.
* It flushes range of PTEs of the domain.
*/
static void __domain_flush_pages(struct protection_domain *domain,
u64 address, size_t size)
{
struct iommu_dev_data *dev_data;
int ret = 0;
ioasid_t pasid = IOMMU_NO_PASID;
bool gn = false;
if (pdom_is_v2_pgtbl_mode(domain)) {
gn = true;
ret = domain_flush_pages_v2(domain, address, size);
} else {
ret = domain_flush_pages_v1(domain, address, size);
}
list_for_each_entry(dev_data, &domain->dev_list, list) {
if (!dev_data->ats_enabled)
@ -1551,6 +1573,29 @@ static void amd_iommu_domain_flush_all(struct protection_domain *domain)
CMD_INV_IOMMU_ALL_PAGES_ADDRESS);
}
void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
ioasid_t pasid, u64 address, size_t size)
{
struct iommu_cmd cmd;
struct amd_iommu *iommu = get_amd_iommu_from_dev(dev_data->dev);
build_inv_iommu_pages(&cmd, address, size,
dev_data->gcr3_info.domid, pasid, true);
iommu_queue_command(iommu, &cmd);
if (dev_data->ats_enabled)
device_flush_iotlb(dev_data, address, size, pasid, true);
iommu_completion_wait(iommu);
}
void amd_iommu_dev_flush_pasid_all(struct iommu_dev_data *dev_data,
ioasid_t pasid)
{
amd_iommu_dev_flush_pasid_pages(dev_data, 0,
CMD_INV_IOMMU_ALL_PAGES_ADDRESS, pasid);
}
void amd_iommu_domain_flush_complete(struct protection_domain *domain)
{
int i;
@ -1592,6 +1637,49 @@ static void domain_flush_devices(struct protection_domain *domain)
device_flush_dte(dev_data);
}
static void update_device_table(struct protection_domain *domain)
{
struct iommu_dev_data *dev_data;
list_for_each_entry(dev_data, &domain->dev_list, list) {
struct amd_iommu *iommu = rlookup_amd_iommu(dev_data->dev);
set_dte_entry(iommu, dev_data);
clone_aliases(iommu, dev_data->dev);
}
}
void amd_iommu_update_and_flush_device_table(struct protection_domain *domain)
{
update_device_table(domain);
domain_flush_devices(domain);
}
void amd_iommu_domain_update(struct protection_domain *domain)
{
/* Update device table */
amd_iommu_update_and_flush_device_table(domain);
/* Flush domain TLB(s) and wait for completion */
amd_iommu_domain_flush_all(domain);
}
int amd_iommu_complete_ppr(struct pci_dev *pdev, u32 pasid,
int status, int tag)
{
struct iommu_dev_data *dev_data;
struct amd_iommu *iommu;
struct iommu_cmd cmd;
dev_data = dev_iommu_priv_get(&pdev->dev);
iommu = get_amd_iommu_from_dev(&pdev->dev);
build_complete_ppr(&cmd, dev_data->devid, pasid, status,
tag, dev_data->pri_tlp);
return iommu_queue_command(iommu, &cmd);
}
/****************************************************************************
*
* The next functions belong to the domain allocation. A domain is
@ -1656,16 +1744,22 @@ static void free_gcr3_tbl_level2(u64 *tbl)
}
}
static void free_gcr3_table(struct protection_domain *domain)
static void free_gcr3_table(struct gcr3_tbl_info *gcr3_info)
{
if (domain->glx == 2)
free_gcr3_tbl_level2(domain->gcr3_tbl);
else if (domain->glx == 1)
free_gcr3_tbl_level1(domain->gcr3_tbl);
if (gcr3_info->glx == 2)
free_gcr3_tbl_level2(gcr3_info->gcr3_tbl);
else if (gcr3_info->glx == 1)
free_gcr3_tbl_level1(gcr3_info->gcr3_tbl);
else
BUG_ON(domain->glx != 0);
WARN_ON_ONCE(gcr3_info->glx != 0);
free_page((unsigned long)domain->gcr3_tbl);
gcr3_info->glx = 0;
/* Free per device domain ID */
domain_id_free(gcr3_info->domid);
free_page((unsigned long)gcr3_info->gcr3_tbl);
gcr3_info->gcr3_tbl = NULL;
}
/*
@ -1684,33 +1778,133 @@ static int get_gcr3_levels(int pasids)
return levels ? (DIV_ROUND_UP(levels, 9) - 1) : levels;
}
/* Note: This function expects iommu_domain->lock to be held prior calling the function. */
static int setup_gcr3_table(struct protection_domain *domain, int pasids)
static int setup_gcr3_table(struct gcr3_tbl_info *gcr3_info,
struct amd_iommu *iommu, int pasids)
{
int levels = get_gcr3_levels(pasids);
int nid = iommu ? dev_to_node(&iommu->dev->dev) : NUMA_NO_NODE;
if (levels > amd_iommu_max_glx_val)
return -EINVAL;
domain->gcr3_tbl = alloc_pgtable_page(domain->nid, GFP_ATOMIC);
if (domain->gcr3_tbl == NULL)
if (gcr3_info->gcr3_tbl)
return -EBUSY;
/* Allocate per device domain ID */
gcr3_info->domid = domain_id_alloc();
gcr3_info->gcr3_tbl = alloc_pgtable_page(nid, GFP_ATOMIC);
if (gcr3_info->gcr3_tbl == NULL) {
domain_id_free(gcr3_info->domid);
return -ENOMEM;
}
domain->glx = levels;
domain->flags |= PD_IOMMUV2_MASK;
amd_iommu_domain_update(domain);
gcr3_info->glx = levels;
return 0;
}
static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
struct protection_domain *domain, bool ats, bool ppr)
static u64 *__get_gcr3_pte(struct gcr3_tbl_info *gcr3_info,
ioasid_t pasid, bool alloc)
{
int index;
u64 *pte;
u64 *root = gcr3_info->gcr3_tbl;
int level = gcr3_info->glx;
while (true) {
index = (pasid >> (9 * level)) & 0x1ff;
pte = &root[index];
if (level == 0)
break;
if (!(*pte & GCR3_VALID)) {
if (!alloc)
return NULL;
root = (void *)get_zeroed_page(GFP_ATOMIC);
if (root == NULL)
return NULL;
*pte = iommu_virt_to_phys(root) | GCR3_VALID;
}
root = iommu_phys_to_virt(*pte & PAGE_MASK);
level -= 1;
}
return pte;
}
static int update_gcr3(struct iommu_dev_data *dev_data,
ioasid_t pasid, unsigned long gcr3, bool set)
{
struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
u64 *pte;
pte = __get_gcr3_pte(gcr3_info, pasid, true);
if (pte == NULL)
return -ENOMEM;
if (set)
*pte = (gcr3 & PAGE_MASK) | GCR3_VALID;
else
*pte = 0;
amd_iommu_dev_flush_pasid_all(dev_data, pasid);
return 0;
}
int amd_iommu_set_gcr3(struct iommu_dev_data *dev_data, ioasid_t pasid,
unsigned long gcr3)
{
struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
int ret;
iommu_group_mutex_assert(dev_data->dev);
ret = update_gcr3(dev_data, pasid, gcr3, true);
if (ret)
return ret;
gcr3_info->pasid_cnt++;
return ret;
}
int amd_iommu_clear_gcr3(struct iommu_dev_data *dev_data, ioasid_t pasid)
{
struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
int ret;
iommu_group_mutex_assert(dev_data->dev);
ret = update_gcr3(dev_data, pasid, 0, false);
if (ret)
return ret;
gcr3_info->pasid_cnt--;
return ret;
}
static void set_dte_entry(struct amd_iommu *iommu,
struct iommu_dev_data *dev_data)
{
u64 pte_root = 0;
u64 flags = 0;
u32 old_domid;
u16 devid = dev_data->devid;
u16 domid;
struct protection_domain *domain = dev_data->domain;
struct dev_table_entry *dev_table = get_dev_table(iommu);
struct gcr3_tbl_info *gcr3_info = &dev_data->gcr3_info;
if (gcr3_info && gcr3_info->gcr3_tbl)
domid = dev_data->gcr3_info.domid;
else
domid = domain->id;
if (domain->iop.mode != PAGE_MODE_NONE)
pte_root = iommu_virt_to_phys(domain->iop.root);
@ -1724,23 +1918,23 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
* When SNP is enabled, Only set TV bit when IOMMU
* page translation is in use.
*/
if (!amd_iommu_snp_en || (domain->id != 0))
if (!amd_iommu_snp_en || (domid != 0))
pte_root |= DTE_FLAG_TV;
flags = dev_table[devid].data[1];
if (ats)
if (dev_data->ats_enabled)
flags |= DTE_FLAG_IOTLB;
if (ppr)
if (dev_data->ppr)
pte_root |= 1ULL << DEV_ENTRY_PPR;
if (domain->dirty_tracking)
pte_root |= DTE_FLAG_HAD;
if (domain->flags & PD_IOMMUV2_MASK) {
u64 gcr3 = iommu_virt_to_phys(domain->gcr3_tbl);
u64 glx = domain->glx;
if (gcr3_info && gcr3_info->gcr3_tbl) {
u64 gcr3 = iommu_virt_to_phys(gcr3_info->gcr3_tbl);
u64 glx = gcr3_info->glx;
u64 tmp;
pte_root |= DTE_FLAG_GV;
@ -1768,12 +1962,13 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
((u64)GUEST_PGTABLE_5_LEVEL << DTE_GPT_LEVEL_SHIFT);
}
if (domain->flags & PD_GIOV_MASK)
/* GIOV is supported with V2 page table mode only */
if (pdom_is_v2_pgtbl_mode(domain))
pte_root |= DTE_FLAG_GIOV;
}
flags &= ~DEV_DOMID_MASK;
flags |= domain->id;
flags |= domid;
old_domid = dev_table[devid].data[1] & DEV_DOMID_MASK;
dev_table[devid].data[1] = flags;
@ -1804,16 +1999,11 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 devid)
amd_iommu_apply_erratum_63(iommu, devid);
}
static void do_attach(struct iommu_dev_data *dev_data,
struct protection_domain *domain)
static int do_attach(struct iommu_dev_data *dev_data,
struct protection_domain *domain)
{
struct amd_iommu *iommu;
bool ats;
iommu = rlookup_amd_iommu(dev_data->dev);
if (!iommu)
return;
ats = dev_data->ats_enabled;
struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
int ret = 0;
/* Update data structures */
dev_data->domain = domain;
@ -1827,22 +2017,40 @@ static void do_attach(struct iommu_dev_data *dev_data,
domain->dev_iommu[iommu->index] += 1;
domain->dev_cnt += 1;
/* Init GCR3 table and update device table */
if (domain->pd_mode == PD_MODE_V2) {
/* By default, setup GCR3 table to support single PASID */
ret = setup_gcr3_table(&dev_data->gcr3_info, iommu, 1);
if (ret)
return ret;
ret = update_gcr3(dev_data, 0,
iommu_virt_to_phys(domain->iop.pgd), true);
if (ret) {
free_gcr3_table(&dev_data->gcr3_info);
return ret;
}
}
/* Update device table */
set_dte_entry(iommu, dev_data->devid, domain,
ats, dev_data->ppr);
set_dte_entry(iommu, dev_data);
clone_aliases(iommu, dev_data->dev);
device_flush_dte(dev_data);
return ret;
}
static void do_detach(struct iommu_dev_data *dev_data)
{
struct protection_domain *domain = dev_data->domain;
struct amd_iommu *iommu;
struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
iommu = rlookup_amd_iommu(dev_data->dev);
if (!iommu)
return;
/* Clear GCR3 table */
if (domain->pd_mode == PD_MODE_V2) {
update_gcr3(dev_data, 0, 0, false);
free_gcr3_table(&dev_data->gcr3_info);
}
/* Update data structures */
dev_data->domain = NULL;
@ -1886,7 +2094,7 @@ static int attach_device(struct device *dev,
if (dev_is_pci(dev))
pdev_enable_caps(to_pci_dev(dev));
do_attach(dev_data, domain);
ret = do_attach(dev_data, domain);
out:
spin_unlock(&dev_data->lock);
@ -1954,8 +2162,7 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
ret = iommu_init_device(iommu, dev);
if (ret) {
if (ret != -ENOTSUPP)
dev_err(dev, "Failed to initialize - trying to proceed anyway\n");
dev_err(dev, "Failed to initialize - trying to proceed anyway\n");
iommu_dev = ERR_PTR(ret);
iommu_ignore_device(iommu, dev);
} else {
@ -1998,42 +2205,6 @@ static struct iommu_group *amd_iommu_device_group(struct device *dev)
return acpihid_device_group(dev);
}
/*****************************************************************************
*
* The next functions belong to the dma_ops mapping/unmapping code.
*
*****************************************************************************/
static void update_device_table(struct protection_domain *domain)
{
struct iommu_dev_data *dev_data;
list_for_each_entry(dev_data, &domain->dev_list, list) {
struct amd_iommu *iommu = rlookup_amd_iommu(dev_data->dev);
if (!iommu)
continue;
set_dte_entry(iommu, dev_data->devid, domain,
dev_data->ats_enabled, dev_data->ppr);
clone_aliases(iommu, dev_data->dev);
}
}
void amd_iommu_update_and_flush_device_table(struct protection_domain *domain)
{
update_device_table(domain);
domain_flush_devices(domain);
}
void amd_iommu_domain_update(struct protection_domain *domain)
{
/* Update device table */
amd_iommu_update_and_flush_device_table(domain);
/* Flush domain TLB(s) and wait for completion */
amd_iommu_domain_flush_all(domain);
}
/*****************************************************************************
*
* The following functions belong to the exported interface of AMD IOMMU
@ -2070,9 +2241,6 @@ static void protection_domain_free(struct protection_domain *domain)
if (domain->iop.pgtbl_cfg.tlb)
free_io_pgtable_ops(&domain->iop.iop.ops);
if (domain->flags & PD_IOMMUV2_MASK)
free_gcr3_table(domain);
if (domain->iop.root)
free_page((unsigned long)domain->iop.root);
@ -2094,19 +2262,16 @@ static int protection_domain_init_v1(struct protection_domain *domain, int mode)
return -ENOMEM;
}
domain->pd_mode = PD_MODE_V1;
amd_iommu_domain_set_pgtable(domain, pt_root, mode);
return 0;
}
static int protection_domain_init_v2(struct protection_domain *domain)
static int protection_domain_init_v2(struct protection_domain *pdom)
{
domain->flags |= PD_GIOV_MASK;
domain->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
if (setup_gcr3_table(domain, 1))
return -ENOMEM;
pdom->pd_mode = PD_MODE_V2;
pdom->domain.pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
return 0;
}
@ -2194,11 +2359,8 @@ static struct iommu_domain *do_iommu_domain_alloc(unsigned int type,
struct protection_domain *domain;
struct amd_iommu *iommu = NULL;
if (dev) {
iommu = rlookup_amd_iommu(dev);
if (!iommu)
return ERR_PTR(-ENODEV);
}
if (dev)
iommu = get_amd_iommu_from_dev(dev);
/*
* Since DTE[Mode]=0 is prohibited on SNP-enabled system,
@ -2279,7 +2441,7 @@ static int amd_iommu_attach_device(struct iommu_domain *dom,
{
struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
struct protection_domain *domain = to_pdomain(dom);
struct amd_iommu *iommu = rlookup_amd_iommu(dev);
struct amd_iommu *iommu = get_amd_iommu_from_dev(dev);
int ret;
/*
@ -2337,7 +2499,7 @@ static int amd_iommu_map_pages(struct iommu_domain *dom, unsigned long iova,
int prot = 0;
int ret = -EINVAL;
if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
if ((domain->pd_mode == PD_MODE_V1) &&
(domain->iop.mode == PAGE_MODE_NONE))
return -EINVAL;
@ -2383,7 +2545,7 @@ static size_t amd_iommu_unmap_pages(struct iommu_domain *dom, unsigned long iova
struct io_pgtable_ops *ops = &domain->iop.iop.ops;
size_t r;
if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
if ((domain->pd_mode == PD_MODE_V1) &&
(domain->iop.mode == PAGE_MODE_NONE))
return 0;
@ -2418,7 +2580,7 @@ static bool amd_iommu_capable(struct device *dev, enum iommu_cap cap)
case IOMMU_CAP_DEFERRED_FLUSH:
return true;
case IOMMU_CAP_DIRTY_TRACKING: {
struct amd_iommu *iommu = rlookup_amd_iommu(dev);
struct amd_iommu *iommu = get_amd_iommu_from_dev(dev);
return amd_iommu_hd_support(iommu);
}
@ -2447,9 +2609,7 @@ static int amd_iommu_set_dirty_tracking(struct iommu_domain *domain,
}
list_for_each_entry(dev_data, &pdomain->dev_list, list) {
iommu = rlookup_amd_iommu(dev_data->dev);
if (!iommu)
continue;
iommu = get_amd_iommu_from_dev_data(dev_data);
dev_table = get_dev_table(iommu);
pte_root = dev_table[dev_data->devid].data[0];
@ -2509,9 +2669,7 @@ static void amd_iommu_get_resv_regions(struct device *dev,
return;
devid = PCI_SBDF_TO_DEVID(sbdf);
iommu = rlookup_amd_iommu(dev);
if (!iommu)
return;
iommu = get_amd_iommu_from_dev(dev);
pci_seg = iommu->pci_seg;
list_for_each_entry(entry, &pci_seg->unity_map, list) {
@ -2645,216 +2803,6 @@ const struct iommu_ops amd_iommu_ops = {
}
};
static int __flush_pasid(struct protection_domain *domain, u32 pasid,
u64 address, size_t size)
{
struct iommu_dev_data *dev_data;
struct iommu_cmd cmd;
int i, ret;
if (!(domain->flags & PD_IOMMUV2_MASK))
return -EINVAL;
build_inv_iommu_pages(&cmd, address, size, domain->id, pasid, true);
/*
* IOMMU TLB needs to be flushed before Device TLB to
* prevent device TLB refill from IOMMU TLB
*/
for (i = 0; i < amd_iommu_get_num_iommus(); ++i) {
if (domain->dev_iommu[i] == 0)
continue;
ret = iommu_queue_command(amd_iommus[i], &cmd);
if (ret != 0)
goto out;
}
/* Wait until IOMMU TLB flushes are complete */
amd_iommu_domain_flush_complete(domain);
/* Now flush device TLBs */
list_for_each_entry(dev_data, &domain->dev_list, list) {
struct amd_iommu *iommu;
int qdep;
/*
There might be non-IOMMUv2 capable devices in an IOMMUv2
* domain.
*/
if (!dev_data->ats_enabled)
continue;
qdep = dev_data->ats_qdep;
iommu = rlookup_amd_iommu(dev_data->dev);
if (!iommu)
continue;
build_inv_iotlb_pages(&cmd, dev_data->devid, qdep,
address, size, pasid, true);
ret = iommu_queue_command(iommu, &cmd);
if (ret != 0)
goto out;
}
/* Wait until all device TLBs are flushed */
amd_iommu_domain_flush_complete(domain);
ret = 0;
out:
return ret;
}
static int __amd_iommu_flush_page(struct protection_domain *domain, u32 pasid,
u64 address)
{
return __flush_pasid(domain, pasid, address, PAGE_SIZE);
}
int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid,
u64 address)
{
struct protection_domain *domain = to_pdomain(dom);
unsigned long flags;
int ret;
spin_lock_irqsave(&domain->lock, flags);
ret = __amd_iommu_flush_page(domain, pasid, address);
spin_unlock_irqrestore(&domain->lock, flags);
return ret;
}
static int __amd_iommu_flush_tlb(struct protection_domain *domain, u32 pasid)
{
return __flush_pasid(domain, pasid, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS);
}
int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid)
{
struct protection_domain *domain = to_pdomain(dom);
unsigned long flags;
int ret;
spin_lock_irqsave(&domain->lock, flags);
ret = __amd_iommu_flush_tlb(domain, pasid);
spin_unlock_irqrestore(&domain->lock, flags);
return ret;
}
static u64 *__get_gcr3_pte(u64 *root, int level, u32 pasid, bool alloc)
{
int index;
u64 *pte;
while (true) {
index = (pasid >> (9 * level)) & 0x1ff;
pte = &root[index];
if (level == 0)
break;
if (!(*pte & GCR3_VALID)) {
if (!alloc)
return NULL;
root = (void *)get_zeroed_page(GFP_ATOMIC);
if (root == NULL)
return NULL;
*pte = iommu_virt_to_phys(root) | GCR3_VALID;
}
root = iommu_phys_to_virt(*pte & PAGE_MASK);
level -= 1;
}
return pte;
}
static int __set_gcr3(struct protection_domain *domain, u32 pasid,
unsigned long cr3)
{
u64 *pte;
if (domain->iop.mode != PAGE_MODE_NONE)
return -EINVAL;
pte = __get_gcr3_pte(domain->gcr3_tbl, domain->glx, pasid, true);
if (pte == NULL)
return -ENOMEM;
*pte = (cr3 & PAGE_MASK) | GCR3_VALID;
return __amd_iommu_flush_tlb(domain, pasid);
}
static int __clear_gcr3(struct protection_domain *domain, u32 pasid)
{
u64 *pte;
if (domain->iop.mode != PAGE_MODE_NONE)
return -EINVAL;
pte = __get_gcr3_pte(domain->gcr3_tbl, domain->glx, pasid, false);
if (pte == NULL)
return 0;
*pte = 0;
return __amd_iommu_flush_tlb(domain, pasid);
}
int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, u32 pasid,
unsigned long cr3)
{
struct protection_domain *domain = to_pdomain(dom);
unsigned long flags;
int ret;
spin_lock_irqsave(&domain->lock, flags);
ret = __set_gcr3(domain, pasid, cr3);
spin_unlock_irqrestore(&domain->lock, flags);
return ret;
}
int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, u32 pasid)
{
struct protection_domain *domain = to_pdomain(dom);
unsigned long flags;
int ret;
spin_lock_irqsave(&domain->lock, flags);
ret = __clear_gcr3(domain, pasid);
spin_unlock_irqrestore(&domain->lock, flags);
return ret;
}
int amd_iommu_complete_ppr(struct pci_dev *pdev, u32 pasid,
int status, int tag)
{
struct iommu_dev_data *dev_data;
struct amd_iommu *iommu;
struct iommu_cmd cmd;
dev_data = dev_iommu_priv_get(&pdev->dev);
iommu = rlookup_amd_iommu(&pdev->dev);
if (!iommu)
return -ENODEV;
build_complete_ppr(&cmd, dev_data->devid, pasid, status,
tag, dev_data->pri_tlp);
return iommu_queue_command(iommu, &cmd);
}
#ifdef CONFIG_IRQ_REMAP
/*****************************************************************************

View File

@ -779,7 +779,8 @@ static void apple_dart_domain_free(struct iommu_domain *domain)
kfree(dart_domain);
}
static int apple_dart_of_xlate(struct device *dev, struct of_phandle_args *args)
static int apple_dart_of_xlate(struct device *dev,
const struct of_phandle_args *args)
{
struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
struct platform_device *iommu_pdev = of_find_device_by_node(args->np);

View File

@ -10,7 +10,6 @@
#include <linux/slab.h>
#include "arm-smmu-v3.h"
#include "../../iommu-sva.h"
#include "../../io-pgtable-arm.h"
struct arm_smmu_mmu_notifier {
@ -364,7 +363,13 @@ static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid,
struct arm_smmu_bond *bond;
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_domain *smmu_domain;
if (!(domain->type & __IOMMU_DOMAIN_PAGING))
return -ENODEV;
smmu_domain = to_smmu_domain(domain);
if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
return -ENODEV;
if (!master || !master->sva_enabled)
return -ENODEV;
@ -470,7 +475,6 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master)
static int arm_smmu_master_sva_enable_iopf(struct arm_smmu_master *master)
{
int ret;
struct device *dev = master->dev;
/*
@ -483,16 +487,7 @@ static int arm_smmu_master_sva_enable_iopf(struct arm_smmu_master *master)
if (!master->iopf_enabled)
return -EINVAL;
ret = iopf_queue_add_device(master->smmu->evtq.iopf, dev);
if (ret)
return ret;
ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
if (ret) {
iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
return ret;
}
return 0;
return iopf_queue_add_device(master->smmu->evtq.iopf, dev);
}
static void arm_smmu_master_sva_disable_iopf(struct arm_smmu_master *master)
@ -502,7 +497,6 @@ static void arm_smmu_master_sva_disable_iopf(struct arm_smmu_master *master)
if (!master->iopf_enabled)
return;
iommu_unregister_device_fault_handler(dev);
iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
}

File diff suppressed because it is too large Load Diff

View File

@ -609,8 +609,6 @@ struct arm_smmu_ctx_desc_cfg {
struct arm_smmu_s2_cfg {
u16 vmid;
u64 vttbr;
u64 vtcr;
};
struct arm_smmu_strtab_cfg {
@ -697,7 +695,6 @@ struct arm_smmu_stream {
struct arm_smmu_master {
struct arm_smmu_device *smmu;
struct device *dev;
struct arm_smmu_domain *domain;
struct list_head domain_head;
struct arm_smmu_stream *streams;
/* Locked by the iommu core using the group mutex */
@ -715,7 +712,6 @@ struct arm_smmu_master {
enum arm_smmu_domain_stage {
ARM_SMMU_DOMAIN_S1 = 0,
ARM_SMMU_DOMAIN_S2,
ARM_SMMU_DOMAIN_BYPASS,
};
struct arm_smmu_domain {

View File

@ -260,6 +260,7 @@ static const struct of_device_id qcom_smmu_client_of_match[] __maybe_unused = {
{ .compatible = "qcom,sm6375-mdss" },
{ .compatible = "qcom,sm8150-mdss" },
{ .compatible = "qcom,sm8250-mdss" },
{ .compatible = "qcom,x1e80100-mdss" },
{ }
};

View File

@ -1546,7 +1546,8 @@ static int arm_smmu_set_pgtable_quirks(struct iommu_domain *domain,
return ret;
}
static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
static int arm_smmu_of_xlate(struct device *dev,
const struct of_phandle_args *args)
{
u32 mask, fwid = 0;

View File

@ -546,7 +546,8 @@ static struct iommu_device *qcom_iommu_probe_device(struct device *dev)
return &qcom_iommu->iommu;
}
static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
static int qcom_iommu_of_xlate(struct device *dev,
const struct of_phandle_args *args)
{
struct qcom_iommu_dev *qcom_iommu;
struct platform_device *iommu_pdev;

View File

@ -859,6 +859,11 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
iommu_deferred_attach(dev, domain))
return DMA_MAPPING_ERROR;
/* If anyone ever wants this we'd need support in the IOVA allocator */
if (dev_WARN_ONCE(dev, dma_get_min_align_mask(dev) > iova_mask(iovad),
"Unsupported alignment constraint\n"))
return DMA_MAPPING_ERROR;
size = iova_align(iovad, size + iova_off);
iova = iommu_dma_alloc_iova(domain, size, dma_mask, dev);

View File

@ -1431,7 +1431,7 @@ static void exynos_iommu_release_device(struct device *dev)
}
static int exynos_iommu_of_xlate(struct device *dev,
struct of_phandle_args *spec)
const struct of_phandle_args *spec)
{
struct platform_device *sysmmu = of_find_device_by_node(spec->np);
struct exynos_iommu_owner *owner = dev_iommu_priv_get(dev);

View File

@ -51,6 +51,7 @@ config INTEL_IOMMU_SVM
depends on X86_64
select MMU_NOTIFIER
select IOMMU_SVA
select IOMMU_IOPF
help
Shared Virtual Memory (SVM) provides a facility for devices
to access DMA resources through process address space by
@ -64,17 +65,6 @@ config INTEL_IOMMU_DEFAULT_ON
one is found. If this option is not selected, DMAR support can
be enabled by passing intel_iommu=on to the kernel.
config INTEL_IOMMU_BROKEN_GFX_WA
bool "Workaround broken graphics drivers (going away soon)"
depends on BROKEN && X86
help
Current Graphics drivers tend to use physical address
for DMA and avoid using DMA APIs. Setting this config
option permits the IOMMU driver to set a unity map for
all the OS-visible memory. Hence the driver can continue
to use physical addresses for DMA, at least until this
option is removed in the 2.6.32 kernel.
config INTEL_IOMMU_FLOPPY_WA
def_bool y
depends on X86

View File

@ -5,5 +5,7 @@ obj-$(CONFIG_DMAR_TABLE) += trace.o cap_audit.o
obj-$(CONFIG_DMAR_PERF) += perf.o
obj-$(CONFIG_INTEL_IOMMU_DEBUGFS) += debugfs.o
obj-$(CONFIG_INTEL_IOMMU_SVM) += svm.o
ifdef CONFIG_INTEL_IOMMU
obj-$(CONFIG_IRQ_REMAP) += irq_remapping.o
endif
obj-$(CONFIG_INTEL_IOMMU_PERF_EVENTS) += perfmon.o

View File

@ -1095,7 +1095,9 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd)
iommu->agaw = agaw;
iommu->msagaw = msagaw;
iommu->segment = drhd->segment;
iommu->device_rbtree = RB_ROOT;
spin_lock_init(&iommu->device_rbtree_lock);
mutex_init(&iommu->iopf_lock);
iommu->node = NUMA_NO_NODE;
ver = readl(iommu->reg + DMAR_VER_REG);
@ -1271,6 +1273,8 @@ static int qi_check_fault(struct intel_iommu *iommu, int index, int wait_index)
{
u32 fault;
int head, tail;
struct device *dev;
u64 iqe_err, ite_sid;
struct q_inval *qi = iommu->qi;
int shift = qi_shift(iommu);
@ -1315,6 +1319,13 @@ static int qi_check_fault(struct intel_iommu *iommu, int index, int wait_index)
tail = readl(iommu->reg + DMAR_IQT_REG);
tail = ((tail >> shift) - 1 + QI_LENGTH) % QI_LENGTH;
/*
* SID field is valid only when the ITE field is Set in FSTS_REG
* see Intel VT-d spec r4.1, section 11.4.9.9
*/
iqe_err = dmar_readq(iommu->reg + DMAR_IQER_REG);
ite_sid = DMAR_IQER_REG_ITESID(iqe_err);
writel(DMA_FSTS_ITE, iommu->reg + DMAR_FSTS_REG);
pr_info("Invalidation Time-out Error (ITE) cleared\n");
@ -1324,6 +1335,19 @@ static int qi_check_fault(struct intel_iommu *iommu, int index, int wait_index)
head = (head - 2 + QI_LENGTH) % QI_LENGTH;
} while (head != tail);
/*
* If device was released or isn't present, no need to retry
* the ATS invalidate request anymore.
*
* 0 value of ite_sid means old VT-d device, no ite_sid value.
* see Intel VT-d spec r4.1, section 11.4.9.9
*/
if (ite_sid) {
dev = device_rbtree_find(iommu, ite_sid);
if (!dev || !dev_is_pci(dev) ||
!pci_device_is_present(to_pci_dev(dev)))
return -ETIMEDOUT;
}
if (qi->desc_status[wait_index] == QI_ABORT)
return -EAGAIN;
}

View File

@ -27,7 +27,6 @@
#include "iommu.h"
#include "../dma-iommu.h"
#include "../irq_remapping.h"
#include "../iommu-sva.h"
#include "pasid.h"
#include "cap_audit.h"
#include "perfmon.h"
@ -97,6 +96,81 @@ static phys_addr_t root_entry_uctp(struct root_entry *re)
return re->hi & VTD_PAGE_MASK;
}
static int device_rid_cmp_key(const void *key, const struct rb_node *node)
{
struct device_domain_info *info =
rb_entry(node, struct device_domain_info, node);
const u16 *rid_lhs = key;
if (*rid_lhs < PCI_DEVID(info->bus, info->devfn))
return -1;
if (*rid_lhs > PCI_DEVID(info->bus, info->devfn))
return 1;
return 0;
}
static int device_rid_cmp(struct rb_node *lhs, const struct rb_node *rhs)
{
struct device_domain_info *info =
rb_entry(lhs, struct device_domain_info, node);
u16 key = PCI_DEVID(info->bus, info->devfn);
return device_rid_cmp_key(&key, rhs);
}
/*
* Looks up an IOMMU-probed device using its source ID.
*
* Returns the pointer to the device if there is a match. Otherwise,
* returns NULL.
*
* Note that this helper doesn't guarantee that the device won't be
* released by the iommu subsystem after being returned. The caller
* should use its own synchronization mechanism to avoid the device
* being released during its use if its possibly the case.
*/
struct device *device_rbtree_find(struct intel_iommu *iommu, u16 rid)
{
struct device_domain_info *info = NULL;
struct rb_node *node;
unsigned long flags;
spin_lock_irqsave(&iommu->device_rbtree_lock, flags);
node = rb_find(&rid, &iommu->device_rbtree, device_rid_cmp_key);
if (node)
info = rb_entry(node, struct device_domain_info, node);
spin_unlock_irqrestore(&iommu->device_rbtree_lock, flags);
return info ? info->dev : NULL;
}
static int device_rbtree_insert(struct intel_iommu *iommu,
struct device_domain_info *info)
{
struct rb_node *curr;
unsigned long flags;
spin_lock_irqsave(&iommu->device_rbtree_lock, flags);
curr = rb_find_add(&info->node, &iommu->device_rbtree, device_rid_cmp);
spin_unlock_irqrestore(&iommu->device_rbtree_lock, flags);
if (WARN_ON(curr))
return -EEXIST;
return 0;
}
static void device_rbtree_remove(struct device_domain_info *info)
{
struct intel_iommu *iommu = info->iommu;
unsigned long flags;
spin_lock_irqsave(&iommu->device_rbtree_lock, flags);
rb_erase(&info->node, &iommu->device_rbtree);
spin_unlock_irqrestore(&iommu->device_rbtree_lock, flags);
}
/*
* This domain is a statically identity mapping domain.
* 1. This domain creats a static 1:1 mapping to all usable memory.
@ -1776,34 +1850,17 @@ static void domain_exit(struct dmar_domain *domain)
kfree(domain);
}
/*
* Get the PASID directory size for scalable mode context entry.
* Value of X in the PDTS field of a scalable mode context entry
* indicates PASID directory with 2^(X + 7) entries.
*/
static unsigned long context_get_sm_pds(struct pasid_table *table)
{
unsigned long pds, max_pde;
max_pde = table->max_pasid >> PASID_PDE_SHIFT;
pds = find_first_bit(&max_pde, MAX_NR_PASID_BITS);
if (pds < 7)
return 0;
return pds - 7;
}
static int domain_context_mapping_one(struct dmar_domain *domain,
struct intel_iommu *iommu,
struct pasid_table *table,
u8 bus, u8 devfn)
{
struct device_domain_info *info =
domain_lookup_dev_info(domain, iommu, bus, devfn);
u16 did = domain_id_iommu(domain, iommu);
int translation = CONTEXT_TT_MULTI_LEVEL;
struct dma_pte *pgd = domain->pgd;
struct context_entry *context;
int ret;
int agaw, ret;
if (hw_pass_through && domain_type_is_si(domain))
translation = CONTEXT_TT_PASS_THROUGH;
@ -1846,65 +1903,37 @@ static int domain_context_mapping_one(struct dmar_domain *domain,
}
context_clear_entry(context);
context_set_domain_id(context, did);
if (sm_supported(iommu)) {
unsigned long pds;
/* Setup the PASID DIR pointer: */
pds = context_get_sm_pds(table);
context->lo = (u64)virt_to_phys(table->table) |
context_pdts(pds);
/* Setup the RID_PASID field: */
context_set_sm_rid2pasid(context, IOMMU_NO_PASID);
if (translation != CONTEXT_TT_PASS_THROUGH) {
/*
* Setup the Device-TLB enable bit and Page request
* Enable bit:
* Skip top levels of page tables for iommu which has
* less agaw than default. Unnecessary for PT mode.
*/
if (info && info->ats_supported)
context_set_sm_dte(context);
if (info && info->pri_supported)
context_set_sm_pre(context);
if (info && info->pasid_supported)
context_set_pasid(context);
} else {
struct dma_pte *pgd = domain->pgd;
int agaw;
context_set_domain_id(context, did);
if (translation != CONTEXT_TT_PASS_THROUGH) {
/*
* Skip top levels of page tables for iommu which has
* less agaw than default. Unnecessary for PT mode.
*/
for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
ret = -ENOMEM;
pgd = phys_to_virt(dma_pte_addr(pgd));
if (!dma_pte_present(pgd))
goto out_unlock;
}
if (info && info->ats_supported)
translation = CONTEXT_TT_DEV_IOTLB;
else
translation = CONTEXT_TT_MULTI_LEVEL;
context_set_address_root(context, virt_to_phys(pgd));
context_set_address_width(context, agaw);
} else {
/*
* In pass through mode, AW must be programmed to
* indicate the largest AGAW value supported by
* hardware. And ASR is ignored by hardware.
*/
context_set_address_width(context, iommu->msagaw);
for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
ret = -ENOMEM;
pgd = phys_to_virt(dma_pte_addr(pgd));
if (!dma_pte_present(pgd))
goto out_unlock;
}
context_set_translation_type(context, translation);
if (info && info->ats_supported)
translation = CONTEXT_TT_DEV_IOTLB;
else
translation = CONTEXT_TT_MULTI_LEVEL;
context_set_address_root(context, virt_to_phys(pgd));
context_set_address_width(context, agaw);
} else {
/*
* In pass through mode, AW must be programmed to
* indicate the largest AGAW value supported by
* hardware. And ASR is ignored by hardware.
*/
context_set_address_width(context, iommu->msagaw);
}
context_set_translation_type(context, translation);
context_set_fault_enable(context);
context_set_present(context);
if (!ecap_coherent(iommu->ecap))
@ -1934,43 +1963,29 @@ out_unlock:
return ret;
}
struct domain_context_mapping_data {
struct dmar_domain *domain;
struct intel_iommu *iommu;
struct pasid_table *table;
};
static int domain_context_mapping_cb(struct pci_dev *pdev,
u16 alias, void *opaque)
{
struct domain_context_mapping_data *data = opaque;
struct device_domain_info *info = dev_iommu_priv_get(&pdev->dev);
struct intel_iommu *iommu = info->iommu;
struct dmar_domain *domain = opaque;
return domain_context_mapping_one(data->domain, data->iommu,
data->table, PCI_BUS_NUM(alias),
alias & 0xff);
return domain_context_mapping_one(domain, iommu,
PCI_BUS_NUM(alias), alias & 0xff);
}
static int
domain_context_mapping(struct dmar_domain *domain, struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct domain_context_mapping_data data;
struct intel_iommu *iommu = info->iommu;
u8 bus = info->bus, devfn = info->devfn;
struct pasid_table *table;
table = intel_pasid_get_table(dev);
if (!dev_is_pci(dev))
return domain_context_mapping_one(domain, iommu, table,
bus, devfn);
data.domain = domain;
data.iommu = iommu;
data.table = table;
return domain_context_mapping_one(domain, iommu, bus, devfn);
return pci_for_each_dma_alias(to_pci_dev(dev),
&domain_context_mapping_cb, &data);
domain_context_mapping_cb, domain);
}
/* Returns a number of VTD pages, but aligned to MM page size */
@ -2160,9 +2175,6 @@ static void domain_context_clear_one(struct device_domain_info *info, u8 bus, u8
struct context_entry *context;
u16 did_old;
if (!iommu)
return;
spin_lock(&iommu->lock);
context = iommu_context_addr(iommu, bus, devfn, 0);
if (!context) {
@ -2170,14 +2182,7 @@ static void domain_context_clear_one(struct device_domain_info *info, u8 bus, u8
return;
}
if (sm_supported(iommu)) {
if (hw_pass_through && domain_type_is_si(info->domain))
did_old = FLPT_DEFAULT_DID;
else
did_old = domain_id_iommu(info->domain, iommu);
} else {
did_old = context_domain_id(context);
}
did_old = context_domain_id(context);
context_clear_entry(context);
__iommu_flush_cache(iommu, context, sizeof(*context));
@ -2188,9 +2193,6 @@ static void domain_context_clear_one(struct device_domain_info *info, u8 bus, u8
DMA_CCMD_MASK_NOBIT,
DMA_CCMD_DEVICE_INVL);
if (sm_supported(iommu))
qi_flush_pasid_cache(iommu, did_old, QI_PC_ALL_PASIDS, 0);
iommu->flush.flush_iotlb(iommu,
did_old,
0,
@ -2330,28 +2332,19 @@ static int dmar_domain_attach_device(struct dmar_domain *domain,
list_add(&info->link, &domain->devices);
spin_unlock_irqrestore(&domain->lock, flags);
/* PASID table is mandatory for a PCI device in scalable mode. */
if (sm_supported(iommu) && !dev_is_real_dma_subdevice(dev)) {
/* Setup the PASID entry for requests without PASID: */
if (hw_pass_through && domain_type_is_si(domain))
ret = intel_pasid_setup_pass_through(iommu,
dev, IOMMU_NO_PASID);
else if (domain->use_first_level)
ret = domain_setup_first_level(iommu, domain, dev,
IOMMU_NO_PASID);
else
ret = intel_pasid_setup_second_level(iommu, domain,
dev, IOMMU_NO_PASID);
if (ret) {
dev_err(dev, "Setup RID2PASID failed\n");
device_block_translation(dev);
return ret;
}
}
if (dev_is_real_dma_subdevice(dev))
return 0;
if (!sm_supported(iommu))
ret = domain_context_mapping(domain, dev);
else if (hw_pass_through && domain_type_is_si(domain))
ret = intel_pasid_setup_pass_through(iommu, dev, IOMMU_NO_PASID);
else if (domain->use_first_level)
ret = domain_setup_first_level(iommu, domain, dev, IOMMU_NO_PASID);
else
ret = intel_pasid_setup_second_level(iommu, domain, dev, IOMMU_NO_PASID);
ret = domain_context_mapping(domain, dev);
if (ret) {
dev_err(dev, "Domain context map failed\n");
device_block_translation(dev);
return ret;
}
@ -2712,10 +2705,6 @@ static int __init init_dmars(void)
iommu_set_root_entry(iommu);
}
#ifdef CONFIG_INTEL_IOMMU_BROKEN_GFX_WA
dmar_map_gfx = 0;
#endif
if (!dmar_map_gfx)
iommu_identity_mapping |= IDENTMAP_GFX;
@ -3799,30 +3788,6 @@ static void domain_context_clear(struct device_domain_info *info)
&domain_context_clear_one_cb, info);
}
static void dmar_remove_one_dev_info(struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct dmar_domain *domain = info->domain;
struct intel_iommu *iommu = info->iommu;
unsigned long flags;
if (!dev_is_real_dma_subdevice(info->dev)) {
if (dev_is_pci(info->dev) && sm_supported(iommu))
intel_pasid_tear_down_entry(iommu, info->dev,
IOMMU_NO_PASID, false);
iommu_disable_pci_caps(info);
domain_context_clear(info);
}
spin_lock_irqsave(&domain->lock, flags);
list_del(&info->link);
spin_unlock_irqrestore(&domain->lock, flags);
domain_detach_iommu(domain, iommu);
info->domain = NULL;
}
/*
* Clear the page table pointer in context or pasid table entries so that
* all DMA requests without PASID from the device are blocked. If the page
@ -4027,6 +3992,10 @@ int prepare_domain_attach_device(struct iommu_domain *domain,
dmar_domain->agaw--;
}
if (sm_supported(iommu) && !dev_is_real_dma_subdevice(dev) &&
context_copied(iommu, info->bus, info->devfn))
return intel_pasid_setup_sm_context(dev);
return 0;
}
@ -4330,26 +4299,50 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
}
dev_iommu_priv_set(dev, info);
ret = device_rbtree_insert(iommu, info);
if (ret)
goto free;
if (sm_supported(iommu) && !dev_is_real_dma_subdevice(dev)) {
ret = intel_pasid_alloc_table(dev);
if (ret) {
dev_err(dev, "PASID table allocation failed\n");
kfree(info);
return ERR_PTR(ret);
goto clear_rbtree;
}
if (!context_copied(iommu, info->bus, info->devfn)) {
ret = intel_pasid_setup_sm_context(dev);
if (ret)
goto free_table;
}
}
intel_iommu_debugfs_create_dev(info);
return &iommu->iommu;
free_table:
intel_pasid_free_table(dev);
clear_rbtree:
device_rbtree_remove(info);
free:
kfree(info);
return ERR_PTR(ret);
}
static void intel_iommu_release_device(struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
mutex_lock(&iommu->iopf_lock);
device_rbtree_remove(info);
mutex_unlock(&iommu->iopf_lock);
if (sm_supported(iommu) && !dev_is_real_dma_subdevice(dev) &&
!context_copied(iommu, info->bus, info->devfn))
intel_pasid_teardown_sm_context(dev);
dmar_remove_one_dev_info(dev);
intel_pasid_free_table(dev);
intel_iommu_debugfs_remove_dev(info);
kfree(info);
@ -4492,23 +4485,15 @@ static int intel_iommu_enable_iopf(struct device *dev)
if (ret)
return ret;
ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
if (ret)
goto iopf_remove_device;
ret = pci_enable_pri(pdev, PRQ_DEPTH);
if (ret)
goto iopf_unregister_handler;
if (ret) {
iopf_queue_remove_device(iommu->iopf_queue, dev);
return ret;
}
info->pri_enabled = 1;
return 0;
iopf_unregister_handler:
iommu_unregister_device_fault_handler(dev);
iopf_remove_device:
iopf_queue_remove_device(iommu->iopf_queue, dev);
return ret;
}
static int intel_iommu_disable_iopf(struct device *dev)
@ -4529,14 +4514,7 @@ static int intel_iommu_disable_iopf(struct device *dev)
*/
pci_disable_pri(to_pci_dev(dev));
info->pri_enabled = 0;
/*
* With PRI disabled and outstanding PRQs drained, unregistering
* fault handler and removing device from iopf queue should never
* fail.
*/
WARN_ON(iommu_unregister_device_fault_handler(dev));
WARN_ON(iopf_queue_remove_device(iommu->iopf_queue, dev));
iopf_queue_remove_device(iommu->iopf_queue, dev);
return 0;
}
@ -4855,6 +4833,7 @@ static const struct iommu_dirty_ops intel_dirty_ops = {
const struct iommu_ops intel_iommu_ops = {
.blocked_domain = &blocking_domain,
.release_domain = &blocking_domain,
.capable = intel_iommu_capable,
.hw_info = intel_iommu_hw_info,
.domain_alloc = intel_iommu_domain_alloc,

View File

@ -719,9 +719,16 @@ struct intel_iommu {
#endif
struct iopf_queue *iopf_queue;
unsigned char iopfq_name[16];
/* Synchronization between fault report and iommu device release. */
struct mutex iopf_lock;
struct q_inval *qi; /* Queued invalidation info */
u32 iommu_state[MAX_SR_DMAR_REGS]; /* Store iommu states between suspend and resume.*/
/* rb tree for all probed devices */
struct rb_root device_rbtree;
/* protect the device_rbtree */
spinlock_t device_rbtree_lock;
#ifdef CONFIG_IRQ_REMAP
struct ir_table *ir_table; /* Interrupt remapping info */
struct irq_domain *ir_domain;
@ -755,6 +762,8 @@ struct device_domain_info {
struct intel_iommu *iommu; /* IOMMU used by this device */
struct dmar_domain *domain; /* pointer to domain */
struct pasid_table *pasid_table; /* pasid table */
/* device tracking node(lookup by PCI RID) */
struct rb_node node;
#ifdef CONFIG_INTEL_IOMMU_DEBUGFS
struct dentry *debugfs_dentry; /* pointer to device directory dentry */
#endif
@ -1081,13 +1090,14 @@ void free_pgtable_page(void *vaddr);
void iommu_flush_write_buffer(struct intel_iommu *iommu);
struct iommu_domain *intel_nested_domain_alloc(struct iommu_domain *parent,
const struct iommu_user_data *user_data);
struct device *device_rbtree_find(struct intel_iommu *iommu, u16 rid);
#ifdef CONFIG_INTEL_IOMMU_SVM
void intel_svm_check(struct intel_iommu *iommu);
int intel_svm_enable_prq(struct intel_iommu *iommu);
int intel_svm_finish_prq(struct intel_iommu *iommu);
int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
struct iommu_page_response *msg);
void intel_svm_page_response(struct device *dev, struct iopf_fault *evt,
struct iommu_page_response *msg);
struct iommu_domain *intel_svm_domain_alloc(void);
void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid);
void intel_drain_pasid_prq(struct device *dev, u32 pasid);

View File

@ -214,6 +214,9 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iommu,
if (!info || !info->ats_enabled)
return;
if (pci_dev_is_disconnected(to_pci_dev(dev)))
return;
sid = info->bus << 8 | info->devfn;
qdep = info->ats_qdep;
pfsid = info->pfsid;
@ -667,3 +670,205 @@ int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device *dev,
return 0;
}
/*
* Interfaces to setup or teardown a pasid table to the scalable-mode
* context table entry:
*/
static void device_pasid_table_teardown(struct device *dev, u8 bus, u8 devfn)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
struct context_entry *context;
spin_lock(&iommu->lock);
context = iommu_context_addr(iommu, bus, devfn, false);
if (!context) {
spin_unlock(&iommu->lock);
return;
}
context_clear_entry(context);
__iommu_flush_cache(iommu, context, sizeof(*context));
spin_unlock(&iommu->lock);
/*
* Cache invalidation for changes to a scalable-mode context table
* entry.
*
* Section 6.5.3.3 of the VT-d spec:
* - Device-selective context-cache invalidation;
* - Domain-selective PASID-cache invalidation to affected domains
* (can be skipped if all PASID entries were not-present);
* - Domain-selective IOTLB invalidation to affected domains;
* - Global Device-TLB invalidation to affected functions.
*
* The iommu has been parked in the blocking state. All domains have
* been detached from the device or PASID. The PASID and IOTLB caches
* have been invalidated during the domain detach path.
*/
iommu->flush.flush_context(iommu, 0, PCI_DEVID(bus, devfn),
DMA_CCMD_MASK_NOBIT, DMA_CCMD_DEVICE_INVL);
devtlb_invalidation_with_pasid(iommu, dev, IOMMU_NO_PASID);
}
static int pci_pasid_table_teardown(struct pci_dev *pdev, u16 alias, void *data)
{
struct device *dev = data;
if (dev == &pdev->dev)
device_pasid_table_teardown(dev, PCI_BUS_NUM(alias), alias & 0xff);
return 0;
}
void intel_pasid_teardown_sm_context(struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
if (!dev_is_pci(dev)) {
device_pasid_table_teardown(dev, info->bus, info->devfn);
return;
}
pci_for_each_dma_alias(to_pci_dev(dev), pci_pasid_table_teardown, dev);
}
/*
* Get the PASID directory size for scalable mode context entry.
* Value of X in the PDTS field of a scalable mode context entry
* indicates PASID directory with 2^(X + 7) entries.
*/
static unsigned long context_get_sm_pds(struct pasid_table *table)
{
unsigned long pds, max_pde;
max_pde = table->max_pasid >> PASID_PDE_SHIFT;
pds = find_first_bit(&max_pde, MAX_NR_PASID_BITS);
if (pds < 7)
return 0;
return pds - 7;
}
static int context_entry_set_pasid_table(struct context_entry *context,
struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct pasid_table *table = info->pasid_table;
struct intel_iommu *iommu = info->iommu;
unsigned long pds;
context_clear_entry(context);
pds = context_get_sm_pds(table);
context->lo = (u64)virt_to_phys(table->table) | context_pdts(pds);
context_set_sm_rid2pasid(context, IOMMU_NO_PASID);
if (info->ats_supported)
context_set_sm_dte(context);
if (info->pri_supported)
context_set_sm_pre(context);
if (info->pasid_supported)
context_set_pasid(context);
context_set_fault_enable(context);
context_set_present(context);
__iommu_flush_cache(iommu, context, sizeof(*context));
return 0;
}
static int device_pasid_table_setup(struct device *dev, u8 bus, u8 devfn)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
struct context_entry *context;
spin_lock(&iommu->lock);
context = iommu_context_addr(iommu, bus, devfn, true);
if (!context) {
spin_unlock(&iommu->lock);
return -ENOMEM;
}
if (context_present(context) && !context_copied(iommu, bus, devfn)) {
spin_unlock(&iommu->lock);
return 0;
}
if (context_copied(iommu, bus, devfn)) {
context_clear_entry(context);
__iommu_flush_cache(iommu, context, sizeof(*context));
/*
* For kdump cases, old valid entries may be cached due to
* the in-flight DMA and copied pgtable, but there is no
* unmapping behaviour for them, thus we need explicit cache
* flushes for all affected domain IDs and PASIDs used in
* the copied PASID table. Given that we have no idea about
* which domain IDs and PASIDs were used in the copied tables,
* upgrade them to global PASID and IOTLB cache invalidation.
*/
iommu->flush.flush_context(iommu, 0,
PCI_DEVID(bus, devfn),
DMA_CCMD_MASK_NOBIT,
DMA_CCMD_DEVICE_INVL);
qi_flush_pasid_cache(iommu, 0, QI_PC_GLOBAL, 0);
iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH);
devtlb_invalidation_with_pasid(iommu, dev, IOMMU_NO_PASID);
/*
* At this point, the device is supposed to finish reset at
* its driver probe stage, so no in-flight DMA will exist,
* and we don't need to worry anymore hereafter.
*/
clear_context_copied(iommu, bus, devfn);
}
context_entry_set_pasid_table(context, dev);
spin_unlock(&iommu->lock);
/*
* It's a non-present to present mapping. If hardware doesn't cache
* non-present entry we don't need to flush the caches. If it does
* cache non-present entries, then it does so in the special
* domain #0, which we have to flush:
*/
if (cap_caching_mode(iommu->cap)) {
iommu->flush.flush_context(iommu, 0,
PCI_DEVID(bus, devfn),
DMA_CCMD_MASK_NOBIT,
DMA_CCMD_DEVICE_INVL);
iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_DSI_FLUSH);
}
return 0;
}
static int pci_pasid_table_setup(struct pci_dev *pdev, u16 alias, void *data)
{
struct device *dev = data;
if (dev != &pdev->dev)
return 0;
return device_pasid_table_setup(dev, PCI_BUS_NUM(alias), alias & 0xff);
}
/*
* Set the device's PASID table to its context table entry.
*
* The PASID table is set to the context entries of both device itself
* and its alias requester ID for DMA.
*/
int intel_pasid_setup_sm_context(struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
if (!dev_is_pci(dev))
return device_pasid_table_setup(dev, info->bus, info->devfn);
return pci_for_each_dma_alias(to_pci_dev(dev), pci_pasid_table_setup, dev);
}

View File

@ -318,4 +318,6 @@ void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
bool fault_ignore);
void intel_pasid_setup_page_snoop_control(struct intel_iommu *iommu,
struct device *dev, u32 pasid);
int intel_pasid_setup_sm_context(struct device *dev);
void intel_pasid_teardown_sm_context(struct device *dev);
#endif /* __INTEL_PASID_H */

View File

@ -33,7 +33,7 @@ int dmar_latency_enable(struct intel_iommu *iommu, enum latency_type type)
spin_lock_irqsave(&latency_lock, flags);
if (!iommu->perf_statistic) {
iommu->perf_statistic = kzalloc(sizeof(*lstat) * DMAR_LATENCY_NUM,
iommu->perf_statistic = kcalloc(DMAR_LATENCY_NUM, sizeof(*lstat),
GFP_ATOMIC);
if (!iommu->perf_statistic) {
ret = -ENOMEM;

View File

@ -22,7 +22,6 @@
#include "iommu.h"
#include "pasid.h"
#include "perf.h"
#include "../iommu-sva.h"
#include "trace.h"
static irqreturn_t prq_event_thread(int irq, void *d);
@ -315,10 +314,11 @@ out:
return 0;
}
static int intel_svm_bind_mm(struct intel_iommu *iommu, struct device *dev,
struct iommu_domain *domain, ioasid_t pasid)
static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
struct mm_struct *mm = domain->mm;
struct intel_svm_dev *sdev;
struct intel_svm *svm;
@ -360,7 +360,6 @@ static int intel_svm_bind_mm(struct intel_iommu *iommu, struct device *dev,
sdev->iommu = iommu;
sdev->did = FLPT_DEFAULT_DID;
sdev->sid = PCI_DEVID(info->bus, info->devfn);
init_rcu_head(&sdev->rcu);
if (info->ats_enabled) {
sdev->qdep = info->ats_qdep;
if (sdev->qdep >= QI_DEV_EIOTLB_MAX_INVS)
@ -408,13 +407,6 @@ void intel_svm_remove_dev_pasid(struct device *dev, u32 pasid)
if (svm->notifier.ops)
mmu_notifier_unregister(&svm->notifier, mm);
pasid_private_remove(svm->pasid);
/*
* We mandate that no page faults may be outstanding
* for the PASID when intel_svm_unbind_mm() is called.
* If that is not obeyed, subtle errors will happen.
* Let's make them less subtle...
*/
memset(svm, 0x6b, sizeof(*svm));
kfree(svm);
}
}
@ -562,16 +554,12 @@ static int prq_to_iommu_prot(struct page_req_dsc *req)
return prot;
}
static int intel_svm_prq_report(struct intel_iommu *iommu, struct device *dev,
struct page_req_dsc *desc)
static void intel_svm_prq_report(struct intel_iommu *iommu, struct device *dev,
struct page_req_dsc *desc)
{
struct iommu_fault_event event;
if (!dev || !dev_is_pci(dev))
return -ENODEV;
struct iopf_fault event = { };
/* Fill in event data for device specific processing */
memset(&event, 0, sizeof(struct iommu_fault_event));
event.fault.type = IOMMU_FAULT_PAGE_REQ;
event.fault.prm.addr = (u64)desc->addr << VTD_PAGE_SHIFT;
event.fault.prm.pasid = desc->pasid;
@ -603,7 +591,7 @@ static int intel_svm_prq_report(struct intel_iommu *iommu, struct device *dev,
event.fault.prm.private_data[0] = ktime_to_ns(ktime_get());
}
return iommu_report_device_fault(dev, &event);
iommu_report_device_fault(dev, &event);
}
static void handle_bad_prq_event(struct intel_iommu *iommu,
@ -650,7 +638,7 @@ static irqreturn_t prq_event_thread(int irq, void *d)
struct intel_iommu *iommu = d;
struct page_req_dsc *req;
int head, tail, handled;
struct pci_dev *pdev;
struct device *dev;
u64 address;
/*
@ -696,23 +684,22 @@ bad_req:
if (unlikely(req->lpig && !req->rd_req && !req->wr_req))
goto prq_advance;
pdev = pci_get_domain_bus_and_slot(iommu->segment,
PCI_BUS_NUM(req->rid),
req->rid & 0xff);
/*
* If prq is to be handled outside iommu driver via receiver of
* the fault notifiers, we skip the page response here.
*/
if (!pdev)
mutex_lock(&iommu->iopf_lock);
dev = device_rbtree_find(iommu, req->rid);
if (!dev) {
mutex_unlock(&iommu->iopf_lock);
goto bad_req;
}
if (intel_svm_prq_report(iommu, &pdev->dev, req))
handle_bad_prq_event(iommu, req, QI_RESP_INVALID);
else
trace_prq_report(iommu, &pdev->dev, req->qw_0, req->qw_1,
req->priv_data[0], req->priv_data[1],
iommu->prq_seq_number++);
pci_dev_put(pdev);
intel_svm_prq_report(iommu, dev, req);
trace_prq_report(iommu, dev, req->qw_0, req->qw_1,
req->priv_data[0], req->priv_data[1],
iommu->prq_seq_number++);
mutex_unlock(&iommu->iopf_lock);
prq_advance:
head = (head + sizeof(*req)) & PRQ_RING_MASK;
}
@ -742,9 +729,8 @@ prq_advance:
return IRQ_RETVAL(handled);
}
int intel_svm_page_response(struct device *dev,
struct iommu_fault_event *evt,
struct iommu_page_response *msg)
void intel_svm_page_response(struct device *dev, struct iopf_fault *evt,
struct iommu_page_response *msg)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
@ -753,7 +739,6 @@ int intel_svm_page_response(struct device *dev,
bool private_present;
bool pasid_present;
bool last_page;
int ret = 0;
u16 sid;
prm = &evt->fault.prm;
@ -762,16 +747,6 @@ int intel_svm_page_response(struct device *dev,
private_present = prm->flags & IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA;
last_page = prm->flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
if (!pasid_present) {
ret = -EINVAL;
goto out;
}
if (prm->pasid == 0 || prm->pasid >= PASID_MAX) {
ret = -EINVAL;
goto out;
}
/*
* Per VT-d spec. v3.0 ch7.7, system software must respond
* with page group response if private data is present (PDP)
@ -800,17 +775,6 @@ int intel_svm_page_response(struct device *dev,
qi_submit_sync(iommu, &desc, 1, 0);
}
out:
return ret;
}
static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
return intel_svm_bind_mm(iommu, dev, domain, pasid);
}
static void intel_svm_domain_free(struct iommu_domain *domain)

View File

@ -11,101 +11,140 @@
#include <linux/slab.h>
#include <linux/workqueue.h>
#include "iommu-sva.h"
#include "iommu-priv.h"
/**
* struct iopf_queue - IO Page Fault queue
* @wq: the fault workqueue
* @devices: devices attached to this queue
* @lock: protects the device list
/*
* Return the fault parameter of a device if it exists. Otherwise, return NULL.
* On a successful return, the caller takes a reference of this parameter and
* should put it after use by calling iopf_put_dev_fault_param().
*/
struct iopf_queue {
struct workqueue_struct *wq;
struct list_head devices;
struct mutex lock;
};
/**
* struct iopf_device_param - IO Page Fault data attached to a device
* @dev: the device that owns this param
* @queue: IOPF queue
* @queue_list: index into queue->devices
* @partial: faults that are part of a Page Request Group for which the last
* request hasn't been submitted yet.
*/
struct iopf_device_param {
struct device *dev;
struct iopf_queue *queue;
struct list_head queue_list;
struct list_head partial;
};
struct iopf_fault {
struct iommu_fault fault;
struct list_head list;
};
struct iopf_group {
struct iopf_fault last_fault;
struct list_head faults;
struct work_struct work;
struct device *dev;
};
static int iopf_complete_group(struct device *dev, struct iopf_fault *iopf,
enum iommu_page_response_code status)
static struct iommu_fault_param *iopf_get_dev_fault_param(struct device *dev)
{
struct iommu_page_response resp = {
.version = IOMMU_PAGE_RESP_VERSION_1,
.pasid = iopf->fault.prm.pasid,
.grpid = iopf->fault.prm.grpid,
.code = status,
};
struct dev_iommu *param = dev->iommu;
struct iommu_fault_param *fault_param;
if ((iopf->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) &&
(iopf->fault.prm.flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID))
resp.flags = IOMMU_PAGE_RESP_PASID_VALID;
rcu_read_lock();
fault_param = rcu_dereference(param->fault_param);
if (fault_param && !refcount_inc_not_zero(&fault_param->users))
fault_param = NULL;
rcu_read_unlock();
return iommu_page_response(dev, &resp);
return fault_param;
}
static void iopf_handler(struct work_struct *work)
/* Caller must hold a reference of the fault parameter. */
static void iopf_put_dev_fault_param(struct iommu_fault_param *fault_param)
{
struct iopf_group *group;
struct iommu_domain *domain;
struct iopf_fault *iopf, *next;
enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS;
if (refcount_dec_and_test(&fault_param->users))
kfree_rcu(fault_param, rcu);
}
group = container_of(work, struct iopf_group, work);
domain = iommu_get_domain_for_dev_pasid(group->dev,
group->last_fault.fault.prm.pasid, 0);
if (!domain || !domain->iopf_handler)
status = IOMMU_PAGE_RESP_INVALID;
static void __iopf_free_group(struct iopf_group *group)
{
struct iopf_fault *iopf, *next;
list_for_each_entry_safe(iopf, next, &group->faults, list) {
/*
* For the moment, errors are sticky: don't handle subsequent
* faults in the group if there is an error.
*/
if (status == IOMMU_PAGE_RESP_SUCCESS)
status = domain->iopf_handler(&iopf->fault,
domain->fault_data);
if (!(iopf->fault.prm.flags &
IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE))
if (!(iopf->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE))
kfree(iopf);
}
iopf_complete_group(group->dev, &group->last_fault, status);
/* Pair with iommu_report_device_fault(). */
iopf_put_dev_fault_param(group->fault_param);
}
void iopf_free_group(struct iopf_group *group)
{
__iopf_free_group(group);
kfree(group);
}
EXPORT_SYMBOL_GPL(iopf_free_group);
static struct iommu_domain *get_domain_for_iopf(struct device *dev,
struct iommu_fault *fault)
{
struct iommu_domain *domain;
if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) {
domain = iommu_get_domain_for_dev_pasid(dev, fault->prm.pasid, 0);
if (IS_ERR(domain))
domain = NULL;
} else {
domain = iommu_get_domain_for_dev(dev);
}
if (!domain || !domain->iopf_handler) {
dev_warn_ratelimited(dev,
"iopf (pasid %d) without domain attached or handler installed\n",
fault->prm.pasid);
return NULL;
}
return domain;
}
/* Non-last request of a group. Postpone until the last one. */
static int report_partial_fault(struct iommu_fault_param *fault_param,
struct iommu_fault *fault)
{
struct iopf_fault *iopf;
iopf = kzalloc(sizeof(*iopf), GFP_KERNEL);
if (!iopf)
return -ENOMEM;
iopf->fault = *fault;
mutex_lock(&fault_param->lock);
list_add(&iopf->list, &fault_param->partial);
mutex_unlock(&fault_param->lock);
return 0;
}
static struct iopf_group *iopf_group_alloc(struct iommu_fault_param *iopf_param,
struct iopf_fault *evt,
struct iopf_group *abort_group)
{
struct iopf_fault *iopf, *next;
struct iopf_group *group;
group = kzalloc(sizeof(*group), GFP_KERNEL);
if (!group) {
/*
* We always need to construct the group as we need it to abort
* the request at the driver if it can't be handled.
*/
group = abort_group;
}
group->fault_param = iopf_param;
group->last_fault.fault = evt->fault;
INIT_LIST_HEAD(&group->faults);
INIT_LIST_HEAD(&group->pending_node);
list_add(&group->last_fault.list, &group->faults);
/* See if we have partial faults for this group */
mutex_lock(&iopf_param->lock);
list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) {
if (iopf->fault.prm.grpid == evt->fault.prm.grpid)
/* Insert *before* the last fault */
list_move(&iopf->list, &group->faults);
}
list_add(&group->pending_node, &iopf_param->faults);
mutex_unlock(&iopf_param->lock);
return group;
}
/**
* iommu_queue_iopf - IO Page Fault handler
* @fault: fault event
* @cookie: struct device, passed to iommu_register_device_fault_handler.
* iommu_report_device_fault() - Report fault event to device driver
* @dev: the device
* @evt: fault event data
*
* Add a fault to the device workqueue, to be handled by mm.
* Called by IOMMU drivers when a fault is detected, typically in a threaded IRQ
* handler. If this function fails then ops->page_response() was called to
* complete evt if required.
*
* This module doesn't handle PCI PASID Stop Marker; IOMMU drivers must discard
* them before reporting faults. A PASID Stop Marker (LRW = 0b100) doesn't
@ -137,83 +176,57 @@ static void iopf_handler(struct work_struct *work)
* freed after the device has stopped generating page faults (or the iommu
* hardware has been set to block the page faults) and the pending page faults
* have been flushed.
*
* Return: 0 on success and <0 on error.
*/
int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt)
{
int ret;
struct iommu_fault *fault = &evt->fault;
struct iommu_fault_param *iopf_param;
struct iopf_group abort_group = {};
struct iopf_group *group;
struct iopf_fault *iopf, *next;
struct iopf_device_param *iopf_param;
struct device *dev = cookie;
struct dev_iommu *param = dev->iommu;
lockdep_assert_held(&param->lock);
if (fault->type != IOMMU_FAULT_PAGE_REQ)
/* Not a recoverable page fault */
return -EOPNOTSUPP;
/*
* As long as we're holding param->lock, the queue can't be unlinked
* from the device and therefore cannot disappear.
*/
iopf_param = param->iopf_param;
if (!iopf_param)
return -ENODEV;
iopf_param = iopf_get_dev_fault_param(dev);
if (WARN_ON(!iopf_param))
return;
if (!(fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) {
iopf = kzalloc(sizeof(*iopf), GFP_KERNEL);
if (!iopf)
return -ENOMEM;
iopf->fault = *fault;
/* Non-last request of a group. Postpone until the last one */
list_add(&iopf->list, &iopf_param->partial);
return 0;
report_partial_fault(iopf_param, fault);
iopf_put_dev_fault_param(iopf_param);
/* A request that is not the last does not need to be ack'd */
}
group = kzalloc(sizeof(*group), GFP_KERNEL);
if (!group) {
/*
* The caller will send a response to the hardware. But we do
* need to clean up before leaving, otherwise partial faults
* will be stuck.
*/
ret = -ENOMEM;
goto cleanup_partial;
}
/*
* This is the last page fault of a group. Allocate an iopf group and
* pass it to domain's page fault handler. The group holds a reference
* count of the fault parameter. It will be released after response or
* error path of this function. If an error is returned, the caller
* will send a response to the hardware. We need to clean up before
* leaving, otherwise partial faults will be stuck.
*/
group = iopf_group_alloc(iopf_param, evt, &abort_group);
if (group == &abort_group)
goto err_abort;
group->dev = dev;
group->last_fault.fault = *fault;
INIT_LIST_HEAD(&group->faults);
list_add(&group->last_fault.list, &group->faults);
INIT_WORK(&group->work, iopf_handler);
group->domain = get_domain_for_iopf(dev, fault);
if (!group->domain)
goto err_abort;
/* See if we have partial faults for this group */
list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) {
if (iopf->fault.prm.grpid == fault->prm.grpid)
/* Insert *before* the last fault */
list_move(&iopf->list, &group->faults);
}
/*
* On success iopf_handler must call iopf_group_response() and
* iopf_free_group()
*/
if (group->domain->iopf_handler(group))
goto err_abort;
queue_work(iopf_param->queue->wq, &group->work);
return 0;
return;
cleanup_partial:
list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) {
if (iopf->fault.prm.grpid == fault->prm.grpid) {
list_del(&iopf->list);
kfree(iopf);
}
}
return ret;
err_abort:
iopf_group_response(group, IOMMU_PAGE_RESP_FAILURE);
if (group == &abort_group)
__iopf_free_group(group);
else
iopf_free_group(group);
}
EXPORT_SYMBOL_GPL(iommu_queue_iopf);
EXPORT_SYMBOL_GPL(iommu_report_device_fault);
/**
* iopf_queue_flush_dev - Ensure that all queued faults have been processed
@ -229,25 +242,51 @@ EXPORT_SYMBOL_GPL(iommu_queue_iopf);
*/
int iopf_queue_flush_dev(struct device *dev)
{
int ret = 0;
struct iopf_device_param *iopf_param;
struct dev_iommu *param = dev->iommu;
struct iommu_fault_param *iopf_param;
if (!param)
/*
* It's a driver bug to be here after iopf_queue_remove_device().
* Therefore, it's safe to dereference the fault parameter without
* holding the lock.
*/
iopf_param = rcu_dereference_check(dev->iommu->fault_param, true);
if (WARN_ON(!iopf_param))
return -ENODEV;
mutex_lock(&param->lock);
iopf_param = param->iopf_param;
if (iopf_param)
flush_workqueue(iopf_param->queue->wq);
else
ret = -ENODEV;
mutex_unlock(&param->lock);
flush_workqueue(iopf_param->queue->wq);
return ret;
return 0;
}
EXPORT_SYMBOL_GPL(iopf_queue_flush_dev);
/**
* iopf_group_response - Respond a group of page faults
* @group: the group of faults with the same group id
* @status: the response code
*/
void iopf_group_response(struct iopf_group *group,
enum iommu_page_response_code status)
{
struct iommu_fault_param *fault_param = group->fault_param;
struct iopf_fault *iopf = &group->last_fault;
struct device *dev = group->fault_param->dev;
const struct iommu_ops *ops = dev_iommu_ops(dev);
struct iommu_page_response resp = {
.pasid = iopf->fault.prm.pasid,
.grpid = iopf->fault.prm.grpid,
.code = status,
};
/* Only send response if there is a fault report pending */
mutex_lock(&fault_param->lock);
if (!list_empty(&group->pending_node)) {
ops->page_response(dev, &group->last_fault, &resp);
list_del_init(&group->pending_node);
}
mutex_unlock(&fault_param->lock);
}
EXPORT_SYMBOL_GPL(iopf_group_response);
/**
* iopf_queue_discard_partial - Remove all pending partial fault
* @queue: the queue whose partial faults need to be discarded
@ -261,18 +300,20 @@ EXPORT_SYMBOL_GPL(iopf_queue_flush_dev);
int iopf_queue_discard_partial(struct iopf_queue *queue)
{
struct iopf_fault *iopf, *next;
struct iopf_device_param *iopf_param;
struct iommu_fault_param *iopf_param;
if (!queue)
return -EINVAL;
mutex_lock(&queue->lock);
list_for_each_entry(iopf_param, &queue->devices, queue_list) {
mutex_lock(&iopf_param->lock);
list_for_each_entry_safe(iopf, next, &iopf_param->partial,
list) {
list_del(&iopf->list);
kfree(iopf);
}
mutex_unlock(&iopf_param->lock);
}
mutex_unlock(&queue->lock);
return 0;
@ -288,34 +329,42 @@ EXPORT_SYMBOL_GPL(iopf_queue_discard_partial);
*/
int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev)
{
int ret = -EBUSY;
struct iopf_device_param *iopf_param;
int ret = 0;
struct dev_iommu *param = dev->iommu;
struct iommu_fault_param *fault_param;
const struct iommu_ops *ops = dev_iommu_ops(dev);
if (!param)
if (!ops->page_response)
return -ENODEV;
iopf_param = kzalloc(sizeof(*iopf_param), GFP_KERNEL);
if (!iopf_param)
return -ENOMEM;
INIT_LIST_HEAD(&iopf_param->partial);
iopf_param->queue = queue;
iopf_param->dev = dev;
mutex_lock(&queue->lock);
mutex_lock(&param->lock);
if (!param->iopf_param) {
list_add(&iopf_param->queue_list, &queue->devices);
param->iopf_param = iopf_param;
ret = 0;
if (rcu_dereference_check(param->fault_param,
lockdep_is_held(&param->lock))) {
ret = -EBUSY;
goto done_unlock;
}
fault_param = kzalloc(sizeof(*fault_param), GFP_KERNEL);
if (!fault_param) {
ret = -ENOMEM;
goto done_unlock;
}
mutex_init(&fault_param->lock);
INIT_LIST_HEAD(&fault_param->faults);
INIT_LIST_HEAD(&fault_param->partial);
fault_param->dev = dev;
refcount_set(&fault_param->users, 1);
list_add(&fault_param->queue_list, &queue->devices);
fault_param->queue = queue;
rcu_assign_pointer(param->fault_param, fault_param);
done_unlock:
mutex_unlock(&param->lock);
mutex_unlock(&queue->lock);
if (ret)
kfree(iopf_param);
return ret;
}
EXPORT_SYMBOL_GPL(iopf_queue_add_device);
@ -325,40 +374,66 @@ EXPORT_SYMBOL_GPL(iopf_queue_add_device);
* @queue: IOPF queue
* @dev: device to remove
*
* Caller makes sure that no more faults are reported for this device.
* Removing a device from an iopf_queue. It's recommended to follow these
* steps when removing a device:
*
* Return: 0 on success and <0 on error.
* - Disable new PRI reception: Turn off PRI generation in the IOMMU hardware
* and flush any hardware page request queues. This should be done before
* calling into this helper.
* - Acknowledge all outstanding PRQs to the device: Respond to all outstanding
* page requests with IOMMU_PAGE_RESP_INVALID, indicating the device should
* not retry. This helper function handles this.
* - Disable PRI on the device: After calling this helper, the caller could
* then disable PRI on the device.
*
* Calling iopf_queue_remove_device() essentially disassociates the device.
* The fault_param might still exist, but iommu_page_response() will do
* nothing. The device fault parameter reference count has been properly
* passed from iommu_report_device_fault() to the fault handling work, and
* will eventually be released after iommu_page_response().
*/
int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev)
void iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev)
{
int ret = -EINVAL;
struct iopf_fault *iopf, *next;
struct iopf_device_param *iopf_param;
struct iopf_fault *partial_iopf;
struct iopf_fault *next;
struct iopf_group *group, *temp;
struct dev_iommu *param = dev->iommu;
if (!param || !queue)
return -EINVAL;
struct iommu_fault_param *fault_param;
const struct iommu_ops *ops = dev_iommu_ops(dev);
mutex_lock(&queue->lock);
mutex_lock(&param->lock);
iopf_param = param->iopf_param;
if (iopf_param && iopf_param->queue == queue) {
list_del(&iopf_param->queue_list);
param->iopf_param = NULL;
ret = 0;
fault_param = rcu_dereference_check(param->fault_param,
lockdep_is_held(&param->lock));
if (WARN_ON(!fault_param || fault_param->queue != queue))
goto unlock;
mutex_lock(&fault_param->lock);
list_for_each_entry_safe(partial_iopf, next, &fault_param->partial, list)
kfree(partial_iopf);
list_for_each_entry_safe(group, temp, &fault_param->faults, pending_node) {
struct iopf_fault *iopf = &group->last_fault;
struct iommu_page_response resp = {
.pasid = iopf->fault.prm.pasid,
.grpid = iopf->fault.prm.grpid,
.code = IOMMU_PAGE_RESP_INVALID
};
ops->page_response(dev, iopf, &resp);
list_del_init(&group->pending_node);
}
mutex_unlock(&fault_param->lock);
list_del(&fault_param->queue_list);
/* dec the ref owned by iopf_queue_add_device() */
rcu_assign_pointer(param->fault_param, NULL);
iopf_put_dev_fault_param(fault_param);
unlock:
mutex_unlock(&param->lock);
mutex_unlock(&queue->lock);
if (ret)
return ret;
/* Just in case some faults are still stuck */
list_for_each_entry_safe(iopf, next, &iopf_param->partial, list)
kfree(iopf);
kfree(iopf_param);
return 0;
}
EXPORT_SYMBOL_GPL(iopf_queue_remove_device);
@ -404,7 +479,7 @@ EXPORT_SYMBOL_GPL(iopf_queue_alloc);
*/
void iopf_queue_free(struct iopf_queue *queue)
{
struct iopf_device_param *iopf_param, *next;
struct iommu_fault_param *iopf_param, *next;
if (!queue)
return;

View File

@ -21,10 +21,11 @@ int iommu_group_replace_domain(struct iommu_group *group,
struct iommu_domain *new_domain);
int iommu_device_register_bus(struct iommu_device *iommu,
const struct iommu_ops *ops, struct bus_type *bus,
const struct iommu_ops *ops,
const struct bus_type *bus,
struct notifier_block *nb);
void iommu_device_unregister_bus(struct iommu_device *iommu,
struct bus_type *bus,
const struct bus_type *bus,
struct notifier_block *nb);
#endif /* __LINUX_IOMMU_PRIV_H */

View File

@ -7,7 +7,7 @@
#include <linux/sched/mm.h>
#include <linux/iommu.h>
#include "iommu-sva.h"
#include "iommu-priv.h"
static DEFINE_MUTEX(iommu_sva_lock);
@ -176,15 +176,25 @@ u32 iommu_sva_get_pasid(struct iommu_sva *handle)
}
EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
void mm_pasid_drop(struct mm_struct *mm)
{
struct iommu_mm_data *iommu_mm = mm->iommu_mm;
if (!iommu_mm)
return;
iommu_free_global_pasid(iommu_mm->pasid);
kfree(iommu_mm);
}
/*
* I/O page fault handler for SVA
*/
enum iommu_page_response_code
iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)
static enum iommu_page_response_code
iommu_sva_handle_mm(struct iommu_fault *fault, struct mm_struct *mm)
{
vm_fault_t ret;
struct vm_area_struct *vma;
struct mm_struct *mm = data;
unsigned int access_flags = 0;
unsigned int fault_flags = FAULT_FLAG_REMOTE;
struct iommu_fault_page_request *prm = &fault->prm;
@ -234,13 +244,54 @@ out_put_mm:
return status;
}
void mm_pasid_drop(struct mm_struct *mm)
static void iommu_sva_handle_iopf(struct work_struct *work)
{
struct iommu_mm_data *iommu_mm = mm->iommu_mm;
struct iopf_fault *iopf;
struct iopf_group *group;
enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS;
if (!iommu_mm)
return;
group = container_of(work, struct iopf_group, work);
list_for_each_entry(iopf, &group->faults, list) {
/*
* For the moment, errors are sticky: don't handle subsequent
* faults in the group if there is an error.
*/
if (status != IOMMU_PAGE_RESP_SUCCESS)
break;
iommu_free_global_pasid(iommu_mm->pasid);
kfree(iommu_mm);
status = iommu_sva_handle_mm(&iopf->fault, group->domain->mm);
}
iopf_group_response(group, status);
iopf_free_group(group);
}
static int iommu_sva_iopf_handler(struct iopf_group *group)
{
struct iommu_fault_param *fault_param = group->fault_param;
INIT_WORK(&group->work, iommu_sva_handle_iopf);
if (!queue_work(fault_param->queue->wq, &group->work))
return -EBUSY;
return 0;
}
struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm)
{
const struct iommu_ops *ops = dev_iommu_ops(dev);
struct iommu_domain *domain;
domain = ops->domain_alloc(IOMMU_DOMAIN_SVA);
if (!domain)
return NULL;
domain->type = IOMMU_DOMAIN_SVA;
mmgrab(mm);
domain->mm = mm;
domain->owner = ops;
domain->iopf_handler = iommu_sva_iopf_handler;
return domain;
}

View File

@ -1,71 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* SVA library for IOMMU drivers
*/
#ifndef _IOMMU_SVA_H
#define _IOMMU_SVA_H
#include <linux/mm_types.h>
/* I/O Page fault */
struct device;
struct iommu_fault;
struct iopf_queue;
#ifdef CONFIG_IOMMU_SVA
int iommu_queue_iopf(struct iommu_fault *fault, void *cookie);
int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev);
int iopf_queue_remove_device(struct iopf_queue *queue,
struct device *dev);
int iopf_queue_flush_dev(struct device *dev);
struct iopf_queue *iopf_queue_alloc(const char *name);
void iopf_queue_free(struct iopf_queue *queue);
int iopf_queue_discard_partial(struct iopf_queue *queue);
enum iommu_page_response_code
iommu_sva_handle_iopf(struct iommu_fault *fault, void *data);
#else /* CONFIG_IOMMU_SVA */
static inline int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
{
return -ENODEV;
}
static inline int iopf_queue_add_device(struct iopf_queue *queue,
struct device *dev)
{
return -ENODEV;
}
static inline int iopf_queue_remove_device(struct iopf_queue *queue,
struct device *dev)
{
return -ENODEV;
}
static inline int iopf_queue_flush_dev(struct device *dev)
{
return -ENODEV;
}
static inline struct iopf_queue *iopf_queue_alloc(const char *name)
{
return NULL;
}
static inline void iopf_queue_free(struct iopf_queue *queue)
{
}
static inline int iopf_queue_discard_partial(struct iopf_queue *queue)
{
return -ENODEV;
}
static inline enum iommu_page_response_code
iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)
{
return IOMMU_PAGE_RESP_INVALID;
}
#endif /* CONFIG_IOMMU_SVA */
#endif /* _IOMMU_SVA_H */

View File

@ -36,8 +36,6 @@
#include "dma-iommu.h"
#include "iommu-priv.h"
#include "iommu-sva.h"
static struct kset *iommu_group_kset;
static DEFINE_IDA(iommu_group_ida);
static DEFINE_IDA(iommu_global_pasid_ida);
@ -291,7 +289,7 @@ EXPORT_SYMBOL_GPL(iommu_device_unregister);
#if IS_ENABLED(CONFIG_IOMMUFD_TEST)
void iommu_device_unregister_bus(struct iommu_device *iommu,
struct bus_type *bus,
const struct bus_type *bus,
struct notifier_block *nb)
{
bus_unregister_notifier(bus, nb);
@ -305,7 +303,8 @@ EXPORT_SYMBOL_GPL(iommu_device_unregister_bus);
* some memory to hold a notifier_block.
*/
int iommu_device_register_bus(struct iommu_device *iommu,
const struct iommu_ops *ops, struct bus_type *bus,
const struct iommu_ops *ops,
const struct bus_type *bus,
struct notifier_block *nb)
{
int err;
@ -463,13 +462,24 @@ static void iommu_deinit_device(struct device *dev)
/*
* release_device() must stop using any attached domain on the device.
* If there are still other devices in the group they are not effected
* If there are still other devices in the group, they are not affected
* by this callback.
*
* The IOMMU driver must set the device to either an identity or
* blocking translation and stop using any domain pointer, as it is
* going to be freed.
* If the iommu driver provides release_domain, the core code ensures
* that domain is attached prior to calling release_device. Drivers can
* use this to enforce a translation on the idle iommu. Typically, the
* global static blocked_domain is a good choice.
*
* Otherwise, the iommu driver must set the device to either an identity
* or a blocking translation in release_device() and stop using any
* domain pointer, as it is going to be freed.
*
* Regardless, if a delayed attach never occurred, then the release
* should still avoid touching any hardware configuration either.
*/
if (!dev->iommu->attach_deferred && ops->release_domain)
ops->release_domain->ops->attach_dev(ops->release_domain, dev);
if (ops->release_device)
ops->release_device(dev);
@ -1248,6 +1258,25 @@ void iommu_group_remove_device(struct device *dev)
}
EXPORT_SYMBOL_GPL(iommu_group_remove_device);
#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_IOMMU_API)
/**
* iommu_group_mutex_assert - Check device group mutex lock
* @dev: the device that has group param set
*
* This function is called by an iommu driver to check whether it holds
* group mutex lock for the given device or not.
*
* Note that this function must be called after device group param is set.
*/
void iommu_group_mutex_assert(struct device *dev)
{
struct iommu_group *group = dev->iommu_group;
lockdep_assert_held(&group->mutex);
}
EXPORT_SYMBOL_GPL(iommu_group_mutex_assert);
#endif
static struct device *iommu_group_first_dev(struct iommu_group *group)
{
lockdep_assert_held(&group->mutex);
@ -1330,217 +1359,6 @@ void iommu_group_put(struct iommu_group *group)
}
EXPORT_SYMBOL_GPL(iommu_group_put);
/**
* iommu_register_device_fault_handler() - Register a device fault handler
* @dev: the device
* @handler: the fault handler
* @data: private data passed as argument to the handler
*
* When an IOMMU fault event is received, this handler gets called with the
* fault event and data as argument. The handler should return 0 on success. If
* the fault is recoverable (IOMMU_FAULT_PAGE_REQ), the consumer should also
* complete the fault by calling iommu_page_response() with one of the following
* response code:
* - IOMMU_PAGE_RESP_SUCCESS: retry the translation
* - IOMMU_PAGE_RESP_INVALID: terminate the fault
* - IOMMU_PAGE_RESP_FAILURE: terminate the fault and stop reporting
* page faults if possible.
*
* Return 0 if the fault handler was installed successfully, or an error.
*/
int iommu_register_device_fault_handler(struct device *dev,
iommu_dev_fault_handler_t handler,
void *data)
{
struct dev_iommu *param = dev->iommu;
int ret = 0;
if (!param)
return -EINVAL;
mutex_lock(&param->lock);
/* Only allow one fault handler registered for each device */
if (param->fault_param) {
ret = -EBUSY;
goto done_unlock;
}
get_device(dev);
param->fault_param = kzalloc(sizeof(*param->fault_param), GFP_KERNEL);
if (!param->fault_param) {
put_device(dev);
ret = -ENOMEM;
goto done_unlock;
}
param->fault_param->handler = handler;
param->fault_param->data = data;
mutex_init(&param->fault_param->lock);
INIT_LIST_HEAD(&param->fault_param->faults);
done_unlock:
mutex_unlock(&param->lock);
return ret;
}
EXPORT_SYMBOL_GPL(iommu_register_device_fault_handler);
/**
* iommu_unregister_device_fault_handler() - Unregister the device fault handler
* @dev: the device
*
* Remove the device fault handler installed with
* iommu_register_device_fault_handler().
*
* Return 0 on success, or an error.
*/
int iommu_unregister_device_fault_handler(struct device *dev)
{
struct dev_iommu *param = dev->iommu;
int ret = 0;
if (!param)
return -EINVAL;
mutex_lock(&param->lock);
if (!param->fault_param)
goto unlock;
/* we cannot unregister handler if there are pending faults */
if (!list_empty(&param->fault_param->faults)) {
ret = -EBUSY;
goto unlock;
}
kfree(param->fault_param);
param->fault_param = NULL;
put_device(dev);
unlock:
mutex_unlock(&param->lock);
return ret;
}
EXPORT_SYMBOL_GPL(iommu_unregister_device_fault_handler);
/**
* iommu_report_device_fault() - Report fault event to device driver
* @dev: the device
* @evt: fault event data
*
* Called by IOMMU drivers when a fault is detected, typically in a threaded IRQ
* handler. When this function fails and the fault is recoverable, it is the
* caller's responsibility to complete the fault.
*
* Return 0 on success, or an error.
*/
int iommu_report_device_fault(struct device *dev, struct iommu_fault_event *evt)
{
struct dev_iommu *param = dev->iommu;
struct iommu_fault_event *evt_pending = NULL;
struct iommu_fault_param *fparam;
int ret = 0;
if (!param || !evt)
return -EINVAL;
/* we only report device fault if there is a handler registered */
mutex_lock(&param->lock);
fparam = param->fault_param;
if (!fparam || !fparam->handler) {
ret = -EINVAL;
goto done_unlock;
}
if (evt->fault.type == IOMMU_FAULT_PAGE_REQ &&
(evt->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) {
evt_pending = kmemdup(evt, sizeof(struct iommu_fault_event),
GFP_KERNEL);
if (!evt_pending) {
ret = -ENOMEM;
goto done_unlock;
}
mutex_lock(&fparam->lock);
list_add_tail(&evt_pending->list, &fparam->faults);
mutex_unlock(&fparam->lock);
}
ret = fparam->handler(&evt->fault, fparam->data);
if (ret && evt_pending) {
mutex_lock(&fparam->lock);
list_del(&evt_pending->list);
mutex_unlock(&fparam->lock);
kfree(evt_pending);
}
done_unlock:
mutex_unlock(&param->lock);
return ret;
}
EXPORT_SYMBOL_GPL(iommu_report_device_fault);
int iommu_page_response(struct device *dev,
struct iommu_page_response *msg)
{
bool needs_pasid;
int ret = -EINVAL;
struct iommu_fault_event *evt;
struct iommu_fault_page_request *prm;
struct dev_iommu *param = dev->iommu;
const struct iommu_ops *ops = dev_iommu_ops(dev);
bool has_pasid = msg->flags & IOMMU_PAGE_RESP_PASID_VALID;
if (!ops->page_response)
return -ENODEV;
if (!param || !param->fault_param)
return -EINVAL;
if (msg->version != IOMMU_PAGE_RESP_VERSION_1 ||
msg->flags & ~IOMMU_PAGE_RESP_PASID_VALID)
return -EINVAL;
/* Only send response if there is a fault report pending */
mutex_lock(&param->fault_param->lock);
if (list_empty(&param->fault_param->faults)) {
dev_warn_ratelimited(dev, "no pending PRQ, drop response\n");
goto done_unlock;
}
/*
* Check if we have a matching page request pending to respond,
* otherwise return -EINVAL
*/
list_for_each_entry(evt, &param->fault_param->faults, list) {
prm = &evt->fault.prm;
if (prm->grpid != msg->grpid)
continue;
/*
* If the PASID is required, the corresponding request is
* matched using the group ID, the PASID valid bit and the PASID
* value. Otherwise only the group ID matches request and
* response.
*/
needs_pasid = prm->flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID;
if (needs_pasid && (!has_pasid || msg->pasid != prm->pasid))
continue;
if (!needs_pasid && has_pasid) {
/* No big deal, just clear it. */
msg->flags &= ~IOMMU_PAGE_RESP_PASID_VALID;
msg->pasid = 0;
}
ret = ops->page_response(dev, evt, msg);
list_del(&evt->list);
kfree(evt);
break;
}
done_unlock:
mutex_unlock(&param->fault_param->lock);
return ret;
}
EXPORT_SYMBOL_GPL(iommu_page_response);
/**
* iommu_group_id - Return ID for a group
* @group: the group to ID
@ -2986,7 +2804,7 @@ bool iommu_default_passthrough(void)
}
EXPORT_SYMBOL_GPL(iommu_default_passthrough);
const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode)
const struct iommu_ops *iommu_ops_from_fwnode(const struct fwnode_handle *fwnode)
{
const struct iommu_ops *ops = NULL;
struct iommu_device *iommu;
@ -3037,7 +2855,7 @@ void iommu_fwspec_free(struct device *dev)
}
EXPORT_SYMBOL_GPL(iommu_fwspec_free);
int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids)
int iommu_fwspec_add_ids(struct device *dev, const u32 *ids, int num_ids)
{
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
int i, new_num;
@ -3623,26 +3441,6 @@ struct iommu_domain *iommu_get_domain_for_dev_pasid(struct device *dev,
}
EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev_pasid);
struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm)
{
const struct iommu_ops *ops = dev_iommu_ops(dev);
struct iommu_domain *domain;
domain = ops->domain_alloc(IOMMU_DOMAIN_SVA);
if (!domain)
return NULL;
domain->type = IOMMU_DOMAIN_SVA;
mmgrab(mm);
domain->mm = mm;
domain->owner = ops;
domain->iopf_handler = iommu_sva_handle_iopf;
domain->fault_data = mm;
return domain;
}
ioasid_t iommu_alloc_global_pasid(struct device *dev)
{
int ret;

View File

@ -24,24 +24,8 @@ static bool iova_rcache_insert(struct iova_domain *iovad,
static unsigned long iova_rcache_get(struct iova_domain *iovad,
unsigned long size,
unsigned long limit_pfn);
static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
static void free_iova_rcaches(struct iova_domain *iovad);
unsigned long iova_rcache_range(void)
{
return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
}
static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
{
struct iova_domain *iovad;
iovad = hlist_entry_safe(node, struct iova_domain, cpuhp_dead);
free_cpu_cached_iovas(cpu, iovad);
return 0;
}
static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
static void free_global_cached_iovas(struct iova_domain *iovad);
static struct iova *to_iova(struct rb_node *node)
@ -252,54 +236,6 @@ static void free_iova_mem(struct iova *iova)
kmem_cache_free(iova_cache, iova);
}
int iova_cache_get(void)
{
mutex_lock(&iova_cache_mutex);
if (!iova_cache_users) {
int ret;
ret = cpuhp_setup_state_multi(CPUHP_IOMMU_IOVA_DEAD, "iommu/iova:dead", NULL,
iova_cpuhp_dead);
if (ret) {
mutex_unlock(&iova_cache_mutex);
pr_err("Couldn't register cpuhp handler\n");
return ret;
}
iova_cache = kmem_cache_create(
"iommu_iova", sizeof(struct iova), 0,
SLAB_HWCACHE_ALIGN, NULL);
if (!iova_cache) {
cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD);
mutex_unlock(&iova_cache_mutex);
pr_err("Couldn't create iova cache\n");
return -ENOMEM;
}
}
iova_cache_users++;
mutex_unlock(&iova_cache_mutex);
return 0;
}
EXPORT_SYMBOL_GPL(iova_cache_get);
void iova_cache_put(void)
{
mutex_lock(&iova_cache_mutex);
if (WARN_ON(!iova_cache_users)) {
mutex_unlock(&iova_cache_mutex);
return;
}
iova_cache_users--;
if (!iova_cache_users) {
cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD);
kmem_cache_destroy(iova_cache);
}
mutex_unlock(&iova_cache_mutex);
}
EXPORT_SYMBOL_GPL(iova_cache_put);
/**
* alloc_iova - allocates an iova
* @iovad: - iova domain in question
@ -654,11 +590,18 @@ struct iova_rcache {
struct delayed_work work;
};
static struct kmem_cache *iova_magazine_cache;
unsigned long iova_rcache_range(void)
{
return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
}
static struct iova_magazine *iova_magazine_alloc(gfp_t flags)
{
struct iova_magazine *mag;
mag = kmalloc(sizeof(*mag), flags);
mag = kmem_cache_alloc(iova_magazine_cache, flags);
if (mag)
mag->size = 0;
@ -667,7 +610,7 @@ static struct iova_magazine *iova_magazine_alloc(gfp_t flags)
static void iova_magazine_free(struct iova_magazine *mag)
{
kfree(mag);
kmem_cache_free(iova_magazine_cache, mag);
}
static void
@ -990,5 +933,71 @@ static void free_global_cached_iovas(struct iova_domain *iovad)
spin_unlock_irqrestore(&rcache->lock, flags);
}
}
static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
{
struct iova_domain *iovad;
iovad = hlist_entry_safe(node, struct iova_domain, cpuhp_dead);
free_cpu_cached_iovas(cpu, iovad);
return 0;
}
int iova_cache_get(void)
{
int err = -ENOMEM;
mutex_lock(&iova_cache_mutex);
if (!iova_cache_users) {
iova_cache = kmem_cache_create("iommu_iova", sizeof(struct iova), 0,
SLAB_HWCACHE_ALIGN, NULL);
if (!iova_cache)
goto out_err;
iova_magazine_cache = kmem_cache_create("iommu_iova_magazine",
sizeof(struct iova_magazine),
0, SLAB_HWCACHE_ALIGN, NULL);
if (!iova_magazine_cache)
goto out_err;
err = cpuhp_setup_state_multi(CPUHP_IOMMU_IOVA_DEAD, "iommu/iova:dead",
NULL, iova_cpuhp_dead);
if (err) {
pr_err("IOVA: Couldn't register cpuhp handler: %pe\n", ERR_PTR(err));
goto out_err;
}
}
iova_cache_users++;
mutex_unlock(&iova_cache_mutex);
return 0;
out_err:
kmem_cache_destroy(iova_cache);
kmem_cache_destroy(iova_magazine_cache);
mutex_unlock(&iova_cache_mutex);
return err;
}
EXPORT_SYMBOL_GPL(iova_cache_get);
void iova_cache_put(void)
{
mutex_lock(&iova_cache_mutex);
if (WARN_ON(!iova_cache_users)) {
mutex_unlock(&iova_cache_mutex);
return;
}
iova_cache_users--;
if (!iova_cache_users) {
cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD);
kmem_cache_destroy(iova_cache);
kmem_cache_destroy(iova_magazine_cache);
}
mutex_unlock(&iova_cache_mutex);
}
EXPORT_SYMBOL_GPL(iova_cache_put);
MODULE_AUTHOR("Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>");
MODULE_LICENSE("GPL");

View File

@ -709,7 +709,7 @@ static phys_addr_t ipmmu_iova_to_phys(struct iommu_domain *io_domain,
}
static int ipmmu_init_platform_device(struct device *dev,
struct of_phandle_args *args)
const struct of_phandle_args *args)
{
struct platform_device *ipmmu_pdev;
@ -773,7 +773,7 @@ static bool ipmmu_device_is_allowed(struct device *dev)
}
static int ipmmu_of_xlate(struct device *dev,
struct of_phandle_args *spec)
const struct of_phandle_args *spec)
{
if (!ipmmu_device_is_allowed(dev))
return -ENODEV;
@ -1005,7 +1005,6 @@ static const struct of_device_id ipmmu_of_ids[] = {
static int ipmmu_probe(struct platform_device *pdev)
{
struct ipmmu_vmsa_device *mmu;
struct resource *res;
int irq;
int ret;
@ -1025,8 +1024,7 @@ static int ipmmu_probe(struct platform_device *pdev)
return ret;
/* Map I/O memory and request IRQ. */
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
mmu->base = devm_ioremap_resource(&pdev->dev, res);
mmu->base = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(mmu->base))
return PTR_ERR(mmu->base);
@ -1123,7 +1121,6 @@ static void ipmmu_remove(struct platform_device *pdev)
ipmmu_device_reset(mmu);
}
#ifdef CONFIG_PM_SLEEP
static int ipmmu_resume_noirq(struct device *dev)
{
struct ipmmu_vmsa_device *mmu = dev_get_drvdata(dev);
@ -1153,18 +1150,14 @@ static int ipmmu_resume_noirq(struct device *dev)
}
static const struct dev_pm_ops ipmmu_pm = {
SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(NULL, ipmmu_resume_noirq)
NOIRQ_SYSTEM_SLEEP_PM_OPS(NULL, ipmmu_resume_noirq)
};
#define DEV_PM_OPS &ipmmu_pm
#else
#define DEV_PM_OPS NULL
#endif /* CONFIG_PM_SLEEP */
static struct platform_driver ipmmu_driver = {
.driver = {
.name = "ipmmu-vmsa",
.of_match_table = of_match_ptr(ipmmu_of_ids),
.pm = DEV_PM_OPS,
.of_match_table = ipmmu_of_ids,
.pm = pm_sleep_ptr(&ipmmu_pm),
},
.probe = ipmmu_probe,
.remove_new = ipmmu_remove,

View File

@ -99,7 +99,8 @@ int __init irq_remapping_prepare(void)
if (disable_irq_remap)
return -ENOSYS;
if (intel_irq_remap_ops.prepare() == 0)
if (IS_ENABLED(CONFIG_INTEL_IOMMU) &&
intel_irq_remap_ops.prepare() == 0)
remap_ops = &intel_irq_remap_ops;
else if (IS_ENABLED(CONFIG_AMD_IOMMU) &&
amd_iommu_irq_ops.prepare() == 0)

View File

@ -598,7 +598,7 @@ static void print_ctx_regs(void __iomem *base, int ctx)
static int insert_iommu_master(struct device *dev,
struct msm_iommu_dev **iommu,
struct of_phandle_args *spec)
const struct of_phandle_args *spec)
{
struct msm_iommu_ctx_dev *master = dev_iommu_priv_get(dev);
int sid;
@ -626,7 +626,7 @@ static int insert_iommu_master(struct device *dev,
}
static int qcom_iommu_of_xlate(struct device *dev,
struct of_phandle_args *spec)
const struct of_phandle_args *spec)
{
struct msm_iommu_dev *iommu = NULL, *iter;
unsigned long flags;

View File

@ -957,7 +957,8 @@ static struct iommu_group *mtk_iommu_device_group(struct device *dev)
return group;
}
static int mtk_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
static int mtk_iommu_of_xlate(struct device *dev,
const struct of_phandle_args *args)
{
struct platform_device *m4updev;
@ -1264,7 +1265,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
data->plat_data = of_device_get_match_data(dev);
/* Protect memory. HW will access here while translation fault.*/
protect = devm_kzalloc(dev, MTK_PROTECT_PA_ALIGN * 2, GFP_KERNEL);
protect = devm_kcalloc(dev, 2, MTK_PROTECT_PA_ALIGN, GFP_KERNEL);
if (!protect)
return -ENOMEM;
data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN);

View File

@ -398,7 +398,8 @@ static const struct iommu_ops mtk_iommu_v1_ops;
* MTK generation one iommu HW only support one iommu domain, and all the client
* sharing the same iova address space.
*/
static int mtk_iommu_v1_create_mapping(struct device *dev, struct of_phandle_args *args)
static int mtk_iommu_v1_create_mapping(struct device *dev,
const struct of_phandle_args *args)
{
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct mtk_iommu_v1_data *data;
@ -621,8 +622,8 @@ static int mtk_iommu_v1_probe(struct platform_device *pdev)
data->dev = dev;
/* Protect memory. HW will access here while translation fault.*/
protect = devm_kzalloc(dev, MTK_PROTECT_PA_ALIGN * 2,
GFP_KERNEL | GFP_DMA);
protect = devm_kcalloc(dev, 2, MTK_PROTECT_PA_ALIGN,
GFP_KERNEL | GFP_DMA);
if (!protect)
return -ENOMEM;
data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN);

View File

@ -29,7 +29,7 @@ static int of_iommu_xlate(struct device *dev,
!of_device_is_available(iommu_spec->np))
return -ENODEV;
ret = iommu_fwspec_init(dev, &iommu_spec->np->fwnode, ops);
ret = iommu_fwspec_init(dev, fwnode, ops);
if (ret)
return ret;
/*

View File

@ -1140,7 +1140,7 @@ static void rk_iommu_release_device(struct device *dev)
}
static int rk_iommu_of_xlate(struct device *dev,
struct of_phandle_args *args)
const struct of_phandle_args *args)
{
struct platform_device *iommu_dev;
struct rk_iommudata *data;

View File

@ -390,7 +390,8 @@ static struct iommu_device *sprd_iommu_probe_device(struct device *dev)
return &sdev->iommu;
}
static int sprd_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
static int sprd_iommu_of_xlate(struct device *dev,
const struct of_phandle_args *args)
{
struct platform_device *pdev;

View File

@ -819,7 +819,7 @@ static struct iommu_device *sun50i_iommu_probe_device(struct device *dev)
}
static int sun50i_iommu_of_xlate(struct device *dev,
struct of_phandle_args *args)
const struct of_phandle_args *args)
{
struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
unsigned id = args->args[0];

View File

@ -830,7 +830,7 @@ static struct tegra_smmu *tegra_smmu_find(struct device_node *np)
}
static int tegra_smmu_configure(struct tegra_smmu *smmu, struct device *dev,
struct of_phandle_args *args)
const struct of_phandle_args *args)
{
const struct iommu_ops *ops = smmu->iommu.ops;
int err;
@ -959,7 +959,7 @@ static struct iommu_group *tegra_smmu_device_group(struct device *dev)
}
static int tegra_smmu_of_xlate(struct device *dev,
struct of_phandle_args *args)
const struct of_phandle_args *args)
{
struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
struct tegra_mc *mc = platform_get_drvdata(iommu_pdev);

View File

@ -1051,7 +1051,8 @@ static struct iommu_group *viommu_device_group(struct device *dev)
return generic_device_group(dev);
}
static int viommu_of_xlate(struct device *dev, struct of_phandle_args *args)
static int viommu_of_xlate(struct device *dev,
const struct of_phandle_args *args)
{
return iommu_fwspec_add_ids(dev, args->args, 1);
}

View File

@ -368,11 +368,6 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
return 0;
}
static inline bool pci_dev_is_disconnected(const struct pci_dev *dev)
{
return dev->error_state == pci_channel_io_perm_failure;
}
/* pci_dev priv_flags */
#define PCI_DEV_ADDED 0
#define PCI_DPC_RECOVERED 1

View File

@ -14,7 +14,6 @@
#include <linux/err.h>
#include <linux/of.h>
#include <linux/iova_bitmap.h>
#include <uapi/linux/iommu.h>
#define IOMMU_READ (1 << 0)
#define IOMMU_WRITE (1 << 1)
@ -41,8 +40,110 @@ struct iommu_domain_ops;
struct iommu_dirty_ops;
struct notifier_block;
struct iommu_sva;
struct iommu_fault_event;
struct iommu_dma_cookie;
struct iommu_fault_param;
#define IOMMU_FAULT_PERM_READ (1 << 0) /* read */
#define IOMMU_FAULT_PERM_WRITE (1 << 1) /* write */
#define IOMMU_FAULT_PERM_EXEC (1 << 2) /* exec */
#define IOMMU_FAULT_PERM_PRIV (1 << 3) /* privileged */
/* Generic fault types, can be expanded IRQ remapping fault */
enum iommu_fault_type {
IOMMU_FAULT_PAGE_REQ = 1, /* page request fault */
};
/**
* struct iommu_fault_page_request - Page Request data
* @flags: encodes whether the corresponding fields are valid and whether this
* is the last page in group (IOMMU_FAULT_PAGE_REQUEST_* values).
* When IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID is set, the page response
* must have the same PASID value as the page request. When it is clear,
* the page response should not have a PASID.
* @pasid: Process Address Space ID
* @grpid: Page Request Group Index
* @perm: requested page permissions (IOMMU_FAULT_PERM_* values)
* @addr: page address
* @private_data: device-specific private information
*/
struct iommu_fault_page_request {
#define IOMMU_FAULT_PAGE_REQUEST_PASID_VALID (1 << 0)
#define IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE (1 << 1)
#define IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA (1 << 2)
#define IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID (1 << 3)
u32 flags;
u32 pasid;
u32 grpid;
u32 perm;
u64 addr;
u64 private_data[2];
};
/**
* struct iommu_fault - Generic fault data
* @type: fault type from &enum iommu_fault_type
* @prm: Page Request message, when @type is %IOMMU_FAULT_PAGE_REQ
*/
struct iommu_fault {
u32 type;
struct iommu_fault_page_request prm;
};
/**
* enum iommu_page_response_code - Return status of fault handlers
* @IOMMU_PAGE_RESP_SUCCESS: Fault has been handled and the page tables
* populated, retry the access. This is "Success" in PCI PRI.
* @IOMMU_PAGE_RESP_FAILURE: General error. Drop all subsequent faults from
* this device if possible. This is "Response Failure" in PCI PRI.
* @IOMMU_PAGE_RESP_INVALID: Could not handle this fault, don't retry the
* access. This is "Invalid Request" in PCI PRI.
*/
enum iommu_page_response_code {
IOMMU_PAGE_RESP_SUCCESS = 0,
IOMMU_PAGE_RESP_INVALID,
IOMMU_PAGE_RESP_FAILURE,
};
/**
* struct iommu_page_response - Generic page response information
* @pasid: Process Address Space ID
* @grpid: Page Request Group Index
* @code: response code from &enum iommu_page_response_code
*/
struct iommu_page_response {
u32 pasid;
u32 grpid;
u32 code;
};
struct iopf_fault {
struct iommu_fault fault;
/* node for pending lists */
struct list_head list;
};
struct iopf_group {
struct iopf_fault last_fault;
struct list_head faults;
/* list node for iommu_fault_param::faults */
struct list_head pending_node;
struct work_struct work;
struct iommu_domain *domain;
/* The device's fault data parameter. */
struct iommu_fault_param *fault_param;
};
/**
* struct iopf_queue - IO Page Fault queue
* @wq: the fault workqueue
* @devices: devices attached to this queue
* @lock: protects the device list
*/
struct iopf_queue {
struct workqueue_struct *wq;
struct list_head devices;
struct mutex lock;
};
/* iommu fault flags */
#define IOMMU_FAULT_READ 0x0
@ -50,7 +151,6 @@ struct iommu_dma_cookie;
typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
struct device *, unsigned long, int, void *);
typedef int (*iommu_dev_fault_handler_t)(struct iommu_fault *, void *);
struct iommu_domain_geometry {
dma_addr_t aperture_start; /* First address that can be mapped */
@ -110,8 +210,7 @@ struct iommu_domain {
unsigned long pgsize_bitmap; /* Bitmap of page sizes in use */
struct iommu_domain_geometry geometry;
struct iommu_dma_cookie *iova_cookie;
enum iommu_page_response_code (*iopf_handler)(struct iommu_fault *fault,
void *data);
int (*iopf_handler)(struct iopf_group *group);
void *fault_data;
union {
struct {
@ -468,16 +567,15 @@ struct iommu_ops {
/* Request/Free a list of reserved regions for a device */
void (*get_resv_regions)(struct device *dev, struct list_head *list);
int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
int (*of_xlate)(struct device *dev, const struct of_phandle_args *args);
bool (*is_attach_deferred)(struct device *dev);
/* Per device IOMMU features */
int (*dev_enable_feat)(struct device *dev, enum iommu_dev_features f);
int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f);
int (*page_response)(struct device *dev,
struct iommu_fault_event *evt,
struct iommu_page_response *msg);
void (*page_response)(struct device *dev, struct iopf_fault *evt,
struct iommu_page_response *msg);
int (*def_domain_type)(struct device *dev);
void (*remove_dev_pasid)(struct device *dev, ioasid_t pasid);
@ -487,6 +585,7 @@ struct iommu_ops {
struct module *owner;
struct iommu_domain *identity_domain;
struct iommu_domain *blocked_domain;
struct iommu_domain *release_domain;
struct iommu_domain *default_domain;
};
@ -577,39 +676,35 @@ struct iommu_device {
u32 max_pasids;
};
/**
* struct iommu_fault_event - Generic fault event
*
* Can represent recoverable faults such as a page requests or
* unrecoverable faults such as DMA or IRQ remapping faults.
*
* @fault: fault descriptor
* @list: pending fault event list, used for tracking responses
*/
struct iommu_fault_event {
struct iommu_fault fault;
struct list_head list;
};
/**
* struct iommu_fault_param - per-device IOMMU fault data
* @handler: Callback function to handle IOMMU faults at device level
* @data: handler private data
* @faults: holds the pending faults which needs response
* @lock: protect pending faults list
* @users: user counter to manage the lifetime of the data
* @rcu: rcu head for kfree_rcu()
* @dev: the device that owns this param
* @queue: IOPF queue
* @queue_list: index into queue->devices
* @partial: faults that are part of a Page Request Group for which the last
* request hasn't been submitted yet.
* @faults: holds the pending faults which need response
*/
struct iommu_fault_param {
iommu_dev_fault_handler_t handler;
void *data;
struct list_head faults;
struct mutex lock;
refcount_t users;
struct rcu_head rcu;
struct device *dev;
struct iopf_queue *queue;
struct list_head queue_list;
struct list_head partial;
struct list_head faults;
};
/**
* struct dev_iommu - Collection of per-device IOMMU data
*
* @fault_param: IOMMU detected device fault reporting data
* @iopf_param: I/O Page Fault queue and data
* @fwspec: IOMMU fwspec data
* @iommu_dev: IOMMU device this device is linked to
* @priv: IOMMU Driver private data
@ -624,8 +719,7 @@ struct iommu_fault_param {
*/
struct dev_iommu {
struct mutex lock;
struct iommu_fault_param *fault_param;
struct iopf_device_param *iopf_param;
struct iommu_fault_param __rcu *fault_param;
struct iommu_fwspec *fwspec;
struct iommu_device *iommu_dev;
void *priv;
@ -654,6 +748,22 @@ static inline struct iommu_device *dev_to_iommu_device(struct device *dev)
return (struct iommu_device *)dev_get_drvdata(dev);
}
/**
* iommu_get_iommu_dev - Get iommu_device for a device
* @dev: an end-point device
*
* Note that this function must be called from the iommu_ops
* to retrieve the iommu_device for a device, which the core code
* guarentees it will not invoke the op without an attached iommu.
*/
static inline struct iommu_device *__iommu_get_iommu_dev(struct device *dev)
{
return dev->iommu->iommu_dev;
}
#define iommu_get_iommu_dev(dev, type, member) \
container_of(__iommu_get_iommu_dev(dev), type, member)
static inline void iommu_iotlb_gather_init(struct iommu_iotlb_gather *gather)
{
*gather = (struct iommu_iotlb_gather) {
@ -719,16 +829,6 @@ extern int iommu_group_for_each_dev(struct iommu_group *group, void *data,
extern struct iommu_group *iommu_group_get(struct device *dev);
extern struct iommu_group *iommu_group_ref_get(struct iommu_group *group);
extern void iommu_group_put(struct iommu_group *group);
extern int iommu_register_device_fault_handler(struct device *dev,
iommu_dev_fault_handler_t handler,
void *data);
extern int iommu_unregister_device_fault_handler(struct device *dev);
extern int iommu_report_device_fault(struct device *dev,
struct iommu_fault_event *evt);
extern int iommu_page_response(struct device *dev,
struct iommu_page_response *msg);
extern int iommu_group_id(struct iommu_group *group);
extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
@ -905,8 +1005,8 @@ struct iommu_mm_data {
int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
const struct iommu_ops *ops);
void iommu_fwspec_free(struct device *dev);
int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids);
const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode);
int iommu_fwspec_add_ids(struct device *dev, const u32 *ids, int num_ids);
const struct iommu_ops *iommu_ops_from_fwnode(const struct fwnode_handle *fwnode);
static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
{
@ -948,8 +1048,6 @@ bool iommu_group_dma_owner_claimed(struct iommu_group *group);
int iommu_device_claim_dma_owner(struct device *dev, void *owner);
void iommu_device_release_dma_owner(struct device *dev);
struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm);
int iommu_attach_device_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid);
void iommu_detach_device_pasid(struct iommu_domain *domain,
@ -1138,31 +1236,6 @@ static inline void iommu_group_put(struct iommu_group *group)
{
}
static inline
int iommu_register_device_fault_handler(struct device *dev,
iommu_dev_fault_handler_t handler,
void *data)
{
return -ENODEV;
}
static inline int iommu_unregister_device_fault_handler(struct device *dev)
{
return 0;
}
static inline
int iommu_report_device_fault(struct device *dev, struct iommu_fault_event *evt)
{
return -ENODEV;
}
static inline int iommu_page_response(struct device *dev,
struct iommu_page_response *msg)
{
return -ENODEV;
}
static inline int iommu_group_id(struct iommu_group *group)
{
return -ENODEV;
@ -1256,7 +1329,7 @@ static inline int iommu_fwspec_add_ids(struct device *dev, u32 *ids,
}
static inline
const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode)
const struct iommu_ops *iommu_ops_from_fwnode(const struct fwnode_handle *fwnode)
{
return NULL;
}
@ -1311,12 +1384,6 @@ static inline int iommu_device_claim_dma_owner(struct device *dev, void *owner)
return -ENODEV;
}
static inline struct iommu_domain *
iommu_sva_domain_alloc(struct device *dev, struct mm_struct *mm)
{
return NULL;
}
static inline int iommu_attach_device_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid)
{
@ -1343,6 +1410,14 @@ static inline ioasid_t iommu_alloc_global_pasid(struct device *dev)
static inline void iommu_free_global_pasid(ioasid_t pasid) {}
#endif /* CONFIG_IOMMU_API */
#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_IOMMU_API)
void iommu_group_mutex_assert(struct device *dev);
#else
static inline void iommu_group_mutex_assert(struct device *dev)
{
}
#endif
/**
* iommu_map_sgtable - Map the given buffer to the IOMMU domain
* @domain: The IOMMU domain to perform the mapping
@ -1456,6 +1531,8 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
struct mm_struct *mm);
void iommu_sva_unbind_device(struct iommu_sva *handle);
u32 iommu_sva_get_pasid(struct iommu_sva *handle);
struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm);
#else
static inline struct iommu_sva *
iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
@ -1480,6 +1557,68 @@ static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm)
}
static inline void mm_pasid_drop(struct mm_struct *mm) {}
static inline struct iommu_domain *
iommu_sva_domain_alloc(struct device *dev, struct mm_struct *mm)
{
return NULL;
}
#endif /* CONFIG_IOMMU_SVA */
#ifdef CONFIG_IOMMU_IOPF
int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev);
void iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev);
int iopf_queue_flush_dev(struct device *dev);
struct iopf_queue *iopf_queue_alloc(const char *name);
void iopf_queue_free(struct iopf_queue *queue);
int iopf_queue_discard_partial(struct iopf_queue *queue);
void iopf_free_group(struct iopf_group *group);
void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt);
void iopf_group_response(struct iopf_group *group,
enum iommu_page_response_code status);
#else
static inline int
iopf_queue_add_device(struct iopf_queue *queue, struct device *dev)
{
return -ENODEV;
}
static inline void
iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev)
{
}
static inline int iopf_queue_flush_dev(struct device *dev)
{
return -ENODEV;
}
static inline struct iopf_queue *iopf_queue_alloc(const char *name)
{
return NULL;
}
static inline void iopf_queue_free(struct iopf_queue *queue)
{
}
static inline int iopf_queue_discard_partial(struct iopf_queue *queue)
{
return -ENODEV;
}
static inline void iopf_free_group(struct iopf_group *group)
{
}
static inline void
iommu_report_device_fault(struct device *dev, struct iopf_fault *evt)
{
}
static inline void iopf_group_response(struct iopf_group *group,
enum iommu_page_response_code status)
{
}
#endif /* CONFIG_IOMMU_IOPF */
#endif /* __LINUX_IOMMU_H */

View File

@ -2517,6 +2517,11 @@ static inline struct pci_dev *pcie_find_root_port(struct pci_dev *dev)
return NULL;
}
static inline bool pci_dev_is_disconnected(const struct pci_dev *dev)
{
return dev->error_state == pci_channel_io_perm_failure;
}
void pci_request_acs(void);
bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags);
bool pci_acs_path_enabled(struct pci_dev *start,

View File

@ -1,161 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
* IOMMU user API definitions
*/
#ifndef _UAPI_IOMMU_H
#define _UAPI_IOMMU_H
#include <linux/types.h>
#define IOMMU_FAULT_PERM_READ (1 << 0) /* read */
#define IOMMU_FAULT_PERM_WRITE (1 << 1) /* write */
#define IOMMU_FAULT_PERM_EXEC (1 << 2) /* exec */
#define IOMMU_FAULT_PERM_PRIV (1 << 3) /* privileged */
/* Generic fault types, can be expanded IRQ remapping fault */
enum iommu_fault_type {
IOMMU_FAULT_DMA_UNRECOV = 1, /* unrecoverable fault */
IOMMU_FAULT_PAGE_REQ, /* page request fault */
};
enum iommu_fault_reason {
IOMMU_FAULT_REASON_UNKNOWN = 0,
/* Could not access the PASID table (fetch caused external abort) */
IOMMU_FAULT_REASON_PASID_FETCH,
/* PASID entry is invalid or has configuration errors */
IOMMU_FAULT_REASON_BAD_PASID_ENTRY,
/*
* PASID is out of range (e.g. exceeds the maximum PASID
* supported by the IOMMU) or disabled.
*/
IOMMU_FAULT_REASON_PASID_INVALID,
/*
* An external abort occurred fetching (or updating) a translation
* table descriptor
*/
IOMMU_FAULT_REASON_WALK_EABT,
/*
* Could not access the page table entry (Bad address),
* actual translation fault
*/
IOMMU_FAULT_REASON_PTE_FETCH,
/* Protection flag check failed */
IOMMU_FAULT_REASON_PERMISSION,
/* access flag check failed */
IOMMU_FAULT_REASON_ACCESS,
/* Output address of a translation stage caused Address Size fault */
IOMMU_FAULT_REASON_OOR_ADDRESS,
};
/**
* struct iommu_fault_unrecoverable - Unrecoverable fault data
* @reason: reason of the fault, from &enum iommu_fault_reason
* @flags: parameters of this fault (IOMMU_FAULT_UNRECOV_* values)
* @pasid: Process Address Space ID
* @perm: requested permission access using by the incoming transaction
* (IOMMU_FAULT_PERM_* values)
* @addr: offending page address
* @fetch_addr: address that caused a fetch abort, if any
*/
struct iommu_fault_unrecoverable {
__u32 reason;
#define IOMMU_FAULT_UNRECOV_PASID_VALID (1 << 0)
#define IOMMU_FAULT_UNRECOV_ADDR_VALID (1 << 1)
#define IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID (1 << 2)
__u32 flags;
__u32 pasid;
__u32 perm;
__u64 addr;
__u64 fetch_addr;
};
/**
* struct iommu_fault_page_request - Page Request data
* @flags: encodes whether the corresponding fields are valid and whether this
* is the last page in group (IOMMU_FAULT_PAGE_REQUEST_* values).
* When IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID is set, the page response
* must have the same PASID value as the page request. When it is clear,
* the page response should not have a PASID.
* @pasid: Process Address Space ID
* @grpid: Page Request Group Index
* @perm: requested page permissions (IOMMU_FAULT_PERM_* values)
* @addr: page address
* @private_data: device-specific private information
*/
struct iommu_fault_page_request {
#define IOMMU_FAULT_PAGE_REQUEST_PASID_VALID (1 << 0)
#define IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE (1 << 1)
#define IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA (1 << 2)
#define IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID (1 << 3)
__u32 flags;
__u32 pasid;
__u32 grpid;
__u32 perm;
__u64 addr;
__u64 private_data[2];
};
/**
* struct iommu_fault - Generic fault data
* @type: fault type from &enum iommu_fault_type
* @padding: reserved for future use (should be zero)
* @event: fault event, when @type is %IOMMU_FAULT_DMA_UNRECOV
* @prm: Page Request message, when @type is %IOMMU_FAULT_PAGE_REQ
* @padding2: sets the fault size to allow for future extensions
*/
struct iommu_fault {
__u32 type;
__u32 padding;
union {
struct iommu_fault_unrecoverable event;
struct iommu_fault_page_request prm;
__u8 padding2[56];
};
};
/**
* enum iommu_page_response_code - Return status of fault handlers
* @IOMMU_PAGE_RESP_SUCCESS: Fault has been handled and the page tables
* populated, retry the access. This is "Success" in PCI PRI.
* @IOMMU_PAGE_RESP_FAILURE: General error. Drop all subsequent faults from
* this device if possible. This is "Response Failure" in PCI PRI.
* @IOMMU_PAGE_RESP_INVALID: Could not handle this fault, don't retry the
* access. This is "Invalid Request" in PCI PRI.
*/
enum iommu_page_response_code {
IOMMU_PAGE_RESP_SUCCESS = 0,
IOMMU_PAGE_RESP_INVALID,
IOMMU_PAGE_RESP_FAILURE,
};
/**
* struct iommu_page_response - Generic page response information
* @argsz: User filled size of this data
* @version: API version of this structure
* @flags: encodes whether the corresponding fields are valid
* (IOMMU_FAULT_PAGE_RESPONSE_* values)
* @pasid: Process Address Space ID
* @grpid: Page Request Group Index
* @code: response code from &enum iommu_page_response_code
*/
struct iommu_page_response {
__u32 argsz;
#define IOMMU_PAGE_RESP_VERSION_1 1
__u32 version;
#define IOMMU_PAGE_RESP_PASID_VALID (1 << 0)
__u32 flags;
__u32 pasid;
__u32 grpid;
__u32 code;
};
#endif /* _UAPI_IOMMU_H */