The IOMMU VT-d implementation uses the first level for GPA->HPA translation
by default. Although both the first level and the second level could handle
the DMA translation, they're different in some way. For example, the second
level translation has separate controls for the Access/Dirty page tracking.
With the first level translation, there's no such control. On the other
hand, the second level translation has the page-level control for forcing
snoop, but the first level only has global control with pasid granularity.
This uses the second level for GPA->HPA translation so that we can provide
a consistent hardware interface for use cases like dirty page tracking for
live migration.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20210926114535.923263-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20211014053839.727419-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
An iommu domain could be allocated and mapped before it's attached to any
device. This requires that in scalable mode, when the domain is allocated,
the format (FL or SL) of the page table must be determined. In order to
achieve this, the platform should support consistent SL or FL capabilities
on all IOMMU's. This adds a check for this and aborts IOMMU probing if it
doesn't meet this requirement.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20210926114535.923263-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20211014053839.727419-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When the dmar translation fault happens, the kernel prints a single line
fault reason with corresponding hexadecimal code defined in the Intel VT-d
specification.
Currently, when user wants to debug the translation fault in detail,
debugfs is used for dumping the dmar_translation_struct, which is not
available when the kernel failed to boot.
Dump the DMAR translation structure, pagewalk the IO page table and print
the page table entry when the fault happens.
This takes effect only when CONFIG_DMAR_DEBUG is enabled.
Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
Link: https://lore.kernel.org/r/20210815203845.31287-1-kyung.min.park@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211014053839.727419-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Handling of intel_iommu kernel command line option should return "true" to
indicate option is valid and so avoid logging it as unknown by the core
parsing code.
Also log unknown sub-options at the notice level to let user know of
potential typos or similar.
Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://lore.kernel.org/r/20210831112947.310080-1-tvrtko.ursulin@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211014053839.727419-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
sid2groups keeps track of which stream id combinations belong to a
iommu_group to assign those correctly to devices.
When a iommu_group is freed a stale pointer will however remain in
sid2groups. This prevents devices with the same stream id combination
to ever be attached again (see below).
Fix that by creating a shadow copy of the stream id configuration
when a group is allocated for the first time and clear the sid2group
entry when that group is freed.
# echo 1 >/sys/bus/pci/devices/0000\:03\:00.0/remove
pci 0000:03:00.0: Removing from iommu group 1
# echo 1 >/sys/bus/pci/rescan
[...]
pci 0000:03:00.0: BAR 0: assigned [mem 0x6a0000000-0x6a000ffff 64bit pref]
pci 0000:03:00.0: BAR 2: assigned [mem 0x6a0010000-0x6a001ffff 64bit pref]
pci 0000:03:00.0: BAR 6: assigned [mem 0x6c0100000-0x6c01007ff pref]
tg3 0000:03:00.0: Failed to add to iommu group 1: -2
[...]
Fixes: 46d1fb072e ("iommu/dart: Add DART iommu driver")
Reported-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sven Peter <sven@svenpeter.dev>
Tested-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210924134502.15589-1-sven@svenpeter.dev
Signed-off-by: Joerg Roedel <jroedel@suse.de>
719a193356 ("iommu/vt-d: Tweak the description of a DMA fault") changed
the DMA fault reason from hex to decimal. It also added "0x" prefixes to
the PCI bus/device, e.g.,
- DMAR: [INTR-REMAP] Request device [00:00.5]
+ DMAR: [INTR-REMAP] Request device [0x00:0x00.5]
These no longer match dev_printk() and other similar messages in
dmar_match_pci_path() and dmar_acpi_insert_dev_scope().
Drop the "0x" prefixes from the bus and device addresses.
Fixes: 719a193356 ("iommu/vt-d: Tweak the description of a DMA fault")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20210903193711.483999-1-helgaas@kernel.org
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210922054726.499110-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
apple_dart_tlb_flush_{all,walk} expect to get a struct apple_dart_domain
but instead get a struct iommu_domain right now. This breaks those two
functions and can lead to kernel panics like the one below.
DART can only invalidate the entire TLB and apple_dart_iotlb_sync will
already flush everything. There's no need to do that again inside those
two functions. Let's just drop them.
pci 0000:03:00.0: Removing from iommu group 1
Unable to handle kernel paging request at virtual address 0000000100000023
[...]
Call trace:
_raw_spin_lock_irqsave+0x54/0xbc
apple_dart_hw_stream_command.constprop.0+0x2c/0x130
apple_dart_tlb_flush_all+0x48/0x90
free_io_pgtable_ops+0x40/0x70
apple_dart_domain_free+0x2c/0x44
iommu_group_release+0x68/0xac
kobject_cleanup+0x4c/0x1fc
kobject_cleanup+0x14c/0x1fc
kobject_put+0x64/0x84
iommu_group_remove_device+0x110/0x180
iommu_release_device+0x50/0xa0
[...]
Fixes: 46d1fb072e ("iommu/dart: Add DART iommu driver")
Reported-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sven Peter <sven@svenpeter.dev>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210921153934.35647-1-sven@svenpeter.dev
Signed-off-by: Joerg Roedel <jroedel@suse.de>
vduse driver supporting blk
virtio-vsock support for end of record with SEQPACKET
vdpa: mac and mq support for ifcvf and mlx5
vdpa: management netlink for ifcvf
virtio-i2c, gpio dt bindings
misc fixes, cleanups
NB: when merging this with
b542e383d8 ("eventfd: Make signal recursion protection a task bit")
from Linus' tree, replace eventfd_signal_count with
eventfd_signal_allowed, and drop the export of eventfd_wake_count from
("eventfd: Export eventfd_wake_count to modules").
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmE1+awPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpt6EIAJy0qrc62lktNA0IiIVJSLbUbTMmFj8MzkGR
8UxZdhpjWqBPJPyaOuNeksAqTGm/UAPEYx3C2c95Jhej7anFpy7dbCtIXcPHLJME
DjcJg+EDrlNCj8m0FcsHpHWsFzPMERJpyEZNxgB5WazQbv+yWhGrg2FN5DCnF0Ro
ZFYeKSVty148pQ0nHl8X0JM2XMtqit+O+LvKN2HQZ+fubh7BCzMxzkHY0QLHIzUS
UeZqd3Qm8YcbqnlX38P5D6k+NPiTEgknmxaBLkPxg6H3XxDAmaIRFb8Ldd1rsgy1
zTLGDiSGpVDIpawRnuEAzqJThV3Y5/MVJ1WD+mDYQ96tmhfp+KY=
=DBH/
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
- vduse driver ("vDPA Device in Userspace") supporting emulated virtio
block devices
- virtio-vsock support for end of record with SEQPACKET
- vdpa: mac and mq support for ifcvf and mlx5
- vdpa: management netlink for ifcvf
- virtio-i2c, gpio dt bindings
- misc fixes and cleanups
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (39 commits)
Documentation: Add documentation for VDUSE
vduse: Introduce VDUSE - vDPA Device in Userspace
vduse: Implement an MMU-based software IOTLB
vdpa: Support transferring virtual addressing during DMA mapping
vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap()
vdpa: Add an opaque pointer for vdpa_config_ops.dma_map()
vhost-iotlb: Add an opaque pointer for vhost IOTLB
vhost-vdpa: Handle the failure of vdpa_reset()
vdpa: Add reset callback in vdpa_config_ops
vdpa: Fix some coding style issues
file: Export receive_fd() to modules
eventfd: Export eventfd_wake_count to modules
iova: Export alloc_iova_fast() and free_iova_fast()
virtio-blk: remove unneeded "likely" statements
virtio-balloon: Use virtio_find_vqs() helper
vdpa: Make use of PFN_PHYS/PFN_UP/PFN_DOWN helper macro
vsock_test: update message bounds test for MSG_EOR
af_vsock: rename variables in receive loop
virtio/vsock: support MSG_EOR bit processing
vhost/vsock: support MSG_EOR bit processing
...
Including:
- Intel VT-d:
- PASID leakage in intel_svm_unbind_mm();
- Deadlock in intel_svm_drain_prq().
- AMD IOMMU: Fixes for an unhandled page-fault bug when AVIC is used
for a KVM guest.
- Make CONFIG_IOMMU_DEFAULT_DMA_LAZY architecture instead of IOMMU
driver dependent
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmE7fPoACgkQK/BELZcB
GuPaIA/8CGoRP1ARzGgrNb67+Y5T0Ut332YASa9vDyfcJugxbbqoAQ0dn8ZzMfpd
zTqsBXHk++2hcfbgk3WbZDhB8Vb5qYd4t881wy36N+pVdRd+NjDpJbLtH1HAyu0a
K1aYW+ZCd3/8vAgfTqKA1+nlS+urA52hB0MkUlaPwNG5LUALk8G9lXbA419WXPku
HQeyP8xy3D33znuq23MT6dnNL/InAIHJgPm+kNfDGFMfIS68clDcnUszPpMenWsU
0oTSIauD3kQJoA9ElV64+OZfq2IEmltvCChErW4Le4cU0BIuX3NiN+RmmreJAmxU
zko1Lz4AosfWEHIYiTIEe2W/N9SwQkwsDXSqZViD/4Bw7wVc5+M+YMynF84kWakn
kFQ1Lq9hvB/KYblbB93Lbdae3YYwoHNSe402rtNtDcSY/rFnthGdU+scGgjzlKra
p+1CWo0CpTI4L1Wr1UI/0G9CDQeluXYILMQiB0RbDBLAKDvsE7Zf2gsZjHYkHo40
WnQNI54j09JktR648rUCHahwx8v7tuXV7zQtJuhjIYIiDmM9uI7cUA6hwrsn1km3
o+CrmCAY5nMsRcjoMeNbeKq2lUH3xC/LP5WD7eg2twzw5KvJ6sNVudsbImRV2RS6
JUe22/IJEhy2B6wmq5Tbjn7gEGlV7PovnaRp7S8y4z2wFAohVNs=
=/POW
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v5.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
- Intel VT-d:
- PASID leakage in intel_svm_unbind_mm()
- Deadlock in intel_svm_drain_prq()
- AMD IOMMU: Fixes for an unhandled page-fault bug when AVIC is used
for a KVM guest.
- Make CONFIG_IOMMU_DEFAULT_DMA_LAZY architecture instead of IOMMU
driver dependent
* tag 'iommu-fixes-v5.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu: Clarify default domain Kconfig
iommu/vt-d: Fix a deadlock in intel_svm_drain_prq()
iommu/vt-d: Fix PASID leak in intel_svm_unbind_mm()
iommu/amd: Remove iommu_init_ga()
iommu/amd: Relocate GAMSup check to early_enable_iommus
Although strictly it is the AMD and Intel drivers which have an existing
expectation of lazy behaviour by default, it ends up being rather
unintuitive to describe this literally in Kconfig. Express it instead as
an architecture dependency, to clarify that it is a valid config-time
decision. The end result is the same since virtio-iommu doesn't support
lazy mode and thus falls back to strict at runtime regardless.
The per-architecture disparity is a matter of historical expectations:
the AMD and Intel drivers have been lazy by default since 2008, and
changing that gets noticed by people asking where their I/O throughput
has gone. Conversely, Arm-based systems with their wider assortment of
IOMMU drivers mostly only support strict mode anyway; only the Arm SMMU
drivers have later grown support for passthrough and lazy mode, for
users who wanted to explicitly trade off isolation for performance.
These days, reducing the default level of isolation in a way which may
go unnoticed by users who expect otherwise hardly seems worth risking
for the sake of one line of Kconfig, so here's where we are.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/69a0c6f17b000b54b8333ee42b3124c1d5a869e2.1631105737.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
pasid_mutex and dev->iommu->param->lock are held while unbinding mm is
flushing IO page fault workqueue and waiting for all page fault works to
finish. But an in-flight page fault work also need to hold the two locks
while unbinding mm are holding them and waiting for the work to finish.
This may cause an ABBA deadlock issue as shown below:
idxd 0000:00:0a.0: unbind PASID 2
======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc7+ #549 Not tainted [ 186.615245] ----------
dsa_test/898 is trying to acquire lock:
ffff888100d854e8 (¶m->lock){+.+.}-{3:3}, at:
iopf_queue_flush_dev+0x29/0x60
but task is already holding lock:
ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at:
intel_svm_unbind+0x34/0x1e0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (pasid_mutex){+.+.}-{3:3}:
__mutex_lock+0x75/0x730
mutex_lock_nested+0x1b/0x20
intel_svm_page_response+0x8e/0x260
iommu_page_response+0x122/0x200
iopf_handle_group+0x1c2/0x240
process_one_work+0x2a5/0x5a0
worker_thread+0x55/0x400
kthread+0x13b/0x160
ret_from_fork+0x22/0x30
-> #1 (¶m->fault_param->lock){+.+.}-{3:3}:
__mutex_lock+0x75/0x730
mutex_lock_nested+0x1b/0x20
iommu_report_device_fault+0xc2/0x170
prq_event_thread+0x28a/0x580
irq_thread_fn+0x28/0x60
irq_thread+0xcf/0x180
kthread+0x13b/0x160
ret_from_fork+0x22/0x30
-> #0 (¶m->lock){+.+.}-{3:3}:
__lock_acquire+0x1134/0x1d60
lock_acquire+0xc6/0x2e0
__mutex_lock+0x75/0x730
mutex_lock_nested+0x1b/0x20
iopf_queue_flush_dev+0x29/0x60
intel_svm_drain_prq+0x127/0x210
intel_svm_unbind+0xc5/0x1e0
iommu_sva_unbind_device+0x62/0x80
idxd_cdev_release+0x15a/0x200 [idxd]
__fput+0x9c/0x250
____fput+0xe/0x10
task_work_run+0x64/0xa0
exit_to_user_mode_prepare+0x227/0x230
syscall_exit_to_user_mode+0x2c/0x60
do_syscall_64+0x48/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of:
¶m->lock --> ¶m->fault_param->lock --> pasid_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(pasid_mutex);
lock(¶m->fault_param->lock);
lock(pasid_mutex);
lock(¶m->lock);
*** DEADLOCK ***
2 locks held by dsa_test/898:
#0: ffff888100cc1cc0 (&group->mutex){+.+.}-{3:3}, at:
iommu_sva_unbind_device+0x53/0x80
#1: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at:
intel_svm_unbind+0x34/0x1e0
stack backtrace:
CPU: 2 PID: 898 Comm: dsa_test Not tainted 5.14.0-rc7+ #549
Hardware name: Intel Corporation Kabylake Client platform/KBL S
DDR4 UD IMM CRB, BIOS KBLSE2R1.R00.X050.P01.1608011715 08/01/2016
Call Trace:
dump_stack_lvl+0x5b/0x74
dump_stack+0x10/0x12
print_circular_bug.cold+0x13d/0x142
check_noncircular+0xf1/0x110
__lock_acquire+0x1134/0x1d60
lock_acquire+0xc6/0x2e0
? iopf_queue_flush_dev+0x29/0x60
? pci_mmcfg_read+0xde/0x240
__mutex_lock+0x75/0x730
? iopf_queue_flush_dev+0x29/0x60
? pci_mmcfg_read+0xfd/0x240
? iopf_queue_flush_dev+0x29/0x60
mutex_lock_nested+0x1b/0x20
iopf_queue_flush_dev+0x29/0x60
intel_svm_drain_prq+0x127/0x210
? intel_pasid_tear_down_entry+0x22e/0x240
intel_svm_unbind+0xc5/0x1e0
iommu_sva_unbind_device+0x62/0x80
idxd_cdev_release+0x15a/0x200
pasid_mutex protects pasid and svm data mapping data. It's unnecessary
to hold pasid_mutex while flushing the workqueue. To fix the deadlock
issue, unlock pasid_pasid during flushing the workqueue to allow the works
to be handled.
Fixes: d5b9e4bfe0 ("iommu/vt-d: Report prq to io-pgfault framework")
Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210828070622.2437559-3-baolu.lu@linux.intel.com
[joro: Removed timing information from kernel log messages]
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The mm->pasid will be used in intel_svm_free_pasid() after load_pasid()
during unbinding mm. Clearing it in load_pasid() will cause PASID cannot
be freed in intel_svm_free_pasid().
Additionally mm->pasid was updated already before load_pasid() during pasid
allocation. No need to update it again in load_pasid() during binding mm.
Don't update mm->pasid to avoid the issues in both binding mm and unbinding
mm.
Fixes: 4048377414 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers")
Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210828070622.2437559-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Since the function has been simplified and only call iommu_init_ga_log(),
remove the function and replace with iommu_init_ga_log() instead.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20210820202957.187572-4-suravee.suthikulpanit@amd.com
Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Currently, iommu_init_ga() checks and disables IOMMU VAPIC support
(i.e. AMD AVIC support in IOMMU) when GAMSup feature bit is not set.
However it forgets to clear IRQ_POSTING_CAP from the previously set
amd_iommu_irq_ops.capability.
This triggers an invalid page fault bug during guest VM warm reboot
if AVIC is enabled since the irq_remapping_cap(IRQ_POSTING_CAP) is
incorrectly set, and crash the system with the following kernel trace.
BUG: unable to handle page fault for address: 0000000000400dd8
RIP: 0010:amd_iommu_deactivate_guest_mode+0x19/0xbc
Call Trace:
svm_set_pi_irte_mode+0x8a/0xc0 [kvm_amd]
? kvm_make_all_cpus_request_except+0x50/0x70 [kvm]
kvm_request_apicv_update+0x10c/0x150 [kvm]
svm_toggle_avic_for_irq_window+0x52/0x90 [kvm_amd]
svm_enable_irq_window+0x26/0xa0 [kvm_amd]
vcpu_enter_guest+0xbbe/0x1560 [kvm]
? avic_vcpu_load+0xd5/0x120 [kvm_amd]
? kvm_arch_vcpu_load+0x76/0x240 [kvm]
? svm_get_segment_base+0xa/0x10 [kvm_amd]
kvm_arch_vcpu_ioctl_run+0x103/0x590 [kvm]
kvm_vcpu_ioctl+0x22a/0x5d0 [kvm]
__x64_sys_ioctl+0x84/0xc0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes by moving the initializing of AMD IOMMU interrupt remapping mode
(amd_iommu_guest_ir) earlier before setting up the
amd_iommu_irq_ops.capability with appropriate IRQ_POSTING_CAP flag.
[joro: Squashed the two patches and limited
check_features_on_all_iommus() to CONFIG_IRQ_REMAP
to fix a compile warning.]
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20210820202957.187572-2-suravee.suthikulpanit@amd.com
Link: https://lore.kernel.org/r/20210820202957.187572-3-suravee.suthikulpanit@amd.com
Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Export alloc_iova_fast() and free_iova_fast() so that
some modules can make use of the per-CPU cache to get
rid of rbtree spinlock in alloc_iova() and free_iova()
during IOVA allocation.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210831103634.33-2-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Including:
- New DART IOMMU driver for Apple Silicon M1 chips.
- Optimizations for iommu_[map/unmap] performance
- Selective TLB flush support for the AMD IOMMU driver to make
it more efficient on emulated IOMMUs.
- Rework IOVA setup and default domain type setting to move more
code out of IOMMU drivers and to support runtime switching
between certain types of default domains.
- VT-d Updates from Lu Baolu:
- Update the virtual command related registers
- Enable Intel IOMMU scalable mode by default
- Preset A/D bits for user space DMA usage
- Allow devices to have more than 32 outstanding PRs
- Various cleanups
- ARM SMMU Updates from Will Deacon:
- SMMUv3: Minor optimisation to avoid zeroing struct members on CMD submission
- SMMUv3: Increased use of batched commands to reduce submission latency
- SMMUv3: Refactoring in preparation for ECMDQ support
- SMMUv2: Fix races when probing devices with identical StreamIDs
- SMMUv2: Optimise walk cache flushing for Qualcomm implementations
- SMMUv2: Allow deep sleep states for some Qualcomm SoCs with shared clocks
- Various smaller optimizations, cleanups, and fixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmEyKYAACgkQK/BELZcB
GuOzAxAAnJ02PG07BnFFFGN2/o3eVON4LQUXquMePjcZ8A8oQf073jO/ybWNnpJK
5V+DHRg2CAugFHks/EwIrxFXAWZuStrcnk81d8t6T6ROQl47Zv1qshksUTnsDQnz
V7mQ1P/pcsBwCUf73aD9ncLmLkiuTVfHKoKfe3gHeQI+H+2Lw4ijzB8kIwUqhkHI
heJZLDmO87S2Mr7zlCmMQH5R550fHrTKSbUCx9QqFu3GgWsjkU+3u1S17xR1bEoW
hmhJhyAw+MLrSgdeG4U9o+6AcQuRELEHfVSq7PtDxQ6hEVziGYGY4Nk+YiEcXFiv
mu9qfEkaP/2QOKszvks+nhHrwDnJ9WLnEEskiEFwjsaFauIKsRscfYVUBTWeYXJT
9t/PVngigWLDhGO0NEPthQvJExvJJs1MQQ72CcA6dd0XdGpN+aRglIUWUJP/nQHd
doAx4/1YWnHVkWWUef8NgmVvlHdoXjA7vy4QGL9FYCqV6ImfhAkJYKJ99X6Ovlmk
gje/Kx+5wUPT2nXNbTkjalIylyUNpugMY4xD7K06VXjvRMUf2SbYNDQxYJaDDld6
nDt0F0NvEyrj7HO8egwIZbX3MOikhMGHur48yEyCTbm+9oHQffkODq1o4OfuxJh2
nq0G5Plln9CEmhQVwzibcPSNlYPe8AZbbXqQ9DrJFusEpYj+01c=
=zVaQ
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- New DART IOMMU driver for Apple Silicon M1 chips
- Optimizations for iommu_[map/unmap] performance
- Selective TLB flush support for the AMD IOMMU driver to make it more
efficient on emulated IOMMUs
- Rework IOVA setup and default domain type setting to move more code
out of IOMMU drivers and to support runtime switching between certain
types of default domains
- VT-d Updates from Lu Baolu:
- Update the virtual command related registers
- Enable Intel IOMMU scalable mode by default
- Preset A/D bits for user space DMA usage
- Allow devices to have more than 32 outstanding PRs
- Various cleanups
- ARM SMMU Updates from Will Deacon:
SMMUv3:
- Minor optimisation to avoid zeroing struct members on CMD submission
- Increased use of batched commands to reduce submission latency
- Refactoring in preparation for ECMDQ support
SMMUv2:
- Fix races when probing devices with identical StreamIDs
- Optimise walk cache flushing for Qualcomm implementations
- Allow deep sleep states for some Qualcomm SoCs with shared clocks
- Various smaller optimizations, cleanups, and fixes
* tag 'iommu-updates-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (85 commits)
iommu/io-pgtable: Abstract iommu_iotlb_gather access
iommu/arm-smmu: Fix missing unlock on error in arm_smmu_device_group()
iommu/vt-d: Add present bit check in pasid entry setup helpers
iommu/vt-d: Use pasid_pte_is_present() helper function
iommu/vt-d: Drop the kernel doc annotation
iommu/vt-d: Allow devices to have more than 32 outstanding PRs
iommu/vt-d: Preset A/D bits for user space DMA usage
iommu/vt-d: Enable Intel IOMMU scalable mode by default
iommu/vt-d: Refactor Kconfig a bit
iommu/vt-d: Remove unnecessary oom message
iommu/vt-d: Update the virtual command related registers
iommu: Allow enabling non-strict mode dynamically
iommu: Merge strictness and domain type configs
iommu: Only log strictness for DMA domains
iommu: Expose DMA domain strictness via sysfs
iommu: Express DMA strictness via the domain type
iommu/vt-d: Prepare for multiple DMA domain types
iommu/arm-smmu: Prepare for multiple DMA domain types
iommu/amd: Prepare for multiple DMA domain types
iommu: Introduce explicit type for non-strict DMA domains
...
Pull swiotlb updates from Konrad Rzeszutek Wilk:
"A new feature called restricted DMA pools. It allows SWIOTLB to
utilize per-device (or per-platform) allocated memory pools instead of
using the global one.
The first big user of this is ARM Confidential Computing where the
memory for DMA operations can be set per platform"
* 'stable/for-linus-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: (23 commits)
swiotlb: use depends on for DMA_RESTRICTED_POOL
of: restricted dma: Don't fail device probe on rmem init failure
of: Move of_dma_set_restricted_buffer() into device.c
powerpc/svm: Don't issue ultracalls if !mem_encrypt_active()
s390/pv: fix the forcing of the swiotlb
swiotlb: Free tbl memory in swiotlb_exit()
swiotlb: Emit diagnostic in swiotlb_exit()
swiotlb: Convert io_default_tlb_mem to static allocation
of: Return success from of_dma_set_restricted_buffer() when !OF_ADDRESS
swiotlb: add overflow checks to swiotlb_bounce
swiotlb: fix implicit debugfs declarations
of: Add plumbing for restricted DMA pool
dt-bindings: of: Add restricted DMA pool
swiotlb: Add restricted DMA pool initialization
swiotlb: Add restricted DMA alloc/free support
swiotlb: Refactor swiotlb_tbl_unmap_single
swiotlb: Move alloc_size to swiotlb_find_slots
swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
swiotlb: Update is_swiotlb_active to add a struct device argument
swiotlb: Update is_swiotlb_buffer to add a struct device argument
...
- fix debugfs initialization order (Anthony Iliopoulos)
- use memory_intersects() directly (Kefeng Wang)
- allow to return specific errors from ->map_sg
(Logan Gunthorpe, Martin Oliveira)
- turn the dma_map_sg return value into an unsigned int (me)
- provide a common global coherent pool іmplementation (me)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmEvY+8LHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYPaehAAsgnBzzzoLHO83pgs0KL92c+0DiMNHYmaMCJOvZXk
x2Irv+O74WikRJc4S7uQ26p2spjmUxjmiOjld+8+NN0liD4QO9BQ/SZpIp8emuKS
/yPG6Xh86xSl/OrPL1y7kGeHkRi5sm3mRhcTdILFQFPLcSReupe++GRfnvrpbOPk
tj3pBGXluD6iJH12BBt00ushUVzZ0F2xaF6xUDAs94RSZ3tlqsfx6c928Y1KxSZh
f89q/KuaokyogFG7Ujj/nYgIUETaIs2W6UmxBfRzdEMJFSffwomUMbw+M+qGJ7/d
2UjamFYRX16FReE8WNsndbX1E6k5JBW12E1qwV3dUwatlNLWEaRq3PNiWkF7zcFH
LDkpDYN6s5bIDPTfDp21XfPygoH8KQhnD9lVf0aB7n04uu8VJrGB9+10PpkCJVXD
0b2dcuSwCO7hAfTfNGVV8f3EI/1XPflr1hJvMgcVtY53CR96ldp+4QaElzWLXumN
MyptirmrVITNVyVwGzhGAblXBLWdarXD0EXudyiaF4Xbrj3AkIOSUCghEwKLpjQf
UwMFFwSE8yGxKTRK4HfU5gMzy6G751fU7TUe5lmxZLovDflQoSXMWgHE8e7r0Qel
o5v6lmUzoWz2fAISf3xjauo2ncgmfWMwYM6C7OJy5nG73QXLQId9J+ReXbmrgrrN
DgI=
=spje
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-5.15' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:
- fix debugfs initialization order (Anthony Iliopoulos)
- use memory_intersects() directly (Kefeng Wang)
- allow to return specific errors from ->map_sg (Logan Gunthorpe,
Martin Oliveira)
- turn the dma_map_sg return value into an unsigned int (me)
- provide a common global coherent pool іmplementation (me)
* tag 'dma-mapping-5.15' of git://git.infradead.org/users/hch/dma-mapping: (31 commits)
hexagon: use the generic global coherent pool
dma-mapping: make the global coherent pool conditional
dma-mapping: add a dma_init_global_coherent helper
dma-mapping: simplify dma_init_coherent_memory
dma-mapping: allow using the global coherent pool for !ARM
ARM/nommu: use the generic dma-direct code for non-coherent devices
dma-direct: add support for dma_coherent_default_memory
dma-mapping: return an unsigned int from dma_map_sg{,_attrs}
dma-mapping: disallow .map_sg operations from returning zero on error
dma-mapping: return error code from dma_dummy_map_sg()
x86/amd_gart: don't set failed sg dma_address to DMA_MAPPING_ERROR
x86/amd_gart: return error code from gart_map_sg()
xen: swiotlb: return error code from xen_swiotlb_map_sg()
parisc: return error code from .map_sg() ops
sparc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR
sparc/iommu: return error codes from .map_sg() ops
s390/pci: don't set failed sg dma_address to DMA_MAPPING_ERROR
s390/pci: return error code from s390_dma_map_sg()
powerpc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR
powerpc/iommu: return error code from .map_sg() ops
...
These are updates for drivers that are tied to a particular SoC,
including the correspondig device tree bindings:
- A couple of reset controller changes for unisoc, uniphier, renesas
and zte platforms
- memory controller driver fixes for omap and tegra
- Rockchip io domain driver updates
- Lots of updates for qualcomm platforms, mostly touching their
firmware and power management drivers
- Tegra FUSE and firmware driver updateѕ
- Support for virtio transports in the SCMI firmware framework
- cleanup of ixp4xx drivers, towards enabling multiplatform
support and bringing it up to date with modern platforms
- Minor updates for keystone, mediatek, omap, renesas.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iD8DBQBhLz215t5GS2LDRf4RAjlHAJ473D0PymaTzv68EuPHThG+DEPifQCdGjLq
QGBB6JidIP8rtEdC+LWBB8I=
=M5+N
-----END PGP SIGNATURE-----
Merge tag 'drivers-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC driver updates from Arnd Bergmann:
"These are updates for drivers that are tied to a particular SoC,
including the correspondig device tree bindings:
- A couple of reset controller changes for unisoc, uniphier, renesas
and zte platforms
- memory controller driver fixes for omap and tegra
- Rockchip io domain driver updates
- Lots of updates for qualcomm platforms, mostly touching their
firmware and power management drivers
- Tegra FUSE and firmware driver updateѕ
- Support for virtio transports in the SCMI firmware framework
- cleanup of ixp4xx drivers, towards enabling multiplatform support
and bringing it up to date with modern platforms
- Minor updates for keystone, mediatek, omap, renesas"
* tag 'drivers-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (96 commits)
reset: simple: remove ZTE details in Kconfig help
soc: rockchip: io-domain: Remove unneeded semicolon
soc: rockchip: io-domain: add rk3568 support
dt-bindings: power: add rk3568-pmu-io-domain support
bus: ixp4xx: return on error in ixp4xx_exp_probe()
soc: renesas: Prefer memcpy() over strcpy()
firmware: tegra: Stop using seq_get_buf()
soc/tegra: fuse: Enable fuse clock on suspend for Tegra124
soc/tegra: fuse: Add runtime PM support
soc/tegra: fuse: Clear fuse->clk on driver probe failure
soc/tegra: pmc: Prevent racing with cpuilde driver
soc/tegra: bpmp: Remove unused including <linux/version.h>
dt-bindings: soc: ti: pruss: Add dma-coherent property
soc: ti: Remove pm_runtime_irq_safe() usage for smartreflex
soc: ti: pruss: Enable support for ICSSG subsystems on K3 AM64x SoCs
dt-bindings: soc: ti: pruss: Update bindings for K3 AM64x SoCs
firmware: arm_scmi: Use WARN_ON() to check configured transports
firmware: arm_scmi: Fix boolconv.cocci warnings
soc: mediatek: mmsys: Fix missing UFOE component in mt8173 table routing
soc: mediatek: mmsys: add MT8365 support
...
- Improve ftrace code patching so that stop_machine is not required anymore.
This requires a small common code patch acked by Steven Rostedt:
https://lore.kernel.org/linux-s390/20210730220741.4da6fdf6@oasis.local.home/
- Enable KCSAN for s390. This comes with a small common code change to fix a
compile warning. Acked by Marco Elver:
https://lore.kernel.org/r/20210729142811.1309391-1-hca@linux.ibm.com
- Add KFENCE support for s390. This also comes with a minimal x86 patch from
Marco Elver who said also this can be carried via the s390 tree:
https://lore.kernel.org/linux-s390/YQJdarx6XSUQ1tFZ@elver.google.com/
- More changes to prepare the decompressor for relocation.
- Enable DAT also for CPU restart path.
- Final set of register asm removal patches; leaving only three locations where
needed and sane.
- Add NNPA, Vector-Packed-Decimal-Enhancement Facility 2, PCI MIO support to
hwcaps flags.
- Cleanup hwcaps implementation.
- Add new instructions to in-kernel disassembler.
- Various QDIO cleanups.
- Add SCLP debug feature.
- Various other cleanups and improvements all over the place.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmEs1GsACgkQIg7DeRsp
bsJ9+A/9FApCECNPgu6jOX4Ee+no+LxpCPUF8rvt56TFTLv7+Dhm7fJl0xQ9utsZ
FyLMDAr1/FKdm2wBW23QZH4vEIt1bd6e/03DwwK+6IjHKZHRIfB8eGJMsLj/TDzm
K6/+FI7qXjvpNXxgkCqXf5yESi/y5Dgr+16kTBhPZj5awRiwe5puPamji3uiQ45V
r4MdGCCC9BnTZvtPpUrr8ImnUqHJ4/TMo1YYdykLbZFuAvvYUyZ5YG5kh0pMa8JZ
DGJpfLQfy7ZNscIzdVhZtfzzESVtS6/AOeBzDMO1pbM1CGXtvpJJP0Wjlr/PGwoW
fvuMHpqTlDi+TfNZiPP5lwsFC89xSd6gtZH7vAuI8kFCXgW3RMjABF6h/mzpH1WO
jXyaSEZROc/83gxPMYyOYiqrKyAFPbpZ/Rnav2bvGQGneqx7RvmpF3GgA9WEo1PW
rMDoEbLstJuHk0E2uEV+OnQd5F7MHNonzpYfp/7pyQ+PL8w2GExV9yngVc/f3TqG
HYLC9rc3K6DkxZappcJm0qTb7lDTMFI7xK3g9RiqPQBJE1v1MYE/rai48nW69ypE
bRNL76AjyXKo+zKR2wlhJVMY1I1+DarMopHhZj6fzQT5te1LLsv8OuTU2gkt6dIq
YoSYOYvModf3HbKnJul2tszQG9yl+vpE9MiCyBQSsxIYXCriq/c=
=WDRh
-----END PGP SIGNATURE-----
Merge tag 's390-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Heiko Carstens:
- Improve ftrace code patching so that stop_machine is not required
anymore. This requires a small common code patch acked by Steven
Rostedt:
https://lore.kernel.org/linux-s390/20210730220741.4da6fdf6@oasis.local.home/
- Enable KCSAN for s390. This comes with a small common code change to
fix a compile warning. Acked by Marco Elver:
https://lore.kernel.org/r/20210729142811.1309391-1-hca@linux.ibm.com
- Add KFENCE support for s390. This also comes with a minimal x86 patch
from Marco Elver who said also this can be carried via the s390 tree:
https://lore.kernel.org/linux-s390/YQJdarx6XSUQ1tFZ@elver.google.com/
- More changes to prepare the decompressor for relocation.
- Enable DAT also for CPU restart path.
- Final set of register asm removal patches; leaving only three
locations where needed and sane.
- Add NNPA, Vector-Packed-Decimal-Enhancement Facility 2, PCI MIO
support to hwcaps flags.
- Cleanup hwcaps implementation.
- Add new instructions to in-kernel disassembler.
- Various QDIO cleanups.
- Add SCLP debug feature.
- Various other cleanups and improvements all over the place.
* tag 's390-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (105 commits)
s390: remove SCHED_CORE from defconfigs
s390/smp: do not use nodat_stack for secondary CPU start
s390/smp: enable DAT before CPU restart callback is called
s390: update defconfigs
s390/ap: fix state machine hang after failure to enable irq
KVM: s390: generate kvm hypercall functions
s390/sclp: add tracing of SCLP interactions
s390/debug: add early tracing support
s390/debug: fix debug area life cycle
s390/debug: keep debug data on resize
s390/diag: make restart_part2 a local label
s390/mm,pageattr: fix walk_pte_level() early exit
s390: fix typo in linker script
s390: remove do_signal() prototype and do_notify_resume() function
s390/crypto: fix all kernel-doc warnings in vfio_ap_ops.c
s390/pci: improve DMA translation init and exit
s390/pci: simplify CLP List PCI handling
s390/pci: handle FH state mismatch only on disable
s390/pci: fix misleading rc in clp_set_pci_fn()
s390/boot: factor out offset_vmlinux_info() function
...
Currently zpci_dma_init_device()/zpci_dma_exit_device() is called as
part of zpci_enable_device()/zpci_disable_device() and errors for
zpci_dma_exit_device() are always ignored even if we could abort.
Improve upon this by moving zpci_dma_exit_device() out of
zpci_disable_device() and check for errors whenever we have a way to
abort the current operation. Note that for example in
zpci_event_hard_deconfigured() the device is expected to be gone so we
really can't abort and proceed even in case of error.
Similarly move the cc == 3 special case out of zpci_unregister_ioat()
and into the callers allowing to abort when finding an already disabled
devices precludes proceeding with the operation.
While we are at it log IOAT register/unregister errors in the s390
debugfs log,
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Previously io-pgtable merely passed the iommu_iotlb_gather pointer
through to helpers, but now it has grown its own direct dereference.
This turns out to break the build for !IOMMU_API configs where the
structure only has a dummy definition. It will probably also crash
drivers who don't use the gather mechanism and simply pass in NULL.
Wrap this dereference in a suitable helper which can both be stubbed
out for !IOMMU_API and encapsulate a NULL check otherwise.
Fixes: 7a7c5badf8 ("iommu: Indicate queued flushes via gather data")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/83672ee76f6405c82845a55c148fa836f56fbbc1.1629465282.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Add the missing unlock before return from function arm_smmu_device_group()
in the error handling case.
Fixes: b1a1347912 ("iommu/arm-smmu: Fix race condition during iommu_group creation")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210820074949.1946576-1-yangyingliang@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Kernel doc validator is unhappy with the following
.../perf.c:16: warning: Function parameter or member 'latency_lock' not described in 'DEFINE_SPINLOCK'
.../perf.c:16: warning: expecting prototype for perf.c(). Prototype was for DEFINE_SPINLOCK() instead
Drop kernel doc annotation since the top comment is not in the required format.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210729163538.40101-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210818134852.1847070-8-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The minimum per-IOMMU PRQ queue size is one 4K page, this is more entries
than the hardcoded limit of 32 in the current VT-d code. Some devices can
support up to 512 outstanding PRQs but underutilized by this limit of 32.
Although, 32 gives some rough fairness when multiple devices share the same
IOMMU PRQ queue, but far from optimal for customized use case. This extends
the per-IOMMU PRQ queue size to four 4K pages and let the devices have as
many outstanding page requests as they can.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210818134852.1847070-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We preset the access and dirty bits for IOVA over first level usage only
for the kernel DMA (i.e., when domain type is IOMMU_DOMAIN_DMA). We should
also preset the FL A/D for user space DMA usage. The idea is that even the
user space A/D bit memory write is unnecessary. We should avoid it to
minimize the overhead.
Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210818134852.1847070-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The commit 8950dcd83a ("iommu/vt-d: Leave scalable mode default off")
leaves the scalable mode default off and end users could turn it on with
"intel_iommu=sm_on". Using the Intel IOMMU scalable mode for kernel DMA,
user-level device access and Shared Virtual Address have been enabled.
This enables the scalable mode by default if the hardware advertises the
support and adds kernel options of "intel_iommu=sm_on/sm_off" for end
users to configure it through the kernel parameters.
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210818134852.1847070-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This fixes the "shared memory state machine" (SMSM) interrupt logic to
avoid missing transitions happening while the interrupts are masked.
SM6115 support is added to smd-rpm and rpmpd.
The Qualcomm SCM firmware driver is once again made possible to compile
and load as a kernel module.
An out-of-bounds error related to the cooling devices of the AOSS driver
is corrected. The binding is converted to YAML and a generic compatible
is introduced to reduce the driver churn.
The GENI wrapper gains a helper function used in I2C and SPI for
switching the serial engine hardware to use the wrapper's DMA-engine.
Lastly it contains a number of cleanups and smaller fixes for rpmhpd,
socinfo, CPR, mdt_loader and the GENI DT binding.
-----BEGIN PGP SIGNATURE-----
iQJPBAABCAA5FiEEBd4DzF816k8JZtUlCx85Pw2ZrcUFAmEa29wbHGJqb3JuLmFu
ZGVyc3NvbkBsaW5hcm8ub3JnAAoJEAsfOT8Nma3FFnMP/A2Od7JYhdjH7sfc3i3B
0zws88lT7XTo9KSMpLFrQg1Qzp3ELDMfVhS2CpsekZn1g7s6RGnbQrh2Mac9Yh4z
+7e4YLhoMxEkdbEVvbVRIX4parFWD/KxUdkyXM8gKZntJzOFl6VY5V8aKi+7IO+/
CdWHvELDVXMe2LkBd3lKE1AlS2MjkohXpFKgRwkY4r2nVXwqYTdkfJvXdGdhECGr
ld4ZIvIR6ERLmVpGvTxdU5W0z2xsLTbPYDPSv6mPkWqDYSOyXV3zABV1fSkH28ot
wIoesHyI3vL4/LNIlHn+tcWj1Ou8hSzxZmxfq7cdKbkfwPLCWE5D8+HEO4kmbFiF
5Dds+oxvKSFpf/wppK7bUSEd9Q+dKsrFt2mdWy/sYRe1EaEv5sFBgE0rV3+c6ykL
tptUEFlaB2si7PKSpKje8czHn4Akuc6BwT6xovZZ72K8CNz9D71etSkoNLLXa54d
bJibw2eNTT1EOACC/FPBO9AS11Icm6wszn/dcaSwaSPGQ6cR3lvAwHqzDFMGHp+x
L+iojgnZoHykFhQjGuGrI3yTHOpp0MCNxRoN7DlFwm7KLKVHqeqg+xHXtV9sJer8
iAhY/uepLRxc1oC5Z+Ejx1gABmKycXtzKQ9ecwTclrk66ampWQBlv5+Bxd5w/hux
ZR96mJPmpk1WKOX3FAgdeaaP
=88qN
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmEdDA4ACgkQmmx57+YA
GNmAmw//WfdprWl2SJGfyojhDeeUq6ZeNesMFBabFss7Ar++szhyFSu3bqlyH6WC
ZgPOZOJakKQq2EGKNm7RuEgeR8sEUs9czetNO8AqVy1szlhqnUKhG0OO+uLqRLQn
Hvf1fc+uG3xeaFP0/Q/4fuG+GL1fnmXIJWl0nZFUGfWyQ3mxxm5fLq1DM27KAzP1
eKMqtCh3F0qHMWZTJd084LO2XXyUTVVBbXAQKX/IQmH6+BK3Q2YMToo/919HCnZs
XfgcdKuB/fOfi1n+PgGNODdTJ/Uy10WkSALZHlzKfr77AUCs9+RWxdkogCmACJBs
IhNoZ/6D6a6kCIEuaEjHtNsMAVoG0bYpaB9vFLhJgF4wfdCd+DuOXkCy9B7vI4/8
7/SKArKYrG1sPlhDGFaWZjWEFBCGycDDsHQ4T2ecZ5d3f+Kuimgx7NLYhKRHPI+9
8QVJwNbIGrNXjwIn6S0AeqDLoXIzMmAbNvuX1lFz0OyEkbgDYPSuUPmKYoKh0VL5
+aTPYANbKxfF7nIPxfN580yjQGZmJmctyhkqnavEB6HdNySO2/oM5F2uGRVdVKnv
5HaPLqWZf2PffjAg+bU3O3bBlYRdIpEaKaa1eHFIMTgK7nlcvFESWWtwg+IBKdqY
JI3Kkg5PP3jxqxybzgFPE298vI1G6/so2A12HBi0dl28XheK+wM=
=dS58
-----END PGP SIGNATURE-----
Merge tag 'qcom-drivers-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/drivers
Qualcomm driver updates for v5.15
This fixes the "shared memory state machine" (SMSM) interrupt logic to
avoid missing transitions happening while the interrupts are masked.
SM6115 support is added to smd-rpm and rpmpd.
The Qualcomm SCM firmware driver is once again made possible to compile
and load as a kernel module.
An out-of-bounds error related to the cooling devices of the AOSS driver
is corrected. The binding is converted to YAML and a generic compatible
is introduced to reduce the driver churn.
The GENI wrapper gains a helper function used in I2C and SPI for
switching the serial engine hardware to use the wrapper's DMA-engine.
Lastly it contains a number of cleanups and smaller fixes for rpmhpd,
socinfo, CPR, mdt_loader and the GENI DT binding.
* tag 'qcom-drivers-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
soc: qcom: smsm: Fix missed interrupts if state changes while masked
soc: qcom: smsm: Implement support for get_irqchip_state
soc: qcom: mdt_loader: be more informative on errors
dt-bindings: qcom: geni-se: document iommus
soc: qcom: smd-rpm: Add SM6115 compatible
soc: qcom: geni: Add support for gpi dma
soc: qcom: geni: move GENI_IF_DISABLE_RO to common header
PM: AVS: qcom-cpr: Use nvmem_cell_read_variable_le_u32()
drivers: soc: qcom: rpmpd: Add SM6115 RPM Power Domains
dt-bindings: power: rpmpd: Add SM6115 to rpmpd binding
dt-bindings: soc: qcom: smd-rpm: Add SM6115 compatible
soc: qcom: aoss: Fix the out of bound usage of cooling_devs
firmware: qcom_scm: Allow qcom_scm driver to be loadable as a permenent module
soc: qcom: socinfo: Don't print anything if nothing found
soc: qcom: rpmhpd: Use corner in power_off
soc: qcom: aoss: Add generic compatible
dt-bindings: soc: qcom: aoss: Convert to YAML
dt-bindings: soc: qcom: aoss: Add SC8180X and generic compatible
firmware: qcom_scm: remove a duplicative condition
firmware: qcom_scm: Mark string array const
Link: https://lore.kernel.org/r/20210816214840.581244-1-bjorn.andersson@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Allocating and enabling a flush queue is in fact something we can
reasonably do while a DMA domain is active, without having to rebuild it
from scratch. Thus we can allow a strict -> non-strict transition from
sysfs without requiring to unbind the device's driver, which is of
particular interest to users who want to make selective relaxations to
critical devices like the one serving their root filesystem.
Disabling and draining a queue also seems technically possible to
achieve without rebuilding the whole domain, but would certainly be more
involved. Furthermore there's not such a clear use-case for tightening
up security *after* the device may already have done whatever it is that
you don't trust it not to do, so we only consider the relaxation case.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/d652966348c78457c38bf18daf369272a4ebc2c9.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
To parallel the sysfs behaviour, merge the new build-time option
for DMA domain strictness into the default domain type choice.
Suggested-by: Joerg Roedel <joro@8bytes.org>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/d04af35b9c0f2a1d39605d7a9b451f5e1f0c7736.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When passthrough is enabled, the default strictness policy becomes
irrelevant, since any subsequent runtime override to a DMA domain type
now embodies an explicit choice of strictness as well. Save on noise by
only logging the default policy when it is meaningfully in effect.
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/9d2bcba880c6d517d0751ed8bd4960853030b4d7.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The sysfs interface for default domain types exists primarily so users
can choose the performance/security tradeoff relevant to their own
workload. As such, the choice between the policies for DMA domains fits
perfectly as an additional point on that scale - downgrading a
particular device from a strict default to non-strict may be enough to
let it reach the desired level of performance, while still retaining
more peace of mind than with a wide-open identity domain. Now that we've
abstracted non-strict mode as a distinct type of DMA domain, allow it to
be chosen through the user interface as well.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0e08da5ed4069fd3473cfbadda758ca983becdbf.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Eliminate the iommu_get_dma_strict() indirection and pipe the
information through the domain type from the beginning. Besides
the flow simplification this also has several nice side-effects:
- Automatically implies strict mode for untrusted devices by
virtue of their IOMMU_DOMAIN_DMA override.
- Ensures that we only end up using flush queues for drivers
which are aware of them and can actually benefit.
- Allows us to handle flush queue init failure by falling back
to strict mode instead of leaving it to possibly blow up later.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/47083d69155577f1367877b1594921948c366eb3.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In preparation for the strict vs. non-strict decision for DMA domains
to be expressed in the domain type, make sure we expose our flush queue
awareness by accepting the new domain type, and test the specific
feature flag where we want to identify DMA domains in general. The DMA
ops reset/setup can simply be made unconditional, since iommu-dma
already knows only to touch DMA domains.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/31a8ef868d593a2f3826a6a120edee81815375a7.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Promote the difference between strict and non-strict DMA domains from an
internal detail to a distinct domain feature and type, to pave the road
for exposing it through the sysfs default domain interface.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/08cd2afaf6b63c58ad49acec3517c9b32c2bb946.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
IO_PGTABLE_QUIRK_NON_STRICT was never a very comfortable fit, since it's
not a quirk of the pagetable format itself. Now that we have a more
appropriate way to convey non-strict unmaps, though, this last of the
non-quirk quirks can also go, and with the flush queue code also now
enforcing its own ordering we can have a lovely cleanup all round.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/155b5c621cd8936472e273a8b07a182f62c6c20d.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Since iommu_iotlb_gather exists to help drivers optimise flushing for a
given unmap request, it is also the logical place to indicate whether
the unmap is strict or not, and thus help them further optimise for
whether to expect a sync or a flush_all subsequently. As part of that,
it also seems fair to make the flush queue code take responsibility for
enforcing the really subtle ordering requirement it brings, so that we
don't need to worry about forgetting that if new drivers want to add
flush queue support, and can consolidate the existing versions.
While we're adding to the kerneldoc, also fill in some info for
@freelist which was overlooked previously.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/bf5f8e2ad84e48c712ccbf80fa8c610594c7595f.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
iommu_dma_init_domain() is now only called from iommu_setup_dma_ops(),
which has already assumed dev to be non-NULL.
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/06024523c080364390016550065e3cfe8031367e.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>