After binding a device to an mm, device drivers currently need to
register a mm_exit handler. This function is called when the mm exits,
to gracefully stop DMA targeting the address space and flush page faults
to the IOMMU.
This is deemed too complex for the MMU release() notifier, which may be
triggered by any mmput() invocation, from about 120 callsites [1]. The
upcoming SVA module has an example of such complexity: the I/O Page
Fault handler would need to call mmput_async() instead of mmput() after
handling an IOPF, to avoid triggering the release() notifier which would
in turn drain the IOPF queue and lock up.
Another concern is the DMA stop function taking too long, up to several
minutes [2]. For some mmput() callers this may disturb other users. For
example, if the OOM killer picks the mm bound to a device as the victim
and that mm's memory is locked, if the release() takes too long, it
might choose additional innocent victims to kill.
To simplify the MMU release notifier, don't forward the notification to
device drivers. Since they don't stop DMA on mm exit anymore, the PASID
lifetime is extended:
(1) The device driver calls bind(). A PASID is allocated.
Here any DMA fault is handled by mm, and on error we don't print
anything to dmesg. Userspace can easily trigger errors by issuing DMA
on unmapped buffers.
(2) exit_mmap(), for example the process took a SIGKILL. This step
doesn't happen during normal operations. Remove the pgd from the
PASID table, since the page tables are about to be freed. Invalidate
the IOTLBs.
Here the device may still perform DMA on the address space. Incoming
transactions are aborted but faults aren't printed out. ATS
Translation Requests return Successful Translation Completions with
R=W=0. PRI Page Requests return with Invalid Request.
(3) The device driver stops DMA, possibly following release of a fd, and
calls unbind(). PASID table is cleared, IOTLB invalidated if
necessary. The page fault queues are drained, and the PASID is
freed.
If DMA for that PASID is still running here, something went seriously
wrong and errors should be reported.
For now remove iommu_sva_ops entirely. We might need to re-introduce
them at some point, for example to notify device drivers of unhandled
IOPF.
[1] https://lore.kernel.org/linux-iommu/20200306174239.GM31668@ziepe.ca/
[2] https://lore.kernel.org/linux-iommu/4d68da96-0ad5-b412-5987-2f7a6aa796c3@amd.com/
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200423125329.782066-3-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>