CONFIG_MIGRATE_VMA_HELPER guards helpers that are required for proper
devic private memory support. Remove the option and just check for
CONFIG_DEVICE_PRIVATE instead.
Link: https://lore.kernel.org/r/20190814075928.23766-11-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
No one ever checks this flag, and we could easily get that information
from the page if needed.
Link: https://lore.kernel.org/r/20190814075928.23766-10-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Now that we can rely errors in the normal control flow there is no
need for this flag, remove it.
Link: https://lore.kernel.org/r/20190814075928.23766-9-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Factor the main copy page to vram routine out into a helper that acts
on a single page and which doesn't require the nouveau_dmem_migrate
structure for argument passing. As an added benefit the new version
only allocates the dma address array once and reuses it for each
subsequent chunk of work.
Link: https://lore.kernel.org/r/20190814075928.23766-8-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Factor the main copy page to ram routine out into a helper that acts on
a single page and which doesn't require the nouveau_dmem_fault
structure for argument passing. Also remove the loop over multiple
pages as we only handle one at the moment, although the structure of
the main worker function makes it relatively easy to add multi page
support back if needed in the future. But at least for now this avoid
the needed to dynamically allocate memory for the dma addresses in
what is essentially the page fault path.
Link: https://lore.kernel.org/r/20190814075928.23766-7-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
nouveau_dmem_migrate_vma and nouveau_dmem_convert_pfn are only called
when CONFIG_DRM_NOUVEAU_SVM is enabled, so there is no need to provide
!CONFIG_DRM_NOUVEAU_SVM stubs for them.
Link: https://lore.kernel.org/r/20190814075928.23766-6-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Factor out the end of fencing logic from the two migration routines.
Link: https://lore.kernel.org/r/20190814075928.23766-5-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Factor out the repeated device memory address calculation into
a helper.
Link: https://lore.kernel.org/r/20190814075928.23766-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When we start a new batch of dma_map operations we need to reset dma_nr,
as we start filling a newly allocated array.
Link: https://lore.kernel.org/r/20190814075928.23766-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There isn't any good reason to pass callbacks to migrate_vma. Instead
we can just export the three steps done by this function to drivers and
let them sequence the operation without callbacks. This removes a lot
of boilerplate code as-is, and will allow the drivers to drastically
improve code flow and error handling further on.
Link: https://lore.kernel.org/r/20190814075928.23766-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Jason Gunthorpe says:
====================
Add mmu_notifier_get/put for managing mmu notifier registrations
This series introduces a new registration flow for mmu_notifiers based on
the idea that the user would like to get a single refcounted piece of
memory for a mm, keyed to its use.
For instance many users of mmu_notifiers use an interval tree or similar
to dispatch notifications to some object. There are many objects but only
one notifier subscription per mm holding the tree.
Of the 12 places that call mmu_notifier_register:
- 7 are maintaining some kind of obvious mapping of mm_struct to
mmu_notifier registration, ie in some linked list or hash table. Of
the 7 this series converts 4 (gru, hmm, RDMA, radeon)
- 3 (hfi1, gntdev, vhost) are registering multiple notifiers, but each
one immediately does some VA range filtering, ie with an interval tree.
These would be better with a global subsystem-wide range filter and
could convert to this API.
- 2 (kvm, amd_iommu) are deliberately using a single mm at a time, and
really can't use this API. One of the intel-svm's modes is also in this
list
The 3/7 unconverted drivers are:
- intel-svm
This driver tracks mm's in a global linked list 'global_svm_list'
and would benefit from this API.
Its flow is a bit complex, since it also wants a set of non-shared
notifiers.
- i915_gem_usrptr
This driver tracks mm's in a per-device hash
table (dev_priv->mm_structs), but only has an optional use of
mmu_notifiers. Since it still seems to need the hash table it is
difficult to convert.
- amdkfd/kfd_process
This driver is using a global SRCU hash table to track mm's
The control flow here is very complicated and the driver is relying on
this hash table to be fast on the ioctl syscall path.
It would definitely benefit, but only if the ioctl path didn't need to
do the search so often.
====================
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The sequence of mmu_notifier_unregister_no_release(),
mmu_notifier_call_srcu() is identical to mmu_notifier_put() with the
free_notifier callback.
As this is the last user of those APIs, converting it means we can drop
them.
Link: https://lore.kernel.org/r/20190806231548.25242-11-jgg@ziepe.ca
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When using mmu_notifer_unregister_no_release() the caller must ensure
there is a SRCU synchronize before the mn memory is freed, otherwise use
after free races are possible, for instance:
CPU0 CPU1
invalidate_range_start
hlist_for_each_entry_rcu(..)
mmu_notifier_unregister_no_release(&p->mn)
kfree(mn)
if (mn->ops->invalidate_range_end)
The error unwind in amdkfd misses the SRCU synchronization.
amdkfd keeps the kfd_process around until the mm is released, so split the
flow to fully initialize the kfd_process and register it for find_process,
and with the notifier. Past this point the kfd_process does not need to be
cleaned up as it is fully ready.
The final failable step does a vm_mmap() and does not seem to impact the
kfd_process global state. Since it also cannot be undone (and already has
problems with undo if it internally fails), it has to be last.
This way we don't have to try to unwind the mmu_notifier_register() and
avoid the problem with the SRCU.
Along the way this also fixes various other error unwind bugs in the flow.
Fixes: 45102048f7 ("amdkfd: Add process queue manager module")
Link: https://lore.kernel.org/r/20190806231548.25242-10-jgg@ziepe.ca
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
radeon is using a device global hash table to track what mmu_notifiers
have been registered on struct mm. This is better served with the new
get/put scheme instead.
radeon has a bug where it was not blocking notifier release() until all
the BO's had been invalidated. This could result in a use after free of
pages the BOs. This is tied into a second bug where radeon left the
notifiers running endlessly even once the interval tree became
empty. This could result in a use after free with module unload.
Both are fixed by changing the lifetime model, the BOs exist in the
interval tree with their natural lifetimes independent of the mm_struct
lifetime using the get/put scheme. The release runs synchronously and just
does invalidate_start across the entire interval tree to create the
required DMA fence.
Additions to the interval tree after release are already impossible as
only current->mm is used during the add.
Link: https://lore.kernel.org/r/20190806231548.25242-9-jgg@ziepe.ca
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This is a significant simplification, it eliminates all the remaining
'hmm' stuff in mm_struct, eliminates krefing along the critical notifier
paths, and takes away all the ugly locking and abuse of page_table_lock.
mmu_notifier_get() provides the single struct hmm per struct mm which
eliminates mm->hmm.
It also directly guarantees that no mmu_notifier op callback is callable
while concurrent free is possible, this eliminates all the krefs inside
the mmu_notifier callbacks.
The remaining krefs in the range code were overly cautious, drivers are
already not permitted to free the mirror while a range exists.
Link: https://lore.kernel.org/r/20190806231548.25242-6-jgg@ziepe.ca
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
GRU is already using almost the same algorithm as get/put, it even
helpfully has a 10 year old comment to make this algorithm common, which
is finally happening.
There are a few differences and fixes from this conversion:
- GRU used rcu not srcu to read the hlist
- Unclear how the locking worked to prevent gru_register_mmu_notifier()
from running concurrently with gru_drop_mmu_notifier() - this version is
safe
- GRU had a release function which only set a variable without any locking
that skiped the synchronize_srcu during unregister which looks racey,
but this makes it reliable via the integrated call_srcu().
- It is unclear if the mmap_sem is actually held when
__mmu_notifier_register() was called, lockdep will now warn if this is
wrong
Link: https://lore.kernel.org/r/20190806231548.25242-5-jgg@ziepe.ca
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dimitri Sivanich <sivanich@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Many places in the kernel have a flow where userspace will create some
object and that object will need to connect to the subsystem's
mmu_notifier subscription for the duration of its lifetime.
In this case the subsystem is usually tracking multiple mm_structs and it
is difficult to keep track of what struct mmu_notifier's have been
allocated for what mm's.
Since this has been open coded in a variety of exciting ways, provide core
functionality to do this safely.
This approach uses the struct mmu_notifier_ops * as a key to determine if
the subsystem has a notifier registered on the mm or not. If there is a
registration then the existing notifier struct is returned, otherwise the
ops->alloc_notifiers() is used to create a new per-subsystem notifier for
the mm.
The destroy side incorporates an async call_srcu based destruction which
will avoid bugs in the callers such as commit 6d7c3cde93 ("mm/hmm: fix
use after free with struct hmm in the mmu notifiers").
Since we are inside the mmu notifier core locking is fairly simple, the
allocation uses the same approach as for mmu_notifier_mm, the write side
of the mmap_sem makes everything deterministic and we only need to do
hlist_add_head_rcu() under the mm_take_all_locks(). The new users count
and the discoverability in the hlist is fully serialized by the
mmu_notifier_mm->lock.
Link: https://lore.kernel.org/r/20190806231548.25242-4-jgg@ziepe.ca
Co-developed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
A prior commit e0f3c3f78d ("mm/mmu_notifier: init notifier if necessary")
made an attempt at doing this, but had to be reverted as calling
the GFP_KERNEL allocator under the i_mmap_mutex causes deadlock, see
commit 35cfa2b0b4 ("mm/mmu_notifier: allocate mmu_notifier in advance").
However, we can avoid that problem by doing the allocation only under
the mmap_sem, which is already happening.
Since all writers to mm->mmu_notifier_mm hold the write side of the
mmap_sem reading it under that sem is deterministic and we can use that to
decide if the allocation path is required, without speculation.
The actual update to mmu_notifier_mm must still be done under the
mm_take_all_locks() to ensure read-side coherency.
Link: https://lore.kernel.org/r/20190806231548.25242-3-jgg@ziepe.ca
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This simplifies the code to not have so many one line functions and extra
logic. __mmu_notifier_register() simply becomes the entry point to
register the notifier, and the other one calls it under lock.
Also add a lockdep_assert to check that the callers are holding the lock
as expected.
Link: https://lore.kernel.org/r/20190806231548.25242-2-jgg@ziepe.ca
Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Make HMM_MIRROR an option that is selected by drivers wanting to use it
instead of a user visible option as it is just a low-level implementation
detail.
Link: https://lore.kernel.org/r/20190806160554.14046-15-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There isn't really any architecture specific code in this page table walk
implementation, so drop the dependencies.
Link: https://lore.kernel.org/r/20190806160554.14046-14-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Stub out the whole function and assign NULL to the .hugetlb_entry method
if CONFIG_HUGETLB_PAGE is not set, as the method won't ever be called in
that case.
Link: https://lore.kernel.org/r/20190806160554.14046-13-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Stub out the whole function when CONFIG_TRANSPARENT_HUGEPAGE is not set to
make the function easier to read.
Link: https://lore.kernel.org/r/20190806160554.14046-12-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
We only need the special pud_entry walker if PUD-sized hugepages and pte
mappings are supported, else the common pagewalk code will take care of
the iteration. Not implementing this callback reduced the amount of code
compiled for non-x86 platforms, and also fixes compile failures with other
architectures when helpers like pud_pfn are not implemented.
Link: https://lore.kernel.org/r/20190806160554.14046-11-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
pte_index is an internal arch helper in various architectures, without
consistent semantics. Open code that calculation of a PMD index based on
the virtual address instead.
Link: https://lore.kernel.org/r/20190806160554.14046-10-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The pagewalk code already passes the value as the hmask parameter.
Link: https://lore.kernel.org/r/20190806160554.14046-9-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
All users pass PAGE_SIZE here, and if we wanted to support single entries
for huge pages we should really just add a HMM_FAULT_HUGEPAGE flag instead
that uses the huge page size instead of having the caller calculate that
size once, just for the hmm code to verify it.
Link: https://lore.kernel.org/r/20190806160554.14046-8-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The start, end and page_shift values are all saved in the range structure,
so we might as well use that for argument passing.
Link: https://lore.kernel.org/r/20190806160554.14046-7-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
We'll need the nouveau_svmm structure to improve the function soon. For
now this allows using the svmm->mm reference to unlock the mmap_sem, and
thus the same dereference chain that the caller uses to lock and unlock
it.
Link: https://lore.kernel.org/r/20190806160554.14046-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The list is used to add the range to another list as an entry in the core
hmm code, and intended as a private member not exposed to drivers. There
is no need to initialize it in a driver.
Link: https://lore.kernel.org/r/20190806160554.14046-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
hmm_range_fault can only return -EAGAIN if called with the
HMM_FAULT_ALLOW_RETRY flag, which amdgpu never does. Remove the handling
for the -EAGAIN case with its non-standard locking scheme.
Link: https://lore.kernel.org/r/20190806160554.14046-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Since hmm_range_fault() doesn't use the struct hmm_range vma field, remove
it.
Link: https://lore.kernel.org/r/20190726005650.2566-8-rcampbell@nvidia.com
Suggested-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
walk_page_range() will only call hmm_vma_walk_hugetlb_entry() for
hugetlbfs pages and doesn't call hmm_vma_walk_pmd() in this case.
Therefore, it is safe to remove the check for vma->vm_flags & VM_HUGETLB
in hmm_vma_walk_pmd().
Link: https://lore.kernel.org/r/20190726005650.2566-7-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add a HMM_FAULT_SNAPSHOT flag so that hmm_range_snapshot can be merged
into the almost identical hmm_range_fault function.
Link: https://lore.kernel.org/r/20190726005650.2566-5-rcampbell@nvidia.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This allows easier expansion to other flags, and also makes the callers a
little easier to read.
Link: https://lore.kernel.org/r/20190726005650.2566-4-rcampbell@nvidia.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
A few more comments and minor programming style clean ups. There should
be no functional changes.
Link: https://lore.kernel.org/r/20190726005650.2566-3-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The hmm_mirror_ops callback function sync_cpu_device_pagetables() passes a
struct hmm_update which is a simplified version of struct
mmu_notifier_range. This is unnecessary so replace hmm_update with
mmu_notifier_range directly.
Link: https://lore.kernel.org/r/20190726005650.2566-2-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
[jgg: white space tuning]
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The magic dropping of mmap_sem when handle_mm_fault returns VM_FAULT_RETRY
is rather subtile. Add a comment explaining it.
Link: https://lore.kernel.org/r/20190724065258.16603-8-hch@lst.de
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
[hch: wrote a changelog]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Switch the one remaining user in nouveau over to its replacement, and
remove all the wrappers.
Link: https://lore.kernel.org/r/20190724065258.16603-7-hch@lst.de
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-EAGAIN has a magic meaning for non-blocking faults, so don't overload it.
Given that the caller doesn't check for specific error codes this change
is purely cosmetic.
Link: https://lore.kernel.org/r/20190724065258.16603-6-hch@lst.de
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Currently nouveau_svm_fault expects nouveau_range_fault to never unlock
mmap_sem, but the latter unlocks it for a random selection of error
codes. Fix this up by always unlocking mmap_sem for non-zero return values
in nouveau_range_fault, and only unlocking it in the caller for successful
returns.
Link: https://lore.kernel.org/r/20190724065258.16603-5-hch@lst.de
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The parameter is always false, so remove it as well as the -EAGAIN
handling that can only happen for the non-blocking case.
Link: https://lore.kernel.org/r/20190724065258.16603-4-hch@lst.de
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
These two functions are marked as a legacy APIs to get rid of, but seem to
suit the current nouveau flow. Move it to the only user in preparation
for fixing a locking bug involving caller and callee. All comments
referring to the old API have been removed as this now is a driver private
helper.
Link: https://lore.kernel.org/r/20190724065258.16603-3-hch@lst.de
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
We should not have two different error codes for the same
condition. EAGAIN must be reserved for the FAULT_FLAG_ALLOW_RETRY retry
case and signals to the caller that the mmap_sem has been unlocked.
Use EBUSY for the !valid case so that callers can get the locking right.
Link: https://lore.kernel.org/r/20190724065258.16603-2-hch@lst.de
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
[jgg: elaborated commit message]
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Fix several warnings/errors in validation of binding schemas.
-----BEGIN PGP SIGNATURE-----
iQJEBAABCgAuFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAl0zzmkQHHJvYmhAa2Vy
bmVsLm9yZwAKCRD6+121jbxhw0YYD/9Uq8Oj9WG9NP8QeJt1qZIVIu4GMRqCiijM
ApXu9RAsQvaakwRiqaOufFSACIP/yjxHadbncObMWq7uGLT2TVxZwhR8XeOoi+Ft
UO/2KosTbTpMeAjFB1Dz/f0IbUa4Ro5ZiP5kohGNi5X/IdsCAC0ypFk3cCIx4Siz
/gr+cN+ef9p6cOy+vGzEGRMSDULbun/9cZmctDpMf9ZFUMu8xA/nn6qTEck3mQ2j
OX465qPGrstZKlO3C2NVSyUip8/NLhTrUeDCCNTFw6fOIxRCjQfIj0MNMnm+pjO3
/xPDQV0Swv+LjT+HSJA8TyEXKQP28N+v9K9bP4e59PPAf2raGX1khwvMM5bJuy//
2K8mDmAJrP5wy/9aFq8bPdzWQZzfPefDW1PCBNtZybo6OlppDu+4uX7FonjRI/nj
7AzS3qch4v4i1sJmADgysn9yOUgzUvJf/SLD1f5XHsiEa0RXr51QWVxvJaRs24wS
U/vzyZq6vqtTBSOzXpjIK0Yj+D7f05qT0MsPK5lbynCdByLli3xhbte8AwGg57RW
4CtPTZLdPVvIlCZ3jNOZXq9OizRSBokaj155YvSKQ9nzcnbHfEF4JctJQ+K3tdwS
6s4FVrgNxtXN9KidfBCnWDj+1eS1ZxeCmKo1Ypxhj9/t80NXt7b7dwEJ8H91/YuU
GfLancHPHQ==
=IkH4
-----END PGP SIGNATURE-----
Merge tag 'devicetree-fixes-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull Devicetree fixes from Rob Herring:
"Fix several warnings/errors in validation of binding schemas"
* tag 'devicetree-fixes-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: pinctrl: stm32: Fix missing 'clocks' property in examples
dt-bindings: iio: ad7124: Fix dtc warnings in example
dt-bindings: iio: avia-hx711: Fix avdd-supply typo in example
dt-bindings: pinctrl: aspeed: Fix AST2500 example errors
dt-bindings: pinctrl: aspeed: Fix 'compatible' schema errors
dt-bindings: riscv: Limit cpus schema to only check RiscV 'cpu' nodes
dt-bindings: Ensure child nodes are of type 'object'
Pull vfs documentation typo fix from Al Viro.
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
typo fix: it's d_make_root, not d_make_inode...
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAl0zg9YACgkQiiy9cAdy
T1HEtQv/Vn2vb9jPoqbCc5QfSUDL13dNYyxhqt3xPdbb3m49a7XNHLoBByM/pnMu
TF10QdOPkPS5eP5OTDpR7iwS9iNX9+8KGtlqlbzX7z+bwQmGbBQXg3VoYahSeeGU
soJE8VgqoeoOkPV9Sl8tojOIVk6B+BAv4iRPpswt8iPD8Y8IPRg3ONZNoomybzfG
GlQBFabhxJU8VLkJIb4NVB+E4AlgMDueD7HpCXD2zh6xsEULgRs2wwxfEopZlyzh
prjNY6OZliXMuTMNLw2D07xlya3ZqaXOt+hHEuvSB//YxooTVxBme70ycLF47N3q
Gvmaw8QB6scl110PHud9luqpHhdJsEvKOKr0Amv2ND2Atb098D0kZ+VITRP3QHt5
LZcHtKNRXDpAvkHphgN0W5O0ngdFDFo197BaWdwPmi9rARPhuQPq61vkFpsrPwN2
AxkG7IQANNJh4bII6/+xBpSjmtInm0BQhxuojpHIYGiZvdHNYMrJfwFcAoVWfVvl
b9MtG9Qz
=rP0b
-----END PGP SIGNATURE-----
Merge tag '5.3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Two fixes for stable, one that had dependency on earlier patch in this
merge window and can now go in, and a perf improvement in SMB3 open"
* tag '5.3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module number
cifs: flush before set-info if we have writeable handles
smb3: optimize open to not send query file internal info
cifs: copy_file_range needs to strip setuid bits and update timestamps
CIFS: fix deadlock in cached root handling
The commit b3aa14f022 ("iommu: remove the mapping_error dma_map_ops
method") incorrectly changed the checking from dma_ops_alloc_iova() in
map_sg() causes a crash under memory pressure as dma_ops_alloc_iova()
never return DMA_MAPPING_ERROR on failure but 0, so the error handling
is all wrong.
kernel BUG at drivers/iommu/iova.c:801!
Workqueue: kblockd blk_mq_run_work_fn
RIP: 0010:iova_magazine_free_pfns+0x7d/0xc0
Call Trace:
free_cpu_cached_iovas+0xbd/0x150
alloc_iova_fast+0x8c/0xba
dma_ops_alloc_iova.isra.6+0x65/0xa0
map_sg+0x8c/0x2a0
scsi_dma_map+0xc6/0x160
pqi_aio_submit_io+0x1f6/0x440 [smartpqi]
pqi_scsi_queue_command+0x90c/0xdd0 [smartpqi]
scsi_queue_rq+0x79c/0x1200
blk_mq_dispatch_rq_list+0x4dc/0xb70
blk_mq_sched_dispatch_requests+0x249/0x310
__blk_mq_run_hw_queue+0x128/0x200
blk_mq_run_work_fn+0x27/0x30
process_one_work+0x522/0xa10
worker_thread+0x63/0x5b0
kthread+0x1d2/0x1f0
ret_from_fork+0x22/0x40
Fixes: b3aa14f022 ("iommu: remove the mapping_error dma_map_ops method")
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>