linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-26 13:44:15 +08:00

Author	SHA1	Message	Date
Arnd Bergmann	7d31677bb7	gpu: host1x: fix uninitialized variable use The error handling for platform_get_irq() failing no longer works after a recent change, clang now points this out with a warning: drivers/gpu/host1x/dev.c:520:6: error: variable 'syncpt_irq' is uninitialized when used here [-Werror,-Wuninitialized] if (syncpt_irq < 0) ^~~~~~~~~~ Fix this by removing the variable and checking the correct error status. Fixes: `625d4ffb43` ("gpu: host1x: Rewrite syncpoint interrupt handling") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-03-20 11:12:37 -07:00
Linus Torvalds	a13de74e47	IOMMU Updates for Linux v6.3: Including: - Consolidate iommu_map/unmap functions. There have been blocking and atomic variants so far, but that was problematic as this approach does not scale with required new variants which just differ in the GFP flags used. So Jason consolidated this back into single functions that take a GFP parameter. This has the potential to cause conflicts with other trees, as they introduce new call-sites for the changed functions. I offered them to pull in the branch containing these changes and resolve it, but I am not sure everyone did that. The conflicts this caused with upstream up to v6.2-rc8 are resolved in the final merge commit. - Retire the detach_dev() call-back in iommu_ops - Arm SMMU updates from Will: - Device-tree binding updates: * Cater for three power domains on SM6375 * Document existing compatible strings for Qualcomm SoCs * Tighten up clocks description for platform-specific compatible strings - Enable Qualcomm workarounds for some additional platforms that need them - Intel VT-d updates from Lu Baolu: - Add Intel IOMMU performance monitoring support - Set No Execute Enable bit in PASID table entry - Two performance optimizations - Fix PASID directory pointer coherency - Fix missed rollbacks in error path - Cleanups - Apple t8110 DART support - Exynos IOMMU: - Implement better fault handling - Error handling fixes - Renesas IPMMU: - Add device tree bindings for r8a779g0 - AMD IOMMU: - Various fixes for handling on SNP-enabled systems and handling of faults with unknown request-ids - Cleanups and other small fixes - Various other smaller fixes and cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmP0hDwACgkQK/BELZcB GuM43RAA0YieShO+X0h6TFGfbK0zVoPd91giZehWBv9rHK7pP4iY8UEtBLBWGx/t CId4t98mmKmC212zz8QxrwAEzyTIRY+2t1yrpG2aVkoTYk8inMb07TU37wganh3O T0QccXN+9b2BS4k8yro5f3uX0d/C1JQVcMowwr53VMb/e73huqP1VTbz06/CIWMH DUhVRCzmNhSvoUOT5n7g6+ZDH+pot8WPZbtHV7FowEsmPCRc7Fj8kXyI9FEwKwrZ hIV5Y+6Lej8nQScgbO8MfblJym3VrBoSoM4GY2w0L0rjQw6m+Xtea5rT0W39YVWy YpiscLTL8TIMPP9zK1dXVygTaABK4J2iWmheHPkpKXIhK0iuH3Dke0Do5p6DNITj 7J2YlaNEB480D5hvNBKsbbGHavgGPT8m529Sz0R7mSC7omRzqiG5Vsb46IXL+2bc 92ojjYNfXb6OCtagIr2LMBLZRL2JCODqF1dUmyZfA8GKOHLP5kZXoMM+sZbQ2aUL 1LOxRZVx+tlb9V4VaH1ZSs/6eM+HLDzjtHeu3PoWYf6mW4AEt4S/yl9SKAkGdBqt jCUErmYB1nU/eefqG1jhWRpQeJabcT3Oe30NZru1pfMoREThhjbAACw1JxWtoe1X ipGpV6lAP7tQUGuRk3/9O1lNqElJuNwC5lVTjS4FJ38vYQhQbao= =ZaZV -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - Consolidate iommu_map/unmap functions. There have been blocking and atomic variants so far, but that was problematic as this approach does not scale with required new variants which just differ in the GFP flags used. So Jason consolidated this back into single functions that take a GFP parameter. - Retire the detach_dev() call-back in iommu_ops - Arm SMMU updates from Will: - Device-tree binding updates: - Cater for three power domains on SM6375 - Document existing compatible strings for Qualcomm SoCs - Tighten up clocks description for platform-specific compatible strings - Enable Qualcomm workarounds for some additional platforms that need them - Intel VT-d updates from Lu Baolu: - Add Intel IOMMU performance monitoring support - Set No Execute Enable bit in PASID table entry - Two performance optimizations - Fix PASID directory pointer coherency - Fix missed rollbacks in error path - Cleanups - Apple t8110 DART support - Exynos IOMMU: - Implement better fault handling - Error handling fixes - Renesas IPMMU: - Add device tree bindings for r8a779g0 - AMD IOMMU: - Various fixes for handling on SNP-enabled systems and handling of faults with unknown request-ids - Cleanups and other small fixes - Various other smaller fixes and cleanups * tag 'iommu-updates-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (71 commits) iommu/amd: Skip attach device domain is same as new domain iommu: Attach device group to old domain in error path iommu/vt-d: Allow to use flush-queue when first level is default iommu/vt-d: Fix PASID directory pointer coherency iommu/vt-d: Avoid superfluous IOTLB tracking in lazy mode iommu/vt-d: Fix error handling in sva enable/disable paths iommu/amd: Improve page fault error reporting iommu/amd: Do not identity map v2 capable device when snp is enabled iommu: Fix error unwind in iommu_group_alloc() iommu/of: mark an unused function as __maybe_unused iommu: dart: DART_T8110_ERROR range should be 0 to 5 iommu/vt-d: Enable IOMMU perfmon support iommu/vt-d: Add IOMMU perfmon overflow handler support iommu/vt-d: Support cpumask for IOMMU perfmon iommu/vt-d: Add IOMMU perfmon support iommu/vt-d: Support Enhanced Command Interface iommu/vt-d: Retrieve IOMMU perfmon capability information iommu/vt-d: Support size of the register set in DRHD iommu/vt-d: Set No Execute Enable bit in PASID table entry iommu/vt-d: Remove sva from intel_svm_dev ...	2023-02-24 13:40:13 -08:00
Linus Torvalds	a93e884edf	Driver core changes for 6.3-rc1 Here is the large set of driver core changes for 6.3-rc1. There's a lot of changes this development cycle, most of the work falls into two different categories: - fw_devlink fixes and updates. This has gone through numerous review cycles and lots of review and testing by lots of different devices. Hopefully all should be good now, and Saravana will be keeping a watch for any potential regression on odd embedded systems. - driver core changes to work to make struct bus_type able to be moved into read-only memory (i.e. const) The recent work with Rust has pointed out a number of areas in the driver core where we are passing around and working with structures that really do not have to be dynamic at all, and they should be able to be read-only making things safer overall. This is the contuation of that work (started last release with kobject changes) in moving struct bus_type to be constant. We didn't quite make it for this release, but the remaining patches will be finished up for the release after this one, but the groundwork has been laid for this effort. Other than that we have in here: - debugfs memory leak fixes in some subsystems - error path cleanups and fixes for some never-able-to-be-hit codepaths. - cacheinfo rework and fixes - Other tiny fixes, full details are in the shortlog All of these have been in linux-next for a while with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY/ipdg8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ynL3gCgwzbcWu0So3piZyLiJKxsVo9C2EsAn3sZ9gN6 6oeFOjD3JDju3cQsfGgd =Su6W -----END PGP SIGNATURE----- Merge tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the large set of driver core changes for 6.3-rc1. There's a lot of changes this development cycle, most of the work falls into two different categories: - fw_devlink fixes and updates. This has gone through numerous review cycles and lots of review and testing by lots of different devices. Hopefully all should be good now, and Saravana will be keeping a watch for any potential regression on odd embedded systems. - driver core changes to work to make struct bus_type able to be moved into read-only memory (i.e. const) The recent work with Rust has pointed out a number of areas in the driver core where we are passing around and working with structures that really do not have to be dynamic at all, and they should be able to be read-only making things safer overall. This is the contuation of that work (started last release with kobject changes) in moving struct bus_type to be constant. We didn't quite make it for this release, but the remaining patches will be finished up for the release after this one, but the groundwork has been laid for this effort. Other than that we have in here: - debugfs memory leak fixes in some subsystems - error path cleanups and fixes for some never-able-to-be-hit codepaths. - cacheinfo rework and fixes - Other tiny fixes, full details are in the shortlog All of these have been in linux-next for a while with no reported problems" [ Geert Uytterhoeven points out that that last sentence isn't true, and that there's a pending report that has a fix that is queued up - Linus ] * tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (124 commits) debugfs: drop inline constant formatting for ERR_PTR(-ERROR) OPP: fix error checking in opp_migrate_dentry() debugfs: update comment of debugfs_rename() i3c: fix device.h kernel-doc warnings dma-mapping: no need to pass a bus_type into get_arch_dma_ops() driver core: class: move EXPORT_SYMBOL_GPL() lines to the correct place Revert "driver core: add error handling for devtmpfs_create_node()" Revert "devtmpfs: add debug info to handle()" Revert "devtmpfs: remove return value of devtmpfs_delete_node()" driver core: cpu: don't hand-override the uevent bus_type callback. devtmpfs: remove return value of devtmpfs_delete_node() devtmpfs: add debug info to handle() driver core: add error handling for devtmpfs_create_node() driver core: bus: update my copyright notice driver core: bus: add bus_get_dev_root() function driver core: bus: constify bus_unregister() driver core: bus: constify some internal functions driver core: bus: constify bus_get_kset() driver core: bus: constify bus_register/unregister_notifier() driver core: remove private pointer from struct bus_type ...	2023-02-24 12:58:55 -08:00
Thierry Reding	9026ba7223	gpu: host1x: Use tegra_dev_iommu_get_stream_id() Use the newly implemented tegra_dev_iommu_get_stream_id() helper to encapsulate and centralize the IOMMU stream ID access. Signed-off-by: Thierry Reding <treding@nvidia.com>	2023-01-27 17:41:49 +01:00
Greg Kroah-Hartman	2a81ada32f	driver core: make struct bus_type.uevent() take a const * The uevent() callback in struct bus_type should not be modifying the device that is passed into it, so mark it as a const * and propagate the function signature changes out into all relevant subsystems that use this callback. Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20230111113018.459199-16-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-01-27 13:45:52 +01:00
Mikko Perttunen	d5179020f5	gpu: host1x: External timeout/cancellation for fences Currently all fences have a 30 second timeout to ensure they are cleaned up if the fence never completes otherwise. However, this one size fits all solution doesn't actually fit in every case, such as syncpoint waiting where we want to be able to have timeouts longer than 30 seconds. As such, we want to be able to give control over fence cancellation to the caller (and maybe eventually get rid of the internal timeout altogether). Here we add this cancellation mechanism by essentially adding a function for entering the timeout path by function call, and changing the syncpoint wait function to use it. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2023-01-26 15:55:38 +01:00
Mikko Perttunen	625d4ffb43	gpu: host1x: Rewrite syncpoint interrupt handling Move from the old, complex intr handling code to a new implementation based on dma_fences. While there is a fair bit of churn to get there, the new implementation is much simpler and likely faster as well due to allowing signaling directly from interrupt context. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2023-01-26 15:55:38 +01:00
Mikko Perttunen	c24973ed79	gpu: host1x: Implement job tracking using DMA fences In anticipation of removal of the intr API, implement job tracking using DMA fences instead. The main two things about this are making cdma_update schedule the work since fence completion can now be called from interrupt context, and some complication in ensuring the callback is not running when we free the fence. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2023-01-26 15:55:38 +01:00
Mikko Perttunen	f0fb260a0c	gpu: host1x: Implement syncpoint wait using DMA fences In anticipation of removal of the intr API, move host1x_syncpt_wait to use DMA fences instead. As of this patch, this means that waits have a 30 second maximum timeout because of the implicit timeout we have with fences, but that will be lifted in a follow-up patch. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2023-01-26 15:55:38 +01:00
Mikko Perttunen	eb258cc1fd	gpu: host1x: Don't skip assigning syncpoints to channels The code to write the syncpoint channel assignment register incorrectly skips the write if hypervisor registers are not available. The register, however, is within the guest aperture so remove the check and assign syncpoints properly even on virtualized systems. Fixes: `c3f52220f2` ("gpu: host1x: Enable Tegra186 syncpoint protection") Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2023-01-26 15:55:38 +01:00
Mikko Perttunen	79aad29c7d	gpu: host1x: Fix mask for syncpoint increment register On Tegra186+, the syncpoint ID has 10 bits of space. To allow using more than 256 syncpoints, fix the mask. Fixes: `9abdd497cd` ("gpu: host1x: Tegra234 device data and headers") Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2023-01-26 15:55:38 +01:00
Jason Gunthorpe	1369459b2e	iommu: Add a gfp parameter to iommu_map() The internal mechanisms support this, but instead of exposting the gfp to the caller it wrappers it into iommu_map() and iommu_map_atomic() Fix this instead of adding more variants for GFP_KERNEL_ACCOUNT. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/1-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-01-25 11:52:00 +01:00
Thierry Reding	08fef75f5e	gpu: host1x: Staticize host1x_syncpt_fence_ops This structure is never used outside the file, so make it locally scoped. Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-11-25 16:14:59 +01:00
Liu Shixin	a624bd9cbd	gpu: host1x: Use DEFINE_SHOW_ATTRIBUTE to simplify debugfs code Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the debugfs code for the status and status_all entries. No functional change. Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-11-25 16:14:59 +01:00
Mikko Perttunen	97b93b7a4a	gpu: host1x: Add stream ID register data for NVDEC on Tegra234 Add entries for NVDEC to the Tegra234 SID table. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-11-25 16:14:58 +01:00
Mikko Perttunen	8935002fc3	gpu: host1x: Select context device based on attached IOMMU On Tegra234, engines that are programmed through Host1x channels can be attached to either the NISO0 or NISO1 SMMU. Because of that, when selecting a context device to use with an engine, we need to select one that is also attached to the same SMMU. Add a parameter to host1x_memory_context_alloc to specify which device we are allocating a context for, and use it to pick an appropriate context device. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> [treding@nvidia.com: update !IOMMU_API stub signature] Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-11-25 16:14:19 +01:00
Robin Murphy	c2418f911a	gpu: host1x: Avoid trying to use GART on Tegra20 Since commit `c7e3ca515e` ("iommu/tegra: gart: Do not register with bus") quite some time ago, the GART driver has effectively disabled itself to avoid issues with the GPU driver expecting it to work in ways that it doesn't. As of commit `57365a04c9` ("iommu: Move bus setup to IOMMU device registration") that bodge no longer works, but really the GPU driver should be responsible for its own behaviour anyway. Make the workaround explicit. Reported-by: Jon Hunter <jonathanh@nvidia.com> Suggested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-11-18 09:33:20 +01:00
Christophe JAILLET	2e1bfb314c	gpu: host1x: Use the bitmap API to allocate bitmaps Use bitmap_zalloc()/bitmap_free() instead of hand-writing them. It is less verbose and it improves the semantic. While at it, remove a useless bitmap_zero() call. The bitmap is already zero'ed when allocated. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 17:38:34 +02:00
Mikko Perttunen	8c92243d9e	gpu: host1x: Generalize host1x_cdma_push_wide() host1x_cdma_push_wide() had the assumptions that the last parameter word was a NOP opcode, and that NOP opcodes could be used in all situations. Neither are true with the new job opcode sequence, so adjust the function to not have these assumptions, and instead place an early RESTART opcode when necessary to jump back to the beginning of the pushbuffer. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 17:36:26 +02:00
Mikko Perttunen	5b7239c17c	gpu: host1x: Initialize syncval in channel_submit() During the refactoring of channel_submit(), assignment of syncval was moved but it is also used in channel_submit(). Add this assignment back to channel_submit() as well. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 17:35:19 +02:00
Robin Murphy	f99e689181	gpu: host1x: Register context bus unconditionally Conditional registration is a problem for other subsystems which may unwittingly try to interact with host1x_context_device_bus_type in an uninitialised state on non-Tegra platforms. A look under /sys/bus on a typical system already reveals plenty of entries from enabled but otherwise irrelevant configs, so lets keep things simple and register our context bus unconditionally too. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 16:31:24 +02:00
Mikko Perttunen	0ae4ae9158	gpu: host1x: Use RESTART_W to skip timed out jobs on Tegra186+ When MLOCK enforcement is enabled, the 0-word write currently done is rejected by the hardware outside of an MLOCK region. As such, on these chips, which also have the newer, more convenient RESTART_W opcode, use that instead to skip over the timed out job. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 16:27:53 +02:00
Mikko Perttunen	a94b8a77bc	gpu: host1x: Add MLOCK release code on Tegra234 With the full-featured opcode sequence using MLOCKs, we need to also unlock those MLOCKs in the event of a timeout. However, it turns out that on Tegra186/Tegra194, by default, we don't need to do this; furthermore, on Tegra234 it is much simpler to do; so only implement this on Tegra234 for the time being. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 16:27:53 +02:00
Mikko Perttunen	1411796f20	gpu: host1x: Rewrite job opcode sequence For new (Tegra186+) SoCs, use a new ('full-featured') job opcode sequence that is compatible with virtualization. In particular, the Host1x hardware in Tegra234 is more strict regarding the sequence, requiring ACQUIRE_MLOCK-SETCLASS-SETSTREAMID opcodes to occur in that sequence without gaps (except for SETPAYLOAD), so let's do it properly in one go now. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 16:27:53 +02:00
Mikko Perttunen	9abdd497cd	gpu: host1x: Tegra234 device data and headers Add device data and chip headers for Tegra234. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 16:27:52 +02:00
Mikko Perttunen	7afd1194a3	gpu: host1x: Program interrupt destinations on Tegra234 On Tegra234, each Host1x VM has 8 interrupt lines. Each syncpoint can be configured with which interrupt line should be used for threshold interrupt, allowing for load balancing. For now, to keep backwards compatibility, just set all syncpoints to the first interrupt. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 16:27:52 +02:00
Mikko Perttunen	ee8f894f3f	gpu: host1x: Allow reset to be missing Host1x on Tegra234 does not have a software-controllable reset line. As such, don't bail out if we don't find one in the device tree. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 16:27:52 +02:00
Mikko Perttunen	939179fab8	gpu: host1x: Program virtualization tables Program virtualization tables specifying which VMs have access to which Host1x hardware resources. Programming these has become mandatory in Tegra234. For now, since the driver does not operate as a Host1x hypervisor, we basically allow access to everything to everyone. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 16:27:52 +02:00
Mikko Perttunen	97dea367d8	gpu: host1x: Simplify register mapping and add common aperture Refactor 'regs' property loading using devm_platform_ioremap_* and add loading of the 'common' region found on Tegra234. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 16:27:52 +02:00
Mikko Perttunen	3000c4ac02	gpu: host1x: Deduplicate hardware headers Host1x class information and opcodes are unchanged or backwards compatible across SoCs so let's not duplicate them for each one but have them in a shared header file. At the same time, add opcode functions for acquire/release_mlock. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 16:27:52 +02:00
Mikko Perttunen	2486254781	gpu: host1x: Program context stream ID on submission Add code to do stream ID switching at the beginning of a job. The stream ID is switched to the stream ID specified by the context passed in the job structure. Before switching the stream ID, an OP_DONE wait is done on the channel's engine to ensure that there is no residual ongoing work that might do DMA using the new stream ID. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 16:27:52 +02:00
Mikko Perttunen	8aa5bcb616	gpu: host1x: Add context device management code Add code to register context devices from device tree, allocate them out and manage their refcounts. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-07-08 16:27:52 +02:00
Mikko Perttunen	597b89d30b	gpu: host1x: Add context bus The context bus is a "dummy" bus that contains struct devices that correspond to IOMMU contexts assigned through Host1x to processes. Even when host1x itself is built as a module, the bus is registered in built-in code so that the built-in ARM SMMU driver is able to reference it. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-06-01 11:50:42 +02:00
Jon Hunter	74bb98dd91	gpu: host1x: Show all allocated syncpts via debugfs When the host1x syncpts status is dumped via the debugfs, syncpts that have been allocated but not yet used are not shown and so currently it is not possible to see all the allocated syncpts. Update the path for dumping the syncpt status via the debugfs to show all allocated syncpts even if they have not been used yet. Note that when the syncpt status is dumped by the kernel itself for debugging only the active syncpt are shown. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-04-06 15:33:57 +02:00
Thierry Reding	3e9c458433	gpu: host1x: Do not use mapping cache for job submissions Buffer mappings used in job submissions are usually small and not rapidly reused as opposed to framebuffers (which are usually large and rapidly reused, for example when page-flipping between double-buffered framebuffers). Avoid going through the mapping cache for these buffers since the cache would also lead to leaks if nobody is ever releasing the cache's last reference. For DRM/KMS these last references are dropped when the framebuffers are removed and therefore no longer needed. While at it, also add a note about the need to explicitly remove the final reference to the mapping in the cache. Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-04-06 15:12:36 +02:00
Christophe JAILLET	025c6643a8	gpu: host1x: Fix a memory leak in 'host1x_remove()' Add a missing 'host1x_channel_list_free()' call in the remove function, as already done in the error handling path of the probe function. Fixes: `8474b02531` ("gpu: host1x: Refactor channel allocation code") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-03-01 11:13:09 +01:00
Christophe JAILLET	e5d5db1a79	gpu: host1x: Fix an error handling path in 'host1x_probe()' Add the missing 'host1x_bo_cache_destroy()' call in the error handling path of the probe, as already done in the remove function. In order to simplify the error handling, move the 'host1x_bo_cache_init()' call after all the devm_ function. Fixes: `1f39b1dfa5` ("drm/tegra: Implement buffer object cache") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-03-01 11:13:03 +01:00
Mikko Perttunen	184b58fa81	gpu: host1x: Always return syncpoint value when waiting The new TegraDRM UAPI uses syncpoint waiting with timeout set to zero to indicate reading the syncpoint value. To support that we need to return the syncpoint value always when waiting. Fixes: `44e9613813` ("drm/tegra: Implement syncpoint wait UAPI") Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-02-16 17:20:53 +01:00
Dmitry Osipenko	22d7ee32f1	gpu: host1x: Fix hang on Tegra186+ Tegra186+ hangs if host1x hardware is disabled at a kernel boot time because we touch hardware before runtime PM is resumed. Move sync point assignment initialization to the RPM-resume callback. Older SoCs were unaffected because they skip that sync point initialization. Tested-by: Jon Hunter <jonathanh@nvidia.com> # T186 Reported-by: Jon Hunter <jonathanh@nvidia.com> # T186 Fixes: `6b6776e2ab` ("gpu: host1x: Add initial runtime PM and OPP support") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2022-01-27 18:27:02 +01:00
Dmitry Osipenko	d5185965c3	gpu: host1x: Add back arm_iommu_detach_device() Host1x DMA buffer isn't mapped properly when CONFIG_ARM_DMA_USE_IOMMU=y. The memory management code of Host1x driver has a longstanding overhaul overdue and it's not obvious where the problem is in this case. Hence let's add back the old workaround which we already had sometime before. It explicitly detaches Host1x device from the offending implicit IOMMU domain. This fixes a completely broken Host1x DMA in case of ARM32 multiplatform kernel config. Cc: stable@vger.kernel.org Fixes: `af1cbfb9bf` ("gpu: host1x: Support DMA mapping of buffers") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2021-12-16 14:28:51 +01:00
Dmitry Osipenko	9ca790f446	gpu: host1x: Add host1x_channel_stop() Add host1x_channel_stop() which waits till channel becomes idle and then stops the channel hardware. This is needed for supporting suspend/resume by host1x drivers since the hardware state is lost after power-gating, thus the channel needs to be stopped before client enters into suspend. Tested-by: Peter Geis <pgwipeout@gmail.com> # Ouya T30 Tested-by: Paul Fertser <fercerpav@gmail.com> # PAZ00 T20 Tested-by: Nicolas Chauvet <kwizart@gmail.com> # PAZ00 T20 and TK1 T124 Tested-by: Matt Merhar <mattmerhar@protonmail.com> # Ouya T30 Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2021-12-16 14:07:07 +01:00
Dmitry Osipenko	6b6776e2ab	gpu: host1x: Add initial runtime PM and OPP support Add runtime PM and OPP support to the Host1x driver. For the starter we will keep host1x always-on because dynamic power management require a major refactoring of the driver code since lot's of code paths are missing the RPM handling and we're going to remove some of these paths in the future. Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Peter Geis <pgwipeout@gmail.com> # Ouya T30 Tested-by: Paul Fertser <fercerpav@gmail.com> # PAZ00 T20 Tested-by: Nicolas Chauvet <kwizart@gmail.com> # PAZ00 T20 and TK1 T124 Tested-by: Matt Merhar <mattmerhar@protonmail.com> # Ouya T30 Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2021-12-16 14:07:07 +01:00
Robin Murphy	4abfc0e3a5	gpu: host1x: Add missing DMA API include Host1x seems to be relying on picking up dma-mapping.h transitively from iova.h, which has no reason to include it in the first place. Fix the former issue before we totally break things by fixing the latter one. CC: Thierry Reding <thierry.reding@gmail.com> CC: Mikko Perttunen <mperttunen@nvidia.com> CC: dri-devel@lists.freedesktop.org Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2021-12-16 14:07:07 +01:00
Arnd Bergmann	6c7a388b62	gpu: host1x: select CONFIG_DMA_SHARED_BUFFER Linking fails when dma-buf is disabled: ld.lld: error: undefined symbol: dma_fence_release >>> referenced by fence.c >>> gpu/host1x/fence.o:(host1x_syncpt_fence_enable_signaling) in archive drivers/built-in.a >>> referenced by fence.c >>> gpu/host1x/fence.o:(host1x_fence_signal) in archive drivers/built-in.a >>> referenced by fence.c >>> gpu/host1x/fence.o:(do_fence_timeout) in archive drivers/built-in.a Fixes: `687db2207b` ("gpu: host1x: Add DMA fence implementation") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2021-12-16 14:07:06 +01:00
Randy Dunlap	ab3c971d2f	gpu: host1x: Drop excess kernel-doc entry @key Fix kernel-doc warning in host1x: ../drivers/gpu/host1x/bus.c:774: warning: Excess function parameter 'key' description in '__host1x_client_register' Fixes: `0cfe5a6e75` ("gpu: host1x: Split up client initalization and registration") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: dri-devel@lists.freedesktop.org Cc: linux-tegra@vger.kernel.org Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Thierry Reding <treding@nvidia.com>	2021-12-16 14:07:06 +01:00
Mikko Perttunen	46f226c93d	drm/tegra: Add NVDEC driver Add support for booting and using NVDEC on Tegra210, Tegra186 and Tegra194 to the Host1x and TegraDRM drivers. Booting in secure mode is not currently supported. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2021-12-16 14:07:06 +01:00
Thierry Reding	1f39b1dfa5	drm/tegra: Implement buffer object cache This cache is used to avoid mapping and unmapping buffer objects unnecessarily. Mappings are cached per client and stay hot until the buffer object is destroyed. Signed-off-by: Thierry Reding <treding@nvidia.com>	2021-12-16 14:07:06 +01:00
Thierry Reding	c6aeaf56f4	drm/tegra: Implement correct DMA-BUF semantics DMA-BUF requires that each device that accesses a DMA-BUF attaches to it separately. To do so the host1x_bo_pin() and host1x_bo_unpin() functions need to be reimplemented so that they can return a mapping, which either represents an attachment or a map of the driver's own GEM object. Signed-off-by: Thierry Reding <treding@nvidia.com>	2021-12-16 14:07:06 +01:00
Thierry Reding	c3dbfb9c49	gpu: host1x: Plug potential memory leak The memory allocated for a DMA fence could be leaked if the code failed to allocate the waiter object. Make sure to release the fence allocation on failure. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2021-09-16 18:06:52 +02:00
Dmitry Osipenko	a81cf839a0	gpu/host1x: fence: Make spinlock static The DEFINE_SPINLOCK macro creates a global spinlock symbol that is visible to the whole kernel. This is unintended in the code, fix it. Fixes: `687db2207b` ("gpu: host1x: Add DMA fence implementation") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2021-09-16 18:06:51 +02:00

1 2 3 4 5 ...

331 Commits