Currently only part of the MMU SPI/SEI interrupts are enabled, although
there is no real reason to not enable all.
The only exception is "burst_fifo_full" which is expected for PMMU
because it has a 2 entries FIFO, and thus is it not enabled for it.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to be more explicit we should use the term compute_reset
for describing the reset in which only the compute engines gets
reset.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Fix the following coccicheck warning:
./drivers/misc/habanalabs/gaudi2/gaudi2.c:9727:48-53: WARNING:
conversion to bool not needed here
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Change is_idle functions so it would be more usable outside debugfs.
Do this by replacing seq_file parameter with regular string.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
H/W being dirty during initialization is completely expected in case
f/w tools are used before loading the driver. As it is not an error,
and as it doesn't give any meaningful information to the user,
no point of printing it.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Doing compute reset can be the traditional inference soft reset
that is supported only in Goya.
Or it can be the new reset upon device release, which is supported
in Gaudi2 and above.
Therefore, wherever suitable, use the terminology of compute reset
instead of soft reset.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Upon the initialization of a user context, map the host memory page of
the virtual MSI-X doorbell in the device MMU.
A reserved VA is used for this purpose, so user can use it directly
without any allocation/map operation.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Modify the decoder wrapper blocks to generate interrupts using the
virtual MSI-X doorbell.
As a decoder wrapper block cannot write directly to HBW upon completion,
it writes instead to SOB which is monitored by a master monitor.
When resolved, this monitor will be the one to actually write to the
virtual MSI-X doorbell.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Modify the CQ which is used for CS completion, to use the virtual MSI-X
doorbell.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Following patches are going to add more reserved sync objects and
monitors.
To make the counting of these reserved resources simpler, replace the
existing RESERVED_* defines with enumerations.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Due to a watchdog timer in the LBW path, writes to the MSI-X doorbell
can return sporadic error responses.
To work-around this issue, a virtual MSI-X doorbell on the HBW path is
configured, using the MSI-X AXI slave interface in the PCIe controller.
Upon an access to a configured HBW host address, the controller will
generate MSI-X interrupt instead of treating the access as regular host
memory access.
This patch allocates the dedicate host memory page, and communicate the
address to F/W, so it will configure the relevant address match
registers in the controller, and will use this address to generate MSI-X
interrupts for F/W events.
Following patches will handle other initiators in the device, to move
them to use the virtual MSI-X doorbell.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
For gaudi2 we need to send a value to F/W as part of the
PCI_ACCESS packet.
As a preparation, modify hl_fw_send_pci_access_msg() to have a 'value'
field.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
roundup will create an error in 32-bit architectures as we use
64-bit variables.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Otherwise, due to how we calculate it, we might fail in FIELD_PREP
checks.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
User application should be able to get notification for any decoder
completion. Hence, we introduce a new interface in which a user
can wait for all current decoder pending interrupts.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Current naming convention can be misleading. Hence renaming some
variables and defines in order to be more explicit.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Currently we are not waiting for preboot ready after hard reset.
This leads to a race in which COMMs protocol begins but will get no
response from the f/w.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Correctable ECC events are not fatal, but as they accumulate, the f/w
can decide that a hard-rest is required. This indication is
propagated to the host using the existing ECC event interface.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add the Gaudi2 code to initialize the ASIC's profiler. The profile
receives its initialization values from the user, same as in Gaudi2,
but the code to initialize is in the driver because the configuration
space of the device is not directly exposed to the user.
Signed-off-by: Benjamin Dotan <bdotan@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Use the generic security module to block all registers in the ASIC and
then open only those that are needed to be accessed by the user.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add the ASIC-specific code for Gaudi2. Supply (almost) all of the
function callbacks that the driver's common code need to initialize,
finalize and submit workloads to the Gaudi2 ASIC.
It also contains the code to initialize the F/W of the Gaudi2 ASIC
and to receive events from the F/W.
It contains new debugfs entry to dump razwi events. razwi is a case
where the device's engines create a transaction that reaches an
invalid destination.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>