linux/drivers
Oded Gabbay f8c8c7d5f1 habanalabs: add device reset support
This patch adds support for doing various on-the-fly reset of Goya.

The driver supports two types of resets:
1. soft-reset
2. hard-reset

Soft-reset is done when the device detects a timeout of a command
submission that was given to the device. The soft-reset process only resets
the engines that are relevant for the submission of compute jobs, i.e. the
DMA channels, the TPCs and the MME. The purpose is to bring the device as
fast as possible to a working state.

Hard-reset is done in several cases:
1. After soft-reset is done but the device is not responding
2. When fatal errors occur inside the device, e.g. ECC error
3. When the driver is removed

Hard-reset performs a reset of the entire chip except for the PCI
controller and the PLLs. It is a much longer process then soft-reset but it
helps to recover the device without the need to reboot the Host.

After hard-reset, the driver will restore the max power attribute and in
case of manual power management, the frequencies that were set.

This patch also adds two entries to the sysfs, which allows the root user
to initiate a soft or hard reset.

Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18 09:46:45 +01:00
..
accessibility
acpi ACPI: Set debug output flags independent of ACPICA 2019-02-07 12:24:28 +01:00
amba
android binder: fix handling of misaligned binder object 2019-02-15 08:49:20 +01:00
ata libata: Add NOLPM quirk for SAMSUNG MZ7TE512HMHP-000L1 SSD 2019-02-06 12:47:09 -07:00
atm atm: he: fix sign-extension overflow on large shift 2019-01-17 11:27:00 -08:00
auxdisplay auxdisplay: charlcd: fix x/y command parsing 2018-12-21 21:27:21 +01:00
base Driver core fixes for 5.0-rc6 2019-02-08 10:53:44 -08:00
bcma
block for-linus-20190118 2019-01-20 09:12:50 +12:00
bluetooth Bluetooth: hci_bcm: Handle specific unknown packets after firmware loading 2018-12-19 13:43:42 +01:00
bus ARM: SoC driver updates 2018-12-31 17:32:35 -08:00
cdrom gdrom: fix a memory leak bug 2018-12-29 08:20:44 -07:00
char char: lp: mark expected switch fall-through 2019-02-13 19:45:57 +01:00
clk clk: qcom: gcc: Use active only source for CPUSS clocks 2019-01-24 11:41:48 -08:00
clocksource arch/csky patches for 4.21-rc1 2019-01-05 09:50:07 -08:00
connector
cpufreq Merge branches 'pm-cpuidle', 'pm-cpufreq' and 'pm-sleep' 2019-01-11 10:09:51 +01:00
cpuidle cpuidle: poll_state: Fix default time limit 2019-01-30 22:57:42 +01:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-01-31 23:09:00 -08:00
dax mm, devm_memremap_pages: fix shutdown handling 2018-12-28 12:11:47 -08:00
dca
devfreq
dio
dma dmaengine-fix-5.0-rc6 2019-02-10 10:39:37 -08:00
dma-buf drivers/dma-buf/udmabuf.c: convert to use vm_fault_t 2019-01-04 13:13:46 -08:00
edac EDAC, altera: Fix S10 persistent register offset 2019-01-24 17:13:59 +01:00
eisa
extcon extcon: ptn5150: Fix return value check in ptn5150_i2c_probe() 2019-02-11 17:21:38 +09:00
firewire scsi: communicate max segment size to the DMA mapping code 2019-01-22 20:40:59 -05:00
firmware ARM: SoC fixes for linux-5.0 2019-02-08 16:23:41 -08:00
fmc
fpga Merge 5.0-rc6 into char-misc-next 2019-02-11 09:05:58 +01:00
fsi
gnss
gpio gpio: vf610: Mask all GPIO interrupts 2019-01-28 15:28:43 +01:00
gpu drm-misc-fixes for v5.0-rc6: 2019-02-08 10:32:49 +10:00
hid HID: debug: fix the ring buffer implementation 2019-01-29 12:09:11 +01:00
hsi
hv vmbus: fix subchannel removal 2019-01-09 19:20:31 -05:00
hwmon hwmon: (tmp421) Correct the misspelling of the tmp442 compatible attribute in OF device ID table 2019-01-17 12:54:52 -08:00
hwspinlock hwspinlock: fix return value check in stm32_hwspinlock_probe() 2019-01-03 11:42:10 -08:00
hwtracing coresight: Use event attributes for sink selection 2019-02-08 12:27:36 +01:00
i2c i2c: omap: Use noirq system sleep pm ops to idle device for suspend 2019-02-05 13:13:20 +01:00
i3c i3c: master: dw: fix deadlock 2019-01-26 11:14:25 +01:00
ide ide: ensure atapi sense request aren't preempted 2019-01-31 08:25:09 -07:00
idle
iio First set of IIO fixes for the 5.0 cycle. 2019-02-03 13:10:41 +01:00
infiniband IB/uverbs: Fix OOPs in uverbs_user_mmap_disassociate 2019-01-29 13:57:22 -07:00
input Mostly driver fixes, but there's a core framework fix in here too. 2019-01-31 23:22:57 -08:00
interconnect interconnect: Revert to previous config if any request fails 2019-01-22 13:37:25 +01:00
iommu IOMMU Fix for Linux v5.0-rc5: 2019-02-08 15:34:10 -08:00
ipack
irqchip Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-02-10 09:54:19 -08:00
isdn mISDN: fix a race in dev_expire_timer() 2019-02-05 16:39:29 -08:00
leds leds: lp5523: fix a missing check of return value of lp55xx_read 2019-01-17 22:27:39 +01:00
lightnvm lightnvm: pblk: fix use-after-free bug 2018-12-22 14:45:35 -07:00
macintosh macintosh/via-cuda: Don't rely on Cuda to end a transfer 2019-01-22 10:21:45 +01:00
mailbox mailbox: tegra-hsp: Use device-managed registration API 2018-12-21 22:31:26 -06:00
mcb
md dm: don't use bio_trim() afterall 2019-02-06 17:24:37 -05:00
media media: vim2m: only cancel work if it is for right context 2019-01-16 11:13:25 -05:00
memory ARM: SoC: late updates 2019-01-05 11:30:37 -08:00
memstick MMC core: 2018-12-28 16:52:18 -08:00
message scsi: flip the default on use_clustering 2018-12-18 23:13:12 -05:00
mfd mfd: Fix unmet dependency warning for MFD_TPS68470 2019-01-29 10:55:34 +01:00
misc habanalabs: add device reset support 2019-02-18 09:46:45 +01:00
mmc mmc: mediatek: fix incorrect register setting of hs400_cmd_int_delay 2019-01-28 12:49:28 +01:00
mtd mtd: rawnand: gpmi: fix MX28 bus master lockup problem 2019-02-06 09:39:22 +01:00
mux
net net: dsa: b53: Fix for failure when irq is not defined in dt 2019-02-07 18:18:37 -08:00
nfc
ntb Merge 5.0-rc4 into char-misc-next 2019-01-28 08:13:52 +01:00
nubus
nvdimm libnvdimm/security: Require nvdimm_security_setup_events() to succeed 2019-01-21 09:57:43 -08:00
nvme nvme-pci: fix rapid add remove sequence 2019-02-06 16:35:33 +01:00
nvmem nvmem: allow to select i.MX nvmem driver for i.MX 7D 2019-02-03 13:09:37 +01:00
of OF: properties: add missing of_node_put 2019-01-16 12:49:53 -06:00
opp cpufreq: scpi/scmi: Fix freeing of dynamic OPPs 2019-01-04 12:19:40 +01:00
oprofile
parisc Kconfig file consolidation for v4.21 2018-12-29 13:40:29 -08:00
parport parport: daisy: use new parport device model 2019-02-13 19:45:56 +01:00
pci pci-v5.0-fixes-4 2019-02-08 15:32:10 -08:00
pcmcia Included in this update: 2019-01-05 11:23:17 -08:00
perf perf/aux: Make perf_event accessible to setup_aux() 2019-02-08 12:27:36 +01:00
phy USB/PHY fixes for 5.0-rc4 2019-01-25 12:57:09 -10:00
pinctrl pinctrl: sunxi: Correct number of IRQ banks on H6 main pin controller 2019-01-22 10:52:39 +01:00
platform Merge 5.0-rc6 into char-misc-next 2019-02-11 09:05:58 +01:00
pnp Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
power power supply and reset changes for the v4.21 series 2018-12-28 20:22:45 -08:00
powercap
pps Kconfig updates for v4.21 2018-12-29 13:03:29 -08:00
ps3
ptp ptp: check that rsv field is zero in struct ptp_sys_offset_extended 2019-01-08 16:22:56 -05:00
pwm pwm: imx: Add ipg clock operation 2018-12-24 12:06:56 +01:00
rapidio cross-tree: phase out dma_zalloc_coherent() 2019-01-08 07:58:37 -05:00
ras treewide: surround Kconfig file paths with double quotes 2018-12-22 00:25:54 +09:00
regulator Merge remote-tracking branch 'regulator/topic/coupled' into regulator-next 2018-12-21 13:43:35 +00:00
remoteproc virtio: don't allocate vqs when names[i] = NULL 2019-01-14 20:15:19 -05:00
reset reset: uniphier-glue: Add AHCI reset control support in glue layer 2019-01-07 16:38:51 +01:00
rpmsg
rtc RTC for 4.21 2019-01-01 13:24:31 -08:00
s390 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-02-08 11:21:54 -08:00
sbus Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next 2018-12-26 10:32:18 -08:00
scsi Merge 5.0-rc6 into char-misc-next 2019-02-11 09:05:58 +01:00
sfi
sh
siox
slimbus slimbus: core: add missing spin_lock_init on txn_lock 2019-01-22 13:34:35 +01:00
sn
soc soc/fsl fixes for v5.0 2019-01-30 11:14:04 +01:00
soundwire
spi cross-tree: phase out dma_zalloc_coherent() 2019-01-08 07:58:37 -05:00
spmi
ssb
staging Staging/IIO driver fixes for 5.0-rc6 2019-02-08 10:51:59 -08:00
target scsi: target: make the pi_prot_format ConfigFS path readable 2019-02-04 21:40:32 -05:00
tc
tee OP-TEE dynamic shm log message 2018-12-31 13:06:30 -08:00
thermal Merge branch 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2019-01-23 16:23:41 +13:00
thunderbolt
tty TTY/Serial fixes for 5.0-rc6 2019-02-08 10:49:55 -08:00
uio driver: uio: fix possible use-after-free in __uio_register_device 2019-01-31 16:36:52 +01:00
usb usb: typec: tcpm: Correct the PPS out_volt calculation 2019-01-31 09:14:00 +01:00
uwb
vfio vfio-pci/nvlink2: Fix ancient gcc warnings 2019-01-23 08:20:43 -07:00
vhost vhost: fix OOB in get_rx_bufs() 2019-01-28 22:53:09 -08:00
video Merge 5.0-rc4 into char-misc-next 2019-01-28 08:13:52 +01:00
virt
virtio virtio: drop internal struct from UAPI 2019-02-05 15:29:48 -05:00
visorbus
vlynq
vme
w1 treewide: surround Kconfig file paths with double quotes 2018-12-22 00:25:54 +09:00
watchdog watchdog: tqmx86: Fix a couple IS_ERR() vs NULL bugs 2019-01-07 10:10:35 +01:00
xen arm64/xen: fix xen-swiotlb cache flushing 2019-01-23 22:14:56 +01:00
zorro
Kconfig interconnect: Add generic on-chip interconnect API 2019-01-22 13:37:25 +01:00
Makefile interconnect: Add generic on-chip interconnect API 2019-01-22 13:37:25 +01:00