Current IRQ statistics support does not show detail counts for I/O
interrupts which are processed internally only. The result is a
summation count which is way off such as this one:
CPU0 CPU1 CPU2
I/O: 1331 710 442
[...]
QAI: 15 16 16 [I/O] QDIO Adapter Interrupt
QDI: 1 0 0 [I/O] QDIO Interrupt
DAS: 706 645 381 [I/O] DASD
C15: 26 10 0 [I/O] 3215
C70: 0 0 0 [I/O] 3270
TAP: 0 0 0 [I/O] Tape
VMR: 0 0 0 [I/O] Unit Record Devices
LCS: 0 0 0 [I/O] LCS
CLW: 0 0 0 [I/O] CLAW
CTC: 0 0 0 [I/O] CTC
APB: 0 0 0 [I/O] AP Bus
Fix this by moving I/O interrupt accounting into the common I/O layer.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The cio_ignore purge function is intended to only remove CCW devices
which are in the offline state. There is a time frame after the purge
function finished where a CCW device is scheduled for removal but
still accessible. When the device is set online during this time
frame, it may first appear online before it is then removed.
Fix this by preventing that CCW devices can be set online while there
is work (such as removal triggered by the purge function) for it
pending. Also ensure that the purge function does not schedule devices
for removal which are in the process of being set online.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Make ccw_bus_type static. ccw_device drivers have to
use ccw_driver_register.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the owner and name members of struct
ccw_driver and convert all drivers to store
this data in the embedded struct device_driver.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the owner and name members of struct
css_driver and convert all drivers to store
this data in the embedded struct device_driver.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use the subchannels drv_data to access io_subchannel_private
for io subchannels.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
While resuming report any found paths as new to the
device drivers.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If device recognition is interrupted by a subchannel event
indicating that the device is gone, ccw_device_init_count
is not correctly decreased.
Fix this by reporting the corresponding event to the device
recognition callback via the state machine.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Function ccw_device_cancel_halt_clear may cause an unexpected kernel
panic if a clear function is currently active at the subchannel for
which it is called. In that case, the iretry counter used to determine
the number of retries is never initialized, leading to an immediate
failure of the function which results in a kernel panic.
Fix this by initializing the iretry counter when the function is
first called. Also replace the kernel panic with a return code: a
single malfunctioning I/O device should not automatically cause a
system-wide kernel panic.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch adds a notification mechanism to inform ccw drivers
about changes to channel paths, which occured while the device
is online.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If a ccwdevice is lost during hibernation and a different
ccwdevice is attached to the same subchannel, we will
deregister the old ccw device and register the new one.
Since deregistration is not allowed in this context, we
handle this action later. However, some parts of the
registration process for the new device were started anyway,
so that the old device structure is no longer accessible.
Fix this by deferring both actions to the afterwards
scheduled subchannel event.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If no driver is attached to a device or the driver provides no
set_online/set_offline function, setting this device online/offline
via its sysfs online attribute will silently fail but return success.
This patch changes the behavior to return -EINVAL in those cases.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
A Linux interface for the CHSC command
store-I/O-operation-status-and-initiate-logging (SIOSL).
Model-dependent logging within the channel subsystem can be invoked
via a helper function or a writable subchannel device attribute.
Signed-off-by: Michael Ernst <mernst@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
After we try to steal a lock on a ccw device in boxed state,
we have to restart device recognition and potentially reprobing.
In this case ccw_device_init_count was erroneously decreased
twice. This patch fixes the issue.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
ccw_device_pm_restore: trigger subchannel event to better handle
changes to the subchannel device.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Callers of ccw_device_notify could not distinguish between a driver
who has no notifier registered and a driver who doesn't want to keep
a device after a certain event. Change this by adding proper return
codes.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Make the potentially long blocking wait_event's used by the cio
settle mechanism interruptible.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We used to maintain 2 singlethreaded workqueues for synchronization
and to trigger work from interrupt context. Since our latest cio
changes we only use one of these workqueues. So get rid of the
unused workqueue, rename the remaining one to "cio_work_q" and move
its ownership to the channel subsystem driver.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Using dev_set_drvdata prior to device_register will force the driver core
to kmalloc its private data. Since we use this for the console subchannel
lets set the drvdata before taking the subchannels spinlock.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If we detect a busy subchannel after the driver's set_offline
callback returned in ccw_device_set_offline, the current behavior
is to unregister the device, which may lead to undesired
consequences. Change this to just quiesce the subchannel and go on
with the offline processing.
Note: This is no excuse for not fixing these drivers -
after the set_offline callback they should have no running IO!
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
IO subchannels are always unregistered in process context, so use
spin_lock_irq in the corresponding remove callback.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ensure that there will be no more interrupts for an
unregistered device by using the same quiesce and disable loop
as in io_subchannel_shutdown.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Try to disable the old subchannel before we ask the driver core
to move the attached device to a new parent. This way we can use
the QUIESCE state during shutdown which prevents a possible use
after free situation in some error cases.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
DEV_STATE_QUIESCE is used to stop all IO on a busy subchannel.
This patch fixes the following problems related to the QUIESCE
state:
* Fix a potential race condition which could occur when the
resulting state was DEV_STATE_OFFLINE.
* Add missing locking around cio_disable_subchannel,
ccw_device_cancel_halt_clear and the cdev's handler.
* Loop until we know for sure that the subchannel is disabled.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The function ccw_device_unregister has to ensure to remove
all references obtained by device_add and device_initialize.
Unfortunately it gets called for devices which are
1) uninitialized, 2) initialized but unregistered, and
3) registered devices. To distinguish 1) and 2) this patch
introduces a new flag "initialized", which is 1 as long as we
hold the initial device reference.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We used to maintain a "registered" flag in our ccw_device_private
structure. This patch removes the "registered" flag and converts
all users of it to device_is_registered which has the exact same
meaning.
Note: The usage the atomic operation test_and_clear_bit is replaced
by the non-atomic if (device_is_registered()) device_del(). This
will not do harm, since we serialize calls to ccw_device_unregister
with a single-threaded workqueue.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
After changing all internal I/O functions to use the newly introduced
ccw request infrastructure, retries are handled automatically after a
clear operation. Therefore remove the internal retry flag and
associated code.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Accept a request for setting a not-operational device offline.
This way, users can remove devices from Linux which would otherwise
remain unusable until reboot.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use the newly introduced ccw request infrastructure to implement
pgid related operations: sense pgid, set pgid and disband pg.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Device recognition needs to be started with the ccw device lock
held to prevent race conditions between I/O starting and interrupt
reception.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the return code from ccw_device_recognition and handle
recognition errors through the existing callback
ccw_device_recog_done to reduce cleanup code duplication.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Print a warning message in case a ccw device enters boxed or
not operational state during online/offline processing.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Introduce a central mechanism for performing delayed ccw device work
to ensure that different types of work do not overwrite each other.
Prioritization ensures that the most important work is always
performed while less important tasks are either obsoleted or repeated
later.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ensure that current and future users of sch->work do not overwrite
each other by introducing a single mechanism for delayed subchannel
work.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Change the initiative to update subchannel-ccw device associations
to the subchannel: when there is an indication that the internal
association no longer reflects the current hardware state, mark
each affected subchannel as requiring attention. Once processing
reaches a subchannel, determine the correct association for that
subchannel at that time and perform the necessary device_move
operations.
This change fixes problems with the previous approach which would
leave devices in an inconsistent state when a new hardware change
occurred while a device_move was already scheduled.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
sch_create_and_recog_new_device() associates a parent subchannel
with its ccw device child even though this is already done by
the subsequently called io_subchannel_recog(). Also make sure
io_subchannel_recog() sets the association under lock.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
io_subchannel_probe() frees memory for sch->private which is later
freed again when io_subchannel_remove() is called. Fix this problem
by removing the cleanup in io_subchannel_probe().
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use cio_is_console() in io_subchannel_probe to indicate that the
special handling is console specific. As long as there is no other
subchannel for which this might be true, it is misleading to speak
of "early devices". Should more of these devices be introduced,
a cleanup of all console special handling is in order anyway.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When a ccw device appears not operational, inform the associated
device driver and act according to the response: if the driver
wants to keep the device, put it into the disconnected state.
If not, or if there is no driver or if the device is not online,
unregister it. This approach is consistent with no-path event
handling.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Introduce the css_driver callback settle which can be implemented
by a subchannel driver to wait for the subchannel type specific
asynchronous work to finish.
In channel_subsystem_init_sync we call that for each subchannel
driver.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Define initialization sequence of css and ccw bus init calls by merging
them into a single init call. Also introduce channel_subsystem_init_sync
to wait for the initialization of devices to finish.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Let attribute group vectors be declared "const". We'd
like to let most attribute metadata live in read-only
sections... this is a start.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We used the init_name to set the console ccw_device's name early
at the boot stage. This patch moves the name setting (for all ccw
devices) to the point where we actually register the device. At this
time we can do dynamic allocations and therefore use dev_set_name.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We use a test_and_clear_bit to prevent a device from being
unregistered twice. Unfortunately in this cases the "final"
put_device (from device_initialize) was issued more than once,
resulting in an use after free error. Fix this by moving this
put_device to ccw_device_unregister.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When using s390dbf with "%s" in sprintf format strings the string itself
is not copied to the dbf buffer.
Since in this case only pointers are stored in the s390dbf, we should
not use dev_name - which is bound to the lifetime of the device.
Reading this entry from s390dbf after the device was released will cause
an use after free error.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When unit checks trigger sensing the device state is set to W4SENSE
until sense completion; then the device state is set back to
ONLINE. If a unit check occurs while set online or set offline
requests are processed then it might happen that the device's
temporary W4SENSE state causes these functions to terminate,
leaving the device in an inconsistent state when the state is set
back to ONLINE later on so that the device cannot be set online or
offline any longer.
To solve this, set online/offline and related rollback or error
routines are processed only if the device is in a final or
DISCONNECTED state.
Signed-off-by: Michael Ernst <mernst@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ensure to always hold an extra device reference for scheduling a
subchannel deregistration, by moving the get_device to
ccw_device_schedule_sch_unregister. This fixes an use after free
error in ccw_device_call_sch_unregister where put_device was called
on an already freed device structure.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Move debug traces for start I/O and interrupt events to exclusive
trace levels. Also change tracing in hot-path from sprintf (costly)
to hex.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ensure that the hardware interruption parameter for a subchannel is
reset when the associated subchannel data structure is freed.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>