* 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc:
Remove DEBUG_SEMAPHORE from Kconfig
Improve semaphore documentation
Simplify semaphore implementation
Add down_timeout and change ACPI to use it
Introduce down_killable()
Generic semaphore implementation
Add semaphore.h to kernel_lock.c
Fix quota.h includes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (104 commits)
IB/iser: Don't change itt endianness
IB/mlx4: Update module version and release date
IPoIB: Handle case when P_Key is deleted and re-added at same index
IB/iser: Release connection resources on RDMA_CM_EVENT_DEVICE_REMOVAL event
IB/mlx4: Fix incorrect comment
IB/mlx4: Fix race when detaching a QP from a multicast group
IB/ehca: Support all ibv_devinfo values in query_device() and query_port()
RDMA/nes: Free IRQ before killing tasklet
IB/mthca: Update module version and release date
IB/mlx4: Update QP state if query QP succeeds
IB/mthca: Update QP state if query QP succeeds
RDMA/amso1100: Add check for NULL reply_msg in c2_intr()
IB/mlx4: Add support for resizing CQs
IB/mlx4: Add support for modifying CQ moderation parameters
IPoIB: Support modifying IPoIB CQ event moderation
IB/core: Add support for modify CQ
IPoIB: Add basic ethtool support
mlx4_core: Increase max number of QPs to 128K
RDMA/amso1100: Add support for "send with invalidate" work requests
IB/core: Add support for "send with invalidate" work requests
...
ACPI currently emulates a timeout for semaphores with calls to
down_trylock and sleep. This produces horrible behaviour in terms of
fairness and excessive wakeups. Now that we have a unified semaphore
implementation, adding a real down_trylock is almost trivial.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Move the function that prints the segment warning messages found in the
monreader driver and the dcssblk driver to the extmem base code.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Comments, which suggested to be kernel-doc but were not in the right
formatting, have been corrected. Additionally some minor cleanup in
the comments has been done.
Signed-off-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Most noteable part of this commit is the new local header file entry.h
which contains all the function declarations of functions that get only
called from asm code or are arch internal. That way we can avoid extern
declarations in C files.
This is more or less the same that was done for sparc64.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
This way we get rid of s390's NO_IDLE_HZ and use the generic dynticks
variant instead. In addition we get high resolution timers for free.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Remove the program check generating monitor calls and use function
calls instead. Theres is no real advantage in using monitor calls,
but they do make debugging harder, because of all the program checks
it generates.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Next generation of OSA adapters allows retrieval of further self-describing
infos. This is the preparational infrastructure patch for further exploitation
in the qeth driver.
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
This patch allows user space applications to access large amounts of
truly random data. The random data source is the build-in hardware
random number generator on the CEX2C cards.
Signed-off-by: Ralph Wuerthner <rwuerthn@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
The api for hardware random number generators is currently limited to
devices that never fail. If the hardware is registered as a source for
random numbers it has to work. This prevents the use of i/o based
random number devices where the i/o might fail.
Add a check for errors after the read from a hardware random number device.
This patch is required to support large random numbers retrieved
from the CEX2C cards on System z.
Signed-off-by: Ralph Wuerthner <rwuerthn@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Add permanent and temporary model capacity and the corresponding
capacity value fields for the three capacity identifiers to the
output of /proc/sysinfo.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
drivers/s390/sysinfo.c uses the store system information intruction to query
the system about information of the machine, the LPAR and additional
hypervisors. KVM has to implement the host part for this instruction.
To avoid code duplication, this patch splits the common definitions from
sysinfo.c into a separate header file include/asm-s390/sysinfo.h for KVM use.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Now the system reports system information messages (SIM) to the user.
The System Reference Code (SRC) which is reported to the user gives
the abbility to lookup the reason of the SIM online in the
documentation of the storage server.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
If user space opens a unit record device node then vmur is leaving the kernel
with lock open_mutex still held to prevent other processes from opening the
device simultaneously. This causes lockdep to complain about a lock held when
returning to user space.
Now the mutex is replaced by a wait queue to serialize device open.
Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
When a tape device is set online, offline and online again, the following
error message is printed on the console: "sysfs: duplicate filename
'non-rewinding' can not be created". The reason is that when setting a
device online, the tape driver creates a sysfs symlink from the tape device
to the tape class device. Unfortunately the symlink is not removed
correctly, when the device is set offline. Instead of passing the
tape device object to sysfs_remove_link, the class device object is used.
This patch fixes this problem and uses the correct tape device object now.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
allocating dasd_fba_private without GFP_DMA results in IO error
during read device characteristics of a FBA disk
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Problem:
Usually every FCP device has its own indicator field the adapter
uses to signal outstanding work. Once a certain limit of devices
is reached, a common indicator field is used. In certain scenarios
qdio resets this common indicator field, but handles only part of
the FCP-devices sharing the common indicator field. Thus inbound
traffic on the non-processed shared FCP-devices is not recognized
immediately.
Solution:
Make sure common indicator field is reset only, if all FCP-devices
sharing the indicator are processed.
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reintroduces in_interrupt() check in sclp_tty code. Add may_schedule
parameter to vt220 write function, so we can let the write function
know if it may schedule or not. So we disallow scheduling for all
console calls and may allow them for tty calls.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
It is now possible to trigger cm_enable processing several times in
parallel without causing a kernel panic.
Signed-off-by: Michael Ernst <mernst@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Currently, we don't do much on no path or no device situations during
normal user I/O, since we rely on reports regarding those events by
the machine. If we trigger a path verification to bring our device
state up-to-date, we (a) may recover from path failures earlier and
(b) better handle situations where the hardware/hypervisor doesn't
give us enough notifications.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Make sure we wait for previous evaluations triggered by path state
changes to have settled before we manipulate path states again.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
I compiled the kernel without deadline, and the dasd code exits the old
scheduler (CFQ), fails to load the new one (deadline), and then things just
hang - with one of these (sorry about the weird chars - I copy & pasted it
from a 3270 console):
dasd(eckd): 0.0.0151: 3390/0A(CU:3990/01) Cyl:3338 Head:15 Sec:224
------------ cut here ------------
Badness at kernel/mutex.c:134
Modules linked in: dasd_eckd_mod dasd_mod
CPU: 0 Not tainted 2.6.25-rc3 #9
Process exe (pid: 538, task: 000000000d172000, ksp: 000000000d21ef88)
Krnl PSW : 0404000180000000 000000000022fb5c (mutex_lock_nested+0x2a4/0x2cc)
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
Krnl GPRS: 0000000000024218 000000000076fc78 0000000000000000 000000000000000f
000000000022f92e 0000000000449898 000000000f921c00 000003e000162590
00000000001539c4 000000000d172000 070000007fffffff 000000000d21f400
000000000f8f2560 00000000002413f8 000000000022fb44 000000000d21f400
Krnl Code: 000000000022fb50: bf2f1000 icm %r2,15,0(%r1)
000000000022fb54: a774fef6 brc 7,22f940
000000000022fb58: a7f40001 brc 15,22fb5a
>000000000022fb5c: a7f4fef2 brc 15,22f940
000000000022fb60: c0e5fffa112a brasl %r14,171db4
000000000022fb66: 1222 ltr %r2,%r2
000000000022fb68: a784fedb brc 8,22f91e
000000000022fb6c: c010002a0086 larl %r1,76fc78
Call Trace:
(<000000000022f92e> mutex_lock_nested+0x76/0x2cc)
<00000000001539c4> elevator_exit+0x38/0x80
<0000000000156ffe> blk_cleanup_queue+0x62/0x7c
<000003e0001d5414> dasd_change_state+0xe0/0x8ec
<000003e0001d5cae> dasd_set_target_state+0x8e/0x9c
<000003e0001d5f74> dasd_generic_set_online+0x160/0x284
<000003e00011e83a> dasd_eckd_set_online+0x2e/0x40
<0000000000199bf4> ccw_device_set_online+0x170/0x2c0
<0000000000199d9e> online_store_recog_and_online+0x5a/0x14c
<000000000019a08a> online_store+0xbe/0x2ec
<000000000018456c> dev_attr_store+0x38/0x58
<000000000010efbc> sysfs_write_file+0x130/0x190
<00000000000af582> vfs_write+0xb2/0x160
<00000000000afc7c> sys_write+0x54/0x9c
<0000000000025e16> sys32_write+0x2e/0x50
<0000000000024218> sysc_noemu+0x10/0x16
<0000000077e82bd2> 0x77e82bd2
Set elevator pointer to NULL in order to avoid double elevator_exit
calls when elevator_init call for deadline iosched fails.
Also make sure the dasd device driver depends on IOSCHED_DEADLINE so
the default IO scheduler of the dasd driver is present.
Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
The itt field in struct iscsi_data is not defined with any particular
endianness. open-iscsi should use it as-is without byte-swapping it.
This fixes sparse warnings coming from doing ntohl(hdr->itt).
Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The mlx4_ib driver is stable enough for production use, so bump the
version number to 1.0 to indicate this.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If a P_Key is deleted and then re-added at the same index, then IPoIB
gets confused because __ipoib_ib_dev_flush() only checks whether the
index is the same without checking whether the P_Key was present, so
the interface is stopped when the P_Key is deleted, but the event when
the P_Key is re-added gets ignored and the interface never gets
restarted.
Also, switch to using ib_find_pkey() instead of ib_find_cached_pkey()
everywhere in IPoIB, since none of the places that look for P_Keys are
in a fast path or in non-sleeping context, and in general we want to
kill off the whole caching infrastructure eventually. This also fixes
consistency problems caused because some IPoIB queries were cached and
some were uncached during the window where the cache was not updated.
Thanks to Venkata Subramonyam <vsubramo@cisco.com> for debugging this
problem and testing this fix.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When a RDMA_CM_EVENT_DEVICE_REMOVAL event is raised, iSER should
release the connection resources.
This is necessary when the IB HCA module is unloaded while open-iscsi
is still running. Currently, iSER just BUG()s.
Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
mlx4 hardware does not support external DDR memory. Moreover, UAR
area (BAR 2) can change depending on FW version.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When detaching the last QP from an MCG entry, we need to make
sure that at any time, there will be no entry with zero number of
QPs which is linked to the list of the MCGs of the corresponding
hash index. So don't write back the MCG entry if we are removing the
last QP; just unlink the entry.
Also, remove an unnecessary MCG read when attaching a QP requires
allocation of a new entry in the AMGM.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Also, introduce a few inline helper functions to make the code more readable.
Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Move the free_irq() call in nes_remove() to before the tasklet_kill();
otherwise there is a window after tasklet_kill() where a new interrupt
can be handled and reschedule the tasklet, leading to a use-after-free
crash.
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The ib_mthca driver has been stable for a while, so bump the version
number to 1.0 to indicate this.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the QP was moved to another state (such as SQE) by the hardware,
then after this change the user won't have to set the IBV_QP_CUR_STATE
mask in order to execute modify QP in order to recover from this state.
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the QP was moved to another state (such as SQE) by the hardware,
then after this change the user won't have to set the IBV_QP_CUR_STATE
mask in order to execute modify QP in order to recover from this state.
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix a place where we might dereference a NULL pointer; this fixes
Coverity CID 1392. On inspection I also found a place where we could
attempt to kmem_cache_free() a NULL pointer, so fix this too.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This can be used to tune at run time the parameters controlling the
event (interrupt) generation rate and thus reduce the overhead
incurred by handling interrupts resulting in better throughput. Since
IPoIB uses a single CQ for both RX and TX, RX is chosen to dictate
configuration for both RX and TX.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add support for modifying CQ parameters for controlling event
generation moderation.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Just add the infrastructure so we can add functionality later.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
With the advent large clusters which utilize multicore hosts, 64K QPs
is not enough. We should increase the default maximum for QPs to 128K.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Handle IB_WR_SEND_WITH_INV work requests.
This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a
"send with invalidate" work request as defined in the iWARP verbs and
the InfiniBand base memory management extensions. Also put "imm_data"
and a new "invalidate_rkey" member in a new "ex" union in struct
ib_send_wr. The invalidate_rkey member can be used to pass in an
R_Key/STag to be invalidated. Add this new union to struct
ib_uverbs_send_wr. Add code to copy the invalidate_rkey field in
ib_uverbs_post_send().
Fix up low-level drivers to deal with the change to struct ib_send_wr,
and just remove the imm_data initialization from net/sunrpc/xprtrdma/,
since that code never does any send with immediate operations.
Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since
the iWARP drivers currently in the tree set the bit. The amso1100
driver at least will silently fail to honor the IB_SEND_INVALIDATE bit
if passed in as part of userspace send requests (since it does not
implement kernel bypass work request queueing). Remove the flag from
all existing drivers that set it until we know which ones are OK.
The values chosen for the new flag is not consecutive to avoid clashing
with flags defined in the XRC patches, which are not merged yet but
which are already in use and are likely to be merged soon.
This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch adds the initialization calls into the new 7220 HCA files,
changes the Makefile to compile and link the new files, and code to
handle send DMA.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The patch adds a number of minor changes to support newer HCAs:
- New send buffer control bits
- New error condition bits
- Locking and initialization changes
- More send buffers
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A new file which allows the IBA7220 send DMA engine to be used from
userland. The routines here are not linked in yet, that will happen in
a follow-on patch...
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A new header file which allows the IBA7220 send DMA engine to be used
from userland. The definitions here are not used yet, that will happen
in a follow-on patch...
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The IBA7220 HCA has a new feature to DMA data to the on chip send
buffers instead of or in addition to the host CPU doing the data
transfer. This patch adds code to support the send DMA queue.
Signed-off-by: John Gregor <john.gregor@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>