In a setup with a high bandwidth and high latency network, eventually
involving deep queues in routers, it is beneficial to only fill those
queues up to an limited extend with resync data.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
To reasonably control resync speed over drbd-proxy connections,
drbd has to measure the current delay of packets transmitted over
the (possibly congested) data socket vs the meta-data socket.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Delay_probes are new packets in the DRBD protocol, which allow
DRBD to know the current delay packets have on the data socket.
(relative to the meta data socket)
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
The "surplus" bits of the old (smaller) bitmap must be clean
in case of online-grow without resync.
Note: Reverted 67ae8b80d4a116ab3b7094eb3723506b20c06dff as
well, since the lines added by this patch are redundant. The
bits get set by the bm_set_surplus(b) call before that.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Some wish to be notified of all instances of split brain, not just those that
go unresolved. The initial-split-brain handler is called to notify someone
upon detection of all split brain conditions even if auto-recovery policies
are configured.
Signed-off-by: Adam Gandelman <adam.gandelman@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
The condition does not fit the commend (I may well be Primary,
even if I lost the disk earlier and now the connection).
And this is catched below anyways, where it also gets logged.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Even if it should never happen if the peer does behave, we need to
double check, and not even attempt access beyond end of device.
It usually would be caught by lower layers, resulting in "IO error",
but may also end up in the internal meta data area.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
In case both nodes are "inconsistent", invalidate would
have started a resync anyways, without a chance to ever
succeed, just filling the logs with warning messages.
Simply disallow that state change,
re-using the SS_NO_UP_TO_DATE_DISK return value.
This also changes the corresponding error string to
"Need access to UpToDate Data" -- I found the
"Refusing to be Primary without at least one UpToDate disk"
answer misleading in some situations anyways.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Don't forget to drain the digest in case we cannot satisfy a
checksum based resync or online-verify request.
It would additionally cause a protocoll error,
dropping the connection.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
block_id may be ID_SYNCER,
as well as checksum based resync request magic, or online verify magic.
Let's just drop that ASSERT.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
commit e4f925e12e
Author: Philipp Reisner <philipp.reisner@linbit.com>
Date: Wed Mar 17 14:18:41 2010 +0100
drbd: Do not upgrade state to Outdated if already Inconsistent
prevented the necessary state transition for attaching while connected
(Diskless -> Consistent respectively Outdated).
This is the fix for the fix.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
There was a race condition:
In a situation with a SyncSource+Primary and a SyncTarget+Secondary node,
and a resync dependency to some other device. After both nodes decided
to do the resync, the other device finishes its resync process.
At that time SyncSource already sent the P_SYNC_UUID packet, and
already updated its peer disk state to Inconsistent.
The SyncTarget node waits for the P_SYNC_UUID and sends a state packet
to report the resync dependency change. That packet still carries
a disk state of Outdated.
Impact:
If application writes come in, during that time on the Primary node,
those do not get replicated, and the out-of-sync counter gets increased.
=> The completion of resync is not detected on the primary node.
=> stalled.
Those blocks get resync'ed with the next resync, since the are get
marked as out-of-sync in the bitmap.
In order to fix this, we filter out that wrong state change in the
sanitize_state() function.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
To document that we know about deprecation of proc_create,
even though we are not affected, as we don't use the ->data member,
open code proc_create_data(..., NULL);
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Fix sparse warnings:
drivers/block/cciss.c:1591:37: warning: symbol 'i' shadows an earlier one
drivers/block/cciss.c:2437:21: warning: symbol 'i' shadows an earlier one
Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Make the PARIDE menu be displayed correctly, with proper/expected
indentation, by moving the GDROM kconfig symbol, which was
splitting the PARIDE kconfig symbol from its dependent symbols.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
fix regression introduced in 8.3.3:
commit a9b17323f2875f5d9b132c2b476a750bf44b10c7
Author: Lars Ellenberg <lars.ellenberg@linbit.com>
Date: Wed Aug 12 15:18:33 2009 +0200
out-of-spinlock completion of master bio
: (bio_rw(bio) == READA)
? read_completed_with_error
: read_ahead_completed_with_error;
is obviously not what was intended.
No one noticed because of
* page-cache at work,
* local RAIDs
Impact:
Failed local READs are not retried remotely,
but errored to upper layers, causing filesystems
to remount read-only, or worse.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
The pktcdvd driver uses proper locking and does not need the BKL in the
ioctl and llseek functions of the character device, so kill both.
Moving the compat_ioctl handling from common code into the driver itself
fixes build problems when CONFIG_BLOCK is disabled.
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The patch just convert all blkdev_issue_xxx function to common
set of flags. Wait/allocation semantics preserved.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
We leak memory if "--dry-run" is not supported by the peer.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (34 commits)
cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch
loop: Update mtime when writing using aops
block: expose the statistics in blkio.time and blkio.sectors for the root cgroup
backing-dev: Handle class_create() failure
Block: Fix block/elevator.c elevator_get() off-by-one error
drbd: lc_element_by_index() never returns NULL
cciss: unlock on error path
cfq-iosched: Do not merge queues of BE and IDLE classes
cfq-iosched: Add additional blktrace log messages in CFQ for easier debugging
i2o: Remove the dangerous kobj_to_i2o_device macro
block: remove 16 bytes of padding from struct request on 64bits
cfq-iosched: fix a kbuild regression
block: make CONFIG_BLK_CGROUP visible
Remove GENHD_FL_DRIVERFS
block: Export max number of segments and max segment size in sysfs
block: Finalize conversion of block limits functions
block: Fix overrun in lcm() and move it to lib
vfs: improve writeback_inodes_wb()
paride: fix off-by-one test
drbd: fix al-to-on-disk-bitmap for 4k logical_block_size
...
Update mtime when writing to backing filesystem using the address space
operations write_begin and write_end.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
We take the spin_lock again in fail_all_cmds() so we need to unlock here.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Steve Cameron <scameron@beardog.cce.hp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We take the spin_lock again in fail_all_cmds() so we need to unlock
here.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (56 commits)
doc: fix typo in comment explaining rb_tree usage
Remove fs/ntfs/ChangeLog
doc: fix console doc typo
doc: cpuset: Update the cpuset flag file
Fix of spelling in arch/sparc/kernel/leon_kernel.c no longer needed
Remove drivers/parport/ChangeLog
Remove drivers/char/ChangeLog
doc: typo - Table 1-2 should refer to "status", not "statm"
tree-wide: fix typos "ass?o[sc]iac?te" -> "associate" in comments
No need to patch AMD-provided drivers/gpu/drm/radeon/atombios.h
devres/irq: Fix devm_irq_match comment
Remove reference to kthread_create_on_cpu
tree-wide: Assorted spelling fixes
tree-wide: fix 'lenght' typo in comments and code
drm/kms: fix spelling in error message
doc: capitalization and other minor fixes in pnp doc
devres: typo fix s/dev/devm/
Remove redundant trailing semicolons from macros
fix typo "definetly" -> "definitely" in comment
tree-wide: s/widht/width/g typo in comments
...
Fix trivial conflict in Documentation/laptops/00-INDEX
Just code the test directly
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make debugt messages a little neater.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reduces indent.
Makes a bit more readable and intelligible.
Return value now at bottom of function.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Print the function name not the pointer address where useful and possible
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add and use __func__ to is_alive.
Use __func__ in some DPRINTs.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move DPRINT macro definition above 1st use Consolidate a format string
(>80 columns) Add a newline to an unterminated message Comment neatened
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The code could not be compiled without the #define, so just remove it and
the #ifdef/#endif lines.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Prior to patch "drivers/block/floppy.c: Use pr_<level>" only
reschedule_timeout(,"request done"...) printed a numeric value after a
reschedule_timeout event message.
Restore that behavior and remove the now unnecessary argument.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change for(;;) with continue; to label: goto label
Reduces indentation and adds a bit of clarity.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Used a couple of times, might simplify the code a bit.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use it directly in the one place it's used.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Various functions use int where bool is appropriate
lock_fdc, wait_til_done, poll_drive, user_reset_fdc
Convert to bool.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove ugly IN/OUT macros, use direct case and code
Add missing semicolon after ECALL
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Spacing, column alignment and a for loop with
a naked semicolon converted to an assign and while
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Macros with hidden returns are not nice.
Convert the 2 uses to use direct code.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move assigns above if()s
Remove unnecessary parentheses from returns
Use a temporary for a duplicated test
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Convert bare printk to pr_info and pr_cont
Convert printk(KERN_ERR to pr_err
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With `while (j++ < PX_SPIN)' j reaches PX_SPIN + 1 after the loop. This
is probably unlikely to produce a problem.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Up to now, applying the in-core activity-log to the on-disk
bitmap did not care for logical_block_size.
On logical_block_size != 512 byte, this very likely results
in misalligned block access and spurious "io errors".
We now simply always submit aligned whole 4k blocks, fixing this
for logical block sizes of 512, 1024, 2048 and 4096.
For even larger logical block sizes, this won't work.
But I'm not aware of devices with such properties being available.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Up to now this only worked for Outdated and Inconsistent disks, that
it did not worked for Consistent disks was an inconsistent omission.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
This is a race condition that existed for ages.
The previous commit reduces the window, this one closes it.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
cmdname() should map command number to its human readable
representation. The string table was incomplete, though.
Maybe rather do a switch() block, and let the compiler help us
to keep it complete?
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Fixes a race and potential kernel panic if e.g. the worker was just
about to send a few P_RS_IS_IN_SYNC via the meta socket for checksum
based resync, while the receiver destroys the sockets in
drbd_disconnect.
To make sure no-one is using the meta socket,
it is not enough to stop the asender...
Grab the meta socket mutex before destroying it.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Depending on resync request size,
we need to account for more than one bit.
Impact: cosmetic
If SyncTarget reported correctly 100% equal checksums,
the SyncSource usually reported 12% equal checksums instead,
because it only counted requests, we typically do 32k resync requests,
and the bitmap granularity is still 4k.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Situation:
we have diverging data sets, i.e. we had a split brain somewhen,
but currently are connected, one node diskless.
Then we try to attach that disk, figure it is consistent,
but has a diverging data set, we refuse to attach.
This led to strange state changes:
22:18:35 bb drbd1: peer( Unknown -> Primary ) conn( WFReportParams -> Connected) pdsk( DUnknown -> UpToDate )
22:19:30 bb drbd1: disk( Diskless -> Attaching )
22:19:30 bb drbd1: disk( Attaching -> Negotiating )
22:19:30 bb drbd1: drbd_sync_handshake:
22:19:30 bb drbd1: self 97BF25798B9D5222:F33D1F62ADE698DD:4269796F9D027C83:AC45D8B5C3C1BF93 bits:19449 flags:0
22:19:30 bb drbd1: peer 280DFB6E125465D3:F33D1F62ADE698DC:4269796F9D027C82:AC45D8B5C3C1BF93 bits:2575806 flags:0
22:19:30 bb drbd1: uuid_compare()=100 by rule 90
22:19:30 bb drbd1: Split-Brain detected, dropping connection!
22:19:30 bb drbd1: disk( Negotiating -> Diskless )
while the other side says:
22:19:30 aa drbd1: Split-Brain detected, dropping connection!
22:19:30 aa drbd1: Disk attach process on the peer node was aborted.
22:19:30 aa drbd1: conn( Connected -> TOO_LARGE ) pdsk( Diskless -> Consistent )
This should be fixed now.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
we still don't support 4k 'physical' sectors 'natively',
but use a read-modify-write workaround.
And we even tried to use the extra page before we allocated it :(
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
The bm_change semaphore is semantically a mutex. Convert it to a real
mutex.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Constify struct sysfs_ops.
This is part of the ops structure constification
effort started by Arjan van de Ven et al.
Benefits of this constification:
* prevents modification of data that is shared
(referenced) by many other structure instances
at runtime
* detects/prevents accidental (but not intentional)
modification attempts on archs that enforce
read-only kernel data at runtime
* potentially better optimized code as the compiler
can assume that the const data cannot be changed
* the compiler/linker move const data into .rodata
and therefore exclude them from false sharing
Signed-off-by: Emese Revfy <re.emese@gmail.com>
Acked-by: David Teigland <teigland@redhat.com>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Passing the attribute to the low level IO functions allows all kinds
of cleanups, by sharing low level IO code without requiring
an own function for every piece of data.
Also drivers can extend the attributes with own data fields
and use that in the low level function.
This makes the class attributes the same as sysdev_class attributes
and plain attributes.
This will allow further cleanups in drivers.
Full tree sweep converting all users.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
i8253_lock needs to be a real spinlock in preempt-rt, i.e. it can
not be converted to a sleeping lock.
Convert it to raw_spinlock and fix up all users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
LKML-Reference: <20100217163751.030764372@linutronix.de>
* 'for-2.6.34' of git://git.kernel.dk/linux-2.6-block: (38 commits)
block: don't access jiffies when initialising io_context
cfq: remove 8 bytes of padding from cfq_rb_root on 64 bit builds
block: fix for "Consolidate phys_segment and hw_segment limits"
cfq-iosched: quantum check tweak
blktrace: perform cleanup after setup error
blkdev: fix merge_bvec_fn return value checks
cfq-iosched: requests "in flight" vs "in driver" clarification
cciss: Fix problem with scatter gather elements in the scsi half of the driver
cciss: eliminate unnecessary pointer use in cciss scsi code
cciss: do not use void pointer for scsi hba data
cciss: factor out scatter gather chain block mapping code
cciss: fix scatter gather chain block dma direction kludge
cciss: simplify scatter gather code
cciss: factor out scatter gather chain block allocation and freeing
cciss: detect bad alignment of scsi commands at build time
cciss: clarify command list padding calculation
cfq-iosched: rethink seeky detection for SSDs
cfq-iosched: rework seeky detection
block: remove padding from io_context on 64bit builds
block: Consolidate phys_segment and hw_segment limits
...
cciss: Fix problem with scatter gather elements in the scsi half of the driver
When support for more than 31 scatter gather elements was added to the block
half of the driver, the SCSI half of the driver was not addressed, and the bump
from 31 to 32 scatter gather elements in the command block itself (not chained)
actually broke the SCSI half of the driver, so that any transfer requiring 32
scatter gather elements wouldn't work. This fix also increases the max transfer
size and size of the scatter gather table to the limit supported by the controller
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: eliminate unnecessary pointer use in cciss scsi code
An extra level of indirection was being used in some places
for no real reason.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: do not use void pointer for scsi hba data
and get rid of related unnecessary type casting
and delete some superfluous and misleading comments nearby.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: factor out scatter gather chain block mapping code
Rationale is I want to use this code from the scsi half of the
driver.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: fix scatter gather chain block dma direction kludge
The data direction for the chained block of scatter gather
elements should always be PCI_DMA_TODEVICE, but was mistakenly
set to the direction of the data transfer, then a kludge to
fix it was added, in which pci_dma_sync_single_for_device or
pci_dma_sync_single_for_cpu was called. If the correct direction
is used in the first place, the kludge isn't needed.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: simplify scatter gather code.
Instead of allocating an array of pointers to a structure
containing an SGDescriptor structure, and two other elements
that aren't really used, just allocate SGDescriptor structs.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: factor out scatter gather chain block allocation and freeing
Rationale is that I want to use this code from the scsi half of the
driver.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: detect bad alignment of scsi commands at build time
Incidentally fix some nearby c++ style comments.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: clarify command list padding calculation
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Adjust the platform device code to conform with the code style used in the
rest of this patch series. No need to name resources nor to register
devices which are not applicable.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Except for SCSI no device drivers distinguish between physical and
hardware segment limits. Consolidate the two into a single segment
limit.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The block layer calling convention is blk_queue_<limit name>.
blk_queue_max_sectors predates this practice, leading to some confusion.
Rename the function to appropriately reflect that its intended use is to
set max_hw_sectors.
Also introduce a temporary wrapper for backwards compability. This can
be removed after the merge window is closed.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Add a BLK_ prefix to block layer constants.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Now that the bio list management stuff is generic, convert pktcdvd to
use bio lists instead of its own private bio list implementation.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Peter Osterlund <petero2@telia.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Allow reading various alignment values from the config page. This
allows the guest to much better align I/O requests depending on the
storage topology.
Note that the formats for the config values appear a bit messed up,
but we follow the formats used by ATA and SCSI so they are expected in
the storage world.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are several duplicate definitions in cciss_cmd.h and cciss_ioctl.h.
Consolidate these into the new cciss_defs.h file. This patch doesn't change
the definitions exposed under include/linux, so userspace apps shouldn't
be affected.
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: dann frazier <dannf@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Some cleanup before the header file split-out so we don't propagate this style
into new files.
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: dann frazier <dannf@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Joe Perches <joe@perches.com>
Cc: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
It is possible (and expected) for there to be holes in the h->drv[]
array, that is, some elements may be NULL pointers. cciss_seq_show
needs to be made aware of this possibility to avoid an Oops.
To reproduce the Oops which this fixes:
1) Create two "arrays" in the Array Configuratino Utility and
several logical drives on each array.
2) cat /proc/driver/cciss/cciss* in an infinite loop
3) delete some of the logical drives in the first "array."
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This is the counterpart to cba767175b
("pktcdvd: remove broken dev_t export of class devices"). Device is not
registered using dev_t, so it should not be destroyed using device_destroy
which looks up the device by dev_t. This will fail and adding the device
again will fail with the "duplicate name" error. This is fixed using
device_unregister instead of device_destroy.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
blk_queue_make_request() internally calls blk_set_default_limits(),
so calling blk_queue_max_segment_size() before is useless.
Ergo: move the call to blk_queue_max_segment_size() down a few lines.
Impact:
If, after a fresh modprobe, you first connect a Diskless drbd,
then attach, this could result in a DRBD Protocol Error at first.
The next connection attempt would then succeeded.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
The id_table field of the struct virtio_driver is constant in <linux/virtio.h>
so it is worth to make id_table also constant.
The semantic match that finds this kind of pattern is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r@
disable decl_init,const_decl_init;
identifier I1, I2, x;
@@
struct I1 {
...
const struct I2 *x;
...
};
@s@
identifier r.I1, y;
identifier r.x, E;
@@
struct I1 y = {
.x = E,
};
@c@
identifier r.I2;
identifier s.E;
@@
const struct I2 E[] = ... ;
@depends on !c@
identifier r.I2;
identifier s.E;
@@
+ const
struct I2 E[] = ...;
// </smpl>
Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: cocci@diku.dk
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The ids field of the struct xenbus_device_id is constant in <linux/xen/xenbus.h>
so it is worth to make blkfront_ids also constant.
The semantic match that finds this kind of pattern is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r@
disable decl_init,const_decl_init;
identifier I1, I2, x;
@@
struct I1 {
...
const struct I2 *x;
...
};
@s@
identifier r.I1, y;
identifier r.x, E;
@@
struct I1 y = {
.x = E,
};
@c@
identifier r.I2;
identifier s.E;
@@
const struct I2 E[] = ... ;
@depends on !c@
identifier r.I2;
identifier s.E;
@@
+ const
struct I2 E[] = ...;
// </smpl>
Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: cocci@diku.dk
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The match_table field of the struct of_device_id is constant in <linux/of_platform.h>
so it is worth to make ace_of_match also constant.
The semantic match that finds this kind of pattern is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r@
disable decl_init,const_decl_init;
identifier I1, I2, x;
@@
struct I1 {
...
const struct I2 *x;
...
};
@s@
identifier r.I1, y;
identifier r.x, E;
@@
struct I1 y = {
.x = E,
};
@c@
identifier r.I2;
identifier s.E;
@@
const struct I2 E[] = ... ;
@depends on !c@
identifier r.I2;
identifier s.E;
@@
+ const
struct I2 E[] = ...;
// </smpl>
Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: cocci@diku.dk
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The id_table field of the struct usb_device_id is constant in <linux/usb.h>
so it is worth to make ub_usb_ids also constant.
The semantic match that finds this kind of pattern is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r@
disable decl_init,const_decl_init;
identifier I1, I2, x;
@@
struct I1 {
...
const struct I2 *x;
...
};
@s@
identifier r.I1, y;
identifier r.x, E;
@@
struct I1 y = {
.x = E,
};
@c@
identifier r.I2;
identifier s.E;
@@
const struct I2 E[] = ... ;
@depends on !c@
identifier r.I2;
identifier s.E;
@@
+ const
struct I2 E[] = ...;
// </smpl>
Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: cocci@diku.dk
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The id_table field of the struct pci_driver is constant in <linux/pci.h>
so it is worth to make the initialization data also constant.
The semantic match that finds this kind of pattern is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r@
disable decl_init,const_decl_init;
identifier I1, I2, x;
@@
struct I1 {
...
const struct I2 *x;
...
};
@s@
identifier r.I1, y;
identifier r.x, E;
@@
struct I1 y = {
.x = E,
};
@c@
identifier r.I2;
identifier s.E;
@@
const struct I2 E[] = ... ;
@depends on !c@
identifier r.I2;
identifier s.E;
@@
+ const
struct I2 E[] = ...;
// </smpl>
Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: cocci@diku.dk
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
It is called LBDAF since 2.6.31.
impact:
without this change, on 32bit,
DRBD would wrongly claim to only support 2TiB devices.
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Test the just-allocated value for NULL rather than some other value.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression x,y;
statement S;
@@
x = \(kmalloc\|kcalloc\|kzalloc\)(...);
(
if ((x) == NULL) S
|
if (
- y
+ x
== NULL)
S
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Immediately after changing the write ordering method, the epoch can already
be finished at this point.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
!CONFIG_OPT evalues to FALSE if CONFIG_OPT='m'. Do not display the
"DRBD disabled..." message if the dependencies are compiled as module.
Signed-off-by: Johannes Thoma <johannes.thoma@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
In D_DISKLESS we do not hand out any new references to ldev (local_cnt)
therefore waiting until all previously handed out refereces got returned
is sufficient before actually freeing mdev->ldev.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Use resource_size() for ioremap.
The ioremap appears to be passing the incorrect size for the platform
resource. Unfortunately, I can't locate a user in mainline to verify
this. Using resource_size should be the correct fix.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: unsik Kim <donari75@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
DAC960_LP_Controller and DAC960_V2_Controller have the same value, but
elsewhere it is DAC960_V1_Controller or DAC960_V2_Controller that is used
in the FirmwareType field.
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
rsp->count is unsigned so the test does not work.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
rsp->count is unsigned so the test does not work.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* 'for-2.6.33' of git://git.kernel.dk/linux-2.6-block:
cfq: set workload as expired if it doesn't have any slice left
Fix a CFQ crash in "for-2.6.33" branch of block tree
cfq: Remove wait_request flag when idle time is being deleted
cfq-iosched: commenting non-obvious initialization
cfq-iosched: Take care of corner cases of group losing share due to deletion
cfq-iosched: Get rid of cfqq wait_busy_done flag
cfq: Optimization for close cooperating queue searching
block,xd: Delay allocation of DMA buffers until device is known
drbd: Following the hmac change to SHASH (see linux commit 8bd1209cff)
cfq-iosched: reduce write depth only if sync was delayed
gcc is not convinced that the floppy.c ioctl has sufficient bound checks:
In function `copy_from_user',
inlined from `fd_copyin' at drivers/block/floppy.c:3080,
inlined from `fd_ioctl' at drivers/block/floppy.c:3503:
arch/x86/include/asm/uaccess_32.h:211:
warning: call to `copy_from_user_overflow' declared with attribute
warning: copy_from_user buffer size is not provably correct
And frankly, as a human I have a hard time proving the same more or less
(the size comes from the ioctl argument. humpf. maybe. the code isn't
very nice)
This patch adds an explicit check to make 100% sure it's safe, better than
finding out later that there indeed was a gap.
[akpm@linux-foundation.org: add WARN_ON()]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (151 commits)
powerpc: Fix usage of 64-bit instruction in 32-bit altivec code
MAINTAINERS: Add PowerPC patterns
powerpc/pseries: Track previous CPPR values to correctly EOI interrupts
powerpc/pseries: Correct pseries/dlpar.c build break without CONFIG_SMP
powerpc: Make "intspec" pointers in irq_host->xlate() const
powerpc/8xx: DTLB Miss cleanup
powerpc/8xx: Remove DIRTY pte handling in DTLB Error.
powerpc/8xx: Start using dcbX instructions in various copy routines
powerpc/8xx: Restore _PAGE_WRITETHRU
powerpc/8xx: Add missing Guarded setting in DTLB Error.
powerpc/8xx: Fixup DAR from buggy dcbX instructions.
powerpc/8xx: Tag DAR with 0x00f0 to catch buggy instructions.
powerpc/8xx: Update TLB asm so it behaves as linux mm expects.
powerpc/8xx: Invalidate non present TLBs
powerpc/pseries: Serialize cpu hotplug operations during deactivate Vs deallocate
pseries/pseries: Add code to online/offline CPUs of a DLPAR node
powerpc: stop_this_cpu: remove the cpu from the online map.
powerpc/pseries: Add kernel based CPU DLPAR handling
sysfs/cpu: Add probe/release files
powerpc/pseries: Kernel DLPAR Infrastructure
...
Loading the XD module triggers a warning like
WARNING: at mm/page_alloc.c:1805
__alloc_pages_nodemask+0x127/0x48f()
Hardware name: System Product Name
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.32-rc8-git5 #1
Call Trace:
[<c103d94b>] warn_slowpath_common+0x65/0x95
[<c103d98d>] warn_slowpath_null+0x12/0x15
[<c109550c>] __alloc_pages_nodemask+0x127/0x48f
[<c10be964>] ? get_slab+0x8/0x50
[<c10b8979>] alloc_page_interleave+0x2e/0x6e
[<c10b8a10>] alloc_pages_current+0x57/0x99
[<c2083a4a>] ? xd_init+0x0/0x482
[<c1094c38>] __get_free_pages+0xd/0x1e
[<c2083a94>] xd_init+0x4a/0x482
[<c2082df0>] ? loop_init+0x104/0x16a
[<c169162d>] ? loop_probe+0x0/0xaf
[<c2083a4a>] ? xd_init+0x0/0x482
[<c1001143>] do_one_initcall+0x51/0x13f
[<c204a307>] kernel_init+0x10b/0x15f
[<c204a1fc>] ? kernel_init+0x0/0x15f
[<c1004347>] kernel_thread_helper+0x7/0x10
---[ end trace 686db6333ade6e7a ]---
xd: Out of memory.
The warning is because the alloc_pages is called with an
order >= MAX_ORDER. The simplistic reason is that get_order(0) returns garbage
values when given 0 as a size. The more complex reason is that the XD driver
initialisation is broken.
It's not clear why this ever worked. XD allocates a buffer for DMA based
on the value of xd_maxsectors. This value is determined by the exact
type of controller in use but the value is determined *after* an attempt
has been made to allocate the buffer. i.e. the requested size of the DMA
buffer will always be 0.
This patch alters how XD is initialised slightly by allocating the
buffer when and if a device has actually been detected. The error paths
are updated to suit the new logic.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The hotplug mediabay has tendrils deep into drivers/ide code
which makes a libata port reather difficult. In addition it's
ugly and could be done better.
This reworks the interface between the mediabay and the rest
of the world so that:
- Any macio_driver can now have a mediabay_event callback
which will be called when that driver sits on a mediabay and
it's been either plugged or unplugged. The device type is
passed as an argument. We can now move all the IDE cruft
into the IDE driver itself
- A check_media_bay() function can be used to take a peek
at the type of device currently in the bay if any, a cleaner
variant of the previous function with the same name.
- A pair of lock/unlock functions are exposed to allow the
IDE driver to block the hotplug callbacks during the initial
setup and probing of the bay in order to avoid nasty race
conditions.
- The mediabay code no longer needs to spin on the status
register of the IDE interface when it detects an IDE device,
this is done just fine by the IDE code itself
Overall, less code, simpler, and allows for another driver
than our old drivers/ide based one.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
"Definition" is misspelled "defintion" in several comments; this
patch fixes them. No code changes.
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Prevent the AoE block driver from creating cache aliases of page cache
pages on machines with virtually indexed caches.
Building kernels on an AT91SAM9G20 board without this patch fails with
segmentation faults after a couple of passes.
Signed-off-by: Peter Horton <zero@colonel-panic.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since 8.3.3 we fail to do the resync when a partial resynch is not
possible, but a full synch is necessary.
This regression was introduced with 7101539930c0a89146959e7a39c09ad9c3516434
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Otherwise the 'state fixup' in the receiver will change to Unconnected,
but the receiver will terminate itself, and any attempt at 'down'ing
that drbd later will block forever.
see also Bugz. #259
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
this is uncritical, as we still also serialize in userland,
but to correctly serialize on the CONFIG_PENDING bit,
it must be wait_event(state_wait, \!test_and_set_bit)
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
A recent commit broke the ia64 build:
Author: Don Brace <brace@beardog.cce.hp.com>
Date: Thu Nov 12 12:50:01 2009 -0600
cciss: Add enhanced scatter-gather support.
because of this hunk:
--- a/drivers/block/cciss.h
+++ b/drivers/block/cciss.h
+struct Cmd_sg_list {
+ SGDescriptor_struct *sgchain;
+ dma64_addr_t sg_chain_dma;
+ int chain_block_size;
+};
The issue is that dma64_addr_t isn't #define'd on ia64.
The way that we're using Cmd_sg_list.sg_chain_dma is to hold an
address returned from pci_map_single().
+ temp64.val = pci_map_single(h->pdev,
+ h->cmd_sg_list[c->cmdindex]->sgchain,
+ len, dir);
+
+ h->cmd_sg_list[c->cmdindex]->sg_chain_dma = temp64.val;
pci_map_single() returns a dma_addr_t too.
This code will still work even on a 32-bit x86 build, where
dma_addr_t is defined to be a u32 because it will simply be
promoted to the __u64 that temp64.val is defined as.
Thus, declaring Cmd_sg_list.sg_chain_dma as dma_addr_t is safe.
Cc: Don Brace <brace@beardog.cce.hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
On driver unload, only free up the extra scatter gather data if they were
allocated in the first place (the controller supports it) and don't forget
to free up the sg_cmd_list array of pointers.
Signed-off-by: Don Brace <brace@beardog.cce.hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
No need to export those device attributes.
In fact, without this patch, we can trip over a build error if cciss
is a built-in and another driver also declares and exports attributes
with the same name.
You'll see errors like:
drivers/scsi/built-in.o: multiple definition of `dev_attr_lunid'
drivers/block/built-in.o: first defined here
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Cc: <mike.miller@hp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: Fix weird usage of ENXIO in cciss_scsi.c
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: Add enhanced scatter-gather support. For controllers which
supported, more than 512 scatter-gather elements per command may
be used, and the max transfer size can be increased to 8192 blocks.
Signed-off-by: Don Brace <brace@beardog.cce.hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: Do not automatically rescan on UNIT ATTENTION/LUN DATA CHANGED
There are problems with doing this. If, say, several logical drives
are deleted at once, several such UNIT ATTENTIONS will be encountered,
often during the rescan triggered by the first such UNIT ATTENTION.
The block layer may be in the midst of trying to add logical drives
which were just deleted (resulting in the subsequent UNIT ATTENTION(s).)
Making the rescan code robust enough to tolerate this kind of thing
is too complicated for the moment. So, for now, we just don't do it.
Note: This UNIT ATTENTION/LUN DATA CHANGED situation only occurs on
the MSA2012.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: Remove unnecessary check in scan_thread
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: fix typo that causes scsi status to be lost.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: remove sendcmd() as it is no longer used.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: clean up code in cciss_shutdown. Send the flush cache
command down with interrupts still enabled, and do not do DMA
from the stack.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: Remove the "withirq" parameter from various functions where possible
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: Retry driver initiated cmds with unit attention condition
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: Fix problem with remove_from_scan_list that on driver unload
it doesn't remove the controller from the scan list correctly if
the controller is currently being scanned for new devices.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: Make device attributes static
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
There is a nice gem in drivers/block/ataflop.c::do_fd_request()
void do_fd_request(struct request_queue * q)
{
unsigned long flags;
DPRINT(("do_fd_request for pid %d\n",current->pid));
while( fdc_busy ) sleep_on( &fdc_wait );
fdc_busy = 1;
stdma_lock(floppy_irq, NULL);
atari_disable_irq( IRQ_MFP_FDC );
local_save_flags(flags); /* The request function is called with ints
local_irq_disable(); * disabled... so must save the IPL for later */
redo_fd_request();
local_irq_restore(flags);
atari_enable_irq( IRQ_MFP_FDC );
}
If you look at the code long enough, you will notioce that the
local_irq_disable() call is actually commented out. This has been
introduced back in 2002 in [1], but as you can see, the same bug has been
there even before, with the sti() call being commented out in the very
same way :)
I am not familiar with the code myself at all, but I guess that the whole
stuff can just be removed. Why do we need save_flags/restore_flags at all,
without actually disabling the local IRQs afterwards? The
redo_fd_request() doesn't seem to do anything that would mess with flags
inconsistently.
[1] http://lkml.org/lkml/2002/12/27/58
Jens:
That does look odd. The comment is correct that the function is entered
with interrupts disabled (and the queue lock held). So I'd say your
patch looks fine, the whole save/restore business looks meaningless.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Michael Schmitz <schmitz@biophys.uni-duesseldorf.de>
Move xen_domain and related tests out of asm-x86 to xen/xen.h so they
can be included whenever they are necessary.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When there are many blocks on the fly (ua), and the AL gets into "starving"
mode (random IO, scattered all over the device), and the connections gets
interrupted, the receiver thread deadlocks in the drbd_disconnect() code path.
Affected are only nodes in Primary role.
The bug triggers most likely on system that mirror over "long distances"
Regression introduced shortly before 8.3.3
with git commit 31e0f1250f174ac1ee317f360943a0159e19edc8
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
When IO gets frozen due to a broken fence-peer script, the user
should be able to thaw IO by the resume-io command.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
To check wether we are truncating a very large device due to limited
meta data space, we need to check the ll_dev size.
Also improve the printk to suggest "flexible" or "internal".
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
The current PS3 VRAM driver uses msleep() to wait for completion of RSX
DMA transfers between system memory and VRAM. Depending on the system
timing, the processing delay and overhead of this msleep() call can
significantly impact VRAM driver IO.
To avoid the condition, add a short duration (200 usec max) udelay()
polling loop before entering the msleep() polling loop.
Signed-off-by: Hideyuki Sasaki <xhide@rd.scei.sony.co.jp>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Acked-by: Jim Paris <jim@jtan.com>
Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This reverts "Add serial number support for virtio_blk, V4a".
Turns out that virtio_pci, lguest and s/390 all have an 8 bit limit
on virtio config space, so noone could ever use this.
This is coming back later in a cleaner form.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: john cooper <john.cooper@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Rusty,
commit 3ca4f5ca73
virtio: add virtio IDs file
moved all device IDs into a single file. While the change itself is
a very good one, it can break userspace applications. For example
if a userspace tool wanted to get the ID of virtio_net it used to
include virtio_net.h. This does no longer work, since virtio_net.h
does not include virtio_ids.h.
This patch moves all "#include <linux/virtio_ids.h>" from the C
files into the header files, making the header files compatible with
the old ones.
In addition, this patch exports virtio_ids.h to userspace.
CC: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It seems like the addition of QUEUE_FLAG_VIRT caueses major performance
regressions for Fedora users:
https://bugzilla.redhat.com/show_bug.cgi?id=509383https://bugzilla.redhat.com/show_bug.cgi?id=505695
while I can't reproduce those extreme regressions myself I think the flag
is wrong.
Rationale:
QUEUE_FLAG_VIRT expands to QUEUE_FLAG_NONROT which casus the queue
unplugged immediately. This is not a good behaviour for at least
qemu and kvm where we do have significant overhead for every
I/O operations. Even with all the latested speeups (native AIO,
MSI support, zero copy) we can only get native speed for up to 128kb
I/O requests we already are down to 66% of native performance for 4kb
requests even on my laptop running the Intel X25-M SSD for which the
QUEUE_FLAG_NONROT was designed.
If we ever get virtio-blk overhead low enough that this flag makes
sense it should only be set based on a feature flag set by the host.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Add cciss_allow_hpsa module parameter. This parameter causes
the cciss driver to ignore any Smart Array devices known to be
supported by the hpsa driver.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>