Commit Graph

233828 Commits

Author SHA1 Message Date
Philipp Reisner
c4752ef128 drbd: When proxy's buffer drained off go into regular resync mode
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:49 +01:00
Philipp Reisner
73a01a18b9 drbd: New packet for Ahead/Behind mode: P_OUT_OF_SYNC
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:48 +01:00
Philipp Reisner
67531718d8 drbd: Implemented two new connection states Ahead/Behind
In this connection mode, the ahead node no longer replicates
application IO. The behind's disk becomes out dated.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:46 +01:00
Philipp Reisner
422028b1ca drbd: New configuration parameters for dealing with network congestion
net {
    on_congestion {block|pull-ahead|disconnect};
    congestion-fill {sectors};
    congestion-extents {al-extents};
}

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:45 +01:00
Philipp Reisner
759fbdfba6 drbd: Track the numbers of sectors in flight
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:43 +01:00
Lars Ellenberg
688593c5a8 drbd: Renamed write_flags_to_bio() to wire_flags_to_bio()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:32 +01:00
Lars Ellenberg
4896e8c1b8 drbd: restore compatibility with 32bit kernels
With commit
drbd: further converge progress display of resync and online-verify
accidentally an u64/u64 div was introduced, causing an unresolvable
symbol __udivdi3 to be reference. Actually for that division, 32bit are
still suficient for now, so we can revert to unsigned long instead.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:13 +01:00
Lars Ellenberg
1816a2b47a drbd: properly use max_hw_sectors to limit the our bio size
To ease tracking of bios in some hash tables, we want it to
not cross certain boundaries (128k, used to be 32k).
We limit the maximum bio size using queue parameters.

Historically some defines and variables we use there have been named
max_segment_size, which was misguided. Rename them to max_bio_size,
and use [blk_]queue_max_hw_sectors where appropriate.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:11 +01:00
Lars Ellenberg
3129b1b9ae drbd: debug: limit nelink-broadcast of request on digest mismatch to 32k
We used to be limited to 32k requests,
but have increased that limit to 128k now.

This part of the code can only deal with 32k,
it would scramble arbitrary pages for larger requests.

As it is used for debugging only anyways,
it is ok to simply truncate the dumped data here.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:09 +01:00
Lars Ellenberg
470be44ab1 drbd: detect modification of in-flight buffers
With data-integrity digest enabled, double-check on the sending side
for modifications by upper layers of buffers under write back,
so we can tell it appart from corruption on the "wire".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:08 +01:00
Lars Ellenberg
5f9915bbb8 drbd: further converge progress display of resync and online-verify
Show progressbar and ETA always, with proc_details >= 1 also show the
current sector position for both resync and online-verify on both nodes.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:06 +01:00
Lars Ellenberg
18edc0b9d7 drbd: fix potential wrap of 32bit oos:%lu display in /proc/drbd
When converting bits (4k resolution, still) to kB, we shift left.  If it
was a large number of bits on a 32bit box (>= 4 TiB storage), we may
wrap the 32bit unsigned long base type, resulting in incorrect display.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:04 +01:00
Lars Ellenberg
2649f0809f drbd: use the resync controller for online-verify requests as well
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:03 +01:00
Lars Ellenberg
e65f440d47 drbd: factor out drbd_rs_number_requests
Preparation patch to be able to use the auto-throttling resync controller
for online-verify requests as well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:01 +01:00
Lars Ellenberg
9bd28d3c90 drbd: factor out drbd_rs_controller_reset
Preparation patch to be able to use the auto-throttling resync controller
for online-verify requests as well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:59 +01:00
Lars Ellenberg
439d595379 drbd: show progress bar and ETA for online-verify
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:58 +01:00
Lars Ellenberg
ea5442aff6 drbd: advance progress step marks for online-verify
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:56 +01:00
Lars Ellenberg
c6ea14dfa3 drbd: factor out advancement of resync marks for progress reporting
This is in preparation to unify progress reporting of
online-verify and resync requests.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:54 +01:00
Lars Ellenberg
de228bba67 drbd: initialize online-verify progress tracking on verify target
For partial (resumed) online verify, initialize the resync step marks
once we know what the online verify start sector is.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:53 +01:00
Lars Ellenberg
30b743a2d5 drbd: improve online-verify progress tracking
For a partial (resumed) online-verify, initialize rs_total not to total
bits, but to number of bits to check in this run, to match the meaning
rs_total has for actual resync.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:51 +01:00
Lars Ellenberg
2652561886 drbd: only reset online-verify start sector if verify completed
For network hickups during online-verify, on the next verify
triggered, we by default want to resume where it left off.

After any replication link interruption, there will be a (possibly
empty) resync.  Do not reset online-verify start sector if some resync
completed, that would defeats the purpose.

Only reset the start sector once a verify run is completed.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:49 +01:00
Jens Axboe
4c63f5646e Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/core
Conflicts:
	block/blk-core.c
	block/blk-flush.c
	drivers/md/raid1.c
	drivers/md/raid10.c
	drivers/md/raid5.c
	fs/nilfs2/btnode.c
	fs/nilfs2/mdt.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:58:35 +01:00
Vivek Goyal
69d60eb96a blk-throttle: Use blk_plug in throttle dispatch
Use plug in throttle dispatch also as we are dispatching a bunch of
bios in throttle context and some of them might merge.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:27 +01:00
Jens Axboe
721a9602e6 block: kill off REQ_UNPLUG
With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:27 +01:00
Jens Axboe
cf15900e12 aio: remove request submission batching
This should be useless now that we have on-stack plugging. So lets just
kill it.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:27 +01:00
Shaohua Li
9f5b942546 fs: make aio plug
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:27 +01:00
Jens Axboe
2ed1a6bcf9 fs: make mpage read/write_pages() plug
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:26 +01:00
Jens Axboe
5b417b1873 read-ahead: use plugging
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:26 +01:00
Jens Axboe
55602dd66f fs: make generic file read/write functions plug
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:26 +01:00
Jens Axboe
7eaceaccab block: remove per-queue plugging
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:07 +01:00
Jens Axboe
73c1010119 block: initial patch for on-stack per-task plugging
This patch adds support for creating a queuing context outside
of the queue itself. This enables us to batch up pieces of IO
before grabbing the block device queue lock and submitting them to
the IO scheduler.

The context is created on the stack of the process and assigned in
the task structure, so that we can auto-unplug it if we hit a schedule
event.

The current queue plugging happens implicitly if IO is submitted to
an empty device, yet callers have to remember to unplug that IO when
they are going to wait for it. This is an ugly API and has caused bugs
in the past. Additionally, it requires hacks in the vm (->sync_page()
callback) to handle that logic. By switching to an explicit plugging
scheme we make the API a lot nicer and can get rid of the ->sync_page()
hack in the vm.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:45:54 +01:00
Jens Axboe
a488e74976 scsi: convert to blk_delay_queue()
It was always abuse to reuse the plugging infrastructure for this,
convert it to the (new) real API for delaying queueing a bit. A
default delay of 3 msec is defined, to match the previous
behaviour.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:45:54 +01:00
Jens Axboe
0a41e90bb7 ide-cd: convert to blk_delay_queue() for a short pause
It was always abuse to reuse the plugging infrastructure for this,
convert it to the (new) real API for delaying queueing a bit.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Acked-by: David S. Miller <davem@davemloft.net>
2011-03-10 08:45:54 +01:00
Jens Axboe
3cca6dc1c8 block: add API for delaying work/request_fn a little bit
Currently we use plugging for that, but as plugging is going away,
we need an alternative mechanism.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:45:54 +01:00
Tejun Heo
cafb0bfca1 staging: Convert to bdops->check_events()
Convert two staging drivers - blkvsc_drv and cyasblkdev_block - from
->media_changed() to ->check_events().  The former always indicated
media changed while the latter always indicated media not changed.
Not sure what the drivers are trying to achieve but keep the original
behavior.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
2011-03-09 19:54:29 +01:00
Tejun Heo
3c0d206092 pktcdvd: Convert to bdops->check_events()
Convert from ->media_changed() to ->check_events().

pktcdvd needs to forward all event related operations to the
underlying device.  Forward ->check_events() instead of
->media_changed() and inherit disk->[async_]events.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Peter Osterlund <petero2@telia.com>
2011-03-09 19:54:28 +01:00
Tejun Heo
6fac80e3aa umem: Drop dummy ->media_changed()
umem doesn't implement media changed detection and there's no need to
implement dummy callback anymore.  Remove it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
2011-03-09 19:54:28 +01:00
Tejun Heo
ffe80cea35 s390/tape_block: Convert to bdops->check_events()
Convert from ->media_changed() to ->check_events().

s390/tape_block buffers media changed state and clears it on
revalidation.  It will behave correctly with kernel event polling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-03-09 19:54:28 +01:00
Tejun Heo
f47350fdec i2o_block: Convert to bdops->check_events()
Convert from ->media_changed() to ->check_events().

i2o_block buffers media changed state and clears it after reporting.
It will behave correctly with kernel event polling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
2011-03-09 19:54:28 +01:00
Tejun Heo
3a200911ad xsysace: Convert to bdops->check_events()
Convert from ->media_changed() to ->check_events().

xsysace buffers media changed state and clears it on revalidation.  It
will behave correctly with kernel event polling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
2011-03-09 19:54:28 +01:00
Tejun Heo
aaa7c01546 ub: Convert to bdops->check_events()
Convert from ->media_changed() to ->check_events().

ub buffers media changed state and clears it on revalidation.  It will
behave correctly with kernel event polling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Pete Zaitcev <zaitcev@redhat.com>
2011-03-09 19:54:28 +01:00
Tejun Heo
4bbde77787 swim[3]: Convert to bdops->check_events()
Convert from ->media_changed() to ->check_events().

Both swim and swim3 buffer media changed state and clear it on
revalidation.  They will behave correctly with kernel event polling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Laurent Vivier <laurent@lvivier.info>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-09 19:54:28 +01:00
Tejun Heo
507daea227 dac960: Convert to bdops->check_events()
Convert from ->media_changed() to ->check_events().

DAC960 media change notification seems to be one way (once set, never
cleared) and will generate spurious events when polled once the
condition triggers.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
2011-03-09 19:54:28 +01:00
Tejun Heo
b1b56b93f3 paride: Convert to bdops->check_events()
Convert paride drivers from ->media_changed() to ->check_events().

pcd and pd buffer and clear events after reporting; however, pf
unconditionally reports MEDIA_CHANGE and will generate spurious events
when polled.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Tim Waugh <tim@cyberelk.net>
2011-03-09 19:54:28 +01:00
Tejun Heo
1c27030bd2 gdrom,viocd: Convert to bdops->check_events()
Convert gdrom and viocd from ->media_changed() to ->check_events().

It's unclear how the conditions are cleared and it's possible that it
may generate spurious events when polled.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
2011-03-09 19:54:28 +01:00
Tejun Heo
1a8a74f03f floppy,{ami|ata}flop: Convert to bdops->check_events()
Convert the floppy drivers from ->media_changed() to ->check_events().
Both floppy and ataflop buffer media changed state bit and clear them
on revalidation and will behave correctly with kernel event polling.

I can't tell how amiflop clears its event and it's possible that it
may generate spurious events when polled.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
2011-03-09 19:54:27 +01:00
Tejun Heo
5b03a1b140 ide: Convert to bdops->check_events()
Convert ->media_changed() to the new ->check_events() method.  The
conversion is mostly mechanical.  The only notable change is that
cdrom now doesn't generate any event if @slot_nr isn't CDSL_CURRENT.
It used to return -EINVAL which would be treated as media changed.  As
media changer isn't supported anyway, this doesn't make any
difference.

This makes ide emit the standard disk events and allows kernel event
polling.  Currently, only MEDIA_CHANGE event is implemented.  Adding
support for EJECT_REQUEST shouldn't be difficult; however, given that
ide driver is already deprecated, it probably is best to leave it
alone.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-ide@vger.kernel.org
2011-03-09 19:54:27 +01:00
Tejun Heo
69e02c59a7 block: Don't check events while open is in progress
Not all block drivers clear events immediately after reporting.  Some
do so in ->revalidate_disk() or other steps during ->open().  There is
a slim chance event poll may happen between the clearing event check
from check_disk_change() and the actual clearing of the events which
would result in spurious events.

Block event checks while block device open is in progress.  There is
no need to kick explicit event check afterwards as events are always
checked during open.

-v2: The original patch could have called disk_unblock_events() with
     an already released or %NULL @disk causing oops.  Fixed by making
     sure references are put after disk_unblock_events() is called.
     It also makes the error path of __blkdev_get() a bit simpler.
     This problem was reported by Jens.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
2011-03-09 19:54:27 +01:00
Tejun Heo
6936217cc7 block: Don't check events on close unless it was blocked
The block event mechanism currently always checks events when the
device is being closed regardless of the open mode.  The intention was
to allow detection of EJECT_REQUEST when a device is closed whether
disk event polling is enabled or not.

This is unnecessary as, for devices of interest, events are checked
from either userland or kernel and in the former case ->check_events()
is performed on open of each poll attempt anyway.  Furthermore, this
unconditional event check on close makes the code susceptible to event
loop if the block driver doesn't clear reported events correctly - an
event triggers userland to open and close the device which in turn
causes another event, rinse and repeat.

Check events on close only if it was blocked by excl write open.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
2011-03-09 19:54:27 +01:00
Tejun Heo
facc31ddc3 block: Don't implicitly trigger event check on disk_unblock_events()
Currently, disk_unblock_events() implicitly kick event check if the
block count reaches zero.  This behavior is not described in the
comment and hinders with future changes.  Make the unblocker
explicitly check events by calling disk_check_events() as necessary.

This patch doesn't cause any behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
2011-03-09 19:54:27 +01:00