2006-08-30 02:06:00 +08:00
|
|
|
/* bounce buffer handling for block devices
|
|
|
|
*
|
|
|
|
* - Split from highmem.c
|
|
|
|
*/
|
|
|
|
|
2014-06-07 05:38:30 +08:00
|
|
|
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
|
|
|
|
|
2006-08-30 02:06:00 +08:00
|
|
|
#include <linux/mm.h>
|
2011-10-16 14:01:52 +08:00
|
|
|
#include <linux/export.h>
|
2006-08-30 02:06:00 +08:00
|
|
|
#include <linux/swap.h>
|
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 16:04:11 +08:00
|
|
|
#include <linux/gfp.h>
|
2006-08-30 02:06:00 +08:00
|
|
|
#include <linux/bio.h>
|
|
|
|
#include <linux/pagemap.h>
|
|
|
|
#include <linux/mempool.h>
|
|
|
|
#include <linux/blkdev.h>
|
2015-05-23 05:13:32 +08:00
|
|
|
#include <linux/backing-dev.h>
|
2006-08-30 02:06:00 +08:00
|
|
|
#include <linux/init.h>
|
|
|
|
#include <linux/hash.h>
|
|
|
|
#include <linux/highmem.h>
|
2011-10-21 03:24:30 +08:00
|
|
|
#include <linux/bootmem.h>
|
2014-06-07 05:38:30 +08:00
|
|
|
#include <linux/printk.h>
|
2006-08-30 02:06:00 +08:00
|
|
|
#include <asm/tlbflush.h>
|
|
|
|
|
tracing/events: convert block trace points to TRACE_EVENT()
TRACE_EVENT is a more generic way to define tracepoints. Doing so adds
these new capabilities to this tracepoint:
- zero-copy and per-cpu splice() tracing
- binary tracing without printf overhead
- structured logging records exposed under /debug/tracing/events
- trace events embedded in function tracer output and other plugins
- user-defined, per tracepoint filter expressions
...
Cons:
- no dev_t info for the output of plug, unplug_timer and unplug_io events.
no dev_t info for getrq and sleeprq events if bio == NULL.
no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL.
This is mainly because we can't get the deivce from a request queue.
But this may change in the future.
- A packet command is converted to a string in TP_assign, not TP_print.
While blktrace do the convertion just before output.
Since pc requests should be rather rare, this is not a big issue.
- In blktrace, an event can have 2 different print formats, but a TRACE_EVENT
has a unique format, which means we have some unused data in a trace entry.
The overhead is minimized by using __dynamic_array() instead of __array().
I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing:
dd dd + ioctl blktrace dd + TRACE_EVENT (splice)
1 7.36s, 42.7 MB/s 7.50s, 42.0 MB/s 7.41s, 42.5 MB/s
2 7.43s, 42.3 MB/s 7.48s, 42.1 MB/s 7.43s, 42.4 MB/s
3 7.38s, 42.6 MB/s 7.45s, 42.2 MB/s 7.41s, 42.5 MB/s
So the overhead of tracing is very small, and no regression when using
those trace events vs blktrace.
And the binary output of TRACE_EVENT is much smaller than blktrace:
# ls -l -h
-rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0
-rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1
-rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out
Following are some comparisons between TRACE_EVENT and blktrace:
plug:
kjournald-480 [000] 303.084981: block_plug: [kjournald]
kjournald-480 [000] 303.084981: 8,0 P N [kjournald]
unplug_io:
kblockd/0-118 [000] 300.052973: block_unplug_io: [kblockd/0] 1
kblockd/0-118 [000] 300.052974: 8,0 U N [kblockd/0] 1
remap:
kjournald-480 [000] 303.085042: block_remap: 8,0 W 102736992 + 8 <- (8,8) 33384
kjournald-480 [000] 303.085043: 8,0 A W 102736992 + 8 <- (8,8) 33384
bio_backmerge:
kjournald-480 [000] 303.085086: block_bio_backmerge: 8,0 W 102737032 + 8 [kjournald]
kjournald-480 [000] 303.085086: 8,0 M W 102737032 + 8 [kjournald]
getrq:
kjournald-480 [000] 303.084974: block_getrq: 8,0 W 102736984 + 8 [kjournald]
kjournald-480 [000] 303.084975: 8,0 G W 102736984 + 8 [kjournald]
bash-2066 [001] 1072.953770: 8,0 G N [bash]
bash-2066 [001] 1072.953773: block_getrq: 0,0 N 0 + 0 [bash]
rq_complete:
konsole-2065 [001] 300.053184: block_rq_complete: 8,0 W () 103669040 + 16 [0]
konsole-2065 [001] 300.053191: 8,0 C W 103669040 + 16 [0]
ksoftirqd/1-7 [001] 1072.953811: 8,0 C N (5a 00 08 00 00 00 00 00 24 00) [0]
ksoftirqd/1-7 [001] 1072.953813: block_rq_complete: 0,0 N (5a 00 08 00 00 00 00 00 24 00) 0 + 0 [0]
rq_insert:
kjournald-480 [000] 303.084985: block_rq_insert: 8,0 W 0 () 102736984 + 8 [kjournald]
kjournald-480 [000] 303.084986: 8,0 I W 102736984 + 8 [kjournald]
Changelog from v2 -> v3:
- use the newly introduced __dynamic_array().
Changelog from v1 -> v2:
- use __string() instead of __array() to minimize the memory required
to store hex dump of rq->cmd().
- support large pc requests.
- add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT.
- some cleanups.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4A2DF669.5070905@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-09 13:43:05 +08:00
|
|
|
#include <trace/events/block.h>
|
|
|
|
|
2006-08-30 02:06:00 +08:00
|
|
|
#define POOL_SIZE 64
|
|
|
|
#define ISA_POOL_SIZE 16
|
|
|
|
|
|
|
|
static mempool_t *page_pool, *isa_page_pool;
|
|
|
|
|
2012-06-17 04:41:05 +08:00
|
|
|
#if defined(CONFIG_HIGHMEM) || defined(CONFIG_NEED_BOUNCE_POOL)
|
2006-08-30 02:06:00 +08:00
|
|
|
static __init int init_emergency_pool(void)
|
|
|
|
{
|
2012-06-17 04:41:05 +08:00
|
|
|
#if defined(CONFIG_HIGHMEM) && !defined(CONFIG_MEMORY_HOTPLUG)
|
2011-10-21 03:24:30 +08:00
|
|
|
if (max_pfn <= max_low_pfn)
|
2006-08-30 02:06:00 +08:00
|
|
|
return 0;
|
2011-10-21 03:24:30 +08:00
|
|
|
#endif
|
2006-08-30 02:06:00 +08:00
|
|
|
|
|
|
|
page_pool = mempool_create_page_pool(POOL_SIZE, 0);
|
|
|
|
BUG_ON(!page_pool);
|
2014-06-07 05:38:30 +08:00
|
|
|
pr_info("pool size: %d pages\n", POOL_SIZE);
|
2006-08-30 02:06:00 +08:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
__initcall(init_emergency_pool);
|
2012-06-17 04:41:05 +08:00
|
|
|
#endif
|
2006-08-30 02:06:00 +08:00
|
|
|
|
2012-06-17 04:41:05 +08:00
|
|
|
#ifdef CONFIG_HIGHMEM
|
2006-08-30 02:06:00 +08:00
|
|
|
/*
|
|
|
|
* highmem version, map in to vec
|
|
|
|
*/
|
|
|
|
static void bounce_copy_vec(struct bio_vec *to, unsigned char *vfrom)
|
|
|
|
{
|
|
|
|
unsigned long flags;
|
|
|
|
unsigned char *vto;
|
|
|
|
|
|
|
|
local_irq_save(flags);
|
2011-11-25 23:14:39 +08:00
|
|
|
vto = kmap_atomic(to->bv_page);
|
2006-08-30 02:06:00 +08:00
|
|
|
memcpy(vto + to->bv_offset, vfrom, to->bv_len);
|
2011-11-25 23:14:39 +08:00
|
|
|
kunmap_atomic(vto);
|
2006-08-30 02:06:00 +08:00
|
|
|
local_irq_restore(flags);
|
|
|
|
}
|
|
|
|
|
|
|
|
#else /* CONFIG_HIGHMEM */
|
|
|
|
|
|
|
|
#define bounce_copy_vec(to, vfrom) \
|
|
|
|
memcpy(page_address((to)->bv_page) + (to)->bv_offset, vfrom, (to)->bv_len)
|
|
|
|
|
|
|
|
#endif /* CONFIG_HIGHMEM */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* allocate pages in the DMA region for the ISA pool
|
|
|
|
*/
|
|
|
|
static void *mempool_alloc_pages_isa(gfp_t gfp_mask, void *data)
|
|
|
|
{
|
|
|
|
return mempool_alloc_pages(gfp_mask | GFP_DMA, data);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* gets called "every" time someone init's a queue with BLK_BOUNCE_ISA
|
|
|
|
* as the max address, so check if the pool has already been created.
|
|
|
|
*/
|
|
|
|
int init_emergency_isa_pool(void)
|
|
|
|
{
|
|
|
|
if (isa_page_pool)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
isa_page_pool = mempool_create(ISA_POOL_SIZE, mempool_alloc_pages_isa,
|
|
|
|
mempool_free_pages, (void *) 0);
|
|
|
|
BUG_ON(!isa_page_pool);
|
|
|
|
|
2014-06-07 05:38:30 +08:00
|
|
|
pr_info("isa pool size: %d pages\n", ISA_POOL_SIZE);
|
2006-08-30 02:06:00 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Simple bounce buffer support for highmem pages. Depending on the
|
|
|
|
* queue gfp mask set, *to may or may not be a highmem page. kmap it
|
|
|
|
* always, it will do the Right Thing
|
|
|
|
*/
|
|
|
|
static void copy_to_high_bio_irq(struct bio *to, struct bio *from)
|
|
|
|
{
|
|
|
|
unsigned char *vfrom;
|
2013-11-24 09:19:00 +08:00
|
|
|
struct bio_vec tovec, *fromvec = from->bi_io_vec;
|
|
|
|
struct bvec_iter iter;
|
|
|
|
|
|
|
|
bio_for_each_segment(tovec, to, iter) {
|
|
|
|
if (tovec.bv_page != fromvec->bv_page) {
|
|
|
|
/*
|
|
|
|
* fromvec->bv_offset and fromvec->bv_len might have
|
|
|
|
* been modified by the block layer, so use the original
|
|
|
|
* copy, bounce_copy_vec already uses tovec->bv_len
|
|
|
|
*/
|
|
|
|
vfrom = page_address(fromvec->bv_page) +
|
|
|
|
tovec.bv_offset;
|
|
|
|
|
|
|
|
bounce_copy_vec(&tovec, vfrom);
|
|
|
|
flush_dcache_page(tovec.bv_page);
|
|
|
|
}
|
2006-08-30 02:06:00 +08:00
|
|
|
|
2013-11-24 09:19:00 +08:00
|
|
|
fromvec++;
|
2006-08-30 02:06:00 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void bounce_end_io(struct bio *bio, mempool_t *pool, int err)
|
|
|
|
{
|
|
|
|
struct bio *bio_orig = bio->bi_private;
|
|
|
|
struct bio_vec *bvec, *org_vec;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* free up bounce indirect pages used
|
|
|
|
*/
|
2013-02-07 04:23:11 +08:00
|
|
|
bio_for_each_segment_all(bvec, bio, i) {
|
2006-08-30 02:06:00 +08:00
|
|
|
org_vec = bio_orig->bi_io_vec + i;
|
|
|
|
if (bvec->bv_page == org_vec->bv_page)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
dec_zone_page_state(bvec->bv_page, NR_BOUNCE);
|
|
|
|
mempool_free(bvec->bv_page, pool);
|
|
|
|
}
|
|
|
|
|
2007-09-27 18:47:43 +08:00
|
|
|
bio_endio(bio_orig, err);
|
2006-08-30 02:06:00 +08:00
|
|
|
bio_put(bio);
|
|
|
|
}
|
|
|
|
|
2007-09-27 18:47:43 +08:00
|
|
|
static void bounce_end_io_write(struct bio *bio, int err)
|
2006-08-30 02:06:00 +08:00
|
|
|
{
|
|
|
|
bounce_end_io(bio, page_pool, err);
|
|
|
|
}
|
|
|
|
|
2007-09-27 18:47:43 +08:00
|
|
|
static void bounce_end_io_write_isa(struct bio *bio, int err)
|
2006-08-30 02:06:00 +08:00
|
|
|
{
|
|
|
|
|
|
|
|
bounce_end_io(bio, isa_page_pool, err);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void __bounce_end_io_read(struct bio *bio, mempool_t *pool, int err)
|
|
|
|
{
|
|
|
|
struct bio *bio_orig = bio->bi_private;
|
|
|
|
|
|
|
|
if (test_bit(BIO_UPTODATE, &bio->bi_flags))
|
|
|
|
copy_to_high_bio_irq(bio_orig, bio);
|
|
|
|
|
|
|
|
bounce_end_io(bio, pool, err);
|
|
|
|
}
|
|
|
|
|
2007-09-27 18:47:43 +08:00
|
|
|
static void bounce_end_io_read(struct bio *bio, int err)
|
2006-08-30 02:06:00 +08:00
|
|
|
{
|
|
|
|
__bounce_end_io_read(bio, page_pool, err);
|
|
|
|
}
|
|
|
|
|
2007-09-27 18:47:43 +08:00
|
|
|
static void bounce_end_io_read_isa(struct bio *bio, int err)
|
2006-08-30 02:06:00 +08:00
|
|
|
{
|
|
|
|
__bounce_end_io_read(bio, isa_page_pool, err);
|
|
|
|
}
|
|
|
|
|
2013-02-22 08:42:55 +08:00
|
|
|
#ifdef CONFIG_NEED_BOUNCE_POOL
|
|
|
|
static int must_snapshot_stable_pages(struct request_queue *q, struct bio *bio)
|
|
|
|
{
|
|
|
|
if (bio_data_dir(bio) != WRITE)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (!bdi_cap_stable_pages_required(&q->backing_dev_info))
|
|
|
|
return 0;
|
|
|
|
|
2013-04-30 06:07:25 +08:00
|
|
|
return test_bit(BIO_SNAP_STABLE, &bio->bi_flags);
|
2013-02-22 08:42:55 +08:00
|
|
|
}
|
|
|
|
#else
|
|
|
|
static int must_snapshot_stable_pages(struct request_queue *q, struct bio *bio)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
#endif /* CONFIG_NEED_BOUNCE_POOL */
|
|
|
|
|
2007-07-24 15:28:11 +08:00
|
|
|
static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
|
2013-02-22 08:42:55 +08:00
|
|
|
mempool_t *pool, int force)
|
2006-08-30 02:06:00 +08:00
|
|
|
{
|
2012-09-11 05:30:37 +08:00
|
|
|
struct bio *bio;
|
|
|
|
int rw = bio_data_dir(*bio_orig);
|
2013-11-24 09:19:00 +08:00
|
|
|
struct bio_vec *to, from;
|
|
|
|
struct bvec_iter iter;
|
2012-09-11 05:30:37 +08:00
|
|
|
unsigned i;
|
2006-08-30 02:06:00 +08:00
|
|
|
|
2013-10-01 04:45:09 +08:00
|
|
|
if (force)
|
|
|
|
goto bounce;
|
2013-11-24 09:19:00 +08:00
|
|
|
bio_for_each_segment(from, *bio_orig, iter)
|
|
|
|
if (page_to_pfn(from.bv_page) > queue_bounce_pfn(q))
|
2012-09-11 05:30:37 +08:00
|
|
|
goto bounce;
|
2006-08-30 02:06:00 +08:00
|
|
|
|
2012-09-11 05:30:37 +08:00
|
|
|
return;
|
|
|
|
bounce:
|
|
|
|
bio = bio_clone_bioset(*bio_orig, GFP_NOIO, fs_bio_set);
|
2006-08-30 02:06:00 +08:00
|
|
|
|
2012-09-06 06:22:02 +08:00
|
|
|
bio_for_each_segment_all(to, bio, i) {
|
2012-09-11 05:30:37 +08:00
|
|
|
struct page *page = to->bv_page;
|
2008-12-23 19:44:19 +08:00
|
|
|
|
2012-09-11 05:30:37 +08:00
|
|
|
if (page_to_pfn(page) <= queue_bounce_pfn(q) && !force)
|
|
|
|
continue;
|
2006-08-30 02:06:00 +08:00
|
|
|
|
2012-09-11 05:30:37 +08:00
|
|
|
to->bv_page = mempool_alloc(pool, q->bounce_gfp);
|
block:bounce: fix call inc_|dec_zone_page_state on different pages confuse value of NR_BOUNCE
Commit d2c5e30c9a1420902262aa923794d2ae4e0bc391
("[PATCH] zoned vm counters: conversion of nr_bounce to per zone counter")
convert statistic of nr_bounce to per zone and one global value in vm_stat,
but it call inc_|dec_zone_page_state on different pages, then different
zones, and cause us to get unexpected value of NR_BOUNCE.
Below is the result on my machine:
Mar 2 09:26:08 udknight kernel: [144766.778265] Mem-Info:
Mar 2 09:26:08 udknight kernel: [144766.778266] DMA per-cpu:
Mar 2 09:26:08 udknight kernel: [144766.778268] CPU 0: hi: 0, btch: 1 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778269] CPU 1: hi: 0, btch: 1 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778270] Normal per-cpu:
Mar 2 09:26:08 udknight kernel: [144766.778271] CPU 0: hi: 186, btch: 31 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778273] CPU 1: hi: 186, btch: 31 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778274] HighMem per-cpu:
Mar 2 09:26:08 udknight kernel: [144766.778275] CPU 0: hi: 186, btch: 31 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778276] CPU 1: hi: 186, btch: 31 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778279] active_anon:46926 inactive_anon:287406 isolated_anon:0
Mar 2 09:26:08 udknight kernel: [144766.778279] active_file:105085 inactive_file:139432 isolated_file:0
Mar 2 09:26:08 udknight kernel: [144766.778279] unevictable:653 dirty:0 writeback:0 unstable:0
Mar 2 09:26:08 udknight kernel: [144766.778279] free:178957 slab_reclaimable:6419 slab_unreclaimable:9966
Mar 2 09:26:08 udknight kernel: [144766.778279] mapped:4426 shmem:305277 pagetables:784 bounce:0
Mar 2 09:26:08 udknight kernel: [144766.778279] free_cma:0
Mar 2 09:26:08 udknight kernel: [144766.778286] DMA free:3324kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15900kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Mar 2 09:26:08 udknight kernel: [144766.778287] lowmem_reserve[]: 0 822 3754 3754
Mar 2 09:26:08 udknight kernel: [144766.778293] Normal free:26828kB min:3632kB low:4540kB high:5448kB active_anon:4872kB inactive_anon:68kB active_file:1796kB inactive_file:1796kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:892920kB managed:842560kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:4144kB slab_reclaimable:25676kB slab_unreclaimable:39864kB kernel_stack:1944kB pagetables:3136kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2412612 all_unreclaimable? yes
Mar 2 09:26:08 udknight kernel: [144766.778294] lowmem_reserve[]: 0 0 23451 23451
Mar 2 09:26:08 udknight kernel: [144766.778299] HighMem free:685676kB min:512kB low:3748kB high:6984kB active_anon:182832kB inactive_anon:1149556kB active_file:418544kB inactive_file:555932kB unevictable:2612kB isolated(anon):0kB isolated(file):0kB present:3001732kB managed:3001732kB mlocked:0kB dirty:0kB writeback:0kB mapped:17704kB shmem:1216964kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:75771152kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Mar 2 09:26:08 udknight kernel: [144766.778300] lowmem_reserve[]: 0 0 0 0
You can see bounce:75771152kB for HighMem, but bounce:0 for lowmem and global.
This patch fix it.
Signed-off-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-04-26 16:43:31 +08:00
|
|
|
inc_zone_page_state(to->bv_page, NR_BOUNCE);
|
2006-08-30 02:06:00 +08:00
|
|
|
|
|
|
|
if (rw == WRITE) {
|
|
|
|
char *vto, *vfrom;
|
|
|
|
|
2012-09-11 05:30:37 +08:00
|
|
|
flush_dcache_page(page);
|
|
|
|
|
2006-08-30 02:06:00 +08:00
|
|
|
vto = page_address(to->bv_page) + to->bv_offset;
|
2012-09-11 05:30:37 +08:00
|
|
|
vfrom = kmap_atomic(page) + to->bv_offset;
|
2006-08-30 02:06:00 +08:00
|
|
|
memcpy(vto, vfrom, to->bv_len);
|
2012-09-11 05:30:37 +08:00
|
|
|
kunmap_atomic(vfrom);
|
2006-08-30 02:06:00 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2008-10-30 15:34:33 +08:00
|
|
|
trace_block_bio_bounce(q, *bio_orig);
|
2007-01-12 19:20:26 +08:00
|
|
|
|
2006-08-30 02:06:00 +08:00
|
|
|
bio->bi_flags |= (1 << BIO_BOUNCED);
|
|
|
|
|
|
|
|
if (pool == page_pool) {
|
|
|
|
bio->bi_end_io = bounce_end_io_write;
|
|
|
|
if (rw == READ)
|
|
|
|
bio->bi_end_io = bounce_end_io_read;
|
|
|
|
} else {
|
|
|
|
bio->bi_end_io = bounce_end_io_write_isa;
|
|
|
|
if (rw == READ)
|
|
|
|
bio->bi_end_io = bounce_end_io_read_isa;
|
|
|
|
}
|
|
|
|
|
|
|
|
bio->bi_private = *bio_orig;
|
|
|
|
*bio_orig = bio;
|
|
|
|
}
|
|
|
|
|
2007-07-24 15:28:11 +08:00
|
|
|
void blk_queue_bounce(struct request_queue *q, struct bio **bio_orig)
|
2006-08-30 02:06:00 +08:00
|
|
|
{
|
2013-02-22 08:42:55 +08:00
|
|
|
int must_bounce;
|
2006-08-30 02:06:00 +08:00
|
|
|
mempool_t *pool;
|
|
|
|
|
2007-09-27 19:01:25 +08:00
|
|
|
/*
|
|
|
|
* Data-less bio, nothing to bounce
|
|
|
|
*/
|
2008-08-14 19:12:15 +08:00
|
|
|
if (!bio_has_data(*bio_orig))
|
2007-09-27 19:01:25 +08:00
|
|
|
return;
|
|
|
|
|
2013-02-22 08:42:55 +08:00
|
|
|
must_bounce = must_snapshot_stable_pages(q, *bio_orig);
|
|
|
|
|
2006-08-30 02:06:00 +08:00
|
|
|
/*
|
|
|
|
* for non-isa bounce case, just check if the bounce pfn is equal
|
|
|
|
* to or bigger than the highest pfn in the system -- in that case,
|
|
|
|
* don't waste time iterating over bio segments
|
|
|
|
*/
|
|
|
|
if (!(q->bounce_gfp & GFP_DMA)) {
|
2013-02-22 08:42:55 +08:00
|
|
|
if (queue_bounce_pfn(q) >= blk_max_pfn && !must_bounce)
|
2006-08-30 02:06:00 +08:00
|
|
|
return;
|
|
|
|
pool = page_pool;
|
|
|
|
} else {
|
|
|
|
BUG_ON(!isa_page_pool);
|
|
|
|
pool = isa_page_pool;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* slow path
|
|
|
|
*/
|
2013-02-22 08:42:55 +08:00
|
|
|
__blk_queue_bounce(q, bio_orig, pool, must_bounce);
|
2006-08-30 02:06:00 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
EXPORT_SYMBOL(blk_queue_bounce);
|