From 1be62dc190ebaca331038962c873e7967de6cc4b Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Fri, 4 Apr 2008 14:38:17 -0700 Subject: [PATCH] Be more careful about marking buffers dirty Mikulas Patocka noted that the optimization where we check if a buffer was already dirty (and we avoid re-dirtying it) was not really SMP-safe. Since the read of the old status was not synchronized with anything, an aggressive CPU re-ordering of memory accesses might have moved that read up to before the data was even written to the buffer, and another CPU that cleaned it again, causing the newly dirty state to never actually hit the disk. Admittedly this would probably never trigger in practice, but it's still wrong. Mikulas sent a patch that fixed the problem, but I dislike the subtlety of the whole optimization, so this is an alternate fix that is more explicit about the particular SMP ordering for the optimization, and separates out the speculative reads of the buffer state into its own conditional (and makes the memory barrier only happen if we are likely to actually hit the optimized case in the first place). I considered removing the optimization entirely, but Andrew argued for it's continued existence. I'm a push-over. Cc: Mikulas Patocka Cc: Andrew Morton Signed-off-by: Linus Torvalds --- fs/buffer.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/fs/buffer.c b/fs/buffer.c index 98196327ddf0..39ff14403d13 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1181,7 +1181,20 @@ __getblk_slow(struct block_device *bdev, sector_t block, int size) void mark_buffer_dirty(struct buffer_head *bh) { WARN_ON_ONCE(!buffer_uptodate(bh)); - if (!buffer_dirty(bh) && !test_set_buffer_dirty(bh)) + + /* + * Very *carefully* optimize the it-is-already-dirty case. + * + * Don't let the final "is it dirty" escape to before we + * perhaps modified the buffer. + */ + if (buffer_dirty(bh)) { + smp_mb(); + if (buffer_dirty(bh)) + return; + } + + if (!test_set_buffer_dirty(bh)) __set_page_dirty(bh->b_page, page_mapping(bh->b_page), 0); }