linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-11-18 23:54:26 +08:00

History

Michael S. Tsirkin 298359c5bf exit: fix oops in sync_mm_rss In 2.6.34-rc1, removing vhost_net module causes an oops in sync_mm_rss (called from do_exit) when workqueue is destroyed. This does not happen on net-next, or with vhost on top of to 2.6.33. The issue seems to be introduced by `34e55232e5` ("mm: avoid false sharing of mm_counter) which added sync_mm_rss() that is passed task->mm, and dereferences it without checking. If task is a kernel thread, mm might be NULL. I think this might also happen e.g. with aio. This patch fixes the oops by calling sync_mm_rss when task->mm is set to NULL. I also added BUG_ON to detect any other cases where counters get incremented while mm is NULL. The oops I observed looks like this: BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8 IP: [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f PGD 0 Oops: 0002 [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map CPU 2 Modules linked in: vhost_net(-) tun bridge stp sunrpc ipv6 cpufreq_ondemand acpi_cpufreq freq_table kvm_intel kvm i5000_edac edac_core rtc_cmos bnx2 button i2c_i801 i2c_core rtc_core e1000e sg joydev ide_cd_mod serio_raw pcspkr rtc_lib cdrom virtio_net virtio_blk virtio_pci virtio_ring virtio af_packet e1000 shpchp aacraid uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode] Pid: 2046, comm: vhost Not tainted 2.6.34-rc1-vhost #25 System Planar/IBM System x3550 -[7978B3G]- RIP: 0010:[<ffffffff810b436d>] [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f RSP: 0018:ffff8802379b7e60 EFLAGS: 00010202 RAX: 0000000000000008 RBX: ffff88023f2390c0 RCX: 0000000000000000 RDX: ffff88023f2396b0 RSI: 0000000000000000 RDI: ffff88023f2390c0 RBP: ffff8802379b7e60 R08: 0000000000000000 R09: 0000000000000000 R10: ffff88023aecfbc0 R11: 0000000000013240 R12: 0000000000000000 R13: ffffffff81051a6c R14: ffffe8ffffc0f540 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff880001e80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000000002a8 CR3: 000000023af23000 CR4: 00000000000406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process vhost (pid: 2046, threadinfo ffff8802379b6000, task ffff88023f2390c0) Stack: ffff8802379b7ee0 ffffffff81040687 ffffe8ffffc0f558 ffffffffa00a3e2d <0> 0000000000000000 ffff88023f2390c0 ffffffff81055817 ffff8802379b7e98 <0> ffff8802379b7e98 0000000100000286 ffff8802379b7ee0 ffff88023ad47d78 Call Trace: [<ffffffff81040687>] do_exit+0x147/0x6c4 [<ffffffffa00a3e2d>] ? handle_rx_net+0x0/0x17 [vhost_net] [<ffffffff81055817>] ? autoremove_wake_function+0x0/0x39 [<ffffffff81051a6c>] ? worker_thread+0x0/0x229 [<ffffffff810553c9>] kthreadd+0x0/0xf2 [<ffffffff810038d4>] kernel_thread_helper+0x4/0x10 [<ffffffff81055342>] ? kthread+0x0/0x87 [<ffffffff810038d0>] ? kernel_thread_helper+0x0/0x10 Code: 00 8b 87 6c 02 00 00 85 c0 74 14 48 98 f0 48 01 86 a0 02 00 00 c7 87 6c 02 00 00 00 00 00 00 8b 87 70 02 00 00 85 c0 74 14 48 98 <f0> 48 01 86 a8 02 00 00 c7 87 70 02 00 00 00 00 00 00 8b 87 74 RIP [<ffffffff810b436d>] sync_mm_rss+0x33/0x6f RSP <ffff8802379b7e60> CR2: 00000000000002a8 ---[ end trace 41603ba922beddd2 ]--- Fixing recursive fault but reboot is needed! (note: handle_rx_net is a work item using workqueue in question). sync_mm_rss+0x33/0x6f gave me a hint. I also tried reverting `34e55232e5` and the oops goes away. The module in question calls use_mm and later unuse_mm from a kernel thread. It is when this kernel thread is destroyed that the crash happens. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2010-03-24 16:31:21 -07:00
..
backing-dev.c	flusher: Fix PF_FROZEN race	2009-12-03 13:49:43 +01:00
bootmem.c	x86: Make 64 bit use early_res instead of bootmem before slab	2010-02-12 09:41:59 -08:00
bounce.c	block: remove some includings of blktrace_api.h	2009-06-16 11:19:36 +02:00
debug-pagealloc.c	generic debug pagealloc	2009-04-01 08:59:13 -07:00
dmapool.c	dmapools: protect page_list walk in show_pools()	2009-06-30 18:56:00 -07:00
fadvise.c	readahead: introduce FMODE_RANDOM for POSIX_FADV_RANDOM	2010-03-06 11:26:25 -08:00
failslab.c	failslab: add ability to filter slab caches	2010-02-26 19:19:39 +02:00
filemap_xip.c	mm: clean up mm_counter	2010-03-06 11:26:23 -08:00
filemap.c	mm: use rlimit helpers	2010-03-06 11:26:24 -08:00
fremap.c	mm: clean up mm_counter	2010-03-06 11:26:23 -08:00
highmem.c	grammar fix in comment	2010-02-05 12:22:40 +01:00
hugetlb.c	Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm	2010-03-01 09:15:15 -08:00
hwpoison-inject.c	HWPOISON: Don't do early filtering if filter is disabled	2009-12-16 12:20:01 +01:00
init-mm.c	mm: consolidate init_mm definition	2009-06-16 19:47:28 -07:00
internal.h	HWPOISON: add an interface to switch off/on all the page filters	2009-12-16 12:19:59 +01:00
Kconfig	Merge branch 'x86-bootmem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip	2010-03-03 08:15:05 -08:00
Kconfig.debug	trivial: improve help text for mm debug config options	2009-09-21 15:14:57 +02:00
kmemcheck.c	kmemcheck: add hooks for the page allocator	2009-06-15 15:48:33 +02:00
kmemleak-test.c	percpu: clean up percpu variable definitions	2009-06-24 15:13:48 +09:00
kmemleak.c	Merge branch 'kmemleak' of git://linux-arm.org/linux-2.6	2009-12-17 16:00:19 -08:00
ksm.c	mm/ksm.c is doing an unneeded _notify in write_protect_page.	2010-03-24 16:31:20 -07:00
maccess.c	maccess,probe_kernel: Allow arch specific override probe_kernel_(read\|write)	2010-01-07 11:58:36 -06:00
madvise.c	HWPOISON: Add a madvise() injector for soft page offlining	2009-12-16 12:20:00 +01:00
Makefile	make generic_acl slightly more generic	2009-12-16 12:16:49 -05:00
memcontrol.c	memcontrol: fix potential null deref	2010-03-24 16:31:19 -07:00
memory_hotplug.c	mm: introduce dump_page() and print symbolic flag names	2010-03-12 15:52:28 -08:00
memory-failure.c	mm: change anon_vma linking to fix multi-process server scalability issue	2010-03-06 11:26:26 -08:00
memory.c	exit: fix oops in sync_mm_rss	2010-03-24 16:31:21 -07:00
mempolicy.c	tmpfs: cleanup mpol_parse_str()	2010-03-24 16:31:21 -07:00
mempool.c	mm: remove broken 'kzalloc' mempool	2009-09-22 07:17:35 -07:00
migrate.c	mm/migrate.c: kill anon local variable from migrate_page_copy	2010-03-06 11:26:25 -08:00
mincore.c	mm: hugetlb: fix hugepage memory leak in mincore()	2009-12-15 08:53:24 -08:00
mlock.c	mm: use rlimit helpers	2010-03-06 11:26:24 -08:00
mm_init.c
mmap.c	Add generic sys_old_mmap()	2010-03-12 15:52:32 -08:00
mmu_context.c	exit: fix oops in sync_mm_rss	2010-03-24 16:31:21 -07:00
mmu_notifier.c	ksm: add mmu_notifier set_pte_at_notify()	2009-09-22 07:17:31 -07:00
mmzone.c	[ARM] Double check memmap is actually valid with a memmap has unexpected holes V2	2009-05-18 11:22:24 +01:00
mprotect.c	perf: Do the big rename: Performance Counters -> Performance Events	2009-09-21 14:28:04 +02:00
mremap.c	mm: change anon_vma linking to fix multi-process server scalability issue	2010-03-06 11:26:26 -08:00
msync.c	[CVE-2009-0029] System call wrappers part 13	2009-01-14 14:15:23 +01:00
nommu.c	nommu: fix an incorrect comment in the do_mmap_shared_file()	2010-03-24 16:31:20 -07:00
oom_kill.c	memcg: fix oom kill behavior	2010-03-12 15:52:38 -08:00
page_alloc.c	mm: introduce dump_page() and print symbolic flag names	2010-03-12 15:52:28 -08:00
page_cgroup.c	memcg: avoid use cmpxchg in swap cgroup maintainance	2010-03-17 18:43:47 -07:00
page_io.c	swap: rework map_swap_page() again	2009-12-15 08:53:16 -08:00
page_isolation.c	memory hotplug: fix page_zone() calculation in test_pages_isolated()	2008-11-06 15:41:19 -08:00
page-writeback.c	writeback: remove unused nonblocking and congestion checks	2009-12-03 13:54:25 +01:00
pagewalk.c	mm hugetlb: add hugepage support to pagemap	2009-12-15 08:53:24 -08:00
percpu.c	early_res: Add free_early_partial()	2010-02-26 08:25:35 +01:00
prio_tree.c
quicklist.c	cpumask: use new-style cpumask ops in mm/quicklist.	2009-09-24 09:34:52 +09:30
readahead.c	readahead: introduce FMODE_RANDOM for POSIX_FADV_RANDOM	2010-03-06 11:26:25 -08:00
rmap.c	vmscan: detect mapped file pages used only once	2010-03-06 11:26:27 -08:00
shmem.c	Fix breakage in shmem.c	2009-12-16 19:48:48 -05:00
slab.c	Merge branches 'slab/cleanups', 'slab/failslab', 'slab/fixes' and 'slub/percpu' into slab-for-linus	2010-03-04 12:07:50 +02:00
slob.c	slab: remove duplicate kmem_cache_init_late() declarations	2009-08-06 11:36:25 +03:00
slub.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial	2010-03-12 16:04:50 -08:00
sparse-vmemmap.c	sparsemem: Put mem map for one node together.	2010-02-12 09:42:38 -08:00
sparse.c	sparsemem: Fix compilation on PowerPC	2010-03-01 17:59:24 -08:00
swap_state.c	mm: add_to_swap_cache() does not return -EEXIST	2009-09-22 07:17:35 -07:00
swap.c	mm: remove free_hot_page()	2010-03-06 11:26:25 -08:00
swapfile.c	memcg: move charges of anonymous swap	2010-03-12 15:52:36 -08:00
thrash.c	mm: pass mm to grab_swap_token	2009-06-23 12:50:05 -07:00
truncate.c	vfs: Fix vmtruncate() regression	2010-01-13 16:09:33 -08:00
util.c	nommu: don't need get_unmapped_area() for NOMMU	2010-01-16 12:15:40 -08:00
vmalloc.c	mm: purge fragmented percpu vmap blocks	2010-02-02 12:50:47 -08:00
vmscan.c	vmscan: detect mapped file pages used only once	2010-03-06 11:26:27 -08:00
vmstat.c	mm: restore zone->all_unreclaimable to independence word	2010-03-06 11:26:25 -08:00