2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-24 21:24:00 +08:00
linux-next/kernel/sched
Kirill Tkhai 1effd9f193 sched/numa: Fix unsafe get_task_struct() in task_numa_assign()
Unlocked access to dst_rq->curr in task_numa_compare() is racy.
If curr task is exiting this may be a reason of use-after-free:

task_numa_compare()                    do_exit()
    ...                                        current->flags |= PF_EXITING;
    ...                                    release_task()
    ...                                        ~~delayed_put_task_struct()~~
    ...                                    schedule()
    rcu_read_lock()                        ...
    cur = ACCESS_ONCE(dst_rq->curr)        ...
        ...                                rq->curr = next;
        ...                                    context_switch()
        ...                                        finish_task_switch()
        ...                                            put_task_struct()
        ...                                                __put_task_struct()
        ...                                                    free_task_struct()
        task_numa_assign()                                     ...
            get_task_struct()                                  ...

As noted by Oleg:

  <<The lockless get_task_struct(tsk) is only safe if tsk == current
    and didn't pass exit_notify(), or if this tsk was found on a rcu
    protected list (say, for_each_process() or find_task_by_vpid()).
    IOW, it is only safe if release_task() was not called before we
    take rcu_read_lock(), in this case we can rely on the fact that
    delayed_put_pid() can not drop the (potentially) last reference
    until rcu_read_unlock().

    And as Kirill pointed out task_numa_compare()->task_numa_assign()
    path does get_task_struct(dst_rq->curr) and this is not safe. The
    task_struct itself can't go away, but rcu_read_lock() can't save
    us from the final put_task_struct() in finish_task_switch(); this
    reference goes away without rcu gp>>

The patch provides simple check of PF_EXITING flag. If it's not set,
this guarantees that call_rcu() of delayed_put_task_struct() callback
hasn't happened yet, so we can safely do get_task_struct() in
task_numa_assign().

Locked dst_rq->lock protects from concurrency with the last schedule().
Reusing or unmapping of cur's memory may happen without it.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1413962231.19914.130.camel@tkhai
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28 10:46:02 +01:00
..
auto_group.c sched: Change autogroup_move_group() to use for_each_thread() 2014-08-20 09:47:18 +02:00
auto_group.h Revert "sched/autogroup: Fix crash on reboot when autogroup is disabled" 2012-12-11 10:23:45 +01:00
clock.c time: Replace __get_cpu_var uses 2014-08-26 13:45:44 -04:00
completion.c sched: Move completion code from core.c to completion.c 2013-11-06 07:49:19 +01:00
core.c sched: Fix race between task_group and sched_task_group 2014-10-28 10:45:59 +01:00
cpuacct.c cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes 2014-07-15 11:05:09 -04:00
cpuacct.h sched/cpuacct: Initialize root cpuacct earlier 2013-04-10 13:54:20 +02:00
cpudeadline.c sched/deadline: Fix inter- exclusive cpusets migrations 2014-09-24 14:46:57 +02:00
cpudeadline.h sched/deadline: Replace NR_CPUS arrays 2014-05-22 10:21:28 +02:00
cpupri.c Merge commit '3cf2f34' into sched/core, to fix build error 2014-06-12 13:46:37 +02:00
cpupri.h sched/cpupri: Replace NR_CPUS arrays 2014-05-22 10:21:29 +02:00
cputime.c sched, time: Fix build error with 64 bit cputime_t on 32 bit systems 2014-10-03 05:46:55 +02:00
deadline.c sched/deadline: Fix races between rt_mutex_setprio() and dl_task_timer() 2014-10-28 10:46:01 +01:00
debug.c sched: print_rq(): Don't use tasklist_lock 2014-09-24 14:47:04 +02:00
fair.c sched/numa: Fix unsafe get_task_struct() in task_numa_assign() 2014-10-28 10:46:02 +01:00
features.h sched: Rename capacity related flags 2014-06-05 11:52:32 +02:00
idle_task.c sched: Transform resched_task() into resched_curr() 2014-07-16 13:38:19 +02:00
idle.c sched: Let the scheduler see CPU idle states 2014-09-24 14:46:58 +02:00
Makefile sched/idle: Move cpu/idle.c to sched/idle.c 2014-02-11 09:58:30 +01:00
proc.c cpuidle: menu: Lookup CPU runqueues less 2014-08-06 21:17:45 +02:00
rt.c Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2014-10-15 07:48:18 +02:00
sched.h Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2014-10-15 07:48:18 +02:00
stats.c kernel: audit/fix non-modular users of module_init in core code 2014-04-03 16:21:07 -07:00
stats.h sched: Micro-optimize by dropping unnecessary task_rq() calls 2013-09-25 13:51:06 +02:00
stop_task.c sched: Add wrapper for checking task_struct::on_rq 2014-08-20 14:52:59 +02:00
wait.c SCHED: add some "wait..on_bit...timeout()" interfaces. 2014-09-25 08:23:57 -04:00