Commit Graph

648503 Commits

Author SHA1 Message Date
Ding Tianhong
bb42ca4740 clocksource/drivers/arm_arch_timer: Work around Hisilicon erratum 161010101
Erratum Hisilicon-161010101 says that the ARM generic timer counter "has
the potential to contain an erroneous value when the timer value
changes". Accesses to TVAL (both read and write) are also affected due
to the implicit counter read. Accesses to CVAL are not affected.

The workaround is to reread the system count registers until the value
of the second read is larger than the first one by less than 32, the
system counter can be guaranteed not to return wrong value twice by
back-to-back read and the error value is always larger than the correct
one by 32. Writes to TVAL are replaced with an equivalent write to CVAL.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
[Mark: split patch, fix Kconfig, reword commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-08 00:14:04 +01:00
Ding Tianhong
16d10ef29f clocksource/drivers/arm_arch_timer: Introduce generic errata handling infrastructure
Currently we have code inline in the arch timer probe path to cater for
Freescale erratum A-008585, complete with ifdeffery. This is a little
ugly, and will get worse as we try to add more errata handling.

This patch refactors the handling of Freescale erratum A-008585. Now the
erratum is described in a generic arch_timer_erratum_workaround
structure, and the probe path can iterate over these to detect errata
and enable workarounds.

This will simplify the addition and maintenance of code handling
Hisilicon erratum 161010101.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
[Mark: split patch, correct Kconfig, reword commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-08 00:14:03 +01:00
Ding Tianhong
5444ea6a7f clocksource/drivers/arm_arch_timer: Remove fsl-a008585 parameter
Having a command line option to flip the errata handling for a
particular erratum is a little bit unusual, and it's vastly superior to
pass this in the DT. By common consensus, it's best to kill off the
command line parameter.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
[Mark: split patch, reword commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-08 00:14:03 +01:00
Ding Tianhong
729e55225b clocksource/drivers/arm_arch_timer: Add dt binding for hisilicon-161010101 erratum
This erratum describes a bug in logic outside the core, so MIDR can't be
used to identify its presence, and reading an SoC-specific revision
register from common arch timer code would be awkward.  So, describe it
in the device tree.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-08 00:13:57 +01:00
Chris Brandt
fb6002a826 clocksource/drivers/ostm: Add renesas-ostm timer driver
This patch adds a OSTM driver for the Renesas architecture.
The OS Timer (OSTM) has independent channels that can be
used as a freerun or interval times.
This driver uses the first probed device as a clocksource
and then any additional devices as clock events.

Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-07 20:58:30 +01:00
Chris Brandt
a1966cd29d clocksource/drivers/ostm: Document renesas-ostm timer DT bindings
Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-07 20:58:30 +01:00
David Engraf
7b9f1d16e6 clocksource/drivers/tcb_clksrc: Use 32 bit tcb as sched_clock
On newer boards the TC can be read as single 32 bit value without locking.
Thus the clock can be used as reference for sched_clock which is much more
accurate than the jiffies implementation.

Tested on a Atmel SAMA5D2 board.

Signed-off-by: David Engraf <david.engraf@sysgo.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-07 20:58:30 +01:00
Linus Walleij
4750535bc9 clocksource/drivers/gemini: Add driver for the Cortina Gemini
This is a rewrite of the Gemini timer
driver in arch/arm/mach-gemini/timer.c trying to do everything
the device tree way:

- Make every IO-access relative to a base address and dynamic
  so we can do a dynamic ioremap and get going.
- Do not poke around directly in the global syscon registers,
  access them using the syscon regmap style design pattern for
  the one register we need to check.
- Find register range and interrupt from the device tree.

Cc: Janos Laube <janos.dev@gmail.com>
Cc: Paulius Zaleckas <paulius.zaleckas@gmail.com>
Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-07 20:58:30 +01:00
Linus Walleij
9744b181ee clocksource: add DT bindings for Cortina Gemini
This adds device tree bindings for the Cortina Systems Gemini
timer block used in these SoCs.

Cc: Janos Laube <janos.dev@gmail.com>
Cc: Paulius Zaleckas <paulius.zaleckas@gmail.com>
Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: devicetree@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-07 20:58:30 +01:00
Daniel Lezcano
376bc27150 clockevents: Add a clkevt-of mechanism like clksrc-of
The current code uses the CLOCKSOURCE_OF_DECLARE macro to fill the clksrc
table with a t-uple (name, init_function).

Unfortunately it ends up to the clockevent and the clocksource being
both initialized with this macro. It is not a problem by itself but there
is not a clear distinction between a clockevent and a clocksource in the
code initialization path. Somebody can argue there are the same IP block
and the same DT node. But conceptually from the software side, there are
two distincts entities and as is they should be initialized separetely.
Some drivers which do not have a clocksource end up by using the
CLOCKSOURCE_OF_DECLARE macro to declare a clockevent.

Another result is the fuzzy organization in the clocksource directory,
where the clockevents are implemented in the same file than the
clocksources or file labelled timer-something implementing a clocksource.

This patch provides another macro to specifically declare a clockevent in
the same way than the clocksource and gives the opportunity to write two
separate drivers, one for the clocksource and another for the clockevents.

Hopefully, that can help to do some housework in the directory, perhaps
split the drivers in to entities, for example:
	- clksrc-rockchip.c
	- clkevt-rockchip.c

Also, it gives the possibility to declare clocksources separately in the
DT and then use a clocksource from IP block while while clockevents are
used from another IP block.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-02-07 20:58:30 +01:00
Waiman Long
668802c257 tick/broadcast: Reduce lock cacheline contention
It was observed that on an Intel x86 system without the ARAT (Always
running APIC timer) feature and with fairly large number of CPUs as
well as CPUs coming in and out of intel_idle frequently, the lock
contention on the tick_broadcast_lock can become significant.

To reduce contention, the lock is put into its own cacheline and all
the cpumask_var_t variables are put into the __read_mostly section.

Running the SP benchmark of the NAS Parallel Benchmarks on a 4-socket
16-core 32-thread Nehalam system, the performance number improved
from 3353.94 Mop/s to 3469.31 Mop/s when this patch was applied on
a 4.9.6 kernel.  This is a 3.4% improvement.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1485799063-20857-1-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04 08:54:46 +01:00
Thomas Gleixner
9556ad6ad0 Merge branch 'fortglx/4.11/time' of https://git.linaro.org/people/john.stultz/linux into timers/core
- Remove unused functions
 - Document udelay inaccuracy
 - Remove posix timer data from task struct when posix timers are off
2017-01-30 11:22:39 +01:00
Nicolas Pitre
b18b6a9cef timers: Omit POSIX timer stuff from task_struct when disabled
When CONFIG_POSIX_TIMERS is disabled, it is preferable to remove related
structures from struct task_struct and struct signal_struct as they
won't contain anything useful and shouldn't be relied upon by mistake.
Code still referencing those structures is also disabled here.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-01-27 13:05:26 -08:00
Jiri Slaby
4c45c5167c x86/timer: Make delay() work during early bootup
When a panic happens during bootup, "Rebooting in X seconds.." is
shown, but reboot happens immediatelly. It is because panic() uses mdelay()
and mdelay() calls __const_udelay() immediately, which does not
work while booting.

The per_cpu cpu_info.loops_per_jiffy value is not initialized yet, so
__const_udelay() actually multiplies the number of loops by zero. This
results in __const_udelay() to delay the execution only by a nanosecond
or so.

So check whether cpu_info.loops_per_jiffy is zero and use
loops_per_jiffy in that case. mdelay() will not be so precise without
proper calibration, but it works relatively well.

Before:

  [    0.170039] delaying 100ms
  [    0.170828] done

After

  [    0.214042] delaying 100ms
  [    0.313974] done

I do not think the added check matters given we are about to spin the
processor in the next few hundred cycles.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170119114730.2670-1-jslaby@suse.cz
[ Minor edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-22 10:03:12 +01:00
Russell King
9f8197980d delay: Add explanation of udelay() inaccuracy
There seems to be some misunderstanding that udelay() and friends will
always guarantee the specified delay.  This is a false understanding.
When udelay() is based on CPU cycles, it can return early for many
reasons which are detailed by Linus' reply to me in a thread in 2011:

  http://lists.openwall.net/linux-kernel/2011/01/12/372

However, a udelay test module was created in 2014 which allows udelay()
to only be 0.5% fast, which is outside of the CPU-cycles udelay()
results I measured back in 2011, which were deemed to be in the "we
don't care" region.

test_udelay() should be fixed to reflect the real allowable tolerance
on udelay(), rather than 0.5%.

Cc: David Riley <davidriley@chromium.org>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-01-20 14:32:39 -08:00
Geliang Tang
d852d39432 timerqueue: Use rb_entry_safe() instead of open-coding it
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/0d5cf199ac43792df0b6f7e2145545c30fa1dbbe.1482222135.git.geliangtang@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-20 08:03:42 +01:00
Peter Zijlstra
9e3d6223d2 math64, timers: Fix 32bit mul_u64_u32_shr() and friends
It turns out that while GCC-4.4 manages to generate 32x32->64 mult
instructions for the 32bit mul_u64_u32_shr() code, any GCC after that
fails horribly.

Fix this by providing an explicit mul_u32_u32() function which can be
architcture provided.

Reported-by: Chris Metcalf <cmetcalf@mellanox.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
Cc: Christopher S. Hall <christopher.s.hall@intel.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Liav Rehana <liavr@mellanox.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Parit Bhargava <prarit@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161209083011.GD15765@worktop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14 11:31:50 +01:00
Linus Torvalds
e96f8f18c8 Merge branch 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "These are all over the place.

  The tracepoint part of the pull fixes a crash and adds a little more
  information to two tracepoints, while the rest are good old fashioned
  fixes"

* 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  btrfs: make tracepoint format strings more compact
  Btrfs: add truncated_len for ordered extent tracepoints
  Btrfs: add 'inode' for extent map tracepoint
  btrfs: fix crash when tracepoint arguments are freed by wq callbacks
  Btrfs: adjust outstanding_extents counter properly when dio write is split
  Btrfs: fix lockdep warning about log_mutex
  Btrfs: use down_read_nested to make lockdep silent
  btrfs: fix locking when we put back a delayed ref that's too new
  btrfs: fix error handling when run_delayed_extent_op fails
  btrfs: return the actual error value from  from btrfs_uuid_tree_iterate
2017-01-13 17:40:22 -08:00
Linus Torvalds
04e396277b Two small fixups for the filesystem changes that went into this merge
window.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJYeQymAAoJEEp/3jgCEfOLLVsH/28qRsjVPWr5JuL1SF86//kd
 rAi7QUfbNgXHqbb10a9za9pNuLhHr3kImIfvQ04wYiYQY+IaAapiRXwQev8BsNAa
 yENUc8XwNgydw4FU1ia5PkGOJLDtujtfgjWT2v+gf1HUzLaV6alBzqDwUZBt3xJz
 mlYC82oFkXPa0BFmLUXtT/jJu/ZI8caO4KB34/UKi7LjBQk1ca7E2xVUoDtdQmEm
 ciPE98akU4JiB99aOgGdwemBzkAMHEGQpImTzqHr/tbIUj0MqVAjH9FVOhRCbjMy
 6MSR+U9yUzJkBzefS5enijAoExVc8cD/A0nIaKGVb6qWrIrk51/Opl6iILeVLUo=
 =28cq
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-4.10-rc4' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "Two small fixups for the filesystem changes that went into this merge
  window"

* tag 'ceph-for-4.10-rc4' of git://github.com/ceph/ceph-client:
  ceph: fix get_oldest_context()
  ceph: fix mds cluster availability check
2017-01-13 17:38:05 -08:00
Linus Torvalds
af54efa4f5 VFIO fixes for v4.10-rc4
- Cleanups and bug fixes for the mtty sample driver (Dan Carpenter)
  - Export and make use of has_capability() to fix incorrect use of
    ns_capable() for testing task capabilities (Jike Song)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJYeTWfAAoJECObm247sIsi1PEP/0lIkIQWBUlVWC1QA6bJN0Xx
 9c4pA34kLJwCpEtEoPxf6owjgK7kSBgIUUqBaNNDdKZQYttGgA+qiX3HhuvEigKL
 vEq5/TqwL6vv2aIUp/5uPP4NNTJD8RynwkfDI1B8DVQN6E1GM2zozpFUiZbDUxz/
 sgIuby9nuG3WTVLgOVayyMHlPTXG1+l+quRlAhMAseD7LMx7q/71NIjKggSUFRQG
 fkOVVTqfCnLJmIyq/cWbJt2cDgeWQq2/Ik6gje3SiOFtxi8fRdlzONUL+tHM1KgT
 r0htrq+r3B7BxI0CMZuoHIBt1SK443yu39xDzb0iXDSb5W9gwR14uFMuXv1ftfM0
 qkZnvpsXaT6wpKvK2ztmHgUiKJmOTgYrG77Dhz4oz6Mm0Y1mn6bV4yueoF/rQIn0
 GrM1Af/SVLf3Vhxw6i5a1s7kDgpySw8FfucKO5Xv3cOaIgNtlrrjxbKKa9DZ3wd7
 mnjD30XHwxEim8OCgv7CFswPsc5TiqYJTKGbnSJGo67ZCXWxXFHLIab0cn5yMd8G
 Qgw4mLnIv2rkRZOWpgMy4PedCNjZXNuQbW3I90kDb/VlPvRdCqUIsO0Ty10yaNhe
 s8Gwmxphoi3U/J7Y4T/BsfkCZ4Umut9gAt/WsG4kgWj3v0FOmxLgl39lC0cRigR6
 l7HSf0fOg/D9k6EN1xnc
 =yeS9
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v4.10-rc4' of git://github.com/awilliam/linux-vfio

Pull VFIO fixes from Alex Williamson:

 - Cleanups and bug fixes for the mtty sample driver (Dan Carpenter)

 - Export and make use of has_capability() to fix incorrect use of
   ns_capable() for testing task capabilities (Jike Song)

* tag 'vfio-v4.10-rc4' of git://github.com/awilliam/linux-vfio:
  vfio/type1: Remove pid_namespace.h include
  vfio iommu type1: fix the testing of capability for remote task
  capability: export has_capability
  vfio-mdev: remove some dead code
  vfio-mdev: buffer overflow in ioctl()
  vfio-mdev: return -EFAULT if copy_to_user() fails
2017-01-13 17:35:43 -08:00
Linus Torvalds
406732c932 * fix for module unload vs. deferred jump labels (note: there might be
other buggy modules!)
 * two NULL pointer dereferences from syzkaller
 * CVE from syzkaller, very serious on 4.10-rc, "just" kernel memory
   leak on releases
 * CVE from security@kernel.org, somewhat serious on AMD, less so on
   Intel
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJYd7l5AAoJEL/70l94x66DLWYH/0GUg+lK9J/gj0kwqi6BwsOP
 Rrs5Y7XvyNLsy/piBrrHDHvRa+DfAkrU8nepwgygX/yuGmSDV/zmdIb8XA/dvKht
 MN285NFlVjTyznYlU/LH3etx11CHLMNclishiFHQbcnohtvhOe+fvN6RVNdfeRxm
 d9iBPOum15ikc1xDl2z8Op+ZXVjMxkgLkzIXFcDBpJf4BvUx0X+ZHZXIKdizVhgU
 ZMD2ds/MutMB8X1A52qp6kQvT7xE4rp87M0So4qDMTbAto5G4ZmMaWC5MlK2Oxe/
 o+3qnx4vVz4H6uYzg1N4diHiC+buhgtXCLwwkcUOKKUVqJRP9e0Bh7kw8JA52XU=
 =C+tM
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:

 - fix for module unload vs deferred jump labels (note: there might be
   other buggy modules!)

 - two NULL pointer dereferences from syzkaller

 - also syzkaller: fix emulation of fxsave/fxrstor/sgdt/sidt, problem
   made worse during this merge window, "just" kernel memory leak on
   releases

 - fix emulation of "mov ss" - somewhat serious on AMD, less so on Intel

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: fix emulation of "MOV SS, null selector"
  KVM: x86: fix NULL deref in vcpu_scan_ioapic
  KVM: eventfd: fix NULL deref irqbypass consumer
  KVM: x86: Introduce segmented_write_std
  KVM: x86: flush pending lapic jump label updates on module unload
  jump_labels: API for flushing deferred jump label updates
2017-01-13 17:06:24 -08:00
Linus Torvalds
a65c92597d - Fix huge_ptep_set_access_flags() to return "changed" when any of the
ptes in the contiguous range is changed, not just the last one
 
 - Fix the adr_l assembly macro to work in modules under KASLR
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYeRmYAAoJEGvWsS0AyF7x6tYQAJd0Rtb88kwalDYb/kMcutBU
 3xyjvb8mIEKtnMOP1wS4o3YdqD6ke9OMCUm2EAwhAxgkfzwklsDOOOUlWsDijif2
 X3TzYWoKVgoje3oFODXOHMZNLqU6lBmuVN6G4ZdVPsTfvntTLE4cn9q828OgLdtB
 L1H+cRkHMhO9w4a0VxZFsNWtSDs4UugGLUp/cNLA4gXFj4atw8+bgX9o7BsmCb1d
 x+rd3LDWJb+a1YFKhKJkLQO+uQKk3n7d1WQ0DrQeDBgPs4uzMx422WpfmoW+j/dq
 MV/6C8ZYtQczS4BKp8k9apFHq3SC0bZcPLhtXqf/NZZCCLvDKS0iPflDAArYmIHo
 mOnmYhw+SeGc0llp9+tDaReco71HAqzdlpYnhGEePDEc0ZXBBr4/xqAwQoY4tgWa
 uZLSGZuiGqCFovzLb+LMLEtQlFyu48w+Y4Ct6r0M9gmRmU6d8msoEvXkA2IB/q8z
 JGFdFkJ1ZD8MtabRqUzYhuqn7WD+aC5eA3uqImnPjcrqNaYaiSy8Wif6vO+7asz5
 1YWyEaLuL9rITllunTQuK0crgZGjplwhGKYASz/w82AZebBeTl84adK/x7jrJgbn
 BPxQRHg4LqoX7i6tU3KWc/ulbE8EzOeJabCcKN8HnkPvt2akgKh/nlH3NQVLpG0l
 c/ffN90w3+fK7pNQKYnu
 =aUnr
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Fix huge_ptep_set_access_flags() to return "changed" when any of the
   ptes in the contiguous range is changed, not just the last one

 - Fix the adr_l assembly macro to work in modules under KASLR

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: assembler: make adr_l work in modules under KASLR
  arm64: hugetlb: fix the wrong return value for huge_ptep_set_access_flags
2017-01-13 17:00:42 -08:00
Linus Torvalds
c79d47f14f SCSI fixes on 20170113
The major fix is the bfa firmware, since the latest 10Gb cards fail
 probing with the current firmware.  The rest is a set of minor fixes:
 one missed Kconfig dependency causing randconfig failures, a missed
 error return on an error leg, a change for how multiqueue waits on a
 blocked device and a don't reset while in reset fix.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJYeOuBAAoJEAVr7HOZEZN4TxEP/j3nlFKrs7XEhlbbZrezyHkA
 RvEWAkpoSXpQiVvuqHcObOX5Jh9I8ndBMRh2MyS2zLV4PA4SzwrprgemaB5rL9+j
 +5IIw7MdHY5SHP0IrvpTae+c1kOnHRfSdsLORqohB9uhqJiSbwhXgV0Q19hRzttA
 qmjD5RBI4sY3a+/CjGQevaM2Y/WQqCp4+J5kvclr3AmEoTjMkAiOsjbSuclFHWX9
 CxNKSMv/6z34ZgqS0FgqCrCAI5kO5UHrjznslcEtgVECIOTrNer3g5e75DhQOcD6
 Mhklxbx1dfG9r3h3oush8Zy9IIJoXZFZLRNmhhX7zcKkuYUaR9LU2COUVCTOEE+s
 SAtp4qdXxCZ8sFyHS+E9ajTYSeTw05fmGEbAu1dqJ8FUxkRnvsWouCiw/TI+ps94
 mJSiS2UGHAYxpLdWFbnKC+2CaCG32ygXINUcZbIItmT2cdA+ERAPAO6tDegjgb0S
 MoeRxAlUGfUehHReCppRV3igwjtw2vtQaY5hFq1+v8n00Ezt6kDNeeTanqobMrHr
 71rOZAos9aT5wa39pmbW18M3pfUB1qeuGn6di4ArYk8f9ILAMI4qykvnDHH3INcn
 BdCIY3SgRo/Zq+87vFWZr+56CcbVqq0GV38FKoRZcHglyOjDw8ibQY6elcsjBtJC
 S1Z5wlkv1T9tfA+8mRX6
 =kjPw
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "The major fix is the bfa firmware, since the latest 10Gb cards fail
  probing with the current firmware.

  The rest is a set of minor fixes: one missed Kconfig dependency
  causing randconfig failures, a missed error return on an error leg, a
  change for how multiqueue waits on a blocked device and a don't reset
  while in reset fix"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: bfa: Increase requested firmware version to 3.2.5.1
  scsi: snic: Return error code on memory allocation failure
  scsi: fnic: Avoid sending reset to firmware when another reset is in progress
  scsi: qedi: fix build, depends on UIO
  scsi: scsi-mq: Wait for .queue_rq() if necessary
2017-01-13 12:38:36 -08:00
Linus Torvalds
6d90b4f99d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
 "Small driver fixups"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: elants_i2c - avoid divide by 0 errors on bad touchscreen data
  Input: adxl34x - make it enumerable in ACPI environment
  Input: ALPS - fix TrackStick Y axis handling for SS5 hardware
  Input: synaptics-rmi4 - fix F03 build error when serio is module
  Input: xpad - use correct product id for x360w controllers
  Input: synaptics_i2c - change msleep to usleep_range for small msecs
  Input: i8042 - add Pegatron touchpad to noloop table
  Input: joydev - remove unused linux/miscdevice.h include
2017-01-13 11:49:34 -08:00
Alex Williamson
94a6fa899d vfio/type1: Remove pid_namespace.h include
Using has_capability() rather than ns_capable(), we're no longer using
this header.

Cc: Jike Song <jike.song@intel.com>
Cc: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-13 08:23:33 -07:00
Jike Song
d1b333d12c vfio iommu type1: fix the testing of capability for remote task
Before the mdev enhancement type1 iommu used capable() to test the
capability of current task; in the course of mdev development a
new requirement, testing for another task other than current, was
raised.  ns_capable() was used for this purpose, however it still
tests current, the only difference is, in a specified namespace.

Fix it by using has_capability() instead, which tests the cap for
specified task in init_user_ns, the same namespace as capable().

Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-12 16:05:35 -07:00
Linus Torvalds
557ed56cc7 sound fixes for 4.10-rc4
This time we got a few more fixes than the previous rc's, and the most
 of commits were about ASoC.  The only significant change in the core
 side is the regression fix wrt the aux device list handling, and all
 the rest are driver-specific small / trivial fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJYd+juAAoJEGwxgFQ9KSmkdpkQALA1JjQSHYyPybvwJNAjTUnB
 +VaG+tTK0Q6TRS9KFUte41a8/FVazXwK9xBsEsOjaYFfd/nJj7WRoZsmbwMALY+S
 6S7FBPBxuvrhzFSWCHBFL9E0aaxeHo81tHiZQoaSj4TAQJvIH9amDCNLgMM+kdIf
 nRp5CgXZT4Xco462Ge+dpnz6KL7mRtv3f9EpjAHT/ptEWRIDQb7KOj3cmq7F5Jpk
 SFigCWfHdthH6ldwVKnl3ZRGPYqmdVwq+Wq+jOKByDNK4yPKb0s3JFtldvDkszWI
 IxblPwVbjnOPGvWY4dI1FT20Xuqjhgbm1HoZQMQsNGHwqZsUxdNtQTO7V3nlE57I
 Kf+l6FC0f4aveIPo9Qt6J6T6kkPIYNhIhEaVhFXX3LFU91/f8fcDpyzunPP1pbA0
 TyCbDsc6boPfrCh0qYeOZdfsJpfomL1hTyGpPkQiGGqKzw+uO3Jv+lsTeq/Qd0Ud
 RvOhcW+UrP/gBIb7Q1zQpK7vbJna00WpmqBjYClENGuLWUCdPOOpkcLHVXOKsy/0
 bxKsGt9rWzz6LtLX4bWgwHdu0tn9iXNggbcshp5LjyoChVCXaTh36uQHRyIXk8Bw
 VhbMFW1ZIMac9pHj47EkpSTRsGPHZYwCnK5utWqf8RRgV86DpUrR9rrX2wVUxGYF
 pGOR8RzEoM5i7PRWJyJE
 =gGt3
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "This time we got a few more fixes than the previous rc's, and most of
  commits were about ASoC.

  The only significant change in the core side is the regression fix wrt
  the aux device list handling, and all the rest are driver-specific
  small / trivial fixes"

* tag 'sound-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: usb-audio: Add a quirk for Plantronics BT600
  ASoC: rt5645: set sel_i2s_pre_div1 to 2
  ASoC: dpcm: Avoid putting stream state to STOP when FE stream is paused
  ASoC: Intel: Skylake: Release FW ctx in cleanup
  ASoC: Intel: bytcr-rt5640: fix settings in internal clock mode
  ASoC: fsl_ssi: set fifo watermark to more reliable value
  ASoC: nau8825: fix invalid configuration in Pre-Scalar of FLL
  ASoC: nau8825: correct the function name of register
  ASoC: Intel: Skylake: Fix to fail safely if module not available in path
  ASoC: tlv320aic3x: Mark the RESET register as volatile
  ASoC: Fix binding and probing of auxiliary components
  ASoC: wm_adsp: Don't overrun firmware file buffer when reading region data
  ASoC: Intel: bytcr_rt5640: fallback mechanism if MCLK is not enabled
  ASoC: hdmi-codec: use unsigned type to structure members with bit-field
  ASoC: topology: kfree kcontrol->private_value before freeing kcontrol
  ASoC: rsnd: don't double free kctrl
  ASoC: dwc: Fix PIO mode initialization
2017-01-12 14:45:59 -08:00
Linus Torvalds
e28ac1fc31 Contained in this update:
- Fix free space request handling when low on disk space
 - Remove redundant log failure error messages
 - Free truncate dirty pages instead of letting them build up forever
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCgAGBQJYdnh5AAoJEPh/dxk0SrTrI5sP/2FYAxrj3Iw684e92PWeIe1c
 pZir+0LEmlXgly4wprli7/BO6WZjESuMF3huBZZxtYLx3UUvGRavHup620z3Xa2e
 YRdSiBnvchOe3F0UfFO9wT15vEOzwWS61FfX/g35t4mD6ItY0XISesTx4+XA3Cqi
 8zf7OAjXI5WQietthwc9zmfhpyWgyw8CkeXVtqd5whNJN/6E80wh5IE6D6cJ1xkX
 2qNicrrruTUACPyJxrTrR/0kkoxDoYgBapy3kqV0R1uK+ttNrGlGjfdapKs0tqMb
 ezksj0sy/rx+HjfGEHx52mTiYRRTHEsGt/yMa1pT8o0p2wvR0nxnvzkvV8DeMoX3
 0wYRxATsv/t8Oog+ug6khB/FtppSGJML+XxG3bV9itgCkAIFAbb7vZRrThnzVqOr
 ChbQvBhchY4lYA1pef862QHLRraJF84HuY/ypyE6DX+nQWkGjDfEeHFPcm6eVSXG
 xEplK8wjs09pSklH/OQD+GsO4eedBHc4hdzDuBTjCmt867/Lk8Uw4RIjGMYjbBx2
 ovU3pTHfdlO6/qJmfSWnBAAZjKAjna/p47ZfDLzio23hb1hK68Qe/z8+tPnAzkHW
 rowA6KfBi13tX6Njds/lceiYMrnMRPqNBA+Og1wCyljP2Q4shbADa6czrFDzPn6P
 xZbfhKK2CTbYSvxi5S3G
 =8bsr
 -----END PGP SIGNATURE-----

Merge tag 'xfs-for-linus-4.10-rc4-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
 "As promised last week, here's some stability fixes from Christoph and
  Jan Kara:

   - fix free space request handling when low on disk space

   - remove redundant log failure error messages

   - free truncated dirty pages instead of letting them build up
     forever"

* tag 'xfs-for-linus-4.10-rc4-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: Timely free truncated dirty pages
  xfs: don't print warnings when xfs_log_force fails
  xfs: don't rely on ->total in xfs_alloc_space_available
  xfs: adjust allocation length in xfs_alloc_space_available
  xfs: fix bogus minleft manipulations
  xfs: bump up reserved blocks in xfs_alloc_set_aside
2017-01-12 11:06:26 -08:00
Linus Torvalds
9ca277eba0 remoteproc fixes for v4.10
This fixes two regressions that has been reported to be introduced in
 v4.10-rc1.
 
 * The first fix corrects an incorrect usage of the kref api.
 
 * The second reverts the change to make the resource table read-only. As the
   space each vdev resource is used as virtio device config space it must be
   shared with the remote.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYdqtvAAoJEAsfOT8Nma3F58IP/A88sWYN0Fp8Zv7tU1H1sk4B
 YFRCbf+g+9fFUEA55infRbCrQxzPV5ZINeIV8hmsFgradSnMLntTrl0KUpgjYr7q
 eH6Nt7AHkloPME+fGV2m3aVrQGQDYOMaJ5kgRoFMn4YFU5KekcCwnwuwgxiSJiq5
 PBVhUZfFUoBEWfAcCh9palkUZ7V5FMtSu5ZujPu6rpEwpchZqla4hTCVCDJsZsQo
 HpF+Df30MSML8PiFtvX8aEKz4KzKKAFENAs0FQL22YcyhMF2A0LRy/LfVoL2ANDc
 c0CVAEQueVr5diACVYtUzYQKJPib/Lw97OqO2rlh3F6AxsLhOcM46agx28OtJEKv
 NRJRKKX5xWpXofsO+MyZ0odpesGWOOecouTbnmL6BgHeJKW2BVbkntbv/VDFD0f4
 BIxCioLNCY/zGZ/FCConmceqoXoMJwgL5ZHp2vcL4icwZj4M//B1zQGBmWmauCZG
 x/SA7A6r8DZ/Y6GyUP4rpzA0m3GUhEvFU4N4nEaQjBe9tVsougD2NIlgoDcARoO8
 e7CiKNuAqViSeMJzlNmDBARDoQFCDoGLZhhS4qQsyMCmd6ySmu2x5MVNEuMRUpp3
 wafRU5CMHJw/EUyFlybNQmkReE7RDhWuTZ42UBK+Zx1orhu+fH6ulPgPTaxKgp+d
 x4XcbiZU4iqAobZ/wWsP
 =GUNc
 -----END PGP SIGNATURE-----

Merge tag 'rproc-v4.10-fixes' of git://github.com/andersson/remoteproc

Pull remoteproc fixes from Bjorn Andersson:
 "This fixes two regressions that have been reported to be introduced in
  v4.10-rc1.

   - correct an incorrect usage of the kref api

   - revert the change to make the resource table read-only. As the
     space each vdev resource is used as virtio device config space it
     must be shared with the remote"

* tag 'rproc-v4.10-fixes' of git://github.com/andersson/remoteproc:
  Revert "remoteproc: Merge table_ptr and cached_table pointers"
  remoteproc: fix vdev reference management
2017-01-12 11:00:22 -08:00
Linus Torvalds
1d865da79e rpmsg fixes for v4.10
This fixes a regression introduced in v4.10-rc1 that prohibits multiple
 channels with the same name but different endpoint addresses to be used.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYdqptAAoJEAsfOT8Nma3F3xcP/R2zVEsl7tbFt+DJdXTi6ZvK
 U+pTRuyGpao1oQVbcUf/aQ+IYqO5Ndr06qTzwnBI/44DYlHBihqskUepmvUYapNl
 Pl9n3S0BdDSuN8OpR8QJMLaWIzSbz9CM5bM8St7nORUy21rjPL89EkuUjaq/lU1P
 hr0HNfe/naOgcp93u/w7j+tS0afakjrNFxgXaU8UCasOSoXw1Uw5/aivIp1Oc6mN
 dOpEFykRHA1fTL5giMEWAqoh6tWHDfVaeyNSGcL8/KxoT2W9dSJ8ed37uLm2pnNK
 9e6lTTO8yRpOIT99XCytsZrezhASamg0PawY3OSQ8YE8gLqIpTRAz/YPyQaw83e3
 GwRb/zHINf4GsuyopAj8pIstJj57pB5M5rnMI/sUZsV0BRjFhOUPkGsuS7enCOfp
 CyNlIWXOOBirBefpeZLnYoYN+IIDKqhZ4/en1IMp04A6BNr4RrjlPguE4krgaK21
 yaAGrDMhCr5V+RQeOzwMtuTkNog8cML7CRwXhsFu8NmrNIsEj+eTL0aEzxFERIo4
 hkaLPIUyy8Njy2SCsp1hETDajrVagPWUw49zMcRAkaMQ5ev+wJnrNe+/7NNBzIXI
 PgtZQ0Eq0sE8KCEc37o5mt6CQv/vn8KRq+I6KoLmdxihD5QyxOcoQMpRjlH0vIrx
 5HiloewfU6PfoS1HqfB/
 =xcek
 -----END PGP SIGNATURE-----

Merge tag 'rpmsg-v4.10-fixes' of git://github.com/andersson/remoteproc

Pull rpmsg fixes from Bjorn Andersson:
 "This fixes a regression introduced in v4.10-rc1 that prohibits
  multiple channels with the same name but different endpoint addresses
  to be used"

* tag 'rpmsg-v4.10-fixes' of git://github.com/andersson/remoteproc:
  rpmsg: virtio_rpmsg_bus: fix channel creation
2017-01-12 10:58:16 -08:00
Linus Torvalds
95ce13138e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:

 - device descriptor length validation fix to hid-cypress driver from
   Greg

 - introduction of a short delay into i2c-hid, which is not really
   mandated by the spec, but fixes Asus Touchpads

 - Petzl USB connectable flashlight quirk from myself

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: i2c-hid: Add sleep between POWER ON and RESET
  HID: hid-cypress: validate length of report
  HID: ignore Petzl USB headlamp
2017-01-12 10:55:28 -08:00
Linus Torvalds
cb38b45346 Merge branch 'scsi-target-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux
Pull scsi target fixes from Bart Van Assche:

 - a series of bug fixes for the XCOPY implementation from David
   Disseldorp

 - one bug fix for the ibmvscsis driver, a driver that is used for
   communication between partitions on IBM POWER systems.

* 'scsi-target-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux:
  ibmvscsis: Fix srp_transfer_data fail return code
  target: support XCOPY requests without parameters
  target: check for XCOPY parameter truncation
  target: use XCOPY segment descriptor CSCD IDs
  target: check XCOPY segment descriptor CSCD IDs
  target: simplify XCOPY wwn->se_dev lookup helper
  target: return UNSUPPORTED TARGET/SEGMENT DESC TYPE CODE sense
  target: bounds check XCOPY total descriptor list length
  target: bounds check XCOPY segment descriptor list
  target: use XCOPY TOO MANY TARGET DESCRIPTORS sense
  target: add XCOPY target/segment desc sense codes
2017-01-12 10:41:20 -08:00
Geng, Jichao
84fcc2d2bd ceph: fix get_oldest_context()
For no snapshot case, we should use ci->truncate_{seq,size}.

Fixes: 5f743e4566 ("ceph: record truncate size/seq for snap data writeback")
Signed-off-by: Geng, Jichao <geng.jichao@h3c.com>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2017-01-12 19:31:01 +01:00
Yan, Zheng
cc8e834293 ceph: fix mds cluster availability check
We should apply the check after getting the initial mdsmap.

Fixes: e9e427f0a1 ("ceph: check availability of mds cluster on mount")
Link: http://tracker.ceph.com/issues/18161
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2017-01-12 19:31:01 +01:00
Linus Torvalds
607ae5f269 Merge tag 'md/4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
Pull md fixes from Shaohua Li:
 "Basically one fix for raid5 cache which is merged in this cycle,
  others are trival fixes"

* tag 'md/4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md/raid5: Use correct IS_ERR() variation on pointer check
  md: cleanup mddev flag clear for takeover
  md/r5cache: fix spelling mistake on "recoverying"
  md/r5cache: assign conf->log before r5l_load_log()
  md/r5cache: simplify handling of sh->log_start in recovery
  md/raid5-cache: removes unnecessary write-through mode judgments
  md/raid10: Refactor raid10_make_request
  md/raid1: Refactor raid1_make_request
2017-01-12 10:17:59 -08:00
Ard Biesheuvel
41c066f2c4 arm64: assembler: make adr_l work in modules under KASLR
When CONFIG_RANDOMIZE_MODULE_REGION_FULL=y, the offset between loaded
modules and the core kernel may exceed 4 GB, putting symbols exported
by the core kernel out of the reach of the ordinary adrp/add instruction
pairs used to generate relative symbol references. So make the adr_l
macro emit a movz/movk sequence instead when executing in module context.

While at it, remove the pointless special case for the stack pointer.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-01-12 18:10:52 +00:00
Paolo Bonzini
33ab91103b KVM: x86: fix emulation of "MOV SS, null selector"
This is CVE-2017-2583.  On Intel this causes a failed vmentry because
SS's type is neither 3 nor 7 (even though the manual says this check is
only done for usable SS, and the dmesg splat says that SS is unusable!).
On AMD it's worse: svm.c is confused and sets CPL to 0 in the vmcb.

The fix fabricates a data segment descriptor when SS is set to a null
selector, so that CPL and SS.DPL are set correctly in the VMCS/vmcb.
Furthermore, only allow setting SS to a NULL selector if SS.RPL < 3;
this in turn ensures CPL < 3 because RPL must be equal to CPL.

Thanks to Andy Lutomirski and Willy Tarreau for help in analyzing
the bug and deciphering the manuals.

Reported-by: Xiaohan Zhang <zhangxiaohan1@huawei.com>
Fixes: 79d5b4c3cd
Cc: stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12 15:17:13 +01:00
Jike Song
19c816e8e4 capability: export has_capability
has_capability() is sometimes needed by modules to test capability
for specified task other than current, so export it.

Cc: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-12 07:01:56 -07:00
Wanpeng Li
546d87e5c9 KVM: x86: fix NULL deref in vcpu_scan_ioapic
Reported by syzkaller:

    BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0
    IP: _raw_spin_lock+0xc/0x30
    PGD 3e28eb067
    PUD 3f0ac6067
    PMD 0
    Oops: 0002 [#1] SMP
    CPU: 0 PID: 2431 Comm: test Tainted: G           OE   4.10.0-rc1+ #3
    Call Trace:
     ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
     kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm]
     ? pick_next_task_fair+0xe1/0x4e0
     ? kvm_arch_vcpu_load+0xea/0x260 [kvm]
     kvm_vcpu_ioctl+0x33a/0x600 [kvm]
     ? hrtimer_try_to_cancel+0x29/0x130
     ? do_nanosleep+0x97/0xf0
     do_vfs_ioctl+0xa1/0x5d0
     ? __hrtimer_init+0x90/0x90
     ? do_nanosleep+0x5b/0xf0
     SyS_ioctl+0x79/0x90
     do_syscall_64+0x6e/0x180
     entry_SYSCALL64_slow_path+0x25/0x25
    RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0

The syzkaller folks reported a NULL pointer dereference due to
ENABLE_CAP succeeding even without an irqchip.  The Hyper-V
synthetic interrupt controller is activated, resulting in a
wrong request to rescan the ioapic and a NULL pointer dereference.

    #include <sys/ioctl.h>
    #include <sys/mman.h>
    #include <sys/types.h>
    #include <linux/kvm.h>
    #include <pthread.h>
    #include <stddef.h>
    #include <stdint.h>
    #include <stdlib.h>
    #include <string.h>
    #include <unistd.h>

    #ifndef KVM_CAP_HYPERV_SYNIC
    #define KVM_CAP_HYPERV_SYNIC 123
    #endif

    void* thr(void* arg)
    {
	struct kvm_enable_cap cap;
	cap.flags = 0;
	cap.cap = KVM_CAP_HYPERV_SYNIC;
	ioctl((long)arg, KVM_ENABLE_CAP, &cap);
	return 0;
    }

    int main()
    {
	void *host_mem = mmap(0, 0x1000, PROT_READ|PROT_WRITE,
			MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
	int kvmfd = open("/dev/kvm", 0);
	int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
	struct kvm_userspace_memory_region memreg;
	memreg.slot = 0;
	memreg.flags = 0;
	memreg.guest_phys_addr = 0;
	memreg.memory_size = 0x1000;
	memreg.userspace_addr = (unsigned long)host_mem;
	host_mem[0] = 0xf4;
	ioctl(vmfd, KVM_SET_USER_MEMORY_REGION, &memreg);
	int cpufd = ioctl(vmfd, KVM_CREATE_VCPU, 0);
	struct kvm_sregs sregs;
	ioctl(cpufd, KVM_GET_SREGS, &sregs);
	sregs.cr0 = 0;
	sregs.cr4 = 0;
	sregs.efer = 0;
	sregs.cs.selector = 0;
	sregs.cs.base = 0;
	ioctl(cpufd, KVM_SET_SREGS, &sregs);
	struct kvm_regs regs = { .rflags = 2 };
	ioctl(cpufd, KVM_SET_REGS, &regs);
	ioctl(vmfd, KVM_CREATE_IRQCHIP, 0);
	pthread_t th;
	pthread_create(&th, 0, thr, (void*)(long)cpufd);
	usleep(rand() % 10000);
	ioctl(cpufd, KVM_RUN, 0);
	pthread_join(th, 0);
	return 0;
    }

This patch fixes it by failing ENABLE_CAP if without an irqchip.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: 5c919412fe (kvm/x86: Hyper-V synthetic interrupt controller)
Cc: stable@vger.kernel.org # 4.5+
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12 14:52:52 +01:00
Wanpeng Li
4f3dbdf47e KVM: eventfd: fix NULL deref irqbypass consumer
Reported syzkaller:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
    IP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass]
    PGD 0

    Oops: 0002 [#1] SMP
    CPU: 1 PID: 125 Comm: kworker/1:1 Not tainted 4.9.0+ #1
    Workqueue: kvm-irqfd-cleanup irqfd_shutdown [kvm]
    task: ffff9bbe0dfbb900 task.stack: ffffb61802014000
    RIP: 0010:irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass]
    Call Trace:
     irqfd_shutdown+0x66/0xa0 [kvm]
     process_one_work+0x16b/0x480
     worker_thread+0x4b/0x500
     kthread+0x101/0x140
     ? process_one_work+0x480/0x480
     ? kthread_create_on_node+0x60/0x60
     ret_from_fork+0x25/0x30
    RIP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass] RSP: ffffb61802017e20
    CR2: 0000000000000008

The syzkaller folks reported a NULL pointer dereference that due to
unregister an consumer which fails registration before. The syzkaller
creates two VMs w/ an equal eventfd occasionally. So the second VM
fails to register an irqbypass consumer. It will make irqfd as inactive
and queue an workqueue work to shutdown irqfd and unregister the irqbypass
consumer when eventfd is closed. However, the second consumer has been
initialized though it fails registration. So the token(same as the first
VM's) is taken to unregister the consumer through the workqueue, the
consumer of the first VM is found and unregistered, then NULL deref incurred
in the path of deleting consumer from the consumers list.

This patch fixes it by making irq_bypass_register/unregister_consumer()
looks for the consumer entry based on consumer pointer itself instead of
token matching.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12 14:42:34 +01:00
Steve Rutherford
129a72a0d3 KVM: x86: Introduce segmented_write_std
Introduces segemented_write_std.

Switches from emulated reads/writes to standard read/writes in fxsave,
fxrstor, sgdt, and sidt.  This fixes CVE-2017-2584, a longstanding
kernel memory leak.

Since commit 283c95d0e3 ("KVM: x86: emulate FXSAVE and FXRSTOR",
2016-11-09), which is luckily not yet in any final release, this would
also be an exploitable kernel memory *write*!

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes: 96051572c8
Fixes: 283c95d0e3
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Steve Rutherford <srutherford@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12 14:34:58 +01:00
David Matlack
cef84c302f KVM: x86: flush pending lapic jump label updates on module unload
KVM's lapic emulation uses static_key_deferred (apic_{hw,sw}_disabled).
These are implemented with delayed_work structs which can still be
pending when the KVM module is unloaded. We've seen this cause kernel
panics when the kvm_intel module is quickly reloaded.

Use the new static_key_deferred_flush() API to flush pending updates on
module unload.

Signed-off-by: David Matlack <dmatlack@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12 14:33:17 +01:00
David Matlack
b6416e6101 jump_labels: API for flushing deferred jump label updates
Modules that use static_key_deferred need a way to synchronize with
any delayed work that is still pending when the module is unloaded.
Introduce static_key_deferred_flush() which flushes any pending
jump label updates.

Signed-off-by: David Matlack <dmatlack@google.com>
Cc: stable@vger.kernel.org
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12 14:33:16 +01:00
Brendan McGrath
a89af4abdf HID: i2c-hid: Add sleep between POWER ON and RESET
Support for the Asus Touchpad was recently added. It turns out this
device can fail initialisation (and become unusable) when the RESET
command is sent too soon after the POWER ON command.

Unfortunately the i2c-hid specification does not specify the need for
a delay between these two commands. But it was discovered the Windows
driver has a 1ms delay.

As a result, this patch modifies the i2c-hid module to add a sleep
inbetween the POWER ON and RESET commands which lasts between 1ms and 5ms.

See https://github.com/vlasenko/hid-asus-dkms/issues/24 for further
details.

Signed-off-by: Brendan McGrath <redmcg@redmandi.dyndns.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-01-11 21:55:35 +01:00
Linus Torvalds
ba836a6f5a Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "27 fixes.

  There are three patches that aren't actually fixes. They're simple
  function renamings which are nice-to-have in mainline as ongoing net
  development depends on them."

* akpm: (27 commits)
  timerfd: export defines to userspace
  mm/hugetlb.c: fix reservation race when freeing surplus pages
  mm/slab.c: fix SLAB freelist randomization duplicate entries
  zram: support BDI_CAP_STABLE_WRITES
  zram: revalidate disk under init_lock
  mm: support anonymous stable page
  mm: add documentation for page fragment APIs
  mm: rename __page_frag functions to __page_frag_cache, drop order from drain
  mm: rename __alloc_page_frag to page_frag_alloc and __free_page_frag to page_frag_free
  mm, memcg: fix the active list aging for lowmem requests when memcg is enabled
  mm: don't dereference struct page fields of invalid pages
  mailmap: add codeaurora.org names for nameless email commits
  signal: protect SIGNAL_UNKILLABLE from unintentional clearing.
  mm: pmd dirty emulation in page fault handler
  ipc/sem.c: fix incorrect sem_lock pairing
  lib/Kconfig.debug: fix frv build failure
  mm: get rid of __GFP_OTHER_NODE
  mm: fix remote numa hits statistics
  mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done}
  ocfs2: fix crash caused by stale lvb with fsdlm plugin
  ...
2017-01-11 11:15:15 -08:00
Dan Carpenter
73da4268fd vfio-mdev: remove some dead code
We set info.count to 1 in mtty_get_irq_info() so static checkers
complain that, "Why do we have impossible conditions?"  The answer is
that it seems to be left over dead code that can be safely removed.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-11 12:12:37 -07:00
Dan Carpenter
5c677869e0 vfio-mdev: buffer overflow in ioctl()
This is a sample driver for documentation so the impact is probably
pretty low.  But we should check that bar_index is valid so we
don't write beyond the end of the mdev_state->region_info[] array.

Fixes: 9d1a546c53 ("docs: Sample driver to demonstrate how to use Mediated device framework.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-11 12:12:29 -07:00
Dan Carpenter
6ed0993a0b vfio-mdev: return -EFAULT if copy_to_user() fails
The copy_to_user() function returns the number of bytes which it wasn't
able to copy but we want to return a negative error code.

Fixes: 9d1a546c53 ("docs: Sample driver to demonstrate how to use Mediated device framework.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-11 12:06:35 -07:00
Takashi Iwai
6cf4569ce3 ASoC: Fixes for v4.10
As well as the usual smattering of driver specific fixes collected since
 the merge window this has one particularly important fix to the core for
 handling of aux_devs which was broken during the merge window by some of
 the componentization refactoring.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAlh2as0THGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0LBrB/92Z6gD0GrbQjP6LkMJ0SwmAMjWOOy+
 hDTr8m9CNwSHwQW/L+rAnyS8WBB46jiJ4/mTw6Sz7YIyY0Xdv5RY7IPPuWC92JQd
 jA+0lcfGe0p86ZvVhK2tye+EHTBqKgfIzO2Sl5XNzaQZiw0S8g/FjJIjBABOGkty
 oyK2iYHAW5H7aNVZfoXR9QQBqWniSh5hh06tCDs7Gy90zlKSOoWDUUfux5pubzVR
 mXOxTnie6bU7Rf0IKzdAQ5EI3zt2XT3XtFgv47VYp4bKW8LbkSo8JCVORGymoq+c
 k+Oc8YPbpAY5Jh4tZ9tSup1Ce7DJvE1sf4VOuHkAoXjKO+Pjp+/qTo50
 =KUQm
 -----END PGP SIGNATURE-----

Merge tag 'asoc-fix-v4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v4.10

As well as the usual smattering of driver specific fixes collected since
the merge window this has one particularly important fix to the core for
handling of aux_devs which was broken during the merge window by some of
the componentization refactoring.
2017-01-11 19:49:27 +01:00
Jan Kara
0a417b8dc1 xfs: Timely free truncated dirty pages
Commit 99579ccec4 "xfs: skip dirty pages in ->releasepage()" started
to skip dirty pages in xfs_vm_releasepage() which also has the effect
that if a dirty page is truncated, it does not get freed by
block_invalidatepage() and is lingering in LRU list waiting for reclaim.
So a simple loop like:

while true; do
	dd if=/dev/zero of=file bs=1M count=100
	rm file
done

will keep using more and more memory until we hit low watermarks and
start pagecache reclaim which will eventually reclaim also the truncate
pages. Keeping these truncated (and thus never usable) pages in memory
is just a waste of memory, is unnecessarily stressing page cache
reclaim, and reportedly also leads to anonymous mmap(2) returning ENOMEM
prematurely.

So instead of just skipping dirty pages in xfs_vm_releasepage(), return
to old behavior of skipping them only if they have delalloc or unwritten
buffers and fix the spurious warnings by warning only if the page is
clean.

CC: stable@vger.kernel.org
CC: Brian Foster <bfoster@redhat.com>
CC: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Petr Tůma <petr.tuma@d3s.mff.cuni.cz>
Fixes: 99579ccec4
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-11 10:20:04 -08:00