2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2025-01-01 10:13:58 +08:00
Commit Graph

563816 Commits

Author SHA1 Message Date
Mickaël Salaün
e04c989eb7 um: Fix ptrace GETREGS/SETREGS bugs
This fix two related bugs:
* PTRACE_GETREGS doesn't get the right orig_ax (syscall) value
* PTRACE_SETREGS can't set the orig_ax value (erased by initial value)

Get rid of the now useless and error-prone get_syscall().

Fix inconsistent behavior in the ptrace implementation for i386 when
updating orig_eax automatically update the syscall number as well. This
is now updated in handle_syscall().

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Thomas Meyer <thomas@m3y3r.de>
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: Anton Ivanov <aivanov@brocade.com>
Cc: Meredydd Luff <meredydd@senatehouse.org>
Cc: David Drysdale <drysdale@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Kees Cook <keescook@chromium.org>
2016-01-10 21:49:48 +01:00
Vegard Nossum
a7df4716d1 um: link with -lpthread
Similarly to commit fb1770aa78, with gcc 5
on Ubuntu and CONFIG_STATIC_LINK=y I was seeing these linker errors:

/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/librt.a(timer_create.o): In function `__timer_create_new':
(.text+0xcd): undefined reference to `pthread_once'
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/librt.a(timer_create.o): In function `__timer_create_new':
(.text+0x126): undefined reference to `pthread_attr_init'
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/librt.a(timer_create.o): In function `__timer_create_new':
(.text+0x168): undefined reference to `pthread_attr_setdetachstate'
[...]

Obviously we also need -lpthread for librt.a.

Cc: stable@vger.kernel.org # 4.4
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 21:49:48 +01:00
Anton Ivanov
8c6157b6b3 um: Update UBD to use pread/pwrite family of functions
This decreases the number of syscalls per read/write by half.

Signed-off-by: Anton Ivanov <aivanov@brocade.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 21:49:48 +01:00
Anton Ivanov
470a166e8c um: Do not change hard IRQ flags in soft IRQ processing
Software IRQ processing in generic architectures assumes that the
exit out of hard IRQ may have re-enabled interrupts (some
architectures may have an implicit EOI). It presumes them enabled
and toggles the flags once more just in case unless this is turned
off in the architecture specific hardirq.h by setting
__ARCH_IRQ_EXIT_IRQS_DISABLED

This patch adds this to UML where due to the way IRQs are handled
it is an optimization (it works fine without it too).

Signed-off-by: Anton Ivanov <aivanov@brocade.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 21:49:48 +01:00
Anton Ivanov
d5e3f5cbe5 um: Prevent IRQ handler reentrancy
The existing IRQ handler design in UML does not prevent reentrancy

This is mitigated by fd-enable/fd-disable semantics for the IO
portion of the UML subsystem. The timer, however, can and is
re-entered resulting in very deep stack usage and occasional
stack exhaustion.

This patch prevents this by checking if there is a timer
interrupt in-flight before processing any pending timer interrupts.

Signed-off-by: Anton Ivanov <aivanov@brocade.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 21:49:47 +01:00
Vegard Nossum
0754fb298f uml: flush stdout before forking
I was seeing some really weird behaviour where piping UML's output
somewhere would cause output to get duplicated:

  $ ./vmlinux | head -n 40
  Checking that ptrace can change system call numbers...Core dump limits :
          soft - 0
          hard - NONE
  OK
  Checking syscall emulation patch for ptrace...Core dump limits :
          soft - 0
          hard - NONE
  OK
  Checking advanced syscall emulation patch for ptrace...Core dump limits :
          soft - 0
          hard - NONE
  OK
  Core dump limits :
          soft - 0
          hard - NONE

This is because these tests do a fork() which duplicates the non-empty
stdout buffer, then glibc flushes the duplicated buffer as each child
exits.

A simple workaround is to flush before forking.

Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 21:49:47 +01:00
Vegard Nossum
9f2dfda2f2 uml: fix hostfs mknod()
An inverted return value check in hostfs_mknod() caused the function
to return success after handling it as an error (and cleaning up).

It resulted in the following segfault when trying to bind() a named
unix socket:

  Pid: 198, comm: a.out Not tainted 4.4.0-rc4
  RIP: 0033:[<0000000061077df6>]
  RSP: 00000000daae5d60  EFLAGS: 00010202
  RAX: 0000000000000000 RBX: 000000006092a460 RCX: 00000000dfc54208
  RDX: 0000000061073ef1 RSI: 0000000000000070 RDI: 00000000e027d600
  RBP: 00000000daae5de0 R08: 00000000da980ac0 R09: 0000000000000000
  R10: 0000000000000003 R11: 00007fb1ae08f72a R12: 0000000000000000
  R13: 000000006092a460 R14: 00000000daaa97c0 R15: 00000000daaa9a88
  Kernel panic - not syncing: Kernel mode fault at addr 0x40, ip 0x61077df6
  CPU: 0 PID: 198 Comm: a.out Not tainted 4.4.0-rc4 #1
  Stack:
   e027d620 dfc54208 0000006f da981398
   61bee000 0000c1ed daae5de0 0000006e
   e027d620 dfcd4208 00000005 6092a460
  Call Trace:
   [<60dedc67>] SyS_bind+0xf7/0x110
   [<600587be>] handle_syscall+0x7e/0x80
   [<60066ad7>] userspace+0x3e7/0x4e0
   [<6006321f>] ? save_registers+0x1f/0x40
   [<6006c88e>] ? arch_prctl+0x1be/0x1f0
   [<60054985>] fork_handler+0x85/0x90

Let's also get rid of the "cosmic ray protection" while we're at it.

Fixes: e9193059b1 "hostfs: fix races in dentry_name() and inode_name()"
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 21:49:47 +01:00
Sudip Mukherjee
eb37bc3f85 m68k: Provide __phys_to_pfn() and __pfn_to_phys()
The defconfig build of m68k was failing with the error:
implicit declaration of function '__pfn_to_phys'

Other architectures have added <asm/memory.h>, but if we do so here then
we will also get redeclaration of some other functions. So it is better
to copy these macros into page.h.

Fixes: 0a3c3bf11240 ("x86, mm: introduce vmem_altmap to augment vmemmap_populate()")
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Reported-by: Guenter Roeck <linux@roeck-us.net> (m68knommu)
[geert: Apply to page.h instead of page_mm.h to cover nommu, reword]
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2016-01-10 13:56:40 +01:00
Finn Thain
2d522618c8 m68k/atari, m68k/sun3: Fix SCSI platform device registration when driver is modular
Fixes: 3ff228af84 ("atari_scsi: Convert to platform device")
Fixes: 0d31f87591 ("sun3_scsi: Convert to platform device")
Reported-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2016-01-10 13:56:40 +01:00
Linus Torvalds
eac6f76ac7 SCSI fixes on 20160109
Single fix for machines with pages > 4k (PPC mostly).  There's a bug in our
 optimal transfer size code where we don't account for pages > 4k and can set
 the transfer size to be less than the page size causing nasty failures.
 
 Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJWkUiHAAoJEDeqqVYsXL0MArAH/2XWKJGI9tr0AQQ79WGD/kjV
 KZsUjKP9sshjxXBB8jK+TFvO/Z2BE92XzXtIswgdc6OdeANyhE+LhdbNpuooX2gv
 lW28RAbSVLrwJzyr7B2VGAiCOR9opGu2opOJnQMo05pSAFqJxNG4l1Ap+4pOX9/1
 ffTwidgk6bWs5zKlDwbETHVv/X50U90O5MyJBvf9KC7YvFhD41OIz7QEqiHgs1qW
 T6J80ZH4ZhWN+pRMHlybJ7RwP7TVjUSkDRLWRSX8IWjisbDqY86JVVt/BUA3lbhL
 sLDc/mku3ZPsRAUJL544ydAPWA2DtJ9PjPkYuMgPQOCyUfAKq5OTf6hG0Mghm34=
 =n00Y
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fix from James Bottomley:
 "A single fix for machines with pages > 4k (PPC mostly).

  There's a bug in our optimal transfer size code where we don't account
  for pages > 4k and can set the transfer size to be less than the page
  size causing nasty failures"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  sd: Reject optimal transfer length smaller than page size
2016-01-09 14:53:48 -08:00
Linus Torvalds
c0cb139345 PCI updates for v4.4:
TI DRA7xx host bridge driver
     Mark driver as broken (Richard Cochran)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWkSPdAAoJEFmIoMA60/r80kQP/jPgr13qJ0bt+DE8WMZ98zRu
 BKpgL2MMQ2PzHjpHvrce1y2WceSdnPvloB5fElpi0PtYRzjmhZ607tgBpUjUEm5U
 u/jmFa28PlzzGMu8zQuV6Ge4zFp8PdcNGxZqRNpxZl4HLltxYgd7LnM9wgmSAUo7
 VwZlO5QIjHUPQ2oFC1o7F7pRhme1RpfUDwXaoHAER1E44pHBRksjY/xRlP/Nknh5
 7QOcb62e6N/ngR6Vb01evNVSdtH1+HQQlPaxPALw3sItaRsqqB9rLfXecw5Xu1NN
 h2DiRTcG/2X5bA8yxYZtmkFyvkFQFHjoxgvD2RHf7jb9TX0qyryFJceAKyyAUmFT
 A4z3XSmd54tXjetFkSzYsUbb0Egp1atBLT1Uw8d7UH4djebnwh3hTqPNFpRZBDeA
 AZRIkhTeupRjJGgtYsVghsiOQeS0yg4OboDVBPJljELOcsYN/nuoNvreXCp8qk4H
 pHyq9AQ3XVu3OS94SvDNmTdssUNARPL020/K8FDOkbOnD9EDK/J/PdiEGCZtNea4
 nu8qVo1PaROd5HBxQ2zB6PDiAIXAEfuqmFZPRXd5xa/DHxSfkHRJMmdE1TUVeFHn
 rB+8Fz/PqdkA4wIM3STWtVZOCjYnn6WkeWYS95jX9PmVc1Pa8ByNYkrYc5mxGtar
 uzqdDmZqYJAzRKkIadMR
 =srOL
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.4-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixlet from Bjorn Helgaas:
 "This marks the TI DRA7xx host bridge driver as broken.  Apparently it
  has never worked without some additional out-of-tree code, so I'm
  going to mark it broken now and remove it completely next cycle unless
  it's fixed"

* tag 'pci-v4.4-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: dra7xx: Mark driver as broken
2016-01-09 14:44:44 -08:00
Ingo Molnar
3eb9ede23b perf/core improvements and fixes:
New features:
 
 - Allow using trace events fields as sort order keys, making 'perf evlist --trace_fields'
   show those, and then the user can select a subset and use like:
 
     perf top -e sched:sched_switch -s prev_comm,next_comm
 
   That works as well in 'perf report' when handling files containing
   tracepoints.
 
   The default when just tracepoint events are found in a perf.data file is to
   format it like ftrace, using the libtraceevent formatters, plugins, etc (Namhyung Kim)
 
 - Add support in 'perf script' to process 'perf stat record' generated files,
   culminating in a python perf script that calculates CPI (Cycles per
   Instruction) (Jiri Olsa)
 
 - Show random perf tool tips in the 'perf report' bottom line (Namhyung Kim)
 
 - perf report now defaults to --group if the perf.data file has grouped events, try it with:
 
   # perf record -e '{cycles,instructions}' -a sleep 1
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 1.093 MB perf.data (1247 samples) ]
   # perf report
   # Samples: 1K of event 'anon group { cycles, instructions }'
   # Event count (approx.): 1955219195
   #
   #       Overhead  Command     Shared Object      Symbol
 
      2.86%   0.22%  swapper     [kernel.kallsyms]  [k] intel_idle
      1.05%   0.33%  firefox     libxul.so          [.] js::SetObjectElement
      1.05%   0.00%  kworker/0:3 [kernel.kallsyms]  [k] gen6_ring_get_seqno
      0.88%   0.17%  chrome      chrome             [.] 0x0000000000ee27ab
      0.65%   0.86%  firefox     libxul.so          [.] js::ValueToId<(js::AllowGC)1>
      0.64%   0.23%  JS Helper   libxul.so          [.] js::SplayTree<js::jit::LiveRange*, js::jit::LiveRange>::splay
      0.62%   1.27%  firefox     libxul.so          [.] js::GetIterator
      0.61%   1.74%  firefox     libxul.so          [.] js::NativeSetProperty
      0.61%   0.31%  firefox     libxul.so          [.] js::SetPropertyByDefining
 
 User visible fixes:
 
 - Coect data mmaps so that the DWARF unwinder can handle usecases needing them,
   like softice (Jiri Olsa)
 
 - Decay callchains in fractal mode, fixing up cases where 'perf top -g' would
   show entries with more than 100% (Namhyung Kim)
 
 Infrastructure:
 
 - Sync tools/lib with the lib/ in the kernel sources for find_bit.c and
   move bitmap.[ch] from tools/perf/util/ to tools/lib/ (Arnaldo Carvalho de Melo)
 
 - No need to set attr.sample_freq in some 'perf test' entries that only
   want to deal with PERF_RECORD_ meta-events, improve a bit error output
   for CQM test (Arnaldo Carvalho de Melo)
 
 - Fix python binding build, adding some missing object files now required
   due to cpumap using find_bit stuff (Arnaldo Carvalho de Melo)
 
 - tools/build improvemnts (Jiri Olsa)
 
 - Add more files to cscope/ctags databases (Jiri Olsa)
 
 - Do not show 'trace' in 'perf help' if it is not compiled in (Jiri Olsa)
 
 - Make perf_evlist__open() open evsels with their cpus and threads,
   like perf record does, making them consistent (Adrian Hunter)
 
 - Fix pmu snapshot initialization bug (Stephane Eranian)
 
 - Add missing headers in perf's MANIFEST (Wang Nan)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWj/imAAoJENZQFvNTUqpADg4P/2/zn/JlIQbFMBo0YS4HtBtp
 OIvpU9XV0YyMWEbxEUfNOM3qnOfqZWOuTHvqVywnz1JXHxKNt9j7KjPhBA2avLEc
 WuloOf7Af1eUjroKW/tl1FsPatW0x0zXEVqR5XjUrPfXge6rYPuMwQ2f4oTCzc6H
 uf5fH1H2vK/iQsOu9X+IoGEKxoJF22zabKDQy7q+48gSq/TVtz1wJfHqYBCCFOh8
 c+MEMnOZ3F0DXJ9iRsXbChcOmkHTfAnu5CM8GlkvM38VnJA69K+AkFXC3YuExvA1
 PghTVZMBsYDyHNq+ewIshrJ1xGz0/OwXX2IUtJfFPGbWhMZYl2NCg1SuLfTl7jCF
 m/iR5LzdHpLiEIaimuK+8eIVfLdRyxZUeN9/BQzUFrl7pqe2n1xIjIvUX0djHPv9
 YlQ6ZI/g/nJ1AunC0QhWiwSkmUas/YATNKt7CunOnSjJey2p9T91lhzstqvCur4Q
 XH1iIA4o2A67vRLbVEb2eh54QT2BASO1H+suYPNTvU55W5gGz9pJJjVxIbpT5+lk
 dAZ8vXwxOg1jMFIgDrm6mbosGcs4lUBeeKJbE7ImevpSyOECR8lRdV4yX+bfkuUe
 FL5fNfJyFcr3q2p+sRTpikrq32C3iyQ08VhYYNLuVSUOQgNYEyLTmhMkotZYjmIk
 JZO4/8trMU7FR2WpnGqG
 =tbfd
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

New features:

- Allow using trace events fields as sort order keys, making 'perf evlist --trace_fields'
  show those, and then the user can select a subset and use like:

    perf top -e sched:sched_switch -s prev_comm,next_comm

  That works as well in 'perf report' when handling files containing
  tracepoints.

  The default when just tracepoint events are found in a perf.data file is to
  format it like ftrace, using the libtraceevent formatters, plugins, etc (Namhyung Kim)

- Add support in 'perf script' to process 'perf stat record' generated files,
  culminating in a python perf script that calculates CPI (Cycles per
  Instruction) (Jiri Olsa)

- Show random perf tool tips in the 'perf report' bottom line (Namhyung Kim)

- perf report now defaults to --group if the perf.data file has grouped events, try it with:

  # perf record -e '{cycles,instructions}' -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.093 MB perf.data (1247 samples) ]
  # perf report
  # Samples: 1K of event 'anon group { cycles, instructions }'
  # Event count (approx.): 1955219195
  #
  #       Overhead  Command     Shared Object      Symbol

     2.86%   0.22%  swapper     [kernel.kallsyms]  [k] intel_idle
     1.05%   0.33%  firefox     libxul.so          [.] js::SetObjectElement
     1.05%   0.00%  kworker/0:3 [kernel.kallsyms]  [k] gen6_ring_get_seqno
     0.88%   0.17%  chrome      chrome             [.] 0x0000000000ee27ab
     0.65%   0.86%  firefox     libxul.so          [.] js::ValueToId<(js::AllowGC)1>
     0.64%   0.23%  JS Helper   libxul.so          [.] js::SplayTree<js::jit::LiveRange*, js::jit::LiveRange>::splay
     0.62%   1.27%  firefox     libxul.so          [.] js::GetIterator
     0.61%   1.74%  firefox     libxul.so          [.] js::NativeSetProperty
     0.61%   0.31%  firefox     libxul.so          [.] js::SetPropertyByDefining

User visible fixes:

- Coect data mmaps so that the DWARF unwinder can handle usecases needing them,
  like softice (Jiri Olsa)

- Decay callchains in fractal mode, fixing up cases where 'perf top -g' would
  show entries with more than 100% (Namhyung Kim)

Infrastructure changes:

- Sync tools/lib with the lib/ in the kernel sources for find_bit.c and
  move bitmap.[ch] from tools/perf/util/ to tools/lib/ (Arnaldo Carvalho de Melo)

- No need to set attr.sample_freq in some 'perf test' entries that only
  want to deal with PERF_RECORD_ meta-events, improve a bit error output
  for CQM test (Arnaldo Carvalho de Melo)

- Fix python binding build, adding some missing object files now required
  due to cpumap using find_bit stuff (Arnaldo Carvalho de Melo)

- tools/build improvemnts (Jiri Olsa)

- Add more files to cscope/ctags databases (Jiri Olsa)

- Do not show 'trace' in 'perf help' if it is not compiled in (Jiri Olsa)

- Make perf_evlist__open() open evsels with their cpus and threads,
  like perf record does, making them consistent (Adrian Hunter)

- Fix pmu snapshot initialization bug (Stephane Eranian)

- Add missing headers in perf's MANIFEST (Wang Nan)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-09 17:17:33 +01:00
Guenter Roeck
91918d13eb hwmon: (nct6683) Add basic support for NCT6683 on Mitac boards
Mitac microcode differs from Intel microcode. One key difference
is that pwm values can be written.

Detect vendor from customer ID field and no longer use DMI data
to identify which microcode is running on the chip.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-01-09 07:31:58 -08:00
Michal Hocko
751e5f5c75 vmstat: allocate vmstat_wq before it is used
kernel test robot has reported the following crash:

  BUG: unable to handle kernel NULL pointer dereference at 00000100
  IP: [<c1074df6>] __queue_work+0x26/0x390
  *pdpt = 0000000000000000 *pde = f000ff53f000ff53 *pde = f000ff53f000ff53
  Oops: 0000 [#1] PREEMPT PREEMPT SMP SMP
  CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.4.0-rc4-00139-g373ccbe #1
  Workqueue: events vmstat_shepherd
  task: cb684600 ti: cb7ba000 task.ti: cb7ba000
  EIP: 0060:[<c1074df6>] EFLAGS: 00010046 CPU: 0
  EIP is at __queue_work+0x26/0x390
  EAX: 00000046 EBX: cbb37800 ECX: cbb37800 EDX: 00000000
  ESI: 00000000 EDI: 00000000 EBP: cb7bbe68 ESP: cb7bbe38
   DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
  CR0: 8005003b CR2: 00000100 CR3: 01fd5000 CR4: 000006b0
  Stack:
  Call Trace:
    __queue_delayed_work+0xa1/0x160
    queue_delayed_work_on+0x36/0x60
    vmstat_shepherd+0xad/0xf0
    process_one_work+0x1aa/0x4c0
    worker_thread+0x41/0x440
    kthread+0xb0/0xd0
    ret_from_kernel_thread+0x21/0x40

The reason is that start_shepherd_timer schedules the shepherd work item
which uses vmstat_wq (vmstat_shepherd) before setup_vmstat allocates
that workqueue so if the further initialization takes more than HZ we
might end up scheduling on a NULL vmstat_wq.  This is really unlikely
but not impossible.

Fixes: 373ccbe592 ("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress")
Reported-by: kernel test robot <ying.huang@linux.intel.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: stable@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-08 23:47:54 -08:00
Jann Horn
a7f61e89af compat_ioctl: don't call do_ioctl under set_fs(KERNEL_DS)
This replaces all code in fs/compat_ioctl.c that translated
ioctl arguments into a in-kernel structure, then performed
do_ioctl under set_fs(KERNEL_DS), with code that allocates
data on the user stack and can call the VFS ioctl handler
under USER_DS.

This is done as a hardening measure because the caller
does not know what kind of ioctl handler will be invoked,
only that no corresponding compat_ioctl handler exists and
what the ioctl command number is. The accidental
invocation of an unlocked_ioctl handler that unexpectedly
calls copy_to_user could be a severe security issue.

Signed-off-by: Jann Horn <jann@thejh.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-08 21:18:13 -05:00
Al Viro
66cf191f3e compat_ioctl: don't pass fd around when not needed
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-08 21:16:50 -05:00
Jann Horn
b43417216e compat_ioctl: don't look up the fd twice
In code in fs/compat_ioctl.c that translates ioctl arguments
into a in-kernel structure, then performs sys_ioctl, possibly
under set_fs(KERNEL_DS), this commit changes the sys_ioctl
calls to do_ioctl calls. do_ioctl is a new function that does
the same thing as sys_ioctl, but doesn't look up the fd again.

This change is made to avoid (potential) security issues
because of ioctl handlers that accept one of the ioctl
commands I2C_FUNCS, VIDEO_GET_EVENT, MTIOCPOS, MTIOCGET,
TIOCGSERIAL, TIOCSSERIAL, RTC_IRQP_READ, RTC_EPOCH_READ.
This can happen for multiple reasons:

 - The ioctl command number could be reused.
 - The ioctl handler might not check the full ioctl
   command. This is e.g. true for drm_ioctl.
 - The ioctl handler is very special, e.g. cuse_file_ioctl

The real issue is that set_fs(KERNEL_DS) is used here,
but that's fixed in a separate commit
"compat_ioctl: don't call do_ioctl under set_fs(KERNEL_DS)".

This change mitigates potential security issues by
preventing a race that permits invocation of
unlocked_ioctl handlers under KERNEL_DS through compat
code even if a corresponding compat_ioctl handler exists.

So far, no way has been identified to use this to damage
kernel memory without having CAP_SYS_ADMIN in the init ns
(with the capability, doing reads/writes at arbitrary
kernel addresses should be easy through CUSE's ioctl
handler with FUSE_IOCTL_UNRESTRICTED set).

[AV: two missed sys_ioctl() taken care of]

Signed-off-by: Jann Horn <jann@thejh.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-08 21:16:11 -05:00
Mikulas Patocka
385277bfb5 dm snapshot: fix hung bios when copy error occurs
When there is an error copying a chunk dm-snapshot can incorrectly hold
associated bios indefinitely, resulting in hung IO.

The function copy_callback sets pe->error if there was error copying the
chunk, and then calls complete_exception.  complete_exception calls
pending_complete on error, otherwise it calls commit_exception with
commit_callback (and commit_callback calls complete_exception).

The persistent exception store (dm-snap-persistent.c) assumes that calls
to prepare_exception and commit_exception are paired.
persistent_prepare_exception increases ps->pending_count and
persistent_commit_exception decreases it.

If there is a copy error, persistent_prepare_exception is called but
persistent_commit_exception is not.  This results in the variable
ps->pending_count never returning to zero and that causes some pending
exceptions (and their associated bios) to be held forever.

Fix this by unconditionally calling commit_exception regardless of
whether the copy was successful.  A new "valid" parameter is added to
commit_exception -- when the copy fails this parameter is set to zero so
that the chunk that failed to copy (and all following chunks) is not
recorded in the snapshot store.  Also, remove commit_callback now that
it is merely a wrapper around pending_complete.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
2016-01-08 20:03:05 -05:00
Linus Torvalds
44d8a7d5c1 ARM: SoC fixes for 4.4-rc
This is the final small set of ARM SoC bug fixes for linux-4.4,
 almost all regressions:
 
 OMAP:		data corruption on the Nokia N900 flash
 Allwinner:	Two defconfig change to get USB working again
 ARM Versatile:	Interrupt numbers gone bad after an older bug fix
 Nomadik:	Crashes from incorrect L2 cache settings
 VIA vt8500:	SD/MMC support on WM8650 never worked
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAVpA/vGCrR//JCVInAQK1NQ//fk4EYJj6JMwaB/I2hGtpUpartVtXj9tj
 916+NazORGc/UBP9vG/bLWnZrKzZQdeLd2HwDT9eyHtZ+Qq8bm+No312l10EATkS
 EAZKy4oy0HH5BD8yE5xM+CQn9fxh8GcmGJ/hKacpJnY2e0aaGgp9iHiPWH8kRmO4
 kBbUWsbfGjxzfM+a/9XGrzcTWZLSOu6VGtxvdp2rGy+Un8AWVc6FQz48xsFXWivB
 5JHp0qTnyUebSNbsxq05BDorybDuNSho5d3r94H85WwfbdybCxuFYedTZ1r3vCh9
 rWDI8R2vb3uotzYYjigRwyYqggUz6TewWqwnyeyGsMHOCEU/q/2n2ASLhdxKl2x8
 UZ+kKDhRiraaxnsNh/apVy4yvs7KmsUM6DgxzvXW3uyrrLrjJyGHKY5doHt5Z5mJ
 ErhOW024skA8GNiWObmNiKzUdKEbO4e7ReFgwbLbA8W6ACVVMNdL4udsmkb+nj25
 +VFWOg5WVG5WJFaEoSOGze7J0+SSwu7wP0HRPmWFotslExh5HsrvkSw429sDDB0R
 4OwT0Ru0vDZvGrHD+Bu6mFJO1nL2kWKVsolyben1sA+OIbm7eoHcennRUFf2sog0
 yWD03DT9iz//bmWPHVntcdRTTaXYvaChPWzHxlbBxw2DCfRkzy4Tx/Ic44s2Kjuq
 PgQjTmNlnQ8=
 =Z1E0
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
 "This is the final small set of ARM SoC bug fixes for linux-4.4, almost
  all regressions:

  OMAP:
   - data corruption on the Nokia N900 flash

  Allwinner:
   - Two defconfig change to get USB working again

  ARM Versatile:
   - Interrupt numbers gone bad after an older bug fix

  Nomadik:
   - Crashes from incorrect L2 cache settings

  VIA vt8500:
   - SD/MMC support on WM8650 never worked"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  dts: vt8500: Add SDHC node to DTS file for WM8650
  ARM: Fix broken USB support in multi_v7_defconfig for sunxi devices
  ARM: versatile: fix MMC/SD interrupt assignment
  ARM: nomadik: set latencies to 8 cycles
  ARM: OMAP2+: Fix onenand rate detection to avoid filesystem corruption
  ARM: Fix broken USB support in sunxi_defconfig
2016-01-08 16:11:05 -08:00
Linus Torvalds
516c50cde6 A simple fix. I'm sending it before the merge window, because it refines
a patch found in your master branch but not yet in the kvm/next branch
 that is destined for 4.5.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJWj+yRAAoJEL/70l94x66DulYH/0OGP+yIHDDFlBqtPRm6q0pr
 r8pSVRPPd4GY2SOJDBsBvMmWphFSYKIoCTyMbFnikADHM2yh/pycwLU/uzCM5xQl
 uABMsCUntwbGaKq+A4bOvsNO49ueRCkML4ToVuKNTeuEKRYfdnlj3XcAMMgsUfEF
 QGz8W2cm9xPn69df91cfBuFLLFeQVv2XsjA5WpqzzvWy5HEs1F07aVh57TI4j8OF
 eFdn3Lkes9Ync70KjEy2QKe2Su0EWjderE0oqAORKomwZFVCYv/Vg1wERJYsugg5
 UyYCY2j1tKlycKYDnO47L1xoS9JgMHY05OsH08Sn/EXBjRjnEVwTyco5pGPmuNA=
 =5Lst
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fix from Paolo Bonzini:
 "A simple fix.  I'm sending it before the merge window, because it
  refines a patch found in your master branch but not yet in the
  kvm/next branch that is destined for 4.5"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: x86: only channel 0 of the i8254 is linked to the HPET
2016-01-08 15:58:14 -08:00
Linus Torvalds
496b0b57c0 ACPI fix for final v4.4
Just one obvious fix that adds a missing function argument in
 ACPI code introduced recently (Kees Cook).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJWkD8gAAoJEILEb/54YlRxOC4P/RDQxQkTzQPskyqHayzamCSQ
 cEqKYOL199Do42OD0mlJm9gc2Ax0tLvSEK/6VRz4WB+xqrlUz8Lcnqrcjld632jx
 a+xxqXcgIYTV03ujhNs4rRiWBedH9alwKvy4SoukQkL6eeXsE7llk1hZNjoqA3rh
 VsVsPSxQNLOAkJJzaHpBrGE3weCdXAmUb7aaFe1mjj1eGrPP+MjOYRSKWoyRn4bA
 RWL84wUmaM5X/NowCuFJZHEXGyiYm3EBFu4oqfKMRJDE6saJ0DLIPfIuaEX6srox
 gSsOB0tCHhrVR4vhfFD2x4P3rMEwt39IcW/5dAj76zYH2yyYiCpC4SmdYIJTkI+a
 /8y+a9/4CyKfOGhZ+lfJQOo0HbMHOGkF1VitNHGmiGnefTKDravFXIVqFqWBvgm3
 oT1tPtM4fmgy787WBVMP+P7NQ9yh4/YQ5oB+3Mi5AS13mT1PCTKe32q9LiuMUAtE
 QmcK0yvDAUM9LwNDYffUFVoUxkCAUft7CYIso46X5qjDOp94fB52K/JxM4aMX90b
 kjtKWLWVFGzBCehj7cb4RX6Jqaq/5cR8f3pLf+wyC9tli2DtThX2aiFXjA3PcY8z
 TZwZ8L31LL6apbIW5wZTMN9f6cKiM4YT44KYxWFdgf/S8ZqRx9EtjqrAKcJxgziF
 WM+yB+XEb7Q21WcTESV/
 =XAk2
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.4-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "Just one obvious fix that adds a missing function argument in ACPI
  code introduced recently (Kees Cook)"

* tag 'pm+acpi-4.4-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / property: avoid leaking format string into kobject name
2016-01-08 15:50:59 -08:00
Linus Torvalds
650e5455d8 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "A handful of x86 fixes:

   - a syscall ABI fix, fixing an Android breakage
   - a Xen PV guest fix relating to the RTC device, causing a
     non-working console
   - a Xen guest syscall stack frame fix
   - an MCE hotplug CPU crash fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/numachip: Fix NumaConnect2 MMCFG PCI access
  x86/entry: Restore traditional SYSENTER calling convention
  x86/entry: Fix some comments
  x86/paravirt: Prevent rtc_cmos platform device init on PV guests
  x86/xen: Avoid fast syscall path for Xen PV guests
  x86/mce: Ensure offline CPUs don't participate in rendezvous process
2016-01-08 15:21:48 -08:00
Linus Torvalds
de03017958 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "Misc scheduler fixes"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/core: Reset task's lockless wake-queues on fork()
  sched/core: Fix unserialized r-m-w scribbling stuff
  sched/core: Check tgid in is_global_init()
  sched/fair: Fix multiplication overflow on 32-bit systems
2016-01-08 13:57:13 -08:00
Linus Torvalds
3ab6d1ebd5 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Two core subsystem fixes, plus a handful of tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Fix race in swevent hash
  perf: Fix race in perf_event_exec()
  perf list: Robustify event printing routine
  perf list: Add support for PERF_COUNT_SW_BPF_OUT
  perf hists browser: Fix segfault if use symbol filter in cmdline
  perf hists browser: Reset selection when refresh
  perf hists browser: Add NULL pointer check to prevent crash
  perf buildid-list: Fix return value of perf buildid-list -k
  perf buildid-list: Show running kernel build id fix
2016-01-08 13:52:59 -08:00
Linus Torvalds
ea83ae2fb3 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
 "Fixes a core IRQ subsystem deadlock"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Prevent chip buslock deadlock
2016-01-08 13:46:59 -08:00
Linus Torvalds
a6a7358e49 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block revert from Jens Axboe:
 "The previous pull request had a split fix for NVMe, however there are
  corner cases where that ends up blowing up.

  So let's revert it for 4.4.  The regression isn't introduced in this
  cycle, and it's "just" a performance regression, not a
  stability/integrity issue"

* 'for-linus' of git://git.kernel.dk/linux-block:
  Revert "block: Split bios on chunk boundaries"
2016-01-08 13:39:09 -08:00
Linus Torvalds
212c7f66ec dmaengine fixes for 4.4
Late fixes for 4.4 are three fixes for drivers which include a
 revert of mic-x100 fix which is causing regression, xgene fix for
 double IRQ and async_tx fix to use GFP_NOWAIT
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWj7XNAAoJEHwUBw8lI4NHOEEQAIj9YoQjkHkPo2PviIeiX3Zy
 HNAIM0GepYSZrJO+duKAzFxK4mEGC92gpyOiv6NJ91RdC2ijYc9gb6Z7ig/g4znq
 odqMGFIdBEqZAq1PhN81BtJqdkw4EmzffS/GaEol0MM85Hvm5r91GfOx86AKs0HW
 emOJL5XXG8/byNY7xihCl+55u9l1E5c7+K8QNj3++bgKK07KIor4qtxBNUqejQUj
 /eFdi8Z1SFOHHDvdYKGVRRm70uq6BqfYcsLfLcgd7uEDkvz0Zcy7SF11hY67z8QU
 ood0eyyuBhuZjxzZN+5No9UkRf2BKfjXUex/wr2AnuqF2xYcGYC76tFKeAUbOc6s
 YF+SdfDoQVys3o6EhYj6hj9s1viepu41tpnOB8ldBE2B/Gyn0iq/ipj05rsMKIRp
 ivefoSb7KG1O6aesJ/n7vUBY8VWiE3KBcRYoIotW5jNQ83Jb/e40ougpshzxeWYR
 08PekcG+CAUlnfaF8f0+c3xuo2Q23MMpf6e3XqlNxCGBpwIn8NEtLw1qcLdKI9zG
 x6zguGxEZDIoGYC0kjmP+PNAjrJmNVHutcJODgK73H4zNvY1O6Kg+dOBLrwkqXh5
 Um6RRDzOcEWsGMHg0akPHWaaAt7slCvtTXLMtzMysHR1mkihqAbzq6Zr0QoGjekz
 j20fC8o8bArX9cCDO71F
 =jYJY
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-fix-4.4' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine fixes from Vinod Koul:
 "Late fixes for 4.4 are three fixes for drivers which include a revert
  of mic-x100 fix which is causing regression, xgene fix for double IRQ
  and async_tx fix to use GFP_NOWAIT"

* tag 'dmaengine-fix-4.4' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: xgene-dma: Fix double IRQ issue by setting IRQ_DISABLE_UNLAZY flag
  async_tx: use GFP_NOWAIT rather than GFP_IO
  dmaengine: Revert "dmaengine: mic_x100: add missing spin_unlock"
2016-01-08 12:23:00 -08:00
Linus Torvalds
436950a65d Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull dmi fix from Jean Delvare.

* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  firmware: dmi_scan: Fix UUID endianness for SMBIOS >= 2.6
2016-01-08 12:18:45 -08:00
Linus Torvalds
4054f64c93 sound fixes for 4.4
A slightly higher volume than a new year's wish, but not too
 worrisome: a large LOC is only for HD-audio device-specific quirks,
 so fairly safe to apply.  The rest ASoC fixes are all trivial and
 small; a simple replacement of mutex call with nested lock version,
 a few Arizona and Realtek codec fixes, and a regression fix for
 Skylake firmware handling.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJWjmpaAAoJEGwxgFQ9KSmkypcP/AsZgxRp9kU7xBKcymZR6ldi
 OtXse9Irie7/Da23v98VfOmC0SF1RYDnEO+yDi3ud31ZGyqcwCMBrcO1ZrvuQw1w
 3LDsUdVsOJyD2bK/u9pDy+HfZQ62I+IVN/lwQ3eA4fimxae5M+6Ptt4q6tAiOVul
 mk6sf+aVIIpGRJDV0icVj+S+N1o05PhukEPCrzi8NgLjN6QYwOeaIEQa6/ymZjfe
 mcYG7SQsyyxGtdDpGEebZtQJVxmYY93qnLDykikLsNquCJKRX4GGWZqZn2dVF73M
 afZWt921+CvPdLIabQYgjPpqMniFR/XbGXuXYhLtADbb806SFmy68Dvfs29u9qDH
 ZwmsMcvwx/CjeyuZGicbjlApND9pA5RHGkL4qE2hCxkrrBrfWPfVKsa5DKUMthiy
 vIUTBQohgMQE+H1N6S1vd+4itnUdMaxhF4+yV1oIzHtAo1m+bHR6Mzein3O9lS9o
 7ch028nrhDZZZNOMl3vsbCS8R9KAXKSpojwgCZ94QwY8xSelXXiLQY05ggA1U5Bo
 w7Uqnn3AnvGieORv3ZZ0TKEQSGO6rxMUB+L8PsrJEiC8F3tkSr9yYVQzXxo69Qhs
 jQXAREwdwarugVj+ER8rIyFfJ9rd1+8yZT3oRr+WhmaXvuQ6CoGrQ/P0+VsMUSxC
 RJKBEeGjaFRhc2NedGO7
 =XSAp
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A slightly higher volume than a new year's wish, but not too
  worrisome: a large LOC is only for HD-audio device-specific quirks, so
  fairly safe to apply.  The rest ASoC fixes are all trivial and small;
  a simple replacement of mutex call with nested lock version, a few
  Arizona and Realtek codec fixes, and a regression fix for Skylake
  firmware handling"

* tag 'sound-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: Intel: Skylake: Fix the memory leak
  ASoC: Intel: Skylake: Revert previous broken fix memory leak fix
  ASoC: Use nested lock for snd_soc_dapm_mutex_lock
  ASoC: rt5645: add sys clk detection
  ALSA: hda - Add keycode map for alc input device
  ALSA: hda - Add mic mute hotkey quirk for Lenovo ThinkCentre AIO
  ASoC: arizona: Fix bclk for sample rates that are multiple of 4kHz
2016-01-08 11:52:18 -08:00
Chris Wilson
1f1a89ac05 x86/mm: Micro-optimise clflush_cache_range()
Whilst inspecting the asm for clflush_cache_range() and some perf profiles
that required extensive flushing of single cachelines (from part of the
intel-gpu-tools GPU benchmarks), we noticed that gcc was reloading
boot_cpu_data.x86_clflush_size on every iteration of the loop. We can
manually hoist that read which perf regarded as taking ~25% of the
function time for a single cacheline flush.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Sai Praneeth <sai.praneeth.prakhya@intel.com>
Link: http://lkml.kernel.org/r/1452246933-10890-1-git-send-email-chris@chris-wilson.co.uk
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-01-08 19:27:39 +01:00
Andrey Smetanin
ac3e5fcae8 kvm/x86: Hyper-V SynIC timers tracepoints
Trace the following Hyper SynIC timers events:
* periodic timer start
* one-shot timer start
* timer callback
* timer expiration and message delivery result
* timer config setup
* timer count setup
* timer cleanup

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08 19:04:43 +01:00
Andrey Smetanin
18659a9cb1 kvm/x86: Hyper-V SynIC tracepoints
Trace the following Hyper SynIC events:
* set msr
* set sint irq
* ack sint
* sint irq eoi

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08 19:04:43 +01:00
Andrey Smetanin
f3b138c5d8 kvm/x86: Update SynIC timers on guest entry only
Consolidate updating the Hyper-V SynIC timers in a
single place: on guest entry in processing KVM_REQ_HV_STIMER
request.  This simplifies the overall logic, and makes sure
the most current state of msrs and guest clock is used for
arming the timers (to achieve that, KVM_REQ_HV_STIMER
has to be processed after KVM_REQ_CLOCK_UPDATE).

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08 19:04:42 +01:00
Andrey Smetanin
7be58a6488 kvm/x86: Skip SynIC vector check for QEMU side
QEMU zero-inits Hyper-V SynIC vectors. We should allow that,
and don't reject zero values if set by the host.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08 19:04:42 +01:00
Andrey Smetanin
23a3b201fd kvm/x86: Hyper-V fix SynIC timer disabling condition
Hypervisor Function Specification(HFS) doesn't require
to disable SynIC timer at timer config write if timer->count = 0.

So drop this check, this allow to load timers MSR's
during migration restore, because config are set before count
in QEMU side.

Also fix condition according to HFS doc(15.3.1):
"It is not permitted to set the SINTx field to zero for an
enabled timer. If attempted, the timer will be
marked disabled (that is, bit 0 cleared) immediately."

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08 19:04:41 +01:00
Andrey Smetanin
0cdeabb118 kvm/x86: Reorg stimer_expiration() to better control timer restart
Split stimer_expiration() into two parts - timer expiration message
sending and timer restart/cleanup based on timer state(config).

This also fixes a bug where a one-shot timer message whose delivery
failed once would get lost for good.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08 19:04:41 +01:00
Andrey Smetanin
f808495da5 kvm/x86: Hyper-V unify stimer_start() and stimer_restart()
This will be used in future to start Hyper-V SynIC timer
in several places by one logic in one function.

Changes v2:
* drop stimer->count == 0 check inside stimer_start()
* comment stimer_start() assumptions

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08 19:04:40 +01:00
Andrey Smetanin
019b9781cc kvm/x86: Drop stimer_stop() function
The function stimer_stop() is called in one place
so remove the function and replace it's call by function
content.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08 19:04:40 +01:00
Andrey Smetanin
1ac1b65ac1 kvm/x86: Hyper-V timers fix incorrect logical operation
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08 19:04:39 +01:00
Paolo Bonzini
2860c4b167 KVM: move architecture-dependent requests to arch/
Since the numbers now overlap, it makes sense to enumerate
them in asm/kvm_host.h rather than linux/kvm_host.h.  Functions
that refer to architecture-specific requests are also moved
to arch/.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08 19:04:36 +01:00
Paolo Bonzini
6662ba347b KVM: renumber vcpu->request bits
Leave room for 4 more arch-independent requests.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08 19:03:06 +01:00
Paolo Bonzini
0cd3104372 KVM: document which architecture uses each request bit
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08 19:03:06 +01:00
Paolo Bonzini
6c71f8ae15 KVM: Remove unused KVM_REQ_KICK to save a bit in vcpu->requests
Suggested-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
[Takuya moved all subsequent constants to fill the void, but that
 is useless in view of the following patches.  So this change looks
 nothing like the original. - Paolo]
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08 19:03:05 +01:00
Paolo Bonzini
d0ecb44dd4 KVM: s390: Feature and fix for 4.5
- allow the runtime instrumentation support inside the guests
 - remove a useless memory barrier
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJWjstnAAoJEBF7vIC1phx8iKEQAI9j2Y+BqXbeS0rSJtalCh9k
 f4Z3LX5GK46YzspKQcFQH++Rrowhf6bQWnArrYFXjvhkBOsIWadNEMc5nmk3odxw
 uOfPUxj/LlmaJFi3flf91aQQEcRKOHOUlzu4x+JGquYDGMkZbgpFSwnJViRj0g/a
 iLnQLAmKPGQsIPc8Ja9W4OSx3Xfg4szqa481HYQzDIjeLvkqiDdoIaQcUgXQjVMx
 E+GC10xvBiycRXViMqcNbVBggTjxi9U+0WoNI051OUut8S10i6WQmuTPEmZz5Oep
 hRkIsPncIrQC2t0nGep5xIDcvX9/56rmQHYG+4jutdPQAWuUzLxxb1JEfhhX1dvG
 h9+n0FCL+3wIvL4WO+enjKtqBFjqAWiktuG5MQBBG7nYfcjaSi1ATWhZm8I++L6N
 dRTxV4axZoJ88SmBpM8wIUQwUfBKqD7D6IPwU1OGj0bg9L4JnIo1c3GT0f5I0Gw0
 vp8dH5HmhJEWtRDprdnm/49emzZ+2wtlVs2jDXmrIy8hN31+dBiVC8T50kEy1350
 LY0aPSAOvg+DeGcw1aZ2OKxKj+XIKEIA+Huq7rAJRELMvxzfOhSUDbOdUr8R9eWR
 5wBbZKMgK7Eqg2lu2bfL/Clsj4YlSAuvKuLMXfXV6/sZo88BUJj8yFSqOADggHKn
 NcbVMcxn2XWDkUA5ol+F
 =Quu/
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Feature and fix for 4.5

- allow the runtime instrumentation support inside the guests
- remove a useless memory barrier
2016-01-08 19:03:03 +01:00
Namhyung Kim
775d8a1b0d perf evlist: Add --trace-fields option to show trace fields
To use dynamic sort keys, it might be good to add an option to see the
list of field names.

  $ perf evlist -i perf.data.sched
  sched:sched_switch
  sched:sched_stat_wait
  sched:sched_stat_sleep
  sched:sched_stat_iowait
  sched:sched_stat_runtime
  sched:sched_process_fork
  sched:sched_wakeup
  sched:sched_wakeup_new
  sched:sched_migrate_task
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

  $ perf evlist -i perf.data.sched --trace-fields
  sched:sched_switch: trace_fields: prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
  sched:sched_stat_wait: trace_fields: comm,pid,delay
  sched:sched_stat_sleep: trace_fields: comm,pid,delay
  sched:sched_stat_iowait: trace_fields: comm,pid,delay
  sched:sched_stat_runtime: trace_fields: comm,pid,runtime,vruntime
  sched:sched_process_fork: trace_fields: parent_comm,parent_pid,child_comm,child_pid
  sched:sched_wakeup: trace_fields: comm,pid,prio,success,target_cpu
  sched:sched_wakeup_new: trace_fields: comm,pid,prio,success,target_cpu
  sched:sched_migrate_task: trace_fields: comm,pid,prio,orig_cpu,dest_cpu

Committer notes:

For another file, in verbose mode:

  # perf evlist -v --trace-fields
  sched:sched_switch: type: 2, size: 112, config: 0x10b, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, trace_fields: prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452125549-1511-5-git-send-email-namhyung@kernel.org
[ Replaced 'trace_fields=' with 'trace_fields: ' to make the output consistent in -v mode ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 14:23:02 -03:00
Jiri Olsa
5c0cf22477 perf record: Store data mmaps for dwarf unwind
Currently we don't synthesize data mmap by default. It depends on -d
option, that enables data address sampling.

But we've seen cases (softice) where DWARF unwinder went through non
executable mmaps, which we need to lookup in MAP__VARIABLE tree.

Making data mmaps to be synthesized for dwarf unwind as well.

Reported-by: Noel Grandin <noelgrandin@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160107133022.GA32115@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 14:17:45 -03:00
Jiri Olsa
0ba98149f8 perf libdw: Check for mmaps also in MAP__VARIABLE tree
We've seen cases (softice) where DWARF unwinder went through non
executable mmaps, which we need to lookup in MAP__VARIABLE tree.

Reported-and-Tested-by: Noel Grandin <noelgrandin@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 14:16:57 -03:00
Jiri Olsa
0ddf5246f7 perf unwind: Check for mmaps also in MAP__VARIABLE tree
We've seen cases (softice) where DWARF unwinder went through non
executable mmaps, which we need to lookup in MAP__VARIABLE tree.

Reported-and-Tested-by: Noel Grandin <noelgrandin@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 14:16:34 -03:00
Jiri Olsa
f22ed827a8 perf unwind: Use find_map function in access_dso_mem
The find_map helper is already there, so let's use it.

Also we're going to introduce wider search in following patch, so it'll
be easier to make this change on single place.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Noel Grandin <noelgrandin@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 14:16:12 -03:00
Jiri Olsa
d2190a8091 perf evlist: Remove perf_evlist__(enable|disable)_event functions
Replacing them with perf_evsel__(enable|disable).

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 14:15:43 -03:00