Commit Graph

125486 Commits

Author SHA1 Message Date
Alexander Shishkin
8ee83b2ab3 perf/x86/intel/pt: Add support for PTWRITE and power event tracing
The Intel PT facility grew some new functionality:

  * PTWRITE packet carries the payload of the new PTWRITE instruction
    that can be used to instrument Intel PT traces with user-supplied
    data. Packets of this type are only generated if 'ptwrite' capability
    is set and PTWEn bit is set in the event attribute's config. Flow
    update packets (FUP) can be generated on PTWRITE packets if FUPonPTW
    config bit is set. Setting these bits is not allowed if 'ptwrite'
    capability is not set.

  * PWRE, PWRX, MWAIT, EXSTOP packets communicate core power management
    events. These depend on 'power_event_tracing' capability and are
    enabled by setting PwrEvtEn bit in the event attribute.

Extend the driver capabilities and provide the proper sanity checks in the
event validation function.

[ tglx: Massaged changelog ]

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: vince@deater.net
Cc: eranian@google.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/20160916134819.1978-1-alexander.shishkin@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20 01:18:28 +02:00
Kan Liang
cd34cd97b7 perf/x86/intel/uncore: Add Skylake server uncore support
This patch implements the uncore monitoring driver for Skylake server.
The uncore subsystem in Skylake server is similar to previous
server. There are some differences in config register encoding and pci
device IDs. Besides, Skylake introduces many new boxes to reflect the
MESH architecture changes.

The control registers for IIO and UPI have been extended to 64 bit. This
patch also introduces event_mask_ext to handle the high 32 bit mask.

The CHA box number could vary for different machines. This patch gets
the CHA box number by counting the CHA register space during
initialization at runtime.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1471378190-17276-3-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-10 11:18:52 +02:00
Harry Pan
2668c61956 perf/x86/rapl: Enable Apollo Lake RAPL support
This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

  $ perf list
  $ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan <harry.pan@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: bp@alien8.de
Cc: gs0622@gmail.com
Cc: hpa@zytor.com
Cc: srinivas.pandruvada@linux.intel.com
Link: http://lkml.kernel.org/r/1473325738-730-1-git-send-email-harry.pan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-10 11:18:52 +02:00
Ingo Molnar
5006921837 Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-10 11:17:54 +02:00
Peter Zijlstra
8ef9b8455a perf/x86/intel: Fix PEBSv3 record drain
Alexander hit the WARN_ON_ONCE(!event) on his Skylake while running
the perf fuzzer.

This means the PEBSv3 record included a status bit for an inactive
event, something that _should_ not happen.

Move the code that filters the status bits against our known PEBS
events up a spot to guarantee we only deal with events we know about.

Further add "continue" statements to the WARN_ON_ONCE()s such that
we'll not die nor generate silly events in case we ever do hit them
again.

Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vince@deater.net>
Cc: stable@vger.kernel.org
Fixes: a3d86542de ("perf/x86/intel/pebs: Add PEBSv3 decoding")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-10 11:15:39 +02:00
Alexander Shishkin
ef9ef3befa perf/x86/intel/bts: Kill a silly warning
At the moment, intel_bts will WARN() out if there is more than one
event writing to the same ring buffer, via SET_OUTPUT, and will only
send data from one event to a buffer.

There is no reason to have this warning in, so kill it.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160906132353.19887-6-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-10 11:15:38 +02:00
Alexander Shishkin
4d4c474124 perf/x86/intel/bts: Fix BTS PMI detection
Since BTS doesn't have a dedicated PMI status bit, the driver needs to
take extra care to check for the condition that triggers it to avoid
spurious NMI warnings.

Regardless of the local BTS context state, the only way of knowing that
the NMI is ours is to compare the write pointer against the interrupt
threshold.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160906132353.19887-5-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-10 11:15:38 +02:00
Alexander Shishkin
a9a94401c2 perf/x86/intel/bts: Fix confused ordering of PMU callbacks
The intel_bts driver is using a CPU-local 'started' variable to order
callbacks and PMIs and make sure that AUX transactions don't get messed
up. However, the ordering rules in regard to this variable is a complete
mess, which recently resulted in perf_fuzzer-triggered warnings and
panics.

The general ordering rule that is patch is enforcing is that this
cpu-local variable be set only when the cpu-local AUX transaction is
active; consequently, this variable is to be checked before the AUX
related bits can be touched.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160906132353.19887-4-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-10 11:15:37 +02:00
Sebastian Andrzej Siewior
7d762e49c2 perf/x86/amd/uncore: Prevent use after free
The resent conversion of the cpu hotplug support in the uncore driver
introduced a regression due to the way the callbacks are invoked at
initialization time.

The old code called the prepare/starting/online function on each online cpu
as a block. The new code registers the hotplug callbacks in the core for
each state. The core invokes the callbacks at each registration on all
online cpus.

The code implicitely relied on the prepare/starting/online callbacks being
called as combo on a particular cpu, which was not obvious and completely
undocumented.

The resulting subtle wreckage happens due to the way how the uncore code
manages shared data structures for cpus which share an uncore resource in
hardware. The sharing is determined in the cpu starting callback, but the
prepare callback allocates per cpu data for the upcoming cpu because
potential sharing is unknown at this point. If the starting callback finds
a online cpu which shares the hardware resource it takes a refcount on the
percpu data of that cpu and puts the own data structure into a
'free_at_online' pointer of that shared data structure. The online callback
frees that.

With the old model this worked because in a starting callback only one non
unused structure (the one of the starting cpu) was available. The new code
allocates the data structures for all cpus when the prepare callback is
registered.

Now the starting function iterates through all online cpus and looks for a
data structure (skipping its own) which has a matching hardware id. The id
member of the data structure is initialized to 0, but the hardware id can
be 0 as well. The resulting wreckage is:

  CPU0 finds a matching id on CPU1, takes a refcount on CPU1 data and puts
  its own data structure into CPU1s data structure to be freed.

  CPU1 skips CPU0 because the data structure is its allegedly unsued own.
  It finds a matching id on CPU2, takes a refcount on CPU1 data and puts
  its own data structure into CPU2s data structure to be freed.

  ....

Now the online callbacks are invoked.

  CPU0 has a pointer to CPU1s data and frees the original CPU0 data. So
  far so good.

  CPU1 has a pointer to CPU2s data and frees the original CPU1 data, which
  is still referenced by CPU0 ---> Booom

So there are two issues to be solved here:

1) The id field must be initialized at allocation time to a value which
   cannot be a valid hardware id, i.e. -1

   This prevents the above scenario, but now CPU1 and CPU2 both stick their
   own data structure into the free_at_online pointer of CPU0. So we leak
   CPU1s data structure.

2) Fix the memory leak described in #1

   Instead of having a single pointer, use a hlist to enqueue the
   superflous data structures which are then freed by the first cpu
   invoking the online callback.

Ideally we should know the sharing _before_ invoking the prepare callback,
but that's way beyond the scope of this bug fix.

[ tglx: Rewrote changelog ]

Fixes: 96b2bd3866 ("perf/x86/amd/uncore: Convert to hotplug state machine")
Reported-and-tested-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20160909160822.lowgmkdwms2dheyv@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-10 00:00:06 +02:00
Jiri Olsa
79d102cbfd perf/x86/intel/cqm: Check cqm/mbm enabled state in event init
Yanqiu Zhang reported kernel panic when using mbm event
on system where CQM is detected but without mbm event
support, like with perf:

  # perf stat -e 'intel_cqm/event=3/' -a

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  IP: [<ffffffff8100d64c>] update_sample+0xbc/0xe0
  ...
   <IRQ>
   [<ffffffff8100d688>] __intel_mbm_event_init+0x18/0x20
   [<ffffffff81113d6b>] flush_smp_call_function_queue+0x7b/0x160
   [<ffffffff81114853>] generic_smp_call_function_single_interrupt+0x13/0x60
   [<ffffffff81052017>] smp_call_function_interrupt+0x27/0x40
   [<ffffffff816fb06c>] call_function_interrupt+0x8c/0xa0
  ...

The reason is that we currently allow to init mbm event
even if mbm support is not detected.  Adding checks for
both cqm and mbm events and support into cqm's event_init.

Fixes: 33c3cc7acf ("perf/x86/mbm: Add Intel Memory B/W Monitoring enumeration and init")
Reported-by: Yanqiu Zhang <yanqzhan@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1473089407-21857-1-git-send-email-jolsa@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-06 10:42:12 +02:00
Stephane Eranian
24cf84672e perf/x86/intel/uncore: Handle non-standard counter offset
The offset of the counters for UPI and M2M boxes on Skylake server is
non-standard (8 bytes apart).

This patch introduces a custom flag UNCORE_BOX_FLAG_CTL_OFFS8 to
specially handle it.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1471378190-17276-2-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-05 13:15:08 +02:00
Kan Liang
68ce4a0dea perf/x86/intel/uncore: Remove hard-coded implementation for Node ID mapping location
The method to build PCI bus to socket mapping is similar among
platforms. However, the PCI location which stores Node ID mapping could
vary between different platforms. For example, the Node ID mapping address
on Skylake server is different from the previous platform. Also, to
build the mapping for the PCI bus without UBOX, it has to start from
bus 0 on Skylake server.

This patch removes the current hardcoded implementation and adds
three parameters for snbep_pci2phy_map_init(). This way the Node ID mapping
address and bus searching direction can be configured according to
different platforms.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nilay Vaish <nilayvaish@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1471378190-17276-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-05 13:15:07 +02:00
Ingo Molnar
2cc538412a Merge branch 'perf/urgent' into perf/core, to pick up fixed and resolve conflicts
Conflicts:
	kernel/events/core.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-05 12:09:59 +02:00
Linus Torvalds
9ca581b50d Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
 "A single fix for an AMD erratum so machines without a BIOS fix work"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/AMD: Apply erratum 665 on machines without a BIOS fix
2016-09-04 08:45:41 -07:00
Linus Torvalds
2bece1a010 - arm64 fix: debug exception unmasking on the CPU resume path
- ARM PMU fixes: memory leak on error path and NULL pointer dereference
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXyqG5AAoJEGvWsS0AyF7x1NsP/irMTRdy27zrV2PV5rhuA2Hx
 1xvpLD2vG5mUA89XDKFWwqXCw44VDdWv2TaqvzyVE+oGO1A/1wiH6ZVdqoYCsQ7x
 AcxlJv21i0jZTO/xlSVA/vayhTq3UHXveUcn+Ji5y6u9pLE7Eb7h922oFMV8DlvR
 7UM37TEXclDzSpmmUSynmhu/OwFKjzQGSo0J19cAj7bV/9V/si09Obh3IskcbrQe
 o5tQ+Eel4huWi99h2Uo8HPcsPkQ08Q1ISloSz4yWTEfIX4R0F3R2yP+hpJWuaOy6
 HoIeKE3dCj2PIUyMonvnTFLzBoddxL/N7H/J9XpyHvjG/bmngV2cQrjjIMBIKVC7
 zjw5UuYi39prP2KA+vJMmzkzad67GrEqsWk3NXMwqF43HAW2abAlhTNC2F/or1+V
 YxtTVYAlk7F+AnItVBIZfX43H2LqVl0RrnCfYN2z7+pn2sBcJdRH2IuRj+Pl+WZk
 ErKBZoDHd2NYEh3rNL6HhghDn9obNUVbeDtkMk7ZLRvI919sOuDqcMRRJ5VlBW1s
 EQQNjEQaRZxUOKTEWvSIcdKmr2jOrf8m35NlPAEbGBa5ZDtM86AbEBIHrOiZdCQT
 XN01tUFP/LwqgtjlgFnak806dwwZXeGzF/VqivS0oV8DfPVqtRn8xJrqR35dvcNx
 R4mC9bBozPzGWWwP65qV
 =aasV
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
 "arm64 and arm/perf fixes:

   - arm64 fix: debug exception unmasking on the CPU resume path

   - ARM PMU fixes: memory leak on error path and NULL pointer
     dereference"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kernel: Fix unmasked debug exceptions when restoring mdscr_el1
  drivers/perf: arm_pmu: Fix NULL pointer dereference during probe
  drivers/perf: arm_pmu: Fix leak in error path
2016-09-03 12:31:37 -07:00
Linus Torvalds
018c81b827 Staging/IIO driver fixes for 4.8-rc5
Here are a number of small fixes for staging and IIO drivers that
 resolve reported problems.
 
 Full details are in the shortlog.  All of these have been in linux-next
 with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iFYEABECABYFAlfK4HoPHGdyZWdAa3JvYWguY29tAAoJEDFH1A3bLfspt0MAn0wC
 dYhZOUHxOptLiEkVGXFCU9kzAJ4gETEbuGn9lgp2TFATOOAN7oqPUw==
 =6MKk
 -----END PGP SIGNATURE-----

Merge tag 'staging-4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging/IIO driver fixes from Greg KH:
 "Here are a number of small fixes for staging and IIO drivers that
  resolve reported problems.

  Full details are in the shortlog.  All of these have been in
  linux-next with no reported issues"

* tag 'staging-4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (35 commits)
  arm: dts: rockchip: add reset node for the exist saradc SoCs
  arm64: dts: rockchip: add reset saradc node for rk3368 SoCs
  iio: adc: rockchip_saradc: reset saradc controller before programming it
  iio: accel: kxsd9: Fix raw read return
  iio: adc: ti_am335x_adc: Increase timeout value waiting for ADC sample
  iio: adc: ti_am335x_adc: Protect FIFO1 from concurrent access
  include/linux: fix excess fence.h kernel-doc notation
  staging: wilc1000: correctly check if associatedsta has not been found
  staging: wilc1000: NULL dereference on error
  staging: wilc1000: txq_event: Fix coding error
  MAINTAINERS: Add file patterns for ion device tree bindings
  MAINTAINERS: Update maintainer entry for wilc1000
  iio: chemical: atlas-ph-sensor: fix typo in val assignment
  iio: fix sched WARNING "do not call blocking ops when !TASK_RUNNING"
  staging: comedi: ni_mio_common: fix AO inttrig backwards compatibility
  staging: comedi: dt2811: fix a precedence bug
  staging: comedi: adv_pci1760: Do not return EINVAL for CMDF_ROUND_DOWN.
  staging: comedi: ni_mio_common: fix wrong insn_write handler
  staging: comedi: comedi_test: fix timer race conditions
  staging: comedi: daqboard2000: bug fix board type matching code
  ...
2016-09-03 11:33:33 -07:00
Emanuel Czirai
d199299675 x86/AMD: Apply erratum 665 on machines without a BIOS fix
AMD F12h machines have an erratum which can cause DIV/IDIV to behave
unpredictably. The workaround is to set MSRC001_1029[31] but sometimes
there is no BIOS update containing that workaround so let's do it
ourselves unconditionally. It is simple enough.

[ Borislav: Wrote commit message. ]

Signed-off-by: Emanuel Czirai <icanrealizeum@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Yaowu Xu <yaowu@google.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160902053550.18097-1-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-02 20:42:28 +02:00
Steven Rostedt
15301a5707 x86/paravirt: Do not trace _paravirt_ident_*() functions
Łukasz Daniluk reported that on a RHEL kernel that his machine would lock up
after enabling function tracer. I asked him to bisect the functions within
available_filter_functions, which he did and it came down to three:

  _paravirt_nop(), _paravirt_ident_32() and _paravirt_ident_64()

It was found that this is only an issue when noreplace-paravirt is added
to the kernel command line.

This means that those functions are most likely called within critical
sections of the funtion tracer, and must not be traced.

In newer kenels _paravirt_nop() is defined within gcc asm(), and is no
longer an issue.  But both _paravirt_ident_{32,64}() causes the
following splat when they are traced:

 mm/pgtable-generic.c:33: bad pmd ffff8800d2435150(0000000001d00054)
 mm/pgtable-generic.c:33: bad pmd ffff8800d3624190(0000000001d00070)
 mm/pgtable-generic.c:33: bad pmd ffff8800d36a5110(0000000001d00054)
 mm/pgtable-generic.c:33: bad pmd ffff880118eb1450(0000000001d00054)
 NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [systemd-journal:469]
 Modules linked in: e1000e
 CPU: 2 PID: 469 Comm: systemd-journal Not tainted 4.6.0-rc4-test+ #513
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
 task: ffff880118f740c0 ti: ffff8800d4aec000 task.ti: ffff8800d4aec000
 RIP: 0010:[<ffffffff81134148>]  [<ffffffff81134148>] queued_spin_lock_slowpath+0x118/0x1a0
 RSP: 0018:ffff8800d4aefb90  EFLAGS: 00000246
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88011eb16d40
 RDX: ffffffff82485760 RSI: 000000001f288820 RDI: ffffea0000008030
 RBP: ffff8800d4aefb90 R08: 00000000000c0000 R09: 0000000000000000
 R10: ffffffff821c8e0e R11: 0000000000000000 R12: ffff880000200fb8
 R13: 00007f7a4e3f7000 R14: ffffea000303f600 R15: ffff8800d4b562e0
 FS:  00007f7a4e3d7840(0000) GS:ffff88011eb00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f7a4e3f7000 CR3: 00000000d3e71000 CR4: 00000000001406e0
 Call Trace:
   _raw_spin_lock+0x27/0x30
   handle_pte_fault+0x13db/0x16b0
   handle_mm_fault+0x312/0x670
   __do_page_fault+0x1b1/0x4e0
   do_page_fault+0x22/0x30
   page_fault+0x28/0x30
   __vfs_read+0x28/0xe0
   vfs_read+0x86/0x130
   SyS_read+0x46/0xa0
   entry_SYSCALL_64_fastpath+0x1e/0xa8
 Code: 12 48 c1 ea 0c 83 e8 01 83 e2 30 48 98 48 81 c2 40 6d 01 00 48 03 14 c5 80 6a 5d 82 48 89 0a 8b 41 08 85 c0 75 09 f3 90 8b 41 08 <85> c0 74 f7 4c 8b 09 4d 85 c9 74 08 41 0f 18 09 eb 02 f3 90 8b

Reported-by: Łukasz Daniluk <lukasz.daniluk@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-02 09:40:47 -07:00
James Morse
744c6c37cc arm64: kernel: Fix unmasked debug exceptions when restoring mdscr_el1
Changes to make the resume from cpu_suspend() code behave more like
secondary boot caused debug exceptions to be unmasked early by
__cpu_setup(). We then go on to restore mdscr_el1 in cpu_do_resume(),
potentially taking break or watch points based on uninitialised registers.

Mask debug exceptions in cpu_do_resume(), which is specific to resume
from cpu_suspend(). Debug exceptions will be restored to their original
state by local_dbg_restore() in cpu_suspend(), which runs after
hw_breakpoint_restore() has re-initialised the other registers.

Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Fixes: cabe1c81ea ("arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va")
Cc: <stable@vger.kernel.org> # 4.7+
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-09-02 17:19:55 +01:00
Arnd Bergmann
236dec0510 kconfig: tinyconfig: provide whole choice blocks to avoid warnings
Using "make tinyconfig" produces a couple of annoying warnings that show
up for build test machines all the time:

    .config:966:warning: override: NOHIGHMEM changes choice state
    .config:965:warning: override: SLOB changes choice state
    .config:963:warning: override: KERNEL_XZ changes choice state
    .config:962:warning: override: CC_OPTIMIZE_FOR_SIZE changes choice state
    .config:933:warning: override: SLOB changes choice state
    .config:930:warning: override: CC_OPTIMIZE_FOR_SIZE changes choice state
    .config:870:warning: override: SLOB changes choice state
    .config:868:warning: override: KERNEL_XZ changes choice state
    .config:867:warning: override: CC_OPTIMIZE_FOR_SIZE changes choice state

I've made a previous attempt at fixing them and we discussed a number of
alternatives.

I tried changing the Makefile to use "merge_config.sh -n
$(fragment-list)" but couldn't get that to work properly.

This is yet another approach, based on the observation that we do want
to see a warning for conflicting 'choice' options, and that we can
simply make them non-conflicting by listing all other options as
disabled.  This is a trivial patch that we can apply independent of
plans for other changes.

Link: http://lkml.kernel.org/r/20160829214952.1334674-2-arnd@arndb.de
Link: https://storage.kernelci.org/mainline/v4.7-rc6/x86-tinyconfig/build.log
https://patchwork.kernel.org/patch/9212749/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-01 17:52:01 -07:00
Josh Poimboeuf
0d025d271e mm/usercopy: get rid of CONFIG_DEBUG_STRICT_USER_COPY_CHECKS
There are three usercopy warnings which are currently being silenced for
gcc 4.6 and newer:

1) "copy_from_user() buffer size is too small" compile warning/error

   This is a static warning which happens when object size and copy size
   are both const, and copy size > object size.  I didn't see any false
   positives for this one.  So the function warning attribute seems to
   be working fine here.

   Note this scenario is always a bug and so I think it should be
   changed to *always* be an error, regardless of
   CONFIG_DEBUG_STRICT_USER_COPY_CHECKS.

2) "copy_from_user() buffer size is not provably correct" compile warning

   This is another static warning which happens when I enable
   __compiletime_object_size() for new compilers (and
   CONFIG_DEBUG_STRICT_USER_COPY_CHECKS).  It happens when object size
   is const, but copy size is *not*.  In this case there's no way to
   compare the two at build time, so it gives the warning.  (Note the
   warning is a byproduct of the fact that gcc has no way of knowing
   whether the overflow function will be called, so the call isn't dead
   code and the warning attribute is activated.)

   So this warning seems to only indicate "this is an unusual pattern,
   maybe you should check it out" rather than "this is a bug".

   I get 102(!) of these warnings with allyesconfig and the
   __compiletime_object_size() gcc check removed.  I don't know if there
   are any real bugs hiding in there, but from looking at a small
   sample, I didn't see any.  According to Kees, it does sometimes find
   real bugs.  But the false positive rate seems high.

3) "Buffer overflow detected" runtime warning

   This is a runtime warning where object size is const, and copy size >
   object size.

All three warnings (both static and runtime) were completely disabled
for gcc 4.6 with the following commit:

  2fb0815c9e ("gcc4: disable __compiletime_object_size for GCC 4.6+")

That commit mistakenly assumed that the false positives were caused by a
gcc bug in __compiletime_object_size().  But in fact,
__compiletime_object_size() seems to be working fine.  The false
positives were instead triggered by #2 above.  (Though I don't have an
explanation for why the warnings supposedly only started showing up in
gcc 4.6.)

So remove warning #2 to get rid of all the false positives, and re-enable
warnings #1 and #3 by reverting the above commit.

Furthermore, since #1 is a real bug which is detected at compile time,
upgrade it to always be an error.

Having done all that, CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is no longer
needed.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-30 10:10:21 -07:00
Linus Torvalds
1f6a563ee0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Segregate namespaces properly in conntrack dumps, from Liping Zhang.

 2) tcp listener refcount fix in netfilter tproxy, from Eric Dumazet.

 3) Fix timeouts in qed driver due to xmit_more, from Yuval Mintz.

 4) Fix use-after-free in tcp_xmit_retransmit_queue().

 5) Userspace header fixups (use of __u32, missing includes, etc.) from
    Mikko Rapeli.

 6) Further refinements to fragmentation wrt gso and tunnels, from
    Shmulik Ladkani.

 7) Trigger poll correctly for zero length UDP packets, from Eric
    Dumazet.

 8) TCP window scaling fix, also from Eric Dumazet.

 9) SLAB_DESTROY_BY_RCU is not relevant any more for UDP sockets.

10) Module refcount leak in qdisc_create_dflt(), from Eric Dumazet.

11) Fix deadlock in cp_rx_poll() of 8139cp driver, from Gao Feng.

12) Memory leak in rhashtable's alloc_bucket_locks(), from Eric Dumazet.

13) Add new device ID to alx driver, from Owen Lin.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (83 commits)
  Add Killer E2500 device ID in alx driver.
  net: smc91x: fix SMC accesses
  Documentation: networking: dsa: Remove platform device TODO
  net/mlx5: Increase number of ethtool steering priorities
  net/mlx5: Add error prints when validate ETS failed
  net/mlx5e: Fix memory leak if refreshing TIRs fails
  net/mlx5e: Add ethtool counter for TX xmit_more
  net/mlx5e: Fix ethtool -g/G rx ring parameter report with striding RQ
  net/mlx5e: Don't wait for SQ completions on close
  net/mlx5e: Don't post fragmented MPWQE when RQ is disabled
  net/mlx5e: Don't wait for RQ completions on close
  net/mlx5e: Limit UMR length to the device's limitation
  rhashtable: fix a memory leak in alloc_bucket_locks()
  sfc: fix potential stack corruption from running past stat bitmask
  team: loadbalance: push lacpdus to exact delivery
  net: hns: dereference ppe_cb->ppe_common_cb if it is non-null
  8139cp: Fix one possible deadloop in cp_rx_poll
  i40e: Change some init flow for the client
  Revert "phy: IRQ cannot be shared"
  net: dsa: bcm_sf2: Fix race condition while unmasking interrupts
  ...
2016-08-29 12:29:13 -07:00
Linus Torvalds
2a90309e06 powerpc fixes for 4.8 #4
Andrew Donnellan (1):
       cxl: use pcibios_free_controller_deferred() when removing vPHBs
 
 Andrzej Hajda (1):
       powerpc/powernv/pci: fix iterator signedness
 
 Boqun Feng (1):
       powerpc, hotplug: Avoid to touch non-existent cpumasks.
 
 Christophe Leroy (1):
       powerpc: sysdev: cpm: fix gpio save_regs functions
 
 Cyril Bur (1):
       powerpc: signals: Discard transaction state from signal frames
 
 Guenter Roeck (1):
       powerpc: cputhreads: Add missing include file
 
 Markus Elfring (3):
       drivers/macintosh: Delete owner assignment
       powerpc/512x: Delete unnecessary assignment for the field "owner"
       powerpc: mpc8349emitx: Delete unnecessary assignment for the field "owner"
 
 Mauricio Faria de Oliveira (1):
       powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)
 
 Michael Ellerman (1):
       powerpc/prom: Fix sub-processor option passed to ibm, client-architecture-support
 
 Mukesh Ojha (1):
       powerpc/powernv : Drop reference added by kset_find_obj()
 
 Nicholas Piggin (3):
       powerpc/pseries: PACA save area fix for general exception vs MCE
       powerpc/pseries: PACA save area fix for MCE vs MCE
       powerpc/tm: do not use r13 for tabort_syscall
 
 Paolo Bonzini (1):
       powerpc: move hmi.c to arch/powerpc/kvm/
 
 Paul Gortmaker (1):
       powerpc: migrate exception table users off module.h and onto extable.h
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXw7jmAAoJEHM62YSLdExeu6YP+gMNWkW4CDjvIo8OsVMhYKqX
 RP3c6xa8APcizfdxssKMwX+STP6nfEzRUMCsDMmChSSpb1ZtSNdwM2t2zmlLDe8r
 wKHN7Chqw1EJsnIT6TLa79fvHft03MyGQCTjeRvr54aFcTVJHuAgZnTeuejFBrtN
 hHCsO/pEv+KjgGlDkN+UvEMUs0quI5Ri8kRJpDxO6il1agcPpeYFYB3ksIcgwaVH
 0X/MwLJbzENmiDLhtb3S0jQWGMijmpPbcz5b3z36gIpTxddEPnC9UhYCdc924V/C
 sREuku7rihXDEda6B/qNpk8kOxXxIkE/dh5eWVMyWVDHIXJA6hy4z0zpoF9xe4Ba
 H2/ERyCfsP3LqonOpDOWKs/tkTAmMXOwdfgr+geoc4vDsbR2X8jyjHqb3bdj7J2A
 YtcjmQ/w9HCnzC0weme1+0xR2tfyjUciNOdNoRHP/shLAObzc14xXlpg0IZ3eou2
 oozGqkHnGoA6SrsqezF0UmsAf3884BcVscY1N45p1s8ulMRkyHTgPHM8KnGpHoHq
 1zgUZji2F4VFgjGrMJyMZ82c3usP3/N65d2nkvRv/QzkmdTat4OmXLiB0Qky26Wy
 gW7GsyRpGdEvW/KxhLxuobxPw7xg4mGQ3MdUMWzL3Df17IuRquuFhjljeQIUsQDE
 I7NVfZATWJ7mqJQU2s5L
 =+1I0
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Ben Herrenschmidt:
 "This was meant to be sent early last week, but I has a change pending
  on one of the fixes and other things made me forget all about.  Ugh.

  We have some misc fixes for powerpc 4.8.  Some trivial bits and some
  regressions, and a trivial cleanup or two that I saw no point in
  letting rot in patchwork"

* tag 'powerpc-4.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: signals: Discard transaction state from signal frames
  powerpc/powernv : Drop reference added by kset_find_obj()
  powerpc/tm: do not use r13 for tabort_syscall
  powerpc: move hmi.c to arch/powerpc/kvm/
  powerpc: sysdev: cpm: fix gpio save_regs functions
  powerpc/pseries: PACA save area fix for MCE vs MCE
  powerpc/pseries: PACA save area fix for general exception vs MCE
  powerpc/prom: Fix sub-processor option passed to ibm, client-architecture-support
  powerpc, hotplug: Avoid to touch non-existent cpumasks.
  powerpc: migrate exception table users off module.h and onto extable.h
  powerpc/powernv/pci: fix iterator signedness
  powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)
  cxl: use pcibios_free_controller_deferred() when removing vPHBs
  powerpc: mpc8349emitx: Delete unnecessary assignment for the field "owner"
  powerpc/512x: Delete unnecessary assignment for the field "owner"
  drivers/macintosh: Delete owner assignment
  powerpc: cputhreads: Add missing include file
2016-08-29 12:12:15 -07:00
Russell King
2fb04fdf30 net: smc91x: fix SMC accesses
Commit b70661c708 ("net: smc91x: use run-time configuration on all ARM
machines") broke some ARM platforms through several mistakes.  Firstly,
the access size must correspond to the following rule:

(a) at least one of 16-bit or 8-bit access size must be supported
(b) 32-bit accesses are optional, and may be enabled in addition to
    the above.

Secondly, it provides no emulation of 16-bit accesses, instead blindly
making 16-bit accesses even when the platform specifies that only 8-bit
is supported.

Reorganise smc91x.h so we can make use of the existing 16-bit access
emulation already provided - if 16-bit accesses are supported, use
16-bit accesses directly, otherwise if 8-bit accesses are supported,
use the provided 16-bit access emulation.  If neither, BUG().  This
exactly reflects the driver behaviour prior to the commit being fixed.

Since the conversion incorrectly cut down the available access sizes on
several platforms, we also need to go through every platform and fix up
the overly-restrictive access size: Arnd assumed that if a platform can
perform 32-bit, 16-bit and 8-bit accesses, then only a 32-bit access
size needed to be specified - not so, all available access sizes must
be specified.

This likely fixes some performance regressions in doing this: if a
platform does not support 8-bit accesses, 8-bit accesses have been
emulated by performing a 16-bit read-modify-write access.

Tested on the Intel Assabet/Neponset platform, which supports only 8-bit
accesses, which was broken by the original commit.

Fixes: b70661c708 ("net: smc91x: use run-time configuration on all ARM machines")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-28 23:44:55 -04:00
Cyril Bur
78a3e8889b powerpc: signals: Discard transaction state from signal frames
Userspace can begin and suspend a transaction within the signal
handler which means they might enter sys_rt_sigreturn() with the
processor in suspended state.

sys_rt_sigreturn() wants to restore process context (which may have
been in a transaction before signal delivery). To do this it must
restore TM SPRS. To achieve this, any transaction initiated within the
signal frame must be discarded in order to be able to restore TM SPRs
as TM SPRs can only be manipulated non-transactionally..
>From the PowerPC ISA:
  TM Bad Thing Exception [Category: Transactional Memory]
   An attempt is made to execute a mtspr targeting a TM register in
   other than Non-transactional state.

Not doing so results in a TM Bad Thing:
[12045.221359] Kernel BUG at c000000000050a40 [verbose debug info unavailable]
[12045.221470] Unexpected TM Bad Thing exception at c000000000050a40 (msr 0x201033)
[12045.221540] Oops: Unrecoverable exception, sig: 6 [#1]
[12045.221586] SMP NR_CPUS=2048 NUMA PowerNV
[12045.221634] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE
 nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter
 ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables kvm_hv kvm
 uio_pdrv_genirq ipmi_powernv uio powernv_rng ipmi_msghandler autofs4 ses enclosure
 scsi_transport_sas bnx2x ipr mdio libcrc32c
[12045.222167] CPU: 68 PID: 6178 Comm: sigreturnpanic Not tainted 4.7.0 #34
[12045.222224] task: c0000000fce38600 ti: c0000000fceb4000 task.ti: c0000000fceb4000
[12045.222293] NIP: c000000000050a40 LR: c0000000000163bc CTR: 0000000000000000
[12045.222361] REGS: c0000000fceb7ac0 TRAP: 0700   Not tainted (4.7.0)
[12045.222418] MSR: 9000000300201033 <SF,HV,ME,IR,DR,RI,LE,TM[SE]> CR: 28444280  XER: 20000000
[12045.222625] CFAR: c0000000000163b8 SOFTE: 0 PACATMSCRATCH: 900000014280f033
GPR00: 01100000b8000001 c0000000fceb7d40 c00000000139c100 c0000000fce390d0
GPR04: 900000034280f033 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000000 b000000000001033 0000000000000001 0000000000000000
GPR12: 0000000000000000 c000000002926400 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 00003ffff98cadd0 00003ffff98cb470 0000000000000000
GPR28: 900000034280f033 c0000000fceb7ea0 0000000000000001 c0000000fce390d0
[12045.223535] NIP [c000000000050a40] tm_restore_sprs+0xc/0x1c
[12045.223584] LR [c0000000000163bc] tm_recheckpoint+0x5c/0xa0
[12045.223630] Call Trace:
[12045.223655] [c0000000fceb7d80] [c000000000026e74] sys_rt_sigreturn+0x494/0x6c0
[12045.223738] [c0000000fceb7e30] [c0000000000092e0] system_call+0x38/0x108
[12045.223806] Instruction dump:
[12045.223841] 7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8
[12045.223955] 4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020
[12045.224074] ---[ end trace cb8002ee240bae76 ]---

It isn't clear exactly if there is really a use case for userspace
returning with a suspended transaction, however, doing so doesn't (on
its own) constitute a bad frame. As such, this patch simply discards
the transactional state of the context calling the sigreturn and
continues.

Reported-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Acked-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-29 12:48:40 +10:00
Mukesh Ojha
a9cbf0b219 powerpc/powernv : Drop reference added by kset_find_obj()
In a situation, where Linux kernel gets notified about duplicate error log
from OPAL, it is been observed that kernel fails to remove sysfs entries
(/sys/firmware/opal/elog/0xXXXXXXXX) of such error logs. This is because,
we currently search the error log/dump kobject in the kset list via
'kset_find_obj()' routine. Which eventually increment the reference count
by one, once it founds the kobject.

So, unless we decrement the reference count by one after it found the kobject,
we would not be able to release the kobject properly later.

This patch adds the 'kobject_put()' which was missing earlier.

Signed-off-by: Mukesh Ojha <mukesh02@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-29 12:48:21 +10:00
Nicholas Piggin
cc7786d3ee powerpc/tm: do not use r13 for tabort_syscall
tabort_syscall runs with RI=1, so a nested recoverable machine
check will load the paca into r13 and overwrite what we loaded
it with, because exceptions returning to privileged mode do not
restore r13.

Fixes: b4b56f9eca (powerpc/tm: Abort syscalls in active transactions)
Cc: stable@vger.kernel.org
Signed-off-by: Nick Piggin <npiggin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-29 12:47:56 +10:00
Linus Torvalds
5d84ee7964 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
 "A single bugfix to prevent irq remapping when the ioapic is disabled"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Do not init irq remapping if ioapic is disabled
2016-08-28 10:00:21 -07:00
Linus Torvalds
af56ff27eb * ARM fixes:
** fixes for ITS init issues, error handling, IRQ leakage, race conditions
 ** An erratum workaround for timers
 ** Some removal of misleading use of errors and comments
 ** A fix for GICv3 on 32-bit guests
 * MIPS fix where the guest could wrongly map the first page of physical memory
 * x86 nested virtualization fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJXtyfVAAoJEL/70l94x66Dhe4IAIOGI/OYVWU5IfUQ01oeRgD3
 7wN222OmyC/K0/hSZc7ndRdcQfr5ombgM9XsS/EbkcRacWxAUHDX2FaYMpKgjT2M
 Dnh2tJHuPz/4VtByGQ2fZ4hziK7amn18/MtPFCee+mIj0ya2fcWZ4qHVU8pKC6Ps
 mVVZ0kxXsdV4pw9y6XgBLz/4bTLeASKvhFZrWOnjJoa+GeH2MFwocS0xaEI0HwxP
 HVwcgoRdGXJuKUB9jE9FDWmWOgdoLnCG1bNUOvXKPcE0ZaFQDT4I4dImkBys3rqz
 jbqnhLrpGEY2ZC3Rj+VyD2MOXbYOOSi59GRwYmCkqD96ZarHxSu3PdyCxmIFWzM=
 =+4WK
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "ARM:
   - fixes for ITS init issues, error handling, IRQ leakage, race
     conditions
   - an erratum workaround for timers
   - some removal of misleading use of errors and comments
   - a fix for GICv3 on 32-bit guests

  MIPS:
   - fix for where the guest could wrongly map the first page of
     physical memory

  x86:
   - nested virtualization fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  MIPS: KVM: Check for pfn noslot case
  kvm: nVMX: fix nested tsc scaling
  KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write
  KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC
  arm64: KVM: report configured SRE value to 32-bit world
  arm64: KVM: remove misleading comment on pmu status
  KVM: arm/arm64: timer: Workaround misconfigured timer interrupt
  arm64: Document workaround for Cortex-A72 erratum #853709
  KVM: arm/arm64: Change misleading use of is_error_pfn
  KVM: arm64: ITS: avoid re-mapping LPIs
  KVM: arm64: check for ITS device on MSI injection
  KVM: arm64: ITS: move ITS registration into first VCPU run
  KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic
  KVM: arm64: vgic-its: Plug race in vgic_put_irq
  KVM: arm64: vgic-its: Handle errors from vgic_add_lpi
  KVM: arm64: ITS: return 1 on successful MSI injection
2016-08-27 15:51:50 -07:00
Linus Torvalds
5e608a0270 Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "11 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: silently skip readahead for DAX inodes
  dax: fix device-dax region base
  fs/seq_file: fix out-of-bounds read
  mm: memcontrol: avoid unused function warning
  mm: clarify COMPACTION Kconfig text
  treewide: replace config_enabled() with IS_ENABLED() (2nd round)
  printk: fix parsing of "brl=" option
  soft_dirty: fix soft_dirty during THP split
  sysctl: handle error writing UINT_MAX to u32 fields
  get_maintainer: quiet noisy implicit -f vcs_file_exists checking
  byteswap: don't use __builtin_bswap*() with sparse
2016-08-26 23:12:12 -07:00
Linus Torvalds
65fc7d54ef ARM64 fix to avoid potential TLB conflict when CONFIG_RANDOMIZE_BASE is
enabled.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXwH52AAoJEGvWsS0AyF7xQukP/1c9tMKzvX2tEaEenwv5nIvV
 /fUfYzdPl97+KHl50N8CPY3xomHUo5kR+p/PDMi/yH2wljOCUmOI7alpWOrBizK9
 GbaGGU4rFsfmRlyxWkMfnYWjKkiXdDR4RPGwvW3SOC7sUDnI4od+1Dk9r9O/vomw
 14tggbL6HCR66MpcuH+yDCxBKvDnUuu8YPWoT3CX+VJCEPLXquO+d1GQKo28iYdu
 GxAxGB3NCNFiGyTIifLZk7OdozniCkCblhd+wdwRmMMPPgA75aueuUFsvaOMpo6a
 Tc5DoVTgLHiqAcJN7GWTKvrQeocxOXSJkEns4iTEmYuCVackwokvaLu6cFFxET5P
 zdG4crOjfoGPUvKB2yM3RnbYSRKGKOBkPsL/mqtyu2JgApPCqwLK6EQ6g0fWwouU
 OXvhmVyUGG53VXFYfPu1EzsE/ilRVzevHSlt3hHki9YkwXbKTgGfS7WQWQD6dkUe
 ks8gUpSmLBeW7SRVfB0YRrN3cFCd5dbkelES5S/97zHWkpIJy4ZoFKGQYOV3oj18
 rjHq1miPD40abe2IhbdZSz2mtiYpXjKfRvwuGc/MPySaB+WBIzHYwrGuiDvrqlgm
 nmFAeRMA8ulqq3VM5VklJbyk8I9luC7tHojKKgzWru711awuiRTi38bJaaAG1piJ
 bfvWK/pICaat+BIQp9wh
 =LYZy
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull ARM64 fix from Catalin Marinas:
 "ARM64 fix to avoid potential TLB conflict when CONFIG_RANDOMIZE_BASE
  is enabled"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: avoid TLB conflict with CONFIG_RANDOMIZE_BASE
2016-08-26 23:05:19 -07:00
Linus Torvalds
219c04cea3 PCI updates for v4.8:
Resource management
     Update "pci=resource_alignment" documentation (Mathias Koehrer)
 
   MSI
     Use positive flags in pci_alloc_irq_vectors() (Christoph Hellwig)
     Call pci_intx() when using legacy interrupts in pci_alloc_irq_vectors() (Christoph Hellwig)
 
   Intel VMD host bridge driver
     Fix infinite loop executing irq's (Keith Busch)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXvvHgAAoJEFmIoMA60/r8y+MP/3cXG5FG3fdRB9Pn8q5BvdD4
 hmIgOpuEH859iYFOUJNxVB6haxG73Z+z3aFvNUDKTOh7GXQ7UHzNw1qj1OEmjXS0
 ex45ynjxP2DTBIO3eoyLpTKaRuOCAz03sDIucK9NzSBkrazBGoaxRRYY5gi+V8K9
 CNsbT8pPSGOrfRX6nk2QjiGMHVw+ExurvskbwuQmRYmHGU1Y1nnkmlnT5AMUJsn+
 BgSLOLFkv9UOIfLVY2+/KCnPFfi0nIxspCJHinosLl5BsgULgFPdRFKr6SiY+yhx
 W3d8MKnhDX5wsajtE7V/HurYmETiyimLL1ZBPJOCzGXwNGrwvUrt1rji+83S+1Zx
 LbrPGDRNBLW51prcrxOx1Pr/RDaHT0+/2UrSvaeUnTqniV5xW1daBPvB6h8jbKih
 C+GGqfiHURvJi1sd9QUMP95yVaxVgSBM9BrKfNYAmVYarnCpx3xbdeMv8gA0+QHx
 Ts0QKJ7ph80mcztdk2H0Ov7OK+OGjDhZPDijXiCTSPfKgK0HPsaytDCULt9LcqAv
 M8daeD42nwe36xOlNgcosj++gL+VxVYk/4BAiOWCgwF+k2EDh1ol8/eK6a+hXAqC
 qslLMNNucZP5BSsSIYc5KVGbkEXuHTw8viUNvBhLE2CWD3SY378IlzrF7QxjgKNl
 AawNqjfXY7obX3m5xr9g
 =qJG4
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "Resource management:
   - Update "pci=resource_alignment" documentation (Mathias Koehrer)

  MSI:
   - Use positive flags in pci_alloc_irq_vectors() (Christoph Hellwig)
   - Call pci_intx() when using legacy interrupts in pci_alloc_irq_vectors() (Christoph Hellwig)

  Intel VMD host bridge driver:
   - Fix infinite loop executing irq's (Keith Busch)"

* tag 'pci-v4.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  x86/PCI: VMD: Fix infinite loop executing irq's
  PCI: Call pci_intx() when using legacy interrupts in pci_alloc_irq_vectors()
  PCI: Use positive flags in pci_alloc_irq_vectors()
  PCI: Update "pci=resource_alignment" documentation
2016-08-26 18:26:07 -07:00
Masahiro Yamada
a5ff1b34e1 treewide: replace config_enabled() with IS_ENABLED() (2nd round)
Commit 97f2645f35 ("tree-wide: replace config_enabled() with
IS_ENABLED()") mostly killed config_enabled(), but some new users have
appeared for v4.8-rc1.  They are all used for a boolean option, so can
be replaced with IS_ENABLED() safely.

Link: http://lkml.kernel.org/r/1471970749-24867-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26 17:39:35 -07:00
Mark Rutland
fd363bd417 arm64: avoid TLB conflict with CONFIG_RANDOMIZE_BASE
When CONFIG_RANDOMIZE_BASE is selected, we modify the page tables to remap the
kernel at a newly-chosen VA range. We do this with the MMU disabled, but do not
invalidate TLBs prior to re-enabling the MMU with the new tables. Thus the old
mappings entries may still live in TLBs, and we risk violating
Break-Before-Make requirements, leading to TLB conflicts and/or other issues.

We invalidate TLBs when we uninsall the idmap in early setup code, but prior to
this we are subject to issues relating to the Break-Before-Make violation.

Avoid these issues by invalidating the TLBs before the new mappings can be
used by the hardware.

Fixes: f80fb3a3d5 ("arm64: add support for kernel ASLR")
Cc: <stable@vger.kernel.org> # 4.6+
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-08-25 11:11:32 +01:00
Linus Torvalds
4935e04ef4 Merge branch 'for-linus-4.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML fix from Richard Weinberger:
 "This contains a fix for a build regression introduced during the merge
  window"

* 'for-linus-4.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: Don't discard .text.exit section
2016-08-24 16:04:59 -04:00
Linus Torvalds
fe2dd21282 xen: regression fix for 4.8-rc3
- Fix a regression in the xenbus device preventing userspace tools
   from working.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXvdugAAoJEFxbo/MsZsTRAwEH/AiKLV4T0OiARv/df827WVnL
 obUmEAh/wVSWZh2xdUNurDOH64lEfeBDSBIpGPQMLGmXLzNEQO9u8ZJYWJ7R1Ryp
 JU37lu3DP7HqQqTXsy8ltgcBkwVaQZAo0GRtDeua80ZPdjulnZirwHWS48TuNIFF
 pVtW4Eoy1BNAVri55o5hOIub4HUKMRoNB/J+o+SKLyJEvOon+qD4pOfIhR3sqeja
 oYVX7QpY/4Miymd5uI9v8LUefS4PW/U58a7tjr414Ng4mzQbZOHDmNyWF0CH27lj
 INAmgMXDG7RtiSQMWPKtDQUvuefApKoeRmFr6mQ/xHyCX3cAzOw07+p0rKacCig=
 =PTX1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.8b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen regression fix from David Vrabel:
 "Fix a regression in the xenbus device preventing userspace tools from
  working"

* tag 'for-linus-4.8b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: change the type of xen_vcpu_id to uint32_t
  xenbus: don't look up transaction IDs for ordinary writes
2016-08-24 14:04:30 -04:00
Vitaly Kuznetsov
55467dea29 xen: change the type of xen_vcpu_id to uint32_t
We pass xen_vcpu_id mapping information to hypercalls which require
uint32_t type so it would be cleaner to have it as uint32_t. The
initializer to -1 can be dropped as we always do the mapping before using
it and we never check the 'not set' value anyway.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-24 18:17:27 +01:00
Wanpeng Li
2e63ad4bd5 x86/apic: Do not init irq remapping if ioapic is disabled
native_smp_prepare_cpus
  -> default_setup_apic_routing
    -> enable_IR_x2apic
      -> irq_remapping_prepare
        -> intel_prepare_irq_remapping
          -> intel_setup_irq_remapping		  

So IR table is setup even if "noapic" boot parameter is added. As a result we
crash later when the interrupt affinity is set due to a half initialized
remapping infrastructure.

Prevent remap initialization when IOAPIC is disabled.

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Link: http://lkml.kernel.org/r/1471954039-3942-1-git-send-email-wanpeng.li@hotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-08-24 09:45:40 +02:00
Keith Busch
21c80c9fef x86/PCI: VMD: Fix infinite loop executing irq's
We can't initialize the list head on deletion as this causes the node to
point to itself, which causes an infinite loop if vmd_irq() happens to be
servicing that node.

The list initialization was trying to fix a bug from multiple calls to
disable the same IRQ.  Fix this instead by having the VMD driver track if
the interrupt is enabled.

[bhelgaas: changelog, add "Fixes"]
Fixes: 97e9230635 ("x86/PCI: VMD: Initialize list item in IRQ disable")
Reported-by: Grzegorz Koczot <grzegorz.koczot@intel.com>
Tested-by: Miroslaw Drost <miroslaw.drost@intel.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by Jon Derrick: <jonathan.derrick@intel.com>
2016-08-23 16:36:42 -05:00
Andrey Ryabinin
dad2232844 um: Don't discard .text.exit section
Commit e41f501d39 ("vmlinux.lds: account for destructor sections")
added '.text.exit' to EXIT_TEXT which is discarded at link time by default.
This breaks compilation of UML:
     `.text.exit' referenced in section `.fini_array' of
     /usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/libc.a(sdlerror.o):
     defined in discarded section `.text.exit' of
     /usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/libc.a(sdlerror.o)

Apparently UML doesn't want to discard exit text, so let's place all EXIT_TEXT
sections in .exit.text.

Fixes: e41f501d39 ("vmlinux.lds: account for destructor sections")
Reported-by: Stefan Traby <stefan@hello-penguin.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-08-23 23:16:16 +02:00
Linus Torvalds
d1fdafa10f Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes a number of memory corruption bugs in the newly added
  sha256-mb/sha256-mb code"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: sha512-mb - fix ctx pointer
  crypto: sha256-mb - fix ctx pointer and digest copy
2016-08-23 14:29:00 -04:00
Caesar Wang
3d4267a5a3 arm: dts: rockchip: add reset node for the exist saradc SoCs
SARADC controller needs to be reset before programming it, otherwise
it will not function properly.

Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2016-08-23 19:08:29 +01:00
Caesar Wang
78ec79bfd5 arm64: dts: rockchip: add reset saradc node for rk3368 SoCs
SARADC controller needs to be reset before programming it, otherwise
it will not function properly.

Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2016-08-23 19:08:21 +01:00
Linus Torvalds
ef0e1ea885 ARC Fixes for 4.8-rc4
- Support for Syscall ABI v4 with upstream gcc 6.x
 
 - Lockdep fix (Daniel Mentz)
 
 - gdb register clobber (Liav Rehana)
 
 - Couple of missing exports for modules
 
 - Other fixes here and there
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXuz5EAAoJEGnX8d3iisJeGmAP/AjKaBVrSsQCJW9S9Xg5Tcx8
 30UPJOqRI25HkQVmeQaehNgtFPx2nN8oqOeUA4edFMJXoE44cBvPPvuBeAKYj7qV
 ROv6ssomJt/DdoRdkbnUZqh0nLrQwR0srYkiWLqQp9zlxUpwfCM2tHah2RB3xV0d
 Tet7nIAgUxEt42+rfSNbhZUphwHebvh7fbu1czDRr1L78fp266XM84n3uQj8aTpC
 3QK4ddWJU8qPU24kVa7kLg45cCw4W2KUHGBzJmZWeUtv/04+t6wCZQu0tOeZ4/Mm
 WnbCRnJrvYi+LjnXi+7ymmMN/qd+FOeRQ4MWLHcC7GBCChQ/2WCJVM4bDSfWCzWa
 qe3aDRd7cq9Yjyzf3j34tDwhYQirwNRkYI7ps9fjsSmDMDM6hlXwNry+a6Y27Z4O
 AFfBCHJDFhKAflm34ryskiDWQotZ30JtuFRgKKK3oWLeAOL/foDW8nbLea5AB2Rd
 CtPIZTwKq+MQW6l/24V4F5kQNZZA6IuaqwSwugNAZLaONm/OsxXMMomo9RTfV+xH
 Z4i3dQHvwNrGfBoYABdP+QBDibkqdX0a3y5H/4wIyZAe4pKw4VntsYhV/bsCHTp8
 GuFeaR86ii48RmwR40gtaYQ4/CZFKsw3eQk+aAcCPsONVy3hIpsgeaAceIlQxuDD
 LDvbsaUU4a1xm5PhtbUN
 =1WGt
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.8-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

 - support for Syscall ABI v4 with upstream gcc 6.x

 - lockdep fix (Daniel Mentz)

 - gdb register clobber (Liav Rehana)

 - couple of missing exports for modules

 - other fixes here and there

* tag 'arc-4.8-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: export __udivdi3 for modules
  ARC: mm: fix build breakage with STRICT_MM_TYPECHECKS
  ARC: export kmap
  ARC: Support syscall ABI v4
  ARC: use correct offset in pt_regs for saving/restoring user mode r25
  ARC: Elide redundant setup of DMA callbacks
  ARC: Call trace_hardirqs_on() before enabling irqs
2016-08-22 17:53:02 -05:00
Paolo Bonzini
7c379526d7 powerpc: move hmi.c to arch/powerpc/kvm/
hmi.c functions are unused unless sibling_subcore_state is nonzero, and
that in turn happens only if KVM is in use.  So move the code to
arch/powerpc/kvm/, putting it under CONFIG_KVM_BOOK3S_HV_POSSIBLE
rather than CONFIG_PPC_BOOK3S_64.  The sibling_subcore_state is also
included in struct paca_struct only if KVM is supported by the kernel.

Cc: Daniel Axtens <dja@axtens.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: kvm-ppc@vger.kernel.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Christophe Leroy
41017a7579 powerpc: sysdev: cpm: fix gpio save_regs functions
of_mm_gpiochip_add_data() calls mm_gc->save_regs() before
setting the data. Therefore ->save_regs() cannot use
gpiochip_get_data()

[    0.275940] Unable to handle kernel paging request for data at address 0x00000130
[    0.283120] Faulting instruction address: 0xc01b44cc
[    0.288175] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.293343] PREEMPT CMPC885
[    0.296141] CPU: 0 PID: 1 Comm: swapper Not tainted 4.7.0-g65124df-dirty #68
[    0.304131] task: c6074000 ti: c6080000 task.ti: c6080000
[    0.309459] NIP: c01b44cc LR: c0011720 CTR: c0011708
[    0.314372] REGS: c6081d90 TRAP: 0300   Not tainted  (4.7.0-g65124df-dirty)
[    0.322267] MSR: 00009032 <EE,ME,IR,DR,RI>  CR: 24000028  XER: 20000000
[    0.328813] DAR: 00000130 DSISR: c0000000
GPR00: c01b6d0c c6081e40 c6074000 c6017000 c9028000 c601d028 c6081dd8 00000000
GPR08: c601d028 00000000 ffffffff 00000001 24000044 00000000 c0002790 00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 c05643b0 00000083
GPR24: c04a1a6c c0560000 c04a8308 c04c6480 c0012498 c6017000 c7ffcc78 c6017000
[    0.360806] NIP [c01b44cc] gpiochip_get_data+0x4/0xc
[    0.365684] LR [c0011720] cpm1_gpio16_save_regs+0x18/0x44
[    0.370972] Call Trace:
[    0.373451] [c6081e50] [c01b6d0c] of_mm_gpiochip_add_data+0x70/0xdc
[    0.379624] [c6081e70] [c00124c0] cpm_init_par_io+0x28/0x118
[    0.385238] [c6081e80] [c04a8ac0] do_one_initcall+0xb0/0x17c
[    0.390819] [c6081ef0] [c04a8cbc] kernel_init_freeable+0x130/0x1dc
[    0.396924] [c6081f30] [c00027a4] kernel_init+0x14/0x110
[    0.402177] [c6081f40] [c000b424] ret_from_kernel_thread+0x5c/0x64
[    0.408233] Instruction dump:
[    0.411168] 4182fafc 3f80c040 48234c6d 3bc0fff0 3b9c5ed0 4bfffaf4 81290020 712a0004
[    0.418825] 4182fb34 48234c51 4bfffb2c 81230004 <80690130> 4e800020 7c0802a6 9421ffe0
[    0.426763] ---[ end trace fe4113ee21d72ffa ]---

fixes: e65078f1f3 ("powerpc: sysdev: cpm1: use gpiochip data pointer")
fixes: a14a2d484b ("powerpc: cpm_common: use gpiochip data pointer")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Nicholas Piggin
a74599a504 powerpc/pseries: PACA save area fix for MCE vs MCE
MCE must not enable MSR_RI until PACA_EXMC is no longer being used.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Nicholas Piggin
3f3b5dc14c powerpc/pseries: PACA save area fix for general exception vs MCE
MCE must not use PACA_EXGEN. When a general exception enables MSR_RI,
that means SPRN_SRR[01] and SPRN_SPRG are no longer used. However the
PACA save area is still in use.
Acked-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Michael Ellerman
66443efa83 powerpc/prom: Fix sub-processor option passed to ibm, client-architecture-support
When booting from an OpenFirmware which supports it, we use the
"ibm,client-architecture-support" firmware call to communicate
our capabilities to firmware.

The format of the structure we pass to firmware is specified in
PAPR (Power Architecture Platform Requirements), or the public version
LoPAPR (Linux on Power Architecture Platform Reference).

Referring to table 244 in LoPAPR v1.1, option vector 5 contains a 4 byte
field at bytes 17-20 for the "Platform Facilities Enable". This is
followed by a 1 byte field at byte 21 for "Sub-Processor Represenation
Level".

Comparing to the code, there we have the Platform Facilities
options (OV5_PFO_*) at byte 17, but we fail to pad that field out to its
full width of 4 bytes. This means the OV5_SUB_PROCESSORS option is
incorrectly placed at byte 18.

Fix it by adding zero bytes for bytes 18, 19, 20, and comment the bytes
to hopefully make it clearer in future.

As far as I'm aware nothing actually consumes this value at this time,
so the effect of this bug is nil in practice.

It does mean we've been incorrectly setting bit 15 of the "Platform
Facilities Enable" option for the past ~3 1/2 years, so we should avoid
allocating that bit to anything else in future.

Fixes: df77c79920 ("powerpc/pseries: Update ibm,architecture.vec for PAPR 2.7/POWER8")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Boqun Feng
19ab58d19e powerpc, hotplug: Avoid to touch non-existent cpumasks.
We observed a kernel oops when running a PPC guest with config NR_CPUS=4
and qemu option "-smp cores=1,threads=8":

[   30.634781] Unable to handle kernel paging request for data at
address 0xc00000014192eb17
[   30.636173] Faulting instruction address: 0xc00000000003e5cc
[   30.637069] Oops: Kernel access of bad area, sig: 11 [#1]
[   30.637877] SMP NR_CPUS=4 NUMA pSeries
[   30.638471] Modules linked in:
[   30.638949] CPU: 3 PID: 27 Comm: migration/3 Not tainted
4.7.0-07963-g9714b26 #1
[   30.640059] task: c00000001e29c600 task.stack: c00000001e2a8000
[   30.640956] NIP: c00000000003e5cc LR: c00000000003e550 CTR:
0000000000000000
[   30.642001] REGS: c00000001e2ab8e0 TRAP: 0300   Not tainted
(4.7.0-07963-g9714b26)
[   30.643139] MSR: 8000000102803033 <SF,VEC,VSX,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 22004084  XER: 00000000
[   30.644583] CFAR: c000000000009e98 DAR: c00000014192eb17 DSISR: 40000000 SOFTE: 0
GPR00: c00000000140a6b8 c00000001e2abb60 c0000000016dd300 0000000000000003
GPR04: 0000000000000000 0000000000000004 c0000000016e5920 0000000000000008
GPR08: 0000000000000004 c00000014192eb17 0000000000000000 0000000000000020
GPR12: c00000000140a6c0 c00000000ffffc00 c0000000000d3ea8 c00000001e005680
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 c00000001e6b3a00 0000000000000000 0000000000000001
GPR24: c00000001ff85138 c00000001ff85130 000000001eb6f000 0000000000000001
GPR28: 0000000000000000 c0000000017014e0 0000000000000000 0000000000000018
[   30.653882] NIP [c00000000003e5cc] __cpu_disable+0xcc/0x190
[   30.654713] LR [c00000000003e550] __cpu_disable+0x50/0x190
[   30.655528] Call Trace:
[   30.655893] [c00000001e2abb60] [c00000000003e550] __cpu_disable+0x50/0x190 (unreliable)
[   30.657280] [c00000001e2abbb0] [c0000000000aca0c] take_cpu_down+0x5c/0x100
[   30.658365] [c00000001e2abc10] [c000000000163918] multi_cpu_stop+0x1a8/0x1e0
[   30.659617] [c00000001e2abc60] [c000000000163cc0] cpu_stopper_thread+0xf0/0x1d0
[   30.660737] [c00000001e2abd20] [c0000000000d8d70] smpboot_thread_fn+0x290/0x2a0
[   30.661879] [c00000001e2abd80] [c0000000000d3fa8] kthread+0x108/0x130
[   30.662876] [c00000001e2abe30] [c000000000009968] ret_from_kernel_thread+0x5c/0x74
[   30.664017] Instruction dump:
[   30.664477] 7bde1f24 38a00000 787f1f24 3b600001 39890008 7d204b78 7d05e214 7d0b07b4
[   30.665642] 796b1f24 7d26582a 7d204a14 7d29f214 <7d4048a8> 7d4a3878 7d4049ad 40c2fff4
[   30.666854] ---[ end trace 32643b7195717741 ]---

The reason of this is that in __cpu_disable(), when we try to set the
cpu_sibling_mask or cpu_core_mask of the sibling CPUs of the disabled
one, we don't check whether the current configuration employs those
sibling CPUs(hw threads). And if a CPU is not employed by a
configuration, the percpu structures cpu_{sibling,core}_mask are not
allocated, therefore accessing those cpumasks will result in problems as
above.

This patch fixes this problem by adding an addition check on whether the
id is no less than nr_cpu_ids in the sibling CPU iteration code.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00