During a context switch we always restore the per thread DSCR value.
If we aren't doing explicit DSCR management
(ie thread.dscr_inherit == 0) and the default DSCR changed while
the process has been sleeping we end up with the wrong value.
Check thread.dscr_inherit and select the default DSCR or per thread
DSCR as required.
This was found with the following test case, when running with
more threads than CPUs (ie forcing context switching):
http://ozlabs.org/~anton/junkcode/dscr_default_test.c
With the four patches applied I can run a combination of all
test cases successfully at the same time:
http://ozlabs.org/~anton/junkcode/dscr_default_test.chttp://ozlabs.org/~anton/junkcode/dscr_explicit_test.chttp://ozlabs.org/~anton/junkcode/dscr_inherit_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If the default DSCR is non zero we set thread.dscr_inherit in
copy_thread() meaning the new thread and all its children will ignore
future updates to the default DSCR. This is not intended and is
a change in behaviour that a number of our users have hit.
We just need to inherit thread.dscr and thread.dscr_inherit from
the parent which ends up being much simpler.
This was found with the following test case:
http://ozlabs.org/~anton/junkcode/dscr_default_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When we update the DSCR either via emulation of mtspr(DSCR) or via
a change to dscr_default in sysfs we don't update thread.dscr.
We will eventually update it at context switch time but there is
a period where thread.dscr is incorrect.
If we fork at this point we will copy the old value of thread.dscr
into the child. To avoid this, always keep thread.dscr in sync with
reality.
This issue was found with the following testcase:
http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Writing to dscr_default in sysfs doesn't actually change the DSCR -
we rely on a context switch on each CPU to do the work. There is no
guarantee we will get a context switch in a reasonable amount of time
so fire off an IPI to force an immediate change.
This issue was found with the following test case:
http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The CPU hotplug code for the powernv platform currently only puts
offline CPUs into nap mode if the powersave_nap variable is set.
However, HV-style KVM on this platform requires secondary CPU threads
to be offline and in nap mode. Since we know nap mode works just
fine on all POWER7 machines, and the only machines that support the
powernv platform are POWER7 machines, this changes the code to
always put offline CPUs into nap mode, regardless of powersave_nap.
Powersave_nap still controls whether or not CPUs go into nap mode
when idle, as before.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
At the moment the handler for hypervisor decrementer interrupts is
the same as for decrementer interrupts, i.e. timer_interrupt().
This is bogus; if we ever do get a hypervisor decrementer interrupt
it won't have anything to do with the next timer event. In fact
the only time we get hypervisor decrementer interrupts is when one
is left pending on exit from a KVM guest.
When we get a hypervisor decrementer interrupt we don't need to do
anything special to clear it, since they are edge-triggered on the
transition of HDEC from 0 to -1. Thus this adds an empty handler
function for them. We don't need to have them masked when interrupts
are soft-disabled, so we use STD_EXCEPTION_HV instead of
MASKABLE_EXCEPTION_HV.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
arch_update_cpu_topology() should only return 1 when the topology has
actually changed, and should return 0 otherwise.
This patch fixes a potential bug where rebuild_sched_domains() would
reinitialize the sched domains even when the topology hasn't changed.
Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Embedded.Hypervisor category defines GSPRG0..3 physical registers for guests.
Avoid SPRG4-7 usage as scratch in host exception handlers, otherwise guest
SPRG4-7 registers will be clobbered.
For bolted TLB miss exception handlers, which is the version currently
supported by KVM, use SPRN_SPRG_GEN_SCRATCH aka SPRG0 instead of
SPRN_SPRG_TLB_SCRATCH aka SPRG6. Keep using TLB PACA slots to fit in one
64-byte cache line.
For critical exception handlers use SPRG3 instead of SPRG7. Provide a routine
to store and restore user-visible SPRGs. This will be subsequently used
to restore VDSO information in SPRG3. Add EX_R13 to paca slots to free up
SPRG3 and change the critical exception epilog to use it.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Refactor exception prolog to get rid of mfspr srr1 duplicate. This was
introduced by KVM integration, with DO_KVM macro logic expecting srr1 value
earlier in r11.
Reserve r11 to hold srr1's value also required at the end of the prolog and
free up r10 to serve as spare in addition macros.
For syscalls case this change does not add any performance penalty. For irq
soft-disabled case the change adds a store/load of conditional register value
to/from a paca slot. Paca slots fit in one 64-byte cache line so these
additional operations have little impact on performance.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Hook DO_KVM macro into 64-bit booke for KVM integration. Extend interrupt
handlers' parameter list with interrupt vector numbers to accomodate the macro.
Only the bolted version of tlb miss handers is addressed now.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Guest Doorbell interrupts use guest save and restore registers. Add a new
Guest Doorbell exception type to accommodate GSRR0/1 SPRs usage in exception
prolog and fix the exception handler.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Machine check exception handler was using a wrong prolog. Hypervisors like
KVM which are called early from the exception handler rely on the interrupt
source.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is the port of uprobes to powerpc. Usage is similar to x86.
[root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc
Added new event:
probe_libc:malloc (on 0xb4860)
You can now use it in all perf tools, such as:
perf record -e probe_libc:malloc -aR sleep 1
[root@xxxx ~]# ./bin/perf record -e probe_libc:malloc -aR sleep 20
[ perf record: Woken up 22 times to write data ]
[ perf record: Captured and wrote 5.843 MB perf.data (~255302 samples) ]
[root@xxxx ~]# ./bin/perf report --stdio
...
69.05% tar libc-2.12.so [.] malloc
28.57% rm libc-2.12.so [.] malloc
1.32% avahi-daemon libc-2.12.so [.] malloc
0.58% bash libc-2.12.so [.] malloc
0.28% sshd libc-2.12.so [.] malloc
0.08% irqbalance libc-2.12.so [.] malloc
0.05% bzip2 libc-2.12.so [.] malloc
0.04% sleep libc-2.12.so [.] malloc
0.03% multipathd libc-2.12.so [.] malloc
0.01% sendmail libc-2.12.so [.] malloc
0.01% automount libc-2.12.so [.] malloc
The trap_nr addition patch is a prereq.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add thread_struct.trap_nr and use it to store the last exception
the thread experienced. In this patch, we populate the field at
various places where we force_sig_info() to the process.
This is also used in uprobes to determine if the probed instruction
caused an exception.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Move is_trap() and relatives to a common file to be shared between kprobes
and uprobes.
Code movement only; no change in functionality.
Suggested by Michael Ellerman.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have an old FIXME in reg.h which points out that we should standardise
on PVR_foo for our PVR #defines. Currently we use PVR_ on 32-bit and PV_
on 64-bit.
So do that rename and remove the FIXME.
Seeing as we're touching all but one usage of __is_processor(), rename it
to something less ugly and more indicative of what it does, which is
simply to check the PVR version.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The pseries hvterm driver only registers a udbg backend (for xmon and
other low level debugging mechanisms) when hvc0 is recognized as the
firmware console at boot time, not if it's detected later on, for
example because the firmware is using a graphics card.
This can make debugging challenging especially under X11, and there's
really no good reason for that limitation, so let's hookup udbg
whenever hvc0 is detected instead.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
hvc_console has two methods to instanciate the consoles.
hvc_instanciate is meant to be called at early boot, while hvc_alloc is
called for more dynamically probed objects.
Currently, it only deals with adding kernel consoles in the former case,
which means for example that if a console only uses dynamic probing, it
will never be usable as a kernel console even when specifying
console=hvc0 explicitly, which could be considered annoying...
More specifically, on pseries, we only do the early instanciate for the
console currently used by the firmware, so if you have your firmware
configured to go to a video card, for example, you cannot get your
kernel console, oops messages, etc... on your serial port or hypervisor
console, which would be handy to deal with oopses.
This fixes it by checking if hvc_console.flags & CON_ENABLED is set when
registering a new dynamic console, and if not, redo the index check and
re-register the console if the index matches, allowing console=hvcN to
work.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
It contains no code and is not included by anyone.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
It's empty now, apart from other includes.
Fixup a few files that were getting things via this header.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
virt_to_abs() is just a wrapper around __pa(), call __pa() directly.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These days they are just __va() and __pa() respectively.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These days they are just __va() and __pa() respectively.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These days they are just __va() and __pa() respectively.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
abs_to_virt() simply calls __va() and we'd like to get rid of it,
so replace all abs_to_virt() uses with __va().
Similarly virt_to_abs() just calls __pa().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
abs_to_virt() is just a wrapper around __va(), call __va() directly.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
abs_to_virt() simply calls __va() and we'd like to get rid of it,
so replace all abs_to_virt() uses with __va().
__va() returns void *, so when assigning to a pointer there's no need
to cast.
Similarly virt_to_abs() just calls __pa().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These days they are just wrappers around __pa() and __va() respectively.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These days they are just wrappers around __pa() and __va() respectively.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
abs_to_virt() is just a wrapper around __va(), call __va() directly.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Since commit f5339277 phys_to_abs() is 100% a nop, so just remove it.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
phys_to_abs() is a nop, don't use it.
virt_to_abs() is just a wrapper around __pa(), call __pa() directly.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In commit f5339277 "powerpc: Remove FW_FEATURE ISERIES from arch code", we
removed the bulk of the iSeries code, but missed a few bits.
Remove the mschunks bits, these were only ever used on iSeries as far as I
know, and are definitely not used anymore.
Make it even clearer that phys_to_abs() is a nop, by making it a macro. We
still have a few users of this, but should clean those up.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Test-compiling obscure machines I notice that the gemini (which
by the way lacks a defconfig) is broken since some time back.
Adding a simple missing include makes it build again.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Two regression fixes and one boot-loader compatibility fix from Simon Horman.
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: armadillo800eva: enable rw rootfs mount
ARM: shmobile: mackerel: fixup usb module order
ARM: shmobile: armadillo800eva: fixup: sound card detection order
After commit 26b88520b8 ("mmc:
omap_hsmmc: remove private DMA API implementation"), the Nokia N800
here stopped booting:
[ 2.086181] Waiting for root device /dev/mmcblk0p1...
[ 2.324066] Unhandled fault: imprecise external abort (0x406) at 0x00000000
[ 2.331451] Internal error: : 406 [#1] ARM
[ 2.335784] Modules linked in:
[ 2.339050] CPU: 0 Not tainted (3.6.0-rc3 #60)
[ 2.344146] PC is at default_idle+0x28/0x30
[ 2.348602] LR is at trace_hardirqs_on_caller+0x15c/0x1b0
...
This turned out to be due to memory corruption caused by long-broken
PIO code in drivers/mmc/host/omap.c. (Previously, this driver had
been using DMA; but the above commit caused the MMC driver to fall
back to PIO mode with an unmodified Kconfig.)
The PIO code, added with the rest of the driver in commit
730c9b7e66 ("[MMC] Add OMAP MMC host
driver"), confused bytes with 16-bit words. This bug caused memory
located after the PIO transfer buffer to be corrupted with transfers
larger than 32 bytes. The driver also did not increment the buffer
pointer after the transfer occurred. This bug resulted in data
corruption during any transfer larger than 64 bytes.
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
For several MoviNAND eMMC parts, there are known issues with secure
erase and secure trim. For these specific MoviNAND devices, we skip
these operations.
Specifically, there is a bug in the eMMC firmware that causes
unrecoverable corruption when the MMC is erased with MMC_CAP_ERASE
enabled.
References:
http://forum.xda-developers.com/showthread.php?t=1644364https://plus.google.com/111398485184813224730/posts/21pTYfTsCkB#111398485184813224730/posts/21pTYfTsCkB
Signed-off-by: Ian Chen <ian.cy.chen@samsung.com>
Reviewed-by: Namjae Jeon <linkinjeon@gmail.com>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: stable <stable@vger.kernel.org> [3.0+]
Signed-off-by: Chris Ball <cjb@laptop.org>
The documentation for the dw_mmc part says that the low power
mode should normally only be set for MMC and SD memory and should
be turned off for SDIO cards that need interrupts detected.
The best place I could find to do this is when the SDIO interrupt
was first enabled. I rely on the fact that dw_mci_setup_bus()
will be called when it's time to reenable.
Signed-off-by: Doug Anderson <dianders@chromium.org>
Acked-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Data transfer will be continued until all the bytes are transmitted,
even if data crc error occurs during a multiple-block data transfer.
This means RXDR/TXDR interrupts will occurs until data transfer is
terminated. Early setting of host->sg to NULL prevents going into
xxx_data_pio functions, hence permanent unhandled RXDR/TXDR interrupts
occurs. And checking error interrupt status in the xxx_data_pio functions
is no need because dw_mci_interrupt does do the same. This patch also
removes it.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Datasheet of SYNOPSYS mentions that DTO(Data Transfer Over) interrupt
will be raised even if some error interrupts, however it is actually
found that DTO does not occur. SYNOPSYS has confirmed this issue.
Current implementation defers the call of tasklet_schedule until DTO
when the error interrupts is happened. This patch fixes error handling.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
RINTSTS status includes masked interrupts as well as unmasked.
data_status and cmd_status are set by value of RINTSTS in interrupt handler
and tasklet finally uses it to decide whether error is happened or not.
In addition, MINTSTS status is used for setting data_status in PIO.
Masked error interrupt will not be handled and that status can be considered
non-error case.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Reviewed By: Girish K S <girish.shivananjappa@linaro.org>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Even if the datasheet says that the not busy flag has to be used only
for write operations, it's false except for version lesser than v2xx.
Not waiting on the not busy flag for read operations can cause the
controller to hang-up during the initialization of some SD cards
with DMA after the first CMD6 -- the next command is sent too early.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: stable <stable@vger.kernel.org> [3.5, 3.6]
Signed-off-by: Chris Ball <cjb@laptop.org>
Since commit 30832ab56 ("mmc: sdhci: Always pass clock request value
zero to set_clock host op") was merged, esdhc_set_clock starts hitting
"if (clock == 0)" where ESDHC_SYSTEM_CONTROL has been operated. This
causes SDHCI card-detection function being broken. Fix the regression
by moving "if (clock == 0)" above ESDHC_SYSTEM_CONTROL operation.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Descriptor array structure has been moved into blackfin dma.h head file.
This patch fix below error:
drivers/mmc/host/bfin_sdh.c:52:8: error: redefinition of 'struct
dma_desc_array'
make[4]: *** [drivers/mmc/host/bfin_sdh.o] Error 1
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bob Liu <lliubbo@gmail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
UBI was mistakingly using 'kfree()' instead of 'kmem_cache_free()' when
freeing "attach eraseblock" structures in vtbl.c. Thankfully, this happened
only when we were doing auto-format, so many systems were unaffected. However,
there are still many users affected.
It is strange, but the system did not crash and nothing bad happened when
the SLUB memory allocator was used. However, in case of SLOB we observed an
crash right away.
This problem was introduced in 2.6.39 by commit
"6c1e875 UBI: add slab cache for ubi_scan_leb objects"
A note for stable trees:
Because variable were renamed, this won't cleanly apply to older kernels.
Changing names like this should help:
1. ai -> si
2. aeb_slab_cache -> seb_slab_cache
3. new_aeb -> new_seb
Reported-by: Richard Genoud <richard.genoud@gmail.com>
Tested-by: Richard Genoud <richard.genoud@gmail.com>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: stable@vger.kernel.org [v2.6.39+]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>