This change sets the PCI devices' initial DMA capabilities
conservatively and promotes them at the request of the driver,
as opposed to assuming advanced DMA capabilities. The old design
runs the risk of breaking drivers that assume default capabilities.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This was really only useful for TILE64 when we mapped the
kernel data with small pages. Now we use a huge page and we
really don't want to map different parts of the kernel
data in different ways.
We retain the __write_once name in case we want to bring
it back to life at some point in the future.
Note that this change uncovered a latent bug where the
"smp_topology" variable happened to always be aligned mod 8
so we could store two "int" values at once, but when we
eliminated __write_once it ended up only aligned mod 4.
Fix with an explicit annotation.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This chip is no longer being actively developed for (it was superceded
by the TILEPro64 in 2008), and in any case the existing compiler and
toolchain in the community do not support it. It's unlikely that the
kernel works with TILE64 at this point as the configuration has not been
tested in years. The support is also awkward as it requires maintaining
a significant number of ifdefs. So, just remove it altogether.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The definisions of __ffs(), __fls(), and ffs() for tile are almost same
as asm-generic/bitops-*.h. The only difference is that it is defined
as __always_inline or inline. So this switches to use those headers.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [moved #includes to end]
We use virt_to_pte(NULL, va) a lot, which isn't very obvious.
I added virt_to_kpte(va) as a more obvious wrapper function,
that also validates the va as being a kernel adddress.
And, I fixed the semantics of virt_to_pte() so that we handle
the pud and pmd the same way, and we now document the fact that
we handle the final pte level differently.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Newer hypervisors have an API for reporting per-cpu statistics
information. This change allows seeing that information via
/sys/devices/system/cpu/cpuN/hv_stats file for each core.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Enter kernel debugger at boot with:
--hvd UART_1=1 --hvx kgdbwait --hvx kgdboc=ttyS1,115200
or at runtime with:
echo ttyS1,115200 > /sys/module/kgdboc/parameters/kgdboc
echo g > /proc/sysrq-trigger
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The TILE-Gx chip includes an on-chip UART. This change adds support
for using the UART from within the kernel. The UART shim has more
functionality than is exposed here, but to keep the kernel code and
binary simpler, this is a subset of the full API designed to enable
a standard Linux tty serial driver only.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The existing code relied on the hardware definition (<arch/chip.h>)
to specify how much VA and PA space was available. It's convenient
to allow customizing this for some configurations, so provide symbols
MAX_PA_WIDTH and MAX_VA_WIDTH in <asm/page.h> that can be modified
if desired.
Additionally, move away from the MEM_XX_INTRPT nomenclature to
define the start of various regions within the VA space. In fact
the cleaner symbol is, for example, MEM_SV_START, to indicate the
start of the area used for supervisor code; the actual address of the
interrupt vectors is not as important, and can be changed if desired.
As part of this change, convert from "intrpt1" nomenclature (which
built in the old privilege-level 1 model) to a simple "intrpt".
Also strip out some tilepro-specific code supporting modifying the
PL the kernel could run at, since we don't actually support using
different PLs in tilepro, only tilegx.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Technically, user privilege is anything less than kernel
privilege. We modify the existing user_mode() macro to have
this semantic (and use it in a couple of places it wasn't being
used before), and add an IS_KERNEL_EX1() macro to the assembly
code as well.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This tile-specific API had a minor bug, in that if a super huge (>4GB)
page mapped a particular address range, we wouldn't handle it correctly.
As part of fixing that bug, I also cleaned up some of the pud and pmd
accessors to make them more consistent.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Previously, we used a special-purpose register (SPR_SYSTEM_SAVE_K_0)
to hold the CPU number and the top of the current kernel stack
by using the low bits to hold the CPU number, and using the high
bits to hold the address of the page just above where we'd want
the kernel stack to be. That way we could initialize a new SP
when first entering the kernel by just masking the SPR value and
subtracting a couple of words.
However, it's actually more useful to be able to place an arbitrary
kernel-top value in the SPR. This allows us to create a new stack
context (e.g. for virtualization) with an arbitrary top-of-stack VA.
To make this work, we now store the CPU number in the high bits,
above the highest legal VA bit (42 bits in the current tilegx
microarchitecture). The full 42 bits are thus available to store the
top of stack value. Getting the current cpu (a relatively common
operation) is still fast; it's now a shift rather than a mask.
We make this change only for tilegx, since tilepro has too few SPR
bits to do this, and we don't need this support on tilepro anyway.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Normally the build doesn't include these warnings, but at one
point I built with -Wsign-compare, and noticed a few things that
are technically bugs. This change fixes those things.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Nothing in the codebase was using them, and as written they took
"unsigned long" as the physical address rather than "phys_addr_t",
which is wrong on tilepro anyway. Rather than fixing stale APIs,
just remove them; if there's ever demand for them on this platform,
we can put them back.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
With this change, tile Linux now supports address-space layout
randomization for shared objects, stack, heap and vdso.
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Tony Lu <zlu@tilera.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This may fix a reported bug where an R_TILEGX_64 in a module was not
pointing to an aligned address.
Reported-by: Simon Marchi <simon.marchi@polymtl.ca>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This change includes support for Kprobes, Jprobes and Return Probes.
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Tony Lu <zlu@tilera.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This commit adds support for static ftrace, graph function support,
and dynamic tracer support.
Signed-off-by: Tony Lu <zlu@tilera.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This change creates the framework for vDSO calls, makes the existing
rt_sigreturn() mechanism use it, and adds a fast gettimeofday().
Now that we need to expose the vDSO address to userspace, we add
AT_SYSINFO_EHDR to the set of aux entries provided to userspace.
(You can disable any extra vDSO support by booting with vdso=0,
but the rt_sigreturn vDSO page will still be provided.)
Note that glibc has supported the tile vDSO since release 2.17.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
First, fix a bug in asm/unaligned.h; we need to just use the asm-generic
unaligned.h so we properly choose endian-correct flavors.
Second, keep the hv/hypervisor.h ABI fully "native" in the sense that
we don't have __BIG_ENDIAN__ ifdefs there. Instead, we use macros in
the head_NN.S assembly code to properly extract two 32-bit structure
members from a 64-bit register holding the structure.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This change adds support for CONFIG_PREEMPT (full kernel preemption).
In addition to the core support, this change includes a number
of places where we fix up uses of smp_processor_id() and per-cpu
variables. I also eliminate the PAGE_HOME_HERE and PAGE_HOME_UNKNOWN
values for page homing, as it turns out they weren't being used.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This change adds support for avoiding recursive backtracer crashes;
we haven't seen this in practice other than when things are seriously
corrupt, but it may help avoid losing the root cause of a crash.
Also, don't abort kernel backtracers for invalid userspace PC's.
If we do, we lose the ability to backtrace through a userspace
call to a bad address above PAGE_OFFSET, even though that it can
be perfectly reasonable to continue the backtrace in such a case.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This change enables unaligned userspace memory access via a kernel
fast path on tilegx. The kernel tracks user PC/instruction pairs
per-thread using a direct-mapped cache in userspace. The cache
maps those PC/instruction pairs to JIT'ed instruction sequences that
load or store using byte-wide load store intructions and then
synthesize 2-, 4- or 8-byte load or store results. Once an
instruction has been seen to generate an unaligned access once,
subsequent hits on that instruction typically require overhead
of only around 50 cycles if cache and TLB is hot.
We support the prctl() PR_GET_UNALIGN / PR_SET_UNALIGN sys call to
enable or disable unaligned fixups on a per-process basis.
To do this we pull some of the tilepro unaligned support out of the
single_step.c file; tilepro uses instruction disassembly for both
single-step and unaligned access support. Since tilegx actually has
hardware singlestep support, though, it's cleaner to keep the tilegx
unaligned access code in a separate file. While we're at it,
properly rename the tilepro-specific types, etc., to have tilepro
suffixes instead of generic tile suffixes.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This change improves and cleans up the tile console.
- We enable HVC_IRQ support on tilegx, with the addition of a new
Tilera hypervisor API for tilegx to allow a console IPI. If IPI
support is not available we fall back to the previous polling mode.
- We simplify the earlyprintk code to use CON_BOOT and eliminate some
of the other supporting earlyprintk code.
- A new tile_console_write() primitive is used to send output to
the console and is factored out of the hvc_tile driver.
This lets us support a "sim_console" boot argument to allow using
simulator hooks to send output to the "console" as a slightly
faster alternative to emulating the hardware more directly.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On Tilera Gx72 systems, the logic for figuring out whether
a given port is root complex is slightly different.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The standard kernel function dma_get_required_mask() uses the
highest DRAM address to determine if 32-bit or 64-bit DMA addressing
is needed. This only works on architectures that have direct mapping
between the PA and the PCI address space, i.e. those that don't have
I/O TLBs or have I/O TLB but choose to use direct mapping. Neither
of these are true for tilegx. Whether to use 64-bit DMA should depend
on the PCI device's capability only, not on the amount of DRAM
installeds, so we now advertise a 64-bit DMA mask unconditionally.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The .mem_resources[] field in the pci_controller struct
is now obsoleted by the .mem_space and .io_space fields.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The TRIO shim initialization is shared with other kernel drivers
such as the endpoint and StreamIO drivers, so reorganize the
initialization flow to ensure that the root complex driver properly
initializes TRIO state regardless of what kind of TRIO driver will
end up using the shim.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
To enable this functionality, configure CONFIG_TILE_PCI_IO. Without
this flag, the kernel still assigns I/O address ranges to the
devices, but no TRIO resource and mapping support is provided.
We assign disjoint I/O address ranges to separate PCIe domains.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Besides using pr_info() to print the linkdown status for a plug-in
slot, add extra indication that this is expected if the slot is empty.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
To support PCIe devices with higher number of MSI-X interrupt vectors,
e.g. 16 for the LSI RAID card, enhance the Gx RC stack to provide more
MSI-X vectors by using the TRIO Scatter Queues, which provide 8 more
vectors in addition to ~10 from the Map Mem regions.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The LSI MEGARAID SAS HBA suffers from the problem where it can do
64-bit DMA to streaming buffers but not to consistent buffers.
In other words, 64-bit DMA is used for disk data transfers and 32-bit
DMA must be used for control message transfers. According to LSI,
the firmware is not fully functional yet. This change implements a
kind of hybrid dma_ops to support this.
Note that on most other platforms, the 64-bit DMA addressing space is the
same as the 32-bit DMA space and they overlap the physical memory space.
No special arrangement is needed to support this kind of mixed DMA
capability. On TILE-Gx, the 64-bit DMA space is completely separate
from the 32-bit DMA space. Due to the use of the IOMMU, the 64-bit DMA
space doesn't overlap the physical memory space. On the other hand,
the 32-bit DMA space overlaps the physical memory space under 4GB.
The separate address spaces make it necessary to have separate dma_ops.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
- remove unneeded <linux/bootmem.h> include in pci.c
- eliminate unused pci_controller.first_busno field
- prefer msleep to mdelay
- remove stale comment about pci_scan_bus_parented()
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Using strlen as a model, add length checking to create strnlen.
Signed-off-by: Ken Steele <ken@tilera.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The "inv" (invalidate) instruction is generally less safe than "finv"
(flush and invalidate), as it will drop dirty data from the cache.
It turns out we have almost no need for "inv" (other than for the
older 32-bit architecture in some limited cases), so convert to
"finv" where possible and delete the extra "inv" infrastructure.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Need add cmpxchg64(), or will cause compiling issue.
Need define it as cmpxchg() only for 64-bit operation, since cmpxchg()
can support 8 bytes.
The related error (with allmodconfig):
drivers/block/blockconsole.c: In function ‘bcon_advance_console_bytes’:
drivers/block/blockconsole.c:164:2: error: implicit declaration of function ‘cmpxchg64’ [-Werror=implicit-function-declaration]
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Normalize global variables exported by vmlinux.lds to conform usage
guidelines from include/asm-generic/sections.h.
1) Use _text to mark the start of the kernel image including the head
text, and _stext to mark the start of the .text section.
2) Export mandatory global variables __init_begin and __init_end.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull voluntary preemption fixes from Ingo Molnar:
"This tree contains a speedup which is achieved through better
might_sleep()/might_fault() preemption point annotations for uaccess
functions, by Michael S Tsirkin:
1. The only reason uaccess routines might sleep is if they fault.
Make this explicit for all architectures.
2. A voluntary preemption point in uaccess functions means compiler
can't inline them efficiently, this breaks assumptions that they
are very fast and small that e.g. net code seems to make. Remove
this preemption point so behaviour matches with what callers
assume.
3. Accesses (e.g through socket ops) to kernel memory with KERNEL_DS
like net/sunrpc does will never sleep. Remove an unconditinal
might_sleep() in the might_fault() inline in kernel.h (used when
PROVE_LOCKING is not set).
4. Accesses with pagefault_disable() return EFAULT but won't cause
caller to sleep. Check for that and thus avoid might_sleep() when
PROVE_LOCKING is set.
These changes offer a nice speedup for CONFIG_PREEMPT_VOLUNTARY=y
kernels, here's a network bandwidth measurement between a virtual
machine and the host:
before:
incoming: 7122.77 Mb/s
outgoing: 8480.37 Mb/s
after:
incoming: 8619.24 Mb/s [ +21.0% ]
outgoing: 9455.42 Mb/s [ +11.5% ]
I kept these changes in a separate tree, separate from scheduler
changes, because it's a mixed MM and scheduler topic"
* 'sched-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
mm, sched: Allow uaccess in atomic with pagefault_disable()
mm, sched: Drop voluntary schedule from might_fault()
x86: uaccess s/might_sleep/might_fault/
tile: uaccess s/might_sleep/might_fault/
powerpc: uaccess s/might_sleep/might_fault/
mn10300: uaccess s/might_sleep/might_fault/
microblaze: uaccess s/might_sleep/might_fault/
m32r: uaccess s/might_sleep/might_fault/
frv: uaccess s/might_sleep/might_fault/
arm64: uaccess s/might_sleep/might_fault/
asm-generic: uaccess s/might_sleep/might_fault/
Pull scheduler updates from Ingo Molnar:
"The main changes:
- load-calculation cleanups and improvements, by Alex Shi
- various nohz related tidying up of statisics, by Frederic
Weisbecker
- factor out /proc functions to kernel/sched/proc.c, by Paul
Gortmaker
- simplify the RT policy scheduler, by Kirill Tkhai
- various fixes and cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
sched/debug: Remove CONFIG_FAIR_GROUP_SCHED mask
sched/debug: Fix formatting of /proc/<PID>/sched
sched: Fix typo in struct sched_avg member description
sched/fair: Fix typo describing flags in enqueue_entity
sched/debug: Add load-tracking statistics to task
sched: Change get_rq_runnable_load() to static and inline
sched/tg: Remove tg.load_weight
sched/cfs_rq: Change atomic64_t removed_load to atomic_long_t
sched/tg: Use 'unsigned long' for load variable in task group
sched: Change cfs_rq load avg to unsigned long
sched: Consider runnable load average in move_tasks()
sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task
sched: Update cpu load after task_tick
sched: Fix sleep time double accounting in enqueue entity
sched: Set an initial value of runnable avg for new forked task
sched: Move a few runnable tg variables into CONFIG_SMP
Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking"
sched: Don't mix use of typedef ctl_table and struct ctl_table
sched: Remove WARN_ON(!sd) from init_sched_groups_power()
sched: Fix memory leakage in build_sched_groups()
...
Most of the stuff from kernel/sched.c was moved to kernel/sched/core.c long time
back and the comments/Documentation never got updated.
I figured it out when I was going through sched-domains.txt and so thought of
fixing it globally.
I haven't crossed check if the stuff that is referenced in sched/core.c by all
these files is still present and hasn't changed as that wasn't the motive behind
this patch.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/cdff76a265326ab8d71922a1db5be599f20aad45.1370329560.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The only reason uaccess routines might sleep
is if they fault. Make this explicit.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1369577426-26721-8-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull tile update from Chris Metcalf:
"The interesting bug fix is support for the upcoming "4.2" release of
the Tilera hypervisor, which by default launches Linux at privilege
level 2 instead of 1. The fix lets new and old hypervisors and
Linuxes interoperate more smoothly, so I've tagged it for
stable@kernel.org so that older Linuxes will be able to boot under the
newer hypervisor."
* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
usb: tilegx: fix memleak when create hcd fail
arch/tile: remove inline marking of EXPORT_SYMBOL functions
rtc: rtc-tile: add missing platform_device_unregister() when module exit
tile: support new Tilera hypervisor
The Tilera hypervisor shipped in releases up through MDE 4.1 launches
the client operating system (i.e. Linux) at privilege level 1 (PL1).
Starting with MDE 4.2, as part of the work to enable KVM, the
Tilera hypervisor launches Linux at PL2 instead.
This commit makes the KERNEL_PL option default to 2 for tilegx, while
still saying at 1 for tilepro, which doesn't have an updated hypervisor.
It also explains how and when you might want to choose another value.
In addition, we change a small buglet in the on-chip Ethernet driver,
where we were failing to use the KERNEL_PL constant in an API call.
To make the transition cleaner, this change also provides the updated
hv_init() API for the new hypervisor that supports announcing Linux's
compiled-in PL, so the hypervisor can generate a suitable error in the
case of a mismatched hypervisor and Linux binary.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: stable@vger.linux.org
Pull tile arch changes from Chris Metcalf:
"These are some minor new feature work and other changes that didn't
merit getting pushed up after the 3.9 merge window closed.
There should be a lot more activity in the 3.11 merge window"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch/tile: Fix syscall return value passed to tracepoint
tile: comment assumption about __insn_mtspr for <asm/irqflags.h>
tile: ns2cycles should use __raw_get_cpu_var
arch: remove KCORE_ELF again [tile]
tile: remove two outdated Kconfig entries
tile: support atomic64_dec_if_positive()
tile: support TIF_SYSCALL_TRACEPOINT; select HAVE_SYSCALL_TRACEPOINTS
tile: Add definition of NR_syscalls
tile: move declaration of sys_call_table to <asm/syscall.h>
arch/tile: Enable HAVE_ARCH_TRACEHOOK
arch/tile: Call tracehook_report_syscall_{entry,exit} in syscall trace
The help text for this config is duplicated across the x86, parisc, and
s390 Kconfig.debug files. Arnd Bergman noted that the help text was
slightly misleading and should be fixed to state that enabling this
option isn't a problem when using pre 4.4 gcc.
To simplify the rewording, consolidate the text into lib/Kconfig.debug
and modify it there to be more explicit about when you should say N to
this config.
Also, make the text a bit more generic by stating that this option
enables compile time checks so we can cover architectures which emit
warnings vs. ones which emit errors. The details of how an
architecture decided to implement the checks isn't as important as the
concept of compile time checking of copy_from_user() calls.
While we're doing this, remove all the copy_from_user_overflow() code
that's duplicated many times and place it into lib/ so that any
architecture supporting this option can get the function for free.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>