ARCH_USE_BUILTIN_BSWAP will use __builtin_bswap16(), __builtin_bswap32()
and __builtin_bswap64() where available. This allows better instruction
scheduling. On pre-R2 processors it will result in 32 bit and 64 bit
swapping being performed in a call to a __bswapsi2() rsp. __bswapdi2()
functions, so we add these, too.
For a 4.2 kernel with GCC 4.9 this yields the following kernel sizes:
text data bss dec hex filename
3996071 155804 88992 4240867 40b5e3 vmlinux ip22 baseline
3985687 159900 88992 4234579 409d53 vmlinux ip22 + bswap patch
6913157 378552 251024 7542733 7317cd vmlinux ip27 baseline
6878581 378552 251024 7508157 7290bd vmlinux ip27 + bswap patch
5773777 268752 187424 6229953 5f0fc1 vmlinux malta baseline
5773401 268752 187424 6229577 5f0e49 vmlinux malta + bswap patch
Presumably the code size improvments yield better cache hit rate thus
better performance compensating for the extra function call but this
will still need to be benchmarked.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The following instructions have been removed from MIPS R6
ulw, ulh, swl, lwr, lwl, swr.
However, all of them are used in the MIPS specific checksum implementation.
As a result of which, we will use the generic checksum on MIPS R6
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
For non MIPSr2 processors, such as the BMIPS 5000, calls to
arch_local_irq_disable() and others may be preempted, and in doing
so a stale value may be restored to c0_status. This fix disables
preemption for such processors prior to the call and enables it
after the call.
Those functions that needed this fix have been "outlined" to
mips-atomic.c, as they are no longer good candidates for inlining.
This bug was observed in a BMIPS 5000, occuring once every few hours
in a continuous reboot test. It was traced to the write_lock_irq()
function which was being invoked in release_task() in exit.c.
By placing a number of "nops" inbetween the mfc0/mtc0 pair in
arch_local_irq_disable(), which is called by write_lock_irq(), we
were able to greatly increase the occurance of this bug. Similarly,
the application of this commit silenced the bug.
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: David Daney <ddaney.cavm@gmail.com>
Cc: Kevin Cernekee cernekee@gmail.com
Cc: Jim Quinlan <jim2101024@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/4321/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The "else clause" of most functions in bitops.h invoked
raw_local_irq_{save,restore}() and in doing so had a dependency on
irqflags.h. This fix moves said code to bitops.c, removing the
dependency.
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: David Daney <ddaney.cavm@gmail.com>
Cc: Kevin Cernekee cernekee@gmail.com
Cc: Jim Quinlan <jim2101024@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/4320/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Allows us not to duplicate more lines in arch/mips/lib/Makefile.
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Patchwork: http://patchwork.linux-mips.org/patch/3329/
Signed-off-by: John Crispin <blogic@openwrt.org>
We can save the 451 lines of code that comprise memcpy-inatomic.S at the
expense of a single instruction in the memcpy prolog. We also use an
additional register (t6), so this may cause increased register pressure in
some places as well. But I think the reduced maintenance burden, of not
having two nearly identical implementations, makes it worth it.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add NLM_XLR_BOARD, CPU_XLR and other config options
Makefile updates, mostly based on r4k
Signed-off-by: Jayachandran C <jayachandranc@netlogicmicro.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2334/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Outlining fixes the issue were on certain CPUs such as the R10000 family
the delay loop would need an extra cycle if it overlaps a cacheline
boundary.
The rewrite also fixes build errors with GCC 4.4 which was changed in
way incompatible with the kernel's inline assembly.
Relying on pure C for computation of the delay value removes the need for
explicit. The price we pay is a slight slowdown of the computation - to
be fixed on another day.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Take all the OCTEON specific files that were added, and hook them into
the build system for the arch/mips. For versions of GCC that lack
OCTEON support, override gas target architecture.
Signed-off-by: Tomaso Paoletti <tpaoletti@caviumnetworks.com>
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We already have sufficient infrastructure to support VR5500 and VR5500A
series processors. Here's a Makefile support to make it selectable by
ports, and enable it for NEC EMMA2RH Markeins board.
This patch also fixes a confused target help, and adds 1Gb PageMask bits
supported by VR5500 and its variants.
Signed-off-by: Shinya Kuribayashi <shinya.kuribayashi@necel.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Certain 32-bit kernel configurations seem to be able to cause references,
this was observed with gcc 4.1.2.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported by Eugene Surovegin <ebs@ebshome.net>.
If only modules were users of these functions they did not get linked into
the kernel proper, so later module loads would fail as well.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Many Makefiles in arch/mips have EXTRA_AFLAGS := $(CFLAGS) line. This
is redundant while AFLAGS contains $(cflags-y) and any options only
listed in CFLAGS (not in cflags-y) should be unnecessary for asm
sources.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
From the 01408c4939 log message:
The problem is that when we write to a file, the copy from userspace to
pagecache is first done with preemption disabled, so if the source
address is not immediately available the copy fails *and* *zeros* *the*
*destination*.
This is a problem because a concurrent read (which admittedly is an odd
thing to do) might see zeros rather that was there before the write, or
what was there after, or some mixture of the two (any of these being a
reasonable thing to see).
If the copy did fail, it will immediately be retried with preemption
re-enabled so any transient problem with accessing the source won't
cause an error.
The first copying does not need to zero any uncopied bytes, and doing
so causes the problem. It uses copy_from_user_atomic rather than
copy_from_user so the simple expedient is to change copy_from_user_atomic
to *not* zero out bytes on failure.
< --- end cite --- >
This patch finally implements at least a not so pretty solution by
duplicating the relevant part of __copy_user.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This implementation has support for the concept of one separate ioport
address space by PCI domain. A pointer to the virtual address where
the port space of a domain has been mapped has been added to struct
pci_controller and systems should be fixed to fill in this value. For
single domain systems this will be the same value as passed to
set_io_port_base().
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The 32-bit version and 64-bit version are almost equal. Unify them.
This makes further improvements (for example, supporting CDEX, etc.)
easier.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Implement optimized asm version of csum_partial_copy_nocheck,
csum_partial_copy_from_user and csum_and_copy_to_user which can do
calculate and copy in parallel, based on memcpy.S.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The 32-bit version and 64-bit version are almost equal. Unify them. This
makes further improvements (for example, copying with parallel, supporting
PREFETCH, etc.) easier.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
It took a while longer than on other architectures but gcc has finally
started to strike us as well ...
This also fixes the damage by 6edfba1b33.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Several implementations were essentialy a common piece of C code using
the cmpxchg() macro. Put the implementation in one spot that everyone
can share, and convert sparc64 over to using this.
Alpha is the lone arch-specific implementation, which codes up a
special fast path for the common case in order to avoid GP reloading
which a pure C version would require.
Signed-off-by: David S. Miller <davem@davemloft.net>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!