Commit Graph

547563 Commits

Author SHA1 Message Date
Vineet Gupta
30b9dbee89 ARC: Fix silly typo in MAINTAINERS file 2015-11-14 13:12:31 +05:30
Vineet Gupta
1cfc05cbe2 ARC: cpu_relax() to be compiler barrier even for UP
cpu_relax() on ARC has been barrier only for SMP (and no-op for UP). Per
recent discussions, it is safer to make it a compiler barrier
unconditionally.

Link: http://lkml.kernel.org/r/53A7D3AA.9020100@synopsys.com
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-14 13:12:30 +05:30
Vineet Gupta
a6416f57ce ARC: use ASL assembler mnemonic
ARCompact and ARCv2 only have ASL, while binutils used to support LSL as
a alias mnemonic.

Newer binutils (upstream) don't want to do that so replace it.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-14 13:12:21 +05:30
Vineet Gupta
541366da6a ARC: [arcompact] Handle bus error from userspace as Interrupt not exception
Bus errors from userspace on ARCompact based cores are handled by core
as a high priority L2 interrupt but current code treated it as interrupt
Handling an interrupt like exception is certainly not going to go unnoticed.
(and it worked so far as we never saw a Bus error from userspace until
IPPK guys tested a DDR controller with ECC error detection etc hence
needed to explicitly trigger/handle such errors)

 - So move mem_service exception handler from common code into ARCv2 code.
 - In ARCompact code, define  mem_service as L2 interrupt handler which
   just drops down to pure kernel mode and goes of to enqueue SIGBUS

Reported-by: Nelson Pereira <npereira@synopsys.com>
Tested-by: Ana Martins <amartins@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-14 13:12:20 +05:30
Vineet Gupta
76a8c40c65 ARC: remove extraneous header include
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-14 13:11:38 +05:30
Vineet Gupta
ac506b7f22 ARCv2: lib: memcpy: use local symbols
Otherwise perf profiles don't charge tme to memcpy

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-03 17:33:00 +05:30
Vineet Gupta
5a364c2a17 ARC: mm: PAE40 support
This is the first working implementation of 40-bit physical address
extension on ARCv2.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-29 18:41:30 +05:30
Vineet Gupta
25d464183c ARC: mm: PAE40: tlbex.S: Explicitify the size of pte_t
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:50:29 +05:30
Vineet Gupta
28b4af729f ARC: mm: PAE40: switch to using phys_addr_t for physical addresses
That way a single flip of phys_addr_t to 64 bit ensures all places
dealing with physical addresses get correct data

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:50:29 +05:30
Vineet Gupta
29e332261d ARC: mm: HIGHMEM: populate high memory from DT
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:50:26 +05:30
Vineet Gupta
45890f6d34 ARC: mm: HIGHMEM: kmap API implementation
Implement kmap* API for ARC.

This enables
 - permanent kernel maps (pkmaps): :kmap() API
 - fixmap : kmap_atomic()

We use a very simple/uniform approach for both (unlike some of the other
arches). So fixmap doesn't use the customary compile time address stuff.
The important semantic is sleep'ability (pkmap) vs. not (fixmap) which
the API guarantees.

Note that this patch only enables highmem for subsequent PAE40 support
as there is no real highmem for ARC in pure 32-bit paradigm as explained
below.

ARC has 2:2 address split of the 32-bit address space with lower half
being translated (virtual) while upper half unstranslated
(0x8000_0000 to 0xFFFF_FFFF). kernel itself is linked at base of
unstranslated space (i.e. 0x8000_0000 onwards), which is mapped to say
DDR 0x0 by external Bus Glue logic (outside the core). So kernel can
potentially access 1.75G worth of memory directly w/o need for highmem.
(the top 256M is taken by uncached peripheral space from 0xF000_0000 to
0xFFFF_FFFF)

In PAE40, hardware can address memory beyond 4G (0x1_0000_0000) while
the logical/virtual addresses remain 32-bits. Thus highmem is required
for kernel proper to be able to access these pages for it's own purposes
(user space is agnostic to this anyways).

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:49:04 +05:30
Vineet Gupta
6101be5ad4 ARC: mm: preps ahead of HIGHMEM support #2
Explicit'ify that all memory added so far is low memory
Nothing semantical

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:49:00 +05:30
Vineet Gupta
336e2136e1 ARC: mm: preps ahead of HIGHMEM support
Before we plug in highmem support, some of code needs to be ready for it
 - copy_user_highpage() needs to be using the kmap_atomic API
 - mk_pte() can't assume page_address()
 - do_page_fault() can't assume VMALLOC_END is end of kernel vaddr space

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:31:05 +05:30
Alexey Brodkin
d40846457f ARC: mm: use generic macros _BITUL()/_AC()
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:31:05 +05:30
Vineet Gupta
8840e14cd8 ARC: mm: Improve Duplicate PD Fault handler
- Move the verbosity knob from .data to .bss by using inverted logic
 - No need to readout PD1 descriptor
 - clip the non pfn bits of PD0 to avoid clipping inside the loop

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:31:04 +05:30
Vineet Gupta
9acdc911b5 MAINTAINERS: Add public mailing list for ARC
Cc: <stable@vger.kernel.org> #3.9+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:43 +05:30
Vineet Gupta
f759ee57b2 ARC: Ensure DT mem base is same as what kernel is built with
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:42 +05:30
Vineet Gupta
483bcc99c0 ARC: boot: Non Master cpus only need to call EARLY_CPU_SETUP once
With prev fixes, all cores now start via common entry point @stext which
already calls EARLY_CPU_SETUP for all cores - so no need to invoke it
again

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:42 +05:30
Vineet Gupta
aa0efcde45 ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_smp()
MCIP now registers it's own per cpu setup routine (for IPI IRQ request)
using smp_ops.init_irq_cpu().

So no need for platforms to do that. This now completely decouples
platforms from MCIP.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:41 +05:30
Vineet Gupta
286130ebf1 ARC: smp: Introduce smp hook @init_irq_cpu called for all cores
Note this is not part of platform owned static machine_desc,
but more of device owned plat_smp_ops (rather misnamed) which a IPI
provider or some such typically defines.

This will help us seperate out the IPI registration from platform
specific init_cpu_smp() into device specific init_irq_cpu()

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:41 +05:30
Vineet Gupta
8721a7f5a6 ARC: smp: Rename platform hook @init_smp -> @init_cpu_smp
This conveys better that it is called for each cpu

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:40 +05:30
Vineet Gupta
26b8f99623 ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_early_smp()
MCIP now registers it's own probe callback with smp_ops.init_early_smp()
which is called by ARC common code, so no need for platforms to do that.

This decouples the platforms and MCIP and helps confine MCIP details
to it's own file.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:40 +05:30
Vineet Gupta
e55af4da02 ARC: smp: Introduce smp hook @init_early_smp for Master core
This adds a platform agnostic early SMP init hook which is called on
Master core before calling setup_processor()

  setup_arch()
     smp_init_cpus()
         smp_ops.init_early_smp()
     ...
     setup_processor()

How this helps:
 - Used for one time init of certain SMP centric IP blocks, before
   calling setup_processor() which probes various bits of core,
   possibly including this block

 - Currently platforms need to call this IP block init from their
   init routines, which doesn't make sense as this is specific to ARC
   core and not platform and otherwise requires copy/paste in all
   (and hence a possible point of failure)

e.g. MCIP init is called from 2 platforms currently (axs10x and sim)
which will go away once we have this.

This change only adds the hooks but they are empty for now. Next commit
will populate them and remove the explicit init calls from platforms.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:40 +05:30
Vineet Gupta
4c82f28617 ARC: remove @init_time, @init_irq platform callbacks
These are not in use for ARC platforms. Moreover DT mechanims exist to
probe them w/o explicit platform calls.

 - clocksource drivers can use CLOCKSOURCE_OF_DECLARE()
 - intc IRQCHIP_DECLARE() calls + cascading inside DT allows external
   intc to be probed automatically

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:39 +05:30
Vineet Gupta
e0868e6f67 ARC: smp: irqchip: handle IPI as percpu irq like timer
The reason this was not done so far was lack of genuine IPI_IRQ for
ARC700, as we don't have a SMP version of core yet (which might change
soon thx to EZChip). Nevertheles to increase the build coverage, we
need to allow CONFIG_SMP for ARC700 and still be able to run it on a
UP platform (nsim or AXS101) with a UP Device Tree (SMP-on-UP)

The build itself requires some define for IPI_IRQ and even a dummy
value is fine since that code won't run anyways.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:39 +05:30
Vineet Gupta
3971cdc202 ARC: boot: Support Halt-on-reset and Run-on-reset SMP booting modes
For Run-on-reset, non masters need to spin wait. For Halt-on-reset they
can jump to entry point directly.

Also while at it, made reset vector handler as "the" entry point for
kernel including host debugger based boot (which uses the ELF header
entry point)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:08:17 +05:30
Vineet Gupta
f33e9c434b ARC: smp: Move default boot kick/wait code out of MCIP into common code
For non halt-on-reset case, all cores start of simultaneously in @stext.
Master core0 proceeds with kernel boot, while other spin-wait on
@wake_flag being set by master once it is ready. So NO hardware assist
is needed for master to "kick" the others.

This patch moves this soft implementation out of mcip.c (as there is no
hardware assist) into common smp.c

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:27 +05:30
Vineet Gupta
d0890ea5b6 ARC: boot log: decode more mmu config items
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:26 +05:30
Vineet Gupta
964cf28f9d ARC: boot log: move helper macros to header for reuse
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:25 +05:30
Vineet Gupta
b598e17f6a ARC: mm: compute TLB size as needed from ways * sets
This frees up some bits to hold more high level info such as PAE being
present, w/o increasing the size of already bloated cpuinfo struct

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:25 +05:30
Vineet Gupta
c583ee4fb0 ARC: mm: MMU v1..v3 only selectable for ARCompact ISA based cores
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:24 +05:30
Vineet Gupta
5c35ee642a ARC: make write_aux_reg safer against macro substitution
It was generating warnings when called as write_aux_reg(x, paddr >> 32)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:24 +05:30
Vineet Gupta
9fabcc636b ARC: [arcompact] entry.S: Elide extra check/branch in exception ret path
This is done by improving the laddering logic !

Before:

   if Exception
      goto excep_or_pure_k_ret

   if !Interrupt(L2)
      goto l1_chk
   else
      INTERRUPT_EPILOGUE 2

 l1_chk:
   if !Interrupt(L1)  (i.e. pure kernel mode)
      goto excep_or_pure_k_ret
   else
      INTERRUPT_EPILOGUE 1

 excep_or_pure_k_ret:
   EXCEPTION_EPILOGUE

Now:

   if !Interrupt(L1 or L2) (i.e. exception or pure kernel mode)
      goto excep_or_pure_k_ret

  ; guaranteed to be an interrupt
   if !Interrupt(L2)
      goto l1_ret
   else
      INTERRUPT_EPILOGUE 2

 ; by virtue of above, no need to chk for L1 active
 l1_ret:
    INTERRUPT_EPILOGUE 1

 excep_or_pure_k_ret:
    EXCEPTION_EPILOGUE

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:23 +05:30
Vineet Gupta
5f88808745 ARC: [arcompact] entry.S: Document preemption games for L2 intr
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:23 +05:30
Vineet Gupta
55a2ae775a ARC: [arcompact] entry.S: Improve early return from exception
The requirement is to
 - Reenable Exceptions (AE cleared)
 - Reenable Interrupts (E1/E2 set)

We need to do wiggle these bits into ERSTATUS and call RTIE.

Prev version used the pre-exception STATUS32 as starting point for what
goes into ERSTATUS. This required explicit fixups of U/DE/L bits.

Instead, use the current (in-exception) STATUS32 as starting point.
Being in exception handler U/DE/L can be safely assumed to be correct.
Only AE/E1/E2 need to be fixed.

So the new implementation is slightly better
 -Avoids read form memory
 -Is 4 bytes smaller for the typical 1 level of intr configuration
 -Depicts the semantics more clearly

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:22 +05:30
Vineet Gupta
9dbd3d9bfd ARC: [arcompact] don't check for hard isr calling local_irq_enable()
Historically this was done by ARC IDE driver, which is long gone.
IRQ core is pretty robust now and already checks if IRQs are enabled
in hard ISRs. Thus no point in checking this in arch code, for every
call of irq enabled.

Further if some driver does do that - let it bring down the system so we
notice/fix this sooner than covering up for sucker

This makes local_irq_enable() - for L1 only case atleast simple enough
so we can inline it.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:22 +05:30
Vineet Gupta
c7119d56d2 ARCv2: mm: THP: flush_pmd_tlb_range make SMP safe
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:21 +05:30
Vineet Gupta
722fe8fd36 ARCv2: mm: THP: Implement flush_pmd_tlb_range() optimization
Implement the TLB flush routine to evict a sepcific Super TLB entry,
vs. moving to a new ASID on every such flush.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:21 +05:30
Vineet Gupta
12ebc1581a mm,thp: introduce flush_pmd_tlb_range
ARCHes with special requirements for evicting THP backing TLB entries
can implement this.

Otherwise also, it can help optimize TLB flush in THP regime.
stock flush_tlb_range() typically has optimization to nuke the entire
TLB if flush span is greater than a certain threshhold, which will
likely be true for a single huge page. Thus a single thp flush will
invalidate the entrire TLB which is not desirable.

e.g. see arch/arc: flush_pmd_tlb_range

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: http://lkml.kernel.org/r/20151009100816.GC7873@node
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:20 +05:30
Vineet Gupta
bd5e88ad72 mm,thp: reduce ifdef'ery for THP in generic code
- pgtable-generic.c: Fold individual #ifdef for each helper into a top
  level #ifdef. Makes code more readable

- Converted the stub helpers for !THP to BUILD_BUG() vs. runtime BUG()

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: http://lkml.kernel.org/r/20151009133450.GA8597@node
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:20 +05:30
Vineet Gupta
52585bcc25 mm: group pte related helpers together
This reduces/simplifies the diff for the next patch which moves THP
specific code.

No semantical changes !

Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com
Link: http://lkml.kernel.org/r/1442918096-17454-9-git-send-email-vgupta@synopsys.com
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:19 +05:30
Vineet Gupta
443a631283 Documentation/features/vm: THP now supported by ARC
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:19 +05:30
Vineet Gupta
6ce187985f ARCv2: mm: THP: boot validation/reporting
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:18 +05:30
Vineet Gupta
fe6c1b8611 ARCv2: mm: THP support
MMUv4 in HS38x cores supports Super Pages which are basis for Linux THP
support.

Normal and Super pages can co-exist (ofcourse not overlap) in TLB with a
new bit "SZ" in TLB page desciptor to distinguish between them.
Super Page size is configurable in hardware (4K to 16M), but fixed once
RTL builds.

The exact THP size a Linx configuration will support is a function of:
 - MMU page size (typical 8K, RTL fixed)
 - software page walker address split between PGD:PTE:PFN (typical
   11:8:13, but can be changed with 1 line)

So for above default, THP size supported is 8K * 256 = 2M

Default Page Walker is 2 levels, PGD:PTE:PFN, which in THP regime
reduces to 1 level (as PTE is folded into PGD and canonically referred
to as PMD).

Thus thp PMD accessors are implemented in terms of PTE (just like sparc)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:18 +05:30
Vineet Gupta
55ad769fde Documentation/features/vm: pte_special now supported by ARC
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09 17:04:23 +05:30
Vineet Gupta
24830fc782 ARC: mm: Introduce PTE_SPECIAL
Needed for THP, but will also come in handy for fast GUP later

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09 17:04:23 +05:30
Vineet Gupta
129cbed54a ARC: mm: pte flags comsetic cleanups, comments
No semantical changes

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09 17:04:22 +05:30
Vineet Gupta
e8a75963a4 ARC: mm: switch pgtable_to to pte_t *
ARC is the only arch with unsigned long type (vs. struct page *).
Historically this was done to avoid the page_address() calls in various
arch hooks which need to get the virtual/logical address of the table.

Some arches alternately define it as pte_t *, and is as efficient as
unsigned long (generated code doesn't change)

Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09 17:04:22 +05:30
Linus Torvalds
049e6dde7e Linux 4.3-rc4 2015-10-04 16:57:17 +01:00
Linus Torvalds
30c44659f4 Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull strscpy string copy function implementation from Chris Metcalf.

Chris sent this during the merge window, but I waffled back and forth on
the pull request, which is why it's going in only now.

The new "strscpy()" function is definitely easier to use and more secure
than either strncpy() or strlcpy(), both of which are horrible nasty
interfaces that have serious and irredeemable problems.

strncpy() has a useless return value, and doesn't NUL-terminate an
overlong result.  To make matters worse, it pads a short result with
zeroes, which is a performance disaster if you have big buffers.

strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking
the insane NUL padding, but having a differently broken return value
which returns the original length of the source string.  Which means
that it will read characters past the count from the source buffer, and
you have to trust the source to be properly terminated.  It also makes
error handling fragile, since the test for overflow is unnecessarily
subtle.

strscpy() avoids both these problems, guaranteeing the NUL termination
(but not excessive padding) if the destination size wasn't zero, and
making the overflow condition very obvious by returning -E2BIG.  It also
doesn't read past the size of the source, and can thus be used for
untrusted source data too.

So why did I waffle about this for so long?

Every time we introduce a new-and-improved interface, people start doing
these interminable series of trivial conversion patches.

And every time that happens, somebody does some silly mistake, and the
conversion patch to the improved interface actually makes things worse.
Because the patch is mindnumbing and trivial, nobody has the attention
span to look at it carefully, and it's usually done over large swatches
of source code which means that not every conversion gets tested.

So I'm pulling the strscpy() support because it *is* a better interface.
But I will refuse to pull mindless conversion patches.  Use this in
places where it makes sense, but don't do trivial patches to fix things
that aren't actually known to be broken.

* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: use global strscpy() rather than private copy
  string: provide strscpy()
  Make asm/word-at-a-time.h available on all architectures
2015-10-04 16:31:13 +01:00