This fixes a regression in 3.7-rc, which has since gone into stable.
Commit 00442ad04a ("mempolicy: fix a memory corruption by refcount
imbalance in alloc_pages_vma()") changed get_vma_policy() to raise the
refcount on a shmem shared mempolicy; whereas shmem_alloc_page() went
on expecting alloc_page_vma() to drop the refcount it had acquired.
This deserves a rework: but for now fix the leak in shmem_alloc_page().
Hugh: shmem_swapin() did not need a fix, but surely it's clearer to use
the same refcounting there as in shmem_alloc_page(), delete its onstack
mempolicy, and the strange mpol_cond_copy() and __mpol_cond_copy() -
those were invented to let swapin_readahead() make an unknown number of
calls to alloc_pages_vma() with one mempolicy; but since 00442ad04a,
alloc_pages_vma() has kept refcount in balance, so now no problem.
Reported-and-tested-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull module fixes from Rusty Russell:
"Module signing build fixes for blackfin and metag"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
modsign: add symbol prefix to certificate list
linux/kernel.h: define SYMBOL_PREFIX
Merge 'block-dev' branch.
I was going to just mark everything here for stable and leave it to the
3.8 merge window, but having decided on doing another -rc, I migth as
well merge it now.
This removes the bd_block_size_semaphore semaphore that was added in
this release to fix a race condition between block size changes and
block IO, and replaces it with atomicity guaratees in fs/buffer.c
instead, along with simplifying fs/block-dev.c.
This removes more lines than it adds, makes the code generally simpler,
and avoids the latency/rt issues that the block size semaphore
introduced for mount.
I'm not happy with the timing, but it wouldn't be much better doing this
during the merge window and then having some delayed back-port of it
into stable.
* block-dev:
blkdev_max_block: make private to fs/buffer.c
direct-io: don't read inode->i_blkbits multiple times
blockdev: remove bd_block_size_semaphore again
fs/buffer.c: make block-size be per-page and protected by the page lock
Define SYMBOL_PREFIX to be the same as CONFIG_SYMBOL_PREFIX if set by
the architecture, or "" otherwise. This avoids the need for ugly #ifdefs
whenever symbols are referenced in asm blocks.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Perches <joe@perches.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Pull perf fixes from Ingo Molnar:
"This is mostly about unbreaking architectures that took the UAPI
changes in the v3.7 cycle, plus misc fixes."
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf kvm: Fix building perf kvm on non x86 arches
perf kvm: Rename perf_kvm to perf_kvm_stat
perf: Make perf build for x86 with UAPI disintegration applied
perf powerpc: Use uapi/unistd.h to fix build error
tools: Pass the target in descend
tools: Honour the O= flag when tool build called from a higher Makefile
tools: Define a Makefile function to do subdir processing
x86: Export asm/{svm.h,vmx.h,perf_regs.h}
perf tools: Fix strbuf_addf() when the buffer needs to grow
perf header: Fix numa topology printing
perf, powerpc: Fix hw breakpoints returning -ENOSPC
Merge misc fixes from Andrew Morton:
"Seven fixes, some of them fingers-crossed :("
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (7 patches)
drivers/rtc/rtc-tps65910.c: fix invalid pointer access on _remove()
mm: soft offline: split thp at the beginning of soft_offline_page()
mm: avoid waking kswapd for THP allocations when compaction is deferred or contended
revert "Revert "mm: remove __GFP_NO_KSWAPD""
mm: vmscan: fix endless loop in kswapd balancing
mm/vmemmap: fix wrong use of virt_to_page
mm: compaction: fix return value of capture_free_page()
It apepars that this patch was innocent, and we hope that "mm: avoid
waking kswapd for THP allocations when compaction is deferred or
contended" will fix the final kswapd-spinning cause.
Cc: Zdenek Kabelac <zkabelac@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We really don't want to look at the block size for the raw block device
accesses in fs/block-dev.c, because it may be changing from under us.
So get rid of the max_block logic entirely, since the caller should
already have done it anyway.
That leaves the only user of this function in fs/buffer.c, so move the
whole function there and make it static.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts the block-device direct access code to the previous
unlocked code, now that fs/buffer.c no longer needs external locking.
With this, fs/block_dev.c is back to the original version, apart from a
whitespace cleanup that I didn't want to revert.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
task_cputime_adjusted() and thread_group_cputime_adjusted()
essentially share the same code. They just don't use the same
source:
* The first function uses the cputime in the task struct and the
previous adjusted snapshot that ensures monotonicity.
* The second adds the cputime of all tasks in the group and the
previous adjusted snapshot of the whole group from the signal
structure.
Just consolidate the common code that does the adjustment. These
functions just need to fetch the values from the appropriate
source.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
We have thread_group_cputime() and thread_group_times(). The naming
doesn't provide enough information about the difference between
these two APIs.
To lower the confusion, rename thread_group_times() to
thread_group_cputime_adjusted(). This name better suggests that
it's a version of thread_group_cputime() that does some stabilization
on the raw cputime values. ie here: scale on top of CFS runtime
stats and bound lower value for monotonicity.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Use synchronize_sched_expedited() instead of synchronize_sched()
to improve mount speed.
This patch improves mount time from 0.500s to 0.013s for Jeff's
test-case.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-and-tested-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction
based on failures" reverted, Zdenek Kabelac reported the following
Hmm, so it's just took longer to hit the problem and observe
kswapd0 spinning on my CPU again - it's not as endless like before -
but still it easily eats minutes - it helps to turn off Firefox
or TB (memory hungry apps) so kswapd0 stops soon - and restart
those apps again. (And I still have like >1GB of cached memory)
kswapd0 R running task 0 30 2 0x00000000
Call Trace:
preempt_schedule+0x42/0x60
_raw_spin_unlock+0x55/0x60
put_super+0x31/0x40
drop_super+0x22/0x30
prune_super+0x149/0x1b0
shrink_slab+0xba/0x510
The sysrq+m indicates the system has no swap so it'll never reclaim
anonymous pages as part of reclaim/compaction. That is one part of the
problem but not the root cause as file-backed pages could also be
reclaimed.
The likely underlying problem is that kswapd is woken up or kept awake
for each THP allocation request in the page allocator slow path.
If compaction fails for the requesting process then compaction will be
deferred for a time and direct reclaim is avoided. However, if there
are a storm of THP requests that are simply rejected, it will still be
the the case that kswapd is awake for a prolonged period of time as
pgdat->kswapd_max_order is updated each time. This is noticed by the
main kswapd() loop and it will not call kswapd_try_to_sleep(). Instead
it will loopp, shrinking a small number of pages and calling
shrink_slab() on each iteration.
The temptation is to supply a patch that checks if kswapd was woken for
THP and if so ignore pgdat->kswapd_max_order but it'll be a hack and not
backed up by proper testing. As 3.7 is very close to release and this
is not a bug we should release with, a safer path is to revert "mm:
remove __GFP_NO_KSWAPD" for now and revisit it with the view to ironing
out the balance_pgdat() logic in general.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Zdenek Kabelac <zkabelac@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit baf05aa927 ("bug: introduce BUILD_BUG_ON_INVALID() macro")
introduces this macro only when _CHECKER_ is not defined. Define a
silent macro in the else condition to fix following sparse warning:
mm/filemap.c:395:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
mm/filemap.c:396:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
mm/filemap.c:397:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
include/linux/mm.h:419:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
include/linux/mm.h:419:9: error: not a function <noident>
Signed-off-by: Tushar Behera <tushar.behera@linaro.org>
Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Simple build regression fix for DT device drivers on Sparc. An earlier
change had masked out the of_iomap() helper on SPARC.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQr/PiAAoJEEFnBt12D9kB5vsP/RpNnzUAE/tORTGqu/gHpMf/
doi88Y7+on68ZrQeHrtGhVQBvBrZiAohlP9z62qYkSThBb1m63k3HRvX4/Ygk8Qd
AID6TvtPTbsivaRh4l5cbROmKW7j22XUBUGDBO4mczbtxhTrvc58+51CPrBBH0GO
NGv+sBgaKO97dfflgnUtgFftZ9XYDtiSpL64fQ2dvAMoz9tHhOtYt9YS2egDHV49
wN8g8VjK7pZpuX7mRB3cCHujqEka59iFhitAzisqbr+3q4Hl6YSNAEhaNPRkuyRa
Jn5JLV1gKi5g3j3wh9P2y8ZiDhjUidBD8kk13mPTdDJCNo7PV89zB678+9U6n74N
5l2ZSzjO6ylWteag9pJJsfMQbybtsD8od2fOg+qa9KIl+cYtJV9NNg1WhBTnYDmG
iiOqZW0UyIn3YuAXSDP5T1PuiBtHHlFJwmZFNYMEMRf6tbQOs1MvNw9JYokZgO8f
j09M28yO2YafoTppBcyyYWUyLEl7iKV9lS2JaZA7OLNAlA58dmxB9Ge9RiTB4pbD
Z665edAnzWMZS320Mg5oyLJv33u7GyuVC1jm10O1ZNonhlUkUzSOnxpOMhnObJYN
Bfhl7MFFNqyUFu+1R6DACxeEU3Rk+CEMeOrSVvwS6bfVvWib+BFmZz/sS2YzOZyv
Rk14JvH91UBhXfThqnQ0
=yKFc
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6
Pull device tree regression fix from Grant Likely:
"Simple build regression fix for DT device drivers on Sparc. An
earlier change had masked out the of_iomap() helper on SPARC."
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6:
of/address: sparc: Declare of_iomap as an extern function for sparc again
This bug-fix makes sure that of_iomap is defined extern for sparc so that the
sparc-specific implementation of_iomap is once again used when including
include/linux/of_address.h in a sparc context. OF_GPIO that is now available for
sparc relies on this.
The bug was inadvertently introduced in a850a75, "of/address: add empty static
inlines for !CONFIG_OF", that added a static dummy inline for of_iomap when
!CONFIG_OF_ADDRESS. However, CONFIG_OF_ADDRESS is never defined for sparc, but
there is a sparc-specific implementation /arch/sparc/kernel/of_device_common.c.
This fix takes the same approach as 0bce04b that solved the equivalent problem
for of_address_to_resource.
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Pull i2c fixes from Wolfram Sang:
"Bugfixes for the i2c subsystem.
Except for a few one-liners, there is mainly one revert because of an
overlooked dependency. Since there is no linux-next at the moment, I
did some extra testing, and all was fine for me."
* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
i2c: mxs: Handle i2c DMA failure properly
i2c: s3c2410: Fix code to free gpios
i2c: omap: ensure writes to dev->buf_len are ordered
Revert "ARM: OMAP: convert I2C driver to PM QoS for MPU latency constraints"
i2c: at91: fix SMBus quick command
Pull input updates from Dmitry Torokhov:
"This fixes recent regression where /dev/input/mice got assigned wrong
device node which messed up setups with static /dev, and a regression
in ads7846 GPIO debounce setup."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
ARM - OMAP: ads7846: fix pendown debounce setting
Input: ads7846 - enable pendown GPIO debounce time setting
Input: mousedev - move /dev/input/mice to the correct minor
Input: MT - document new 'flags' argument of input_mt_init_slots()
Some platforms need the pendown GPIO debounce time setting programmed.
Since the pendown GPIO is handled by the driver, the debounce time
should also be handled along with the pendown GPIO request.
Signed-off-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Make perf build for x86 once the UAPI disintegration patches for that arch
have been applied by adding the appropriate -I flags - in the right order -
and then converting some #includes that use ../.. notation to find main kernel
headerfiles to use <asm/foo.h> and <linux/foo.h> instead.
Note that -Iarch/foo/include/uapi is present _before_ -Iarch/foo/include.
This makes sure we get the userspace version of the pt_regs struct. Ideally,
we wouldn't have the latter -I flag at all, but unfortunately we want
asm/svm.h and asm/vmx.h in builtin-kvm.c and these aren't part of the UAPI -
at least not for x86. I wonder if the bits outside of the __KERNEL__ guards
*should* be transferred there.
I note also that perf seems to do its dependency handling manually by listing
all the header files it might want to use in LIB_H in the Makefile. Can this
be changed to use -MD?
Note that to do make this work, we need to export and UAPI disintegrate
linux/hw_breakpoint.h, which I think should've been exported previously so that
perf can access the bits. We have to do this in the same patch to maintain
bisectability.
Signed-off-by: David Howells <dhowells@redhat.com>
Merge misc fixes from Andrew Morton.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (12 patches)
revert "mm: fix-up zone present pages"
tmpfs: change final i_blocks BUG to WARNING
tmpfs: fix shmem_getpage_gfp() VM_BUG_ON
mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address
mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures"
rapidio: fix kernel-doc warnings
swapfile: fix name leak in swapoff
memcg: fix hotplugged memory zone oops
mips, arc: fix build failure
memcg: oom: fix totalpages calculation for memory.swappiness==0
mm: fix build warning for uninitialized value
mm: add anon_vma_lock to validate_mm()
Revert commit 7f1290f2f2 ("mm: fix-up zone present pages")
That patch tried to fix a issue when calculating zone->present_pages,
but it caused a regression on 32bit systems with HIGHMEM. With that
change, reset_zone_present_pages() resets all zone->present_pages to
zero, and fixup_zone_present_pages() is called to recalculate
zone->present_pages when the boot allocator frees core memory pages into
buddy allocator. Because highmem pages are not freed by bootmem
allocator, all highmem zones' present_pages becomes zero.
Various options for improving the situation are being discussed but for
now, let's return to the 3.6 code.
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix rapidio kernel-doc warnings:
Warning(drivers/rapidio/rio.c:415): No description found for parameter 'local'
Warning(drivers/rapidio/rio.c:415): Excess function parameter 'lstart' description in 'rio_map_inb_region'
Warning(include/linux/rio.h:290): No description found for parameter 'switches'
Warning(include/linux/rio.h:290): No description found for parameter 'destid_table'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Acked-by: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When MEMCG is configured on (even when it's disabled by boot option),
when adding or removing a page to/from its lru list, the zone pointer
used for stats updates is nowadays taken from the struct lruvec. (On
many configurations, calculating zone from page is slower.)
But we have no code to update all the lruvecs (per zone, per memcg) when
a memory node is hotadded. Here's an extract from the oops which
results when running numactl to bind a program to a newly onlined node:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000f60
IP: __mod_zone_page_state+0x9/0x60
Pid: 1219, comm: numactl Not tainted 3.6.0-rc5+ #180 Bochs Bochs
Process numactl (pid: 1219, threadinfo ffff880039abc000, task ffff8800383c4ce0)
Call Trace:
__pagevec_lru_add_fn+0xdf/0x140
pagevec_lru_move_fn+0xb1/0x100
__pagevec_lru_add+0x1c/0x30
lru_add_drain_cpu+0xa3/0x130
lru_add_drain+0x2f/0x40
...
The natural solution might be to use a memcg callback whenever memory is
hotadded; but that solution has not been scoped out, and it happens that
we do have an easy location at which to update lruvec->zone. The lruvec
pointer is discovered either by mem_cgroup_zone_lruvec() or by
mem_cgroup_page_lruvec(), and both of those do know the right zone.
So check and set lruvec->zone in those; and remove the inadequate
attempt to set lruvec->zone from lruvec_init(), which is called before
NODE_DATA(node) has been allocated in such cases.
Ah, there was one exceptionr. For no particularly good reason,
mem_cgroup_force_empty_list() has its own code for deciding lruvec.
Change it to use the standard mem_cgroup_zone_lruvec() and
mem_cgroup_get_lru_size() too. In fact it was already safe against such
an oops (the lru lists in danger could only be empty), but we're better
proofed against future changes this way.
I've marked this for stable (3.6) since we introduced the problem in 3.5
(now closed to stable); but I have no idea if this is the only fix
needed to get memory hotadd working with memcg in 3.6, and received no
answer when I enquired twice before.
Reported-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We've been sitting on this longer than we meant to due to travel and
other activities, but the number of patches is luckily not that high.
Biggest changes are from a batch of OMAP bugfixes, but there are a
few for the broader set of SoCs too (bcm2835, pxa, highbank, tegra,
at91 and i.MX).
The OMAP patches contain some fixes for MUSB/PHY on omap4 which
ends up being a bit on the large side but needed for legacy (non-DT)
platforms. Beyond that there are a handful of hwmod/pm changes.
So, fairly noncontroversial stuff all in all, and as usual around this
time the fixes are well targeted at specific problems.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQpmQOAAoJEIwa5zzehBx37+YP/jzcKS1pU5Wf73GYIUcCPqO0
iRwLziexucUkWXnqIIPLedUJ8Dze/8q1tSTnQ7/JSXy9SYtJf651aj5OAo8w/cXO
d58+y1S+VTsFHbppKfQHbGpYq2n2f4PPvrL24ftp40OmomVl/ktqOB4fDN1/YuAw
lfTeo0v8MNfyVni5ij21rCNS17IC0Tl4Mfth8zIILWe6qdqajpla7CoO7ppmUM08
OAPi6NJL/l8vqfqNtGuk2x9cOca0jY1rdib/rfrL1LxrtLT//NP0d6h+wKaSxLWm
Qvr9nAsnmZNV0pFnYjVxfSMwM6Gf2SBh0QG3lF0akwfe3bEXqfnG5muAtWEhpTlt
MVx54UgKSWVBgfBH8/SsDkJ3UydxNO5XjHz9YOix1Sj390J2zpP3E24Y0vmYYaNn
c2seHeH4SckMZ1mddZgy1NT8y7/zaXQ2OSRLyVigJ9wjKmduuOW013BpKUlHFI9E
Uzh1GLpWe2Cwl3wKWlHRlhgp0NzpWEHhrPW254rOfai8P9xeFMMQH8eqUlSWZbDN
GAzqUWZ4/F6NxPV1GPrVCZjrA9IYKKtZ5GkPwC1FibC9Kfyb0rp9kr7KM4mzWUY3
7MxpFatMS4rdbJk5lXCQ1+EIIWF6xdNbURCyMAnoiaowMzdv+LkfBIW+17fDPDgI
WAu7+QCKTQd+i/bvMsKh
=aKWl
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"We've been sitting on this longer than we meant to due to travel and
other activities, but the number of patches is luckily not that high.
Biggest changes are from a batch of OMAP bugfixes, but there are a few
for the broader set of SoCs too (bcm2835, pxa, highbank, tegra, at91
and i.MX).
The OMAP patches contain some fixes for MUSB/PHY on omap4 which ends
up being a bit on the large side but needed for legacy (non-DT)
platforms. Beyond that there are a handful of hwmod/pm changes.
So, fairly noncontroversial stuff all in all, and as usual around this
time the fixes are well targeted at specific problems."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: imx: ehci: fix host power mask bit
ARM i.MX: fix error-valued pointer dereference in clk_register_gate2()
ARM: at91/usbh: fix overcurrent gpio setup
ARM: at91/AT91SAM9G45: fix crypto peripherals irq issue due to sparse irq support
ARM: boot: Fix usage of kecho
ARM: OMAP: ocp2scp: create omap device for ocp2scp
ARM: OMAP4: add _dev_attr_ to ocp2scp for representing usb_phy
drivers: bus: ocp2scp: add pdata support
irqchip: irq-bcm2835: Add terminating entry for of_device_id table
ARM: highbank: retry wfi on reset request
ARM: OMAP4: PM: fix regulator name for VDD_MPU
ARM: OMAP4: hwmod data: do not enable or reset the McPDM during kernel init
ARM: OMAP2+: hwmod: add flag to prevent hwmod code from touching IP block during init
ARM: dt: tegra: fix length of pad control and mux registers
ARM: OMAP: hwmod: wait for sysreset complete after enabling hwmod
ARM: OMAP2+: clockdomain: Fix OMAP4 ISS clk domain to support only SWSUP
ARM: pxa/spitz_pm: Fix hang when resuming from STR
ARM: pxa: hx4700: Fix backlight PWM device number
ARM: OMAP2+: PM: add missing newline to VC warning message
Users of GCC 4.7 have reported compiler errors due to having inline
applied to function declarations in clk-provider.h. The definitions
exist in drivers/clk/clk.c. An example error:
In file included from arch/arm/mach-omap2/clockdomain.c:25:0:
arch/arm/mach-omap2/clockdomain.c: In function ‘clkdm_clk_disable’:
include/linux/clk-provider.h:338:12: error: inlining failed in call to always_inline ‘__clk_get_enable_count’: function body not available
arch/arm/mach-omap2/clockdomain.c:1001:28: error: called from here
make[1]: *** [arch/arm/mach-omap2/clockdomain.o] Error 1
make: *** [arch/arm/mach-omap2] Error 2
This patch removes the use of inline from include/linux/clk-provider.h
but keeps the function definitions in drivers/clk/clk.c as inlined since
they are one-liners.
Signed-off-by: Igor Mazanov <i.mazanov@gmail.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
[mturquette@linaro.org: improved subject, added changelog]
Pull networking fixes from David Miller:
"Bug fixes galore, mostly in drivers as is often the case:
1) USB gadget and cdc_eem drivers need adjustments to their frame size
lengths in order to handle VLANs correctly. From Ian Coolidge.
2) TIPC and several network drivers erroneously call tasklet_disable
before tasklet_kill, fix from Xiaotian Feng.
3) r8169 driver needs to apply the WOL suspend quirk to more chipsets,
fix from Cyril Brulebois.
4) Fix multicast filters on RTL_GIGA_MAC_VER_35 r8169 chips, from
Nathan Walp.
5) FDB netlink dumps should use RTM_NEWNEIGH as the message type, not
zero. From John Fastabend.
6) Fix smsc95xx tx checksum offload on big-endian, from Steve
Glendinning.
7) __inet_diag_dump() needs to repsect and report the error value
returned from inet_diag_lock_handler() rather than ignore it.
Otherwise if an inet diag handler is not available for a particular
protocol, we essentially report success instead of giving an error
indication. Fix from Cyrill Gorcunov.
8) When the QFQ packet scheduler sees TSO/GSO packets it does not
handle things properly, and in fact ends up corrupting it's
datastructures as well as mis-schedule packets. Fix from Paolo
Valente.
9) Fix oopser in skb_loop_sk(), from Eric Leblond.
10) CXGB4 passes partially uninitialized datastructures in to FW
commands, fix from Vipul Pandya.
11) When we send unsolicited ipv6 neighbour advertisements, we should
send them to the link-local allnodes multicast address, as per
RFC4861. Fix from Hannes Frederic Sowa.
12) There is some kind of bug in the usbnet's kevent deferral
mechanism, but more immediately when it triggers an uncontrolled
stream of kernel messages spam the log. Rate limit the error log
message triggered when this problem occurs, as sending thousands
of error messages into the kernel log doesn't help matters at all,
and in fact makes further diagnosis more difficult.
From Steve Glendinning.
13) Fix gianfar restore from hibernation, from Wang Dongsheng.
14) The netlink message attribute sizes are wrong in the ipv6 GRE
driver, it was using the size of ipv4 addresses instead of ipv6
ones :-) Fix from Nicolas Dichtel."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
gre6: fix rtnl dump messages
gianfar: ethernet vanishes after restoring from hibernation
usbnet: ratelimit kevent may have been dropped warnings
ipv6: send unsolicited neighbour advertisements to all-nodes
net: usb: cdc_eem: Fix rx skb allocation for 802.1Q VLANs
usb: gadget: g_ether: fix frame size check for 802.1Q
cxgb4: Fix initialization of SGE_CONTROL register
isdn: Make CONFIG_ISDN depend on CONFIG_NETDEVICES
cxgb4: Initialize data structures before using.
af-packet: fix oops when socket is not present
pkt_sched: enable QFQ to support TSO/GSO
net: inet_diag -- Return error code if protocol handler is missed
net: bnx2x: Fix typo in bnx2x driver
smsc95xx: fix tx checksum offload for big endian
rtnetlink: Use nlmsg type RTM_NEWNEIGH from dflt fdb dump
ptp: update adjfreq callback description
r8169: allow multicast packets on sub-8168f chipset.
r8169: Fix WoL on RTL8168d/8111d.
drivers/net: use tasklet_kill in device remove/close process
tipc: do not use tasklet_disable before tasklet_kill
Pull sparc fixes from David Miller:
"Several build/bug fixes for sparc, including:
1) Configuring a mix of static vs. modular sparc64 crypto modules
didn't work, remove an ill-conceived attempt to only have to build
the device match table for these drivers once to fix the problem.
Reported by Meelis Roos.
2) Make the montgomery multiple/square and mpmul instructions actually
usable in 32-bit tasks. Essentially this involves providing 32-bit
userspace with a way to use a 64-bit stack when it needs to.
3) Our sparc64 atomic backoffs don't yield cpu strands properly on
Niagara chips. Use pause instruction when available to achieve
this, otherwise use a benign instruction we know blocks the strand
for some time.
4) Wire up kcmp
5) Fix the build of various drivers by removing the unnecessary
blocking of OF_GPIO when SPARC.
6) Fix unintended regression wherein of_address_to_resource stopped
being provided. Fix from Andreas Larsson.
7) Fix NULL dereference in leon_handle_ext_irq(), also from Andreas
Larsson."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix build with mix of modular vs. non-modular crypto drivers.
sparc: Support atomic64_dec_if_positive properly.
of/address: sparc: Declare of_address_to_resource() as an extern function for sparc again
sparc32, leon: Check for existent irq_map entry in leon_handle_ext_irq
sparc: Add sparc support for platform_get_irq()
sparc: Allow OF_GPIO on sparc.
qlogicpti: Fix build warning.
sparc: Wire up sys_kcmp.
sparc64: Improvde documentation and readability of atomic backoff code.
sparc64: Use pause instruction when available.
sparc64: Fix cpu strand yielding.
sparc64: Make montmul/montsqr/mpmul usable in 32-bit threads.
This bug-fix makes sure that of_address_to_resource is defined extern for sparc
so that the sparc-specific implementation of of_address_to_resource() is once
again used when including include/linux/of_address.h in a sparc context. A
number of drivers in mainline relies on this function working for sparc.
The bug was introduced in a850a75544, "of/address:
add empty static inlines for !CONFIG_OF". Contrary to that commit title, the
static inlines are added for !CONFIG_OF_ADDRESS, and CONFIG_OF_ADDRESS is never
defined for sparc. This is good behavior for the other functions in
include/linux/of_address.h, as the extern functions defined in
drivers/of/address.c only gets linked when OF_ADDRESS is configured. However,
for of_address_to_resource there exists a sparc-specific implementation in
arch/sparc/arch/sparc/kernel/of_device_common.c
Solution suggested by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The of_device_id match data is now marked as const and
must not be modified. This changes the dw_mmc to mark
all pointers passing the dw_mci_drv_data or dw_mci_dma_ops
structures as const, and also marks the static definitions
as const.
drivers/mmc/host/dw_mmc-exynos.c: In function 'dw_mci_exynos_probe':
drivers/mmc/host/dw_mmc-exynos.c:234:11: warning: assignment discards 'const' qualifier from pointer target type [enabled by default]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Abraham <thomas.abraham@linaro.org>
Cc: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
CMD23 causes lots of errors in kernel on some freescale SoCs
(P1020, P1021, P1022, P1024, P1025 and P4080) when MMC card used,
which is because these controllers does not support CMD23,
even on the SoCs which declares CMD23 is supported.
Therefore, we'll not use CMD23.
Signed-off-by: Jerry Huang <Chang-Ming.Huang@freescale.com>
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Acked-by: Anton Vorontsov <cbouatmailru@gmail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Even though platform_get_irq returns error, 'host->irq'
always has an unsigned value. Less-than-zero comparison
of an unsigned value is never true. Type of 'unsigned int'
will be changed for 'int'.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
ocp2scp was not having pdata support which makes *musb* fail for non-dt
boot in OMAP platform. The pdata will have information about the devices
that is connected to ocp2scp. ocp2scp driver will now make use of this
information to create the devices that is attached to ocp2scp.
This is needed to fix MUSB regression caused by commit c9e4412a
(arm: omap: phy: remove unused functions from omap-phy-internal.c)
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Acked-by: Felipe Balbi <balbi@ti.com>
[tony@atomide.com: updated comments for regression info]
Signed-off-by: Tony Lindgren <tony@atomide.com>
This patch updates the adjfreq callback description to include a note that the
delta in ppb is always relative to the base frequency, and not to the current
frequency of the hardware clock.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
CC: stable@vger.kernel.org [v3.5+]
CC: Richard Cochran <richard.cochran@gmail.com>
CC: John Stultz <john.stultz@linaro.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This hashtable implementation is using hlist buckets to provide a simple
hashtable to prevent it from getting reimplemented all over the kernel.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
[ Merging this now, so that subsystems can start applying Sasha's
patches that use this - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After commit b3356bf0db (KVM: emulator: optimize "rep ins" handling),
the pieces of io data can be collected and write them to the guest memory
or MMIO together
Unfortunately, kvm splits the mmio access into 8 bytes and store them to
vcpu->mmio_fragments. If the guest uses "rep ins" to move large data, it
will cause vcpu->mmio_fragments overflow
The bug can be exposed by isapc (-M isapc):
[23154.818733] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[ ......]
[23154.858083] Call Trace:
[23154.859874] [<ffffffffa04f0e17>] kvm_get_cr8+0x1d/0x28 [kvm]
[23154.861677] [<ffffffffa04fa6d4>] kvm_arch_vcpu_ioctl_run+0xcda/0xe45 [kvm]
[23154.863604] [<ffffffffa04f5a1a>] ? kvm_arch_vcpu_load+0x17b/0x180 [kvm]
Actually, we can use one mmio_fragment to store a large mmio access then
split it when we pass the mmio-exit-info to userspace. After that, we only
need two entries to store mmio info for the cross-mmio pages access
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
- one recently introduced crash for dm-raid10 with discard
- one bug in new functionality that has been around for a few releases.
- minor bug in md's 'faulty' personality
and UAPI disintegration for md.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQIVAwUAUJB3rDnsnt1WYoG5AQLuAw/8C2I1LNHRc9zccO4akAg9AyYcpoNGcY6I
PG1SR7sQiKuQYNTwc7xqqYJ241r3U+Ablh8nurr0rbCmYX8rcnjwTZzhH6h0ER5Q
M31i7CKb2OY7VGKjs1FtlVnRtdRWVkLHHappEaT0NzjHUqpCDGZYcLMoSaLaCNdE
8P8GlAI+w8kachkWRnp1a4pdR7Kc1SnP97aZJ304EDy63gYwcsOg+m8zZj5h74u9
gJpVES1yqflN12CHIkK3K22QM9a1KbP9L9TKQSsevmOe4ju/ID3IlTKjKJvvYoUS
r9FJIJsGbzOREr1iap4hr81+rrH56t4o1FxgWCuj2wpw7EWelMFrTH0iMNNaxjyk
z+g7ZElnSjkOYxQXirKcWTJ+F5F4jEc48XlFNjtuvHz771xby3Q5dTN/+hMCQ9k1
JNML2A9QquK0jLZauRIsbBpVy2uC+vOoJ2BX2kcMOvuHUeCzK78x4HZjZi7mP6Dg
O9E4+ocGnFZsqnCPtBAxv9G8RE36Efp3uxms9HlwY6TeTGJWyZuiWDyNea2tRLct
OARMseYVxkup7DOnHirtb9Pywc3kkLqtXcWbZH68Hi5uHMrGFUO2ZhSwjfsC5+rZ
Nyt1lcRLZaxy/JFgHXzOeLqA2o/nY62OiMEgP+ENbASNJ4HKf685ytzmg2BVetsY
9E/KUQBEJqY=
=plEs
-----END PGP SIGNATURE-----
Merge tag 'md-3.7-fixes' of git://neil.brown.name/md
Pull md fixes from NeilBrown:
"Some fixes for md in 3.7
- one recently introduced crash for dm-raid10 with discard
- one bug in new functionality that has been around for a few
releases.
- minor bug in md's 'faulty' personality
and UAPI disintegration for md."
* tag 'md-3.7-fixes' of git://neil.brown.name/md:
MD RAID10: Fix oops when creating RAID10 arrays via dm-raid.c
md/raid1: Fix assembling of arrays containing Replacements.
md faulty: use disk_stack_limits()
UAPI: (Scripted) Disintegrate include/linux/raid
vtime_account() doesn't have the same role in
CONFIG_VIRT_CPU_ACCOUNTING and CONFIG_IRQ_TIME_ACCOUNTING.
In the first case it handles time accounting in any context. In
the second case it only handles irq time accounting.
So when vtime_account() is called from outside vtime_account_irq_*()
this call is pointless to CONFIG_IRQ_TIME_ACCOUNTING.
To fix the confusion, change vtime_account() to irqtime_account_irq()
in CONFIG_IRQ_TIME_ACCOUNTING. This way we ensure future account_vtime()
calls won't waste useless cycles in the irqtime APIs.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
With CONFIG_VIRT_CPU_ACCOUNTING, when vtime_account()
is called in irq entry/exit, we perform a check on the
context: if we are interrupting the idle task we
account the pending cputime to idle, otherwise account
to system time or its sub-areas: tsk->stime, hardirq time,
softirq time, ...
However this check for idle only concerns the hardirq entry
and softirq entry:
* Hardirq may directly interrupt the idle task, in which case
we need to flush the pending CPU time to idle.
* The idle task may be directly interrupted by a softirq if
it calls local_bh_enable(). There is probably no such call
in any idle task but we need to cover every case. Ksoftirqd
is not concerned because the idle time is flushed on context
switch and softirq in the end of hardirq have the idle time
already flushed from the hardirq entry.
In the other cases we always account to system/irq time:
* On hardirq exit we account the time to hardirq time.
* On softirq exit we account the time to softirq time.
To optimize this and avoid the indirect call to vtime_account()
and the checks it performs, specialize the vtime irq APIs and
only perform the check on irq entry. Irq exit can directly call
vtime_account_system().
CONFIG_IRQ_TIME_ACCOUNTING behaviour doesn't change and directly
maps to its own vtime_account() implementation. One may want
to take benefits from the new APIs to optimize irq time accounting
as well in the future.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Switching to or from guest context is done on ioctl context.
So by the time we call kvm_guest_enter() or kvm_guest_exit()
we know we are not running the idle task.
As a result, we can directly account the cputime using
vtime_account_system().
There are two good reasons to do this:
* We avoid some useless checks on guest switch. It optimizes
a bit this fast path.
* In the case of CONFIG_IRQ_TIME_ACCOUNTING, calling vtime_account()
checks for irq time to account. This is pointless since we know
we are not in an irq on guest switch. This is wasting cpu cycles
for no good reason. vtime_account_system() OTOH is a no-op in
this config option.
* We can remove the irq disable/enable around kvm guest switch in s390.
A further optimization may consist in introducing a vtime_account_guest()
that directly calls account_guest_time().
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Xiantao Zhang <xiantao.zhang@intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
vtime_account_system() currently has only one caller with
vtime_account() which is irq safe.
Now we are going to call it from other places like kvm where
irqs are not always disabled by the time we account the cputime.
So let's make it irqsafe. The arch implementation part is now
prefixed with "__".
vtime_account_idle() arch implementation is prefixed accordingly
to stay consistent.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
These APIs are scattered around and are going to expand a bit.
Let's create a dedicated header file for sanity.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Use rcu_read_lock_sched / rcu_read_unlock_sched / synchronize_sched
instead of rcu_read_lock / rcu_read_unlock / synchronize_rcu.
This is an optimization. The RCU-protected region is very small, so
there will be no latency problems if we disable preempt in this region.
So we use rcu_read_lock_sched / rcu_read_unlock_sched that translates
to preempt_disable / preempt_disable. It is smaller (and supposedly
faster) than preemptible rcu_read_lock / rcu_read_unlock.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>