Host controller drivers use the NS_TO_US macro to convert transaction
times, which are computed in nanoseconds, to microseconds for
scheduling. Periodic scheduling requires worst-case estimates, but
the macro does its conversion using round-to-nearest. This patch
changes it to use round-up, giving a correct worst-case value.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
usb_wait_anchor_empty_timeout() should wait till the completion handler
has run. Both the zd1211rw driver and the uas driver (in its task mgmt) depend
on the completion handler having completed when usb_wait_anchor_empty_timeout()
returns, as they read state set by the completion handler after an
usb_wait_anchor_empty_timeout() call.
But __usb_hcd_giveback_urb() calls usb_unanchor_urb before calling the
completion handler. This is necessary as the completion handler may
re-submit and re-anchor the urb. But this introduces a race where the state
these drivers want to read has not been set yet by the completion handler
(this race is easily triggered with the uas task mgmt code).
I've considered adding an anchor_count to struct urb, which would be
incremented on anchor and decremented on unanchor, and then only actually
do the anchor / unanchor on 0 -> 1 and 1 -> 0 transtions, combined with
moving the unanchor call in hcd_giveback_urb to after calling the completion
handler. But this will only work if urb's are only re-anchored to the same
anchor as they were anchored to before the completion handler ran.
And at least one driver re-anchors to another anchor from the completion
handler (rtlwifi).
So I have come up with this patch instead, which adds the ability to
suspend wakeups of usb_wait_anchor_empty_timeout() waiters to the usb_anchor
functionality, and uses this in __usb_hcd_giveback_urb() to delay wake-ups
until the completion handler has run.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
And do so in a way which ensures that any fields added in the future will
also get properly zero-ed.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The loops which SPI controller drivers use to process the list of transfers
in a spi_message are typically very similar and have some error prone areas
such as the handling of /CS. Help simplify drivers by factoring this code
out into the core - if drivers provide a transfer_one() function instead
of a transfer_one_message() function the core will handle processing at the
message level.
/CS can be controlled by either setting cs_gpio or providing a set_cs
function. If this is not possible for hardware reasons then both can be
omitted and the driver should continue to implement manual /CS handling.
This is a first step in refactoring and it is expected that there will be
further enhancements, for example factoring out of the mapping of transfers
for DMA and the initiation and completion of interrupt driven transfers.
Signed-off-by: Mark Brown <broonie@linaro.org>
Many SPI drivers perform setup and tear down on every message, usually
doing things like DMA mapping the message. Provide hooks for them to use
to provide such operations.
This is of limited value for drivers that implement transfer_one_message()
but will be of much greater utility with future factoring out of standard
implementations of that function.
Signed-off-by: Mark Brown <broonie@linaro.org>
Pull drm fixes from Dave Airlie:
"All over the map..
- nouveau:
disable MSI, needs more work, will try again next merge window
- radeon:
audio + uvd regression fixes, dpm fixes, reset fixes
- i915:
the dpms fix might fix your haswell
And one pain in the ass revert, so we have VGA arbitration that when
implemented 4-5 years ago really hoped that GPUs could remove
themselves from arbitration completely once they had a kernel driver.
It seems Intel hw designers decided that was too nice a facility to
allow us to have so they removed it when they went on-die (so since
Ironlake at least). Now Alex Williamson added support for VGA
arbitration for newer GPUs however this now exposes itself to
userspace as requireing arbitration of GPU VGA regions and the X
server gets involved and disables things that it can't handle when VGA
access is possibly required around every operation.
So in order to not break userspace we just reverted things back to the
old known broken status so maybe we can try and design out way out.
Ville also had a patch to use stop machine for the two times Intel
needs to access VGA space, that might be acceptable with some rework,
but for now myself and Daniel agreed to just go back"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (23 commits)
Revert "i915: Update VGA arbiter support for newer devices"
Revert "drm/i915: Delay disabling of VGA memory until vgacon->fbcon handoff is done"
drm/radeon: re-enable sw ACR support on pre-DCE4
drm/radeon/dpm: disable bapm on TN asics
drm/radeon: improve soft reset on CIK
drm/radeon: improve soft reset on SI
drm/radeon/dpm: off by one in si_set_mc_special_registers()
drm/radeon/dpm/btc: off by one in btc_set_mc_special_registers()
drm/radeon: forever loop on error in radeon_do_test_moves()
drm/radeon: fix hw contexts for SUMO2 asics
drm/radeon: fix typo in CP DMA register headers
drm/radeon/dpm: disable multiple UVD states
drm/radeon: use hw generated CTS/N values for audio
drm/radeon: fix N/CTS clock matching for audio
drm/radeon: use 64-bit math to calculate CTS values for audio (v2)
drm/edid: catch kmalloc failure in drm_edid_to_speaker_allocation
Revert "drm/fb-helper: don't sleep for screen unblank when an oops is in progress"
drm/gma500: fix things after get/put page helpers
drm/nouveau/mc: disable msi support by default, it's busted in tons of places
drm/i915: Only apply DPMS to the encoder if enabled
...
Add REGULATOR_LINEAR_RANGE macro and convert regulator drivers to use it.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
linear ranges means each range has linear voltage settings.
So we can calculate max_uV for each linear range in regulator core rather than
set the max_uV field in drivers.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
The current VFIO-on-POWER implementation supports only user mode
driven mapping, i.e. QEMU is sending requests to map/unmap pages.
However this approach is really slow, so we want to move that to KVM.
Since H_PUT_TCE can be extremely performance sensitive (especially with
network adapters where each packet needs to be mapped/unmapped) we chose
to implement that as a "fast" hypercall directly in "real
mode" (processor still in the guest context but MMU off).
To be able to do that, we need to provide some facilities to
access the struct page count within that real mode environment as things
like the sparsemem vmemmap mappings aren't accessible.
This adds an API function realmode_pfn_to_page() to get page struct when
MMU is off.
This adds to MM a new function put_page_unless_one() which drops a page
if counter is bigger than 1. It is going to be used when MMU is off
(for example, real mode on PPC64) and we want to make sure that page
release will not happen in real mode as it may crash the kernel in
a horrible way.
CONFIG_SPARSEMEM_VMEMMAP and CONFIG_FLATMEM are supported.
Cc: linux-mm@kvack.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This adds hash_for_each_possible_rcu_notrace() which is basically
a notrace clone of hash_for_each_possible_rcu() which cannot be
used in real mode due to its tracing/debugging capability.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down
a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto'
constructs, as outlined here:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670
Implement a workaround suggested by Jakub Jelinek.
Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This reverts commit 81b5c7bc8d.
Adding drm/i915 into the vga arbiter chain means that X (in a piece of
well-meant paranoia) will do a get/put on the vga decoding around
_every_ accel call down into the ddx. Which results in some nice
performance disasters [1]. This really breaks userspace, by disabling
DRI for everyone, and stops OpenGL from working, this isn't limited
to just the i915 but both the integrated and discrete GPUs on
multi-gpu systems, in other words this causes untold worlds of pain,
Ville tried to come up with a Great Hack to fiddle the required VGA
I/O ops behind everyone's back using stop_machine, but that didn't
really work out [2]. Given that we're fairly late in the -rc stage for
such games let's just revert this all.
One thing we might want to keep is to delay the disabling of the vga
decoding until the fbdev emulation and the fbcon screen is set up. If
we kill vga mem decoding beforehand fbcon can end up with a white
square in the top-left corner it tried to save from the vga memory for
a seamless transition. And we have bug reports on older platforms
which seem to match these symptoms.
But again that's something to play around with in -next.
References: [1] http://lists.x.org/archives/xorg-devel/2013-September/037763.html
References: [2] http://www.spinics.net/lists/intel-gfx/msg34062.html
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
For omaps, we still have dependencies to the legacy code
for handling the PRM (Power Reset Management) interrupts,
and also for reconfiguring the io wake-up chain after
changes.
Let's pass the PRM interrupt and the rearm functions via
auxdata. Then when at some point we have a proper PRM
driver, we can get the interrupt via device tree and
set up the rearm function as exported function in the
PRM driver.
By using auxdata we can remove a dependency to the
wake-up events for converting omap3 to be device
tree only.
Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Prakash Manjunathappa <prakash.pm@ti.com>
Cc: Roger Quadros <rogerq@ti.com>
Cc: Haojian Zhuang <haojian.zhuang@gmail.com>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
non-x86 platforms, in particular MIPS and ARM.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABCAAGBQJSVvO/AAoJENNvdpvBGATwNZ4P+wadRWY/Gdz/p9332qdVrGYs
nP4DPWSg+n3RH/fOnacEwHF5vqapTe03G82NriCaVGFP8O9j7bo6ByMKKkIR7yvr
4sHUX4YMc/DwchaIHH2xp8fQoMc3Mv7mn8bodTtPXgveeldEvtuUQM0q+j4DXZUT
qSLMGElgJYrpIf2Cm8JAIBkt2QuzpZPPX7Z6glZunpvfLSMmgn3Vj2ilNEx1YCFH
v+Rk1ZYLjg2LzUYqaO7HOXlRJqmE10I7ZmNvPXJZ9fuPmGYD9FU6WeHhmIAFYdFw
V6bAzou+LbnuNVoW6yiDvrKcOXgh2Spbk6SaKVSrcjVPfc87ocNzGWI4OTfNy1xI
Kv9u4YfU3pIUWPDGx0mvT/KXAXl/PGVfu7bYXDcN2I2tqlrbBPdIWqpFB2eTn7/j
//XbatoT6gGZTuseCKhYXWpG8AE5pCfbjGnd9il21fvlUDdkIq42wAs96qjc6Ruj
tPCi5yYzLiHsn4eau+SJqI1KxPLf6YJw9Qo+f70FGl63wXJU9Vr07ID2rGTwXm1m
Qf1joTtx900PvfzUaD0ODbQZaTbX6ebSOkriKpKWYwg+26Gdc7JAxIVI3HDOlOR+
++r1M4ERwDic/xdVsB6Mngmop3d1BeNU2IAoiRDZwcJpS1+MLivlIbd1PjBAt0bU
+oOm+wseHEzSnlgucQ0g
=qnTe
-----END PGP SIGNATURE-----
Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull /dev/random changes from Ted Ts'o:
"These patches are designed to enable improvements to /dev/random for
non-x86 platforms, in particular MIPS and ARM"
* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: allow architectures to optionally define random_get_entropy()
random: run random_int_secret_init() run after all late_initcalls
Allow architectures which have a disabled get_cycles() function to
provide a random_get_entropy() function which provides a fine-grained,
rapidly changing counter that can be used by the /dev/random driver.
For example, an architecture might have a rapidly changing register
used to control random TLB cache eviction, or DRAM refresh that
doesn't meet the requirements of get_cycles(), but which is good
enough for the needs of the random driver.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
This adds RCAR Gen2 USB phy support. The driver configures
USB channels 0/2 which are shared between PCI USB hosts and
USBHS/USBSS devices. It also controls internal USBHS phy.
Signed-off-by: Valentine Barshak <valentine.barshak@cogentembedded.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
It's helpful for a driver to put the pci slot name in its interrupt
names, so /proc/interrupts will show the pci slot of the device.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
The layout of struct health_buffer was not according to firmware
specification. Fix it to comply.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Checksum calculations consume CPU resources and can be significant to
the rate of resource creation/destruction.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Pull more timekeeping items for v3.13 from John Stultz:
* Small cleanup in the clocksource code.
* Fix for rtc-pl031 to let it work with alarmtimers.
* Move arm64 to using the generic sched_clock framework & resulting
cleanup in the generic sched_clock code.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Nobody is using sched_clock_func() anymore now that sched_clock
supports up to 64 bits. Remove the hook so that new code only
uses sched_clock_register().
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Make it easier for drivers to include single register writes in
asynchronous sequences by providing async versions of the write
and update bits operations. The update bits operations are only
likely to be effective when used with devices that have caches
but this is common enough to be useful.
Signed-off-by: Mark Brown <broonie@linaro.org>
Shared faults can lead to lots of unnecessary page migrations,
slowing down the system, and causing private faults to hit the
per-pgdat migration ratelimit.
This patch adds sysctl numa_balancing_migrate_deferred, which specifies
how many shared page migrations to skip unconditionally, after each page
migration that is skipped because it is a shared fault.
This reduces the number of page migrations back and forth in
shared fault situations. It also gives a strong preference to
the tasks that are already running where most of the memory is,
and to moving the other tasks to near the memory.
Testing this with a much higher scan rate than the default
still seems to result in fewer page migrations than before.
Memory seems to be somewhat better consolidated than previously,
with multi-instance specjbb runs on a 4 node system.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-62-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With the scan rate code working (at least for multi-instance specjbb),
the large hammer that is "sched: Do not migrate memory immediately after
switching node" can be replaced with something smarter. Revert temporarily
migration disabling and all traces of numa_migrate_seq.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-61-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With scan rate adaptions based on whether the workload has properly
converged or not there should be no need for the scan period reset
hammer. Get rid of it.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-60-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Adjust numa_scan_period in task_numa_placement, depending on how much
useful work the numa code can do. The more local faults there are in a
given scan window the longer the period (and hence the slower the scan rate)
during the next window. If there are excessive shared faults then the scan
period will decrease with the amount of scaling depending on whether the
ratio of shared/private faults. If the preferred node changes then the
scan rate is reset to recheck if the task is properly placed.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-59-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Due to the way the pid is truncated, and tasks are moved between
CPUs by the scheduler, it is possible for the current task_numa_fault
to group together tasks that do not actually share memory together.
This patch adds a few easy sanity checks to task_numa_fault, joining
tasks together if they share the same tsk->mm, or if the fault was on
a page with an elevated mapcount, in a shared VMA.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-57-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It is possible for a task in a numa group to call exec, and
have the new (unrelated) executable inherit the numa group
association from its former self.
This has the potential to break numa grouping, and is trivial
to fix.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-51-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch uses the fraction of faults on a particular node for both task
and group, to figure out the best node to place a task. If the task and
group statistics disagree on what the preferred node should be then a full
rescan will select the node with the best combined weight.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-50-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
A newly spawned thread inside a process should stay on the same
NUMA node as its parent. This prevents processes from being "torn"
across multiple NUMA nodes every time they spawn a new thread.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-49-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
And here's a little something to make sure not the whole world ends up
in a single group.
As while we don't migrate shared executable pages, we do scan/fault on
them. And since everybody links to libc, everybody ends up in the same
group.
Suggested-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1381141781-10992-47-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It is desirable to model from userspace how the scheduler groups tasks
over time. This patch adds an ID to the numa_group and reports it via
/proc/PID/status.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-45-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
While parallel applications tend to align their data on the cache
boundary, they tend not to align on the page or THP boundary.
Consequently tasks that partition their data can still "false-share"
pages presenting a problem for optimal NUMA placement.
This patch uses NUMA hinting faults to chain tasks together into
numa_groups. As well as storing the NID a task was running on when
accessing a page a truncated representation of the faulting PID is
stored. If subsequent faults are from different PIDs it is reasonable
to assume that those two tasks share a page and are candidates for
being grouped together. Note that this patch makes no scheduling
decisions based on the grouping information.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1381141781-10992-44-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Change the per page last fault tracking to use cpu,pid instead of
nid,pid. This will allow us to try and lookup the alternate task more
easily. Note that even though it is the cpu that is store in the page
flags that the mpol_misplaced decision is still based on the node.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1381141781-10992-43-git-send-email-mgorman@suse.de
[ Fixed build failure on 32-bit systems. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Use the new stop_two_cpus() to implement migrate_swap(), a function that
flips two tasks between their respective cpus.
I'm fairly sure there's a less crude way than employing the stop_two_cpus()
method, but everything I tried either got horribly fragile and/or complex. So
keep it simple for now.
The notable detail is how we 'migrate' tasks that aren't runnable
anymore. We'll make it appear like we migrated them before they went to
sleep. The sole difference is the previous cpu in the wakeup path, so we
override this.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Link: http://lkml.kernel.org/r/1381141781-10992-39-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Introduce stop_two_cpus() in order to allow controlled swapping of two
tasks. It repurposes the stop_machine() state machine but only stops
the two cpus which we can do with on-stack structures and avoid
machine wide synchronization issues.
The ordering of CPUs is important to avoid deadlocks. If unordered then
two cpus calling stop_two_cpus on each other simultaneously would attempt
to queue in the opposite order on each CPU causing an AB-BA style deadlock.
By always having the lowest number CPU doing the queueing of works, we can
guarantee that works are always queued in the same order, and deadlocks
are avoided.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
[ Implemented deadlock avoidance. ]
Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Link: http://lkml.kernel.org/r/1381141781-10992-38-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When a preferred node is selected for a tasks there is an attempt to migrate
the task to a CPU there. This may fail in which case the task will only
migrate if the active load balancer takes action. This may never happen if
the conditions are not right. This patch will check at NUMA hinting fault
time if another attempt should be made to migrate the task. It will only
make an attempt once every five seconds.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-34-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There is a 90% regression observed with a large Oracle performance test
on a 4 node system. Profiles indicated that the overhead was due to
contention on sp_lock when looking up shared memory policies. These
policies do not have the appropriate flags to allow them to be
automatically balanced so trapping faults on them is pointless. This
patch skips VMAs that do not have MPOL_F_MOF set.
[riel@redhat.com: Initial patch]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-and-tested-by: Joe Mario <jmario@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-32-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ideally it would be possible to distinguish between NUMA hinting faults that
are private to a task and those that are shared. If treated identically
there is a risk that shared pages bounce between nodes depending on
the order they are referenced by tasks. Ultimately what is desirable is
that task private pages remain local to the task while shared pages are
interleaved between sharing tasks running on different nodes to give good
average performance. This is further complicated by THP as even
applications that partition their data may not be partitioning on a huge
page boundary.
To start with, this patch assumes that multi-threaded or multi-process
applications partition their data and that in general the private accesses
are more important for cpu->memory locality in the general case. Also,
no new infrastructure is required to treat private pages properly but
interleaving for shared pages requires additional infrastructure.
To detect private accesses the pid of the last accessing task is required
but the storage requirements are a high. This patch borrows heavily from
Ingo Molnar's patch "numa, mm, sched: Implement last-CPU+PID hash tracking"
to encode some bits from the last accessing task in the page flags as
well as the node information. Collisions will occur but it is better than
just depending on the node information. Node information is then used to
determine if a page needs to migrate. The PID information is used to detect
private/shared accesses. The preferred NUMA node is selected based on where
the maximum number of approximately private faults were measured. Shared
faults are not taken into consideration for a few reasons.
First, if there are many tasks sharing the page then they'll all move
towards the same node. The node will be compute overloaded and then
scheduled away later only to bounce back again. Alternatively the shared
tasks would just bounce around nodes because the fault information is
effectively noise. Either way accounting for shared faults the same as
private faults can result in lower performance overall.
The second reason is based on a hypothetical workload that has a small
number of very important, heavily accessed private pages but a large shared
array. The shared array would dominate the number of faults and be selected
as a preferred node even though it's the wrong decision.
The third reason is that multiple threads in a process will race each
other to fault the shared page making the fault information unreliable.
Signed-off-by: Mel Gorman <mgorman@suse.de>
[ Fix complication error when !NUMA_BALANCING. ]
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-30-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently automatic NUMA balancing is unable to distinguish between false
shared versus private pages except by ignoring pages with an elevated
page_mapcount entirely. This avoids shared pages bouncing between the
nodes whose task is using them but that is ignored quite a lot of data.
This patch kicks away the training wheels in preparation for adding support
for identifying shared/private pages is now in place. The ordering is so
that the impact of the shared/private detection can be easily measured. Note
that the patch does not migrate shared, file-backed within vmas marked
VM_EXEC as these are generally shared library pages. Migrating such pages
is not beneficial as there is an expectation they are read-shared between
caches and iTLB and iCache pressure is generally low.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-28-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ideally it would be possible to distinguish between NUMA hinting faults
that are private to a task and those that are shared. This patch prepares
infrastructure for separately accounting shared and private faults by
allocating the necessary buffers and passing in relevant information. For
now, all faults are treated as private and detection will be introduced
later.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-26-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch favours moving tasks towards NUMA node that recorded a higher
number of NUMA faults during active load balancing. Ideally this is
self-reinforcing as the longer the task runs on that node, the more faults
it should incur causing task_numa_placement to keep the task running on that
node. In reality a big weakness is that the nodes CPUs can be overloaded
and it would be more efficient to queue tasks on an idle node and migrate
to the new node. This would require additional smarts in the balancer so
for now the balancer will simply prefer to place the task on the preferred
node for a PTE scans which is controlled by the numa_balancing_settle_count
sysctl. Once the settle_count number of scans has complete the schedule
is free to place the task on an alternative node if the load is imbalanced.
[srikar@linux.vnet.ibm.com: Fixed statistics]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
[ Tunable and use higher faults instead of preferred. ]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-23-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
NUMA hinting fault counts and placement decisions are both recorded in the
same array which distorts the samples in an unpredictable fashion. The values
linearly accumulate during the scan and then decay creating a sawtooth-like
pattern in the per-node counts. It also means that placement decisions are
time sensitive. At best it means that it is very difficult to state that
the buffer holds a decaying average of past faulting behaviour. At worst,
it can confuse the load balancer if it sees one node with an artifically high
count due to very recent faulting activity and may create a bouncing effect.
This patch adds a second array. numa_faults stores the historical data
which is used for placement decisions. numa_faults_buffer holds the
fault activity during the current scan window. When the scan completes,
numa_faults decays and the values from numa_faults_buffer are copied
across.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-22-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch selects a preferred node for a task to run on based on the
NUMA hinting faults. This information is later used to migrate tasks
towards the node during balancing.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-21-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch tracks what nodes numa hinting faults were incurred on.
This information is later used to schedule a task on the node storing
the pages most frequently faulted by the task.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-20-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The NUMA PTE scan rate is controlled with a combination of the
numa_balancing_scan_period_min, numa_balancing_scan_period_max and
numa_balancing_scan_size. This scan rate is independent of the size
of the task and as an aside it is further complicated by the fact that
numa_balancing_scan_size controls how many pages are marked pte_numa and
not how much virtual memory is scanned.
In combination, it is almost impossible to meaningfully tune the min and
max scan periods and reasoning about performance is complex when the time
to complete a full scan is is partially a function of the tasks memory
size. This patch alters the semantic of the min and max tunables to be
about tuning the length time it takes to complete a scan of a tasks occupied
virtual address space. Conceptually this is a lot easier to understand. There
is a "sanity" check to ensure the scan rate is never extremely fast based on
the amount of virtual memory that should be scanned in a second. The default
of 2.5G seems arbitrary but it is to have the maximum scan rate after the
patch roughly match the maximum scan rate before the patch was applied.
On a similar note, numa_scan_period is in milliseconds and not
jiffies. Properly placed pages slow the scanning rate but adding 10 jiffies
to numa_scan_period means that the rate scanning slows depends on HZ which
is confusing. Get rid of the jiffies_to_msec conversion and treat it as ms.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-18-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
PTE scanning and NUMA hinting fault handling is expensive so commit
5bca2303 ("mm: sched: numa: Delay PTE scanning until a task is scheduled
on a new node") deferred the PTE scan until a task had been scheduled on
another node. The problem is that in the purely shared memory case that
this may never happen and no NUMA hinting fault information will be
captured. We are not ruling out the possibility that something better
can be done here but for now, this patch needs to be reverted and depend
entirely on the scan_delay to avoid punishing short-lived processes.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-16-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQEcBAABAgAGBQJSUc9zAAoJEHm+PkMAQRiG9DMH/AtpuAF6LlMRPjrCeuJQ1pyh
T0IUO+CsLKO6qtM5IyweP8V6zaasNjIuW1+B6IwVIl8aOrM+M7CwRiKvpey26ldM
I8G2ron7hqSOSQqSQs20jN2yGAqQGpYIbTmpdGLAjQ350NNNvEKthbP5SZR5PAmE
UuIx5OGEkaOyZXvCZJXU9AZkCxbihlMSt2zFVxybq2pwnGezRUYgCigE81aeyE0I
QLwzzMVdkCxtZEpkdJMpLILAz22jN4RoVDbXRa2XC7dA9I2PEEXI9CcLzqCsx2Ii
8eYS+no2K5N2rrpER7JFUB2B/2X8FaVDE+aJBCkfbtwaYTV9UYLq3a/sKVpo1Cs=
=xSFJ
-----END PGP SIGNATURE-----
Merge tag 'v3.12-rc4' into sched/core
Merge Linux v3.12-rc4 to fix a conflict and also to refresh the tree
before applying more scheduler patches.
Conflicts:
arch/avr32/include/asm/Kbuild
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull perf fixes from Ingo Molnar:
"Various fixlets:
On the kernel side:
- fix a race
- fix a bug in the handling of the perf ring-buffer data page
On the tooling side:
- fix the handling of certain corrupted perf.data files
- fix a bug in 'perf probe'
- fix a bug in 'perf record + perf sched'
- fix a bug in 'make install'
- fix a bug in libaudit feature-detection on certain distros"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf session: Fix infinite loop on invalid perf.data file
perf tools: Fix installation of libexec components
perf probe: Fix to find line information for probe list
perf tools: Fix libaudit test
perf stat: Set child_pid after perf_evlist__prepare_workload()
perf tools: Add default handler for mmap2 events
perf/x86: Clean up cap_user_time* setting
perf: Fix perf_pmu_migrate_context
This adds a new driver to support the s3c24xx dma using the dmaengine
and makes the old one in mach-s3c24xx obsolete in the long run.
Conceptually the s3c24xx-dma feels like a distant relative of the pl08x
with numerous virtual channels being mapped to a lot less physical ones.
The driver therefore borrows a lot from the amba-pl08x driver in this
regard. Functionality-wise the driver gains a memcpy ability in addition
to the slave_sg one.
The driver supports both the method for requesting the peripheral used
by SoCs before the S3C2443 and the different method for S3C2443 and later.
On earlier SoCs the hardware channels usable for specific peripherals is
constrainted while on later SoCs all channels can be used for any peripheral.
Tested on a s3c2416-based board, memcpy using the dmatest module and
slave_sg partially using the spi-s3c64xx driver.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
From Sebastian Hasselbarth:
This is a patch set based on an RFC [1][2] sent earlier to provide
a common arch/arm init for DT clock providers. Currently, the call to
of_clk_init(NULL) to initialize DT clock providers is spread among several
mach-dirs. Since most machs require DT clocks initialized before timers,
no initcall can be used.
By adding of_clk_init(NULL) to arch/arm time_init(), we can remove all
mach-specific .init_time hooks that basically called of_clk_init and
clocksource_of_init.
In contrast to the RFC version, of_clk_init(NULL) is now only called if
no custom .init_time callback is set. This allows some machs to still
call clock init themselves, as not all can be converted now. Therefore,
this patch sets drops conversion of mach-mvebu and mach-zynq. New machs
that were introduced with v3.12-rc1 are also converted, except mach-u300
that requires clocks before irqs.
* 'clk-of-init-v2_for-3.13' of https://github.com/shesselba/linux-dove: (29 commits)
ARM: vt8500: remove custom .init_time hook
ARM: vexpress: remove custom .init_time hook
ARM: tegra: remove custom .init_time hook
ARM: sunxi: remove custom .init_time hook
ARM: sti: remove custom .init_time hook
ARM: socfpga: remove custom .init_time hook
ARM: rockchip: remove custom .init_time hook
ARM: prima2: remove custom .init_time hook
ARM: nspire: remove custom .init_time hook
ARM: nomadik: remove custom .init_time hook
ARM: mxs: remove custom .init_time hook
ARM: kirkwood: remove custom .init_time hook
ARM: imx: remove custom .init_time hook
ARM: highbank: remove custom .init_time hook
ARM: exynos: remove custom .init_time hook
ARM: dove: remove custom .init_time hook
ARM: bcm2835: remove custom .init_time hook
ARM: bcm: provide common arch init for DT clocks
ARM: call of_clk_init from default time_init handler
ARM: vt8500: prepare for arch-wide .init_time callback
...
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull HID fixes from Jiri Kosina:
- fix for hidraw reference counting regression, by Manoj Chourasia
- fix for minor number allocation for uhid, by David Herrmann
- other small unsorted fixes / device ID additions
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: wiimote: fix FF deadlock
HID: add Holtek USB ID 04d9:a081 SHARKOON DarkGlider
HID: hidraw: close underlying device at removal of last reader
HID: roccat: Fix "cannot create duplicate filename" problems
HID: uhid: allocate static minor
virtio wants to pass in cpumask_of(cpu), make parameter
const to avoid build warnings.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No in-kernel code is now using this, they have all be converted over to
using the bin_attrs support in attribute groups, so this field, and the
code in the driver core that was creating/remove the binary files can be
removed.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that all in-kernel users of the dev_attrs field are converted to use
dev_groups, we can safely remove dev_attrs from struct class.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A couple of fixes from the IOMMU side:
* Some small fixes for the new ARM-SMMU driver
* A register offset correction for VT-d
* Adding MAINTAINERS entry for drivers/iommu
Overall no really big or intrusive changes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJSTtOsAAoJECvwRC2XARrjD9cQAOGijCeHW3fZr5iVNCPQQVUq
NNvGZpqjFwtV/8QvBy/HZ6Jt2j/4c24od1xewQflwBuvhxokTPUVs30LYVMGqE//
tfWM5x/kL7Iz2JBAu0vT9Ihq/elIy4zGE8w7a6hio5K0UQVmvsQ73SWYrxut0jcR
F9e9BaNt7LI27v20Ph48ejVHNTIQT07GDZJXRtYiBI9VmI8O6aTpu8OOeS5Grrv+
tyIpt3DYnqbsTdsnF5YlIVn23d/MYR8be2wnpaGh/vShZlwPsU8ay9/29cJhqyGD
5GWuRK+4OKaXXWhzpcwMc+iYwCIp1IKkCc5dax1xVMedlOzRxtQpZXEZEjlv9/aS
sINp2kBnkJssGO781OWr7HL9G/OxdKHokG8AiizFSS18VDy76AVI3sWLCJwuFPPW
X+SAYQiph7liVkUKEFwITTu4CJ5TClwcy0ovFGqpnhGLIKp3woEO8K1RznBdYZgH
22ZSm3GpTi6XG53cP2INBQ0cKXg6nbJPhczyUiaSLDVGlFS+VMGZavCsjn4ceq7u
/k1M9uwhE8JqS6T2dTROy/ZuTOoMTFm4yGTIpec/S9RtRvPjwVMEUQN+y419AV7k
PzAuxefsCOqEcviVt0pMz/aFjdPw6slNJNAG1zWckvw6DrMmKrFGbyH8KMRSchzY
uo0xEoMIft8Mmfvu1IVe
=a8UE
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"A couple of fixes from the IOMMU side:
- some small fixes for the new ARM-SMMU driver
- a register offset correction for VT-d
- add MAINTAINERS entry for drivers/iommu
Overall no really big or intrusive changes"
* tag 'iommu-fixes-v3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
x86/iommu: correct ICS register offset
MAINTAINERS: add overall IOMMU section
iommu/arm-smmu: don't enable SMMU device until probing has completed
iommu/arm-smmu: fix iommu_present() test in init
iommu/arm-smmu: fix a signedness bug
The GPIO number of the RESET line can be passed to the
driver using the gpio_reset member.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Since we are changing wait.h profoundly, use the opportunity to:
- add a sentence to explain what this file is about
- remove whitespace noise
- prettify weird looking line break fixup attempts
- standardize type definition and initialization sequences
- use consistent style details
No code is changed.
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-O8dIie5swnctqpupakatvqyq@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Change all __wait_event*() implementations to match the corresponding
wait_event*() signature for convenience.
In particular this does away with the weird 'ret' logic. Since there
are __wait_event*() users this requires we update them too.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092529.042563462@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
While not a whole-sale replacement like the others we can still reduce
the size of __wait_event_hrtimeout() considerably by noting that the
actual core of __wait_event_hrtimeout() is identical to what
___wait_event() generates.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.972793648@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reduce macro complexity by using the new ___wait_event() helper.
No change in behaviour, identical generated code.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.898691966@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reduce macro complexity by using the new ___wait_event() helper.
No change in behaviour, identical generated code.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.831085521@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reduce macro complexity by using the new ___wait_event() helper.
No change in behaviour, identical generated code.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.759956109@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reduce macro complexity by using the new ___wait_event() helper.
No change in behaviour, identical generated code.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.686006009@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reduce macro complexity by using the new ___wait_event() helper.
No change in behaviour, identical generated code.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.612813379@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reduce macro complexity by using the new ___wait_event() helper.
No change in behaviour, identical generated code.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.541716442@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reduce macro complexity by using the new ___wait_event() helper.
No change in behaviour, identical generated code.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.469616907@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reduce macro complexity by using the new ___wait_event() helper.
No change in behaviour, identical generated code.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.396949919@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reduce macro complexity by using the new ___wait_event() helper.
No change in behaviour, identical generated code.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.325264677@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reduce macro complexity by using the new ___wait_event() helper.
No change in behaviour, identical generated code.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.254863348@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There's far too much duplication in the __wait_event macros; in order
to fix this introduce ___wait_event() a macro with the capability to
replace most other macros.
With the previous patches changing the various __wait_event*()
implementations to be more uniform; we can now collapse the lot
without also changing generated code.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.181897111@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Purely a preparatory patch; it changes the control flow to match what
will soon be generated by generic code so that that patch can be a
unity transform.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.107994763@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 4c663cf ("wait: fix false timeouts when using
wait_event_timeout()") introduced an additional condition check after
a timeout but there's a few issues;
- it forgot one site
- it put the check after the main loop; not at the actual timeout
check.
Cure both; by wrapping the condition (as suggested by Oleg), this
avoids double evaluation of 'condition' which could be quite big.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092528.028892896@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There's two patterns to check signals in the __wait_event*() macros:
if (!signal_pending(current)) {
schedule();
continue;
}
ret = -ERESTARTSYS;
break;
And the more natural:
if (signal_pending(current)) {
ret = -ERESTARTSYS;
break;
}
schedule();
Change them all into the latter form.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131002092527.956416254@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add a generic qualifier for transaction events, as a new sample
type that returns a flag word. This is particularly useful
for qualifying aborts: to distinguish aborts which happen
due to asynchronous events (like conflicts caused by another
CPU) versus instructions that lead to an abort.
The tuning strategies are very different for those cases,
so it's important to distinguish them easily and early.
Since it's inconvenient and inflexible to filter for this
in the kernel we report all the events out and allow
some post processing in user space.
The flags are based on the Intel TSX events, but should be fairly
generic and mostly applicable to other HTM architectures too. In addition
to various flag words there's also reserved space to report an
program supplied abort code. For TSX this is used to distinguish specific
classes of aborts, like a lock busy abort when doing lock elision.
Flags:
Elision and generic transactions (ELISION vs TRANSACTION)
(HLE vs RTM on TSX; IBM etc. would likely only use TRANSACTION)
Aborts caused by current thread vs aborts caused by others (SYNC vs ASYNC)
Retryable transaction (RETRY)
Conflicts with other threads (CONFLICT)
Transaction write capacity overflow (CAPACITY WRITE)
Transaction read capacity overflow (CAPACITY READ)
Transactions implicitely aborted can also return an abort code.
This can be used to signal specific events to the profiler. A common
case is abort on lock busy in a RTM eliding library (code 0xff)
To handle this case we include the TSX abort code
Common example aborts in TSX would be:
- Data conflict with another thread on memory read.
Flags: TRANSACTION|ASYNC|CONFLICT
- executing a WRMSR in a transaction. Flags: TRANSACTION|SYNC
- HLE transaction in user space is too large
Flags: ELISION|SYNC|CAPACITY-WRITE
The only flag that is somewhat TSX specific is ELISION.
This adds the perf core glue needed for reporting the new flag word out.
v2: Add MEM/MISC
v3: Move transaction to the end
v4: Separate capacity-read/write and remove misc
v5: Remove _SAMPLE. Move abort flags to 32bit. Rename
transaction to txn
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
While auditing the list_entry usage due to a trinity bug I found that
perf_pmu_migrate_context violates the rules for
perf_event::event_entry.
The problem is that perf_event::event_entry is a RCU list element, and
hence we must wait for a full RCU grace period before re-using the
element after deletion.
Therefore the usage in perf_pmu_migrate_context() which re-uses the
entry immediately is broken. For now introduce another list_head into
perf_event for this specific usage.
This doesn't actually fix the trinity report because that never goes
through this code.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-mkj72lxagw1z8fvjm648iznw@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Two function declarations are absence if not define CONFIG_DEBUG_FS
in include/linux/debugfs.h
Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This function was preventing us from supporting multiple
instances. Get rid of it. Since we support DT boots only,
users can get the control device phandle from the DT node.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
omap_get_control_dev() is being deprecated as it doesn't support
multiple instances. As control device is present only from OMAP4
onwards which supports DT only, we use phandles to get the
reference to the control device.
Also get rid of "ti,has-mailbox" property as it is redundant and
we can determine that from whether "ctrl-module" property is present
or not. Get rid of has_mailbox from musb_hdrc_platform_data as well.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add support for new device types and in the process rid of "ti,type"
device tree property. The correct type of device will be determined
from the compatible string instead.
Introduce a compatible string for each device type. At the moment
we support 4 types OTGHS, USB2, PIPE3 (e.g. USB3) and DRA7USB2.
Update DT binding information to reflect these changes.
Also get rid of omap_control_usb3_phy_power(). Just one function
i.e. omap_control_usb_phy_power() will now take care of all PHY types.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
omap-control device is present from OMAP4 onwards which
support device tree boots only. So get rid of platform data.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds transfer type enum and packet definitions for
WA_XFER_ISO_PACKET_INFO and WA_XFER_ISO_PACKET_STATUS packets.
It also changes instances of __attribute__((packed)) to __packed to make
checkpatch.pl happy.
Signed-off-by: Thomas Pugliese <thomas.pugliese@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds a kgdb_nmicallin() interface that can be used by
external NMI handlers to call the KGDB/KDB handler. The primary
need for this is for those types of NMI interrupts where all the
CPUs have already received the NMI signal. Therefore no
send_IPI(NMI) is required, and in fact it will cause a 2nd
unhandled NMI to occur. This generates the "Dazed and Confuzed"
messages.
Since all the CPUs are getting the NMI at roughly the same time,
it's not guaranteed that the first CPU that hits the NMI handler
will manage to enter KGDB and set the dbg_master_lock before the
slaves start entering. The new argument "send_ready" was added
for KGDB to signal the NMI handler to release the slave CPUs for
entry into KGDB.
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Jason Wessel <jason.wessel@windriver.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Reviewed-by: Hedi Berriche <hedi@sgi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20131002151417.928886849@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch adds a new spi_w8r16be() helper, which is similar to spi_w8r16()
except that it converts the read data word from big endian to native endianness
before returning it. The reason for introducing this new helper is that for SPI
slave devices it is quite common that the read 16 bit data word is in big
endian. So users of spi_w8r16() have to convert the result to native endianness
manually. A second reason is that in this case the endianness of the return
value of spi_w8r16() depends on its sign. If it is negative (i.e. a error code)
it is already in native endianness, if it is positive it is in big endian. The
sparse code checker doesn't like this kind of mixed endianness and special
annotations are necessary to keep it quiet (E.g. casting to be16 using __force).
Doing the conversion to native endianness in the helper function does not
require such annotations since we are not mixing different endiannesses in the
same variable.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@linaro.org>
Pull (mostly) ARM clocksource driver updates from Daniel Lezcano:
" - Soren Brinkmann added FEAT_PERCPU to a clock device when it is local
per cpu. This feature prevents the clock framework to choose a per cpu
timer as a broadcast timer. This problem arised when the ARM global
timer is used when switching to the broadcast timer which is the case
now on Xillinx with its cpuidle driver.
- Stephen Boyd extended the generic sched_clock code to support 64bit
counters and removes the setup_sched_clock deprecation, as that causes
lots of warnings since there's still users in the arch/arm tree. He
added also the CLOCK_SOURCE_SUSPEND_NONSTOP flag on the architected
timer as they continue counting during suspend.
- Uwe Kleine-König added some missing __init sections and consolidated the
code by moving the of_node_put call from the drivers to the function
clocksource_of_init. "
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQEcBAABAgAGBQJSSKOHAAoJEHm+PkMAQRiGeREH/3EqHmJPBzmVoJwR9/ykDoLg
u+TJTkuxZG220WhgXS7W/0ECyBX0U7yA0bY9PZbqgcdiLjY0veR18/pOhEq5RzHq
ub8Q+AJdiORF/sq268q7gnNmy3rSCgnrAyHA/bzBtkbisYODwZPYvWQVUjgNZ2dW
qtW/TE9rjANcUrk8WdOu9oWcwsq4cyG3cscbfHE/JLFy/8tB5GoD158gxKLZsLXk
uTCeUHMmvFRT56fZwfyvNstA8ozxXcHBmuu6+Ttceky2zeGzp6dOrd+d2SU1Ps3O
P91x4e/Af4RFEwDczGP6TpSBEf/J/JaqrM1drjhnQHho0hrNRZVUXhADFVADCXY=
=dOjB
-----END PGP SIGNATURE-----
Merge tag 'v3.12-rc3' into timers/core
Merge Linux 3.12-rc3 - refresh the tree with the latest fixes before merging new bits.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Jamal sent patch to add tc user simple actions to iproute2
but required header was not being exported.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the flag CLOCK_EVT_FEAT_PERCPU which is supposed to be set for per
cpu clockevent devices.
Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Michal Simek <michal.simek@xilinx.com>
The spec states that the client should not resend requests because
the server will disconnect if it needs to drop an RPC request.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Pull networking changes from David Miller:
1) Multiply in netfilter IPVS can overflow when calculating destination
weight. From Simon Kirby.
2) Use after free fixes in IPVS from Julian Anastasov.
3) SFC driver bug fixes from Daniel Pieczko.
4) Memory leak in pcan_usb_core failure paths, from Alexey Khoroshilov.
5) Locking and encapsulation fixes to serial line CAN driver, from
Andrew Naujoks.
6) Duplex and VF handling fixes to bnx2x driver from Yaniv Rosner,
Eilon Greenstein, and Ariel Elior.
7) In lapb, if no other packets are outstanding, T1 timeouts actually
stall things and no packet gets sent. Fix from Josselin Costanzi.
8) ICMP redirects should not make it to the socket error queues, from
Duan Jiong.
9) Fix bugs in skge DMA mapping error handling, from Nikulas Patocka.
10) Fix setting of VLAN priority field on via-rhine driver, from Roget
Luethi.
11) Fix TX stalls and VLAN promisc programming in be2net driver from
Ajit Khaparde.
12) Packet padding doesn't get handled correctly in new usbnet SG
support code, from Ming Lei.
13) Fix races in netdevice teardown wrt. network namespace closing.
From Eric W. Biederman.
14) Fix potential missed initialization of net_secret if not TCP
connections are openned. From Eric Dumazet.
15) Cinterion PLXX product ID in qmi_wwan driver is wrong, from
Aleksander Morgado.
16) skb_cow_head() can change skb->data and thus packet header pointers,
don't use stale ip_hdr reference in ip_tunnel code.
17) Backend state transition handling fixes in xen-netback, from Paul
Durrant.
18) Packet offset for AH protocol is handled wrong in flow dissector,
from Eric Dumazet.
19) Taking down an fq packet scheduler instance can leave stale packets
in the queues, fix from Eric Dumazet.
20) Fix performance regressions introduced by TCP Small Queues. From
Eric Dumazet.
21) IPV6 GRE tunneling code calculates max_headroom incorrectly, from
Hannes Frederic Sowa.
22) Multicast timer handlers in ipv4 and ipv6 can be the last and final
reference to the ipv4/ipv6 specific network device state, so use the
reference put that will check and release the object if the
reference hits zero. From Salam Noureddine.
23) Fix memory corruption in ip_tunnel driver, and use skb_push()
instead of __skb_push() so that similar bugs are less hard to find.
From Steffen Klassert.
24) Add forgotten hookup of rtnl_ops in SIT and ip6tnl drivers, from
Nicolas Dichtel.
25) fq scheduler doesn't accurately rate limit in certain circumstances,
from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits)
pkt_sched: fq: rate limiting improvements
ip6tnl: allow to use rtnl ops on fb tunnel
sit: allow to use rtnl ops on fb tunnel
ip_tunnel: Remove double unregister of the fallback device
ip_tunnel_core: Change __skb_push back to skb_push
ip_tunnel: Add fallback tunnels to the hash lists
ip_tunnel: Fix a memory corruption in ip_tunnel_xmit
qlcnic: Fix SR-IOV configuration
ll_temac: Reset dma descriptors indexes on ndo_open
skbuff: size of hole is wrong in a comment
ipv6 mcast: use in6_dev_put in timer handlers instead of __in6_dev_put
ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put
ethernet: moxa: fix incorrect placement of __initdata tag
ipv6: gre: correct calculation of max_headroom
powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file
Revert "powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file"
bonding: Fix broken promiscuity reference counting issue
tcp: TSQ can use a dynamic limit
dm9601: fix IFF_ALLMULTI handling
pkt_sched: fq: qdisc dismantle fixes
...
Don't call hid_open_device till there is actually an user. This saves
power by not opening underlying transport for HID. Also close device
if there are no active mfd client using HID sensor hub.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
All arch overriden implementations of do_softirq() share the following
common code: disable irqs (to avoid races with the pending check),
check if there are softirqs pending, then execute __do_softirq() on
a specific stack.
Consolidate the common parts such that archs only worry about the
stack switch.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@au1.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Since commit c93bdd0e03 ("netvm: allow skb allocation to use PFMEMALLOC
reserves"), hole size is one bit less than what is written in the comment.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Stable fix for Oopses in the pNFS files layout driver
- Fix a regression when doing a non-exclusive file create on NFSv4.x
- NFSv4.1 security negotiation fixes when looking up the root filesystem
- Fix a memory ordering issue in the pNFS files layout driver
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJSSfNNAAoJEGcL54qWCgDybGYQAJGm4/vd7/rWZ49KIjGFGkFo
sCt0UOK6Y6ALhUOIlIreXsQ+Iwn9aAoIIRgx8UwnB+hO6PGnSyFuJZZx1KE8V2kj
6JlE5FbsWV+3uFQzNJQsNcoj7NZMzIRZT7x+7QansBOdSQjgQc3ig2sAMWREZjn8
GxMOl8FNRrnP8gRom30ZScgMp1YDM8J1ql80S/nbxh2NOLBsvgg9VapzJhhqkMyl
b7WKX4Qbg4AeSaxIAIrIwcZ7L2YS09JGC40VSybQARs0/7J8fjOZPs7CmrUCoB5F
DmT5vfEC4+dqDf8PMyoFVfxK5ua5Sb/FGQmagYYa8bSgY7Uq03akYI++co+4PZU1
f3SN6CSvVffzGMdXAhUupOZQbkKvKFxR2MTGy8s7dxdkQudd4RioYPDmLfCHlbmb
VY5kFh/Duqso1FCrcfvZoC88ElrWUz5yoVzZyECOEwCs1wjI6bjmGdSqCSbU75Lm
Z0XOAn1cStwFvGwCbGZPUzlvueji3coDdCFPBXAOFHzisLYoo/Lxenw7l5D1qM5b
02iZllcIo340vw8wxHZxVebecFo33P90X1gjv0HQQkV/6EeNgq4D47SWTPxRq3Ai
Dl9MFjTPl51oseDLrH6I/hBvcqjksB1M1+WjifT0bCIi3Y0HAea2U0wgweHS3vAd
QHqIpIJxNHDjPBMDWEZW
=ScfI
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.12-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Stable fix for Oopses in the pNFS files layout driver
- Fix a regression when doing a non-exclusive file create on NFSv4.x
- NFSv4.1 security negotiation fixes when looking up the root
filesystem
- Fix a memory ordering issue in the pNFS files layout driver
* tag 'nfs-for-3.12-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Give "flavor" an initial value to fix a compile warning
NFSv4.1: try SECINFO_NO_NAME flavs until one works
NFSv4.1: Ensure memory ordering between nfs4_ds_connect and nfs4_fl_prepare_ds
NFSv4.1: nfs4_fl_prepare_ds - fix bugs when the connect attempt fails
NFSv4: Honour the 'opened' parameter in the atomic_open() filesystem method
Merge misc fixes from Andrew Morton.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
pidns: fix free_pid() to handle the first fork failure
ipc,msg: prevent race with rmid in msgsnd,msgrcv
ipc/sem.c: update sem_otime for all operations
mm/hwpoison: fix the lack of one reference count against poisoned page
mm/hwpoison: fix false report on 2nd attempt at page recovery
mm/hwpoison: fix test for a transparent huge page
mm/hwpoison: fix traversal of hugetlbfs pages to avoid printk flood
block: change config option name for cmdline partition parsing
mm/mlock.c: prevent walking off the end of a pagetable in no-pmd configuration
mm: avoid reinserting isolated balloon pages into LRU lists
arch/parisc/mm/fault.c: fix uninitialized variable usage
include/asm-generic/vtime.h: avoid zero-length file
nilfs2: fix issue with race condition of competition between segments for dirty blocks
Documentation/kernel-parameters.txt: replace kernelcore with Movable
mm/bounce.c: fix a regression where MS_SNAP_STABLE (stable pages snapshotting) was ignored
kernel/kmod.c: check for NULL in call_usermodehelper_exec()
ipc/sem.c: synchronize the proc interface
ipc/sem.c: optimize sem_lock()
ipc/sem.c: fix race in sem_lock()
mm/compaction.c: periodically schedule when freeing pages
...
- Move all the clock definitions over to the device tree
- Remove all now-redundant AUXDATA and make the ux500 device
tree only
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJSRA9cAAoJEEEQszewGV1zpiQP/iwq3mMXJm5NRnG7NdlfaZkO
TUtobAxdTQDbz70nhgLG5z+FD+qxqYed7MnF/NTRIwe6ZsV74lvY4WloAkNizfDL
TeZYxFiSrv5oTHLfJzuynveIflnG0Wfm3MAMFYp3aeR+h9fGrpDkxOlMIleI+BP0
C9wjysYAO2vhAQIUT3ONwybMgPfgsEmQMpMsPzOpfdLmgtqQrELaHmucN8Hghagy
rTwxq6f3gOxYeFAlEAw1RxapH4GI+VBxoLCXGpkTJqhLK4DAo67NphoUPgNAXp7I
IGZwWTYbNVa6GlvLVXFdCmV6UVF+qkTn9/Tedn+a6LF/5XtA44DLGAvhPRZBGdrY
k5nWCyc/90gpDF8pSi7cJkoZLUW3wEwDpYi9XmZA3dsLnbWW3ct8kX3iJNFQ/jAe
lx+kJP9erMQEtiuVfEXXYbK8ILHi8yoEWfiq7FY3UIbHkKIjEdJOCxk5to6BeynA
PKPJNO7dgYdhiM4Rqz4AcF1xCEe5NAQVltNfeqPVg/7o4nlQnIaUQVUpbDnYXaCN
iB0tVbjSiFA4C/C5oz0k2VhKH8YMsgd+CZ9ALS2jhGNiIfFOp8JuuJ8KL86m7iZQ
aC0rPXT8xkEF0a9lVU1UCCiSf6zIpZ0G1iJ6mZ+hA/FJ2pzeLEGEa4iYpsMhcnfQ
0d9fVtYBKBKCMXaeJP0z
=FQfo
-----END PGP SIGNATURE-----
Merge tag 'ux500-dt-for-v3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson into next/dt
From Linus Walleij:
This is a huge device tree and ATAG removal series for ux500:
- Move all the clock definitions over to the device tree
- Remove all now-redundant AUXDATA and make the ux500 device
tree only
* tag 'ux500-dt-for-v3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson: (92 commits)
ARM: ux500: delete devices-common remnants
clk: ux500: Provide a look-up for the ARMSS clock
ARM: ux500: Enable CPUFreq on Snowball
ARM: ux500: Provide a Device Tree node for CPUFreq in the DBx500
ARM: ux500: Provide a clock lookup for the Hash driver
ARM: ux500: Provide a clock lookup for the Crypto driver
ARM: ux500: Fix trivial white-space error in the DBX500 DTSI file
ARM: ux500: Remove ATAG booting support for Snowball
ARM: ux500: Remove ATAG booting support for HREF
ARM: ux500: Remove ATAG booting support for U8520
ARM: ux500: Remove ATAG booting support for MOP500
ARM: ux500: Purge UIB framework when booting with ATAGs
ARM: ux500: Take out STUIB support when not booting with Device Tree
ARM: ux500: Remove BU21013 ROHM TS support when booting with only ATAGs
ARM: ux500: Don't register the STMPE/SKE when booting with ATAG support
ARM: ux500: Delete U8500 UIB support when booting with ATAGs
ARM: ux500: Don't register Synaptics RMI4 TS when booting with ATAGs
ARM: ux500: Purge DB8500 PRCMU registration when not booting with DT
ARM: ux500: Stop requesting the SoC device to play 'parent' role
ARM: ux500: Remove UART support when booting without Device Tree
...
Signed-off-by: Olof Johansson <olof@lixom.net>
We need/want the mei fixes in here so we can apply other updates that
are depending on them.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here are some HyperV and MEI driver fixes for 3.12-rc3. They resolve some
issues that people have been reporting for them.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.21 (GNU/Linux)
iEYEABECAAYFAlJIgcwACgkQMUfUDdst+ynRGwCgoIIGIXG2RTVaTDm8wezRERGA
ZDoAoKZe5Bec2CSqydcPNZWYg4ATTMjn
=6r+r
-----END PGP SIGNATURE-----
Merge tag 'char-misc-3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some HyperV and MEI driver fixes for 3.12-rc3. They resolve
some issues that people have been reporting for them"
* tag 'char-misc-3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Drivers: hv: vmbus: Terminate vmbus version negotiation on timeout
Drivers: hv: util: Correctly support ws2008R2 and earlier
mei: cancel stall timers in mei_reset
mei: bus: stop wait for read during cl state transition
mei: make me client counters less error prone
This is a small, mainly to get a couple of compile bug related fixes into
the tree ASAP.
New device support:
1) Add ad5446 dac support to the ad5641 driver.
New functionality and cleanups:
1) Optional power supply regulators for the st pressure sensors drivers using
the new optional regulator interface.
2) Bit of tidying up of naming in the sysfs trigger.
Bug fixes from the previous series:
1) Missing select IIO_BUFFER for ti_am335x_adc
2) Drop a bonus bracket in iio-trig-bfin-timmer
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.21 (GNU/Linux)
iQIcBAABAgAGBQJSSJGeAAoJEFSFNJnE9BaIY7gP/i5Ld5glnnG88dtM5g7Cs7Ws
utyKeWh1QxfbtRH8MDo8GfxKAszVeckG552A/iOfm0wvqJT1JIKRdUl2rMcFvfBS
V0YFMKlJlXgfVhvy4X9NgfllrlGxAniFQ115l3JnPPhnEfr4IgWOt1K5dmK0kqhF
Ztt9ukmsa2VctefYP5tcRh1Pw1rYdFndgkdZ64+fYjkAideMbWxgQdh91al1prBz
wXTu43djnQimvkZLRQ0tX0/WQ9zs8H3bNm7TEgOqU8bbX785B8XQv6N6OFnqXT9S
IjbZHniih8eU13gtno+yBR6OO0gYw20IxXmVRUUR/F94JT4KXaByGBrMz8Z5f+o0
OilkJjOdn/jH+Dd1XHJN7Pk8Wx6PxwxeENu3qVn7p1vg27OcpY0Ft2vcdecnE11j
7bP3HYLcxGxLCW0qYs6voK2SnvZAJBrw8MB4sdUAmykdn1xJ25wccQm6iMKTvTYI
+AipHI3jpfllIC6Lj1IVhksAYhd7BlMK5sVH0sgWirLbqACSweRpbXCcRlSvguGY
jEX5+Rf29QfYbcIsRlk52+n47FpVVOfIDjc0ygBGLovaE+FLorWUNIJjeM3f570D
l2nwWjHUvnNFtA/NnBf4NLrwHqXzBX30Httk+6eISRJiyjr4ZEBAVpRt/luiWlqY
bOEBICwy1sEzF4ymIAIa
=KKlS
-----END PGP SIGNATURE-----
Merge tag 'iio-for-3.13b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
Jonathan writes:
Second set of new functionality for IIO in the 3.13 cycle - with bug fixes for first set.
This is a small, mainly to get a couple of compile bug related fixes into
the tree ASAP.
New device support:
1) Add ad5446 dac support to the ad5641 driver.
New functionality and cleanups:
1) Optional power supply regulators for the st pressure sensors drivers using
the new optional regulator interface.
2) Bit of tidying up of naming in the sysfs trigger.
Bug fixes from the previous series:
1) Missing select IIO_BUFFER for ti_am335x_adc
2) Drop a bonus bracket in iio-trig-bfin-timmer
This patch converts clk-imx2[38] clocksource_of_init compatible init
associated with fsl,imx2[38]-clkctrl. With arch/arm calling
of_clk_init(NULL) from time_init(), we can now also remove custom
.init_time hooks.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Mike Turquette <mturquette@linaro.org>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Common clock framework allows to register clock providers to get called
on of_clk_init() by using CLK_OF_DECLARE. This converts sunxi clock
providers to make use of it and get rid of the mach specific clk init
call. As sunxi has a bunch of independent clk provider nodes, we hook
current clock init to board compatible to make it called once.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Mike Turquette <mturquette@linaro.org>
Common clock framework allows to register clock providers to get called
on of_clk_init() by using CLK_OF_DECLARE. This converts nomadik clock
provider to make use of it and get rid of the mach specific clk init
call. As clocks require system reset controller base address to be
initialized each clock driver checks src_base and calls new
nomadik_src_init if required.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Mike Turquette <mturquette@linaro.org>
Commit 638c5115a7949(USBNET: support DMA SG) introduces DMA SG
if the usb host controller is capable of building packet from
discontinuous buffers, but missed handling padding packet when
building DMA SG.
This patch attachs the pre-allocated padding packet at the
end of the sg list, so padding packet can be sent to device
if drivers require that.
Reported-by: David Laight <David.Laight@aculab.com>
Acked-by: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull s390 lockref enablement from Heiko Carstens:
"Enabling the new lockless lockref variant on s390 would have been
trivial until Tony Luck added a cpu_relax() call into the
CMPXCHG_LOOP(), with commit d472d9d98b ("lockref: Relax in cmpxchg
loop")
As already mentioned cpu_relax() is very expensive on s390 since it
yields() the current virtual cpu. So we are talking of several
thousand cycles. Considering this enabling the lockless lockref
variant would contradict the intention of the new semantics. And also
some quick measurements show performance regressions of 50% and more.
Simply removing the cpu_relax() call again seems also not very
desireable since Waiman Long reported that for some workloads the call
improved performance by 5%."
* 'lockref' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: enable ARCH_USE_CMPXCHG_LOCKREF
lockref: use arch_mutex_cpu_relax() in CMPXCHG_LOOP()
mutex: replace CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX with simple ifdef
Clean-up to fix some warnings for !OF builds and spelling fixes in docs
- Clean-up openrisc prom.h
- Fix warnings caused by of_irq.h ifdefs
- Spelling fix for Synopsys
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJSQk1CAAoJEMhvYp4jgsXilvEH/RztxbKNiscz7TYg3BDEKMaY
Wg029sdygyhl+Xb78FaS3Qwt98pK2JSWqzoNv+Tv4tfksGeLWVpe6HZlGvE/PEhb
2gUppV/ILyFSH8NfeaPkfyqdqr2XpCkcpkyGgYz7jQzNti5si448LC4oJR4tcXbo
FyKguRPpTtCwDsG86fPpQgc3fHK+C7W0oQu+PCB0DEk2ApMR05rGpPiMEYsKlj56
FE9n0DMZ+hfMCUivraVTyHB9Xjo4/kTv9s88hkjwkli+N3UrIUgOGsQpNt27QIFP
F95roRtVOmgRbgtMAFaEy7nfd+3QuVfonjfYNJzxpRHWew7LxFcVEwWRjb0zaKQ=
=KDHt
-----END PGP SIGNATURE-----
Merge tag 'devicetree-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull DeviceTree fixes from Rob Herring:
"Clean-up to fix some warnings for !OF builds and spelling fixes in
docs:
- Clean-up openrisc prom.h
- Fix warnings caused by of_irq.h ifdefs
- Spelling fix for Synopsys"
* tag 'devicetree-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dts: Fix misspelling of Synopsys
of: clean-up ifdefs in of_irq.h
openrisc: clean-up prom.h
Now that all in-kernel users of bus_type.drv_attrs have been converted
to use drv_groups instead, the drv_attrs field, and logic surrounding
it, can be removed.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that all in-kernel users of bus_type.bus_attrs have been converted
to use bus_groups instead, the bus_attrs field, and logic surrounding
it, can be removed.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus suggested to replace
#ifndef CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX
#define arch_mutex_cpu_relax() cpu_relax()
#endif
with just a simple
#ifndef arch_mutex_cpu_relax
# define arch_mutex_cpu_relax() cpu_relax()
#endif
to get rid of CONFIG_HAVE_CPU_RELAX_SIMPLE. So architectures can
simply define arch_mutex_cpu_relax if they want an architecture
specific function instead of having to add a select statement in
their Kconfig in addition.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Yuanhan reported a serious throughput regression in his pigz
benchmark. Using the ftrace patch I found that several idle
paths need more TLC before we can switch the generic
need_resched() over to preempt_need_resched.
The preemption paths benefit most from preempt_need_resched and
do indeed use it; all other need_resched() users don't really
care that much so reverting need_resched() back to
tif_need_resched() is the simple and safe solution.
Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: lkp@linux.intel.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20130927153003.GF15690@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Used the generic PHY framework API to create the PHY. For powering on
and powering off the PHY, power_on and power_off ops are used. Once the
MUSB OMAP glue is adapted to the new framework, the suspend and resume
ops of usb phy library will be removed. Also twl4030-usb driver is moved
to drivers/phy/.
However using the old usb phy library cannot be completely removed
because otg is intertwined with phy and moving to the new
framework completely will break otg. Once we have a separate otg state machine,
we can get rid of the usb phy library.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Acked-by: Felipe Balbi <balbi@ti.com>
Reviewed-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The PHY framework provides a set of APIs for the PHY drivers to
create/destroy a PHY and APIs for the PHY users to obtain a reference to the
PHY with or without using phandle. For dt-boot, the PHY drivers should
also register *PHY provider* with the framework.
PHY drivers should create the PHY by passing id and ops like init, exit,
power_on and power_off. This framework is also pm runtime enabled.
The documentation for the generic PHY framework is added in
Documentation/phy.txt and the documentation for dt binding can be found at
Documentation/devicetree/bindings/phy/phy-bindings.txt
Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Acked-by: Felipe Balbi <balbi@ti.com>
Tested-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use i_writecount to control whether to get an fscache cookie in nfs_open() as
NFS does not do write caching yet. I *think* this is the cause of a problem
encountered by Mark Moseley whereby __fscache_uncache_page() gets a NULL
pointer dereference because cookie->def is NULL:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffff812a1903>] __fscache_uncache_page+0x23/0x160
PGD 0
Thread overran stack, or stack corrupted
Oops: 0000 [#1] SMP
Modules linked in: ...
CPU: 7 PID: 18993 Comm: php Not tainted 3.11.1 #1
Hardware name: Dell Inc. PowerEdge R420/072XWF, BIOS 1.3.5 08/21/2012
task: ffff8804203460c0 ti: ffff880420346640
RIP: 0010:[<ffffffff812a1903>] __fscache_uncache_page+0x23/0x160
RSP: 0018:ffff8801053af878 EFLAGS: 00210286
RAX: 0000000000000000 RBX: ffff8800be2f8780 RCX: ffff88022ffae5e8
RDX: 0000000000004c66 RSI: ffffea00055ff440 RDI: ffff8800be2f8780
RBP: ffff8801053af898 R08: 0000000000000001 R09: 0000000000000003
R10: 0000000000000000 R11: 0000000000000000 R12: ffffea00055ff440
R13: 0000000000001000 R14: ffff8800c50be538 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88042fc60000(0063) knlGS:00000000e439c700
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000000001d8f000 CR4: 00000000000607f0
Stack:
...
Call Trace:
[<ffffffff81365a72>] __nfs_fscache_invalidate_page+0x42/0x70
[<ffffffff813553d5>] nfs_invalidate_page+0x75/0x90
[<ffffffff811b8f5e>] truncate_inode_page+0x8e/0x90
[<ffffffff811b90ad>] truncate_inode_pages_range.part.12+0x14d/0x620
[<ffffffff81d6387d>] ? __mutex_lock_slowpath+0x1fd/0x2e0
[<ffffffff811b95d3>] truncate_inode_pages_range+0x53/0x70
[<ffffffff811b969d>] truncate_inode_pages+0x2d/0x40
[<ffffffff811b96ff>] truncate_pagecache+0x4f/0x70
[<ffffffff81356840>] nfs_setattr_update_inode+0xa0/0x120
[<ffffffff81368de4>] nfs3_proc_setattr+0xc4/0xe0
[<ffffffff81357f78>] nfs_setattr+0xc8/0x150
[<ffffffff8122d95b>] notify_change+0x1cb/0x390
[<ffffffff8120a55b>] do_truncate+0x7b/0xc0
[<ffffffff8121f96c>] do_last+0xa4c/0xfd0
[<ffffffff8121ffbc>] path_openat+0xcc/0x670
[<ffffffff81220a0e>] do_filp_open+0x4e/0xb0
[<ffffffff8120ba1f>] do_sys_open+0x13f/0x2b0
[<ffffffff8126aaf6>] compat_SyS_open+0x36/0x50
[<ffffffff81d7204c>] sysenter_dispatch+0x7/0x24
The code at the instruction pointer was disassembled:
> (gdb) disas __fscache_uncache_page
> Dump of assembler code for function __fscache_uncache_page:
> ...
> 0xffffffff812a18ff <+31>: mov 0x48(%rbx),%rax
> 0xffffffff812a1903 <+35>: cmpb $0x0,0x10(%rax)
> 0xffffffff812a1907 <+39>: je 0xffffffff812a19cd <__fscache_uncache_page+237>
These instructions make up:
ASSERTCMP(cookie->def->type, !=, FSCACHE_COOKIE_TYPE_INDEX);
That cmpb is the faulting instruction (%rax is 0). So cookie->def is NULL -
which presumably means that the cookie has already been at least partway
through __fscache_relinquish_cookie().
What I think may be happening is something like a three-way race on the same
file:
PROCESS 1 PROCESS 2 PROCESS 3
=============== =============== ===============
open(O_TRUNC|O_WRONLY)
open(O_RDONLY)
open(O_WRONLY)
-->nfs_open()
-->nfs_fscache_set_inode_cookie()
nfs_fscache_inode_lock()
nfs_fscache_disable_inode_cookie()
__fscache_relinquish_cookie()
nfs_inode->fscache = NULL
<--nfs_fscache_set_inode_cookie()
-->nfs_open()
-->nfs_fscache_set_inode_cookie()
nfs_fscache_inode_lock()
nfs_fscache_enable_inode_cookie()
__fscache_acquire_cookie()
nfs_inode->fscache = cookie
<--nfs_fscache_set_inode_cookie()
<--nfs_open()
-->nfs_setattr()
...
...
-->nfs_invalidate_page()
-->__nfs_fscache_invalidate_page()
cookie = nfsi->fscache
-->nfs_open()
-->nfs_fscache_set_inode_cookie()
nfs_fscache_inode_lock()
nfs_fscache_disable_inode_cookie()
-->__fscache_relinquish_cookie()
-->__fscache_uncache_page(cookie)
<crash>
<--__fscache_relinquish_cookie()
nfs_inode->fscache = NULL
<--nfs_fscache_set_inode_cookie()
What is needed is something to prevent process #2 from reacquiring the cookie
- and I think checking i_writecount should do the trick.
It's also possible to have a two-way race on this if the file is opened
O_TRUNC|O_RDONLY instead.
Reported-by: Mark Moseley <moseleymark@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Provide the ability to enable and disable fscache cookies. A disabled cookie
will reject or ignore further requests to:
Acquire a child cookie
Invalidate and update backing objects
Check the consistency of a backing object
Allocate storage for backing page
Read backing pages
Write to backing pages
but still allows:
Checks/waits on the completion of already in-progress objects
Uncaching of pages
Relinquishment of cookies
Two new operations are provided:
(1) Disable a cookie:
void fscache_disable_cookie(struct fscache_cookie *cookie,
bool invalidate);
If the cookie is not already disabled, this locks the cookie against other
dis/enablement ops, marks the cookie as being disabled, discards or
invalidates any backing objects and waits for cessation of activity on any
associated object.
This is a wrapper around a chunk split out of fscache_relinquish_cookie(),
but it reinitialises the cookie such that it can be reenabled.
All possible failures are handled internally. The caller should consider
calling fscache_uncache_all_inode_pages() afterwards to make sure all page
markings are cleared up.
(2) Enable a cookie:
void fscache_enable_cookie(struct fscache_cookie *cookie,
bool (*can_enable)(void *data),
void *data)
If the cookie is not already enabled, this locks the cookie against other
dis/enablement ops, invokes can_enable() and, if the cookie is not an
index cookie, will begin the procedure of acquiring backing objects.
The optional can_enable() function is passed the data argument and returns
a ruling as to whether or not enablement should actually be permitted to
begin.
All possible failures are handled internally. The cookie will only be
marked as enabled if provisional backing objects are allocated.
A later patch will introduce these to NFS. Cookie enablement during nfs_open()
is then contingent on i_writecount <= 0. can_enable() checks for a race
between open(O_RDONLY) and open(O_WRONLY/O_RDWR). This simplifies NFS's cookie
handling and allows us to get rid of open(O_RDONLY) accidentally introducing
caching to an inode that's open for writing already.
One operation has its API modified:
(3) Acquire a cookie.
struct fscache_cookie *fscache_acquire_cookie(
struct fscache_cookie *parent,
const struct fscache_cookie_def *def,
void *netfs_data,
bool enable);
This now has an additional argument that indicates whether the requested
cookie should be enabled by default. It doesn't need the can_enable()
function because the caller must prevent multiple calls for the same netfs
object and it doesn't need to take the enablement lock because no one else
can get at the cookie before this returns.
Signed-off-by: David Howells <dhowells@redhat.com
Add wrapper functions for dealing with cookie->n_active:
(*) __fscache_use_cookie() to increment it.
(*) __fscache_unuse_cookie() to decrement and test against zero.
(*) __fscache_wake_unused_cookie() to wake up anyone waiting for it to reach
zero.
The second and third are split so that the third can be done after cookie->lock
has been released in case the waiter wakes up whilst we're still holding it and
tries to get it.
We will need to wake-on-zero once the cookie disablement patch is applied
because it will then be possible to see n_active become zero without the cookie
being relinquished.
Also move the cookie usement out of fscache_attr_changed_op() and into
fscache_attr_changed() and the operation struct so that cookie disablement
will be able to track it.
Whilst we're at it, only increment n_active if we're about to do
fscache_submit_op() so that we don't have to deal with undoing it if anything
earlier fails. Possibly this should be moved into fscache_submit_op() which
could look at FSCACHE_OP_UNUSE_COOKIE.
Signed-off-by: David Howells <dhowells@redhat.com>
This patchset modify extcon core to remove unnecessary allocation sequence for
'dev' instance and change extcon_dev_register() interface. extcon-gpio use
gpiolib API to get debounce time and include small fix of extcon core/device
driver.
Detailed description for patchset:
1. Modify extcon core driver
- The extcon-gpio driver use gpio_set_debounce() API provided from gpiolib
if gpio driver for SoC support gpio_set_debounce() function and support 'gpio_
activ_low' filed to check whether gpio active state is 1(high) or 0(low).
- Change field type of 'dev' in structure extcon_dev and remove the sequence
of allocating memory of 'struct dev' on extcon_dev_register() function because
extcon device must need 'struct device.
- Change extcon_dev_register() prototype to simplify it and remove unnecessary
parameter as below:
2. Fix coding style and typo
- extcon core : Fix indentation coding style and remove unnecessary casting
- extcon-max8997 : Fix checkpatch warning
- extcon-max77693 : Fix checkpatch warning
- extcon-arizona : Fix typo of comment and modify minor issue
- extcon-palmas : Use dev_get_platdata()
3. Modify extcon-arizona driver
- Modify minor issue about micbias and comparision statement
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJSRNuGAAoJEJzN3yze689TcaQP/0jyJEFTkB0EsutUsK5K4Ul9
Hv6G07ccCQZOR8Bk5VR3a/ldX2nMlc8uZkhCrcvxbPM9NDuUNt5Q18EFrMCYomwz
bipFPds6Q2UuGxnaSgqBQKJ6PH/IYXTfS1Y2nbaOivKK18L62WBWw4wCz59WWyS8
B4ROjn0OjfzHlbp3uPjjCAqY9PpVsxFfTxszR0+Xt4hbapet/7UTgyMO6fJRZ0JF
gzWCKcXghRvv5ZEravX5sUc4SW6bNYKq3CgPBPIEmo33W0obk5bKOY5qDVdc7e03
WhYCPhm/gYsEfHq/Ah9k9xz3Q5ObrNdDep5nAFrS4PpoyJfalYwpRmcExTTuCEXj
CXmR/NpSK2BPlxgrQ32/bTve2bMtT16ZAMXKAIxHGwAUG/RAuQIZx0trjTSERNi1
A0juMZG3a2o9q+heCIeUp/bjhMGqCgfmlSwGrinU6hZMEj8DL4KtgiZrozOp4RlF
SX+LOvNA40EK+S2FljY30G3NkTCLksJimjQ6v37b1vhSnXBNh5R+lhzXUY6cBBHB
b5oo+6P4A9CCcmoUg8XOb5/NEjJX8JxpUKGpF/saBgHOQl5xXAfYBYVi5HVpgrJ1
Soqlv2s/wUxmGGjG9meC+AEvTu/DtKGNUMtGDZM8TUULAxwFsvdStSeJ5JHqyXuE
iaAj5pGTNkFsXnwPISZT
=8Hr2
-----END PGP SIGNATURE-----
Merge tag 'extcon-next-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into char-misc-next
Chanwoo writes:
Update extcon for 3.13
This patchset modify extcon core to remove unnecessary allocation sequence for
'dev' instance and change extcon_dev_register() interface. extcon-gpio use
gpiolib API to get debounce time and include small fix of extcon core/device
driver.
Detailed description for patchset:
1. Modify extcon core driver
- The extcon-gpio driver use gpio_set_debounce() API provided from gpiolib
if gpio driver for SoC support gpio_set_debounce() function and support 'gpio_
activ_low' filed to check whether gpio active state is 1(high) or 0(low).
- Change field type of 'dev' in structure extcon_dev and remove the sequence
of allocating memory of 'struct dev' on extcon_dev_register() function because
extcon device must need 'struct device.
- Change extcon_dev_register() prototype to simplify it and remove unnecessary
parameter as below:
2. Fix coding style and typo
- extcon core : Fix indentation coding style and remove unnecessary casting
- extcon-max8997 : Fix checkpatch warning
- extcon-max77693 : Fix checkpatch warning
- extcon-arizona : Fix typo of comment and modify minor issue
- extcon-palmas : Use dev_get_platdata()
3. Modify extcon-arizona driver
- Modify minor issue about micbias and comparision statement
This patch remove extcon_dev_register()'s second parameter which means
the pointer of parent device to simplify prototype of this function.
So, if extcon device has the parent device, it should set the pointer of
parent device to edev.dev.parent in extcon device driver instead of in
extcon_dev_register().
Cc: Graeme Gregory <gg@slimlogic.co.uk>
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Cc: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
The extcon device must always need 'struct device' so this patch change
field type of 'dev' instead of allocating memory for 'struct device' on
extcon_dev_register() function.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Myungjoo Ham <cw00.choi@samsung.com>
This patch add 'gpio_active_low' field to 'struct gpio_extcon_data'
to check whether gpio active state is 1(high) or 0(low).
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Prevent drivers relying on platform_driver_probe from requesting
deferred probing in order to avoid further futile probe attempts (either
the driver has been unregistered or its probe function has been set to
platform_drv_probe_fail when probing is retried).
Note that several platform drivers currently return subsystem errors
from probe and that these can include -EPROBE_DEFER (e.g. if a gpio
request fails).
Add a warning to platform_drv_probe that can be used to catch drivers
that inadvertently request probe deferral while using
platform_driver_probe.
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A common way to handle kobject lifetimes in embedded in objects with
different lifetime rules is to pair the kobject with a struct completion.
This introduces a kobj_completion structure that can be used in place
of the pairing, along with several convenience functions for
initialization, release, and put-and-wait.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The pre-existing sysfs interfaces which take explicit namespace
argument are weird in that they place the optional @ns in front of
@name which is contrary to the established convention. For example,
we end up forcing vast majority of sysfs_get_dirent() users to do
sysfs_get_dirent(parent, NULL, name), which is silly and error-prone
especially as @ns and @name may be interchanged without causing
compilation warning.
This renames sysfs_get_dirent() to sysfs_get_dirent_ns() and swap the
positions of @name and @ns, and sysfs_get_dirent() is now a wrapper
around sysfs_get_dirent_ns(). This makes confusions a lot less
likely.
There are other interfaces which take @ns before @name. They'll be
updated by following patches.
This patch doesn't introduce any functional changes.
v2: EXPORT_SYMBOL_GPL() wasn't updated leading to undefined symbol
error on module builds. Reported by build test robot. Fixed.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There's no reason for sysfs to be calling ktype->namespace(). It is
backwards, obfuscates what's going on and unnecessarily tangles two
separate layers.
There are two places where symlink code calls ktype->namespace().
* sysfs_do_create_link_sd() calls it to find out the namespace tag of
the target directory. Unless symlinking races with cross-namespace
renaming, this equals @target_sd->s_ns.
* sysfs_rename_link() uses it to find out the new namespace to rename
to and the new namespace can be different from the existing one.
The function is renamed to sysfs_rename_link_ns() with an explicit
@ns argument and the ktype->namespace() invocation is shifted to the
device layer.
While this patch replaces ktype->namespace() invocation with the
recorded result in @target_sd, this shouldn't result in any behvior
difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For some unrecognizable reason, namespace information is communicated
to sysfs through ktype->namespace() callback when there's *nothing*
which needs the use of a callback. The whole sequence of operations
is completely synchronous and sysfs operations simply end up calling
back into the layer which just invoked it in order to find out the
namespace information, which is completely backwards, obfuscates
what's going on and unnecessarily tangles two separate layers.
This patch doesn't remove ktype->namespace() but shifts its handling
to kobject layer. We probably want to get rid of the callback in the
long term.
This patch adds an explicit param to sysfs_{create|rename|move}_dir()
and renames them to sysfs_{create|rename|move}_dir_ns(), respectively.
ktype->namespace() invocations are moved to the calling sites of the
above functions. A new helper kboject_namespace() is introduced which
directly tests kobj_ns_type_operations->type which should give the
same result as testing sysfs_fs_type(parent_sd) and returns @kobj's
namespace tag as necessary. kobject_namespace() is extern as it will
be used from another file in the following patches.
This patch should be an equivalent conversion without any functional
difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sysfs ns (namespace) implementation became more convoluted than
necessary while trying to hide ns information from visible interface.
The relatively recent attr ns support is a good example.
* attr ns tag is determined by sysfs_ops->namespace() callback while
dir tag is determined by kobj_type->namespace(). The placement is
arbitrary.
* Instead of performing operations with explicit ns tag, the namespace
callback is routed through sysfs_attr_ns(), sysfs_ops->namespace(),
class_attr_namespace(), class_attr->namespace(). It's not simpler
in any sense. The only thing this convolution does is traversing
the whole stack backwards.
The namespace callbacks are unncessary because the operations involved
are inherently synchronous. The information can be provided in in
straight-forward top-down direction and reversing that direction is
unnecessary and against basic design principles.
This backward interface is unnecessarily convoluted and hinders
properly separating out sysfs from driver model / kobject for proper
layering. This patch updates attr ns support such that
* sysfs_ops->namespace() and class_attr->namespace() are dropped.
* sysfs_{create|remove}_file_ns(), which take explicit @ns param, are
added and sysfs_{create|remove}_file() are now simple wrappers
around the ns aware functions.
* ns handling is dropped from sysfs_chmod_file(). Nobody uses it at
this point. sysfs_chmod_file_ns() can be added later if necessary.
* Explicit @ns is propagated through class_{create|remove}_file_ns()
and netdev_class_{create|remove}_file_ns().
* driver/net/bonding which is currently the only user of attr
namespace is updated to use netdev_class_{create|remove}_file_ns()
with @bh->net as the ns tag instead of using the namespace callback.
This patch should be an equivalent conversion without any functional
difference. It makes the code easier to follow, reduces lines of code
a bit and helps proper separation and layering.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current code does not correctly negotiate the version numbers for the util
driver when hosted on earlier hosts. The version numbers presented by this
driver were not compatible with the version numbers supported by Windows Server
2008. Fix this problem.
I would like to thank Olaf Hering (ohering@suse.com) for identifying the problem.
Reported-by: Olaf Hering <ohering@suse.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's no longer needed, and the struct hv_ring_buffer_debug_info
structure shouldn't be "global" so move it to the local .h file instead.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's only used once, only contains 2 function calls, so just make those
calls directly, deleting the function, and the now unneeded structure
entirely.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This moves the "client_monitor_conn_id" and "server_monitor_conn_id" bus
attributes to the dev_groups structure, removing the need for it to be
in a temporary structure.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This moves the "client_monitor_latency" and "server_monitor_latency" bus
attributes to the dev_groups structure, removing the need for it to be
in a temporary structure.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This moves the "client_monitor_pending" and "server_monitor_pending" bus
attributes to the dev_groups structure, removing the need for it to be
in a temporary structure.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This moves the "device_id" bus attribute to the dev_groups structure,
removing the need for it to be in a temporary structure.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This moves the "class_id" bus attribute to the dev_groups structure,
removing the need for it to be in a temporary structure.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This moves the "state" bus attribute to the dev_groups structure,
removing the need for it to be in a temporary structure.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This moves the "state" bus attribute to the dev_groups structure,
removing the need for it to be in a temporary structure.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch is the first in a series that moves the hv bus code to use the
dev_groups field instead of dev_attrs, as dev_attrs is going away in future
kernel releases.
It moves the id sysfs file to the dev_groups structure, and creates the needed
show/store functions, instead of relying on one "universal" function for this.
By doing this, it removes the need for this to be in a temporary structure.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Determine if we've created a new file by examining the directory change
attribute and/or the O_EXCL flag.
This fixes a regression when doing a non-exclusive create of a new file.
If the FILE_CREATED flag is not set, the atomic_open() command will
perform full file access permissions checks instead of just checking
for MAY_OPEN.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Here we're using the old clock initialisation function as a template.
It's necessary to remove all of the clk_register_clkdev() calls as
they don't make sense when booting with Device Tree.
Cc: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
These are required to request DBx500 PRCMU clocks from Device Tree. The
numbers used are taken directly from the Hardware Specification document.
We're moving them from the DBx500 PRCMU include file into the DT include
directory and referencing them from the former via a #include.
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
... as stipulated by the Hardware Specification document.
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
udev has this nice feature of creating "dead" /dev/<node> device-nodes if
it finds a devnode:<node> modalias. Once the node is accessed, the kernel
automatically loads the module that provides the node. However, this
requires udev to know the major:minor code to use for the node. This
feature was introduced by:
commit 578454ff7e
Author: Kay Sievers <kay.sievers@vrfy.org>
Date: Thu May 20 18:07:20 2010 +0200
driver core: add devname module aliases to allow module on-demand auto-loading
However, uhid uses dynamic minor numbers so this doesn't actually work. We
need to load uhid to know which minor it's going to use.
Hence, allocate a static minor (just like uinput does) and we're good
to go.
Reported-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Although originally conceived as a hook for port drivers to know
when a port reference is dropped, no driver uses this method.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The following patch is required to resolve remote wake issues with
certain devices.
Issue description:
If the remote wake is issued from the device in a specific timing
condition while the system is entering sleep state then it may cause
system to auto wake on subsequent sleep cycle.
Root cause:
Host controller rebroadcasts the Resume signal > 100 µseconds after
receiving the original resume event from the device. For proper
function, some devices may require the rebroadcast of resume event
within the USB spec of 100µS.
Workaroud:
1. Filter the AMD platforms with Yangtze chipset, then judge of all the usb
devices are mouse or not. And get out the port id which attached a mouse
with Pixart controller.
2. Then reset the port which attached issue device during system resume
from S3.
[Q] Why the special devices are only mice? Would high speed devices
such as 3G modem or USB Bluetooth adapter trigger this issue?
- Current this sensitivity is only confined to devices that use Pixart
controllers. This controller is designed for use with LS mouse
devices only. We have not observed any other devices failing. There
may be a small risk for other devices also but this patch (reset
device in resume phase) will cover the cases if required.
[Q] Shouldn’t the resume signal be sent within 100 us for every
device?
- The Host controller may not send the resume signal within 100us,
this our host controller specification change. This is why we
require the patch to prevent side effects on certain known devices.
[Q] Why would clicking mouse INTENSELY to wake the system up trigger
this issue?
- This behavior is specific to the devices that use Pixart controller.
It is timing dependent on when the resume event is triggered during
the sleep state.
[Q] Is it a host controller issue or mouse?
- It is the host controller behavior during resume that triggers the
device incorrect behavior on the next resume.
This patch sets USB_QUIRK_RESET_RESUME flag for these Pixart-based mice
when they attached to platforms with AMD Yangtze chipset.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A few fixes for dm-snapshot, a 32 bit fix for dm-stats, a couple error
handling fixes for dm-multipath. A fix for the thin provisioning target
to not expose non-zero discard limits if discards are disabled.
Lastly, add two DM module parameters which allow users to tune the
emergency memory reserves that DM mainatins per device -- this helps fix
a long-standing issue for dm-multipath. The conservative default
reserve for request-based dm-multipath devices (256) has proven
problematic for users with many multipathed SCSI devices but relatively
little memory. To responsibly select a smaller value users should use
the new nr_bios tracepoint info (via commit 75afb352 "block: Add nr_bios
to block_rq_remap tracepoint") to determine the peak number of bios
their workloads create.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQEcBAABAgAGBQJSQMVHAAoJEMUj8QotnQNaOXgIAJS6/XJKMoHfiDJ9M+XD34rZ
Uyr9TEnubX3DKCRBiY23MUcCQn3fx6BjCGv5/c8L4jQFIuLyDi2yatqpwXcbGSJh
G/S/y6u0Axek+ew7TS80OFop4nblW6MoKnoh9/4N55Ofa+1WvKM4ERUGjHGbauyS
TxmLQPToCFPLYRIOZ+imd6hQuIZ1+FFdJFvi7kY9O6Llx2sLD6fWi1iruBd/Da2H
ByMX3biGN45mSpcBzRbSC/FkJ9CRIvT9n82BDPS0o3Tllt8NaVlEDaovB7h4ncc0
bFuT2Z3Q38B9uZ8Lj0bqdGzv3kXMLCkLo6WhWjyUt84hmDPAzRpBwt60jUqWyZs=
=bjVp
-----END PGP SIGNATURE-----
Merge tag 'dm-3.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device-mapper fixes from Mike Snitzer:
"A few fixes for dm-snapshot, a 32 bit fix for dm-stats, a couple error
handling fixes for dm-multipath. A fix for the thin provisioning
target to not expose non-zero discard limits if discards are disabled.
Lastly, add two DM module parameters which allow users to tune the
emergency memory reserves that DM mainatins per device -- this helps
fix a long-standing issue for dm-multipath. The conservative default
reserve for request-based dm-multipath devices (256) has proven
problematic for users with many multipathed SCSI devices but
relatively little memory. To responsibly select a smaller value users
should use the new nr_bios tracepoint info (via commit 75afb352
"block: Add nr_bios to block_rq_remap tracepoint") to determine the
peak number of bios their workloads create"
* tag 'dm-3.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: add reserved_bio_based_ios module parameter
dm: add reserved_rq_based_ios module parameter
dm: lower bio-based mempool reservation
dm thin: do not expose non-zero discard limits if discards disabled
dm mpath: disable WRITE SAME if it fails
dm-snapshot: fix performance degradation due to small hash size
dm snapshot: workaround for a false positive lockdep warning
dm stats: fix possible counter corruption on 32-bit systems
dm mpath: do not fail path on -ENOSPC
This resolves the merge problem with two iio drivers that Stephen
Rothwell pointed out.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The old rcu_is_cpu_idle() function is just __rcu_is_watching() with
preemption disabled. This commit therefore renames rcu_is_cpu_idle()
to rcu_is_watching.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
There is currently no way for kernel code to determine whether it
is safe to enter an RCU read-side critical section, in other words,
whether or not RCU is paying attention to the currently running CPU.
Given the large and increasing quantity of code shared by the idle loop
and non-idle code, the this shortcoming is becoming increasingly painful.
This commit therefore adds __rcu_is_watching(), which returns true if
it is safe to enter an RCU read-side critical section on the currently
running CPU. This function is quite fast, using only a __this_cpu_read().
However, the caller must disable preemption.
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Move the workaround for double sending AUDIO_CODEC and AUDIO_DAC writes
into the SPI core, aiding refactoring to eliminate the ASoC custom I/O
functions and avoiding the extra writes for I2C.
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Remove the bloat of the C calling convention out of the
preempt_enable() sites by creating an ASM wrapper which allows us to
do an asm("call ___preempt_schedule") instead.
calling.h bits by Andi Kleen
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-tk7xdi1cvvxewixzke8t8le1@git.kernel.org
[ Fixed build error. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When using per-cpu preempt_count variables we need to save/restore the
preempt_count on context switch (into per task storage; for instance
the old thread_info::preempt_count variable) because of
PREEMPT_ACTIVE.
However, this means that on fork() the preempt_count value of the last
context switch gets copied and if we had a PREEMPT_ACTIVE switch right
before cloning a child task the child task will now too have
PREEMPT_ACTIVE set and start its life with an extra PREEMPT_ACTIVE
count.
Therefore we need to make init_task_preempt_count() unconditional;
this resets whatever preempt_count we inherited from our parent
process.
Doing so for !per-cpu implementations is harmless.
For !PREEMPT_COUNT kernels we need to be careful not to start life
with an increased preempt_count.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-4k0b7oy1rcdyzochwiixuwi9@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Rewrite the preempt_count macros in order to extract the 3 basic
preempt_count value modifiers:
__preempt_count_add()
__preempt_count_sub()
and the new:
__preempt_count_dec_and_test()
And since we're at it anyway, replace the unconventional
$op_preempt_count names with the more conventional preempt_count_$op.
Since these basic operators are equivalent to the previous _notrace()
variants, do away with the _notrace() versions.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-ewbpdbupy9xpsjhg960zwbv8@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In order to prepare to per-arch implementations of preempt_count move
the required bits into an asm-generic header and use this for all
archs.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In order to combine the preemption and need_resched test we need to
fold the need_resched information into the preempt_count value.
Since the NEED_RESCHED flag is set across CPUs this needs to be an
atomic operation, however we very much want to avoid making
preempt_count atomic, therefore we keep the existing TIF_NEED_RESCHED
infrastructure in place but at 3 sites test it and fold its value into
preempt_count; namely:
- resched_task() when setting TIF_NEED_RESCHED on the current task
- scheduler_ipi() when resched_task() sets TIF_NEED_RESCHED on a
remote task it follows it up with a reschedule IPI
and we can modify the cpu local preempt_count from
there.
- cpu_idle_loop() for when resched_task() found tsk_is_polling().
We use an inverted bitmask to indicate need_resched so that a 0 means
both need_resched and !atomic.
Also remove the barrier() in preempt_enable() between
preempt_enable_no_resched() and preempt_check_resched() to avoid
having to reload the preemption value and allow the compiler to use
the flags of the previuos decrement. I couldn't come up with any sane
reason for this barrier() to be there as preempt_enable_no_resched()
already has a barrier() before doing the decrement.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-7a7m5qqbn5pmwnd4wko9u6da@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Replace the single preempt_count() 'function' that's an lvalue with
two proper functions:
preempt_count() - returns the preempt_count value as rvalue
preempt_count_set() - Allows setting the preempt-count value
Also provide preempt_count_ptr() as a convenience wrapper to implement
all modifying operations.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-orxrbycjozopqfhb4dxdkdvb@git.kernel.org
[ Fixed build failure. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mike reported that commit 7d1a9417 ("x86: Use generic idle loop")
regressed several workloads and caused excessive reschedule
interrupts.
The patch in question failed to notice that the x86 code had an
inverted sense of the polling state versus the new generic code (x86:
default polling, generic: default !polling).
Fix the two prominent x86 mwait based idle drivers and introduce a few
new generic polling helpers (fixing the wrong smp_mb__after_clear_bit
usage).
Also switch the idle routines to using tif_need_resched() which is an
immediate TIF_NEED_RESCHED test as opposed to need_resched which will
end up being slightly different.
Reported-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: lenb@kernel.org
Cc: tglx@linutronix.de
Link: http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Preemption semantics are going to change which mandate a change.
All DRM usage sites are already broken and will not be affected (much)
by this change. DRM people are aware and will remove the last few
stragglers.
For now, leave an empty stub that generates a warning, once all users
are gone we can remove this.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: airlied@linux.ie
Cc: daniel.vetter@ffwll.ch
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-qfc1el2zvhxiyut4ai99ij4n@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The x86/AMD64 EFI stubs must use a call wrapper to convert between
the Linux and EFI ABIs, so void pointers are sufficient. For ARM,
the ABIs are compatible, so we can directly invoke the function
pointers. The functions that are used by the ARM stub are updated
to match the EFI definitions.
Also add some EFI types used by EFI functions.
Signed-off-by: Roy Franz <roy.franz@linaro.org>
Acked-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Much of of_irq.h is needlessly ifdef'ed. Clean this up and minimize the
amount ifdef'ed code. This fixes some build warnings when CONFIG_OF
is not enabled (seen on i386 and x86_64):
include/linux/of_irq.h:82:7: warning: 'struct device_node' declared inside parameter list [enabled by default]
include/linux/of_irq.h:82:7: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
include/linux/of_irq.h:87:47: warning: 'struct device_node' declared inside parameter list [enabled by default]
Compile tested on i386, sparc and arm.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Revert commit 3b38722efd ("memcg, vmscan: integrate soft reclaim
tighter with zone shrinking code")
I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert commit a5b7c87f92 ("vmscan, memcg: do softlimit reclaim also
for targeted reclaim")
I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert commit de57780dc6 ("memcg: enhance memcg iterator to support
predicates")
I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
watchdog_tresh controls how often nmi perf event counter checks per-cpu
hrtimer_interrupts counter and blows up if the counter hasn't changed
since the last check. The counter is updated by per-cpu
watchdog_hrtimer hrtimer which is scheduled with 2/5 watchdog_thresh
period which guarantees that hrtimer is scheduled 2 times per the main
period. Both hrtimer and perf event are started together when the
watchdog is enabled.
So far so good. But...
But what happens when watchdog_thresh is updated from sysctl handler?
proc_dowatchdog will set a new sampling period and hrtimer callback
(watchdog_timer_fn) will use the new value in the next round. The
problem, however, is that nobody tells the perf event that the sampling
period has changed so it is ticking with the period configured when it
has been set up.
This might result in an ear ripping dissonance between perf and hrtimer
parts if the watchdog_thresh is increased. And even worse it might lead
to KABOOM if the watchdog is configured to panic on such a spurious
lockup.
This patch fixes the issue by updating both nmi perf even counter and
hrtimers if the threshold value has changed.
The nmi one is disabled and then reinitialized from scratch. This has
an unpleasant side effect that the allocation of the new event might
fail theoretically so the hard lockup detector would be disabled for
such cpus. On the other hand such a memory allocation failure is very
unlikely because the original event is deallocated right before.
It would be much nicer if we just changed perf event period but there
doesn't seem to be any API to do that right now. It is also unfortunate
that perf_event_alloc uses GFP_KERNEL allocation unconditionally so we
cannot use on_each_cpu() and do the same thing from the per-cpu context.
The update from the current CPU should be safe because
perf_event_disable removes the event atomically before it clears the
per-cpu watchdog_ev so it cannot change anything under running handler
feet.
The hrtimer is simply restarted (thanks to Don Zickus who has pointed
this out) if it is queued because we cannot rely it will fire&adopt to
the new sampling period before a new nmi event triggers (when the
treshold is decreased).
[akpm@linux-foundation.org: the UP version of __smp_call_function_single ended up in the wrong place]
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove NEED_MACH_GPIO_H config select option for ARCH_DAVINCI
to start using gpiolib interface for davinci platforms. This makes
it easier to use the gpio driver on other platforms as it breaks
dependency on mach-davinci.
Latencies for gpio_get/set APIs will increase. On measurement,
latency was found to have increased by 18 microsecond with
gpiolib API as compared to inline APIs.
Measurement was done on DA850 EVM for gpio_get_value() API by
taking the printk timing across the call with interrupts disabled.
inline gpio API with interrupt disabled
[ 29.734337] before gpio_get
[ 29.736847] after gpio_get
Time difference 0.00251
gpio library with interrupt disabled
[ 272.876763] before gpio_get
[ 272.879291] after gpio_get
Time difference 0.002528
Latency increased by (0.002528 - 0.00251) = 18 microsecond.
While at it, remove GPIO_TYPE_DAVINCI enum definition as
gpio-davinci.c is converted to Linux device driver model.
Signed-off-by: Philip Avinash <avinashphilip@ti.com>
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
[nsekhar@ti.com: minor edits to commit message]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
According to Intel Vt-D specs, the offset of Invalidation complete
status register should be 0x9C, not 0x98.
See Intel's VT-d spec, Revision 1.3, Chapter 10.4, Page 98;
Signed-off-by: Li, Zhen-Hua <zhen-hual@hp.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Modify DaVinci GPIO driver to become a platform device
driver.
The driver does not have platform driver structure or
a probe. Instead, it has pure_initcall function for
initialization. The platform specific informaiton is
obtained using the DaVinci specific davinci_soc_info
structure. This is a problem for Device Tree (DT)
implementation.
As a first stage of DT conversion, we implement a probe.
Additional notes:
- The driver registration happens as postcore_initcall.
This is required since machine init functions like
da850_lcd_hw_init() make use of GPIO.
- Start using devres APIs for simpler error handling.
Signed-off-by: KV Sujith <sujithkv@ti.com>
[avinashphilip@ti.com: Move global definition of
"davinci_gpio_controller" to local]
Signed-off-by: Philip Avinash <avinashphilip@ti.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
[nsekhar@ti.com: drop unused structure member, rebase to new
clean-up patch and fix error messages]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
We have USB fixes now in Linus's tree that we need to properly sort out
with reverts and the like in the usb-next branch, so merge them together
and do it by hand.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The power to some of the sensors are controlled by regulators. In most
cases these are 'always on', but if not they will fail to work until
the regulator is enabled using the relevant APIs. This patch allows for
the Vdd_IO power supply to be specified by either platform data or
Device Tree.
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The power to some of the sensors are controlled by regulators. In most
cases these are 'always on', but if not they will fail to work until
the regulator is enabled using the relevant APIs. This patch allows for
the Vdd power supply to be specified by either platform data or Device
Tree.
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The list_splice_init_rcu() function allows a list visible to RCU readers
to be spliced into another list visible to RCU readers. This is OK,
except for the use of INIT_LIST_HEAD(), which does pointer updates
without doing anything to make those updates safe for concurrent readers.
Of course, most of the time INIT_LIST_HEAD() is being used in reader-free
contexts, such as initialization or cleanup, so it is OK for it to update
pointers in an unsafe-for-RCU-readers manner. This commit therefore
creates an INIT_LIST_HEAD_RCU() that uses ACCESS_ONCE() to make the updates
reader-safe. The reason that we can use ACCESS_ONCE() instead of the more
typical rcu_assign_pointer() is that list_splice_init_rcu() is updating the
pointers to reference something that is already visible to readers, so
that there is no problem with pre-initialized values.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The some platforms (e.g., ARM) initializes their clocks as
late_initcalls for some unknown reason. So make sure
random_int_secret_init() is run after all of the late_initcalls are
run.
Cc: stable@vger.kernel.org
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Pull block IO fixes from Jens Axboe:
"After merge window, no new stuff this time only a collection of neatly
confined and simple fixes"
* 'for-3.12/core' of git://git.kernel.dk/linux-block:
cfq: explicitly use 64bit divide operation for 64bit arguments
block: Add nr_bios to block_rq_remap tracepoint
If the queue is dying then we only call the rq->end_io callout. This leaves bios setup on the request, because the caller assumes when the blk_execute_rq_nowait/blk_execute_rq call has completed that the rq->bios have been cleaned up.
bio-integrity: Fix use of bs->bio_integrity_pool after free
blkcg: relocate root_blkg setting and clearing
block: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...)
block: trace all devices plug operation
A number of new drivers and some new functionality + a lot of cleanups
all over IIO.
New Core Elements
1) New INT_TIME info_mask element for integration time, which may have
different effects on measurement noise and similar, than an amplifier
and hence is different from existing SCALE. Already existed in some
drivers as a custom attribute.
2) Introduce a iio_push_buffers_with_timestamp helper to cover the common
case of filling the last 64 bits of data to be passed to the buffer with
a timestamp. Applied to lots of drivers. Cuts down on repeated code and
moves a slightly fiddly bit of logic into a single location.
3) Introduce info_mask_[shared_by_dir/shared_by_all] elements to allow support
of elements such as sampling_frequency which is typically shared by all
input channels on a device. This reduces code and makes these controls
available from in kernel consumers of IIO devices.
New drivers
1) MCP3422/3/4 ADC
2) TSL4531 ambient light sensor
3) TCS3472/5 color light sensor
4) GP2AP020A00F ambient light / proximity sensor
5) LPS001WP support added to ST pressure sensor driver.
New driver functionality
1) ti_am335x_adc Add buffered sampling support.
This device has a hardware fifo that is fed directly into an IIO kfifo
buffer based on a watershed interrupt. Note this will act as an example
of how to handle this increasingly common type of device.
The only previous example - sca3000 - take a less than optimal approach
which is largely why it is still in staging.
A couple of little cleanups for that new functionality followed later.
Core cleanups:
1) MAINTAINERS - Sachin actually brought my email address up to date because
I said I'd do it and never got around to it :)
2) Assign buffer list elements as single element lists to simplify the
iio_buffer_is_active logic.
3) wake_up_interruptible_poll instead of wake_up_interruptible to only wake
up threads waiting for poll notifications.
4) Add O_CLOEXEC flag to anon_inode_get_fd call for IIO event interface.
5) Change iio_push_to_buffers to take a void * pointer so as to avoid some
annoying and unnecessary type casts.
6) iio_compute_scan_bytes incorrectly took a long rather than unsigned long.
7) Various minor tidy ups.
Driver cleanups (in no particular order)
1) Another set of devm_ allocations patches from Sachin Kamat.
2) tsl2x7x - 0 to NULL cleanup.
3) hmc5843 - fix missing > in MODULE_AUTHOR
4) Set of strict_strto* to kstrto* conversions.
5) mxs-lradc - fix ordering of resource removal to match creation
6) mxs-lradc - add MODULE_ALIAS
7) adc7606 - drop a work pending test duplicated in core functions.
8) hmc5843 - devm_ allocation patch
9) Series of redundant breaks removed.
10) ad2s1200 - pr_err -> dev_err
11) adjd_s311 - use INT_TIME
12) ST sensors - large set of cleanups from Lee Jones and removed restriction
to using only triggers provided by the st_sensors themselves from
Dennis Ciocca.
13) dummy and tmp006 provide sampling_frequency via info_mask_shared_by_all.
14) tcs3472 - fix incorrect buffer size and wrong device pointer used in
suspend / resume functions.
15) max1363 - use defaults for buffer setup ops as provided by the triggered
buffer helpers as they are the same as were specified in max1363 driver.
16) Trivial tidy ups in a number of other drivers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.21 (GNU/Linux)
iQIcBAABAgAGBQJSP0HUAAoJEFSFNJnE9BaIwiQQAKuJaoPrdMezm1TDaqgrzQWQ
U95mSJ19xPYVSQVNHFLFidcajhADRMFhMUGOJF64VZObEdOtFWI0UkrJjFhYtJTt
n1B6qAqPjatmruj434+n5PW32XtareOPThso5EDCAW0X+CNgSOgda+TVj+9g1Ilg
Onltb3wugMcs27FakZpKv1YuGyKAKE6uT/33qr++cuynR89JZOlp0QmLgIXobVRR
WdjuiH8OXFA4LsP7dWQhoSejs6+JPMn992qkACUc5fztQfFfCk0eJsgQIsOXkz1e
U6MFvab0LtdPKDRyzT1kIpK/Jxf1OVNiOYaQNIGuNMipa+5WRz2lF1sZyERQTJWR
HOZehkikBdL73WaaKwyaLTsYyDMbYM9ZkpLrBEFRr7ocZpg/0LA84BWYYDWu1Nok
9Ib9xNAxcAgFwQMJpiz9J3ap/IzV2qJT9rv78q1chVwhNhVDs2CbwcuZKAB4UvWs
Oz7C0Xx5DA/K7DlpJMLaVB1+BRJ3C1I9Jbr84mnu0clgOqFE+nrdKZcUTrOTFXdy
2yTp7Bkc2JiRtOYhI40UL79N08KCGNTUfigmUDQseF2dsaNlz5rTOiMifYQCRw9+
C1kxY00emzlGTvfUDdPwkiQTtz8tWf9Ahvjx/ufGfed68KWDMs1VuGNcqEzgqKNI
SMP0VTEXbCiLeWYMqGep
=mMgm
-----END PGP SIGNATURE-----
Merge tag 'iio-for-3.13a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
Jonathan writes:
First round of new drivers, functionality and cleanups for IIO in the 3.13 cycle
A number of new drivers and some new functionality + a lot of cleanups
all over IIO.
New Core Elements
1) New INT_TIME info_mask element for integration time, which may have
different effects on measurement noise and similar, than an amplifier
and hence is different from existing SCALE. Already existed in some
drivers as a custom attribute.
2) Introduce a iio_push_buffers_with_timestamp helper to cover the common
case of filling the last 64 bits of data to be passed to the buffer with
a timestamp. Applied to lots of drivers. Cuts down on repeated code and
moves a slightly fiddly bit of logic into a single location.
3) Introduce info_mask_[shared_by_dir/shared_by_all] elements to allow support
of elements such as sampling_frequency which is typically shared by all
input channels on a device. This reduces code and makes these controls
available from in kernel consumers of IIO devices.
New drivers
1) MCP3422/3/4 ADC
2) TSL4531 ambient light sensor
3) TCS3472/5 color light sensor
4) GP2AP020A00F ambient light / proximity sensor
5) LPS001WP support added to ST pressure sensor driver.
New driver functionality
1) ti_am335x_adc Add buffered sampling support.
This device has a hardware fifo that is fed directly into an IIO kfifo
buffer based on a watershed interrupt. Note this will act as an example
of how to handle this increasingly common type of device.
The only previous example - sca3000 - take a less than optimal approach
which is largely why it is still in staging.
A couple of little cleanups for that new functionality followed later.
Core cleanups:
1) MAINTAINERS - Sachin actually brought my email address up to date because
I said I'd do it and never got around to it :)
2) Assign buffer list elements as single element lists to simplify the
iio_buffer_is_active logic.
3) wake_up_interruptible_poll instead of wake_up_interruptible to only wake
up threads waiting for poll notifications.
4) Add O_CLOEXEC flag to anon_inode_get_fd call for IIO event interface.
5) Change iio_push_to_buffers to take a void * pointer so as to avoid some
annoying and unnecessary type casts.
6) iio_compute_scan_bytes incorrectly took a long rather than unsigned long.
7) Various minor tidy ups.
Driver cleanups (in no particular order)
1) Another set of devm_ allocations patches from Sachin Kamat.
2) tsl2x7x - 0 to NULL cleanup.
3) hmc5843 - fix missing > in MODULE_AUTHOR
4) Set of strict_strto* to kstrto* conversions.
5) mxs-lradc - fix ordering of resource removal to match creation
6) mxs-lradc - add MODULE_ALIAS
7) adc7606 - drop a work pending test duplicated in core functions.
8) hmc5843 - devm_ allocation patch
9) Series of redundant breaks removed.
10) ad2s1200 - pr_err -> dev_err
11) adjd_s311 - use INT_TIME
12) ST sensors - large set of cleanups from Lee Jones and removed restriction
to using only triggers provided by the st_sensors themselves from
Dennis Ciocca.
13) dummy and tmp006 provide sampling_frequency via info_mask_shared_by_all.
14) tcs3472 - fix incorrect buffer size and wrong device pointer used in
suspend / resume functions.
15) max1363 - use defaults for buffer setup ops as provided by the triggered
buffer helpers as they are the same as were specified in max1363 driver.
16) Trivial tidy ups in a number of other drivers.
Adding the number of bios in a remapped request to 'block_rq_remap'
tracepoint.
Request remapper clones bios in a request to track the completion
status of each bio. So the number of bios can be useful information
for investigation.
Related discussions:
http://www.redhat.com/archives/dm-devel/2013-August/msg00084.htmlhttp://www.redhat.com/archives/dm-devel/2013-September/msg00024.html
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Drivers using software buffers often store the timestamp in their data buffer
before calling iio_push_to_buffers() with that data buffer. Storing the
timestamp in the buffer usually involves some ugly pointer arithmetic. This
patch adds a new helper function called iio_push_buffers_with_timestamp() which
is similar to iio_push_to_buffers but takes an additional timestamp parameter.
The function will help to hide to uglyness in one central place instead of
exposing it in every driver. If timestamps are enabled for the IIO device
iio_push_buffers_with_timestamp() will store the timestamp as the last element
in buffer, before passing the buffer on to iio_push_buffers(). The buffer needs
large enough to hold the timestamp in this case. If timestamps are disabled
iio_push_buffers_with_timestamp() will behave just like iio_push_buffers().
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: Oleksandr Kravchenko <o.v.kravchenko@globallogic.com>
Cc: Josh Wu <josh.wu@atmel.com>
Cc: Denis Ciocca <denis.ciocca@gmail.com>
Cc: Manuel Stahl <manuel.stahl@iis.fraunhofer.de>
Cc: Ge Gao <ggao@invensense.com>
Cc: Peter Meerwald <pmeerw@pmeerw.net>
Cc: Jacek Anaszewski <j.anaszewski@samsung.com>
Cc: Fabio Estevam <fabio.estevam@freescale.com>
Cc: Marek Vasut <marex@denx.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Previously the driver had only one-shot reading functionality.
This patch adds continuous sampling support to the driver.
Continuous sampling starts when buffer is enabled.
HW IRQ wakes worker thread that pushes samples to userspace.
Sampling stops when buffer is disabled by userspace.
Patil Rachna (TI) laid the ground work for ADC HW register access.
Russ Dill (TI) fixed bugs in the driver relevant to FIFOs and IRQs.
I fixed channel scanning so multiple ADC channels can be read
simultaneously and pushed to userspace.
Restructured the driver to fit IIO ABI.
And added INDIO_BUFFER_HARDWARE mode.
Signed-off-by: Zubair Lutfullah <zubair.lutfullah@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Russ Dill <Russ.Dill@ti.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
To be able to use the hex ascii functions in case sensitive environments
the array hex_asc_upper[] and the needed functions for hex_byte_pack_upper()
are introduced.
Signed-off-by: Andre Naujoks <nautsch2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add helper macro to set the min and max value of the register range.
This is useful when initialising the register ranges of the device like
static const struct regmap_range readable_ranges[] = {
regmap_reg_range(DEVICE_REG0, DEVICE_REG10),
};
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Workaround the SCSI layer's problematic WRITE SAME heuristics by
disabling WRITE SAME in the DM multipath device's queue_limits if an
underlying device disabled it.
The WRITE SAME heuristics, with both the original commit 5db44863b6
("[SCSI] sd: Implement support for WRITE SAME") and the updated commit
66c28f971 ("[SCSI] sd: Update WRITE SAME heuristics"), default to enabling
WRITE SAME(10) even without successfully determining it is supported.
After the first failed WRITE SAME the SCSI layer will disable WRITE SAME
for the device (by setting sdkp->device->no_write_same which results in
'max_write_same_sectors' in device's queue_limits to be set to 0).
When a device is stacked ontop of such a SCSI device any changes to that
SCSI device's queue_limits do not automatically propagate up the stack.
As such, a DM multipath device will not have its WRITE SAME support
disabled. This causes the block layer to continue to issue WRITE SAME
requests to the mpath device which causes paths to fail and (if mpath IO
isn't configured to queue when no paths are available) it will result in
actual IO errors to the upper layers.
This fix doesn't help configurations that have additional devices
stacked ontop of the mpath device (e.g. LVM created linear DM devices
ontop). A proper fix that restacks all the queue_limits from the bottom
of the device stack up will need to be explored if SCSI will continue to
use this model of optimistically allowing op codes and then disabling
them after they fail for the first time.
Before this patch:
EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
device-mapper: multipath: XXX snitm debugging: failing WRITE SAME IO with error=-121
end_request: critical target error, dev dm-6, sector 528
dm-6: WRITE SAME failed. Manually zeroing.
device-mapper: multipath: Failing path 8:112.
end_request: I/O error, dev dm-6, sector 4616
dm-6: WRITE SAME failed. Manually zeroing.
end_request: I/O error, dev dm-6, sector 4616
end_request: I/O error, dev dm-6, sector 5640
end_request: I/O error, dev dm-6, sector 6664
end_request: I/O error, dev dm-6, sector 7688
end_request: I/O error, dev dm-6, sector 524288
Buffer I/O error on device dm-6, logical block 65536
lost page write due to I/O error on dm-6
JBD2: Error -5 detected when updating journal superblock for dm-6-8.
end_request: I/O error, dev dm-6, sector 524296
Aborting journal on device dm-6-8.
end_request: I/O error, dev dm-6, sector 524288
Buffer I/O error on device dm-6, logical block 65536
lost page write due to I/O error on dm-6
JBD2: Error -5 detected when updating journal superblock for dm-6-8.
# cat /sys/block/sdh/queue/write_same_max_bytes
0
# cat /sys/block/dm-6/queue/write_same_max_bytes
33553920
After this patch:
EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
device-mapper: multipath: XXX snitm debugging: WRITE SAME I/O failed with error=-121
end_request: critical target error, dev dm-6, sector 528
dm-6: WRITE SAME failed. Manually zeroing.
# cat /sys/block/sdh/queue/write_same_max_bytes
0
# cat /sys/block/dm-6/queue/write_same_max_bytes
0
It should be noted that WRITE SAME support wasn't enabled in DM
multipath until v3.10.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: stable@vger.kernel.org # 3.10+
This patch builds on patch 2 and periodically decays that max value to
do idle balancing per sched domain by approximately 1% per second. Also
decay the rq's max_idle_balance_cost value.
Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379096813-3032-4-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In this patch, we keep track of the max cost we spend doing idle load balancing
for each sched domain. If the avg time the CPU remains idle is less then the
time we have already spent on idle balancing + the max cost of idle balancing
in the sched domain, then we don't continue to attempt the balance. We also
keep a per rq variable, max_idle_balance_cost, which keeps track of the max
time spent on newidle load balances throughout all its domains so that we can
determine the avg_idle's max value.
By using the max, we avoid overrunning the average. This further reduces the
chance we attempt balancing when the CPU is not idle for longer than the cost
to balance.
Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379096813-3032-3-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>