linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-28 07:04:00 +08:00

Author	SHA1	Message	Date
Alan Stern	63fb3a2800	USB: NS_TO_US should round up Host controller drivers use the NS_TO_US macro to convert transaction times, which are computed in nanoseconds, to microseconds for scheduling. Periodic scheduling requires worst-case estimates, but the macro does its conversion using round-to-nearest. This patch changes it to use round-up, giving a correct worst-case value. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-11 16:37:45 -07:00
Hans de Goede	6ec4147e7b	usb-anchor: Delay usb_wait_anchor_empty_timeout wake up till completion is done usb_wait_anchor_empty_timeout() should wait till the completion handler has run. Both the zd1211rw driver and the uas driver (in its task mgmt) depend on the completion handler having completed when usb_wait_anchor_empty_timeout() returns, as they read state set by the completion handler after an usb_wait_anchor_empty_timeout() call. But __usb_hcd_giveback_urb() calls usb_unanchor_urb before calling the completion handler. This is necessary as the completion handler may re-submit and re-anchor the urb. But this introduces a race where the state these drivers want to read has not been set yet by the completion handler (this race is easily triggered with the uas task mgmt code). I've considered adding an anchor_count to struct urb, which would be incremented on anchor and decremented on unanchor, and then only actually do the anchor / unanchor on 0 -> 1 and 1 -> 0 transtions, combined with moving the unanchor call in hcd_giveback_urb to after calling the completion handler. But this will only work if urb's are only re-anchored to the same anchor as they were anchored to before the completion handler ran. And at least one driver re-anchors to another anchor from the completion handler (rtlwifi). So I have come up with this patch instead, which adds the ability to suspend wakeups of usb_wait_anchor_empty_timeout() waiters to the usb_anchor functionality, and uses this in __usb_hcd_giveback_urb() to delay wake-ups until the completion handler has run. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Oliver Neukum <oliver@neukum.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-11 16:33:58 -07:00
Hans de Goede	9ef73dbdd0	usb-anchor: Ensure poisened gets initialized to 0 And do so in a way which ensures that any fields added in the future will also get properly zero-ed. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Oliver Neukum <oliver@neukum.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-11 16:33:58 -07:00
Mark Brown	2cc6e2e0c8	Merge remote-tracking branch 'spi/topic/s3c64xx' into spi-loop	2013-10-11 20:10:13 +01:00
Mark Brown	b158935f70	spi: Provide common spi_message processing loop The loops which SPI controller drivers use to process the list of transfers in a spi_message are typically very similar and have some error prone areas such as the handling of /CS. Help simplify drivers by factoring this code out into the core - if drivers provide a transfer_one() function instead of a transfer_one_message() function the core will handle processing at the message level. /CS can be controlled by either setting cs_gpio or providing a set_cs function. If this is not possible for hardware reasons then both can be omitted and the driver should continue to implement manual /CS handling. This is a first step in refactoring and it is expected that there will be further enhancements, for example factoring out of the mapping of transfers for DMA and the initiation and completion of interrupt driven transfers. Signed-off-by: Mark Brown <broonie@linaro.org>	2013-10-11 20:09:50 +01:00
Mark Brown	2841a5fc37	spi: Provide per-message prepare and unprepare operations Many SPI drivers perform setup and tear down on every message, usually doing things like DMA mapping the message. Provide hooks for them to use to provide such operations. This is of limited value for drivers that implement transfer_one_message() but will be of much greater utility with future factoring out of standard implementations of that function. Signed-off-by: Mark Brown <broonie@linaro.org>	2013-10-11 20:09:13 +01:00
Linus Torvalds	cd4edf7a34	Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "All over the map.. - nouveau: disable MSI, needs more work, will try again next merge window - radeon: audio + uvd regression fixes, dpm fixes, reset fixes - i915: the dpms fix might fix your haswell And one pain in the ass revert, so we have VGA arbitration that when implemented 4-5 years ago really hoped that GPUs could remove themselves from arbitration completely once they had a kernel driver. It seems Intel hw designers decided that was too nice a facility to allow us to have so they removed it when they went on-die (so since Ironlake at least). Now Alex Williamson added support for VGA arbitration for newer GPUs however this now exposes itself to userspace as requireing arbitration of GPU VGA regions and the X server gets involved and disables things that it can't handle when VGA access is possibly required around every operation. So in order to not break userspace we just reverted things back to the old known broken status so maybe we can try and design out way out. Ville also had a patch to use stop machine for the two times Intel needs to access VGA space, that might be acceptable with some rework, but for now myself and Daniel agreed to just go back" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (23 commits) Revert "i915: Update VGA arbiter support for newer devices" Revert "drm/i915: Delay disabling of VGA memory until vgacon->fbcon handoff is done" drm/radeon: re-enable sw ACR support on pre-DCE4 drm/radeon/dpm: disable bapm on TN asics drm/radeon: improve soft reset on CIK drm/radeon: improve soft reset on SI drm/radeon/dpm: off by one in si_set_mc_special_registers() drm/radeon/dpm/btc: off by one in btc_set_mc_special_registers() drm/radeon: forever loop on error in radeon_do_test_moves() drm/radeon: fix hw contexts for SUMO2 asics drm/radeon: fix typo in CP DMA register headers drm/radeon/dpm: disable multiple UVD states drm/radeon: use hw generated CTS/N values for audio drm/radeon: fix N/CTS clock matching for audio drm/radeon: use 64-bit math to calculate CTS values for audio (v2) drm/edid: catch kmalloc failure in drm_edid_to_speaker_allocation Revert "drm/fb-helper: don't sleep for screen unblank when an oops is in progress" drm/gma500: fix things after get/put page helpers drm/nouveau/mc: disable msi support by default, it's busted in tons of places drm/i915: Only apply DPMS to the encoder if enabled ...	2013-10-11 10:41:21 -07:00
Axel Lin	8828bae464	regulator: Add REGULATOR_LINEAR_RANGE macro Add REGULATOR_LINEAR_RANGE macro and convert regulator drivers to use it. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Mark Brown <broonie@linaro.org>	2013-10-11 12:49:16 +01:00
Axel Lin	e277e65680	regulator: Remove max_uV from struct regulator_linear_range linear ranges means each range has linear voltage settings. So we can calculate max_uV for each linear range in regulator core rather than set the max_uV field in drivers. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Mark Brown <broonie@linaro.org>	2013-10-11 12:49:12 +01:00
Alexey Kardashevskiy	8e0861fa3c	powerpc: Prepare to support kernel handling of IOMMU map/unmap The current VFIO-on-POWER implementation supports only user mode driven mapping, i.e. QEMU is sending requests to map/unmap pages. However this approach is really slow, so we want to move that to KVM. Since H_PUT_TCE can be extremely performance sensitive (especially with network adapters where each packet needs to be mapped/unmapped) we chose to implement that as a "fast" hypercall directly in "real mode" (processor still in the guest context but MMU off). To be able to do that, we need to provide some facilities to access the struct page count within that real mode environment as things like the sparsemem vmemmap mappings aren't accessible. This adds an API function realmode_pfn_to_page() to get page struct when MMU is off. This adds to MM a new function put_page_unless_one() which drops a page if counter is bigger than 1. It is going to be used when MMU is off (for example, real mode on PPC64) and we want to make sure that page release will not happen in real mode as it may crash the kernel in a horrible way. CONFIG_SPARSEMEM_VMEMMAP and CONFIG_FLATMEM are supported. Cc: linux-mm@kvack.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 17:24:39 +11:00
Alexey Kardashevskiy	81fcfb813f	hashtable: add hash_for_each_possible_rcu_notrace() This adds hash_for_each_possible_rcu_notrace() which is basically a notrace clone of hash_for_each_possible_rcu() which cannot be used in real mode due to its tracing/debugging capability. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 17:21:14 +11:00
Ingo Molnar	ec0ad3d01f	Merge branch 'core/urgent' into sched/core Merge in asm goto fix, to be able to apply the asm/rmwcc.h fix. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-11 07:39:37 +02:00
Ingo Molnar	3f0116c323	compiler/gcc4: Add quirk for 'asm goto' miscompilation bug Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto' constructs, as outlined here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670 Implement a workaround suggested by Jakub Jelinek. Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Suggested-by: Jakub Jelinek <jakub@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-11 07:39:14 +02:00
Dave Airlie	ebff5fa9d5	Revert "i915: Update VGA arbiter support for newer devices" This reverts commit `81b5c7bc8d`. Adding drm/i915 into the vga arbiter chain means that X (in a piece of well-meant paranoia) will do a get/put on the vga decoding around _every_ accel call down into the ddx. Which results in some nice performance disasters [1]. This really breaks userspace, by disabling DRI for everyone, and stops OpenGL from working, this isn't limited to just the i915 but both the integrated and discrete GPUs on multi-gpu systems, in other words this causes untold worlds of pain, Ville tried to come up with a Great Hack to fiddle the required VGA I/O ops behind everyone's back using stop_machine, but that didn't really work out [2]. Given that we're fairly late in the -rc stage for such games let's just revert this all. One thing we might want to keep is to delay the disabling of the vga decoding until the fbdev emulation and the fbcon screen is set up. If we kill vga mem decoding beforehand fbcon can end up with a white square in the top-left corner it tried to save from the vga memory for a seamless transition. And we have bug reports on older platforms which seem to match these symptoms. But again that's something to play around with in -next. References: [1] http://lists.x.org/archives/xorg-devel/2013-September/037763.html References: [2] http://www.spinics.net/lists/intel-gfx/msg34062.html Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-10-11 15:19:22 +10:00
Tony Lindgren	dc7743aa3c	pinctrl: single: Add support for auxdata For omaps, we still have dependencies to the legacy code for handling the PRM (Power Reset Management) interrupts, and also for reconfiguring the io wake-up chain after changes. Let's pass the PRM interrupt and the rearm functions via auxdata. Then when at some point we have a proper PRM driver, we can get the interrupt via device tree and set up the rearm function as exported function in the PRM driver. By using auxdata we can remove a dependency to the wake-up events for converting omap3 to be device tree only. Cc: Peter Ujfalusi <peter.ujfalusi@ti.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Prakash Manjunathappa <prakash.pm@ti.com> Cc: Roger Quadros <rogerq@ti.com> Cc: Haojian Zhuang <haojian.zhuang@gmail.com> Cc: linux-kernel@vger.kernel.org Reviewed-by: Kevin Hilman <khilman@linaro.org> Tested-by: Kevin Hilman <khilman@linaro.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Tony Lindgren <tony@atomide.com>	2013-10-10 15:30:47 -07:00
Linus Torvalds	f715729ee4	These patches are designed to enable improvements to /dev/random for non-x86 platforms, in particular MIPS and ARM. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABCAAGBQJSVvO/AAoJENNvdpvBGATwNZ4P+wadRWY/Gdz/p9332qdVrGYs nP4DPWSg+n3RH/fOnacEwHF5vqapTe03G82NriCaVGFP8O9j7bo6ByMKKkIR7yvr 4sHUX4YMc/DwchaIHH2xp8fQoMc3Mv7mn8bodTtPXgveeldEvtuUQM0q+j4DXZUT qSLMGElgJYrpIf2Cm8JAIBkt2QuzpZPPX7Z6glZunpvfLSMmgn3Vj2ilNEx1YCFH v+Rk1ZYLjg2LzUYqaO7HOXlRJqmE10I7ZmNvPXJZ9fuPmGYD9FU6WeHhmIAFYdFw V6bAzou+LbnuNVoW6yiDvrKcOXgh2Spbk6SaKVSrcjVPfc87ocNzGWI4OTfNy1xI Kv9u4YfU3pIUWPDGx0mvT/KXAXl/PGVfu7bYXDcN2I2tqlrbBPdIWqpFB2eTn7/j //XbatoT6gGZTuseCKhYXWpG8AE5pCfbjGnd9il21fvlUDdkIq42wAs96qjc6Ruj tPCi5yYzLiHsn4eau+SJqI1KxPLf6YJw9Qo+f70FGl63wXJU9Vr07ID2rGTwXm1m Qf1joTtx900PvfzUaD0ODbQZaTbX6ebSOkriKpKWYwg+26Gdc7JAxIVI3HDOlOR+ ++r1M4ERwDic/xdVsB6Mngmop3d1BeNU2IAoiRDZwcJpS1+MLivlIbd1PjBAt0bU +oOm+wseHEzSnlgucQ0g =qnTe -----END PGP SIGNATURE----- Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull /dev/random changes from Ted Ts'o: "These patches are designed to enable improvements to /dev/random for non-x86 platforms, in particular MIPS and ARM" * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: random: allow architectures to optionally define random_get_entropy() random: run random_int_secret_init() run after all late_initcalls	2013-10-10 12:31:43 -07:00
Theodore Ts'o	61875f30da	random: allow architectures to optionally define random_get_entropy() Allow architectures which have a disabled get_cycles() function to provide a random_get_entropy() function which provides a fine-grained, rapidly changing counter that can be used by the /dev/random driver. For example, an architecture might have a rapidly changing register used to control random TLB cache eviction, or DRAM refresh that doesn't meet the requirements of get_cycles(), but which is good enough for the needs of the random driver. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-10-10 14:30:53 -04:00
Valentine Barshak	5578b266e9	usb: phy: Add RCAR Gen2 USB phy This adds RCAR Gen2 USB phy support. The driver configures USB channels 0/2 which are shared between PCI USB hosts and USBHS/USBSS devices. It also controls internal USBHS phy. Signed-off-by: Valentine Barshak <valentine.barshak@cogentembedded.com> Signed-off-by: Felipe Balbi <balbi@ti.com>	2013-10-10 11:59:48 -05:00
Sagi Grimberg	ada9f5d007	IB/mlx5: Fix eq names to display nicely in /proc/interrupts It's helpful for a driver to put the pci slot name in its interrupt names, so /proc/interrupts will show the pci slot of the device. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-10-10 09:23:59 -07:00
Eli Cohen	2f6daec14d	mlx5: Fix layout of struct mlx5_init_seg The layout of struct health_buffer was not according to firmware specification. Fix it to comply. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-10-10 09:23:57 -07:00
Eli Cohen	c1868b8225	mlx5: Remove checksum on command interface commands Checksum calculations consume CPU resources and can be significant to the rate of resource creation/destruction. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-10-10 09:23:56 -07:00
Ingo Molnar	8a749de5e3	Merge branch 'fortglx/3.13/time' of git://git.linaro.org/people/jstultz/linux into timers/core Pull more timekeeping items for v3.13 from John Stultz: * Small cleanup in the clocksource code. * Fix for rtc-pl031 to let it work with alarmtimers. * Move arm64 to using the generic sched_clock framework & resulting cleanup in the generic sched_clock code. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-10 06:25:23 +02:00
Stephen Boyd	b4042ceaab	sched_clock: Remove sched_clock_func() hook Nobody is using sched_clock_func() anymore now that sched_clock supports up to 64 bits. Remove the hook so that new code only uses sched_clock_register(). Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: John Stultz <john.stultz@linaro.org>	2013-10-09 16:54:39 -07:00
Mark Brown	915f441b6f	regmap: Provide asynchronous write and update bits operations Make it easier for drivers to include single register writes in asynchronous sequences by providing async versions of the write and update bits operations. The update bits operations are only likely to be effective when used with devices that have caches but this is common enough to be useful. Signed-off-by: Mark Brown <broonie@linaro.org>	2013-10-09 14:05:26 +01:00
Rik van Riel	de1c9ce6f0	sched/numa: Skip some page migrations after a shared fault Shared faults can lead to lots of unnecessary page migrations, slowing down the system, and causing private faults to hit the per-pgdat migration ratelimit. This patch adds sysctl numa_balancing_migrate_deferred, which specifies how many shared page migrations to skip unconditionally, after each page migration that is skipped because it is a shared fault. This reduces the number of page migrations back and forth in shared fault situations. It also gives a strong preference to the tasks that are already running where most of the memory is, and to moving the other tasks to near the memory. Testing this with a much higher scan rate than the default still seems to result in fewer page migrations than before. Memory seems to be somewhat better consolidated than previously, with multi-instance specjbb runs on a 4 node system. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-62-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 14:48:21 +02:00
Rik van Riel	1e3646ffc6	mm: numa: Revert temporarily disabling of NUMA migration With the scan rate code working (at least for multi-instance specjbb), the large hammer that is "sched: Do not migrate memory immediately after switching node" can be replaced with something smarter. Revert temporarily migration disabling and all traces of numa_migrate_seq. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-61-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 14:48:20 +02:00
Mel Gorman	930aa174fc	sched/numa: Remove the numa_balancing_scan_period_reset sysctl With scan rate adaptions based on whether the workload has properly converged or not there should be no need for the scan period reset hammer. Get rid of it. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-60-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 14:48:18 +02:00
Rik van Riel	04bb2f9475	sched/numa: Adjust scan rate in task_numa_placement Adjust numa_scan_period in task_numa_placement, depending on how much useful work the numa code can do. The more local faults there are in a given scan window the longer the period (and hence the slower the scan rate) during the next window. If there are excessive shared faults then the scan period will decrease with the amount of scaling depending on whether the ratio of shared/private faults. If the preferred node changes then the scan rate is reset to recheck if the task is properly placed. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-59-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 14:48:16 +02:00
Rik van Riel	dabe1d9924	sched/numa: Be more careful about joining numa groups Due to the way the pid is truncated, and tasks are moved between CPUs by the scheduler, it is possible for the current task_numa_fault to group together tasks that do not actually share memory together. This patch adds a few easy sanity checks to task_numa_fault, joining tasks together if they share the same tsk->mm, or if the fault was on a page with an elevated mapcount, in a shared VMA. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-57-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 14:48:12 +02:00
Ingo Molnar	b32e86b430	sched/numa: Add debugging Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1381141781-10992-53-git-send-email-mgorman@suse.de	2013-10-09 14:48:04 +02:00
Rik van Riel	82727018b0	sched/numa: Call task_numa_free() from do_execve() It is possible for a task in a numa group to call exec, and have the new (unrelated) executable inherit the numa group association from its former self. This has the potential to break numa grouping, and is trivial to fix. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-51-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 14:48:00 +02:00
Mel Gorman	83e1d2cd9e	sched/numa: Use group fault statistics in numa placement This patch uses the fraction of faults on a particular node for both task and group, to figure out the best node to place a task. If the task and group statistics disagree on what the preferred node should be then a full rescan will select the node with the best combined weight. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-50-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 14:47:58 +02:00
Rik van Riel	5e1576ed0e	sched/numa: Stay on the same node if CLONE_VM A newly spawned thread inside a process should stay on the same NUMA node as its parent. This prevents processes from being "torn" across multiple NUMA nodes every time they spawn a new thread. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-49-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 14:47:57 +02:00
Peter Zijlstra	6688cc0547	mm: numa: Do not group on RO pages And here's a little something to make sure not the whole world ends up in a single group. As while we don't migrate shared executable pages, we do scan/fault on them. And since everybody links to libc, everybody ends up in the same group. Suggested-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1381141781-10992-47-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 14:47:53 +02:00
Mel Gorman	e29cf08b05	sched/numa: Report a NUMA task group ID It is desirable to model from userspace how the scheduler groups tasks over time. This patch adds an ID to the numa_group and reports it via /proc/PID/status. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-45-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 14:47:49 +02:00
Peter Zijlstra	8c8a743c50	sched/numa: Use {cpu, pid} to create task groups for shared faults While parallel applications tend to align their data on the cache boundary, they tend not to align on the page or THP boundary. Consequently tasks that partition their data can still "false-share" pages presenting a problem for optimal NUMA placement. This patch uses NUMA hinting faults to chain tasks together into numa_groups. As well as storing the NID a task was running on when accessing a page a truncated representation of the faulting PID is stored. If subsequent faults are from different PIDs it is reasonable to assume that those two tasks share a page and are candidates for being grouped together. Note that this patch makes no scheduling decisions based on the grouping information. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1381141781-10992-44-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 14:47:47 +02:00
Peter Zijlstra	90572890d2	mm: numa: Change page last {nid,pid} into {cpu,pid} Change the per page last fault tracking to use cpu,pid instead of nid,pid. This will allow us to try and lookup the alternate task more easily. Note that even though it is the cpu that is store in the page flags that the mpol_misplaced decision is still based on the node. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1381141781-10992-43-git-send-email-mgorman@suse.de [ Fixed build failure on 32-bit systems. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 14:47:45 +02:00
Peter Zijlstra	ac66f54772	sched/numa: Introduce migrate_swap() Use the new stop_two_cpus() to implement migrate_swap(), a function that flips two tasks between their respective cpus. I'm fairly sure there's a less crude way than employing the stop_two_cpus() method, but everything I tried either got horribly fragile and/or complex. So keep it simple for now. The notable detail is how we 'migrate' tasks that aren't runnable anymore. We'll make it appear like we migrated them before they went to sleep. The sole difference is the previous cpu in the wakeup path, so we override this. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Link: http://lkml.kernel.org/r/1381141781-10992-39-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:40:46 +02:00
Peter Zijlstra	1be0bd77c5	stop_machine: Introduce stop_two_cpus() Introduce stop_two_cpus() in order to allow controlled swapping of two tasks. It repurposes the stop_machine() state machine but only stops the two cpus which we can do with on-stack structures and avoid machine wide synchronization issues. The ordering of CPUs is important to avoid deadlocks. If unordered then two cpus calling stop_two_cpus on each other simultaneously would attempt to queue in the opposite order on each CPU causing an AB-BA style deadlock. By always having the lowest number CPU doing the queueing of works, we can guarantee that works are always queued in the same order, and deadlocks are avoided. Signed-off-by: Peter Zijlstra <peterz@infradead.org> [ Implemented deadlock avoidance. ] Signed-off-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Link: http://lkml.kernel.org/r/1381141781-10992-38-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:40:45 +02:00
Mel Gorman	6b9a7460b6	sched/numa: Retry migration of tasks to CPU on a preferred node When a preferred node is selected for a tasks there is an attempt to migrate the task to a CPU there. This may fail in which case the task will only migrate if the active load balancer takes action. This may never happen if the conditions are not right. This patch will check at NUMA hinting fault time if another attempt should be made to migrate the task. It will only make an attempt once every five seconds. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-34-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:40:40 +02:00
Mel Gorman	fc3147245d	mm: numa: Limit NUMA scanning to migrate-on-fault VMAs There is a 90% regression observed with a large Oracle performance test on a 4 node system. Profiles indicated that the overhead was due to contention on sp_lock when looking up shared memory policies. These policies do not have the appropriate flags to allow them to be automatically balanced so trapping faults on them is pointless. This patch skips VMAs that do not have MPOL_F_MOF set. [riel@redhat.com: Initial patch] Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-and-tested-by: Joe Mario <jmario@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-32-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:40:38 +02:00
Mel Gorman	b795854b1f	sched/numa: Set preferred NUMA node based on number of private faults Ideally it would be possible to distinguish between NUMA hinting faults that are private to a task and those that are shared. If treated identically there is a risk that shared pages bounce between nodes depending on the order they are referenced by tasks. Ultimately what is desirable is that task private pages remain local to the task while shared pages are interleaved between sharing tasks running on different nodes to give good average performance. This is further complicated by THP as even applications that partition their data may not be partitioning on a huge page boundary. To start with, this patch assumes that multi-threaded or multi-process applications partition their data and that in general the private accesses are more important for cpu->memory locality in the general case. Also, no new infrastructure is required to treat private pages properly but interleaving for shared pages requires additional infrastructure. To detect private accesses the pid of the last accessing task is required but the storage requirements are a high. This patch borrows heavily from Ingo Molnar's patch "numa, mm, sched: Implement last-CPU+PID hash tracking" to encode some bits from the last accessing task in the page flags as well as the node information. Collisions will occur but it is better than just depending on the node information. Node information is then used to determine if a page needs to migrate. The PID information is used to detect private/shared accesses. The preferred NUMA node is selected based on where the maximum number of approximately private faults were measured. Shared faults are not taken into consideration for a few reasons. First, if there are many tasks sharing the page then they'll all move towards the same node. The node will be compute overloaded and then scheduled away later only to bounce back again. Alternatively the shared tasks would just bounce around nodes because the fault information is effectively noise. Either way accounting for shared faults the same as private faults can result in lower performance overall. The second reason is based on a hypothetical workload that has a small number of very important, heavily accessed private pages but a large shared array. The shared array would dominate the number of faults and be selected as a preferred node even though it's the wrong decision. The third reason is that multiple threads in a process will race each other to fault the shared page making the fault information unreliable. Signed-off-by: Mel Gorman <mgorman@suse.de> [ Fix complication error when !NUMA_BALANCING. ] Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-30-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:40:35 +02:00
Mel Gorman	1bc115d87d	mm: numa: Scan pages with elevated page_mapcount Currently automatic NUMA balancing is unable to distinguish between false shared versus private pages except by ignoring pages with an elevated page_mapcount entirely. This avoids shared pages bouncing between the nodes whose task is using them but that is ignored quite a lot of data. This patch kicks away the training wheels in preparation for adding support for identifying shared/private pages is now in place. The ordering is so that the impact of the shared/private detection can be easily measured. Note that the patch does not migrate shared, file-backed within vmas marked VM_EXEC as these are generally shared library pages. Migrating such pages is not beneficial as there is an expectation they are read-shared between caches and iTLB and iCache pressure is generally low. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-28-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:40:32 +02:00
Mel Gorman	ac8e895bd2	sched/numa: Add infrastructure for split shared/private accounting of NUMA hinting faults Ideally it would be possible to distinguish between NUMA hinting faults that are private to a task and those that are shared. This patch prepares infrastructure for separately accounting shared and private faults by allocating the necessary buffers and passing in relevant information. For now, all faults are treated as private and detection will be introduced later. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-26-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:40:30 +02:00
Mel Gorman	3a7053b322	sched/numa: Favour moving tasks towards the preferred node This patch favours moving tasks towards NUMA node that recorded a higher number of NUMA faults during active load balancing. Ideally this is self-reinforcing as the longer the task runs on that node, the more faults it should incur causing task_numa_placement to keep the task running on that node. In reality a big weakness is that the nodes CPUs can be overloaded and it would be more efficient to queue tasks on an idle node and migrate to the new node. This would require additional smarts in the balancer so for now the balancer will simply prefer to place the task on the preferred node for a PTE scans which is controlled by the numa_balancing_settle_count sysctl. Once the settle_count number of scans has complete the schedule is free to place the task on an alternative node if the load is imbalanced. [srikar@linux.vnet.ibm.com: Fixed statistics] Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> [ Tunable and use higher faults instead of preferred. ] Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-23-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:40:26 +02:00
Mel Gorman	745d61476d	sched/numa: Update NUMA hinting faults once per scan NUMA hinting fault counts and placement decisions are both recorded in the same array which distorts the samples in an unpredictable fashion. The values linearly accumulate during the scan and then decay creating a sawtooth-like pattern in the per-node counts. It also means that placement decisions are time sensitive. At best it means that it is very difficult to state that the buffer holds a decaying average of past faulting behaviour. At worst, it can confuse the load balancer if it sees one node with an artifically high count due to very recent faulting activity and may create a bouncing effect. This patch adds a second array. numa_faults stores the historical data which is used for placement decisions. numa_faults_buffer holds the fault activity during the current scan window. When the scan completes, numa_faults decays and the values from numa_faults_buffer are copied across. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-22-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:40:25 +02:00
Mel Gorman	688b7585d1	sched/numa: Select a preferred node with the most numa hinting faults This patch selects a preferred node for a task to run on based on the NUMA hinting faults. This information is later used to migrate tasks towards the node during balancing. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-21-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:40:23 +02:00
Mel Gorman	f809ca9a55	sched/numa: Track NUMA hinting faults on per-node basis This patch tracks what nodes numa hinting faults were incurred on. This information is later used to schedule a task on the node storing the pages most frequently faulted by the task. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-20-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:40:22 +02:00
Mel Gorman	598f0ec0bc	sched/numa: Set the scan rate proportional to the memory usage of the task being scanned The NUMA PTE scan rate is controlled with a combination of the numa_balancing_scan_period_min, numa_balancing_scan_period_max and numa_balancing_scan_size. This scan rate is independent of the size of the task and as an aside it is further complicated by the fact that numa_balancing_scan_size controls how many pages are marked pte_numa and not how much virtual memory is scanned. In combination, it is almost impossible to meaningfully tune the min and max scan periods and reasoning about performance is complex when the time to complete a full scan is is partially a function of the tasks memory size. This patch alters the semantic of the min and max tunables to be about tuning the length time it takes to complete a scan of a tasks occupied virtual address space. Conceptually this is a lot easier to understand. There is a "sanity" check to ensure the scan rate is never extremely fast based on the amount of virtual memory that should be scanned in a second. The default of 2.5G seems arbitrary but it is to have the maximum scan rate after the patch roughly match the maximum scan rate before the patch was applied. On a similar note, numa_scan_period is in milliseconds and not jiffies. Properly placed pages slow the scanning rate but adding 10 jiffies to numa_scan_period means that the rate scanning slows depends on HZ which is confusing. Get rid of the jiffies_to_msec conversion and treat it as ms. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-18-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:40:20 +02:00
Mel Gorman	b726b7dfb4	Revert "mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node" PTE scanning and NUMA hinting fault handling is expensive so commit `5bca2303` ("mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node") deferred the PTE scan until a task had been scheduled on another node. The problem is that in the purely shared memory case that this may never happen and no NUMA hinting fault information will be captured. We are not ruling out the possibility that something better can be done here but for now, this patch needs to be reverted and depend entirely on the scan_delay to avoid punishing short-lived processes. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1381141781-10992-16-git-send-email-mgorman@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:40:17 +02:00
Ingo Molnar	37bf06375c	Linux 3.12-rc4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAABAgAGBQJSUc9zAAoJEHm+PkMAQRiG9DMH/AtpuAF6LlMRPjrCeuJQ1pyh T0IUO+CsLKO6qtM5IyweP8V6zaasNjIuW1+B6IwVIl8aOrM+M7CwRiKvpey26ldM I8G2ron7hqSOSQqSQs20jN2yGAqQGpYIbTmpdGLAjQ350NNNvEKthbP5SZR5PAmE UuIx5OGEkaOyZXvCZJXU9AZkCxbihlMSt2zFVxybq2pwnGezRUYgCigE81aeyE0I QLwzzMVdkCxtZEpkdJMpLILAz22jN4RoVDbXRa2XC7dA9I2PEEXI9CcLzqCsx2Ii 8eYS+no2K5N2rrpER7JFUB2B/2X8FaVDE+aJBCkfbtwaYTV9UYLq3a/sKVpo1Cs= =xSFJ -----END PGP SIGNATURE----- Merge tag 'v3.12-rc4' into sched/core Merge Linux v3.12-rc4 to fix a conflict and also to refresh the tree before applying more scheduler patches. Conflicts: arch/avr32/include/asm/Kbuild Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 12:36:13 +02:00
Linus Torvalds	0e7a3ed04f	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Various fixlets: On the kernel side: - fix a race - fix a bug in the handling of the perf ring-buffer data page On the tooling side: - fix the handling of certain corrupted perf.data files - fix a bug in 'perf probe' - fix a bug in 'perf record + perf sched' - fix a bug in 'make install' - fix a bug in libaudit feature-detection on certain distros" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf session: Fix infinite loop on invalid perf.data file perf tools: Fix installation of libexec components perf probe: Fix to find line information for probe list perf tools: Fix libaudit test perf stat: Set child_pid after perf_evlist__prepare_workload() perf tools: Add default handler for mmap2 events perf/x86: Clean up cap_user_time* setting perf: Fix perf_pmu_migrate_context	2013-10-08 09:23:12 -07:00
Linus Walleij	bfabb59433	Linux 3.12-rc4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAABAgAGBQJSUc9zAAoJEHm+PkMAQRiG9DMH/AtpuAF6LlMRPjrCeuJQ1pyh T0IUO+CsLKO6qtM5IyweP8V6zaasNjIuW1+B6IwVIl8aOrM+M7CwRiKvpey26ldM I8G2ron7hqSOSQqSQs20jN2yGAqQGpYIbTmpdGLAjQ350NNNvEKthbP5SZR5PAmE UuIx5OGEkaOyZXvCZJXU9AZkCxbihlMSt2zFVxybq2pwnGezRUYgCigE81aeyE0I QLwzzMVdkCxtZEpkdJMpLILAz22jN4RoVDbXRa2XC7dA9I2PEEXI9CcLzqCsx2Ii 8eYS+no2K5N2rrpER7JFUB2B/2X8FaVDE+aJBCkfbtwaYTV9UYLq3a/sKVpo1Cs= =xSFJ -----END PGP SIGNATURE----- Merge tag 'v3.12-rc4' into devel Linux 3.12-rc4	2013-10-08 13:27:11 +02:00
Heiko Stuebner	ddeccb8d6b	dmaengine: add driver for Samsung s3c24xx SoCs This adds a new driver to support the s3c24xx dma using the dmaengine and makes the old one in mach-s3c24xx obsolete in the long run. Conceptually the s3c24xx-dma feels like a distant relative of the pl08x with numerous virtual channels being mapped to a lot less physical ones. The driver therefore borrows a lot from the amba-pl08x driver in this regard. Functionality-wise the driver gains a memcpy ability in addition to the slave_sg one. The driver supports both the method for requesting the peripheral used by SoCs before the S3C2443 and the different method for S3C2443 and later. On earlier SoCs the hardware channels usable for specific peripherals is constrainted while on later SoCs all channels can be used for any peripheral. Tested on a s3c2416-based board, memcpy using the dmatest module and slave_sg partially using the spi-s3c64xx driver. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>	2013-10-08 06:42:10 +09:00
Alexei Starovoitov	d45ed4a4e3	net: fix unsafe set_memory_rw from softirq on x86 system with net.core.bpf_jit_enable = 1 sudo tcpdump -i eth1 'tcp port 22' causes the warning: [ 56.766097] Possible unsafe locking scenario: [ 56.766097] [ 56.780146] CPU0 [ 56.786807] ---- [ 56.793188] lock(&(&vb->lock)->rlock); [ 56.799593] <Interrupt> [ 56.805889] lock(&(&vb->lock)->rlock); [ 56.812266] [ 56.812266] * DEADLOCK * [ 56.812266] [ 56.830670] 1 lock held by ksoftirqd/1/13: [ 56.836838] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8118f44c>] vm_unmap_aliases+0x8c/0x380 [ 56.849757] [ 56.849757] stack backtrace: [ 56.862194] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 3.12.0-rc3+ #45 [ 56.868721] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 [ 56.882004] ffffffff821944c0 ffff88080bbdb8c8 ffffffff8175a145 0000000000000007 [ 56.895630] ffff88080bbd5f40 ffff88080bbdb928 ffffffff81755b14 0000000000000001 [ 56.909313] ffff880800000001 ffff880800000000 ffffffff8101178f 0000000000000001 [ 56.923006] Call Trace: [ 56.929532] [<ffffffff8175a145>] dump_stack+0x55/0x76 [ 56.936067] [<ffffffff81755b14>] print_usage_bug+0x1f7/0x208 [ 56.942445] [<ffffffff8101178f>] ? save_stack_trace+0x2f/0x50 [ 56.948932] [<ffffffff810cc0a0>] ? check_usage_backwards+0x150/0x150 [ 56.955470] [<ffffffff810ccb52>] mark_lock+0x282/0x2c0 [ 56.961945] [<ffffffff810ccfed>] __lock_acquire+0x45d/0x1d50 [ 56.968474] [<ffffffff810cce6e>] ? __lock_acquire+0x2de/0x1d50 [ 56.975140] [<ffffffff81393bf5>] ? cpumask_next_and+0x55/0x90 [ 56.981942] [<ffffffff810cef72>] lock_acquire+0x92/0x1d0 [ 56.988745] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380 [ 56.995619] [<ffffffff817628f1>] _raw_spin_lock+0x41/0x50 [ 57.002493] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380 [ 57.009447] [<ffffffff8118f52a>] vm_unmap_aliases+0x16a/0x380 [ 57.016477] [<ffffffff8118f44c>] ? vm_unmap_aliases+0x8c/0x380 [ 57.023607] [<ffffffff810436b0>] change_page_attr_set_clr+0xc0/0x460 [ 57.030818] [<ffffffff810cfb8d>] ? trace_hardirqs_on+0xd/0x10 [ 57.037896] [<ffffffff811a8330>] ? kmem_cache_free+0xb0/0x2b0 [ 57.044789] [<ffffffff811b59c3>] ? free_object_rcu+0x93/0xa0 [ 57.051720] [<ffffffff81043d9f>] set_memory_rw+0x2f/0x40 [ 57.058727] [<ffffffff8104e17c>] bpf_jit_free+0x2c/0x40 [ 57.065577] [<ffffffff81642cba>] sk_filter_release_rcu+0x1a/0x30 [ 57.072338] [<ffffffff811108e2>] rcu_process_callbacks+0x202/0x7c0 [ 57.078962] [<ffffffff81057f17>] __do_softirq+0xf7/0x3f0 [ 57.085373] [<ffffffff81058245>] run_ksoftirqd+0x35/0x70 cannot reuse jited filter memory, since it's readonly, so use original bpf insns memory to hold work_struct defer kfree of sk_filter until jit completed freeing tested on x86_64 and i386 Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-07 15:16:45 -04:00
Olof Johansson	c884357ec6	Merge branch 'clk-of-init-v2_for-3.13' of https://github.com/shesselba/linux-dove into next/cleanup From Sebastian Hasselbarth: This is a patch set based on an RFC [1][2] sent earlier to provide a common arch/arm init for DT clock providers. Currently, the call to of_clk_init(NULL) to initialize DT clock providers is spread among several mach-dirs. Since most machs require DT clocks initialized before timers, no initcall can be used. By adding of_clk_init(NULL) to arch/arm time_init(), we can remove all mach-specific .init_time hooks that basically called of_clk_init and clocksource_of_init. In contrast to the RFC version, of_clk_init(NULL) is now only called if no custom .init_time callback is set. This allows some machs to still call clock init themselves, as not all can be converted now. Therefore, this patch sets drops conversion of mach-mvebu and mach-zynq. New machs that were introduced with v3.12-rc1 are also converted, except mach-u300 that requires clocks before irqs. * 'clk-of-init-v2_for-3.13' of https://github.com/shesselba/linux-dove: (29 commits) ARM: vt8500: remove custom .init_time hook ARM: vexpress: remove custom .init_time hook ARM: tegra: remove custom .init_time hook ARM: sunxi: remove custom .init_time hook ARM: sti: remove custom .init_time hook ARM: socfpga: remove custom .init_time hook ARM: rockchip: remove custom .init_time hook ARM: prima2: remove custom .init_time hook ARM: nspire: remove custom .init_time hook ARM: nomadik: remove custom .init_time hook ARM: mxs: remove custom .init_time hook ARM: kirkwood: remove custom .init_time hook ARM: imx: remove custom .init_time hook ARM: highbank: remove custom .init_time hook ARM: exynos: remove custom .init_time hook ARM: dove: remove custom .init_time hook ARM: bcm2835: remove custom .init_time hook ARM: bcm: provide common arch init for DT clocks ARM: call of_clk_init from default time_init handler ARM: vt8500: prepare for arch-wide .init_time callback ... Signed-off-by: Olof Johansson <olof@lixom.net>	2013-10-07 09:47:31 -07:00
Linus Torvalds	fd848319e7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID fixes from Jiri Kosina: - fix for hidraw reference counting regression, by Manoj Chourasia - fix for minor number allocation for uhid, by David Herrmann - other small unsorted fixes / device ID additions * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: wiimote: fix FF deadlock HID: add Holtek USB ID 04d9:a081 SHARKOON DarkGlider HID: hidraw: close underlying device at removal of last reader HID: roccat: Fix "cannot create duplicate filename" problems HID: uhid: allocate static minor	2013-10-07 09:30:36 -07:00
Michael S. Tsirkin	3573540caf	netif_set_xps_queue: make cpu mask const virtio wants to pass in cpumask_of(cpu), make parameter const to avoid build warnings. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-07 12:29:26 -04:00
Greg Kroah-Hartman	4c015ba24b	Merge 3.12-rc4 into usb-next We want the USB fixes in this branch as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-06 17:33:56 -07:00
Greg Kroah-Hartman	97a7729a5c	Merge 3.12-rc4 into tty-next We want the tty fixes in this branch as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-06 17:28:16 -07:00
Greg Kroah-Hartman	02ab343f3b	Merge 3.12-rc4 into staging-next We want the staging fixes in this branch as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-06 17:22:43 -07:00
Greg Kroah-Hartman	a6b01deda1	driver core: remove dev_bin_attrs from struct class No in-kernel code is now using this, they have all be converted over to using the bin_attrs support in attribute groups, so this field, and the code in the driver core that was creating/remove the binary files can be removed. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-06 00:01:47 -07:00
Greg Kroah-Hartman	bcc8edb52f	driver core: remove dev_attrs from struct class Now that all in-kernel users of the dev_attrs field are converted to use dev_groups, we can safely remove dev_attrs from struct class. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-05 23:59:34 -07:00
Heikki Krogerus	e4a49a6015	usb: remove intel_mid_otg.h It's not used anymore. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-05 17:54:13 -07:00
Linus Torvalds	8e1a254099	IOMMU Fixes for Linux v3.12-rc3 A couple of fixes from the IOMMU side: * Some small fixes for the new ARM-SMMU driver * A register offset correction for VT-d * Adding MAINTAINERS entry for drivers/iommu Overall no really big or intrusive changes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJSTtOsAAoJECvwRC2XARrjD9cQAOGijCeHW3fZr5iVNCPQQVUq NNvGZpqjFwtV/8QvBy/HZ6Jt2j/4c24od1xewQflwBuvhxokTPUVs30LYVMGqE// tfWM5x/kL7Iz2JBAu0vT9Ihq/elIy4zGE8w7a6hio5K0UQVmvsQ73SWYrxut0jcR F9e9BaNt7LI27v20Ph48ejVHNTIQT07GDZJXRtYiBI9VmI8O6aTpu8OOeS5Grrv+ tyIpt3DYnqbsTdsnF5YlIVn23d/MYR8be2wnpaGh/vShZlwPsU8ay9/29cJhqyGD 5GWuRK+4OKaXXWhzpcwMc+iYwCIp1IKkCc5dax1xVMedlOzRxtQpZXEZEjlv9/aS sINp2kBnkJssGO781OWr7HL9G/OxdKHokG8AiizFSS18VDy76AVI3sWLCJwuFPPW X+SAYQiph7liVkUKEFwITTu4CJ5TClwcy0ovFGqpnhGLIKp3woEO8K1RznBdYZgH 22ZSm3GpTi6XG53cP2INBQ0cKXg6nbJPhczyUiaSLDVGlFS+VMGZavCsjn4ceq7u /k1M9uwhE8JqS6T2dTROy/ZuTOoMTFm4yGTIpec/S9RtRvPjwVMEUQN+y419AV7k PzAuxefsCOqEcviVt0pMz/aFjdPw6slNJNAG1zWckvw6DrMmKrFGbyH8KMRSchzY uo0xEoMIft8Mmfvu1IVe =a8UE -----END PGP SIGNATURE----- Merge tag 'iommu-fixes-v3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: "A couple of fixes from the IOMMU side: - some small fixes for the new ARM-SMMU driver - a register offset correction for VT-d - add MAINTAINERS entry for drivers/iommu Overall no really big or intrusive changes" * tag 'iommu-fixes-v3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: x86/iommu: correct ICS register offset MAINTAINERS: add overall IOMMU section iommu/arm-smmu: don't enable SMMU device until probing has completed iommu/arm-smmu: fix iommu_present() test in init iommu/arm-smmu: fix a signedness bug	2013-10-04 09:05:12 -07:00
Roger Quadros	8e933359ee	usb: phy: generic: Add gpio_reset to platform data The GPIO number of the RESET line can be passed to the driver using the gpio_reset member. Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Felipe Balbi <balbi@ti.com>	2013-10-04 09:25:22 -05:00
Ingo Molnar	fb869b6e91	sched/wait: Clean up wait.h details a bit Since we are changing wait.h profoundly, use the opportunity to: - add a sentence to explain what this file is about - remove whitespace noise - prettify weird looking line break fixup attempts - standardize type definition and initialization sequences - use consistent style details No code is changed. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-O8dIie5swnctqpupakatvqyq@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 13:57:19 +02:00
Peter Zijlstra	35a2af94c7	sched/wait: Make the __wait_event() interface more friendly Change all __wait_event() implementations to match the corresponding wait_event() signature for convenience. In particular this does away with the weird 'ret' logic. Since there are __wait_event() users this requires we update them too. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092529.042563462@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:16:25 +02:00
Peter Zijlstra	ebdc195f2e	sched/wait: Collapse __wait_event_hrtimeout() While not a whole-sale replacement like the others we can still reduce the size of __wait_event_hrtimeout() considerably by noting that the actual core of __wait_event_hrtimeout() is identical to what ___wait_event() generates. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.972793648@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:16:22 +02:00
Peter Zijlstra	cf7361fd96	sched/wait: Collapse __wait_event_killable() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.898691966@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:16:21 +02:00
Peter Zijlstra	0d1e1c8a43	sched/wait: Collapse __wait_event_interruptible_tty() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.831085521@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:16:20 +02:00
Peter Zijlstra	a1dc6852ac	sched/wait: Collapse __wait_event_interruptible_lock_irq_timeout() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.759956109@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:16:20 +02:00
Peter Zijlstra	8fbd88fa17	sched/wait: Collapse __wait_event_interruptible_lock_irq() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.686006009@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:16:19 +02:00
Peter Zijlstra	13cb5042a4	sched/wait: Collapse __wait_event_lock_irq() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.612813379@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:50 +02:00
Peter Zijlstra	48c2521717	sched/wait: Collapse __wait_event_interruptible_exclusive() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.541716442@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:49 +02:00
Peter Zijlstra	c2ebb1fb4e	sched/wait: Collapse __wait_event_interruptible_timeout() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.469616907@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:49 +02:00
Peter Zijlstra	f13f4c41c9	sched/wait: Collapse __wait_event_interruptible() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.396949919@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:48 +02:00
Peter Zijlstra	ddc1994b82	sched/wait: Collapse __wait_event_timeout() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.325264677@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:47 +02:00
Peter Zijlstra	854267f438	sched/wait: Collapse __wait_event() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.254863348@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:46 +02:00
Peter Zijlstra	41a1431b17	sched/wait: Introduce ___wait_event() There's far too much duplication in the __wait_event macros; in order to fix this introduce ___wait_event() a macro with the capability to replace most other macros. With the previous patches changing the various __wait_event*() implementations to be more uniform; we can now collapse the lot without also changing generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.181897111@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:46 +02:00
Peter Zijlstra	bb632bc449	sched/wait: Change the wait_exclusive control flow Purely a preparatory patch; it changes the control flow to match what will soon be generated by generic code so that that patch can be a unity transform. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.107994763@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:45 +02:00
Peter Zijlstra	2953ef246b	sched/wait: Change timeout logic Commit `4c663cf` ("wait: fix false timeouts when using wait_event_timeout()") introduced an additional condition check after a timeout but there's a few issues; - it forgot one site - it put the check after the main loop; not at the actual timeout check. Cure both; by wrapping the condition (as suggested by Oleg), this avoids double evaluation of 'condition' which could be quite big. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.028892896@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:44 +02:00
Peter Zijlstra	2f2a2b60ad	sched/wait: Make the signal_pending() checks consistent There's two patterns to check signals in the __wait_event*() macros: if (!signal_pending(current)) { schedule(); continue; } ret = -ERESTARTSYS; break; And the more natural: if (signal_pending(current)) { ret = -ERESTARTSYS; break; } schedule(); Change them all into the latter form. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092527.956416254@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:44 +02:00
Andi Kleen	fdfbbd07e9	perf: Add generic transaction flags Add a generic qualifier for transaction events, as a new sample type that returns a flag word. This is particularly useful for qualifying aborts: to distinguish aborts which happen due to asynchronous events (like conflicts caused by another CPU) versus instructions that lead to an abort. The tuning strategies are very different for those cases, so it's important to distinguish them easily and early. Since it's inconvenient and inflexible to filter for this in the kernel we report all the events out and allow some post processing in user space. The flags are based on the Intel TSX events, but should be fairly generic and mostly applicable to other HTM architectures too. In addition to various flag words there's also reserved space to report an program supplied abort code. For TSX this is used to distinguish specific classes of aborts, like a lock busy abort when doing lock elision. Flags: Elision and generic transactions (ELISION vs TRANSACTION) (HLE vs RTM on TSX; IBM etc. would likely only use TRANSACTION) Aborts caused by current thread vs aborts caused by others (SYNC vs ASYNC) Retryable transaction (RETRY) Conflicts with other threads (CONFLICT) Transaction write capacity overflow (CAPACITY WRITE) Transaction read capacity overflow (CAPACITY READ) Transactions implicitely aborted can also return an abort code. This can be used to signal specific events to the profiler. A common case is abort on lock busy in a RTM eliding library (code 0xff) To handle this case we include the TSX abort code Common example aborts in TSX would be: - Data conflict with another thread on memory read. Flags: TRANSACTION\|ASYNC\|CONFLICT - executing a WRMSR in a transaction. Flags: TRANSACTION\|SYNC - HLE transaction in user space is too large Flags: ELISION\|SYNC\|CAPACITY-WRITE The only flag that is somewhat TSX specific is ELISION. This adds the perf core glue needed for reporting the new flag word out. v2: Add MEM/MISC v3: Move transaction to the end v4: Separate capacity-read/write and remove misc v5: Remove _SAMPLE. Move abort flags to 32bit. Rename transaction to txn Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-2-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:06:08 +02:00
Peter Zijlstra	9886167d20	perf: Fix perf_pmu_migrate_context While auditing the list_entry usage due to a trinity bug I found that perf_pmu_migrate_context violates the rules for perf_event::event_entry. The problem is that perf_event::event_entry is a RCU list element, and hence we must wait for a full RCU grace period before re-using the element after deletion. Therefore the usage in perf_pmu_migrate_context() which re-uses the entry immediately is broken. For now introduce another list_head into perf_event for this specific usage. This doesn't actually fix the trinity report because that never goes through this code. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-mkj72lxagw1z8fvjm648iznw@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 09:58:53 +02:00
Weijie Yang	5b8802143a	fs/debugfs: add declaration for no CONFIG_DEBUG_FS Two function declarations are absence if not define CONFIG_DEBUG_FS in include/linux/debugfs.h Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-03 16:14:12 -07:00
Roger Quadros	0bb85dc2d3	usb: phy: omap: get rid of omap_get_control_dev() This function was preventing us from supporting multiple instances. Get rid of it. Since we support DT boots only, users can get the control device phandle from the DT node. Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-03 15:47:31 -07:00
Roger Quadros	8934d3e4d0	usb: musb: omap2430: Don't use omap_get_control_dev() omap_get_control_dev() is being deprecated as it doesn't support multiple instances. As control device is present only from OMAP4 onwards which supports DT only, we use phandles to get the reference to the control device. Also get rid of "ti,has-mailbox" property as it is redundant and we can determine that from whether "ctrl-module" property is present or not. Get rid of has_mailbox from musb_hdrc_platform_data as well. Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-03 15:47:31 -07:00
Roger Quadros	6cb9310a32	usb: phy: omap: Add new device types and remove omap_control_usb3_phy_power() Add support for new device types and in the process rid of "ti,type" device tree property. The correct type of device will be determined from the compatible string instead. Introduce a compatible string for each device type. At the moment we support 4 types OTGHS, USB2, PIPE3 (e.g. USB3) and DRA7USB2. Update DT binding information to reflect these changes. Also get rid of omap_control_usb3_phy_power(). Just one function i.e. omap_control_usb_phy_power() will now take care of all PHY types. Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-03 15:47:30 -07:00
Roger Quadros	4fd06af96b	usb: phy: omap-control: Get rid of platform data omap-control device is present from OMAP4 onwards which support device tree boots only. So get rid of platform data. Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-03 15:47:30 -07:00
Thomas Pugliese	4ae1a5bd3f	usb: wusbcore: Add isoc transfer type enum and packet definitions This patch adds transfer type enum and packet definitions for WA_XFER_ISO_PACKET_INFO and WA_XFER_ISO_PACKET_STATUS packets. It also changes instances of __attribute__((packed)) to __packed to make checkpatch.pl happy. Signed-off-by: Thomas Pugliese <thomas.pugliese@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-10-03 15:46:26 -07:00
Mike Travis	8daaa5f826	kdb: Add support for external NMI handler to call KGDB/KDB This patch adds a kgdb_nmicallin() interface that can be used by external NMI handlers to call the KGDB/KDB handler. The primary need for this is for those types of NMI interrupts where all the CPUs have already received the NMI signal. Therefore no send_IPI(NMI) is required, and in fact it will cause a 2nd unhandled NMI to occur. This generates the "Dazed and Confuzed" messages. Since all the CPUs are getting the NMI at roughly the same time, it's not guaranteed that the first CPU that hits the NMI handler will manage to enter KGDB and set the dbg_master_lock before the slaves start entering. The new argument "send_ready" was added for KGDB to signal the NMI handler to release the slave CPUs for entry into KGDB. Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Jason Wessel <jason.wessel@windriver.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/20131002151417.928886849@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-03 18:47:54 +02:00
Lars-Peter Clausen	05071aa864	spi: Add a spi_w8r16be() helper This patch adds a new spi_w8r16be() helper, which is similar to spi_w8r16() except that it converts the read data word from big endian to native endianness before returning it. The reason for introducing this new helper is that for SPI slave devices it is quite common that the read 16 bit data word is in big endian. So users of spi_w8r16() have to convert the result to native endianness manually. A second reason is that in this case the endianness of the return value of spi_w8r16() depends on its sign. If it is negative (i.e. a error code) it is already in native endianness, if it is positive it is in big endian. The sparse code checker doesn't like this kind of mixed endianness and special annotations are necessary to keep it quiet (E.g. casting to be16 using __force). Doing the conversion to native endianness in the helper function does not require such annotations since we are not mixing different endiannesses in the same variable. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@linaro.org>	2013-10-03 13:52:03 +01:00
Ingo Molnar	68e9074028	Merge branch 'clockevents/3.13' of git://git.linaro.org/people/dlezcano/linux into timers/core Pull (mostly) ARM clocksource driver updates from Daniel Lezcano: " - Soren Brinkmann added FEAT_PERCPU to a clock device when it is local per cpu. This feature prevents the clock framework to choose a per cpu timer as a broadcast timer. This problem arised when the ARM global timer is used when switching to the broadcast timer which is the case now on Xillinx with its cpuidle driver. - Stephen Boyd extended the generic sched_clock code to support 64bit counters and removes the setup_sched_clock deprecation, as that causes lots of warnings since there's still users in the arch/arm tree. He added also the CLOCK_SOURCE_SUSPEND_NONSTOP flag on the architected timer as they continue counting during suspend. - Uwe Kleine-König added some missing __init sections and consolidated the code by moving the of_node_put call from the drivers to the function clocksource_of_init. " Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-03 07:57:02 +02:00
Ingo Molnar	6c09f6d830	Linux 3.12-rc3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAABAgAGBQJSSKOHAAoJEHm+PkMAQRiGeREH/3EqHmJPBzmVoJwR9/ykDoLg u+TJTkuxZG220WhgXS7W/0ECyBX0U7yA0bY9PZbqgcdiLjY0veR18/pOhEq5RzHq ub8Q+AJdiORF/sq268q7gnNmy3rSCgnrAyHA/bzBtkbisYODwZPYvWQVUjgNZ2dW qtW/TE9rjANcUrk8WdOu9oWcwsq4cyG3cscbfHE/JLFy/8tB5GoD158gxKLZsLXk uTCeUHMmvFRT56fZwfyvNstA8ozxXcHBmuu6+Ttceky2zeGzp6dOrd+d2SU1Ps3O P91x4e/Af4RFEwDczGP6TpSBEf/J/JaqrM1drjhnQHho0hrNRZVUXhADFVADCXY= =dOjB -----END PGP SIGNATURE----- Merge tag 'v3.12-rc3' into timers/core Merge Linux 3.12-rc3 - refresh the tree with the latest fixes before merging new bits. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-03 07:52:21 +02:00
stephen hemminger	5bc3db5c9c	tc: export tc_defact.h to userspace Jamal sent patch to add tc user simple actions to iproute2 but required header was not being exported. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-02 16:39:11 -04:00
Soren Brinkmann	3713c0cfd0	clockchips: Add FEAT_PERCPU clockevent flag Add the flag CLOCK_EVT_FEAT_PERCPU which is supposed to be set for per cpu clockevent devices. Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Michal Simek <michal.simek@xilinx.com>	2013-10-02 11:33:23 +02:00
Trond Myklebust	99875249bf	NFSv4: Ensure that we disable the resend timeout for NFSv4 The spec states that the client should not resend requests because the server will disconnect if it needs to drop an RPC request. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-10-01 18:22:11 -04:00
Trond Myklebust	8a19a0b6cb	SUNRPC: Add RPC task and client level options to disable the resend timeout Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-10-01 18:22:11 -04:00
Trond Myklebust	90051ea774	SUNRPC: Clean up - convert xprt_prepare_transmit to return a bool Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-10-01 18:22:11 -04:00
Linus Torvalds	c31eeaced2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking changes from David Miller: 1) Multiply in netfilter IPVS can overflow when calculating destination weight. From Simon Kirby. 2) Use after free fixes in IPVS from Julian Anastasov. 3) SFC driver bug fixes from Daniel Pieczko. 4) Memory leak in pcan_usb_core failure paths, from Alexey Khoroshilov. 5) Locking and encapsulation fixes to serial line CAN driver, from Andrew Naujoks. 6) Duplex and VF handling fixes to bnx2x driver from Yaniv Rosner, Eilon Greenstein, and Ariel Elior. 7) In lapb, if no other packets are outstanding, T1 timeouts actually stall things and no packet gets sent. Fix from Josselin Costanzi. 8) ICMP redirects should not make it to the socket error queues, from Duan Jiong. 9) Fix bugs in skge DMA mapping error handling, from Nikulas Patocka. 10) Fix setting of VLAN priority field on via-rhine driver, from Roget Luethi. 11) Fix TX stalls and VLAN promisc programming in be2net driver from Ajit Khaparde. 12) Packet padding doesn't get handled correctly in new usbnet SG support code, from Ming Lei. 13) Fix races in netdevice teardown wrt. network namespace closing. From Eric W. Biederman. 14) Fix potential missed initialization of net_secret if not TCP connections are openned. From Eric Dumazet. 15) Cinterion PLXX product ID in qmi_wwan driver is wrong, from Aleksander Morgado. 16) skb_cow_head() can change skb->data and thus packet header pointers, don't use stale ip_hdr reference in ip_tunnel code. 17) Backend state transition handling fixes in xen-netback, from Paul Durrant. 18) Packet offset for AH protocol is handled wrong in flow dissector, from Eric Dumazet. 19) Taking down an fq packet scheduler instance can leave stale packets in the queues, fix from Eric Dumazet. 20) Fix performance regressions introduced by TCP Small Queues. From Eric Dumazet. 21) IPV6 GRE tunneling code calculates max_headroom incorrectly, from Hannes Frederic Sowa. 22) Multicast timer handlers in ipv4 and ipv6 can be the last and final reference to the ipv4/ipv6 specific network device state, so use the reference put that will check and release the object if the reference hits zero. From Salam Noureddine. 23) Fix memory corruption in ip_tunnel driver, and use skb_push() instead of __skb_push() so that similar bugs are less hard to find. From Steffen Klassert. 24) Add forgotten hookup of rtnl_ops in SIT and ip6tnl drivers, from Nicolas Dichtel. 25) fq scheduler doesn't accurately rate limit in certain circumstances, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits) pkt_sched: fq: rate limiting improvements ip6tnl: allow to use rtnl ops on fb tunnel sit: allow to use rtnl ops on fb tunnel ip_tunnel: Remove double unregister of the fallback device ip_tunnel_core: Change __skb_push back to skb_push ip_tunnel: Add fallback tunnels to the hash lists ip_tunnel: Fix a memory corruption in ip_tunnel_xmit qlcnic: Fix SR-IOV configuration ll_temac: Reset dma descriptors indexes on ndo_open skbuff: size of hole is wrong in a comment ipv6 mcast: use in6_dev_put in timer handlers instead of __in6_dev_put ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put ethernet: moxa: fix incorrect placement of __initdata tag ipv6: gre: correct calculation of max_headroom powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file Revert "powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file" bonding: Fix broken promiscuity reference counting issue tcp: TSQ can use a dynamic limit dm9601: fix IFF_ALLMULTI handling pkt_sched: fq: qdisc dismantle fixes ...	2013-10-01 12:58:48 -07:00
Srinivas Pandruvada	1df3a40115	HID: Delay opening HID device Don't call hid_open_device till there is actually an user. This saves power by not opening underlying transport for HID. Also close device if there are no active mfd client using HID sensor hub. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Jonathan Cameron <jic23@kernel.org>	2013-10-01 16:19:08 +01:00
Frederic Weisbecker	7d65f4a655	irq: Consolidate do_softirq() arch overriden implementations All arch overriden implementations of do_softirq() share the following common code: disable irqs (to avoid races with the pending check), check if there are softirqs pending, then execute __do_softirq() on a specific stack. Consolidate the common parts such that archs only worry about the stack switch. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org>	2013-10-01 12:53:25 +02:00
Nicolas Dichtel	4590672357	skbuff: size of hole is wrong in a comment Since commit `c93bdd0e03` ("netvm: allow skb allocation to use PFMEMALLOC reserves"), hole size is one bit less than what is written in the comment. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 22:32:39 -07:00
Linus Torvalds	f927318840	NFS client bugfixes for 3.12 - Stable fix for Oopses in the pNFS files layout driver - Fix a regression when doing a non-exclusive file create on NFSv4.x - NFSv4.1 security negotiation fixes when looking up the root filesystem - Fix a memory ordering issue in the pNFS files layout driver -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABAgAGBQJSSfNNAAoJEGcL54qWCgDybGYQAJGm4/vd7/rWZ49KIjGFGkFo sCt0UOK6Y6ALhUOIlIreXsQ+Iwn9aAoIIRgx8UwnB+hO6PGnSyFuJZZx1KE8V2kj 6JlE5FbsWV+3uFQzNJQsNcoj7NZMzIRZT7x+7QansBOdSQjgQc3ig2sAMWREZjn8 GxMOl8FNRrnP8gRom30ZScgMp1YDM8J1ql80S/nbxh2NOLBsvgg9VapzJhhqkMyl b7WKX4Qbg4AeSaxIAIrIwcZ7L2YS09JGC40VSybQARs0/7J8fjOZPs7CmrUCoB5F DmT5vfEC4+dqDf8PMyoFVfxK5ua5Sb/FGQmagYYa8bSgY7Uq03akYI++co+4PZU1 f3SN6CSvVffzGMdXAhUupOZQbkKvKFxR2MTGy8s7dxdkQudd4RioYPDmLfCHlbmb VY5kFh/Duqso1FCrcfvZoC88ElrWUz5yoVzZyECOEwCs1wjI6bjmGdSqCSbU75Lm Z0XOAn1cStwFvGwCbGZPUzlvueji3coDdCFPBXAOFHzisLYoo/Lxenw7l5D1qM5b 02iZllcIo340vw8wxHZxVebecFo33P90X1gjv0HQQkV/6EeNgq4D47SWTPxRq3Ai Dl9MFjTPl51oseDLrH6I/hBvcqjksB1M1+WjifT0bCIi3Y0HAea2U0wgweHS3vAd QHqIpIJxNHDjPBMDWEZW =ScfI -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.12-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Stable fix for Oopses in the pNFS files layout driver - Fix a regression when doing a non-exclusive file create on NFSv4.x - NFSv4.1 security negotiation fixes when looking up the root filesystem - Fix a memory ordering issue in the pNFS files layout driver * tag 'nfs-for-3.12-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: Give "flavor" an initial value to fix a compile warning NFSv4.1: try SECINFO_NO_NAME flavs until one works NFSv4.1: Ensure memory ordering between nfs4_ds_connect and nfs4_fl_prepare_ds NFSv4.1: nfs4_fl_prepare_ds - fix bugs when the connect attempt fails NFSv4: Honour the 'opened' parameter in the atomic_open() filesystem method	2013-09-30 17:10:26 -07:00
Linus Torvalds	522d6d38f8	Merge branch 'akpm' (fixes from Andrew Morton) Merge misc fixes from Andrew Morton. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits) pidns: fix free_pid() to handle the first fork failure ipc,msg: prevent race with rmid in msgsnd,msgrcv ipc/sem.c: update sem_otime for all operations mm/hwpoison: fix the lack of one reference count against poisoned page mm/hwpoison: fix false report on 2nd attempt at page recovery mm/hwpoison: fix test for a transparent huge page mm/hwpoison: fix traversal of hugetlbfs pages to avoid printk flood block: change config option name for cmdline partition parsing mm/mlock.c: prevent walking off the end of a pagetable in no-pmd configuration mm: avoid reinserting isolated balloon pages into LRU lists arch/parisc/mm/fault.c: fix uninitialized variable usage include/asm-generic/vtime.h: avoid zero-length file nilfs2: fix issue with race condition of competition between segments for dirty blocks Documentation/kernel-parameters.txt: replace kernelcore with Movable mm/bounce.c: fix a regression where MS_SNAP_STABLE (stable pages snapshotting) was ignored kernel/kmod.c: check for NULL in call_usermodehelper_exec() ipc/sem.c: synchronize the proc interface ipc/sem.c: optimize sem_lock() ipc/sem.c: fix race in sem_lock() mm/compaction.c: periodically schedule when freeing pages ...	2013-09-30 14:32:32 -07:00
Rafael Aquini	117aad1e9e	mm: avoid reinserting isolated balloon pages into LRU lists Isolated balloon pages can wrongly end up in LRU lists when migrate_pages() finishes its round without draining all the isolated page list. The same issue can happen when reclaim_clean_pages_from_list() tries to reclaim pages from an isolated page list, before migration, in the CMA path. Such balloon page leak opens a race window against LRU lists shrinkers that leads us to the following kernel panic: BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 IP: [<ffffffff810c2625>] shrink_page_list+0x24e/0x897 PGD 3cda2067 PUD 3d713067 PMD 0 Oops: 0000 [#1] SMP CPU: 0 PID: 340 Comm: kswapd0 Not tainted 3.12.0-rc1-22626-g4367597 #87 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 RIP: shrink_page_list+0x24e/0x897 RSP: 0000:ffff88003da499b8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff88003e82bd60 RCX: 00000000000657d5 RDX: 0000000000000000 RSI: 000000000000031f RDI: ffff88003e82bd40 RBP: ffff88003da49ab0 R08: 0000000000000001 R09: 0000000081121a45 R10: ffffffff81121a45 R11: ffff88003c4a9a28 R12: ffff88003e82bd40 R13: ffff88003da0e800 R14: 0000000000000001 R15: ffff88003da49d58 FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000067d9000 CR3: 000000003ace5000 CR4: 00000000000407b0 Call Trace: shrink_inactive_list+0x240/0x3de shrink_lruvec+0x3e0/0x566 __shrink_zone+0x94/0x178 shrink_zone+0x3a/0x82 balance_pgdat+0x32a/0x4c2 kswapd+0x2f0/0x372 kthread+0xa2/0xaa ret_from_fork+0x7c/0xb0 Code: 80 7d 8f 01 48 83 95 68 ff ff ff 00 4c 89 e7 e8 5a 7b 00 00 48 85 c0 49 89 c5 75 08 80 7d 8f 00 74 3e eb 31 48 8b 80 18 01 00 00 <48> 8b 74 0d 48 8b 78 30 be 02 00 00 00 ff d2 eb RIP [<ffffffff810c2625>] shrink_page_list+0x24e/0x897 RSP <ffff88003da499b8> CR2: 0000000000000028 ---[ end trace 703d2451af6ffbfd ]--- Kernel panic - not syncing: Fatal exception This patch fixes the issue, by assuring the proper tests are made at putback_movable_pages() & reclaim_clean_pages_from_list() to avoid isolated balloon pages being wrongly reinserted in LRU lists. [akpm@linux-foundation.org: clarify awkward comment text] Signed-off-by: Rafael Aquini <aquini@redhat.com> Reported-by: Luiz Capitulino <lcapitulino@redhat.com> Tested-by: Luiz Capitulino <lcapitulino@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-30 14:31:02 -07:00
Olof Johansson	8e17a7f32a	This is a huge device tree and ATAG removal series for ux500: - Move all the clock definitions over to the device tree - Remove all now-redundant AUXDATA and make the ux500 device tree only -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABAgAGBQJSRA9cAAoJEEEQszewGV1zpiQP/iwq3mMXJm5NRnG7NdlfaZkO TUtobAxdTQDbz70nhgLG5z+FD+qxqYed7MnF/NTRIwe6ZsV74lvY4WloAkNizfDL TeZYxFiSrv5oTHLfJzuynveIflnG0Wfm3MAMFYp3aeR+h9fGrpDkxOlMIleI+BP0 C9wjysYAO2vhAQIUT3ONwybMgPfgsEmQMpMsPzOpfdLmgtqQrELaHmucN8Hghagy rTwxq6f3gOxYeFAlEAw1RxapH4GI+VBxoLCXGpkTJqhLK4DAo67NphoUPgNAXp7I IGZwWTYbNVa6GlvLVXFdCmV6UVF+qkTn9/Tedn+a6LF/5XtA44DLGAvhPRZBGdrY k5nWCyc/90gpDF8pSi7cJkoZLUW3wEwDpYi9XmZA3dsLnbWW3ct8kX3iJNFQ/jAe lx+kJP9erMQEtiuVfEXXYbK8ILHi8yoEWfiq7FY3UIbHkKIjEdJOCxk5to6BeynA PKPJNO7dgYdhiM4Rqz4AcF1xCEe5NAQVltNfeqPVg/7o4nlQnIaUQVUpbDnYXaCN iB0tVbjSiFA4C/C5oz0k2VhKH8YMsgd+CZ9ALS2jhGNiIfFOp8JuuJ8KL86m7iZQ aC0rPXT8xkEF0a9lVU1UCCiSf6zIpZ0G1iJ6mZ+hA/FJ2pzeLEGEa4iYpsMhcnfQ 0d9fVtYBKBKCMXaeJP0z =FQfo -----END PGP SIGNATURE----- Merge tag 'ux500-dt-for-v3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson into next/dt From Linus Walleij: This is a huge device tree and ATAG removal series for ux500: - Move all the clock definitions over to the device tree - Remove all now-redundant AUXDATA and make the ux500 device tree only * tag 'ux500-dt-for-v3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson: (92 commits) ARM: ux500: delete devices-common remnants clk: ux500: Provide a look-up for the ARMSS clock ARM: ux500: Enable CPUFreq on Snowball ARM: ux500: Provide a Device Tree node for CPUFreq in the DBx500 ARM: ux500: Provide a clock lookup for the Hash driver ARM: ux500: Provide a clock lookup for the Crypto driver ARM: ux500: Fix trivial white-space error in the DBX500 DTSI file ARM: ux500: Remove ATAG booting support for Snowball ARM: ux500: Remove ATAG booting support for HREF ARM: ux500: Remove ATAG booting support for U8520 ARM: ux500: Remove ATAG booting support for MOP500 ARM: ux500: Purge UIB framework when booting with ATAGs ARM: ux500: Take out STUIB support when not booting with Device Tree ARM: ux500: Remove BU21013 ROHM TS support when booting with only ATAGs ARM: ux500: Don't register the STMPE/SKE when booting with ATAG support ARM: ux500: Delete U8500 UIB support when booting with ATAGs ARM: ux500: Don't register Synaptics RMI4 TS when booting with ATAGs ARM: ux500: Purge DB8500 PRCMU registration when not booting with DT ARM: ux500: Stop requesting the SoC device to play 'parent' role ARM: ux500: Remove UART support when booting without Device Tree ... Signed-off-by: Olof Johansson <olof@lixom.net>	2013-09-30 09:08:46 -07:00
Mark Brown	3abd38db6d	Merge remote-tracking branch 'regulator/fix/doc' into regulator-linus	2013-09-30 12:04:30 +01:00
Greg Kroah-Hartman	df9b17f586	Merge 3.12-rc3 into usb-next We want the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-29 18:45:55 -07:00
Greg Kroah-Hartman	4ceedcf815	Merge 3.12-rc3 into tty-next We want the tty/serial fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-29 18:44:13 -07:00
Greg Kroah-Hartman	73b2277718	Merge 3.12-rc3 into staging-next We want the staging fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-29 18:42:21 -07:00
Greg Kroah-Hartman	88502b9c0a	Merge 3.12-rc3 into driver-core-next We want the driver core and sysfs fixes in here to make merges and development easier. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-29 18:29:23 -07:00
Greg Kroah-Hartman	d717349368	Merge 3.12-rc3 into char-misc-next We need/want the mei fixes in here so we can apply other updates that are depending on them. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-29 18:27:03 -07:00
Linus Torvalds	c23c2234de	Char / Misc driver fixes for 3.12-rc3 Here are some HyperV and MEI driver fixes for 3.12-rc3. They resolve some issues that people have been reporting for them. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.21 (GNU/Linux) iEYEABECAAYFAlJIgcwACgkQMUfUDdst+ynRGwCgoIIGIXG2RTVaTDm8wezRERGA ZDoAoKZe5Bec2CSqydcPNZWYg4ATTMjn =6r+r -----END PGP SIGNATURE----- Merge tag 'char-misc-3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some HyperV and MEI driver fixes for 3.12-rc3. They resolve some issues that people have been reporting for them" * tag 'char-misc-3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: Drivers: hv: vmbus: Terminate vmbus version negotiation on timeout Drivers: hv: util: Correctly support ws2008R2 and earlier mei: cancel stall timers in mei_reset mei: bus: stop wait for read during cl state transition mei: make me client counters less error prone	2013-09-29 13:44:48 -07:00
Greg Kroah-Hartman	b4e8459947	Second set of new functionality for IIO in the 3.13 cycle - with bug fixes for first set. This is a small, mainly to get a couple of compile bug related fixes into the tree ASAP. New device support: 1) Add ad5446 dac support to the ad5641 driver. New functionality and cleanups: 1) Optional power supply regulators for the st pressure sensors drivers using the new optional regulator interface. 2) Bit of tidying up of naming in the sysfs trigger. Bug fixes from the previous series: 1) Missing select IIO_BUFFER for ti_am335x_adc 2) Drop a bonus bracket in iio-trig-bfin-timmer -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.21 (GNU/Linux) iQIcBAABAgAGBQJSSJGeAAoJEFSFNJnE9BaIY7gP/i5Ld5glnnG88dtM5g7Cs7Ws utyKeWh1QxfbtRH8MDo8GfxKAszVeckG552A/iOfm0wvqJT1JIKRdUl2rMcFvfBS V0YFMKlJlXgfVhvy4X9NgfllrlGxAniFQ115l3JnPPhnEfr4IgWOt1K5dmK0kqhF Ztt9ukmsa2VctefYP5tcRh1Pw1rYdFndgkdZ64+fYjkAideMbWxgQdh91al1prBz wXTu43djnQimvkZLRQ0tX0/WQ9zs8H3bNm7TEgOqU8bbX785B8XQv6N6OFnqXT9S IjbZHniih8eU13gtno+yBR6OO0gYw20IxXmVRUUR/F94JT4KXaByGBrMz8Z5f+o0 OilkJjOdn/jH+Dd1XHJN7Pk8Wx6PxwxeENu3qVn7p1vg27OcpY0Ft2vcdecnE11j 7bP3HYLcxGxLCW0qYs6voK2SnvZAJBrw8MB4sdUAmykdn1xJ25wccQm6iMKTvTYI +AipHI3jpfllIC6Lj1IVhksAYhd7BlMK5sVH0sgWirLbqACSweRpbXCcRlSvguGY jEX5+Rf29QfYbcIsRlk52+n47FpVVOfIDjc0ygBGLovaE+FLorWUNIJjeM3f570D l2nwWjHUvnNFtA/NnBf4NLrwHqXzBX30Httk+6eISRJiyjr4ZEBAVpRt/luiWlqY bOEBICwy1sEzF4ymIAIa =KKlS -----END PGP SIGNATURE----- Merge tag 'iio-for-3.13b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next Jonathan writes: Second set of new functionality for IIO in the 3.13 cycle - with bug fixes for first set. This is a small, mainly to get a couple of compile bug related fixes into the tree ASAP. New device support: 1) Add ad5446 dac support to the ad5641 driver. New functionality and cleanups: 1) Optional power supply regulators for the st pressure sensors drivers using the new optional regulator interface. 2) Bit of tidying up of naming in the sysfs trigger. Bug fixes from the previous series: 1) Missing select IIO_BUFFER for ti_am335x_adc 2) Drop a bonus bracket in iio-trig-bfin-timmer	2013-09-29 13:12:23 -07:00
Sebastian Hesselbarth	dd03ee9ae5	ARM: mxs: remove custom .init_time hook This patch converts clk-imx2[38] clocksource_of_init compatible init associated with fsl,imx2[38]-clkctrl. With arch/arm calling of_clk_init(NULL) from time_init(), we can now also remove custom .init_time hooks. Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Acked-by: Mike Turquette <mturquette@linaro.org> Acked-by: Shawn Guo <shawn.guo@linaro.org>	2013-09-29 21:09:34 +02:00
Sebastian Hesselbarth	be0804513a	clk: sunxi: declare OF clock provider Common clock framework allows to register clock providers to get called on of_clk_init() by using CLK_OF_DECLARE. This converts sunxi clock providers to make use of it and get rid of the mach specific clk init call. As sunxi has a bunch of independent clk provider nodes, we hook current clock init to board compatible to make it called once. Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com> Acked-by: Mike Turquette <mturquette@linaro.org>	2013-09-29 21:07:16 +02:00
Sebastian Hesselbarth	74227e65f9	clk: nomadik: declare OF clock provider Common clock framework allows to register clock providers to get called on of_clk_init() by using CLK_OF_DECLARE. This converts nomadik clock provider to make use of it and get rid of the mach specific clk init call. As clocks require system reset controller base address to be initialized each clock driver checks src_base and calls new nomadik_src_init if required. Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Mike Turquette <mturquette@linaro.org>	2013-09-29 21:07:06 +02:00
Ming Lei	60e453a940	USBNET: fix handling padding packet Commit 638c5115a7949(USBNET: support DMA SG) introduces DMA SG if the usb host controller is capable of building packet from discontinuous buffers, but missed handling padding packet when building DMA SG. This patch attachs the pre-allocated padding packet at the end of the sg list, so padding packet can be sent to device if drivers require that. Reported-by: David Laight <David.Laight@aculab.com> Acked-by: Oliver Neukum <oliver@neukum.org> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-28 15:42:49 -04:00
Linus Torvalds	aeebc26457	Merge branch 'lockref' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 lockref enablement from Heiko Carstens: "Enabling the new lockless lockref variant on s390 would have been trivial until Tony Luck added a cpu_relax() call into the CMPXCHG_LOOP(), with commit `d472d9d98b` ("lockref: Relax in cmpxchg loop") As already mentioned cpu_relax() is very expensive on s390 since it yields() the current virtual cpu. So we are talking of several thousand cycles. Considering this enabling the lockless lockref variant would contradict the intention of the new semantics. And also some quick measurements show performance regressions of 50% and more. Simply removing the cpu_relax() call again seems also not very desireable since Waiman Long reported that for some workloads the call improved performance by 5%." * 'lockref' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: enable ARCH_USE_CMPXCHG_LOCKREF lockref: use arch_mutex_cpu_relax() in CMPXCHG_LOOP() mutex: replace CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX with simple ifdef	2013-09-28 12:36:19 -07:00
Linus Torvalds	f2e98aa830	DeviceTree fixes for 3.12 Clean-up to fix some warnings for !OF builds and spelling fixes in docs - Clean-up openrisc prom.h - Fix warnings caused by of_irq.h ifdefs - Spelling fix for Synopsys -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJSQk1CAAoJEMhvYp4jgsXilvEH/RztxbKNiscz7TYg3BDEKMaY Wg029sdygyhl+Xb78FaS3Qwt98pK2JSWqzoNv+Tv4tfksGeLWVpe6HZlGvE/PEhb 2gUppV/ILyFSH8NfeaPkfyqdqr2XpCkcpkyGgYz7jQzNti5si448LC4oJR4tcXbo FyKguRPpTtCwDsG86fPpQgc3fHK+C7W0oQu+PCB0DEk2ApMR05rGpPiMEYsKlj56 FE9n0DMZ+hfMCUivraVTyHB9Xjo4/kTv9s88hkjwkli+N3UrIUgOGsQpNt27QIFP F95roRtVOmgRbgtMAFaEy7nfd+3QuVfonjfYNJzxpRHWew7LxFcVEwWRjb0zaKQ= =KDHt -----END PGP SIGNATURE----- Merge tag 'devicetree-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull DeviceTree fixes from Rob Herring: "Clean-up to fix some warnings for !OF builds and spelling fixes in docs: - Clean-up openrisc prom.h - Fix warnings caused by of_irq.h ifdefs - Spelling fix for Synopsys" * tag 'devicetree-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dts: Fix misspelling of Synopsys of: clean-up ifdefs in of_irq.h openrisc: clean-up prom.h	2013-09-28 11:57:26 -07:00
Greg Kroah-Hartman	e18945b159	driver-core: remove struct bus_type.drv_attrs Now that all in-kernel users of bus_type.drv_attrs have been converted to use drv_groups instead, the drv_attrs field, and logic surrounding it, can be removed. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-28 10:18:20 -07:00
Greg Kroah-Hartman	b4e46138f9	driver-core: remove struct bus_type.bus_attrs Now that all in-kernel users of bus_type.bus_attrs have been converted to use bus_groups instead, the bus_attrs field, and logic surrounding it, can be removed. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-28 08:32:11 -07:00
Heiko Carstens	083986e824	mutex: replace CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX with simple ifdef Linus suggested to replace #ifndef CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX #define arch_mutex_cpu_relax() cpu_relax() #endif with just a simple #ifndef arch_mutex_cpu_relax # define arch_mutex_cpu_relax() cpu_relax() #endif to get rid of CONFIG_HAVE_CPU_RELAX_SIMPLE. So architectures can simply define arch_mutex_cpu_relax if they want an architecture specific function instead of having to add a select statement in their Kconfig in addition. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2013-09-28 12:46:21 +02:00
Peter Zijlstra	75f93fed50	sched: Revert need_resched() to look at TIF_NEED_RESCHED Yuanhan reported a serious throughput regression in his pigz benchmark. Using the ftrace patch I found that several idle paths need more TLC before we can switch the generic need_resched() over to preempt_need_resched. The preemption paths benefit most from preempt_need_resched and do indeed use it; all other need_resched() users don't really care that much so reverting need_resched() back to tif_need_resched() is the simple and safe solution. Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: lkp@linux.intel.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20130927153003.GF15690@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-28 10:04:47 +02:00
Kishon Vijay Abraham I	6747caa76c	usb: phy: twl4030: use the new generic PHY framework Used the generic PHY framework API to create the PHY. For powering on and powering off the PHY, power_on and power_off ops are used. Once the MUSB OMAP glue is adapted to the new framework, the suspend and resume ops of usb phy library will be removed. Also twl4030-usb driver is moved to drivers/phy/. However using the old usb phy library cannot be completely removed because otg is intertwined with phy and moving to the new framework completely will break otg. Once we have a separate otg state machine, we can get rid of the usb phy library. Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Acked-by: Felipe Balbi <balbi@ti.com> Reviewed-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-27 17:36:58 -07:00
Kishon Vijay Abraham I	ff76496347	drivers: phy: add generic PHY framework The PHY framework provides a set of APIs for the PHY drivers to create/destroy a PHY and APIs for the PHY users to obtain a reference to the PHY with or without using phandle. For dt-boot, the PHY drivers should also register PHY provider with the framework. PHY drivers should create the PHY by passing id and ops like init, exit, power_on and power_off. This framework is also pm runtime enabled. The documentation for the generic PHY framework is added in Documentation/phy.txt and the documentation for dt binding can be found at Documentation/devicetree/bindings/phy/phy-bindings.txt Cc: Tomasz Figa <t.figa@samsung.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Acked-by: Felipe Balbi <balbi@ti.com> Tested-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-27 17:35:41 -07:00
David Howells	f1fe29b4a0	NFS: Use i_writecount to control whether to get an fscache cookie in nfs_open() Use i_writecount to control whether to get an fscache cookie in nfs_open() as NFS does not do write caching yet. I think this is the cause of a problem encountered by Mark Moseley whereby __fscache_uncache_page() gets a NULL pointer dereference because cookie->def is NULL: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: [<ffffffff812a1903>] __fscache_uncache_page+0x23/0x160 PGD 0 Thread overran stack, or stack corrupted Oops: 0000 [#1] SMP Modules linked in: ... CPU: 7 PID: 18993 Comm: php Not tainted 3.11.1 #1 Hardware name: Dell Inc. PowerEdge R420/072XWF, BIOS 1.3.5 08/21/2012 task: ffff8804203460c0 ti: ffff880420346640 RIP: 0010:[<ffffffff812a1903>] __fscache_uncache_page+0x23/0x160 RSP: 0018:ffff8801053af878 EFLAGS: 00210286 RAX: 0000000000000000 RBX: ffff8800be2f8780 RCX: ffff88022ffae5e8 RDX: 0000000000004c66 RSI: ffffea00055ff440 RDI: ffff8800be2f8780 RBP: ffff8801053af898 R08: 0000000000000001 R09: 0000000000000003 R10: 0000000000000000 R11: 0000000000000000 R12: ffffea00055ff440 R13: 0000000000001000 R14: ffff8800c50be538 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88042fc60000(0063) knlGS:00000000e439c700 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 0000000000000010 CR3: 0000000001d8f000 CR4: 00000000000607f0 Stack: ... Call Trace: [<ffffffff81365a72>] __nfs_fscache_invalidate_page+0x42/0x70 [<ffffffff813553d5>] nfs_invalidate_page+0x75/0x90 [<ffffffff811b8f5e>] truncate_inode_page+0x8e/0x90 [<ffffffff811b90ad>] truncate_inode_pages_range.part.12+0x14d/0x620 [<ffffffff81d6387d>] ? __mutex_lock_slowpath+0x1fd/0x2e0 [<ffffffff811b95d3>] truncate_inode_pages_range+0x53/0x70 [<ffffffff811b969d>] truncate_inode_pages+0x2d/0x40 [<ffffffff811b96ff>] truncate_pagecache+0x4f/0x70 [<ffffffff81356840>] nfs_setattr_update_inode+0xa0/0x120 [<ffffffff81368de4>] nfs3_proc_setattr+0xc4/0xe0 [<ffffffff81357f78>] nfs_setattr+0xc8/0x150 [<ffffffff8122d95b>] notify_change+0x1cb/0x390 [<ffffffff8120a55b>] do_truncate+0x7b/0xc0 [<ffffffff8121f96c>] do_last+0xa4c/0xfd0 [<ffffffff8121ffbc>] path_openat+0xcc/0x670 [<ffffffff81220a0e>] do_filp_open+0x4e/0xb0 [<ffffffff8120ba1f>] do_sys_open+0x13f/0x2b0 [<ffffffff8126aaf6>] compat_SyS_open+0x36/0x50 [<ffffffff81d7204c>] sysenter_dispatch+0x7/0x24 The code at the instruction pointer was disassembled: > (gdb) disas __fscache_uncache_page > Dump of assembler code for function __fscache_uncache_page: > ... > 0xffffffff812a18ff <+31>: mov 0x48(%rbx),%rax > 0xffffffff812a1903 <+35>: cmpb $0x0,0x10(%rax) > 0xffffffff812a1907 <+39>: je 0xffffffff812a19cd <__fscache_uncache_page+237> These instructions make up: ASSERTCMP(cookie->def->type, !=, FSCACHE_COOKIE_TYPE_INDEX); That cmpb is the faulting instruction (%rax is 0). So cookie->def is NULL - which presumably means that the cookie has already been at least partway through __fscache_relinquish_cookie(). What I think may be happening is something like a three-way race on the same file: PROCESS 1 PROCESS 2 PROCESS 3 =============== =============== =============== open(O_TRUNC\|O_WRONLY) open(O_RDONLY) open(O_WRONLY) -->nfs_open() -->nfs_fscache_set_inode_cookie() nfs_fscache_inode_lock() nfs_fscache_disable_inode_cookie() __fscache_relinquish_cookie() nfs_inode->fscache = NULL <--nfs_fscache_set_inode_cookie() -->nfs_open() -->nfs_fscache_set_inode_cookie() nfs_fscache_inode_lock() nfs_fscache_enable_inode_cookie() __fscache_acquire_cookie() nfs_inode->fscache = cookie <--nfs_fscache_set_inode_cookie() <--nfs_open() -->nfs_setattr() ... ... -->nfs_invalidate_page() -->__nfs_fscache_invalidate_page() cookie = nfsi->fscache -->nfs_open() -->nfs_fscache_set_inode_cookie() nfs_fscache_inode_lock() nfs_fscache_disable_inode_cookie() -->__fscache_relinquish_cookie() -->__fscache_uncache_page(cookie) <crash> <--__fscache_relinquish_cookie() nfs_inode->fscache = NULL <--nfs_fscache_set_inode_cookie() What is needed is something to prevent process #2 from reacquiring the cookie - and I think checking i_writecount should do the trick. It's also possible to have a two-way race on this if the file is opened O_TRUNC\|O_RDONLY instead. Reported-by: Mark Moseley <moseleymark@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com>	2013-09-27 18:40:25 +01:00
David Howells	94d30ae90a	FS-Cache: Provide the ability to enable/disable cookies Provide the ability to enable and disable fscache cookies. A disabled cookie will reject or ignore further requests to: Acquire a child cookie Invalidate and update backing objects Check the consistency of a backing object Allocate storage for backing page Read backing pages Write to backing pages but still allows: Checks/waits on the completion of already in-progress objects Uncaching of pages Relinquishment of cookies Two new operations are provided: (1) Disable a cookie: void fscache_disable_cookie(struct fscache_cookie cookie, bool invalidate); If the cookie is not already disabled, this locks the cookie against other dis/enablement ops, marks the cookie as being disabled, discards or invalidates any backing objects and waits for cessation of activity on any associated object. This is a wrapper around a chunk split out of fscache_relinquish_cookie(), but it reinitialises the cookie such that it can be reenabled. All possible failures are handled internally. The caller should consider calling fscache_uncache_all_inode_pages() afterwards to make sure all page markings are cleared up. (2) Enable a cookie: void fscache_enable_cookie(struct fscache_cookie cookie, bool (can_enable)(void data), void data) If the cookie is not already enabled, this locks the cookie against other dis/enablement ops, invokes can_enable() and, if the cookie is not an index cookie, will begin the procedure of acquiring backing objects. The optional can_enable() function is passed the data argument and returns a ruling as to whether or not enablement should actually be permitted to begin. All possible failures are handled internally. The cookie will only be marked as enabled if provisional backing objects are allocated. A later patch will introduce these to NFS. Cookie enablement during nfs_open() is then contingent on i_writecount <= 0. can_enable() checks for a race between open(O_RDONLY) and open(O_WRONLY/O_RDWR). This simplifies NFS's cookie handling and allows us to get rid of open(O_RDONLY) accidentally introducing caching to an inode that's open for writing already. One operation has its API modified: (3) Acquire a cookie. struct fscache_cookie fscache_acquire_cookie( struct fscache_cookie parent, const struct fscache_cookie_def def, void *netfs_data, bool enable); This now has an additional argument that indicates whether the requested cookie should be enabled by default. It doesn't need the can_enable() function because the caller must prevent multiple calls for the same netfs object and it doesn't need to take the enablement lock because no one else can get at the cookie before this returns. Signed-off-by: David Howells <dhowells@redhat.com	2013-09-27 18:40:25 +01:00
David Howells	8fb883f3e3	FS-Cache: Add use/unuse/wake cookie wrappers Add wrapper functions for dealing with cookie->n_active: () __fscache_use_cookie() to increment it. () __fscache_unuse_cookie() to decrement and test against zero. (*) __fscache_wake_unused_cookie() to wake up anyone waiting for it to reach zero. The second and third are split so that the third can be done after cookie->lock has been released in case the waiter wakes up whilst we're still holding it and tries to get it. We will need to wake-on-zero once the cookie disablement patch is applied because it will then be possible to see n_active become zero without the cookie being relinquished. Also move the cookie usement out of fscache_attr_changed_op() and into fscache_attr_changed() and the operation struct so that cookie disablement will be able to track it. Whilst we're at it, only increment n_active if we're about to do fscache_submit_op() so that we don't have to deal with undoing it if anything earlier fails. Possibly this should be moved into fscache_submit_op() which could look at FSCACHE_OP_UNUSE_COOKIE. Signed-off-by: David Howells <dhowells@redhat.com>	2013-09-27 18:40:25 +01:00
John W. Linville	0a878747e1	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem Also fixed-up a badly indented closing brace... Signed-off-by: John W. Linville <linville@tuxdriver.com>	2013-09-27 13:11:17 -04:00
Mark Brown	0cab71e701	Merge remote-tracking branch 'spi/fix/s3c64xx' into spi-s3c64xx Conflicts: drivers/spi/spi-s3c64xx.c	2013-09-27 14:27:56 +01:00
Greg Kroah-Hartman	2424a7339b	Update extcon for 3.13 This patchset modify extcon core to remove unnecessary allocation sequence for 'dev' instance and change extcon_dev_register() interface. extcon-gpio use gpiolib API to get debounce time and include small fix of extcon core/device driver. Detailed description for patchset: 1. Modify extcon core driver - The extcon-gpio driver use gpio_set_debounce() API provided from gpiolib if gpio driver for SoC support gpio_set_debounce() function and support 'gpio_ activ_low' filed to check whether gpio active state is 1(high) or 0(low). - Change field type of 'dev' in structure extcon_dev and remove the sequence of allocating memory of 'struct dev' on extcon_dev_register() function because extcon device must need 'struct device. - Change extcon_dev_register() prototype to simplify it and remove unnecessary parameter as below: 2. Fix coding style and typo - extcon core : Fix indentation coding style and remove unnecessary casting - extcon-max8997 : Fix checkpatch warning - extcon-max77693 : Fix checkpatch warning - extcon-arizona : Fix typo of comment and modify minor issue - extcon-palmas : Use dev_get_platdata() 3. Modify extcon-arizona driver - Modify minor issue about micbias and comparision statement -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJSRNuGAAoJEJzN3yze689TcaQP/0jyJEFTkB0EsutUsK5K4Ul9 Hv6G07ccCQZOR8Bk5VR3a/ldX2nMlc8uZkhCrcvxbPM9NDuUNt5Q18EFrMCYomwz bipFPds6Q2UuGxnaSgqBQKJ6PH/IYXTfS1Y2nbaOivKK18L62WBWw4wCz59WWyS8 B4ROjn0OjfzHlbp3uPjjCAqY9PpVsxFfTxszR0+Xt4hbapet/7UTgyMO6fJRZ0JF gzWCKcXghRvv5ZEravX5sUc4SW6bNYKq3CgPBPIEmo33W0obk5bKOY5qDVdc7e03 WhYCPhm/gYsEfHq/Ah9k9xz3Q5ObrNdDep5nAFrS4PpoyJfalYwpRmcExTTuCEXj CXmR/NpSK2BPlxgrQ32/bTve2bMtT16ZAMXKAIxHGwAUG/RAuQIZx0trjTSERNi1 A0juMZG3a2o9q+heCIeUp/bjhMGqCgfmlSwGrinU6hZMEj8DL4KtgiZrozOp4RlF SX+LOvNA40EK+S2FljY30G3NkTCLksJimjQ6v37b1vhSnXBNh5R+lhzXUY6cBBHB b5oo+6P4A9CCcmoUg8XOb5/NEjJX8JxpUKGpF/saBgHOQl5xXAfYBYVi5HVpgrJ1 Soqlv2s/wUxmGGjG9meC+AEvTu/DtKGNUMtGDZM8TUULAxwFsvdStSeJ5JHqyXuE iaAj5pGTNkFsXnwPISZT =8Hr2 -----END PGP SIGNATURE----- Merge tag 'extcon-next-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into char-misc-next Chanwoo writes: Update extcon for 3.13 This patchset modify extcon core to remove unnecessary allocation sequence for 'dev' instance and change extcon_dev_register() interface. extcon-gpio use gpiolib API to get debounce time and include small fix of extcon core/device driver. Detailed description for patchset: 1. Modify extcon core driver - The extcon-gpio driver use gpio_set_debounce() API provided from gpiolib if gpio driver for SoC support gpio_set_debounce() function and support 'gpio_ activ_low' filed to check whether gpio active state is 1(high) or 0(low). - Change field type of 'dev' in structure extcon_dev and remove the sequence of allocating memory of 'struct dev' on extcon_dev_register() function because extcon device must need 'struct device. - Change extcon_dev_register() prototype to simplify it and remove unnecessary parameter as below: 2. Fix coding style and typo - extcon core : Fix indentation coding style and remove unnecessary casting - extcon-max8997 : Fix checkpatch warning - extcon-max77693 : Fix checkpatch warning - extcon-arizona : Fix typo of comment and modify minor issue - extcon-palmas : Use dev_get_platdata() 3. Modify extcon-arizona driver - Modify minor issue about micbias and comparision statement	2013-09-26 20:47:25 -07:00
Chanwoo Choi	42d7d7539a	extcon: Simplify extcon_dev_register() prototype by removing unnecessary parameter This patch remove extcon_dev_register()'s second parameter which means the pointer of parent device to simplify prototype of this function. So, if extcon device has the parent device, it should set the pointer of parent device to edev.dev.parent in extcon device driver instead of in extcon_dev_register(). Cc: Graeme Gregory <gg@slimlogic.co.uk> Cc: Kishon Vijay Abraham I <kishon@ti.com> Cc: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>	2013-09-27 09:37:01 +09:00
Chanwoo Choi	dae6165124	extcon: Change field type of 'dev' in extcon_dev structure The extcon device must always need 'struct device' so this patch change field type of 'dev' instead of allocating memory for 'struct device' on extcon_dev_register() function. Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Myungjoo Ham <cw00.choi@samsung.com>	2013-09-27 09:37:01 +09:00
Guenter Roeck	5bfbdc9caa	extcon: gpio: Add support for active-low presence to detect pins This patch add 'gpio_active_low' field to 'struct gpio_extcon_data' to check whether gpio active state is 1(high) or 0(low). Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>	2013-09-27 09:37:01 +09:00
Chanwoo Choi	a75e1c73a4	extcon: Fix indentation coding style to improve readability Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>	2013-09-27 09:37:00 +09:00
Johan Hovold	3f9120b042	driver core: prevent deferred probe with platform_driver_probe Prevent drivers relying on platform_driver_probe from requesting deferred probing in order to avoid further futile probe attempts (either the driver has been unregistered or its probe function has been set to platform_drv_probe_fail when probing is retried). Note that several platform drivers currently return subsystem errors from probe and that these can include -EPROBE_DEFER (e.g. if a gpio request fails). Add a warning to platform_drv_probe that can be used to catch drivers that inadvertently request probe deferral while using platform_driver_probe. Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 16:18:32 -07:00
Jeff Mahoney	eee0316497	kobject: introduce kobj_completion A common way to handle kobject lifetimes in embedded in objects with different lifetime rules is to pair the kobject with a struct completion. This introduces a kobj_completion structure that can be used in place of the pairing, along with several convenience functions for initialization, release, and put-and-wait. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 16:17:33 -07:00
Tejun Heo	388975ccca	sysfs: clean up sysfs_get_dirent() The pre-existing sysfs interfaces which take explicit namespace argument are weird in that they place the optional @ns in front of @name which is contrary to the established convention. For example, we end up forcing vast majority of sysfs_get_dirent() users to do sysfs_get_dirent(parent, NULL, name), which is silly and error-prone especially as @ns and @name may be interchanged without causing compilation warning. This renames sysfs_get_dirent() to sysfs_get_dirent_ns() and swap the positions of @name and @ns, and sysfs_get_dirent() is now a wrapper around sysfs_get_dirent_ns(). This makes confusions a lot less likely. There are other interfaces which take @ns before @name. They'll be updated by following patches. This patch doesn't introduce any functional changes. v2: EXPORT_SYMBOL_GPL() wasn't updated leading to undefined symbol error on module builds. Reported by build test robot. Fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 15:33:18 -07:00
Tejun Heo	4b30ee58ee	sysfs: remove ktype->namespace() invocations in symlink code There's no reason for sysfs to be calling ktype->namespace(). It is backwards, obfuscates what's going on and unnecessarily tangles two separate layers. There are two places where symlink code calls ktype->namespace(). * sysfs_do_create_link_sd() calls it to find out the namespace tag of the target directory. Unless symlinking races with cross-namespace renaming, this equals @target_sd->s_ns. * sysfs_rename_link() uses it to find out the new namespace to rename to and the new namespace can be different from the existing one. The function is renamed to sysfs_rename_link_ns() with an explicit @ns argument and the ktype->namespace() invocation is shifted to the device layer. While this patch replaces ktype->namespace() invocation with the recorded result in @target_sd, this shouldn't result in any behvior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 15:30:22 -07:00
Tejun Heo	e34ff49061	sysfs: remove ktype->namespace() invocations in directory code For some unrecognizable reason, namespace information is communicated to sysfs through ktype->namespace() callback when there's nothing which needs the use of a callback. The whole sequence of operations is completely synchronous and sysfs operations simply end up calling back into the layer which just invoked it in order to find out the namespace information, which is completely backwards, obfuscates what's going on and unnecessarily tangles two separate layers. This patch doesn't remove ktype->namespace() but shifts its handling to kobject layer. We probably want to get rid of the callback in the long term. This patch adds an explicit param to sysfs_{create\|rename\|move}_dir() and renames them to sysfs_{create\|rename\|move}_dir_ns(), respectively. ktype->namespace() invocations are moved to the calling sites of the above functions. A new helper kboject_namespace() is introduced which directly tests kobj_ns_type_operations->type which should give the same result as testing sysfs_fs_type(parent_sd) and returns @kobj's namespace tag as necessary. kobject_namespace() is extern as it will be used from another file in the following patches. This patch should be an equivalent conversion without any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 15:30:22 -07:00
Tejun Heo	58292cbe66	sysfs: make attr namespace interface less convoluted sysfs ns (namespace) implementation became more convoluted than necessary while trying to hide ns information from visible interface. The relatively recent attr ns support is a good example. * attr ns tag is determined by sysfs_ops->namespace() callback while dir tag is determined by kobj_type->namespace(). The placement is arbitrary. * Instead of performing operations with explicit ns tag, the namespace callback is routed through sysfs_attr_ns(), sysfs_ops->namespace(), class_attr_namespace(), class_attr->namespace(). It's not simpler in any sense. The only thing this convolution does is traversing the whole stack backwards. The namespace callbacks are unncessary because the operations involved are inherently synchronous. The information can be provided in in straight-forward top-down direction and reversing that direction is unnecessary and against basic design principles. This backward interface is unnecessarily convoluted and hinders properly separating out sysfs from driver model / kobject for proper layering. This patch updates attr ns support such that * sysfs_ops->namespace() and class_attr->namespace() are dropped. * sysfs_{create\|remove}_file_ns(), which take explicit @ns param, are added and sysfs_{create\|remove}_file() are now simple wrappers around the ns aware functions. * ns handling is dropped from sysfs_chmod_file(). Nobody uses it at this point. sysfs_chmod_file_ns() can be added later if necessary. * Explicit @ns is propagated through class_{create\|remove}_file_ns() and netdev_class_{create\|remove}_file_ns(). * driver/net/bonding which is currently the only user of attr namespace is updated to use netdev_class_{create\|remove}_file_ns() with @bh->net as the ns tag instead of using the namespace callback. This patch should be an equivalent conversion without any functional difference. It makes the code easier to follow, reduces lines of code a bit and helps proper separation and layering. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 14:50:01 -07:00
K. Y. Srinivasan	3a4916050b	Drivers: hv: util: Correctly support ws2008R2 and earlier The current code does not correctly negotiate the version numbers for the util driver when hosted on earlier hosts. The version numbers presented by this driver were not compatible with the version numbers supported by Windows Server 2008. Fix this problem. I would like to thank Olaf Hering (ohering@suse.com) for identifying the problem. Reported-by: Olaf Hering <ohering@suse.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 14:20:21 -07:00
Arend van Spriel	2bedea8f26	bcma: make bcma_core_pci_{up,down}() callable from atomic context This patch removes the bcma_core_pci_power_save() call from the bcma_core_pci_{up,down}() functions as it tries to schedule thus requiring to call them from non-atomic context. The function bcma_core_pci_power_save() is now exported so the calling module can explicitly use it in non-atomic context. This fixes the 'scheduling while atomic' issue reported by Tod Jackson and Joe Perches. [ 13.210710] BUG: scheduling while atomic: dhcpcd/1800/0x00000202 [ 13.210718] Modules linked in: brcmsmac nouveau coretemp kvm_intel kvm cordic brcmutil bcma dell_wmi atl1c ttm mxm_wmi wmi [ 13.210756] CPU: 2 PID: 1800 Comm: dhcpcd Not tainted 3.11.0-wl #1 [ 13.210762] Hardware name: Alienware M11x R2/M11x R2, BIOS A04 11/23/2010 [ 13.210767] ffff880177c92c40 ffff880170fd1948 ffffffff8169af5b 0000000000000007 [ 13.210777] ffff880170fd1ab0 ffff880170fd1958 ffffffff81697ee2 ffff880170fd19d8 [ 13.210785] ffffffff816a19f5 00000000000f4240 000000000000d080 ffff880170fd1fd8 [ 13.210794] Call Trace: [ 13.210813] [<ffffffff8169af5b>] dump_stack+0x4f/0x84 [ 13.210826] [<ffffffff81697ee2>] __schedule_bug+0x43/0x51 [ 13.210837] [<ffffffff816a19f5>] __schedule+0x6e5/0x810 [ 13.210845] [<ffffffff816a1c34>] schedule+0x24/0x70 [ 13.210855] [<ffffffff816a04fc>] schedule_hrtimeout_range_clock+0x10c/0x150 [ 13.210867] [<ffffffff810684e0>] ? update_rmtp+0x60/0x60 [ 13.210877] [<ffffffff8106915f>] ? hrtimer_start_range_ns+0xf/0x20 [ 13.210887] [<ffffffff816a054e>] schedule_hrtimeout_range+0xe/0x10 [ 13.210897] [<ffffffff8104f6fb>] usleep_range+0x3b/0x40 [ 13.210910] [<ffffffffa00371af>] bcma_pcie_mdio_set_phy.isra.3+0x4f/0x80 [bcma] [ 13.210921] [<ffffffffa003729f>] bcma_pcie_mdio_write.isra.4+0xbf/0xd0 [bcma] [ 13.210932] [<ffffffffa0037498>] bcma_pcie_mdio_writeread.isra.6.constprop.13+0x18/0x30 [bcma] [ 13.210942] [<ffffffffa00374ee>] bcma_core_pci_power_save+0x3e/0x80 [bcma] [ 13.210953] [<ffffffffa003765d>] bcma_core_pci_up+0x2d/0x60 [bcma] [ 13.210975] [<ffffffffa03dc17c>] brcms_c_up+0xfc/0x430 [brcmsmac] [ 13.210989] [<ffffffffa03d1a7d>] brcms_up+0x1d/0x20 [brcmsmac] [ 13.211003] [<ffffffffa03d2498>] brcms_ops_start+0x298/0x340 [brcmsmac] [ 13.211020] [<ffffffff81600a12>] ? cfg80211_netdev_notifier_call+0xd2/0x5f0 [ 13.211030] [<ffffffff815fa53d>] ? packet_notifier+0xad/0x1d0 [ 13.211064] [<ffffffff81656e75>] ieee80211_do_open+0x325/0xf80 [ 13.211076] [<ffffffff8106ac09>] ? __raw_notifier_call_chain+0x9/0x10 [ 13.211086] [<ffffffff81657b41>] ieee80211_open+0x71/0x80 [ 13.211101] [<ffffffff81526267>] __dev_open+0x87/0xe0 [ 13.211109] [<ffffffff8152650c>] __dev_change_flags+0x9c/0x180 [ 13.211117] [<ffffffff815266a3>] dev_change_flags+0x23/0x70 [ 13.211127] [<ffffffff8158cd68>] devinet_ioctl+0x5b8/0x6a0 [ 13.211136] [<ffffffff8158d5c5>] inet_ioctl+0x75/0x90 [ 13.211147] [<ffffffff8150b38b>] sock_do_ioctl+0x2b/0x70 [ 13.211155] [<ffffffff8150b681>] sock_ioctl+0x71/0x2a0 [ 13.211169] [<ffffffff8114ed47>] do_vfs_ioctl+0x87/0x520 [ 13.211180] [<ffffffff8113f159>] ? ____fput+0x9/0x10 [ 13.211198] [<ffffffff8106228c>] ? task_work_run+0x9c/0xd0 [ 13.211202] [<ffffffff8114f271>] SyS_ioctl+0x91/0xb0 [ 13.211208] [<ffffffff816aa252>] system_call_fastpath+0x16/0x1b [ 13.211217] NOHZ: local_softirq_pending 202 The issue was introduced in v3.11 kernel by following commit: commit `aa51e598d0` Author: Hauke Mehrtens <hauke@hauke-m.de> Date: Sat Aug 24 00:32:31 2013 +0200 brcmsmac: use bcma PCIe up and down functions replace the calls to bcma_core_pci_extend_L1timer() by calls to the newly introduced bcma_core_pci_ip() and bcma_core_pci_down() Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Cc: Arend van Spriel <arend@broadcom.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> This fix has been discussed with Hauke Mehrtens [1] selection option 3) and is intended for v3.12. Ref: [1] http://mid.gmane.org/5239B12D.3040206@hauke-m.de Cc: <stable@vger.kernel.org> # 3.11.x Cc: Tod Jackson <tod.jackson@gmail.com> Cc: Joe Perches <joe@perches.com> Cc: Rafal Milecki <zajec5@gmail.com> Cc: Hauke Mehrtens <hauke@hauke-m.de> Reviewed-by: Hante Meuleman <meuleman@broadcom.com> Signed-off-by: Arend van Spriel <arend@broadcom.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2013-09-26 14:02:33 -04:00
Greg Kroah-Hartman	1fdde16d1f	hv: delete struct hv_dev_port_info It's no longer needed, and the struct hv_ring_buffer_debug_info structure shouldn't be "global" so move it to the local .h file instead. Tested-by: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 09:01:17 -07:00
Greg Kroah-Hartman	2c9be3eacc	hv: delete vmbus_get_debug_info() It's only used once, only contains 2 function calls, so just make those calls directly, deleting the function, and the now unneeded structure entirely. Tested-by: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 09:01:17 -07:00
Greg Kroah-Hartman	4947c7453b	hv: move "client/server_monitor_conn_id" bus attributes to dev_groups This moves the "client_monitor_conn_id" and "server_monitor_conn_id" bus attributes to the dev_groups structure, removing the need for it to be in a temporary structure. Tested-by: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 09:01:17 -07:00
Greg Kroah-Hartman	1cee272b02	hv: move "client/server_monitor_latency" bus attributes to dev_groups This moves the "client_monitor_latency" and "server_monitor_latency" bus attributes to the dev_groups structure, removing the need for it to be in a temporary structure. Tested-by: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 09:01:16 -07:00
Greg Kroah-Hartman	76c52bbe5e	hv: move "client/server_monitor_pending" bus attributes to dev_groups This moves the "client_monitor_pending" and "server_monitor_pending" bus attributes to the dev_groups structure, removing the need for it to be in a temporary structure. Tested-by: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 09:01:16 -07:00
Greg Kroah-Hartman	7c55e1d0e6	hv: move "device_id" bus attribute to dev_groups This moves the "device_id" bus attribute to the dev_groups structure, removing the need for it to be in a temporary structure. Tested-by: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 09:01:16 -07:00
Greg Kroah-Hartman	68234c049c	hv: move "class_id" bus attribute to dev_groups This moves the "class_id" bus attribute to the dev_groups structure, removing the need for it to be in a temporary structure. Tested-by: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 09:01:16 -07:00
Greg Kroah-Hartman	5ffd00e241	hv: move "monitor_id" bus attribute to dev_groups This moves the "state" bus attribute to the dev_groups structure, removing the need for it to be in a temporary structure. Tested-by: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 09:01:15 -07:00
Greg Kroah-Hartman	a8fb5f3d58	hv: move "state" bus attribute to dev_groups This moves the "state" bus attribute to the dev_groups structure, removing the need for it to be in a temporary structure. Tested-by: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 09:01:15 -07:00
Greg Kroah-Hartman	03f3a9107f	hv: use dev_groups for device attributes This patch is the first in a series that moves the hv bus code to use the dev_groups field instead of dev_attrs, as dev_attrs is going away in future kernel releases. It moves the id sysfs file to the dev_groups structure, and creates the needed show/store functions, instead of relying on one "universal" function for this. By doing this, it removes the need for this to be in a temporary structure. Tested-by: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 09:01:15 -07:00
Trond Myklebust	5bc2afc2b5	NFSv4: Honour the 'opened' parameter in the atomic_open() filesystem method Determine if we've created a new file by examining the directory change attribute and/or the O_EXCL flag. This fixes a regression when doing a non-exclusive create of a new file. If the FILE_CREATED flag is not set, the atomic_open() command will perform full file access permissions checks instead of just checking for MAY_OPEN. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-26 10:20:18 -04:00
Lee Jones	82b0f4b7c5	clk: ux500: Copy u8500_clk_init() ready for DT enablement Here we're using the old clock initialisation function as a template. It's necessary to remove all of the clk_register_clkdev() calls as they don't make sense when booting with Device Tree. Cc: Mike Turquette <mturquette@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2013-09-26 11:05:26 +02:00
Lee Jones	67f13daadc	mfd: dbx500-prcmu: Move PRCMU numerical clock identifiers into DT include file These are required to request DBx500 PRCMU clocks from Device Tree. The numbers used are taken directly from the Hardware Specification document. We're moving them from the DBx500 PRCMU include file into the DT include directory and referencing them from the former via a #include. Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2013-09-26 11:04:12 +02:00
Lee Jones	b0f4fe1edf	mfd: dbx500-prcmu: Correctly reorder PRCMU clock identifiers ... as stipulated by the Hardware Specification document. Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2013-09-26 11:04:06 +02:00
David Herrmann	19872d20c8	HID: uhid: allocate static minor udev has this nice feature of creating "dead" /dev/<node> device-nodes if it finds a devnode:<node> modalias. Once the node is accessed, the kernel automatically loads the module that provides the node. However, this requires udev to know the major:minor code to use for the node. This feature was introduced by: commit `578454ff7e` Author: Kay Sievers <kay.sievers@vrfy.org> Date: Thu May 20 18:07:20 2010 +0200 driver core: add devname module aliases to allow module on-demand auto-loading However, uhid uses dynamic minor numbers so this doesn't actually work. We need to load uhid to know which minor it's going to use. Hence, allocate a static minor (just like uinput does) and we're good to go. Reported-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-09-26 11:03:29 +02:00
Peter Hurley	469d6d0631	tty: Remove unused drop() method from tty_port interface Although originally conceived as a hook for port drivers to know when a port reference is dropped, no driver uses this method. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-25 18:08:34 -07:00
Huang Rui	7868943db1	usb: core: implement AMD remote wakeup quirk The following patch is required to resolve remote wake issues with certain devices. Issue description: If the remote wake is issued from the device in a specific timing condition while the system is entering sleep state then it may cause system to auto wake on subsequent sleep cycle. Root cause: Host controller rebroadcasts the Resume signal > 100 µseconds after receiving the original resume event from the device. For proper function, some devices may require the rebroadcast of resume event within the USB spec of 100µS. Workaroud: 1. Filter the AMD platforms with Yangtze chipset, then judge of all the usb devices are mouse or not. And get out the port id which attached a mouse with Pixart controller. 2. Then reset the port which attached issue device during system resume from S3. [Q] Why the special devices are only mice? Would high speed devices such as 3G modem or USB Bluetooth adapter trigger this issue? - Current this sensitivity is only confined to devices that use Pixart controllers. This controller is designed for use with LS mouse devices only. We have not observed any other devices failing. There may be a small risk for other devices also but this patch (reset device in resume phase) will cover the cases if required. [Q] Shouldn’t the resume signal be sent within 100 us for every device? - The Host controller may not send the resume signal within 100us, this our host controller specification change. This is why we require the patch to prevent side effects on certain known devices. [Q] Why would clicking mouse INTENSELY to wake the system up trigger this issue? - This behavior is specific to the devices that use Pixart controller. It is timing dependent on when the resume event is triggered during the sleep state. [Q] Is it a host controller issue or mouse? - It is the host controller behavior during resume that triggers the device incorrect behavior on the next resume. This patch sets USB_QUIRK_RESET_RESUME flag for these Pixart-based mice when they attached to platforms with AMD Yangtze chipset. Signed-off-by: Huang Rui <ray.huang@amd.com> Suggested-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-25 17:24:37 -07:00
Linus Torvalds	e93dd910b9	A set of device-mapper fixes for 3.12. A few fixes for dm-snapshot, a 32 bit fix for dm-stats, a couple error handling fixes for dm-multipath. A fix for the thin provisioning target to not expose non-zero discard limits if discards are disabled. Lastly, add two DM module parameters which allow users to tune the emergency memory reserves that DM mainatins per device -- this helps fix a long-standing issue for dm-multipath. The conservative default reserve for request-based dm-multipath devices (256) has proven problematic for users with many multipathed SCSI devices but relatively little memory. To responsibly select a smaller value users should use the new nr_bios tracepoint info (via commit `75afb352` "block: Add nr_bios to block_rq_remap tracepoint") to determine the peak number of bios their workloads create. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAABAgAGBQJSQMVHAAoJEMUj8QotnQNaOXgIAJS6/XJKMoHfiDJ9M+XD34rZ Uyr9TEnubX3DKCRBiY23MUcCQn3fx6BjCGv5/c8L4jQFIuLyDi2yatqpwXcbGSJh G/S/y6u0Axek+ew7TS80OFop4nblW6MoKnoh9/4N55Ofa+1WvKM4ERUGjHGbauyS TxmLQPToCFPLYRIOZ+imd6hQuIZ1+FFdJFvi7kY9O6Llx2sLD6fWi1iruBd/Da2H ByMX3biGN45mSpcBzRbSC/FkJ9CRIvT9n82BDPS0o3Tllt8NaVlEDaovB7h4ncc0 bFuT2Z3Q38B9uZ8Lj0bqdGzv3kXMLCkLo6WhWjyUt84hmDPAzRpBwt60jUqWyZs= =bjVp -----END PGP SIGNATURE----- Merge tag 'dm-3.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device-mapper fixes from Mike Snitzer: "A few fixes for dm-snapshot, a 32 bit fix for dm-stats, a couple error handling fixes for dm-multipath. A fix for the thin provisioning target to not expose non-zero discard limits if discards are disabled. Lastly, add two DM module parameters which allow users to tune the emergency memory reserves that DM mainatins per device -- this helps fix a long-standing issue for dm-multipath. The conservative default reserve for request-based dm-multipath devices (256) has proven problematic for users with many multipathed SCSI devices but relatively little memory. To responsibly select a smaller value users should use the new nr_bios tracepoint info (via commit `75afb352` "block: Add nr_bios to block_rq_remap tracepoint") to determine the peak number of bios their workloads create" * tag 'dm-3.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: add reserved_bio_based_ios module parameter dm: add reserved_rq_based_ios module parameter dm: lower bio-based mempool reservation dm thin: do not expose non-zero discard limits if discards disabled dm mpath: disable WRITE SAME if it fails dm-snapshot: fix performance degradation due to small hash size dm snapshot: workaround for a false positive lockdep warning dm stats: fix possible counter corruption on 32-bit systems dm mpath: do not fail path on -ENOSPC	2013-09-25 15:12:46 -07:00
Greg Kroah-Hartman	e2aad1d571	Merge 3.12-rc2 into staging-next. This resolves the merge problem with two iio drivers that Stephen Rothwell pointed out. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-25 08:59:04 -07:00
Paul E. McKenney	5c173eb8bc	rcu: Consistent rcu_is_watching() naming The old rcu_is_cpu_idle() function is just __rcu_is_watching() with preemption disabled. This commit therefore renames rcu_is_cpu_idle() to rcu_is_watching. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-09-25 06:45:06 -07:00
Paul E. McKenney	cc6783f788	rcu: Is it safe to enter an RCU read-side critical section? There is currently no way for kernel code to determine whether it is safe to enter an RCU read-side critical section, in other words, whether or not RCU is paying attention to the currently running CPU. Given the large and increasing quantity of code shared by the idle loop and non-idle code, the this shortcoming is becoming increasingly painful. This commit therefore adds __rcu_is_watching(), which returns true if it is safe to enter an RCU read-side critical section on the currently running CPU. This function is quite fast, using only a __this_cpu_read(). However, the caller must disable preemption. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-09-25 06:44:41 -07:00
Mark Brown	fd792f8fbc	mfd: mc13xxx: Move SPI erratum workaround into SPI I/O function Move the workaround for double sending AUDIO_CODEC and AUDIO_DAC writes into the SPI core, aiding refactoring to eliminate the ASoC custom I/O functions and avoiding the extra writes for I2C. Signed-off-by: Mark Brown <broonie@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>	2013-09-25 13:47:30 +01:00
Peter Zijlstra	1a338ac32c	sched, x86: Optimize the preempt_schedule() call Remove the bloat of the C calling convention out of the preempt_enable() sites by creating an ASM wrapper which allows us to do an asm("call ___preempt_schedule") instead. calling.h bits by Andi Kleen Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-tk7xdi1cvvxewixzke8t8le1@git.kernel.org [ Fixed build error. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:23:07 +02:00
Peter Zijlstra	a233f1120c	sched: Prepare for per-cpu preempt_count When using per-cpu preempt_count variables we need to save/restore the preempt_count on context switch (into per task storage; for instance the old thread_info::preempt_count variable) because of PREEMPT_ACTIVE. However, this means that on fork() the preempt_count value of the last context switch gets copied and if we had a PREEMPT_ACTIVE switch right before cloning a child task the child task will now too have PREEMPT_ACTIVE set and start its life with an extra PREEMPT_ACTIVE count. Therefore we need to make init_task_preempt_count() unconditional; this resets whatever preempt_count we inherited from our parent process. Doing so for !per-cpu implementations is harmless. For !PREEMPT_COUNT kernels we need to be careful not to start life with an increased preempt_count. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-4k0b7oy1rcdyzochwiixuwi9@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:55 +02:00
Peter Zijlstra	bdb4380658	sched: Extract the basic add/sub preempt_count modifiers Rewrite the preempt_count macros in order to extract the 3 basic preempt_count value modifiers: __preempt_count_add() __preempt_count_sub() and the new: __preempt_count_dec_and_test() And since we're at it anyway, replace the unconventional $op_preempt_count names with the more conventional preempt_count_$op. Since these basic operators are equivalent to the previous _notrace() variants, do away with the _notrace() versions. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-ewbpdbupy9xpsjhg960zwbv8@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:54 +02:00
Peter Zijlstra	a787870924	sched, arch: Create asm/preempt.h In order to prepare to per-arch implementations of preempt_count move the required bits into an asm-generic header and use this for all archs. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:50 +02:00
Peter Zijlstra	f27dde8dee	sched: Add NEED_RESCHED to the preempt_count In order to combine the preemption and need_resched test we need to fold the need_resched information into the preempt_count value. Since the NEED_RESCHED flag is set across CPUs this needs to be an atomic operation, however we very much want to avoid making preempt_count atomic, therefore we keep the existing TIF_NEED_RESCHED infrastructure in place but at 3 sites test it and fold its value into preempt_count; namely: - resched_task() when setting TIF_NEED_RESCHED on the current task - scheduler_ipi() when resched_task() sets TIF_NEED_RESCHED on a remote task it follows it up with a reschedule IPI and we can modify the cpu local preempt_count from there. - cpu_idle_loop() for when resched_task() found tsk_is_polling(). We use an inverted bitmask to indicate need_resched so that a 0 means both need_resched and !atomic. Also remove the barrier() in preempt_enable() between preempt_enable_no_resched() and preempt_check_resched() to avoid having to reload the preemption value and allow the compiler to use the flags of the previuos decrement. I couldn't come up with any sane reason for this barrier() to be there as preempt_enable_no_resched() already has a barrier() before doing the decrement. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-7a7m5qqbn5pmwnd4wko9u6da@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:49 +02:00
Peter Zijlstra	4a2b4b2227	sched: Introduce preempt_count accessor functions Replace the single preempt_count() 'function' that's an lvalue with two proper functions: preempt_count() - returns the preempt_count value as rvalue preempt_count_set() - Allows setting the preempt-count value Also provide preempt_count_ptr() as a convenience wrapper to implement all modifying operations. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-orxrbycjozopqfhb4dxdkdvb@git.kernel.org [ Fixed build failure. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:32 +02:00
Peter Zijlstra	ea81174789	sched, idle: Fix the idle polling state logic Mike reported that commit `7d1a9417` ("x86: Use generic idle loop") regressed several workloads and caused excessive reschedule interrupts. The patch in question failed to notice that the x86 code had an inverted sense of the polling state versus the new generic code (x86: default polling, generic: default !polling). Fix the two prominent x86 mwait based idle drivers and introduce a few new generic polling helpers (fixing the wrong smp_mb__after_clear_bit usage). Also switch the idle routines to using tif_need_resched() which is an immediate TIF_NEED_RESCHED test as opposed to need_resched which will end up being slightly different. Reported-by: Mike Galbraith <bitbucket@online.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: lenb@kernel.org Cc: tglx@linutronix.de Link: http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 13:53:10 +02:00
Peter Zijlstra	3150398626	sched: Remove {set,clear}_need_resched Preemption semantics are going to change which mandate a change. All DRM usage sites are already broken and will not be affected (much) by this change. DRM people are aware and will remove the last few stragglers. For now, leave an empty stub that generates a warning, once all users are gone we can remove this. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: airlied@linux.ie Cc: daniel.vetter@ffwll.ch Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/n/tip-qfc1el2zvhxiyut4ai99ij4n@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 13:53:09 +02:00
Roy Franz	ed37ddffe2	efi: Add proper definitions for some EFI function pointers. The x86/AMD64 EFI stubs must use a call wrapper to convert between the Linux and EFI ABIs, so void pointers are sufficient. For ARM, the ABIs are compatible, so we can directly invoke the function pointers. The functions that are used by the ARM stub are updated to match the EFI definitions. Also add some EFI types used by EFI functions. Signed-off-by: Roy Franz <roy.franz@linaro.org> Acked-by: Mark Salter <msalter@redhat.com> Reviewed-by: Grant Likely <grant.likely@linaro.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2013-09-25 12:34:33 +01:00
Rob Herring	b0b8c960ff	of: clean-up ifdefs in of_irq.h Much of of_irq.h is needlessly ifdef'ed. Clean this up and minimize the amount ifdef'ed code. This fixes some build warnings when CONFIG_OF is not enabled (seen on i386 and x86_64): include/linux/of_irq.h:82:7: warning: 'struct device_node' declared inside parameter list [enabled by default] include/linux/of_irq.h:82:7: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] include/linux/of_irq.h:87:47: warning: 'struct device_node' declared inside parameter list [enabled by default] Compile tested on i386, sparc and arm. Reported-by: Randy Dunlap <rdunlap@infradead.org> Cc: Grant Likely <grant.likely@linaro.org> Signed-off-by: Rob Herring <rob.herring@calxeda.com>	2013-09-24 21:12:32 -05:00
Andrew Morton	0608f43da6	revert "memcg, vmscan: integrate soft reclaim tighter with zone shrinking code" Revert commit `3b38722efd` ("memcg, vmscan: integrate soft reclaim tighter with zone shrinking code") I merged this prematurely - Michal and Johannes still disagree about the overall design direction and the future remains unclear. Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Andrew Morton	b1aff7fcf8	revert "vmscan, memcg: do softlimit reclaim also for targeted reclaim" Revert commit `a5b7c87f92` ("vmscan, memcg: do softlimit reclaim also for targeted reclaim") I merged this prematurely - Michal and Johannes still disagree about the overall design direction and the future remains unclear. Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Andrew Morton	694fbc0fe7	revert "memcg: enhance memcg iterator to support predicates" Revert commit `de57780dc6` ("memcg: enhance memcg iterator to support predicates") I merged this prematurely - Michal and Johannes still disagree about the overall design direction and the future remains unclear. Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Michal Hocko	9809b18fcf	watchdog: update watchdog_thresh properly watchdog_tresh controls how often nmi perf event counter checks per-cpu hrtimer_interrupts counter and blows up if the counter hasn't changed since the last check. The counter is updated by per-cpu watchdog_hrtimer hrtimer which is scheduled with 2/5 watchdog_thresh period which guarantees that hrtimer is scheduled 2 times per the main period. Both hrtimer and perf event are started together when the watchdog is enabled. So far so good. But... But what happens when watchdog_thresh is updated from sysctl handler? proc_dowatchdog will set a new sampling period and hrtimer callback (watchdog_timer_fn) will use the new value in the next round. The problem, however, is that nobody tells the perf event that the sampling period has changed so it is ticking with the period configured when it has been set up. This might result in an ear ripping dissonance between perf and hrtimer parts if the watchdog_thresh is increased. And even worse it might lead to KABOOM if the watchdog is configured to panic on such a spurious lockup. This patch fixes the issue by updating both nmi perf even counter and hrtimers if the threshold value has changed. The nmi one is disabled and then reinitialized from scratch. This has an unpleasant side effect that the allocation of the new event might fail theoretically so the hard lockup detector would be disabled for such cpus. On the other hand such a memory allocation failure is very unlikely because the original event is deallocated right before. It would be much nicer if we just changed perf event period but there doesn't seem to be any API to do that right now. It is also unfortunate that perf_event_alloc uses GFP_KERNEL allocation unconditionally so we cannot use on_each_cpu() and do the same thing from the per-cpu context. The update from the current CPU should be safe because perf_event_disable removes the event atomically before it clears the per-cpu watchdog_ev so it cannot change anything under running handler feet. The hrtimer is simply restarted (thanks to Don Zickus who has pointed this out) if it is queued because we cannot rely it will fire&adopt to the new sampling period before a new nmi event triggers (when the treshold is decreased). [akpm@linux-foundation.org: the UP version of __smp_call_function_single ended up in the wrong place] Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Fabio Estevam <festevam@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:25 -07:00
Philip Avinash	f1a4c52ff5	ARM: davinci: gpio: use gpiolib API instead of inline functions Remove NEED_MACH_GPIO_H config select option for ARCH_DAVINCI to start using gpiolib interface for davinci platforms. This makes it easier to use the gpio driver on other platforms as it breaks dependency on mach-davinci. Latencies for gpio_get/set APIs will increase. On measurement, latency was found to have increased by 18 microsecond with gpiolib API as compared to inline APIs. Measurement was done on DA850 EVM for gpio_get_value() API by taking the printk timing across the call with interrupts disabled. inline gpio API with interrupt disabled [ 29.734337] before gpio_get [ 29.736847] after gpio_get Time difference 0.00251 gpio library with interrupt disabled [ 272.876763] before gpio_get [ 272.879291] after gpio_get Time difference 0.002528 Latency increased by (0.002528 - 0.00251) = 18 microsecond. While at it, remove GPIO_TYPE_DAVINCI enum definition as gpio-davinci.c is converted to Linux device driver model. Signed-off-by: Philip Avinash <avinashphilip@ti.com> Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> [nsekhar@ti.com: minor edits to commit message] Signed-off-by: Sekhar Nori <nsekhar@ti.com>	2013-09-25 04:16:37 +05:30
Li, Zhen-Hua	82aeef0bf0	x86/iommu: correct ICS register offset According to Intel Vt-D specs, the offset of Invalidation complete status register should be 0x9C, not 0x98. See Intel's VT-d spec, Revision 1.3, Chapter 10.4, Page 98; Signed-off-by: Li, Zhen-Hua <zhen-hual@hp.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2013-09-24 13:04:07 +02:00
KV Sujith	118150f22d	gpio: davinci: move to platform device Modify DaVinci GPIO driver to become a platform device driver. The driver does not have platform driver structure or a probe. Instead, it has pure_initcall function for initialization. The platform specific informaiton is obtained using the DaVinci specific davinci_soc_info structure. This is a problem for Device Tree (DT) implementation. As a first stage of DT conversion, we implement a probe. Additional notes: - The driver registration happens as postcore_initcall. This is required since machine init functions like da850_lcd_hw_init() make use of GPIO. - Start using devres APIs for simpler error handling. Signed-off-by: KV Sujith <sujithkv@ti.com> [avinashphilip@ti.com: Move global definition of "davinci_gpio_controller" to local] Signed-off-by: Philip Avinash <avinashphilip@ti.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> [nsekhar@ti.com: drop unused structure member, rebase to new clean-up patch and fix error messages] Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>	2013-09-24 10:31:51 +05:30
Greg Kroah-Hartman	dc4fea795b	Merge branch 'master' into usb-next We have USB fixes now in Linus's tree that we need to properly sort out with reverts and the like in the usb-next branch, so merge them together and do it by hand. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-23 13:24:10 -07:00
Lee Jones	71e1980c8d	iio: pressure-core: st: Provide support for the Vdd_IO power supply The power to some of the sensors are controlled by regulators. In most cases these are 'always on', but if not they will fail to work until the regulator is enabled using the relevant APIs. This patch allows for the Vdd_IO power supply to be specified by either platform data or Device Tree. Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org>	2013-09-23 20:17:58 +01:00
Lee Jones	774487611c	iio: pressure-core: st: Provide support for the Vdd power supply The power to some of the sensors are controlled by regulators. In most cases these are 'always on', but if not they will fail to work until the regulator is enabled using the relevant APIs. This patch allows for the Vdd power supply to be specified by either platform data or Device Tree. Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org>	2013-09-23 20:17:54 +01:00
Paul E. McKenney	2a855b644c	rcu: Make list_splice_init_rcu() account for RCU readers The list_splice_init_rcu() function allows a list visible to RCU readers to be spliced into another list visible to RCU readers. This is OK, except for the use of INIT_LIST_HEAD(), which does pointer updates without doing anything to make those updates safe for concurrent readers. Of course, most of the time INIT_LIST_HEAD() is being used in reader-free contexts, such as initialization or cleanup, so it is OK for it to update pointers in an unsafe-for-RCU-readers manner. This commit therefore creates an INIT_LIST_HEAD_RCU() that uses ACCESS_ONCE() to make the updates reader-safe. The reason that we can use ACCESS_ONCE() instead of the more typical rcu_assign_pointer() is that list_splice_init_rcu() is updating the pointers to reference something that is already visible to readers, so that there is no problem with pre-initialized values. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-09-23 09:13:49 -07:00
Theodore Ts'o	47d06e532e	random: run random_int_secret_init() run after all late_initcalls The some platforms (e.g., ARM) initializes their clocks as late_initcalls for some unknown reason. So make sure random_int_secret_init() is run after all of the late_initcalls are run. Cc: stable@vger.kernel.org Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-09-23 06:35:06 -04:00
Linus Torvalds	68cf8d0c72	Merge branch 'for-3.12/core' of git://git.kernel.dk/linux-block Pull block IO fixes from Jens Axboe: "After merge window, no new stuff this time only a collection of neatly confined and simple fixes" * 'for-3.12/core' of git://git.kernel.dk/linux-block: cfq: explicitly use 64bit divide operation for 64bit arguments block: Add nr_bios to block_rq_remap tracepoint If the queue is dying then we only call the rq->end_io callout. This leaves bios setup on the request, because the caller assumes when the blk_execute_rq_nowait/blk_execute_rq call has completed that the rq->bios have been cleaned up. bio-integrity: Fix use of bs->bio_integrity_pool after free blkcg: relocate root_blkg setting and clearing block: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...) block: trace all devices plug operation	2013-09-22 15:00:11 -07:00
Greg Kroah-Hartman	3ffdea3fec	First round of new drivers, functionality and cleanups for IIO in the 3.13 cycle A number of new drivers and some new functionality + a lot of cleanups all over IIO. New Core Elements 1) New INT_TIME info_mask element for integration time, which may have different effects on measurement noise and similar, than an amplifier and hence is different from existing SCALE. Already existed in some drivers as a custom attribute. 2) Introduce a iio_push_buffers_with_timestamp helper to cover the common case of filling the last 64 bits of data to be passed to the buffer with a timestamp. Applied to lots of drivers. Cuts down on repeated code and moves a slightly fiddly bit of logic into a single location. 3) Introduce info_mask_[shared_by_dir/shared_by_all] elements to allow support of elements such as sampling_frequency which is typically shared by all input channels on a device. This reduces code and makes these controls available from in kernel consumers of IIO devices. New drivers 1) MCP3422/3/4 ADC 2) TSL4531 ambient light sensor 3) TCS3472/5 color light sensor 4) GP2AP020A00F ambient light / proximity sensor 5) LPS001WP support added to ST pressure sensor driver. New driver functionality 1) ti_am335x_adc Add buffered sampling support. This device has a hardware fifo that is fed directly into an IIO kfifo buffer based on a watershed interrupt. Note this will act as an example of how to handle this increasingly common type of device. The only previous example - sca3000 - take a less than optimal approach which is largely why it is still in staging. A couple of little cleanups for that new functionality followed later. Core cleanups: 1) MAINTAINERS - Sachin actually brought my email address up to date because I said I'd do it and never got around to it :) 2) Assign buffer list elements as single element lists to simplify the iio_buffer_is_active logic. 3) wake_up_interruptible_poll instead of wake_up_interruptible to only wake up threads waiting for poll notifications. 4) Add O_CLOEXEC flag to anon_inode_get_fd call for IIO event interface. 5) Change iio_push_to_buffers to take a void * pointer so as to avoid some annoying and unnecessary type casts. 6) iio_compute_scan_bytes incorrectly took a long rather than unsigned long. 7) Various minor tidy ups. Driver cleanups (in no particular order) 1) Another set of devm_ allocations patches from Sachin Kamat. 2) tsl2x7x - 0 to NULL cleanup. 3) hmc5843 - fix missing > in MODULE_AUTHOR 4) Set of strict_strto* to kstrto* conversions. 5) mxs-lradc - fix ordering of resource removal to match creation 6) mxs-lradc - add MODULE_ALIAS 7) adc7606 - drop a work pending test duplicated in core functions. 8) hmc5843 - devm_ allocation patch 9) Series of redundant breaks removed. 10) ad2s1200 - pr_err -> dev_err 11) adjd_s311 - use INT_TIME 12) ST sensors - large set of cleanups from Lee Jones and removed restriction to using only triggers provided by the st_sensors themselves from Dennis Ciocca. 13) dummy and tmp006 provide sampling_frequency via info_mask_shared_by_all. 14) tcs3472 - fix incorrect buffer size and wrong device pointer used in suspend / resume functions. 15) max1363 - use defaults for buffer setup ops as provided by the triggered buffer helpers as they are the same as were specified in max1363 driver. 16) Trivial tidy ups in a number of other drivers. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.21 (GNU/Linux) iQIcBAABAgAGBQJSP0HUAAoJEFSFNJnE9BaIwiQQAKuJaoPrdMezm1TDaqgrzQWQ U95mSJ19xPYVSQVNHFLFidcajhADRMFhMUGOJF64VZObEdOtFWI0UkrJjFhYtJTt n1B6qAqPjatmruj434+n5PW32XtareOPThso5EDCAW0X+CNgSOgda+TVj+9g1Ilg Onltb3wugMcs27FakZpKv1YuGyKAKE6uT/33qr++cuynR89JZOlp0QmLgIXobVRR WdjuiH8OXFA4LsP7dWQhoSejs6+JPMn992qkACUc5fztQfFfCk0eJsgQIsOXkz1e U6MFvab0LtdPKDRyzT1kIpK/Jxf1OVNiOYaQNIGuNMipa+5WRz2lF1sZyERQTJWR HOZehkikBdL73WaaKwyaLTsYyDMbYM9ZkpLrBEFRr7ocZpg/0LA84BWYYDWu1Nok 9Ib9xNAxcAgFwQMJpiz9J3ap/IzV2qJT9rv78q1chVwhNhVDs2CbwcuZKAB4UvWs Oz7C0Xx5DA/K7DlpJMLaVB1+BRJ3C1I9Jbr84mnu0clgOqFE+nrdKZcUTrOTFXdy 2yTp7Bkc2JiRtOYhI40UL79N08KCGNTUfigmUDQseF2dsaNlz5rTOiMifYQCRw9+ C1kxY00emzlGTvfUDdPwkiQTtz8tWf9Ahvjx/ufGfed68KWDMs1VuGNcqEzgqKNI SMP0VTEXbCiLeWYMqGep =mMgm -----END PGP SIGNATURE----- Merge tag 'iio-for-3.13a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next Jonathan writes: First round of new drivers, functionality and cleanups for IIO in the 3.13 cycle A number of new drivers and some new functionality + a lot of cleanups all over IIO. New Core Elements 1) New INT_TIME info_mask element for integration time, which may have different effects on measurement noise and similar, than an amplifier and hence is different from existing SCALE. Already existed in some drivers as a custom attribute. 2) Introduce a iio_push_buffers_with_timestamp helper to cover the common case of filling the last 64 bits of data to be passed to the buffer with a timestamp. Applied to lots of drivers. Cuts down on repeated code and moves a slightly fiddly bit of logic into a single location. 3) Introduce info_mask_[shared_by_dir/shared_by_all] elements to allow support of elements such as sampling_frequency which is typically shared by all input channels on a device. This reduces code and makes these controls available from in kernel consumers of IIO devices. New drivers 1) MCP3422/3/4 ADC 2) TSL4531 ambient light sensor 3) TCS3472/5 color light sensor 4) GP2AP020A00F ambient light / proximity sensor 5) LPS001WP support added to ST pressure sensor driver. New driver functionality 1) ti_am335x_adc Add buffered sampling support. This device has a hardware fifo that is fed directly into an IIO kfifo buffer based on a watershed interrupt. Note this will act as an example of how to handle this increasingly common type of device. The only previous example - sca3000 - take a less than optimal approach which is largely why it is still in staging. A couple of little cleanups for that new functionality followed later. Core cleanups: 1) MAINTAINERS - Sachin actually brought my email address up to date because I said I'd do it and never got around to it :) 2) Assign buffer list elements as single element lists to simplify the iio_buffer_is_active logic. 3) wake_up_interruptible_poll instead of wake_up_interruptible to only wake up threads waiting for poll notifications. 4) Add O_CLOEXEC flag to anon_inode_get_fd call for IIO event interface. 5) Change iio_push_to_buffers to take a void * pointer so as to avoid some annoying and unnecessary type casts. 6) iio_compute_scan_bytes incorrectly took a long rather than unsigned long. 7) Various minor tidy ups. Driver cleanups (in no particular order) 1) Another set of devm_ allocations patches from Sachin Kamat. 2) tsl2x7x - 0 to NULL cleanup. 3) hmc5843 - fix missing > in MODULE_AUTHOR 4) Set of strict_strto* to kstrto* conversions. 5) mxs-lradc - fix ordering of resource removal to match creation 6) mxs-lradc - add MODULE_ALIAS 7) adc7606 - drop a work pending test duplicated in core functions. 8) hmc5843 - devm_ allocation patch 9) Series of redundant breaks removed. 10) ad2s1200 - pr_err -> dev_err 11) adjd_s311 - use INT_TIME 12) ST sensors - large set of cleanups from Lee Jones and removed restriction to using only triggers provided by the st_sensors themselves from Dennis Ciocca. 13) dummy and tmp006 provide sampling_frequency via info_mask_shared_by_all. 14) tcs3472 - fix incorrect buffer size and wrong device pointer used in suspend / resume functions. 15) max1363 - use defaults for buffer setup ops as provided by the triggered buffer helpers as they are the same as were specified in max1363 driver. 16) Trivial tidy ups in a number of other drivers.	2013-09-22 11:30:12 -07:00
Jun'ichi Nomura	75afb35299	block: Add nr_bios to block_rq_remap tracepoint Adding the number of bios in a remapped request to 'block_rq_remap' tracepoint. Request remapper clones bios in a request to track the completion status of each bio. So the number of bios can be useful information for investigation. Related discussions: http://www.redhat.com/archives/dm-devel/2013-August/msg00084.html http://www.redhat.com/archives/dm-devel/2013-September/msg00024.html Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-09-21 13:57:47 -06:00
Lars-Peter Clausen	d2c3d072c4	iio: Add iio_push_buffers_with_timestamp() helper Drivers using software buffers often store the timestamp in their data buffer before calling iio_push_to_buffers() with that data buffer. Storing the timestamp in the buffer usually involves some ugly pointer arithmetic. This patch adds a new helper function called iio_push_buffers_with_timestamp() which is similar to iio_push_to_buffers but takes an additional timestamp parameter. The function will help to hide to uglyness in one central place instead of exposing it in every driver. If timestamps are enabled for the IIO device iio_push_buffers_with_timestamp() will store the timestamp as the last element in buffer, before passing the buffer on to iio_push_buffers(). The buffer needs large enough to hold the timestamp in this case. If timestamps are disabled iio_push_buffers_with_timestamp() will behave just like iio_push_buffers(). Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Cc: Oleksandr Kravchenko <o.v.kravchenko@globallogic.com> Cc: Josh Wu <josh.wu@atmel.com> Cc: Denis Ciocca <denis.ciocca@gmail.com> Cc: Manuel Stahl <manuel.stahl@iis.fraunhofer.de> Cc: Ge Gao <ggao@invensense.com> Cc: Peter Meerwald <pmeerw@pmeerw.net> Cc: Jacek Anaszewski <j.anaszewski@samsung.com> Cc: Fabio Estevam <fabio.estevam@freescale.com> Cc: Marek Vasut <marex@denx.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org>	2013-09-21 19:23:50 +01:00
Zubair Lutfullah	ca9a563805	iio: ti_am335x_adc: Add continuous sampling support Previously the driver had only one-shot reading functionality. This patch adds continuous sampling support to the driver. Continuous sampling starts when buffer is enabled. HW IRQ wakes worker thread that pushes samples to userspace. Sampling stops when buffer is disabled by userspace. Patil Rachna (TI) laid the ground work for ADC HW register access. Russ Dill (TI) fixed bugs in the driver relevant to FIFOs and IRQs. I fixed channel scanning so multiple ADC channels can be read simultaneously and pushed to userspace. Restructured the driver to fit IIO ABI. And added INDIO_BUFFER_HARDWARE mode. Signed-off-by: Zubair Lutfullah <zubair.lutfullah@gmail.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Russ Dill <Russ.Dill@ti.com> Acked-by: Lee Jones <lee.jones@linaro.org> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org>	2013-09-21 19:23:47 +01:00
Andre Naujoks	c26d436cbf	lib: introduce upper case hex ascii helpers To be able to use the hex ascii functions in case sensitive environments the array hex_asc_upper[] and the needed functions for hex_byte_pack_upper() are introduced. Signed-off-by: Andre Naujoks <nautsch2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-20 15:38:26 -04:00
Laxman Dewangan	6112fe60ac	regmap: add helper macro to set min/max range of register Add helper macro to set the min and max value of the register range. This is useful when initialising the register ranges of the device like static const struct regmap_range readable_ranges[] = { regmap_reg_range(DEVICE_REG0, DEVICE_REG10), }; Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Signed-off-by: Mark Brown <broonie@linaro.org>	2013-09-20 17:50:46 +01:00
Mike Snitzer	f84cb8a46a	dm mpath: disable WRITE SAME if it fails Workaround the SCSI layer's problematic WRITE SAME heuristics by disabling WRITE SAME in the DM multipath device's queue_limits if an underlying device disabled it. The WRITE SAME heuristics, with both the original commit `5db44863b6` ("[SCSI] sd: Implement support for WRITE SAME") and the updated commit `66c28f971` ("[SCSI] sd: Update WRITE SAME heuristics"), default to enabling WRITE SAME(10) even without successfully determining it is supported. After the first failed WRITE SAME the SCSI layer will disable WRITE SAME for the device (by setting sdkp->device->no_write_same which results in 'max_write_same_sectors' in device's queue_limits to be set to 0). When a device is stacked ontop of such a SCSI device any changes to that SCSI device's queue_limits do not automatically propagate up the stack. As such, a DM multipath device will not have its WRITE SAME support disabled. This causes the block layer to continue to issue WRITE SAME requests to the mpath device which causes paths to fail and (if mpath IO isn't configured to queue when no paths are available) it will result in actual IO errors to the upper layers. This fix doesn't help configurations that have additional devices stacked ontop of the mpath device (e.g. LVM created linear DM devices ontop). A proper fix that restacks all the queue_limits from the bottom of the device stack up will need to be explored if SCSI will continue to use this model of optimistically allowing op codes and then disabling them after they fail for the first time. Before this patch: EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null) device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121) device-mapper: multipath: XXX snitm debugging: failing WRITE SAME IO with error=-121 end_request: critical target error, dev dm-6, sector 528 dm-6: WRITE SAME failed. Manually zeroing. device-mapper: multipath: Failing path 8:112. end_request: I/O error, dev dm-6, sector 4616 dm-6: WRITE SAME failed. Manually zeroing. end_request: I/O error, dev dm-6, sector 4616 end_request: I/O error, dev dm-6, sector 5640 end_request: I/O error, dev dm-6, sector 6664 end_request: I/O error, dev dm-6, sector 7688 end_request: I/O error, dev dm-6, sector 524288 Buffer I/O error on device dm-6, logical block 65536 lost page write due to I/O error on dm-6 JBD2: Error -5 detected when updating journal superblock for dm-6-8. end_request: I/O error, dev dm-6, sector 524296 Aborting journal on device dm-6-8. end_request: I/O error, dev dm-6, sector 524288 Buffer I/O error on device dm-6, logical block 65536 lost page write due to I/O error on dm-6 JBD2: Error -5 detected when updating journal superblock for dm-6-8. # cat /sys/block/sdh/queue/write_same_max_bytes 0 # cat /sys/block/dm-6/queue/write_same_max_bytes 33553920 After this patch: EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null) device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121) device-mapper: multipath: XXX snitm debugging: WRITE SAME I/O failed with error=-121 end_request: critical target error, dev dm-6, sector 528 dm-6: WRITE SAME failed. Manually zeroing. # cat /sys/block/sdh/queue/write_same_max_bytes 0 # cat /sys/block/dm-6/queue/write_same_max_bytes 0 It should be noted that WRITE SAME support wasn't enabled in DM multipath until v3.10. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Cc: stable@vger.kernel.org # 3.10+	2013-09-20 10:36:34 -04:00
Jason Low	f48627e686	sched/balancing: Periodically decay max cost of idle balance This patch builds on patch 2 and periodically decays that max value to do idle balancing per sched domain by approximately 1% per second. Also decay the rq's max_idle_balance_cost value. Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379096813-3032-4-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 12:03:46 +02:00
Jason Low	9bd721c55c	sched/balancing: Consider max cost of idle balance per sched domain In this patch, we keep track of the max cost we spend doing idle load balancing for each sched domain. If the avg time the CPU remains idle is less then the time we have already spent on idle balancing + the max cost of idle balancing in the sched domain, then we don't continue to attempt the balance. We also keep a per rq variable, max_idle_balance_cost, which keeps track of the max time spent on newidle load balances throughout all its domains so that we can determine the avg_idle's max value. By using the max, we avoid overrunning the average. This further reduces the chance we attempt balancing when the CPU is not idle for longer than the cost to balance. Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379096813-3032-3-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 12:03:44 +02:00

... 2 3 4 5 6 ...

37448 Commits