TRSCER register is configured differently by SoCs. TRSCER of R-Car Gen2 is
RINT8 bit only valid, other bits are reserved bits. This removes access to
TRSCER register reserve bit by adding variable trscer_err_mask to
sh_eth_cpu_data structure, set the register information to each SoCs.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FDR register of R-Car set in fdr_value can have the original settings.
This sets the value that is suitable for each SoCs to fdr_value of R8A777x
and R8A779x.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-01-06
This series contains fixes to i40e only.
Jesse provides a fix for when the driver was polling with interrupts
disabled the hardware would occasionally not write back descriptors.
His fix causes the driver to detect this situation and force an interrupt
to fire which will flush the stuck descriptor.
Anjali provides a couple of fixes, the first corrects an issue where
the receive port checksum error counter was incrementing incorrectly with
UDP encapsulated tunneled traffic. The second fix resolves an issue where
the driver was examining the outer protocol layer to set the inner protocol
layer checksum offload. In the case of TCP over IPv6 over an IPv4 based
VXLAN, the inner checksum offloads would be set to look for IPv4/UDP
instead of IPv6/TCP, so fixed the issue so that the driver will look at
the proper layer for encapsulation offload settings.
v2: fixed a bug in patch 01 of the series, where the interrupt rate impacted
4 port workloads by reducing throughput.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Charles Shirron and Paul Cassella from Cray Inc have reported kswapd
stuck in a busy loop with nothing left to balance, but
kswapd_try_to_sleep() failing to sleep. Their analysis found the cause
to be a combination of several factors:
1. A process is waiting in throttle_direct_reclaim() on pgdat->pfmemalloc_wait
2. The process has been killed (by OOM in this case), but has not yet been
scheduled to remove itself from the waitqueue and die.
3. kswapd checks for throttled processes in prepare_kswapd_sleep():
if (waitqueue_active(&pgdat->pfmemalloc_wait)) {
wake_up(&pgdat->pfmemalloc_wait);
return false; // kswapd will not go to sleep
}
However, for a process that was already killed, wake_up() does not remove
the process from the waitqueue, since try_to_wake_up() checks its state
first and returns false when the process is no longer waiting.
4. kswapd is running on the same CPU as the only CPU that the process is
allowed to run on (through cpus_allowed, or possibly single-cpu system).
5. CONFIG_PREEMPT_NONE=y kernel is used. If there's nothing to balance, kswapd
encounters no voluntary preemption points and repeatedly fails
prepare_kswapd_sleep(), blocking the process from running and removing
itself from the waitqueue, which would let kswapd sleep.
So, the source of the problem is that we prevent kswapd from going to
sleep until there are processes waiting on the pfmemalloc_wait queue,
and a process waiting on a queue is guaranteed to be removed from the
queue only when it gets scheduled. This was done to make sure that no
process is left sleeping on pfmemalloc_wait when kswapd itself goes to
sleep.
However, it isn't necessary to postpone kswapd sleep until the
pfmemalloc_wait queue actually empties. To prevent processes from being
left sleeping, it's actually enough to guarantee that all processes
waiting on pfmemalloc_wait queue have been woken up by the time we put
kswapd to sleep.
This patch therefore fixes this issue by substituting 'wake_up' with
'wake_up_all' and removing 'return false' in the code snippet from
prepare_kswapd_sleep() above. Note that if any process puts itself in
the queue after this waitqueue_active() check, or after the wake up
itself, it means that the process will also wake up kswapd - and since
we are under prepare_to_wait(), the wake up won't be missed. Also we
update the comment prepare_kswapd_sleep() to hopefully more clearly
describe the races it is preventing.
Fixes: 5515061d22 ("mm: throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org> [3.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We are supposed to take one css reference per each memory page and per
each swap entry accounted to a memory cgroup. However, during task
charges migration we take a reference to the destination cgroup twice
per each swap entry: first in mem_cgroup_do_precharge()->try_charge()
and then in mem_cgroup_move_swap_account(), permanently leaking the
destination cgroup.
The hunk taking the second reference seems to be a leftover from the
pre-00501b531c472 ("mm: memcontrol: rewrite charge API") era. Remove it
to fix the leak.
Fixes: e8ea14cc6e (mm: memcontrol: take a css reference for each charged page)
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 3e32cb2e0a ("mm: memcontrol: lockless page counters")
accidentally switched the soft limit default from infinity to zero,
which turns all memcgs with even a single page into soft limit excessors
and engages soft limit reclaim on all of them during global memory
pressure. This makes global reclaim generally more aggressive, but also
inverts the meaning of existing soft limit configurations where unset
soft limits are usually more generous than set ones.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These are obsolete since commit e30825f186 ("mm/debug-pagealloc:
prepare boottime configurable") was merged. So remove them.
[pebolle@tiscali.nl: find obsolete Kconfig options]
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Jungsoo Son <jungsoo.son@lge.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix clashing values for O_PATH and FMODE_NONOTIFY on sparc. The
clashing O_PATH value was added in commit 5229645bdc ("vfs: add
nonconflicting values for O_PATH") but this can't be changed as it is
user-visible.
FMODE_NONOTIFY is only used internally in the kernel, but it is in the
same numbering space as the other O_* flags, as indicated by the comment
at the top of include/uapi/asm-generic/fcntl.h (and its use in
fs/notify/fanotify/fanotify_user.c). So renumber it to avoid the clash.
All of this has happened before (commit 12ed2e36c9: "fanotify:
FMODE_NONOTIFY and __O_SYNC in sparc conflict"), and all of this will
happen again -- so update the uniqueness check in fcntl_init() to
include __FMODE_NONOTIFY.
Signed-off-by: David Drysdale <drysdale@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In ocfs2_link(), the parent directory inode passed to function
ocfs2_lookup_ino_from_name() is wrong. Parameter dir is the parent of
new_dentry not old_dentry. We should get old_dir from old_dentry and
lookup old_dentry in old_dir in case another node remove the old dentry.
With this change, hard linking works again, when paths are relative with
at least one subdirectory. This is how the problem was reproducable:
# mkdir a
# mkdir b
# touch a/test
# ln a/test b/test
ln: failed to create hard link `b/test' => `a/test': No such file or directory
However when creating links in the same dir, it worked well.
Now the link gets created.
Fixes: 0e048316ff ("ocfs2: check existence of old dentry in ocfs2_link()")
Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Reported-by: Szabo Aron - UBIT <aron@ubit.hu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Tested-by: Aron Szabo <aron@ubit.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
My ISP finally gave up on the old mail address, so I am moving things
over to bitmath.org instead. Also change the status fields to better
reflect reality.
Signed-off-by: Henrik Rydberg <rydberg@bitmath.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tejun, while reviewing the code, spotted the following race condition
between the dirtying and truncation of a page:
__set_page_dirty_nobuffers() __delete_from_page_cache()
if (TestSetPageDirty(page))
page->mapping = NULL
if (PageDirty())
dec_zone_page_state(page, NR_FILE_DIRTY);
dec_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
if (page->mapping)
account_page_dirtied(page)
__inc_zone_page_state(page, NR_FILE_DIRTY);
__inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
which results in an imbalance of NR_FILE_DIRTY and BDI_RECLAIMABLE.
Dirtiers usually lock out truncation, either by holding the page lock
directly, or in case of zap_pte_range(), by pinning the mapcount with
the page table lock held. The notable exception to this rule, though,
is do_wp_page(), for which this race exists. However, do_wp_page()
already waits for a locked page to unlock before setting the dirty bit,
in order to prevent a race where clear_page_dirty() misses the page bit
in the presence of dirty ptes. Upgrade that wait to a fully locked
set_page_dirty() to also cover the situation explained above.
Afterwards, the code in set_page_dirty() dealing with a truncation race
is no longer needed. Remove it.
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Constantly forking task causes unlimited grow of anon_vma chain. Each
next child allocates new level of anon_vmas and links vma to all
previous levels because pages might be inherited from any level.
This patch adds heuristic which decides to reuse existing anon_vma
instead of forking new one. It adds counter anon_vma->degree which
counts linked vmas and directly descending anon_vmas and reuses anon_vma
if counter is lower than two. As a result each anon_vma has either vma
or at least two descending anon_vmas. In such trees half of nodes are
leafs with alive vmas, thus count of anon_vmas is no more than two times
bigger than count of vmas.
This heuristic reuses anon_vmas as few as possible because each reuse
adds false aliasing among vmas and rmap walker ought to scan more ptes
when it searches where page is might be mapped.
Link: http://lkml.kernel.org/r/20120816024610.GA5350@evergreen.ssec.wisc.edu
Fixes: 5beb493052 ("mm: change anon_vma linking to fix multi-process server scalability issue")
[akpm@linux-foundation.org: fix typo, per Rik]
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Daniel Forrest <dan.forrest@ssec.wisc.edu>
Tested-by: Michal Hocko <mhocko@suse.cz>
Tested-by: Jerome Marchand <jmarchan@redhat.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org> [2.6.34+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
wait_consider_task() checks EXIT_ZOMBIE after EXIT_DEAD/EXIT_TRACE and
both checks can fail if we race with EXIT_ZOMBIE -> EXIT_DEAD/EXIT_TRACE
change in between, gcc needs to reload p->exit_state after
security_task_wait(). In this case ->notask_error will be wrongly
cleared and do_wait() can hang forever if it was the last eligible
child.
Many thanks to Arne who carefully investigated the problem.
Note: this bug is very old but it was pure theoretical until commit
b3ab03160d ("wait: completely ignore the EXIT_DEAD tasks"). Before
this commit "-O2" was probably enough to guarantee that compiler won't
read ->exit_state twice.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Arne Goedeke <el@laramies.com>
Tested-by: Arne Goedeke <el@laramies.com>
Cc: <stable@vger.kernel.org> [3.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In dlm_process_recovery_data, only when dlm_new_lock failed the ret will
be set to -ENOMEM. And in this case, newlock is definitely NULL. So
test newlock is meaningless, remove it.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull kbuild fix from Michal Marek:
"make mrproper / distclean stopped removing the generated debian/
directory in v3.16. This fixes it"
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kbuild: Fix removal of the debian/ directory
The introduction of the uapi directories in v3.7-rc1 moved some of the
generated headers from arch/*/include/generated to the uapi directory,
keeping the #include directives intact.
This creates a problem when bisecting, because the unversioned files are
not cleaned automatically by git and the compiler might include stale
headers as a result. Instead of cleaning them in the Makefiles, promote
arch/*/include/generated/uapi in the search path. Under normal
circumstances, there is no overlap between this uapi subdirectory and
its parent, so the include choices remain the same. We keep
arch/*/include/generated/uapi in the USERINCLUDE variable so that it is
usable standalone.
Note that we cannot completely swap the order of the uapi and
kernel-only directories, since the headers in include/uapi/asm-generic
are meant to be wrapped by their include/asm-generic counterparts when
building kernel code.
Reported-by: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Reported-by: David Drysdale <dmd@lurklurk.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
and STi drivers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUrj7qAAoJEEEQszewGV1zMe0P/jhTWx/hP6IsgwTxlZrxrQjP
jLzoRyA9WE3+Rxk+W5XFl/NBX4JJxcIJvJ1ffc5VbjVNV1gZSTnxGn44YdSY3aCW
7Gkw0GGz8JzbbeZH8jJyPNM01SAMzKHZOIx9xIl9e+97kSku/wPQvOnmMoqXgEjS
d2Jxrh8VscL8pUexNyqXzI4JNMPnQ/opuzP/ODWoQ9UCv5/2eRlHlam+347yTYAo
fI65aMHXG/vOjDqMs//vtIs84Lyeh1oQnt6O/wDduzECpaA4y2AKMJpWUalRASaJ
Rm1B7OEpYDg3HPTEh1lhEUGyYCX0uTzEovJG+t5MCnaxX2S5lU/RJBNfbTHLOkb3
8JKU2DWrbDnhweyaAJE7QeFD2y35gckz8hnhN+3nmOTBxlae7qXBTEMDk0YJzvC3
tYPFjoj+/P3IROgWOKDzG5MUSx9pgUFDiEhTvQZeuDWjwQ+9NOZq/xfoyR2yk0ob
sBPv7SMaiubhAcP0F5GkXjH3G8Kyb9tsOZZsZV5v12EZtb64c4HYqHvlveN4mKRP
MEKknTVz8vY4yXqEtap4H/T+19aLzHJdr/pA75KKSbmOZ25cCEMBV2Ruqk5m8RFy
KdbWDaiPaen28TTB0wMyW00jheHa8P1xMF8cwTxUBDH6pM3N9QTG9U270HaIvbAu
9/6w1W880RCQMox+phVT
=gZIz
-----END PGP SIGNATURE-----
Merge tag 'pinctrl-v3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pinctrl fixes from Linus Walleij:
"Allright allright I've been lazy over christmas and New Years. Here
are a few collected pin control fixes eventually. Details:
A set of assorted pin control fixes for the Rockchip and STi drivers"
* tag 'pinctrl-v3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: st: Add irq_disable hook to st_gpio_irqchip
pinctrl: st: avoid multiple mutex lock
pinctrl: rockchip: Fix enable/disable/mask/unmask
pinctrl: rockchip: Handle wakeup pins
- Fix ACPI power management intialization for device objects
corresponding to devices that are not present at the init time
(the _STA control method returns 0 for them) and therefore should
not be regarded as power manageable (Rafael J Wysocki).
- Rename a structure field and two functions used by the ACPI
processor driver to make them less tied to architectures that
use APICs (both x86 and ia64) and more suitable for ARM64
processors (Hanjun Guo).
- Add a disable_native_backlight quirk for Dell XPS15 L521X
designed in an unusual way preventing native backlight from
working on that machine (Hans de Goede).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJUrcCTAAoJEILEb/54YlRxeMUQAKOHS8rlq0XtOxieufCks0Rq
e96ZExiTdLI/KYZCgqwkSJxR7w2981hbFVIcFccPpo1k9Z1YowSOvWcHvmn1RwHQ
lXDfCEqWssIXMn0zctH3Ob/uBggvWFx9g5y8slOZ6W2LaUzzNdA+ZqxIbNsgwNSN
vji2E1m2ZhcwgThPYeXNsvvWJzYyC3LI4nZ8UIKE8kUMM5oxYoIlrW/qjoHc8KNV
AlUv+e0Z43XHy5jHaS8sQrKGrAPrdUroDRcByw1MG/V8r4vSXepvNjOXXwEZ9SF9
xPp/bzLLRPK8FW4MQ2VEhTcO0FcjblDTQDhenDHKyjEonWOse5A9VbAyZ2dORfZ+
juDsO3xalYk+OoHjRDdzxkQLIyQOATTcxkyGxyJf+HjcD6nfsdibWzisM6nl7E9h
hpwGRhb5sgbWV9QRwmkkvFBP/uCsF3rHA/qCZQCFsxs0n3Ty5wiAg2JEHEL/kPF9
6vPsX4ttukw5MbNzTyHfXOm41Ula3u4EfJvTdQjWNdx9uifBjsosetw3ho6q4XmP
e6mCBFIU0Fxxfnmgij5x6ufGCBCrkpbLuZsC63HyaRDiRDN4DuftDLesVEySfJix
+FiqilOyiOmSXHHC17cF8YNYsxMdFo1mGJDpV1PS1gnh/xwmEgfzwdWrjso7Y76L
9MilW/AXspuXxegu6JDu
=NEY0
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"These are an ACPI device power management initialization fix (-stable
material), two commits renaming stuff in the ACPI processor driver to
make it more suitable for ARM64 processors and a new ACPI backlight
blacklist entry.
Specifics:
- Fix ACPI power management intialization for device objects
corresponding to devices that are not present at the init time (the
_STA control method returns 0 for them) and therefore should not be
regarded as power manageable (Rafael J Wysocki).
- Rename a structure field and two functions used by the ACPI
processor driver to make them less tied to architectures that use
APICs (both x86 and ia64) and more suitable for ARM64 processors
(Hanjun Guo).
- Add a disable_native_backlight quirk for Dell XPS15 L521X designed
in an unusual way preventing native backlight from working on that
machine (Hans de Goede)"
* tag 'pm+acpi-3.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / video: Add disable_native_backlight quirk for Dell XPS15 L521X
ACPI / processor: Rename acpi_(un)map_lsapic() to acpi_(un)map_cpu()
ACPI / processor: Convert apic_id to phys_id to make it arch agnostic
ACPI / PM: Fix PM initialization for devices that are not present
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIVAwUAVK1agvu3V2unywtrAQLBEg/9FfHWzF0CcjPeYRaHgpmf28F+VjbYCu3R
GoUHGhugEwkXOs0jIEBSLxpMjKLC6EhbypjFyQpKpvKUKxa2gwqc8NopFaeN8rhP
pCwyCTgVVmKmhpEct73jp0B7rw7HGQ9v87oLTBgdaDKy2RQkLLhgJZVC4SAr0/sn
NXXA/An+xrWyzrisT085Pb7Cklj27k44lbeJfgKBDNqXrAOYG5LiomZmWPphHxq8
6bvgQulZZQiHEsASRC2JNly6Ijh6HW5OWtSqKIxo904HDaU5DnZsWopaCdcAl3le
auLE9MRDdhG8oOuOrMqcj9hSieSe1fCAG7fFNqGLHwcXitxvNO/A6W2psCfe9/4N
umHEjs6F9OBqAnu0AsbAO+K1NgJAikUy56wxpucRK6lfcJ6/xDCKFN3xLoKP+d2J
AzhNRB7qMah9deVVudU+XuipYuv5F397XIRBJTbR0cvpNO6K71OS4YCaf+q8r5Ex
gFtwLWlcY5XEqVPSiARaQiNmlx1kpRQzhnu2soMl+Bd0l0lYaI99Xig/z04ay4L+
Ed+BYWZfu2T/ueDm/INA40b5ZF2y9Qt0RN1CzUyqras+ZIwGhGOeRJfsEvO33sx9
W8gnjnMDLynarwmajmfaJfvTaj1iyT6omQiYgsjT0fe/JDgnEgKri/XJqxRtcPy5
nmMN9CdT+hw=
=CGYU
-----END PGP SIGNATURE-----
Merge tag 'keys-fixes-20150107' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull keyrings fixes from David Howells:
"Two fixes:
- Fix for the order in which things are done during key garbage
collection to prevent named keyrings causing a crash
[CVE-2014-9529].
- Fix assoc_array to explicitly #include rcupdate.h to prevent
compilation errors under certain circumstances"
* tag 'keys-fixes-20150107' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
assoc_array: Include rcupdate.h for call_rcu() definition
KEYS: close race between key lookup and freeing
This fixes a couple of bugs triggered by hot-unplug
of virtio devices, as well as a regression in vhost-net.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUrQ7bAAoJECgfDbjSjVRptOUH/R8uZWxD2LY8cyUpGuUKkhXi
SX727WQXMYd5tZH0MAiakFvL8+rGz5A/MAZFr+kT1xjgt3WAYHk9CwdvPBpd0YlD
uawqr2t5t+AzGbNLmn5S7D2OD89Yzk6dOsvotgvjtlOxWTcMhZxS6ynNB6iSM4Ul
K/5DCnDxgm3BYyljtdAEStQS7pft1BITtK1BBmJZ5EYR2+sOKcNAM7imYKJ0XA52
Spvkn/SIqLCaapLcDg451f3T1BupqqceaUMO0XKDDrGcLesJC1X1nXcxs4Q6zMED
Zh6frLnoKrCvzInuIrW5VyLwLn72HTJZ1EtKmp/Q1IdfXXuoBhOZxS8mFnvG09E=
=q55J
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
pull virtio/vhost fixes from Michael Tsirkin:
"This fixes a couple of bugs triggered by hot-unplug of virtio devices,
as well as a regression in vhost-net"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost/net: length miscalculation
virtio_pci: document why we defer kfree
virtio_pci: defer kfree until release callback
virtio_pci: device-specific release callback
virtio: make del_vqs idempotent
Including:
* A domain structure leak fix in the Intel VT-d driver
* Compile error fix for the VMSA IPMMU driver because of the
IOMMU_EXEC -> IOMMU_NOEXEC conversion
* Two small cleanups as an aftermath of the merge window and the
domain-leak fix
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJUrU8AAAoJECvwRC2XARrjMLwP/3bv9V+bX4JdCDb0pIY/69kC
+7pQqrE1la4yFVYGwGrzUD8mcswkMudc0j+mgmWZAoM73AV7k9qTcPRhtOPDIijn
nE1JezbY0CHjkW2QnKkT9HLMoC5xxESurc1FFeNnAfQD3Lx9T9ZMcsnyT5z3AxHc
IhqWtRC1bcqEUXmnGCm+VjnzsVH4sxSWc7tGkS6gMflqjQgAd8Fla9EeStwQo0rX
g139MuS7m1CwPLh9yXN0sQ1PWdczYy3o3p4Q0j+XUtGdJalmSS32NcoJYBHVQgJf
IzOFqCvVgzWtaiYS9YqKSX+dOJ3sUlK1q8XPikCoDX6TcipTpEirKdzqxmEXCM8h
WZ0GMfFzSerCfVR9imlKiTuei1SM4t/hQy0VBqA1gOwVb50RVeyUDIRlDECPpnJT
9uZuB1B5kHGXmpiJ7cUDZjUHnx/wH/mEuJj1eSMXr+KEEMu9QQpM4JgK1eKz5Vw3
qaaJBOQuccPj2tOgW7TN5liHL2rj+KkDtDGgeUX+sga0yenV0zNfa6Qe5gEt2BFu
fMHJlrKjabMouPB0Vx5R221/Q1ZdzZkFZ2xsyyk02yQqNrvYB6n0ZEreuq39tPfJ
XIfIfOTHzpw4DxrGsaK2D8mTOtlnhwFnsalE92pMhE2eU3jprpa3MRDMz9vvPRnu
7lUl8pv2C8chfBh1miVy
=e3/b
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"Including:
- a domain structure leak fix in the Intel VT-d driver
- compile error fix for the VMSA IPMMU driver because of the
IOMMU_EXEC -> IOMMU_NOEXEC conversion
- two small cleanups as an aftermath of the merge window and the
domain-leak fix"
* tag 'iommu-fixes-v3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/rockchip: Drop owner assignment from platform_drivers
iommu/vt-d: Remove dead code in device_notifier
iommu/vt-d: Fix dmar_domain leak in iommu_attach_device
iommu/ipmmu-vmsa: Change IOMMU_EXEC to IOMMU_NOEXEC
Pull crypto fixes from Herbert Xu:
"This fixes a build problem with sha-mb with old toolchains and an
implementation bug in the ctr(aes)/by8 branch of aesni-intel that's
enabled when AVX is available"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: sha-mb - Add avx2_supported check.
crypto: aesni - fix "by8" variant for 128 bit keys
irqmap is optional property, so priv->domain can be NULL if !irqmap.
Thus add NULL test for priv->domain before calling irq_domain_remove()
to prevent NULL pointer dereference.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The queues and device need to be locked when messing with them.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This freezes and stops all the queues on device shutdown and restarts
them on resume. This fixes hotplug and reset issues when the controller
is actively being used.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Aborts all requeued commands prior to killing the request_queue. For
commands that time out on a dying request queue, set the "Do Not Retry"
bit on the command status so the command cannot be requeued. Finanally, if
the driver is requested to abort a command it did not start, do nothing.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This protects admin queue access on shutdown. When the controller is
disabled, the queue is frozen to prevent new entry, and unfrozen on
resume, and fixes cq_vector signedness to not suspend a queue twice.
Since unfreezing the queue makes it available for commands, it requires
the queue be initialized, so this moves this part after that.
Special handling is done when the device is unresponsive during
shutdown. This can be optimized to not require subsequent commands to
timeout, but saving that fix for later.
This patch also removes the kill signals in this path that were left-over
artifacts from the blk-mq conversion and no longer necessary.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
add -lrt to fix undefined reference to `clock_gettime'
error seen when the test is compiled using gcc 4.6.4.
Signed-off-by: Andrey Skvortsov <andrej.skvortzov@gmail.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Since there is no gendisk associated with the admin queue, the driver
needs to hold a reference to it until all open references to the
controller are closed.
This also combines queue cleanup with freeing the tag set since these
should not be separate.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Once the nvme callback is set for a request, the driver can start it
and make it available for timeout handling. For timed out commands on a
device that is not initialized, this fixes potential deadlocks that can
occur on startup and shutdown when a device is unresponsive since they
can now be cancelled.
Asynchronous requests do not have any expected timeout, so these are
using the new "REQ_NO_TIMEOUT" request flags.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Requests that haven't been started prior to a queue dying can be ended
in error without waiting for them to start and time out.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Added code comment to explain why this is done.
Signed-off-by: Jens Axboe <axboe@fb.com>
Some types of requests may be started that are not gauranteed to ever
complete. This adds a request flag that a driver can use so mark the
request as such.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Adds a helper function a driver can use to abort requeued requests in
case any are pending when h/w queues are being removed.
Signed-off-by: Jens Axboe <axboe@fb.com>
Kicking requeued requests will start h/w queues in a work_queue, which
may alter the driver's requested state to temporarily stop them. This
patch exports a method to cancel the q->requeue_work so a driver can be
assured stopped h/w queues won't be started up before it is ready.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Drivers can iterate over all allocated request tags, but their callback
needs a way to know if the driver started the request in the first place.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
When the queue is set to dying, wake up tasks that are waiting on frozen
queue so they realize it is dying and abandon their request.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Modified by me to add a code comment on the need for the wakeup.
Signed-off-by: Jens Axboe <axboe@fb.com>
When perf report on TUI shows callchain it checks first node has
siblings to determine whether it needs to print percentage value.
But it missed a case that first node is NULL. So sometimes it segfaults
like below:
$ perf top -g
perf: Segmentation fault
-------- backtrace --------
perf[0x4fcefb]
/usr/lib/libc.so.6(+0x33b20)[0x7f2a35839b20]
perf(rb_next+0x8)[0x47d3d8]
perf[0x4f6058]
perf[0x4f833b]
perf[0x4f8610]
perf[0x4f209e]
perf(ui_browser__run+0x3a)[0x4f2e6a]
perf[0x4f94ee]
perf(perf_evlist__tui_browse_hists+0x94)[0x4fbbf4]
perf[0x444d10]
/usr/lib/libpthread.so.0(+0x7314)[0x7f2a37070314]
/usr/lib/libc.so.6(clone+0x6d)[0x7f2a358ee5bd]
$ addr2line -e `which perf` 0x4f6058
/home/namhyung/project/linux/tools/perf/ui/browsers/hists.c:553
I don't know why the backtrace didn't print some symbols..
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Fixes: 4087d11cd9 ("perf hists browser: Print overhead percent value for first-level callchain")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1419401076-21700-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Markus reported that "perf top -g" can leak ~300MB per second on his
machine. This is partly because it missed to free callchains when hist
entries are deleted. Fix it.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20141230053813.GD6081@sejong
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When perf report --children resorts output fields, it tries to put
caller above the callee. But this was only meaningful for a same thread
and doing this requires callchain enabled. So fix its check before
comparing the callchain depth.
This also changes the hist accumulation tests: In test 3, xmalloc in
bash thread should be above than other perf threads due to alphabetical
order of comm string. Also it's under page_fault in bash thread since
alphabetical order of dso name. The sys_perf_event_open in perf thread
is put on the last line since it's self overhead is 0.
In test 4, the sys_perf_event_open is put above other perf entries that
have same children overhead since its callchain depth is smaller.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1419309381-2593-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In case kasprintf() fails in xen_setup_timer() we assign name to the
static string "<timer kasprintf failed>". We, however, don't check
that fact before issuing kfree() in xen_teardown_timer(), kernel is
supposed to crash with 'kernel BUG at mm/slub.c:3341!'
Solve the issue by making name a fixed length string inside struct
xen_clock_event_device. 16 bytes should be enough.
Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
If the non-RAM regions in the e820 memory map are larger than the size
of the initial balloon, a BUG was triggered as the frames are remaped
beyond the limit of the linear p2m. The frames are remapped into the
initial balloon area (xen_extra_mem) but not enough of this is
available.
Ensure enough extra memory regions are added for these remapped
frames.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
This accounting is just used to print a diagnostic message that isn't
very useful.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
With recent changes in p2m we now have legitimate cases when
p2m memory needs to be freed during early boot (i.e. before
slab is initialized).
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
The early ioremap support introduced by patch bf4b558eba
("arm64: add early_ioremap support") failed to add a call to
early_ioremap_reset() at an appropriate time. Without this call,
invocations of early_ioremap etc. that are done too late will go
unnoticed and may cause corruption.
This is exactly what happened when the first user of this feature
was added in patch f84d02755f ("arm64: add EFI runtime services").
The early mapping of the EFI memory map is unmapped during an early
initcall, at which time the early ioremap support is long gone.
Fix by adding the missing call to early_ioremap_reset() to
setup_arch(), and move the offending early_memunmap() to right after
the point where the early mapping of the EFI memory map is last used.
Fixes: f84d02755f ("arm64: add EFI runtime services")
Cc: <stable@vger.kernel.org>
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
pmd_to_page() is only available if USE_SPLIT_PMD_PTLOCKS is defined.
The use of pmd_to_page in the gmap code can cause compile errors if
NR_CPUS is smaller than SPLIT_PTLOCK_CPUS. Do not use pmd_to_page
outside of USE_SPLIT_PMD_PTLOCKS sections.
Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
- 'perf probe' should fall back to find probe point in symbols when failing
to do so in a debuginfo file (Masami Hiramatsu)
- Fix 'perf probe' crash in dwarf_getcfi_elf (Namhyung Kim)
- Fix shell completion with 'perf list' --raw-dump option (Taesoo Kim)
- Fix 'perf diff' to sort by baseline field by default (Namhyung Kim)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUp1VfAAoJEBpxZoYYoA71q8oIAJcsGVKioROAGgcaorZJlm4M
XeC1THov9Rly96h73rKn7PNXs+YDouixNc6F63rxjKxePpLk3TWxcRdhjkXZtkzQ
m4VJR2nZT8UC5OBBonuKB5dvDBrNnhvDYD3VR6cTThBuwKv7eZzLcNBdc2K/oa+y
fmvv+Mk/6Kv3Ru0giOkSbiMLKaWUDP3stKl10cVIsVP7lIdbDm5IceXJChWnK1Uo
ydEknREXyNm11dUQ2LR/PpO+R/mAq+/IfgYYbr1mJd67Zgc+iix2H/Y0yq+PtwuH
Vsw5W+LFAzjQid7PhgNbsYXviMrZ5Vt74tskD9Ux3vCpcjvF/MoCrUJrcAe+Epw=
=0JUz
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
"
- 'perf probe' should fall back to find probe point in symbols when failing
to do so in a debuginfo file (Masami Hiramatsu)
- Fix 'perf probe' crash in dwarf_getcfi_elf (Namhyung Kim)
- Fix shell completion with 'perf list' --raw-dump option (Taesoo Kim)
- Fix 'perf diff' to sort by baseline field by default (Namhyung Kim)
"
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- Complete overhaul to the main IOCTL function, kfd_ioctl(), according to
drm_ioctl() example. This includes changing the IOCTL definitions, so it
breaks compatibility with previous versions of the userspace. However,
because the kernel was not officialy released yet, and this the first
kernel that includes amdkfd, I assume I can still do that at this stage.
- A couple of bug fixes for the non-HWS path (used for bring-ups and
debugging purposes only).
* tag 'amdkfd-fixes-2015-01-06' of git://people.freedesktop.org/~gabbayo/linux:
drm/amdkfd: rewrite kfd_ioctl() according to drm_ioctl()
drm/amdkfd: reformat IOCTL definitions to drm-style
drm/amdkfd: Do copy_to/from_user in general kfd_ioctl()
drm/amdkfd: unmap VMID<-->PASID when relesing VMID (non-HWS)
drm/radeon: Assign VMID to PASID for IH in non-HWS mode
drm/radeon: do not leave queue acquired if timeout happens in kgd_hqd_destroy()
drm/amdkfd: Load mqd to hqd in non-HWS mode
drm/amd: Fixing typos in kfd<->kgd interface