Pull key handling fixes from James Morris:
"Quoting David Howells:
Here are three miscellaneous fixes:
(1) Fix a panic in some debugging code in PKCS#7. This can only
happen by explicitly inserting a #define DEBUG into the code.
(2) Fix the calculation of the digest length in the PE file parser.
This causes a failure where there should be a success.
(3) Fix the case where an X.509 cert can be added as an asymmetric key
to a trusted keyring with no trust restriction if no AKID is
supplied.
Bugs (1) and (2) aren't particularly problematic, but (3) allows a
security check to be bypassed. Happily, this is a recent regression
and never made it into a released kernel"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
KEYS: Fix for erroneous trust of incorrectly signed X.509 certs
pefile: Fix the failure of calculation for digest
PKCS#7: Fix panic when referring to the empty AKID when DEBUG defined
Pull input fixes from Dmitry Torokhov:
"A few more fixes for the input subsystem:
- restore naming for tsc2005 touchscreens as some userspace match on it
- fix out of bound access in legacy keyboard driver
- fixup in RMI4 driver
Everything is tagged for stable as well"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: tsc200x - report proper input_dev name
tty/vt/keyboard: fix OOB access in do_compute_shiftstate()
Input: synaptics-rmi4 - fix maximum size check for F12 control register 8
Pull libnvdimm fix from Dan Williams:
"This contains a regression fix for a problem that was introduced in
v4.7-rc6.
In 4.7-rc1 we introduced auto-probing for the ACPI DSM (device-
specific-method) format that the platform firmware implements for
nvdimm devices. We initially fixed a regression in probing the QEMU
DSM implementation by making acpi_check_dsm() tolerant of the way QEMU
reports the "0 DSMs supported" condition.
However, that broke HPE platforms since that tolerance caused the
driver to mistakenly match the 1-zero-byte response those platforms
give to "unknown" commands. Instead, we simply make the driver
tolerant of not finding any supported DSMs. This has been tested to
work with both QEMU and HPE platforms.
This commit has appeared in a -next release with no reported issues"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nfit: make DIMM DSMs optional
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXkiA+AAoJEEEQszewGV1z8b0P/AlOJKGBx8/1JCJ2O0RaM8Iq
DZG5GpncsP9uEY6EGYWQ2OzrSOv2wy8LcDHxRGjoDVaCSQrVcNDLlPaxrCcdy+zb
JhyEZUnziBga5npCZaXhrpao9B1bWZLqUOB1+o5eJTLEsaCmxEslUiqUswxvrAzv
LjTjgTwV2OAdW5FKTHD8q32cwlj683HkansTcRtjAT2XIhS8tTEqpCKwEwnY9dCY
CWTJJSAjXSs1XOq/iWi2VSlqMRFmNKMfok3JszMr3CZ0E9HrB+shbx4p9nVJooLr
c5dF84UwmwfgxKeJiMCvpS7+cSPU3LqUnjeo7sF5wcWpENb3Atn5Ospwtaoja1Hg
fppIb8wydvtZZK/W4/v1XDhMF3Snr5VH5C7YMQaxEYP1cHmoPbFY7kejja5Dgm79
8DaMNZtxZe6taR0yBwRiciahJSkOMehRz/5OIHX1GW1paGXU7b2JxWk8NPbS5apy
WEEyw7TIinY/nFpn/NbANZPW+WNzbS7XOCq+R+K6vdrgwmD/A1ZXPUR+bdKXVId2
E0hkV+IM5pHL/zE3I5nqBDb6P/+tlSe8wm5Kqyvet8BZ7WWNfHEZFQjr/D/qc2GP
p2vH+XD4o9udm0/YED8XoP9Tk2QGuJKdYw1SAnLqbpwA8C6ItjL/q7AUGbpSFVgn
gQrSB1MeEH69LILoxpJx
=xQOQ
-----END PGP SIGNATURE-----
Merge tag 'gpio-v4.7-6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fix from Linus Walleij:
"Compile problem fix for Tegra,
Sorry to send this in the last minute but Ingo says this build failure
is very prominent so I'm not going to wait for v4.7 before sending it.
It is a case of COMPILE_TEST causing more problems than it solves and
I'm already swearing about me shooting myself in the foot with that
gun :("
* tag 'gpio-v4.7-6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: tegra: don't auto-enable for COMPILE_TEST
drivers, and one bug in a sunxi clk driver introduced in the 4.7 merge
window.
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJXkWveAAoJEAQCK3UHF29dSL0H/RL3maAwq4Ik5Wla6+AmpZ31
YOy5yXUp6y9ZAEjRhZfp5pzEaDk1A0q3Yy1QsN1xjyGgDo67xc4yPgB0ZQZJ58TU
ZqUPMe2gHIXz++fcWHMKm6gbVcs7L5tdYsYAbEYgkzQqm4gY0IyWvERYgDlmnjpI
R/SsgKKhrNfRwxQEB5Er9nx0lyq2kQ/8ocOgGJjDoeCWbdMczO6MWYIPSAYsQjct
wJGXhvYFKYLefD1LxynjonEnJQeimk6MGEeKsZ6hqVva/oHuw0qtOIRodyWJ4sSU
xiEitnLK1109fRfe84aKIpDw8mf2RjZludgTWNhDk3VpZzkpV4pxPESHfiACWYQ=
=T8Je
-----END PGP SIGNATURE-----
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Michael Turquette:
"Fix a bug in the at91 clk driver, two compile time warnings in sunxi
clk drivers, and one bug in a sunxi clk driver introduced in the 4.7
merge window"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: at91: fix clk_programmable_set_parent()
clk: sunxi: remove unused variable
clk: sunxi: display: Add per-clock flags
clk: sunxi: tcon-ch1: Do not return a negative error in get_parent
Pull libata fix from Tejun Heo:
"Another fallout from max_sectors bump a couple years ago. The lite-on
optical drive times out on large requests"
* 'for-4.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata: LITE-ON CX1-JB256-HP needs lower max_sectors
Now one more regression fix in addition to the previous pull request:
two changes in the core part are for unusual error paths, while the
rest are the regular HD-audio fixes and one USB-audio regression fix.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIrBAABCAAVBQJXkiDdDhx0aXdhaUBzdXNlLmRlAAoJEGwxgFQ9KSmk47UP/Av8
SASJJk8fhghQr6wKLxsXMV7Grrd3ZyAloZ3JDGlW0ldGz++jS5uC0jlkkd4EEj9T
K2yETbb/RninBAlIiCWmQn38HrSXf4CxYPFpFAU7C3AOXkPghuqhom+hKZhu6259
P7SyyO/UpZdXUuI0jKy6xshyYYaGCG8EK5F+wtw5ofmyivKaIL38dO7eMMvJhTMu
YBlBtkrTg/wFk7I6jFMpzvlPWFFnOqgjsaGVUun/zlxHp9mQRnc/fV9Nv2gOmKxl
I6xht7x8G1WvocqF5qcUr7pPs5NOgRbnwJaZuy9WCPaotHBpJLq3QqaF+rqJKd0u
QBHLbRxCe+Nu7uoIr6HUrNxLeOguHjbRO/X8I5mt/KOOp4GjhfFYn4oAf1TtYgdE
1GB1ifVI1WYZnrayXsDsCe4OpIV7W4wvOpPSpIoX2OMaKLO3uGNKyZ3FIlLUxMQ+
6PUAucyqlp+iEw+OsRLIdE1VV1zOzv22CucSzFW5ZsSFVnRR0ajch2NDnNCm68Nb
UqNQ+sxBhUodFoqiNVaquCH7/oLN8+nfQ3FlJh4wqwb9svG6eKYvY6zXlwcHafBe
yhSMpeAs+ECgOZQQMyMA6gQ+lePKz1m+d2aDy9U/Xikq0LM36uxCGhQVzo7JNI8w
iUInZBihhx9EL+VELXkGIYivRftvr8OCxl6QS6lQ
=Ky73
-----END PGP SIGNATURE-----
Merge tag 'sound-4.7-fix2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"No surprise, just a few small fixes: a couple of changes are seen in
the core part, and both of them are rather for unusual error paths.
The rest are the regular HD-audio fixes and one USB-audio regression
fix"
* tag 'sound-4.7-fix2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Fix quirks code is not called
ALSA: hda: add AMD Stoney PCI ID with proper driver caps
ALSA: hda - fix use-after-free after module unload
ALSA: pcm: Free chmap at PCM free callback, too
ALSA: ctl: Stop notification after disconnection
ALSA: hda/realtek - add new pin definition in alc225 pin quirk table
Pull NVMe fix from Jens Axboe:
"Late addition here, it's basically a revert of a patch that was added
in this merge window, but has proven to cause problems.
This is swapping out the RCU based namespace protection with a good
old mutex instead"
* 'for-linus' of git://git.kernel.dk/linux-block:
nvme: Remove RCU namespace protection
With this command sequence:
modprobe plip
modprobe pps_parport
rmmod pps_parport
the partport_pps modules causes this crash:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: parport_detach+0x1d/0x60 [pps_parport]
Oops: 0000 [#1] SMP
...
Call Trace:
parport_unregister_driver+0x65/0xc0 [parport]
SyS_delete_module+0x187/0x210
The sequence that builds up to this is:
1) plip is loaded and takes the parport device for exclusive use:
plip0: Parallel port at 0x378, using IRQ 7.
2) pps_parport then fails to grab the device:
pps_parport: parallel port PPS client
parport0: cannot grant exclusive access for device pps_parport
pps_parport: couldn't register with parport0
3) rmmod of pps_parport is then killed because it tries to access
pardev->name, but pardev (taken from port->cad) is NULL.
So add a check for NULL in the test there too.
Link: http://lkml.kernel.org/r/20160714115245.12651-1-jslaby@suse.cz
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Rodolfo Giometti <giometti@enneenne.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The curly braces are missing here so we print stuff unintentionally.
Fixes: 9da4714a2d ('slub: slabinfo update for cmpxchg handling')
Link: http://lkml.kernel.org/r/20160715211243.GE19522@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are no parentheses around this macro and it causes a problem when
we do:
index = rand() % THRASH_SIZE;
Link: http://lkml.kernel.org/r/20160715210953.GC19522@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
radix_tree_iter_retry() resets slot to NULL, but it doesn't reset tags.
Then NULL slot and non-zero iter.tags passed to radix_tree_next_slot()
leading to crash:
RIP: radix_tree_next_slot include/linux/radix-tree.h:473
find_get_pages_tag+0x334/0x930 mm/filemap.c:1452
....
Call Trace:
pagevec_lookup_tag+0x3a/0x80 mm/swap.c:960
mpage_prepare_extent_to_map+0x321/0xa90 fs/ext4/inode.c:2516
ext4_writepages+0x10be/0x2b20 fs/ext4/inode.c:2736
do_writepages+0x97/0x100 mm/page-writeback.c:2364
__filemap_fdatawrite_range+0x248/0x2e0 mm/filemap.c:300
filemap_write_and_wait_range+0x121/0x1b0 mm/filemap.c:490
ext4_sync_file+0x34d/0xdb0 fs/ext4/fsync.c:115
vfs_fsync_range+0x10a/0x250 fs/sync.c:195
vfs_fsync fs/sync.c:209
do_fsync+0x42/0x70 fs/sync.c:219
SYSC_fdatasync fs/sync.c:232
SyS_fdatasync+0x19/0x20 fs/sync.c:230
entry_SYSCALL_64_fastpath+0x23/0xc1 arch/x86/entry/entry_64.S:207
We must reset iterator's tags to bail out from radix_tree_next_slot()
and go to the slow-path in radix_tree_next_chunk().
Fixes: 46437f9a55 ("radix-tree: fix race in gang lookup")
Link: http://lkml.kernel.org/r/1468495196-10604-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The memory controller has quite a bit of state that usually outlives the
cgroup and pins its CSS until said state disappears. At the same time
it imposes a 16-bit limit on the CSS ID space to economically store IDs
in the wild. Consequently, when we use cgroups to contain frequent but
small and short-lived jobs that leave behind some page cache, we quickly
run into the 64k limitations of outstanding CSSs. Creating a new cgroup
fails with -ENOSPC while there are only a few, or even no user-visible
cgroups in existence.
Although pinning CSSs past cgroup removal is common, there are only two
instances that actually need an ID after a cgroup is deleted: cache
shadow entries and swapout records.
Cache shadow entries reference the ID weakly and can deal with the CSS
having disappeared when it's looked up later. They pose no hurdle.
Swap-out records do need to pin the css to hierarchically attribute
swapins after the cgroup has been deleted; though the only pages that
remain swapped out after offlining are tmpfs/shmem pages. And those
references are under the user's control, so they are manageable.
This patch introduces a private 16-bit memcg ID and switches swap and
cache shadow entries over to using that. This ID can then be recycled
after offlining when the CSS remains pinned only by objects that don't
specifically need it.
This script demonstrates the problem by faulting one cache page in a new
cgroup and deleting it again:
set -e
mkdir -p pages
for x in `seq 128000`; do
[ $((x % 1000)) -eq 0 ] && echo $x
mkdir /cgroup/foo
echo $$ >/cgroup/foo/cgroup.procs
echo trex >pages/$x
echo $$ >/cgroup/cgroup.procs
rmdir /cgroup/foo
done
When run on an unpatched kernel, we eventually run out of possible IDs
even though there are no visible cgroups:
[root@ham ~]# ./cssidstress.sh
[...]
65000
mkdir: cannot create directory '/cgroup/foo': No space left on device
After this patch, the IDs get released upon cgroup destruction and the
cache and css objects get released once memory reclaim kicks in.
[hannes@cmpxchg.org: init the IDR]
Link: http://lkml.kernel.org/r/20160621154601.GA22431@cmpxchg.org
Fixes: b2052564e6 ("mm: memcontrol: continue cache reclaim from offlined groups")
Link: http://lkml.kernel.org/r/20160617162516.GD19084@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: John Garcia <john.garcia@mesosphere.io>
Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Nikolay Borisov <kernel@kyup.com>
Cc: <stable@vger.kernel.org> [3.19+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The RST cpp:function handler is very pedantic: it doesn't allow any
macros like __user on it:
Documentation/media/kapi/dtv-core.rst:28: WARNING: Error when parsing function declaration.
If the function has no return type:
Error in declarator or parameters and qualifiers
Invalid definition: Expecting "(" in parameters_and_qualifiers. [error at 8]
ssize_t dvb_ringbuffer_pkt_read_user (struct dvb_ringbuffer * rbuf, size_t idx, int offset, u8 __user * buf, size_t len)
--------^
If the function has a return type:
Error in declarator or parameters and qualifiers
If pointer to member declarator:
Invalid definition: Expected '::' in pointer to member (function). [error at 37]
ssize_t dvb_ringbuffer_pkt_read_user (struct dvb_ringbuffer * rbuf, size_t idx, int offset, u8 __user * buf, size_t len)
-------------------------------------^
If declarator-id:
Invalid definition: Expecting "," or ")" in parameters_and_qualifiers, got "*". [error at 102]
ssize_t dvb_ringbuffer_pkt_read_user (struct dvb_ringbuffer * rbuf, size_t idx, int offset, u8 __user * buf, size_t len)
------------------------------------------------------------------------------------------------------^
So, we have to remove it from the function prototype.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
The objtool build fails in a cross-compiled environment on a non-x86
host with "ARCH=x86_64":
tools/objtool/objtool-in.o: In function `decode_instructions':
tools/objtool/builtin-check.c:276: undefined reference to `arch_decode_instruction'
We could override the ARCH environment variable and change it back to
x86, similar to what the objtool Makefile was doing before; but it's
tricky to override environment variables consistently.
Instead, take a similar approach used by the Linux top-level Makefile
and introduce a SRCARCH Makefile variable which evaluates to "x86" when
ARCH is either "x86_64" or "x86".
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160722191920.ej62fnspnqurbaa7@treble
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
objtool's Makefile was setting up ARCH but fixing up just the x86_64 ->
x86, using Makefile.arch will do the necessary fixups for all arches.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-hbq0bbh03u2b722vozcyql31@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For tools that needs to be always compiled with the host headers.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-907q32k2nep6q670dkxypmu6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cross building it on Ubuntu 16.04 to ARM ends up showing we get
the free() prototype by luck in other environments, fix it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-0ktfgmmyhcfw8ondka2013f3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I stumbled over a build error with COMPILE_TEST and CONFIG_OF
disabled:
drivers/gpio/gpio-tegra.c: In function 'tegra_gpio_probe':
drivers/gpio/gpio-tegra.c:603:9: error: 'struct gpio_chip' has no member named 'of_node'
The problem is that the newly added GPIO_TEGRA Kconfig symbol
does not have a dependency on CONFIG_OF. However, there is another
problem here as the driver gets enabled unconditionally whenever
COMPILE_TEST is set.
This fixes both problems, by making the symbol user-visible
when COMPILE_TEST is set and default-enabled for ARCH_TEGRA=y.
As a side-effect, it is now possible to compile-test a Tegra
kernel with GPIO support disabled, which is harmless.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 4dd4dd1d21 ("gpio: tegra: Allow compile test")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Currently, osd_weight and osd_state fields are updated in the encoding
order. This is wrong, because an incremental map may look like e.g.
new_up_client: { osd=6, addr=... } # set osd_state and addr
new_state: { osd=6, xorstate=EXISTS } # clear osd_state
Suppose osd6's current osd_state is EXISTS (i.e. osd6 is down). After
applying new_up_client, osd_state is changed to EXISTS | UP. Carrying
on with the new_state update, we flip EXISTS and leave osd6 in a weird
"!EXISTS but UP" state. A non-existent OSD is considered down by the
mapping code
2087 for (i = 0; i < pg->pg_temp.len; i++) {
2088 if (ceph_osd_is_down(osdmap, pg->pg_temp.osds[i])) {
2089 if (ceph_can_shift_osds(pi))
2090 continue;
2091
2092 temp->osds[temp->size++] = CRUSH_ITEM_NONE;
and so requests get directed to the second OSD in the set instead of
the first, resulting in OSD-side errors like:
[WRN] : client.4239 192.168.122.21:0/2444980242 misdirected client.4239.1:2827 pg 2.5df899f2 to osd.4 not [1,4,6] in e680/680
and hung rbds on the client:
[ 493.566367] rbd: rbd0: write 400000 at 11cc00000 (0)
[ 493.566805] rbd: rbd0: result -6 xferred 400000
[ 493.567011] blk_update_request: I/O error, dev rbd0, sector 9330688
The fix is to decouple application from the decoding and:
- apply new_weight first
- apply new_state before new_up_client
- twiddle osd_state flags if marking in
- clear out some of the state if osd is destroyed
Fixes: http://tracker.ceph.com/issues/14901
Cc: stable@vger.kernel.org # 3.15+: 6dd74e44dc: libceph: set 'exists' flag for newly up osd
Cc: stable@vger.kernel.org # 3.15+
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
To allow for child request context the struct akcipher_request child_req
needs to be at the end of the structure.
Cc: stable@vger.kernel.org
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Both the intent and the effect of reserve_bios_regions() is simple:
reserve the range from the apparent BIOS start (suitably filtered)
through 1MB and, if the EBDA start address is sensible, extend that
reservation downward to cover the EBDA as well.
The code is overcomplicated, though, and contains head-scratchers
like:
if (ebda_start < BIOS_START_MIN)
ebda_start = BIOS_START_MAX;
That snipped is trying to say "if ebda_start < BIOS_START_MIN,
ignore it".
Simplify it: reorder the code so that it makes sense. This should
have no functional effect under any circumstances.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Mario Limonciello <mario_limonciello@dell.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Link: http://lkml.kernel.org/r/ef89c0c761be20ead8bd9a3275743e6259b6092a.1469135598.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It doesn't just control probing for the EBDA -- it controls whether we
detect and reserve the <1MB BIOS regions in general.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Mario Limonciello <mario_limonciello@dell.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Link: http://lkml.kernel.org/r/55bd591115498440d461857a7b64f349a5d911f3.1469135598.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The upper dentry may become stale before we call ovl_lock_rename_workdir.
For example, someone could (mistakenly or maliciously) manually unlink(2)
it directly from upperdir.
To ensure it is not stale, let's lookup it after ovl_lock_rename_workdir
and and check if it matches the upper dentry.
Essentially, it is the same problem and similar solution as in
commit 11f3710417 ("ovl: verify upper dentry before unlink and rename").
Signed-off-by: Maxim Patlasov <mpatlasov@virtuozzo.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>
sock_cmsg_send() can return different error codes and not only
-EINVAL, and we should properly propagate them.
Fixes: c14ac9451c ("sock: enable timestamping using control messages")
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before this patch, if you used gfs2_jadd to add new journals of a
size smaller than the existing journals, replaying those new journals
would withdraw. That's because function gfs2_replay_incr_blk was
using the number of journal blocks (jd_block) from the superblock's
journal pointer. In other words, "My journal's max size" rather than
"the journal we're replaying's size." This patch changes the function
to use the size of the pertinent journal rather than always using the
journal we happen to be using.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
I don't think it is really possible to have a system where CPUID
enumerates support for XSAVE but that it does not have FP/SSE
(they are "legacy" features and always present).
But, I did manage to hit this case in qemu when I enabled its
somewhat shaky XSAVE support. The bummer is that the FPU is set
up before we parse the command-line or have *any* console support
including earlyprintk. That turned what should have been an easy
thing to debug in to a bit more of an odyssey.
So a BUG() here is worthless. All it does it guarantee that
if/when we hit this case we have an empty console. So, remove
the BUG() and try to limp along by disabling XSAVE and trying to
continue. Add a comment on why we are doing this, and also add
a common "out_disable" path for leaving fpu__init_system_xstate().
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160720194551.63BB2B58@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Previous patches added support for Intel's AVX-512 instructions to the
kernel and perf tools instruction decoders.
AVX-512 instructions are documented in Intel Architecture Instruction
Set Extensions Programming Reference (February 2016).
Add a representative set of instructions to perf's "new instructions"
test. e.g.
perf test "new instructions"
Or to view a particular instruction:
perf test -v "new instructions" 2>&1 | grep vbroadcasti64x4
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: X86 ML <x86@kernel.org>
Link: http://lkml.kernel.org/r/1469003437-32706-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for Intel's AVX-512 instructions to perf tools instruction
decoder used by Intel PT. The kernel's instruction decoder was updated in
a previous patch.
AVX-512 instructions are documented in Intel Architecture Instruction Set
Extensions Programming Reference (February 2016).
AVX-512 instructions are identified by a EVEX prefix which, for the purpose
of instruction decoding, can be treated as though it were a 4-byte VEX
prefix.
Existing instructions which can now accept an EVEX prefix need not be
further annotated in the op code map (x86-opcode-map.txt). In the case of
new instructions, the op code map is updated accordingly.
Also add associated Mask Instructions that are used to manipulate mask
registers used in AVX-512 instructions.
A representative set of instructions is added to the perf tools new
instructions test in a subsequent patch.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: X86 ML <x86@kernel.org>
Link: http://lkml.kernel.org/r/1469003437-32706-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for Intel's AVX-512 instructions to the instruction decoder.
AVX-512 instructions are documented in Intel Architecture Instruction
Set Extensions Programming Reference (February 2016).
AVX-512 instructions are identified by a EVEX prefix which, for the
purpose of instruction decoding, can be treated as though it were a
4-byte VEX prefix.
Existing instructions which can now accept an EVEX prefix need not be
further annotated in the op code map (x86-opcode-map.txt). In the case
of new instructions, the op code map is updated accordingly.
Also add associated Mask Instructions that are used to manipulate mask
registers used in AVX-512 instructions.
The 'perf tools' instruction decoder is updated in a subsequent patch.
And a representative set of instructions is added to the perf tools new
instructions test in a subsequent patch.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: X86 ML <x86@kernel.org>
Link: http://lkml.kernel.org/r/1469003437-32706-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So the reserve_ebda_region() code has accumulated a number of
problems over the years that make it really difficult to read
and understand:
- The calculation of 'lowmem' and 'ebda_addr' is an unnecessarily
interleaved mess of first lowmem, then ebda_addr, then lowmem tweaks...
- 'lowmem' here means 'super low mem' - i.e. 16-bit addressable memory. In other
parts of the x86 code 'lowmem' means 32-bit addressable memory... This makes it
super confusing to read.
- It does not help at all that we have various memory range markers, half of which
are 'start of range', half of which are 'end of range' - but this crucial
property is not obvious in the naming at all ... gave me a headache trying to
understand all this.
- Also, the 'ebda_addr' name sucks: it highlights that it's an address (which is
obvious, all values here are addresses!), while it does not highlight that it's
the _start_ of the EBDA region ...
- 'BIOS_LOWMEM_KILOBYTES' says a lot of things, except that this is the only value
that is a pointer to a value, not a memory range address!
- The function name itself is a misnomer: it says 'reserve_ebda_region()' while
its main purpose is to reserve all the firmware ROM typically between 640K and
1MB, while the 'EBDA' part is only a small part of that ...
- Likewise, the paravirt quirk flag name 'ebda_search' is misleading as well: this
too should be about whether to reserve firmware areas in the paravirt case.
- In fact thinking about this as 'end of RAM' is confusing: what this function
*really* wants to reserve is firmware data and code areas! Once the thinking is
inverted from a mixed 'ram' and 'reserved firmware area' notion to a pure
'reserved area' notion everything becomes a lot clearer.
To improve all this rewrite the whole code (without changing the logic):
- Firstly invert the naming from 'lowmem end' to 'BIOS reserved area start'
and propagate this concept through all the variable names and constants.
BIOS_RAM_SIZE_KB_PTR // was: BIOS_LOWMEM_KILOBYTES
BIOS_START_MIN // was: INSANE_CUTOFF
ebda_start // was: ebda_addr
bios_start // was: lowmem
BIOS_START_MAX // was: LOWMEM_CAP
- Then clean up the name of the function itself by renaming it
to reserve_bios_regions() and renaming the ::ebda_search paravirt
flag to ::reserve_bios_regions.
- Fix up all the comments (fix typos), harmonize and simplify their
formulation and remove comments that become unnecessary due to
the much better naming all around.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Parallel build can sporadically fail because asn1 headers may
not be built yet by the time qat_asym_algs.o is compiled:
drivers/crypto/qat/qat_common/qat_asym_algs.c:55:32: fatal error: qat_rsapubkey-asn1.h: No such file or directory
#include "qat_rsapubkey-asn1.h"
Cc: stable@vger.kernel.org
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For a front merge, the maximum number of sectors of the
request must be checked against the front merge BIO sector,
not the current sector of the request.
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Before merging a bio into an existing request, io scheduler is called to
get its approval first. However, the requests that come from a plug
flush may get merged by block layer without consulting with io
scheduler.
In case of CFQ, this can cause fairness problems. For instance, if a
request gets merged into a low weight cgroup's request, high weight cgroup
now will depend on low weight cgroup to get scheduled. If high weigt cgroup
needs that io request to complete before submitting more requests, then it
will also lose its timeslice.
Following script demonstrates the problem. Group g1 has a low weight, g2
and g3 have equal high weights but g2's requests are adjacent to g1's
requests so they are subject to merging. Due to these merges, g2 gets
poor disk time allocation.
cat > cfq-merge-repro.sh << "EOF"
#!/bin/bash
set -e
IO_ROOT=/mnt-cgroup/io
mkdir -p $IO_ROOT
if ! mount | grep -qw $IO_ROOT; then
mount -t cgroup none -oblkio $IO_ROOT
fi
cd $IO_ROOT
for i in g1 g2 g3; do
if [ -d $i ]; then
rmdir $i
fi
done
mkdir g1 && echo 10 > g1/blkio.weight
mkdir g2 && echo 495 > g2/blkio.weight
mkdir g3 && echo 495 > g3/blkio.weight
RUNTIME=10
(echo $BASHPID > g1/cgroup.procs &&
fio --readonly --name name1 --filename /dev/sdb \
--rw read --size 64k --bs 64k --time_based \
--runtime=$RUNTIME --offset=0k &> /dev/null)&
(echo $BASHPID > g2/cgroup.procs &&
fio --readonly --name name1 --filename /dev/sdb \
--rw read --size 64k --bs 64k --time_based \
--runtime=$RUNTIME --offset=64k &> /dev/null)&
(echo $BASHPID > g3/cgroup.procs &&
fio --readonly --name name1 --filename /dev/sdb \
--rw read --size 64k --bs 64k --time_based \
--runtime=$RUNTIME --offset=256k &> /dev/null)&
sleep $((RUNTIME+1))
for i in g1 g2 g3; do
echo ---- $i ----
cat $i/blkio.time
done
EOF
# ./cfq-merge-repro.sh
---- g1 ----
8:16 162
---- g2 ----
8:16 165
---- g3 ----
8:16 686
After applying the patch:
# ./cfq-merge-repro.sh
---- g1 ----
8:16 90
---- g2 ----
8:16 445
---- g3 ----
8:16 471
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Provides the ability to identify DAX enabled devices in userspace.
Signed-off-by: Yigal Korman <yigal@plexistor.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Currently, presence of direct_access() in block_device_operations
indicates support of DAX on its block device. Because
block_device_operations is instantiated with 'const', this DAX
capablity may not be enabled conditinally.
In preparation for supporting DAX to device-mapper devices, add
QUEUE_FLAG_DAX to request_queue flags to advertise their DAX
support. This will allow to set the DAX capability based on how
mapped device is composed.
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <linux-s390@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Passes input_id struct to the common probe function for the tsc200x drivers
instead of just the bustype.
This allows for the use of the product variable to set the input_dev->name
variable according to the type of touchscreen used. Note that when we
introduced support for TSC2004 we started calling everything TSC200X, so
let's keep this quirk.
Signed-off-by: Michael Welling <mwelling@ieee.org>
Cc: stable@vger.kernel.org
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The size of individual keymap in drivers/tty/vt/keyboard.c is NR_KEYS,
which is currently 256, whereas number of keys/buttons in input device (and
therefor in key_down) is much larger - KEY_CNT - 768, and that can cause
out-of-bound access when we do
sym = U(key_maps[0][k]);
with large 'k'.
To fix it we should not attempt iterating beyond smaller of NR_KEYS and
KEY_CNT.
Also while at it let's switch to for_each_set_bit() instead of open-coding
it.
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
This reverts commit 47d6d752b9.
Commit f42ddca7be (doc-rst: kernel-doc directive, fix state machine
reporter) from Marcus Heiser provides a better fix, so this configuration
change is no longer needed.
Add a reporter replacement that assigns the correct source name and line
number to a system message, as recorded in a ViewList.
[1] http://mid.gmane.org/CAKMK7uFMQ2wOp99t-8v06Om78mi9OvRZWuQsFJD55QA20BB3iw@mail.gmail.com
Signed-off-by: Markus Heiser <markus.heiser@darmarIT.de>
Tested-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Now that the new Sphinx world order is taking over, the information in
kernel-doc-nano-HOWTO.txt is outmoded. I hate to remove it altogether,
since it's one of those files that people expect to find. But we can add a
warning and fix all the other pointers to it.
Reminded-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
memset the command buffers rather than the pointers to them.
Fixes: b3f63c3d5e ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
`changeable` is the "version" of mode page requested by the user.
It will be less confusing/misleading if we do not check it
"together" with the setting bits of the drive.
Not to mention that we currently have ata_mselect_*() implemented
in a way that each of them will serve exclusively a particular bit
on each page. The old style will hence make the condition look even
more unnecessarily arcane if the ata_msense_*() is reflecting more
than one bit.
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Due to PCI subsystem behaviour, unloading AHCI driver will disable
MSI and enable INTx. When HBA supports MSIx or Multiple MSI, Driver's
irq handler doesn't clear GHC.IS register. It works well when reading or
writing data and GHC.IS is always non-zero. But when unloading driver
(or any other operation which causes disable MSIx and enable INTx), PCI
subsystem uses config write(Rx04.bit10) to enable INTx. Because
GHC.IS is non-zero, HBA will falsely assume some port needs interrupt
service. Then it asserts INTx. To make things worse, when AHCI controller
shares the same interrupt pin with other PCI device, that PCI device's ISR
will be called and nobody de-asserts previous INTx.
This patch clears GHC.IS in ahci_port_stop() even when using MSIx or
MMSI to prevent this case. It ensures GHC.IS is zero before PCI subsystem
enables INTx.
tj: Minor updates to the comment.
Signed-off-by: Raymond Pang <raymond_rule@hotmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>