Large writes to uinput interface may cause rcu stalls. Let's add
cond_resched() to the loop to avoid this.
Reviewed-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Large writes to evdev interface may cause rcu stalls. Let's add
cond_resched() to the loop to avoid this.
Reviewed-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Daniel Borkmann says:
====================
pull-request: bpf 2018-10-05
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix to truncate input on ALU operations in 32 bit mode, from Jann.
2) Fixes for cgroup local storage to reject reserved flags on element
update and rejection of map allocation with zero-sized value, from Roman.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When I wrote commit 468f6eafa6 ("bpf: fix 32-bit ALU op verification"), I
assumed that, in order to emulate 64-bit arithmetic with 32-bit logic, it
is sufficient to just truncate the output to 32 bits; and so I just moved
the register size coercion that used to be at the start of the function to
the end of the function.
That assumption is true for almost every op, but not for 32-bit right
shifts, because those can propagate information towards the least
significant bit. Fix it by always truncating inputs for 32-bit ops to 32
bits.
Also get rid of the coerce_reg_to_size() after the ALU op, since that has
no effect.
Fixes: 468f6eafa6 ("bpf: fix 32-bit ALU op verification")
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Fix a commit 8a8158c85e ("MIPS: memset.S: EVA & fault support for
small_memset") regression and remove assembly warnings:
arch/mips/lib/memset.S: Assembler messages:
arch/mips/lib/memset.S:243: Warning: Macro instruction expanded into multiple instructions in a branch delay slot
triggering with the CPU_DADDI_WORKAROUNDS option set and this code:
PTR_SUBU a2, t1, a0
jr ra
PTR_ADDIU a2, 1
This is because with that option in place the DADDIU instruction, which
the PTR_ADDIU CPP macro expands to, becomes a GAS macro, which in turn
expands to an LI/DADDU (or actually ADDIU/DADDU) sequence:
13c: 01a4302f dsubu a2,t1,a0
140: 03e00008 jr ra
144: 24010001 li at,1
148: 00c1302d daddu a2,a2,at
...
Correct this by switching off the `noreorder' assembly mode and letting
GAS schedule this jump's delay slot, as there is nothing special about
it that would require manual scheduling. With this change in place
correct code is produced:
13c: 01a4302f dsubu a2,t1,a0
140: 24010001 li at,1
144: 03e00008 jr ra
148: 00c1302d daddu a2,a2,at
...
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: 8a8158c85e ("MIPS: memset.S: EVA & fault support for small_memset")
Patchwork: https://patchwork.linux-mips.org/patch/20833/
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: stable@vger.kernel.org # 4.17+
- Fix the build on Clear Linux, coping with redundant declarations of
function prototypes in python3 header files by adding
-Wno-redundant-decls to build with PYTHON=python3 (Arnaldo Carvalho de Melo)
- Fixes for processing inline frames in backtraces using DWARF based
unwinding (Milian Wolff)
- Cope with bad DWARF info for function names for inline frames,not
trying to demangle this symbol. Problem reported with rust but
reproduced as well with C++. Problem reported to the libbpf
maintainers (Milian Wolff)
- Fix python export to postgresql and sqlite code (Adrian Hunter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCW7eGGwAKCRCyPKLppCJ+
JzAWAP4rGO2bZmvRFBmg9go5xeYZmOOy8pk9zHslslrNlSSyrwEAvyI1Fk/YYR4I
9UDE+/F9pey7BVuY5V8pFplEEcFF8gY=
=K7kp
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-for-mingo-4.19-20181005' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix the build on Clear Linux, coping with redundant declarations of
function prototypes in python3 header files by adding
-Wno-redundant-decls to build with PYTHON=python3 (Arnaldo Carvalho de Melo)
- Fixes for processing inline frames in backtraces using DWARF based
unwinding (Milian Wolff)
- Cope with bad DWARF info for function names for inline frames,not
trying to demangle this symbol. Problem reported with rust but
reproduced as well with C++. Problem reported to the libbpf
maintainers (Milian Wolff)
- Fix python export to postgresql and sqlite code (Adrian Hunter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
One important fix:
- Fix a memory leak with AMD IOMMU when SME is active and a VM
has assigned devices. In that case the complete guest memory
will be leaked without this fix.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJbtyAKAAoJECvwRC2XARrjungQAL5ceVrKiY1rFif/ZhH0fFtq
7FjZsa2cB1ROYfXCpTfVXhMSLB4sWW0Lg7eR3pTWxvx301eLZGPK/j/NSugZOSQB
39KMIvXkEEJk6QXMOC4SeFhOtHKXst8Gq7cbgmOcHt4oA8hEuphyLb5AuFOEE1iY
RVXT6e8Si2pQ6YjJ+CK/N59N/hCa2lcn+blHHyPTNtGoPPME8h0fb/lIaC5TDB2X
hMev0WgMk6qKCk3N7tDqX0vgEaHvPCmZlTQaM/TH85mBIjiLq/DUZ5dA0EJjPOAt
NSo+dh8737gDXoKvdQgBR3p8jxlsdBSLVyjuf2f0lHtYa3/PR2e4NicQQxrsQHrE
nJZ8AAfkfAX3yfCIdU4jdGUlAkt+y+HDMY/ewxYi5Z+4TNjmCwtt3KdjmNnSdMoF
1L7kmPRd+mgjVFa6hPVFU809ey7/G8fwlYpIqwc4Xid21nHrnkagHYdDUDX10xjL
mIt3gR9sfZsbouzN8OTKSEnc++5P8he2rTlrFD7Wg5Crt79AI7K4iHDeReiKAzo9
mHt1xYwQl+5HLtz/xoZ9RZAFgpPPyYZijbUqCXxhsTuInWnwKSIiLy+wYXrsW86n
XRAuPzCBUrTVbQ+Nsaunvf3iXVenB2c/565C22GrJ7irgMTllyArZhX12YhpvbPm
qsVJHKceTEve+AEkmzE2
=VoCg
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Joerg writes:
"IOMMU Fix for Linux v4.19-rc6
One important fix:
- Fix a memory leak with AMD IOMMU when SME is active and a VM
has assigned devices. In that case the complete guest memory
will be leaked without this fix."
* tag 'iommu-fixes-v4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Clear memory encryption mask from physical address
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJbtxXyAAoJEL/70l94x66D52kH/jEbBo/jz9Jx2bnbxYkG1YzO
cIpIRjbRcOKVFNGxjStlJ0PedQBWAfPQl+SywRfqwiSlOOt/yo0lZ5ZewENR2TxO
CLQC/OnV/5SU7BJvbsKgH9tc+Wp9X55wBUEalfcvG/knFlmR+eK/7TwTS+hv/U21
uYKRnGfz5AGfdmB9FyCn0blkPNnFaQ8KB+y+INZTkB+YZzNsybow230FRPs22fjX
HGeJ7gngah50M5gxDW+YPPNXFhs36x2hsyQXBN9TPxLPHoxTsRRoeqx2nl/UvA+e
LXZWg8/UAzXFO/fKVHkJX4jSnCDr2W7HYGNyLPtXFPWhcOelP1h9uHrfuX+fxA4=
=UNUo
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Paolo writes:
"KVM changes for 4.19-rc7
x86 and PPC bugfixes, mostly introduced in 4.19-rc1."
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: nVMX: fix entry with pending interrupt if APICv is enabled
KVM: VMX: hide flexpriority from guest when disabled at the module level
KVM: VMX: check for existence of secondary exec controls before accessing
KVM: PPC: Book3S HV: Avoid crash from THP collapse during radix page fault
KVM: x86: fix L1TF's MMIO GFN calculation
tools/kvm_stat: cut down decimal places in update interval dialog
KVM: nVMX: Fix emulation of VM_ENTRY_LOAD_BNDCFGS
KVM: x86: Do not use kvm_x86_ops->mpx_supported() directly
KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled
KVM: x86: never trap MSR_KERNEL_GS_BASE
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAlu2jLMACgkQiiy9cAdy
T1GLMQwAiUaUTtR0MN2ws+IHNnwFX5YnWFa1Cr/6Tgz9WBo5TrrvQcBGhTIUQok2
7w2+I0rXPO7P7Tq2odWn66BT1/l0o2U10aB4OHYlS7zqfCAp6GuwKGtATivRqjsg
ENn2mmBRuTmz+XtGmlVklmQpHjn4+MTR2GSLA57oo1fqX5YBeSxcq2UEciCGttRQ
nAvn923NrPSbGRvL/eDDWGXas8gMATXnrNlN8st2AsxnoFC6CrzUuRM0s/IFgp6r
Orh0jIlovlT2Q6LdND7n7xXnEVoyUTETG89IjRVwcBJ9LtXrJXpuWo7/sOr11cMy
sGeUkILNlmkIdLtIlUaZcGaclq/yfeKdMxziBT8tWP6wXIH4LrGCzvZ9jDtrzygU
EWL2nLsqMJNh0+wsYSZM/m9VFC1ReeB1AgIC2EHu/xzqWoTeL40i2OhKi/732FEX
H+FlTZUoorhFoI96AMJMrqXjXyqZUZSQ1b5iJ7/jq888Vdey+m7HiMUtxv0+5yzC
+lQksh8X
=4PHD
-----END PGP SIGNATURE-----
Merge tag '4.19-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Steve writes:
"SMB3 fixes
four small SMB3 fixes: one for stable, the others to address a more
recent regression"
* tag '4.19-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb3: fix lease break problem introduced by compounding
cifs: only wake the thread for the very last PDU in a compound
cifs: add a warning if we try to to dequeue a deleted mid
smb2: fix missing files in root share directory listing
Only use the mapped IP to find inline frames, but keep using the
unmapped IP for the callchain cursor. This ensures we properly show the
unmapped IP when displaying a frame we received via the
dso__parse_addr_inlines API for a module which does not contain
sufficient debug symbols to show the srcline.
This is another follow-up to commit 1961018469 ("perf script: Show
virtual addresses instead of offsets").
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Fixes: 1961018469 ("perf script: Show virtual addresses instead of offsets")
Link: http://lkml.kernel.org/r/20180926135207.30263-2-milian.wolff@kdab.com
Link: http://lkml.kernel.org/r/20181002073949.3297-1-milian.wolff@kdab.com
[ Squashed a fix from Milian for a problem reported by Ravi, fixed up space damage ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The functions vbin_printf() and bstr_printf() are used by trace_printk() to
try to keep the overhead down during printing. trace_printk() uses
vbin_printf() at the time of execution, as it only scans the fmt string to
record the printf values into the buffer, and then uses vbin_printf() to do
the conversions to print the string based on the format and the saved
values in the buffer.
This is an issue for dereferenced pointers, as before commit 841a915d20,
the processing of the pointer could happen some time after the pointer value
was recorded (reading the trace buffer). This means the processing of the
value at a later time could show different results, or even crash the
system, if the pointer no longer existed.
Commit 841a915d20 addressed this by processing dereferenced pointers at
the time of execution and save the result in the ring buffer as a string.
The bstr_printf() would then treat these pointers as normal strings, and
print the value. But there was an off-by-one bug here, where after
processing the argument, it move the pointer only "strlen(arg)" which made
the arg pointer not point to the next argument in the ring buffer, but
instead point to the nul character of the last argument. This causes any
values after a dereferenced pointer to be corrupted.
Cc: stable@vger.kernel.org
Fixes: 841a915d20 ("vsprintf: Do not have bprintf dereference pointers")
Reported-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
When building in ClearLinux using 'make PYTHON=python3' with gcc 8.2.1
it fails with:
GEN /tmp/build/perf/python/perf.so
In file included from /usr/include/python3.7m/Python.h:126,
from /git/linux/tools/perf/util/python.c:2:
/usr/include/python3.7m/import.h:58:24: error: redundant redeclaration of ‘_PyImport_AddModuleObject’ [-Werror=redundant-decls]
PyAPI_FUNC(PyObject *) _PyImport_AddModuleObject(PyObject *, PyObject *);
^~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/python3.7m/import.h:47:24: note: previous declaration of ‘_PyImport_AddModuleObject’ was here
PyAPI_FUNC(PyObject *) _PyImport_AddModuleObject(PyObject *name,
^~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
error: command 'gcc' failed with exit status 1
And indeed there is a redundant declaration in that Python.h file, one
with parameter names and the other without, so just add
-Wno-error=redundant-decls to the python setup instructions.
Now perf builds with gcc in ClearLinux with the following Dockerfile:
# docker.io/acmel/linux-perf-tools-build-clearlinux:latest
FROM docker.io/clearlinux:latest
MAINTAINER Arnaldo Carvalho de Melo <acme@kernel.org>
RUN swupd update && \
swupd bundle-add sysadmin-basic-dev
RUN mkdir -m 777 -p /git /tmp/build/perf /tmp/build/objtool /tmp/build/linux && \
groupadd -r perfbuilder && \
useradd -m -r -g perfbuilder perfbuilder && \
chown -R perfbuilder.perfbuilder /tmp/build/ /git/
USER perfbuilder
COPY rx_and_build.sh /
ENV EXTRA_MAKE_ARGS=PYTHON=python3
ENTRYPOINT ["/rx_and_build.sh"]
Now to figure out why the build fails with clang, that is present in the
above container as detected by the rx_and_build.sh script:
clang version 6.0.1 (tags/RELEASE_601/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/sbin
make: Entering directory '/git/linux/tools/perf'
BUILD: Doing 'make -j4' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
Auto-detecting system features:
... dwarf: [ OFF ]
... dwarf_getlocations: [ OFF ]
... glibc: [ OFF ]
... gtk2: [ OFF ]
... libaudit: [ OFF ]
... libbfd: [ OFF ]
... libelf: [ OFF ]
... libnuma: [ OFF ]
... numa_num_possible_cpus: [ OFF ]
... libperl: [ OFF ]
... libpython: [ OFF ]
... libslang: [ OFF ]
... libcrypto: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ OFF ]
... zlib: [ OFF ]
... lzma: [ OFF ]
... get_cpuid: [ OFF ]
... bpf: [ OFF ]
Makefile.config:331: *** No gnu/libc-version.h found, please install glibc-dev[el]. Stop.
make[1]: *** [Makefile.perf:206: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
make: Leaving directory '/git/linux/tools/perf'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thiago Macieira <thiago.macieira@intel.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-c3khb9ac86s00qxzjrueomme@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix the rxrpc_data_ready() function to pick up all packets and to not miss
any. There are two problems:
(1) The sk_data_ready pointer on the UDP socket is set *after* it is
bound. This means that it's open for business before we're ready to
dequeue packets and there's a tiny window exists in which a packet can
sneak onto the receive queue, but we never know about it.
Fix this by setting the pointers on the socket prior to binding it.
(2) skb_recv_udp() will return an error (such as ENETUNREACH) if there was
an error on the transmission side, even though we set the
sk_error_report hook. Because rxrpc_data_ready() returns immediately
in such a case, it never actually removes its packet from the receive
queue.
Fix this by abstracting out the UDP dequeuing and checksumming into a
separate function that keeps hammering on skb_recv_udp() until it
returns -EAGAIN, passing the packets extracted to the remainder of the
function.
and two potential problems:
(3) It might be possible in some circumstances or in the future for
packets to be being added to the UDP receive queue whilst rxrpc is
running consuming them, so the data_ready() handler might get called
less often than once per packet.
Allow for this by fully draining the queue on each call as (2).
(4) If a packet fails the checksum check, the code currently returns after
discarding the packet without checking for more.
Allow for this by fully draining the queue on each call as (2).
Fixes: 17926a7932 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Fix some refs to init_net that should've been changed to the appropriate
network namespace.
Fixes: 2baec2c3f8 ("rxrpc: Support network namespacing")
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Recently we implemented show_user_instructions() which dumps the code
around the NIP when a user space process dies with an unhandled
signal. This was modelled on the x86 code, and we even went so far as
to implement the exact same bug, namely that if the user process
crashed with its NIP pointing into the kernel we will dump kernel text
to dmesg. eg:
bad-bctr[2996]: segfault (11) at c000000000010000 nip c000000000010000 lr 12d0b0894 code 1
bad-bctr[2996]: code: fbe10068 7cbe2b78 7c7f1b78 fb610048 38a10028 38810020 fb810050 7f8802a6
bad-bctr[2996]: code: 3860001c f8010080 48242371 60000000 <7c7b1b79> 4082002c e8010080 eb610048
This was discovered on x86 by Jann Horn and fixed in commit
342db04ae7 ("x86/dumpstack: Don't dump kernel memory based on usermode RIP").
Fix it by checking the adjusted NIP value (pc) and number of
instructions against USER_DS, and bail if we fail the check, eg:
bad-bctr[2969]: segfault (11) at c000000000010000 nip c000000000010000 lr 107930894 code 1
bad-bctr[2969]: Bad NIP, not dumping instructions.
Fixes: 88b0fe1757 ("powerpc: Add show_user_instructions()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
There are platforms which don't provide input clock rate but provide
I2C timing parameters. Commit 3bd4f27727 ("i2c: designware: Call
i2c_dw_clk_rate() only once in i2c_dw_init_master()") causes needless
warning during probe on those platforms since i2c_dw_clk_rate(), which
causes the warning when input clock is unknown, is called even when
there is no need to calculate timing parameters.
Fixes: 3bd4f27727 ("i2c: designware: Call i2c_dw_clk_rate() only once in i2c_dw_init_master()")
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: <stable@vger.kernel.org> # 4.19
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Before cloning into a file, update the ctime and remove sensitive
attributes like suid, just like we'd do for a regular file write.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
When we're reflinking between two files and the destination file range
is well beyond the destination file's EOF marker, zero any posteof
speculative preallocations in the destination file so that we don't
expose stale disk contents. The previous strategy of trying to clear
the preallocations does not work if the destination file has the
PREALLOC flag set.
Uncovered by shared/010.
Reported-by: Zorro Lang <zlang@redhat.com>
Bugzilla-id: https://bugzilla.kernel.org/show_bug.cgi?id=201259
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Refactor all the reflink preparation steps into a separate helper
that we'll use to land all the upcoming fixes for insufficient input
checks.
This rework also moves the invalidation of the destination range to
the prep function so that it is done before the range is remapped.
This ensures that nobody can access the data in range being remapped
until the remap is complete.
[dgc: fix xfs_reflink_remap_prep() return value and caller check to
handle vfs_clone_file_prep_inodes() returning 0 to mean "nothing to
do". ]
[dgc: make sure length changed by vfs_clone_file_prep_inodes() gets
propagated back to XFS code that does the remapping. ]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Boris Ostrovsky reported a memory leak with device passthrough when SME
is active.
The VFIO driver uses iommu_iova_to_phys() to get the physical address for
an iova. This physical address is later passed into vfio_unmap_unpin() to
unpin the memory. The vfio_unmap_unpin() uses pfn_valid() before unpinning
the memory. The pfn_valid() check was failing because encryption mask was
part of the physical address returned. This resulted in the memory not
being unpinned and therefore leaked after the guest terminates.
The memory encryption mask must be cleared from the physical address in
iommu_iova_to_phys().
Fixes: 2543a786aa ("iommu/amd: Allow the AMD IOMMU to work with memory encryption")
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: <iommu@lists.linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: kvm@vger.kernel.org
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: <stable@vger.kernel.org> # 4.14+
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When connecting SFP PHY to phylink use the detected interface.
Otherwise, the link fails to come up when the configured 'phy-mode'
differs from the SFP detected mode.
Move most of phylink_connect_phy() into __phylink_connect_phy(), and
leave phylink_connect_phy() as a wrapper. phylink_sfp_connect_phy() can
now pass the SFP detected PHY interface to __phylink_connect_phy().
This fixes 1GB SFP module link up on eth3 of the Macchiatobin board that
is configured in the DT to "2500base-x" phy-mode.
Fixes: 9525ae8395 ("phylink: add phylink infrastructure")
Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
the be2net implementation of .ndo_tunnel_{add,del}() changes the value of
NETIF_F_GSO_UDP_TUNNEL bit in 'features' and 'hw_features', but it forgets
to call netdev_features_change(). Moreover, ethtool setting for that bit
can potentially be reverted after a tunnel is added or removed.
GSO already does software segmentation when 'hw_enc_features' is 0, even
if VXLAN offload is turned on. In addition, commit 096de2f83e ("benet:
stricter vxlan offloading check in be_features_check") avoids hardware
segmentation of non-VXLAN tunneled packets, or VXLAN packets having wrong
destination port. So, it's safe to avoid flipping the above feature on
addition/deletion of VXLAN tunnels.
Fixes: 630f4b7056 ("be2net: Export tunnel offloads only when a VxLAN tunnel is created")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One patch here, fixing a potential host crash introduced (or at least
exacerbated) by a previous fix for corruption relating to radix guest
page faults and THP operations.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJbtq7FAAoJEJ2a6ncsY3Gf0PAIAJvq2veNJs0HHsonuI592mK0
YMKL7IproGHBnIuVvpaI3o6iERjMtRaolh+G4dlk3xoQ/TXYzTdimT1cwF2jZblL
wQSsit5499UrJdPsFxBc7lc8yXwRR9+N/ooRAQS/TyK6w00Y+Aep7sTQpRdVCb1G
nskLWXBDWtdB6nGAX9dECpRQP+/voxEJkMMrsFM3rYbBkM/rOqu/IXsCoH2u2jGM
6zFDe7Z35VP77UaehK9W4fGHVc+92oD5E2kN56DNIotYaAT7fudPmNQF+g1GsVFc
Mv8AJdXiQIhZFbpRWTS/UTvyb/aq1Ou6Xd5msTYTMG8v+g0M9h8ZSH3FQCX8RMs=
=Fdit
-----END PGP SIGNATURE-----
Merge tag 'kvm-ppc-fixes-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master
Third set of PPC KVM fixes for 4.19
One patch here, fixing a potential host crash introduced (or at least
exacerbated) by a previous fix for corruption relating to radix guest
page faults and THP operations.
While we currently grab a runtime PM ref in nouveau's normal connector
detection code, we apparently don't do this for MST. This means if we're
in a scenario where the GPU is suspended and userspace attempts to do a
connector probe on an MSTC connector, the probe will fail entirely due
to the DP aux channel and GPU not being woken up:
[ 316.633489] nouveau 0000:01:00.0: i2c: aux 000a: begin idle timeout ffffffff
[ 316.635713] nouveau 0000:01:00.0: i2c: aux 000a: begin idle timeout ffffffff
[ 316.637785] nouveau 0000:01:00.0: i2c: aux 000a: begin idle timeout ffffffff
...
So, grab a runtime PM ref here.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
When we use raw socket as the vhost backend, a packet from virito with
gso offloading information, cannot be sent out in later validaton at
xmit path, as we did not set correct skb->protocol which is further used
for looking up the gso function.
To fix this, we set this field according to virito hdr information.
Fixes: e858fae2b0 ("virtio_net: use common code for virtio_net_hdr and skb GSO conversion")
Signed-off-by: Jianfeng Tan <jianfeng.tan@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit c499696e79 ("net: dsa: b53: Stop using dev->cpu_port
incorrectly") was a bit too trigger happy in removing the CPU port from
the VLAN membership because we rely on DSA to program the CPU port VLAN,
which it does, except it does not bother itself with tagged/untagged and
just usese untagged.
Having the CPU port "follow" the user ports tagged/untagged is not great
and does not allow for properly differentiating, so keep the CPU port
tagged in all VLANs.
Reported-by: Gerhard Wiesinger <lists@wiesinger.com>
Fixes: c499696e79 ("net: dsa: b53: Stop using dev->cpu_port incorrectly")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Load the respective NAT helper module if the flow uses it.
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: Misc. bug fixes.
4 small bug fixes related to setting firmware message enables bits, possible
memory leak when probe fails, and ring accouting when RDMA driver is loaded.
Please queue these for -stable as well. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When getting the max rings supported, get the reduced max_irqs
by the ones used by RDMA.
If the number MSIX is the limiting factor, this bug may cause the
max ring count to be higher than it should be when RDMA driver is
loaded and may result in ring allocation failures.
Fixes: 30f529473e ("bnxt_en: Do not modify max IRQ count after RDMA driver requests/frees IRQs.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the driver probe fails, all the resources that were allocated prior
to the failure must be freed. However, hwrm dma response memory is not
getting freed.
This patch fixes the problem described above.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In HWRM_QUEUE_COS2BW_CFG request, enables field should have the bits
set only for the queue ids which are having the valid parameters.
This causes firmware to return error when the TC to hardware CoS queue
mapping is not 1:1 during DCBNL ETS setup.
Fixes: 2e8ef77ee0 ("bnxt_en: Add TC to hardware QoS queue mapping logic.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The enables bit for VNIC was set wrong when calling the HWRM_FUNC_CFG
firmware call to reserve VNICs. This has the effect that the firmware
will keep a large number of VNICs for the PF, and having very few for
VFs. DPDK driver running on the VFs, which requires more VNICs, may not
work properly as a result.
Fixes: 674f50a5b0 ("bnxt_en: Implement new method to reserve rings.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
syzbot was able to trigger rcu stalls by calling write()
with large number of bytes.
Add a cond_resched() in the loop to avoid this.
Link: https://lkml.org/lkml/2018/8/23/1106
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com
Reviewed-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbtphfAAoJEAx081l5xIa+OwYP/1dmnns1njRhGP7DAL4+/5GH
U/LXZtHRhWKMdQOPyCyzZ0xBTPGG1utrTEphdGbYoHIM3yhpWXMmazQlfwmM2S0+
6F5ca0M5f4YKo8up/a8Tn215J5gf8LpoyzhSXaau9kbb2yfusmgB6ZRZOltGgW04
CKnERuvUDwfaHlNBADfcRGK1BzPVNBESzDJ4DhdYtmVSNlMjJ1bqvjjqqIUcNfP3
ifcbOcm6G+efSqSM9s55XbQml5pgeBKLvS2N27tIZp/2Y1HaIbBeWGGxJLcQpzDP
h1lEMmQl7Z81iVqAxFnHSzwEQ5YrxMKWOX9108MlYjDOYa6K3Q81tzOjMr8z3pGv
9Lfq1YCd6rwQ9H7UCwOuO9yvVGqgmEpT6kxVZaiEVFBwAgh7RsYSMwsD4npjKwHK
VYbuSXPN+BBcObqJTlZz3zOqTngtImUGX1xaqHgOTSCksBLtToBAL2KkvJeNrctr
4zTzsS9Rkce5hSMTzkh6mm0yzh74V9SnwAzhbS+UEEtPufE9u1BpOHeRYMeI95PV
cQnTj9zY5KcfeZjPURtOO2lCBzHDpTDSl+wIIXAJSwsX/FdksA/2B22vOqJbn4Zd
70M673sOg0FxE5wyCSNSiBs0Iq9oz5/cGDKzRJ1u4mh/KbQJ3y8nGQJamA4TafCM
rRZwcKXRHkPa+D6eSiw5
=slnZ
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2018-10-05' of git://anongit.freedesktop.org/drm/drm
Dave writes:
"amdgpu and two core fixes
Two fixes for amdgpu:
one corrects a use of process->mm
one fix for display code race condition that can result in a crash
Two core fixes:
One for a use-after-free in the leasing code
One for a cma/fbdev crash."
* tag 'drm-fixes-2018-10-05' of git://anongit.freedesktop.org/drm/drm:
drm/amdkfd: Fix incorrect use of process->mm
drm/amd/display: Signal hw_done() after waiting for flip_done()
drm/cma-helper: Fix crash in fbdev error path
drm: fix use-after-free read in drm_mode_create_lease_ioctl()
Cancel pending work before freeing smsc75xx private data structure
during binding. This fixes the following crash in the driver:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
IP: mutex_lock+0x2b/0x3f
<snipped>
Workqueue: events smsc75xx_deferred_multicast_write [smsc75xx]
task: ffff8caa83e85700 task.stack: ffff948b80518000
RIP: 0010:mutex_lock+0x2b/0x3f
<snipped>
Call Trace:
smsc75xx_deferred_multicast_write+0x40/0x1af [smsc75xx]
process_one_work+0x18d/0x2fc
worker_thread+0x1a2/0x269
? pr_cont_work+0x58/0x58
kthread+0xfa/0x10a
? pr_cont_work+0x58/0x58
? rcu_read_unlock_sched_notrace+0x48/0x48
ret_from_fork+0x22/0x40
Signed-off-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* fix use-after-free in regulatory code
* fix rx-mgmt key flag in AP mode (mac80211)
* fix wireless extensions compat code memory leak
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlu2bEUACgkQB8qZga/f
l8RAgw/7BfRpm3Kr7XW919naGkt/pQeJxUcuF9YggBpTCrp/DSLQYsjOBE5DyS/m
728oPD8jEDehUHasWKsbG7wit1S7ImExCHTPim8C1mbABhbqdhwD4ceUvBO7RYi2
p0+yN8X8z5D0qruMrNwhtxdE8iV9bBgmY6u1jubpJFkKLPf2euZyroH40b879CIn
aHqB42GNJCdwO2UFaPDH2cdx5DFWrDlfA1LGbrbuzrXMBfNGWYgen2JJH/5iDOyU
1rVXk/pUpVffp0Zde+66NtyCxxC0+hQwrTczEKXICb5qoWJpz6kugFGGO1oDQgdp
AbM7KNrV712h/qwTEnC1NG0KUXgocpwWIuf/cuTow0vGUJSl+O2pLS/3GLOwH2du
1u/FF4LiBc4NFXmWBPMN3LUN+Ica0/YWSbVwcv2c4guemV1EOGinlbFc+tnoue7M
fpLkQJUYCiEVFRXGWVaSl0Hr6z+zwgfa8qHYN2yq1qyB0dYHryYiVhgKLV7yisCm
RNy1hmVuV7rMsL3f4iUq/2xnL3U8qK1+19Mr+i58/kU4Tx2jMkWj0kivRdgYX4EN
XcBhJzWXrb3yMldACrCji6iRnrFMqg7osyEgFiMLMl4cZfs057+qsrJyR5xajVOi
Ws+hj3a1LukJ1nIhou4uOLAt9D7ohuJojQq6D2GLeBwfGPNkz4E=
=QyWK
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-davem-2018-10-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Just three small fixes:
* fix use-after-free in regulatory code
* fix rx-mgmt key flag in AP mode (mac80211)
* fix wireless extensions compat code memory leak
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
A cgroup which is already a threaded domain may be converted into a
threaded cgroup if the prerequisite conditions are met. When this
happens, all threaded descendant should also have their ->dom_cgrp
updated to the new threaded domain cgroup. Unfortunately, this
propagation was missing leading to the following failure.
# cd /sys/fs/cgroup/unified
# cat cgroup.subtree_control # show that no controllers are enabled
# mkdir -p mycgrp/a/b/c
# echo threaded > mycgrp/a/b/cgroup.type
At this point, the hierarchy looks as follows:
mycgrp [d]
a [dt]
b [t]
c [inv]
Now let's make node "a" threaded (and thus "mycgrp" s made "domain threaded"):
# echo threaded > mycgrp/a/cgroup.type
By this point, we now have a hierarchy that looks as follows:
mycgrp [dt]
a [t]
b [t]
c [inv]
But, when we try to convert the node "c" from "domain invalid" to
"threaded", we get ENOTSUP on the write():
# echo threaded > mycgrp/a/b/c/cgroup.type
sh: echo: write error: Operation not supported
This patch fixes the problem by
* Moving the opencoded ->dom_cgrp save and restoration in
cgroup_enable_threaded() into cgroup_{save|restore}_control() so
that mulitple cgroups can be handled.
* Updating all threaded descendants' ->dom_cgrp to point to the new
dom_cgrp when enabling threaded mode.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Reported-by: Amin Jamali <ajamali@pivotal.io>
Reported-by: Joao De Almeida Pereira <jpereira@pivotal.io>
Link: https://lore.kernel.org/r/CAKgNAkhHYCMn74TCNiMJ=ccLd7DcmXSbvw3CbZ1YREeG7iJM5g@mail.gmail.com
Fixes: 454000adaa ("cgroup: introduce cgroup->dom_cgrp and threaded css_set handling")
Cc: stable@vger.kernel.org # v4.14+
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCW7YOCQAKCRDh3BK/laaZ
PAmGAQC1/3+mkfHXqyDx+zv4f7A348ZULW26EWFWwMwlEljWowD+PlRFBgUu5v5c
194mCApZU7hl6o0u07aijjz6m81e6QE=
=dM89
-----END PGP SIGNATURE-----
Merge tag 'ovl-fixes-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Miklos writes:
"overlayfs fixes for 4.19-rc7
This update fixes a couple of regressions in the stacked file update
added in this cycle, as well as some older bugs uncovered by
syzkaller.
There's also one trivial naming change that touches other parts of
the fs subsystem."
* tag 'ovl-fixes-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: fix format of setxattr debug
ovl: fix access beyond unterminated strings
ovl: make symbol 'ovl_aops' static
vfs: swap names of {do,vfs}_clone_file_range()
ovl: fix freeze protection bypass in ovl_clone_file_range()
ovl: fix freeze protection bypass in ovl_write_iter()
ovl: fix memory leak on unlink of indexed file
A reload of the cache's DM table is needed during resize because
otherwise a crash will occur when attempting to access smq policy
entries associated with the portion of the cache that was recently
extended.
The reason is cache-size based data structures in the policy will not be
resized, the only way to safely extend the cache is to allow for a
proper cache policy initialization that occurs when the cache table is
loaded. For example the smq policy's space_init(), init_allocator(),
calc_hotspot_params() must be sized based on the extended cache size.
The fix for this is to disallow cache resizes of this pattern:
1) suspend "cache" target's device
2) resize the fast device used for the cache
3) resume "cache" target's device
Instead, the last step must be a full reload of the cache's DM table.
Fixes: 66a636356 ("dm cache: add stochastic-multi-queue (smq) policy")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Commit fd2fa9541 ("dm cache metadata: save in-core policy_hint_size to
on-disk superblock") enabled previously written policy hints to be
used after a cache is reactivated. But in doing so the cache
metadata's hint array was left exposed to out of bounds access because
on resize the metadata's on-disk hint array wasn't ever extended.
Fix this by ignoring that there are no on-disk hints associated with the
newly added cache blocks. An expanded on-disk hint array is later
rewritten upon the next clean shutdown of the cache.
Fixes: fd2fa9541 ("dm cache metadata: save in-core policy_hint_size to on-disk superblock")
Cc: stable@vger.kernel.org
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
If __device_suspend() runs asynchronously (in which case the device
passed to it is in dpm_suspended_list at that point) and it returns
early on an error or pending wakeup, and the power.direct_complete
flag has been set for the device already, the subsequent
device_resume() will be confused by that and it will call
pm_runtime_enable() incorrectly, as runtime PM has not been
disabled for the device by __device_suspend().
To avoid that, clear power.direct_complete if __device_suspend()
is not going to disable runtime PM for the device before returning.
Fixes: aae4518b31 (PM / sleep: Mechanism to avoid resuming runtime-suspended devices unnecessarily)
Reported-by: Al Cooper <alcooperx@gmail.com>
Tested-by: Al Cooper <alcooperx@gmail.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Cc: 3.16+ <stable@vger.kernel.org> # 3.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Ido Schimmel says:
====================
mlxsw: Couple of fixes
First patch works around an hardware issue in Spectrum-2 where a field
indicating the event type is always set to the same value. Since there
are only two event types and they are reported using different queues,
we can use the queue number to derive the event type.
Second patch prevents a router interface (RIF) leakage when a VLAN
device is deleted from on top a bridge device.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 602b74eda8 ("mlxsw: spectrum_switchdev: Do not leak RIFs
when removing bridge") I handled the case where RIFs created for VLAN
devices were not properly cleaned up when their real device (a bridge)
was removed.
However, I forgot to handle the case of the VLAN device itself being
removed. Do so now when the VLAN device is being unlinked from its real
device.
Fixes: 99f44bb352 ("mlxsw: spectrum: Enable L3 interfaces on top of bridge devices")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reported-by: Artem Shvorin <art@qrator.net>
Tested-by: Artem Shvorin <art@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to a hardware issue in Spectrum-2, the field event_type of the event
queue element (EQE) has become reserved. It was used to distinguish between
command interface completion events and completion events.
Use queue number to determine event type, as command interface completion
events are always received on EQ0 and mlxsw driver maps completion events
to EQ1.
Fixes: c3ab435466 ("mlxsw: spectrum: Extend to support Spectrum-2 ASIC")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell writes:
"A couple of small ARM fixes from Stefan and Thomas:
- Adding the io_pgetevents syscall
- Fixing a bounds check in pci_ioremap_io()"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8799/1: mm: fix pci_ioremap_io() offset check
ARM: 8787/1: wire up io_pgetevents syscall