Commit Graph

724011 Commits

Author SHA1 Message Date
Vishal Verma
24e3a7fb60 libnvdimm, btt: Fix an incompatibility in the log layout
Due to a spec misinterpretation, the Linux implementation of the BTT log
area had different padding scheme from other implementations, such as
UEFI and NVML.

This fixes the padding scheme, and defaults to it for new BTT layouts.
We attempt to detect the padding scheme in use when probing for an
existing BTT. If we detect the older/incompatible scheme, we continue
using it.

Reported-by: Juston Li <juston.li@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Fixes: 5212e11fde ("nd_btt: atomic sector updates")
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-12-21 14:59:27 -08:00
Vishal Verma
13b7954c0b libnvdimm, btt: add a couple of missing kernel-doc lines
Recent updates to btt.h neglected to add corresponding kernel-doc lines
for new structure members. Add them.

Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-12-21 14:59:27 -08:00
David S. Miller
c50b7c473f Merge branch 'net-zerocopy-fixes'
Saeed Mahameed says:

===================
Mellanox, mlx5 fixes 2017-12-19

The follwoing series includes some fixes for mlx5 core and etherent
driver.

Please pull and let me know if there is any problem.

This series doesn't introduce any conflict with the ongoing mlx5 for-next
submission.

For -stable:

kernels >= v4.7.y
    ("net/mlx5e: Fix possible deadlock of VXLAN lock")
    ("net/mlx5e: Add refcount to VXLAN structure")
    ("net/mlx5e: Prevent possible races in VXLAN control flow")
    ("net/mlx5e: Fix features check of IPv6 traffic")

kernels >= v4.9.y
    ("net/mlx5: Fix error flow in CREATE_QP command")
    ("net/mlx5: Fix rate limit packet pacing naming and struct")

kernels >= v4.13.y
    ("net/mlx5: FPGA, return -EINVAL if size is zero")

kernels >= v4.14.y
    ("Revert "mlx5: move affinity hints assignments to generic code")

All above patches apply and compile with no issues on corresponding -stable.
===================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-21 15:00:59 -05:00
Willem de Bruijn
b90ddd5687 skbuff: skb_copy_ubufs must release uarg even without user frags
skb_copy_ubufs creates a private copy of frags[] to release its hold
on user frags, then calls uarg->callback to notify the owner.

Call uarg->callback even when no frags exist. This edge case can
happen when zerocopy_sg_from_iter finds enough room in skb_headlen
to copy all the data.

Fixes: 3ece782693 ("sock: skb_copy_ubufs support for compound pages")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-21 15:00:58 -05:00
Willem de Bruijn
268b790679 skbuff: orphan frags before zerocopy clone
Call skb_zerocopy_clone after skb_orphan_frags, to avoid duplicate
calls to skb_uarg(skb)->callback for the same data.

skb_zerocopy_clone associates skb_shinfo(skb)->uarg from frag_skb
with each segment. This is only safe for uargs that do refcounting,
which is those that pass skb_orphan_frags without dropping their
shared frags. For others, skb_orphan_frags drops the user frags and
sets the uarg to NULL, after which sock_zerocopy_clone has no effect.

Qemu hangs were reported due to duplicate vhost_net_zerocopy_callback
calls for the same data causing the vhost_net_ubuf_ref_>refcount to
drop below zero.

Link: http://lkml.kernel.org/r/<CAF=yD-LWyCD4Y0aJ9O0e_CHLR+3JOeKicRRTEVCPxgw4XOcqGQ@mail.gmail.com>
Fixes: 1f8b977ab3 ("sock: enable MSG_ZEROCOPY")
Reported-by: Andreas Hartmann <andihartmann@01019freenet.de>
Reported-by: David Hill <dhill@redhat.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-21 15:00:58 -05:00
Linus Torvalds
9035a8961b Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "It's been a few weeks, so here's a small collection of fixes that
  should go into the current series.

  This contains:

   - NVMe pull request from Christoph, with a few important fixes.

   - kyber hang fix from Omar.

   - A blk-throttl fix from Shaohua, fixing a case where we double
     charge a bio.

   - Two call_single_data alignment fixes from me, fixing up some
     unfortunate changes that went into 4.14 without being properly
     reviewed on the block side (since nobody was CC'ed on the
     patch...).

   - A bounce buffer fix in two parts, one from me and one from Ming.

   - Revert bdi debug error handling patch. It's causing boot issues for
     some folks, and a week down the line, we're still no closer to a
     fix. Revert this patch for now until it's figured out, then we can
     retry for 4.16"

* 'for-linus' of git://git.kernel.dk/linux-block:
  Revert "bdi: add error handle for bdi_debug_register"
  null_blk: unalign call_single_data
  block: unalign call_single_data in struct request
  block-throttle: avoid double charge
  block: fix blk_rq_append_bio
  block: don't let passthrough IO go into .make_request_fn()
  nvme: setup streams after initializing namespace head
  nvme: check hw sectors before setting chunk sectors
  nvme: call blk_integrity_unregister after queue is cleaned up
  nvme-fc: remove double put reference if admin connect fails
  nvme: set discard_alignment to zero
  kyber: fix another domain token wait queue hang
2017-12-21 11:13:37 -08:00
Linus Torvalds
409232a450 ARM fixes:
- A bug in handling of SPE state for non-vhe systems
 - A fix for a crash on system shutdown
 - Three timer fixes, introduced by the timer optimizations for v4.15
 
 x86 fixes:
 - fix for a WARN that was introduced in 4.15
 - fix for SMM when guest uses PCID
 - fixes for several bugs found by syzkaller
 
 ... and a dozen papercut fixes for the kvm_stat tool.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJaO6N9AAoJEL/70l94x66DC1wH/Rf+u0Cj6ZQil6LK6Nf8bfPd
 3TqrwrxUDeXwi8GzsvK14izBr1mDzidSHIO0Q4XINFRSRdaf43h3R2im/SJqvNhP
 xktCmJI2CxN96oaC7kIExgwf3YKhFdLIADfbT8oR9p3xZG/+c97dkr3b4XtmVCDb
 ZXdUEOcKnoW4zwpfJN30FLlq4OwYvuYVz02AEfPivZRDfhhus/TYSnuSdxH8CLNf
 75ymuKyXoo/RELbimwbMk8Cm9+ey7PjlUGOgbnbXIFtmgznXhLzAOeES2B+46J5b
 sMBPlmiJrn6N//lM18CC5yOBzBLGsYOoXggtw4aU/5nM4GVcFebWedpcoD4D8Jw=
 =Bt8w
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "ARM fixes:
   - A bug in handling of SPE state for non-vhe systems
   - A fix for a crash on system shutdown
   - Three timer fixes, introduced by the timer optimizations for v4.15

  x86 fixes:
   - fix for a WARN that was introduced in 4.15
   - fix for SMM when guest uses PCID
   - fixes for several bugs found by syzkaller

  ... and a dozen papercut fixes for the kvm_stat tool"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits)
  tools/kvm_stat: sort '-f help' output
  kvm: x86: fix RSM when PCID is non-zero
  KVM: Fix stack-out-of-bounds read in write_mmio
  KVM: arm/arm64: Fix timer enable flow
  KVM: arm/arm64: Properly handle arch-timer IRQs after vtimer_save_state
  KVM: arm/arm64: timer: Don't set irq as forwarded if no usable GIC
  KVM: arm/arm64: Fix HYP unmapping going off limits
  arm64: kvm: Prevent restoring stale PMSCR_EL1 for vcpu
  KVM/x86: Check input paging mode when cs.l is set
  tools/kvm_stat: add line for totals
  tools/kvm_stat: stop ignoring unhandled arguments
  tools/kvm_stat: suppress usage information on command line errors
  tools/kvm_stat: handle invalid regular expressions
  tools/kvm_stat: add hint on '-f help' to man page
  tools/kvm_stat: fix child trace events accounting
  tools/kvm_stat: fix extra handling of 'help' with fields filter
  tools/kvm_stat: fix missing field update after filter change
  tools/kvm_stat: fix drilldown in events-by-guests mode
  tools/kvm_stat: fix command line option '-g'
  kvm: x86: fix WARN due to uninitialized guest FPU state
  ...
2017-12-21 10:44:13 -08:00
Shaohua Li
513674b5a2 net: reevalulate autoflowlabel setting after sysctl setting
sysctl.ip6.auto_flowlabels is default 1. In our hosts, we set it to 2.
If sockopt doesn't set autoflowlabel, outcome packets from the hosts are
supposed to not include flowlabel. This is true for normal packet, but
not for reset packet.

The reason is ipv6_pinfo.autoflowlabel is set in sock creation. Later if
we change sysctl.ip6.auto_flowlabels, the ipv6_pinfo.autoflowlabel isn't
changed, so the sock will keep the old behavior in terms of auto
flowlabel. Reset packet is suffering from this problem, because reset
packet is sent from a special control socket, which is created at boot
time. Since sysctl.ipv6.auto_flowlabels is 1 by default, the control
socket will always have its ipv6_pinfo.autoflowlabel set, even after
user set sysctl.ipv6.auto_flowlabels to 1, so reset packset will always
have flowlabel. Normal sock created before sysctl setting suffers from
the same issue. We can't even turn off autoflowlabel unless we kill all
socks in the hosts.

To fix this, if IPV6_AUTOFLOWLABEL sockopt is used, we use the
autoflowlabel setting from user, otherwise we always call
ip6_default_np_autolabel() which has the new settings of sysctl.

Note, this changes behavior a little bit. Before commit 42240901f7
(ipv6: Implement different admin modes for automatic flow labels), the
autoflowlabel behavior of a sock isn't sticky, eg, if sysctl changes,
existing connection will change autoflowlabel behavior. After that
commit, autoflowlabel behavior is sticky in the whole life of the sock.
With this patch, the behavior isn't sticky again.

Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Tom Herbert <tom@quantonium.net>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-21 13:07:20 -05:00
Eric Garver
c48e74736f openvswitch: Fix pop_vlan action for double tagged frames
skb_vlan_pop() expects skb->protocol to be a valid TPID for double
tagged frames. So set skb->protocol to the TPID and let skb_vlan_pop()
shift the true ethertype into position for us.

Fixes: 5108bbaddc ("openvswitch: add processing of L3 packets")
Signed-off-by: Eric Garver <e@erig.me>
Reviewed-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-21 13:02:08 -05:00
Gabriel Krisman Bertazi
8bc0d7ac93 i915: Reject CCS modifiers for pipe C on Geminilake
Current code advertises (on the modifiers blob property) support for CCS
modifier for pipe C on GLK, only to reject it later when validating the
request before the atomic commit.

This fixes the tests igt@kms_ccs@pipe-c-*, which should skip on GLK for
pipe C (see bug 104096).

A relevant discussion is archived at:

https://lists.freedesktop.org/archives/intel-gfx/2017-December/150646.html

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104096
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171220002410.5604-1-krisman@collabora.co.uk
(cherry picked from commit f0cbd8bd87)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-12-21 19:51:03 +02:00
Marcin Ciupak
f235bcfe6a staging: pi433: remove unused rf69_set_dc_cut_off_frequency* functions
The following functions:
* rf69_set_dc_cut_off_frequency,
* rf69_set_dc_cut_off_frequency_intern,
* rf69_set_dc_cut_off_frequency_during_afc
are unused and should be removed along with type enum dcc_percent which
was used only by these functions.

Signed-off-by: Marcin Ciupak <marcin.s.ciupak@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:29:59 +01:00
Marcin Ciupak
9924a8339f staging: pi433: remove unused rf69_set_ook_threshold_type function
Function rf69_set_ook_threshold_type is unused and should be removed
along with type enum thresholdType which was used only by that function.

Signed-off-by: Marcin Ciupak <marcin.s.ciupak@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:29:59 +01:00
Marcin Ciupak
418e175a99 staging: pi433: remove unused rf69_set_ook_threshold_step function
Function rf69_set_ook_threshold_step is unused and should be removed
along with type enum thresholdStep which was used only by that function.

Signed-off-by: Marcin Ciupak <marcin.s.ciupak@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:29:59 +01:00
Marcin Ciupak
b44badd30b staging: pi433: remove unused rf69_set_rx_start_timeout function
Function rf69_set_rx_start_timeout is unused and should be removed.

Signed-off-by: Marcin Ciupak <marcin.s.ciupak@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:29:57 +01:00
Marcin Ciupak
68701fb146 staging: pi433: remove unused rf69_set_rssi_timeout function
Function rf69_set_rssi_timeout is unused and should be removed.

Signed-off-by: Marcin Ciupak <marcin.s.ciupak@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:29:57 +01:00
Marcin Ciupak
68a87a9a23 staging: pi433: remove unused rf69_get_payload_length function
Function rf69_get_payload_length is unused and should be removed.

Signed-off-by: Marcin Ciupak <marcin.s.ciupak@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:29:40 +01:00
Marcin Ciupak
db1bad5235 staging: pi433: make local functions static
Following functions:
* rf69_get_modulation
* rf69_read_reg
* rf69_write_reg
are used locally only and should be declared as static

Signed-off-by: Marcin Ciupak <marcin.s.ciupak@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:29:40 +01:00
Martin Homuth
b78559b605 staging: rtl8712: style fix multiple line dereferences
This patch fixes various coding style issues in the rtl8712 module as
noted by checkpatch.pl related to dereferencing over multiple lines.

It fixes the following checkpatch.pl warning:

WARNING: Avoid multiple line dereference - prefer %s

Signed-off-by: Martin Homuth <martin@martinhomuth.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:27:15 +01:00
Michael Gebhard
2c5f7348fe staging: comedi: add identifiers to function parameters
Fix these checkpatch.pl warnings in comedidev.h:
WARNING: function definition argument '<name>' should also have an identifier name

Introduces this checkpatch.pl warning in lines 195 and 205:
WARNING: line over 80 characters
Breaking these lines would make the code less compact.

Signed-off-by: Michael Gebhard <im72ywil@cip.cs.fau.de>
Signed-off-by: David Sauerwein <la38vyti@cip.cs.fau.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:25:37 +01:00
Roman Storozhenko
a1b475afe3 staging: lustre: Replace 'uint32_t' with 'u32' and 'uint64_t' with 'u64'
There are two reasons for replacing 'uint32_t' with 'u32'
and 'uint64_t' with 'u64':

1) As Linus Torvalds have said we should use kernel types:
http://lkml.iu.edu/hypermail//linux/kernel/1506.0/00160.html

2) There are only few places in the lustre codebase that use such types.
In the most cases it uses 'u32' and 'u64'.

Signed-off-by: Roman Storozhenko <romeusmeister@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:22:07 +01:00
Shreeya Patel
37edc1ccc9 Staging: rtl8723bs: Do not check for NOT NULL before kfree()
Do not check for NOT NULL before calling kfree because
if the pointer is NULL, no action occurs.
Done using the following semantic patch by coccinelle.

@@
expression ptr;
@@

- if (ptr != NULL)
  kfree(ptr);

Signed-off-by: Shreeya Patel <shreeya.patel23498@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:22:07 +01:00
Mikhail Shvetsov
16e1b4ebfb staging: vchiq_arm: Cleaning up codestyle warnings
This removes checkpatch.pl warnings:
WARNING: line over 80 characters

Signed-off-by: Mikhail Shvetsov <lameli67@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:21:09 +01:00
Mikhail Shvetsov
75ff22a426 staging: vchiq_arm: Fixing code style of comments
This removes checkpatch.pl warnings:

WARNING: Block comments use a trailing */ on a separate line
WARNING: Block comments should align the * on each line
WARNING: Block comments use * on subsequent lines

Signed-off-by: Mikhail Shvetsov <lameli67@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:21:09 +01:00
Mikhail Shvetsov
c44805a0a2 staging: vchiq_arm: Remove useless comments.
This removes useless comments duplicate function names.

Signed-off-by: Mikhail Shvetsov <lameli67@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:21:09 +01:00
Valentin Vidic
acf71f8dfc staging: pi433: fix CamelCase for fifo variables
Fixes checkpatch warnings:

  CHECK: Avoid CamelCase: <fifoEmpty>
  CHECK: Avoid CamelCase: <fifoFillCondition>
  CHECK: Avoid CamelCase: <fifoFull>
  CHECK: Avoid CamelCase: <fifoLevel>
  CHECK: Avoid CamelCase: <fifoLevelBelowThreshold>
  CHECK: Avoid CamelCase: <fifoNotEmpty>
  CHECK: Avoid CamelCase: <fifoOverrun>

Signed-off-by: Valentin Vidic <Valentin.Vidic@CARNet.hr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:17:05 +01:00
Oliver Graute
dd1114693b staging: pi433: pi433_write fixed the return value
The pi433_write function should return the number of processed bytes

Reported-by: Marcin Ciupak <marcin.s.ciupak@gmail.com>
Signed-off-by: Oliver Graute <oliver.graute@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:17:05 +01:00
Ioana Radulescu
93ddf0b211 staging: fsl-dpaa2/eth: Flow affinity for non-forwarded traffic
The previous patch ensures Tx flow affinity for forwarded frames,
but for termination traffic the initial flow affinity is determined
based on the skb hash, which is expected to hit only a few Tx queues
when there is a small number of flows.

Instead, use XPS (transmit packet steering) to set netdevice queue
affinity to the sending core.

Signed-off-by: Bogdan Purcareata <bogdan.purcareata@nxp.com>
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:14:22 +01:00
Ioana Radulescu
537336ce2e staging: fsl-dpaa2/eth: Flow affinity for IP forwarding
The driver xmit function chooses an egress FQ based on the current
core id. The network stack itself sets a mapping field in the skb
based on many things - the default one being a hash on packet fields,
which the current driver ignores.

This patch saves the ingress frame flow affinity information in the
skb. In case of forwarded frames, this info will then be used for Tx
and Tx confirmation hardware queue selection, ensuring all processing
of the given frame is done on a single core.

Signed-off-by: Bogdan Purcareata <bogdan.purcareata@nxp.com>
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 18:14:22 +01:00
Jens Axboe
6d0e4827b7 Revert "bdi: add error handle for bdi_debug_register"
This reverts commit a0747a859e.

It breaks some booting for some users, and more than a week
into this, there's still no good fix. Revert this commit
for now until a solution has been found.

Reported-by: Laura Abbott <labbott@redhat.com>
Reported-by: Bruno Wolff III <bruno@wolff.to>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-12-21 10:01:30 -07:00
Ido Schimmel
58acfd714e ipv6: Honor specified parameters in fibmatch lookup
Currently, parameters such as oif and source address are not taken into
account during fibmatch lookup. Example (IPv4 for reference) before
patch:

$ ip -4 route show
192.0.2.0/24 dev dummy0 proto kernel scope link src 192.0.2.1
198.51.100.0/24 dev dummy1 proto kernel scope link src 198.51.100.1

$ ip -6 route show
2001:db8:1::/64 dev dummy0 proto kernel metric 256 pref medium
2001:db8:2::/64 dev dummy1 proto kernel metric 256 pref medium
fe80::/64 dev dummy0 proto kernel metric 256 pref medium
fe80::/64 dev dummy1 proto kernel metric 256 pref medium

$ ip -4 route get fibmatch 192.0.2.2 oif dummy0
192.0.2.0/24 dev dummy0 proto kernel scope link src 192.0.2.1
$ ip -4 route get fibmatch 192.0.2.2 oif dummy1
RTNETLINK answers: No route to host

$ ip -6 route get fibmatch 2001:db8:1::2 oif dummy0
2001:db8:1::/64 dev dummy0 proto kernel metric 256 pref medium
$ ip -6 route get fibmatch 2001:db8:1::2 oif dummy1
2001:db8:1::/64 dev dummy0 proto kernel metric 256 pref medium

After:

$ ip -6 route get fibmatch 2001:db8:1::2 oif dummy0
2001:db8:1::/64 dev dummy0 proto kernel metric 256 pref medium
$ ip -6 route get fibmatch 2001:db8:1::2 oif dummy1
RTNETLINK answers: Network is unreachable

The problem stems from the fact that the necessary route lookup flags
are not set based on these parameters.

Instead of duplicating the same logic for fibmatch, we can simply
resolve the original route from its copy and dump it instead.

Fixes: 18c3a61c42 ("net: ipv6: RTM_GETROUTE: return matched fib result when requested")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-21 11:51:06 -05:00
Darrick J. Wong
68c58e9b9a xfs: only skip rmap owner checks for unknown-owner rmap removal
For rmap removal, refactor the rmap owner checks into a separate
function, then skip the checks if we are performing an unknown-owner
removal.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-12-21 08:48:38 -08:00
Darrick J. Wong
33df3a9cf9 xfs: always honor OWN_UNKNOWN rmap removal requests
Calling xfs_rmap_free with an unknown owner is supposed to remove any
rmaps covering that range regardless of owner.  This is used by the EFI
recovery code to say "we're freeing this, it mustn't be owned by
anything anymore", but for whatever reason xfs_free_ag_extent filters
them out.

Therefore, remove the filter and make xfs_rmap_unmap actually treat it
as a wildcard owner -- free anything that's already there, and if
there's no owner at all then that's fine too.

There are two existing callers of bmap_add_free that take care the rmap
deferred ops themselves and use OWN_UNKNOWN to skip the EFI-based rmap
cleanup; convert these to use OWN_NULL (via helpers), and now we really
require that an RUI (if any) gets added to the defer ops before any EFI.

Lastly, now that xfs_free_extent filters out OWN_NULL rmap free requests,
growfs will have to consult directly with the rmap to ensure that there
aren't any rmaps in the grown region.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-12-21 08:48:38 -08:00
Darrick J. Wong
0525e952dc xfs: queue deferred rmap ops for cow staging extent alloc/free in the right order
Under the deferred rmap operation scheme, there's a certain order in
which the rmap deferred ops have to be queued to maintain integrity
during log replay.  For alloc/map operations that order is cui -> rui;
for free/unmap operations that order is cui -> rui -> efi.  However, the
initial refcount code got the ordering wrong in the free side of things
because it queued refcount free op and an EFI and the refcount free op
queued a rmap free op, resulting in the order cui -> efi -> rui.

If we fail before the efd finishes, the efi recovery will try to do a
wildcard rmap removal and the subsequent rui will fail to find the rmap
and blow up.  This didn't ever happen due to other screws up in handling
unknown owner rmap removals, but those other screw ups broke recovery in
other ways, so fix the ordering to follow the intended rules.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-12-21 08:48:38 -08:00
Darrick J. Wong
86d692bfad xfs: set cowblocks tag for direct cow writes too
If a user performs a direct CoW write, we end up loading the CoW fork
with preallocated extents.  Therefore, we must set the cowblocks tag so
that they can be cleared out if we run low on space.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-12-21 08:47:37 -08:00
Darrick J. Wong
10ddf64e42 xfs: remove leftover CoW reservations when remounting ro
When we're remounting the filesystem readonly, remove all CoW
preallocations prior to going ro.  If the fs goes down after the ro
remount, we never clean up the staging extents, which means xfs_check
will trip over them on a subsequent run.  Practically speaking, the next
mount will clean them up too, so this is unlikely to be seen.  Since we
shut down the cowblocks cleaner on remount-ro, we also have to make sure
we start it back up if/when we remount-rw.

Found by adding clonerange to fsstress and running xfs/017.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-12-21 08:47:32 -08:00
Darrick J. Wong
363e59baa4 xfs: don't be so eager to clear the cowblocks tag on truncate
Currently, xfs_itruncate_extents clears the cowblocks tag if i_cnextents
is zero.  This is wrong, since i_cnextents only tracks real extents in
the CoW fork, which means that we could have some delayed CoW
reservations still in there that will now never get cleaned.

Fix a further bug where we /don't/ clear the reflink iflag if there are
any attribute blocks -- really, it's only safe to clear the reflink flag
if there are no data fork extents and no cow fork extents.

Found by adding clonerange to fsstress in xfs/017.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-12-21 08:47:28 -08:00
Stefan Raspl
aa12f594f9 tools/kvm_stat: sort '-f help' output
Sort the fields returned by specifying '-f help' on the command line.
While at it, simplify the code a bit, indent the output and eliminate an
extra blank line at the beginning.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-21 13:03:32 +01:00
Paolo Bonzini
fae1a3e775 kvm: x86: fix RSM when PCID is non-zero
rsm_load_state_64() and rsm_enter_protected_mode() load CR3, then
CR4 & ~PCIDE, then CR0, then CR4.

However, setting CR4.PCIDE fails if CR3[11:0] != 0.  It's probably easier
in the long run to replace rsm_enter_protected_mode() with an emulator
callback that sets all the special registers (like KVM_SET_SREGS would
do).  For now, set the PCID field of CR3 only after CR4.PCIDE is 1.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Fixes: 660a5d517a
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-21 12:59:54 +01:00
Jani Nikula
423561a0bd Merge tag 'gvt-fixes-2017-12-21' of https://github.com/intel/gvt-linux into drm-intel-fixes
gvt-fixes-2017-12-21:

- default pipe enable fix for virtual display (Xiaolin)

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171221032500.xjofb4xyoihw3wo5@zhen-hp.sh.intel.com
2017-12-21 13:08:31 +02:00
Linus Torvalds
966031f340 n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)
We added support for EXTPROC back in 2010 in commit 26df6d1340 ("tty:
Add EXTPROC support for LINEMODE") and the intent was to allow it to
override some (all?) ICANON behavior.  Quoting from that original commit
message:

         There is a new bit in the termios local flag word, EXTPROC.
         When this bit is set, several aspects of the terminal driver
         are disabled.  Input line editing, character echo, and mapping
         of signals are all disabled.  This allows the telnetd to turn
         off these functions when in linemode, but still keep track of
         what state the user wants the terminal to be in.

but the problem turns out that "several aspects of the terminal driver
are disabled" is a bit ambiguous, and you can really confuse the n_tty
layer by setting EXTPROC and then causing some of the ICANON invariants
to no longer be maintained.

This fixes at least one such case (TIOCINQ) becoming unhappy because of
the confusion over whether ICANON really means ICANON when EXTPROC is set.

This basically makes TIOCINQ match the case of read: if EXTPROC is set,
we ignore ICANON.  Also, make sure to reset the ICANON state ie EXTPROC
changes, not just if ICANON changes.

Fixes: 26df6d1340 ("tty: Add EXTPROC support for LINEMODE")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Reported-by: syzkaller <syzkaller@googlegroups.com>
Cc: Jiri Slaby <jslaby@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 11:19:22 +01:00
Dmitry Torokhov
9b3fa47d4a kobject: fix suppressing modalias in uevents delivered over netlink
The commit 4a336a23d6 ("kobject: copy env blob in one go") optimized
constructing uevent data for delivery over netlink by using the raw
environment buffer, instead of reconstructing it from individual
environment pointers. Unfortunately in doing so it broke suppressing
MODALIAS attribute for KOBJ_UNBIND events, as the code that suppressed this
attribute only adjusted the environment pointers, but left the buffer
itself alone. Let's fix it by making sure the offending attribute is
obliterated form the buffer as well.

Reported-by: Tariq Toukan <tariqt@mellanox.com>
Reported-by: Casey Leedom <leedom@chelsio.com>
Fixes: 4a336a23d6 ("kobject: copy env blob in one go")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 11:10:33 +01:00
Keith Packard
d2a48e5254 drm: move lease init after validation in drm_lease_create
Patch bd36d3bab2 fixed a deadlock in the
failure path of drm_lease_create. This made the partially initialized
lease object visible for a short window of time.

To avoid having the lessee state appear transiently, I've rearranged
the code so that the lessor fields are not filled in until the
parameters are all validated and the function will succeed.

Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171221065424.1304-1-keithp@keithp.com
2017-12-21 09:49:40 +01:00
David S. Miller
8b6ca2bf5a Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2017-12-21

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix multiple security issues in the BPF verifier mostly related
   to the value and min/max bounds tracking rework in 4.14. Issues
   range from incorrect bounds calculation in some BPF_RSH cases,
   to improper sign extension and reg size handling on 32 bit
   ALU ops, missing strict alignment checks on stack pointers, and
   several others that got fixed, from Jann, Alexei and Edward.

2) Fix various build failures in BPF selftests on sparc64. More
   specifically, librt needed to be added to the libs to link
   against and few format string fixups for sizeof, from David.

3) Fix one last remaining issue from BPF selftest build that was
   still occuring on s390x from the asm/bpf_perf_event.h include
   which could not find the asm/ptrace.h copy, from Hendrik.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-20 23:10:29 -05:00
Cathy Avery
d1b8b2391c scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error
When an I/O is returned with an srb_status of SRB_STATUS_INVALID_LUN
which has zero good_bytes it must be assigned an error. Otherwise the
I/O will be continuously requeued and will cause a deadlock in the case
where disks are being hot added and removed. sd_probe_async will wait
forever for its I/O to complete while holding scsi_sd_probe_domain.

Also returning the default error of DID_TARGET_FAILURE causes multipath
to not retry the I/O resulting in applications receiving I/O errors
before a failover can occur.

Signed-off-by: Cathy Avery <cavery@redhat.com>
Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:23:11 -05:00
Alexei Starovoitov
82abbf8d2f bpf: do not allow root to mangle valid pointers
Do not allow root to convert valid pointers into unknown scalars.
In particular disallow:
 ptr &= reg
 ptr <<= reg
 ptr += ptr
and explicitly allow:
 ptr -= ptr
since pkt_end - pkt == length

1.
This minimizes amount of address leaks root can do.
In the future may need to further tighten the leaks with kptr_restrict.

2.
If program has such pointer math it's likely a user mistake and
when verifier complains about it right away instead of many instructions
later on invalid memory access it's easier for users to fix their progs.

3.
when register holding a pointer cannot change to scalar it allows JITs to
optimize better. Like 32-bit archs could use single register for pointers
instead of a pair required to hold 64-bit scalars.

4.
reduces architecture dependent behavior. Since code:
r1 = r10;
r1 &= 0xff;
if (r1 ...)
will behave differently arm64 vs x64 and offloaded vs native.

A significant chunk of ptr mangling was allowed by
commit f1174f77b5 ("bpf/verifier: rework value tracking")
yet some of it was allowed even earlier.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:26:29 +01:00
Daniel Borkmann
3db9128fcf Merge branch 'bpf-verifier-sec-fixes'
Alexei Starovoitov says:

====================
This patch set addresses a set of security vulnerabilities
in bpf verifier logic discovered by Jann Horn.
All of the patches are candidates for 4.14 stable.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:15:42 +01:00
Jann Horn
2255f8d520 selftests/bpf: add tests for recent bugfixes
These tests should cover the following cases:

 - MOV with both zero-extended and sign-extended immediates
 - implicit truncation of register contents via ALU32/MOV32
 - implicit 32-bit truncation of ALU32 output
 - oversized register source operand for ALU32 shift
 - right-shift of a number that could be positive or negative
 - map access where adding the operation size to the offset causes signed
   32-bit overflow
 - direct stack access at a ~4GiB offset

Also remove the F_LOAD_WITH_STRICT_ALIGNMENT flag from a bunch of tests
that should fail independent of what flags userspace passes.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:15:41 +01:00
Alexei Starovoitov
bb7f0f989c bpf: fix integer overflows
There were various issues related to the limited size of integers used in
the verifier:
 - `off + size` overflow in __check_map_access()
 - `off + reg->off` overflow in check_mem_access()
 - `off + reg->var_off.value` overflow or 32-bit truncation of
   `reg->var_off.value` in check_mem_access()
 - 32-bit truncation in check_stack_boundary()

Make sure that any integer math cannot overflow by not allowing
pointer math with large values.

Also reduce the scope of "scalar op scalar" tracking.

Fixes: f1174f77b5 ("bpf/verifier: rework value tracking")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:15:41 +01:00
Jann Horn
179d1c5602 bpf: don't prune branches when a scalar is replaced with a pointer
This could be made safe by passing through a reference to env and checking
for env->allow_ptr_leaks, but it would only work one way and is probably
not worth the hassle - not doing it will not directly lead to program
rejection.

Fixes: f1174f77b5 ("bpf/verifier: rework value tracking")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:15:41 +01:00
Jann Horn
a5ec6ae161 bpf: force strict alignment checks for stack pointers
Force strict alignment checks for stack pointers because the tracking of
stack spills relies on it; unaligned stack accesses can lead to corruption
of spilled registers, which is exploitable.

Fixes: f1174f77b5 ("bpf/verifier: rework value tracking")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-21 02:15:41 +01:00