Commit Graph

752486 Commits

Author SHA1 Message Date
Guenter Roeck
1b59788979 hwmon: (k10temp) Add temperature offset for Ryzen 2700X
Ryzen 2700X has a temperature offset of 10 degrees C. If bit 19 of the
Temperature Control register is set, there is an additional offset of
49 degrees C. Take this into account as well.

Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-04-25 05:23:20 -07:00
Theodore Ts'o
4e00b339e2 random: rate limit unseeded randomness warnings
On systems without sufficient boot randomness, no point spamming dmesg.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2018-04-25 02:41:39 -04:00
Linus Torvalds
3be4aaf4e2 Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns bug fix from Eric Biederman:
 "Just a small fix to properly set the return code on error"

* 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  commoncap: Handle memory allocation failure.
2018-04-24 17:58:51 -07:00
Linus Torvalds
24cac7009c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix rtnl deadlock in ipvs, from Julian Anastasov.

 2) s390 qeth fixes from Julian Wiedmann (control IO completion stalls,
    bad MAC address update sequence, request side races on command IO
    timeouts).

 3) Handle seq_file overflow properly in l2tp, from Guillaume Nault.

 4) Fix VLAN priority mappings in cpsw driver, from Ivan Khoronzhuk.

 5) Packet scheduler ife action fixes (malformed TLV lengths, etc.) from
    Alexander Aring.

 6) Fix out of bounds access in tcp md5 option parser, from Jann Horn.

 7) Missing netlink attribute policies in rtm_ipv6_policy table, from
    Eric Dumazet.

 8) Missing socket address length checks in l2tp and pppoe connect, from
    Guillaume Nault.

 9) Fix netconsole over team and bonding, from Xin Long.

10) Fix race with AF_PACKET socket state bitfields, from Willem de
    Bruijn.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (51 commits)
  ice: Fix insufficient memory issue in ice_aq_manage_mac_read
  sfc: ARFS filter IDs
  net: ethtool: Add missing kernel doc for FEC parameters
  packet: fix bitfield update race
  ice: Do not check INTEVENT bit for OICR interrupts
  ice: Fix incorrect comment for action type
  ice: Fix initialization for num_nodes_added
  igb: Fix the transmission mode of queue 0 for Qav mode
  ixgbevf: ensure xdp_ring resources are free'd on error exit
  team: fix netconsole setup over team
  amd-xgbe: Only use the SFP supported transceiver signals
  amd-xgbe: Improve KR auto-negotiation and training
  amd-xgbe: Add pre/post auto-negotiation phy hooks
  pppoe: check sockaddr length in pppoe_connect()
  l2tp: check sockaddr length in pppol2tp_connect()
  net: phy: marvell: clear wol event before setting it
  ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy
  bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave
  tcp: don't read out-of-bounds opsize
  ibmvnic: Clean actual number of RX or TX pools
  ...
2018-04-24 14:16:40 -07:00
Hans de Goede
53fa1f6e8a ACPI / video: Only default only_lcd to true on Win8-ready _desktops_
Commit 5928c28152 (ACPI / video: Default lcd_only to true on Win8-ready
and newer machines) made only_lcd default to true on all machines where
acpi_osi_is_win8() returns true, including laptops.

The purpose of this is to avoid the bogus / non-working acpi backlight
interface which many newer BIOS-es define on desktop machines.

But this is causing a regression on some laptops, specifically on the
Dell XPS 13 2013 model, which does not have the LCD flag set for its
fully functional ACPI backlight interface.

Rather then DMI quirking our way out of this, this commits changes the
logic for setting only_lcd to true, to only do this on machines with
a desktop (or server) dmi chassis-type.

Note that we cannot simply only check the chassis-type and not register
the backlight interface based on that as there are some laptops and
tablets which have their chassis-type set to "3" aka desktop. Hopefully
the combination of checking the LCD flag, but only on devices with
a desktop(ish) chassis-type will avoid the needs for DMI quirks for this,
or at least limit the amount of DMI quirks which we need to a minimum.

Fixes: 5928c28152 (ACPI / video: Default lcd_only to true on Win8-ready and newer machines)
Reported-and-tested-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-24 22:42:35 +02:00
David S. Miller
d19efb729f Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2018-04-24

This series contains fixes to ixgbevf, igb and ice drivers.

Colin Ian King fixes the return value on error for the new XDP support
that went into ixgbevf for 4.17.

Vinicius provides a fix for queue 0 for igb, which was not receiving all
the credits it needed when QAV mode was enabled.

Anirudh provides several fixes for the new ice driver, starting with
properly initializing num_nodes_added to zero.  Fixed up a code comment
to better reflect what is really going on in the code.  Fixed how to
detect if an OICR interrupt has occurred to a more reliable method.

Md Fahad fixes the ice driver to allocate the right amount of memory
when reading and storing the devices MAC addresses.  The device can have
up to 2 MAC addresses (LAN and WoL), while WoL is currently not
supported, we need to ensure it can be properly handled when support is
added.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24 16:17:59 -04:00
Md Fahad Iqbal Polash
d6fef10c75 ice: Fix insufficient memory issue in ice_aq_manage_mac_read
For the MAC read operation, the device can return up to two (LAN and WoL)
MAC addresses. Without access to adequate memory, the device will return
an error. Fixed this by allocating the right amount of memory. Also, logic
to detect and copy the LAN MAC address into the port_info structure has
been added. Note that the WoL MAC address is ignored currently as the WoL
feature isn't supported yet.

Fixes: dc49c77236 ("ice: Get MAC/PHY/link info and scheduler topology")
Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-24 12:27:49 -07:00
Michael S. Tsirkin
b400003250 virtio_balloon: add array of stat names
Jason Wang points out that it's very hard for users to build an array of
stat names. The naive thing is to use VIRTIO_BALLOON_S_NR but that
breaks if we add more stats - as done e.g. recently by commit 6c64fe7f2
("virtio_balloon: export hugetlb page allocation counts").

Let's add an array of reasonably readable names.

Fixes: 6c64fe7f2 ("virtio_balloon: export hugetlb page allocation counts")
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jonathan Helman <jonathan.helman@oracle.com>
2018-04-24 21:44:01 +03:00
Aurelien Jarno
85602bea29
RISC-V: build vdso-dummy.o with -no-pie
Debian toolcahin defaults to PIE, and I guess that will also be the case
of most distributions. This causes the following build failure:

  AS      arch/riscv/kernel/vdso/getcpu.o
  AS      arch/riscv/kernel/vdso/flush_icache.o
  VDSOLD  arch/riscv/kernel/vdso/vdso.so.dbg
  OBJCOPY arch/riscv/kernel/vdso/vdso.so
  AS      arch/riscv/kernel/vdso/vdso.o
  VDSOLD  arch/riscv/kernel/vdso/vdso-dummy.o
  LD      arch/riscv/kernel/vdso/vdso-syms.o
riscv64-linux-gnu-ld: attempted static link of dynamic object `arch/riscv/kernel/vdso/vdso-dummy.o'
make[2]: *** [arch/riscv/kernel/vdso/Makefile:43: arch/riscv/kernel/vdso/vdso-syms.o] Error 1
make[1]: *** [scripts/Makefile.build:575: arch/riscv/kernel/vdso] Error 2
make: *** [Makefile:1018: arch/riscv/kernel] Error 2

While the root Makefile correctly passes "-fno-PIE" to build individual
object files, the RISC-V kernel also builds vdso-dummy.o as an
executable, which is therefore linked as PIE. Fix that by updating this
specific link rule to also include "-no-pie".

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-24 10:54:46 -07:00
Christoph Hellwig
5b7252a268
riscv: there is no <asm/handle_irq.h>
So don't list it as generic-y.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-24 10:54:23 -07:00
Christoph Hellwig
86e11757d8
riscv: select DMA_DIRECT_OPS instead of redefining it
DMA_DIRECT_OPS is defined in lib/Kconfig, so don't duplicate it in
arch/riscv/Kconfig.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-24 10:54:08 -07:00
Edward Cree
f8d6203780 sfc: ARFS filter IDs
Associate an arbitrary ID with each ARFS filter, allowing to properly query
 for expiry.  The association is maintained in a hash table, which is
 protected by a spinlock.

v3: fix build warnings when CONFIG_RFS_ACCEL is disabled (thanks lkp-robot).
v2: fixed uninitialised variable (thanks davem and lkp-robot).

Fixes: 3af0f34290 ("sfc: replace asynchronous filter operations")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24 13:48:22 -04:00
Florian Fainelli
d805c52093 net: ethtool: Add missing kernel doc for FEC parameters
While adding support for ethtool::get_fecparam and set_fecparam, kernel
doc for these functions was missed, add those.

Fixes: 1a5f3da20b ("net: ethtool: add support for forward error correction modes")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24 13:38:42 -04:00
Willem de Bruijn
a6361f0ca4 packet: fix bitfield update race
Updates to the bitfields in struct packet_sock are not atomic.
Serialize these read-modify-write cycles.

Move po->running into a separate variable. Its writes are protected by
po->bind_lock (except for one startup case at packet_create). Also
replace a textual precondition warning with lockdep annotation.

All others are set only in packet_setsockopt. Serialize these
updates by holding the socket lock. Analogous to other field updates,
also hold the lock when testing whether a ring is active (pg_vec).

Fixes: 8dc4194474 ("[PACKET]: Add optional checksum computation for recvmsg")
Reported-by: DaeRyong Jeong <threeearcat@gmail.com>
Reported-by: Byoungyoung Lee <byoungyoung@purdue.edu>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24 13:17:08 -04:00
Ben Shelton
30d84397af ice: Do not check INTEVENT bit for OICR interrupts
According to the hardware spec, checking the INTEVENT bit isn't a
reliable way to detect if an OICR interrupt has occurred. This is
because this bit can be cleared by the hardware/firmware before the
interrupt service routine has run. So instead, just check for OICR
events every time.

Fixes: 940b61af02 ("ice: Initialize PF and setup miscellaneous interrupt")
Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-24 09:03:23 -07:00
Theodore Ts'o
6c1e851c4e random: fix possible sleeping allocation from irq context
We can do a sleeping allocation from an irq context when CONFIG_NUMA
is enabled.  Fix this by initializing the NUMA crng instances in a
workqueue.

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot+9de458f6a5e713ee8c1a@syzkaller.appspotmail.com
Fixes: 8ef35c866f ("random: set up the NUMA crng instances...")
Cc: stable@vger.kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-04-24 12:00:08 -04:00
Anirudh Venkataramanan
34357a90d5 ice: Fix incorrect comment for action type
Action type 5 defines large action generic values. Fix comment to
reflect that better.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-24 08:56:56 -07:00
Anirudh Venkataramanan
d332a38c95 ice: Fix initialization for num_nodes_added
ice_sched_add_nodes_to_layer is used recursively, and so we start
with num_nodes_added being 0. This way, in case of an error or if
num_nodes is NULL, the function just returns 0 to indicate that no
nodes were added.

Fixes: 5513b920a4 ("ice: Update Tx scheduler tree for VSI multi-Tx queue support")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-24 08:55:42 -07:00
Vinicius Costa Gomes
2707df9773 igb: Fix the transmission mode of queue 0 for Qav mode
When Qav mode is enabled, queue 0 should be kept on Stream Reservation
mode. From the i210 datasheet, section 8.12.19:

"Note: Queue0 QueueMode must be set to 1b when TransmitMode is set to
Qav." ("QueueMode 1b" represents the Stream Reservation mode)

The solution is to give queue 0 the all the credits it might need, so
it has priority over queue 1.

A situation where this can happen is when cbs is "installed" only on
queue 1, leaving queue 0 alone. For example:

$ tc qdisc replace dev enp2s0 handle 100: parent root mqprio num_tc 3 \
     	   map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0

$ tc qdisc replace dev enp2s0 parent 100:2 cbs locredit -1470 \
     	   hicredit 30 sendslope -980000 idleslope 20000 offload 1

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-24 08:53:19 -07:00
Colin Ian King
39035bfdc3 ixgbevf: ensure xdp_ring resources are free'd on error exit
The current error handling for failed resource setup for xdp_ring
data is a break out of the loop and returning 0 indicated everything
was OK, when in fact it is not.  Fix this by exiting via the
error exit label err_setup_tx that will clean up the resources
correctly and return and error status.

Detected by CoverityScan, CID#1466879 ("Logically dead code")

Fixes: 21092e9ce8 ("ixgbevf: Add support for XDP_TX action")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-24 08:20:40 -07:00
Xin Long
9cf2f437ca team: fix netconsole setup over team
The same fix in Commit dbe173079a ("bridge: fix netconsole
setup over bridge") is also needed for team driver.

While at it, remove the unnecessary parameter *team from
team_port_enable_netpoll().

v1->v2:
  - fix it in a better way, as does bridge.

Fixes: 0fb52a27a0 ("team: cleanup netpoll clode")
Reported-by: João Avelino Bellomo Filho <jbellomo@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24 09:36:21 -04:00
Ard Biesheuvel
ac1e55b1fd ACPI / button: make module loadable when booted in non-ACPI mode
Modules such as nouveau.ko and i915.ko have a link time dependency on
acpi_lid_open(), and due to its use of acpi_bus_register_driver(),
the button.ko module that provides it is only loadable when booted in
ACPI mode. However, the ACPI button driver can be built into the core
kernel as well, in which case the dependency can always be satisfied,
and the dependent modules can be loaded regardless of whether the
system was booted in ACPI mode or not.

So let's fix this asymmetry by making the ACPI button driver loadable
as a module even if not booted in ACPI mode, so it can provide the
acpi_lid_open() symbol in the same way as when built into the kernel.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[ rjw: Minor adjustments of comments, whitespace and names. ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-24 12:47:34 +02:00
Markus Mayer
ee53a65dc7 cpufreq: brcmstb-avs-cpufreq: remove development debug support
This debug code was helpful while developing the driver, but it isn't
being used for anything anymore.

Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-24 11:34:57 +02:00
Mika Westerberg
a0a37862a4 ACPI / watchdog: Prefer iTCO_wdt on Lenovo Z50-70
WDAT table on Lenovo Z50-70 is using RTC SRAM (ports 0x70 and 0x71) to
store state of the timer. This conflicts with Linux RTC driver
(rtc-cmos.c) who fails to reserve those ports for itself preventing RTC
from functioning. In addition the WDAT table seems not to be fully
functional because it does not reset the system when the watchdog times
out.

On this system iTCO_wdt works just fine so we simply prefer to use it
instead of WDAT. This makes RTC working again and also results working
watchdog via iTCO_wdt.

Reported-by: Peter Milley <pbmilley@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199033
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-24 11:12:59 +02:00
David S. Miller
6cd968f448 Merge branch 'amd-xgbe-fixes'
aTom Lendacky says:

====================
amd-xgbe: AMD XGBE driver fixes 2018-04-23

This patch series addresses some issues in the AMD XGBE driver.

The following fixes are included in this driver update series:

- Improve KR auto-negotiation and training (2 patches)
  - Add pre and post auto-negotiation hooks
  - Use the pre and post auto-negotiation hooks to disable CDR tracking
    during auto-negotiation page exchange in KR mode
- Check for SFP tranceiver signal support and only use the signal if the
  SFP indicates that it is supported

This patch series is based on net.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23 21:24:23 -04:00
Tom Lendacky
117df655f8 amd-xgbe: Only use the SFP supported transceiver signals
The SFP eeprom indicates the transceiver signals (Rx LOS, Tx Fault, etc.)
that it supports.  Update the driver to include checking the eeprom data
when deciding whether to use a transceiver signal.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23 21:24:22 -04:00
Tom Lendacky
96f4d430c5 amd-xgbe: Improve KR auto-negotiation and training
Update xgbe-phy-v2.c to make use of the auto-negotiation (AN) phy hooks
to improve the ability to successfully complete Clause 73 AN when running
at 10gbps.  Hardware can sometimes have issues with CDR lock when the
AN DME page exchange is being performed.

The AN and KR training hooks are used as follows:
- The pre AN hook is used to disable CDR tracking in the PHY so that the
  DME page exchange can be successfully and consistently completed.
- The post KR training hook is used to re-enable the CDR tracking so that
  KR training can successfully complete.
- The post AN hook is used to check for an unsuccessful AN which will
  increase a CDR tracking enablement delay (up to a maximum value).

Add two debugfs entries to allow control over use of the CDR tracking
workaround.  The debugfs entries allow the CDR tracking workaround to
be disabled and determine whether to re-enable CDR tracking before or
after link training has been initiated.

Also, with these changes the receiver reset cycle that is performed during
the link status check can be performed less often.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23 21:24:22 -04:00
Tom Lendacky
4d945663a6 amd-xgbe: Add pre/post auto-negotiation phy hooks
Add hooks to the driver auto-negotiation (AN) flow to allow the different
phy implementations to perform any steps necessary to improve AN.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23 21:24:22 -04:00
Guillaume Nault
a49e2f5d5f pppoe: check sockaddr length in pppoe_connect()
We must validate sockaddr_len, otherwise userspace can pass fewer data
than we expect and we end up accessing invalid data.

Fixes: 224cf5ad14 ("ppp: Move the PPP drivers")
Reported-by: syzbot+4f03bdf92fdf9ef5ddab@syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23 21:12:15 -04:00
Guillaume Nault
eb1c28c058 l2tp: check sockaddr length in pppol2tp_connect()
Check sockaddr_len before dereferencing sp->sa_protocol, to ensure that
it actually points to valid data.

Fixes: fd558d186d ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Reported-by: syzbot+a70ac890b23b1bf29f5c@syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23 21:10:43 -04:00
Jingju Hou
b6a930fa88 net: phy: marvell: clear wol event before setting it
If WOL event happened once, the LED[2] interrupt pin will not be
cleared unless we read the CSISR register. If interrupts are in use,
the normal interrupt handling will clear the WOL event. Let's clear the
WOL event before enabling it if !phy_interrupt_is_valid().

Signed-off-by: Jingju Hou <Jingju.Hou@synaptics.com>
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23 21:06:41 -04:00
David S. Miller
77621f024d Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:

1) Fix SIP conntrack with phones sending session descriptions for different
   media types but same port numbers, from Florian Westphal.

2) Fix incorrect rtnl_lock mutex logic from IPVS sync thread, from Julian
   Anastasov.

3) Skip compat array allocation in ebtables if there is no entries, also
   from Florian.

4) Do not lose left/right bits when shifting marks from xt_connmark, from
   Jack Ma.

5) Silence false positive memleak in conntrack extensions, from Cong Wang.

6) Fix CONFIG_NF_REJECT_IPV6=m link problems, from Arnd Bergmann.

7) Cannot kfree rule that is already in list in nf_tables, switch order
   so this error handling is not required, from Florian Westphal.

8) Release set name in error path, from Florian.

9) include kmemleak.h in nf_conntrack_extend.c, from Stepheh Rothwell.

10) NAT chain and extensions depend on NF_TABLES.

11) Out of bound access when renaming chains, from Taehee Yoo.

12) Incorrect casting in xt_connmark leads to wrong bitshifting.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23 16:22:24 -04:00
Eric Dumazet
aa8f877849 ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy
KMSAN reported use of uninit-value that I tracked to lack
of proper size check on RTA_TABLE attribute.

I also believe RTA_PREFSRC lacks a similar check.

Fixes: 86872cb579 ("[IPv6] route: FIB6 configuration using struct fib6_config")
Fixes: c3968a857a ("ipv6: RTA_PREFSRC support for ipv6 route source address selection")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23 12:01:21 -04:00
Xin Long
ddea788c63 bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave
After Commit 8a8efa22f5 ("bonding: sync netpoll code with bridge"), it
would set slave_dev npinfo in slave_enable_netpoll when enslaving a dev
if bond->dev->npinfo was set.

However now slave_dev npinfo is set with bond->dev->npinfo before calling
slave_enable_netpoll. With slave_dev npinfo set, __netpoll_setup called
in slave_enable_netpoll will not call slave dev's .ndo_netpoll_setup().
It causes that the lower dev of this slave dev can't set its npinfo.

One way to reproduce it:

  # modprobe bonding
  # brctl addbr br0
  # brctl addif br0 eth1
  # ifconfig bond0 192.168.122.1/24 up
  # ifenslave bond0 eth2
  # systemctl restart netconsole
  # ifenslave bond0 br0
  # ifconfig eth2 down
  # systemctl restart netconsole

The netpoll won't really work.

This patch is to remove that slave_dev npinfo setting in bond_enslave().

Fixes: 8a8efa22f5 ("bonding: sync netpoll code with bridge")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23 11:52:35 -04:00
Jann Horn
7e5a206ab6 tcp: don't read out-of-bounds opsize
The old code reads the "opsize" variable from out-of-bounds memory (first
byte behind the segment) if a broken TCP segment ends directly after an
opcode that is neither EOL nor NOP.

The result of the read isn't used for anything, so the worst thing that
could theoretically happen is a pagefault; and since the physmap is usually
mostly contiguous, even that seems pretty unlikely.

The following C reproducer triggers the uninitialized read - however, you
can't actually see anything happen unless you put something like a
pr_warn() in tcp_parse_md5sig_option() to print the opsize.

====================================
#define _GNU_SOURCE
#include <arpa/inet.h>
#include <stdlib.h>
#include <errno.h>
#include <stdarg.h>
#include <net/if.h>
#include <linux/if.h>
#include <linux/ip.h>
#include <linux/tcp.h>
#include <linux/in.h>
#include <linux/if_tun.h>
#include <err.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <assert.h>

void systemf(const char *command, ...) {
  char *full_command;
  va_list ap;
  va_start(ap, command);
  if (vasprintf(&full_command, command, ap) == -1)
    err(1, "vasprintf");
  va_end(ap);
  printf("systemf: <<<%s>>>\n", full_command);
  system(full_command);
}

char *devname;

int tun_alloc(char *name) {
  int fd = open("/dev/net/tun", O_RDWR);
  if (fd == -1)
    err(1, "open tun dev");
  static struct ifreq req = { .ifr_flags = IFF_TUN|IFF_NO_PI };
  strcpy(req.ifr_name, name);
  if (ioctl(fd, TUNSETIFF, &req))
    err(1, "TUNSETIFF");
  devname = req.ifr_name;
  printf("device name: %s\n", devname);
  return fd;
}

#define IPADDR(a,b,c,d) (((a)<<0)+((b)<<8)+((c)<<16)+((d)<<24))

void sum_accumulate(unsigned int *sum, void *data, int len) {
  assert((len&2)==0);
  for (int i=0; i<len/2; i++) {
    *sum += ntohs(((unsigned short *)data)[i]);
  }
}

unsigned short sum_final(unsigned int sum) {
  sum = (sum >> 16) + (sum & 0xffff);
  sum = (sum >> 16) + (sum & 0xffff);
  return htons(~sum);
}

void fix_ip_sum(struct iphdr *ip) {
  unsigned int sum = 0;
  sum_accumulate(&sum, ip, sizeof(*ip));
  ip->check = sum_final(sum);
}

void fix_tcp_sum(struct iphdr *ip, struct tcphdr *tcp) {
  unsigned int sum = 0;
  struct {
    unsigned int saddr;
    unsigned int daddr;
    unsigned char pad;
    unsigned char proto_num;
    unsigned short tcp_len;
  } fakehdr = {
    .saddr = ip->saddr,
    .daddr = ip->daddr,
    .proto_num = ip->protocol,
    .tcp_len = htons(ntohs(ip->tot_len) - ip->ihl*4)
  };
  sum_accumulate(&sum, &fakehdr, sizeof(fakehdr));
  sum_accumulate(&sum, tcp, tcp->doff*4);
  tcp->check = sum_final(sum);
}

int main(void) {
  int tun_fd = tun_alloc("inject_dev%d");
  systemf("ip link set %s up", devname);
  systemf("ip addr add 192.168.42.1/24 dev %s", devname);

  struct {
    struct iphdr ip;
    struct tcphdr tcp;
    unsigned char tcp_opts[20];
  } __attribute__((packed)) syn_packet = {
    .ip = {
      .ihl = sizeof(struct iphdr)/4,
      .version = 4,
      .tot_len = htons(sizeof(syn_packet)),
      .ttl = 30,
      .protocol = IPPROTO_TCP,
      /* FIXUP check */
      .saddr = IPADDR(192,168,42,2),
      .daddr = IPADDR(192,168,42,1)
    },
    .tcp = {
      .source = htons(1),
      .dest = htons(1337),
      .seq = 0x12345678,
      .doff = (sizeof(syn_packet.tcp)+sizeof(syn_packet.tcp_opts))/4,
      .syn = 1,
      .window = htons(64),
      .check = 0 /*FIXUP*/
    },
    .tcp_opts = {
      /* INVALID: trailing MD5SIG opcode after NOPs */
      1, 1, 1, 1, 1,
      1, 1, 1, 1, 1,
      1, 1, 1, 1, 1,
      1, 1, 1, 1, 19
    }
  };
  fix_ip_sum(&syn_packet.ip);
  fix_tcp_sum(&syn_packet.ip, &syn_packet.tcp);
  while (1) {
    int write_res = write(tun_fd, &syn_packet, sizeof(syn_packet));
    if (write_res != sizeof(syn_packet))
      err(1, "packet write failed");
  }
}
====================================

Fixes: cfb6eeb4c8 ("[TCP]: MD5 Signature Option (RFC2385) support.")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23 09:51:06 -04:00
Guenter Roeck
dbac00f0cf hwmon: (nct6683) Enable EC access if disabled at boot
On Asrock Z370M Pro4, it was observed that EC access was disabled after
initially booting the system. As a result, the driver failed to load
with
	nct6683: EC is disabled
After a suspend/resume cycle, the driver loaded correctly.
	nct6683: Found NCT6683D or compatible chip at 0x2e:0xa20
	nct6683 nct6683.2592: NCT6683D EC firmware version 1.0 build 07/18/16

Enable EC access after identifying the chip if disabled to fix the problem.
Warn the user that the data it reports may be unusable, similar to other
drivers for chips from Nuvoton.

Fixes: 41082d66bf ("hwmon: Driver for NCT6683D")
Reported-by: Jonathan Sims <jonathan.625266@earthlink.net>
Tested-by: Jonathan Sims <jonathan.625266@earthlink.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-04-23 06:12:26 -07:00
Jacopo Mondi
60695be2bb dma-mapping: postpone cpu addr translation on mmap
Postpone calling virt_to_page() translation on memory locations not
guaranteed to be backed by a struct page.  Try first to map memory from
the device coherent memory pool, then perform translation if that fails.

On some architectures, specifically SH when configured with the SPARSEMEM
memory model, assuming a struct page is always assigned to a memory
address lead to unexpected hangs during the virtual to page address
translation. This patch fixes that specific issue but applies in the
general case too.

Suggested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-04-23 14:44:24 +02:00
Robin Murphy
41d0bbc749 dma-coherent: clarify dma_mmap_from_dev_coherent documentation
The use of "correctly mapped" here is misleading, since it can give the
wrong expectation in the case that the memory *should* have been mapped
from the per-device pool, but doing so failed for other reasons.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-04-23 14:44:17 +02:00
Takashi Iwai
504a918e67 dma-direct: don't retry allocation for no-op GFP_DMA
When an allocation with lower dma_coherent mask fails, dma_direct_alloc()
retries the allocation with GFP_DMA.  But, this is useless for
architectures that hav no ZONE_DMA.

Fix it by adding the check of CONFIG_ZONE_DMA before retrying the
allocation.

Fixes: 95f183916d ("dma-direct: retry allocations using GFP_DMA for small masks")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-04-23 14:43:27 +02:00
Mika Westerberg
ae860a19f3 PCI / PM: Do not clear state_saved in pci_pm_freeze() when smart suspend is set
If a driver uses DPM_FLAG_SMART_SUSPEND and the device is already
runtime suspended when hibernate is started PCI core skips runtime
resuming the device but still clears pci_dev->state_saved. After the
hibernation image is written pci_pm_thaw_noirq() makes sure subsequent
thaw phases for the device are also skipped leaving it runtime suspended
with pci_dev->state_saved == false.

When the device is eventually runtime resumed pci_pm_runtime_resume()
restores config space by calling pci_restore_standard_config(), however
because pci_dev->state_saved == false pci_restore_state() never actually
restores the config space leaving the device in a state that is not what
the driver might expect.

For example here is what happens for intel-lpss I2C devices once the
hibernation snapshot is taken:

  intel-lpss 0000:00:15.0: power state changed by ACPI to D0
  intel-lpss 0000:00:1e.0: power state changed by ACPI to D3cold
  video LNXVIDEO:00: Restoring backlight state
  PM: hibernation exit
  i2c_designware i2c_designware.1: Unknown Synopsys component type: 0xffffffff
  i2c_designware i2c_designware.0: Unknown Synopsys component type: 0xffffffff
  i2c_designware i2c_designware.1: timeout in disabling adapter
  i2c_designware i2c_designware.0: timeout in disabling adapter

Since PCI config space is not restored the device is still in D3hot
making MMIO register reads return 0xffffffff.

Fix this by clearing pci_dev->state_saved only if we actually end up
runtime resuming the device.

Fixes: c4b65157ae (PCI / PM: Take SMART_SUSPEND driver flag into account)
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-23 08:57:35 +02:00
Mika Westerberg
cc6a0e315a ACPI / scan: Initialize watchdog before PNP
At least on one Dell system the PNP motherboard resources device
includes resources used by WDAT table. Since PNP gets initialized before
WDAT it results following error and no watchdog:

  platform wdat_wdt: failed to claim resource 3: [io  0x046a-0x046c]
  ACPI: watchdog: Device creation failed: -16

Now, the PNP system driver is already accustomed with the situation that
it cannot reserve all those motherboard resources because drivers using
those might have reserved them already. In addition putting WDAT table
resources under motherboard resources device makes sense in general.

Fix this by initializing WDAT right before PNP. This allows WDAT to
reserve all its resources and still keeps PNP system driver happy.

Reported-by: Shubhrata.Priyadarsh@dell.com
Reported-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-23 08:56:38 +02:00
Chen Yu
855c1c2fce ACPI / PM: Blacklist Low Power S0 Idle _DSM for ThinkPad X1 Tablet(2016)
ThinkPad X1 Tablet(2016) is reported to have issues with
the Low Power S0 Idle _DSM interface and since this machine
model generally can do ACPI S3 just fine, and user would
like to use S3 as default sleep model, add a blacklist
entry to disable that interface for ThinkPad X1 Tablet(2016).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=199057
Reported-and-tested-by: Robin Lee <robinlee.sysu@gmail.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-23 08:53:17 +02:00
Martin Schwidefsky
6cf09958f3 s390: correct module section names for expoline code revert
The main linker script vmlinux.lds.S for the kernel image merges
the expoline code patch tables into two section ".nospec_call_table"
and ".nospec_return_table". This is *not* done for the modules,
there the sections retain their original names as generated by gcc:
".s390_indirect_call", ".s390_return_mem" and ".s390_return_reg".

The module_finalize code has to check for the compiler generated
section names, otherwise no code patching is done. This slows down
the module code in case of "spectre_v2=off".

Cc: stable@vger.kernel.org # 4.16
Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-23 07:57:17 +02:00
Cornelia Huck
3368e547c5 vfio: ccw: process ssch with interrupts disabled
When we call ssch, an interrupt might already be pending once we
return from the START SUBCHANNEL instruction. Therefore we need to
make sure interrupts are disabled while holding the subchannel lock
until after we're done with our processing.

Cc: stable@vger.kernel.org #v4.12+
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Acked-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-23 07:57:17 +02:00
Martin Schwidefsky
2317b07d05 s390: update sampling tag after task pid change
In a multi-threaded program any thread can call execve(). If this
is not done by the thread group leader, the de_thread() function
replaces the pid of the task that calls execve() with the pid of
thread group leader. If the task reaches user space again without
going over __switch_to() the sampling tag is still set to the old
pid.

Define the arch_setup_new_exec function to verify the task pid
and udpate the tag with LPP if it has changed.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-23 07:57:17 +02:00
André Wild
5f3ba878e7 s390/cpum_cf: rename IBM z13/z14 counter names
Change the IBM z13/z14 counter names to be in sync with all other models.

Cc: stable@vger.kernel.org # v4.12+
Fixes: 3593eb944c ("s390/cpum_cf: add hardware counter support for IBM z14")
Fixes: 3fc7acebae ("s390/cpum_cf: add IBM z13 counter event names")
Signed-off-by: André Wild <wild@linux.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-23 07:57:17 +02:00
Stefan Haberland
5d27a2bf6e s390/dasd: fix IO error for newly defined devices
When a new CKD storage volume is defined at the storage server, Linux
may be relying on outdated information about that volume, which leads to
the following errors:

1. Command Reject Errors for minidisk on z/VM:

dasd-eckd.b3193d: 0.0.XXXX: An error occurred in the DASD device driver,
		  reason=09
dasd(eckd): I/O status report for device 0.0.XXXX:
dasd(eckd): in req: 00000000XXXXXXXX CC:00 FC:04 AC:00 SC:17 DS:02 CS:00
	    RC:0
dasd(eckd): device 0.0.2046: Failing CCW: 00000000XXXXXXXX
dasd(eckd): Sense(hex)  0- 7: 80 00 00 00 00 00 00 00
dasd(eckd): Sense(hex)  8-15: 00 00 00 00 00 00 00 00
dasd(eckd): Sense(hex) 16-23: 00 00 00 00 e1 00 0f 00
dasd(eckd): Sense(hex) 24-31: 00 00 40 e2 00 00 00 00
dasd(eckd): 24 Byte: 0 MSG 0, no MSGb to SYSOP

2. Equipment Check errors on LPAR or for dedicated devices on z/VM:

dasd(eckd): I/O status report for device 0.0.XXXX:
dasd(eckd): in req: 00000000XXXXXXXX CC:00 FC:04 AC:00 SC:17 DS:0E CS:40
	    fcxs:01 schxs:00 RC:0
dasd(eckd): device 0.0.9713: Failing TCW: 00000000XXXXXXXX
dasd(eckd): Sense(hex)  0- 7: 10 00 00 00 13 58 4d 0f
dasd(eckd): Sense(hex)  8-15: 67 00 00 00 00 00 00 04
dasd(eckd): Sense(hex) 16-23: e5 18 05 33 97 01 0f 0f
dasd(eckd): Sense(hex) 24-31: 00 00 40 e2 00 04 58 0d
dasd(eckd): 24 Byte: 0 MSG f, no MSGb to SYSOP

Fix this problem by using the up-to-date information provided during
online processing via the device specific SNEQ to detect the case of
outdated LCU data. If there is a difference, perform a re-read of that
data.

Cc: stable@vger.kernel.org
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-23 07:57:16 +02:00
Heiko Carstens
783c3b53b9 s390/uprobes: implement arch_uretprobe_is_alive()
Implement s390 specific arch_uretprobe_is_alive() to avoid SIGSEGVs
observed with uretprobes in combination with setjmp/longjmp.

See commit 2dea1d9c38 ("powerpc/uprobes: Implement
arch_uretprobe_is_alive()") for more details.

With this implemented all test cases referenced in the above commit
pass.

Reported-by: Ziqian SUN <zsun@redhat.com>
Cc: <stable@vger.kernel.org> # v4.3+
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-23 07:57:16 +02:00
Sebastian Ott
af2e460ade s390/cio: update chpid descriptor after resource accessibility event
Channel path descriptors have been seen as something stable (as
long as the chpid is configured). Recent tests have shown that the
descriptor can also be altered when the link state of a channel path
changes. Thus it is necessary to update the descriptor during
handling of resource accessibility events.

Cc: <stable@vger.kernel.org>
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-23 07:57:16 +02:00
Sudeep Holla
f18a36cf3f hwmon: (scmi) handle absence of few types of sensors
Currently the loop checks for non-zero count of sensors for each type
of sensors which is completely wrong. It also results in aborting the
registration of sensors if one or more types of sensors are completely
not supported by the platform SCMI firmware.

This patch fixes the issue by continue to loop and skiping sensor types
that are not present.

Fixes: b23688aefb ("hwmon: add support for sensors exported via ARM SCMI")
Reported-by: Jim Quinlan <james.quinlan@broadcom.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: linux-hwmon@vger.kernel.org
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-04-22 19:39:55 -07:00