2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-26 14:14:01 +08:00
Commit Graph

760763 Commits

Author SHA1 Message Date
Shannon Nelson
2a8a15526d ixgbe: check ipsec ip addr against mgmt filters
Make sure we don't try to offload the decryption of an incoming
packet that should get delivered to the management engine.  This
is a corner case that will likely be very seldom seen, but could
really confuse someone if they were to hit it.

Suggested-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04 10:29:32 -07:00
David S. Miller
20677108c5 Merge branch 'mlxsw-Fixes-in-offloading-of-mirror-to-gretap'
Ido Schimmel says:

====================
mlxsw: Fixes in offloading of mirror-to-gretap

Petr says:

These two patches fix issues in offloading of mirror-to-gretap when
bridge is present in the underlay.

In patch #1, reconsideration of SPAN configuration is not done right at
the point that SWITCHDEV_OBJ_ID_PORT_VLAN deletion notification is
distributed, but is postponed, because the notifications are actually
distributed before the relevant change is implemented in the bridge.

In patch #2, a problem in configuring VLAN tagging in situations when a
VLAN device is on top of an 802.1Q bridge whose egress port is marked as
"egress untagged". In that case, mlxsw would neglect to suppress the
tagging implicitly assumed after the VLAN device was seen.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 13:27:58 -04:00
Petr Machata
1fc68bb7c3 mlxsw: spectrum_span: Suppress VLAN on BRIDGE_VLAN_INFO_UNTAGGED
When offloading mirroring to gretap or ip6gretap netdevices, an 802.1q
bridge is one of the soft devices permissible in the underlay when
resolving the packet path. After the packet path is resolved to a
particular bridge egress device, flags on packet VLAN determine whether
the egressed packet should be tagged.

The current logic however only ever sets the VLAN tag, never suppresses
it. Thus if there's a VLAN netdevice above the bridge that determines
the packet VLAN, that VLAN is never unset, and mirroring is configured
with VLAN tagging.

Fix by setting the packet VLAN on both branches: set to zero (for unset)
when BRIDGE_VLAN_INFO_UNTAGGED, copy the resolved VLAN (e.g. from bridge
PVID) otherwise.

Fixes: 946a11e740 ("mlxsw: spectrum_span: Allow bridge for gretap mirror")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 13:27:57 -04:00
Petr Machata
f07ff01406 mlxsw: spectrum_switchdev: Postpone respin on object deletion
VLAN deletion notifications are emitted before the relevant change is
projected to bridge configuration. Thus, like with VLAN addition,
schedule SPAN respin for later.

Fixes: c520bc6986 ("mlxsw: Respin SPAN on switchdev events")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 13:27:56 -04:00
Tony Nguyen
88adce4ea8 ixgbe: fix possible race in reset subtask
Similar to ixgbevf, the same possibility for race exists. Extend the RTNL
lock in ixgbe_reset_subtask() to protect the state bits; this is to make
sure that we get the most up-to-date values for the bits and avoid a
possible race when going down.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04 10:25:15 -07:00
Daniel Borkmann
cc5b114dcf bpf, i40e: add meta data support
Add support for XDP meta data when using build skb variant of
the i40e driver. Implementation is analogous to the existing
ixgbe and ixgbevf support for meta data from 366a88fe2f ("bpf,
ixgbe: add meta data support") and be8333322e ("ixgbevf: Add
support for meta data"). With the build skb variant we get
192 bytes of extra headroom which can be used for encaps or
meta data.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Tested-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04 10:23:58 -07:00
Michal Kubecek
fa1be7e01e ipv6: omit traffic class when calculating flow hash
Some of the code paths calculating flow hash for IPv6 use flowlabel member
of struct flowi6 which, despite its name, encodes both flow label and
traffic class. If traffic class changes within a TCP connection (as e.g.
ssh does), ECMP route can switch between path. It's also inconsistent with
other code paths where ip6_flowlabel() (returning only flow label) is used
to feed the key.

Use only flow label everywhere, including one place where hash key is set
using ip6_flowinfo().

Fixes: 51ebd31815 ("ipv6: add support of equal cost multipath (ECMP)")
Fixes: f70ea018da ("net: Add functions to get skb->hash based on flow structures")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 13:21:18 -04:00
YueHaibing
e9c721837d ixgbe: introduce a helper to simplify code
ixgbe_dbg_reg_ops_read and ixgbe_dbg_netdev_ops_read copy-pasting
the same code except for ixgbe_dbg_netdev_ops_buf/ixgbe_dbg_reg_ops_buf,
so introduce a helper ixgbe_dbg_common_ops_read to remove redundant code.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04 10:21:13 -07:00
David S. Miller
a925ab48da Revert "ipv6: omit traffic class when calculating flow hash"
This reverts commit 87ae68c8b4.

Applied the wrong version of this fix, correct version
coming up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 13:20:38 -04:00
Emil Tantilov
7d6446db1b ixgbevf: fix possible race in the reset subtask
Extend the RTNL lock in ixgbevf_reset_subtask() to protect the state bits
check in addition to the call to ixgbevf_reinit_locked().

This is to make sure that we get the most up-to-date values for the bits
and avoid a possible race when going down.

Suggested-by: Zhiping du <zhipingdu@tencent.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04 10:19:32 -07:00
Michal Kubecek
87ae68c8b4 ipv6: omit traffic class when calculating flow hash
Some of the code paths calculating flow hash for IPv6 use flowlabel member
of struct flowi6 which, despite its name, encodes both flow label and
traffic class. If traffic class changes within a TCP connection (as e.g.
ssh does), ECMP route can switch between path. It's also incosistent with
other code paths where ip6_flowlabel() (returning only flow label) is used
to feed the key.

Use only flow label everywhere, including one place where hash key is set
using ip6_flowinfo().

Fixes: 51ebd31815 ("ipv6: add support of equal cost multipath (ECMP)")
Fixes: f70ea018da ("net: Add functions to get skb->hash based on flow structures")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 13:18:35 -04:00
Alexander Duyck
4be87727d4 ixgbevf: Fix coexistence of malicious driver detection with XDP
In the case of the VF driver it is supposed to provide a context descriptor
that allows us to provide information about the header offsets inside of
the frame. However in the case of XDP we don't really have any of that
information since the data is minimally processed. As a result we were
seeing malicious driver detection (MDD) events being triggered when the PF
had that functionality enabled.

To address this I have added a bit of new code that will "prime" the XDP
ring by providing one context descriptor that assumes the minimal setup of
an Ethernet frame which is an L2 header length of 14. With just that we can
provide enough information to make the hardware happy so that we don't
trigger MDD events.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04 10:17:55 -07:00
Linus Torvalds
f956d08a56 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "Misc bits and pieces not fitting into anything more specific"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: delete unnecessary assignment in vfs_listxattr
  Documentation: filesystems: update filesystem locking documentation
  vfs: namei: use path_equal() in follow_dotdot()
  fs.h: fix outdated comment about file flags
  __inode_security_revalidate() never gets NULL opt_dentry
  make xattr_getsecurity() static
  vfat: simplify checks in vfat_lookup()
  get rid of dead code in d_find_alias()
  it's SB_BORN, not MS_BORN...
  msdos_rmdir(): kill BS comment
  remove rpc_rmdir()
  fs: avoid fdput() after failed fdget() in vfs_dedupe_file_range()
2018-06-04 10:14:28 -07:00
Sergey Nemov
06140c793d igb: Wait 10ms just once after TX queues reset
Move 10ms sleep out of function resetting TX queue.
Reset all the TX queues in one turn and
wait for all of them just once.

Use usleep_range() instead of mdelay() in order not to
affect transmission on other interfaces.

Signed-off-by: Sergey Nemov <sergey.nemov@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04 10:11:14 -07:00
Linus Torvalds
cf626b0da7 Merge branch 'hch.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull procfs updates from Al Viro:
 "Christoph's proc_create_... cleanups series"

* 'hch.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (44 commits)
  xfs, proc: hide unused xfs procfs helpers
  isdn/gigaset: add back gigaset_procinfo assignment
  proc: update SIZEOF_PDE_INLINE_NAME for the new pde fields
  tty: replace ->proc_fops with ->proc_show
  ide: replace ->proc_fops with ->proc_show
  ide: remove ide_driver_proc_write
  isdn: replace ->proc_fops with ->proc_show
  atm: switch to proc_create_seq_private
  atm: simplify procfs code
  bluetooth: switch to proc_create_seq_data
  netfilter/x_tables: switch to proc_create_seq_private
  netfilter/xt_hashlimit: switch to proc_create_{seq,single}_data
  neigh: switch to proc_create_seq_data
  hostap: switch to proc_create_{seq,single}_data
  bonding: switch to proc_create_seq_data
  rtc/proc: switch to proc_create_single_data
  drbd: switch to proc_create_single
  resource: switch to proc_create_seq_data
  staging/rtl8192u: simplify procfs code
  jfs: simplify procfs code
  ...
2018-06-04 10:00:01 -07:00
Joanna Yurdal
1ec2297c5c igb: Clear TSICR interrupts together with ICR
Issuing "ip link set up/down" can block TSICR interrupts, what results in
missing PTP Tx timestamp and no PPS pulse generation.

Problem happens when the link is set up with the TSICR interrupts pending.
ICR is cleared before enabling interrupts, while TSICR is not. When all TSICR
interrupts are pending at this moment, time_sync interrupt will never
be generated. TSICR should be cleared as well.

In order to reproduce the issue:
1. Setup linux with IEEE 1588 grandmaster and PPS output enabled
2. Continue setting link up/down with random intervals between commands
3. Wait until PPS is not generated ( only one pulse is generated and PPS
dies), and ptp4l complains constantly about Tx timeout.

Signed-off-by: Joanna Yurdal <jyu@trackman.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04 09:55:36 -07:00
Linus Torvalds
9c50eafc32 Merge branch 'work.rmdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull rmdir update from Al Viro:
 "More shrink_dcache_parent()-related stuff - killing the main source of
  potentially contended calls of that on large subtrees"

* 'work.rmdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  rmdir(),rename(): do shrink_dcache_parent() only on success
2018-06-04 09:53:33 -07:00
Jeff Kirsher
228046e761 Documentation: e1000: Update kernel documentation
Updated the e1000.txt kernel documentation with the latest information.

Also convert the text file to reStructuredText (RST) format, since the
Linux kernel documentation now uses this format for documentation.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
2018-06-04 09:52:16 -07:00
Jeff Kirsher
85d63445f4 Documentation: e100: Update the Intel 10/100 driver doc
Over the years, several of the links have changed or are no longer valid
so update them.  In addition, the default values were incorrect for a
couple of parameters.

Converted the text file to the reStructuredText (RST) format, since the
Linux kernel documentation now uses this format for documentation.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
2018-06-04 09:48:54 -07:00
Benjamin Poirier
fff200caf6 e1000e: Ignore TSYNCRXCTL when getting I219 clock attributes
There have been multiple reports of crashes that look like
kernel: RIP: 0010:[<ffffffff8110303f>] timecounter_read+0xf/0x50
[...]
kernel: Call Trace:
kernel:  [<ffffffffa0806b0f>] e1000e_phc_gettime+0x2f/0x60 [e1000e]
kernel:  [<ffffffffa0806c5d>] e1000e_systim_overflow_work+0x1d/0x80 [e1000e]
kernel:  [<ffffffff810992c5>] process_one_work+0x155/0x440
kernel:  [<ffffffff81099e16>] worker_thread+0x116/0x4b0
kernel:  [<ffffffff8109f422>] kthread+0xd2/0xf0
kernel:  [<ffffffff8163184f>] ret_from_fork+0x3f/0x70

These can be traced back to the fact that e1000e_systim_reset() skips the
timecounter_init() call if e1000e_get_base_timinca() returns -EINVAL, which
leads to a null deref in timecounter_read().

Commit 83129b37ef ("e1000e: fix systim issues", v4.2-rc1) reworked
e1000e_get_base_timinca() in such a way that it can return -EINVAL for
e1000_pch_spt if the SYSCFI bit is not set in TSYNCRXCTL.

Some experimentation has shown that on I219 (e1000_pch_spt, "MAC: 12")
adapters, the E1000_TSYNCRXCTL_SYSCFI flag is unstable; TSYNCRXCTL reads
sometimes don't have the SYSCFI bit set. Retrying the read shortly after
finds the bit to be set. This was observed at boot (probe) but also link up
and link down.

Moreover, the phc (PTP Hardware Clock) seems to operate normally even after
reads where SYSCFI=0. Therefore, remove this register read and
unconditionally set the clock parameters.

Reported-by: Achim Mildenberger <admin@fph.physik.uni-karlsruhe.de>
Message-Id: <20180425065243.g5mqewg5irkwgwgv@f2>
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1075876
Fixes: 83129b37ef ("e1000e: fix systim issues")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04 09:45:53 -07:00
Linus Torvalds
06c86e66d6 Merge branch 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull dcache updates from Al Viro:
 "This is the first part of dealing with livelocks etc around
  shrink_dcache_parent()."

* 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  restore cond_resched() in shrink_dcache_parent()
  dput(): turn into explicit while() loop
  dcache: move cond_resched() into the end of __dentry_kill()
  d_walk(): kill 'finish' callback
  d_invalidate(): unhash immediately
2018-06-04 08:57:36 -07:00
kbuild test robot
fe083b3f06 net: mvpp2: mvpp2_percpu_read_relaxed() can be static
Fixes: db9d7d36ee ("net: mvpp2: Split the PPv2 driver to a dedicated directory")
Signed-off-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 11:36:29 -04:00
Eric Dumazet
eb73190f4f net/packet: refine check for priv area size
syzbot was able to trick af_packet again [1]

Various commits tried to address the problem in the past,
but failed to take into account V3 header size.

[1]

tpacket_rcv: packet too big, clamped from 72 to 4294967224. macoff=96
BUG: KASAN: use-after-free in prb_run_all_ft_ops net/packet/af_packet.c:1016 [inline]
BUG: KASAN: use-after-free in prb_fill_curr_block.isra.59+0x4e5/0x5c0 net/packet/af_packet.c:1039
Write of size 2 at addr ffff8801cb62000e by task kworker/1:2/2106

CPU: 1 PID: 2106 Comm: kworker/1:2 Not tainted 4.17.0-rc7+ #77
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: ipv6_addrconf addrconf_dad_work
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 print_address_description+0x6c/0x20b mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
 __asan_report_store2_noabort+0x17/0x20 mm/kasan/report.c:436
 prb_run_all_ft_ops net/packet/af_packet.c:1016 [inline]
 prb_fill_curr_block.isra.59+0x4e5/0x5c0 net/packet/af_packet.c:1039
 __packet_lookup_frame_in_block net/packet/af_packet.c:1094 [inline]
 packet_current_rx_frame net/packet/af_packet.c:1117 [inline]
 tpacket_rcv+0x1866/0x3340 net/packet/af_packet.c:2282
 dev_queue_xmit_nit+0x891/0xb90 net/core/dev.c:2018
 xmit_one net/core/dev.c:3049 [inline]
 dev_hard_start_xmit+0x16b/0xc10 net/core/dev.c:3069
 __dev_queue_xmit+0x2724/0x34c0 net/core/dev.c:3584
 dev_queue_xmit+0x17/0x20 net/core/dev.c:3617
 neigh_resolve_output+0x679/0xad0 net/core/neighbour.c:1358
 neigh_output include/net/neighbour.h:482 [inline]
 ip6_finish_output2+0xc9c/0x2810 net/ipv6/ip6_output.c:120
 ip6_finish_output+0x5fe/0xbc0 net/ipv6/ip6_output.c:154
 NF_HOOK_COND include/linux/netfilter.h:277 [inline]
 ip6_output+0x227/0x9b0 net/ipv6/ip6_output.c:171
 dst_output include/net/dst.h:444 [inline]
 NF_HOOK include/linux/netfilter.h:288 [inline]
 ndisc_send_skb+0x100d/0x1570 net/ipv6/ndisc.c:491
 ndisc_send_ns+0x3c1/0x8d0 net/ipv6/ndisc.c:633
 addrconf_dad_work+0xbef/0x1340 net/ipv6/addrconf.c:4033
 process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
 worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
 kthread+0x345/0x410 kernel/kthread.c:240
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412

The buggy address belongs to the page:
page:ffffea00072d8800 count:0 mapcount:-127 mapping:0000000000000000 index:0xffff8801cb620e80
flags: 0x2fffc0000000000()
raw: 02fffc0000000000 0000000000000000 ffff8801cb620e80 00000000ffffff80
raw: ffffea00072e3820 ffffea0007132d20 0000000000000002 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8801cb61ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff8801cb61ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff8801cb620000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                      ^
 ffff8801cb620080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff8801cb620100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

Fixes: 2b6867c2ce ("net/packet: fix overflow in check for priv area size")
Fixes: dc808110bb ("packet: handle too big packets for PACKET_V3")
Fixes: f6fb8f100b ("af-packet: TPACKET_V3 flexible buffer implementation.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 11:31:53 -04:00
Colin Ian King
76a45194e5 net: aquantia: make function aq_fw2x_get_mac_permanent static
The function aq_fw2x_get_mac_permanent is local to the source and does
not need to be in global scope, so make it static.

Cleans up sparse warning:
warning: symbol 'aq_fw2x_get_mac_permanent' was not declared. Should it
be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 11:30:50 -04:00
Magnus Karlsson
a65ea68b8d samples/bpf: minor *_nb_free performance fix
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-06-04 17:21:02 +02:00
Björn Töpel
a412ef54fc samples/bpf: adapted to new uapi
Here, the xdpsock sample application is adjusted to the new descriptor
format.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-06-04 17:21:02 +02:00
Björn Töpel
bbff2f321a xsk: new descriptor addressing scheme
Currently, AF_XDP only supports a fixed frame-size memory scheme where
each frame is referenced via an index (idx). A user passes the frame
index to the kernel, and the kernel acts upon the data.  Some NICs,
however, do not have a fixed frame-size model, instead they have a
model where a memory window is passed to the hardware and multiple
frames are filled into that window (referred to as the "type-writer"
model).

By changing the descriptor format from the current frame index
addressing scheme, AF_XDP can in the future be extended to support
these kinds of NICs.

In the index-based model, an idx refers to a frame of size
frame_size. Addressing a frame in the UMEM is done by offseting the
UMEM starting address by a global offset, idx * frame_size + offset.
Communicating via the fill- and completion-rings are done by means of
idx.

In this commit, the idx is removed in favor of an address (addr),
which is a relative address ranging over the UMEM. To convert an
idx-based address to the new addr is simply: addr = idx * frame_size +
offset.

We also stop referring to the UMEM "frame" as a frame. Instead it is
simply called a chunk.

To transfer ownership of a chunk to the kernel, the addr of the chunk
is passed in the fill-ring. Note, that the kernel will mask addr to
make it chunk aligned, so there is no need for userspace to do
that. E.g., for a chunk size of 2k, passing an addr of 2048, 2050 or
3000 to the fill-ring will refer to the same chunk.

On the completion-ring, the addr will match that of the Tx descriptor,
passed to the kernel.

Changing the descriptor format to use chunks/addr will allow for
future changes to move to a type-writer based model, where multiple
frames can reside in one chunk. In this model passing one single chunk
into the fill-ring, would potentially result in multiple Rx
descriptors.

This commit changes the uapi of AF_XDP sockets, and updates the
documentation.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-06-04 17:21:02 +02:00
Björn Töpel
a509a95536 xsk: proper Rx drop statistics update
Previously, rx_dropped could be updated incorrectly, e.g. if the XDP
program redirected the frame to a socket bound to a different queue
than where the XDP program was executing.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-06-04 17:21:02 +02:00
Björn Töpel
4e64c83525 xsk: proper fill queue descriptor validation
Previously the fill queue descriptor was not copied to kernel space
prior validating it, making it possible for userland to change the
descriptor post-kernel-validation.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-06-04 17:21:02 +02:00
Linus Torvalds
f459c34538 for-4.18/block-20180603
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJbFIrHAAoJEPfTWPspceCm2+kQAKo7o7HL30aRxJYu+gYafkuW
 PV47zr3e4vhMDEzDaMsh1+V7I7bm3uS+NZu6cFbcV+N9KXFpeb4V4Hvvm5cs+OC3
 WCOBi4eC1h4qnDQ3ZyySrCMN+KHYJ16pZqddEjqw+fhVudx8i+F+jz3Y4ZMDDc3q
 pArKZvjKh2wEuYXUMFTjaXY46IgPt+er94OwvrhyHk+4AcA+Q/oqSfSdDahUC8jb
 BVR3FV4I3NOHUaru0RbrUko13sVZSboWPCIFrlTDz8xXcJOnVHzdVS1WLFDXLHnB
 O8q9cADCfa4K08kz68RxykcJiNxNvz5ChDaG0KloCFO+q1tzYRoXLsfaxyuUDg57
 Zd93OFZC6hAzXdhclDFIuPET9OQIjDzwphodfKKmDsm3wtyOtydpA0o7JUEongp0
 O1gQsEfYOXmQsXlo8Ot+Z7Ne/HvtGZ91JahUa/59edxQbcKaMrktoyQsQ/d1nOEL
 4kXID18wPcFHWRQHYXyVuw6kbpRtQnh/U2m1eenSZ7tVQHwoe6mF3cfSf5MMseak
 k8nAnmsfEvOL4Ar9ftg61GOrImaQlidxOC2A8fmY5r0Sq/ZldvIFIZizsdTTCcni
 8SOTxcQowyqPf5NvMNQ8cKqqCJap3ppj4m7anZNhbypDIF2TmOWsEcXcMDn4y9on
 fax14DPLo59gBRiPCn5f
 =nga/
 -----END PGP SIGNATURE-----

Merge tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:

 - clean up how we pass around gfp_t and
   blk_mq_req_flags_t (Christoph)

 - prepare us to defer scheduler attach (Christoph)

 - clean up drivers handling of bounce buffers (Christoph)

 - fix timeout handling corner cases (Christoph/Bart/Keith)

 - bcache fixes (Coly)

 - prep work for bcachefs and some block layer optimizations (Kent).

 - convert users of bio_sets to using embedded structs (Kent).

 - fixes for the BFQ io scheduler (Paolo/Davide/Filippo)

 - lightnvm fixes and improvements (Matias, with contributions from Hans
   and Javier)

 - adding discard throttling to blk-wbt (me)

 - sbitmap blk-mq-tag handling (me/Omar/Ming).

 - remove the sparc jsflash block driver, acked by DaveM.

 - Kyber scheduler improvement from Jianchao, making it more friendly
   wrt merging.

 - conversion of symbolic proc permissions to octal, from Joe Perches.
   Previously the block parts were a mix of both.

 - nbd fixes (Josef and Kevin Vigor)

 - unify how we handle the various kinds of timestamps that the block
   core and utility code uses (Omar)

 - three NVMe pull requests from Keith and Christoph, bringing AEN to
   feature completeness, file backed namespaces, cq/sq lock split, and
   various fixes

 - various little fixes and improvements all over the map

* tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block: (196 commits)
  blk-mq: update nr_requests when switching to 'none' scheduler
  block: don't use blocking queue entered for recursive bio submits
  dm-crypt: fix warning in shutdown path
  lightnvm: pblk: take bitmap alloc. out of critical section
  lightnvm: pblk: kick writer on new flush points
  lightnvm: pblk: only try to recover lines with written smeta
  lightnvm: pblk: remove unnecessary bio_get/put
  lightnvm: pblk: add possibility to set write buffer size manually
  lightnvm: fix partial read error path
  lightnvm: proper error handling for pblk_bio_add_pages
  lightnvm: pblk: fix smeta write error path
  lightnvm: pblk: garbage collect lines with failed writes
  lightnvm: pblk: rework write error recovery path
  lightnvm: pblk: remove dead function
  lightnvm: pass flag on graceful teardown to targets
  lightnvm: pblk: check for chunk size before allocating it
  lightnvm: pblk: remove unnecessary argument
  lightnvm: pblk: remove unnecessary indirection
  lightnvm: pblk: return NVM_ error on failed submission
  lightnvm: pblk: warn in case of corrupted write buffer
  ...
2018-06-04 07:58:06 -07:00
Bob Peterson
6d1c2cf224 MAINTAINERS: Add Andreas Gruenbacher as a maintainer for gfs2
Add Andreas Gruenbacher as a maintainer for the gfs2 file system
and remove Steve Whitehouse.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04 09:33:07 -05:00
Eric Dumazet
d3adce9d03 MAINTAINERS: TCP gets its first maintainer
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 10:32:00 -04:00
Stephen Suryaputra
2f17becfbe vrf: check the original netdevice for generating redirect
Use the right device to determine if redirect should be sent especially
when using vrf. Same as well as when sending the redirect.

Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 10:16:45 -04:00
Varsha Rao
cfed0a2c98 net: ethernet: mlx4: Remove unnecessary parentheses
This patch fixes the clang warning of extraneous parentheses, with the
following coccinelle script.

@@
identifier i;
expression e;
statement s;
@@
if (
-(i == e)
+i == e
 )
s

Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Varsha Rao <rvarsha016@gmail.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 10:14:39 -04:00
Jose Abreu
9a8a02c9d4 net: stmmac: Add Flexible PPS support
This adds support for Flexible PPS output (which is equivalent
to per_out output of PTP subsystem).

Tested using an oscilloscope and the following commands:

1) Start PTP4L:
	# ptp4l -A -4 -H -m -i eth0 &
2) Set Flexible PPS frequency:
	# echo <idx> <ts> <tns> <ps> <pns> > /sys/class/ptp/ptpX/period

Where, ts/tns is start time and ps/pns is period time, and ptpX is ptp
of eth0.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 10:13:16 -04:00
David S. Miller
9db9268fe4 Merge branch 'qed-next'
Sudarsana Reddy Kalluru says:

====================
qed: Fix issues in UFP feature commit 'cac6f691'.

This patch series fixes couple of issues in the UFP feature commit,
   cac6f691: Add support for Unified Fabric Port.

Changes from previous version:
------------------------------
v2: Added "Fixes:" tag.

Please consider applying it to "net-next".
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 10:10:47 -04:00
Sudarsana Reddy Kalluru
b5fabb0800 qed: Fix use of incorrect shmem address.
Incorrect shared memory address is used while deriving the values
for tc and pri_type. Use shmem address corresponding to 'oem_cfg_func'
where the management firmare saves tc/pri_type values.

Fixes: cac6f691 ("qed: Add support for Unified Fabric Port")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 10:10:46 -04:00
Sudarsana Reddy Kalluru
5e9f20359a qed: Fix shared memory inconsistency between driver and the MFW.
The structure shared between driver and management firmware (MFW)
differ in sizes. The additional field defined by the MFW is not
relevant to the current driver. Add a dummy field to the structure.

Fixes: cac6f691 ("qed: Add support for Unified Fabric Port")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 10:10:46 -04:00
David S. Miller
1ec23c5407 Merge branch 'selftests-mirror_vlan-fixes'
Petr Machata says:

====================
selftests: forwarding: mirror_vlan: Fixlets

This patchset includes two small fixes for the tests that were
introduced in commit 1bb58d2d3c ("Merge branch
'Mirroring-tests-involving-VLAN'").

In patch #1, a "tc action trap" is uninstalled after the suite runs,
instead of being installed again.

In patch #2, a test in suite is renamed to differentiate it from another
test of the same name.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 10:08:46 -04:00
Petr Machata
6ebe5a7a66 selftests: forwarding: mirror_vlan: Change test description
The test description is displayed with the PASS/FAIL resolution after
the test is ran. There however already is one other test described
exactly like this, which makes it unclear which of the tests passed or
failed. Make the description unique.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 10:08:45 -04:00
Petr Machata
00d5622967 selftests: forwarding: mirror_vlan: Uninstall trap
Instead of installing a trap before tests run and uninstalling it after
they run, mirror_vlan.sh installs it twice due to a typo. Fix the typo.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 10:08:44 -04:00
Srinivas Kandagatla
ff2faf1289
ASoC: dapm: delete dapm_kcontrol_data paths list before freeing it
dapm_kcontrol_data is freed as part of dapm_kcontrol_free(), leaving the
paths pointer dangling in the list.

This leads to system crash when we try to unload and reload sound card.
I hit this bug during ADSP crash/reboot test case on Dragon board DB410c.

Without this patch, on SLAB Poisoning enabled build, kernel crashes with
"BUG kmalloc-128 (Tainted: G        W        ): Poison overwritten"

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
2018-06-04 14:56:22 +01:00
David S. Miller
8284fd4cb8 Merge branch 'selftests-net-various'
Willem de Bruijn says:

====================
selftests/net: various

A few odds and ends to network tests:

- msg_zerocopy: run as part of kselftest
- udp gso:      add missing bounds test for minimal sizes
- psocket_snd:  initial basic conformance test
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 09:50:12 -04:00
Willem de Bruijn
75f0139fd6 selftests/net: add packet socket packet_snd test
Add regression tests for PF_PACKET transmission using packet_snd.

The TPACKET ring interface has tests for transmission and reception.
This is an initial stab at the same for the send call based interface.

Packets are sent over loopback, then read twice. The entire packet is
read from another packet socket and compared. The packet is also
verified to arrive at a UDP socket for protocol conformance.

The test sends a packet over loopback, testing the following options
(not the full cross-product):

- SOCK_DGRAM
- SOCK_RAW
- vlan tag
- qdisc bypass
- bind() and sendto()
- virtio_net_hdr
- csum offload (NOT actual csum feature, ignored on loopback)
- gso

Besides these basic functionality tests, the test runs from a set
of bounds checks, positive and negative. Running over loopback, which
has dev->min_header_len, it cannot generate variable length hhlen.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 09:50:01 -04:00
Willem de Bruijn
00f333e8d6 selftests/net: udpgso: test small gso_size boundary conditions
Verify that udpgso can generate segments smaller than device mtu, down
to the extreme case of 1B gso_size.

Verify that irrespective of gso_size, udpgso restricts the number of
segments it will generate per call (UDP_MAX_SEGMENTS).

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 09:49:39 -04:00
Willem de Bruijn
830669e691 selftests/net: enable msg_zerocopy test
The existing msg_zerocopy test takes additional protocol arguments.
Add a variant that takes no arguments and runs all supported variants.
Call this from kselftest.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 09:49:39 -04:00
Tonghao Zhang
2fa3c8a8b2 net: virtio: simplify the virtnet_find_vqs
Use the common free functions while return successfully.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 09:33:22 -04:00
Andreas Gruenbacher
628e366df1 gfs2: Iomap cleanups and improvements
Clean up gfs2_iomap_alloc and gfs2_iomap_get.  Document how
gfs2_iomap_alloc works: it now needs to be called separately after
gfs2_iomap_get where necessary; this will be used later by iomap write.
Move gfs2_iomap_ops into bmap.c.

Introduce a new gfs2_iomap_get_alloc helper and use it in
fallocate_chunk: gfs2_iomap_begin will become unsuitable for fallocate
with proper iomap write support.

In gfs2_block_map and fallocate_chunk, zero-initialize struct iomap.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04 07:56:51 -05:00
Andreas Gruenbacher
845802b112 gfs2: Remove ordered write mode handling from gfs2_trans_add_data
In journaled data mode, we need to add each buffer head to the current
transaction.  In ordered write mode, we only need to add the inode to
the ordered inode list.  So far, both cases are handled in
gfs2_trans_add_data.  This makes the code look misleading and is
inefficient for small block sizes as well.  Handle both cases separately
instead.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04 07:50:16 -05:00
Andreas Gruenbacher
d6382a3505 gfs2: gfs2_stuffed_write_end cleanup
First, change the sanity check in gfs2_stuffed_write_end to check for
the actual write size instead of the requested write size.

Second, use the existing teardown code in gfs2_write_end instead of
duplicating it in gfs2_stuffed_write_end.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04 07:45:53 -05:00