Commit Graph

678577 Commits

Author SHA1 Message Date
LABBE Corentin
93264150b0 arm64: allwinner: pine64: Enable dwmac-sun8i
The dwmac-sun8i hardware is present on the pine64
It uses an external PHY via RMII.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:08 -04:00
LABBE Corentin
103aefa01c arm64: allwinner: sun50i-a64: add dwmac-sun8i Ethernet driver
The dwmac-sun8i is an Ethernet MAC that supports 10/100/1000 Mbit
connections. It is very similar to the device found in the Allwinner
H3, but lacks the internal 100 Mbit PHY and its associated control
bits.
This adds the necessary bits to the Allwinner A64 SoC .dtsi, but keeps
it disabled at this level.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:07 -04:00
LABBE Corentin
b89acf34c6 arm64: allwinner: sun50i-a64: Add dt node for the syscon control module
This patch add the dt node for the syscon register present on the
Allwinner A64.

Only two register are present in this syscon and the only one useful is
the one dedicated to EMAC clock.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:07 -04:00
LABBE Corentin
6f9461d6a4 arm: sun8i: nanopi-neo: Enable dwmac-sun8i
The dwmac-sun8i hardware is present on the NanoPi Neo.
It uses the internal PHY.
This patch create the needed emac node.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:07 -04:00
LABBE Corentin
29eb9d2984 arm: sun8i: orangepi-pc-plus: Set EMAC activity LEDs to active high
On the Orange Pi PC Plus, the polarity of the LEDs on the RJ45 Ethernet
port were changed from active low to active high.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:06 -04:00
LABBE Corentin
0d38218c4d arm: sun8i: orangepi-2: Enable dwmac-sun8i
The dwmac-sun8i hardware is present on the Orange PI 2.
It uses the internal PHY.

This patch create the needed emac node.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:06 -04:00
LABBE Corentin
bec8f59b74 arm: sun8i: orangepi-one: Enable dwmac-sun8i
The dwmac-sun8i hardware is present on the Orange PI One.
It uses the internal PHY.

This patch create the needed emac node.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:06 -04:00
LABBE Corentin
0e4da34445 arm: sun8i: orangepi-zero: Enable dwmac-sun8i
The dwmac-sun8i hardware is present on the Orange PI Zero.
It uses the internal PHY.

This patch create the needed emac node.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:05 -04:00
LABBE Corentin
62781b2878 arm: sun8i: orangepi-pc: Enable dwmac-sun8i
The dwmac-sun8i hardware is present on the Orange PI PC.
It uses the internal PHY.

This patch create the needed emac node.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:05 -04:00
LABBE Corentin
33125eaae4 arm: sun8i: sunxi-h3-h5: add dwmac-sun8i ethernet driver
The dwmac-sun8i is an ethernet MAC hardware that support 10/100/1000
speed.

This patch enable the dwmac-sun8i on Allwinner H3/H5 SoC Device-tree.
SoC H3/H5 have an internal PHY, so optionals syscon and ephy are set.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:05 -04:00
LABBE Corentin
2c0cba482e arm: sun8i: sunxi-h3-h5: Add dt node for the syscon control module
This patch add the dt node for the syscon register present on the
Allwinner H3/H5

Only two register are present in this syscon and the only one useful is
the one dedicated to EMAC clock..

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:04 -04:00
LABBE Corentin
9f93ac8d40 net-next: stmmac: Add dwmac-sun8i
The dwmac-sun8i is a heavy hacked version of stmmac hardware by
allwinner.
In fact the only common part is the descriptor management and the first
register function.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:04 -04:00
LABBE Corentin
ce5a4ff3c5 dt-bindings: syscon: Add DT bindings documentation for Allwinner syscon
This patch adds documentation for Device-Tree bindings for the
syscon present in allwinner devices.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:04 -04:00
LABBE Corentin
0441bde003 dt-bindings: net-next: Add DT bindings documentation for Allwinner dwmac-sun8i
This patch adds documentation for Device-Tree bindings for the
Allwinner dwmac-sun8i driver.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:03 -04:00
LABBE Corentin
ec33d71de7 net-next: stmmac: add optional setup function
Instead of adding more ifthen logic for adding a new mac_device_info
setup function, it is easier to add a function pointer to the function
needed.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:03 -04:00
LABBE Corentin
3874191898 net-next: stmmac: export stmmac_set_mac_addr/stmmac_get_mac_addr
Thoses symbol will be needed for the dwmac-sun8i ethernet driver.
For letting it to be build as module, they need to be exported.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:53:02 -04:00
Stephen Rothwell
042cc40934 powerpc: use asm-generic/socket.h as much as possible
asm-generic/socket.h already has an exception for the differences that
powerpc needs, so just include it after defining the differences.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 14:48:05 -04:00
Yotam Gigi
ce6ef68f43 mlxsw: spectrum: Implement the ethtool flash_device callback
Add callback to the ethtool flash_device op. This callback uses the mlxfw
module to flash the new firmware file to the device.

As the firmware flash process takes about 20 seconds and ethtool takes the
rtnl lock during the flash_device callback, release the rtnl lock at the
beginning of the flash process and take it again before leaving the
callback. This way, the rtnl is not held during the process. To make sure
the device does not get deleted during the flash process, take a reference
to it before releasing the rtnl lock.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:25:41 -04:00
David S. Miller
40aa306f8d Merge branch 'qed-Status-block-changes'
Yuval Mintz says:

====================
qed: Status block changes

The device maintains a CAM mapping of the internal status blocks
and the various PF/VF MSI-x vector mappings.
During initialization, the driver reads the HW memory and constructs
a shadow SW implementation which it would later use for manipulation
of interrupts. E.g., when enabling VFs and setting their MSI-x tables.

The driver currently has some very strict assumptions on the order the
entries are placed in the CAM. Specifically, it assumes that all entries
belonging to a PF would be consecutive and in-order in the CAM, and that
the VF entries would then follow. But there's no actual HW constraint
enforcing this assumption [although management firmware does set it
accordingly to same assumption initially].

Since the CAM is re-configurable, there are now SW flows employeed
by other OSes that might cause the assumption to be invalid.
Such flows allow the PF to forfeit some of it's available interrupts
in favor of its VFs or vice versa.
While those are not employeed today by qed, we want to relax the
assumptions as much as we can -
both to allow functionality after PDA as well as allowing future
compatibility where the driver would be loaded after a newer one has
'dirtied' the CAM configuration.

In addition to patches meant for the above relaxation, the series
also contains various cleanups & refactoring for interrupt logic
[most of which is !semantic].
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:17:21 -04:00
Mintz, Yuval
1ee240e31d qed: No need to reset SBs on IOV init
Since we're resetting the IGU CAM each time we initialize the PF
device, there's no need to reset the VF SBs again when initializing
IOV.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:17:20 -04:00
Mintz, Yuval
ebbdcc669c qed: Reset IGU CAM to default on init
The IGU CAM contains an assocaition between hardware SBs
and interrupt lines, and it can be dynamically configured
to allow more interrupts in one entity over another, specifically
for Re-distibution of SBs between a PF and its child VFs.

While we don't yet use this functionality, there are other
clients that do and as such its possible the information
passed from management firmware during initialization in
regard to the possible number of SBs doesn't accurately reflect
the current HW configuration.

The following changes are going to apply to the driver init sequence:

 a. PF is going to re-configure all entries belonging to itself and
    its child VFs in IGU CAM based on the management firmware info
    regarding the number of SBs that are supposed to exist there.

 b. PF is going to stop using the SB resource [management firmware
    provided information] for anything but the initialization.
    Instead, it would use the live-time counters it maintains for
    the numbers.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:17:20 -04:00
Mintz, Yuval
50a207147f qed: Hold a single array for SBs
A PF today holds 2 different arrays - one holding information
about the HW configuration and one holding information about
the SBs that are used by the protocol drivers.
These arrays aren't really connected - e.g., protocol driver
initializing a given SB would not mark the same SB as occupied
in the HW shadow array.

Move into a single array [at least for PFs] - hold the mapping
of the driver-protocol SBs on the HW entry which they configure.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:17:19 -04:00
Mintz, Yuval
09b6b14749 qed: Provide auxiliary for getting free VF SB
IOV code is very intrusive in its manipulation of the status block
database.
Add a new auxiliary function to allow the PF to find an available unused
status block to configure for a specific VF's MSI-x vector.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:17:19 -04:00
Mintz, Yuval
1ac72433c5 qed: Remove assumption on SB order in IGU
Current code assumes there's a known layout for SBs in the IGU,
where all the SBs of a single entity would be laid in consecutive
order of vectors.

While the assumption is still kept by management firmware, we already
have the necessary information to eliminate it, so no reason to keep
it in code.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:17:18 -04:00
Mintz, Yuval
726fdbe9fa qed: Encapsulate interrupt counters in struct
We already have an API struct that contains interrupt-related
numbers. Use it to encapsulate all information relating to the
status of SBs as (used|free).

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:17:18 -04:00
Mintz, Yuval
a333f7f3fd qed: Add aux. function translating sb_id -> igu_sb_id
An additional step for relaxing the IGU order assumption, we now add
an auxiliary function that can be used for finding the HW status block
that's associated with a given MSI-x vector.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:17:17 -04:00
Mintz, Yuval
d031548e91 qed: Distinguish between sb_id and igu_sb_id
In qed code, sb_id means 2 different things:
  - An interrupt vector [usually when received as a parameter from
    a protocol driver, but not only] that's associated with a status
    block.

  - An index to a status block entity existing in HW.

This patch renames the references to the HW entity, adding an 'igu_'
prefix to allow an easier distinction.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:17:17 -04:00
Mintz, Yuval
d749dd0dc1 qed: IGU read revised
As a first step for relaxing various assumptions done by driver
about the IGU mapping, the driver is now going to read the entire
IGU into a shadow copy, and mark in its database each status block
that's relevant for it.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:17:17 -04:00
Mintz, Yuval
979cead3de qed: Minor refactoring in interrupt code
Separate the portions controlling interrupt enablement form those
controlling the ability of HW to generate attentions.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:17:16 -04:00
Mintz, Yuval
8befd73c23 qed: Make qed_int_cau_conf_pi() static
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:17:16 -04:00
Colin Ian King
7b954ed752 net: dsa: make function ksz_rcv static
function ksz_rcv can be made static as it does not need to be
in global scope. Reformat arguments to make it checkpatch warning
free too.

Cleans up sparse warning: "symbol 'ksz_rcv' was not declared. Should
it be static?"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 12:12:40 -04:00
Gao Feng
97fcc193f6 ppp: remove unnecessary bh disable in xmit path
Since the commit 55454a5658 ("ppp: avoid dealock on recursive xmit"),
the PPP xmit path is protected by wrapper functions which disable the
bh already. So it is unnecessary to disable the bh again in the real
xmit path.

Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Acked-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 11:57:36 -04:00
Roopa Prabhu
ba52d61e0f ipv4: route: restore skb_dst_set in inet_rtm_getroute
recent updates to inet_rtm_getroute dropped skb_dst_set in
inet_rtm_getroute. This patch restores it because it is
needed to release the dst correctly.

Fixes: 3765d35ed8 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup")
Reported-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 11:30:41 -04:00
David S. Miller
a5e2ee5da4 bpf: Take advantage of stack_depth tracking in sparc64 JIT
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 19:35:00 -07:00
David S. Miller
551f40c42f Merge branch 'dsa-add-Microchip-KSZ9477-DSA-driver'
Woojung Huh says:

====================
dsa: add Microchip KSZ9477 DSA driver

This series of patches is for Microchip KSZ9477 DSA driver.
KSZ9477 is 7 ports GigE switch with numerous advanced features.
5 ports are 10/100/1000 Mbps internal PHYs and 2 ports have
Interfaces to SGMII, RGMII, MII or RMII.

This patch supports VLAN, MDB, FDB and port mirroring offloads.

Welcome reviews and comments from community.

Note: Tests are performed on internal development board.

V5
- add missing MODULE_LICENSE

V4
- update per review comments
- cosmetic changes
- net/dsa/tag_ksz.c
  * skb_put() & memset() are changed to skb_put_padto()
- drivers/net/dsa/microchip/ksz_common.
   * vlan access mutex is updated
   * mib_names[] is changed to static const

V3
- update per review comments
- cosmetic changes
- drivers/net/dsa/microchip/ksz_common.c
  * clean up ksz_switch_chips[]
  * consolidate checking loops into functions
  * update mutex for better locking
  * replace devm_kmalloc_array() to devm_kcalloc()
- MAINTAINERS
  * add missing net/dsa/tag_ksz.c

V2
- update per review comments
- several cosmetic changes
- net/dsa/tag_ksz.c
  * constants are changed to defines
  * remove skb_linearize() in ksz_rcv()
  * ksz_xmit()checks skb tailroom before allocate new skb
- drivers/net/phy/micrel.c
  * remove PHY_HAS_MAGICANEG from ksphy_driver[]
- drivers/net/dsa/microchip/ksz_common.c
  * add timeout to avoid endless loop
  * port initialization is move to ksz_port_enable() instead of  ksz_setup_ports()
- Documentation/devicetree/bindings/net/dsa/ksz.txt
  * fix typo and indentations
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 20:56:40 -04:00
Woojung Huh
419585a98e dsa: add maintainer of Microchip KSZ switches
Adding maintainer of Microchip KSZ switches.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Woojung Huh <Woojung.Huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 20:56:32 -04:00
Woojung Huh
5033a7cbec net: dsa: Add Microchip KSZ switches binding
A sample SPI configuration for Microchip KSZ switches.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Woojung Huh <Woojung.Huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 20:56:31 -04:00
Woojung Huh
b987e98e50 dsa: add DSA switch driver for Microchip KSZ9477
The KSZ9477 is a fully integrated layer 2, managed, 7 ports GigE switch
with numerous advanced features. 5 ports incorporate 10/100/1000 Mbps PHYs.
The other 2 ports have interfaces that can be configured as SGMII, RGMII, MII
or RMII. Either of these may connect directly to a host processor or
to an external PHY. The SGMII port may interface to a fiber optic transceiver.

This driver currently supports vlan, fdb, mdb & mirror dsa switch operations.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 20:56:31 -04:00
Woojung Huh
fc3973a1fa phy: micrel: add Microchip KSZ 9477 Switch PHY support
Adding Microchip 9477 Phy included in KSZ9477 Switch.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Woojung Huh <Woojung.Huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 20:56:31 -04:00
Woojung Huh
8b8010fb78 dsa: add support for Microchip KSZ tail tagging
Adding support for the Microchip KSZ switch family tail tagging.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Woojung Huh <Woojung.Huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 20:56:31 -04:00
David S. Miller
6c21a2a65d Merge branch 'bpf-stack-tracker'
Alexei Starovoitov says:

====================
bpf: stack depth tracking

Introduce tracking of bpf program stack depth in the verifier and use that
info to reduce bpf program stack consumption in the interpreter and x64 JIT.
Other JITs can take advantage of it as well in the future.
Most of the programs consume very little stack, so it's good optimization
in general and it's the first step toward bpf to bpf function calls.

Also use internal opcode for bpf_tail_call() marking to make clear
that jmp|call|x opcode is not uapi and may be used for actual
indirect call opcode in the future.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 19:29:48 -04:00
Alexei Starovoitov
2960ae48c4 bpf: take advantage of stack_depth tracking in x64 JIT
Take advantage of stack_depth tracking in x64 JIT.
Round up allocated stack by 8 bytes to make sure it stays aligned
for functions called from JITed bpf program.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 19:29:48 -04:00
Alexei Starovoitov
177366bf7c bpf: change x86 JITed program stack layout
in order to JIT programs with different stack sizes we need to
make epilogue and exception path to be stack size independent,
hence move auxiliary stack space from the bottom of the stack
to the top of the stack.
Nice side effect is that JITed function prologue becomes shorter
due to imm8 offset encoding vs imm32.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 19:29:48 -04:00
Alexei Starovoitov
b870aa901f bpf: use different interpreter depending on required stack size
16 __bpf_prog_run() interpreters for various stack sizes add .text
but not a lot comparing to run-time stack savings

   text	   data	    bss	    dec	    hex	filename
  26350   10328     624   37302    91b6 kernel/bpf/core.o.before_split
  25777   10328     624   36729    8f79 kernel/bpf/core.o.after_split
  26970	  10328	    624	  37922	   9422	kernel/bpf/core.o.now

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 19:29:48 -04:00
Alexei Starovoitov
105c03614b bpf: fix stack_depth usage by test_bpf.ko
test_bpf.ko doesn't call verifier before selecting interpreter or JITing,
hence the tests need to manually specify the amount of stack they consume.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 19:29:48 -04:00
Alexei Starovoitov
50bbfed967 bpf: track stack depth of classic bpf programs
To track stack depth of classic bpf programs we only need
to analyze ST|STX instructions, since check_load_and_stores()
verifies that programs can load from stack only after write.

We also need to change the way cBPF stack slots map to eBPF stack,
since typical classic programs are using slots 0 and 1, so they
need to map to stack offsets -4 and -8 respectively in order
to take advantage of small stack interpreter and JITs.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 19:29:47 -04:00
Alexei Starovoitov
80a58d0255 bpf: reconcile bpf_tail_call and stack_depth
The next set of patches will take advantage of stack_depth tracking,
so make sure that the program that does bpf_tail_call() has
stack depth large enough for the callee.
We could have tracked the stack depth of the prog_array owner program
and only allow insertion of the programs with stack depth less
than the owner, but it will break existing applications.
Some of them have trivial root bpf program that only does
multiple bpf_tail_calls and at init time the prog array is empty.
In the future we may add a flag to do such tracking optionally,
but for now play simple and safe.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 19:29:47 -04:00
Alexei Starovoitov
8726679a0f bpf: teach verifier to track stack depth
teach verifier to track bpf program stack depth

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 19:29:47 -04:00
Alexei Starovoitov
f696b8f471 bpf: split bpf core interpreter
split __bpf_prog_run() interpreter into stack allocation and execution parts.
The code section shrinks which helps interpreter performance in some cases.
   text	   data	    bss	    dec	    hex	filename
  26350	  10328	    624	  37302	   91b6	kernel/bpf/core.o.before
  25777	  10328	    624	  36729	   8f79	kernel/bpf/core.o.after

Very short programs got slower (due to extra function call):
Before:
test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:0 7 PASS
test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:0 8 PASS
test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647 jited:0 7 PASS
test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296 jited:0 11 PASS
test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 = -1 jited:0 7 PASS
After:
test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:0 11 PASS
test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:0 11 PASS
test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647 jited:0 11 PASS
test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296 jited:0 14 PASS
test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 = -1 jited:0 10 PASS

Longer programs got faster:
Before:
test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations jited:0 20286 20513 PASS
test_bpf: #267 BPF_MAXINSNS: Call heavy transformations jited:0 31853 31768 PASS
test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:0 9815 PASS
test_bpf: #269 BPF_MAXINSNS: Very long jump backwards jited:0 6 PASS
test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse jited:0 13959 PASS
test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ... jited:0 210 PASS
test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id jited:0 21724 PASS
test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop jited:0 19118 PASS
After:
test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations jited:0 19008 18827 PASS
test_bpf: #267 BPF_MAXINSNS: Call heavy transformations jited:0 29238 28450 PASS
test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:0 9485 PASS
test_bpf: #269 BPF_MAXINSNS: Very long jump backwards jited:0 12 PASS
test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse jited:0 13257 PASS
test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ... jited:0 213 PASS
test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id jited:0 19389 PASS
test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop jited:0 19583 PASS

For real world production programs the difference is noise.

This patch is first step towards reducing interpreter stack consumption.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 19:29:47 -04:00
Alexei Starovoitov
71189fa9b0 bpf: free up BPF_JMP | BPF_CALL | BPF_X opcode
free up BPF_JMP | BPF_CALL | BPF_X opcode to be used by actual
indirect call by register and use kernel internal opcode to
mark call instruction into bpf_tail_call() helper.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31 19:29:47 -04:00