Commit Graph

855575 Commits

Author SHA1 Message Date
Tony Nguyen
3015b8fcb6 ice: Bump version number
Update driver version to 0.7.5

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 13:41:09 -07:00
Akeem G Abodunrin
b67f25d76e ice: Remove flag to track VF interrupt status
As a result of refactoring of VF VSIs interrupts code, there is no
need to track its configuration status again with ICE_VF_STATE_CFG_INTR
flag - In fact, it is not being checked anywhere in the code right now, so
this patch removes the dead code as applicable to the flag.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 13:41:05 -07:00
Brett Creeley
ba880734ba ice: Remove unnecessary flag ICE_FLAG_MSIX_ENA
This flag is not needed and is called every time we re-enable interrupts
in the hotpath so remove it. Also remove ice_vsi_req_irq() because it
was a wrapper function for ice_vsi_req_irq_msix() whose sole purpose was
checking the ICE_FLAG_MSIX_ENA flag.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 13:41:01 -07:00
Akeem G Abodunrin
9921494463 ice: Don't return error for disabling LAN Tx queue that does exist
Since Tx rings are being managed by FW/NVM, Tx rings might have not been
set up or driver had already wiped them off - In that case, call to
disable LAN Tx queue is being returned as not in existence. This patch
makes sure we don't return unnecessary error for such scenario.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 13:40:57 -07:00
Brett Creeley
a1e9968593 ice: Remove duplicate code in ice_alloc_rx_bufs
Currently if the call to ice_alloc_mapped_page() fails we jump to the
no_buf label, possibly call ice_release_rx_desc(), and return true
indicating that there is more work to do. In the success case we just
fall out of the while loop, possibly call ice_alloc_mapped_page(), and
return false saying we exhausted cleaned_count. This flow can be
improved by breaking if ice_alloc_mapped_page() fails and then the flow
outside of the while loop is the same for the failure and success case.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 13:40:52 -07:00
Brett Creeley
56923ab664 ice: Add stats for Rx drops at the port level
Currently we are not reporting dropped counts at the port level to
ethtool or netlink. This was found when debugging Rx dropped issues
and the total packets sent did not equal the total packets received
minus the rx_dropped, which was very confusing. To determine dropped
counts at the port level we need to read the PRTRPB_RDPC register.
To fix reporting we will store the dropped counts in the PF's
rx_discards. This will be reported to netlink by storing it in the
PF VSI's rx_missed_errors signaling that the receiver missed the
packet. Also, we will report this to ethtool in the rx_dropped.nic
field.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 13:40:46 -07:00
Akeem G Abodunrin
66b29e7a88 ice: Update number of VF queue before setting VSI resources
In case there is a request from a VF to change its number of queues, and
the request was successful, we need to update number of queues
configured on the VF before updating corresponding VSI for that VF,
especially LAN Tx queue tree and TC update, otherwise, we would continued
to use old value of vf->num_vf_qs for allocated Tx/Rx queues...

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 13:40:42 -07:00
Akeem G Abodunrin
d5a4635917 ice: Set up Tx scheduling tree based on alloc VSI Tx queues
This patch uses allocated number of Tx queues per VSI to set up its
scheduling tree instead of using total number of available Tx queues.
Only PF VSIs have total number of allocated Tx queues equal to number
of available Tx queues, other VSIs have different number of queues
configured.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 13:40:35 -07:00
Brett Creeley
cb7db35641 ice: Only bump Rx tail and release buffers once per napi_poll
Currently we bump the Rx tail and release/give buffers to hardware every
16 descriptors. This causes us to bump Rx tail up to 4 times per
napi_poll call. Also we are always bumping tail on an odd index and this
is a problem because hardware ignores the lower 3 bits in the QRX_TAIL
register. This is making it so hardware sees tail bumps only every 8
descriptors. Instead lets only bump Rx tail once per napi_poll if
the value aligns with hardware's expectations of the lower 3 bits being
cleared. Also only release/give Rx buffers once per napi_poll call.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 13:40:30 -07:00
Akeem G Abodunrin
c7aeb4d1b9 ice: Disable VFs until reset is completed
This patch adds code to clear VFs enable status until reset is completed,
and Tx/Rx rings are setup. Without this patch, the code flow request Tx
queues to be disabled after reset, especially PFR - where VF VSI Tx rings
have already been wiped off in the NVM and result to adminq error based on
the call to disable Tx LAN queue in ice_reset_all_vfs function call.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 10:23:04 -07:00
Tony Nguyen
6d5999467d ice: Do not configure port with no media
The firmware reports an error when trying to configure a port with no
media. Instead of always configuring the port, check for media before
attempting to configure it. In the absence of media, turn off link and
poll for media to become available before re-enabling link.

Move ice_force_phys_link_state() up to avoid forward declaration.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 10:23:04 -07:00
Jacob Keller
5c91ecfda5 ice: separate out control queue lock creation
The ice_init_all_ctrlq and ice_shutdown_all_ctrlq functions create and
destroy the locks used to protect the send and receive process of each
control queue.

This is problematic, as the driver may use these functions to shutdown
and re-initialize the control queues at run time. For example, it may do
this in response to a device reset.

If the driver failed to recover from a reset, it might leave the control
queues offline. In this case, the locks will no longer be initialized.
A later call to ice_sq_send_cmd will then attempt to acquire a lock that
has been destroyed.

It is incorrect behavior to access a lock that has been destroyed.

Indeed, ice_aq_send_cmd already tries to avoid accessing an offline
control queue, but the check occurs inside the lock.

The root of the problem is that the locks are destroyed at run time.

Modify ice_init_all_ctrlq and ice_shutdown_all_ctrlq such that they no
longer create or destroy the locks.

Introduce new functions, ice_create_all_ctrlq and ice_destroy_all_ctrlq.
Call these functions in ice_init_hw and ice_deinit_hw.

Now, the control queue locks will remain valid for the life of the
driver, and will not be destroyed until the driver unloads.

This also allows removing a duplicate check of the sq.count and
rq.count values when shutting down the controlqs. The ice_shutdown_ctrlq
function already checks this value under the lock. Previously
commit dec64ff10e ("ice: use [sr]q.count when checking if queue is
initialized") needed this check to happen outside the lock, because it
prevented duplicate attempts at destroying the locks.

The driver may now safely use ice_init_all_ctrlq and
ice_shutdown_all_ctrlq while handling reset events, without causing the
locks to be invalid.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 10:23:04 -07:00
Brett Creeley
c31a5c25bb ice: Always set prefena when configuring an Rx queue
Currently we are always setting prefena to 0. This is causing the
hardware to only fetch descriptors when there are none free in the cache
for a received packet instead of prefetching when it has used the last
descriptor regardless of incoming packets. Fix this by allowing the
hardware to prefetch Rx descriptors.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 10:23:04 -07:00
Tony Nguyen
17bc6d0721 ice: Move vector base setup to PF VSI
When interrupt tracking was refactored, during rebuild, the call to
ice_vsi_setup_vector_base() was inadvertently removed from the PF VSI
instead of being removed from the VF VSI. During reset, the failure to
properly setup the vector base generates a call trace. Correct this so
that resets/rebuilds properly complete.

Fixes: cbe66bfee6 ("ice: Refactor interrupt tracking")
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 10:23:04 -07:00
Jacob Keller
36517fd397 ice: track hardware stat registers past rollover
Currently, ice_stat_update32 and ice_stat_update40 will limit the
value of the software statistic to 32 or 40 bits wide, depending on
which register is being read.

This means that if a driver is running for a long time, the displayed
software register values will roll over to zero at 40 bits or 32 bits.

This occurs because the functions directly assign the difference between
the previous value and current value of the hardware statistic.

Instead, add this value to the current software statistic, and then
update the previous value.

In this way, each time ice_stat_update40 or ice_stat_update32 are
called, they will increment the software tracking value by the
difference of the hardware register from its last read. The software
tracking value will correctly count up until it overflows a u64.

The only requirement is that the ice_stat_update functions be called at
least once each time the hardware register overflows.

While we're fixing ice_stat_update40, modify it to use rd64 instead of
two calls to rd32. Additionally, drop the now unnecessary hireg
function parameter.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 10:23:04 -07:00
Paul Greenwalt
5a056cd7ea ice: add lp_advertising flow control support
Add support for reporting link partner advertising when
ETHTOOL_GLINKSETTINGS defined. Get pause param reports the Tx/Rx
pause configured, and then ethtool issues ETHTOOL_GSET ioctl and
ice_get_settings_link_up reports the negotiated Tx/Rx pause. Negotiated
pause frame report per IEEE 802.3-2005 table 288-3.

$ ethtool --show-pause ens6f0
Pause parameters for ens6f0:
Autonegotiate:  on
RX:             on
TX:             on
RX negotiated:  on
TX negotiated:  on

$ ethtool ens6f0
Settings for ens6f0:
        Supported ports: [ FIBRE ]
        Supported link modes:   25000baseCR/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Supported FEC modes: None BaseR RS
        Advertised link modes:  25000baseCR/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: None BaseR RS
        Link partner advertised link modes:  Not reported
        Link partner advertised pause frame use: Symmetric
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 25000Mb/s
        Duplex: Full
        Port: Direct Attach Copper
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: g
        Wake-on: g
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes

When ETHTOOL_GLINKSETTINGS is not defined, get pause param reports the
negotiated Tx/Rx pause.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 10:23:04 -07:00
YueHaibing
6a7ce95d75 staging/octeon: Fix build error without CONFIG_NETDEVICES
While do COMPILE_TEST build without CONFIG_NETDEVICES,
we get Kconfig warning:

WARNING: unmet direct dependencies detected for PHYLIB
  Depends on [n]: NETDEVICES [=n]
  Selected by [y]:
  - OCTEON_ETHERNET [=y] && STAGING [=y] && (CAVIUM_OCTEON_SOC && NETDEVICES [=n] || COMPILE_TEST [=y])

Reported-by: Hulk Robot <hulkci@huawei.com>
Reported-by: Mark Brown <broonie@kernel.org>
Fixes: 171a9bae68 ("staging/octeon: Allow test build on !MIPS")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-31 09:07:05 -07:00
David S. Miller
ac5fe22636 We have a reasonably large number of changes:
* lots more HE (802.11ax) support, particularly things
    relevant for the the AP side, but also mesh support
  * debugfs cleanups from Greg
  * some more work on extended key ID
  * start using genl parallel_ops, as preparation for
    weaning ourselves off RTNL and getting parallelism
  * various other changes all over
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl1BtsAACgkQB8qZga/f
 l8QTdQ//bD3NLitiIW4rBmC/w9str2TRpUbRN8Hf9xTnRh1smLGQop8jI8xpYdAE
 hXFlWn4glCkTkZrqtT8IDMcERb2YPHI94L7lMR1L6dXH7e1tiBZ7Qum5+rojLUCQ
 /XXZJnVa9AbdzBDtkhzCjkN8dCldMnkId0m6vkGdFre/b70FLDg2GlolqIchbDCz
 GxeKaw/uTLwu9nxEJhspDmiXDQtQd1ZsGNZRrovQ2nUuUfvX9sU54OmaoDHhqrUM
 iSwFZJAhrCV7nidOLs2X1rVs8hRymbrYnGu6NdUlfTQMqaOVgWhqniM1RMnJnnVi
 Ec/1OQg4KIj2rjJIXW/3QXAzAO6TDwV8xNk9al3m+yAkkSXR7tXP01plNUoMU4UW
 YxcaPD341H919GIqa71lieeEMyi7XIKKFfaZvGFLiDzidKKi0JkDuC+1w5UOJrbj
 PM/dzvKqvwOMEa5XahjrUQLsiGMgxBkDo36+F4rfaelzDKPuJBOokDS23vg/lYf5
 2v74WbeJVc45cAhSwp5/0RTsG0xhmFImfrE66VNB1vcJUVLvHTuATfXsBbRahOz/
 SglGBAGqigxJwIeYfgec3lPNfdV+0LwYnPAEjBIjCO8qlGvst4jLroHg7ayApr2E
 JfUbaWtpEb9cgAz7X6tEjv/BAnsRWPzpxb8Nm6JoZttTtA2VakE=
 =eUnh
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2019-07-31' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
We have a reasonably large number of changes:
 * lots more HE (802.11ax) support, particularly things
   relevant for the the AP side, but also mesh support
 * debugfs cleanups from Greg
 * some more work on extended key ID
 * start using genl parallel_ops, as preparation for
   weaning ourselves off RTNL and getting parallelism
 * various other changes all over
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-31 08:59:41 -07:00
David S. Miller
164f0de315 Merge branch 'mlxsw-Test-coverage-for-DSCP-leftover-fix'
Petr Machata says:

====================
mlxsw: Test coverage for DSCP leftover fix

This patch set fixes some global scope pollution issues in the DSCP tests
(in patch #1), and then proceeds (in patch #2) to add a new test for
checking whether, after DSCP prioritization rules are removed from a port,
DSCP rewrites consistently to zero, instead of the last removed rule still
staying in effect.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-31 08:47:14 -07:00
Petr Machata
d11786bb96 selftests: mlxsw: Add a test for leftover DSCP rule
Commit dedfde2fe1 ("mlxsw: spectrum_dcb: Configure DSCP map as the last
rule is removed") fixed a problem in mlxsw where last DSCP rule to be
removed remained in effect when DSCP rewrite was applied.

Add a selftest that covers this problem.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-31 08:47:13 -07:00
Petr Machata
7700476f31 selftests: mlxsw: Fix local variable declarations in DSCP tests
These two tests have some problems in the global scope pollution and on
contrary, contain unnecessary local declarations. Fix them.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-31 08:47:13 -07:00
Ding Xiang
7084148854 myri10ge: remove unneeded variable
"error" is unneeded,just return 0

Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-31 08:44:00 -07:00
Christophe JAILLET
a9d41e7b8b net: ag71xx: Slighly simplify code in 'ag71xx_rings_init()'
A few lines above, we have:
   tx_size = BIT(tx->order);

So use 'tx_size' directly to be consistent with the way 'rx->descs_cpu' and
'rx->descs_dma' are computed below.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-31 08:38:17 -07:00
Shay Bar
f39b07fdfb mac80211: HE STA disassoc due to QOS NULL not sent
In case of HE AP-STA link, ieee80211_send_nullfunc() will not
send the QOS NULL packet to check if AP is still associated.

In this case, probe_send_count will be non-zero and
ieee80211_sta_work() will later disassociate the AP, even
though no packet was ever sent.

Fix this by decrementing probe_send_count and not calling
ieee80211_send_nullfunc() in case of HE link, so that we
still wait for some time for the AP beacon to reappear and
don't disconnect right away.

Signed-off-by: Shay Bar <shay.bar@celeno.com>
Link: https://lore.kernel.org/r/20190703131848.22879-1-shay.bar@celeno.com
[clarify commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-31 13:26:41 +02:00
John Crispin
1ced169cc1 mac80211: allow setting spatial reuse parameters from bss_conf
Store the OBSS PD parameters inside bss_conf when bringing up an AP and/or
when a station connects to an AP. This allows the driver to configure the
HW accordingly.

Signed-off-by: John Crispin <john@phrozen.org>
Link: https://lore.kernel.org/r/20190730163701.18836-3-john@phrozen.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-31 11:00:52 +02:00
Johannes Berg
6d4dd4ef1a nl80211: add strict start type
Add a strict start type so all new attributes starting from
NL80211_ATTR_HE_OBSS_PD are validated strictly.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-31 11:00:52 +02:00
John Crispin
796e90f42b cfg80211: add support for parsing OBBS_PD attributes
Add the data structure, policy and parsing code allowing userland to send
the OBSS PD information into the kernel.

Signed-off-by: John Crispin <john@phrozen.org>
Link: https://lore.kernel.org/r/20190730163701.18836-2-john@phrozen.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-31 11:00:52 +02:00
Karthikeyan Periyasamy
52dba8d7d5 mac80211: reject zero MAC address in add station
This came up in fuzz testing, and really we don't consider
all-zeroes to be a valid MAC address in most places, so
also reject it here to avoid confusion later on.

Signed-off-by: Karthikeyan Periyasamy <periyasa@codeaurora.org>
Link: https://lore.kernel.org/r/1563959770-21570-1-git-send-email-periyasa@codeaurora.org
[rewrite commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-31 11:00:52 +02:00
Johannes Berg
50508d941c cfg80211: use parallel_ops for genl
Over time, we really need to get rid of all of our global locking.
One of the things needed is to use parallel_ops. This isn't really
the most important (RTNL is much more important) but OTOH we just
keep adding uses of genl_family_attrbuf() now. Use .parallel_ops to
disallow this.

Reviewed-By: Denis Kenzior <denkenz@gmail.com>
Link: https://lore.kernel.org/r/20190729143109.18683-1-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-31 11:00:46 +02:00
Johannes Berg
05d610af3e mac80211_hwsim: fill boottime_ns in netlink RX path
Give a proper boottime_ns value for netlink RX to avoid scan
issues here.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20190729160605.1074-1-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-31 10:51:43 +02:00
Colin Ian King
f12cac539f mac80211: add missing null return check from call to ieee80211_get_sband
The return from ieee80211_get_sband can potentially be a null pointer, so
it seems prudent to add a null check to avoid a null pointer dereference
on sband.

Addresses-Coverity: ("Dereference null return")
Fixes: 2ab4587675 ("mac80211: add support for the ADDBA extension element")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20190730143205.14261-1-colin.king@canonical.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-31 10:51:17 +02:00
David S. Miller
5133f36cef Merge branch 'net-dsa-ksz-Add-Microchip-KSZ87xx-support'
Marek Vasut says:

====================
net: dsa: ksz: Add Microchip KSZ87xx support

This series adds support for Microchip KSZ87xx switches, which are
slightly simpler compared to KSZ9xxx .
====================

Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 15:12:50 -07:00
Tristram Ha
e66f840c08 net: dsa: ksz: Add Microchip KSZ8795 DSA driver
Add Microchip KSZ8795 DSA driver.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 15:12:50 -07:00
Tristram Ha
016e43a26b net: dsa: ksz: Add KSZ8795 tag code
Add DSA tag code for Microchip KSZ8795 switch. The switch is simpler
and the tag is only 1 byte, instead of 2 as is the case with KSZ9477.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 15:12:50 -07:00
Marek Vasut
4c173472d0 dt-bindings: net: dsa: ksz: document Microchip KSZ87xx family switches
Document Microchip KSZ87xx family switches. These include
KSZ8765 - 5 port switch
KSZ8794 - 4 port switch
KSZ8795 - 5 port switch

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Cc: devicetree@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 15:12:50 -07:00
David S. Miller
c69e6eafff Merge branch 'vsock-virtio-optimizations-to-increase-the-throughput'
Stefano Garzarella says:

====================
vsock/virtio: optimizations to increase the throughput

This series tries to increase the throughput of virtio-vsock with slight
changes.
While I was testing the v2 of this series I discovered an huge use of memory,
so I added patch 1 to mitigate this issue. I put it in this series in order
to better track the performance trends.

v5:
- rebased all patches on net-next
- added Stefan's R-b and Michael's A-b

v4: https://patchwork.kernel.org/cover/11047717
v3: https://patchwork.kernel.org/cover/10970145
v2: https://patchwork.kernel.org/cover/10938743
v1: https://patchwork.kernel.org/cover/10885431

Below are the benchmarks step by step. I used iperf3 [1] modified with VSOCK
support. As Michael suggested in the v1, I booted host and guest with 'nosmap'.

A brief description of patches:
- Patches 1:   limit the memory usage with an extra copy for small packets
- Patches 2+3: reduce the number of credit update messages sent to the
               transmitter
- Patches 4+5: allow the host to split packets on multiple buffers and use
               VIRTIO_VSOCK_MAX_PKT_BUF_SIZE as the max packet size allowed

                    host -> guest [Gbps]
pkt_size before opt   p 1     p 2+3    p 4+5

32         0.032     0.030    0.048    0.051
64         0.061     0.059    0.108    0.117
128        0.122     0.112    0.227    0.234
256        0.244     0.241    0.418    0.415
512        0.459     0.466    0.847    0.865
1K         0.927     0.919    1.657    1.641
2K         1.884     1.813    3.262    3.269
4K         3.378     3.326    6.044    6.195
8K         5.637     5.676   10.141   11.287
16K        8.250     8.402   15.976   16.736
32K       13.327    13.204   19.013   20.515
64K       21.241    21.341   20.973   21.879
128K      21.851    22.354   21.816   23.203
256K      21.408    21.693   21.846   24.088
512K      21.600    21.899   21.921   24.106

                    guest -> host [Gbps]
pkt_size before opt   p 1     p 2+3    p 4+5

32         0.045     0.046    0.057    0.057
64         0.089     0.091    0.103    0.104
128        0.170     0.179    0.192    0.200
256        0.364     0.351    0.361    0.379
512        0.709     0.699    0.731    0.790
1K         1.399     1.407    1.395    1.427
2K         2.670     2.684    2.745    2.835
4K         5.171     5.199    5.305    5.451
8K         8.442     8.500   10.083    9.941
16K       12.305    12.259   13.519   15.385
32K       11.418    11.150   11.988   24.680
64K       10.778    10.659   11.589   35.273
128K      10.421    10.339   10.939   40.338
256K      10.300     9.719   10.508   36.562
512K       9.833     9.808   10.612   35.979

As Stefan suggested in the v1, I measured also the efficiency in this way:
    efficiency = Mbps / (%CPU_Host + %CPU_Guest)

The '%CPU_Guest' is taken inside the VM. I know that it is not the best way,
but it's provided for free from iperf3 and could be an indication.

        host -> guest efficiency [Mbps / (%CPU_Host + %CPU_Guest)]
pkt_size before opt   p 1     p 2+3    p 4+5

32         0.35      0.45     0.79     1.02
64         0.56      0.80     1.41     1.54
128        1.11      1.52     3.03     3.12
256        2.20      2.16     5.44     5.58
512        4.17      4.18    10.96    11.46
1K         8.30      8.26    20.99    20.89
2K        16.82     16.31    39.76    39.73
4K        30.89     30.79    74.07    75.73
8K        53.74     54.49   124.24   148.91
16K       80.68     83.63   200.21   232.79
32K      132.27    132.52   260.81   357.07
64K      229.82    230.40   300.19   444.18
128K     332.60    329.78   331.51   492.28
256K     331.06    337.22   339.59   511.59
512K     335.58    328.50   331.56   504.56

        guest -> host efficiency [Mbps / (%CPU_Host + %CPU_Guest)]
pkt_size before opt   p 1     p 2+3    p 4+5

32         0.43      0.43     0.53     0.56
64         0.85      0.86     1.04     1.10
128        1.63      1.71     2.07     2.13
256        3.48      3.35     4.02     4.22
512        6.80      6.67     7.97     8.63
1K        13.32     13.31    15.72    15.94
2K        25.79     25.92    30.84    30.98
4K        50.37     50.48    58.79    59.69
8K        95.90     96.15   107.04   110.33
16K      145.80    145.43   143.97   174.70
32K      147.06    144.74   146.02   282.48
64K      145.25    143.99   141.62   406.40
128K     149.34    146.96   147.49   489.34
256K     156.35    149.81   152.21   536.37
512K     151.65    150.74   151.52   519.93

[1] https://github.com/stefano-garzarella/iperf/
====================

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 15:00:00 -07:00
Stefano Garzarella
0038ff357f vsock/virtio: change the maximum packet size allowed
Since now we are able to split packets, we can avoid limiting
their sizes to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE.
Instead, we can use VIRTIO_VSOCK_MAX_PKT_BUF_SIZE as the max
packet size.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 15:00:00 -07:00
Stefano Garzarella
6dbd3e66e7 vhost/vsock: split packets to send using multiple buffers
If the packets to sent to the guest are bigger than the buffer
available, we can split them, using multiple buffers and fixing
the length in the packet header.
This is safe since virtio-vsock supports only stream sockets.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 15:00:00 -07:00
Stefano Garzarella
9632e9f61b vsock/virtio: fix locking in virtio_transport_inc_tx_pkt()
fwd_cnt and last_fwd_cnt are protected by rx_lock, so we should use
the same spinlock also if we are in the TX path.

Move also buf_alloc under the same lock.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 15:00:00 -07:00
Stefano Garzarella
b89d882dc9 vsock/virtio: reduce credit update messages
In order to reduce the number of credit update messages,
we send them only when the space available seen by the
transmitter is less than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 15:00:00 -07:00
Stefano Garzarella
473c7391ce vsock/virtio: limit the memory used per-socket
Since virtio-vsock was introduced, the buffers filled by the host
and pushed to the guest using the vring, are directly queued in
a per-socket list. These buffers are preallocated by the guest
with a fixed size (4 KB).

The maximum amount of memory used by each socket should be
controlled by the credit mechanism.
The default credit available per-socket is 256 KB, but if we use
only 1 byte per packet, the guest can queue up to 262144 of 4 KB
buffers, using up to 1 GB of memory per-socket. In addition, the
guest will continue to fill the vring with new 4 KB free buffers
to avoid starvation of other sockets.

This patch mitigates this issue copying the payload of small
packets (< 128 bytes) into the buffer of last packet queued, in
order to avoid wasting memory.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 15:00:00 -07:00
Stephen Boyd
d1a55841ab net: Remove dev_err() usage after platform_get_irq()
We don't need dev_err() messages when platform_get_irq() fails now that
platform_get_irq() prints an error message itself when something goes
wrong. Let's remove these prints with a simple semantic patch.

// <smpl>
@@
expression ret;
struct platform_device *E;
@@

ret =
(
platform_get_irq(E, ...)
|
platform_get_irq_byname(E, ...)
);

if ( \( ret < 0 \| ret <= 0 \) )
{
(
-if (ret != -EPROBE_DEFER)
-{ ...
-dev_err(...);
-... }
|
...
-dev_err(...);
)
...
}
// </smpl>

While we're here, remove braces on if statements that only have one
statement (manually).

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 14:37:35 -07:00
David S. Miller
2d73a6c38d Merge branch 'Finish-conversion-of-skb_frag_t-to-bio_vec'
Jonathan Lemon says:

====================
Finish conversion of skb_frag_t to bio_vec

The recent conversion of skb_frag_t to bio_vec did not include
skb_frag's page_offset.  Add accessor functions for this field,
utilize them, and remove the union, restoring the original structure.

v2:
  - rename accessors
  - follow kdoc conventions
====================

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 14:21:32 -07:00
Jonathan Lemon
65c84f148e linux: Remove bvec page_offset, use bv_offset
Now that page_offset is referenced through accessors, remove
the union, and use bv_offset.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 14:21:32 -07:00
Jonathan Lemon
b54c9d5bd6 net: Use skb_frag_off accessors
Use accessor functions for skb fragment's page_offset instead
of direct references, in preparation for bvec conversion.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 14:21:32 -07:00
Jonathan Lemon
7240b60c98 linux: Add skb_frag_t page_offset accessors
Add skb_frag_off(), skb_frag_off_add(), skb_frag_off_set(),
and skb_frag_off_copy() accessors for page_offset.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 14:21:31 -07:00
David S. Miller
6ca04afbf9 Merge branch 'sctp-clean-up-sctp_connect-function'
Xin Long says:

====================
sctp: clean up __sctp_connect function

This patchset is to factor out some common code for
sctp_sendmsg_new_asoc() and __sctp_connect() into 2
new functioins.

v1->v2:
  - add the patch 1/5 to avoid a slab-out-of-bounds warning.
  - add some code comment for the check change in patch 2/5.
  - remove unused 'addrcnt' as Marcelo noticed in patch 3/5.
====================

Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 14:18:14 -07:00
Xin Long
a64e59c72c sctp: factor out sctp_connect_add_peer
In this function factored out from sctp_sendmsg_new_asoc() and
__sctp_connect(), it adds a peer with the other addr into the
asoc after this asoc is created with the 1st addr.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 14:18:14 -07:00
Xin Long
f26f995122 sctp: factor out sctp_connect_new_asoc
In this function factored out from sctp_sendmsg_new_asoc() and
__sctp_connect(), it creates the asoc and adds a peer with the
1st addr.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 14:18:14 -07:00
Xin Long
dd8378b3af sctp: clean up __sctp_connect
__sctp_connect is doing quit similar things as sctp_sendmsg_new_asoc.
To factor out common functions, this patch is to clean up their code
to make them look more similar:

  1. create the asoc and add a peer with the 1st addr.
  2. add peers with the other addrs into this asoc one by one.

while at it, also remove the unused 'addrcnt'.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30 14:18:14 -07:00