Alexei Starovoitov says:
====================
pull-request: bpf-next 2021-12-30
The following pull-request contains BPF updates for your *net-next* tree.
We've added 72 non-merge commits during the last 20 day(s) which contain
a total of 223 files changed, 3510 insertions(+), 1591 deletions(-).
The main changes are:
1) Automatic setrlimit in libbpf when bpf is memcg's in the kernel, from Andrii.
2) Beautify and de-verbose verifier logs, from Christy.
3) Composable verifier types, from Hao.
4) bpf_strncmp helper, from Hou.
5) bpf.h header dependency cleanup, from Jakub.
6) get_func_[arg|ret|arg_cnt] helpers, from Jiri.
7) Sleepable local storage, from KP.
8) Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support, from Kumar.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
sock.h is pretty heavily used (5k objects rebuilt on x86 after
it's touched). We can drop the include of filter.h from it and
add a forward declaration of struct sk_filter instead.
This decreases the number of rebuilt objects when bpf.h
is touched from ~5k to ~1k.
There's a lot of missing includes this was masking. Primarily
in networking tho, this time.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/bpf/20211229004913.513372-1-kuba@kernel.org
The ice hardware contains an embedded chip with firmware which can be
updated using devlink flash. The firmware which runs on this chip is
referred to as the Embedded Management Processor firmware (EMP
firmware).
Activating the new firmware image currently requires that the system be
rebooted. This is not ideal as rebooting the system can cause unwanted
downtime.
In practical terms, activating the firmware does not always require a
full system reboot. In many cases it is possible to activate the EMP
firmware immediately. There are a couple of different scenarios to
cover.
* The EMP firmware itself can be reloaded by issuing a special update
to the device called an Embedded Management Processor reset (EMP
reset). This reset causes the device to reset and reload the EMP
firmware.
* PCI configuration changes are only reloaded after a cold PCIe reset.
Unfortunately there is no generic way to trigger this for a PCIe
device without a system reboot.
When performing a flash update, firmware is capable of responding with
some information about the specific update requirements.
The driver updates the flash by programming a secondary inactive bank
with the contents of the new image, and then issuing a command to
request to switch the active bank starting from the next load.
The response to the final command for updating the inactive NVM flash
bank includes an indication of the minimum reset required to fully
update the device. This can be one of the following:
* A full power on is required
* A cold PCIe reset is required
* An EMP reset is required
The response to the command to switch flash banks includes an indication
of whether or not the firmware will allow an EMP reset request.
For most updates, an EMP reset is sufficient to load the new EMP
firmware without issues. In some cases, this reset is not sufficient
because the PCI configuration space has changed. When this could cause
incompatibility with the new EMP image, the firmware is capable of
rejecting the EMP reset request.
Add logic to ice_fw_update.c to handle the response data flash update
AdminQ commands.
For the reset level, issue a devlink status notification informing the
user of how to complete the update with a simple suggestion like
"Activate new firmware by rebooting the system".
Cache the status of whether or not firmware will restrict the EMP reset
for use in implementing devlink reload.
Implement support for devlink reload with the "fw_activate" flag. This
allows user space to request the firmware be activated immediately.
For the .reload_down handler, we will issue a request for the EMP reset
using the appropriate firmware AdminQ command. If we know that the
firmware will not allow an EMP reset, simply exit with a suitable
netlink extended ACK message indicating that the EMP reset is not
available.
For the .reload_up handler, simply wait until the driver has finished
resetting. Logic to handle processing of an EMP reset already exists in
the driver as part of its reset and rebuild flows.
Implement support for the devlink reload interface with the
"fw_activate" action. This allows userspace to request activation of
firmware without a reboot.
Note that support for indicating the required reset and EMP reset
restriction is not supported on old versions of firmware. The driver can
determine if the two features are supported by checking the device
capabilities report. I confirmed support has existed since at least
version 5.5.2 as reported by the 'fw.mgmt' version. Support to issue the
EMP reset request has existed in all version of the EMP firmware for the
ice hardware.
Check the device capabilities report to determine whether or not the
indications are reported by the running firmware. If the reset
requirement indication is not supported, always assume a full power on
is necessary. If the reset restriction capability is not supported,
always assume the EMP reset is available.
Users can verify if the EMP reset has activated the firmware by using
the devlink info report to check that the 'running' firmware version has
updated. For example a user might do the following:
# Check current version
$ devlink dev info
# Update the device
$ devlink dev flash pci/0000:af:00.0 file firmware.bin
# Confirm stored version updated
$ devlink dev info
# Reload to activate new firmware
$ devlink dev reload pci/0000:af:00.0 action fw_activate
# Confirm running version updated
$ devlink dev info
Finally, this change does *not* implement basic driver-only reload
support. I did look into trying to do this. However, it requires
significant refactor of how the ice driver probes and loads everything.
The ice driver probe and allocation flows were not designed with such
a reload in mind. Refactoring the flow to support this is beyond the
scope of this change.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The ice_devlink_flash_update function performs a few upfront checks and
then calls ice_flash_pldm_image.
Most if these checks make more sense in the context of code within
ice_flash_pldm_image. Merge ice_devlink_flash_update and
ice_flash_pldm_image into one function, placing it in ice_fw_update.c
Since this is still the entry point for devlink, call the function
ice_devlink_flash_update instead of ice_flash_pldm_image. This leaves a
single function which handles the devlink parameters and then initiates
a PLDM update.
With this change, the ice_devlink_flash_update function in
ice_fw_update.c becomes the main entry point for flash update. It
elimintes some unnecessary boiler plate code between the two previous
functions. The ultimate motivation for this is that it eases supporting
a dry run with the PLDM library in a future change.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The ice_devlink_flash_update function performs a few checks and then
calls ice_flash_pldm_image. One of these checks is to call
ice_check_for_pending_update. This function checks if the device has
a pending update, and cancels it if so. This is necessary to allow
a new flash update to proceed.
We want to refactor the ice code to eliminate ice_devlink_flash_update,
moving its checks into ice_flash_pldm_image.
To do this, ice_check_for_pending_update will become static, and only
called by ice_flash_pldm_image. To make this change easier to review,
first just move the function up within the ice_fw_update.c file.
While at it, note that the function has a misleading name. Its primary
action is to cancel a pending update. Using the verb "check" does not
imply this. Rename it to ice_cancel_pending_update.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
We have a region for reading the contents of the NVM flash as
a snapshot. This region does not allow reading the Shadow RAM, as it
always passes the FLASH_ONLY bit to the low level firmware interface.
Add a separate shadow-ram region which will allow snapshot of the
current contents of the Shadow RAM. This data is built from the NVM
contents but is distinct as the device builds up the Shadow RAM during
initialization, so being able to snapshot its contents can be useful
when attempting to debug flash related issues.
Fix the comment description of the nvm-flash region which incorrectly
stated that it filled the shadow-ram region, and add a comment
explaining that the nvm-flash region does not actually read the Shadow
RAM.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
As all functions now return standard error codes, propagate the values
being returned instead of converting them to generic values.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
ice_status previously had a variable to contain these values where other
error codes had a variable as well. With ice_status now being an int,
there is no need for two variables to hold error values. In cases where
this occurs, remove one of the excess variables and use a single one.
Some initialization of variables are no longer needed and have been
removed.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Clean up code after changing ice_status to int. Rearrange to fix reverse
Christmas tree and pull lines up where applicable.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
To prepare for removal of ice_status, change the variables from
ice_status to int. This eases the transition when values are changed to
return standard int error codes over enum ice_status.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Remove the ice_stat_str() function which prints the string
representation of the ice_status error code. With upcoming changes
moving away from ice_status, there will be no need for this function.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
vbool in ice_devlink_enable_roce_get can be assigned to a
non-0/1 constant.
Fix this assignment of vbool to be 0/1.
Fixes: e523af4ee5 ("net/ice: Add support for enable_iwarp and enable_roce devlink param")
Suggested-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Allow support for 'enable_iwarp' and 'enable_roce' devlink params to turn
on/off iWARP or RoCE protocol support for E800 devices.
For example, a user can turn on iWARP functionality with,
devlink dev param set pci/0000:07:00.0 name enable_iwarp value true cmode runtime
This add an iWARP auxiliary rdma device, ice.iwarp.<>, under this PF.
A user request to enable both iWARP and RoCE under the same PF is rejected
since this device does not support both protocols simultaneously on the
same port.
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Leszek Kaliszczuk <leszek.kaliszczuk@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The helper function devm_add_action_or_reset() will internally
call devm_add_action(), and if devm_add_action() fails then it will
execute the action mentioned and return the error code. So
use devm_add_action_or_reset() instead of devm_add_action()
to simplify the error handling, reduce the code.
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Currently when a user uses "devlink dev info", the fw.mgmt.api will be
the major.minor numbers as shown below:
devlink dev info pci/0000:3b:00.0
pci/0000:3b:00.0:
driver ice
serial_number 00-01-00-ff-ff-00-00-00
versions:
fixed:
board.id K91258-000
running:
fw.mgmt 6.1.2
fw.mgmt.api 1.7 <--- No patch number included
fw.mgmt.build 0xd75e7d06
fw.mgmt.srev 5
fw.undi 1.2992.0
fw.undi.srev 5
fw.psid.api 3.10
fw.bundle_id 0x800085cc
fw.app.name ICE OS Default Package
fw.app 1.3.27.0
fw.app.bundle_id 0xc0000001
fw.netlist 3.10.2000-3.1e.0
fw.netlist.build 0x2a76e110
stored:
fw.mgmt.srev 5
fw.undi 1.2992.0
fw.undi.srev 5
fw.psid.api 3.10
fw.bundle_id 0x800085cc
fw.netlist 3.10.2000-3.1e.0
fw.netlist.build 0x2a76e110
There are many features in the driver that depend on the major, minor,
and patch version of the FW. Without the patch number in the output for
fw.mgmt.api debugging issues related to the FW API version is difficult.
Also, using major.minor.patch aligns with the existing firmware version
which uses a 3 digit value.
Fix this by making the fw.mgmt.api print the major.minor.patch
versions. Shown below is the result:
devlink dev info pci/0000:3b:00.0
pci/0000:3b:00.0:
driver ice
serial_number 00-01-00-ff-ff-00-00-00
versions:
fixed:
board.id K91258-000
running:
fw.mgmt 6.1.2
fw.mgmt.api 1.7.9 <--- patch number included
fw.mgmt.build 0xd75e7d06
fw.mgmt.srev 5
fw.undi 1.2992.0
fw.undi.srev 5
fw.psid.api 3.10
fw.bundle_id 0x800085cc
fw.app.name ICE OS Default Package
fw.app 1.3.27.0
fw.app.bundle_id 0xc0000001
fw.netlist 3.10.2000-3.1e.0
fw.netlist.build 0x2a76e110
stored:
fw.mgmt.srev 5
fw.undi 1.2992.0
fw.undi.srev 5
fw.psid.api 3.10
fw.bundle_id 0x800085cc
fw.netlist 3.10.2000-3.1e.0
fw.netlist.build 0x2a76e110
Fixes: ff2e5c700e ("ice: add basic handler for devlink .info_get")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Keeping devlink port inside VSI data structure causes some issues.
Since VF VSI is released during reset that means that we have to
unregister devlink port and register it again every time reset is
triggered. With the new changes in devlink API it
might cause deadlock issues. After calling
devlink_port_register/devlink_port_unregister devlink API is going to
lock rtnl_mutex. It's an issue when VF reset is triggered in netlink
operation context (like setting VF MAC address or VLAN),
because rtnl_lock is already taken by netlink. Another call of
rtnl_lock from devlink API results in dead-lock.
By moving devlink port to PF/VF we avoid creating/destroying it
during reset. Since this patch, devlink ports are created during
ice_probe, destroyed during ice_remove for PF and created during
ice_repr_add, destroyed during ice_repr_rem for VF.
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Write set and get eswitch mode functions used by devlink
ops. Use new pf struct member eswitch_mode to track current
eswitch mode in driver.
Changing eswitch mode is only allowed when there are no
VFs created.
Create new file for eswitch related code.
Add config flag ICE_SWITCHDEV to allow user to choose if
switchdev support should be enabled or disabled.
Use case examples:
- show current eswitch mode ('legacy' is the default one)
[root@localhost]# devlink dev eswitch show pci/0000:03:00.1
pci/0000:03:00.1: mode legacy
- move to 'switchdev' mode
[root@localhost]# devlink dev eswitch set pci/0000:03:00.1 mode
switchdev
[root@localhost]# devlink dev eswitch show pci/0000:03:00.1
pci/0000:03:00.1: mode switchdev
- create 2 VFs
[root@localhost]# echo 2 > /sys/class/net/ens4f1/device/sriov_numvfs
- unsuccessful attempt to change eswitch mode while VFs are created
[root@localhost]# devlink dev eswitch set pci/0000:03:00.1 mode legacy
devlink answers: Operation not supported
- destroy VFs
[root@localhost]# echo 0 > /sys/class/net/ens4f1/device/sriov_numvfs
- restore 'legacy' mode
[root@localhost]# devlink dev eswitch set pci/0000:03:00.1 mode legacy
[root@localhost]# devlink dev eswitch show pci/0000:03:00.1
pci/0000:03:00.1: mode legacy
Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
After commit a8f89fa277 ("ice: do not abort devlink info if board
identifier can't be found"), the getter/fallback() functions no longer
report an error. Convert the interface to a void so that it is no
longer possible to add a version field that is fatal. This makes
sense, because we should not fail to report other versions just
because one of the version pieces could not be found.
Finally, clean up the getter functions line wrapping so that none of
them take more than 80 columns, as is the usual style for networking
files.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
devlink_register() can't fail and always returns success, but all drivers
are obligated to check returned status anyway. This adds a lot of boilerplate
code to handle impossible flow.
Make devlink_register() void and simplify the drivers that use that
API call.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com> # dsa
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The devlink dev info command reports version information about the
device and firmware running on the board. This includes the "board.id"
field which is supposed to represent an identifier of the board design.
The ice driver uses the Product Board Assembly identifier for this.
In some cases, the PBA is not present in the NVM. If this happens,
devlink dev info will fail with an error. Instead, modify the
ice_info_pba function to just exit without filling in the context
buffer. This will cause the board.id field to be skipped. Log a dev_dbg
message in case someone wants to confirm why board.id is not showing up
for them.
Fixes: e961b679fb ("ice: add board identifier info to devlink .info_get")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20210819223451.245613-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
All kernel devlink implementations call to devlink_alloc() during
initialization routine for specific device which is used later as
a parent device for devlink_register().
Such late device assignment causes to the situation which requires us to
call to device_register() before setting other parameters, but that call
opens devlink to the world and makes accessible for the netlink users.
Any attempt to move devlink_register() to be the last call generates the
following error due to access to the devlink->dev pointer.
[ 8.758862] devlink_nl_param_fill+0x2e8/0xe50
[ 8.760305] devlink_param_notify+0x6d/0x180
[ 8.760435] __devlink_params_register+0x2f1/0x670
[ 8.760558] devlink_params_register+0x1e/0x20
The simple change of API to set devlink device in the devlink_alloc()
instead of devlink_register() fixes all this above and ensures that
prior to call to devlink_register() everything already set.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Requesting device firmware information while the device is busy cleaning
up after a reset can result in an unexpected failure:
This occurs because the command is attempting to access the device
AdminQ while it is down. Resolve this by having the command wait for
a while until the reset is complete. To do this, introduce
a reset_wait_queue and associated helper function "ice_wait_for_reset".
This helper will use the wait queue to sleep until the driver is done
rebuilding. Use of a wait queue is preferred because the potential sleep
duration can be several seconds.
To ensure that the thread wakes up properly, a new wake_up call is added
during all code paths which clear the reset state bits associated with
the driver rebuild flow.
Using this ensures that tools can request device information without
worrying about whether the driver is cleaning up from a reset.
Specifically, it is expected that a flash update could result in
a device reset, and it is better to delay the response for information
until the reset is complete rather than exit with an immediate failure.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
When filling out information for the DEVLINK_CMD_INFO_GET, the driver
needs to read some device capabilities. Add an extack message to
properly inform the caller of the failure, as we do for other failures
in this function.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Just as we recently added support for other stored firmware flash
versions, support display of the stored UNDI Option ROM version via
devlink info.
To do this, we need to introduce a new ice_get_inactive_orom_ver
function. This is a little trickier than with other flash versions. The
Option ROM version data was being read from a special "Boot
Configuration" block of the NVM Preserved Field Area. This block only
contains the *active* Option ROM version data. It is populated when the
device firmware finishes updating the Option ROM.
This method is ineffective at reading the stored Option ROM version
data. Instead of reading from this section of the flash, replace this
version extraction with one which locates the Combo Version information
from within the Option ROM binary.
This data is stored within the Option ROM at a 512 byte offset, in
a simple structured format. The structure uses a simple modulo 256
checksum for integrity verification. Scan through the Option ROM to
locate the CIVD data section, and extract the Combo Version.
Refactor ice_get_orom_ver_info so that it takes the bank select
enumeration parameter. Use this to implement ice_get_inactive_orom_ver.
Although all ice devices have a Boot Configuration block in the NVM PFA,
not all devices have a valid Option ROM. In this case, the old
ice_get_orom_ver_info would "succeed" but report a version of all
zeros. The new implementation would fail to locate the $CIV section in
the Option ROM and report an error. Thus, we must ensure that
ice_init_nvm does not fail if ice_get_orom_ver_info fails.
Use the new ice_get_inactive_orom_ver to allow reporting the Option ROM
versions for a pending update via devlink info.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Add a function to read the inactive netlist bank for version
information. To support this, refactor how we read the netlist version
data. Instead of using the firmware AQ interface with a module ID, read
from the flash as a flat NVM, using ice_read_flash_module.
This change requires a slight adjustment to the offset values used, as
reading from the flat NVM includes the type field (which was stripped by
firmware previously). Cleanup the macro names and move them to
ice_type.h. For clarity in how we calculate the offsets and so that
programmers can easily map the offset value to the data sheet, use
a wrapper macro to account for the offset adjustments.
Use the newly added ice_get_inactive_netlist_ver function to extract the
version data from the pending netlist module update. Add the stored
variants of "fw.netlist", and "fw.netlist.build" to the info version map
array.
With this change, we now report the "fw.netlist" and "fw.netlist.build"
versions into the stored section of the devlink info report. As with the
main NVM module versions, if there is no pending update, we report the
currently active values as stored.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The devlink info interface supports drivers reporting "stored" versions.
These versions indicate the version of an update that has been
downloaded to the device, but is not yet active.
The code for extracting the NVM version recently changed to enable
support for reading from either the active or the inactive bank. Use
this to implement ice_get_inactive_nvm_ver, which will read the NVM
version data from the inactive section of flash.
When reporting the versions via devlink info, first read the device
capabilities. Determine if there is a pending flash update, and if so,
extract relevant version information from the inactive flash. Store
these within the info context structure.
When reporting "stored" firmware versions, devlink documentation
indicates that we ought to always report a stored value, even if there
is no pending update. In this common case, the stored version should
match the running version. This means that each stored version should by
default fallback to the same value as reported by the running handler.
To support this, modify the version structure to have both a "getter"
and a "fallback". Modify the control loop so that it will use the
"fallback" function if the "getter" function does not report a version.
To report versions for which we can read the stored value, use a new
"stored()" macro. This macro will insert two entries into the version
list. The first entry is the traditional running version. The second is
the stored version, implemented with a fallback to the active version.
This is a little tricky, but reduces the overall duplication of elements
in the entry list, and ensures that running and stored values remain
consistent.
To avoid some duplication, add a combined() macro that will insert both
the running and stored versions into the version entry list.
Using this new support, add pending version reporter functions for
"fw.psid.api" and "fw.bundle_id". This enables reporting the stored
values for some of versions in the NVM module of the flash.
Reporting management versions is not implemented by this patch. The
active management version is reported to the driver via the AdminQ
mailbox during load. Although the version must be in the firmware binary
somewhere, accessing this from the inactive firmware is not trivial and
has not been implemented in this change.
Future changes will introduce support for reading the UNDI Option ROM
version and the version associated with the Netlist module.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The ice driver uses an array of structures which link an info name with
a function that formats the associated version data into a string.
All existing format functions simply format already captured static data
from the driver hw structure. Future changes will introduce format
functions for reporting the versions of flash sections stored but not
yet applied. This type of version data is not stored as a member of the
hw structure. This is because (a) it might not yet exist in the case
there is no pending flash update, and (b) even if it does, it might
change such as if an update is canceled or replaced by a new update
before finalizing.
We could simply have each format function gather its own data upon being
called. However, in some cases the raw binary version data is
a combination of multiple different reported fields. Additionally, the
current interface doesn't have a way for the function to indicate that
the version doesn't exist.
Refactor this function interface to take a new ice_info_ctx structure
instead of the buffer pointer and length. This context structure allows
for future extensions to pre-gather version data that is stored within
the context struct instead of the hw struct.
Allocate this context structure initially at the start of
ice_devlink_info_get. We use dynamic allocation instead of a local stack
variable in order to avoid using too much kernel stack once we extend it
with additional data structures.
Modify the main loop that drives the info reporting so that the version
buffer string is always cleared between each format. Explicitly check
that the format function actually filled in a version string of non-zero
length. If the string is not provided, simply skip this version without
reporting an error. This allows for introducing format functions of
versions which may or may not be present, such as the version of
a pending update that has not yet been activated.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The ice_nvm_info structure has become somewhat of a dumping ground for
all of the fields related to flash version. It holds the NVM version and
EETRACK id, the OptionROM info structure, the flash size, the ShadowRAM
size, and more.
A future change is going to add the ability to read the NVM version and
EETRACK ID from the inactive NVM bank. To make this simpler, it is
useful to have these NVM version info fields extracted to their own
structure.
Rename ice_nvm_info into ice_flash_info, and create a separate
ice_nvm_info structure that will contain the eetrack and NVM map
version. Move the netlist_ver structure into ice_flash_info and rename it
ice_netlist_info for consistency.
Modify the static ice_get_orom_ver_info to take the option rom structure
as a pointer. This makes it more obvious what portion of the hw struct
is being modified. Do the same for ice_get_netlist_ver_info.
Introduce a new ice_get_nvm_ver_info function, which will be similar to
ice_get_orom_ver_info and ice_get_netlist_ver_info, used to keep the NVM
version extraction code co-located.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
When performing a flash update via devlink, device drivers may inform
user space of status updates via
devlink_flash_update_(begin|end|timeout|status)_notify functions.
It is expected that drivers do not send any status notifications unless
they send a begin and end message. If a driver sends a status
notification without sending the appropriate end notification upon
finishing (regardless of success or failure), the current implementation
of the devlink userspace program can get stuck endlessly waiting for the
end notification that will never come.
The current ice driver implementation may send such a status message
without the appropriate end notification in rare cases.
Fixing the ice driver is relatively simple: we just need to send the
begin_notify at the start of the function and always send an end_notify
no matter how the function exits.
Rather than assuming driver authors will always get this right in the
future, lets just fix the API so that it is not possible to get wrong.
Make devlink_flash_update_begin_notify and
devlink_flash_update_end_notify static, and call them in devlink.c core
code. Always send the begin_notify just before calling the driver's
flash_update routine. Always send the end_notify just after the routine
returns regardless of success or failure.
Doing this makes the status notification easier to use from the driver,
as it no longer needs to worry about catching failures and cleaning up
by calling devlink_flash_update_end_notify. It is now no longer possible
to do the wrong thing in this regard. We also save a couple of lines of
code in each driver.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
All drivers which implement the devlink flash update support, with the
exception of netdevsim, use either request_firmware or
request_firmware_direct to locate the firmware file. Rather than having
each driver do this separately as part of its .flash_update
implementation, perform the request_firmware within net/core/devlink.c
Replace the file_name parameter in the struct devlink_flash_update_params
with a pointer to the fw object.
Use request_firmware rather than request_firmware_direct. Although most
Linux distributions today do not have the fallback mechanism
implemented, only about half the drivers used the _direct request, as
compared to the generic request_firmware. In the event that
a distribution does support the fallback mechanism, the devlink flash
update ought to be able to use it to provide the firmware contents. For
distributions which do not support the fallback userspace mechanism,
there should be essentially no difference between request_firmware and
request_firmware_direct.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
While debugging a recent failure to update the flash of an ice device,
I found it helpful to add additional logging which helped determine the
root cause of the problem being a timeout issue.
Add some extra dev_dbg() logging messages which can be enabled using the
dynamic debug facility, including one for ice_aq_wait_for_event that
will use jiffies to capture a rough estimate of how long we waited for
the completion of a firmware command.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Brijesh Behera <brijeshx.behera@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, the devlink_port structure is stored within the ice_pf. This
made sense because we create a single devlink_port for each PF. This
setup does not mesh with the abstractions in the driver very well, and
led to a flow where we accidentally call devlink_port_unregister twice
during error cleanup.
In particular, if devlink_port_register or devlink_port_unregister are
called twice, this leads to a kernel panic. This appears to occur during
some possible flows while cleaning up from a failure during driver
probe.
If register_netdev fails, then we will call devlink_port_unregister in
ice_cfg_netdev as it cleans up. Later, we again call
devlink_port_unregister since we assume that we must cleanup the port
that is associated with the PF structure.
This occurs because we cleanup the devlink_port for the main PF even
though it was not allocated. We allocated the port within a per-VSI
function for managing the main netdev, but did not release the port when
cleaning up that VSI, the allocation and destruction are not aligned.
Instead of attempting to manage the devlink_port as part of the PF
structure, manage it as part of the PF VSI. Doing this has advantages,
as we can match the de-allocation of the devlink_port with the
unregister_netdev associated with the main PF VSI.
Moving the port to the VSI is preferable as it paves the way for
handling devlink ports allocated for other purposes such as SR-IOV VFs.
Since we're changing up how we allocate the devlink_port, also change
the indexing. Originally, we indexed the port using the PF id number.
This came from an old goal of sharing a devlink for each physical
function. Managing devlink instances across multiple function drivers is
not workable. Instead, lets set the port number to the logical port
number returned by firmware and set the index using the VSI index
(sometimes referred to as VSI handle).
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add "fw.app.bundle_id" to display the DDP Track ID of the active DDP
package. This id is similar to "fw.bundle_id" and is a unique identifier
for the DDP package that is loaded in the device. Each new DDP has
a unique Track ID generated for it, and the ID can be used to identify
and track the DDP package.
Add documentation for the new devlink info version.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ice_info_get_dsn always returns 0, so just make it void.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use %*phD format to print small buffer as hex string.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Support the recently added DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK
parameter in the ice flash update handler. Convert the overwrite mask
bitfield into the appropriate preservation level used by the firmware
when updating.
Because there is no equivalent preservation level for overwriting only
identifiers, this combination is rejected by the driver as not supported
with an appropriate extended ACK message.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The devlink core recently gained support for checking whether the driver
supports a flash_update parameter, via `supported_flash_update_params`.
However, parameters are specified as function arguments. Adding a new
parameter still requires modifying the signature of the .flash_update
callback in all drivers.
Convert the .flash_update function to take a new `struct
devlink_flash_update_params` instead. By using this structure, and the
`supported_flash_update_params` bit field, a new parameter to
flash_update can be added without requiring modification to existing
drivers.
As before, all parameters except file_name will require driver opt-in.
Because file_name is a necessary field to for the flash_update to make
sense, no "SUPPORTED" bitflag is provided and it is always considered
valid. All future additional parameters will require a new bit in the
supported_flash_update_params bitfield.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Bin Luo <luobin9@huawei.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Ido Schimmel <idosch@mellanox.com>
Cc: Danielle Ratson <danieller@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When implementing .flash_update, drivers which do not support
per-component update are manually checking the component parameter to
verify that it is NULL. Without this check, the driver might accept an
update request with a component specified even though it will not honor
such a request.
Instead of having each driver check this, move the logic into
net/core/devlink.c, and use a new `supported_flash_update_params` field
in the devlink_ops. Drivers which will support per-component update must
now specify this by setting DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT in
the supported_flash_update_params in their devlink_ops.
This helps ensure that drivers do not forget to check for a NULL
component if they do not support per-component update. This also enables
a slightly better error message by enabling the core stack to set the
netlink bad attribute message to indicate precisely the unsupported
attribute in the message.
Going forward, any new additional parameter to flash update will require
a bit in the supported_flash_update_params bitfield.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Bin Luo <luobin9@huawei.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Ido Schimmel <idosch@mellanox.com>
Cc: Danielle Ratson <danieller@mellanox.com>
Cc: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pass the region to be snapshotted to the function performing the
snapshot. This allows one function to operate on numerous regions.
v4:
Add missing kerneldoc for ICE
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a collection of minor fixes including typos, white space, and
style. No functional changes.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Use the newly added pldmfw library to implement device flash update for
the Intel ice networking device driver. This support uses the devlink
flash update interface.
The main parts of the flash include the Option ROM, the netlist module,
and the main NVM data. The PLDM firmware file contains modules for each
of these components.
Using the pldmfw library, the provided firmware file will be scanned for
the three major components, "fw.undi" for the Option ROM, "fw.mgmt" for
the main NVM module containing the primary device firmware, and
"fw.netlist" containing the netlist module.
The flash is separated into two banks, the active bank containing the
running firmware, and the inactive bank which we use for update. Each
module is updated in a staged process. First, the inactive bank is
erased, preparing the device for update. Second, the contents of the
component are copied to the inactive portion of the flash. After all
components are updated, the driver signals the device to switch the
active bank during the next EMP reset (which would usually occur during
the next reboot).
Although the firmware AdminQ interface does report an immediate status
for each command, the NVM erase and NVM write commands receive status
asynchronously. The driver must not continue writing until previous
erase and write commands have finished. The real status of the NVM
commands is returned over the receive AdminQ. Implement a simple
interface that uses a wait queue so that the main update thread can
sleep until the completion status is reported by firmware. For erasing
the inactive banks, this can take quite a while in practice.
To help visualize the process to the devlink application and other
applications based on the devlink netlink interface, status is reported
via the devlink_flash_update_status_notify. While we do report status
after each 4k block when writing, there is no real status we can report
during erasing. We simply must wait for the complete module erasure to
finish.
With this implementation, basic flash update for the ice hardware is
supported.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, devlink_port_attrs_set accepts a long list of parameters,
that most of them are devlink port's attributes.
Use the devlink_port_attrs struct to replace the relevant parameters.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new devlink region used for capturing a snapshot of the device
capabilities buffer which is reported by the firmware over the AdminQ.
This information can useful in debugging driver and firmware
interactions.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The flash memory for the ice hardware contains a block of information
used for link management called the Netlist module.
As this essentially represents another section of firmware, add its
version information to the output of the driver's .info_get handler.
This includes both a version and the first few bytes of a hash of the
module contents.
fw.netlist -> the version information extracted from the netlist module
fw.netlist.build-> first 4 bytes of the hash of the contents, similar
to fw.mgmt.build
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add a devlink region for exposing the device's Non Volatime Memory flash
contents.
Support the recently added .snapshot operation, enabling userspace to
request a snapshot of the NVM contents via DEVLINK_CMD_REGION_NEW.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Export a unique board identifier using "board.id" for devlink's
.info_get command.
Obtain this by reading the NVM for the PBA identification string.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The devlink .info_get callback allows the driver to report detailed
version information. The following devlink versions are reported with
this initial implementation:
"fw.mgmt" -> The version of the firmware that controls PHY, link, etc
"fw.mgmt.api" -> API version of interface exposed over the AdminQ
"fw.mgmt.build" -> Unique build id of the source for the management fw
"fw.undi" -> Version of the Option ROM containing the UEFI driver
"fw.psid.api" -> Version of the NVM image format.
"fw.bundle_id" -> Unique identifier for the combined flash image.
"fw.app.name" -> The name of the active DDP package.
"fw.app" -> The version of the active DDP package.
With this, devlink dev info can report at least as much information as
is reported by ETHTOOL_GDRVINFO.
Compare the output from ethtool vs from devlink:
$ ethtool -i ens785s0
driver: ice
version: 0.8.1-k
firmware-version: 0.80 0x80002ec0 1.2581.0
expansion-rom-version:
bus-info: 0000:3b:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
$ devlink dev info pci/0000:3b:00.0
pci/0000:3b:00.0:
driver ice
serial number 00-01-ab-ff-ff-ca-05-68
versions:
running:
fw.mgmt 2.1.7
fw.mgmt.api 1.5
fw.mgmt.build 0x305d955f
fw.undi 1.2581.0
fw.psid.api 0.80
fw.bundle_id 0x80002ec0
fw.app.name ICE OS Default Package
fw.app 1.3.1.0
More pieces of information can be displayed, each version is kept
separate instead of munged together, and each version has an identifier
which comes with associated documentation.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>