================== AF_XDP TX Metadata ================== This document describes how to enable offloads when transmitting packets via :doc:`af_xdp`. Refer to :doc:`xdp-rx-metadata` on how to access similar metadata on the receive side. General Design ============== The headroom for the metadata is reserved via ``tx_metadata_len`` in ``struct xdp_umem_reg``. The metadata length is therefore the same for every socket that shares the same umem. The metadata layout is a fixed UAPI, refer to ``union xsk_tx_metadata`` in ``include/uapi/linux/if_xdp.h``. Thus, generally, the ``tx_metadata_len`` field above should contain ``sizeof(union xsk_tx_metadata)``. The headroom and the metadata itself should be located right before ``xdp_desc->addr`` in the umem frame. Within a frame, the metadata layout is as follows:: tx_metadata_len / \ +-----------------+---------+----------------------------+ | xsk_tx_metadata | padding | payload | +-----------------+---------+----------------------------+ ^ | xdp_desc->addr An AF_XDP application can request headrooms larger than ``sizeof(struct xsk_tx_metadata)``. The kernel will ignore the padding (and will still use ``xdp_desc->addr - tx_metadata_len`` to locate the ``xsk_tx_metadata``). For the frames that shouldn't carry any metadata (i.e., the ones that don't have ``XDP_TX_METADATA`` option), the metadata area is ignored by the kernel as well. The flags field enables the particular offload: - ``XDP_TXMD_FLAGS_TIMESTAMP``: requests the device to put transmission timestamp into ``tx_timestamp`` field of ``union xsk_tx_metadata``. - ``XDP_TXMD_FLAGS_CHECKSUM``: requests the device to calculate L4 checksum. ``csum_start`` specifies byte offset of where the checksumming should start and ``csum_offset`` specifies byte offset where the device should store the computed checksum. Besides the flags above, in order to trigger the offloads, the first packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA`` bit in the ``options`` field. Also note that in a multi-buffer packet only the first chunk should carry the metadata. Software TX Checksum ==================== For development and testing purposes its possible to pass ``XDP_UMEM_TX_SW_CSUM`` flag to ``XDP_UMEM_REG`` UMEM registration call. In this case, when running in ``XDK_COPY`` mode, the TX checksum is calculated on the CPU. Do not enable this option in production because it will negatively affect performance. Querying Device Capabilities ============================ Every devices exports its offloads capabilities via netlink netdev family. Refer to ``xsk-flags`` features bitmask in ``Documentation/netlink/specs/netdev.yaml``. - ``tx-timestamp``: device supports ``XDP_TXMD_FLAGS_TIMESTAMP`` - ``tx-checksum``: device supports ``XDP_TXMD_FLAGS_CHECKSUM`` See ``tools/net/ynl/samples/netdev.c`` on how to query this information. Example ======= See ``tools/testing/selftests/bpf/xdp_hw_metadata.c`` for an example program that handles TX metadata. Also see https://github.com/fomichev/xskgen for a more bare-bones example.