Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next

Pull networking updates from David Miller:

 1) Add Maglev hashing scheduler to IPVS, from Inju Song.

 2) Lots of new TC subsystem tests from Roman Mashak.

 3) Add TCP zero copy receive and fix delayed acks and autotuning with
    SO_RCVLOWAT, from Eric Dumazet.

 4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard
    Brouer.

 5) Add ttl inherit support to vxlan, from Hangbin Liu.

 6) Properly separate ipv6 routes into their logically independant
    components. fib6_info for the routing table, and fib6_nh for sets of
    nexthops, which thus can be shared. From David Ahern.

 7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP
    messages from XDP programs. From Nikita V. Shirokov.

 8) Lots of long overdue cleanups to the r8169 driver, from Heiner
    Kallweit.

 9) Add BTF ("BPF Type Format"), from Martin KaFai Lau.

10) Add traffic condition monitoring to iwlwifi, from Luca Coelho.

11) Plumb extack down into fib_rules, from Roopa Prabhu.

12) Add Flower classifier offload support to igb, from Vinicius Costa
    Gomes.

13) Add UDP GSO support, from Willem de Bruijn.

14) Add documentation for eBPF helpers, from Quentin Monnet.

15) Add TLS tx offload to mlx5, from Ilya Lesokhin.

16) Allow applications to be given the number of bytes available to read
    on a socket via a control message returned from recvmsg(), from
    Soheil Hassas Yeganeh.

17) Add x86_32 eBPF JIT compiler, from Wang YanQing.

18) Add AF_XDP sockets, with zerocopy support infrastructure as well.
    From Björn Töpel.

19) Remove indirect load support from all of the BPF JITs and handle
    these operations in the verifier by translating them into native BPF
    instead. From Daniel Borkmann.

20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha.

21) Allow XDP programs to do lookups in the main kernel routing tables
    for forwarding. From David Ahern.

22) Allow drivers to store hardware state into an ELF section of kernel
    dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy.

23) Various RACK and loss detection improvements in TCP, from Yuchung
    Cheng.

24) Add TCP SACK compression, from Eric Dumazet.

25) Add User Mode Helper support and basic bpfilter infrastructure, from
    Alexei Starovoitov.

26) Support ports and protocol values in RTM_GETROUTE, from Roopa
    Prabhu.

27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard
    Brouer.

28) Add lots of forwarding selftests, from Petr Machata.

29) Add generic network device failover driver, from Sridhar Samudrala.

* ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits)
  strparser: Add __strp_unpause and use it in ktls.
  rxrpc: Fix terminal retransmission connection ID to include the channel
  net: hns3: Optimize PF CMDQ interrupt switching process
  net: hns3: Fix for VF mailbox receiving unknown message
  net: hns3: Fix for VF mailbox cannot receiving PF response
  bnx2x: use the right constant
  Revert "net: sched: cls: Fix offloading when ingress dev is vxlan"
  net: dsa: b53: Fix for brcm tag issue in Cygnus SoC
  enic: fix UDP rss bits
  netdev-FAQ: clarify DaveM's position for stable backports
  rtnetlink: validate attributes in do_setlink()
  mlxsw: Add extack messages for port_{un, }split failures
  netdevsim: Add extack error message for devlink reload
  devlink: Add extack to reload and port_{un, }split operations
  net: metrics: add proper netlink validation
  ipmr: fix error path when ipmr_new_table fails
  ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
  net: hns3: remove unused hclgevf_cfg_func_mta_filter
  netfilter: provide udp*_lib_lookup for nf_tproxy
  qed*: Utilize FW 8.37.2.0
  ...
This commit is contained in:
Linus Torvalds 2018-06-06 18:39:49 -07:00
commit 1c8c5a9d38
1744 changed files with 114089 additions and 42298 deletions

View File

@ -0,0 +1,36 @@
=================
BPF documentation
=================
This directory contains documentation for the BPF (Berkeley Packet
Filter) facility, with a focus on the extended BPF version (eBPF).
This kernel side documentation is still work in progress. The main
textual documentation is (for historical reasons) described in
`Documentation/networking/filter.txt`_, which describe both classical
and extended BPF instruction-set.
The Cilium project also maintains a `BPF and XDP Reference Guide`_
that goes into great technical depth about the BPF Architecture.
The primary info for the bpf syscall is available in the `man-pages`_
for `bpf(2)`_.
Frequently asked questions (FAQ)
================================
Two sets of Questions and Answers (Q&A) are maintained.
* QA for common questions about BPF see: bpf_design_QA_
* QA for developers interacting with BPF subsystem: bpf_devel_QA_
.. Links:
.. _bpf_design_QA: bpf_design_QA.rst
.. _bpf_devel_QA: bpf_devel_QA.rst
.. _Documentation/networking/filter.txt: ../networking/filter.txt
.. _man-pages: https://www.kernel.org/doc/man-pages/
.. _bpf(2): http://man7.org/linux/man-pages/man2/bpf.2.html
.. _BPF and XDP Reference Guide: http://cilium.readthedocs.io/en/latest/bpf/

View File

@ -0,0 +1,221 @@
==============
BPF Design Q&A
==============
BPF extensibility and applicability to networking, tracing, security
in the linux kernel and several user space implementations of BPF
virtual machine led to a number of misunderstanding on what BPF actually is.
This short QA is an attempt to address that and outline a direction
of where BPF is heading long term.
.. contents::
:local:
:depth: 3
Questions and Answers
=====================
Q: Is BPF a generic instruction set similar to x64 and arm64?
-------------------------------------------------------------
A: NO.
Q: Is BPF a generic virtual machine ?
-------------------------------------
A: NO.
BPF is generic instruction set *with* C calling convention.
-----------------------------------------------------------
Q: Why C calling convention was chosen?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because BPF programs are designed to run in the linux kernel
which is written in C, hence BPF defines instruction set compatible
with two most used architectures x64 and arm64 (and takes into
consideration important quirks of other architectures) and
defines calling convention that is compatible with C calling
convention of the linux kernel on those architectures.
Q: can multiple return values be supported in the future?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: NO. BPF allows only register R0 to be used as return value.
Q: can more than 5 function arguments be supported in the future?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: NO. BPF calling convention only allows registers R1-R5 to be used
as arguments. BPF is not a standalone instruction set.
(unlike x64 ISA that allows msft, cdecl and other conventions)
Q: can BPF programs access instruction pointer or return address?
-----------------------------------------------------------------
A: NO.
Q: can BPF programs access stack pointer ?
------------------------------------------
A: NO.
Only frame pointer (register R10) is accessible.
From compiler point of view it's necessary to have stack pointer.
For example LLVM defines register R11 as stack pointer in its
BPF backend, but it makes sure that generated code never uses it.
Q: Does C-calling convention diminishes possible use cases?
-----------------------------------------------------------
A: YES.
BPF design forces addition of major functionality in the form
of kernel helper functions and kernel objects like BPF maps with
seamless interoperability between them. It lets kernel call into
BPF programs and programs call kernel helpers with zero overhead.
As all of them were native C code. That is particularly the case
for JITed BPF programs that are indistinguishable from
native kernel C code.
Q: Does it mean that 'innovative' extensions to BPF code are disallowed?
------------------------------------------------------------------------
A: Soft yes.
At least for now until BPF core has support for
bpf-to-bpf calls, indirect calls, loops, global variables,
jump tables, read only sections and all other normal constructs
that C code can produce.
Q: Can loops be supported in a safe way?
----------------------------------------
A: It's not clear yet.
BPF developers are trying to find a way to
support bounded loops where the verifier can guarantee that
the program terminates in less than 4096 instructions.
Instruction level questions
---------------------------
Q: LD_ABS and LD_IND instructions vs C code
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Q: How come LD_ABS and LD_IND instruction are present in BPF whereas
C code cannot express them and has to use builtin intrinsics?
A: This is artifact of compatibility with classic BPF. Modern
networking code in BPF performs better without them.
See 'direct packet access'.
Q: BPF instructions mapping not one-to-one to native CPU
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Q: It seems not all BPF instructions are one-to-one to native CPU.
For example why BPF_JNE and other compare and jumps are not cpu-like?
A: This was necessary to avoid introducing flags into ISA which are
impossible to make generic and efficient across CPU architectures.
Q: why BPF_DIV instruction doesn't map to x64 div?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because if we picked one-to-one relationship to x64 it would have made
it more complicated to support on arm64 and other archs. Also it
needs div-by-zero runtime check.
Q: why there is no BPF_SDIV for signed divide operation?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because it would be rarely used. llvm errors in such case and
prints a suggestion to use unsigned divide instead
Q: Why BPF has implicit prologue and epilogue?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because architectures like sparc have register windows and in general
there are enough subtle differences between architectures, so naive
store return address into stack won't work. Another reason is BPF has
to be safe from division by zero (and legacy exception path
of LD_ABS insn). Those instructions need to invoke epilogue and
return implicitly.
Q: Why BPF_JLT and BPF_JLE instructions were not introduced in the beginning?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because classic BPF didn't have them and BPF authors felt that compiler
workaround would be acceptable. Turned out that programs lose performance
due to lack of these compare instructions and they were added.
These two instructions is a perfect example what kind of new BPF
instructions are acceptable and can be added in the future.
These two already had equivalent instructions in native CPUs.
New instructions that don't have one-to-one mapping to HW instructions
will not be accepted.
Q: BPF 32-bit subregister requirements
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Q: BPF 32-bit subregisters have a requirement to zero upper 32-bits of BPF
registers which makes BPF inefficient virtual machine for 32-bit
CPU architectures and 32-bit HW accelerators. Can true 32-bit registers
be added to BPF in the future?
A: NO. The first thing to improve performance on 32-bit archs is to teach
LLVM to generate code that uses 32-bit subregisters. Then second step
is to teach verifier to mark operations where zero-ing upper bits
is unnecessary. Then JITs can take advantage of those markings and
drastically reduce size of generated code and improve performance.
Q: Does BPF have a stable ABI?
------------------------------
A: YES. BPF instructions, arguments to BPF programs, set of helper
functions and their arguments, recognized return codes are all part
of ABI. However when tracing programs are using bpf_probe_read() helper
to walk kernel internal datastructures and compile with kernel
internal headers these accesses can and will break with newer
kernels. The union bpf_attr -> kern_version is checked at load time
to prevent accidentally loading kprobe-based bpf programs written
for a different kernel. Networking programs don't do kern_version check.
Q: How much stack space a BPF program uses?
-------------------------------------------
A: Currently all program types are limited to 512 bytes of stack
space, but the verifier computes the actual amount of stack used
and both interpreter and most JITed code consume necessary amount.
Q: Can BPF be offloaded to HW?
------------------------------
A: YES. BPF HW offload is supported by NFP driver.
Q: Does classic BPF interpreter still exist?
--------------------------------------------
A: NO. Classic BPF programs are converted into extend BPF instructions.
Q: Can BPF call arbitrary kernel functions?
-------------------------------------------
A: NO. BPF programs can only call a set of helper functions which
is defined for every program type.
Q: Can BPF overwrite arbitrary kernel memory?
---------------------------------------------
A: NO.
Tracing bpf programs can *read* arbitrary memory with bpf_probe_read()
and bpf_probe_read_str() helpers. Networking programs cannot read
arbitrary memory, since they don't have access to these helpers.
Programs can never read or write arbitrary memory directly.
Q: Can BPF overwrite arbitrary user memory?
-------------------------------------------
A: Sort-of.
Tracing BPF programs can overwrite the user memory
of the current task with bpf_probe_write_user(). Every time such
program is loaded the kernel will print warning message, so
this helper is only useful for experiments and prototypes.
Tracing BPF programs are root only.
Q: bpf_trace_printk() helper warning
------------------------------------
Q: When bpf_trace_printk() helper is used the kernel prints nasty
warning message. Why is that?
A: This is done to nudge program authors into better interfaces when
programs need to pass data to user space. Like bpf_perf_event_output()
can be used to efficiently stream data via perf ring buffer.
BPF maps can be used for asynchronous data sharing between kernel
and user space. bpf_trace_printk() should only be used for debugging.
Q: New functionality via kernel modules?
----------------------------------------
Q: Can BPF functionality such as new program or map types, new
helpers, etc be added out of kernel module code?
A: NO.

View File

@ -1,156 +0,0 @@
BPF extensibility and applicability to networking, tracing, security
in the linux kernel and several user space implementations of BPF
virtual machine led to a number of misunderstanding on what BPF actually is.
This short QA is an attempt to address that and outline a direction
of where BPF is heading long term.
Q: Is BPF a generic instruction set similar to x64 and arm64?
A: NO.
Q: Is BPF a generic virtual machine ?
A: NO.
BPF is generic instruction set _with_ C calling convention.
Q: Why C calling convention was chosen?
A: Because BPF programs are designed to run in the linux kernel
which is written in C, hence BPF defines instruction set compatible
with two most used architectures x64 and arm64 (and takes into
consideration important quirks of other architectures) and
defines calling convention that is compatible with C calling
convention of the linux kernel on those architectures.
Q: can multiple return values be supported in the future?
A: NO. BPF allows only register R0 to be used as return value.
Q: can more than 5 function arguments be supported in the future?
A: NO. BPF calling convention only allows registers R1-R5 to be used
as arguments. BPF is not a standalone instruction set.
(unlike x64 ISA that allows msft, cdecl and other conventions)
Q: can BPF programs access instruction pointer or return address?
A: NO.
Q: can BPF programs access stack pointer ?
A: NO. Only frame pointer (register R10) is accessible.
From compiler point of view it's necessary to have stack pointer.
For example LLVM defines register R11 as stack pointer in its
BPF backend, but it makes sure that generated code never uses it.
Q: Does C-calling convention diminishes possible use cases?
A: YES. BPF design forces addition of major functionality in the form
of kernel helper functions and kernel objects like BPF maps with
seamless interoperability between them. It lets kernel call into
BPF programs and programs call kernel helpers with zero overhead.
As all of them were native C code. That is particularly the case
for JITed BPF programs that are indistinguishable from
native kernel C code.
Q: Does it mean that 'innovative' extensions to BPF code are disallowed?
A: Soft yes. At least for now until BPF core has support for
bpf-to-bpf calls, indirect calls, loops, global variables,
jump tables, read only sections and all other normal constructs
that C code can produce.
Q: Can loops be supported in a safe way?
A: It's not clear yet. BPF developers are trying to find a way to
support bounded loops where the verifier can guarantee that
the program terminates in less than 4096 instructions.
Q: How come LD_ABS and LD_IND instruction are present in BPF whereas
C code cannot express them and has to use builtin intrinsics?
A: This is artifact of compatibility with classic BPF. Modern
networking code in BPF performs better without them.
See 'direct packet access'.
Q: It seems not all BPF instructions are one-to-one to native CPU.
For example why BPF_JNE and other compare and jumps are not cpu-like?
A: This was necessary to avoid introducing flags into ISA which are
impossible to make generic and efficient across CPU architectures.
Q: why BPF_DIV instruction doesn't map to x64 div?
A: Because if we picked one-to-one relationship to x64 it would have made
it more complicated to support on arm64 and other archs. Also it
needs div-by-zero runtime check.
Q: why there is no BPF_SDIV for signed divide operation?
A: Because it would be rarely used. llvm errors in such case and
prints a suggestion to use unsigned divide instead
Q: Why BPF has implicit prologue and epilogue?
A: Because architectures like sparc have register windows and in general
there are enough subtle differences between architectures, so naive
store return address into stack won't work. Another reason is BPF has
to be safe from division by zero (and legacy exception path
of LD_ABS insn). Those instructions need to invoke epilogue and
return implicitly.
Q: Why BPF_JLT and BPF_JLE instructions were not introduced in the beginning?
A: Because classic BPF didn't have them and BPF authors felt that compiler
workaround would be acceptable. Turned out that programs lose performance
due to lack of these compare instructions and they were added.
These two instructions is a perfect example what kind of new BPF
instructions are acceptable and can be added in the future.
These two already had equivalent instructions in native CPUs.
New instructions that don't have one-to-one mapping to HW instructions
will not be accepted.
Q: BPF 32-bit subregisters have a requirement to zero upper 32-bits of BPF
registers which makes BPF inefficient virtual machine for 32-bit
CPU architectures and 32-bit HW accelerators. Can true 32-bit registers
be added to BPF in the future?
A: NO. The first thing to improve performance on 32-bit archs is to teach
LLVM to generate code that uses 32-bit subregisters. Then second step
is to teach verifier to mark operations where zero-ing upper bits
is unnecessary. Then JITs can take advantage of those markings and
drastically reduce size of generated code and improve performance.
Q: Does BPF have a stable ABI?
A: YES. BPF instructions, arguments to BPF programs, set of helper
functions and their arguments, recognized return codes are all part
of ABI. However when tracing programs are using bpf_probe_read() helper
to walk kernel internal datastructures and compile with kernel
internal headers these accesses can and will break with newer
kernels. The union bpf_attr -> kern_version is checked at load time
to prevent accidentally loading kprobe-based bpf programs written
for a different kernel. Networking programs don't do kern_version check.
Q: How much stack space a BPF program uses?
A: Currently all program types are limited to 512 bytes of stack
space, but the verifier computes the actual amount of stack used
and both interpreter and most JITed code consume necessary amount.
Q: Can BPF be offloaded to HW?
A: YES. BPF HW offload is supported by NFP driver.
Q: Does classic BPF interpreter still exist?
A: NO. Classic BPF programs are converted into extend BPF instructions.
Q: Can BPF call arbitrary kernel functions?
A: NO. BPF programs can only call a set of helper functions which
is defined for every program type.
Q: Can BPF overwrite arbitrary kernel memory?
A: NO. Tracing bpf programs can _read_ arbitrary memory with bpf_probe_read()
and bpf_probe_read_str() helpers. Networking programs cannot read
arbitrary memory, since they don't have access to these helpers.
Programs can never read or write arbitrary memory directly.
Q: Can BPF overwrite arbitrary user memory?
A: Sort-of. Tracing BPF programs can overwrite the user memory
of the current task with bpf_probe_write_user(). Every time such
program is loaded the kernel will print warning message, so
this helper is only useful for experiments and prototypes.
Tracing BPF programs are root only.
Q: When bpf_trace_printk() helper is used the kernel prints nasty
warning message. Why is that?
A: This is done to nudge program authors into better interfaces when
programs need to pass data to user space. Like bpf_perf_event_output()
can be used to efficiently stream data via perf ring buffer.
BPF maps can be used for asynchronous data sharing between kernel
and user space. bpf_trace_printk() should only be used for debugging.
Q: Can BPF functionality such as new program or map types, new
helpers, etc be added out of kernel module code?
A: NO.

View File

@ -0,0 +1,640 @@
=================================
HOWTO interact with BPF subsystem
=================================
This document provides information for the BPF subsystem about various
workflows related to reporting bugs, submitting patches, and queueing
patches for stable kernels.
For general information about submitting patches, please refer to
`Documentation/process/`_. This document only describes additional specifics
related to BPF.
.. contents::
:local:
:depth: 2
Reporting bugs
==============
Q: How do I report bugs for BPF kernel code?
--------------------------------------------
A: Since all BPF kernel development as well as bpftool and iproute2 BPF
loader development happens through the netdev kernel mailing list,
please report any found issues around BPF to the following mailing
list:
netdev@vger.kernel.org
This may also include issues related to XDP, BPF tracing, etc.
Given netdev has a high volume of traffic, please also add the BPF
maintainers to Cc (from kernel MAINTAINERS_ file):
* Alexei Starovoitov <ast@kernel.org>
* Daniel Borkmann <daniel@iogearbox.net>
In case a buggy commit has already been identified, make sure to keep
the actual commit authors in Cc as well for the report. They can
typically be identified through the kernel's git tree.
**Please do NOT report BPF issues to bugzilla.kernel.org since it
is a guarantee that the reported issue will be overlooked.**
Submitting patches
==================
Q: To which mailing list do I need to submit my BPF patches?
------------------------------------------------------------
A: Please submit your BPF patches to the netdev kernel mailing list:
netdev@vger.kernel.org
Historically, BPF came out of networking and has always been maintained
by the kernel networking community. Although these days BPF touches
many other subsystems as well, the patches are still routed mainly
through the networking community.
In case your patch has changes in various different subsystems (e.g.
tracing, security, etc), make sure to Cc the related kernel mailing
lists and maintainers from there as well, so they are able to review
the changes and provide their Acked-by's to the patches.
Q: Where can I find patches currently under discussion for BPF subsystem?
-------------------------------------------------------------------------
A: All patches that are Cc'ed to netdev are queued for review under netdev
patchwork project:
http://patchwork.ozlabs.org/project/netdev/list/
Those patches which target BPF, are assigned to a 'bpf' delegate for
further processing from BPF maintainers. The current queue with
patches under review can be found at:
https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
Once the patches have been reviewed by the BPF community as a whole
and approved by the BPF maintainers, their status in patchwork will be
changed to 'Accepted' and the submitter will be notified by mail. This
means that the patches look good from a BPF perspective and have been
applied to one of the two BPF kernel trees.
In case feedback from the community requires a respin of the patches,
their status in patchwork will be set to 'Changes Requested', and purged
from the current review queue. Likewise for cases where patches would
get rejected or are not applicable to the BPF trees (but assigned to
the 'bpf' delegate).
Q: How do the changes make their way into Linux?
------------------------------------------------
A: There are two BPF kernel trees (git repositories). Once patches have
been accepted by the BPF maintainers, they will be applied to one
of the two BPF trees:
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
The bpf tree itself is for fixes only, whereas bpf-next for features,
cleanups or other kind of improvements ("next-like" content). This is
analogous to net and net-next trees for networking. Both bpf and
bpf-next will only have a master branch in order to simplify against
which branch patches should get rebased to.
Accumulated BPF patches in the bpf tree will regularly get pulled
into the net kernel tree. Likewise, accumulated BPF patches accepted
into the bpf-next tree will make their way into net-next tree. net and
net-next are both run by David S. Miller. From there, they will go
into the kernel mainline tree run by Linus Torvalds. To read up on the
process of net and net-next being merged into the mainline tree, see
the `netdev FAQ`_ under:
`Documentation/networking/netdev-FAQ.txt`_
Occasionally, to prevent merge conflicts, we might send pull requests
to other trees (e.g. tracing) with a small subset of the patches, but
net and net-next are always the main trees targeted for integration.
The pull requests will contain a high-level summary of the accumulated
patches and can be searched on netdev kernel mailing list through the
following subject lines (``yyyy-mm-dd`` is the date of the pull
request)::
pull-request: bpf yyyy-mm-dd
pull-request: bpf-next yyyy-mm-dd
Q: How do I indicate which tree (bpf vs. bpf-next) my patch should be applied to?
---------------------------------------------------------------------------------
A: The process is the very same as described in the `netdev FAQ`_, so
please read up on it. The subject line must indicate whether the
patch is a fix or rather "next-like" content in order to let the
maintainers know whether it is targeted at bpf or bpf-next.
For fixes eventually landing in bpf -> net tree, the subject must
look like::
git format-patch --subject-prefix='PATCH bpf' start..finish
For features/improvements/etc that should eventually land in
bpf-next -> net-next, the subject must look like::
git format-patch --subject-prefix='PATCH bpf-next' start..finish
If unsure whether the patch or patch series should go into bpf
or net directly, or bpf-next or net-next directly, it is not a
problem either if the subject line says net or net-next as target.
It is eventually up to the maintainers to do the delegation of
the patches.
If it is clear that patches should go into bpf or bpf-next tree,
please make sure to rebase the patches against those trees in
order to reduce potential conflicts.
In case the patch or patch series has to be reworked and sent out
again in a second or later revision, it is also required to add a
version number (``v2``, ``v3``, ...) into the subject prefix::
git format-patch --subject-prefix='PATCH net-next v2' start..finish
When changes have been requested to the patch series, always send the
whole patch series again with the feedback incorporated (never send
individual diffs on top of the old series).
Q: What does it mean when a patch gets applied to bpf or bpf-next tree?
-----------------------------------------------------------------------
A: It means that the patch looks good for mainline inclusion from
a BPF point of view.
Be aware that this is not a final verdict that the patch will
automatically get accepted into net or net-next trees eventually:
On the netdev kernel mailing list reviews can come in at any point
in time. If discussions around a patch conclude that they cannot
get included as-is, we will either apply a follow-up fix or drop
them from the trees entirely. Therefore, we also reserve to rebase
the trees when deemed necessary. After all, the purpose of the tree
is to:
i) accumulate and stage BPF patches for integration into trees
like net and net-next, and
ii) run extensive BPF test suite and
workloads on the patches before they make their way any further.
Once the BPF pull request was accepted by David S. Miller, then
the patches end up in net or net-next tree, respectively, and
make their way from there further into mainline. Again, see the
`netdev FAQ`_ for additional information e.g. on how often they are
merged to mainline.
Q: How long do I need to wait for feedback on my BPF patches?
-------------------------------------------------------------
A: We try to keep the latency low. The usual time to feedback will
be around 2 or 3 business days. It may vary depending on the
complexity of changes and current patch load.
Q: How often do you send pull requests to major kernel trees like net or net-next?
----------------------------------------------------------------------------------
A: Pull requests will be sent out rather often in order to not
accumulate too many patches in bpf or bpf-next.
As a rule of thumb, expect pull requests for each tree regularly
at the end of the week. In some cases pull requests could additionally
come also in the middle of the week depending on the current patch
load or urgency.
Q: Are patches applied to bpf-next when the merge window is open?
-----------------------------------------------------------------
A: For the time when the merge window is open, bpf-next will not be
processed. This is roughly analogous to net-next patch processing,
so feel free to read up on the `netdev FAQ`_ about further details.
During those two weeks of merge window, we might ask you to resend
your patch series once bpf-next is open again. Once Linus released
a ``v*-rc1`` after the merge window, we continue processing of bpf-next.
For non-subscribers to kernel mailing lists, there is also a status
page run by David S. Miller on net-next that provides guidance:
http://vger.kernel.org/~davem/net-next.html
Q: Verifier changes and test cases
----------------------------------
Q: I made a BPF verifier change, do I need to add test cases for
BPF kernel selftests_?
A: If the patch has changes to the behavior of the verifier, then yes,
it is absolutely necessary to add test cases to the BPF kernel
selftests_ suite. If they are not present and we think they are
needed, then we might ask for them before accepting any changes.
In particular, test_verifier.c is tracking a high number of BPF test
cases, including a lot of corner cases that LLVM BPF back end may
generate out of the restricted C code. Thus, adding test cases is
absolutely crucial to make sure future changes do not accidentally
affect prior use-cases. Thus, treat those test cases as: verifier
behavior that is not tracked in test_verifier.c could potentially
be subject to change.
Q: samples/bpf preference vs selftests?
---------------------------------------
Q: When should I add code to `samples/bpf/`_ and when to BPF kernel
selftests_ ?
A: In general, we prefer additions to BPF kernel selftests_ rather than
`samples/bpf/`_. The rationale is very simple: kernel selftests are
regularly run by various bots to test for kernel regressions.
The more test cases we add to BPF selftests, the better the coverage
and the less likely it is that those could accidentally break. It is
not that BPF kernel selftests cannot demo how a specific feature can
be used.
That said, `samples/bpf/`_ may be a good place for people to get started,
so it might be advisable that simple demos of features could go into
`samples/bpf/`_, but advanced functional and corner-case testing rather
into kernel selftests.
If your sample looks like a test case, then go for BPF kernel selftests
instead!
Q: When should I add code to the bpftool?
-----------------------------------------
A: The main purpose of bpftool (under tools/bpf/bpftool/) is to provide
a central user space tool for debugging and introspection of BPF programs
and maps that are active in the kernel. If UAPI changes related to BPF
enable for dumping additional information of programs or maps, then
bpftool should be extended as well to support dumping them.
Q: When should I add code to iproute2's BPF loader?
---------------------------------------------------
A: For UAPI changes related to the XDP or tc layer (e.g. ``cls_bpf``),
the convention is that those control-path related changes are added to
iproute2's BPF loader as well from user space side. This is not only
useful to have UAPI changes properly designed to be usable, but also
to make those changes available to a wider user base of major
downstream distributions.
Q: Do you accept patches as well for iproute2's BPF loader?
-----------------------------------------------------------
A: Patches for the iproute2's BPF loader have to be sent to:
netdev@vger.kernel.org
While those patches are not processed by the BPF kernel maintainers,
please keep them in Cc as well, so they can be reviewed.
The official git repository for iproute2 is run by Stephen Hemminger
and can be found at:
https://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git/
The patches need to have a subject prefix of '``[PATCH iproute2
master]``' or '``[PATCH iproute2 net-next]``'. '``master``' or
'``net-next``' describes the target branch where the patch should be
applied to. Meaning, if kernel changes went into the net-next kernel
tree, then the related iproute2 changes need to go into the iproute2
net-next branch, otherwise they can be targeted at master branch. The
iproute2 net-next branch will get merged into the master branch after
the current iproute2 version from master has been released.
Like BPF, the patches end up in patchwork under the netdev project and
are delegated to 'shemminger' for further processing:
http://patchwork.ozlabs.org/project/netdev/list/?delegate=389
Q: What is the minimum requirement before I submit my BPF patches?
------------------------------------------------------------------
A: When submitting patches, always take the time and properly test your
patches *prior* to submission. Never rush them! If maintainers find
that your patches have not been properly tested, it is a good way to
get them grumpy. Testing patch submissions is a hard requirement!
Note, fixes that go to bpf tree *must* have a ``Fixes:`` tag included.
The same applies to fixes that target bpf-next, where the affected
commit is in net-next (or in some cases bpf-next). The ``Fixes:`` tag is
crucial in order to identify follow-up commits and tremendously helps
for people having to do backporting, so it is a must have!
We also don't accept patches with an empty commit message. Take your
time and properly write up a high quality commit message, it is
essential!
Think about it this way: other developers looking at your code a month
from now need to understand *why* a certain change has been done that
way, and whether there have been flaws in the analysis or assumptions
that the original author did. Thus providing a proper rationale and
describing the use-case for the changes is a must.
Patch submissions with >1 patch must have a cover letter which includes
a high level description of the series. This high level summary will
then be placed into the merge commit by the BPF maintainers such that
it is also accessible from the git log for future reference.
Q: Features changing BPF JIT and/or LLVM
----------------------------------------
Q: What do I need to consider when adding a new instruction or feature
that would require BPF JIT and/or LLVM integration as well?
A: We try hard to keep all BPF JITs up to date such that the same user
experience can be guaranteed when running BPF programs on different
architectures without having the program punt to the less efficient
interpreter in case the in-kernel BPF JIT is enabled.
If you are unable to implement or test the required JIT changes for
certain architectures, please work together with the related BPF JIT
developers in order to get the feature implemented in a timely manner.
Please refer to the git log (``arch/*/net/``) to locate the necessary
people for helping out.
Also always make sure to add BPF test cases (e.g. test_bpf.c and
test_verifier.c) for new instructions, so that they can receive
broad test coverage and help run-time testing the various BPF JITs.
In case of new BPF instructions, once the changes have been accepted
into the Linux kernel, please implement support into LLVM's BPF back
end. See LLVM_ section below for further information.
Stable submission
=================
Q: I need a specific BPF commit in stable kernels. What should I do?
--------------------------------------------------------------------
A: In case you need a specific fix in stable kernels, first check whether
the commit has already been applied in the related ``linux-*.y`` branches:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/
If not the case, then drop an email to the BPF maintainers with the
netdev kernel mailing list in Cc and ask for the fix to be queued up:
netdev@vger.kernel.org
The process in general is the same as on netdev itself, see also the
`netdev FAQ`_ document.
Q: Do you also backport to kernels not currently maintained as stable?
----------------------------------------------------------------------
A: No. If you need a specific BPF commit in kernels that are currently not
maintained by the stable maintainers, then you are on your own.
The current stable and longterm stable kernels are all listed here:
https://www.kernel.org/
Q: The BPF patch I am about to submit needs to go to stable as well
-------------------------------------------------------------------
What should I do?
A: The same rules apply as with netdev patch submissions in general, see
`netdev FAQ`_ under:
`Documentation/networking/netdev-FAQ.txt`_
Never add "``Cc: stable@vger.kernel.org``" to the patch description, but
ask the BPF maintainers to queue the patches instead. This can be done
with a note, for example, under the ``---`` part of the patch which does
not go into the git log. Alternatively, this can be done as a simple
request by mail instead.
Q: Queue stable patches
-----------------------
Q: Where do I find currently queued BPF patches that will be submitted
to stable?
A: Once patches that fix critical bugs got applied into the bpf tree, they
are queued up for stable submission under:
http://patchwork.ozlabs.org/bundle/bpf/stable/?state=*
They will be on hold there at minimum until the related commit made its
way into the mainline kernel tree.
After having been under broader exposure, the queued patches will be
submitted by the BPF maintainers to the stable maintainers.
Testing patches
===============
Q: How to run BPF selftests
---------------------------
A: After you have booted into the newly compiled kernel, navigate to
the BPF selftests_ suite in order to test BPF functionality (current
working directory points to the root of the cloned git tree)::
$ cd tools/testing/selftests/bpf/
$ make
To run the verifier tests::
$ sudo ./test_verifier
The verifier tests print out all the current checks being
performed. The summary at the end of running all tests will dump
information of test successes and failures::
Summary: 418 PASSED, 0 FAILED
In order to run through all BPF selftests, the following command is
needed::
$ sudo make run_tests
See the kernels selftest `Documentation/dev-tools/kselftest.rst`_
document for further documentation.
Q: Which BPF kernel selftests version should I run my kernel against?
---------------------------------------------------------------------
A: If you run a kernel ``xyz``, then always run the BPF kernel selftests
from that kernel ``xyz`` as well. Do not expect that the BPF selftest
from the latest mainline tree will pass all the time.
In particular, test_bpf.c and test_verifier.c have a large number of
test cases and are constantly updated with new BPF test sequences, or
existing ones are adapted to verifier changes e.g. due to verifier
becoming smarter and being able to better track certain things.
LLVM
====
Q: Where do I find LLVM with BPF support?
-----------------------------------------
A: The BPF back end for LLVM is upstream in LLVM since version 3.7.1.
All major distributions these days ship LLVM with BPF back end enabled,
so for the majority of use-cases it is not required to compile LLVM by
hand anymore, just install the distribution provided package.
LLVM's static compiler lists the supported targets through
``llc --version``, make sure BPF targets are listed. Example::
$ llc --version
LLVM (http://llvm.org/):
LLVM version 6.0.0svn
Optimized build.
Default target: x86_64-unknown-linux-gnu
Host CPU: skylake
Registered Targets:
bpf - BPF (host endian)
bpfeb - BPF (big endian)
bpfel - BPF (little endian)
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
For developers in order to utilize the latest features added to LLVM's
BPF back end, it is advisable to run the latest LLVM releases. Support
for new BPF kernel features such as additions to the BPF instruction
set are often developed together.
All LLVM releases can be found at: http://releases.llvm.org/
Q: Got it, so how do I build LLVM manually anyway?
--------------------------------------------------
A: You need cmake and gcc-c++ as build requisites for LLVM. Once you have
that set up, proceed with building the latest LLVM and clang version
from the git repositories::
$ git clone http://llvm.org/git/llvm.git
$ cd llvm/tools
$ git clone --depth 1 http://llvm.org/git/clang.git
$ cd ..; mkdir build; cd build
$ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
-DBUILD_SHARED_LIBS=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_BUILD_RUNTIME=OFF
$ make -j $(getconf _NPROCESSORS_ONLN)
The built binaries can then be found in the build/bin/ directory, where
you can point the PATH variable to.
Q: Reporting LLVM BPF issues
----------------------------
Q: Should I notify BPF kernel maintainers about issues in LLVM's BPF code
generation back end or about LLVM generated code that the verifier
refuses to accept?
A: Yes, please do!
LLVM's BPF back end is a key piece of the whole BPF
infrastructure and it ties deeply into verification of programs from the
kernel side. Therefore, any issues on either side need to be investigated
and fixed whenever necessary.
Therefore, please make sure to bring them up at netdev kernel mailing
list and Cc BPF maintainers for LLVM and kernel bits:
* Yonghong Song <yhs@fb.com>
* Alexei Starovoitov <ast@kernel.org>
* Daniel Borkmann <daniel@iogearbox.net>
LLVM also has an issue tracker where BPF related bugs can be found:
https://bugs.llvm.org/buglist.cgi?quicksearch=bpf
However, it is better to reach out through mailing lists with having
maintainers in Cc.
Q: New BPF instruction for kernel and LLVM
------------------------------------------
Q: I have added a new BPF instruction to the kernel, how can I integrate
it into LLVM?
A: LLVM has a ``-mcpu`` selector for the BPF back end in order to allow
the selection of BPF instruction set extensions. By default the
``generic`` processor target is used, which is the base instruction set
(v1) of BPF.
LLVM has an option to select ``-mcpu=probe`` where it will probe the host
kernel for supported BPF instruction set extensions and selects the
optimal set automatically.
For cross-compilation, a specific version can be select manually as well ::
$ llc -march bpf -mcpu=help
Available CPUs for this target:
generic - Select the generic processor.
probe - Select the probe processor.
v1 - Select the v1 processor.
v2 - Select the v2 processor.
[...]
Newly added BPF instructions to the Linux kernel need to follow the same
scheme, bump the instruction set version and implement probing for the
extensions such that ``-mcpu=probe`` users can benefit from the
optimization transparently when upgrading their kernels.
If you are unable to implement support for the newly added BPF instruction
please reach out to BPF developers for help.
By the way, the BPF kernel selftests run with ``-mcpu=probe`` for better
test coverage.
Q: clang flag for target bpf?
-----------------------------
Q: In some cases clang flag ``-target bpf`` is used but in other cases the
default clang target, which matches the underlying architecture, is used.
What is the difference and when I should use which?
A: Although LLVM IR generation and optimization try to stay architecture
independent, ``-target <arch>`` still has some impact on generated code:
- BPF program may recursively include header file(s) with file scope
inline assembly codes. The default target can handle this well,
while ``bpf`` target may fail if bpf backend assembler does not
understand these assembly codes, which is true in most cases.
- When compiled without ``-g``, additional elf sections, e.g.,
.eh_frame and .rela.eh_frame, may be present in the object file
with default target, but not with ``bpf`` target.
- The default target may turn a C switch statement into a switch table
lookup and jump operation. Since the switch table is placed
in the global readonly section, the bpf program will fail to load.
The bpf target does not support switch table optimization.
The clang option ``-fno-jump-tables`` can be used to disable
switch table generation.
- For clang ``-target bpf``, it is guaranteed that pointer or long /
unsigned long types will always have a width of 64 bit, no matter
whether underlying clang binary or default target (or kernel) is
32 bit. However, when native clang target is used, then it will
compile these types based on the underlying architecture's conventions,
meaning in case of 32 bit architecture, pointer or long / unsigned
long types e.g. in BPF context structure will have width of 32 bit
while the BPF LLVM back end still operates in 64 bit. The native
target is mostly needed in tracing for the case of walking ``pt_regs``
or other kernel structures where CPU's register width matters.
Otherwise, ``clang -target bpf`` is generally recommended.
You should use default target when:
- Your program includes a header file, e.g., ptrace.h, which eventually
pulls in some header files containing file scope host assembly codes.
- You can add ``-fno-jump-tables`` to work around the switch table issue.
Otherwise, you can use ``bpf`` target. Additionally, you *must* use bpf target
when:
- Your program uses data structures with pointer or long / unsigned long
types that interface with BPF helpers or context data structures. Access
into these structures is verified by the BPF verifier and may result
in verification failures if the native architecture is not aligned with
the BPF architecture, e.g. 64-bit. An example of this is
BPF_PROG_TYPE_SK_MSG require ``-target bpf``
.. Links
.. _Documentation/process/: https://www.kernel.org/doc/html/latest/process/
.. _MAINTAINERS: ../../MAINTAINERS
.. _Documentation/networking/netdev-FAQ.txt: ../networking/netdev-FAQ.txt
.. _netdev FAQ: ../networking/netdev-FAQ.txt
.. _samples/bpf/: ../../samples/bpf/
.. _selftests: ../../tools/testing/selftests/bpf/
.. _Documentation/dev-tools/kselftest.rst:
https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html
Happy BPF hacking!

View File

@ -1,570 +0,0 @@
This document provides information for the BPF subsystem about various
workflows related to reporting bugs, submitting patches, and queueing
patches for stable kernels.
For general information about submitting patches, please refer to
Documentation/process/. This document only describes additional specifics
related to BPF.
Reporting bugs:
---------------
Q: How do I report bugs for BPF kernel code?
A: Since all BPF kernel development as well as bpftool and iproute2 BPF
loader development happens through the netdev kernel mailing list,
please report any found issues around BPF to the following mailing
list:
netdev@vger.kernel.org
This may also include issues related to XDP, BPF tracing, etc.
Given netdev has a high volume of traffic, please also add the BPF
maintainers to Cc (from kernel MAINTAINERS file):
Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann <daniel@iogearbox.net>
In case a buggy commit has already been identified, make sure to keep
the actual commit authors in Cc as well for the report. They can
typically be identified through the kernel's git tree.
Please do *not* report BPF issues to bugzilla.kernel.org since it
is a guarantee that the reported issue will be overlooked.
Submitting patches:
-------------------
Q: To which mailing list do I need to submit my BPF patches?
A: Please submit your BPF patches to the netdev kernel mailing list:
netdev@vger.kernel.org
Historically, BPF came out of networking and has always been maintained
by the kernel networking community. Although these days BPF touches
many other subsystems as well, the patches are still routed mainly
through the networking community.
In case your patch has changes in various different subsystems (e.g.
tracing, security, etc), make sure to Cc the related kernel mailing
lists and maintainers from there as well, so they are able to review
the changes and provide their Acked-by's to the patches.
Q: Where can I find patches currently under discussion for BPF subsystem?
A: All patches that are Cc'ed to netdev are queued for review under netdev
patchwork project:
http://patchwork.ozlabs.org/project/netdev/list/
Those patches which target BPF, are assigned to a 'bpf' delegate for
further processing from BPF maintainers. The current queue with
patches under review can be found at:
https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
Once the patches have been reviewed by the BPF community as a whole
and approved by the BPF maintainers, their status in patchwork will be
changed to 'Accepted' and the submitter will be notified by mail. This
means that the patches look good from a BPF perspective and have been
applied to one of the two BPF kernel trees.
In case feedback from the community requires a respin of the patches,
their status in patchwork will be set to 'Changes Requested', and purged
from the current review queue. Likewise for cases where patches would
get rejected or are not applicable to the BPF trees (but assigned to
the 'bpf' delegate).
Q: How do the changes make their way into Linux?
A: There are two BPF kernel trees (git repositories). Once patches have
been accepted by the BPF maintainers, they will be applied to one
of the two BPF trees:
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
The bpf tree itself is for fixes only, whereas bpf-next for features,
cleanups or other kind of improvements ("next-like" content). This is
analogous to net and net-next trees for networking. Both bpf and
bpf-next will only have a master branch in order to simplify against
which branch patches should get rebased to.
Accumulated BPF patches in the bpf tree will regularly get pulled
into the net kernel tree. Likewise, accumulated BPF patches accepted
into the bpf-next tree will make their way into net-next tree. net and
net-next are both run by David S. Miller. From there, they will go
into the kernel mainline tree run by Linus Torvalds. To read up on the
process of net and net-next being merged into the mainline tree, see
the netdev FAQ under:
Documentation/networking/netdev-FAQ.txt
Occasionally, to prevent merge conflicts, we might send pull requests
to other trees (e.g. tracing) with a small subset of the patches, but
net and net-next are always the main trees targeted for integration.
The pull requests will contain a high-level summary of the accumulated
patches and can be searched on netdev kernel mailing list through the
following subject lines (yyyy-mm-dd is the date of the pull request):
pull-request: bpf yyyy-mm-dd
pull-request: bpf-next yyyy-mm-dd
Q: How do I indicate which tree (bpf vs. bpf-next) my patch should be
applied to?
A: The process is the very same as described in the netdev FAQ, so
please read up on it. The subject line must indicate whether the
patch is a fix or rather "next-like" content in order to let the
maintainers know whether it is targeted at bpf or bpf-next.
For fixes eventually landing in bpf -> net tree, the subject must
look like:
git format-patch --subject-prefix='PATCH bpf' start..finish
For features/improvements/etc that should eventually land in
bpf-next -> net-next, the subject must look like:
git format-patch --subject-prefix='PATCH bpf-next' start..finish
If unsure whether the patch or patch series should go into bpf
or net directly, or bpf-next or net-next directly, it is not a
problem either if the subject line says net or net-next as target.
It is eventually up to the maintainers to do the delegation of
the patches.
If it is clear that patches should go into bpf or bpf-next tree,
please make sure to rebase the patches against those trees in
order to reduce potential conflicts.
In case the patch or patch series has to be reworked and sent out
again in a second or later revision, it is also required to add a
version number (v2, v3, ...) into the subject prefix:
git format-patch --subject-prefix='PATCH net-next v2' start..finish
When changes have been requested to the patch series, always send the
whole patch series again with the feedback incorporated (never send
individual diffs on top of the old series).
Q: What does it mean when a patch gets applied to bpf or bpf-next tree?
A: It means that the patch looks good for mainline inclusion from
a BPF point of view.
Be aware that this is not a final verdict that the patch will
automatically get accepted into net or net-next trees eventually:
On the netdev kernel mailing list reviews can come in at any point
in time. If discussions around a patch conclude that they cannot
get included as-is, we will either apply a follow-up fix or drop
them from the trees entirely. Therefore, we also reserve to rebase
the trees when deemed necessary. After all, the purpose of the tree
is to i) accumulate and stage BPF patches for integration into trees
like net and net-next, and ii) run extensive BPF test suite and
workloads on the patches before they make their way any further.
Once the BPF pull request was accepted by David S. Miller, then
the patches end up in net or net-next tree, respectively, and
make their way from there further into mainline. Again, see the
netdev FAQ for additional information e.g. on how often they are
merged to mainline.
Q: How long do I need to wait for feedback on my BPF patches?
A: We try to keep the latency low. The usual time to feedback will
be around 2 or 3 business days. It may vary depending on the
complexity of changes and current patch load.
Q: How often do you send pull requests to major kernel trees like
net or net-next?
A: Pull requests will be sent out rather often in order to not
accumulate too many patches in bpf or bpf-next.
As a rule of thumb, expect pull requests for each tree regularly
at the end of the week. In some cases pull requests could additionally
come also in the middle of the week depending on the current patch
load or urgency.
Q: Are patches applied to bpf-next when the merge window is open?
A: For the time when the merge window is open, bpf-next will not be
processed. This is roughly analogous to net-next patch processing,
so feel free to read up on the netdev FAQ about further details.
During those two weeks of merge window, we might ask you to resend
your patch series once bpf-next is open again. Once Linus released
a v*-rc1 after the merge window, we continue processing of bpf-next.
For non-subscribers to kernel mailing lists, there is also a status
page run by David S. Miller on net-next that provides guidance:
http://vger.kernel.org/~davem/net-next.html
Q: I made a BPF verifier change, do I need to add test cases for
BPF kernel selftests?
A: If the patch has changes to the behavior of the verifier, then yes,
it is absolutely necessary to add test cases to the BPF kernel
selftests suite. If they are not present and we think they are
needed, then we might ask for them before accepting any changes.
In particular, test_verifier.c is tracking a high number of BPF test
cases, including a lot of corner cases that LLVM BPF back end may
generate out of the restricted C code. Thus, adding test cases is
absolutely crucial to make sure future changes do not accidentally
affect prior use-cases. Thus, treat those test cases as: verifier
behavior that is not tracked in test_verifier.c could potentially
be subject to change.
Q: When should I add code to samples/bpf/ and when to BPF kernel
selftests?
A: In general, we prefer additions to BPF kernel selftests rather than
samples/bpf/. The rationale is very simple: kernel selftests are
regularly run by various bots to test for kernel regressions.
The more test cases we add to BPF selftests, the better the coverage
and the less likely it is that those could accidentally break. It is
not that BPF kernel selftests cannot demo how a specific feature can
be used.
That said, samples/bpf/ may be a good place for people to get started,
so it might be advisable that simple demos of features could go into
samples/bpf/, but advanced functional and corner-case testing rather
into kernel selftests.
If your sample looks like a test case, then go for BPF kernel selftests
instead!
Q: When should I add code to the bpftool?
A: The main purpose of bpftool (under tools/bpf/bpftool/) is to provide
a central user space tool for debugging and introspection of BPF programs
and maps that are active in the kernel. If UAPI changes related to BPF
enable for dumping additional information of programs or maps, then
bpftool should be extended as well to support dumping them.
Q: When should I add code to iproute2's BPF loader?
A: For UAPI changes related to the XDP or tc layer (e.g. cls_bpf), the
convention is that those control-path related changes are added to
iproute2's BPF loader as well from user space side. This is not only
useful to have UAPI changes properly designed to be usable, but also
to make those changes available to a wider user base of major
downstream distributions.
Q: Do you accept patches as well for iproute2's BPF loader?
A: Patches for the iproute2's BPF loader have to be sent to:
netdev@vger.kernel.org
While those patches are not processed by the BPF kernel maintainers,
please keep them in Cc as well, so they can be reviewed.
The official git repository for iproute2 is run by Stephen Hemminger
and can be found at:
https://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git/
The patches need to have a subject prefix of '[PATCH iproute2 master]'
or '[PATCH iproute2 net-next]'. 'master' or 'net-next' describes the
target branch where the patch should be applied to. Meaning, if kernel
changes went into the net-next kernel tree, then the related iproute2
changes need to go into the iproute2 net-next branch, otherwise they
can be targeted at master branch. The iproute2 net-next branch will get
merged into the master branch after the current iproute2 version from
master has been released.
Like BPF, the patches end up in patchwork under the netdev project and
are delegated to 'shemminger' for further processing:
http://patchwork.ozlabs.org/project/netdev/list/?delegate=389
Q: What is the minimum requirement before I submit my BPF patches?
A: When submitting patches, always take the time and properly test your
patches *prior* to submission. Never rush them! If maintainers find
that your patches have not been properly tested, it is a good way to
get them grumpy. Testing patch submissions is a hard requirement!
Note, fixes that go to bpf tree *must* have a Fixes: tag included. The
same applies to fixes that target bpf-next, where the affected commit
is in net-next (or in some cases bpf-next). The Fixes: tag is crucial
in order to identify follow-up commits and tremendously helps for people
having to do backporting, so it is a must have!
We also don't accept patches with an empty commit message. Take your
time and properly write up a high quality commit message, it is
essential!
Think about it this way: other developers looking at your code a month
from now need to understand *why* a certain change has been done that
way, and whether there have been flaws in the analysis or assumptions
that the original author did. Thus providing a proper rationale and
describing the use-case for the changes is a must.
Patch submissions with >1 patch must have a cover letter which includes
a high level description of the series. This high level summary will
then be placed into the merge commit by the BPF maintainers such that
it is also accessible from the git log for future reference.
Q: What do I need to consider when adding a new instruction or feature
that would require BPF JIT and/or LLVM integration as well?
A: We try hard to keep all BPF JITs up to date such that the same user
experience can be guaranteed when running BPF programs on different
architectures without having the program punt to the less efficient
interpreter in case the in-kernel BPF JIT is enabled.
If you are unable to implement or test the required JIT changes for
certain architectures, please work together with the related BPF JIT
developers in order to get the feature implemented in a timely manner.
Please refer to the git log (arch/*/net/) to locate the necessary
people for helping out.
Also always make sure to add BPF test cases (e.g. test_bpf.c and
test_verifier.c) for new instructions, so that they can receive
broad test coverage and help run-time testing the various BPF JITs.
In case of new BPF instructions, once the changes have been accepted
into the Linux kernel, please implement support into LLVM's BPF back
end. See LLVM section below for further information.
Stable submission:
------------------
Q: I need a specific BPF commit in stable kernels. What should I do?
A: In case you need a specific fix in stable kernels, first check whether
the commit has already been applied in the related linux-*.y branches:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/
If not the case, then drop an email to the BPF maintainers with the
netdev kernel mailing list in Cc and ask for the fix to be queued up:
netdev@vger.kernel.org
The process in general is the same as on netdev itself, see also the
netdev FAQ document.
Q: Do you also backport to kernels not currently maintained as stable?
A: No. If you need a specific BPF commit in kernels that are currently not
maintained by the stable maintainers, then you are on your own.
The current stable and longterm stable kernels are all listed here:
https://www.kernel.org/
Q: The BPF patch I am about to submit needs to go to stable as well. What
should I do?
A: The same rules apply as with netdev patch submissions in general, see
netdev FAQ under:
Documentation/networking/netdev-FAQ.txt
Never add "Cc: stable@vger.kernel.org" to the patch description, but
ask the BPF maintainers to queue the patches instead. This can be done
with a note, for example, under the "---" part of the patch which does
not go into the git log. Alternatively, this can be done as a simple
request by mail instead.
Q: Where do I find currently queued BPF patches that will be submitted
to stable?
A: Once patches that fix critical bugs got applied into the bpf tree, they
are queued up for stable submission under:
http://patchwork.ozlabs.org/bundle/bpf/stable/?state=*
They will be on hold there at minimum until the related commit made its
way into the mainline kernel tree.
After having been under broader exposure, the queued patches will be
submitted by the BPF maintainers to the stable maintainers.
Testing patches:
----------------
Q: Which BPF kernel selftests version should I run my kernel against?
A: If you run a kernel xyz, then always run the BPF kernel selftests from
that kernel xyz as well. Do not expect that the BPF selftest from the
latest mainline tree will pass all the time.
In particular, test_bpf.c and test_verifier.c have a large number of
test cases and are constantly updated with new BPF test sequences, or
existing ones are adapted to verifier changes e.g. due to verifier
becoming smarter and being able to better track certain things.
LLVM:
-----
Q: Where do I find LLVM with BPF support?
A: The BPF back end for LLVM is upstream in LLVM since version 3.7.1.
All major distributions these days ship LLVM with BPF back end enabled,
so for the majority of use-cases it is not required to compile LLVM by
hand anymore, just install the distribution provided package.
LLVM's static compiler lists the supported targets through 'llc --version',
make sure BPF targets are listed. Example:
$ llc --version
LLVM (http://llvm.org/):
LLVM version 6.0.0svn
Optimized build.
Default target: x86_64-unknown-linux-gnu
Host CPU: skylake
Registered Targets:
bpf - BPF (host endian)
bpfeb - BPF (big endian)
bpfel - BPF (little endian)
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
For developers in order to utilize the latest features added to LLVM's
BPF back end, it is advisable to run the latest LLVM releases. Support
for new BPF kernel features such as additions to the BPF instruction
set are often developed together.
All LLVM releases can be found at: http://releases.llvm.org/
Q: Got it, so how do I build LLVM manually anyway?
A: You need cmake and gcc-c++ as build requisites for LLVM. Once you have
that set up, proceed with building the latest LLVM and clang version
from the git repositories:
$ git clone http://llvm.org/git/llvm.git
$ cd llvm/tools
$ git clone --depth 1 http://llvm.org/git/clang.git
$ cd ..; mkdir build; cd build
$ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
-DBUILD_SHARED_LIBS=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_BUILD_RUNTIME=OFF
$ make -j $(getconf _NPROCESSORS_ONLN)
The built binaries can then be found in the build/bin/ directory, where
you can point the PATH variable to.
Q: Should I notify BPF kernel maintainers about issues in LLVM's BPF code
generation back end or about LLVM generated code that the verifier
refuses to accept?
A: Yes, please do! LLVM's BPF back end is a key piece of the whole BPF
infrastructure and it ties deeply into verification of programs from the
kernel side. Therefore, any issues on either side need to be investigated
and fixed whenever necessary.
Therefore, please make sure to bring them up at netdev kernel mailing
list and Cc BPF maintainers for LLVM and kernel bits:
Yonghong Song <yhs@fb.com>
Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann <daniel@iogearbox.net>
LLVM also has an issue tracker where BPF related bugs can be found:
https://bugs.llvm.org/buglist.cgi?quicksearch=bpf
However, it is better to reach out through mailing lists with having
maintainers in Cc.
Q: I have added a new BPF instruction to the kernel, how can I integrate
it into LLVM?
A: LLVM has a -mcpu selector for the BPF back end in order to allow the
selection of BPF instruction set extensions. By default the 'generic'
processor target is used, which is the base instruction set (v1) of BPF.
LLVM has an option to select -mcpu=probe where it will probe the host
kernel for supported BPF instruction set extensions and selects the
optimal set automatically.
For cross-compilation, a specific version can be select manually as well.
$ llc -march bpf -mcpu=help
Available CPUs for this target:
generic - Select the generic processor.
probe - Select the probe processor.
v1 - Select the v1 processor.
v2 - Select the v2 processor.
[...]
Newly added BPF instructions to the Linux kernel need to follow the same
scheme, bump the instruction set version and implement probing for the
extensions such that -mcpu=probe users can benefit from the optimization
transparently when upgrading their kernels.
If you are unable to implement support for the newly added BPF instruction
please reach out to BPF developers for help.
By the way, the BPF kernel selftests run with -mcpu=probe for better
test coverage.
Q: In some cases clang flag "-target bpf" is used but in other cases the
default clang target, which matches the underlying architecture, is used.
What is the difference and when I should use which?
A: Although LLVM IR generation and optimization try to stay architecture
independent, "-target <arch>" still has some impact on generated code:
- BPF program may recursively include header file(s) with file scope
inline assembly codes. The default target can handle this well,
while bpf target may fail if bpf backend assembler does not
understand these assembly codes, which is true in most cases.
- When compiled without -g, additional elf sections, e.g.,
.eh_frame and .rela.eh_frame, may be present in the object file
with default target, but not with bpf target.
- The default target may turn a C switch statement into a switch table
lookup and jump operation. Since the switch table is placed
in the global readonly section, the bpf program will fail to load.
The bpf target does not support switch table optimization.
The clang option "-fno-jump-tables" can be used to disable
switch table generation.
- For clang -target bpf, it is guaranteed that pointer or long /
unsigned long types will always have a width of 64 bit, no matter
whether underlying clang binary or default target (or kernel) is
32 bit. However, when native clang target is used, then it will
compile these types based on the underlying architecture's conventions,
meaning in case of 32 bit architecture, pointer or long / unsigned
long types e.g. in BPF context structure will have width of 32 bit
while the BPF LLVM back end still operates in 64 bit. The native
target is mostly needed in tracing for the case of walking pt_regs
or other kernel structures where CPU's register width matters.
Otherwise, clang -target bpf is generally recommended.
You should use default target when:
- Your program includes a header file, e.g., ptrace.h, which eventually
pulls in some header files containing file scope host assembly codes.
- You can add "-fno-jump-tables" to work around the switch table issue.
Otherwise, you can use bpf target. Additionally, you _must_ use bpf target
when:
- Your program uses data structures with pointer or long / unsigned long
types that interface with BPF helpers or context data structures. Access
into these structures is verified by the BPF verifier and may result
in verification failures if the native architecture is not aligned with
the BPF architecture, e.g. 64-bit. An example of this is
BPF_PROG_TYPE_SK_MSG require '-target bpf'
Happy BPF hacking!

View File

@ -0,0 +1,14 @@
STMicroelectronics STM32 Platforms System Controller
Properties:
- compatible : should contain two values. First value must be :
- " st,stm32mp157-syscfg " - for stm32mp157 based SoCs,
second value must be always "syscon".
- reg : offset and length of the register set.
Example:
syscfg: syscon@50020000 {
compatible = "st,stm32mp157-syscfg", "syscon";
reg = <0x50020000 0x400>;
};

View File

@ -82,8 +82,6 @@ linked into one DSA cluster.
switch0: switch0@0 {
compatible = "marvell,mv88e6085";
#address-cells = <1>;
#size-cells = <0>;
reg = <0>;
dsa,member = <0 0>;
@ -135,8 +133,6 @@ linked into one DSA cluster.
switch1: switch1@0 {
compatible = "marvell,mv88e6085";
#address-cells = <1>;
#size-cells = <0>;
reg = <0>;
dsa,member = <0 1>;
@ -204,8 +200,6 @@ linked into one DSA cluster.
switch2: switch2@0 {
compatible = "marvell,mv88e6085";
#address-cells = <1>;
#size-cells = <0>;
reg = <0>;
dsa,member = <0 2>;

View File

@ -2,7 +2,10 @@
Required properties:
- compatible: should be "qca,qca8337"
- compatible: should be one of:
"qca,qca8334"
"qca,qca8337"
- #size-cells: must be 0
- #address-cells: must be 1
@ -14,6 +17,20 @@ port and PHY id, each subnode describing a port needs to have a valid phandle
referencing the internal PHY connected to it. The CPU port of this switch is
always port 0.
A CPU port node has the following optional node:
- fixed-link : Fixed-link subnode describing a link to a non-MDIO
managed entity. See
Documentation/devicetree/bindings/net/fixed-link.txt
for details.
For QCA8K the 'fixed-link' sub-node supports only the following properties:
- 'speed' (integer, mandatory), to indicate the link speed. Accepted
values are 10, 100 and 1000
- 'full-duplex' (boolean, optional), to indicate that full duplex is
used. When absent, half duplex is assumed.
Example:
@ -53,6 +70,10 @@ Example:
label = "cpu";
ethernet = <&gmac1>;
phy-mode = "rgmii";
fixed-link {
speed = 1000;
full-duplex;
};
};
port@1 {

View File

@ -7,6 +7,7 @@ Required properties:
- compatible: must be one of the following string:
"allwinner,sun8i-a83t-emac"
"allwinner,sun8i-h3-emac"
"allwinner,sun8i-r40-gmac"
"allwinner,sun8i-v3s-emac"
"allwinner,sun50i-a64-emac"
- reg: address and length of the register for the device.
@ -20,18 +21,18 @@ Required properties:
- phy-handle: See ethernet.txt
- #address-cells: shall be 1
- #size-cells: shall be 0
- syscon: A phandle to the syscon of the SoC with one of the following
compatible string:
- allwinner,sun8i-h3-system-controller
- allwinner,sun8i-v3s-system-controller
- allwinner,sun50i-a64-system-controller
- allwinner,sun8i-a83t-system-controller
- syscon: A phandle to the device containing the EMAC or GMAC clock register
Optional properties:
- allwinner,tx-delay-ps: TX clock delay chain value in ps. Range value is 0-700. Default is 0)
- allwinner,rx-delay-ps: RX clock delay chain value in ps. Range value is 0-3100. Default is 0)
Both delay properties need to be a multiple of 100. They control the delay for
external PHY.
- allwinner,tx-delay-ps: TX clock delay chain value in ps.
Range is 0-700. Default is 0.
Unavailable for allwinner,sun8i-r40-gmac
- allwinner,rx-delay-ps: RX clock delay chain value in ps.
Range is 0-3100. Default is 0.
Range is 0-700 for allwinner,sun8i-r40-gmac
Both delay properties need to be a multiple of 100. They control the
clock delay for external RGMII PHY. They do not apply to the internal
PHY or external non-RGMII PHYs.
Optional properties for the following compatibles:
- "allwinner,sun8i-h3-emac",

View File

@ -86,70 +86,4 @@ Example:
* Gianfar PTP clock nodes
General Properties:
- compatible Should be "fsl,etsec-ptp"
- reg Offset and length of the register set for the device
- interrupts There should be at least two interrupts. Some devices
have as many as four PTP related interrupts.
Clock Properties:
- fsl,cksel Timer reference clock source.
- fsl,tclk-period Timer reference clock period in nanoseconds.
- fsl,tmr-prsc Prescaler, divides the output clock.
- fsl,tmr-add Frequency compensation value.
- fsl,tmr-fiper1 Fixed interval period pulse generator.
- fsl,tmr-fiper2 Fixed interval period pulse generator.
- fsl,max-adj Maximum frequency adjustment in parts per billion.
These properties set the operational parameters for the PTP
clock. You must choose these carefully for the clock to work right.
Here is how to figure good values:
TimerOsc = selected reference clock MHz
tclk_period = desired clock period nanoseconds
NominalFreq = 1000 / tclk_period MHz
FreqDivRatio = TimerOsc / NominalFreq (must be greater that 1.0)
tmr_add = ceil(2^32 / FreqDivRatio)
OutputClock = NominalFreq / tmr_prsc MHz
PulseWidth = 1 / OutputClock microseconds
FiperFreq1 = desired frequency in Hz
FiperDiv1 = 1000000 * OutputClock / FiperFreq1
tmr_fiper1 = tmr_prsc * tclk_period * FiperDiv1 - tclk_period
max_adj = 1000000000 * (FreqDivRatio - 1.0) - 1
The calculation for tmr_fiper2 is the same as for tmr_fiper1. The
driver expects that tmr_fiper1 will be correctly set to produce a 1
Pulse Per Second (PPS) signal, since this will be offered to the PPS
subsystem to synchronize the Linux clock.
Reference clock source is determined by the value, which is holded
in CKSEL bits in TMR_CTRL register. "fsl,cksel" property keeps the
value, which will be directly written in those bits, that is why,
according to reference manual, the next clock sources can be used:
<0> - external high precision timer reference clock (TSEC_TMR_CLK
input is used for this purpose);
<1> - eTSEC system clock;
<2> - eTSEC1 transmit clock;
<3> - RTC clock input.
When this attribute is not used, eTSEC system clock will serve as
IEEE 1588 timer reference clock.
Example:
ptp_clock@24e00 {
compatible = "fsl,etsec-ptp";
reg = <0x24E00 0xB0>;
interrupts = <12 0x8 13 0x8>;
interrupt-parent = < &ipic >;
fsl,cksel = <1>;
fsl,tclk-period = <10>;
fsl,tmr-prsc = <100>;
fsl,tmr-add = <0x999999A4>;
fsl,tmr-fiper1 = <0x3B9AC9F6>;
fsl,tmr-fiper2 = <0x00018696>;
fsl,max-adj = <659999998>;
};
Refer to Documentation/devicetree/bindings/ptp/ptp-qoriq.txt

View File

@ -11,6 +11,7 @@ Required properties on all platforms:
- "amlogic,meson8b-dwmac"
- "amlogic,meson8m2-dwmac"
- "amlogic,meson-gxbb-dwmac"
- "amlogic,meson-axg-dwmac"
Additionally "snps,dwmac" and any applicable more
detailed version number described in net/stmmac.txt
should be used.

View File

@ -0,0 +1,54 @@
Microchip LAN78xx Gigabit Ethernet controller
The LAN78XX devices are usually configured by programming their OTP or with
an external EEPROM, but some platforms (e.g. Raspberry Pi 3 B+) have neither.
The Device Tree properties, if present, override the OTP and EEPROM.
Required properties:
- compatible: Should be one of "usb424,7800", "usb424,7801" or "usb424,7850".
Optional properties:
- local-mac-address: see ethernet.txt
- mac-address: see ethernet.txt
Optional properties of the embedded PHY:
- microchip,led-modes: a 0..4 element vector, with each element configuring
the operating mode of an LED. Omitted LEDs are turned off. Allowed values
are defined in "include/dt-bindings/net/microchip-lan78xx.h".
Example:
/* Based on the configuration for a Raspberry Pi 3 B+ */
&usb {
usb-port@1 {
compatible = "usb424,2514";
reg = <1>;
#address-cells = <1>;
#size-cells = <0>;
usb-port@1 {
compatible = "usb424,2514";
reg = <1>;
#address-cells = <1>;
#size-cells = <0>;
ethernet: ethernet@1 {
compatible = "usb424,7800";
reg = <1>;
local-mac-address = [ 00 11 22 33 44 55 ];
mdio {
#address-cells = <0x1>;
#size-cells = <0x0>;
eth_phy: ethernet-phy@1 {
reg = <1>;
microchip,led-modes = <
LAN78XX_LINK_1000_ACTIVITY
LAN78XX_LINK_10_100_ACTIVITY
>;
};
};
};
};
};
};

View File

@ -0,0 +1,26 @@
Microsemi MII Management Controller (MIIM) / MDIO
=================================================
Properties:
- compatible: must be "mscc,ocelot-miim"
- reg: The base address of the MDIO bus controller register bank. Optionally, a
second register bank can be defined if there is an associated reset register
for internal PHYs
- #address-cells: Must be <1>.
- #size-cells: Must be <0>. MDIO addresses have no size component.
- interrupts: interrupt specifier (refer to the interrupt binding)
Typically an MDIO bus might have several children.
Example:
mdio@107009c {
#address-cells = <1>;
#size-cells = <0>;
compatible = "mscc,ocelot-miim";
reg = <0x107009c 0x36>, <0x10700f0 0x8>;
interrupts = <14>;
phy0: ethernet-phy@0 {
reg = <0>;
};
};

View File

@ -0,0 +1,82 @@
Microsemi Ocelot network Switch
===============================
The Microsemi Ocelot network switch can be found on Microsemi SoCs (VSC7513,
VSC7514)
Required properties:
- compatible: Should be "mscc,vsc7514-switch"
- reg: Must contain an (offset, length) pair of the register set for each
entry in reg-names.
- reg-names: Must include the following entries:
- "sys"
- "rew"
- "qs"
- "hsio"
- "qsys"
- "ana"
- "portX" with X from 0 to the number of last port index available on that
switch
- interrupts: Should contain the switch interrupts for frame extraction and
frame injection
- interrupt-names: should contain the interrupt names: "xtr", "inj"
- ethernet-ports: A container for child nodes representing switch ports.
The ethernet-ports container has the following properties
Required properties:
- #address-cells: Must be 1
- #size-cells: Must be 0
Each port node must have the following mandatory properties:
- reg: Describes the port address in the switch
Port nodes may also contain the following optional standardised
properties, described in binding documents:
- phy-handle: Phandle to a PHY on an MDIO bus. See
Documentation/devicetree/bindings/net/ethernet.txt for details.
Example:
switch@1010000 {
compatible = "mscc,vsc7514-switch";
reg = <0x1010000 0x10000>,
<0x1030000 0x10000>,
<0x1080000 0x100>,
<0x10d0000 0x10000>,
<0x11e0000 0x100>,
<0x11f0000 0x100>,
<0x1200000 0x100>,
<0x1210000 0x100>,
<0x1220000 0x100>,
<0x1230000 0x100>,
<0x1240000 0x100>,
<0x1250000 0x100>,
<0x1260000 0x100>,
<0x1270000 0x100>,
<0x1280000 0x100>,
<0x1800000 0x80000>,
<0x1880000 0x10000>;
reg-names = "sys", "rew", "qs", "hsio", "port0",
"port1", "port2", "port3", "port4", "port5",
"port6", "port7", "port8", "port9", "port10",
"qsys", "ana";
interrupts = <21 22>;
interrupt-names = "xtr", "inj";
ethernet-ports {
#address-cells = <1>;
#size-cells = <0>;
port0: port@0 {
reg = <0>;
phy-handle = <&phy0>;
};
port1: port@1 {
reg = <1>;
phy-handle = <&phy1>;
};
};
};

View File

@ -0,0 +1,30 @@
Qualcomm Bluetooth Chips
---------------------
This documents the binding structure and common properties for serial
attached Qualcomm devices.
Serial attached Qualcomm devices shall be a child node of the host UART
device the slave device is attached to.
Required properties:
- compatible: should contain one of the following:
* "qcom,qca6174-bt"
Optional properties:
- enable-gpios: gpio specifier used to enable chip
- clocks: clock provided to the controller (SUSCLK_32KHZ)
Example:
serial@7570000 {
label = "BT-UART";
status = "okay";
bluetooth {
compatible = "qcom,qca6174-bt";
enable-gpios = <&pm8994_gpios 19 GPIO_ACTIVE_HIGH>;
clocks = <&divclk4>;
};
};

View File

@ -7,11 +7,11 @@ Required properties:
"sff,sfp" for SFP modules
"sff,sff" for soldered down SFF modules
Optional Properties:
- i2c-bus : phandle of an I2C bus controller for the SFP two wire serial
interface
Optional Properties:
- mod-def0-gpios : GPIO phandle and a specifier of the MOD-DEF0 (AKA Mod_ABS)
module presence input gpio signal, active (module absent) high. Must
not be present for SFF modules

View File

@ -14,6 +14,7 @@ Required properties:
"renesas,ether-r8a7791" if the device is a part of R8A7791 SoC.
"renesas,ether-r8a7793" if the device is a part of R8A7793 SoC.
"renesas,ether-r8a7794" if the device is a part of R8A7794 SoC.
"renesas,gether-r8a77980" if the device is a part of R8A77980 SoC.
"renesas,ether-r7s72100" if the device is a part of R7S72100 SoC.
"renesas,rcar-gen1-ether" for a generic R-Car Gen1 device.
"renesas,rcar-gen2-ether" for a generic R-Car Gen2 or RZ/G1

View File

@ -13,13 +13,25 @@ Required properties:
- reg: Address where registers are mapped and size of region.
- interrupts: Should contain the MAC interrupt.
- phy-mode: See ethernet.txt in the same directory. Allow to choose
"rgmii", "rmii", or "mii" according to the PHY.
"rgmii", "rmii", "mii", or "internal" according to the PHY.
The acceptable mode is SoC-dependent.
- phy-handle: Should point to the external phy device.
See ethernet.txt file in the same directory.
- clocks: A phandle to the clock for the MAC.
For Pro4 SoC, that is "socionext,uniphier-pro4-ave4",
another MAC clock, GIO bus clock and PHY clock are also required.
- clock-names: Should contain
- "ether", "ether-gb", "gio", "ether-phy" for Pro4 SoC
- "ether" for others
- resets: A phandle to the reset control for the MAC. For Pro4 SoC,
GIO bus reset is also required.
- reset-names: Should contain
- "ether", "gio" for Pro4 SoC
- "ether" for others
- socionext,syscon-phy-mode: A phandle to syscon with one argument
that configures phy mode. The argument is the ID of MAC instance.
Optional properties:
- resets: A phandle to the reset control for the MAC.
- local-mac-address: See ethernet.txt in the same directory.
Required subnode:
@ -34,8 +46,11 @@ Example:
interrupts = <0 66 4>;
phy-mode = "rgmii";
phy-handle = <&ethphy>;
clock-names = "ether";
clocks = <&sys_clk 6>;
reset-names = "ether";
resets = <&sys_rst 6>;
socionext,syscon-phy-mode = <&soc_glue 0>;
local-mac-address = [00 00 00 00 00 00];
mdio {

View File

@ -6,14 +6,28 @@ Please see stmmac.txt for the other unchanged properties.
The device node has following properties.
Required properties:
- compatible: Should be "st,stm32-dwmac" to select glue, and
- compatible: For MCU family should be "st,stm32-dwmac" to select glue, and
"snps,dwmac-3.50a" to select IP version.
For MPU family should be "st,stm32mp1-dwmac" to select
glue, and "snps,dwmac-4.20a" to select IP version.
- clocks: Must contain a phandle for each entry in clock-names.
- clock-names: Should be "stmmaceth" for the host clock.
Should be "mac-clk-tx" for the MAC TX clock.
Should be "mac-clk-rx" for the MAC RX clock.
For MPU family need to add also "ethstp" for power mode clock and,
"syscfg-clk" for SYSCFG clock.
- interrupt-names: Should contain a list of interrupt names corresponding to
the interrupts in the interrupts property, if available.
Should be "macirq" for the main MAC IRQ
Should be "eth_wake_irq" for the IT which wake up system
- st,syscon : Should be phandle/offset pair. The phandle to the syscon node which
encompases the glue register, and the offset of the control register.
encompases the glue register, and the offset of the control register.
Optional properties:
- clock-names: For MPU family "mac-clk-ck" for PHY without quartz
- st,int-phyclk (boolean) : valid only where PHY do not have quartz and need to be clock
by RCC
Example:
ethernet@40028000 {

View File

@ -4,6 +4,7 @@ Required properties:
- compatible: Should be one of the following:
* "qcom,ath10k"
* "qcom,ipq4019-wifi"
* "qcom,wcn3990-wifi"
PCI based devices uses compatible string "qcom,ath10k" and takes calibration
data along with board specific data via "qcom,ath10k-calibration-data".
@ -18,8 +19,12 @@ In general, entry "qcom,ath10k-pre-calibration-data" and
"qcom,ath10k-calibration-data" conflict with each other and only one
can be provided per device.
SNOC based devices (i.e. wcn3990) uses compatible string "qcom,wcn3990-wifi".
Optional properties:
- reg: Address and length of the register set for the device.
- reg-names: Must include the list of following reg names,
"membase"
- resets: Must contain an entry for each entry in reset-names.
See ../reset/reseti.txt for details.
- reset-names: Must include the list of following reset names,
@ -49,6 +54,8 @@ Optional properties:
hw versions.
- qcom,ath10k-pre-calibration-data : pre calibration data as an array,
the length can vary between hw versions.
- <supply-name>-supply: handle to the regulator device tree node
optional "supply-name" is "vdd-0.8-cx-mx".
Example (to supply the calibration data alone):
@ -119,3 +126,27 @@ wifi0: wifi@a000000 {
qcom,msi_base = <0x40>;
qcom,ath10k-pre-calibration-data = [ 01 02 03 ... ];
};
Example (to supply wcn3990 SoC wifi block details):
wifi@18000000 {
compatible = "qcom,wcn3990-wifi";
reg = <0x18800000 0x800000>;
reg-names = "membase";
clocks = <&clock_gcc clk_aggre2_noc_clk>;
clock-names = "smmu_aggre2_noc_clk"
interrupts =
<0 130 0 /* CE0 */ >,
<0 131 0 /* CE1 */ >,
<0 132 0 /* CE2 */ >,
<0 133 0 /* CE3 */ >,
<0 134 0 /* CE4 */ >,
<0 135 0 /* CE5 */ >,
<0 136 0 /* CE6 */ >,
<0 137 0 /* CE7 */ >,
<0 138 0 /* CE8 */ >,
<0 139 0 /* CE9 */ >,
<0 140 0 /* CE10 */ >,
<0 141 0 /* CE11 */ >;
vdd-0.8-cx-mx-supply = <&pm8998_l5>;
};

View File

@ -0,0 +1,69 @@
* Freescale QorIQ 1588 timer based PTP clock
General Properties:
- compatible Should be "fsl,etsec-ptp"
- reg Offset and length of the register set for the device
- interrupts There should be at least two interrupts. Some devices
have as many as four PTP related interrupts.
Clock Properties:
- fsl,cksel Timer reference clock source.
- fsl,tclk-period Timer reference clock period in nanoseconds.
- fsl,tmr-prsc Prescaler, divides the output clock.
- fsl,tmr-add Frequency compensation value.
- fsl,tmr-fiper1 Fixed interval period pulse generator.
- fsl,tmr-fiper2 Fixed interval period pulse generator.
- fsl,max-adj Maximum frequency adjustment in parts per billion.
These properties set the operational parameters for the PTP
clock. You must choose these carefully for the clock to work right.
Here is how to figure good values:
TimerOsc = selected reference clock MHz
tclk_period = desired clock period nanoseconds
NominalFreq = 1000 / tclk_period MHz
FreqDivRatio = TimerOsc / NominalFreq (must be greater that 1.0)
tmr_add = ceil(2^32 / FreqDivRatio)
OutputClock = NominalFreq / tmr_prsc MHz
PulseWidth = 1 / OutputClock microseconds
FiperFreq1 = desired frequency in Hz
FiperDiv1 = 1000000 * OutputClock / FiperFreq1
tmr_fiper1 = tmr_prsc * tclk_period * FiperDiv1 - tclk_period
max_adj = 1000000000 * (FreqDivRatio - 1.0) - 1
The calculation for tmr_fiper2 is the same as for tmr_fiper1. The
driver expects that tmr_fiper1 will be correctly set to produce a 1
Pulse Per Second (PPS) signal, since this will be offered to the PPS
subsystem to synchronize the Linux clock.
Reference clock source is determined by the value, which is holded
in CKSEL bits in TMR_CTRL register. "fsl,cksel" property keeps the
value, which will be directly written in those bits, that is why,
according to reference manual, the next clock sources can be used:
<0> - external high precision timer reference clock (TSEC_TMR_CLK
input is used for this purpose);
<1> - eTSEC system clock;
<2> - eTSEC1 transmit clock;
<3> - RTC clock input.
When this attribute is not used, eTSEC system clock will serve as
IEEE 1588 timer reference clock.
Example:
ptp_clock@24e00 {
compatible = "fsl,etsec-ptp";
reg = <0x24E00 0xB0>;
interrupts = <12 0x8 13 0x8>;
interrupt-parent = < &ipic >;
fsl,cksel = <1>;
fsl,tclk-period = <10>;
fsl,tmr-prsc = <100>;
fsl,tmr-add = <0x999999A4>;
fsl,tmr-fiper1 = <0x3B9AC9F6>;
fsl,tmr-fiper2 = <0x00018696>;
fsl,max-adj = <659999998>;
};

View File

@ -17,7 +17,8 @@ pool management.
Required properties:
- compatible : Must be "ti,keystone-navigator-qmss";
- compatible : Must be "ti,keystone-navigator-qmss".
: Must be "ti,66ak2g-navss-qm" for QMSS on K2G SoC.
- clocks : phandle to the reference clock for this device.
- queue-range : <start number> total range of queue numbers for the device.
- linkram0 : <address size> for internal link ram, where size is the total
@ -39,6 +40,12 @@ Required properties:
- Descriptor memory setup region.
- Queue Management/Queue Proxy region for queue Push.
- Queue Management/Queue Proxy region for queue Pop.
For QMSS on K2G SoC, following QM reg indexes are used in that order
- Queue Peek region.
- Queue configuration region.
- Queue Management/Queue Proxy region for queue Push/Pop.
- queue-pools : child node classifying the queue ranges into pools.
Queue ranges are grouped into 3 type of pools:
- qpend : pool of qpend(interruptible) queues

View File

@ -5,6 +5,7 @@ Written 1996 by Gero Kuhlmann <gero@gkminix.han.de>
Updated 1997 by Martin Mares <mj@atrey.karlin.mff.cuni.cz>
Updated 2006 by Nico Schottelius <nico-kernel-nfsroot@schottelius.org>
Updated 2006 by Horms <horms@verge.net.au>
Updated 2018 by Chris Novakovic <chris@chrisn.me.uk>
@ -79,7 +80,7 @@ nfsroot=[<server-ip>:]<root-dir>[,<nfs-options>]
ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:
<dns0-ip>:<dns1-ip>
<dns0-ip>:<dns1-ip>:<ntp0-ip>
This parameter tells the kernel how to configure IP addresses of devices
and also how to set up the IP routing table. It was originally called
@ -110,6 +111,9 @@ ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:
will not be triggered if it is missing and NFS root is not
in operation.
Value is exported to /proc/net/pnp with the prefix "bootserver "
(see below).
Default: Determined using autoconfiguration.
The address of the autoconfiguration server is used.
@ -123,10 +127,13 @@ ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:
Default: Determined using autoconfiguration.
<hostname> Name of the client. May be supplied by autoconfiguration,
but its absence will not trigger autoconfiguration.
If specified and DHCP is used, the user provided hostname will
be carried in the DHCP request to hopefully update DNS record.
<hostname> Name of the client. If a '.' character is present, anything
before the first '.' is used as the client's hostname, and anything
after it is used as its NIS domain name. May be supplied by
autoconfiguration, but its absence will not trigger autoconfiguration.
If specified and DHCP is used, the user-provided hostname (and NIS
domain name, if present) will be carried in the DHCP request; this
may cause a DNS record to be created or updated for the client.
Default: Client IP address is used in ASCII notation.
@ -162,12 +169,55 @@ ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:
Default: any
<dns0-ip> IP address of first nameserver.
Value gets exported by /proc/net/pnp which is often linked
on embedded systems by /etc/resolv.conf.
<dns0-ip> IP address of primary nameserver.
Value is exported to /proc/net/pnp with the prefix "nameserver "
(see below).
<dns1-ip> IP address of second nameserver.
Same as above.
Default: None if not using autoconfiguration; determined
automatically if using autoconfiguration.
<dns1-ip> IP address of secondary nameserver.
See <dns0-ip>.
<ntp0-ip> IP address of a Network Time Protocol (NTP) server.
Value is exported to /proc/net/ipconfig/ntp_servers, but is
otherwise unused (see below).
Default: None if not using autoconfiguration; determined
automatically if using autoconfiguration.
After configuration (whether manual or automatic) is complete, two files
are created in the following format; lines are omitted if their respective
value is empty following configuration:
- /proc/net/pnp:
#PROTO: <DHCP|BOOTP|RARP|MANUAL> (depending on configuration method)
domain <dns-domain> (if autoconfigured, the DNS domain)
nameserver <dns0-ip> (primary name server IP)
nameserver <dns1-ip> (secondary name server IP)
nameserver <dns2-ip> (tertiary name server IP)
bootserver <server-ip> (NFS server IP)
- /proc/net/ipconfig/ntp_servers:
<ntp0-ip> (NTP server IP)
<ntp1-ip> (NTP server IP)
<ntp2-ip> (NTP server IP)
<dns-domain> and <dns2-ip> (in /proc/net/pnp) and <ntp1-ip> and <ntp2-ip>
(in /proc/net/ipconfig/ntp_servers) are requested during autoconfiguration;
they cannot be specified as part of the "ip=" kernel command line parameter.
Because the "domain" and "nameserver" options are recognised by DNS
resolvers, /etc/resolv.conf is often linked to /proc/net/pnp on systems
that use an NFS root filesystem.
Note that the kernel will not synchronise the system time with any NTP
servers it discovers; this is the responsibility of a user space process
(e.g. an initrd/initramfs script that passes the IP addresses listed in
/proc/net/ipconfig/ntp_servers to an NTP client before mounting the real
root filesystem if it is on NFS).
nfsrootdebug

View File

@ -24,10 +24,10 @@ enum lowpan_lltypes.
Example to evaluate the private usually you can do:
static inline sturct lowpan_priv_foobar *
static inline struct lowpan_priv_foobar *
lowpan_foobar_priv(struct net_device *dev)
{
return (sturct lowpan_priv_foobar *)lowpan_priv(dev)->priv;
return (struct lowpan_priv_foobar *)lowpan_priv(dev)->priv;
}
switch (dev->type) {

View File

@ -0,0 +1,312 @@
.. SPDX-License-Identifier: GPL-2.0
======
AF_XDP
======
Overview
========
AF_XDP is an address family that is optimized for high performance
packet processing.
This document assumes that the reader is familiar with BPF and XDP. If
not, the Cilium project has an excellent reference guide at
http://cilium.readthedocs.io/en/latest/bpf/.
Using the XDP_REDIRECT action from an XDP program, the program can
redirect ingress frames to other XDP enabled netdevs, using the
bpf_redirect_map() function. AF_XDP sockets enable the possibility for
XDP programs to redirect frames to a memory buffer in a user-space
application.
An AF_XDP socket (XSK) is created with the normal socket()
syscall. Associated with each XSK are two rings: the RX ring and the
TX ring. A socket can receive packets on the RX ring and it can send
packets on the TX ring. These rings are registered and sized with the
setsockopts XDP_RX_RING and XDP_TX_RING, respectively. It is mandatory
to have at least one of these rings for each socket. An RX or TX
descriptor ring points to a data buffer in a memory area called a
UMEM. RX and TX can share the same UMEM so that a packet does not have
to be copied between RX and TX. Moreover, if a packet needs to be kept
for a while due to a possible retransmit, the descriptor that points
to that packet can be changed to point to another and reused right
away. This again avoids copying data.
The UMEM consists of a number of equally sized chunks. A descriptor in
one of the rings references a frame by referencing its addr. The addr
is simply an offset within the entire UMEM region. The user space
allocates memory for this UMEM using whatever means it feels is most
appropriate (malloc, mmap, huge pages, etc). This memory area is then
registered with the kernel using the new setsockopt XDP_UMEM_REG. The
UMEM also has two rings: the FILL ring and the COMPLETION ring. The
fill ring is used by the application to send down addr for the kernel
to fill in with RX packet data. References to these frames will then
appear in the RX ring once each packet has been received. The
completion ring, on the other hand, contains frame addr that the
kernel has transmitted completely and can now be used again by user
space, for either TX or RX. Thus, the frame addrs appearing in the
completion ring are addrs that were previously transmitted using the
TX ring. In summary, the RX and FILL rings are used for the RX path
and the TX and COMPLETION rings are used for the TX path.
The socket is then finally bound with a bind() call to a device and a
specific queue id on that device, and it is not until bind is
completed that traffic starts to flow.
The UMEM can be shared between processes, if desired. If a process
wants to do this, it simply skips the registration of the UMEM and its
corresponding two rings, sets the XDP_SHARED_UMEM flag in the bind
call and submits the XSK of the process it would like to share UMEM
with as well as its own newly created XSK socket. The new process will
then receive frame addr references in its own RX ring that point to
this shared UMEM. Note that since the ring structures are
single-consumer / single-producer (for performance reasons), the new
process has to create its own socket with associated RX and TX rings,
since it cannot share this with the other process. This is also the
reason that there is only one set of FILL and COMPLETION rings per
UMEM. It is the responsibility of a single process to handle the UMEM.
How is then packets distributed from an XDP program to the XSKs? There
is a BPF map called XSKMAP (or BPF_MAP_TYPE_XSKMAP in full). The
user-space application can place an XSK at an arbitrary place in this
map. The XDP program can then redirect a packet to a specific index in
this map and at this point XDP validates that the XSK in that map was
indeed bound to that device and ring number. If not, the packet is
dropped. If the map is empty at that index, the packet is also
dropped. This also means that it is currently mandatory to have an XDP
program loaded (and one XSK in the XSKMAP) to be able to get any
traffic to user space through the XSK.
AF_XDP can operate in two different modes: XDP_SKB and XDP_DRV. If the
driver does not have support for XDP, or XDP_SKB is explicitly chosen
when loading the XDP program, XDP_SKB mode is employed that uses SKBs
together with the generic XDP support and copies out the data to user
space. A fallback mode that works for any network device. On the other
hand, if the driver has support for XDP, it will be used by the AF_XDP
code to provide better performance, but there is still a copy of the
data into user space.
Concepts
========
In order to use an AF_XDP socket, a number of associated objects need
to be setup.
Jonathan Corbet has also written an excellent article on LWN,
"Accelerating networking with AF_XDP". It can be found at
https://lwn.net/Articles/750845/.
UMEM
----
UMEM is a region of virtual contiguous memory, divided into
equal-sized frames. An UMEM is associated to a netdev and a specific
queue id of that netdev. It is created and configured (chunk size,
headroom, start address and size) by using the XDP_UMEM_REG setsockopt
system call. A UMEM is bound to a netdev and queue id, via the bind()
system call.
An AF_XDP is socket linked to a single UMEM, but one UMEM can have
multiple AF_XDP sockets. To share an UMEM created via one socket A,
the next socket B can do this by setting the XDP_SHARED_UMEM flag in
struct sockaddr_xdp member sxdp_flags, and passing the file descriptor
of A to struct sockaddr_xdp member sxdp_shared_umem_fd.
The UMEM has two single-producer/single-consumer rings, that are used
to transfer ownership of UMEM frames between the kernel and the
user-space application.
Rings
-----
There are a four different kind of rings: Fill, Completion, RX and
TX. All rings are single-producer/single-consumer, so the user-space
application need explicit synchronization of multiple
processes/threads are reading/writing to them.
The UMEM uses two rings: Fill and Completion. Each socket associated
with the UMEM must have an RX queue, TX queue or both. Say, that there
is a setup with four sockets (all doing TX and RX). Then there will be
one Fill ring, one Completion ring, four TX rings and four RX rings.
The rings are head(producer)/tail(consumer) based rings. A producer
writes the data ring at the index pointed out by struct xdp_ring
producer member, and increasing the producer index. A consumer reads
the data ring at the index pointed out by struct xdp_ring consumer
member, and increasing the consumer index.
The rings are configured and created via the _RING setsockopt system
calls and mmapped to user-space using the appropriate offset to mmap()
(XDP_PGOFF_RX_RING, XDP_PGOFF_TX_RING, XDP_UMEM_PGOFF_FILL_RING and
XDP_UMEM_PGOFF_COMPLETION_RING).
The size of the rings need to be of size power of two.
UMEM Fill Ring
~~~~~~~~~~~~~~
The Fill ring is used to transfer ownership of UMEM frames from
user-space to kernel-space. The UMEM addrs are passed in the ring. As
an example, if the UMEM is 64k and each chunk is 4k, then the UMEM has
16 chunks and can pass addrs between 0 and 64k.
Frames passed to the kernel are used for the ingress path (RX rings).
The user application produces UMEM addrs to this ring. Note that the
kernel will mask the incoming addr. E.g. for a chunk size of 2k, the
log2(2048) LSB of the addr will be masked off, meaning that 2048, 2050
and 3000 refers to the same chunk.
UMEM Completetion Ring
~~~~~~~~~~~~~~~~~~~~~~
The Completion Ring is used transfer ownership of UMEM frames from
kernel-space to user-space. Just like the Fill ring, UMEM indicies are
used.
Frames passed from the kernel to user-space are frames that has been
sent (TX ring) and can be used by user-space again.
The user application consumes UMEM addrs from this ring.
RX Ring
~~~~~~~
The RX ring is the receiving side of a socket. Each entry in the ring
is a struct xdp_desc descriptor. The descriptor contains UMEM offset
(addr) and the length of the data (len).
If no frames have been passed to kernel via the Fill ring, no
descriptors will (or can) appear on the RX ring.
The user application consumes struct xdp_desc descriptors from this
ring.
TX Ring
~~~~~~~
The TX ring is used to send frames. The struct xdp_desc descriptor is
filled (index, length and offset) and passed into the ring.
To start the transfer a sendmsg() system call is required. This might
be relaxed in the future.
The user application produces struct xdp_desc descriptors to this
ring.
XSKMAP / BPF_MAP_TYPE_XSKMAP
----------------------------
On XDP side there is a BPF map type BPF_MAP_TYPE_XSKMAP (XSKMAP) that
is used in conjunction with bpf_redirect_map() to pass the ingress
frame to a socket.
The user application inserts the socket into the map, via the bpf()
system call.
Note that if an XDP program tries to redirect to a socket that does
not match the queue configuration and netdev, the frame will be
dropped. E.g. an AF_XDP socket is bound to netdev eth0 and
queue 17. Only the XDP program executing for eth0 and queue 17 will
successfully pass data to the socket. Please refer to the sample
application (samples/bpf/) in for an example.
Usage
=====
In order to use AF_XDP sockets there are two parts needed. The
user-space application and the XDP program. For a complete setup and
usage example, please refer to the sample application. The user-space
side is xdpsock_user.c and the XDP side xdpsock_kern.c.
Naive ring dequeue and enqueue could look like this::
// struct xdp_rxtx_ring {
// __u32 *producer;
// __u32 *consumer;
// struct xdp_desc *desc;
// };
// struct xdp_umem_ring {
// __u32 *producer;
// __u32 *consumer;
// __u64 *desc;
// };
// typedef struct xdp_rxtx_ring RING;
// typedef struct xdp_umem_ring RING;
// typedef struct xdp_desc RING_TYPE;
// typedef __u64 RING_TYPE;
int dequeue_one(RING *ring, RING_TYPE *item)
{
__u32 entries = *ring->producer - *ring->consumer;
if (entries == 0)
return -1;
// read-barrier!
*item = ring->desc[*ring->consumer & (RING_SIZE - 1)];
(*ring->consumer)++;
return 0;
}
int enqueue_one(RING *ring, const RING_TYPE *item)
{
u32 free_entries = RING_SIZE - (*ring->producer - *ring->consumer);
if (free_entries == 0)
return -1;
ring->desc[*ring->producer & (RING_SIZE - 1)] = *item;
// write-barrier!
(*ring->producer)++;
return 0;
}
For a more optimized version, please refer to the sample application.
Sample application
==================
There is a xdpsock benchmarking/test application included that
demonstrates how to use AF_XDP sockets with both private and shared
UMEMs. Say that you would like your UDP traffic from port 4242 to end
up in queue 16, that we will enable AF_XDP on. Here, we use ethtool
for this::
ethtool -N p3p2 rx-flow-hash udp4 fn
ethtool -N p3p2 flow-type udp4 src-port 4242 dst-port 4242 \
action 16
Running the rxdrop benchmark in XDP_DRV mode can then be done
using::
samples/bpf/xdpsock -i p3p2 -q 16 -r -N
For XDP_SKB mode, use the switch "-S" instead of "-N" and all options
can be displayed with "-h", as usual.
Credits
=======
- Björn Töpel (AF_XDP core)
- Magnus Karlsson (AF_XDP core)
- Alexander Duyck
- Alexei Starovoitov
- Daniel Borkmann
- Jesper Dangaard Brouer
- John Fastabend
- Jonathan Corbet (LWN coverage)
- Michael S. Tsirkin
- Qi Z Zhang
- Willem de Bruijn

View File

@ -140,7 +140,7 @@ bonding module at load time, or are specified via sysfs.
Module options may be given as command line arguments to the
insmod or modprobe command, but are usually specified in either the
/etc/modrobe.d/*.conf configuration files, or in a distro-specific
/etc/modprobe.d/*.conf configuration files, or in a distro-specific
configuration file (some of which are detailed in the next section).
Details on bonding support for sysfs is provided in the

View File

@ -1,7 +1,7 @@
Linux* Base Driver for the Intel(R) PRO/100 Family of Adapters
==============================================================
March 15, 2011
June 1, 2018
Contents
========
@ -36,16 +36,9 @@ Channel Bonding documentation can be found in the Linux kernel source:
Identifying Your Adapter
========================
For more information on how to identify your adapter, go to the Adapter &
Driver ID Guide at:
http://support.intel.com/support/network/adapter/pro100/21397.htm
For the latest Intel network drivers for Linux, refer to the following
website. In the search field, enter your adapter name or type, or use the
networking link on the left to search for your adapter:
http://downloadfinder.intel.com/scripts-df/support_intel.asp
For information on how to identify your adapter, and for the latest Intel
network drivers, refer to the Intel Support website:
http://www.intel.com/support
Driver Configuration Parameters
===============================
@ -57,22 +50,26 @@ Rx Descriptors: Number of receive descriptors. A receive descriptor is a data
structure that describes a receive buffer and its attributes to the network
controller. The data in the descriptor is used by the controller to write
data from the controller to host memory. In the 3.x.x driver the valid range
for this parameter is 64-256. The default value is 64. This parameter can be
changed using the command:
for this parameter is 64-256. The default value is 256. This parameter can be
changed using the command::
ethtool -G eth? rx n, where n is the number of desired rx descriptors.
ethtool -G eth? rx n
Where n is the number of desired Rx descriptors.
Tx Descriptors: Number of transmit descriptors. A transmit descriptor is a data
structure that describes a transmit buffer and its attributes to the network
controller. The data in the descriptor is used by the controller to read
data from the host memory to the controller. In the 3.x.x driver the valid
range for this parameter is 64-256. The default value is 64. This parameter
can be changed using the command:
range for this parameter is 64-256. The default value is 128. This parameter
can be changed using the command::
ethtool -G eth? tx n, where n is the number of desired tx descriptors.
ethtool -G eth? tx n
Where n is the number of desired Tx descriptors.
Speed/Duplex: The driver auto-negotiates the link speed and duplex settings by
default. The ethtool utility can be used as follows to force speed/duplex.
default. The ethtool utility can be used as follows to force speed/duplex.::
ethtool -s eth? autoneg off speed {10|100} duplex {full|half}
@ -81,7 +78,7 @@ Speed/Duplex: The driver auto-negotiates the link speed and duplex settings by
Event Log Message Level: The driver uses the message level flag to log events
to syslog. The message level can be set at driver load time. It can also be
set using the command:
set using the command::
ethtool -s eth? msglvl n
@ -112,9 +109,9 @@ Additional Configurations
---------------------
In order to see link messages and other Intel driver information on your
console, you must set the dmesg level up to six. This can be done by
entering the following on the command line before loading the e100 driver:
entering the following on the command line before loading the e100 driver::
dmesg -n 8
dmesg -n 6
If you wish to see all messages issued by the driver, including debug
messages, set the dmesg level to eight.
@ -146,7 +143,8 @@ Additional Configurations
NAPI (Rx polling mode) is supported in the e100 driver.
See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI.
See https://wiki.linuxfoundation.org/networking/napi for more information
on NAPI.
Multiple Interfaces on Same Ethernet Broadcast Network
------------------------------------------------------
@ -160,7 +158,7 @@ Additional Configurations
If you have multiple interfaces in a server, either turn on ARP
filtering by
(1) entering: echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
(1) entering:: echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
(this only works if your kernel's version is higher than 2.4.5), or
(2) installing the interfaces in separate broadcast domains (either
@ -169,15 +167,11 @@ Additional Configurations
Support
=======
For general information, go to the Intel support website at:
http://www.intel.com/support/
http://support.intel.com
or the Intel Wired Networking project hosted by Sourceforge at:
http://sourceforge.net/projects/e1000
If an issue is identified with the released source code on the supported
kernel with a supported adapter, email the specific information related to the
issue to e1000-devel@lists.sourceforge.net.
or the Intel Wired Networking project hosted by Sourceforge at:
http://sourceforge.net/projects/e1000
If an issue is identified with the released source code on a supported kernel
with a supported adapter, email the specific information related to the issue
to e1000-devel@lists.sf.net.

View File

@ -154,7 +154,7 @@ NOTE: When e1000 is loaded with default settings and multiple adapters
are in use simultaneously, the CPU utilization may increase non-
linearly. In order to limit the CPU utilization without impacting
the overall throughput, we recommend that you load the driver as
follows:
follows::
modprobe e1000 InterruptThrottleRate=3000,3000,3000
@ -167,8 +167,8 @@ NOTE: When e1000 is loaded with default settings and multiple adapters
RxDescriptors
-------------
Valid Range: 80-256 for 82542 and 82543-based adapters
80-4096 for all other supported adapters
Valid Range: 48-256 for 82542 and 82543-based adapters
48-4096 for all other supported adapters
Default Value: 256
This value specifies the number of receive buffer descriptors allocated
@ -230,8 +230,8 @@ speed. Duplex should also be set when Speed is set to either 10 or 100.
TxDescriptors
-------------
Valid Range: 80-256 for 82542 and 82543-based adapters
80-4096 for all other supported adapters
Valid Range: 48-256 for 82542 and 82543-based adapters
48-4096 for all other supported adapters
Default Value: 256
This value is the number of transmit descriptors allocated by the driver.
@ -242,41 +242,10 @@ NOTE: Depending on the available system resources, the request for a
higher number of transmit descriptors may be denied. In this case,
use a lower number.
TxDescriptorStep
----------------
Valid Range: 1 (use every Tx Descriptor)
4 (use every 4th Tx Descriptor)
Default Value: 1 (use every Tx Descriptor)
On certain non-Intel architectures, it has been observed that intense TX
traffic bursts of short packets may result in an improper descriptor
writeback. If this occurs, the driver will report a "TX Timeout" and reset
the adapter, after which the transmit flow will restart, though data may
have stalled for as much as 10 seconds before it resumes.
The improper writeback does not occur on the first descriptor in a system
memory cache-line, which is typically 32 bytes, or 4 descriptors long.
Setting TxDescriptorStep to a value of 4 will ensure that all TX descriptors
are aligned to the start of a system memory cache line, and so this problem
will not occur.
NOTES: Setting TxDescriptorStep to 4 effectively reduces the number of
TxDescriptors available for transmits to 1/4 of the normal allocation.
This has a possible negative performance impact, which may be
compensated for by allocating more descriptors using the TxDescriptors
module parameter.
There are other conditions which may result in "TX Timeout", which will
not be resolved by the use of the TxDescriptorStep parameter. As the
issue addressed by this parameter has never been observed on Intel
Architecture platforms, it should not be used on Intel platforms.
TxIntDelay
----------
Valid Range: 0-65535 (0=off)
Default Value: 64
Default Value: 8
This value delays the generation of transmit interrupts in units of
1.024 microseconds. Transmit interrupt reduction can improve CPU
@ -288,7 +257,7 @@ TxAbsIntDelay
-------------
(This parameter is supported only on 82540, 82545 and later adapters.)
Valid Range: 0-65535 (0=off)
Default Value: 64
Default Value: 32
This value, in units of 1.024 microseconds, limits the delay in which a
transmit interrupt is generated. Useful only if TxIntDelay is non-zero,
@ -310,7 +279,7 @@ Copybreak
---------
Valid Range: 0-xxxxxxx (0=off)
Default Value: 256
Usage: insmod e1000.ko copybreak=128
Usage: modprobe e1000.ko copybreak=128
Driver copies all packets below or equaling this size to a fresh RX
buffer before handing it up the stack.
@ -328,14 +297,6 @@ Default Value: 0 (disabled)
Allows PHY to turn off in lower power states. The user can turn off
this parameter in supported chipsets.
KumeranLockLoss
---------------
Valid Range: 0-1
Default Value: 1 (enabled)
This workaround skips resetting the PHY at shutdown for the initial
silicon releases of ICH8 systems.
Speed and Duplex Configuration
==============================
@ -397,12 +358,12 @@ Additional Configurations
------------
Jumbo Frames support is enabled by changing the MTU to a value larger than
the default of 1500. Use the ifconfig command to increase the MTU size.
For example:
For example::
ifconfig eth<x> mtu 9000 up
This setting is not saved across reboots. It can be made permanent if
you add:
you add::
MTU=9000

View File

@ -0,0 +1,18 @@
.. SPDX-License-Identifier: GPL-2.0
========
FAILOVER
========
Overview
========
The failover module provides a generic interface for paravirtual drivers
to register a netdev and a set of ops with a failover instance. The ops
are used as event handlers that get called to handle netdev register/
unregister/link change/name change events on slave pci ethernet devices
with the same mac address as the failover netdev.
This enables paravirtual drivers to use a VF as an accelerated low latency
datapath. It also allows live migration of VMs with direct attached VFs by
failing over to the paravirtual datapath when the VF is unplugged.

View File

@ -483,6 +483,12 @@ Example output from dmesg:
[ 3389.935851] JIT code: 00000030: 00 e8 28 94 ff e0 83 f8 01 75 07 b8 ff ff 00 00
[ 3389.935852] JIT code: 00000040: eb 02 31 c0 c9 c3
When CONFIG_BPF_JIT_ALWAYS_ON is enabled, bpf_jit_enable is permanently set to 1 and
setting any other value than that will return in failure. This is even the case for
setting bpf_jit_enable to 2, since dumping the final JIT image into the kernel log
is discouraged and introspection through bpftool (under tools/bpf/bpftool/) is the
generally recommended approach instead.
In the kernel source tree under tools/bpf/, there's bpf_jit_disasm for
generating disassembly out of the kernel log's hexdump:
@ -1136,6 +1142,7 @@ into a register from memory, the register's top 56 bits are known zero, while
the low 8 are unknown - which is represented as the tnum (0x0; 0xff). If we
then OR this with 0x40, we get (0x40; 0xbf), then if we add 1 we get (0x0;
0x1ff), because of potential carries.
Besides arithmetic, the register state can also be updated by conditional
branches. For instance, if a SCALAR_VALUE is compared > 8, in the 'true' branch
it will have a umin_value (unsigned minimum value) of 9, whereas in the 'false'
@ -1144,14 +1151,16 @@ BPF_JSGE) would instead update the signed minimum/maximum values. Information
from the signed and unsigned bounds can be combined; for instance if a value is
first tested < 8 and then tested s> 4, the verifier will conclude that the value
is also > 4 and s< 8, since the bounds prevent crossing the sign boundary.
PTR_TO_PACKETs with a variable offset part have an 'id', which is common to all
pointers sharing that same variable offset. This is important for packet range
checks: after adding some variable to a packet pointer, if you then copy it to
another register and (say) add a constant 4, both registers will share the same
'id' but one will have a fixed offset of +4. Then if it is bounds-checked and
found to be less than a PTR_TO_PACKET_END, the other register is now known to
have a safe range of at least 4 bytes. See 'Direct packet access', below, for
more on PTR_TO_PACKET ranges.
checks: after adding a variable to a packet pointer register A, if you then copy
it to another register B and then add a constant 4 to A, both registers will
share the same 'id' but the A will have a fixed offset of +4. Then if A is
bounds-checked and found to be less than a PTR_TO_PACKET_END, the register B is
now known to have a safe range of at least 4 bytes. See 'Direct packet access',
below, for more on PTR_TO_PACKET ranges.
The 'id' field is also used on PTR_TO_MAP_VALUE_OR_NULL, common to all copies of
the pointer returned from a map lookup. This means that when one copy is
checked and found to be non-NULL, all copies can become PTR_TO_MAP_VALUEs.

View File

@ -67,7 +67,7 @@ Don't be confused by terminology: The GTP User Plane goes through
kernel accelerated path, while the GTP Control Plane goes to
Userspace :)
The official homepge of the module is at
The official homepage of the module is at
https://osmocom.org/projects/linux-kernel-gtp-u/wiki
== Userspace Programs with Linux Kernel GTP-U support ==
@ -120,7 +120,7 @@ If yo have questions regarding how to use the Kernel GTP module from
your own software, or want to contribute to the code, please use the
osmocom-net-grps mailing list for related discussion. The list can be
reached at osmocom-net-gprs@lists.osmocom.org and the mailman
interface for managign your subscription is at
interface for managing your subscription is at
https://lists.osmocom.org/mailman/listinfo/osmocom-net-gprs
== Issue Tracker ==

View File

@ -121,7 +121,7 @@ three options to deal with this:
- checksum neutral mapping
When an address is translated the difference can be offset
elsewhere in a part of the packet that is covered by the
elsewhere in a part of the packet that is covered by
the checksum. The low order sixteen bits of the identifier
are used. This method is preferred since it doesn't require
parsing a packet beyond the IP header and in most cases the

View File

@ -6,9 +6,12 @@ Contents:
.. toctree::
:maxdepth: 2
af_xdp
batman-adv
can
dpaa2/index
e100
e1000
kapi
z8530book
msg_zerocopy

View File

@ -26,7 +26,7 @@ ip_no_pmtu_disc - INTEGER
discarded. Outgoing frames are handled the same as in mode 1,
implicitly setting IP_PMTUDISC_DONT on every created socket.
Mode 3 is a hardend pmtu discover mode. The kernel will only
Mode 3 is a hardened pmtu discover mode. The kernel will only
accept fragmentation-needed errors if the underlying protocol
can verify them besides a plain socket lookup. Current
protocols for which pmtu events will be honored are TCP, SCTP
@ -449,8 +449,10 @@ tcp_recovery - INTEGER
features.
RACK: 0x1 enables the RACK loss detection for fast detection of lost
retransmissions and tail drops.
retransmissions and tail drops. It also subsumes and disables
RFC6675 recovery for SACK connections.
RACK: 0x2 makes RACK's reordering window static (min_rtt/4).
RACK: 0x4 disables RACK's DUPACK threshold heuristic
Default: 0x1
@ -523,6 +525,19 @@ tcp_rmem - vector of 3 INTEGERs: min, default, max
tcp_sack - BOOLEAN
Enable select acknowledgments (SACKS).
tcp_comp_sack_delay_ns - LONG INTEGER
TCP tries to reduce number of SACK sent, using a timer
based on 5% of SRTT, capped by this sysctl, in nano seconds.
The default is 1ms, based on TSO autosizing period.
Default : 1,000,000 ns (1 ms)
tcp_comp_sack_nr - INTEGER
Max numer of SACK that can be compressed.
Using 0 disables SACK compression.
Detault : 44
tcp_slow_start_after_idle - BOOLEAN
If set, provide RFC2861 behavior and time out the congestion
window after an idle period. An idle period is defined at
@ -652,11 +667,15 @@ tcp_tso_win_divisor - INTEGER
building larger TSO frames.
Default: 3
tcp_tw_reuse - BOOLEAN
Allow to reuse TIME-WAIT sockets for new connections when it is
safe from protocol viewpoint. Default value is 0.
tcp_tw_reuse - INTEGER
Enable reuse of TIME-WAIT sockets for new connections when it is
safe from protocol viewpoint.
0 - disable
1 - global enable
2 - enable for loopback traffic only
It should not be changed without advice/request of technical
experts.
Default: 2
tcp_window_scaling - BOOLEAN
Enable window scaling as defined in RFC1323.
@ -1428,6 +1447,19 @@ ip6frag_low_thresh - INTEGER
ip6frag_time - INTEGER
Time in seconds to keep an IPv6 fragment in memory.
IPv6 Segment Routing:
seg6_flowlabel - INTEGER
Controls the behaviour of computing the flowlabel of outer
IPv6 header in case of SR T.encaps
-1 set flowlabel to zero.
0 copy flowlabel from Inner packet in case of Inner IPv6
(Set flowlabel to 0 in case IPv4/L2)
1 Compute the flowlabel using seg6_make_flowlabel()
Default is 0.
conf/default/*:
Change the interface-specific default settings.

View File

@ -25,8 +25,8 @@ Quote from RFC3173:
is implementation dependent.
Current IPComp implementation is indeed by the book, while as in practice
when sending non-compressed packet to the peer(whether or not packet len
is smaller than the threshold or the compressed len is large than original
when sending non-compressed packet to the peer (whether or not packet len
is smaller than the threshold or the compressed len is larger than original
packet len), the packet is dropped when checking the policy as this packet
matches the selector but not coming from any XFRM layer, i.e., with no
security path. Such naked packet will not eventually make it to upper layer.

View File

@ -73,11 +73,11 @@ mode to make conn-tracking work.
This is the default option. To configure the IPvlan port in this mode,
user can choose to either add this option on the command-line or don't specify
anything. This is the traditional mode where slaves can cross-talk among
themseleves apart from talking through the master device.
themselves apart from talking through the master device.
5.2 private:
If this option is added to the command-line, the port is set in private
mode. i.e. port wont allow cross communication between slaves.
mode. i.e. port won't allow cross communication between slaves.
5.3 vepa:
If this is added to the command-line, the port is set in VEPA mode.

View File

@ -1,4 +1,4 @@
Kernel Connection Mulitplexor
Kernel Connection Multiplexor
-----------------------------
Kernel Connection Multiplexor (KCM) is a mechanism that provides a message based
@ -31,7 +31,7 @@ KCM implements an NxM multiplexor in the kernel as diagrammed below:
KCM sockets
-----------
The KCM sockets provide the user interface to the muliplexor. All the KCM sockets
The KCM sockets provide the user interface to the multiplexor. All the KCM sockets
bound to a multiplexor are considered to have equivalent function, and I/O
operations in different sockets may be done in parallel without the need for
synchronization between threads in userspace.
@ -199,7 +199,7 @@ while. Example use:
BFP programs for message delineation
------------------------------------
BPF programs can be compiled using the BPF LLVM backend. For exmple,
BPF programs can be compiled using the BPF LLVM backend. For example,
the BPF program for parsing Thrift is:
#include "bpf.h" /* for __sk_buff */
@ -222,7 +222,7 @@ messages. The kernel provides necessary assurances that messages are sent
and received atomically. This relieves much of the burden applications have
in mapping a message based protocol onto the TCP stream. KCM also make
application layer messages a unit of work in the kernel for the purposes of
steerng and scheduling, which in turn allows a simpler networking model in
steering and scheduling, which in turn allows a simpler networking model in
multithreaded applications.
Configurations
@ -272,7 +272,7 @@ on the socket thus waking up the application thread. When the application
sees the error (which may just be a disconnect) it should unattach the
socket from KCM and then close it. It is assumed that once an error is
posted on the TCP socket the data stream is unrecoverable (i.e. an error
may have occurred in the middle of receiving a messssge).
may have occurred in the middle of receiving a message).
TCP connection monitoring
-------------------------

View File

@ -0,0 +1,116 @@
.. SPDX-License-Identifier: GPL-2.0
============
NET_FAILOVER
============
Overview
========
The net_failover driver provides an automated failover mechanism via APIs
to create and destroy a failover master netdev and mananges a primary and
standby slave netdevs that get registered via the generic failover
infrastructrure.
The failover netdev acts a master device and controls 2 slave devices. The
original paravirtual interface is registered as 'standby' slave netdev and
a passthru/vf device with the same MAC gets registered as 'primary' slave
netdev. Both 'standby' and 'failover' netdevs are associated with the same
'pci' device. The user accesses the network interface via 'failover' netdev.
The 'failover' netdev chooses 'primary' netdev as default for transmits when
it is available with link up and running.
This can be used by paravirtual drivers to enable an alternate low latency
datapath. It also enables hypervisor controlled live migration of a VM with
direct attached VF by failing over to the paravirtual datapath when the VF
is unplugged.
virtio-net accelerated datapath: STANDBY mode
=============================================
net_failover enables hypervisor controlled accelerated datapath to virtio-net
enabled VMs in a transparent manner with no/minimal guest userspace chanages.
To support this, the hypervisor needs to enable VIRTIO_NET_F_STANDBY
feature on the virtio-net interface and assign the same MAC address to both
virtio-net and VF interfaces.
Here is an example XML snippet that shows such configuration.
<interface type='network'>
<mac address='52:54:00:00:12:53'/>
<source network='enp66s0f0_br'/>
<target dev='tap01'/>
<model type='virtio'/>
<driver name='vhost' queues='4'/>
<link state='down'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
</interface>
<interface type='hostdev' managed='yes'>
<mac address='52:54:00:00:12:53'/>
<source>
<address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
</source>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
</interface>
Booting a VM with the above configuration will result in the following 3
netdevs created in the VM.
4: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
inet 192.168.12.53/24 brd 192.168.12.255 scope global dynamic ens10
valid_lft 42482sec preferred_lft 42482sec
inet6 fe80::97d8:db2:8c10:b6d6/64 scope link
valid_lft forever preferred_lft forever
5: ens10nsby: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ens10 state UP group default qlen 1000
link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
7: ens11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ens10 state UP group default qlen 1000
link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
ens10 is the 'failover' master netdev, ens10nsby and ens11 are the slave
'standby' and 'primary' netdevs respectively.
Live Migration of a VM with SR-IOV VF & virtio-net in STANDBY mode
==================================================================
net_failover also enables hypervisor controlled live migration to be supported
with VMs that have direct attached SR-IOV VF devices by automatic failover to
the paravirtual datapath when the VF is unplugged.
Here is a sample script that shows the steps to initiate live migration on
the source hypervisor.
# cat vf_xml
<interface type='hostdev' managed='yes'>
<mac address='52:54:00:00:12:53'/>
<source>
<address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
</source>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
</interface>
# Source Hypervisor
#!/bin/bash
DOMAIN=fedora27-tap01
PF=enp66s0f0
VF_NUM=5
TAP_IF=tap01
VF_XML=
MAC=52:54:00:00:12:53
ZERO_MAC=00:00:00:00:00:00
virsh domif-setlink $DOMAIN $TAP_IF up
bridge fdb del $MAC dev $PF master
virsh detach-device $DOMAIN $VF_XML
ip link set $PF vf $VF_NUM mac $ZERO_MAC
virsh migrate --live $DOMAIN qemu+ssh://$REMOTE_HOST/system
# Destination Hypervisor
#!/bin/bash
virsh attach-device $DOMAIN $VF_XML
virsh domif-setlink $DOMAIN $TAP_IF down

View File

@ -179,6 +179,15 @@ A: No. See above answer. In short, if you think it really belongs in
dash marker line as described in Documentation/process/submitting-patches.rst to
temporarily embed that information into the patch that you send.
Q: Are all networking bug fixes backported to all stable releases?
A: Due to capacity, Dave could only take care of the backports for the last
2 stable releases. For earlier stable releases, each stable branch maintainer
is supposed to take care of them. If you find any patch is missing from an
earlier stable branch, please notify stable@vger.kernel.org with either a
commit ID or a formal patch backported, and CC Dave and other relevant
networking developers.
Q: Someone said that the comment style and coding convention is different
for the networking content. Is this true?

View File

@ -113,6 +113,13 @@ whatever headers there might be.
NETIF_F_TSO_ECN means that hardware can properly split packets with CWR bit
set, be it TCPv4 (when NETIF_F_TSO is enabled) or TCPv6 (NETIF_F_TSO6).
* Transmit UDP segmentation offload
NETIF_F_GSO_UDP_GSO_L4 accepts a single UDP header with a payload that exceeds
gso_size. On segmentation, it segments the payload on gso_size boundaries and
replicates the network and UDP headers (fixing up the last one if less than
gso_size).
* Transmit DMA from high memory
On platforms where this is relevant, NETIF_F_HIGHDMA signals that

View File

@ -156,7 +156,7 @@ nf_conntrack_timestamp - BOOLEAN
nf_conntrack_udp_timeout - INTEGER (seconds)
default 30
nf_conntrack_udp_timeout_stream2 - INTEGER (seconds)
nf_conntrack_udp_timeout_stream - INTEGER (seconds)
default 180
This extended timeout will be used in case there is an UDP stream

View File

@ -45,6 +45,7 @@ through bpf(2) and passing a verifier in the kernel, a JIT will then
translate these BPF proglets into native CPU instructions. There are
two flavors of JITs, the newer eBPF JIT currently supported on:
- x86_64
- x86_32
- arm64
- arm32
- ppc64

View File

@ -2732,13 +2732,13 @@ L: netdev@vger.kernel.org
L: linux-kernel@vger.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git
T: git git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
Q: https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
S: Supported
F: arch/x86/net/bpf_jit*
F: Documentation/networking/filter.txt
F: Documentation/bpf/
F: include/linux/bpf*
F: include/linux/filter.h
F: include/trace/events/bpf.h
F: include/trace/events/xdp.h
F: include/uapi/linux/bpf*
F: include/uapi/linux/filter.h
@ -2751,6 +2751,7 @@ F: net/sched/act_bpf.c
F: net/sched/cls_bpf.c
F: samples/bpf/
F: tools/bpf/
F: tools/lib/bpf/
F: tools/testing/selftests/bpf/
BROADCOM B44 10/100 ETHERNET DRIVER
@ -5352,7 +5353,6 @@ F: include/linux/*mdio*.h
F: include/linux/of_net.h
F: include/linux/phy.h
F: include/linux/phy_fixed.h
F: include/linux/platform_data/mdio-gpio.h
F: include/linux/platform_data/mdio-bcm-unimac.h
F: include/trace/events/mdio.h
F: include/uapi/linux/mdio.h
@ -5453,6 +5453,14 @@ M: Josh Poimboeuf <jpoimboe@redhat.com>
S: Maintained
F: scripts/faddr2line
FAILOVER MODULE
M: Sridhar Samudrala <sridhar.samudrala@intel.com>
L: netdev@vger.kernel.org
S: Supported
F: net/core/failover.c
F: include/net/failover.h
F: Documentation/networking/failover.rst
FANOTIFY
M: Jan Kara <jack@suse.cz>
R: Amir Goldstein <amir73il@gmail.com>
@ -5663,7 +5671,6 @@ M: Claudiu Manoil <claudiu.manoil@nxp.com>
L: netdev@vger.kernel.org
S: Maintained
F: drivers/net/ethernet/freescale/gianfar*
X: drivers/net/ethernet/freescale/gianfar_ptp.c
F: Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
FREESCALE GPMI NAND DRIVER
@ -5710,6 +5717,14 @@ S: Maintained
F: drivers/net/ethernet/freescale/fman
F: Documentation/devicetree/bindings/powerpc/fsl/fman.txt
FREESCALE QORIQ PTP CLOCK DRIVER
M: Yangbo Lu <yangbo.lu@nxp.com>
L: netdev@vger.kernel.org
S: Maintained
F: drivers/ptp/ptp_qoriq.c
F: include/linux/fsl/ptp_qoriq.h
F: Documentation/devicetree/bindings/ptp/ptp-qoriq.txt
FREESCALE QUAD SPI DRIVER
M: Han Xu <han.xu@nxp.com>
L: linux-mtd@lists.infradead.org
@ -7123,8 +7138,8 @@ Q: http://patchwork.ozlabs.org/project/intel-wired-lan/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue.git
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue.git
S: Supported
F: Documentation/networking/e100.txt
F: Documentation/networking/e1000.txt
F: Documentation/networking/e100.rst
F: Documentation/networking/e1000.rst
F: Documentation/networking/e1000e.txt
F: Documentation/networking/igb.txt
F: Documentation/networking/igbvf.txt
@ -8521,6 +8536,7 @@ M: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
L: netdev@vger.kernel.org
S: Maintained
F: drivers/net/dsa/mv88e6xxx/
F: linux/platform_data/mv88e6xxx.h
F: Documentation/devicetree/bindings/net/dsa/marvell.txt
MARVELL ARMADA DRM SUPPORT
@ -9074,12 +9090,14 @@ W: http://www.mellanox.com
Q: http://patchwork.ozlabs.org/project/netdev/list/
F: drivers/net/ethernet/mellanox/mlx5/core/en_*
MELLANOX ETHERNET INNOVA DRIVER
MELLANOX ETHERNET INNOVA DRIVERS
R: Boris Pismenny <borisp@mellanox.com>
L: netdev@vger.kernel.org
S: Supported
W: http://www.mellanox.com
Q: http://patchwork.ozlabs.org/project/netdev/list/
F: drivers/net/ethernet/mellanox/mlx5/core/en_accel/*
F: drivers/net/ethernet/mellanox/mlx5/core/accel/*
F: drivers/net/ethernet/mellanox/mlx5/core/fpga/*
F: include/linux/mlx5/mlx5_ifc_fpga.h
@ -9339,6 +9357,12 @@ F: include/linux/cciss*.h
F: include/uapi/linux/cciss*.h
F: Documentation/scsi/smartpqi.txt
MICROSEMI ETHERNET SWITCH DRIVER
M: Alexandre Belloni <alexandre.belloni@bootlin.com>
L: netdev@vger.kernel.org
S: Supported
F: drivers/net/ethernet/mscc/
MICROSOFT SURFACE PRO 3 BUTTON DRIVER
M: Chen Yu <yu.c.chen@intel.com>
L: platform-driver-x86@vger.kernel.org
@ -9688,6 +9712,14 @@ S: Maintained
F: Documentation/hwmon/nct6775
F: drivers/hwmon/nct6775.c
NET_FAILOVER MODULE
M: Sridhar Samudrala <sridhar.samudrala@intel.com>
L: netdev@vger.kernel.org
S: Supported
F: driver/net/net_failover.c
F: include/net/net_failover.h
F: Documentation/networking/net_failover.rst
NETEFFECT IWARP RNIC DRIVER (IW_NES)
M: Faisal Latif <faisal.latif@intel.com>
L: linux-rdma@vger.kernel.org
@ -9881,7 +9913,21 @@ F: net/ipv6/calipso.c
F: net/netfilter/xt_CONNSECMARK.c
F: net/netfilter/xt_SECMARK.c
NETWORKING [TCP]
M: Eric Dumazet <edumazet@google.com>
L: netdev@vger.kernel.org
S: Maintained
F: net/ipv4/tcp*.c
F: net/ipv4/syncookies.c
F: net/ipv6/tcp*.c
F: net/ipv6/syncookies.c
F: include/uapi/linux/tcp.h
F: include/net/tcp.h
F: include/linux/tcp.h
F: include/trace/events/tcp.h
NETWORKING [TLS]
M: Boris Pismenny <borisp@mellanox.com>
M: Aviad Yehezkel <aviadye@mellanox.com>
M: Dave Watson <davejwatson@fb.com>
L: netdev@vger.kernel.org
@ -11447,7 +11493,6 @@ S: Maintained
W: http://linuxptp.sourceforge.net/
F: Documentation/ABI/testing/sysfs-ptp
F: Documentation/ptp/*
F: drivers/net/ethernet/freescale/gianfar_ptp.c
F: drivers/net/phy/dp83640*
F: drivers/ptp/*
F: include/linux/ptp_cl*
@ -13458,6 +13503,7 @@ F: drivers/media/usb/stk1160/
STMMAC ETHERNET DRIVER
M: Giuseppe Cavallaro <peppe.cavallaro@st.com>
M: Alexandre Torgue <alexandre.torgue@st.com>
M: Jose Abreu <joabreu@synopsys.com>
L: netdev@vger.kernel.org
W: http://www.stlinux.com
S: Supported
@ -14683,7 +14729,9 @@ M: Woojung Huh <woojung.huh@microchip.com>
M: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
L: netdev@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/net/microchip,lan78xx.txt
F: drivers/net/usb/lan78xx.*
F: include/dt-bindings/net/microchip-lan78xx.h
USB MASS STORAGE DRIVER
M: Alan Stern <stern@rowland.harvard.edu>
@ -15472,6 +15520,14 @@ T: git git://linuxtv.org/media_tree.git
S: Maintained
F: drivers/media/tuners/tuner-xc2028.*
XDP SOCKETS (AF_XDP)
M: Björn Töpel <bjorn.topel@intel.com>
M: Magnus Karlsson <magnus.karlsson@intel.com>
L: netdev@vger.kernel.org
S: Maintained
F: kernel/bpf/xskmap.c
F: net/xdp/
XEN BLOCK SUBSYSTEM
M: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
M: Roger Pau Monné <roger.pau@citrix.com>

View File

@ -509,6 +509,11 @@ ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC) $(KBUILD_CFLA
KBUILD_AFLAGS += -DCC_HAVE_ASM_GOTO
endif
ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/cc-can-link.sh $(CC)), y)
CC_CAN_LINK := y
export CC_CAN_LINK
endif
ifeq ($(config-targets),1)
# ===========================================================================
# *config targets only - make sure prerequisites are updated, and descend

View File

@ -84,7 +84,7 @@
*
* 1. First argument is passed using the arm 32bit registers and rest of the
* arguments are passed on stack scratch space.
* 2. First callee-saved arugument is mapped to arm 32 bit registers and rest
* 2. First callee-saved argument is mapped to arm 32 bit registers and rest
* arguments are mapped to scratch space on stack.
* 3. We need two 64 bit temp registers to do complex operations on eBPF
* registers.
@ -234,18 +234,11 @@ static void jit_fill_hole(void *area, unsigned int size)
#define SCRATCH_SIZE 80
/* total stack size used in JITed code */
#define _STACK_SIZE \
(ctx->prog->aux->stack_depth + \
+ SCRATCH_SIZE + \
+ 4 /* extra for skb_copy_bits buffer */)
#define STACK_SIZE ALIGN(_STACK_SIZE, STACK_ALIGNMENT)
#define _STACK_SIZE (ctx->prog->aux->stack_depth + SCRATCH_SIZE)
#define STACK_SIZE ALIGN(_STACK_SIZE, STACK_ALIGNMENT)
/* Get the offset of eBPF REGISTERs stored on scratch space. */
#define STACK_VAR(off) (STACK_SIZE-off-4)
/* Offset of skb_copy_bits buffer */
#define SKB_BUFFER STACK_VAR(SCRATCH_SIZE)
#define STACK_VAR(off) (STACK_SIZE - off)
#if __LINUX_ARM_ARCH__ < 7
@ -708,7 +701,7 @@ static inline void emit_a32_arsh_r64(const u8 dst[], const u8 src[], bool dstk,
}
/* dst = dst >> src */
static inline void emit_a32_lsr_r64(const u8 dst[], const u8 src[], bool dstk,
static inline void emit_a32_rsh_r64(const u8 dst[], const u8 src[], bool dstk,
bool sstk, struct jit_ctx *ctx) {
const u8 *tmp = bpf2a32[TMP_REG_1];
const u8 *tmp2 = bpf2a32[TMP_REG_2];
@ -724,7 +717,7 @@ static inline void emit_a32_lsr_r64(const u8 dst[], const u8 src[], bool dstk,
emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
}
/* Do LSH operation */
/* Do RSH operation */
emit(ARM_RSB_I(ARM_IP, rt, 32), ctx);
emit(ARM_SUBS_I(tmp2[0], rt, 32), ctx);
emit(ARM_MOV_SR(ARM_LR, rd, SRTYPE_LSR, rt), ctx);
@ -774,7 +767,7 @@ static inline void emit_a32_lsh_i64(const u8 dst[], bool dstk,
}
/* dst = dst >> val */
static inline void emit_a32_lsr_i64(const u8 dst[], bool dstk,
static inline void emit_a32_rsh_i64(const u8 dst[], bool dstk,
const u32 val, struct jit_ctx *ctx) {
const u8 *tmp = bpf2a32[TMP_REG_1];
const u8 *tmp2 = bpf2a32[TMP_REG_2];
@ -1199,8 +1192,8 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
s32 jmp_offset;
#define check_imm(bits, imm) do { \
if ((((imm) > 0) && ((imm) >> (bits))) || \
(((imm) < 0) && (~(imm) >> (bits)))) { \
if ((imm) >= (1 << ((bits) - 1)) || \
(imm) < -(1 << ((bits) - 1))) { \
pr_info("[%2d] imm=%d(0x%x) out of range\n", \
i, imm, imm); \
return -EINVAL; \
@ -1330,7 +1323,7 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
case BPF_ALU64 | BPF_RSH | BPF_K:
if (unlikely(imm > 63))
return -EINVAL;
emit_a32_lsr_i64(dst, dstk, imm, ctx);
emit_a32_rsh_i64(dst, dstk, imm, ctx);
break;
/* dst = dst << src */
case BPF_ALU64 | BPF_LSH | BPF_X:
@ -1338,7 +1331,7 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
break;
/* dst = dst >> src */
case BPF_ALU64 | BPF_RSH | BPF_X:
emit_a32_lsr_r64(dst, src, dstk, sstk, ctx);
emit_a32_rsh_r64(dst, src, dstk, sstk, ctx);
break;
/* dst = dst >> src (signed) */
case BPF_ALU64 | BPF_ARSH | BPF_X:
@ -1452,83 +1445,6 @@ exit:
emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
emit_ldx_r(dst, rn, dstk, off, ctx, BPF_SIZE(code));
break;
/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + imm)) */
case BPF_LD | BPF_ABS | BPF_W:
case BPF_LD | BPF_ABS | BPF_H:
case BPF_LD | BPF_ABS | BPF_B:
/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + src + imm)) */
case BPF_LD | BPF_IND | BPF_W:
case BPF_LD | BPF_IND | BPF_H:
case BPF_LD | BPF_IND | BPF_B:
{
const u8 r4 = bpf2a32[BPF_REG_6][1]; /* r4 = ptr to sk_buff */
const u8 r0 = bpf2a32[BPF_REG_0][1]; /*r0: struct sk_buff *skb*/
/* rtn value */
const u8 r1 = bpf2a32[BPF_REG_0][0]; /* r1: int k */
const u8 r2 = bpf2a32[BPF_REG_1][1]; /* r2: unsigned int size */
const u8 r3 = bpf2a32[BPF_REG_1][0]; /* r3: void *buffer */
const u8 r6 = bpf2a32[TMP_REG_1][1]; /* r6: void *(*func)(..) */
int size;
/* Setting up first argument */
emit(ARM_MOV_R(r0, r4), ctx);
/* Setting up second argument */
emit_a32_mov_i(r1, imm, false, ctx);
if (BPF_MODE(code) == BPF_IND)
emit_a32_alu_r(r1, src_lo, false, sstk, ctx,
false, false, BPF_ADD);
/* Setting up third argument */
switch (BPF_SIZE(code)) {
case BPF_W:
size = 4;
break;
case BPF_H:
size = 2;
break;
case BPF_B:
size = 1;
break;
default:
return -EINVAL;
}
emit_a32_mov_i(r2, size, false, ctx);
/* Setting up fourth argument */
emit(ARM_ADD_I(r3, ARM_SP, imm8m(SKB_BUFFER)), ctx);
/* Setting up function pointer to call */
emit_a32_mov_i(r6, (unsigned int)bpf_load_pointer, false, ctx);
emit_blx_r(r6, ctx);
emit(ARM_EOR_R(r1, r1, r1), ctx);
/* Check if return address is NULL or not.
* if NULL then jump to epilogue
* else continue to load the value from retn address
*/
emit(ARM_CMP_I(r0, 0), ctx);
jmp_offset = epilogue_offset(ctx);
check_imm24(jmp_offset);
_emit(ARM_COND_EQ, ARM_B(jmp_offset), ctx);
/* Load value from the address */
switch (BPF_SIZE(code)) {
case BPF_W:
emit(ARM_LDR_I(r0, r0, 0), ctx);
emit_rev32(r0, r0, ctx);
break;
case BPF_H:
emit(ARM_LDRH_I(r0, r0, 0), ctx);
emit_rev16(r0, r0, ctx);
break;
case BPF_B:
emit(ARM_LDRB_I(r0, r0, 0), ctx);
/* No need to reverse */
break;
}
break;
}
/* ST: *(size *)(dst + off) = imm */
case BPF_ST | BPF_MEM | BPF_W:
case BPF_ST | BPF_MEM | BPF_H:

View File

@ -36,4 +36,30 @@
drive-strength = <2>; /* 2 MA */
};
};
blsp1_uart1_default: blsp1_uart1_default {
mux {
pins = "gpio41", "gpio42", "gpio43", "gpio44";
function = "blsp_uart2";
};
config {
pins = "gpio41", "gpio42", "gpio43", "gpio44";
drive-strength = <16>;
bias-disable;
};
};
blsp1_uart1_sleep: blsp1_uart1_sleep {
mux {
pins = "gpio41", "gpio42", "gpio43", "gpio44";
function = "gpio";
};
config {
pins = "gpio41", "gpio42", "gpio43", "gpio44";
drive-strength = <2>;
bias-disable;
};
};
};

View File

@ -14,6 +14,28 @@
};
};
bt_en_gpios: bt_en_gpios {
pinconf {
pins = "gpio19";
function = PMIC_GPIO_FUNC_NORMAL;
output-low;
power-source = <PM8994_GPIO_S4>; // 1.8V
qcom,drive-strength = <PMIC_GPIO_STRENGTH_LOW>;
bias-pull-down;
};
};
wlan_en_gpios: wlan_en_gpios {
pinconf {
pins = "gpio8";
function = PMIC_GPIO_FUNC_NORMAL;
output-low;
power-source = <PM8994_GPIO_S4>; // 1.8V
qcom,drive-strength = <PMIC_GPIO_STRENGTH_LOW>;
bias-pull-down;
};
};
volume_up_gpio: pm8996_gpio2 {
pinconf {
pins = "gpio2";
@ -26,6 +48,16 @@
};
};
divclk4_pin_a: divclk4 {
pinconf {
pins = "gpio18";
function = PMIC_GPIO_FUNC_FUNC2;
bias-disable;
power-source = <PM8994_GPIO_S4>;
};
};
usb3_vbus_det_gpio: pm8996_gpio22 {
pinconf {
pins = "gpio22";

View File

@ -23,6 +23,7 @@
aliases {
serial0 = &blsp2_uart1;
serial1 = &blsp2_uart2;
serial2 = &blsp1_uart1;
i2c0 = &blsp1_i2c2;
i2c1 = &blsp2_i2c1;
i2c2 = &blsp2_i2c0;
@ -34,7 +35,36 @@
stdout-path = "serial0:115200n8";
};
clocks {
divclk4: divclk4 {
compatible = "fixed-clock";
#clock-cells = <0>;
clock-frequency = <32768>;
clock-output-names = "divclk4";
pinctrl-names = "default";
pinctrl-0 = <&divclk4_pin_a>;
};
};
soc {
serial@7570000 {
label = "BT-UART";
status = "okay";
pinctrl-names = "default", "sleep";
pinctrl-0 = <&blsp1_uart1_default>;
pinctrl-1 = <&blsp1_uart1_sleep>;
bluetooth {
compatible = "qcom,qca6174-bt";
/* bt_disable_n gpio */
enable-gpios = <&pm8994_gpios 19 GPIO_ACTIVE_HIGH>;
clocks = <&divclk4>;
};
};
serial@75b0000 {
label = "LS-UART1";
status = "okay";
@ -139,9 +169,40 @@
pinctrl-0 = <&usb2_vbus_det_gpio>;
};
bt_en: bt-en-1-8v {
pinctrl-names = "default";
pinctrl-0 = <&bt_en_gpios>;
compatible = "regulator-fixed";
regulator-name = "bt-en-regulator";
regulator-min-microvolt = <1800000>;
regulator-max-microvolt = <1800000>;
/* WLAN card specific delay */
startup-delay-us = <70000>;
enable-active-high;
};
wlan_en: wlan-en-1-8v {
pinctrl-names = "default";
pinctrl-0 = <&wlan_en_gpios>;
compatible = "regulator-fixed";
regulator-name = "wlan-en-regulator";
regulator-min-microvolt = <1800000>;
regulator-max-microvolt = <1800000>;
gpio = <&pm8994_gpios 8 0>;
/* WLAN card specific delay */
startup-delay-us = <70000>;
enable-active-high;
};
agnoc@0 {
qcom,pcie@600000 {
status = "okay";
perst-gpio = <&msmgpio 35 GPIO_ACTIVE_LOW>;
vddpe-supply = <&wlan_en>;
vddpe1-supply = <&bt_en>;
};
qcom,pcie@608000 {

View File

@ -419,6 +419,16 @@
#clock-cells = <1>;
};
blsp1_uart1: serial@7570000 {
compatible = "qcom,msm-uartdm-v1.4", "qcom,msm-uartdm";
reg = <0x07570000 0x1000>;
interrupts = <GIC_SPI 108 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&gcc GCC_BLSP1_UART2_APPS_CLK>,
<&gcc GCC_BLSP1_AHB_CLK>;
clock-names = "core", "iface";
status = "disabled";
};
blsp1_spi0: spi@7575000 {
compatible = "qcom,spi-qup-v2.2.1";
reg = <0x07575000 0x600>;

View File

@ -21,7 +21,6 @@
#include <linux/bpf.h>
#include <linux/filter.h>
#include <linux/printk.h>
#include <linux/skbuff.h>
#include <linux/slab.h>
#include <asm/byteorder.h>
@ -80,37 +79,6 @@ static inline void emit(const u32 insn, struct jit_ctx *ctx)
ctx->idx++;
}
static inline void emit_a64_mov_i64(const int reg, const u64 val,
struct jit_ctx *ctx)
{
u64 tmp = val;
int shift = 0;
emit(A64_MOVZ(1, reg, tmp & 0xffff, shift), ctx);
tmp >>= 16;
shift += 16;
while (tmp) {
if (tmp & 0xffff)
emit(A64_MOVK(1, reg, tmp & 0xffff, shift), ctx);
tmp >>= 16;
shift += 16;
}
}
static inline void emit_addr_mov_i64(const int reg, const u64 val,
struct jit_ctx *ctx)
{
u64 tmp = val;
int shift = 0;
emit(A64_MOVZ(1, reg, tmp & 0xffff, shift), ctx);
for (;shift < 48;) {
tmp >>= 16;
shift += 16;
emit(A64_MOVK(1, reg, tmp & 0xffff, shift), ctx);
}
}
static inline void emit_a64_mov_i(const int is64, const int reg,
const s32 val, struct jit_ctx *ctx)
{
@ -122,7 +90,8 @@ static inline void emit_a64_mov_i(const int is64, const int reg,
emit(A64_MOVN(is64, reg, (u16)~lo, 0), ctx);
} else {
emit(A64_MOVN(is64, reg, (u16)~hi, 16), ctx);
emit(A64_MOVK(is64, reg, lo, 0), ctx);
if (lo != 0xffff)
emit(A64_MOVK(is64, reg, lo, 0), ctx);
}
} else {
emit(A64_MOVZ(is64, reg, lo, 0), ctx);
@ -131,6 +100,59 @@ static inline void emit_a64_mov_i(const int is64, const int reg,
}
}
static int i64_i16_blocks(const u64 val, bool inverse)
{
return (((val >> 0) & 0xffff) != (inverse ? 0xffff : 0x0000)) +
(((val >> 16) & 0xffff) != (inverse ? 0xffff : 0x0000)) +
(((val >> 32) & 0xffff) != (inverse ? 0xffff : 0x0000)) +
(((val >> 48) & 0xffff) != (inverse ? 0xffff : 0x0000));
}
static inline void emit_a64_mov_i64(const int reg, const u64 val,
struct jit_ctx *ctx)
{
u64 nrm_tmp = val, rev_tmp = ~val;
bool inverse;
int shift;
if (!(nrm_tmp >> 32))
return emit_a64_mov_i(0, reg, (u32)val, ctx);
inverse = i64_i16_blocks(nrm_tmp, true) < i64_i16_blocks(nrm_tmp, false);
shift = max(round_down((inverse ? (fls64(rev_tmp) - 1) :
(fls64(nrm_tmp) - 1)), 16), 0);
if (inverse)
emit(A64_MOVN(1, reg, (rev_tmp >> shift) & 0xffff, shift), ctx);
else
emit(A64_MOVZ(1, reg, (nrm_tmp >> shift) & 0xffff, shift), ctx);
shift -= 16;
while (shift >= 0) {
if (((nrm_tmp >> shift) & 0xffff) != (inverse ? 0xffff : 0x0000))
emit(A64_MOVK(1, reg, (nrm_tmp >> shift) & 0xffff, shift), ctx);
shift -= 16;
}
}
/*
* This is an unoptimized 64 immediate emission used for BPF to BPF call
* addresses. It will always do a full 64 bit decomposition as otherwise
* more complexity in the last extra pass is required since we previously
* reserved 4 instructions for the address.
*/
static inline void emit_addr_mov_i64(const int reg, const u64 val,
struct jit_ctx *ctx)
{
u64 tmp = val;
int shift = 0;
emit(A64_MOVZ(1, reg, tmp & 0xffff, shift), ctx);
for (;shift < 48;) {
tmp >>= 16;
shift += 16;
emit(A64_MOVK(1, reg, tmp & 0xffff, shift), ctx);
}
}
static inline int bpf2a64_offset(int bpf_to, int bpf_from,
const struct jit_ctx *ctx)
{
@ -163,7 +185,7 @@ static inline int epilogue_offset(const struct jit_ctx *ctx)
/* Tail call offset to jump into */
#define PROLOGUE_OFFSET 7
static int build_prologue(struct jit_ctx *ctx)
static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
{
const struct bpf_prog *prog = ctx->prog;
const u8 r6 = bpf2a64[BPF_REG_6];
@ -188,7 +210,7 @@ static int build_prologue(struct jit_ctx *ctx)
* | ... | BPF prog stack
* | |
* +-----+ <= (BPF_FP - prog->aux->stack_depth)
* |RSVD | JIT scratchpad
* |RSVD | padding
* current A64_SP => +-----+ <= (BPF_FP - ctx->stack_size)
* | |
* | ... | Function call stack
@ -210,19 +232,19 @@ static int build_prologue(struct jit_ctx *ctx)
/* Set up BPF prog stack base register */
emit(A64_MOV(1, fp, A64_SP), ctx);
/* Initialize tail_call_cnt */
emit(A64_MOVZ(1, tcc, 0, 0), ctx);
if (!ebpf_from_cbpf) {
/* Initialize tail_call_cnt */
emit(A64_MOVZ(1, tcc, 0, 0), ctx);
cur_offset = ctx->idx - idx0;
if (cur_offset != PROLOGUE_OFFSET) {
pr_err_once("PROLOGUE_OFFSET = %d, expected %d!\n",
cur_offset, PROLOGUE_OFFSET);
return -1;
cur_offset = ctx->idx - idx0;
if (cur_offset != PROLOGUE_OFFSET) {
pr_err_once("PROLOGUE_OFFSET = %d, expected %d!\n",
cur_offset, PROLOGUE_OFFSET);
return -1;
}
}
/* 4 byte extra for skb_copy_bits buffer */
ctx->stack_size = prog->aux->stack_depth + 4;
ctx->stack_size = STACK_ALIGN(ctx->stack_size);
ctx->stack_size = STACK_ALIGN(prog->aux->stack_depth);
/* Set up function call stack */
emit(A64_SUB_I(1, A64_SP, A64_SP, ctx->stack_size), ctx);
@ -723,71 +745,6 @@ emit_cond_jmp:
emit(A64_CBNZ(0, tmp3, jmp_offset), ctx);
break;
/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + imm)) */
case BPF_LD | BPF_ABS | BPF_W:
case BPF_LD | BPF_ABS | BPF_H:
case BPF_LD | BPF_ABS | BPF_B:
/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + src + imm)) */
case BPF_LD | BPF_IND | BPF_W:
case BPF_LD | BPF_IND | BPF_H:
case BPF_LD | BPF_IND | BPF_B:
{
const u8 r0 = bpf2a64[BPF_REG_0]; /* r0 = return value */
const u8 r6 = bpf2a64[BPF_REG_6]; /* r6 = pointer to sk_buff */
const u8 fp = bpf2a64[BPF_REG_FP];
const u8 r1 = bpf2a64[BPF_REG_1]; /* r1: struct sk_buff *skb */
const u8 r2 = bpf2a64[BPF_REG_2]; /* r2: int k */
const u8 r3 = bpf2a64[BPF_REG_3]; /* r3: unsigned int size */
const u8 r4 = bpf2a64[BPF_REG_4]; /* r4: void *buffer */
const u8 r5 = bpf2a64[BPF_REG_5]; /* r5: void *(*func)(...) */
int size;
emit(A64_MOV(1, r1, r6), ctx);
emit_a64_mov_i(0, r2, imm, ctx);
if (BPF_MODE(code) == BPF_IND)
emit(A64_ADD(0, r2, r2, src), ctx);
switch (BPF_SIZE(code)) {
case BPF_W:
size = 4;
break;
case BPF_H:
size = 2;
break;
case BPF_B:
size = 1;
break;
default:
return -EINVAL;
}
emit_a64_mov_i64(r3, size, ctx);
emit(A64_SUB_I(1, r4, fp, ctx->stack_size), ctx);
emit_a64_mov_i64(r5, (unsigned long)bpf_load_pointer, ctx);
emit(A64_BLR(r5), ctx);
emit(A64_MOV(1, r0, A64_R(0)), ctx);
jmp_offset = epilogue_offset(ctx);
check_imm19(jmp_offset);
emit(A64_CBZ(1, r0, jmp_offset), ctx);
emit(A64_MOV(1, r5, r0), ctx);
switch (BPF_SIZE(code)) {
case BPF_W:
emit(A64_LDR32(r0, r5, A64_ZR), ctx);
#ifndef CONFIG_CPU_BIG_ENDIAN
emit(A64_REV32(0, r0, r0), ctx);
#endif
break;
case BPF_H:
emit(A64_LDRH(r0, r5, A64_ZR), ctx);
#ifndef CONFIG_CPU_BIG_ENDIAN
emit(A64_REV16(0, r0, r0), ctx);
#endif
break;
case BPF_B:
emit(A64_LDRB(r0, r5, A64_ZR), ctx);
break;
}
break;
}
default:
pr_err_once("unknown opcode %02x\n", code);
return -EINVAL;
@ -851,6 +808,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
struct bpf_prog *tmp, *orig_prog = prog;
struct bpf_binary_header *header;
struct arm64_jit_data *jit_data;
bool was_classic = bpf_prog_was_classic(prog);
bool tmp_blinded = false;
bool extra_pass = false;
struct jit_ctx ctx;
@ -905,7 +863,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
goto out_off;
}
if (build_prologue(&ctx)) {
if (build_prologue(&ctx, was_classic)) {
prog = orig_prog;
goto out_off;
}
@ -928,7 +886,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
skip_init_ctx:
ctx.idx = 0;
build_prologue(&ctx);
build_prologue(&ctx, was_classic);
if (build_body(&ctx)) {
bpf_jit_binary_free(header);

View File

@ -95,7 +95,6 @@ enum reg_val_type {
* struct jit_ctx - JIT context
* @skf: The sk_filter
* @stack_size: eBPF stack size
* @tmp_offset: eBPF $sp offset to 8-byte temporary memory
* @idx: Instruction index
* @flags: JIT flags
* @offsets: Instruction offsets
@ -105,7 +104,6 @@ enum reg_val_type {
struct jit_ctx {
const struct bpf_prog *skf;
int stack_size;
int tmp_offset;
u32 idx;
u32 flags;
u32 *offsets;
@ -293,7 +291,6 @@ static int gen_int_prologue(struct jit_ctx *ctx)
locals_size = (ctx->flags & EBPF_SEEN_FP) ? MAX_BPF_STACK : 0;
stack_adjust += locals_size;
ctx->tmp_offset = locals_size;
ctx->stack_size = stack_adjust;
@ -399,7 +396,6 @@ static void gen_imm_to_reg(const struct bpf_insn *insn, int reg,
emit_instr(ctx, lui, reg, upper >> 16);
emit_instr(ctx, addiu, reg, reg, lower);
}
}
static int gen_imm_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
@ -547,28 +543,6 @@ static int gen_imm_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
return 0;
}
static void * __must_check
ool_skb_header_pointer(const struct sk_buff *skb, int offset,
int len, void *buffer)
{
return skb_header_pointer(skb, offset, len, buffer);
}
static int size_to_len(const struct bpf_insn *insn)
{
switch (BPF_SIZE(insn->code)) {
case BPF_B:
return 1;
case BPF_H:
return 2;
case BPF_W:
return 4;
case BPF_DW:
return 8;
}
return 0;
}
static void emit_const_to_reg(struct jit_ctx *ctx, int dst, u64 value)
{
if (value >= 0xffffffffffff8000ull || value < 0x8000ull) {
@ -1267,110 +1241,6 @@ jeq_common:
return -EINVAL;
break;
case BPF_LD | BPF_B | BPF_ABS:
case BPF_LD | BPF_H | BPF_ABS:
case BPF_LD | BPF_W | BPF_ABS:
case BPF_LD | BPF_DW | BPF_ABS:
ctx->flags |= EBPF_SAVE_RA;
gen_imm_to_reg(insn, MIPS_R_A1, ctx);
emit_instr(ctx, addiu, MIPS_R_A2, MIPS_R_ZERO, size_to_len(insn));
if (insn->imm < 0) {
emit_const_to_reg(ctx, MIPS_R_T9, (u64)bpf_internal_load_pointer_neg_helper);
} else {
emit_const_to_reg(ctx, MIPS_R_T9, (u64)ool_skb_header_pointer);
emit_instr(ctx, daddiu, MIPS_R_A3, MIPS_R_SP, ctx->tmp_offset);
}
goto ld_skb_common;
case BPF_LD | BPF_B | BPF_IND:
case BPF_LD | BPF_H | BPF_IND:
case BPF_LD | BPF_W | BPF_IND:
case BPF_LD | BPF_DW | BPF_IND:
ctx->flags |= EBPF_SAVE_RA;
src = ebpf_to_mips_reg(ctx, insn, src_reg_no_fp);
if (src < 0)
return src;
ts = get_reg_val_type(ctx, this_idx, insn->src_reg);
if (ts == REG_32BIT_ZERO_EX) {
/* sign extend */
emit_instr(ctx, sll, MIPS_R_A1, src, 0);
src = MIPS_R_A1;
}
if (insn->imm >= S16_MIN && insn->imm <= S16_MAX) {
emit_instr(ctx, daddiu, MIPS_R_A1, src, insn->imm);
} else {
gen_imm_to_reg(insn, MIPS_R_AT, ctx);
emit_instr(ctx, daddu, MIPS_R_A1, MIPS_R_AT, src);
}
/* truncate to 32-bit int */
emit_instr(ctx, sll, MIPS_R_A1, MIPS_R_A1, 0);
emit_instr(ctx, daddiu, MIPS_R_A3, MIPS_R_SP, ctx->tmp_offset);
emit_instr(ctx, slt, MIPS_R_AT, MIPS_R_A1, MIPS_R_ZERO);
emit_const_to_reg(ctx, MIPS_R_T8, (u64)bpf_internal_load_pointer_neg_helper);
emit_const_to_reg(ctx, MIPS_R_T9, (u64)ool_skb_header_pointer);
emit_instr(ctx, addiu, MIPS_R_A2, MIPS_R_ZERO, size_to_len(insn));
emit_instr(ctx, movn, MIPS_R_T9, MIPS_R_T8, MIPS_R_AT);
ld_skb_common:
emit_instr(ctx, jalr, MIPS_R_RA, MIPS_R_T9);
/* delay slot move */
emit_instr(ctx, daddu, MIPS_R_A0, MIPS_R_S0, MIPS_R_ZERO);
/* Check the error value */
b_off = b_imm(exit_idx, ctx);
if (is_bad_offset(b_off)) {
target = j_target(ctx, exit_idx);
if (target == (unsigned int)-1)
return -E2BIG;
if (!(ctx->offsets[this_idx] & OFFSETS_B_CONV)) {
ctx->offsets[this_idx] |= OFFSETS_B_CONV;
ctx->long_b_conversion = 1;
}
emit_instr(ctx, bne, MIPS_R_V0, MIPS_R_ZERO, 4 * 3);
emit_instr(ctx, nop);
emit_instr(ctx, j, target);
emit_instr(ctx, nop);
} else {
emit_instr(ctx, beq, MIPS_R_V0, MIPS_R_ZERO, b_off);
emit_instr(ctx, nop);
}
#ifdef __BIG_ENDIAN
need_swap = false;
#else
need_swap = true;
#endif
dst = MIPS_R_V0;
switch (BPF_SIZE(insn->code)) {
case BPF_B:
emit_instr(ctx, lbu, dst, 0, MIPS_R_V0);
break;
case BPF_H:
emit_instr(ctx, lhu, dst, 0, MIPS_R_V0);
if (need_swap)
emit_instr(ctx, wsbh, dst, dst);
break;
case BPF_W:
emit_instr(ctx, lw, dst, 0, MIPS_R_V0);
if (need_swap) {
emit_instr(ctx, wsbh, dst, dst);
emit_instr(ctx, rotr, dst, dst, 16);
}
break;
case BPF_DW:
emit_instr(ctx, ld, dst, 0, MIPS_R_V0);
if (need_swap) {
emit_instr(ctx, dsbh, dst, dst);
emit_instr(ctx, dshd, dst, dst);
}
break;
}
break;
case BPF_ALU | BPF_END | BPF_FROM_BE:
case BPF_ALU | BPF_END | BPF_FROM_LE:
dst = ebpf_to_mips_reg(ctx, insn, dst_reg);

View File

@ -3,7 +3,7 @@
# Arch-specific network modules
#
ifeq ($(CONFIG_PPC64),y)
obj-$(CONFIG_BPF_JIT) += bpf_jit_asm64.o bpf_jit_comp64.o
obj-$(CONFIG_BPF_JIT) += bpf_jit_comp64.o
else
obj-$(CONFIG_BPF_JIT) += bpf_jit_asm.o bpf_jit_comp.o
endif

View File

@ -20,7 +20,7 @@
* with our redzone usage.
*
* [ prev sp ] <-------------
* [ nv gpr save area ] 8*8 |
* [ nv gpr save area ] 6*8 |
* [ tail_call_cnt ] 8 |
* [ local_tmp_var ] 8 |
* fp (r31) --> [ ebpf stack space ] upto 512 |
@ -28,8 +28,8 @@
* sp (r1) ---> [ stack pointer ] --------------
*/
/* for gpr non volatile registers BPG_REG_6 to 10, plus skb cache registers */
#define BPF_PPC_STACK_SAVE (8*8)
/* for gpr non volatile registers BPG_REG_6 to 10 */
#define BPF_PPC_STACK_SAVE (6*8)
/* for bpf JIT code internal usage */
#define BPF_PPC_STACK_LOCALS 16
/* stack frame excluding BPF stack, ensure this is quadword aligned */
@ -39,10 +39,8 @@
#ifndef __ASSEMBLY__
/* BPF register usage */
#define SKB_HLEN_REG (MAX_BPF_JIT_REG + 0)
#define SKB_DATA_REG (MAX_BPF_JIT_REG + 1)
#define TMP_REG_1 (MAX_BPF_JIT_REG + 2)
#define TMP_REG_2 (MAX_BPF_JIT_REG + 3)
#define TMP_REG_1 (MAX_BPF_JIT_REG + 0)
#define TMP_REG_2 (MAX_BPF_JIT_REG + 1)
/* BPF to ppc register mappings */
static const int b2p[] = {
@ -63,40 +61,23 @@ static const int b2p[] = {
[BPF_REG_FP] = 31,
/* eBPF jit internal registers */
[BPF_REG_AX] = 2,
[SKB_HLEN_REG] = 25,
[SKB_DATA_REG] = 26,
[TMP_REG_1] = 9,
[TMP_REG_2] = 10
};
/* PPC NVR range -- update this if we ever use NVRs below r24 */
#define BPF_PPC_NVR_MIN 24
/* Assembly helpers */
#define DECLARE_LOAD_FUNC(func) u64 func(u64 r3, u64 r4); \
u64 func##_negative_offset(u64 r3, u64 r4); \
u64 func##_positive_offset(u64 r3, u64 r4);
DECLARE_LOAD_FUNC(sk_load_word);
DECLARE_LOAD_FUNC(sk_load_half);
DECLARE_LOAD_FUNC(sk_load_byte);
#define CHOOSE_LOAD_FUNC(imm, func) \
(imm < 0 ? \
(imm >= SKF_LL_OFF ? func##_negative_offset : func) : \
func##_positive_offset)
/* PPC NVR range -- update this if we ever use NVRs below r27 */
#define BPF_PPC_NVR_MIN 27
#define SEEN_FUNC 0x1000 /* might call external helpers */
#define SEEN_STACK 0x2000 /* uses BPF stack */
#define SEEN_SKB 0x4000 /* uses sk_buff */
#define SEEN_TAILCALL 0x8000 /* uses tail calls */
#define SEEN_TAILCALL 0x4000 /* uses tail calls */
struct codegen_context {
/*
* This is used to track register usage as well
* as calls to external helpers.
* - register usage is tracked with corresponding
* bits (r3-r10 and r25-r31)
* bits (r3-r10 and r27-r31)
* - rest of the bits can be used to track other
* things -- for now, we use bits 16 to 23
* encoded in SEEN_* macros above

View File

@ -1,180 +0,0 @@
/*
* bpf_jit_asm64.S: Packet/header access helper functions
* for PPC64 BPF compiler.
*
* Copyright 2016, Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
* IBM Corporation
*
* Based on bpf_jit_asm.S by Matt Evans
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; version 2
* of the License.
*/
#include <asm/ppc_asm.h>
#include <asm/ptrace.h>
#include "bpf_jit64.h"
/*
* All of these routines are called directly from generated code,
* with the below register usage:
* r27 skb pointer (ctx)
* r25 skb header length
* r26 skb->data pointer
* r4 offset
*
* Result is passed back in:
* r8 data read in host endian format (accumulator)
*
* r9 is used as a temporary register
*/
#define r_skb r27
#define r_hlen r25
#define r_data r26
#define r_off r4
#define r_val r8
#define r_tmp r9
_GLOBAL_TOC(sk_load_word)
cmpdi r_off, 0
blt bpf_slow_path_word_neg
b sk_load_word_positive_offset
_GLOBAL_TOC(sk_load_word_positive_offset)
/* Are we accessing past headlen? */
subi r_tmp, r_hlen, 4
cmpd r_tmp, r_off
blt bpf_slow_path_word
/* Nope, just hitting the header. cr0 here is eq or gt! */
LWZX_BE r_val, r_data, r_off
blr /* Return success, cr0 != LT */
_GLOBAL_TOC(sk_load_half)
cmpdi r_off, 0
blt bpf_slow_path_half_neg
b sk_load_half_positive_offset
_GLOBAL_TOC(sk_load_half_positive_offset)
subi r_tmp, r_hlen, 2
cmpd r_tmp, r_off
blt bpf_slow_path_half
LHZX_BE r_val, r_data, r_off
blr
_GLOBAL_TOC(sk_load_byte)
cmpdi r_off, 0
blt bpf_slow_path_byte_neg
b sk_load_byte_positive_offset
_GLOBAL_TOC(sk_load_byte_positive_offset)
cmpd r_hlen, r_off
ble bpf_slow_path_byte
lbzx r_val, r_data, r_off
blr
/*
* Call out to skb_copy_bits:
* Allocate a new stack frame here to remain ABI-compliant in
* stashing LR.
*/
#define bpf_slow_path_common(SIZE) \
mflr r0; \
std r0, PPC_LR_STKOFF(r1); \
stdu r1, -(STACK_FRAME_MIN_SIZE + BPF_PPC_STACK_LOCALS)(r1); \
mr r3, r_skb; \
/* r4 = r_off as passed */ \
addi r5, r1, STACK_FRAME_MIN_SIZE; \
li r6, SIZE; \
bl skb_copy_bits; \
nop; \
/* save r5 */ \
addi r5, r1, STACK_FRAME_MIN_SIZE; \
/* r3 = 0 on success */ \
addi r1, r1, STACK_FRAME_MIN_SIZE + BPF_PPC_STACK_LOCALS; \
ld r0, PPC_LR_STKOFF(r1); \
mtlr r0; \
cmpdi r3, 0; \
blt bpf_error; /* cr0 = LT */
bpf_slow_path_word:
bpf_slow_path_common(4)
/* Data value is on stack, and cr0 != LT */
LWZX_BE r_val, 0, r5
blr
bpf_slow_path_half:
bpf_slow_path_common(2)
LHZX_BE r_val, 0, r5
blr
bpf_slow_path_byte:
bpf_slow_path_common(1)
lbzx r_val, 0, r5
blr
/*
* Call out to bpf_internal_load_pointer_neg_helper
*/
#define sk_negative_common(SIZE) \
mflr r0; \
std r0, PPC_LR_STKOFF(r1); \
stdu r1, -STACK_FRAME_MIN_SIZE(r1); \
mr r3, r_skb; \
/* r4 = r_off, as passed */ \
li r5, SIZE; \
bl bpf_internal_load_pointer_neg_helper; \
nop; \
addi r1, r1, STACK_FRAME_MIN_SIZE; \
ld r0, PPC_LR_STKOFF(r1); \
mtlr r0; \
/* R3 != 0 on success */ \
cmpldi r3, 0; \
beq bpf_error_slow; /* cr0 = EQ */
bpf_slow_path_word_neg:
lis r_tmp, -32 /* SKF_LL_OFF */
cmpd r_off, r_tmp /* addr < SKF_* */
blt bpf_error /* cr0 = LT */
b sk_load_word_negative_offset
_GLOBAL_TOC(sk_load_word_negative_offset)
sk_negative_common(4)
LWZX_BE r_val, 0, r3
blr
bpf_slow_path_half_neg:
lis r_tmp, -32 /* SKF_LL_OFF */
cmpd r_off, r_tmp /* addr < SKF_* */
blt bpf_error /* cr0 = LT */
b sk_load_half_negative_offset
_GLOBAL_TOC(sk_load_half_negative_offset)
sk_negative_common(2)
LHZX_BE r_val, 0, r3
blr
bpf_slow_path_byte_neg:
lis r_tmp, -32 /* SKF_LL_OFF */
cmpd r_off, r_tmp /* addr < SKF_* */
blt bpf_error /* cr0 = LT */
b sk_load_byte_negative_offset
_GLOBAL_TOC(sk_load_byte_negative_offset)
sk_negative_common(1)
lbzx r_val, 0, r3
blr
bpf_error_slow:
/* fabricate a cr0 = lt */
li r_tmp, -1
cmpdi r_tmp, 0
bpf_error:
/*
* Entered with cr0 = lt
* Generated code will 'blt epilogue', returning 0.
*/
li r_val, 0
blr

View File

@ -59,7 +59,7 @@ static inline bool bpf_has_stack_frame(struct codegen_context *ctx)
* [ prev sp ] <-------------
* [ ... ] |
* sp (r1) ---> [ stack pointer ] --------------
* [ nv gpr save area ] 8*8
* [ nv gpr save area ] 6*8
* [ tail_call_cnt ] 8
* [ local_tmp_var ] 8
* [ unused red zone ] 208 bytes protected
@ -88,21 +88,6 @@ static int bpf_jit_stack_offsetof(struct codegen_context *ctx, int reg)
BUG();
}
static void bpf_jit_emit_skb_loads(u32 *image, struct codegen_context *ctx)
{
/*
* Load skb->len and skb->data_len
* r3 points to skb
*/
PPC_LWZ(b2p[SKB_HLEN_REG], 3, offsetof(struct sk_buff, len));
PPC_LWZ(b2p[TMP_REG_1], 3, offsetof(struct sk_buff, data_len));
/* header_len = len - data_len */
PPC_SUB(b2p[SKB_HLEN_REG], b2p[SKB_HLEN_REG], b2p[TMP_REG_1]);
/* skb->data pointer */
PPC_BPF_LL(b2p[SKB_DATA_REG], 3, offsetof(struct sk_buff, data));
}
static void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
{
int i;
@ -145,18 +130,6 @@ static void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
if (bpf_is_seen_register(ctx, i))
PPC_BPF_STL(b2p[i], 1, bpf_jit_stack_offsetof(ctx, b2p[i]));
/*
* Save additional non-volatile regs if we cache skb
* Also, setup skb data
*/
if (ctx->seen & SEEN_SKB) {
PPC_BPF_STL(b2p[SKB_HLEN_REG], 1,
bpf_jit_stack_offsetof(ctx, b2p[SKB_HLEN_REG]));
PPC_BPF_STL(b2p[SKB_DATA_REG], 1,
bpf_jit_stack_offsetof(ctx, b2p[SKB_DATA_REG]));
bpf_jit_emit_skb_loads(image, ctx);
}
/* Setup frame pointer to point to the bpf stack area */
if (bpf_is_seen_register(ctx, BPF_REG_FP))
PPC_ADDI(b2p[BPF_REG_FP], 1,
@ -172,14 +145,6 @@ static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx
if (bpf_is_seen_register(ctx, i))
PPC_BPF_LL(b2p[i], 1, bpf_jit_stack_offsetof(ctx, b2p[i]));
/* Restore non-volatile registers used for skb cache */
if (ctx->seen & SEEN_SKB) {
PPC_BPF_LL(b2p[SKB_HLEN_REG], 1,
bpf_jit_stack_offsetof(ctx, b2p[SKB_HLEN_REG]));
PPC_BPF_LL(b2p[SKB_DATA_REG], 1,
bpf_jit_stack_offsetof(ctx, b2p[SKB_DATA_REG]));
}
/* Tear down our stack frame */
if (bpf_has_stack_frame(ctx)) {
PPC_ADDI(1, 1, BPF_PPC_STACKFRAME + ctx->stack_size);
@ -202,25 +167,37 @@ static void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx)
static void bpf_jit_emit_func_call(u32 *image, struct codegen_context *ctx, u64 func)
{
unsigned int i, ctx_idx = ctx->idx;
/* Load function address into r12 */
PPC_LI64(12, func);
/* For bpf-to-bpf function calls, the callee's address is unknown
* until the last extra pass. As seen above, we use PPC_LI64() to
* load the callee's address, but this may optimize the number of
* instructions required based on the nature of the address.
*
* Since we don't want the number of instructions emitted to change,
* we pad the optimized PPC_LI64() call with NOPs to guarantee that
* we always have a five-instruction sequence, which is the maximum
* that PPC_LI64() can emit.
*/
for (i = ctx->idx - ctx_idx; i < 5; i++)
PPC_NOP();
#ifdef PPC64_ELF_ABI_v1
/* func points to the function descriptor */
PPC_LI64(b2p[TMP_REG_2], func);
/* Load actual entry point from function descriptor */
PPC_BPF_LL(b2p[TMP_REG_1], b2p[TMP_REG_2], 0);
/* ... and move it to LR */
PPC_MTLR(b2p[TMP_REG_1]);
/*
* Load TOC from function descriptor at offset 8.
* We can clobber r2 since we get called through a
* function pointer (so caller will save/restore r2)
* and since we don't use a TOC ourself.
*/
PPC_BPF_LL(2, b2p[TMP_REG_2], 8);
#else
/* We can clobber r12 */
PPC_FUNC_ADDR(12, func);
PPC_MTLR(12);
PPC_BPF_LL(2, 12, 8);
/* Load actual entry point from function descriptor */
PPC_BPF_LL(12, 12, 0);
#endif
PPC_MTLR(12);
PPC_BLRL();
}
@ -291,7 +268,7 @@ static void bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32
/* Assemble the body code between the prologue & epilogue */
static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
struct codegen_context *ctx,
u32 *addrs)
u32 *addrs, bool extra_pass)
{
const struct bpf_insn *insn = fp->insnsi;
int flen = fp->len;
@ -747,29 +724,30 @@ emit_clear:
break;
/*
* Call kernel helper
* Call kernel helper or bpf function
*/
case BPF_JMP | BPF_CALL:
ctx->seen |= SEEN_FUNC;
func = (u8 *) __bpf_call_base + imm;
/* Save skb pointer if we need to re-cache skb data */
if ((ctx->seen & SEEN_SKB) &&
bpf_helper_changes_pkt_data(func))
PPC_BPF_STL(3, 1, bpf_jit_stack_local(ctx));
/* bpf function call */
if (insn[i].src_reg == BPF_PSEUDO_CALL)
if (!extra_pass)
func = NULL;
else if (fp->aux->func && off < fp->aux->func_cnt)
/* use the subprog id from the off
* field to lookup the callee address
*/
func = (u8 *) fp->aux->func[off]->bpf_func;
else
return -EINVAL;
/* kernel helper call */
else
func = (u8 *) __bpf_call_base + imm;
bpf_jit_emit_func_call(image, ctx, (u64)func);
/* move return value from r3 to BPF_REG_0 */
PPC_MR(b2p[BPF_REG_0], 3);
/* refresh skb cache */
if ((ctx->seen & SEEN_SKB) &&
bpf_helper_changes_pkt_data(func)) {
/* reload skb pointer to r3 */
PPC_BPF_LL(3, 1, bpf_jit_stack_local(ctx));
bpf_jit_emit_skb_loads(image, ctx);
}
break;
/*
@ -886,65 +864,6 @@ cond_branch:
PPC_BCC(true_cond, addrs[i + 1 + off]);
break;
/*
* Loads from packet header/data
* Assume 32-bit input value in imm and X (src_reg)
*/
/* Absolute loads */
case BPF_LD | BPF_W | BPF_ABS:
func = (u8 *)CHOOSE_LOAD_FUNC(imm, sk_load_word);
goto common_load_abs;
case BPF_LD | BPF_H | BPF_ABS:
func = (u8 *)CHOOSE_LOAD_FUNC(imm, sk_load_half);
goto common_load_abs;
case BPF_LD | BPF_B | BPF_ABS:
func = (u8 *)CHOOSE_LOAD_FUNC(imm, sk_load_byte);
common_load_abs:
/*
* Load from [imm]
* Load into r4, which can just be passed onto
* skb load helpers as the second parameter
*/
PPC_LI32(4, imm);
goto common_load;
/* Indirect loads */
case BPF_LD | BPF_W | BPF_IND:
func = (u8 *)sk_load_word;
goto common_load_ind;
case BPF_LD | BPF_H | BPF_IND:
func = (u8 *)sk_load_half;
goto common_load_ind;
case BPF_LD | BPF_B | BPF_IND:
func = (u8 *)sk_load_byte;
common_load_ind:
/*
* Load from [src_reg + imm]
* Treat src_reg as a 32-bit value
*/
PPC_EXTSW(4, src_reg);
if (imm) {
if (imm >= -32768 && imm < 32768)
PPC_ADDI(4, 4, IMM_L(imm));
else {
PPC_LI32(b2p[TMP_REG_1], imm);
PPC_ADD(4, 4, b2p[TMP_REG_1]);
}
}
common_load:
ctx->seen |= SEEN_SKB;
ctx->seen |= SEEN_FUNC;
bpf_jit_emit_func_call(image, ctx, (u64)func);
/*
* Helper returns 'lt' condition on error, and an
* appropriate return value in BPF_REG_0
*/
PPC_BCC(COND_LT, exit_addr);
break;
/*
* Tail call
*/
@ -971,6 +890,14 @@ common_load:
return 0;
}
struct powerpc64_jit_data {
struct bpf_binary_header *header;
u32 *addrs;
u8 *image;
u32 proglen;
struct codegen_context ctx;
};
struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
{
u32 proglen;
@ -978,6 +905,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
u8 *image = NULL;
u32 *code_base;
u32 *addrs;
struct powerpc64_jit_data *jit_data;
struct codegen_context cgctx;
int pass;
int flen;
@ -985,6 +913,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
struct bpf_prog *org_fp = fp;
struct bpf_prog *tmp_fp;
bool bpf_blinded = false;
bool extra_pass = false;
if (!fp->jit_requested)
return org_fp;
@ -998,11 +927,32 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
fp = tmp_fp;
}
jit_data = fp->aux->jit_data;
if (!jit_data) {
jit_data = kzalloc(sizeof(*jit_data), GFP_KERNEL);
if (!jit_data) {
fp = org_fp;
goto out;
}
fp->aux->jit_data = jit_data;
}
flen = fp->len;
addrs = jit_data->addrs;
if (addrs) {
cgctx = jit_data->ctx;
image = jit_data->image;
bpf_hdr = jit_data->header;
proglen = jit_data->proglen;
alloclen = proglen + FUNCTION_DESCR_SIZE;
extra_pass = true;
goto skip_init_ctx;
}
addrs = kzalloc((flen+1) * sizeof(*addrs), GFP_KERNEL);
if (addrs == NULL) {
fp = org_fp;
goto out;
goto out_addrs;
}
memset(&cgctx, 0, sizeof(struct codegen_context));
@ -1011,10 +961,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
/* Scouting faux-generate pass 0 */
if (bpf_jit_build_body(fp, 0, &cgctx, addrs)) {
if (bpf_jit_build_body(fp, 0, &cgctx, addrs, false)) {
/* We hit something illegal or unsupported. */
fp = org_fp;
goto out;
goto out_addrs;
}
/*
@ -1032,9 +982,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
bpf_jit_fill_ill_insns);
if (!bpf_hdr) {
fp = org_fp;
goto out;
goto out_addrs;
}
skip_init_ctx:
code_base = (u32 *)(image + FUNCTION_DESCR_SIZE);
/* Code generation passes 1-2 */
@ -1042,7 +993,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
/* Now build the prologue, body code & epilogue for real. */
cgctx.idx = 0;
bpf_jit_build_prologue(code_base, &cgctx);
bpf_jit_build_body(fp, code_base, &cgctx, addrs);
bpf_jit_build_body(fp, code_base, &cgctx, addrs, extra_pass);
bpf_jit_build_epilogue(code_base, &cgctx);
if (bpf_jit_enable > 1)
@ -1068,10 +1019,20 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
fp->jited_len = alloclen;
bpf_flush_icache(bpf_hdr, (u8 *)bpf_hdr + (bpf_hdr->pages * PAGE_SIZE));
if (!fp->is_func || extra_pass) {
out_addrs:
kfree(addrs);
kfree(jit_data);
fp->aux->jit_data = NULL;
} else {
jit_data->addrs = addrs;
jit_data->ctx = cgctx;
jit_data->proglen = proglen;
jit_data->image = image;
jit_data->header = bpf_hdr;
}
out:
kfree(addrs);
if (bpf_blinded)
bpf_jit_prog_release_other(fp, fp == org_fp ? tmp_fp : org_fp);

View File

@ -2,5 +2,5 @@
#
# Arch-specific network modules
#
obj-$(CONFIG_BPF_JIT) += bpf_jit.o bpf_jit_comp.o
obj-$(CONFIG_BPF_JIT) += bpf_jit_comp.o
obj-$(CONFIG_HAVE_PNETID) += pnet.o

View File

@ -1,120 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* BPF Jit compiler for s390, help functions.
*
* Copyright IBM Corp. 2012,2015
*
* Author(s): Martin Schwidefsky <schwidefsky@de.ibm.com>
* Michael Holzheu <holzheu@linux.vnet.ibm.com>
*/
#include <linux/linkage.h>
#include <asm/nospec-insn.h>
#include "bpf_jit.h"
/*
* Calling convention:
* registers %r7-%r10, %r11,%r13, and %r15 are call saved
*
* Input (64 bit):
* %r3 (%b2) = offset into skb data
* %r6 (%b5) = return address
* %r7 (%b6) = skb pointer
* %r12 = skb data pointer
*
* Output:
* %r14= %b0 = return value (read skb value)
*
* Work registers: %r2,%r4,%r5,%r14
*
* skb_copy_bits takes 4 parameters:
* %r2 = skb pointer
* %r3 = offset into skb data
* %r4 = pointer to temp buffer
* %r5 = length to copy
* Return value in %r2: 0 = ok
*
* bpf_internal_load_pointer_neg_helper takes 3 parameters:
* %r2 = skb pointer
* %r3 = offset into data
* %r4 = length to copy
* Return value in %r2: Pointer to data
*/
#define SKF_MAX_NEG_OFF -0x200000 /* SKF_LL_OFF from filter.h */
/*
* Load SIZE bytes from SKB
*/
#define sk_load_common(NAME, SIZE, LOAD) \
ENTRY(sk_load_##NAME); \
ltgr %r3,%r3; /* Is offset negative? */ \
jl sk_load_##NAME##_slow_neg; \
ENTRY(sk_load_##NAME##_pos); \
aghi %r3,SIZE; /* Offset + SIZE */ \
clg %r3,STK_OFF_HLEN(%r15); /* Offset + SIZE > hlen? */ \
jh sk_load_##NAME##_slow; \
LOAD %r14,-SIZE(%r3,%r12); /* Get data from skb */ \
B_EX OFF_OK,%r6; /* Return */ \
\
sk_load_##NAME##_slow:; \
lgr %r2,%r7; /* Arg1 = skb pointer */ \
aghi %r3,-SIZE; /* Arg2 = offset */ \
la %r4,STK_OFF_TMP(%r15); /* Arg3 = temp bufffer */ \
lghi %r5,SIZE; /* Arg4 = size */ \
brasl %r14,skb_copy_bits; /* Get data from skb */ \
LOAD %r14,STK_OFF_TMP(%r15); /* Load from temp bufffer */ \
ltgr %r2,%r2; /* Set cc to (%r2 != 0) */ \
BR_EX %r6; /* Return */
sk_load_common(word, 4, llgf) /* r14 = *(u32 *) (skb->data+offset) */
sk_load_common(half, 2, llgh) /* r14 = *(u16 *) (skb->data+offset) */
GEN_BR_THUNK %r6
GEN_B_THUNK OFF_OK,%r6
/*
* Load 1 byte from SKB (optimized version)
*/
/* r14 = *(u8 *) (skb->data+offset) */
ENTRY(sk_load_byte)
ltgr %r3,%r3 # Is offset negative?
jl sk_load_byte_slow_neg
ENTRY(sk_load_byte_pos)
clg %r3,STK_OFF_HLEN(%r15) # Offset >= hlen?
jnl sk_load_byte_slow
llgc %r14,0(%r3,%r12) # Get byte from skb
B_EX OFF_OK,%r6 # Return OK
sk_load_byte_slow:
lgr %r2,%r7 # Arg1 = skb pointer
# Arg2 = offset
la %r4,STK_OFF_TMP(%r15) # Arg3 = pointer to temp buffer
lghi %r5,1 # Arg4 = size (1 byte)
brasl %r14,skb_copy_bits # Get data from skb
llgc %r14,STK_OFF_TMP(%r15) # Load result from temp buffer
ltgr %r2,%r2 # Set cc to (%r2 != 0)
BR_EX %r6 # Return cc
#define sk_negative_common(NAME, SIZE, LOAD) \
sk_load_##NAME##_slow_neg:; \
cgfi %r3,SKF_MAX_NEG_OFF; \
jl bpf_error; \
lgr %r2,%r7; /* Arg1 = skb pointer */ \
/* Arg2 = offset */ \
lghi %r4,SIZE; /* Arg3 = size */ \
brasl %r14,bpf_internal_load_pointer_neg_helper; \
ltgr %r2,%r2; \
jz bpf_error; \
LOAD %r14,0(%r2); /* Get data from pointer */ \
xr %r3,%r3; /* Set cc to zero */ \
BR_EX %r6; /* Return cc */
sk_negative_common(word, 4, llgf)
sk_negative_common(half, 2, llgh)
sk_negative_common(byte, 1, llgc)
bpf_error:
# force a return 0 from jit handler
ltgr %r15,%r15 # Set condition code
BR_EX %r6

View File

@ -16,9 +16,6 @@
#include <linux/filter.h>
#include <linux/types.h>
extern u8 sk_load_word_pos[], sk_load_half_pos[], sk_load_byte_pos[];
extern u8 sk_load_word[], sk_load_half[], sk_load_byte[];
#endif /* __ASSEMBLY__ */
/*
@ -36,15 +33,6 @@ extern u8 sk_load_word[], sk_load_half[], sk_load_byte[];
* | | |
* | BPF stack | |
* | | |
* +---------------+ |
* | 8 byte skbp | |
* R15+176 -> +---------------+ |
* | 8 byte hlen | |
* R15+168 -> +---------------+ |
* | 4 byte align | |
* +---------------+ |
* | 4 byte temp | |
* | for bpf_jit.S | |
* R15+160 -> +---------------+ |
* | new backchain | |
* R15+152 -> +---------------+ |
@ -57,17 +45,11 @@ extern u8 sk_load_word[], sk_load_half[], sk_load_byte[];
* The stack size used by the BPF program ("BPF stack" above) is passed
* via "aux->stack_depth".
*/
#define STK_SPACE_ADD (8 + 8 + 4 + 4 + 160)
#define STK_SPACE_ADD (160)
#define STK_160_UNUSED (160 - 12 * 8)
#define STK_OFF (STK_SPACE_ADD - STK_160_UNUSED)
#define STK_OFF_TMP 160 /* Offset of tmp buffer on stack */
#define STK_OFF_HLEN 168 /* Offset of SKB header length on stack */
#define STK_OFF_SKBP 176 /* Offset of SKB pointer on stack */
#define STK_OFF_R6 (160 - 11 * 8) /* Offset of r6 on stack */
#define STK_OFF_TCCNT (160 - 12 * 8) /* Offset of tail_call_cnt on stack */
/* Offset to skip condition code check */
#define OFF_OK 4
#endif /* __ARCH_S390_NET_BPF_JIT_H */

View File

@ -51,23 +51,21 @@ struct bpf_jit {
#define BPF_SIZE_MAX 0xffff /* Max size for program (16 bit branches) */
#define SEEN_SKB 1 /* skb access */
#define SEEN_MEM 2 /* use mem[] for temporary storage */
#define SEEN_RET0 4 /* ret0_ip points to a valid return 0 */
#define SEEN_LITERAL 8 /* code uses literals */
#define SEEN_FUNC 16 /* calls C functions */
#define SEEN_TAIL_CALL 32 /* code uses tail calls */
#define SEEN_REG_AX 64 /* code uses constant blinding */
#define SEEN_STACK (SEEN_FUNC | SEEN_MEM | SEEN_SKB)
#define SEEN_MEM (1 << 0) /* use mem[] for temporary storage */
#define SEEN_RET0 (1 << 1) /* ret0_ip points to a valid return 0 */
#define SEEN_LITERAL (1 << 2) /* code uses literals */
#define SEEN_FUNC (1 << 3) /* calls C functions */
#define SEEN_TAIL_CALL (1 << 4) /* code uses tail calls */
#define SEEN_REG_AX (1 << 5) /* code uses constant blinding */
#define SEEN_STACK (SEEN_FUNC | SEEN_MEM)
/*
* s390 registers
*/
#define REG_W0 (MAX_BPF_JIT_REG + 0) /* Work register 1 (even) */
#define REG_W1 (MAX_BPF_JIT_REG + 1) /* Work register 2 (odd) */
#define REG_SKB_DATA (MAX_BPF_JIT_REG + 2) /* SKB data register */
#define REG_L (MAX_BPF_JIT_REG + 3) /* Literal pool register */
#define REG_15 (MAX_BPF_JIT_REG + 4) /* Register 15 */
#define REG_L (MAX_BPF_JIT_REG + 2) /* Literal pool register */
#define REG_15 (MAX_BPF_JIT_REG + 3) /* Register 15 */
#define REG_0 REG_W0 /* Register 0 */
#define REG_1 REG_W1 /* Register 1 */
#define REG_2 BPF_REG_1 /* Register 2 */
@ -92,10 +90,8 @@ static const int reg2hex[] = {
[BPF_REG_9] = 10,
/* BPF stack pointer */
[BPF_REG_FP] = 13,
/* Register for blinding (shared with REG_SKB_DATA) */
/* Register for blinding */
[BPF_REG_AX] = 12,
/* SKB data pointer */
[REG_SKB_DATA] = 12,
/* Work registers for s390x backend */
[REG_W0] = 0,
[REG_W1] = 1,
@ -401,27 +397,6 @@ static void save_restore_regs(struct bpf_jit *jit, int op, u32 stack_depth)
} while (re <= 15);
}
/*
* For SKB access %b1 contains the SKB pointer. For "bpf_jit.S"
* we store the SKB header length on the stack and the SKB data
* pointer in REG_SKB_DATA if BPF_REG_AX is not used.
*/
static void emit_load_skb_data_hlen(struct bpf_jit *jit)
{
/* Header length: llgf %w1,<len>(%b1) */
EMIT6_DISP_LH(0xe3000000, 0x0016, REG_W1, REG_0, BPF_REG_1,
offsetof(struct sk_buff, len));
/* s %w1,<data_len>(%b1) */
EMIT4_DISP(0x5b000000, REG_W1, BPF_REG_1,
offsetof(struct sk_buff, data_len));
/* stg %w1,ST_OFF_HLEN(%r0,%r15) */
EMIT6_DISP_LH(0xe3000000, 0x0024, REG_W1, REG_0, REG_15, STK_OFF_HLEN);
if (!(jit->seen & SEEN_REG_AX))
/* lg %skb_data,data_off(%b1) */
EMIT6_DISP_LH(0xe3000000, 0x0004, REG_SKB_DATA, REG_0,
BPF_REG_1, offsetof(struct sk_buff, data));
}
/*
* Emit function prologue
*
@ -462,12 +437,6 @@ static void bpf_jit_prologue(struct bpf_jit *jit, u32 stack_depth)
EMIT6_DISP_LH(0xe3000000, 0x0024, REG_W1, REG_0,
REG_15, 152);
}
if (jit->seen & SEEN_SKB) {
emit_load_skb_data_hlen(jit);
/* stg %b1,ST_OFF_SKBP(%r0,%r15) */
EMIT6_DISP_LH(0xe3000000, 0x0024, BPF_REG_1, REG_0, REG_15,
STK_OFF_SKBP);
}
}
/*
@ -537,12 +506,12 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i
{
struct bpf_insn *insn = &fp->insnsi[i];
int jmp_off, last, insn_count = 1;
unsigned int func_addr, mask;
u32 dst_reg = insn->dst_reg;
u32 src_reg = insn->src_reg;
u32 *addrs = jit->addrs;
s32 imm = insn->imm;
s16 off = insn->off;
unsigned int mask;
if (dst_reg == BPF_REG_AX || src_reg == BPF_REG_AX)
jit->seen |= SEEN_REG_AX;
@ -1029,13 +998,6 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i
}
/* lgr %b0,%r2: load return value into %b0 */
EMIT4(0xb9040000, BPF_REG_0, REG_2);
if ((jit->seen & SEEN_SKB) &&
bpf_helper_changes_pkt_data((void *)func)) {
/* lg %b1,ST_OFF_SKBP(%r15) */
EMIT6_DISP_LH(0xe3000000, 0x0004, BPF_REG_1, REG_0,
REG_15, STK_OFF_SKBP);
emit_load_skb_data_hlen(jit);
}
break;
}
case BPF_JMP | BPF_TAIL_CALL:
@ -1235,73 +1197,6 @@ branch_oc:
jmp_off = addrs[i + off + 1] - (addrs[i + 1] - 4);
EMIT4_PCREL(0xa7040000 | mask << 8, jmp_off);
break;
/*
* BPF_LD
*/
case BPF_LD | BPF_ABS | BPF_B: /* b0 = *(u8 *) (skb->data+imm) */
case BPF_LD | BPF_IND | BPF_B: /* b0 = *(u8 *) (skb->data+imm+src) */
if ((BPF_MODE(insn->code) == BPF_ABS) && (imm >= 0))
func_addr = __pa(sk_load_byte_pos);
else
func_addr = __pa(sk_load_byte);
goto call_fn;
case BPF_LD | BPF_ABS | BPF_H: /* b0 = *(u16 *) (skb->data+imm) */
case BPF_LD | BPF_IND | BPF_H: /* b0 = *(u16 *) (skb->data+imm+src) */
if ((BPF_MODE(insn->code) == BPF_ABS) && (imm >= 0))
func_addr = __pa(sk_load_half_pos);
else
func_addr = __pa(sk_load_half);
goto call_fn;
case BPF_LD | BPF_ABS | BPF_W: /* b0 = *(u32 *) (skb->data+imm) */
case BPF_LD | BPF_IND | BPF_W: /* b0 = *(u32 *) (skb->data+imm+src) */
if ((BPF_MODE(insn->code) == BPF_ABS) && (imm >= 0))
func_addr = __pa(sk_load_word_pos);
else
func_addr = __pa(sk_load_word);
goto call_fn;
call_fn:
jit->seen |= SEEN_SKB | SEEN_RET0 | SEEN_FUNC;
REG_SET_SEEN(REG_14); /* Return address of possible func call */
/*
* Implicit input:
* BPF_REG_6 (R7) : skb pointer
* REG_SKB_DATA (R12): skb data pointer (if no BPF_REG_AX)
*
* Calculated input:
* BPF_REG_2 (R3) : offset of byte(s) to fetch in skb
* BPF_REG_5 (R6) : return address
*
* Output:
* BPF_REG_0 (R14): data read from skb
*
* Scratch registers (BPF_REG_1-5)
*/
/* Call function: llilf %w1,func_addr */
EMIT6_IMM(0xc00f0000, REG_W1, func_addr);
/* Offset: lgfi %b2,imm */
EMIT6_IMM(0xc0010000, BPF_REG_2, imm);
if (BPF_MODE(insn->code) == BPF_IND)
/* agfr %b2,%src (%src is s32 here) */
EMIT4(0xb9180000, BPF_REG_2, src_reg);
/* Reload REG_SKB_DATA if BPF_REG_AX is used */
if (jit->seen & SEEN_REG_AX)
/* lg %skb_data,data_off(%b6) */
EMIT6_DISP_LH(0xe3000000, 0x0004, REG_SKB_DATA, REG_0,
BPF_REG_6, offsetof(struct sk_buff, data));
/* basr %b5,%w1 (%b5 is call saved) */
EMIT2(0x0d00, BPF_REG_5, REG_W1);
/*
* Note: For fast access we jump directly after the
* jnz instruction from bpf_jit.S
*/
/* jnz <ret0> */
EMIT4_PCREL(0xa7740000, jit->ret0_ip - jit->prg);
break;
default: /* too complex, give up */
pr_err("Unknown opcode %02x\n", insn->code);
return -1;

View File

@ -1,4 +1,7 @@
#
# Arch-specific network modules
#
obj-$(CONFIG_BPF_JIT) += bpf_jit_asm_$(BITS).o bpf_jit_comp_$(BITS).o
obj-$(CONFIG_BPF_JIT) += bpf_jit_comp_$(BITS).o
ifeq ($(BITS),32)
obj-$(CONFIG_BPF_JIT) += bpf_jit_asm_32.o
endif

View File

@ -33,35 +33,6 @@
#define I5 0x1d
#define FP 0x1e
#define I7 0x1f
#define r_SKB L0
#define r_HEADLEN L4
#define r_SKB_DATA L5
#define r_TMP G1
#define r_TMP2 G3
/* assembly code in arch/sparc/net/bpf_jit_asm_64.S */
extern u32 bpf_jit_load_word[];
extern u32 bpf_jit_load_half[];
extern u32 bpf_jit_load_byte[];
extern u32 bpf_jit_load_byte_msh[];
extern u32 bpf_jit_load_word_positive_offset[];
extern u32 bpf_jit_load_half_positive_offset[];
extern u32 bpf_jit_load_byte_positive_offset[];
extern u32 bpf_jit_load_byte_msh_positive_offset[];
extern u32 bpf_jit_load_word_negative_offset[];
extern u32 bpf_jit_load_half_negative_offset[];
extern u32 bpf_jit_load_byte_negative_offset[];
extern u32 bpf_jit_load_byte_msh_negative_offset[];
#else
#define r_RESULT %o0
#define r_SKB %o0
#define r_OFF %o1
#define r_HEADLEN %l4
#define r_SKB_DATA %l5
#define r_TMP %g1
#define r_TMP2 %g3
#endif
#endif /* _BPF_JIT_H */

View File

@ -1,162 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 */
#include <asm/ptrace.h>
#include "bpf_jit_64.h"
#define SAVE_SZ 176
#define SCRATCH_OFF STACK_BIAS + 128
#define BE_PTR(label) be,pn %xcc, label
#define SIGN_EXTEND(reg) sra reg, 0, reg
#define SKF_MAX_NEG_OFF (-0x200000) /* SKF_LL_OFF from filter.h */
.text
.globl bpf_jit_load_word
bpf_jit_load_word:
cmp r_OFF, 0
bl bpf_slow_path_word_neg
nop
.globl bpf_jit_load_word_positive_offset
bpf_jit_load_word_positive_offset:
sub r_HEADLEN, r_OFF, r_TMP
cmp r_TMP, 3
ble bpf_slow_path_word
add r_SKB_DATA, r_OFF, r_TMP
andcc r_TMP, 3, %g0
bne load_word_unaligned
nop
retl
ld [r_TMP], r_RESULT
load_word_unaligned:
ldub [r_TMP + 0x0], r_OFF
ldub [r_TMP + 0x1], r_TMP2
sll r_OFF, 8, r_OFF
or r_OFF, r_TMP2, r_OFF
ldub [r_TMP + 0x2], r_TMP2
sll r_OFF, 8, r_OFF
or r_OFF, r_TMP2, r_OFF
ldub [r_TMP + 0x3], r_TMP2
sll r_OFF, 8, r_OFF
retl
or r_OFF, r_TMP2, r_RESULT
.globl bpf_jit_load_half
bpf_jit_load_half:
cmp r_OFF, 0
bl bpf_slow_path_half_neg
nop
.globl bpf_jit_load_half_positive_offset
bpf_jit_load_half_positive_offset:
sub r_HEADLEN, r_OFF, r_TMP
cmp r_TMP, 1
ble bpf_slow_path_half
add r_SKB_DATA, r_OFF, r_TMP
andcc r_TMP, 1, %g0
bne load_half_unaligned
nop
retl
lduh [r_TMP], r_RESULT
load_half_unaligned:
ldub [r_TMP + 0x0], r_OFF
ldub [r_TMP + 0x1], r_TMP2
sll r_OFF, 8, r_OFF
retl
or r_OFF, r_TMP2, r_RESULT
.globl bpf_jit_load_byte
bpf_jit_load_byte:
cmp r_OFF, 0
bl bpf_slow_path_byte_neg
nop
.globl bpf_jit_load_byte_positive_offset
bpf_jit_load_byte_positive_offset:
cmp r_OFF, r_HEADLEN
bge bpf_slow_path_byte
nop
retl
ldub [r_SKB_DATA + r_OFF], r_RESULT
#define bpf_slow_path_common(LEN) \
save %sp, -SAVE_SZ, %sp; \
mov %i0, %o0; \
mov %i1, %o1; \
add %fp, SCRATCH_OFF, %o2; \
call skb_copy_bits; \
mov (LEN), %o3; \
cmp %o0, 0; \
restore;
bpf_slow_path_word:
bpf_slow_path_common(4)
bl bpf_error
ld [%sp + SCRATCH_OFF], r_RESULT
retl
nop
bpf_slow_path_half:
bpf_slow_path_common(2)
bl bpf_error
lduh [%sp + SCRATCH_OFF], r_RESULT
retl
nop
bpf_slow_path_byte:
bpf_slow_path_common(1)
bl bpf_error
ldub [%sp + SCRATCH_OFF], r_RESULT
retl
nop
#define bpf_negative_common(LEN) \
save %sp, -SAVE_SZ, %sp; \
mov %i0, %o0; \
mov %i1, %o1; \
SIGN_EXTEND(%o1); \
call bpf_internal_load_pointer_neg_helper; \
mov (LEN), %o2; \
mov %o0, r_TMP; \
cmp %o0, 0; \
BE_PTR(bpf_error); \
restore;
bpf_slow_path_word_neg:
sethi %hi(SKF_MAX_NEG_OFF), r_TMP
cmp r_OFF, r_TMP
bl bpf_error
nop
.globl bpf_jit_load_word_negative_offset
bpf_jit_load_word_negative_offset:
bpf_negative_common(4)
andcc r_TMP, 3, %g0
bne load_word_unaligned
nop
retl
ld [r_TMP], r_RESULT
bpf_slow_path_half_neg:
sethi %hi(SKF_MAX_NEG_OFF), r_TMP
cmp r_OFF, r_TMP
bl bpf_error
nop
.globl bpf_jit_load_half_negative_offset
bpf_jit_load_half_negative_offset:
bpf_negative_common(2)
andcc r_TMP, 1, %g0
bne load_half_unaligned
nop
retl
lduh [r_TMP], r_RESULT
bpf_slow_path_byte_neg:
sethi %hi(SKF_MAX_NEG_OFF), r_TMP
cmp r_OFF, r_TMP
bl bpf_error
nop
.globl bpf_jit_load_byte_negative_offset
bpf_jit_load_byte_negative_offset:
bpf_negative_common(1)
retl
ldub [r_TMP], r_RESULT
bpf_error:
/* Make the JIT program itself return zero. */
ret
restore %g0, %g0, %o0

View File

@ -48,10 +48,6 @@ static void bpf_flush_icache(void *start_, void *end_)
}
}
#define SEEN_DATAREF 1 /* might call external helpers */
#define SEEN_XREG 2 /* ebx is used */
#define SEEN_MEM 4 /* use mem[] for temporary storage */
#define S13(X) ((X) & 0x1fff)
#define S5(X) ((X) & 0x1f)
#define IMMED 0x00002000
@ -198,7 +194,6 @@ struct jit_ctx {
bool tmp_1_used;
bool tmp_2_used;
bool tmp_3_used;
bool saw_ld_abs_ind;
bool saw_frame_pointer;
bool saw_call;
bool saw_tail_call;
@ -207,9 +202,7 @@ struct jit_ctx {
#define TMP_REG_1 (MAX_BPF_JIT_REG + 0)
#define TMP_REG_2 (MAX_BPF_JIT_REG + 1)
#define SKB_HLEN_REG (MAX_BPF_JIT_REG + 2)
#define SKB_DATA_REG (MAX_BPF_JIT_REG + 3)
#define TMP_REG_3 (MAX_BPF_JIT_REG + 4)
#define TMP_REG_3 (MAX_BPF_JIT_REG + 2)
/* Map BPF registers to SPARC registers */
static const int bpf2sparc[] = {
@ -238,9 +231,6 @@ static const int bpf2sparc[] = {
[TMP_REG_1] = G1,
[TMP_REG_2] = G2,
[TMP_REG_3] = G3,
[SKB_HLEN_REG] = L4,
[SKB_DATA_REG] = L5,
};
static void emit(const u32 insn, struct jit_ctx *ctx)
@ -800,25 +790,6 @@ static int emit_compare_and_branch(const u8 code, const u8 dst, u8 src,
return 0;
}
static void load_skb_regs(struct jit_ctx *ctx, u8 r_skb)
{
const u8 r_headlen = bpf2sparc[SKB_HLEN_REG];
const u8 r_data = bpf2sparc[SKB_DATA_REG];
const u8 r_tmp = bpf2sparc[TMP_REG_1];
unsigned int off;
off = offsetof(struct sk_buff, len);
emit(LD32I | RS1(r_skb) | S13(off) | RD(r_headlen), ctx);
off = offsetof(struct sk_buff, data_len);
emit(LD32I | RS1(r_skb) | S13(off) | RD(r_tmp), ctx);
emit(SUB | RS1(r_headlen) | RS2(r_tmp) | RD(r_headlen), ctx);
off = offsetof(struct sk_buff, data);
emit(LDPTRI | RS1(r_skb) | S13(off) | RD(r_data), ctx);
}
/* Just skip the save instruction and the ctx register move. */
#define BPF_TAILCALL_PROLOGUE_SKIP 16
#define BPF_TAILCALL_CNT_SP_OFF (STACK_BIAS + 128)
@ -857,9 +828,6 @@ static void build_prologue(struct jit_ctx *ctx)
emit_reg_move(I0, O0, ctx);
/* If you add anything here, adjust BPF_TAILCALL_PROLOGUE_SKIP above. */
if (ctx->saw_ld_abs_ind)
load_skb_regs(ctx, bpf2sparc[BPF_REG_1]);
}
static void build_epilogue(struct jit_ctx *ctx)
@ -926,7 +894,6 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
const int i = insn - ctx->prog->insnsi;
const s16 off = insn->off;
const s32 imm = insn->imm;
u32 *func;
if (insn->src_reg == BPF_REG_FP)
ctx->saw_frame_pointer = true;
@ -1225,16 +1192,11 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
u8 *func = ((u8 *)__bpf_call_base) + imm;
ctx->saw_call = true;
if (ctx->saw_ld_abs_ind && bpf_helper_changes_pkt_data(func))
emit_reg_move(bpf2sparc[BPF_REG_1], L7, ctx);
emit_call((u32 *)func, ctx);
emit_nop(ctx);
emit_reg_move(O0, bpf2sparc[BPF_REG_0], ctx);
if (ctx->saw_ld_abs_ind && bpf_helper_changes_pkt_data(func))
load_skb_regs(ctx, L7);
break;
}
@ -1412,43 +1374,6 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
emit_nop(ctx);
break;
}
#define CHOOSE_LOAD_FUNC(K, func) \
((int)K < 0 ? ((int)K >= SKF_LL_OFF ? func##_negative_offset : func) : func##_positive_offset)
/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + imm)) */
case BPF_LD | BPF_ABS | BPF_W:
func = CHOOSE_LOAD_FUNC(imm, bpf_jit_load_word);
goto common_load;
case BPF_LD | BPF_ABS | BPF_H:
func = CHOOSE_LOAD_FUNC(imm, bpf_jit_load_half);
goto common_load;
case BPF_LD | BPF_ABS | BPF_B:
func = CHOOSE_LOAD_FUNC(imm, bpf_jit_load_byte);
goto common_load;
/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + src + imm)) */
case BPF_LD | BPF_IND | BPF_W:
func = bpf_jit_load_word;
goto common_load;
case BPF_LD | BPF_IND | BPF_H:
func = bpf_jit_load_half;
goto common_load;
case BPF_LD | BPF_IND | BPF_B:
func = bpf_jit_load_byte;
common_load:
ctx->saw_ld_abs_ind = true;
emit_reg_move(bpf2sparc[BPF_REG_6], O0, ctx);
emit_loadimm(imm, O1, ctx);
if (BPF_MODE(code) == BPF_IND)
emit_alu(ADD, src, O1, ctx);
emit_call(func, ctx);
emit_alu_K(SRA, O1, 0, ctx);
emit_reg_move(O0, bpf2sparc[BPF_REG_0], ctx);
break;
default:
pr_err_once("unknown opcode %02x\n", code);
@ -1583,12 +1508,11 @@ skip_init_ctx:
build_epilogue(&ctx);
if (bpf_jit_enable > 1)
pr_info("Pass %d: shrink = %d, seen = [%c%c%c%c%c%c%c]\n", pass,
pr_info("Pass %d: shrink = %d, seen = [%c%c%c%c%c%c]\n", pass,
image_size - (ctx.idx * 4),
ctx.tmp_1_used ? '1' : ' ',
ctx.tmp_2_used ? '2' : ' ',
ctx.tmp_3_used ? '3' : ' ',
ctx.saw_ld_abs_ind ? 'L' : ' ',
ctx.saw_frame_pointer ? 'F' : ' ',
ctx.saw_call ? 'C' : ' ',
ctx.saw_tail_call ? 'T' : ' ');

View File

@ -140,7 +140,7 @@ config X86
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_EBPF_JIT if X86_64
select HAVE_EBPF_JIT
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_EXIT_THREAD
select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE

View File

@ -308,16 +308,20 @@ do { \
* lfence
* jmp spec_trap
* do_rop:
* mov %rax,(%rsp)
* mov %rax,(%rsp) for x86_64
* mov %edx,(%esp) for x86_32
* retq
*
* Without retpolines configured:
*
* jmp *%rax
* jmp *%rax for x86_64
* jmp *%edx for x86_32
*/
#ifdef CONFIG_RETPOLINE
# define RETPOLINE_RAX_BPF_JIT_SIZE 17
# define RETPOLINE_RAX_BPF_JIT() \
# ifdef CONFIG_X86_64
# define RETPOLINE_RAX_BPF_JIT_SIZE 17
# define RETPOLINE_RAX_BPF_JIT() \
do { \
EMIT1_off32(0xE8, 7); /* callq do_rop */ \
/* spec_trap: */ \
EMIT2(0xF3, 0x90); /* pause */ \
@ -325,11 +329,30 @@ do { \
EMIT2(0xEB, 0xF9); /* jmp spec_trap */ \
/* do_rop: */ \
EMIT4(0x48, 0x89, 0x04, 0x24); /* mov %rax,(%rsp) */ \
EMIT1(0xC3); /* retq */
#else
# define RETPOLINE_RAX_BPF_JIT_SIZE 2
# define RETPOLINE_RAX_BPF_JIT() \
EMIT2(0xFF, 0xE0); /* jmp *%rax */
EMIT1(0xC3); /* retq */ \
} while (0)
# else /* !CONFIG_X86_64 */
# define RETPOLINE_EDX_BPF_JIT() \
do { \
EMIT1_off32(0xE8, 7); /* call do_rop */ \
/* spec_trap: */ \
EMIT2(0xF3, 0x90); /* pause */ \
EMIT3(0x0F, 0xAE, 0xE8); /* lfence */ \
EMIT2(0xEB, 0xF9); /* jmp spec_trap */ \
/* do_rop: */ \
EMIT3(0x89, 0x14, 0x24); /* mov %edx,(%esp) */ \
EMIT1(0xC3); /* ret */ \
} while (0)
# endif
#else /* !CONFIG_RETPOLINE */
# ifdef CONFIG_X86_64
# define RETPOLINE_RAX_BPF_JIT_SIZE 2
# define RETPOLINE_RAX_BPF_JIT() \
EMIT2(0xFF, 0xE0); /* jmp *%rax */
# else /* !CONFIG_X86_64 */
# define RETPOLINE_EDX_BPF_JIT() \
EMIT2(0xFF, 0xE2) /* jmp *%edx */
# endif
#endif
#endif /* _ASM_X86_NOSPEC_BRANCH_H_ */

View File

@ -1,6 +1,9 @@
#
# Arch-specific network modules
#
OBJECT_FILES_NON_STANDARD_bpf_jit.o += y
obj-$(CONFIG_BPF_JIT) += bpf_jit.o bpf_jit_comp.o
ifeq ($(CONFIG_X86_32),y)
obj-$(CONFIG_BPF_JIT) += bpf_jit_comp32.o
else
obj-$(CONFIG_BPF_JIT) += bpf_jit_comp.o
endif

View File

@ -1,154 +0,0 @@
/* bpf_jit.S : BPF JIT helper functions
*
* Copyright (C) 2011 Eric Dumazet (eric.dumazet@gmail.com)
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; version 2
* of the License.
*/
#include <linux/linkage.h>
#include <asm/frame.h>
/*
* Calling convention :
* rbx : skb pointer (callee saved)
* esi : offset of byte(s) to fetch in skb (can be scratched)
* r10 : copy of skb->data
* r9d : hlen = skb->len - skb->data_len
*/
#define SKBDATA %r10
#define SKF_MAX_NEG_OFF $(-0x200000) /* SKF_LL_OFF from filter.h */
#define FUNC(name) \
.globl name; \
.type name, @function; \
name:
FUNC(sk_load_word)
test %esi,%esi
js bpf_slow_path_word_neg
FUNC(sk_load_word_positive_offset)
mov %r9d,%eax # hlen
sub %esi,%eax # hlen - offset
cmp $3,%eax
jle bpf_slow_path_word
mov (SKBDATA,%rsi),%eax
bswap %eax /* ntohl() */
ret
FUNC(sk_load_half)
test %esi,%esi
js bpf_slow_path_half_neg
FUNC(sk_load_half_positive_offset)
mov %r9d,%eax
sub %esi,%eax # hlen - offset
cmp $1,%eax
jle bpf_slow_path_half
movzwl (SKBDATA,%rsi),%eax
rol $8,%ax # ntohs()
ret
FUNC(sk_load_byte)
test %esi,%esi
js bpf_slow_path_byte_neg
FUNC(sk_load_byte_positive_offset)
cmp %esi,%r9d /* if (offset >= hlen) goto bpf_slow_path_byte */
jle bpf_slow_path_byte
movzbl (SKBDATA,%rsi),%eax
ret
/* rsi contains offset and can be scratched */
#define bpf_slow_path_common(LEN) \
lea 32(%rbp), %rdx;\
FRAME_BEGIN; \
mov %rbx, %rdi; /* arg1 == skb */ \
push %r9; \
push SKBDATA; \
/* rsi already has offset */ \
mov $LEN,%ecx; /* len */ \
call skb_copy_bits; \
test %eax,%eax; \
pop SKBDATA; \
pop %r9; \
FRAME_END
bpf_slow_path_word:
bpf_slow_path_common(4)
js bpf_error
mov 32(%rbp),%eax
bswap %eax
ret
bpf_slow_path_half:
bpf_slow_path_common(2)
js bpf_error
mov 32(%rbp),%ax
rol $8,%ax
movzwl %ax,%eax
ret
bpf_slow_path_byte:
bpf_slow_path_common(1)
js bpf_error
movzbl 32(%rbp),%eax
ret
#define sk_negative_common(SIZE) \
FRAME_BEGIN; \
mov %rbx, %rdi; /* arg1 == skb */ \
push %r9; \
push SKBDATA; \
/* rsi already has offset */ \
mov $SIZE,%edx; /* size */ \
call bpf_internal_load_pointer_neg_helper; \
test %rax,%rax; \
pop SKBDATA; \
pop %r9; \
FRAME_END; \
jz bpf_error
bpf_slow_path_word_neg:
cmp SKF_MAX_NEG_OFF, %esi /* test range */
jl bpf_error /* offset lower -> error */
FUNC(sk_load_word_negative_offset)
sk_negative_common(4)
mov (%rax), %eax
bswap %eax
ret
bpf_slow_path_half_neg:
cmp SKF_MAX_NEG_OFF, %esi
jl bpf_error
FUNC(sk_load_half_negative_offset)
sk_negative_common(2)
mov (%rax),%ax
rol $8,%ax
movzwl %ax,%eax
ret
bpf_slow_path_byte_neg:
cmp SKF_MAX_NEG_OFF, %esi
jl bpf_error
FUNC(sk_load_byte_negative_offset)
sk_negative_common(1)
movzbl (%rax), %eax
ret
bpf_error:
# force a return 0 from jit handler
xor %eax,%eax
mov (%rbp),%rbx
mov 8(%rbp),%r13
mov 16(%rbp),%r14
mov 24(%rbp),%r15
add $40, %rbp
leaveq
ret

View File

@ -17,15 +17,6 @@
#include <asm/set_memory.h>
#include <asm/nospec-branch.h>
/*
* Assembly code in arch/x86/net/bpf_jit.S
*/
extern u8 sk_load_word[], sk_load_half[], sk_load_byte[];
extern u8 sk_load_word_positive_offset[], sk_load_half_positive_offset[];
extern u8 sk_load_byte_positive_offset[];
extern u8 sk_load_word_negative_offset[], sk_load_half_negative_offset[];
extern u8 sk_load_byte_negative_offset[];
static u8 *emit_code(u8 *ptr, u32 bytes, unsigned int len)
{
if (len == 1)
@ -107,9 +98,6 @@ static int bpf_size_to_x86_bytes(int bpf_size)
#define X86_JLE 0x7E
#define X86_JG 0x7F
#define CHOOSE_LOAD_FUNC(K, func) \
((int)K < 0 ? ((int)K >= SKF_LL_OFF ? func##_negative_offset : func) : func##_positive_offset)
/* Pick a register outside of BPF range for JIT internal work */
#define AUX_REG (MAX_BPF_JIT_REG + 1)
@ -120,8 +108,8 @@ static int bpf_size_to_x86_bytes(int bpf_size)
* register in load/store instructions, it always needs an
* extra byte of encoding and is callee saved.
*
* R9 caches skb->len - skb->data_len
* R10 caches skb->data, and used for blinding (if enabled)
* Also x86-64 register R9 is unused. x86-64 register R10 is
* used for blinding (if enabled).
*/
static const int reg2hex[] = {
[BPF_REG_0] = 0, /* RAX */
@ -196,19 +184,15 @@ static void jit_fill_hole(void *area, unsigned int size)
struct jit_context {
int cleanup_addr; /* Epilogue code offset */
bool seen_ld_abs;
bool seen_ax_reg;
};
/* Maximum number of bytes emitted while JITing one eBPF insn */
#define BPF_MAX_INSN_SIZE 128
#define BPF_INSN_SAFETY 64
#define AUX_STACK_SPACE \
(32 /* Space for RBX, R13, R14, R15 */ + \
8 /* Space for skb_copy_bits() buffer */)
#define AUX_STACK_SPACE 40 /* Space for RBX, R13, R14, R15, tailcnt */
#define PROLOGUE_SIZE 37
#define PROLOGUE_SIZE 37
/*
* Emit x86-64 prologue code for BPF program and check its size.
@ -232,20 +216,8 @@ static void emit_prologue(u8 **pprog, u32 stack_depth, bool ebpf_from_cbpf)
/* sub rbp, AUX_STACK_SPACE */
EMIT4(0x48, 0x83, 0xED, AUX_STACK_SPACE);
/* All classic BPF filters use R6(rbx) save it */
/* mov qword ptr [rbp+0],rbx */
EMIT4(0x48, 0x89, 0x5D, 0);
/*
* bpf_convert_filter() maps classic BPF register X to R7 and uses R8
* as temporary, so all tcpdump filters need to spill/fill R7(R13) and
* R8(R14). R9(R15) spill could be made conditional, but there is only
* one 'bpf_error' return path out of helper functions inside bpf_jit.S
* The overhead of extra spill is negligible for any filter other
* than synthetic ones. Therefore not worth adding complexity.
*/
/* mov qword ptr [rbp+8],r13 */
EMIT4(0x4C, 0x89, 0x6D, 8);
/* mov qword ptr [rbp+16],r14 */
@ -353,27 +325,6 @@ static void emit_bpf_tail_call(u8 **pprog)
*pprog = prog;
}
static void emit_load_skb_data_hlen(u8 **pprog)
{
u8 *prog = *pprog;
int cnt = 0;
/*
* r9d = skb->len - skb->data_len (headlen)
* r10 = skb->data
*/
/* mov %r9d, off32(%rdi) */
EMIT3_off32(0x44, 0x8b, 0x8f, offsetof(struct sk_buff, len));
/* sub %r9d, off32(%rdi) */
EMIT3_off32(0x44, 0x2b, 0x8f, offsetof(struct sk_buff, data_len));
/* mov %r10, off32(%rdi) */
EMIT3_off32(0x4c, 0x8b, 0x97, offsetof(struct sk_buff, data));
*pprog = prog;
}
static void emit_mov_imm32(u8 **pprog, bool sign_propagate,
u32 dst_reg, const u32 imm32)
{
@ -462,8 +413,6 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
{
struct bpf_insn *insn = bpf_prog->insnsi;
int insn_cnt = bpf_prog->len;
bool seen_ld_abs = ctx->seen_ld_abs | (oldproglen == 0);
bool seen_ax_reg = ctx->seen_ax_reg | (oldproglen == 0);
bool seen_exit = false;
u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY];
int i, cnt = 0;
@ -473,9 +422,6 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
emit_prologue(&prog, bpf_prog->aux->stack_depth,
bpf_prog_was_classic(bpf_prog));
if (seen_ld_abs)
emit_load_skb_data_hlen(&prog);
for (i = 0; i < insn_cnt; i++, insn++) {
const s32 imm32 = insn->imm;
u32 dst_reg = insn->dst_reg;
@ -483,13 +429,9 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
u8 b2 = 0, b3 = 0;
s64 jmp_offset;
u8 jmp_cond;
bool reload_skb_data;
int ilen;
u8 *func;
if (dst_reg == BPF_REG_AX || src_reg == BPF_REG_AX)
ctx->seen_ax_reg = seen_ax_reg = true;
switch (insn->code) {
/* ALU */
case BPF_ALU | BPF_ADD | BPF_X:
@ -916,36 +858,12 @@ xadd: if (is_imm8(insn->off))
case BPF_JMP | BPF_CALL:
func = (u8 *) __bpf_call_base + imm32;
jmp_offset = func - (image + addrs[i]);
if (seen_ld_abs) {
reload_skb_data = bpf_helper_changes_pkt_data(func);
if (reload_skb_data) {
EMIT1(0x57); /* push %rdi */
jmp_offset += 22; /* pop, mov, sub, mov */
} else {
EMIT2(0x41, 0x52); /* push %r10 */
EMIT2(0x41, 0x51); /* push %r9 */
/*
* We need to adjust jmp offset, since
* pop %r9, pop %r10 take 4 bytes after call insn
*/
jmp_offset += 4;
}
}
if (!imm32 || !is_simm32(jmp_offset)) {
pr_err("unsupported BPF func %d addr %p image %p\n",
imm32, func, image);
return -EINVAL;
}
EMIT1_off32(0xE8, jmp_offset);
if (seen_ld_abs) {
if (reload_skb_data) {
EMIT1(0x5F); /* pop %rdi */
emit_load_skb_data_hlen(&prog);
} else {
EMIT2(0x41, 0x59); /* pop %r9 */
EMIT2(0x41, 0x5A); /* pop %r10 */
}
}
break;
case BPF_JMP | BPF_TAIL_CALL:
@ -1080,60 +998,6 @@ emit_jmp:
}
break;
case BPF_LD | BPF_IND | BPF_W:
func = sk_load_word;
goto common_load;
case BPF_LD | BPF_ABS | BPF_W:
func = CHOOSE_LOAD_FUNC(imm32, sk_load_word);
common_load:
ctx->seen_ld_abs = seen_ld_abs = true;
jmp_offset = func - (image + addrs[i]);
if (!func || !is_simm32(jmp_offset)) {
pr_err("unsupported BPF func %d addr %p image %p\n",
imm32, func, image);
return -EINVAL;
}
if (BPF_MODE(insn->code) == BPF_ABS) {
/* mov %esi, imm32 */
EMIT1_off32(0xBE, imm32);
} else {
/* mov %rsi, src_reg */
EMIT_mov(BPF_REG_2, src_reg);
if (imm32) {
if (is_imm8(imm32))
/* add %esi, imm8 */
EMIT3(0x83, 0xC6, imm32);
else
/* add %esi, imm32 */
EMIT2_off32(0x81, 0xC6, imm32);
}
}
/*
* skb pointer is in R6 (%rbx), it will be copied into
* %rdi if skb_copy_bits() call is necessary.
* sk_load_* helpers also use %r10 and %r9d.
* See bpf_jit.S
*/
if (seen_ax_reg)
/* r10 = skb->data, mov %r10, off32(%rbx) */
EMIT3_off32(0x4c, 0x8b, 0x93,
offsetof(struct sk_buff, data));
EMIT1_off32(0xE8, jmp_offset); /* call */
break;
case BPF_LD | BPF_IND | BPF_H:
func = sk_load_half;
goto common_load;
case BPF_LD | BPF_ABS | BPF_H:
func = CHOOSE_LOAD_FUNC(imm32, sk_load_half);
goto common_load;
case BPF_LD | BPF_IND | BPF_B:
func = sk_load_byte;
goto common_load;
case BPF_LD | BPF_ABS | BPF_B:
func = CHOOSE_LOAD_FUNC(imm32, sk_load_byte);
goto common_load;
case BPF_JMP | BPF_EXIT:
if (seen_exit) {
jmp_offset = ctx->cleanup_addr - addrs[i];

File diff suppressed because it is too large Load Diff

View File

@ -197,6 +197,7 @@ config BT_HCIUART_BCM
config BT_HCIUART_QCA
bool "Qualcomm Atheros protocol support"
depends on BT_HCIUART
depends on BT_HCIUART_SERDEV
select BT_HCIUART_H4
select BT_QCA
help

View File

@ -315,10 +315,12 @@ static int btbcm_read_info(struct hci_dev *hdev)
return 0;
}
static const struct {
struct bcm_subver_table {
u16 subver;
const char *name;
} bcm_uart_subver_table[] = {
};
static const struct bcm_subver_table bcm_uart_subver_table[] = {
{ 0x4103, "BCM4330B1" }, /* 002.001.003 */
{ 0x410e, "BCM43341B0" }, /* 002.001.014 */
{ 0x4406, "BCM4324B3" }, /* 002.004.006 */
@ -330,98 +332,7 @@ static const struct {
{ }
};
int btbcm_initialize(struct hci_dev *hdev, char *fw_name, size_t len)
{
u16 subver, rev;
const char *hw_name = NULL;
struct sk_buff *skb;
struct hci_rp_read_local_version *ver;
int i, err;
/* Reset */
err = btbcm_reset(hdev);
if (err)
return err;
/* Read Local Version Info */
skb = btbcm_read_local_version(hdev);
if (IS_ERR(skb))
return PTR_ERR(skb);
ver = (struct hci_rp_read_local_version *)skb->data;
rev = le16_to_cpu(ver->hci_rev);
subver = le16_to_cpu(ver->lmp_subver);
kfree_skb(skb);
/* Read controller information */
err = btbcm_read_info(hdev);
if (err)
return err;
switch ((rev & 0xf000) >> 12) {
case 0:
case 1:
case 2:
case 3:
for (i = 0; bcm_uart_subver_table[i].name; i++) {
if (subver == bcm_uart_subver_table[i].subver) {
hw_name = bcm_uart_subver_table[i].name;
break;
}
}
snprintf(fw_name, len, "brcm/%s.hcd", hw_name ? : "BCM");
break;
default:
return 0;
}
bt_dev_info(hdev, "%s (%3.3u.%3.3u.%3.3u) build %4.4u",
hw_name ? : "BCM", (subver & 0xe000) >> 13,
(subver & 0x1f00) >> 8, (subver & 0x00ff), rev & 0x0fff);
return 0;
}
EXPORT_SYMBOL_GPL(btbcm_initialize);
int btbcm_finalize(struct hci_dev *hdev)
{
struct sk_buff *skb;
struct hci_rp_read_local_version *ver;
u16 subver, rev;
int err;
/* Reset */
err = btbcm_reset(hdev);
if (err)
return err;
/* Read Local Version Info */
skb = btbcm_read_local_version(hdev);
if (IS_ERR(skb))
return PTR_ERR(skb);
ver = (struct hci_rp_read_local_version *)skb->data;
rev = le16_to_cpu(ver->hci_rev);
subver = le16_to_cpu(ver->lmp_subver);
kfree_skb(skb);
bt_dev_info(hdev, "BCM (%3.3u.%3.3u.%3.3u) build %4.4u",
(subver & 0xe000) >> 13, (subver & 0x1f00) >> 8,
(subver & 0x00ff), rev & 0x0fff);
btbcm_check_bdaddr(hdev);
set_bit(HCI_QUIRK_STRICT_DUPLICATE_FILTER, &hdev->quirks);
return 0;
}
EXPORT_SYMBOL_GPL(btbcm_finalize);
static const struct {
u16 subver;
const char *name;
} bcm_usb_subver_table[] = {
static const struct bcm_subver_table bcm_usb_subver_table[] = {
{ 0x210b, "BCM43142A0" }, /* 001.001.011 */
{ 0x2112, "BCM4314A0" }, /* 001.001.018 */
{ 0x2118, "BCM20702A0" }, /* 001.001.024 */
@ -435,14 +346,14 @@ static const struct {
{ }
};
int btbcm_setup_patchram(struct hci_dev *hdev)
int btbcm_initialize(struct hci_dev *hdev, char *fw_name, size_t len,
bool reinit)
{
char fw_name[64];
const struct firmware *fw;
u16 subver, rev, pid, vid;
const char *hw_name = NULL;
const char *hw_name = "BCM";
struct sk_buff *skb;
struct hci_rp_read_local_version *ver;
const struct bcm_subver_table *bcm_subver_table;
int i, err;
/* Reset */
@ -461,25 +372,27 @@ int btbcm_setup_patchram(struct hci_dev *hdev)
kfree_skb(skb);
/* Read controller information */
err = btbcm_read_info(hdev);
if (err)
return err;
if (!reinit) {
err = btbcm_read_info(hdev);
if (err)
return err;
}
switch ((rev & 0xf000) >> 12) {
case 0:
case 3:
for (i = 0; bcm_uart_subver_table[i].name; i++) {
if (subver == bcm_uart_subver_table[i].subver) {
hw_name = bcm_uart_subver_table[i].name;
break;
}
/* Upper nibble of rev should be between 0 and 3? */
if (((rev & 0xf000) >> 12) > 3)
return 0;
bcm_subver_table = (hdev->bus == HCI_USB) ? bcm_usb_subver_table :
bcm_uart_subver_table;
for (i = 0; bcm_subver_table[i].name; i++) {
if (subver == bcm_subver_table[i].subver) {
hw_name = bcm_subver_table[i].name;
break;
}
}
snprintf(fw_name, sizeof(fw_name), "brcm/%s.hcd",
hw_name ? : "BCM");
break;
case 1:
case 2:
if (hdev->bus == HCI_USB) {
/* Read USB Product Info */
skb = btbcm_read_usb_product(hdev);
if (IS_ERR(skb))
@ -489,24 +402,50 @@ int btbcm_setup_patchram(struct hci_dev *hdev)
pid = get_unaligned_le16(skb->data + 3);
kfree_skb(skb);
for (i = 0; bcm_usb_subver_table[i].name; i++) {
if (subver == bcm_usb_subver_table[i].subver) {
hw_name = bcm_usb_subver_table[i].name;
break;
}
}
snprintf(fw_name, sizeof(fw_name), "brcm/%s-%4.4x-%4.4x.hcd",
hw_name ? : "BCM", vid, pid);
break;
default:
return 0;
snprintf(fw_name, len, "brcm/%s-%4.4x-%4.4x.hcd",
hw_name, vid, pid);
} else {
snprintf(fw_name, len, "brcm/%s.hcd", hw_name);
}
bt_dev_info(hdev, "%s (%3.3u.%3.3u.%3.3u) build %4.4u",
hw_name ? : "BCM", (subver & 0xe000) >> 13,
hw_name, (subver & 0xe000) >> 13,
(subver & 0x1f00) >> 8, (subver & 0x00ff), rev & 0x0fff);
return 0;
}
EXPORT_SYMBOL_GPL(btbcm_initialize);
int btbcm_finalize(struct hci_dev *hdev)
{
char fw_name[64];
int err;
/* Re-initialize */
err = btbcm_initialize(hdev, fw_name, sizeof(fw_name), true);
if (err)
return err;
btbcm_check_bdaddr(hdev);
set_bit(HCI_QUIRK_STRICT_DUPLICATE_FILTER, &hdev->quirks);
return 0;
}
EXPORT_SYMBOL_GPL(btbcm_finalize);
int btbcm_setup_patchram(struct hci_dev *hdev)
{
char fw_name[64];
const struct firmware *fw;
struct sk_buff *skb;
int err;
/* Initialize */
err = btbcm_initialize(hdev, fw_name, sizeof(fw_name), false);
if (err)
return err;
err = request_firmware(&fw, fw_name, &hdev->dev);
if (err < 0) {
bt_dev_info(hdev, "BCM: Patch %s not found", fw_name);
@ -517,25 +456,11 @@ int btbcm_setup_patchram(struct hci_dev *hdev)
release_firmware(fw);
/* Reset */
err = btbcm_reset(hdev);
/* Re-initialize */
err = btbcm_initialize(hdev, fw_name, sizeof(fw_name), true);
if (err)
return err;
/* Read Local Version Info */
skb = btbcm_read_local_version(hdev);
if (IS_ERR(skb))
return PTR_ERR(skb);
ver = (struct hci_rp_read_local_version *)skb->data;
rev = le16_to_cpu(ver->hci_rev);
subver = le16_to_cpu(ver->lmp_subver);
kfree_skb(skb);
bt_dev_info(hdev, "%s (%3.3u.%3.3u.%3.3u) build %4.4u",
hw_name ? : "BCM", (subver & 0xe000) >> 13,
(subver & 0x1f00) >> 8, (subver & 0x00ff), rev & 0x0fff);
/* Read Local Name */
skb = btbcm_read_local_name(hdev);
if (IS_ERR(skb))

View File

@ -73,7 +73,8 @@ int btbcm_patchram(struct hci_dev *hdev, const struct firmware *fw);
int btbcm_setup_patchram(struct hci_dev *hdev);
int btbcm_setup_apple(struct hci_dev *hdev);
int btbcm_initialize(struct hci_dev *hdev, char *fw_name, size_t len);
int btbcm_initialize(struct hci_dev *hdev, char *fw_name, size_t len,
bool reinit);
int btbcm_finalize(struct hci_dev *hdev);
#else
@ -104,7 +105,7 @@ static inline int btbcm_setup_apple(struct hci_dev *hdev)
}
static inline int btbcm_initialize(struct hci_dev *hdev, char *fw_name,
size_t len)
size_t len, bool reinit)
{
return 0;
}

View File

@ -35,15 +35,9 @@ static ssize_t btmrvl_hscfgcmd_write(struct file *file,
const char __user *ubuf, size_t count, loff_t *ppos)
{
struct btmrvl_private *priv = file->private_data;
char buf[16];
long result, ret;
memset(buf, 0, sizeof(buf));
if (copy_from_user(&buf, ubuf, min_t(size_t, sizeof(buf) - 1, count)))
return -EFAULT;
ret = kstrtol(buf, 10, &result);
ret = kstrtol_from_user(ubuf, count, 10, &result);
if (ret)
return ret;
@ -81,15 +75,9 @@ static ssize_t btmrvl_pscmd_write(struct file *file, const char __user *ubuf,
size_t count, loff_t *ppos)
{
struct btmrvl_private *priv = file->private_data;
char buf[16];
long result, ret;
memset(buf, 0, sizeof(buf));
if (copy_from_user(&buf, ubuf, min_t(size_t, sizeof(buf) - 1, count)))
return -EFAULT;
ret = kstrtol(buf, 10, &result);
ret = kstrtol_from_user(ubuf, count, 10, &result);
if (ret)
return ret;
@ -127,15 +115,9 @@ static ssize_t btmrvl_hscmd_write(struct file *file, const char __user *ubuf,
size_t count, loff_t *ppos)
{
struct btmrvl_private *priv = file->private_data;
char buf[16];
long result, ret;
memset(buf, 0, sizeof(buf));
if (copy_from_user(&buf, ubuf, min_t(size_t, sizeof(buf) - 1, count)))
return -EFAULT;
ret = kstrtol(buf, 10, &result);
ret = kstrtol_from_user(ubuf, count, 10, &result);
if (ret)
return ret;
@ -167,35 +149,6 @@ static const struct file_operations btmrvl_hscmd_fops = {
.llseek = default_llseek,
};
static ssize_t btmrvl_fwdump_write(struct file *file, const char __user *ubuf,
size_t count, loff_t *ppos)
{
struct btmrvl_private *priv = file->private_data;
char buf[16];
bool result;
memset(buf, 0, sizeof(buf));
if (copy_from_user(&buf, ubuf, min_t(size_t, sizeof(buf) - 1, count)))
return -EFAULT;
if (strtobool(buf, &result))
return -EINVAL;
if (!result)
return -EINVAL;
btmrvl_firmware_dump(priv);
return count;
}
static const struct file_operations btmrvl_fwdump_fops = {
.write = btmrvl_fwdump_write,
.open = simple_open,
.llseek = default_llseek,
};
void btmrvl_debugfs_init(struct hci_dev *hdev)
{
struct btmrvl_private *priv = hci_get_drvdata(hdev);
@ -226,8 +179,6 @@ void btmrvl_debugfs_init(struct hci_dev *hdev)
priv, &btmrvl_hscmd_fops);
debugfs_create_file("hscfgcmd", 0644, dbg->config_dir,
priv, &btmrvl_hscfgcmd_fops);
debugfs_create_file("fw_dump", 0200, dbg->config_dir,
priv, &btmrvl_fwdump_fops);
dbg->status_dir = debugfs_create_dir("status", hdev->debugfs);
debugfs_create_u8("curpsmode", 0444, dbg->status_dir,

View File

@ -110,7 +110,6 @@ struct btmrvl_private {
u8 *payload, u16 nb);
int (*hw_wakeup_firmware)(struct btmrvl_private *priv);
int (*hw_process_int_status)(struct btmrvl_private *priv);
void (*firmware_dump)(struct btmrvl_private *priv);
spinlock_t driver_lock; /* spinlock used by driver */
#ifdef CONFIG_DEBUG_FS
void *debugfs_data;
@ -183,7 +182,6 @@ int btmrvl_send_hscfg_cmd(struct btmrvl_private *priv);
int btmrvl_enable_ps(struct btmrvl_private *priv);
int btmrvl_prepare_command(struct btmrvl_private *priv);
int btmrvl_enable_hs(struct btmrvl_private *priv);
void btmrvl_firmware_dump(struct btmrvl_private *priv);
#ifdef CONFIG_DEBUG_FS
void btmrvl_debugfs_init(struct hci_dev *hdev);

View File

@ -358,12 +358,6 @@ int btmrvl_prepare_command(struct btmrvl_private *priv)
return ret;
}
void btmrvl_firmware_dump(struct btmrvl_private *priv)
{
if (priv->firmware_dump)
priv->firmware_dump(priv);
}
static int btmrvl_tx_pkt(struct btmrvl_private *priv, struct sk_buff *skb)
{
int ret = 0;

View File

@ -1311,9 +1311,11 @@ rdwr_status btmrvl_sdio_rdwr_firmware(struct btmrvl_private *priv,
}
/* This function dump sdio register and memory data */
static void btmrvl_sdio_dump_firmware(struct btmrvl_private *priv)
static void btmrvl_sdio_coredump(struct device *dev)
{
struct btmrvl_sdio_card *card = priv->btmrvl_dev.card;
struct sdio_func *func = dev_to_sdio_func(dev);
struct btmrvl_sdio_card *card;
struct btmrvl_private *priv;
int ret = 0;
unsigned int reg, reg_start, reg_end;
enum rdwr_status stat;
@ -1321,6 +1323,9 @@ static void btmrvl_sdio_dump_firmware(struct btmrvl_private *priv)
u8 dump_num = 0, idx, i, read_reg, doneflag = 0;
u32 memory_size, fw_dump_len = 0;
card = sdio_get_drvdata(func);
priv = card->priv;
/* dump sdio register first */
btmrvl_sdio_dump_regs(priv);
@ -1547,7 +1552,6 @@ static int btmrvl_sdio_probe(struct sdio_func *func,
priv->hw_host_to_card = btmrvl_sdio_host_to_card;
priv->hw_wakeup_firmware = btmrvl_sdio_wakeup_fw;
priv->hw_process_int_status = btmrvl_sdio_process_int_status;
priv->firmware_dump = btmrvl_sdio_dump_firmware;
if (btmrvl_register_hdev(priv)) {
BT_ERR("Register hdev failed!");
@ -1717,6 +1721,7 @@ static struct sdio_driver bt_mrvl_sdio = {
.remove = btmrvl_sdio_remove,
.drv = {
.owner = THIS_MODULE,
.coredump = btmrvl_sdio_coredump,
.pm = &btmrvl_sdio_pm_ops,
}
};

View File

@ -127,28 +127,41 @@ static void rome_tlv_check_data(struct rome_config *config,
BT_DBG("TLV Type\t\t : 0x%x", type_len & 0x000000ff);
BT_DBG("Length\t\t : %d bytes", length);
config->dnld_mode = ROME_SKIP_EVT_NONE;
switch (config->type) {
case TLV_TYPE_PATCH:
tlv_patch = (struct tlv_type_patch *)tlv->data;
BT_DBG("Total Length\t\t : %d bytes",
/* For Rome version 1.1 to 3.1, all segment commands
* are acked by a vendor specific event (VSE).
* For Rome >= 3.2, the download mode field indicates
* if VSE is skipped by the controller.
* In case VSE is skipped, only the last segment is acked.
*/
config->dnld_mode = tlv_patch->download_mode;
BT_DBG("Total Length : %d bytes",
le32_to_cpu(tlv_patch->total_size));
BT_DBG("Patch Data Length\t : %d bytes",
BT_DBG("Patch Data Length : %d bytes",
le32_to_cpu(tlv_patch->data_length));
BT_DBG("Signing Format Version : 0x%x",
tlv_patch->format_version);
BT_DBG("Signature Algorithm\t : 0x%x",
BT_DBG("Signature Algorithm : 0x%x",
tlv_patch->signature);
BT_DBG("Reserved\t\t : 0x%x",
le16_to_cpu(tlv_patch->reserved1));
BT_DBG("Product ID\t\t : 0x%04x",
BT_DBG("Download mode : 0x%x",
tlv_patch->download_mode);
BT_DBG("Reserved : 0x%x",
tlv_patch->reserved1);
BT_DBG("Product ID : 0x%04x",
le16_to_cpu(tlv_patch->product_id));
BT_DBG("Rom Build Version\t : 0x%04x",
BT_DBG("Rom Build Version : 0x%04x",
le16_to_cpu(tlv_patch->rom_build));
BT_DBG("Patch Version\t\t : 0x%04x",
BT_DBG("Patch Version : 0x%04x",
le16_to_cpu(tlv_patch->patch_version));
BT_DBG("Reserved\t\t : 0x%x",
BT_DBG("Reserved : 0x%x",
le16_to_cpu(tlv_patch->reserved2));
BT_DBG("Patch Entry Address\t : 0x%x",
BT_DBG("Patch Entry Address : 0x%x",
le32_to_cpu(tlv_patch->entry));
break;
@ -194,8 +207,8 @@ static void rome_tlv_check_data(struct rome_config *config,
}
}
static int rome_tlv_send_segment(struct hci_dev *hdev, int idx, int seg_size,
const u8 *data)
static int rome_tlv_send_segment(struct hci_dev *hdev, int seg_size,
const u8 *data, enum rome_tlv_dnld_mode mode)
{
struct sk_buff *skb;
struct edl_event_hdr *edl;
@ -203,12 +216,14 @@ static int rome_tlv_send_segment(struct hci_dev *hdev, int idx, int seg_size,
u8 cmd[MAX_SIZE_PER_TLV_SEGMENT + 2];
int err = 0;
BT_DBG("%s: Download segment #%d size %d", hdev->name, idx, seg_size);
cmd[0] = EDL_PATCH_TLV_REQ_CMD;
cmd[1] = seg_size;
memcpy(cmd + 2, data, seg_size);
if (mode == ROME_SKIP_EVT_VSE_CC || mode == ROME_SKIP_EVT_VSE)
return __hci_cmd_send(hdev, EDL_PATCH_CMD_OPCODE, seg_size + 2,
cmd);
skb = __hci_cmd_sync_ev(hdev, EDL_PATCH_CMD_OPCODE, seg_size + 2, cmd,
HCI_VENDOR_PKT, HCI_INIT_TIMEOUT);
if (IS_ERR(skb)) {
@ -245,47 +260,12 @@ out:
return err;
}
static int rome_tlv_download_request(struct hci_dev *hdev,
const struct firmware *fw)
{
const u8 *buffer, *data;
int total_segment, remain_size;
int ret, i;
if (!fw || !fw->data)
return -EINVAL;
total_segment = fw->size / MAX_SIZE_PER_TLV_SEGMENT;
remain_size = fw->size % MAX_SIZE_PER_TLV_SEGMENT;
BT_DBG("%s: Total segment num %d remain size %d total size %zu",
hdev->name, total_segment, remain_size, fw->size);
data = fw->data;
for (i = 0; i < total_segment; i++) {
buffer = data + i * MAX_SIZE_PER_TLV_SEGMENT;
ret = rome_tlv_send_segment(hdev, i, MAX_SIZE_PER_TLV_SEGMENT,
buffer);
if (ret < 0)
return -EIO;
}
if (remain_size) {
buffer = data + total_segment * MAX_SIZE_PER_TLV_SEGMENT;
ret = rome_tlv_send_segment(hdev, total_segment, remain_size,
buffer);
if (ret < 0)
return -EIO;
}
return 0;
}
static int rome_download_firmware(struct hci_dev *hdev,
struct rome_config *config)
{
const struct firmware *fw;
int ret;
const u8 *segment;
int ret, remain, i = 0;
bt_dev_info(hdev, "ROME Downloading %s", config->fwname);
@ -298,10 +278,24 @@ static int rome_download_firmware(struct hci_dev *hdev,
rome_tlv_check_data(config, fw);
ret = rome_tlv_download_request(hdev, fw);
if (ret) {
BT_ERR("%s: Failed to download file: %s (%d)", hdev->name,
config->fwname, ret);
segment = fw->data;
remain = fw->size;
while (remain > 0) {
int segsize = min(MAX_SIZE_PER_TLV_SEGMENT, remain);
bt_dev_dbg(hdev, "Send segment %d, size %d", i++, segsize);
remain -= segsize;
/* The last segment is always acked regardless download mode */
if (!remain || segsize < MAX_SIZE_PER_TLV_SEGMENT)
config->dnld_mode = ROME_SKIP_EVT_NONE;
ret = rome_tlv_send_segment(hdev, segsize, segment,
config->dnld_mode);
if (ret)
break;
segment += segsize;
}
release_firmware(fw);

View File

@ -61,6 +61,13 @@ enum qca_bardrate {
QCA_BAUDRATE_RESERVED
};
enum rome_tlv_dnld_mode {
ROME_SKIP_EVT_NONE,
ROME_SKIP_EVT_VSE,
ROME_SKIP_EVT_CC,
ROME_SKIP_EVT_VSE_CC
};
enum rome_tlv_type {
TLV_TYPE_PATCH = 1,
TLV_TYPE_NVM
@ -70,6 +77,7 @@ struct rome_config {
u8 type;
char fwname[64];
uint8_t user_baud_rate;
enum rome_tlv_dnld_mode dnld_mode;
};
struct edl_event_hdr {
@ -94,7 +102,8 @@ struct tlv_type_patch {
__le32 data_length;
__u8 format_version;
__u8 signature;
__le16 reserved1;
__u8 download_mode;
__u8 reserved1;
__le16 product_id;
__le16 rom_build;
__le16 patch_version;

View File

@ -65,6 +65,7 @@ static int btqcomsmd_cmd_callback(struct rpmsg_device *rpdev, void *data,
{
struct btqcomsmd *btq = priv;
btq->hdev->stat.byte_rx += count;
return btqcomsmd_recv(btq->hdev, HCI_EVENT_PKT, data, count);
}
@ -76,12 +77,21 @@ static int btqcomsmd_send(struct hci_dev *hdev, struct sk_buff *skb)
switch (hci_skb_pkt_type(skb)) {
case HCI_ACLDATA_PKT:
ret = rpmsg_send(btq->acl_channel, skb->data, skb->len);
if (ret) {
hdev->stat.err_tx++;
break;
}
hdev->stat.acl_tx++;
hdev->stat.byte_tx += skb->len;
break;
case HCI_COMMAND_PKT:
ret = rpmsg_send(btq->cmd_channel, skb->data, skb->len);
if (ret) {
hdev->stat.err_tx++;
break;
}
hdev->stat.cmd_tx++;
hdev->stat.byte_tx += skb->len;
break;
default:
ret = -EILSEQ;

View File

@ -276,6 +276,8 @@ static const struct usb_device_id blacklist_table[] = {
{ USB_DEVICE(0x04ca, 0x3011), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x04ca, 0x3015), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x04ca, 0x3016), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x04ca, 0x301a), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x13d3, 0x3496), .driver_info = BTUSB_QCA_ROME },
/* Broadcom BCM2035 */
{ USB_DEVICE(0x0a5c, 0x2009), .driver_info = BTUSB_BCM92035 },
@ -371,6 +373,9 @@ static const struct usb_device_id blacklist_table[] = {
/* Additional Realtek 8723BU Bluetooth devices */
{ USB_DEVICE(0x7392, 0xa611), .driver_info = BTUSB_REALTEK },
/* Additional Realtek 8723DE Bluetooth devices */
{ USB_DEVICE(0x2ff8, 0xb011), .driver_info = BTUSB_REALTEK },
/* Additional Realtek 8821AE Bluetooth devices */
{ USB_DEVICE(0x0b05, 0x17dc), .driver_info = BTUSB_REALTEK },
{ USB_DEVICE(0x13d3, 0x3414), .driver_info = BTUSB_REALTEK },
@ -379,6 +384,7 @@ static const struct usb_device_id blacklist_table[] = {
{ USB_DEVICE(0x13d3, 0x3462), .driver_info = BTUSB_REALTEK },
/* Additional Realtek 8822BE Bluetooth devices */
{ USB_DEVICE(0x13d3, 0x3526), .driver_info = BTUSB_REALTEK },
{ USB_DEVICE(0x0b05, 0x185c), .driver_info = BTUSB_REALTEK },
/* Silicon Wave based devices */
@ -406,6 +412,13 @@ static const struct dmi_system_id btusb_needs_reset_resume_table[] = {
DMI_MATCH(DMI_PRODUCT_NAME, "XPS 13 9360"),
},
},
{
/* Dell Inspiron 5565 (QCA ROME device 0cf3:e009) */
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "Inspiron 5565"),
},
},
{}
};
@ -2497,11 +2510,9 @@ static const struct qca_device_info qca_devices_table[] = {
{ 0x00000302, 28, 4, 18 }, /* Rome 3.2 */
};
static int btusb_qca_send_vendor_req(struct hci_dev *hdev, u8 request,
static int btusb_qca_send_vendor_req(struct usb_device *udev, u8 request,
void *data, u16 size)
{
struct btusb_data *btdata = hci_get_drvdata(hdev);
struct usb_device *udev = btdata->udev;
int pipe, err;
u8 *buf;
@ -2516,7 +2527,7 @@ static int btusb_qca_send_vendor_req(struct hci_dev *hdev, u8 request,
err = usb_control_msg(udev, pipe, request, USB_TYPE_VENDOR | USB_DIR_IN,
0, 0, buf, size, USB_CTRL_SET_TIMEOUT);
if (err < 0) {
bt_dev_err(hdev, "Failed to access otp area (%d)", err);
dev_err(&udev->dev, "Failed to access otp area (%d)", err);
goto done;
}
@ -2666,20 +2677,38 @@ static int btusb_setup_qca_load_nvm(struct hci_dev *hdev,
return err;
}
/* identify the ROM version and check whether patches are needed */
static bool btusb_qca_need_patch(struct usb_device *udev)
{
struct qca_version ver;
if (btusb_qca_send_vendor_req(udev, QCA_GET_TARGET_VERSION, &ver,
sizeof(ver)) < 0)
return false;
/* only low ROM versions need patches */
return !(le32_to_cpu(ver.rom_version) & ~0xffffU);
}
static int btusb_setup_qca(struct hci_dev *hdev)
{
struct btusb_data *btdata = hci_get_drvdata(hdev);
struct usb_device *udev = btdata->udev;
const struct qca_device_info *info = NULL;
struct qca_version ver;
u32 ver_rom;
u8 status;
int i, err;
err = btusb_qca_send_vendor_req(hdev, QCA_GET_TARGET_VERSION, &ver,
err = btusb_qca_send_vendor_req(udev, QCA_GET_TARGET_VERSION, &ver,
sizeof(ver));
if (err < 0)
return err;
ver_rom = le32_to_cpu(ver.rom_version);
/* Don't care about high ROM versions */
if (ver_rom & ~0xffffU)
return 0;
for (i = 0; i < ARRAY_SIZE(qca_devices_table); i++) {
if (ver_rom == qca_devices_table[i].rom_version)
info = &qca_devices_table[i];
@ -2689,7 +2718,7 @@ static int btusb_setup_qca(struct hci_dev *hdev)
return -ENODEV;
}
err = btusb_qca_send_vendor_req(hdev, QCA_CHECK_STATUS, &status,
err = btusb_qca_send_vendor_req(udev, QCA_CHECK_STATUS, &status,
sizeof(status));
if (err < 0)
return err;
@ -2903,7 +2932,8 @@ static int btusb_probe(struct usb_interface *intf,
/* Old firmware would otherwise let ath3k driver load
* patch and sysconfig files
*/
if (le16_to_cpu(udev->descriptor.bcdDevice) <= 0x0001)
if (le16_to_cpu(udev->descriptor.bcdDevice) <= 0x0001 &&
!btusb_qca_need_patch(udev))
return -ENODEV;
}
@ -3065,6 +3095,7 @@ static int btusb_probe(struct usb_interface *intf,
}
if (id->driver_info & BTUSB_ATH3012) {
data->setup_on_usb = btusb_setup_qca;
hdev->set_bdaddr = btusb_set_bdaddr_ath3012;
set_bit(HCI_QUIRK_SIMULTANEOUS_DISCOVERY, &hdev->quirks);
set_bit(HCI_QUIRK_STRICT_DUPLICATE_FILTER, &hdev->quirks);

View File

@ -380,10 +380,6 @@ static int bcm_open(struct hci_uart *hu)
mutex_lock(&bcm_device_lock);
if (hu->serdev) {
err = serdev_device_open(hu->serdev);
if (err)
goto err_free;
bcm->dev = serdev_device_get_drvdata(hu->serdev);
goto out;
}
@ -420,13 +416,10 @@ out:
return 0;
err_unset_hu:
if (hu->serdev)
serdev_device_close(hu->serdev);
#ifdef CONFIG_PM
else
if (!hu->serdev)
bcm->dev->hu = NULL;
#endif
err_free:
mutex_unlock(&bcm_device_lock);
hu->priv = NULL;
kfree(bcm);
@ -445,7 +438,6 @@ static int bcm_close(struct hci_uart *hu)
mutex_lock(&bcm_device_lock);
if (hu->serdev) {
serdev_device_close(hu->serdev);
bdev = serdev_device_get_drvdata(hu->serdev);
} else if (bcm_device_exists(bcm->dev)) {
bdev = bcm->dev;
@ -501,7 +493,7 @@ static int bcm_setup(struct hci_uart *hu)
hu->hdev->set_diag = bcm_set_diag;
hu->hdev->set_bdaddr = btbcm_set_bdaddr;
err = btbcm_initialize(hu->hdev, fw_name, sizeof(fw_name));
err = btbcm_initialize(hu->hdev, fw_name, sizeof(fw_name), false);
if (err)
return err;
@ -794,19 +786,21 @@ static const struct acpi_gpio_mapping acpi_bcm_int_first_gpios[] = {
{ },
};
#ifdef CONFIG_ACPI
/* IRQ polarity of some chipsets are not defined correctly in ACPI table. */
static const struct dmi_system_id bcm_active_low_irq_dmi_table[] = {
{ /* Handle ThinkPad 8 tablets with BCM2E55 chipset ACPI ID */
.ident = "Lenovo ThinkPad 8",
/* Some firmware reports an IRQ which does not work (wrong pin in fw table?) */
static const struct dmi_system_id bcm_broken_irq_dmi_table[] = {
{
.ident = "Meegopad T08",
.matches = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "LENOVO"),
DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "ThinkPad 8"),
DMI_EXACT_MATCH(DMI_BOARD_VENDOR,
"To be filled by OEM."),
DMI_EXACT_MATCH(DMI_BOARD_NAME, "T3 MRD"),
DMI_EXACT_MATCH(DMI_BOARD_VERSION, "V1.1"),
},
},
{ }
};
#ifdef CONFIG_ACPI
static int bcm_resource(struct acpi_resource *ares, void *data)
{
struct bcm_device *dev = data;
@ -904,6 +898,8 @@ static int bcm_gpio_set_shutdown(struct bcm_device *dev, bool powered)
static int bcm_get_resources(struct bcm_device *dev)
{
const struct dmi_system_id *dmi_id;
dev->name = dev_name(dev->dev);
if (x86_apple_machine && !bcm_apple_get_resources(dev))
@ -936,6 +932,13 @@ static int bcm_get_resources(struct bcm_device *dev)
dev->irq = gpiod_to_irq(gpio);
}
dmi_id = dmi_first_match(bcm_broken_irq_dmi_table);
if (dmi_id) {
dev_info(dev->dev, "%s: Has a broken IRQ config, disabling IRQ support / runtime-pm\n",
dmi_id->ident);
dev->irq = 0;
}
dev_dbg(dev->dev, "BCM irq: %d\n", dev->irq);
return 0;
}
@ -944,7 +947,6 @@ static int bcm_get_resources(struct bcm_device *dev)
static int bcm_acpi_probe(struct bcm_device *dev)
{
LIST_HEAD(resources);
const struct dmi_system_id *dmi_id;
const struct acpi_gpio_mapping *gpio_mapping = acpi_bcm_int_last_gpios;
struct resource_entry *entry;
int ret;
@ -991,13 +993,6 @@ static int bcm_acpi_probe(struct bcm_device *dev)
dev->irq_active_low = irq_polarity;
dev_warn(dev->dev, "Overwriting IRQ polarity to active %s by module-param\n",
dev->irq_active_low ? "low" : "high");
} else {
dmi_id = dmi_first_match(bcm_active_low_irq_dmi_table);
if (dmi_id) {
dev_warn(dev->dev, "%s: Overwriting IRQ polarity to active low",
dmi_id->ident);
dev->irq_active_low = true;
}
}
return 0;

View File

@ -195,7 +195,7 @@ restart:
clear_bit(HCI_UART_SENDING, &hu->tx_state);
}
static void hci_uart_init_work(struct work_struct *work)
void hci_uart_init_work(struct work_struct *work)
{
struct hci_uart *hu = container_of(work, struct hci_uart, init_ready);
int err;
@ -229,15 +229,6 @@ int hci_uart_init_ready(struct hci_uart *hu)
}
/* ------- Interface to HCI layer ------ */
/* Initialize device */
static int hci_uart_open(struct hci_dev *hdev)
{
BT_DBG("%s %p", hdev->name, hdev);
/* Nothing to do for UART driver */
return 0;
}
/* Reset device */
static int hci_uart_flush(struct hci_dev *hdev)
{
@ -264,6 +255,17 @@ static int hci_uart_flush(struct hci_dev *hdev)
return 0;
}
/* Initialize device */
static int hci_uart_open(struct hci_dev *hdev)
{
BT_DBG("%s %p", hdev->name, hdev);
/* Undo clearing this from hci_uart_close() */
hdev->flush = hci_uart_flush;
return 0;
}
/* Close device */
static int hci_uart_close(struct hci_dev *hdev)
{
@ -447,6 +449,8 @@ static int hci_uart_setup(struct hci_dev *hdev)
btbcm_check_bdaddr(hdev);
break;
#endif
default:
break;
}
done:

View File

@ -141,7 +141,6 @@ static int ll_open(struct hci_uart *hu)
if (hu->serdev) {
struct ll_device *lldev = serdev_device_get_drvdata(hu->serdev);
serdev_device_open(hu->serdev);
if (!IS_ERR(lldev->ext_clk))
clk_prepare_enable(lldev->ext_clk);
}
@ -179,8 +178,6 @@ static int ll_close(struct hci_uart *hu)
gpiod_set_value_cansleep(lldev->enable_gpio, 0);
clk_disable_unprepare(lldev->ext_clk);
serdev_device_close(hu->serdev);
}
hu->priv = NULL;

View File

@ -477,8 +477,6 @@ static int nokia_open(struct hci_uart *hu)
dev_dbg(dev, "protocol open");
serdev_device_open(hu->serdev);
pm_runtime_enable(dev);
return 0;
@ -513,7 +511,6 @@ static int nokia_close(struct hci_uart *hu)
gpiod_set_value(btdev->wakeup_bt, 0);
pm_runtime_disable(&btdev->serdev->dev);
serdev_device_close(btdev->serdev);
return 0;
}

View File

@ -29,7 +29,12 @@
*/
#include <linux/kernel.h>
#include <linux/clk.h>
#include <linux/debugfs.h>
#include <linux/gpio/consumer.h>
#include <linux/mod_devicetable.h>
#include <linux/module.h>
#include <linux/serdev.h>
#include <net/bluetooth/bluetooth.h>
#include <net/bluetooth/hci_core.h>
@ -50,6 +55,9 @@
#define IBS_TX_IDLE_TIMEOUT_MS 2000
#define BAUDRATE_SETTLE_TIMEOUT_MS 300
/* susclk rate */
#define SUSCLK_RATE_32KHZ 32768
/* HCI_IBS transmit side sleep protocol states */
enum tx_ibs_states {
HCI_IBS_TX_ASLEEP,
@ -111,6 +119,12 @@ struct qca_data {
u64 votes_off;
};
struct qca_serdev {
struct hci_uart serdev_hu;
struct gpio_desc *bt_en;
struct clk *susclk;
};
static void __serial_clock_on(struct tty_struct *tty)
{
/* TODO: Some chipset requires to enable UART clock on client
@ -386,6 +400,7 @@ static void hci_ibs_wake_retrans_timeout(struct timer_list *t)
/* Initialize protocol */
static int qca_open(struct hci_uart *hu)
{
struct qca_serdev *qcadev;
struct qca_data *qca;
BT_DBG("hu %p qca_open", hu);
@ -444,6 +459,13 @@ static int qca_open(struct hci_uart *hu)
timer_setup(&qca->tx_idle_timer, hci_ibs_tx_idle_timeout, 0);
qca->tx_idle_delay = IBS_TX_IDLE_TIMEOUT_MS;
if (hu->serdev) {
serdev_device_open(hu->serdev);
qcadev = serdev_device_get_drvdata(hu->serdev);
gpiod_set_value_cansleep(qcadev->bt_en, 1);
}
BT_DBG("HCI_UART_QCA open, tx_idle_delay=%u, wake_retrans=%u",
qca->tx_idle_delay, qca->wake_retrans);
@ -512,6 +534,7 @@ static int qca_flush(struct hci_uart *hu)
/* Close protocol */
static int qca_close(struct hci_uart *hu)
{
struct qca_serdev *qcadev;
struct qca_data *qca = hu->priv;
BT_DBG("hu %p qca close", hu);
@ -525,6 +548,13 @@ static int qca_close(struct hci_uart *hu)
destroy_workqueue(qca->workqueue);
qca->hu = NULL;
if (hu->serdev) {
serdev_device_close(hu->serdev);
qcadev = serdev_device_get_drvdata(hu->serdev);
gpiod_set_value_cansleep(qcadev->bt_en, 0);
}
kfree_skb(qca->rx_skb);
hu->priv = NULL;
@ -880,11 +910,19 @@ static int qca_set_baudrate(struct hci_dev *hdev, uint8_t baudrate)
*/
set_current_state(TASK_UNINTERRUPTIBLE);
schedule_timeout(msecs_to_jiffies(BAUDRATE_SETTLE_TIMEOUT_MS));
set_current_state(TASK_INTERRUPTIBLE);
set_current_state(TASK_RUNNING);
return 0;
}
static inline void host_set_baudrate(struct hci_uart *hu, unsigned int speed)
{
if (hu->serdev)
serdev_device_set_baudrate(hu->serdev, speed);
else
hci_uart_set_baudrate(hu, speed);
}
static int qca_setup(struct hci_uart *hu)
{
struct hci_dev *hdev = hu->hdev;
@ -905,7 +943,7 @@ static int qca_setup(struct hci_uart *hu)
speed = hu->proto->init_speed;
if (speed)
hci_uart_set_baudrate(hu, speed);
host_set_baudrate(hu, speed);
/* Setup user speed if needed */
speed = 0;
@ -924,7 +962,7 @@ static int qca_setup(struct hci_uart *hu)
ret);
return ret;
}
hci_uart_set_baudrate(hu, speed);
host_set_baudrate(hu, speed);
}
/* Setup patch / NVM configurations */
@ -935,6 +973,12 @@ static int qca_setup(struct hci_uart *hu)
} else if (ret == -ENOENT) {
/* No patch/nvm-config found, run with original fw/config */
ret = 0;
} else if (ret == -EAGAIN) {
/*
* Userspace firmware loader will return -EAGAIN in case no
* patch/nvm-config is found, so run with original fw/config.
*/
ret = 0;
}
/* Setup bdaddr */
@ -958,12 +1002,80 @@ static struct hci_uart_proto qca_proto = {
.dequeue = qca_dequeue,
};
static int qca_serdev_probe(struct serdev_device *serdev)
{
struct qca_serdev *qcadev;
int err;
qcadev = devm_kzalloc(&serdev->dev, sizeof(*qcadev), GFP_KERNEL);
if (!qcadev)
return -ENOMEM;
qcadev->serdev_hu.serdev = serdev;
serdev_device_set_drvdata(serdev, qcadev);
qcadev->bt_en = devm_gpiod_get(&serdev->dev, "enable",
GPIOD_OUT_LOW);
if (IS_ERR(qcadev->bt_en)) {
dev_err(&serdev->dev, "failed to acquire enable gpio\n");
return PTR_ERR(qcadev->bt_en);
}
qcadev->susclk = devm_clk_get(&serdev->dev, NULL);
if (IS_ERR(qcadev->susclk)) {
dev_err(&serdev->dev, "failed to acquire clk\n");
return PTR_ERR(qcadev->susclk);
}
err = clk_set_rate(qcadev->susclk, SUSCLK_RATE_32KHZ);
if (err)
return err;
err = clk_prepare_enable(qcadev->susclk);
if (err)
return err;
err = hci_uart_register_device(&qcadev->serdev_hu, &qca_proto);
if (err)
clk_disable_unprepare(qcadev->susclk);
return err;
}
static void qca_serdev_remove(struct serdev_device *serdev)
{
struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev);
hci_uart_unregister_device(&qcadev->serdev_hu);
clk_disable_unprepare(qcadev->susclk);
}
static const struct of_device_id qca_bluetooth_of_match[] = {
{ .compatible = "qcom,qca6174-bt" },
{ /* sentinel */ }
};
MODULE_DEVICE_TABLE(of, qca_bluetooth_of_match);
static struct serdev_device_driver qca_serdev_driver = {
.probe = qca_serdev_probe,
.remove = qca_serdev_remove,
.driver = {
.name = "hci_uart_qca",
.of_match_table = qca_bluetooth_of_match,
},
};
int __init qca_init(void)
{
serdev_device_driver_register(&qca_serdev_driver);
return hci_uart_register_proto(&qca_proto);
}
int __exit qca_deinit(void)
{
serdev_device_driver_unregister(&qca_serdev_driver);
return hci_uart_unregister_proto(&qca_proto);
}

View File

@ -101,14 +101,6 @@ static void hci_uart_write_work(struct work_struct *work)
/* ------- Interface to HCI layer ------ */
/* Initialize device */
static int hci_uart_open(struct hci_dev *hdev)
{
BT_DBG("%s %p", hdev->name, hdev);
return 0;
}
/* Reset device */
static int hci_uart_flush(struct hci_dev *hdev)
{
@ -129,6 +121,17 @@ static int hci_uart_flush(struct hci_dev *hdev)
return 0;
}
/* Initialize device */
static int hci_uart_open(struct hci_dev *hdev)
{
BT_DBG("%s %p", hdev->name, hdev);
/* Undo clearing this from hci_uart_close() */
hdev->flush = hci_uart_flush;
return 0;
}
/* Close device */
static int hci_uart_close(struct hci_dev *hdev)
{
@ -204,9 +207,8 @@ static int hci_uart_setup(struct hci_dev *hdev)
return 0;
}
if (skb->len != sizeof(*ver)) {
if (skb->len != sizeof(*ver))
bt_dev_err(hdev, "Event length mismatch for version info");
}
kfree_skb(skb);
return 0;
@ -282,10 +284,14 @@ int hci_uart_register_device(struct hci_uart *hu,
serdev_device_set_client_ops(hu->serdev, &hci_serdev_client_ops);
err = p->open(hu);
err = serdev_device_open(hu->serdev);
if (err)
return err;
err = p->open(hu);
if (err)
goto err_open;
hu->proto = p;
set_bit(HCI_UART_PROTO_READY, &hu->flags);
@ -302,6 +308,7 @@ int hci_uart_register_device(struct hci_uart *hu,
hdev->bus = HCI_UART;
hci_set_drvdata(hdev, hu);
INIT_WORK(&hu->init_ready, hci_uart_init_work);
INIT_WORK(&hu->write_work, hci_uart_write_work);
percpu_init_rwsem(&hu->proto_lock);
@ -351,6 +358,8 @@ err_register:
err_alloc:
clear_bit(HCI_UART_PROTO_READY, &hu->flags);
p->close(hu);
err_open:
serdev_device_close(hu->serdev);
return err;
}
EXPORT_SYMBOL_GPL(hci_uart_register_device);
@ -365,5 +374,6 @@ void hci_uart_unregister_device(struct hci_uart *hu)
cancel_work_sync(&hu->write_work);
hu->proto->close(hu);
serdev_device_close(hu->serdev);
}
EXPORT_SYMBOL_GPL(hci_uart_unregister_device);

View File

@ -116,6 +116,7 @@ void hci_uart_unregister_device(struct hci_uart *hu);
int hci_uart_tx_wakeup(struct hci_uart *hu);
int hci_uart_init_ready(struct hci_uart *hu);
void hci_uart_init_work(struct work_struct *work);
void hci_uart_set_baudrate(struct hci_uart *hu, unsigned int speed);
void hci_uart_set_flow_control(struct hci_uart *hu, bool enable);
void hci_uart_set_speeds(struct hci_uart *hu, unsigned int init_speed,

View File

@ -262,6 +262,8 @@ void proc_coredump_connector(struct task_struct *task)
ev->what = PROC_EVENT_COREDUMP;
ev->event_data.coredump.process_pid = task->pid;
ev->event_data.coredump.process_tgid = task->tgid;
ev->event_data.coredump.parent_pid = task->real_parent->pid;
ev->event_data.coredump.parent_tgid = task->real_parent->tgid;
memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
msg->ack = 0; /* not used */
@ -288,6 +290,8 @@ void proc_exit_connector(struct task_struct *task)
ev->event_data.exit.process_tgid = task->tgid;
ev->event_data.exit.exit_code = task->exit_code;
ev->event_data.exit.exit_signal = task->exit_signal;
ev->event_data.exit.parent_pid = task->real_parent->pid;
ev->event_data.exit.parent_tgid = task->real_parent->tgid;
memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
msg->ack = 0; /* not used */

View File

@ -270,7 +270,7 @@ EXPORT_SYMBOL_GPL(dca_remove_requester);
* @dev - the device that wants dca service
* @cpu - the cpuid as returned by get_cpu()
*/
u8 dca_common_get_tag(struct device *dev, int cpu)
static u8 dca_common_get_tag(struct device *dev, int cpu)
{
struct dca_provider *dca;
u8 tag;

View File

@ -849,7 +849,7 @@ static int create_cq_user(struct mlx5_ib_dev *dev, struct ib_udata *udata,
return 0;
err_cqb:
kfree(*cqb);
kvfree(*cqb);
err_db:
mlx5_ib_db_unmap_user(to_mucontext(context), &cq->db);

View File

@ -116,6 +116,7 @@ enum rdma_cqe_requester_status_enum {
RDMA_CQE_REQ_STS_TRANSPORT_RETRY_CNT_ERR,
RDMA_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR,
RDMA_CQE_REQ_STS_XRC_VOILATION_ERR,
RDMA_CQE_REQ_STS_SIG_ERR,
MAX_RDMA_CQE_REQUESTER_STATUS_ENUM
};
@ -152,12 +153,12 @@ struct rdma_rq_sge {
struct regpair addr;
__le32 length;
__le32 flags;
#define RDMA_RQ_SGE_L_KEY_MASK 0x3FFFFFF
#define RDMA_RQ_SGE_L_KEY_SHIFT 0
#define RDMA_RQ_SGE_L_KEY_LO_MASK 0x3FFFFFF
#define RDMA_RQ_SGE_L_KEY_LO_SHIFT 0
#define RDMA_RQ_SGE_NUM_SGES_MASK 0x7
#define RDMA_RQ_SGE_NUM_SGES_SHIFT 26
#define RDMA_RQ_SGE_RESERVED0_MASK 0x7
#define RDMA_RQ_SGE_RESERVED0_SHIFT 29
#define RDMA_RQ_SGE_L_KEY_HI_MASK 0x7
#define RDMA_RQ_SGE_L_KEY_HI_SHIFT 29
};
struct rdma_srq_sge {
@ -241,18 +242,39 @@ enum rdma_dif_io_direction_flg {
MAX_RDMA_DIF_IO_DIRECTION_FLG
};
/* RDMA DIF Runt Result Structure */
struct rdma_dif_runt_result {
__le16 guard_tag;
__le16 reserved[3];
struct rdma_dif_params {
__le32 base_ref_tag;
__le16 app_tag;
__le16 app_tag_mask;
__le16 runt_crc_value;
__le16 flags;
#define RDMA_DIF_PARAMS_IO_DIRECTION_FLG_MASK 0x1
#define RDMA_DIF_PARAMS_IO_DIRECTION_FLG_SHIFT 0
#define RDMA_DIF_PARAMS_BLOCK_SIZE_MASK 0x1
#define RDMA_DIF_PARAMS_BLOCK_SIZE_SHIFT 1
#define RDMA_DIF_PARAMS_RUNT_VALID_FLG_MASK 0x1
#define RDMA_DIF_PARAMS_RUNT_VALID_FLG_SHIFT 2
#define RDMA_DIF_PARAMS_VALIDATE_CRC_GUARD_MASK 0x1
#define RDMA_DIF_PARAMS_VALIDATE_CRC_GUARD_SHIFT 3
#define RDMA_DIF_PARAMS_VALIDATE_REF_TAG_MASK 0x1
#define RDMA_DIF_PARAMS_VALIDATE_REF_TAG_SHIFT 4
#define RDMA_DIF_PARAMS_VALIDATE_APP_TAG_MASK 0x1
#define RDMA_DIF_PARAMS_VALIDATE_APP_TAG_SHIFT 5
#define RDMA_DIF_PARAMS_CRC_SEED_MASK 0x1
#define RDMA_DIF_PARAMS_CRC_SEED_SHIFT 6
#define RDMA_DIF_PARAMS_RX_REF_TAG_CONST_MASK 0x1
#define RDMA_DIF_PARAMS_RX_REF_TAG_CONST_SHIFT 7
#define RDMA_DIF_PARAMS_BLOCK_GUARD_TYPE_MASK 0x1
#define RDMA_DIF_PARAMS_BLOCK_GUARD_TYPE_SHIFT 8
#define RDMA_DIF_PARAMS_APP_ESCAPE_MASK 0x1
#define RDMA_DIF_PARAMS_APP_ESCAPE_SHIFT 9
#define RDMA_DIF_PARAMS_REF_ESCAPE_MASK 0x1
#define RDMA_DIF_PARAMS_REF_ESCAPE_SHIFT 10
#define RDMA_DIF_PARAMS_RESERVED4_MASK 0x1F
#define RDMA_DIF_PARAMS_RESERVED4_SHIFT 11
__le32 reserved5;
};
/* Memory window type enumeration */
enum rdma_mw_type {
RDMA_MW_TYPE_1,
RDMA_MW_TYPE_2A,
MAX_RDMA_MW_TYPE
};
struct rdma_sq_atomic_wqe {
__le32 reserved1;
@ -334,17 +356,17 @@ struct rdma_sq_bind_wqe {
#define RDMA_SQ_BIND_WQE_SE_FLG_SHIFT 3
#define RDMA_SQ_BIND_WQE_INLINE_FLG_MASK 0x1
#define RDMA_SQ_BIND_WQE_INLINE_FLG_SHIFT 4
#define RDMA_SQ_BIND_WQE_RESERVED0_MASK 0x7
#define RDMA_SQ_BIND_WQE_RESERVED0_SHIFT 5
#define RDMA_SQ_BIND_WQE_DIF_ON_HOST_FLG_MASK 0x1
#define RDMA_SQ_BIND_WQE_DIF_ON_HOST_FLG_SHIFT 5
#define RDMA_SQ_BIND_WQE_RESERVED0_MASK 0x3
#define RDMA_SQ_BIND_WQE_RESERVED0_SHIFT 6
u8 wqe_size;
u8 prev_wqe_size;
u8 bind_ctrl;
#define RDMA_SQ_BIND_WQE_ZERO_BASED_MASK 0x1
#define RDMA_SQ_BIND_WQE_ZERO_BASED_SHIFT 0
#define RDMA_SQ_BIND_WQE_MW_TYPE_MASK 0x1
#define RDMA_SQ_BIND_WQE_MW_TYPE_SHIFT 1
#define RDMA_SQ_BIND_WQE_RESERVED1_MASK 0x3F
#define RDMA_SQ_BIND_WQE_RESERVED1_SHIFT 2
#define RDMA_SQ_BIND_WQE_RESERVED1_MASK 0x7F
#define RDMA_SQ_BIND_WQE_RESERVED1_SHIFT 1
u8 access_ctrl;
#define RDMA_SQ_BIND_WQE_REMOTE_READ_MASK 0x1
#define RDMA_SQ_BIND_WQE_REMOTE_READ_SHIFT 0
@ -363,6 +385,7 @@ struct rdma_sq_bind_wqe {
__le32 length_lo;
__le32 parent_l_key;
__le32 reserved4;
struct rdma_dif_params dif_params;
};
/* First element (16 bytes) of bind wqe */
@ -392,10 +415,8 @@ struct rdma_sq_bind_wqe_2nd {
u8 bind_ctrl;
#define RDMA_SQ_BIND_WQE_2ND_ZERO_BASED_MASK 0x1
#define RDMA_SQ_BIND_WQE_2ND_ZERO_BASED_SHIFT 0
#define RDMA_SQ_BIND_WQE_2ND_MW_TYPE_MASK 0x1
#define RDMA_SQ_BIND_WQE_2ND_MW_TYPE_SHIFT 1
#define RDMA_SQ_BIND_WQE_2ND_RESERVED1_MASK 0x3F
#define RDMA_SQ_BIND_WQE_2ND_RESERVED1_SHIFT 2
#define RDMA_SQ_BIND_WQE_2ND_RESERVED1_MASK 0x7F
#define RDMA_SQ_BIND_WQE_2ND_RESERVED1_SHIFT 1
u8 access_ctrl;
#define RDMA_SQ_BIND_WQE_2ND_REMOTE_READ_MASK 0x1
#define RDMA_SQ_BIND_WQE_2ND_REMOTE_READ_SHIFT 0
@ -416,6 +437,11 @@ struct rdma_sq_bind_wqe_2nd {
__le32 reserved4;
};
/* Third element (16 bytes) of bind wqe */
struct rdma_sq_bind_wqe_3rd {
struct rdma_dif_params dif_params;
};
/* Structure with only the SQ WQE common
* fields. Size is of one SQ element (16B)
*/
@ -486,30 +512,6 @@ struct rdma_sq_fmr_wqe {
u8 length_hi;
__le32 length_lo;
struct regpair pbl_addr;
__le32 dif_base_ref_tag;
__le16 dif_app_tag;
__le16 dif_app_tag_mask;
__le16 dif_runt_crc_value;
__le16 dif_flags;
#define RDMA_SQ_FMR_WQE_DIF_IO_DIRECTION_FLG_MASK 0x1
#define RDMA_SQ_FMR_WQE_DIF_IO_DIRECTION_FLG_SHIFT 0
#define RDMA_SQ_FMR_WQE_DIF_BLOCK_SIZE_MASK 0x1
#define RDMA_SQ_FMR_WQE_DIF_BLOCK_SIZE_SHIFT 1
#define RDMA_SQ_FMR_WQE_DIF_RUNT_VALID_FLG_MASK 0x1
#define RDMA_SQ_FMR_WQE_DIF_RUNT_VALID_FLG_SHIFT 2
#define RDMA_SQ_FMR_WQE_DIF_VALIDATE_CRC_GUARD_MASK 0x1
#define RDMA_SQ_FMR_WQE_DIF_VALIDATE_CRC_GUARD_SHIFT 3
#define RDMA_SQ_FMR_WQE_DIF_VALIDATE_REF_TAG_MASK 0x1
#define RDMA_SQ_FMR_WQE_DIF_VALIDATE_REF_TAG_SHIFT 4
#define RDMA_SQ_FMR_WQE_DIF_VALIDATE_APP_TAG_MASK 0x1
#define RDMA_SQ_FMR_WQE_DIF_VALIDATE_APP_TAG_SHIFT 5
#define RDMA_SQ_FMR_WQE_DIF_CRC_SEED_MASK 0x1
#define RDMA_SQ_FMR_WQE_DIF_CRC_SEED_SHIFT 6
#define RDMA_SQ_FMR_WQE_DIF_RX_REF_TAG_CONST_MASK 0x1
#define RDMA_SQ_FMR_WQE_DIF_RX_REF_TAG_CONST_SHIFT 7
#define RDMA_SQ_FMR_WQE_RESERVED4_MASK 0xFF
#define RDMA_SQ_FMR_WQE_RESERVED4_SHIFT 8
__le32 reserved5;
};
/* First element (16 bytes) of fmr wqe */
@ -566,33 +568,6 @@ struct rdma_sq_fmr_wqe_2nd {
struct regpair pbl_addr;
};
/* Third element (16 bytes) of fmr wqe */
struct rdma_sq_fmr_wqe_3rd {
__le32 dif_base_ref_tag;
__le16 dif_app_tag;
__le16 dif_app_tag_mask;
__le16 dif_runt_crc_value;
__le16 dif_flags;
#define RDMA_SQ_FMR_WQE_3RD_DIF_IO_DIRECTION_FLG_MASK 0x1
#define RDMA_SQ_FMR_WQE_3RD_DIF_IO_DIRECTION_FLG_SHIFT 0
#define RDMA_SQ_FMR_WQE_3RD_DIF_BLOCK_SIZE_MASK 0x1
#define RDMA_SQ_FMR_WQE_3RD_DIF_BLOCK_SIZE_SHIFT 1
#define RDMA_SQ_FMR_WQE_3RD_DIF_RUNT_VALID_FLG_MASK 0x1
#define RDMA_SQ_FMR_WQE_3RD_DIF_RUNT_VALID_FLG_SHIFT 2
#define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_CRC_GUARD_MASK 0x1
#define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_CRC_GUARD_SHIFT 3
#define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_REF_TAG_MASK 0x1
#define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_REF_TAG_SHIFT 4
#define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_APP_TAG_MASK 0x1
#define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_APP_TAG_SHIFT 5
#define RDMA_SQ_FMR_WQE_3RD_DIF_CRC_SEED_MASK 0x1
#define RDMA_SQ_FMR_WQE_3RD_DIF_CRC_SEED_SHIFT 6
#define RDMA_SQ_FMR_WQE_3RD_DIF_RX_REF_TAG_CONST_MASK 0x1
#define RDMA_SQ_FMR_WQE_3RD_DIF_RX_REF_TAG_CONST_SHIFT 7
#define RDMA_SQ_FMR_WQE_3RD_RESERVED4_MASK 0xFF
#define RDMA_SQ_FMR_WQE_RESERVED4_SHIFT 8
__le32 reserved5;
};
struct rdma_sq_local_inv_wqe {
struct regpair reserved;
@ -637,8 +612,8 @@ struct rdma_sq_rdma_wqe {
#define RDMA_SQ_RDMA_WQE_DIF_ON_HOST_FLG_SHIFT 5
#define RDMA_SQ_RDMA_WQE_READ_INV_FLG_MASK 0x1
#define RDMA_SQ_RDMA_WQE_READ_INV_FLG_SHIFT 6
#define RDMA_SQ_RDMA_WQE_RESERVED0_MASK 0x1
#define RDMA_SQ_RDMA_WQE_RESERVED0_SHIFT 7
#define RDMA_SQ_RDMA_WQE_RESERVED1_MASK 0x1
#define RDMA_SQ_RDMA_WQE_RESERVED1_SHIFT 7
u8 wqe_size;
u8 prev_wqe_size;
struct regpair remote_va;
@ -646,13 +621,9 @@ struct rdma_sq_rdma_wqe {
u8 dif_flags;
#define RDMA_SQ_RDMA_WQE_DIF_BLOCK_SIZE_MASK 0x1
#define RDMA_SQ_RDMA_WQE_DIF_BLOCK_SIZE_SHIFT 0
#define RDMA_SQ_RDMA_WQE_DIF_FIRST_RDMA_IN_IO_FLG_MASK 0x1
#define RDMA_SQ_RDMA_WQE_DIF_FIRST_RDMA_IN_IO_FLG_SHIFT 1
#define RDMA_SQ_RDMA_WQE_DIF_LAST_RDMA_IN_IO_FLG_MASK 0x1
#define RDMA_SQ_RDMA_WQE_DIF_LAST_RDMA_IN_IO_FLG_SHIFT 2
#define RDMA_SQ_RDMA_WQE_RESERVED1_MASK 0x1F
#define RDMA_SQ_RDMA_WQE_RESERVED1_SHIFT 3
u8 reserved2[3];
#define RDMA_SQ_RDMA_WQE_RESERVED2_MASK 0x7F
#define RDMA_SQ_RDMA_WQE_RESERVED2_SHIFT 1
u8 reserved3[3];
};
/* First element (16 bytes) of rdma wqe */

View File

@ -3276,7 +3276,7 @@ int qedr_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,
SET_FIELD(flags, RDMA_RQ_SGE_NUM_SGES,
wr->num_sge);
SET_FIELD(flags, RDMA_RQ_SGE_L_KEY,
SET_FIELD(flags, RDMA_RQ_SGE_L_KEY_LO,
wr->sg_list[i].lkey);
RQ_SGE_SET(rqe, wr->sg_list[i].addr,
@ -3295,7 +3295,7 @@ int qedr_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,
/* First one must include the number
* of SGE in the list
*/
SET_FIELD(flags, RDMA_RQ_SGE_L_KEY, 0);
SET_FIELD(flags, RDMA_RQ_SGE_L_KEY_LO, 0);
SET_FIELD(flags, RDMA_RQ_SGE_NUM_SGES, 1);
RQ_SGE_SET(rqe, 0, 0, flags);

View File

@ -443,17 +443,16 @@ static u8 opa_vnic_get_rc(struct __opa_veswport_info *info,
}
/* opa_vnic_calc_entropy - calculate the packet entropy */
u8 opa_vnic_calc_entropy(struct opa_vnic_adapter *adapter, struct sk_buff *skb)
u8 opa_vnic_calc_entropy(struct sk_buff *skb)
{
u16 hash16;
u32 hash = skb_get_hash(skb);
/*
* Get flow based 16-bit hash and then XOR the upper and lower bytes
* to get the entropy.
* __skb_tx_hash limits qcount to 16 bits. Hence, get 15-bit hash.
*/
hash16 = __skb_tx_hash(adapter->netdev, skb, BIT(15));
return (u8)((hash16 >> 8) ^ (hash16 & 0xff));
/* store XOR of all bytes in lower 8 bits */
hash ^= hash >> 8;
hash ^= hash >> 16;
/* return lower 8 bits as entropy */
return (u8)(hash & 0xFF);
}
/* opa_vnic_get_def_port - get default port based on entropy */
@ -490,7 +489,7 @@ void opa_vnic_encap_skb(struct opa_vnic_adapter *adapter, struct sk_buff *skb)
hdr = skb_push(skb, OPA_VNIC_HDR_LEN);
entropy = opa_vnic_calc_entropy(adapter, skb);
entropy = opa_vnic_calc_entropy(skb);
def_port = opa_vnic_get_def_port(adapter, entropy);
len = opa_vnic_wire_length(skb);
dlid = opa_vnic_get_dlid(adapter, skb, def_port);

View File

@ -299,7 +299,7 @@ struct opa_vnic_adapter *opa_vnic_add_netdev(struct ib_device *ibdev,
void opa_vnic_rem_netdev(struct opa_vnic_adapter *adapter);
void opa_vnic_encap_skb(struct opa_vnic_adapter *adapter, struct sk_buff *skb);
u8 opa_vnic_get_vl(struct opa_vnic_adapter *adapter, struct sk_buff *skb);
u8 opa_vnic_calc_entropy(struct opa_vnic_adapter *adapter, struct sk_buff *skb);
u8 opa_vnic_calc_entropy(struct sk_buff *skb);
void opa_vnic_process_vema_config(struct opa_vnic_adapter *adapter);
void opa_vnic_release_mac_tbl(struct opa_vnic_adapter *adapter);
void opa_vnic_query_mac_tbl(struct opa_vnic_adapter *adapter,

View File

@ -104,7 +104,7 @@ static u16 opa_vnic_select_queue(struct net_device *netdev, struct sk_buff *skb,
/* pass entropy and vl as metadata in skb */
mdata = skb_push(skb, sizeof(*mdata));
mdata->entropy = opa_vnic_calc_entropy(adapter, skb);
mdata->entropy = opa_vnic_calc_entropy(skb);
mdata->vl = opa_vnic_get_vl(adapter, skb);
rc = adapter->rn_ops->ndo_select_queue(netdev, skb,
accel_priv, fallback);

View File

@ -25,6 +25,19 @@ config LIRC
passes raw IR to and from userspace, which is needed for
IR transmitting (aka "blasting") and for the lirc daemon.
config BPF_LIRC_MODE2
bool "Support for eBPF programs attached to lirc devices"
depends on BPF_SYSCALL
depends on RC_CORE=y
depends on LIRC
help
Allow attaching eBPF programs to a lirc device using the bpf(2)
syscall command BPF_PROG_ATTACH. This is supported for raw IR
receivers.
These eBPF programs can be used to decode IR into scancodes, for
IR protocols not supported by the kernel decoders.
menuconfig RC_DECODERS
bool "Remote controller decoders"
depends on RC_CORE

View File

@ -5,6 +5,7 @@ obj-y += keymaps/
obj-$(CONFIG_RC_CORE) += rc-core.o
rc-core-y := rc-main.o rc-ir-raw.o
rc-core-$(CONFIG_LIRC) += lirc_dev.o
rc-core-$(CONFIG_BPF_LIRC_MODE2) += bpf-lirc.o
obj-$(CONFIG_IR_NEC_DECODER) += ir-nec-decoder.o
obj-$(CONFIG_IR_RC5_DECODER) += ir-rc5-decoder.o
obj-$(CONFIG_IR_RC6_DECODER) += ir-rc6-decoder.o

313
drivers/media/rc/bpf-lirc.c Normal file
View File

@ -0,0 +1,313 @@
// SPDX-License-Identifier: GPL-2.0
// bpf-lirc.c - handles bpf
//
// Copyright (C) 2018 Sean Young <sean@mess.org>
#include <linux/bpf.h>
#include <linux/filter.h>
#include <linux/bpf_lirc.h>
#include "rc-core-priv.h"
/*
* BPF interface for raw IR
*/
const struct bpf_prog_ops lirc_mode2_prog_ops = {
};
BPF_CALL_1(bpf_rc_repeat, u32*, sample)
{
struct ir_raw_event_ctrl *ctrl;
ctrl = container_of(sample, struct ir_raw_event_ctrl, bpf_sample);
rc_repeat(ctrl->dev);
return 0;
}
static const struct bpf_func_proto rc_repeat_proto = {
.func = bpf_rc_repeat,
.gpl_only = true, /* rc_repeat is EXPORT_SYMBOL_GPL */
.ret_type = RET_INTEGER,
.arg1_type = ARG_PTR_TO_CTX,
};
/*
* Currently rc-core does not support 64-bit scancodes, but there are many
* known protocols with more than 32 bits. So, define the interface as u64
* as a future-proof.
*/
BPF_CALL_4(bpf_rc_keydown, u32*, sample, u32, protocol, u64, scancode,
u32, toggle)
{
struct ir_raw_event_ctrl *ctrl;
ctrl = container_of(sample, struct ir_raw_event_ctrl, bpf_sample);
rc_keydown(ctrl->dev, protocol, scancode, toggle != 0);
return 0;
}
static const struct bpf_func_proto rc_keydown_proto = {
.func = bpf_rc_keydown,
.gpl_only = true, /* rc_keydown is EXPORT_SYMBOL_GPL */
.ret_type = RET_INTEGER,
.arg1_type = ARG_PTR_TO_CTX,
.arg2_type = ARG_ANYTHING,
.arg3_type = ARG_ANYTHING,
.arg4_type = ARG_ANYTHING,
};
static const struct bpf_func_proto *
lirc_mode2_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
{
switch (func_id) {
case BPF_FUNC_rc_repeat:
return &rc_repeat_proto;
case BPF_FUNC_rc_keydown:
return &rc_keydown_proto;
case BPF_FUNC_map_lookup_elem:
return &bpf_map_lookup_elem_proto;
case BPF_FUNC_map_update_elem:
return &bpf_map_update_elem_proto;
case BPF_FUNC_map_delete_elem:
return &bpf_map_delete_elem_proto;
case BPF_FUNC_ktime_get_ns:
return &bpf_ktime_get_ns_proto;
case BPF_FUNC_tail_call:
return &bpf_tail_call_proto;
case BPF_FUNC_get_prandom_u32:
return &bpf_get_prandom_u32_proto;
case BPF_FUNC_trace_printk:
if (capable(CAP_SYS_ADMIN))
return bpf_get_trace_printk_proto();
/* fall through */
default:
return NULL;
}
}
static bool lirc_mode2_is_valid_access(int off, int size,
enum bpf_access_type type,
const struct bpf_prog *prog,
struct bpf_insn_access_aux *info)
{
/* We have one field of u32 */
return type == BPF_READ && off == 0 && size == sizeof(u32);
}
const struct bpf_verifier_ops lirc_mode2_verifier_ops = {
.get_func_proto = lirc_mode2_func_proto,
.is_valid_access = lirc_mode2_is_valid_access
};
#define BPF_MAX_PROGS 64
static int lirc_bpf_attach(struct rc_dev *rcdev, struct bpf_prog *prog)
{
struct bpf_prog_array __rcu *old_array;
struct bpf_prog_array *new_array;
struct ir_raw_event_ctrl *raw;
int ret;
if (rcdev->driver_type != RC_DRIVER_IR_RAW)
return -EINVAL;
ret = mutex_lock_interruptible(&ir_raw_handler_lock);
if (ret)
return ret;
raw = rcdev->raw;
if (!raw) {
ret = -ENODEV;
goto unlock;
}
if (raw->progs && bpf_prog_array_length(raw->progs) >= BPF_MAX_PROGS) {
ret = -E2BIG;
goto unlock;
}
old_array = raw->progs;
ret = bpf_prog_array_copy(old_array, NULL, prog, &new_array);
if (ret < 0)
goto unlock;
rcu_assign_pointer(raw->progs, new_array);
bpf_prog_array_free(old_array);
unlock:
mutex_unlock(&ir_raw_handler_lock);
return ret;
}
static int lirc_bpf_detach(struct rc_dev *rcdev, struct bpf_prog *prog)
{
struct bpf_prog_array __rcu *old_array;
struct bpf_prog_array *new_array;
struct ir_raw_event_ctrl *raw;
int ret;
if (rcdev->driver_type != RC_DRIVER_IR_RAW)
return -EINVAL;
ret = mutex_lock_interruptible(&ir_raw_handler_lock);
if (ret)
return ret;
raw = rcdev->raw;
if (!raw) {
ret = -ENODEV;
goto unlock;
}
old_array = raw->progs;
ret = bpf_prog_array_copy(old_array, prog, NULL, &new_array);
/*
* Do not use bpf_prog_array_delete_safe() as we would end up
* with a dummy entry in the array, and the we would free the
* dummy in lirc_bpf_free()
*/
if (ret)
goto unlock;
rcu_assign_pointer(raw->progs, new_array);
bpf_prog_array_free(old_array);
unlock:
mutex_unlock(&ir_raw_handler_lock);
return ret;
}
void lirc_bpf_run(struct rc_dev *rcdev, u32 sample)
{
struct ir_raw_event_ctrl *raw = rcdev->raw;
raw->bpf_sample = sample;
if (raw->progs)
BPF_PROG_RUN_ARRAY(raw->progs, &raw->bpf_sample, BPF_PROG_RUN);
}
/*
* This should be called once the rc thread has been stopped, so there can be
* no concurrent bpf execution.
*/
void lirc_bpf_free(struct rc_dev *rcdev)
{
struct bpf_prog **progs;
if (!rcdev->raw->progs)
return;
progs = rcu_dereference(rcdev->raw->progs)->progs;
while (*progs)
bpf_prog_put(*progs++);
bpf_prog_array_free(rcdev->raw->progs);
}
int lirc_prog_attach(const union bpf_attr *attr)
{
struct bpf_prog *prog;
struct rc_dev *rcdev;
int ret;
if (attr->attach_flags)
return -EINVAL;
prog = bpf_prog_get_type(attr->attach_bpf_fd,
BPF_PROG_TYPE_LIRC_MODE2);
if (IS_ERR(prog))
return PTR_ERR(prog);
rcdev = rc_dev_get_from_fd(attr->target_fd);
if (IS_ERR(rcdev)) {
bpf_prog_put(prog);
return PTR_ERR(rcdev);
}
ret = lirc_bpf_attach(rcdev, prog);
if (ret)
bpf_prog_put(prog);
put_device(&rcdev->dev);
return ret;
}
int lirc_prog_detach(const union bpf_attr *attr)
{
struct bpf_prog *prog;
struct rc_dev *rcdev;
int ret;
if (attr->attach_flags)
return -EINVAL;
prog = bpf_prog_get_type(attr->attach_bpf_fd,
BPF_PROG_TYPE_LIRC_MODE2);
if (IS_ERR(prog))
return PTR_ERR(prog);
rcdev = rc_dev_get_from_fd(attr->target_fd);
if (IS_ERR(rcdev)) {
bpf_prog_put(prog);
return PTR_ERR(rcdev);
}
ret = lirc_bpf_detach(rcdev, prog);
bpf_prog_put(prog);
put_device(&rcdev->dev);
return ret;
}
int lirc_prog_query(const union bpf_attr *attr, union bpf_attr __user *uattr)
{
__u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
struct bpf_prog_array __rcu *progs;
struct rc_dev *rcdev;
u32 cnt, flags = 0;
int ret;
if (attr->query.query_flags)
return -EINVAL;
rcdev = rc_dev_get_from_fd(attr->query.target_fd);
if (IS_ERR(rcdev))
return PTR_ERR(rcdev);
if (rcdev->driver_type != RC_DRIVER_IR_RAW) {
ret = -EINVAL;
goto put;
}
ret = mutex_lock_interruptible(&ir_raw_handler_lock);
if (ret)
goto put;
progs = rcdev->raw->progs;
cnt = progs ? bpf_prog_array_length(progs) : 0;
if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt))) {
ret = -EFAULT;
goto unlock;
}
if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags))) {
ret = -EFAULT;
goto unlock;
}
if (attr->query.prog_cnt != 0 && prog_ids && cnt)
ret = bpf_prog_array_copy_to_user(progs, prog_ids, cnt);
unlock:
mutex_unlock(&ir_raw_handler_lock);
put:
put_device(&rcdev->dev);
return ret;
}

View File

@ -20,6 +20,7 @@
#include <linux/module.h>
#include <linux/mutex.h>
#include <linux/device.h>
#include <linux/file.h>
#include <linux/idr.h>
#include <linux/poll.h>
#include <linux/sched.h>
@ -104,6 +105,12 @@ void ir_lirc_raw_event(struct rc_dev *dev, struct ir_raw_event ev)
TO_US(ev.duration), TO_STR(ev.pulse));
}
/*
* bpf does not care about the gap generated above; that exists
* for backwards compatibility
*/
lirc_bpf_run(dev, sample);
spin_lock_irqsave(&dev->lirc_fh_lock, flags);
list_for_each_entry(fh, &dev->lirc_fh, list) {
if (LIRC_IS_TIMEOUT(sample) && !fh->send_timeout_reports)
@ -816,4 +823,27 @@ void __exit lirc_dev_exit(void)
unregister_chrdev_region(lirc_base_dev, RC_DEV_MAX);
}
struct rc_dev *rc_dev_get_from_fd(int fd)
{
struct fd f = fdget(fd);
struct lirc_fh *fh;
struct rc_dev *dev;
if (!f.file)
return ERR_PTR(-EBADF);
if (f.file->f_op != &lirc_fops) {
fdput(f);
return ERR_PTR(-EINVAL);
}
fh = f.file->private_data;
dev = fh->rc;
get_device(&dev->dev);
fdput(f);
return dev;
}
MODULE_ALIAS("lirc_dev");

Some files were not shown because too many files have changed in this diff Show More