From 6d632bbcda73d9c08534cad279a11d9a985deacc Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Sun, 11 Aug 2024 09:43:45 -0700 Subject: [PATCH 1/6] man/ip-xfrm: fix dangling quote The man page had a dangling quote character in the usage text which can confuse auto-color/format code like Emacs and Vim. Signed-off-by: Stephen Hemminger --- man/man8/ip-xfrm.8 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/man/man8/ip-xfrm.8 b/man/man8/ip-xfrm.8 index 960779dd..3efd6172 100644 --- a/man/man8/ip-xfrm.8 +++ b/man/man8/ip-xfrm.8 @@ -71,7 +71,7 @@ ip-xfrm \- transform configuration .RB "[ " offload .RB "[ " crypto | packet " ]" .RB dev -.IR DEV " +.I DEV .RB dir .IR DIR " ]" .RB "[ " tfcpad From 6e4c3ffb82278c7abeac7c4896db703ce57d958e Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Sun, 11 Aug 2024 09:53:29 -0700 Subject: [PATCH 2/6] man/tc-codel: cleanup man page Instead of pre-formatted bullet list, use the man macros. Make sure same sentence format is used in all options. Signed-off-by: Stephen Hemminger --- man/man8/tc-codel.8 | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/man/man8/tc-codel.8 b/man/man8/tc-codel.8 index e538e940..7bf08667 100644 --- a/man/man8/tc-codel.8 +++ b/man/man8/tc-codel.8 @@ -22,12 +22,17 @@ CoDel (pronounced "coddle") is an adaptive "no-knobs" active queue management algorithm (AQM) scheme that was developed to address the shortcomings of RED and its variants. It was developed with the following goals in mind: - o It should be parameterless. - o It should keep delays low while permitting bursts of traffic. - o It should control delay. - o It should adapt dynamically to changing link rates with no impact on +.IP * 4 +It should be parameterless. +.IP * +It should keep delays low while permitting bursts of traffic. +.IP * +It should control delay. +.IP * +It should adapt dynamically to changing link rates with no impact on utilization. - o It should be simple and efficient and should scale from simple to +.IP * +It should be simple and efficient and should scale from simple to complex routers. .SH ALGORITHM @@ -57,7 +62,7 @@ Additional details can be found in the paper cited below. .SH PARAMETERS .SS limit -hard limit on the real queue size. When this limit is reached, incoming packets +is the hard limit on the real queue size. When this limit is reached, incoming packets are dropped. If the value is lowered, packets are dropped so that the new limit is met. Default is 1000 packets. @@ -113,7 +118,7 @@ interval 30.0ms ecn .BR tc-red (8) .SH SOURCES -o Kathleen Nichols and Van Jacobson, "Controlling Queue Delay", ACM Queue, +Kathleen Nichols and Van Jacobson, "Controlling Queue Delay", ACM Queue, http://queue.acm.org/detail.cfm?id=2209336 .SH AUTHORS From f7c21530587705b7af69f55da1d522ecd8d71d2e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?L=C6=B0=C6=A1ng=20Vi=E1=BB=87t=20Ho=C3=A0ng?= Date: Mon, 12 Aug 2024 11:41:37 +0700 Subject: [PATCH 3/6] tc-cake: document 'ingress' MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Linux kernel commit 7298de9cd7255a783ba ("sch_cake: Add ingress mode") added an ingress mode for CAKE, which can be enabled with the 'ingress' parameter. Document the changes in CAKE's behavior when ingress mode is enabled. Signed-off-by: Lương Việt Hoàng Reviewed-by: Toke Høiland-Jørgensen Signed-off-by: Stephen Hemminger --- man/man8/tc-cake.8 | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/man/man8/tc-cake.8 b/man/man8/tc-cake.8 index ced9ac78..6d77d7d2 100644 --- a/man/man8/tc-cake.8 +++ b/man/man8/tc-cake.8 @@ -541,6 +541,21 @@ This can be used to set policies in a firewall script that will override CAKE's built-in tin selection. .SH OTHER PARAMETERS +.B ingress +.br + Indicates that CAKE is running in ingress mode (i.e. running on the downlink +of a connection). This changes the shaper to also count dropped packets as data +transferred, as these will have already traversed the link before CAKE can +choose what to do with them. + + In addition, the AQM will be tuned to always keep at least two packets +queued per flow. The reason for this is that retransmits are more expensive in +ingress mode, since dropped packets have to traverse the link again; thus, +keeping a minimum number of packets queued will improve throughput in cases +where the number of active flows are so large that they saturate the link even +at their minimum window size. + +.PP .B memlimit LIMIT .br From 72c13bc5d48c16e91fc438f75a1d93340b479503 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?L=C6=B0=C6=A1ng=20Vi=E1=BB=87t=20Ho=C3=A0ng?= Date: Mon, 12 Aug 2024 11:41:38 +0700 Subject: [PATCH 4/6] tc-cake: reformat MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reformat tc-cake to use man format (nroff) instead of pre-formatting. Signed-off-by: Lương Việt Hoàng Acked-by: Toke Høiland-Jørgensen Signed-off-by: Stephen Hemminger --- man/man8/tc-cake.8 | 429 +++++++++++++++++++++++++-------------------- 1 file changed, 237 insertions(+), 192 deletions(-) diff --git a/man/man8/tc-cake.8 b/man/man8/tc-cake.8 index 6d77d7d2..47f8f985 100644 --- a/man/man8/tc-cake.8 +++ b/man/man8/tc-cake.8 @@ -146,22 +146,23 @@ CAKE uses a deficit-mode shaper, which does not exhibit the initial burst typical of token-bucket shapers. It will automatically burst precisely as much as required to maintain the configured throughput. As such, it is very straightforward to configure. -.PP -.B unlimited -(default) + +.TP +\fBunlimited\fR (default) .br - No limit on the bandwidth. -.PP -.B bandwidth -RATE +No limit on the bandwidth. + +.TP +\fBbandwidth\fR RATE .br - Set the shaper bandwidth. See +Set the shaper bandwidth. See .BR tc(8) or examples below for details of the RATE value. -.PP + +.TP .B autorate-ingress .br - Automatic capacity estimation based on traffic arriving at this qdisc. +Automatic capacity estimation based on traffic arriving at this qdisc. This is most likely to be useful with cellular links, which tend to change quality randomly. A .B bandwidth @@ -177,58 +178,61 @@ are not expert network engineers, keywords have been provided to represent a number of common link technologies. .SS Manual Overhead Specification -.B overhead -BYTES +.TP +\fBoverhead\fR BYTES .br - Adds BYTES to the size of each packet. BYTES may be negative; values +Adds BYTES to the size of each packet. BYTES may be negative; values between -64 and 256 (inclusive) are accepted. -.PP -.B mpu -BYTES + +.TP +\fBmpu\fR BYTES .br - Rounds each packet (including overhead) up to a minimum length +Rounds each packet (including overhead) up to a minimum length BYTES. BYTES may not be negative; values between 0 and 256 (inclusive) are accepted. -.PP + +.TP .B atm .br - Compensates for ATM cell framing, which is normally found on ADSL links. +Compensates for ATM cell framing, which is normally found on ADSL links. This is performed after the .B overhead parameter above. ATM uses fixed 53-byte cells, each of which can carry 48 bytes payload. -.PP + +.TP .B ptm .br - Compensates for PTM encoding, which is normally found on VDSL2 links and +Compensates for PTM encoding, which is normally found on VDSL2 links and uses a 64b/65b encoding scheme. It is even more efficient to simply derate the specified shaper bandwidth by a factor of 64/65 or 0.984. See ITU G.992.3 Annex N and IEEE 802.3 Section 61.3 for details. -.PP + +.TP .B noatm .br - Disables ATM and PTM compensation. +Disables ATM and PTM compensation. -.SS Failsafe Overhead Keywords +.SS Failsafe Overhead Keywords These two keywords are provided for quick-and-dirty setup. Use them if you can't be bothered to read the rest of this section. -.PP -.B raw -(default) + +.TP +\fBraw\fR (default) .br - Turns off all overhead compensation in CAKE. The packet size reported +Turns off all overhead compensation in CAKE. The packet size reported by Linux will be used directly. -.PP - Other overhead keywords may be added after "raw". The effect of this is + +Other overhead keywords may be added after "raw". The effect of this is to make the overhead compensation operate relative to the reported packet size, not the underlying IP packet size. -.PP + +.TP .B conservative .br - Compensates for more overhead than is likely to occur on any +Compensates for more overhead than is likely to occur on any widely-deployed link technology. -.br - Equivalent to +Equivalent to .B overhead 48 atm. .SS ADSL Overhead Keywords @@ -238,77 +242,86 @@ this section are intended to correspond with these sources of information. All of them implicitly set the .B atm flag. -.PP + +.TP .B pppoa-vcmux .br - Equivalent to +Equivalent to .B overhead 10 atm -.PP + +.TP .B pppoa-llc .br - Equivalent to +Equivalent to .B overhead 14 atm -.PP + +.TP .B pppoe-vcmux .br - Equivalent to +Equivalent to .B overhead 32 atm -.PP + +.TP .B pppoe-llcsnap .br - Equivalent to +Equivalent to .B overhead 40 atm -.PP + +.TP .B bridged-vcmux .br - Equivalent to +Equivalent to .B overhead 24 atm -.PP + +.TP .B bridged-llcsnap .br - Equivalent to +Equivalent to .B overhead 32 atm -.PP + +.TP .B ipoa-vcmux .br - Equivalent to +Equivalent to .B overhead 8 atm -.PP + +.TP .B ipoa-llcsnap .br - Equivalent to +Equivalent to .B overhead 16 atm -.PP + +.P See also the Ethernet Correction Factors section below. .SS VDSL2 Overhead Keywords ATM was dropped from VDSL2 in favour of PTM, which is a much more straightforward framing scheme. Some ISPs retained PPPoE for compatibility with their existing back-end systems. -.PP + +.TP .B pppoe-ptm .br - Equivalent to +Equivalent to .B overhead 30 ptm +PPPoE: 2B PPP + 6B PPPoE + .br - PPPoE: 2B PPP + 6B PPPoE + +ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check Sequence + .br - ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check Sequence + -.br - PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC (PTM-FCS) -.br -.PP +PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC (PTM-FCS) + +.TP .B bridged-ptm .br - Equivalent to +Equivalent to .B overhead 22 ptm + +ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check Sequence + .br - ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check Sequence + -.br - PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC (PTM-FCS) -.br -.PP +PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC (PTM-FCS) + +.P See also the Ethernet Correction Factors section below. .SS DOCSIS Cable Overhead Keyword @@ -318,26 +331,28 @@ infrastructure. In this case, the actual on-wire overhead is less important than the packet size the head-end equipment uses for shaping and metering. This is specified to be an Ethernet frame including the CRC (aka FCS). -.PP + +.TP .B docsis .br - Equivalent to +Equivalent to .B overhead 18 mpu 64 noatm .SS Ethernet Overhead Keywords -.PP + +.TP .B ethernet .br - Accounts for Ethernet's preamble, inter-frame gap, and Frame Check +Accounts for Ethernet's preamble, inter-frame gap, and Frame Check Sequence. Use this keyword when the bottleneck being shaped for is an actual Ethernet cable. -.br - Equivalent to +Equivalent to .B overhead 38 mpu 84 noatm -.PP + +.TP .B ether-vlan .br - Adds 4 bytes to the overhead compensation, accounting for an IEEE 802.1Q +Adds 4 bytes to the overhead compensation, accounting for an IEEE 802.1Q VLAN header appended to the Ethernet frame header. NB: Some ISPs use one or even two of these within PPPoE; this keyword may be repeated as necessary to express this. @@ -360,54 +375,77 @@ the jitter in the Linux kernel itself, so congestion might be signalled prematurely. The flows will then become sparse and total throughput reduced, leaving little or no back-pressure for the fairness logic to work against. Use the "metro" setting for local lans unless you have a custom kernel. -.PP -.B rtt -TIME + +.TP +\fBrtt\fR TIME .br - Manually specify an RTT. -.PP +Manually specify an RTT. + +.TP .B datacentre .br - For extremely high-performance 10GigE+ networks only. Equivalent to +For extremely high-performance 10GigE+ networks only. +.br +Equivalent to .B rtt 100us. -.PP + +.TP .B lan .br - For pure Ethernet (not Wi-Fi) networks, at home or in the office. Don't -use this when shaping for an Internet access link. Equivalent to +For pure Ethernet (not Wi-Fi) networks, at home or in the office. Don't +use this when shaping for an Internet access link. +.br +Equivalent to .B rtt 1ms. -.PP + +.TP .B metro .br - For traffic mostly within a single city. Equivalent to +For traffic mostly within a single city. +.br +Equivalent to .B rtt 10ms. -.PP + +.TP .B regional .br - For traffic mostly within a European-sized country. Equivalent to -.B rtt 30ms. -.PP -.B internet -(default) +For traffic mostly within a European-sized country. .br - This is suitable for most Internet traffic. Equivalent to +Equivalent to +.B rtt 30ms. + +.TP +\fBinternet\fR (default) +.br +This is suitable for most Internet traffic. +.br +Equivalent to .B rtt 100ms. -.PP + +.TP .B oceanic .br - For Internet traffic with generally above-average latency, such as that -suffered by Australasian residents. Equivalent to +For Internet traffic with generally above-average latency, such as that +suffered by Australasian residents. +.br +Equivalent to .B rtt 300ms. -.PP + +.TP .B satellite .br - For traffic via geostationary satellites. Equivalent to +For traffic via geostationary satellites. +.br +Equivalent to .B rtt 1000ms. -.PP + +.TP .B interplanetary .br - So named because Jupiter is about 1 light-hour from Earth. Use this to -(almost) completely disable AQM actions. Equivalent to +So named because Jupiter is about 1 light-hour from Earth. Use this to +(almost) completely disable AQM actions. +.br +Equivalent to .B rtt 3600s. .SH FLOW ISOLATION PARAMETERS @@ -419,68 +457,76 @@ minimize flow collisions. These keywords specify whether fairness based on source address, destination address, individual flows, or any combination of those is desired. -.PP + +.TP .B flowblind .br - Disables flow isolation; all traffic passes through a single queue for +Disables flow isolation; all traffic passes through a single queue for each tin. -.PP + +.TP .B srchost .br - Flows are defined only by source address. Could be useful on the egress +Flows are defined only by source address. Could be useful on the egress path of an ISP backhaul. -.PP + +.TP .B dsthost .br - Flows are defined only by destination address. Could be useful on the +Flows are defined only by destination address. Could be useful on the ingress path of an ISP backhaul. -.PP + +.TP .B hosts .br - Flows are defined by source-destination host pairs. This is host +Flows are defined by source-destination host pairs. This is host isolation, rather than flow isolation. -.PP + +.TP .B flows .br - Flows are defined by the entire 5-tuple of source address, destination +Flows are defined by the entire 5-tuple of source address, destination address, transport protocol, source port and destination port. This is the type of flow isolation performed by SFQ and fq_codel. -.PP + +.TP .B dual-srchost .br - Flows are defined by the 5-tuple, and fairness is applied first over +Flows are defined by the 5-tuple, and fairness is applied first over source addresses, then over individual flows. Good for use on egress traffic from a LAN to the internet, where it'll prevent any one LAN host from monopolising the uplink, regardless of the number of flows they use. -.PP + +.TP .B dual-dsthost .br - Flows are defined by the 5-tuple, and fairness is applied first over +Flows are defined by the 5-tuple, and fairness is applied first over destination addresses, then over individual flows. Good for use on ingress traffic to a LAN from the internet, where it'll prevent any one LAN host from monopolising the downlink, regardless of the number of flows they use. -.PP -.B triple-isolate -(default) + +.TP +\fBtriple-isolate\fR (default) .br - Flows are defined by the 5-tuple, and fairness is applied over source +Flows are defined by the 5-tuple, and fairness is applied over source *and* destination addresses intelligently (ie. not merely by host-pairs), and also over individual flows. Use this if you're not certain whether to use dual-srchost or dual-dsthost; it'll do both jobs at once, preventing any one host on *either* side of the link from monopolising it with a large number of flows. -.PP + +.TP .B nat .br - Instructs Cake to perform a NAT lookup before applying flow-isolation +Instructs Cake to perform a NAT lookup before applying flow-isolation rules, to determine the true addresses and port numbers of the packet, to improve fairness between hosts "inside" the NAT. This has no practical effect in "flowblind" or "flows" modes, or if NAT is performed on a different host. -.PP -.B nonat -(default) + +.TP +\fBnonat\fR (default) .br - Cake will not perform a NAT lookup. Flow isolation will be performed +Cake will not perform a NAT lookup. Flow isolation will be performed using the addresses and port numbers directly visible to the interface Cake is attached to. @@ -495,44 +541,46 @@ the threshold using the same algorithm as the deficit-mode shaper. Detailed customisation of tin parameters is not provided. The following presets perform all necessary tuning, relative to the current shaper bandwidth and RTT settings. -.PP + +.TP .B besteffort .br - Disables priority queuing by placing all traffic in one tin. -.PP +Disables priority queuing by placing all traffic in one tin. + +.TP .B precedence .br - Enables legacy interpretation of TOS "Precedence" field. Use of this +Enables legacy interpretation of TOS "Precedence" field. Use of this preset on the modern Internet is firmly discouraged. -.PP + +.TP .B diffserv4 .br - Provides a general-purpose Diffserv implementation with four tins: -.br - Bulk (CS1, LE in kernel v5.9+), 6.25% threshold, generally low priority. -.br - Best Effort (general), 100% threshold. -.br - Video (AF4x, AF3x, CS3, AF2x, CS2, TOS4, TOS1), 50% threshold. -.br - Voice (CS7, CS6, EF, VA, CS5, CS4), 25% threshold. -.PP -.B diffserv3 -(default) -.br - Provides a simple, general-purpose Diffserv implementation with three tins: -.br - Bulk (CS1, LE in kernel v5.9+), 6.25% threshold, generally low priority. -.br - Best Effort (general), 100% threshold. -.br - Voice (CS7, CS6, EF, VA, TOS4), 25% threshold, reduced Codel interval. +Provides a general-purpose Diffserv implementation with four tins: -.PP -.B fwmark -MASK +\(bu Bulk (CS1, LE in kernel v5.9+), 6.25% threshold, generally low priority. .br - This options turns on fwmark-based overriding of CAKE's tin selection. +\(bu Best Effort (general), 100% threshold. +.br +\(bu Video (AF4x, AF3x, CS3, AF2x, CS2, TOS4, TOS1), 50% threshold. +.br +\(bu Voice (CS7, CS6, EF, VA, CS5, CS4), 25% threshold. + +.TP +\fBdiffserv3\fR (default) +.br +Provides a simple, general-purpose Diffserv implementation with three tins: + +\(bu Bulk (CS1, LE in kernel v5.9+), 6.25% threshold, generally low priority. +.br +\(bu Best Effort (general), 100% threshold. +.br +\(bu Voice (CS7, CS6, EF, VA, TOS4), 25% threshold, reduced Codel interval. + +.TP +\fBfwmark\fR MASK +.br +This options turns on fwmark-based overriding of CAKE's tin selection. If set, the option specifies a bitmask that will be applied to the fwmark associated with each packet. If the result of this masking is non-zero, the result will be right-shifted by the number of least-significant unset bits in @@ -541,38 +589,38 @@ This can be used to set policies in a firewall script that will override CAKE's built-in tin selection. .SH OTHER PARAMETERS + +.TP .B ingress .br - Indicates that CAKE is running in ingress mode (i.e. running on the downlink -of a connection). This changes the shaper to also count dropped packets as data +Indicates that CAKE is running in ingress mode (i.e. running on the downlink of +a connection). This changes the shaper to also count dropped packets as data transferred, as these will have already traversed the link before CAKE can choose what to do with them. - In addition, the AQM will be tuned to always keep at least two packets +In addition, the AQM will be tuned to always keep at least two packets queued per flow. The reason for this is that retransmits are more expensive in ingress mode, since dropped packets have to traverse the link again; thus, keeping a minimum number of packets queued will improve throughput in cases where the number of active flows are so large that they saturate the link even at their minimum window size. -.PP -.B memlimit -LIMIT +.TP +\fBmemlimit\fR LIMIT .br - Limit the memory consumed by Cake to LIMIT bytes. Note that this does +Limit the memory consumed by Cake to LIMIT bytes. Note that this does not translate directly to queue size (so do not size this based on bandwidth delay product considerations, but rather on worst case acceptable memory consumption), as there is some overhead in the data structures containing the packets, especially for small packets. - By default, the limit is calculated based on the bandwidth and RTT +By default, the limit is calculated based on the bandwidth and RTT settings. -.PP +.TP .B wash - .br - Traffic entering your diffserv domain is frequently mis-marked in +Traffic entering your diffserv domain is frequently mis-marked in transit from the perspective of your network, and traffic exiting yours may be mis-marked from the perspective of the transiting provider. @@ -583,15 +631,13 @@ If you are shaping inbound, and cannot trust the diffserv markings (as is the case for Comcast Cable, among others), it is best to use a single queue "besteffort" mode with wash. -.PP +.TP .B split-gso - .br - This option controls whether CAKE will split General Segmentation +This option controls whether CAKE will split General Segmentation Offload (GSO) super-packets into their on-the-wire components and dequeue them individually. -.br Super-packets are created by the networking stack to improve efficiency. However, because they are larger they take longer to dequeue, which translates to higher latency for competing flows, especially at lower @@ -610,25 +656,22 @@ field on the skb, and the flow hashing can be overridden by setting the .B classid parameter. -.PP -.B Tin override - -.br - To assign a priority tin, the major number of the priority field needs +.SS Tin override +To assign a priority tin, the major number of the priority field needs to match the qdisc handle of the cake instance; if it does, the minor number will be interpreted as the tin index. For example, to classify all ICMP packets as 'bulk', the following filter can be used: -.br - # tc qdisc replace dev eth0 handle 1: root cake diffserv3 - # tc filter add dev eth0 parent 1: protocol ip prio 1 \\ - u32 match icmp type 0 0 action skbedit priority 1:1 +.RS +.EX +# tc qdisc replace dev eth0 handle 1: root cake diffserv3 +# tc filter add dev eth0 parent 1: protocol ip prio 1 \\ + u32 match icmp type 0 0 action skbedit priority 1:1 +.EE +.RE -.PP -.B Flow hash override - -.br - To override flow hashing, the classid can be set. CAKE will interpret +.SS Flow hash override +To override flow hashing, the classid can be set. CAKE will interpret the major number of the classid as the host hash used in host isolation mode, and the minor number as the flow hash used for flow-based queueing. One or both of those can be set, and will be used if the relevant flow isolation parameter @@ -636,15 +679,16 @@ is set (i.e., the major number will be ignored if CAKE is not configured in hosts mode, and the minor number will be ignored if CAKE is not configured in flows mode). -.br This example will assign all ICMP packets to the first queue: -.br - # tc qdisc replace dev eth0 handle 1: root cake - # tc filter add dev eth0 parent 1: protocol ip prio 1 \\ - u32 match icmp type 0 0 classid 0:1 +.RS +.EX +# tc qdisc replace dev eth0 handle 1: root cake +# tc filter add dev eth0 parent 1: protocol ip prio 1 \\ + u32 match icmp type 0 0 classid 0:1 +.EE +.RE -.br If only one of the host and flow overrides is set, CAKE will compute the other hash from the packet as normal. Note, however, that the host isolation mode works by assigning a host ID to the flow queue; so if overriding both host and @@ -656,12 +700,11 @@ destination host. .SH EXAMPLES +.EX # tc qdisc delete root dev eth0 -.br # tc qdisc add root dev eth0 cake bandwidth 100Mbit ethernet -.br # tc -s qdisc show dev eth0 -.br + qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate rtt 100.0ms noatm overhead 38 mpu 84 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 @@ -691,9 +734,10 @@ qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate rtt 100.0 un_flows 0 0 0 max_len 0 0 0 quantum 300 1514 762 +.EE -After some use: -.br +.SS After some use: +.EX # tc -s qdisc show dev eth0 qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate rtt 100.0ms noatm overhead 38 mpu 84 @@ -725,6 +769,7 @@ qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate rtt 100.0 un_flows 0 0 0 max_len 1514 1514 1514 quantum 300 1514 762 +.EE .SH SEE ALSO .BR tc (8), From 0ddadc93e54f20e35cb742701413ede88ea00471 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Stefan=20M=C3=A4tje?= Date: Mon, 12 Aug 2024 00:31:34 +0200 Subject: [PATCH 5/6] configure: provide surrogates for possibly missing libbpf_version.h MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Old libbpf library versions (< 0.7.x) may not have the libbpf_version.h header packaged. This header would provide LIBBPF_MAJOR_VERSION and LIBBPF_MINOR_VERSION which are then missing to control conditional compilation in some source files. Provide surrogates for these defines via CFLAGS that are derived from the LIBBPF_VERSION determined with $(${PKG_CONFIG} libbpf --modversion). Signed-off-by: Stefan Mätje Signed-off-by: Stephen Hemminger --- configure | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/configure b/configure index 928048b3..7437db4f 100755 --- a/configure +++ b/configure @@ -315,6 +315,12 @@ check_libbpf() echo "HAVE_LIBBPF:=y" >> $CONFIG echo 'CFLAGS += -DHAVE_LIBBPF ' $LIBBPF_CFLAGS >> $CONFIG echo "CFLAGS += -DLIBBPF_VERSION=\\\"$LIBBPF_VERSION\\\"" >> $CONFIG + LIBBPF_MAJOR=$(IFS="."; set $LIBBPF_VERSION; echo $1) + LIBBPF_MINOR=$(IFS="."; set $LIBBPF_VERSION; echo $2) + if [ "$LIBBPF_MAJOR" -eq 0 -a "$LIBBPF_MINOR" -lt 7 ]; then + # Newer libbpf versions provide these defines in the bpf/libbpf_version.h header. + echo "CFLAGS += -DLIBBPF_MAJOR_VERSION=$LIBBPF_MAJOR -DLIBBPF_MINOR_VERSION=$LIBBPF_MINOR" >> $CONFIG + fi echo 'LDLIBS += ' $LIBBPF_LDLIBS >> $CONFIG if [ -z "$LIBBPF_DIR" ]; then From e9096586e0701d5ae031df2f2708d20d34ae7bd4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Stefan=20M=C3=A4tje?= Date: Mon, 12 Aug 2024 00:31:35 +0200 Subject: [PATCH 6/6] ss: fix libbpf version check for ENABLE_BPF_SKSTORAGE_SUPPORT MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This patch fixes a problem with the libbpf version comparison to decide if ENABLE_BPF_SKSTORAGE_SUPPORT could be enabled. - The code enabled by ENABLE_BPF_SKSTORAGE_SUPPORT uses the function btf_dump__new with an API that was introduced in libbpf 0.6.0. So check now against libbpf version to be >= 0.6.x instead of 0.5.x. - This code still depends on the necessity to have LIBBPF_MAJOR_VERSION and LIBBPF_MINOR_VERSION defined, even if libbpf_version.h is not present in the library development package. This was ensured with the previous patch for the configure script. Fixes: e3ecf048 ("ss: pretty-print BPF socket-local storage") Signed-off-by: Stefan Mätje Signed-off-by: Stephen Hemminger --- misc/ss.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/misc/ss.c b/misc/ss.c index 620f4c8f..aef1a714 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -53,7 +53,7 @@ #include #ifdef HAVE_LIBBPF -/* If libbpf is new enough (0.5+), support for pretty-printing BPF socket-local +/* If libbpf is new enough (0.6+), support for pretty-printing BPF socket-local * storage is enabled, otherwise we emit a warning and disable it. * ENABLE_BPF_SKSTORAGE_SUPPORT is only used to gate the socket-local storage * feature, so this wouldn't prevent any feature relying on HAVE_LIBBPF to be @@ -66,8 +66,8 @@ #include #include -#if (LIBBPF_MAJOR_VERSION == 0) && (LIBBPF_MINOR_VERSION < 5) -#warning "libbpf version 0.5 or later is required, disabling BPF socket-local storage support" +#if ((LIBBPF_MAJOR_VERSION == 0) && (LIBBPF_MINOR_VERSION < 6)) +#warning "libbpf version 0.6 or later is required, disabling BPF socket-local storage support" #undef ENABLE_BPF_SKSTORAGE_SUPPORT #endif #endif