linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-11 04:18:39 +08:00

Author	SHA1	Message	Date
stephen hemminger	ff704050f2	netem: free skb's in tree on reset Netem can leak memory because packets get stored in red-black tree and it is not cleared on reset. Reported by: Сергеев Сергей <adron@yapic.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-11 17:30:38 -04:00
stephen hemminger	638a52b801	netem: update backlog after drop When packet is dropped from rb-tree netem the backlog statistic should also be updated. Reported-by: Сергеев Сергей <adron@yapic.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-11 17:29:29 -04:00
Eric Dumazet	f2f872f927	netem: Introduce skb_orphan_partial() helper Commit `547669d483` ("tcp: xps: fix reordering issues") added unexpected reorders in case netem is used in a MQ setup for high performance test bed. ETH=eth0 tc qd del dev $ETH root 2>/dev/null tc qd add dev $ETH root handle 1: mq for i in `seq 1 32` do tc qd add dev $ETH parent 1:$i netem delay 100ms done As all tcp packets are orphaned by netem, TCP stack believes it can set skb->ooo_okay on all packets. In order to allow producers to send more packets, we want to keep sk_wmem_alloc from reaching sk_sndbuf limit. We can do that by accounting one byte per skb in netem queues, so that TCP stack is not fooled too much. Tested: With above MQ/netem setup, scaling number of concurrent flows gives linear results and no reorders/retransmits lpq83:~# for n in 1 10 20 30 40 50 60 70 80 90 100 do echo -n "n:$n " ; ./super_netperf $n -H 10.7.7.84; done n:1 198.46 n:10 2002.69 n:20 4000.98 n:30 6006.35 n:40 8020.93 n:50 10032.3 n:60 12081.9 n:70 13971.3 n:80 16009.7 n:90 17117.3 n:100 17425.5 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-31 14:59:49 -07:00
Eric Dumazet	36b7bfe09b	netem: fix possible NULL deref in netem_dequeue() commit `aec0a40a6f` ("netem: use rb tree to implement the time queue") added a regression if a child qdisc is attached to netem, as we perform a NULL dereference. Fix this by adding a temporary variable to cache netem_skb_cb(skb)->time_to_send. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-03 16:52:10 -07:00
Eric Dumazet	aec0a40a6f	netem: use rb tree to implement the time queue Following typical setup to implement a ~100 ms RTT and big amount of reorders has very poor performance because netem implements the time queue using a linked list. ----------------------------------------------------------- ETH=eth0 IFB=ifb0 modprobe ifb ip link set dev $IFB up tc qdisc add dev $ETH ingress 2>/dev/null tc filter add dev $ETH parent ffff: \ protocol ip u32 match u32 0 0 flowid 1:1 action mirred egress \ redirect dev $IFB ethtool -K $ETH gro off tso off gso off tc qdisc add dev $IFB root netem delay 50ms 10ms limit 100000 tc qd add dev $ETH root netem delay 50ms limit 100000 --------------------------------------------------------- Switch netem time queue to a rb tree, so this kind of setup can work at high speed. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 18:07:15 -07:00
Johannes Naab	a13d310471	netem: fix delay calculation in rate extension The delay calculation with the rate extension introduces in v3.3 does not properly work, if other packets are still queued for transmission. For the delay calculation to work, both delay types (latency and delay introduces by rate limitation) have to be handled differently. The latency delay for a packet can overlap with the delay of other packets. The delay introduced by the rate however is separate, and can only start, once all other rate-introduced delays finished. Latency delay is from same distribution for each packet, rate delay depends on the packet size. .: latency delay -: rate delay x: additional delay we have to wait since another packet is currently transmitted .....---- Packet 1 .....xx------ Packet 2 .....------ Packet 3 ^^^^^ latency stacks ^^ rate delay doesn't stack ^^ latency stacks -----> time When a packet is enqueued, we first consider the latency delay. If other packets are already queued, we can reduce the latency delay until the last packet in the queue is send, however the latency delay cannot be <0, since this would mean that the rate is overcommitted. The new reference point is the time at which the last packet will be send. To find the time, when the packet should be send, the rate introduces delay has to be added on top of that. Signed-off-by: Johannes Naab <jn@stusta.de> Acked-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-29 15:43:02 -05:00
Eric Dumazet	5a308f40bf	netem: refine early skb orphaning netem does an early orphaning of skbs. Doing so breaks TCP Small Queue or any mechanism relying on socket sk_wmem_alloc feedback. Ideally, we should perform this orphaning after the rate module and before the delay module, to mimic what happens on a real link : skb orphaning is indeed normally done at TX completion, before the transit on the link. +-------+ +--------+ +---------------+ +-----------------+ + Qdisc +---> Device +--> TX completion +--> links / hops +-> + + + xmit + + skb orphaning + + propagation + +-------+ +--------+ +---------------+ +-----------------+ < rate limiting > < delay, drops, reorders > If netem is used without delay feature (drops, reorders, rate limiting), then we should avoid early skb orphaning, to keep pressure on sockets as long as packets are still in qdisc queue. Ideally, netem should be refactored to implement delay module as the last stage. Current algorithm merges the two phases (rate limiting + delay) so its not correct. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Cc: Mark Gordon <msg@google.com> Cc: Andreas Terzis <aterzis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-16 23:08:33 -07:00
Eric Dumazet	960fb66e52	netem: add limitation to reordered packets Fix two netem bugs : 1) When a frame was dropped by tfifo_enqueue(), drop counter was incremented twice. 2) When reordering is triggered, we enqueue a packet without checking queue limit. This can OOM pretty fast when this is repeated enough, since skbs are orphaned, no socket limit can help in this situation. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mark Gordon <msg@google.com> Cc: Andreas Terzis <aterzis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-09 00:01:49 -07:00
David S. Miller	0d6c4a2e46	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/intel/e1000e/param.c drivers/net/wireless/iwlwifi/iwl-agn-rx.c drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c drivers/net/wireless/iwlwifi/iwl-trans.h Resolved the iwlwifi conflict with mainline using 3-way diff posted by John Linville and Stephen Rothwell. In 'net' we added a bug fix to make iwlwifi report a more accurate skb->truesize but this conflicted with RX path changes that happened meanwhile in net-next. In e1000e a conflict arose in the validation code for settings of adapter->itr. 'net-next' had more sophisticated logic so that logic was used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-07 23:35:40 -04:00
Eric Dumazet	116a0fc31c	netem: fix possible skb leak skb_checksum_help(skb) can return an error, we must free skb in this case. qdisc_drop(skb, sch) can also be feeded with a NULL skb (if skb_unshare() failed), so lets use this generic helper. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 13:40:48 -04:00
Eric Dumazet	e4ae004b84	netem: add ECN capability Add ECN (Explicit Congestion Notification) marking capability to netem tc qdisc add dev eth0 root netem drop 0.5 ecn Instead of dropping packets, try to ECN mark them. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Cc: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 09:39:48 -04:00
David S. Miller	1b34ec43c9	pkt_sched: Stop using NLA_PUT*(). These macros contain a hidden goto, and are thus extremely error prone and make code hard to audit. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-01 18:11:37 -04:00
Eric Dumazet	cd961c2ca9	netem: fix dequeue commit `50612537e9` (netem: fix classful handling) added two errors in netem_dequeue() 1) After checking skb at the head of tfifo queue for time constraints, it dequeues tail skb, thus adding unwanted reordering. 2) qdisc stats are updated twice per packet (one when packet dequeued from tfifo, once when delivered) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-19 18:57:50 -05:00
David S. Miller	16bda13d90	net: Make qdisc_skb_cb upper size bound explicit. Just like skb->cb[], so that qdisc_skb_cb can be encapsulated inside of other data structures. This is intended to be used by IPoIB so that it can remember addressing information stored at hard_header_ops->create() time that it can fetch when the packet gets to the transmit routine. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-09 13:50:34 -05:00
Vijay Subramanian	a42b4799c6	netem: Fix off-by-one bug in reordering With netem reordering, a gap of N is supposed to reorder every Nth packet with given reorder probability. However, the code currently skips N packets and reorders every (N+1)th packet. Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-22 15:08:44 -05:00
Hagen Paul Pfeifer	eb10192447	net_sched: Bug in netem reordering Not now, but it looks you are correct. q->qdisc is NULL until another additional qdisc is attached (beside tfifo). See `50612537e9`. The following patch should work. From: Hagen Paul Pfeifer <hagen@jauu.net> netem: catch NULL pointer by updating the real qdisc statistic Reported-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-05 13:27:39 -05:00
Eric Dumazet	50612537e9	netem: fix classful handling Commit `10f6dfcfde` (Revert "sch_netem: Remove classful functionality") reintroduced classful functionality to netem, but broke basic netem behavior : netem uses an t(ime)fifo queue, and store timestamps in skb->cb[] If qdisc is changed, time constraints are not respected and other qdisc can destroy skb->cb[] and block netem at dequeue time. Fix this by always using internal tfifo, and optionally attach a child qdisc to netem (or a tree of qdiscs) Example of use : DEV=eth3 tc qdisc del dev $DEV root tc qdisc add dev $DEV root handle 30: est 1sec 8sec netem delay 20ms 10ms tc qdisc add dev $DEV handle 40:0 parent 30:0 tbf \ burst 20480 limit 20480 mtu 1514 rate 32000bps qdisc netem 30: root refcnt 18 limit 1000 delay 20.0ms 10.0ms Sent 190792 bytes 413 pkt (dropped 0, overlimits 0 requeues 0) rate 18416bit 3pps backlog 0b 0p requeues 0 qdisc tbf 40: parent 30: rate 256000bit burst 20Kb/8 mpu 0b lat 0us Sent 190792 bytes 413 pkt (dropped 6, overlimits 10 requeues 0) backlog 0b 5p requeues 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-30 17:12:23 -05:00
David S. Miller	7f8e3234c5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2011-12-30 13:04:14 -05:00
Eric Dumazet	bb52c7acf8	netem: dont call vfree() under spinlock and BH disabled commit `6373a9a286` (netem: use vmalloc for distribution table) added a regression, since vfree() is called while holding a spinlock and BH being disabled. Fix this by doing the pointers swap in critical section, and freeing after spinlock release. Also add __GFP_NOWARN to the kmalloc() try, since we fallback to vmalloc(). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-24 16:08:50 -05:00
stephen hemminger	2494654d48	netem: loss model API sizes The new netem loss model is configured with nested netlink messages. This code is being overly strict about sizes, and is easily confused by padding (or possible future expansion). Also message for gemodel is incorrect. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-23 16:51:18 -05:00
Hagen Paul Pfeifer	90b41a1cd4	netem: add cell concept to simulate special MAC behavior This extension can be used to simulate special link layer characteristics. Simulate because packet data is not modified, only the calculation base is changed to delay a packet based on the original packet size and artificial cell information. packet_overhead can be used to simulate a link layer header compression scheme (e.g. set packet_overhead to -20) or with a positive packet_overhead value an additional MAC header can be simulated. It is also possible to "replace" the 14 byte Ethernet header with something else. cell_size and cell_overhead can be used to simulate link layer schemes, based on cells, like some TDMA schemes. Another application area are MAC schemes using a link layer fragmentation with a (small) header each. Cell size is the maximum amount of data bytes within one cell. Cell overhead is an additional variable to change the per-cell-overhead (e.g. 5 byte header per fragment). Example (5 kbit/s, 20 byte per packet overhead, cell-size 100 byte, per cell overhead 5 byte): tc qdisc add dev eth0 root netem rate 5kbit 20 100 5 Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-12 19:44:48 -05:00
Eric Dumazet	fc33cc7242	netem: fix build error on 32bit arches ERROR: "__udivdi3" [net/sched/sch_netem.ko] undefined! Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-01 11:40:19 -05:00
Hagen Paul Pfeifer	7bc0f28c7a	netem: rate extension Currently netem is not in the ability to emulate channel bandwidth. Only static delay (and optional random jitter) can be configured. To emulate the channel rate the token bucket filter (sch_tbf) can be used. But TBF has some major emulation flaws. The buffer (token bucket depth/rate) cannot be 0. Also the idea behind TBF is that the credit (token in buckets) fills if no packet is transmitted. So that there is always a "positive" credit for new packets. In real life this behavior contradicts the law of nature where nothing can travel faster as speed of light. E.g.: on an emulated 1000 byte/s link a small IPv4/TCP SYN packet with ~50 byte require ~0.05 seconds - not 0 seconds. Netem is an excellent place to implement a rate limiting feature: static delay is already implemented, tfifo already has time information and the user can skip TBF configuration completely. This patch implement rate feature which can be configured via tc. e.g: tc qdisc add dev eth0 root netem rate 10kbit To emulate a link of 5000byte/s and add an additional static delay of 10ms: tc qdisc add dev eth0 root netem delay 10ms rate 5KBps Note: similar to TBF the rate extension is bounded to the kernel timing system. Depending on the architecture timer granularity, higher rates (e.g. 10mbit/s and higher) tend to transmission bursts. Also note: further queues living in network adaptors; see ethtool(8). Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@drr.davemloft.net>	2011-11-30 23:18:35 -05:00
Alexey Dobriyan	b7f080cfe2	net: remove mm.h inclusion from netdevice.h Remove linux/mm.h inclusion from netdevice.h -- it's unused (I've checked manually). To prevent mm.h inclusion via other channels also extract "enum dma_data_direction" definition into separate header. This tiny piece is what gluing netdevice.h with mm.h via "netdevice.h => dmaengine.h => dma-mapping.h => scatterlist.h => mm.h". Removal of mm.h from scatterlist.h was tried and was found not feasible on most archs, so the link was cutoff earlier. Hope people are OK with tiny include file. Note, that mm_types.h is still dragged in, but it is a separate story. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-06-21 19:17:20 -07:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
David S. Miller	78776d3f2b	sch_netem: Need to include vmalloc.h Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-24 22:48:13 -08:00
stephen hemminger	250a65f782	netem: update version and cleanup Get rid of debug message that are not useful, and enable the log messages in case of error. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-24 22:11:56 -08:00
stephen hemminger	661b79725f	netem: revised correlated loss generator This is a patch originated with Stefano Salsano and Fabio Ludovici. It provides several alternative loss models for use with netem. This patch adds two state machine based loss models. See: http://netgroup.uniroma2.it/twiki/bin/view.cgi/Main/NetemCLG Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-24 22:11:56 -08:00
stephen hemminger	10f6dfcfde	Revert "sch_netem: Remove classful functionality" Many users have wanted the old functionality that was lost to be able to use pfifo as inner qdisc for netem. The reason that netem could not be classful with the older API was because of the limitations of the old dequeue/requeue interface; now that qdisc API has a peek function, there is no longer a problem with using any inner qdisc's. This reverts commit `0220146411`. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-24 22:11:55 -08:00
stephen hemminger	df173bda26	netem: define NETEM_DIST_MAX Rather than magic constant in code, expose the maximum size of packet distribution table in API. In iproute2, q_netem defines MAX_DIST as 16K already. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-24 22:11:54 -08:00
stephen hemminger	6373a9a286	netem: use vmalloc for distribution table The netem probability table can be large (up to 64K bytes) which may be too large to allocate in one contiguous chunk. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-24 22:11:54 -08:00
stephen hemminger	861d7f745f	netem: cleanup dump code Use nla_put_nested to update netlink attribute value. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-24 22:11:53 -08:00
David S. Miller	5bdc22a565	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/sched/sch_hfsc.c net/sched/sch_htb.c net/sched/sch_tbf.c	2011-01-24 14:09:35 -08:00
Eric Dumazet	9190b3b320	net_sched: accurate bytes/packets stats/rates In commit `44b8288308` (net_sched: pfifo_head_drop problem), we fixed a problem with pfifo_head drops that incorrectly decreased sch->bstats.bytes and sch->bstats.packets Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a previously enqueued packet, and bstats cannot be changed, so bstats/rates are not accurate (over estimated) This patch changes the qdisc_bstats updates to be done at dequeue() time instead of enqueue() time. bstats counters no longer account for dropped frames, and rates are more correct, since enqueue() bursts dont have effect on dequeue() rate. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 23:31:33 -08:00
Eric Dumazet	fd245a4adb	net_sched: move TCQ_F_THROTTLED flag In commit `3711210576` (net: QDISC_STATE_RUNNING dont need atomic bit ops) I moved QDISC_STATE_RUNNING flag to __state container, located in the cache line containing qdisc lock and often dirtied fields. I now move TCQ_F_THROTTLED bit too, so that we let first cache line read mostly, and shared by all cpus. This should speedup HTB/CBQ for example. Not using test_bit()/__clear_bit()/__test_and_set_bit allows to use an "unsigned int" for __state container, reducing by 8 bytes Qdisc size. Introduce helpers to hide implementation details. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McHardy <kaber@trash.net> CC: Jesper Dangaard Brouer <hawk@diku.dk> CC: Jarek Poplawski <jarkao2@gmail.com> CC: Jamal Hadi Salim <hadi@cyberus.ca> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-20 16:59:32 -08:00
Eric Dumazet	cc7ec456f8	net_sched: cleanups Cleanup net/sched code to current CodingStyle and practices. Reduce inline abuse Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-19 23:31:12 -08:00
Eric Dumazet	bfe0d0298f	net_sched: factorize qdisc stats handling HTB takes into account skb is segmented in stats updates. Generalize this to all schedulers. They should use qdisc_bstats_update() helper instead of manipulating bstats.bytes and bstats.packets Add bstats_update() helper too for classes that use gnet_stats_basic_packed fields. Note : Right now, TCQ_F_CAN_BYPASS shortcurt can be taken only if no stab is setup on qdisc. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-01-10 16:07:54 -08:00
Changli Gao	3511c9132f	net_sched: remove the unused parameter of qdisc_create_dflt() The first parameter dev isn't in use in qdisc_create_dflt(). Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-21 03:09:47 -07:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Joe Perches	f64f9e7192	net: Move && and \|\| to end of previous line Not including net/atm/ Compiled tested x86 allyesconfig only Added a > 80 column line or two, which I ignored. Existing checkpatch plaints willfully, cheerfully ignored. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-29 16:55:45 -08:00
Jarek Poplawski	8caf153974	net: sch_netem: Fix an inconsistency in ingress netem timestamps. Alex Sidorenko reported: "while experimenting with 'netem' we have found some strange behaviour. It seemed that ingress delay as measured by 'ping' command shows up on some hosts but not on others. After some investigation I have found that the problem is that skbuff->tstamp field value depends on whether there are any packet sniffers enabled. That is: - if any ptype_all handler is registered, the tstamp field is as expected - if there are no ptype_all handlers, the tstamp field does not show the delay" This patch prevents unnecessary update of tstamp in dev_queue_xmit_nit() on ingress path (with act_mirred) adding a check, so minimal overhead on the fast path, but only when sniffers etc. are active. Since netem at ingress seems to logically emulate a network before a host, tstamp is zeroed to trigger the update and pretend delays are from the outside. Reported-by: Alex Sidorenko <alexandre.sidorenko@hp.com> Tested-by: Alex Sidorenko <alexandre.sidorenko@hp.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-20 02:14:59 -07:00
Jarek Poplawski	05a8c1cbfe	pkt_sched: Remove smp_wmb() in qdisc_watchdog() While implementing a TCQ_F_THROTTLED flag there was used an smp_wmb() in qdisc_watchdog(), but since this flag is practically used only in sch_netem(), and since it's not even clear what reordering is avoided here (TCQ_F_THROTTLED vs. __QDISC_STATE_SCHED?) it seems the barrier could be safely removed. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-22 19:44:13 -08:00
David S. Miller	eb14f01959	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e1000e/ich8lan.c	2008-12-15 20:03:50 -08:00
Jesper Dangaard Brouer	eb9b851b98	SCHED: netem: Correct documentation comment in code. The netem simulator is no longer limited by Linux timer resolution HZ. Not since Patrick McHardy changed the QoS system to use hrtimer. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-15 00:39:17 -08:00
Patrick McHardy	b94c8afcba	pkt_sched: remove unnecessary xchg() in packet schedulers The use of xchg() hasn't been necessary since 2.2.something when proper locking was added to packet schedulers. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 04:11:36 -08:00
Jarek Poplawski	f30ab418a1	pkt_sched: Remove qdisc->ops->requeue() etc. After implementing qdisc->ops->peek() and changing sch_netem into classless qdisc there are no more qdisc->ops->requeue() users. This patch removes this method with its wrappers (qdisc_requeue()), and also unused qdisc->requeue structure. There are a few minor fixes of warnings (htb_enqueue()) and comments btw. The idea to kill ->requeue() and a similar patch were first developed by David S. Miller. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-13 22:56:30 -08:00
Stephen Hemminger	265eb67fb4	netem: eliminate unneeded return values All these individual parsing functions never return an error, so they can be void. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-03 21:13:26 -08:00
Jarek Poplawski	8ba25dad0a	sch_netem: Replace ->requeue() method with open code After removing netem classful functionality we are sure its inner qdisc is tfifo, so we can replace qdisc->ops->requeue() method with open code. After this patch there are no more ops->requeue() users. The idea of this patch is by Patrick McHardy. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-02 00:36:03 -07:00
Jarek Poplawski	0220146411	sch_netem: Remove classful functionality Patrick McHardy noticed that: "a lot of the functionality of netem requires the inner tfifo anyways and rate-limiting is usually done on top of netem. So I would suggest so either hard-wire the tfifo qdisc or at least make the assumption that inner qdiscs are work-conserving.", and later: "- a lot of other qdiscs still don't work as inner qdiscs of netem [...]". So, according to his suggestion, this patch removes classful options of netem. The main reason of this change is to remove ops->requeue() method, which is currently used only by netem. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-02 00:35:24 -07:00
Jarek Poplawski	77be155cba	pkt_sched: Add peek emulation for non-work-conserving qdiscs. This patch adds qdisc_peek_dequeued() wrapper to emulate peek method with qdisc->dequeue() and storing "peeked" skb in qdisc->gso_skb until dequeuing. This is mainly for compatibility reasons not to break some strange configs because peeking is expected for non-work-conserving parent qdiscs to query work-conserving child qdiscs. This implementation requires using qdisc_dequeue_peeked() wrapper instead of directly calling qdisc->dequeue() for all qdiscs ever querried with qdisc->ops->peek() or qdisc_peek_dequeued(). Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-31 00:47:01 -07:00
Jarek Poplawski	03c05f0d4b	pkt_sched: Use qdisc->ops->peek() instead of ->dequeue() & ->requeue() Use qdisc->ops->peek() instead of ->dequeue() & ->requeue() pair. After this patch the only remaining user of qdisc->ops->requeue() is netem_enqueue(). Based on ideas of Herbert Xu, Patrick McHardy and David S. Miller. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-31 00:46:19 -07:00
Jarek Poplawski	8e3af97899	pkt_sched: Add qdisc->ops->peek() implementation. Add qdisc->ops->peek() implementation for work-conserving qdiscs. With feedback from Patrick McHardy. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-31 00:45:55 -07:00
Thomas Graf	2c10b32bf5	netlink: Remove compat API for nested attributes Removes all _nested_compat() functions from the API. The prio qdisc no longer requires them and netem has its own format anyway. Their existance is only confusing. Resend: Also remove the wrapper macro. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-09-02 17:30:27 -07:00
Jarek Poplawski	102396ae65	pkt_sched: Fix locking of qdisc_root with qdisc_root_sleeping_lock() Use qdisc_root_sleeping_lock() instead of qdisc_root_lock() where appropriate. The only difference is while dev is deactivated, when currently we can use a sleeping qdisc with the lock of noop_qdisc. This shouldn't be dangerous since after deactivation root lock could be used only by gen_estimator code, but looks wrong anyway. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-08-29 14:27:52 -07:00
Jarek Poplawski	c27f339af9	net_sched: Add qdisc __NET_XMIT_BYPASS flag Patrick McHardy <kaber@trash.net> noticed that it would be nice to handle NET_XMIT_BYPASS by NET_XMIT_SUCCESS with an internal qdisc flag __NET_XMIT_BYPASS and to remove the mapping from dev_queue_xmit(). David Miller <davem@davemloft.net> spotted a serious bug in the first version of this patch. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-08-04 22:39:11 -07:00
Jarek Poplawski	378a2f090f	net_sched: Add qdisc __NET_XMIT_STOLEN flag Patrick McHardy <kaber@trash.net> noticed: "The other problem that affects all qdiscs supporting actions is TC_ACT_QUEUED/TC_ACT_STOLEN getting mapped to NET_XMIT_SUCCESS even though the packet is not queued, corrupting upper qdiscs' qlen counters." and later explained: "The reason why it translates it at all seems to be to not increase the drops counter. Within a single qdisc this could be avoided by other means easily, upper qdiscs would still increase the counter when we return anything besides NET_XMIT_SUCCESS though. This means we need a new NET_XMIT return value to indicate this to the upper qdiscs. So I'd suggest to introduce NET_XMIT_STOLEN, return that to upper qdiscs and translate it to NET_XMIT_SUCCESS in dev_queue_xmit, similar to NET_XMIT_BYPASS." David Miller <davem@davemloft.net> noticed: "Maybe these NET_XMIT_* values being passed around should be a set of bits. They could be composed of base meanings, combined with specific attributes. So you could say "NET_XMIT_DROP \| __NET_XMIT_NO_DROP_COUNT" The attributes get masked out by the top-level ->enqueue() caller, such that the base meanings are the only thing that make their way up into the stack. If it's only about communication within the qdisc tree, let's simply code it that way." This patch is trying to realize these ideas. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-08-04 22:31:03 -07:00
Jussi Kivilinna	175f9c1bba	net_sched: Add size table for qdiscs Add size table functions for qdiscs and calculate packet size in qdisc_enqueue(). Based on patch by Patrick McHardy http://marc.info/?l=linux-netdev&m=115201979221729&w=2 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-20 00:08:47 -07:00
Jussi Kivilinna	0abf77e55a	net_sched: Add accessor function for packet length for qdiscs Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-20 00:08:27 -07:00
Jussi Kivilinna	5f86173bdf	net_sched: Add qdisc_enqueue wrapper Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-20 00:08:04 -07:00
David S. Miller	7698b4fcab	pkt_sched: Add and use qdisc_root() and qdisc_root_lock(). When code wants to lock the qdisc tree state, the logic operation it's doing is locking the top-level qdisc that sits of the root of the netdev_queue. Add qdisc_root_lock() to represent this and convert the easiest cases. In order for this to work out in all cases, we have to hook up the noop_qdisc to a dummy netdev_queue. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-17 19:21:19 -07:00
David S. Miller	74d58a0c1d	pkt_sched: Make netem queue agnostic. It just wants the root qdisc given an arbitrary qdisc, and that is simply qdisc->dev_queue->qdisc Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Stephen Hemminger <shemminger@vyatta.com>	2008-07-08 22:57:51 -07:00
David S. Miller	b0e1e6462d	netdev: Move rest of qdisc state into struct netdev_queue Now qdisc, qdisc_sleeping, and qdisc_list also live there. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 17:42:10 -07:00
David S. Miller	dc2b48475a	netdev: Move queue_lock into struct netdev_queue. The lock is now an attribute of the device queue. One thing to notice is that "suspicious" places emerge which will need specific training about multiple queue handling. They are so marked with explicit "netdev->rx_queue" and "netdev->tx_queue" references. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 17:18:23 -07:00
David S. Miller	5ce2d488fe	pkt_sched: Remove 'dev' member of struct Qdisc. It can be obtained via the netdev_queue. So create a helper routine, qdisc_dev(), to make the transformations nicer looking. Now, qdisc_alloc() now no longer needs a net_device pointer argument. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 17:06:30 -07:00
David S. Miller	bb949fbd18	netdev: Create netdev_queue abstraction. A netdev_queue is an entity managed by a qdisc. Currently there is one RX and one TX queue, and a netdev_queue merely contains a backpointer to the net_device. The Qdisc struct is augmented with a netdev_queue pointer as well. Eventually the 'dev' Qdisc member will go away and we will have the resulting hierarchy: net_device --> netdev_queue --> Qdisc Also, qdisc_alloc() and qdisc_create_dflt() now take a netdev_queue pointer argument. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 16:55:56 -07:00
Patrick McHardy	fb0305ce1b	net-sched: consolidate default fifo qdisc setup Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-05 23:40:21 -07:00
Patrick McHardy	27a3421e48	[NET_SCHED]: Use nla_policy for attribute validation in packet schedulers Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:11:22 -08:00
Patrick McHardy	b03f467200	[NET_SCHED]: sch_netem: use nla_parse_nested_compat Replace open coded equivalent of nla_parse_nested_compat(). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:11:15 -08:00
Patrick McHardy	1e90474c37	[NET_SCHED]: Convert packet schedulers from rtnetlink to new netlink API Convert packet schedulers to use the netlink API. Unfortunately a gradual conversion is not possible without breaking compilation in the middle or adding lots of casts, so this patch converts them all in one step. The patch has been mostly generated automatically with some minor edits to at least allow seperate conversion of classifiers and actions. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:11:10 -08:00
Eric Dumazet	20fea08b5f	[NET]: Move Qdisc_class_ops and Qdisc_ops in appropriate sections. Qdisc_class_ops are const, and Qdisc_ops are mostly read. Using "const" and "__read_mostly" qualifiers helps to reduce false sharing. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:58 -08:00
Patrick McHardy	0ba4805383	[NET_SCHED]: Remove unnecessary includes Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:41 -07:00
Patrick McHardy	3bebcda280	[NET_SCHED]: turn PSCHED_GET_TIME into inline function Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:55 -07:00
Patrick McHardy	a084980dcb	[NET_SCHED]: kill PSCHED_SET_PASTPERFECT/PSCHED_IS_PASTPERFECT Use direct assignment and comparison instead. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:51 -07:00
Patrick McHardy	104e087898	[NET_SCHED]: kill PSCHED_TLESS Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:50 -07:00
Patrick McHardy	7c59e25f31	[NET_SCHED]: kill PSCHED_TADD/PSCHED_TADD2 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:49 -07:00
Patrick McHardy	76d643cd3b	[NET_SCHED]: sch_netem: fix off-by-one in send time comparison netem checks PSCHED_TLESS(cb->time_to_send, now) to find out whether it is allowed to send a packet, which is equivalent to cb->time_to_send < now. Use !PSCHED_TLESS(now, cb->time_to_send) instead to properly handle cb->time_to_send == now. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:47 -07:00
Stephen Hemminger	bb2f8cc0ec	[NETEM]: spelling errors Get rid of some of my creative spelling. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:33 -07:00
Stephen Hemminger	11274e5a43	[NETEM]: avoid excessive requeues The netem code would call getnstimeofday() and dequeue/requeue after every packet, even if it was waiting. Avoid this overhead by using the throttled flag. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:22 -07:00
Stephen Hemminger	075aa573b7	[NETEM]: Optimize tfifo In most cases, the next packet will be sent after the last one. So optimize that case. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:21 -07:00
Stephen Hemminger	b407621c35	[NETEM]: use better types for time values The random number generator always generates 32 bit values. The time values are limited by psched_tdiff_t Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:20 -07:00
Stephen Hemminger	a362e0a789	[NETEM]: report reorder percent correctly. If you setup netem to just delay packets; "tc qdisc ls" will report the reordering as 100%. Well it's a lie, reorder isn't used unless gap is set, so just set value to 0 so the output of utility is correct. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:20 -07:00
Arnaldo Carvalho de Melo	dc5fc579b9	[NETLINK]: Use nlmsg_trim() where appropriate Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:26:37 -07:00
Arnaldo Carvalho de Melo	27a884dc3c	[SK_BUFF]: Convert skb->tail to sk_buff_data_t So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes on 64bit architectures, allowing us to combine the 4 bytes hole left by the layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4 64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN... :-) Many calculations that previously required that skb->{transport,network, mac}_header be first converted to a pointer now can be done directly, being meaningful as offsets or pointers. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:26:28 -07:00
Patrick McHardy	59cb5c6734	[NET_SCHED]: sch_netem: use hrtimer based watchdog Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:26:08 -07:00
YOSHIFUJI Hideaki	10297b9931	[NET] SCHED: Fix whitespace errors. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-10 23:20:08 -08:00
Patrick McHardy	e488eafcc5	[NET_SCHED]: Fix endless loops (part 5): netem/tbf/hfsc ->requeue failures When peeking at the next packet in a child qdisc by calling dequeue/requeue, the upper qdisc qlen counter may get out of sync in case the requeue fails. The qdisc and the child qdisc both have their counter decremented, but since no packet is given to the upper qdisc it won't decrement its counter itself. requeue should not fail, so this is mostly for "correctness". Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:31:46 -08:00
Patrick McHardy	5e50da01d0	[NET_SCHED]: Fix endless loops (part 2): "simple" qdiscs Convert the "simple" qdiscs to use qdisc_tree_decrease_qlen() where necessary: - all graft operations - destruction of old child qdiscs in prio, red and tbf change operation - purging of queue in sfq change operation Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:31:43 -08:00
Patrick McHardy	9f9afec482	[NET_SCHED]: Set parent classid in default qdiscs Set parent classids in default qdiscs to allow walking up the tree from outside the qdiscs. This is needed by the next patch. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:31:41 -08:00
Stephen Hemminger	798b6b19d7	[PATCH] skge, sky2, et all. gplv2 only I don't want my code to downgraded to GPLv3 because of cut-n-pasted the comments. These files which I hold copyright on were started before it was clear what GPLv3 was going to be. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-10-31 20:22:06 -05:00
David S. Miller	4e8a520150	[PKT_SCHED] netem: Orphan SKB when adding to queue. The networking emulator can queue SKBs for a very long time, so if you're using netem on the sender side for large bandwidth/delay product testing, the SKB socket send queue sizes become artificially larger. Correct this by calling skb_orphan() in netem_enqueue(). Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-22 21:00:33 -07:00
Patrick McHardy	84fa7933a3	[NET]: Replace CHECKSUM_HW by CHECKSUM_PARTIAL/CHECKSUM_COMPLETE Replace CHECKSUM_HW by CHECKSUM_PARTIAL (for outgoing packets, whose checksum still needs to be completed) and CHECKSUM_COMPLETE (for incoming packets, device supplied full checksum). Patch originally from Herbert Xu, updated by myself for 2.6.18-rc3. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:53 -07:00
Guillaume Chazarain	89e1df74f8	[PKT_SCHED] netem: Fix slab corruption with netem (2nd try) CONFIG_DEBUG_SLAB found the following bug: netem_enqueue() in sch_netem.c gets a pointer inside a slab object: struct netem_skb_cb cb = (struct netem_skb_cb )skb->cb; But then, the slab object may be freed: skb = skb_unshare(skb, GFP_ATOMIC) cb is still pointing inside the freed skb, so here is a patch to initialize cb later, and make it clear that initializing it sooner is a bad idea. [From Stephen Hemminger: leave cb unitialized in order to let gcc complain in case of use before initialization] Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-21 14:45:25 -07:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
Stephen Hemminger	89bbb0a361	[PKT_SCHED] netem: fix loss The following one line fix is needed to make loss function of netem work right when doing loss on the local host. Otherwise, higher layers just recover. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-29 18:33:12 -07:00
Patrick McHardy	6d037a26f0	[PKT_SCHED]: Qdisc drop operation is optional The drop operation is optional and qdiscs must check if childs support it. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 19:00:49 -08:00
Stephen Hemminger	c865e5d99e	[PKT_SCHED] netem: packet corruption option Here is a new feature for netem in 2.6.16. It adds the ability to randomly corrupt packets with netem. A version was done by Hagen Paul Pfeifer, but I redid it to handle the cases of backwards compatibility with netlink interface and presence of hardware checksum offload. It is useful for testing hardware offload in devices. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:05 -08:00
Andrea Bittau	aa8751667d	[PKT_SCHED]: sch_netem: correctly order packets to be sent simultaneously If two packets were queued to be sent at the same time in the future, their order would be reversed. This would occur because the queue is traversed back to front, and a position is found by checking whether the new packet needs to be sent before the packet being examined. If the new packet is to be sent at the same time of a previous packet, it would end up before the old packet in the queue. This patch places packets in the correct order when they are queued to be sent at a same time in the future. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-20 13:41:05 -08:00
Stephen Hemminger	eb229c4cdc	[NETEM]: Add version string Add a version string to help support issues. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-11-05 20:59:21 -02:00
Stephen Hemminger	300ce174eb	[NETEM]: Support time based reordering Change netem to support packets getting reordered because of variations in delay. Introduce a special case version of FIFO that queues packets in order based on the netem delay. Since netem is classful, those users that don't want jitter based reordering can just insert a pfifo instead of the default. This required changes to generic skbuff code to allow finer grain manipulation of sk_buff_head. Insertion into the middle and reverse walk. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-11-05 20:56:41 -02:00
Stephen Hemminger	07aaa11540	[NETEM]: use PSCHED_LESS Convert netem to use PSCHED_LESS and warn if requeue fails. With some of the psched clock sources, the subtraction doesn't work always work right without wrapping. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-11-05 17:03:46 -02:00
Stephen Hemminger	0dca51d362	[PKT_SCHED] netem: allow random reordering (with fix) Here is a fixed up version of the reorder feature of netem. It is the same as the earlier patch plus with the bugfix from Julio merged in. Has expected backwards compatibility behaviour. Go ahead and merge this one, the TCP strangeness I was seeing was due to the reordering bug, and previous version of TSO patch. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-26 12:55:48 -07:00
Stephen Hemminger	0f9f32ac65	[PKT_SCHED] netem: use only inner qdisc -- no private skbuff queue Netem works better if there if packets are just queued in the inner discipline rather than having a separate delayed queue. Change to use the dequeue/requeue to peek like TBF does. By doing this potential qlen problems with the old method are avoided. The problems happened when the netem_run that moved packets from the inner discipline to the nested discipline failed (because inner queue was full). This happened in dequeue, so the effective qlen of the netem would be decreased (because of the drop), but there was no way to keep the outer qdisc (caller of netem dequeue) in sync. The problem window is still there since this patch doesn't address the issue of requeue failing in netem_dequeue, but that shouldn't happen since the sequence dequeue/requeue should always work. Long term correct fix is to implement qdisc->peek in all the qdisc's to allow for this (needed by several other qdisc's as well). Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-26 12:55:01 -07:00
Stephen Hemminger	0afb51e728	[PKT_SCHED]: netem: reinsert for duplication Handle duplication of packets in netem by re-inserting at top of qdisc tree. This avoid problems with qlen accounting with nested qdisc. This recursion requires no additional locking but will potentially increase stack depth. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-26 12:53:49 -07:00
Stephen Hemminger	d5d75cd6b1	[PKT_SCHED]: netetm: adjust parent qlen when duplicating Fix qlen underrun when doing duplication with netem. If netem is used as leaf discipline, then the parent needs to be tweaked when packets are duplicated. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:24:57 -07:00
Stephen Hemminger	771018e76a	[PKT_SCHED]: netetm: make qdisc friendly to outer disciplines Netem currently dumps packets into the queue when timer expires. This patch makes work by self-clocking (more like TBF). It fixes a bug when 0 delay is requested (only doing loss or duplication). Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:24:32 -07:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

1 2 3 4 5

206 Commits