29 Oct, 2009
3 commits
-
Adding an accessor to existing dst_entry feautres field and
refactor the only supported feature (allfrag) to use it.Signed-off-by: Gilad Ben-Yossef
Sigend-off-by: Ori Finkelman
Sigend-off-by: Yony Amit
Signed-off-by: David S. Miller -
We need tcp_parse_options to be aware of dst_entry to
take into account per dst_entry TCP options settingsSigned-off-by: Gilad Ben-Yossef
Sigend-off-by: Ori Finkelman
Sigend-off-by: Yony Amit
Signed-off-by: David S. Miller
28 Oct, 2009
3 commits
-
Adding a list_head parameter to rtnl_link_ops->dellink() methods
allow us to queue devices on a list, in order to dismantle
them all at once.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
ieee80211_rx() must be called with bottom halves disabled. To simplify
driver development implement ieee80211_rx_ni() which disables BH. This
function must be used when in process context.Signed-off-by: Kalle Valo
Signed-off-by: John W. Linville -
Get rid of cookies in cfg80211_send_XXX() functions.
Signed-off-by: Holger Schurig
Signed-off-by: John W. Linville
27 Oct, 2009
1 commit
-
Conflicts:
drivers/net/sh_eth.c
24 Oct, 2009
2 commits
-
When handling large number of netdevice, rtnl_dump_ifinfo()
is very slow because it has O(N^2) complexity.Instead of scanning one single list, we can use the 256 sub lists
of the dev_index hash table.This considerably speedups "ip link" operations
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
SIT tunnels use one rwlock to protect their prl entries.
This first patch adds RCU locking for prl management,
with standard call_rcu() calls.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
23 Oct, 2009
1 commit
-
This adds support for setting the skb mark.
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
21 Oct, 2009
2 commits
-
dst_negative_advice() should check for changed dst and reset
sk_tx_queue_mapping accordingly. Pass sock to the callers of
dst_negative_advice.(sk_reset_txq is defined just for use by dst_negative_advice. The
only way I could find to get around this is to move dst_negative_()
from dst.h to dst.c, include sock.h in dst.c, etc)Signed-off-by: Krishna Kumar
Signed-off-by: David S. Miller -
Introduce sk_tx_queue_mapping; and functions that set, test and
get this value. Reset sk_tx_queue_mapping to -1 whenever the dst
cache is set/reset, and in socket alloc. Setting txq to -1 and
using valid txq= allows the tx path to use the value
of sk_tx_queue_mapping directly instead of subtracting 1 on every
tx.Signed-off-by: Krishna Kumar
Signed-off-by: David S. Miller
20 Oct, 2009
2 commits
-
commit 9e337b0f (net: annotate inet_timewait_sock bitfields)
added 4/8 bytes in struct inet_timewait_sock.Fix this by declaring tw_ipv6_offset in the 'flags' bitfield
The 14 bits hole is named tw_pad to make it cleary apparent.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Reported-by: Stephen Rothwell
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller
19 Oct, 2009
4 commits
-
The last users of skb_icv_walk are converted to ahash now,
so skb_icv_walk is unused and can be removed.Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller -
ah4 and ah6 are converted to ahash now, so we can remove the
code for the obsolete hash algorithm.Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller -
To support for ahash algorithms, we add a pointer to a
crypto_ahash to ah_data.Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller -
In order to have better cache layouts of struct sock (separate zones
for rx/tx paths), we need this preliminary patch.Goal is to transfert fields used at lookup time in the first
read-mostly cache line (inside struct sock_common) and move sk_refcnt
to a separate cache line (only written by rx path)This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
sport and id fields. This allows a future patch to define these
fields as macros, like sk_refcnt, without name clashes.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
15 Oct, 2009
3 commits
-
Signed-off-by: Rémi Denis-Courmont
Signed-off-by: David S. Miller -
The Phonet "universe" only has 64 addresses, so we keep a trivial flat
routing table.Signed-off-by: Rémi Denis-Courmont
Signed-off-by: David S. Miller -
Signed-off-by: Rémi Denis-Courmont
Signed-off-by: David S. Miller
14 Oct, 2009
2 commits
13 Oct, 2009
4 commits
-
Storing the mask (size - 1) instead of the size allows fast path to be
a bit faster.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Meaning receive multiple messages, reducing the number of syscalls and
net stack entry/exit operations.Next patches will introduce mechanisms where protocols that want to
optimize this operation will provide an unlocked_recvmsg operation.This takes into account comments made by:
. Paul Moore: sock_recvmsg is called only for the first datagram,
sock_recvmsg_nosec is used for the rest.. Caitlin Bestler: recvmmsg now has a struct timespec timeout, that
works in the same fashion as the ppoll one.If the underlying protocol returns a datagram with MSG_OOB set, this
will make recvmmsg return right away with as many datagrams (+ the OOB
one) it has received so far.. Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
datagrams and then recvmsg returns an error, recvmmsg will return
the successfully received datagrams, store the error and return it
in the next call.This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg,
where we will be able to acquire the lock only at batch start and end, not at
every underlying recvmsg call.Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller -
Create a new socket level option to report number of queue overflows
Recently I augmented the AF_PACKET protocol to report the number of frames lost
on the socket receive queue between any two enqueued frames. This value was
exported via a SOL_PACKET level cmsg. AFter I completed that work it was
requested that this feature be generalized so that any datagram oriented socket
could make use of this option. As such I've created this patch, It creates a
new SOL_SOCKET level option called SO_RXQ_OVFL, which when enabled exports a
SOL_SOCKET level cmsg that reports the nubmer of times the sk_receive_queue
overflowed between any two given frames. It also augments the AF_PACKET
protocol to take advantage of this new feature (as it previously did not touch
sk->sk_drops, which this patch uses to record the overflow count). Tested
successfully by me.Notes:
1) Unlike my previous patch, this patch simply records the sk_drops value, which
is not a number of drops between packets, but rather a total number of drops.
Deltas must be computed in user space.2) While this patch currently works with datagram oriented protocols, it will
also be accepted by non-datagram oriented protocols. I'm not sure if thats
agreeable to everyone, but my argument in favor of doing so is that, for those
protocols which aren't applicable to this option, sk_drops will always be zero,
and reporting no drops on a receive queue that isn't used for those
non-participating protocols seems reasonable to me. This also saves us having
to code in a per-protocol opt in mechanism.3) This applies cleanly to net-next assuming that commit
977750076d98c7ff6cbda51858bb5a5894a9d9ab (my af packet cmsg patch) is revertedSigned-off-by: Neil Horman
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
ieee80211_rx() must be called with softirqs disabled
since the networking stack requires this for netif_rx()
and some code in mac80211 can assume that it can not
be processing its own tasklet and this call at the same
time.It may be possible to remove this requirement after a
careful audit of mac80211 and doing any needed locking
improvements in it along with disabling softirqs around
netif_rx(). An alternative might be to push all packet
processing to process context in mac80211, instead of
to the tasklet, and add other synchronisation.Signed-off-by: Johannes Berg
Signed-off-by: John W. Linville
12 Oct, 2009
1 commit
-
Since commit a98b65a3 (net: annotate struct sock bitfield), we lost
8 bytes in struct sock on 64bit arches because of
kmemcheck_bitfield_end(flags) misplacement.Fix this by putting together sk_shutdown, sk_no_check, sk_userlocks,
sk_protocol and sk_type in the 'flags' 32bits bitfieldSigned-off-by: Eric Dumazet
Signed-off-by: David S. Miller
10 Oct, 2009
1 commit
08 Oct, 2009
3 commits
-
UDP_HTABLE_SIZE was initialy defined to 128, which is a bit small for
several setups.4000 active UDP sockets -> 32 sockets per chain in average. An
incoming frame has to lookup all sockets to find best match, so long
chains hurt latency.Instead of a fixed size hash table that cant be perfect for every
needs, let UDP stack choose its table size at boot time like tcp/ip
route, using alloc_large_system_hash() helperAdd an optional boot parameter, uhash_entries=x so that an admin can
force a size between 256 and 65536 if needed, like thash_entries and
rhash_entries.dmesg logs two new lines :
[ 0.647039] UDP hash table entries: 512 (order: 0, 4096 bytes)
[ 0.647099] UDP Lite hash table entries: 512 (order: 0, 4096 bytes)Maximal size on 64bit arches would be 65536 slots, ie 1 MBytes for non
debugging spinlocks.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
It's useful to provide firmware and hardware version to user space and have a
generic interface to retrieve them. Users can provide the version information
in bug reports etc.Add fields for firmware and hardware version to struct wiphy.
(Dropped nl80211 bits for now and modified remaining bits in favor of
ethtool. -- JWL)Cc: Kalle Valo
Signed-off-by: John W. Linville -
Refactor wext to
* split out iwpriv handling
* split out iwspy handling
* split out procfs support
* allow cfg80211 to have wireless extensions compat code
w/o CONFIG_WIRELESS_EXTAfter this, drivers need to
- select WIRELESS_EXT - for wext support
- select WEXT_PRIV - for iwpriv support
- select WEXT_SPY - for iwspy supportexcept cfg80211 -- which gets new hooks in wext-core.c
and can then get wext handlers without CONFIG_WIRELESS_EXT.Wireless extensions procfs support is auto-selected
based on PROC_FS and anything that requires the wext core
(i.e. WIRELESS_EXT or CFG80211_WEXT).Signed-off-by: Johannes Berg
Signed-off-by: John W. Linville
07 Oct, 2009
4 commits
-
All usages of structure net_proto_ops should be declared const.
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
Jarek Poplawski a écrit :
>
>
> Hmm... So you made me to do some "real" work here, and guess what?:
> there is one serious checkpatch warning! ;-) Plus, this new parameter
> should be added to the function description. Otherwise:
> Signed-off-by: Jarek Poplawski
>
> Thanks,
> Jarek P.
>
> PS: I guess full "Don't" would show we really mean it...Okay :) Here is the last round, before the night !
Thanks again
[RFC] pkt_sched: gen_estimator: Don't report fake rate estimators
We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator
is running.# tc -s -d qdisc
qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake
one (because no estimator is active)After this patch, tc command output is :
$ tc -s -d qdisc
qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0We add a parameter to gnet_stats_copy_rate_est() function so that
it can use gen_estimator_active(bstats, r), as suggested by Jarek.This parameter can be NULL if check is not necessary, (htb for
example has a mandatory rate estimator)Signed-off-by: Eric Dumazet
Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller -
IPv6 Rapid Deployment (6rd; draft-ietf-softwire-ipv6-6rd) builds upon
mechanisms of 6to4 (RFC3056) to enable a service provider to rapidly
deploy IPv6 unicast service to IPv4 sites to which it provides
customer premise equipment. Like 6to4, it utilizes stateless IPv6 in
IPv4 encapsulation in order to transit IPv4-only network
infrastructure. Unlike 6to4, a 6rd service provider uses an IPv6
prefix of its own in place of the fixed 6to4 prefix.With this option enabled, the SIT driver offers 6rd functionality by
providing additional ioctl API to configure the IPv6 Prefix for in
stead of static 2002::/16 for 6to4.Original patch was done by Alexandre Cassen
based on old Internet-Draft.Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller -
An incoming datagram must bring into cpu cache *lot* of cache lines,
in particular : (other parts omitted (hash chains, ip route cache...))On 32bit arches :
offsetof(struct sock, sk_rcvbuf) =0x30 (read)
offsetof(struct sock, sk_lock) =0x34 (rw)offsetof(struct sock, sk_sleep) =0x50 (read)
offsetof(struct sock, sk_rmem_alloc) =0x64 (rw)
offsetof(struct sock, sk_receive_queue)=0x74 (rw)offsetof(struct sock, sk_forward_alloc)=0x98 (rw)
offsetof(struct sock, sk_callback_lock)=0xcc (rw)
offsetof(struct sock, sk_drops) =0xd8 (read if we add dropcount support, rw if frame dropped)
offsetof(struct sock, sk_filter) =0xf8 (read)offsetof(struct sock, sk_socket) =0x138 (read)
offsetof(struct sock, sk_data_ready) =0x15c (read)
We can avoid sk->sk_socket and socket->fasync_list referencing on sockets
with no fasync() structures. (socket->fasync_list ptr is probably already in cache
because it shares a cache line with socket->wait, ie location pointed by sk->sk_sleep)This avoids one cache line load per incoming packet for common cases (no fasync())
We can leave (or even move in a future patch) sk->sk_socket in a cold location
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
05 Oct, 2009
2 commits
-
We currently dirty a cache line to update tunnel device stats
(tx_packets/tx_bytes). We better use the txq->tx_bytes/tx_packets
counters that already are present in cpu cache, in the cache
line shared with txq->_xmit_lockThis patch extends IPTUNNEL_XMIT() macro to use txq pointer
provided by the caller.Also &tunnel->dev->stats can be replaced by &dev->stats
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
The FIB algorithim for IPV4 is set at compile time, but kernel goes through
the overhead of function call indirection at runtime. Save some
cycles by turning the indirect calls to direct calls to either
hash or trie code.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller
01 Oct, 2009
1 commit
-
This provides safety against negative optlen at the type
level instead of depending upon (sometimes non-trivial)
checks against this sprinkled all over the the place, in
each and every implementation.Based upon work done by Arjan van de Ven and feedback
from Linus Torvalds.Signed-off-by: David S. Miller
29 Sep, 2009
1 commit
-
The move away from having drivers assign wireless handlers,
in favour of making cfg80211 assign them, broke the sysfs
registration (the wireless/ dir went missing) because the
handlers are now assigned only after registration, which is
too late.Fix this by special-casing cfg80211-based devices, all
of which are required to have an ieee80211_ptr, in the
sysfs code, and also using get_wireless_stats() to have
the same values reported as in procfs.Signed-off-by: Johannes Berg
Reported-by: Hugh Dickins
Tested-by: Hugh Dickins
Signed-off-by: John W. Linville