14 Jan, 2008
1 commit
-
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
12 Jan, 2008
1 commit
-
The bridge code incorrectly causes two POST_ROUTING hook invocations
for DNATed packets that end up on the same bridge device. This
happens because packets with a changed destination address are passed
to dst_output() to make them go through the neighbour output function
again to build a new destination MAC address, before they will continue
through the IP hooks simulated by bridge netfilter.The resulting hook order is:
PREROUTING (bridge netfilter)
POSTROUTING (dst_output -> ip_output)
FORWARD (bridge netfilter)
POSTROUTING (bridge netfilter)The deferred hooks used to abort the first POST_ROUTING invocation,
but since the only thing bridge netfilter actually really wants is
a new MAC address, we can avoid going through the IP stack completely
by simply calling the neighbour output function directly.Tested, reported and lots of data provided by: Damien Thebault
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller
11 Jan, 2008
8 commits
-
Use the @helper variable that was just obtained.
Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
RFC2464 says that the next to lowerst order bit of the first octet
of the Interface Identifier is formed by complementing
the Universal/Local bit of the EUI-64. But ip6t_eui64 uses OR not XOR.Thanks Peter Ivancik for reporing this bug and posting a patch
for it.Signed-off-by: Yasuyuki Kozakai
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
Don't allow to nest macvlan devices since it will cause lockdep
warnings and isn't really useful for anything.Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
Allow vlans nesting other vlans without lockdep's warnings (max. 2 levels
i.e. parent + child). Thanks to Patrick McHardy for pointing a bug in the
first version of this patch.Reported-by: Benny Amorsen
Signed-off-by: Jarek Poplawski
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
In dn_rt_cache_get_next(), no need to guard seq->private by a
rcu_dereference() since seq is private to the thread running this
function. Reading seq.private once (as guaranted bu rcu_dereference())
or several time if compiler really is dumb enough wont change the
result.But we miss real spots where rcu_dereference() are needed, both in
dn_rt_cache_get_first() and dn_rt_cache_get_next()Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
In the (rare) event of simultaneous mutual wake up requests,
do send the chip an explicit wake-up ack. This is required
for Texas Instruments's BRF6350 chip.Signed-off-by: Ohad Ben-Cohen
Signed-off-by: David S. Miller -
1) In tty.c the BUG_ON at line 115 will never be called, because the the
before list_del_init in this same function.
115 BUG_ON(!list_empty(&dev->list));
So move the list_del_init to rfcomm_dev_del2) The rfcomm_dev_del could be called from diffrent path
(rfcomm_tty_hangup/rfcomm_dev_state_change/rfcomm_release_dev),So add another BUG_ON when the rfcomm_dev_del is called more than
one time.Signed-off-by: Dave Young
Signed-off-by: David S. Miller -
Bernard Pidoux F6BVP reported:
> When I killall kissattach I can see the following message.
>
> This happens on kernel 2.6.24-rc5 already patched with the 6 previously
> patches I sent recently.
>
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.23.9 #1
> -------------------------------------------------------
> kissattach/2906 is trying to acquire lock:
> (linkfail_lock){-+..}, at: [] ax25_link_failed+0x11/0x39 [ax25]
>
> but task is already holding lock:
> (ax25_list_lock){-+..}, at: [] ax25_device_event+0x38/0x84
> [ax25]
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
...lockdep is worried about the different order here:
#1 (rose_neigh_list_lock){-+..}:
#3 (ax25_list_lock){-+..}:#0 (linkfail_lock){-+..}:
#1 (rose_neigh_list_lock){-+..}:#3 (ax25_list_lock){-+..}:
#0 (linkfail_lock){-+..}:So, ax25_list_lock could be taken before and after linkfail_lock.
I don't know if this three-thread clutch is very probable (or
possible at all), but it seems another bug reported by Bernard
("[...] system impossible to reboot with linux-2.6.24-rc5")
could have similar source - namely ax25_list_lock held by
ax25_kill_by_device() during ax25_disconnect(). It looks like the
only place which calls ax25_disconnect() this way, so I guess, it
isn't necessary.This patch is breaking the lock for ax25_disconnect().
Reported-and-tested-by: Bernard Pidoux
Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller
10 Jan, 2008
6 commits
-
sfuzz can easily trigger any of those.
move the printk message to the corresponding comment: makes the
intention of the code clear and easy to pick up on an scheduled
removal. as bonus simplify the braces placement.Signed-off-by: maximilian attems
Signed-off-by: David S. Miller -
In rt_cache_get_next(), no need to guard seq->private by a
rcu_dereference() since seq is private to the thread running this
function. Reading seq.private once (as guaranted bu rcu_dereference())
or several time if compiler really is dumb enough wont change the
result.But we miss real spots where rcu_dereference() are needed, both in
rt_cache_get_first() and rt_cache_get_next()Signed-off-by: Eric Dumazet
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
The neightbl_fill_parms() is called under the write-locked tbl->lock
and accesses the parms->dev. The negh_parm_release() calls the
dev_put(parms->dev) without this lock. This creates a tiny race window
on which the parms contains potentially stale dev pointer.To fix this race it's enough to move the dev_put() upper under the
tbl->lock, but note, that the parms are held by neighbors and thus can
live after the neigh_parms_release() is called, so we still can have a
parm with bad dev pointer.I didn't find where the neigh->parms->dev is accessed, but still think
that putting the dev is to be done in a place, where the parms are
really freed. Am I right with that?Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller -
From: Mirko Lindner
This patch makes necessary changes in the Neptune driver to support
the new Marvell PHY. It also adds support for the LED blinking
on Neptune cards with Marvell PHY. All registers are using defines
in the niu.h header file as is already done for the BCM8704 registers.[ Coding style, etc. cleanups -DaveM ]
Signed-off-by: David S. Miller
-
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (36 commits)
[ATM]: Check IP header validity in mpc_send_packet
[IPV6]: IPV6_MULTICAST_IF setting is ignored on link-local connect()
[CONNECTOR]: Don't touch queue dev after decrement of ref count.
[SOCK]: Adds a rcu_dereference() in sk_filter
[XFRM]: xfrm_algo_clone() allocates too much memory
[FORCEDETH]: Fix reversing the MAC address on suspend.
[NET]: mcs7830 passes msecs instead of jiffies to usb_control_msg
[LRO] Fix lro_mgr->features checks
[NET]: Clone the sk_buff 'iif' field in __skb_clone()
[IPV4] ROUTE: ip_rt_dump() is unecessary slow
[NET]: kaweth was forgotten in msec switchover of usb_start_wait_urb
[NET] Intel ethernet drivers: update MAINTAINERS
[NET]: Make ->poll() breakout consistent in Intel ethernet drivers.
[NET]: Stop polling when napi_disable() is pending.
[NET]: Fix drivers to handle napi_disable() disabling interrupts.
[NETXEN]: Fix ->poll() done logic.
mac80211: return an error when SIWRATE doesn't match any rate
ssb: Fix probing of PCI cores if PCI and PCIE core is available
[NET]: Do not check netif_running() and carrier state in ->poll()
[NET]: Add NAPI_STATE_DISABLE.
... -
The show_task function invoked by sysrq-t et al displays the
pid and parent's pid of each task. It seems more useful to
show the actual process hierarchy here than who is using
ptrace on each process.Signed-off-by: Roland McGrath
Signed-off-by: Linus Torvalds
09 Jan, 2008
24 commits
-
Al went through the ip_fast_csum callers and found this piece of code
that did not validate the IP header. While root crashing the machine
by sending bogus packets through raw or AF_PACKET sockets isn't that
serious, it is still nice to react gracefully.This patch ensures that the skb has enough data for an IP header and
that the header length field is valid.Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
Signed-off-by: Brian Haley
Acked-by: David L Stevens
Signed-off-by: David S. Miller -
cn_queue_free_callback() will touch 'dev'(i.e. cbq->pdev), so it
should be called before atomic_dec(&dev->refcnt).Signed-off-by: Li Zefan
Signed-off-by: David S. Miller -
It seems commit fda9ef5d679b07c9d9097aaf6ef7f069d794a8f9 introduced a RCU
protection for sk_filter(), without a rcu_dereference()Either we need a rcu_dereference(), either a comment should explain why we
dont need it. I vote for the former.Signed-off-by: Eric Dumazet
Acked-by: Herbert Xu
Signed-off-by: David S. Miller -
alg_key_len is the length in bits of the key, not in bytes.
Best way to fix this is to move alg_len() function from net/xfrm/xfrm_user.c
to include/net/xfrm.h, and to use it in xfrm_algo_clone()alg_len() is renamed to xfrm_alg_len() because of its global exposition.
Signed-off-by: Eric Dumazet
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
For cards that initially have the MAC address stored in reverse order,
the forcedeth driver uses a flag to signal whether the address was
already corrected, so that it is not reversed again on a subsequent
probe.Unfortunately this flag, which is stored in a register of the card,
seems to get lost during suspend, resulting in the MAC address being
reversed again. To fix that, the MAC address needs to be written back
in reversed order before we suspend and the flag needs to be reset.The flag is still required because at least kexec will never write
back the reversed address and thus needs to know what state the card
is in.Signed-off-by: Björn Steinbrink
Signed-off-by: David S. Miller -
usb_control_msg was changed long ago (2.6.12-pre) to take milliseconds
instead of jiffies. Oddly, mcs7830 wasn't added until 2.6.19-rc3.Signed-off-by: Russ Dill
Signed-off-by: David S. Miller -
lro_mgr->features contains a bitmask of LRO_F_* values which are
defined as power of two, not as bit indexes.
They must be checked with x&LRO_F_FOO, not with test_bit(LRO_F_FOO,&x).Signed-off-by: Brice Goglin
Acked-by: Andrew Gallatin
Signed-off-by: David S. Miller -
Both NetLabel and SELinux (other LSMs may grow to use it as well) rely
on the 'iif' field to determine the receiving network interface of
inbound packets. Unfortunately, at present this field is not
preserved across a skb clone operation which can lead to garbage
values if the cloned skb is sent back through the network stack. This
patch corrects this problem by properly copying the 'iif' field in
__skb_clone() and removing the 'iif' field assignment from
skb_act_clone() since it is no longer needed.Also, while we are here, put the assignments in the same order as the
offsets to reduce cacheline bounces.Signed-off-by: Paul Moore
Signed-off-by: David S. Miller -
I noticed "ip route list cache x.y.z.t" can be *very* slow.
While strace-ing -T it I also noticed that first part of route cache
is fetched quite fast :recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"p\0\0\0\30\0\2\0\254i\202
GXm\0\0\2 \0\376\0\0\2\0\2\0"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3772
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\234\0\0\0\30\0\2\0\254i\
202GXm\0\0\2 \0\376\0\0\1\0\2"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3736
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\204\0\0\0\30\0\2\0\254i\
202GXm\0\0\2 \0\376\0\0\1\0\2"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3740
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\234\0\0\0\30\0\2\0\254i\
202GXm\0\0\2 \0\376\0\0\1\0\2"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3712
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\204\0\0\0\30\0\2\0\254i\
202GXm\0\0\2 \0\376\0\0\1\0\2"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3732
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"p\0\0\0\30\0\2\0\254i\202
GXm\0\0\2 \0\376\0\0\2\0\2\0"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3708
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"p\0\0\0\30\0\2\0\254i\202
GXm\0\0\2 \0\376\0\0\2\0\2\0"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3680while the part at the end of the table is more expensive:
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\204\0\0\0\30\0\2\0\254i\202GXm\0\0\2 \0\376\0\0\1\0\2"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3656
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\204\0\0\0\30\0\2\0\254i\202GXm\0\0\2 \0\376\0\0\1\0\2"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3772
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"p\0\0\0\30\0\2\0\254i\202GXm\0\0\2 \0\376\0\0\2\0\2\0"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3712
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"p\0\0\0\30\0\2\0\254i\202GXm\0\0\2 \0\376\0\0\2\0\2\0"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3700
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"p\0\0\0\30\0\2\0\254i\202GXm\0\0\2 \0\376\0\0\2\0\2\0"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3676
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"p\0\0\0\30\0\2\0\254i\202GXm\0\0\2 \0\376\0\0\2\0\2\0"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3724
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\234\0\0\0\30\0\2\0\254i\202GXm\0\0\2 \0\376\0\0\1\0\2"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 3736The following patch corrects this performance/latency problem,
removing quadratic behavior.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Back in 2.6.12-pre, usb_start_wait_urb was switched over to take
milliseconds instead of jiffies. kaweth.c was never updated to match.Signed-off-by: Russ Dill
Signed-off-by: David S. Miller -
Unfortunately Jeb decided to move away from our group. We wish Jeb
good luck with his new group!Reordered people a bit so most active team members are on top.
Signed-off-by: Auke Kok
Signed-off-by: David S. Miller -
This makes the ->poll() routines of the E100, E1000, E1000E, IXGB, and
IXGBE drivers complete ->poll() consistently.Now they will all break out when the amount of RX work done is less
than 'budget'.At a later time, we may want put back code to include the TX work as
well (as at least one other NAPI driver does, but by in large NAPI
drivers do not do this). But if so, it should be done consistently
across the board to all of these drivers.Signed-off-by: David S. Miller
Acked-by: Auke Kok -
This finally adds the code in net_rx_action() to break out of the
->poll()'ing loop when a napi_disable() is found to be pending.Now, even if a device is being flooded with packets it can be cleanly
brought down.Signed-off-by: David S. Miller
-
When we add the generic napi_disable_pending() breakout
logic to net_rx_action() it means that napi_disable()
can cause NAPI poll interrupt events to be disabled.And this is exactly what we want. If a napi_disable()
is pending, and we are looping in the ->poll(), we want
->poll() event interrupts to stay disabled and we want
to complete the NAPI poll ASAP.When ->poll() break out during device down was being handled on a
per-driver basis, often these drivers would turn interrupts back on
when '!netif_running()' was detected.And this would just cause a reschedule of the NAPI ->poll() in the
interrupt handler before the napi_disable() could get in there and
grab the NAPI_STATE_SCHED bit.The vast majority of drivers don't care if napi_disable() might have
the side effect of disabling NAPI ->poll() event interrupts. In all
such cases, when a napi_disable() is performed, the driver just
disabled interrupts or is about to.However there were three exceptions to this in PCNET32, R8169, and
SKY2. To fix those cases, at the subsequent napi_enable() points, I
added code to ensure that the ->poll() interrupt events are enabled in
the hardware.Signed-off-by: David S. Miller
Acked-by: Don Fry -
If work_done >= budget we should always elide the NAPI
completion.Signed-off-by: David S. Miller
-
Currently mac80211 fails silently when trying to set a nonexistent
rate. Return an error instead.Signed-Off-By: Andy Lutomirski
Signed-off-by: John W. Linville -
This will make sure that always the correct core is selected, even if
there are both a PCI and PCI-E core on a PCI or PCI-E card.Signed-off-by: Michael Buesch
Signed-off-by: John W. Linville -
Drivers do this to try to break out of the ->poll()'ing loop
when the device is being brought administratively down.Now that we have a napi_disable() "pending" state we are going
to solve that problem generically.Signed-off-by: David S. Miller
-
Create a bit to signal that a napi_disable() is in progress.
This sets up infrastructure such that net_rx_action() can generically
break out of the ->poll() loop on a NAPI context that has a pending
napi_disable() yet is being bombed with packets (and thus would
otherwise poll endlessly and not allow the napi_disable() to finish).Now, what napi_disable() does is first set the NAPI_STATE_DISABLE bit
(to indicate that a disable is pending), then it polls for the
NAPI_STATE_SCHED bit, and once the NAPI_STATE_SCHED bit is acquired
the NAPI_STATE_DISABLE bit is cleared. Here, the test_and_set_bit()
provides the necessary memory barrier between the various bitops.napi_schedule_prep() now tests for a pending disable as it's first
action and won't try to obtain the NAPI_STATE_SCHED bit if a disable
is pending.As a result, we can remove the netif_running() check in
netif_rx_schedule_prep() because the NAPI disable pending state serves
this purpose. And, it does so in a NAPI centric manner which is what
we really want.Signed-off-by: David S. Miller
-
It is pointless, because everything that can make a device go away
will do a napi_disable() first.The main impetus behind this is that now we can legally do a NAPI
completion in generic code like net_rx_action() which a following
changeset needs to do. net_rx_action() can only perform actions
in NAPI centric ways, because there may be a one to many mapping
between NAPI contexts and network devices (SKY2 is one example).We also want to get rid of this because it's an extra atomic in the
NAPI paths, and also because it is one of the last instances where the
NAPI interfaces care about net devices.The one remaining netdev detail the NAPI stuff cares about is the
netif_running() check which will be killed off in a subsequent
changeset.Signed-off-by: David S. Miller
-
This patch fixes the parsing of the RX data header channel field.
The current code parses the header incorrectly and passes a wrong
channel number and frequency for each frame to mac80211.
The FIXMEs added by this patch don't matter for now as the code
where they live won't get executed anyway. They will be fixed later.Signed-off-by: Michael Buesch
Signed-off-by: John W. Linville -
easy to trigger as user with sfuzz.
irda_create() is quiet on unknown sock->type,
match this behaviour for SOCK_DGRAM unknown protocolSigned-off-by: maximilian attems
Signed-off-by: David S. Miller -
Some recent changes completely removed accounting for the FORWARD_TSN
parameter length in the INIT and INIT-ACK chunk. This is wrong and
should be restored.Signed-off-by: Vlad Yasevich
Signed-off-by: David S. Miller