Doug / smarc-fsl-linux-kernel | Embedian Git Server

18 Jun, 2009

2 commits

31e6d363a net: correct off-by-one write allocations reports ... Browse Code »

commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
(net: No more expensive sock_hold()/sock_put() on each tx)
changed initial sk_wmem_alloc value.

We need to take into account this offset when reporting
sk_wmem_alloc to user, in PROC_FS files or various
ioctls (SIOCOUTQ/TIOCOUTQ)

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-06-18 15:29:12 +0800
b96475805 pkt_sched: Update drops stats in act_police ... Browse Code »

Action police statistics could be misleading because drops are not
shown when expected.

With feedback from: Jamal Hadi Salim

Reported-by: Pawel Staszewski
Signed-off-by: Jarek Poplawski
Acked-by: Jamal Hadi Salim
Signed-off-by: David S. Miller

Jarek Poplawski
2009-06-18 09:56:45 +0800

15 Jun, 2009

2 commits

9cbc1cb8c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 ... Browse Code »

Conflicts:
Documentation/feature-removal-schedule.txt
drivers/scsi/fcoe/fcoe.c
net/core/drop_monitor.c
net/core/net-traces.c

David S. Miller
2009-06-15 18:02:23 +0800
ca44d6e60 pkt_sched: Rename PSCHED_US2NS and PSCHED_NS2US ... Browse Code »

Let's use TICKS instead of US, so PSCHED_TICKS2NS and PSCHED_NS2TICKS
(like in PSCHED_TICKS_PER_SEC already) to avoid misleading.

Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2009-06-15 17:31:47 +0800

13 Jun, 2009

1 commit

5b5481402 net: use symbolic values for ndo_start_xmit() return codes ... Browse Code »

Convert magic values 1 and -1 to NETDEV_TX_BUSY and NETDEV_TX_LOCKED respectively.

0 (NETDEV_TX_OK) is not changed to keep the noise down, except in very few cases
where its in direct proximity to one of the other values.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2009-06-13 16:18:50 +0800

09 Jun, 2009

2 commits

728bf0982 pkt_sched: Use PSCHED_SHIFT in PSCHED time conversion ... Browse Code »

Use PSCHED_SHIFT constant instead of '10' in PSCHED_US2NS() and
PSCHED_NS2US() macros to enable changing this value later.

Additionally use PSCHED_SHIFT in sch_hfsc SM_SHIFT and ISM_SHIFT
definitions. This part of the patch is based on feedback from
Patrick McHardy .

Reported-by: Antonio Almeida
Tested-by: Antonio Almeida
Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2009-06-09 20:25:29 +0800
52ea3a56a cls_cgroup: Fix oops when user send improperly 'tc filter add' request ... Browse Code »

I found a bug in cls_cgroup_change() in cls_cgroup.c.
cls_cgroup_change() expected tca[TCA_OPTIONS] was set from user space properly,
but tc in iproute2-2.6.29-1 (which I used) didn't set it.

In the current source code of tc in git, it set tca[TCA_OPTIONS].

git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git

If we always use a newest iproute2 in git when we use cls_cgroup,
we don't face this oops probably.
But I think, kernel shouldn't panic regardless of use program's behaviour.

Signed-off-by: Minoru Usui
Signed-off-by: David S. Miller

Minoru Usui
2009-06-09 19:03:09 +0800

03 Jun, 2009

3 commits

adf30907d net: skb->dst accessors ... Browse Code »

Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-06-03 17:51:04 +0800
511c3f92a net: skb->rtable accessor ... Browse Code »

Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb

Delete skb->rtable field

Setting rtable is not allowed, just set dst instead as rtable is an alias.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-06-03 17:51:02 +0800
b2f8f7525 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
drivers/net/forcedeth.c

David S. Miller
2009-06-03 17:43:41 +0800

02 Jun, 2009

1 commit

12186be7d net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic… ... Browse Code »

… when we use cls_cgroup

This patch fixes a bug which unconfigured struct tcf_proto keeps
chaining in tc_ctl_tfilter(), and avoids kernel panic in
cls_cgroup_classify() when we use cls_cgroup.

When we execute 'tc filter add', tcf_proto is allocated, initialized
by classifier's init(), and chained. After it's chained,
tc_ctl_tfilter() calls classifier's change(). When classifier's
change() fails, tc_ctl_tfilter() does not free and keeps tcf_proto.

In addition, cls_cgroup is initialized in change() not in init(). It
accesses unconfigured struct tcf_proto which is chained before
change(), then hits Oops.

Signed-off-by: Minoru Usui <usui@mxm.nes.nec.co.jp>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Tested-by: Minoru Usui <usui@mxm.nes.nec.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>

Minoru Usui
2009-06-02 17:17:34 +0800

27 May, 2009

1 commit

e65fcfd63 cls_cgroup: read classid atomically in classifier ... Browse Code »

Avoid reading the unsynchronized value cs->classid multiple times,
since it could change concurrently from non-zero to zero; this would
result in the classifier returning a positive result with a bogus
(zero) classid.

Signed-off-by: Paul Menage
Reviewed-by: Li Zefan
Signed-off-by: David S. Miller

Paul Menage
2009-05-27 11:47:02 +0800

26 May, 2009

1 commit

08baf5610 net: txq_trans_update() helper ... Browse Code »

We would like to get rid of netdev->trans_start = jiffies; that about all net
drivers have to use in their start_xmit() function, and use txq->trans_start
instead.

This can be done generically in core network, as suggested by David.

Some devices, (particularly loopback) dont need trans_start update, because
they dont have transmit watchdog. We could add a new device flag, or rely
on fact that txq->tran_start can be updated is txq->xmit_lock_owner is
different than -1. Use a helper function to hide our choice.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-05-26 13:58:01 +0800

20 May, 2009

1 commit

ab35cd4b8 sch_teql: Use net_device internal stats ... Browse Code »

We can slightly reduce size of teqlN structure, not duplicating stats
structure in teql_master but using stats field from net_device.stats
for tx_errors and from netdev_queue for tx_bytes/tx_packets/tx_dropped
values.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-05-20 06:36:15 +0800

19 May, 2009

2 commits

bb803cfbe Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
drivers/scsi/fcoe/fcoe.c

David S. Miller
2009-05-19 12:08:20 +0800
c0f84d0d4 sch_teql: should not dereference skb after ndo_start_xmit() ... Browse Code »

It is illegal to dereference a skb after a successful ndo_start_xmit()
call. We must store skb length in a local variable instead.

Bug was introduced in 2.6.27 by commit 0abf77e55a2459aa9905be4b226e4729d5b4f0cb
(net_sched: Add accessor function for packet length for qdiscs)

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-05-19 06:12:31 +0800

18 May, 2009

2 commits

9d21493b4 net: tx scalability works : trans_start ... Browse Code »

struct net_device trans_start field is a hot spot on SMP and high performance
devices, particularly multi queues ones, because every transmitter dirties
it. Is main use is tx watchdog and bonding alive checks.

But as most devices dont use NETIF_F_LLTX, we have to lock
a netdev_queue before calling their ndo_start_xmit(). So it makes
sense to move trans_start from net_device to netdev_queue. Its update
will occur on a already present (and in exclusive state) cache line, for
free.

We can do this transition smoothly. An old driver continue to
update dev->trans_start, while an updated one updates txq->trans_start.

Further patches could also put tx_bytes/tx_packets counters in
netdev_queue to avoid dirtying dev->stats (vlan device comes to mind)

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-05-18 11:55:16 +0800
cb1c4b71f cls_cgroup: remove unneeded cgroup_lock ... Browse Code »

We can remove this lock here, since we are in cgroup write handler and
thus the cgrp is guaranteed to be valid, and no lock is needed when
writing a u32 variable.

Signed-off-by: Li Zefan
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller

Li Zefan
2009-05-18 02:59:48 +0800

07 May, 2009

1 commit

6473990c7 net-sched: fix bfifo default limit ... Browse Code »

When no limit is given, the bfifo uses a default of tx_queue_len * mtu.
Packets handled by qdiscs include the link layer header, so this should
be taken into account, similar to what other qdiscs do.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2009-05-07 07:45:07 +0800

03 May, 2009

1 commit

d0ab8ff81 net: Only store high 16 bits of kernel generated filter priorities ... Browse Code »

The kernel should only be using the high 16 bits of a kernel
generated priority. Filter priorities in all other cases only
use the upper 16 bits of the u32 'prio' field of 'struct tcf_proto',
but when the kernel generates the priority of a filter is saves all
32 bits which can result in incorrect lookup failures when a filter
needs to be deleted or modified.

Signed-off-by: Robert Love
Signed-off-by: David S. Miller

Robert Love
2009-05-03 04:48:32 +0800

20 Apr, 2009

1 commit

8caf15397 net: sch_netem: Fix an inconsistency in ingress netem timestamps. ... Browse Code »

Alex Sidorenko reported:

"while experimenting with 'netem' we have found some strange behaviour. It
seemed that ingress delay as measured by 'ping' command shows up on some
hosts but not on others.

After some investigation I have found that the problem is that skbuff->tstamp
field value depends on whether there are any packet sniffers enabled. That
is:

- if any ptype_all handler is registered, the tstamp field is as expected
- if there are no ptype_all handlers, the tstamp field does not show the delay"

This patch prevents unnecessary update of tstamp in dev_queue_xmit_nit()
on ingress path (with act_mirred) adding a check, so minimal overhead on
the fast path, but only when sniffers etc. are active.

Since netem at ingress seems to logically emulate a network before a host,
tstamp is zeroed to trigger the update and pretend delays are from the
outside.

Reported-by: Alex Sidorenko
Tested-by: Alex Sidorenko
Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2009-04-20 17:14:59 +0800

14 Apr, 2009

1 commit

1a31f2042 netsched: Allow meta match on vlan tag on receive ... Browse Code »

When vlan acceleration is used on receive, the vlan tag is maintained
outside of the skb data. The existing vlan tag match only works on TX
path because it uses vlan_get_tag which tests for VLAN_HW_TX_ACCEL.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2009-04-14 09:12:57 +0800

22 Mar, 2009

1 commit

a0bffffc1 net/*: use linux/kernel.h swap() ... Browse Code »

tcp_sack_swap seems unnecessary so I pushed swap to the caller.
Also removed comment that seemed then pointless, and added include
when not already there. Compile tested.

Signed-off-by: Ilpo Järvinen
Signed-off-by: David S. Miller

Ilpo Järvinen
2009-03-22 04:36:17 +0800

16 Mar, 2009

1 commit

7cd0a6387 pkt_sched: Change misleading code in class delete. ... Browse Code »

While looking for a possible reason of bugzilla report on HTB oops:
http://bugzilla.kernel.org/show_bug.cgi?id=12858
I found the code in htb_delete calling htb_destroy_class on zero
refcount is very misleading: it can suggest this is a common path, and
destroy is called under sch_tree_lock. Actually, this can never happen
like this because before deletion cops->get() is done, and after
delete a class is still used by tclass_notify. The class destroy is
always called from cops->put(), so without sch_tree_lock.

This doesn't mean much now (since 2.6.27) because all vulnerable calls
were moved from htb_destroy_class to htb_delete, but there was a bug
in older kernels. The same change is done for other classful scheds,
which, it seems, didn't have similar locking problems here.

Reported-by: m0sia
Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2009-03-16 11:00:19 +0800

05 Mar, 2009

2 commits

508827ff0 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
drivers/net/tokenring/tmspci.c
drivers/net/ucc_geth_mii.c

David S. Miller
2009-03-05 18:06:47 +0800
a883bf564 pkt_sched: act_police: Fix a rate estimator test. ... Browse Code »

A commit c1b56878fb68e9c14070939ea4537ad4db79ffae "tc: policing requires
a rate estimator" introduced a test which invalidates previously working
configs, based on examples from iproute2: doc/actions/actions-general.
This is too rigorous: a rate estimator is needed only when police's
"avrate" option is used.

Reported-by: Joao Correia
Diagnosed-by: John Dykstra
Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2009-03-05 09:38:10 +0800

02 Mar, 2009

1 commit

aa4abc9bc Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
drivers/net/wireless/iwlwifi/iwl-tx.c
net/8021q/vlan_core.c
net/core/dev.c

David S. Miller
2009-03-02 13:35:16 +0800

27 Feb, 2009

1 commit

1844f7479 pkt_sched: sch_drr: Fix oops in drr_change_class. ... Browse Code »

drr_change_class lacks a check for NULL of tca[TCA_OPTIONS], so oops
is possible.

Reported-by: Denys Fedoryschenko
Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2009-02-27 18:42:38 +0800

10 Feb, 2009

1 commit

149490f13 pkt_sched: sch_multiq: Change errno on non-multiqueue devices use. ... Browse Code »

Current "RTNETLINK answers: Invalid argument" warning, while trying to
add multiq qdisc to non-multiqueue device, isn't very helpful and some
of these devs can be changed btw., so let's use a better errno.

With feedback from Stephen Hemminger

Reported-by: Badalian Vyacheslav
Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2009-02-10 16:11:21 +0800

01 Feb, 2009

3 commits

1224736d9 pkt_sched: sch_htb: Use workqueue to schedule after too many events. ... Browse Code »

Patrick McHardy suggested using a workqueue instead
of hrtimers to trigger netif_schedule() when there is a problem with
setting exact time of this event: 'The differnce - yeah, it shouldn't
make much, mainly wake up the qdisc earlier (but not too early) after
"too many events" occured _and_ no further enqueue events wake up the
qdisc anyways.'

Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2009-02-01 17:13:22 +0800
e82181de5 pkt_sched: sch_htb: Warn on too many events. ... Browse Code »

Let's get some info on possible config problems. This patch brings
back an old warning, but is printed only once now.

With feedback from Patrick McHardy

Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2009-02-01 17:13:05 +0800
b00355db3 pkt_sched: sch_hfsc: sch_htb: Add non-work-conserving warning handler. ... Browse Code »

Patrick McHardy suggested:
> How about making this flag and the warning message (in a out-of-line
> function) globally available? Other qdiscs (f.i. HFSC) can't deal with
> inner non-work-conserving qdiscs as well.

This patch uses qdisc->flags field of "suspected" child qdisc.

Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2009-02-01 17:12:42 +0800

13 Jan, 2009

2 commits

a73be0406 pkt_sched: sch_htb: Break all htb_do_events() after 2 jiffies ... Browse Code »

Currently htb_do_events() breaks events recounting for a level after 2
jiffies, but there is no reason to repeat this for next levels and
increase delays even more (with softirqs disabled). htb_dequeue_tree()
can add to this too, btw. In such a case q->now time is invalid anyway.

Thanks to Patrick McHardy for spotting an error around earlier version
of this patch.

Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2009-01-13 13:54:40 +0800
c08513471 pkt_sched: sch_htb: Consider used jiffies in htb_do_events() ... Browse Code »

Next event time should consider jiffies used for recounting. Otherwise
qdisc_watchdog_schedule() triggers hrtimer immediately with the event
in the past, and may cause very high ksoftirqd cpu usage (if highres
is on).

There is also removed checking "event" for zero in htb_dequeue(): it's
always true in this place.

Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2009-01-13 13:54:16 +0800

09 Jan, 2009

2 commits

5fbbf5f64 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (84 commits)
wimax: fix kernel-doc for debufs_dentry member of struct wimax_dev
net: convert pegasus driver to net_device_ops
bnx2x: Prevent eeprom set when driver is down
net: switch kaweth driver to netdevops
pcnet32: round off carrier watch timer
i2400m/usb: wrap USB power saving in #ifdef CONFIG_PM
wimax: testing for rfkill support should also test for CONFIG_RFKILL_MODULE
wimax: fix kconfig interactions with rfkill and input layers
wimax: fix '#ifndef CONFIG_BUG' layout to avoid warning
r6040: bump release number to 0.20
r6040: warn about MAC address being unset
r6040: check PHY status when bringing interface up
r6040: make printks consistent with DRV_NAME
gianfar: Fixup use of BUS_ID_SIZE
mlx4_en: Returning real Max in get_ringparam
mlx4_en: Consider inline packets on completion
netdev: bfin_mac: enable bfin_mac net dev driver for BF51x
qeth: convert to net_device_ops
vlan: add neigh_setup
dm9601: warn on invalid mac address
...

Linus Torvalds
2009-01-09 06:25:41 +0800
c19a28e11 remove lots of double-semicolons ... Browse Code »

Cc: Ingo Molnar
Cc: Thomas Gleixner
Acked-by: Theodore Ts'o
Acked-by: Mark Fasheh
Acked-by: David S. Miller
Cc: James Morris
Acked-by: Casey Schaufler
Acked-by: Takashi Iwai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fernando Carrijo
2009-01-09 00:31:14 +0800

07 Jan, 2009

1 commit

61294e2e2 sch_teql: convert to net_device_ops ... Browse Code »

Convert this driver to net_device_ops.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2009-01-07 02:45:57 +0800

06 Jan, 2009

2 commits

6f5732142 pkt_sched: cls_u32: Fix locking in u32_change() ... Browse Code »

New nodes are inserted in u32_change() under rtnl_lock() with wmb(),
so without tcf_tree_lock() like in other classifiers (e.g. cls_fw).
This isn't enough without rmb() on the read side, but on the other
hand adding such barriers doesn't give any savings, so the lock is
added instead.

Reported-by: m0sia
Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2009-01-06 10:14:19 +0800
c276e098d Revert "net: Fix for initial link state in 2.6.28" ... Browse Code »

This reverts commit 22604c866889c4b2e12b73cbf1683bda1b72a313.

We can't fix this issue in this way, because we now can try
to take the dev_base_lock rwlock as a writer in software interrupt
context and that is not allowed without major surgery elsewhere.

This initial link state problem needs to be solved in some other
way.

Signed-off-by: David S. Miller

David S. Miller
2009-01-06 08:01:51 +0800

05 Jan, 2009

1 commit

22604c866 net: Fix for initial link state in 2.6.28 ... Browse Code »

From: Michael Marineau

Commit b47300168e770b60ab96c8924854c3b0eb4260eb "Do not fire linkwatch
events until the device is registered." was made as a workaround for
drivers that call netif_carrier_off before registering the device.
Unfortunately this causes these drivers to incorrectly report their
link status as IF_OPER_UNKNOWN which can falsely set the IFF_RUNNING
flag when the interface is first brought up. This issues was
previously pointed out[1] but was dismissed saying that IFF_RUNNING is
not related to the link status. From my digging IFF_RUNNING, as
reported to userspace, is based on the link state. It is set based on
__LINK_STATE_START and IF_OPER_UP or IF_OPER_UNKNOWN. See [2], [3],
and [4]. (Whether or not the kernel has IFF_RUNNING set in flags is
not reported to user space so it may well be independent of the link,
I don't know if and when it may get set.)

The end result depends slightly depending on the driver. The the two I
tested were e1000e and b44. With e1000e if the system is booted
without a network cable attached the interface will falsely report
RUNNING when it is brought up causing NetworkManager to attempt to
start it and eventually time out. With b44 when the system is booted
with a network cable attached and brought up with dhcpcd it will time
out the first time.

The attached patch that will still set the operstate variable
correctly to IF_OPER_UP/DOWN/etc when linkwatch_fire_event is called
but then return rather than skipping the linkwatch_fire_event call
entirely as the previous fix did. (sorry it isn't inline, I don't have
a patch friendly email client at the moment)

Signed-off-by: David S. Miller

Michael Marineau
2009-01-05 09:18:51 +0800