Eric Lee / smarc-fsl-linux-kernel

03 Oct, 2012

1 commit

aecdc33e1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking changes from David Miller:

1) GRE now works over ipv6, from Dmitry Kozlov.

2) Make SCTP more network namespace aware, from Eric Biederman.

3) TEAM driver now works with non-ethernet devices, from Jiri Pirko.

4) Make openvswitch network namespace aware, from Pravin B Shelar.

5) IPV6 NAT implementation, from Patrick McHardy.

6) Server side support for TCP Fast Open, from Jerry Chu and others.

7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel
Borkmann.

8) Increate the loopback default MTU to 64K, from Eric Dumazet.

9) Use a per-task rather than per-socket page fragment allocator for
outgoing networking traffic. This benefits processes that have very
many mostly idle sockets, which is quite common.

From Eric Dumazet.

10) Use up to 32K for page fragment allocations, with fallbacks to
smaller sizes when higher order page allocations fail. Benefits are
a) less segments for driver to process b) less calls to page
allocator c) less waste of space.

From Eric Dumazet.

11) Allow GRO to be used on GRE tunnels, from Eric Dumazet.

12) VXLAN device driver, one way to handle VLAN issues such as the
limitation of 4096 VLAN IDs yet still have some level of isolation.
From Stephen Hemminger.

13) As usual there is a large boatload of driver changes, with the scale
perhaps tilted towards the wireless side this time around.

Fix up various fairly trivial conflicts, mostly caused by the user
namespace changes.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits)
hyperv: Add buffer for extended info after the RNDIS response message.
hyperv: Report actual status in receive completion packet
hyperv: Remove extra allocated space for recv_pkt_list elements
hyperv: Fix page buffer handling in rndis_filter_send_request()
hyperv: Fix the missing return value in rndis_filter_set_packet_filter()
hyperv: Fix the max_xfer_size in RNDIS initialization
vxlan: put UDP socket in correct namespace
vxlan: Depend on CONFIG_INET
sfc: Fix the reported priorities of different filter types
sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP
sfc: Fix loopback self-test with separate_tx_channels=1
sfc: Fix MCDI structure field lookup
sfc: Add parentheses around use of bitfield macro arguments
sfc: Fix null function pointer in efx_sriov_channel_type
vxlan: virtual extensible lan
igmp: export symbol ip_mc_leave_group
netlink: add attributes to fdb interface
tg3: unconditionally select HWMON support when tg3 is enabled.
Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT"
gre: fix sparse warning
...

Linus Torvalds
2012-10-03 04:38:27 +0800

25 Aug, 2012

1 commit

8f4cccbbd net: Set device operstate at registration time ... Browse Code »
43

The operstate of a device is initially IF_OPER_UNKNOWN and is updated
asynchronously by linkwatch after each change of carrier state
reported by the driver. The default carrier state of a net device is
on, and this will never be changed on drivers that do not support
carrier detection, thus the operstate remains IF_OPER_UNKNOWN.

For devices that do support carrier detection, the driver must set the
carrier state to off initially, then poll the hardware state when the
device is opened. However, we must not activate linkwatch for a
unregistered device, and commit b473001 ('net: Do not fire linkwatch
events until the device is registered.') ensured that we don't. But
this means that the operstate for many devices that support carrier
detection remains IF_OPER_UNKNOWN when it should be IF_OPER_DOWN.

The same issue exists with the dormant state.

The proper initialisation sequence, avoiding a race with opening of
the device, is:

rtnl_lock();
rc = register_netdevice(dev);
if (rc)
goto out_unlock;
netif_carrier_off(dev); /* or netif_dormant_on(dev) */
rtnl_unlock();

but it seems silly that this should have to be repeated in so many
drivers. Further, the operstate seen immediately after opening the
device may still be IF_OPER_UNKNOWN due to the asynchronous nature of
linkwatch.

Commit 22604c8 ('net: Fix for initial link state in 2.6.28') attempted
to fix this by setting the operstate synchronously, but it was
reverted as it could lead to deadlock.

This initialises the operstate synchronously at registration time
only.

Signed-off-by: Ben Hutchings
Signed-off-by: David S. Miller

Ben Hutchings
2012-08-25 00:46:13 +0800

22 Aug, 2012

1 commit

e7c2f9674 workqueue: use mod_delayed_work() instead of __cancel + queue ... Browse Code »

Now that mod_delayed_work() is safe to call from IRQ handlers,
__cancel_delayed_work() followed by queue_delayed_work() can be
replaced with mod_delayed_work().

Most conversions are straight-forward except for the following.

* net/core/link_watch.c: linkwatch_schedule_work() was doing a quite
elaborate dancing around its delayed_work. Collapse it such that
linkwatch_work is queued for immediate execution if LW_URGENT and
existing timer is kept otherwise.

Signed-off-by: Tejun Heo
Cc: "David S. Miller"
Cc: Tomi Valkeinen

Tejun Heo
2012-08-22 04:18:24 +0800

16 Sep, 2011

1 commit

c37e0c993 net: linkwatch: allow vlans to get carrier changes faster ... Browse Code »

There is a time-lag of IFF_RUNNING flag consistency between vlan and
real devices when the real devices are in problem such as link or cable
broken.

This leads to a degradation of Availability such as a delay of failover
in HA systems using vlan since the detection of the problem at real
device is delayed.

We can avoid the linkwatch delay (~1 sec) for devices linked to another
ones, since delay is already done for the realdev.

Based on a previous patch from Mitsuo Hayasaka

Reported-by: Mitsuo Hayasaka
Signed-off-by: Eric Dumazet
Cc: Herbert Xu
Cc: Patrick McHardy
Cc: "Michał Mirosław"
Cc: Tom Herbert
Cc: Stephen Hemminger
Cc: Jesse Gross
Tested-by: Mitsuo Hayasaka
Signed-off-by: David S. Miller

Eric Dumazet
2011-09-16 03:36:34 +0800

23 Jul, 2011

1 commit

1821f7cd6 net: allow netif_carrier to be called safely from IRQ ... Browse Code »
1

As reported by Ben Greer and Froncois Romieu. The code path in
the netif_carrier code leads it to try and disable
a late workqueue to reenable it immediately
netif_carrier_on
-> linkwatch_fire_event
-> linkwatch_schedule_work
-> cancel_delayed_work
-> del_timer_sync

If __cancel_delayed_work is used instead then there is no
problem of waiting for running linkwatch_event.

There is a race between linkwatch_event running re-scheduling
but it is harmless to schedule an extra scan of the linkwatch queue.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

stephen hemminger
2011-07-23 08:01:14 +0800

31 Mar, 2011

1 commit

25985edce Fix common misspellings ... Browse Code »

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi

Lucas De Marchi
2011-03-31 22:26:23 +0800

13 Jul, 2010

1 commit

9e34a5b51 net/core: EXPORT_SYMBOL cleanups ... Browse Code »

CodingStyle cleanups

EXPORT_SYMBOL should immediately follow the symbol declaration.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2010-07-13 03:57:55 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

18 Nov, 2009

1 commit

e014debec linkwatch: linkwatch_forget_dev() to speedup device dismantle ... Browse Code »

Herbert Xu a écrit :
> On Tue, Nov 17, 2009 at 04:26:04AM -0800, David Miller wrote:
>> Really, the link watch stuff is just due for a redesign. I don't
>> think a simple hack is going to cut it this time, sorry Eric :-)
>
> I have no objections against any redesigns, but since the only
> caller of linkwatch_forget_dev runs in process context with the
> RTNL, it could also legally emit those events.

Thanks guys, here an updated version then, before linkwatch surgery ?

In this version, I force the event to be sent synchronously.

[PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle

time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105

real 0m0.266s
user 0m0.000s
sys 0m0.001s

real 0m0.770s
user 0m0.000s
sys 0m0.000s

real 0m1.022s
user 0m0.000s
sys 0m0.000s

One problem of current schem in vlan dismantle phase is the
holding of device done by following chain :

vlan_dev_stop() ->
netif_carrier_off(dev) ->
linkwatch_fire_event(dev) ->
dev_hold() ...

And __linkwatch_run_queue() runs up to one second later...

A generic fix to this problem is to add a linkwatch_forget_dev() method
to unlink the device from the list of watched devices.

dev->link_watch_next becomes dev->link_watch_list (and use a bit more memory),
to be able to unlink device in O(1).

After patch :
time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105

real 0m0.024s
user 0m0.000s
sys 0m0.000s

real 0m0.032s
user 0m0.000s
sys 0m0.001s

real 0m0.033s
user 0m0.000s
sys 0m0.000s

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-11-18 21:03:11 +0800

09 Jul, 2008

2 commits

6fa9864b5 net: Clean up explicit ->tx_queue references in link watch. ... Browse Code »

First, we add a qdisc_tx_changing() helper which returns true if the
qdisc attachment is in transition.

Second, we remove an assertion warning which is of limited value and
is hard to express precisely in a multiqueue environment.

Signed-off-by: David S. Miller

David S. Miller
2008-07-09 14:01:06 +0800
b0e1e6462 netdev: Move rest of qdisc state into struct netdev_queue ... Browse Code »

Now qdisc, qdisc_sleeping, and qdisc_list also live there.

Signed-off-by: David S. Miller

David S. Miller
2008-07-09 08:42:10 +0800

11 May, 2007

4 commits

d9568ba91 [NET] link_watch: Always schedule urgent events ... Browse Code »

Urgent events may be delayed if we already have a non-urgent event
queued for that device. This patch changes this by making sure that
an urgent event is always looked at immediately.

I've replaced the LW_RUNNING flag by LW_URGENT since whether work
is scheduled is already kept track by the work queue system.

The only complication is that we have to provide some exclusion for
the setting linkwatch_nextevent which is available in the actual
work function.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-05-11 14:45:28 +0800
db0ccffed [NET] link_watch: Eliminate potential delay on wrap-around ... Browse Code »

When the jiffies wrap around or when the system boots up for the first
time, down events can be delayed indefinitely since we no longer
update linkwatch_nextevent when only urgent events are processed.

This patch fixes this by setting linkwatch_nextevent when a
wrap-around occurs.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-05-11 14:45:11 +0800
294cc44b7 [NET]: Remove link_watch delay for up even when we're down ... Browse Code »

Currently all link carrier events are delayed by up to a second
before they're processed to prevent link storms. This causes
unnecessary packet loss during that interval.

In fact, we can achieve the same effect in preventing storms by
only delaying down events and unnecssary up events. The latter
is defined as up events when we're already up.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-05-11 14:45:08 +0800
572a103de [NET] link_watch: Move link watch list into net_device ... Browse Code »

These days the link watch mechanism is an integral part of the
network subsystem as it manages the carrier status. So it now
makes sense to allocate some memory for it in net_device rather
than allocating it on demand.

In fact, this is necessary because we can't tolerate a memory
allocation failure since that means we'd have to potentially
throw a link up event away.

It also simplifies the code greatly.

In doing so I discovered a subtle race condition in the use
of singleevent. This race condition still exists (and is
somewhat magnified) without singleevent but it's now plugged
thanks to an smp_mb__before_clear_bit.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-05-11 14:45:07 +0800

26 Apr, 2007

1 commit

3ff50b799 [NET]: cleanup extra semicolons ... Browse Code »

Spring cleaning time...

There seems to be a lot of places in the network code that have
extra bogus semicolons after conditionals. Most commonly is a
bogus semicolon after: switch() { }

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2007-04-26 13:29:24 +0800

11 Feb, 2007

1 commit

4ec93edb1 [NET] CORE: Fix whitespace errors. ... Browse Code »

Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

YOSHIFUJI Hideaki
2007-02-11 15:19:25 +0800

22 Nov, 2006

2 commits

65f27f384 WorkStruct: Pass the work_struct pointer instead of context data ... Browse Code »

Pass the work_struct pointer to the work function rather than context data.
The work function can use container_of() to work out the data.

For the cases where the container of the work_struct may go away the moment the
pending bit is cleared, it is made possible to defer the release of the
structure by deferring the clearing of the pending bit.

To make this work, an extra flag is introduced into the management side of the
work_struct. This governs auto-release of the structure upon execution.

Ordinarily, the work queue executor would release the work_struct for further
scheduling or deallocation by clearing the pending bit prior to jumping to the
work function. This means that, unless the driver makes some guarantee itself
that the work_struct won't go away, the work function may not access anything
else in the work_struct or its container lest they be deallocated.. This is a
problem if the auxiliary data is taken away (as done by the last patch).

However, if the pending bit is *not* cleared before jumping to the work
function, then the work function *may* access the work_struct and its container
with no problems. But then the work function must itself release the
work_struct by calling work_release().

In most cases, automatic release is fine, so this is the default. Special
initiators exist for the non-auto-release case (ending in _NAR).

Signed-Off-By: David Howells

David Howells
2006-11-22 22:55:48 +0800
52bad64d9 WorkStruct: Separate delayable and non-delayable events. ... Browse Code »

Separate delayable work items from non-delayable work items be splitting them
into a separate structure (delayed_work), which incorporates a work_struct and
the timer_list removed from work_struct.

The work_struct struct is huge, and this limits it's usefulness. On a 64-bit
architecture it's nearly 100 bytes in size. This reduces that by half for the
non-delayable type of event.

Signed-Off-By: David Howells

David Howells
2006-11-22 22:54:01 +0800

01 Jul, 2006

1 commit

6ab3d5624 Remove obsolete #include <linux/config.h> ... Browse Code »

Signed-off-by: Jörn Engel
Signed-off-by: Adrian Bunk

Jörn Engel
2006-07-01 01:25:36 +0800

23 Jun, 2006

1 commit

626ab0e69 [PATCH] list: use list_replace_init() instead of list_splice_init() ... Browse Code »

list_splice_init(list, head) does unneeded job if it is known that
list_empty(head) == 1. We can use list_replace_init() instead.

Signed-off-by: Oleg Nesterov
Acked-by: David S. Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-06-23 22:43:07 +0800

10 May, 2006

1 commit

8c1056839 [NET] linkwatch: Handle jiffies wrap-around ... Browse Code »

The test used in the linkwatch does not handle wrap-arounds correctly.
Since the intention of the code is to eliminate bursts of messages we
can afford to delay things up to a second. Using that fact we can
easily handle wrap-arounds by making sure that we don't delay things
by more than one second.

This is based on diagnosis and a patch by Stefan Rompf.

Signed-off-by: Herbert Xu
Acked-by: Stefan Rompf
Signed-off-by: David S. Miller

Herbert Xu
2006-05-10 06:27:54 +0800

21 Mar, 2006

2 commits

6756ae4b4 [NET]: Convert RTNL to mutex. ... Browse Code »

This patch turns the RTNL from a semaphore to a new 2.6.16 mutex and
gets rid of some of the leftover legacy.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2006-03-21 14:23:58 +0800
b00055aac [NET] core: add RFC2863 operstate ... Browse Code »

this patch adds a dormant flag to network devices, RFC2863 operstate derived
from these flags and possibility for userspace interaction. It allows drivers
to signal that a device is unusable for user traffic without disabling
queueing (and therefore the possibility for protocol establishment traffic to
flow) and a userspace supplicant (WPA, 802.1X) to mark a device unusable
without changes to the driver.

It is the result of our long discussion. However I must admit that it
represents what Jamal and I agreed on with compromises towards Krzysztof, but
Thomas and Krzysztof still disagree with some parts. Anyway I think it should
be applied.

Signed-off-by: Stefan Rompf
Signed-off-by: David S. Miller

Stefan Rompf
2006-03-21 09:09:11 +0800

04 May, 2005

1 commit

cacaddf57 [NET]: Disable queueing when carrier is lost. ... Browse Code »

Some network drivers call netif_stop_queue() when detecting loss of
carrier. This leads to packets being queued up at the qdisc level for
an unbound period of time. In order to prevent this effect, the core
networking stack will now cease to queue packets for any device, that
is operationally down (i.e. the queue is flushed and disabled).

Signed-off-by: Tommy S. Christensen
Acked-by: Herbert Xu
Signed-off-by: David S. Miller

Tommy S. Christensen
2005-05-04 07:18:52 +0800

17 Apr, 2005

1 commit

1da177e4c Linux-2.6.12-rc2 ... Browse Code »

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

Linus Torvalds
2005-04-17 06:20:36 +0800