Eric Lee / smarc-fsl-linux-kernel

04 Nov, 2016

1 commit

346da62cc dccp: do not send reset to already closed sockets ... Browse Code »

Andrey reported following warning while fuzzing with syzkaller

WARNING: CPU: 1 PID: 21072 at net/dccp/proto.c:83 dccp_set_state+0x229/0x290
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 21072 Comm: syz-executor Not tainted 4.9.0-rc1+ #293
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff88003d4c7738 ffffffff81b474f4 0000000000000003 dffffc0000000000
ffffffff844f8b00 ffff88003d4c7804 ffff88003d4c7800 ffffffff8140c06a
0000000041b58ab3 ffffffff8479ab7d ffffffff8140beae ffffffff8140cd00
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[] dump_stack+0xb3/0x10f lib/dump_stack.c:51
[] panic+0x1bc/0x39d kernel/panic.c:179
[] __warn+0x1cc/0x1f0 kernel/panic.c:542
[] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
[] dccp_set_state+0x229/0x290 net/dccp/proto.c:83
[] dccp_close+0x612/0xc10 net/dccp/proto.c:1016
[] inet_release+0xef/0x1c0 net/ipv4/af_inet.c:415
[] sock_release+0x8e/0x1d0 net/socket.c:570
[] sock_close+0x16/0x20 net/socket.c:1017
[] __fput+0x29d/0x720 fs/file_table.c:208
[] ____fput+0x15/0x20 fs/file_table.c:244
[] task_work_run+0xf8/0x170 kernel/task_work.c:116
[< inline >] exit_task_work include/linux/task_work.h:21
[] do_exit+0x883/0x2ac0 kernel/exit.c:828
[] do_group_exit+0x10e/0x340 kernel/exit.c:931
[] get_signal+0x634/0x15a0 kernel/signal.c:2307
[] do_signal+0x8d/0x1a30 arch/x86/kernel/signal.c:807
[] exit_to_usermode_loop+0xe5/0x130
arch/x86/entry/common.c:156
[< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190
[] syscall_return_slowpath+0x1a8/0x1e0
arch/x86/entry/common.c:259
[] entry_SYSCALL_64_fastpath+0xc0/0xc2
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: disabled

Fix this the same way we did for TCP in commit 565b7b2d2e63
("tcp: do not send reset to already closed sockets")

Signed-off-by: Eric Dumazet
Reported-by: Andrey Konovalov
Tested-by: Andrey Konovalov
Signed-off-by: David S. Miller

Eric Dumazet
2016-11-04 04:16:51 +0800

02 Dec, 2015

1 commit

9cd3e072b net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA ... Browse Code »

This patch is a cleanup to make following patch easier to
review.

Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
from (struct socket)->flags to a (struct socket_wq)->flags
to benefit from RCU protection in sock_wake_async()

To ease backports, we rename both constants.

Two new helpers, sk_set_bit(int nr, struct sock *sk)
and sk_clear_bit(int net, struct sock *sk) are added so that
following patch can change their implementation.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2015-12-02 04:45:05 +0800

27 Jul, 2015

1 commit

dfbafc995 tcp: fix recv with flags MSG_WAITALL | MSG_PEEK ... Browse Code »

Currently, tcp_recvmsg enters a busy loop in sk_wait_data if called
with flags = MSG_WAITALL | MSG_PEEK.

sk_wait_data waits for sk_receive_queue not empty, but in this case,
the receive queue is not empty, but does not contain any skb that we
can use.

Add a "last skb seen on receive queue" argument to sk_wait_data, so
that it sleeps until the receive queue has new skbs.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=99461
Link: https://sourceware.org/bugzilla/show_bug.cgi?id=18493
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1205258
Reported-by: Enrico Scholz
Reported-by: Dan Searle
Signed-off-by: Sabrina Dubroca
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Sabrina Dubroca
2015-07-27 16:06:53 +0800

03 Mar, 2015

1 commit

1b7841404 net: Remove iocb argument from sendmsg and recvmsg ... Browse Code »

After TIPC doesn't depend on iocb argument in its internal
implementations of sendmsg() and recvmsg() hooks defined in proto
structure, no any user is using iocb argument in them at all now.
Then we can drop the redundant iocb argument completely from kinds of
implementations of both sendmsg() and recvmsg() in the entire
networking stack.

Cc: Christoph Hellwig
Suggested-by: Al Viro
Signed-off-by: Ying Xue
Signed-off-by: David S. Miller

Ying Xue
2015-03-03 02:06:31 +0800

11 Dec, 2014

1 commit

f95b414ed net: introduce helper macro for_each_cmsghdr ... Browse Code »

Introduce helper macro for_each_cmsghdr as a wrapper of the enumerating
cmsghdr from msghdr, just cleanup.

Signed-off-by: Gu Zheng
Signed-off-by: David S. Miller

Gu Zheng
2014-12-11 11:41:55 +0800

24 Nov, 2014

1 commit

6ce8e9ce5 new helper: memcpy_from_msg() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2014-11-24 17:28:48 +0800

06 Nov, 2014

1 commit

51f3d02b9 net: Add and use skb_copy_datagram_msg() helper. ... Browse Code »

This encapsulates all of the skb_copy_datagram_iovec() callers
with call argument signature "skb, offset, msghdr->msg_iov, length".

When we move to iov_iters in the networking, the iov_iter object will
sit in the msghdr.

Having a helper like this means there will be less places to touch
during that transformation.

Based upon descriptions and patch from Al Viro.

Signed-off-by: David S. Miller

David S. Miller
2014-11-06 05:46:40 +0800

10 Oct, 2014

1 commit

c798360cd Merge branch 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu ... Browse Code »

Pull percpu updates from Tejun Heo:
"A lot of activities on percpu front. Notable changes are...

- percpu allocator now can take @gfp. If @gfp doesn't contain
GFP_KERNEL, it tries to allocate from what's already available to
the allocator and a work item tries to keep the reserve around
certain level so that these atomic allocations usually succeed.

This will replace the ad-hoc percpu memory pool used by
blk-throttle and also be used by the planned blkcg support for
writeback IOs.

Please note that I noticed a bug in how @gfp is interpreted while
preparing this pull request and applied the fix 6ae833c7fe0c
("percpu: fix how @gfp is interpreted by the percpu allocator")
just now.

- percpu_ref now uses longs for percpu and global counters instead of
ints. It leads to more sparse packing of the percpu counters on
64bit machines but the overhead should be negligible and this
allows using percpu_ref for refcnting pages and in-memory objects
directly.

- The switching between percpu and single counter modes of a
percpu_ref is made independent of putting the base ref and a
percpu_ref can now optionally be initialized in single or killed
mode. This allows avoiding percpu shutdown latency for cases where
the refcounted objects may be synchronously created and destroyed
in rapid succession with only a fraction of them reaching fully
operational status (SCSI probing does this when combined with
blk-mq support). It's also planned to be used to implement forced
single mode to detect underflow more timely for debugging.

There's a separate branch percpu/for-3.18-consistent-ops which cleans
up the duplicate percpu accessors. That branch causes a number of
conflicts with s390 and other trees. I'll send a separate pull
request w/ resolutions once other branches are merged"

* 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (33 commits)
percpu: fix how @gfp is interpreted by the percpu allocator
blk-mq, percpu_ref: start q->mq_usage_counter in atomic mode
percpu_ref: make INIT_ATOMIC and switch_to_atomic() sticky
percpu_ref: add PERCPU_REF_INIT_* flags
percpu_ref: decouple switching to percpu mode and reinit
percpu_ref: decouple switching to atomic mode and killing
percpu_ref: add PCPU_REF_DEAD
percpu_ref: rename things to prepare for decoupling percpu/atomic mode switch
percpu_ref: replace pcpu_ prefix with percpu_
percpu_ref: minor code and comment updates
percpu_ref: relocate percpu_ref_reinit()
Revert "blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe"
Revert "percpu: free percpu allocation info for uniprocessor system"
percpu-refcount: make percpu_ref based on longs instead of ints
percpu-refcount: improve WARN messages
percpu: fix locking regression in the failure path of pcpu_alloc()
percpu-refcount: add @gfp to percpu_ref_init()
proportions: add @gfp to init functions
percpu_counter: add @gfp to percpu_counter_init()
percpu_counter: make percpu_counters_lock irq-safe
...

Linus Torvalds
2014-10-10 19:26:02 +0800

09 Oct, 2014

1 commit

35a9ad8af Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking updates from David Miller:
"Most notable changes in here:

1) By far the biggest accomplishment, thanks to a large range of
contributors, is the addition of multi-send for transmit. This is
the result of discussions back in Chicago, and the hard work of
several individuals.

Now, when the ->ndo_start_xmit() method of a driver sees
skb->xmit_more as true, it can choose to defer the doorbell
telling the driver to start processing the new TX queue entires.

skb->xmit_more means that the generic networking is guaranteed to
call the driver immediately with another SKB to send.

There is logic added to the qdisc layer to dequeue multiple
packets at a time, and the handling mis-predicted offloads in
software is now done with no locks held.

Finally, pktgen is extended to have a "burst" parameter that can
be used to test a multi-send implementation.

Several drivers have xmit_more support: i40e, igb, ixgbe, mlx4,
virtio_net

Adding support is almost trivial, so export more drivers to
support this optimization soon.

I want to thank, in no particular or implied order, Jesper
Dangaard Brouer, Eric Dumazet, Alexander Duyck, Tom Herbert, Jamal
Hadi Salim, John Fastabend, Florian Westphal, Daniel Borkmann,
David Tat, Hannes Frederic Sowa, and Rusty Russell.

2) PTP and timestamping support in bnx2x, from Michal Kalderon.

3) Allow adjusting the rx_copybreak threshold for a driver via
ethtool, and add rx_copybreak support to enic driver. From
Govindarajulu Varadarajan.

4) Significant enhancements to the generic PHY layer and the bcm7xxx
driver in particular (EEE support, auto power down, etc.) from
Florian Fainelli.

5) Allow raw buffers to be used for flow dissection, allowing drivers
to determine the optimal "linear pull" size for devices that DMA
into pools of pages. The objective is to get exactly the
necessary amount of headers into the linear SKB area pre-pulled,
but no more. The new interface drivers use is eth_get_headlen().
From WANG Cong, with driver conversions (several had their own
by-hand duplicated implementations) by Alexander Duyck and Eric
Dumazet.

6) Support checksumming more smoothly and efficiently for
encapsulations, and add "foo over UDP" facility. From Tom
Herbert.

7) Add Broadcom SF2 switch driver to DSA layer, from Florian
Fainelli.

8) eBPF now can load programs via a system call and has an extensive
testsuite. Alexei Starovoitov and Daniel Borkmann.

9) Major overhaul of the packet scheduler to use RCU in several major
areas such as the classifiers and rate estimators. From John
Fastabend.

10) Add driver for Intel FM10000 Ethernet Switch, from Alexander
Duyck.

11) Rearrange TCP_SKB_CB() to reduce cache line misses, from Eric
Dumazet.

12) Add Datacenter TCP congestion control algorithm support, From
Florian Westphal.

13) Reorganize sk_buff so that __copy_skb_header() is significantly
faster. From Eric Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1558 commits)
netlabel: directly return netlbl_unlabel_genl_init()
net: add netdev_txq_bql_{enqueue, complete}_prefetchw() helpers
net: description of dma_cookie cause make xmldocs warning
cxgb4: clean up a type issue
cxgb4: potential shift wrapping bug
i40e: skb->xmit_more support
net: fs_enet: Add NAPI TX
net: fs_enet: Remove non NAPI RX
r8169:add support for RTL8168EP
net_sched: copy exts->type in tcf_exts_change()
wimax: convert printk to pr_foo()
af_unix: remove 0 assignment on static
ipv6: Do not warn for informational ICMP messages, regardless of type.
Update Intel Ethernet Driver maintainers list
bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING
tipc: fix bug in multicast congestion handling
net: better IFF_XMIT_DST_RELEASE support
net/mlx4_en: remove NETDEV_TX_BUSY
3c59x: fix bad split of cpu_to_le32(pci_map_single())
net: bcmgenet: fix Tx ring priority programming
...

Linus Torvalds
2014-10-09 09:40:54 +0800

08 Oct, 2014

1 commit

d0cd84817 Merge tag 'dmaengine-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine ... Browse Code »

Pull dmaengine updates from Dan Williams:
"Even though this has fixes marked for -stable, given the size and the
needed conflict resolutions this is 3.18-rc1/merge-window material.

These patches have been languishing in my tree for a long while. The
fact that I do not have the time to do proper/prompt maintenance of
this tree is a primary factor in the decision to step down as
dmaengine maintainer. That and the fact that the bulk of drivers/dma/
activity is going through Vinod these days.

The net_dma removal has not been in -next. It has developed simple
conflicts against mainline and net-next (for-3.18).

Continuing thanks to Vinod for staying on top of drivers/dma/.

Summary:

1/ Step down as dmaengine maintainer see commit 08223d80df38
"dmaengine maintainer update"

2/ Removal of net_dma, as it has been marked 'broken' since 3.13
(commit 77873803363c "net_dma: mark broken"), without reports of
performance regression.

3/ Miscellaneous fixes"

* tag 'dmaengine-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine:
net: make tcp_cleanup_rbuf private
net_dma: revert 'copied_early'
net_dma: simple removal
dmaengine maintainer update
dmatest: prevent memory leakage on error path in thread
ioat: Use time_before_jiffies()
dmaengine: fix xor sources continuation
dma: mv_xor: Rename __mv_xor_slot_cleanup() to mv_xor_slot_cleanup()
dma: mv_xor: Remove all callers of mv_xor_slot_cleanup()
dma: mv_xor: Remove unneeded mv_xor_clean_completed_slots() call
ioat: Use pci_enable_msix_exact() instead of pci_enable_msix()
drivers: dma: Include appropriate header file in dca.c
drivers: dma: Mark functions as static in dma_v3.c
dma: mv_xor: Add DMA API error checks
ioat/dca: Use dev_is_pci() to check whether it is pci device

Linus Torvalds
2014-10-08 08:39:25 +0800

02 Oct, 2014

1 commit

0c5b8a462 net/dccp/proto.c: add __init to dccp_mib_init ... Browse Code »

dccp_mib_init is only called by __init dccp_init in same module.

Signed-off-by: Fabian Frederick
Signed-off-by: David S. Miller

Fabian Frederick
2014-10-02 06:33:13 +0800

28 Sep, 2014

1 commit

7bced3975 net_dma: simple removal ... Browse Code »

Per commit "77873803363c net_dma: mark broken" net_dma is no longer used
and there is no plan to fix it.

This is the mechanical removal of bits in CONFIG_NET_DMA ifdef guards.
Reverting the remainder of the net_dma induced changes is deferred to
subsequent patches.

Marked for stable due to Roman's report of a memory leak in
dma_pin_iovec_pages():

https://lkml.org/lkml/2014/9/3/177

Cc: Dave Jiang
Cc: Vinod Koul
Cc: David Whipple
Cc: Alexander Duyck
Cc:
Reported-by: Roman Gushchin
Acked-by: David S. Miller
Signed-off-by: Dan Williams

Dan Williams
2014-09-28 22:05:16 +0800

08 Sep, 2014

1 commit

908c7f194 percpu_counter: add @gfp to percpu_counter_init() ... Browse Code »

Percpu allocator now supports allocation mask. Add @gfp to
percpu_counter_init() so that !GFP_KERNEL allocation masks can be used
with percpu_counters too.

We could have left percpu_counter_init() alone and added
percpu_counter_init_gfp(); however, the number of users isn't that
high and introducing _gfp variants to all percpu data structures would
be quite ugly, so let's just do the conversion. This is the one with
the most users. Other percpu data structures are a lot easier to
convert.

This patch doesn't make any functional difference.

Signed-off-by: Tejun Heo
Acked-by: Jan Kara
Acked-by: "David S. Miller"
Cc: x86@kernel.org
Cc: Jens Axboe
Cc: "Theodore Ts'o"
Cc: Alexander Viro
Cc: Andrew Morton

Tejun Heo
2014-09-08 08:51:29 +0800

08 May, 2014

1 commit

698365fa1 net: clean up snmp stats code ... Browse Code »

commit 8f0ea0fe3a036a47767f9c80e (snmp: reduce percpu needs by 50%)
reduced snmp array size to 1, so technically it doesn't have to be
an array any more. What's more, after the following commit:

commit 933393f58fef9963eac61db8093689544e29a600
Date: Thu Dec 22 11:58:51 2011 -0600

percpu: Remove irqsafe_cpu_xxx variants

We simply say that regular this_cpu use must be safe regardless of
preemption and interrupt state. That has no material change for x86
and s390 implementations of this_cpu operations. However, arches that
do not provide their own implementation for this_cpu operations will
now get code generated that disables interrupts instead of preemption.

probably no arch wants to have SNMP_ARRAY_SZ == 2. At least after
almost 3 years, no one complains.

So, just convert the array to a single pointer and remove snmp_mib_init()
and snmp_mib_free() as well.

Cc: Christoph Lameter
Cc: Eric Dumazet
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

WANG Cong
2014-05-08 04:06:05 +0800

09 Oct, 2013

1 commit

05dbc7b59 tcp/dccp: remove twchain ... Browse Code »

TCP listener refactoring, part 3 :

Our goal is to hash SYN_RECV sockets into main ehash for fast lookup,
and parallel SYN processing.

Current inet_ehash_bucket contains two chains, one for ESTABLISH (and
friend states) sockets, another for TIME_WAIT sockets only.

As the hash table is sized to get at most one socket per bucket, it
makes little sense to have separate twchain, as it makes the lookup
slightly more complicated, and doubles hash table memory usage.

If we make sure all socket types have the lookup keys at the same
offsets, we can use a generic and faster lookup. It turns out TIME_WAIT
and ESTABLISHED sockets already have common lookup fields for IPv4.

[ INET_TW_MATCH() is no longer needed ]

I'll provide a follow-up to factorize IPv6 lookup as well, to remove
INET6_TW_MATCH()

This way, SYN_RECV pseudo sockets will be supported the same.

A new sock_gen_put() helper is added, doing either a sock_put() or
inet_twsk_put() [ and will support SYN_RECV later ].

Note this helper should only be called in real slow path, when rcu
lookup found a socket that was moved to another identity (freed/reused
immediately), but could eventually be used in other contexts, like
sock_edemux()

Before patch :

dmesg | grep "TCP established"

TCP established hash table entries: 524288 (order: 11, 8388608 bytes)

After patch :

TCP established hash table entries: 524288 (order: 10, 4194304 bytes)

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2013-10-09 11:19:24 +0800

25 Jul, 2013

1 commit

64dc61306 net: add sk_stream_is_writeable() helper ... Browse Code »

Several call sites use the hardcoded following condition :

sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)

Lets use a helper because TCP_NOTSENT_LOWAT support will change this
condition for TCP sockets.

Signed-off-by: Eric Dumazet
Cc: Neal Cardwell
Cc: Yuchung Cheng
Acked-by: Neal Cardwell
Signed-off-by: David S. Miller

Eric Dumazet
2013-07-25 08:54:48 +0800

17 May, 2012

1 commit

dc6b9b782 net: include/net/sock.h cleanup ... Browse Code »

bool/const conversions where possible

__inline__ -> inline

space cleanups

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2012-05-17 16:50:21 +0800

20 Dec, 2011

1 commit

eb9399220 module_param: make bool parameters really bool (net & drivers/net) ... Browse Code »

module_param(bool) used to counter-intuitively take an int. In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.

It's time to remove the int/unsigned int option. For this version
it'll simply give a warning, but it'll break next kernel version.

(Thanks to Joe Perches for suggesting coccinelle for 0/1 -> true/false).

Cc: "David S. Miller"
Cc: netdev@vger.kernel.org
Signed-off-by: Rusty Russell
Signed-off-by: David S. Miller

Rusty Russell
2011-12-20 11:27:29 +0800

01 Aug, 2011

1 commit

31daf0393 dccp ccid-2: use feature-negotiation to report Ack Ratio changes ... Browse Code »

This uses the new feature-negotiation framework to signal Ack Ratio changes,
as required by RFC 4341, sec. 6.1.2.

That raises some problems with CCID-2, which at the moment can not cope
gracefully with Ack Ratios > 1. Since these issues are not directly related
to feature negotiation, they are marked by a FIXME.

Signed-off-by: Gerrit Renker
Signed-off-by: Samuel Jero
Acked-by: Ian McDonald

Gerrit Renker
2011-08-01 21:52:35 +0800

07 Dec, 2010

2 commits

049102650 dccp qpolicy: Parameter checking of cmsg qpolicy parameters ... Browse Code »

Ensure that cmsg->cmsg_type value is valid for qpolicy
that is currently in use.

Signed-off-by: Tomasz Grobelny
Signed-off-by: Gerrit Renker

Tomasz Grobelny
2010-12-07 20:47:12 +0800
871a2c16c dccp: Policy-based packet dequeueing infrastructure ... Browse Code »

This patch adds a generic infrastructure for policy-based dequeueing of
TX packets and provides two policies:
* a simple FIFO policy (which is the default) and
* a priority based policy (set via socket options).
Both policies honour the tx_qlen sysctl for the maximum size of the write
queue (can be overridden via socket options).

The priority policy uses skb->priority internally to assign an u32 priority
identifier, using the same ranking as SO_PRIORITY. The skb->priority field
is set to 0 when the packet leaves DCCP. The priority is supplied as ancillary
data using cmsg(3), the patch also provides the requisite parsing routines.

Signed-off-by: Tomasz Grobelny
Signed-off-by: Gerrit Renker

Tomasz Grobelny
2010-12-07 20:47:12 +0800

29 Oct, 2010

1 commit

b1fcf55ee dccp: Refine the wait-for-ccid mechanism ... Browse Code »

This extends the existing wait-for-ccid routine so that it may be used with
different types of CCID, addressing the following problems:

1) The queue-drain mechanism only works with rate-based CCIDs. If CCID-2 for
example has a full TX queue and becomes network-limited just as the
application wants to close, then waiting for CCID-2 to become unblocked
could lead to an indefinite delay (i.e., application "hangs").
2) Since each TX CCID in turn uses a feedback mechanism, there may be changes
in its sending policy while the queue is being drained. This can lead to
further delays during which the application will not be able to terminate.
3) The minimum wait time for CCID-3/4 can be expected to be the queue length
times the current inter-packet delay. For example if tx_qlen=100 and a delay
of 15 ms is used for each packet, then the application would have to wait
for a minimum of 1.5 seconds before being allowed to exit.
4) There is no way for the user/application to control this behaviour. It would
be good to use the timeout argument of dccp_close() as an upper bound. Then
the maximum time that an application is willing to wait for its CCIDs to can
be set via the SO_LINGER option.

These problems are addressed by giving the CCID a grace period of up to the
`timeout' value.

The wait-for-ccid function is, as before, used when the application
(a) has read all the data in its receive buffer and
(b) if SO_LINGER was set with a non-zero linger time, or
(c) the socket is either in the OPEN (active close) or in the PASSIVE_CLOSEREQ
state (client application closes after receiving CloseReq).

In addition, there is a catch-all case of __skb_queue_purge() after waiting for
the CCID. This is necessary since the write queue may still have data when
(a) the host has been passively-closed,
(b) abnormal termination (unread data, zero linger time),
(c) wait-for-ccid could not finish within the given time limit.

Signed-off-by: Gerrit Renker
Signed-off-by: David S. Miller

Gerrit Renker
2010-10-29 01:27:01 +0800

12 Oct, 2010

1 commit

2f34b3297 dccp: cosmetics - warning format ... Browse Code »

This omits the redundant "DCCP:" in warning messages, since DCCP_WARN() already
echoes the function name, avoiding messages like

kernel: [10988.766503] dccp_close: DCCP: ABORT -- 209 bytes unread

Signed-off-by: Gerrit Renker

Gerrit Renker
2010-10-12 12:57:43 +0800

07 Oct, 2010

1 commit

1f4f0f645 dccp: Kill dead code and add static markers. ... Browse Code »

Remove dead code and make some functions static.
Compile tested only.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

stephen hemminger
2010-10-07 14:12:07 +0800

26 Jun, 2010

1 commit

1823e4c80 snmp: add align parameter to snmp_mib_init() ... Browse Code »

In preparation for 64bit snmp counters for some mibs,
add an 'align' parameter to snmp_mib_init(), instead
of assuming mibs only contain 'unsigned long' fields.

Callers can use __alignof__(type) to provide correct
alignment.

Signed-off-by: Eric Dumazet
CC: Herbert Xu
CC: Arnaldo Carvalho de Melo
CC: Hideaki YOSHIFUJI
CC: Vlad Yasevich
Signed-off-by: David S. Miller

Eric Dumazet
2010-06-26 12:33:17 +0800

31 May, 2010

1 commit

042604d2a net/dccp: Use memdup_user ... Browse Code »

Use memdup_user when user data is immediately copied into the
allocated region.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

//
@@
expression from,to,size,flag;
position p;
identifier l1,l2;
@@

- to = $kmalloc@p\|kzalloc@p$(size,flag);
+ to = memdup_user(from,size);
if (
- to==NULL
+ IS_ERR(to)
|| ...) {

}
- if (copy_from_user(to, from, size) != 0) {
-
- }
//

Signed-off-by: Julia Lawall
Acked-by: Gerrit Renker
Signed-off-by: David S. Miller

Julia Lawall
2010-05-31 15:24:14 +0800

21 Apr, 2010

1 commit

aa3951451 net: sk_sleep() helper ... Browse Code »

Define a new function to return the waitqueue of a "struct sock".

static inline wait_queue_head_t *sk_sleep(struct sock *sk)
{
return sk->sk_sleep;
}

Change all read occurrences of sk_sleep by a call to this function.

Needed for a future RCU conversion. sk_sleep wont be a field directly
available.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2010-04-21 07:37:13 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

16 Mar, 2010

1 commit

d14a0ebda net-2.6 [Bug-Fix][dccp]: fix oops caused after failed initialisation ... Browse Code »

dccp: fix panic caused by failed initialisation

This fixes a kernel panic reported thanks to Andre Noll:

if DCCP is compiled into the kernel and any out of the initialisation
steps in net/dccp/proto.c:dccp_init() fail, a subsequent attempt to create
a SOCK_DCCP socket will panic, since inet{,6}_create() are not prevented
from creating DCCP sockets.

This patch fixes the problem by propagating a failure in dccp_init() to
dccp_v{4,6}_init_net(), and from there to dccp_v{4,6}_init(), so that the
DCCP protocol is not made available if its initialisation fails.

Signed-off-by: Gerrit Renker
Signed-off-by: David S. Miller

Gerrit Renker
2010-03-16 07:00:50 +0800

17 Feb, 2010

1 commit

7d720c3e4 percpu: add __percpu sparse annotations to net ... Browse Code »

Add __percpu sparse annotations to net.

These annotations are to make sparse consider percpu variables to be
in a different address space and warn if accessed without going
through percpu accessors. This patch doesn't affect normal builds.

The macro and type tricks around snmp stats make things a bit
interesting. DEFINE/DECLARE_SNMP_STAT() macros mark the target field
as __percpu and SNMP_UPD_PO_STATS() macro is updated accordingly. All
snmp_mib_*() users which used to cast the argument to (void **) are
updated to cast it to (void __percpu **).

Signed-off-by: Tejun Heo
Acked-by: David S. Miller
Cc: Patrick McHardy
Cc: Arnaldo Carvalho de Melo
Cc: Vlad Yasevich
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller

Tejun Heo
2010-02-17 15:05:38 +0800

13 Feb, 2010

1 commit

55d955902 dccp: support for passing MSG_TRUNC ... Browse Code »

DCCP is datagram-oriented but lacks UDP's support for MSG_TRUNC as defined in
recvmsg(2)/recv(2). Hence the following 'Hello world\0' receiver

len = recv(fd, buf, 10, MSG_PEEK | MSG_TRUNC);

wrongly (always) returns 10, while in UDP it returns 12 as expected.
This patch adds the missing MSG_TRUNC support to recvmsg().

Signed-off-by: Gerrit Renker
Signed-off-by: David S. Miller

Gerrit Renker
2010-02-13 08:51:10 +0800

19 Oct, 2009

1 commit

c720c7e83 inet: rename some inet_sock fields ... Browse Code »

In order to have better cache layouts of struct sock (separate zones
for rx/tx paths), we need this preliminary patch.

Goal is to transfert fields used at lookup time in the first
read-mostly cache line (inside struct sock_common) and move sk_refcnt
to a separate cache line (only written by rx path)

This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
sport and id fields. This allows a future patch to define these
fields as macros, like sk_refcnt, without name clashes.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-10-19 09:52:53 +0800

13 Oct, 2009

1 commit

f373b53b5 tcp: replace ehash_size by ehash_mask ... Browse Code »

Storing the mask (size - 1) instead of the size allows fast path to be
a bit faster.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-10-13 18:44:02 +0800

01 Oct, 2009

1 commit

b7058842c net: Make setsockopt() optlen be unsigned. ... Browse Code »

This provides safety against negative optlen at the type
level instead of depending upon (sometimes non-trivial)
checks against this sprinkled all over the the place, in
each and every implementation.

Based upon work done by Arjan van de Ven and feedback
from Linus Torvalds.

Signed-off-by: David S. Miller

David S. Miller
2009-10-01 07:12:20 +0800

22 Sep, 2009

1 commit

4481374ce mm: replace various uses of num_physpages by totalram_pages ... Browse Code »

Sizing of memory allocations shouldn't depend on the number of physical
pages found in a system, as that generally includes (perhaps a huge amount
of) non-RAM pages. The amount of what actually is usable as storage
should instead be used as a basis here.

Some of the calculations (i.e. those not intending to use high memory)
should likely even use (totalram_pages - totalhigh_pages).

Signed-off-by: Jan Beulich
Acked-by: Rusty Russell
Acked-by: Ingo Molnar
Cc: Dave Airlie
Cc: Kyle McMartin
Cc: Jeremy Fitzhardinge
Cc: Pekka Enberg
Cc: Hugh Dickins
Cc: "David S. Miller"
Cc: Patrick McHardy
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Beulich
2009-09-22 22:17:38 +0800

13 Aug, 2009

1 commit

aa11d958d Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
arch/microblaze/include/asm/socket.h

David S. Miller
2009-08-13 08:44:53 +0800

10 Aug, 2009

1 commit

f222e8b40 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ Browse Code »

David S. Miller
2009-08-10 12:29:47 +0800

06 Aug, 2009

2 commits

36cbd3dcc net: mark read-only arrays as const ... Browse Code »

String literals are constant, and usually, we can also tag the array
of pointers const too, moving it to the .rodata section.

Signed-off-by: Jan Engelhardt
Signed-off-by: David S. Miller

Jan Engelhardt
2009-08-06 01:42:58 +0800
476181cb0 dccp: missing destroy of percpu counter variable while unload module ... Browse Code »

percpu counter dccp_orphan_count is init in dccp_init() by
percpu_counter_init() while dccp module is loaded, but the
destroy of it is missing while dccp module is unloaded. We
can get the kernel WARNING about this. Reproduct by the
following commands:

$ modprobe dccp
$ rmmod dccp
$ modprobe dccp

WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c()
Hardware name: VMware Virtual Platform
list_add corruption. next->prev should be prev (c080c0c4), but was (null). (next
=ca7188cc).
Modules linked in: dccp(+) nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc
Pid: 1956, comm: modprobe Not tainted 2.6.31-rc5 #55
Call Trace:
[] warn_slowpath_common+0x6a/0x81
[] ? __list_add+0x27/0x5c
[] warn_slowpath_fmt+0x29/0x2c
[] __list_add+0x27/0x5c
[] __percpu_counter_init+0x4d/0x5d
[] dccp_init+0x19/0x2ed [dccp]
[] do_one_initcall+0x4f/0x111
[] ? dccp_init+0x0/0x2ed [dccp]
[] ? notifier_call_chain+0x26/0x48
[] ? __blocking_notifier_call_chain+0x45/0x51
[] sys_init_module+0xac/0x1bd
[] sysenter_do_call+0x12/0x22

Signed-off-by: Wei Yongjun
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Wei Yongjun
2009-08-06 01:22:03 +0800

30 Jul, 2009

1 commit

1c29b3ff4 net-dccp: suppress warning about large allocations from DCCP ... Browse Code »

The DCCP protocol tries to allocate some large hash tables during
initialisation using the largest size possible. This can be larger than
what the page allocator can provide so it prints a warning. However, the
caller is able to handle the situation so this patch suppresses the
warning.

Signed-off-by: Mel Gorman
Acked-by: Arnaldo Carvalho de Melo
Cc: "David S. Miller"
Cc: "Rafael J. Wysocki"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2009-07-30 10:10:36 +0800