Eric Lee / smarc-fsl-linux-kernel

01 Jul, 2017

3 commits

41c6d650f net: convert sock.sk_refcnt from atomic_t to refcount_t ... Browse Code »

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

This patch uses refcount_inc_not_zero() instead of
atomic_inc_not_zero_hint() due to absense of a _hint()
version of refcount API. If the hint() version must
be used, we might need to revisit API.

Signed-off-by: Elena Reshetova
Signed-off-by: Hans Liljestrand
Signed-off-by: Kees Cook
Signed-off-by: David Windsor
Signed-off-by: David S. Miller

Reshetova, Elena
2017-07-01 22:39:08 +0800
14afee4b6 net: convert sock.sk_wmem_alloc from atomic_t to refcount_t ... Browse Code »

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova
Signed-off-by: Hans Liljestrand
Signed-off-by: Kees Cook
Signed-off-by: David Windsor
Signed-off-by: David S. Miller

Reshetova, Elena
2017-07-01 22:39:08 +0800
633547973 net: convert sk_buff.users from atomic_t to refcount_t ... Browse Code »

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova
Signed-off-by: Hans Liljestrand
Signed-off-by: Kees Cook
Signed-off-by: David Windsor
Signed-off-by: David S. Miller

Reshetova, Elena
2017-07-01 22:39:07 +0800

16 Jun, 2017

2 commits

4df864c1d networking: make skb_put & friends return void pointers ... Browse Code »

It seems like a historic accident that these return unsigned char *,
and in many places that means casts are required, more often than not.

Make these functions (skb_put, __skb_put and pskb_put) return void *
and remove all the casts across the tree, adding a (u8 *) cast only
where the unsigned char pointer was used directly, all done with the
following spatch:

@@
expression SKB, LEN;
typedef u8;
identifier fn = { skb_put, __skb_put };
@@
- *(fn(SKB, LEN))
+ *(u8 *)fn(SKB, LEN)

@@
expression E, SKB, LEN;
identifier fn = { skb_put, __skb_put };
type T;
@@
- E = ((T *)(fn(SKB, LEN)))
+ E = fn(SKB, LEN)

which actually doesn't cover pskb_put since there are only three
users overall.

A handful of stragglers were converted manually, notably a macro in
drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many
instances in net/bluetooth/hci_sock.c. In the former file, I also
had to fix one whitespace problem spatch introduced.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2017-06-16 23:48:39 +0800
59ae1d127 networking: introduce and use skb_put_data() ... Browse Code »

A common pattern with skb_put() is to just want to memcpy()
some data into the new space, introduce skb_put_data() for
this.

An spatch similar to the one for skb_put_zero() converts many
of the places using it:

@@
identifier p, p2;
expression len, skb, data;
type t, t2;
@@
(
-p = skb_put(skb, len);
+p = skb_put_data(skb, data, len);
|
-p = (t)skb_put(skb, len);
+p = skb_put_data(skb, data, len);
)
(
p2 = (t2)p;
-memcpy(p2, data, len);
|
-memcpy(p, data, len);
)

@@
type t, t2;
identifier p, p2;
expression skb, data;
@@
t *p;
...
(
-p = skb_put(skb, sizeof(t));
+p = skb_put_data(skb, data, sizeof(t));
|
-p = (t *)skb_put(skb, sizeof(t));
+p = skb_put_data(skb, data, sizeof(t));
)
(
p2 = (t2)p;
-memcpy(p2, data, sizeof(*p));
|
-memcpy(p, data, sizeof(*p));
)

@@
expression skb, len, data;
@@
-memcpy(skb_put(skb, len), data, len);
+skb_put_data(skb, data, len);

(again, manually post-processed to retain some comments)

Reviewed-by: Stephen Hemminger
Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2017-06-16 23:48:37 +0800

01 Jun, 2017

1 commit

7212462fa netlink: don't send unknown nsid ... Browse Code »

The NETLINK_F_LISTEN_ALL_NSID otion enables to listen all netns that have a
nsid assigned into the netns where the netlink socket is opened.
The nsid is sent as metadata to userland, but the existence of this nsid is
checked only for netns that are different from the socket netns. Thus, if
no nsid is assigned to the socket netns, NETNSA_NSID_NOT_ASSIGNED is
reported to the userland. This value is confusing and useless.
After this patch, only valid nsid are sent to userland.

Reported-by: Flavio Leitner
Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller

Nicolas Dichtel
2017-06-01 23:49:39 +0800

14 Apr, 2017

5 commits

fe52145f9 netlink: pass extended ACK struct where available ... Browse Code »

This is an add-on to the previous patch that passes the extended ACK
structure where it's already available by existing genl_info or extack
function arguments.

This was done with this spatch (with some manual adjustment of
indentation):

@@
expression A, B, C, D, E;
identifier fn, info;
@@
fn(..., struct genl_info *info, ...) {
...
-nlmsg_parse(A, B, C, D, E, NULL)
+nlmsg_parse(A, B, C, D, E, info->extack)
...
}

@@
expression A, B, C, D, E;
identifier fn, info;
@@
fn(..., struct genl_info *info, ...) {
extack)
...>
}

@@
expression A, B, C, D, E;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {

}

@@
expression A, B, C, D, E;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {

}

@@
expression A, B, C, D, E;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {
...
-nlmsg_parse(A, B, C, D, E, NULL)
+nlmsg_parse(A, B, C, D, E, extack)
...
}

@@
expression A, B, C, D;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {

}

@@
expression A, B, C, D;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {

}

@@
expression A, B, C, D;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {

}

@@
expression A, B, C;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {

}

Signed-off-by: Johannes Berg
Reviewed-by: Jiri Pirko
Signed-off-by: David S. Miller

Johannes Berg
2017-04-14 01:58:22 +0800
fceb6435e netlink: pass extended ACK struct to parsing functions ... Browse Code »

Pass the new extended ACK reporting struct to all of the generic
netlink parsing functions. For now, pass NULL in almost all callers
(except for some in the core.)

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2017-04-14 01:58:22 +0800
ba0dc5f6e netlink: allow sending extended ACK with cookie on success ... Browse Code »

Now that we have extended error reporting and a new message format for
netlink ACK messages, also extend this to be able to return arbitrary
cookie data on success.

This will allow, for example, nl80211 to not send an extra message for
cookies identifying newly created objects, but return those directly
in the ACK message.

The cookie data size is currently limited to 20 bytes (since Jamal
talked about using SHA1 for identifiers.)

Thanks to Jamal Hadi Salim for bringing up this idea during the
discussions.

Signed-off-by: Johannes Berg
Reviewed-by: Jiri Pirko
Signed-off-by: David S. Miller

Johannes Berg
2017-04-14 01:58:21 +0800
7ab606d16 genetlink: pass extended ACK report down ... Browse Code »

Pass the extended ACK reporting struct down from generic netlink to
the families, using the existing struct genl_info for simplicity.

Also add support to set the extended ACK information from generic
netlink users.

Signed-off-by: Johannes Berg
Reviewed-by: Jiri Pirko
Signed-off-by: David S. Miller

Johannes Berg
2017-04-14 01:58:21 +0800
2d4bc9336 netlink: extended ACK reporting ... Browse Code »

Add the base infrastructure and UAPI for netlink extended ACK
reporting. All "manual" calls to netlink_ack() pass NULL for now and
thus don't get extended ACK reporting.

Big thanks goes to Pablo Neira Ayuso for not only bringing up the
whole topic at netconf (again) but also coming up with the nlattr
passing trick and various other ideas.

Signed-off-by: Johannes Berg
Reviewed-by: David Ahern
Signed-off-by: David S. Miller

Johannes Berg
2017-04-14 01:58:20 +0800

05 Apr, 2017

1 commit

457c79e54 netlink/diag: report flags for netlink sockets ... Browse Code »

cb_running is reported in /proc/self/net/netlink and it is reported by
the ss tool, when it gets information from the proc files.

sock_diag is a new interface which is used instead of proc files, so it
looks reasonable that this interface has to report no less information
about sockets than proc files.

We use these flags to dump and restore netlink sockets.

Signed-off-by: Andrei Vagin
Signed-off-by: David S. Miller

Andrey Vagin
2017-04-05 22:13:56 +0800

23 Mar, 2017

1 commit

1d2a6a5e4 genetlink: fix counting regression on ctrl_dumpfamily() ... Browse Code »

Commit 2ae0f17df1cd ("genetlink: use idr to track families") replaced

if (++n < fams_to_skip)
continue;
into:

if (n++ < fams_to_skip)
continue;

This subtle change cause that on retry ctrl_dumpfamily() call we omit
one family that failed to do ctrl_fill_info() on previous call, because
cb->args[0] = n number counts also family that failed to do
ctrl_fill_info().

Patch fixes the problem and avoid confusion in the future just decrease
n counter when ctrl_fill_info() fail.

User visible problem caused by this bug is failure to get access to
some genetlink family i.e. nl80211. However problem is reproducible
only if number of registered genetlink families is big enough to
cause second call of ctrl_dumpfamily().

Cc: Xose Vazquez Perez
Cc: Larry Finger
Cc: Johannes Berg
Fixes: 2ae0f17df1cd ("genetlink: use idr to track families")
Signed-off-by: Stanislaw Gruszka
Acked-by: Johannes Berg
Signed-off-by: David S. Miller

Stanislaw Gruszka
2017-03-23 06:38:43 +0800

22 Mar, 2017

1 commit

8a0f5ccfb crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex ... Browse Code »

On Tue, Mar 14, 2017 at 10:44:10AM +0100, Dmitry Vyukov wrote:
>
> Yes, please.
> Disregarding some reports is not a good way long term.

Please try this patch.

---8cb_mutex are annotated by lockdep
as a single class. This causes a false lcokdep cycle involving
genl and crypto_user.

This patch fixes it by dividing cb_mutex into individual classes
based on the netlink protocol. As genl and crypto_user do not
use the same netlink protocol this breaks the false dependency
loop.

Reported-by: Dmitry Vyukov
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2017-03-22 05:38:15 +0800

28 Jan, 2017

1 commit

158f323b9 net: adjust skb->truesize in pskb_expand_head() ... Browse Code »

Slava Shwartsman reported a warning in skb_try_coalesce(), when we
detect skb->truesize is completely wrong.

In his case, issue came from IPv6 reassembly coping with malicious
datagrams, that forced various pskb_may_pull() to reallocate a bigger
skb->head than the one allocated by NIC driver before entering GRO
layer.

Current code does not change skb->truesize, leaving this burden to
callers if they care enough.

Blindly changing skb->truesize in pskb_expand_head() is not
easy, as some producers might track skb->truesize, for example
in xmit path for back pressure feedback (sk->sk_wmem_alloc)

We can detect the cases where it should be safe to change
skb->truesize :

1) skb is not attached to a socket.
2) If it is attached to a socket, destructor is sock_edemux()

My audit gave only two callers doing their own skb->truesize
manipulation.

I had to remove skb parameter in sock_edemux macro when
CONFIG_INET is not set to avoid a compile error.

Signed-off-by: Eric Dumazet
Reported-by: Slava Shwartsman
Signed-off-by: David S. Miller

Eric Dumazet
2017-01-28 01:03:29 +0800

17 Jan, 2017

1 commit

e89df8131 netlink: do not enter direct reclaim from netlink_trim() ... Browse Code »

In commit d35c99ff77ecb ("netlink: do not enter direct reclaim from
netlink_dump()") we made sure to not trigger expensive memory reclaim.

Problem is that a bit later, netlink_trim() might be called and
trigger memory reclaim.

netlink_trim() should be best effort, and really as fast as possible.
Under memory pressure, it is fine to not trim this skb.

Signed-off-by: Eric Dumazet
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Eric Dumazet
2017-01-17 02:39:35 +0800

25 Dec, 2016

1 commit

7c0f6ba68 Replace <asm/uaccess.h> with <linux/uaccess.h> globally ... Browse Code »

This was entirely automated, using the script by Al:

PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
sed -i -e "s!$PATT!#include !" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro
Signed-off-by: Linus Torvalds

Linus Torvalds
2016-12-25 03:46:01 +0800

11 Dec, 2016

1 commit

efa172f42 netlink: use blocking notifier ... Browse Code »

netlink_chain is called in ->release(), which is apparently
a process context, so we don't have to use an atomic notifier
here.

Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

WANG Cong
2016-12-11 06:25:58 +0800

07 Dec, 2016

1 commit

c63d352f0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2016-12-07 10:33:19 +0800

06 Dec, 2016

1 commit

ed5d7788a netlink: Do not schedule work from sk_destruct ... Browse Code »

It is wrong to schedule a work from sk_destruct using the socket
as the memory reserve because the socket will be freed immediately
after the return from sk_destruct.

Instead we should do the deferral prior to sk_free.

This patch does just that.

Fixes: 707693c8a498 ("netlink: Call cb->done from a worker thread")
Signed-off-by: Herbert Xu
Tested-by: Andrey Konovalov
Signed-off-by: David S. Miller

Herbert Xu
2016-12-06 08:43:42 +0800

04 Dec, 2016

1 commit

2745529ac Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Couple conflicts resolved here:

1) In the MACB driver, a bug fix to properly initialize the
RX tail pointer properly overlapped with some changes
to support variable sized rings.

2) In XGBE we had a "CONFIG_PM" --> "CONFIG_PM_SLEEP" fix
overlapping with a reorganization of the driver to support
ACPI, OF, as well as PCI variants of the chip.

3) In 'net' we had several probe error path bug fixes to the
stmmac driver, meanwhile a lot of this code was cleaned up
and reorganized in 'net-next'.

4) The cls_flower classifier obtained a helper function in
'net-next' called __fl_delete() and this overlapped with
Daniel Borkamann's bug fix to use RCU for object destruction
in 'net'. It also overlapped with Jiri's change to guard
the rhashtable_remove_fast() call with a check against
tc_skip_sw().

5) In mlx4, a revert bug fix in 'net' overlapped with some
unrelated changes in 'net-next'.

6) In geneve, a stale header pointer after pskb_expand_head()
bug fix in 'net' overlapped with a large reorganization of
the same code in 'net-next'. Since the 'net-next' code no
longer had the bug in question, there was nothing to do
other than to simply take the 'net-next' hunks.

Signed-off-by: David S. Miller

David S. Miller
2016-12-04 01:29:53 +0800

30 Nov, 2016

1 commit

707693c8a netlink: Call cb->done from a worker thread ... Browse Code »

The cb->done interface expects to be called in process context.
This was broken by the netlink RCU conversion. This patch fixes
it by adding a worker struct to make the cb->done call where
necessary.

Fixes: 21e4902aea80 ("netlink: Lockless lookup with RCU grace...")
Reported-by: Subash Abhinov Kasiviswanathan
Signed-off-by: Herbert Xu
Acked-by: Cong Wang
Signed-off-by: David S. Miller

Herbert Xu
2016-11-30 08:48:38 +0800

15 Nov, 2016

1 commit

bb598c1b8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Several cases of bug fixes in 'net' overlapping other changes in
'net-next-.

Signed-off-by: David S. Miller

David S. Miller
2016-11-15 23:54:36 +0800

04 Nov, 2016

2 commits

00ffc1ba0 genetlink: fix a memory leak on error path ... Browse Code »

In __genl_register_family(), when genl_validate_assign_mc_groups()
fails, we forget to free the memory we possibly allocate for
family->attrbuf.

Note, some callers call genl_unregister_family() to clean up
on error path, it doesn't work because the family is inserted
to the global list in the nearly last step.

Cc: Jakub Kicinski
Cc: Johannes Berg
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

WANG Cong
2016-11-04 04:52:29 +0800
93636d1f1 netlink: netlink_diag_dump() runs without locks ... Browse Code »

A recent commit removed locking from netlink_diag_dump() but forgot
one error case.

=====================================
[ BUG: bad unlock balance detected! ]
4.9.0-rc3+ #336 Not tainted
-------------------------------------
syz-executor/4018 is trying to release lock ([ 36.220068] nl_table_lock
) at:
[] netlink_diag_dump+0x1a3/0x250 net/netlink/diag.c:182
but there are no more locks to release!

other info that might help us debug this:
3 locks held by syz-executor/4018:
#0: [ 36.220068] (
sock_diag_mutex[ 36.220068] ){+.+.+.}
, at: [ 36.220068] [] sock_diag_rcv+0x1b/0x40
#1: [ 36.220068] (
sock_diag_table_mutex[ 36.220068] ){+.+.+.}
, at: [ 36.220068] [] sock_diag_rcv_msg+0x140/0x3a0
#2: [ 36.220068] (
nlk->cb_mutex[ 36.220068] ){+.+.+.}
, at: [ 36.220068] [] netlink_dump+0x50/0xac0

stack backtrace:
CPU: 1 PID: 4018 Comm: syz-executor Not tainted 4.9.0-rc3+ #336
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff8800645df688 ffffffff81b46934 ffffffff84eb3e78 ffff88006ad85800
ffffffff82dc8683 ffffffff84eb3e78 ffff8800645df6b8 ffffffff812043ca
dffffc0000000000 ffff88006ad85ff8 ffff88006ad85fd0 00000000ffffffff
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[] dump_stack+0xb3/0x10f lib/dump_stack.c:51
[] print_unlock_imbalance_bug+0x17a/0x1a0
kernel/locking/lockdep.c:3388
[< inline >] __lock_release kernel/locking/lockdep.c:3512
[] lock_release+0x8e8/0xc60 kernel/locking/lockdep.c:3765
[< inline >] __raw_read_unlock ./include/linux/rwlock_api_smp.h:225
[] _raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255
[] netlink_diag_dump+0x1a3/0x250 net/netlink/diag.c:182
[] netlink_dump+0x397/0xac0 net/netlink/af_netlink.c:2110

Fixes: ad202074320c ("netlink: Use rhashtable walk interface in diag dump")
Signed-off-by: Eric Dumazet
Reported-by: Andrey Konovalov
Tested-by: Andrey Konovalov
Signed-off-by: David S. Miller

Eric Dumazet
2016-11-04 04:16:51 +0800

02 Nov, 2016

1 commit

22ca904ad genetlink: fix error return code in genl_register_family() ... Browse Code »

Fix to return a negative error code from the idr_alloc() error handling
case instead of 0, as done elsewhere in this function.

Also fix the return value check of idr_alloc() since idr_alloc return
negative errors on failure, not zero.

Fixes: 2ae0f17df1cd ("genetlink: use idr to track families")
Signed-off-by: Wei Yongjun
Signed-off-by: David S. Miller

Wei Yongjun
2016-11-02 00:13:13 +0800

30 Oct, 2016

1 commit

0e82c7635 genetlink: Fix generic netlink family unregister ... Browse Code »

This patch fixes a typo in unregister operation.

Following crash is fixed by this patch. It can be easily reproduced
by repeating modprobe and rmmod module that uses genetlink.

[ 261.446686] BUG: unable to handle kernel paging request at ffffffffa0264088
[ 261.448921] IP: [] strcmp+0xe/0x30
[ 261.450494] PGD 1c09067
[ 261.451266] PUD 1c0a063
[ 261.452091] PMD 8068d5067
[ 261.452525] PTE 0
[ 261.453164]
[ 261.453618] Oops: 0000 [#1] SMP
[ 261.454577] Modules linked in: openvswitch(+) ...
[ 261.480753] RIP: 0010:[] [] strcmp+0xe/0x30
[ 261.483069] RSP: 0018:ffffc90003c0bc28 EFLAGS: 00010282
[ 261.510145] Call Trace:
[ 261.510896] [] genl_family_find_byname+0x5a/0x70
[ 261.512819] [] genl_register_family+0xb9/0x630
[ 261.514805] [] dp_init+0xbc/0x120 [openvswitch]
[ 261.518268] [] do_one_initcall+0x3d/0x160
[ 261.525041] [] do_init_module+0x60/0x1f1
[ 261.526754] [] load_module+0x22af/0x2860
[ 261.530144] [] SYSC_finit_module+0x96/0xd0
[ 261.531901] [] SyS_finit_module+0xe/0x10
[ 261.533605] [] do_syscall_64+0x6e/0x180
[ 261.535284] [] entry_SYSCALL64_slow_path+0x25/0x25
[ 261.546512] RIP [] strcmp+0xe/0x30
[ 261.550198] ---[ end trace 76505a814dd68770 ]---

Fixes: 2ae0f17df1c ("genetlink: use idr to track families").

Reported-by: Jarno Rajahalme
CC: Johannes Berg
Signed-off-by: Pravin B Shelar
Reviewed-by: Johannes Berg
Signed-off-by: David S. Miller

pravin shelar
2016-10-30 08:58:15 +0800

28 Oct, 2016

5 commits

56989f6d8 genetlink: mark families as __ro_after_init ... Browse Code »

Now genl_register_family() is the only thing (other than the
users themselves, perhaps, but I didn't find any doing that)
writing to the family struct.

In all families that I found, genl_register_family() is only
called from __init functions (some indirectly, in which case
I've add __init annotations to clarifly things), so all can
actually be marked __ro_after_init.

This protects the data structure from accidental corruption.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2016-10-28 04:16:09 +0800
2ae0f17df genetlink: use idr to track families ... Browse Code »

Since generic netlink family IDs are small integers, allocated
densely, IDR is an ideal match for lookups. Replace the existing
hand-written hash-table with IDR for allocation and lookup.

This lets the families only be written to once, during register,
since the list_head can be removed and removal of a family won't
cause any writes.

It also slightly reduces the code size (by about 1.3k on x86-64).

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2016-10-28 04:16:09 +0800
489111e5c genetlink: statically initialize families ... Browse Code »

Instead of providing macros/inline functions to initialize
the families, make all users initialize them statically and
get rid of the macros.

This reduces the kernel code size by about 1.6k on x86-64
(with allyesconfig).

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2016-10-28 04:16:09 +0800
a07ea4d99 genetlink: no longer support using static family IDs ... Browse Code »

Static family IDs have never really been used, the only
use case was the workaround I introduced for those users
that assumed their family ID was also their multicast
group ID.

Additionally, because static family IDs would never be
reserved by the generic netlink code, using a relatively
low ID would only work for built-in families that can be
registered immediately after generic netlink is started,
which is basically only the control family (apart from
the workaround code, which I also had to add code for so
it would reserve those IDs)

Thus, anything other than GENL_ID_GENERATE is flawed and
luckily not used except in the cases I mentioned. Move
those workarounds into a few lines of code, and then get
rid of GENL_ID_GENERATE entirely, making it more robust.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2016-10-28 04:16:09 +0800
c90c39dab genetlink: introduce and use genl_family_attrbuf() ... Browse Code »

This helper function allows family implementations to access
their family's attrbuf. This gets rid of the attrbuf usage
in families, and also adds locking validation, since it's not
valid to use the attrbuf with parallel_ops or outside of the
dumpit callback.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2016-10-28 04:16:08 +0800

07 Oct, 2016

1 commit

d35c99ff7 netlink: do not enter direct reclaim from netlink_dump() ... Browse Code »

Since linux-3.15, netlink_dump() can use up to 16384 bytes skb
allocations.

Due to struct skb_shared_info ~320 bytes overhead, we end up using
order-3 (on x86) page allocations, that might trigger direct reclaim and
add stress.

The intent was really to attempt a large allocation but immediately
fallback to a smaller one (order-1 on x86) in case of memory stress.

On recent kernels (linux-4.4), we can remove __GFP_DIRECT_RECLAIM to
meet the goal. Old kernels would need to remove __GFP_WAIT

While we are at it, since we do an order-3 allocation, allow to use
all the allocated bytes instead of 16384 to reduce syscalls during
large dumps.

iproute2 already uses 32KB recvmsg() buffer sizes.

Alexei provided an initial patch downsizing to SKB_WITH_OVERHEAD(16384)

Fixes: 9063e21fb026 ("netlink: autosize skb lengthes")
Signed-off-by: Eric Dumazet
Reported-by: Alexei Starovoitov
Cc: Greg Thelen
Reviewed-by: Greg Rose
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Eric Dumazet
2016-10-07 08:53:13 +0800

08 Sep, 2016

1 commit

733ade23d netlink: don't forget to release a rhashtable_iter structure ... Browse Code »

This bug was detected by kmemleak:
unreferenced object 0xffff8804269cc3c0 (size 64):
comm "criu", pid 1042, jiffies 4294907360 (age 13.713s)
hex dump (first 32 bytes):
a0 32 cc 2c 04 88 ff ff 00 00 00 00 00 00 00 00 .2.,............
00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de ................
backtrace:
[] kmemleak_alloc+0x4a/0xa0
[] kmem_cache_alloc_trace+0x10f/0x280
[] __netlink_diag_dump+0x26c/0x290 [netlink_diag]

v2: don't remove a reference on a rhashtable_iter structure to
release it from netlink_diag_dump_done

Cc: Herbert Xu
Fixes: ad202074320c ("netlink: Use rhashtable walk interface in diag dump")
Signed-off-by: Andrei Vagin
Acked-by: Herbert Xu
Signed-off-by: David S. Miller

Andrey Vagin
2016-09-08 08:29:38 +0800

02 Sep, 2016

1 commit

12d8de6d9 net: make genetlink ctrl ops const ... Browse Code »

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

stephen hemminger
2016-09-02 05:09:00 +0800

20 Aug, 2016

1 commit

ad2020743 netlink: Use rhashtable walk interface in diag dump ... Browse Code »

This patch converts the diag dumping code to use the rhashtable
walk code instead of going through rhashtable by hand. The lock
nl_table_lock is now only taken while we process the multicast
list as it's not needed for the rhashtable walk.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2016-08-20 05:40:25 +0800

10 Jun, 2016

1 commit

21aff3b90 net/netlink/af_netlink.h: Remove unused structure. ... Browse Code »

Signed-off-by: Fabien Siron
Signed-off-by: David S. Miller

Fabien Siron
2016-06-10 13:26:24 +0800

17 May, 2016

1 commit

92964c79b netlink: Fix dump skb leak/double free ... Browse Code »

When we free cb->skb after a dump, we do it after releasing the
lock. This means that a new dump could have started in the time
being and we'll end up freeing their skb instead of ours.

This patch saves the skb and module before we unlock so we free
the right memory.

Fixes: 16b304f3404f ("netlink: Eliminate kmalloc in netlink dump operation.")
Reported-by: Baozeng Ding
Signed-off-by: Herbert Xu
Acked-by: Cong Wang
Signed-off-by: David S. Miller

Herbert Xu
2016-05-17 10:05:15 +0800

24 Apr, 2016

1 commit

1602f49b5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts were two cases of simple overlapping changes,
nothing serious.

In the UDP case, we need to add a hlist_add_tail_rcu()
to linux/rculist.h, because we've moved UDP socket handling
away from using nulls lists.

Signed-off-by: David S. Miller

David S. Miller
2016-04-24 06:51:33 +0800

11 Apr, 2016

1 commit

e27260203 netlink: don't send NETLINK_URELEASE for unbound sockets ... Browse Code »

All existing users of NETLINK_URELEASE use it to clean up resources that
were previously allocated to a socket via some command. As a result, no
users require getting this notification for unbound sockets.

Sending it for unbound sockets, however, is a problem because any user
(including unprivileged users) can create a socket that uses the same ID
as an existing socket. Binding this new socket will fail, but if the
NETLINK_URELEASE notification is generated for such sockets, the users
thereof will be tricked into thinking the socket that they allocated the
resources for is closed.

In the nl80211 case, this will cause destruction of virtual interfaces
that still belong to an existing hostapd process; this is the case that
Dmitry noticed. In the NFC case, it will cause a poll abort. In the case
of netlink log/queue it will cause them to stop reporting events, as if
NFULNL_CFG_CMD_UNBIND/NFQNL_CFG_CMD_UNBIND had been called.

Fix this problem by checking that the socket is bound before generating
the NETLINK_URELEASE notification.

Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Ivanov
Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Dmitry Ivanov
2016-04-11 11:32:23 +0800