Eric Lee / smarc-fsl-linux-kernel

22 Oct, 2020

1 commit

ebfe3c518 rtnetlink: fix data overflow in rtnl_calcit() ... Browse Code »

"ip addr show" command execute error when we have a physical
network card with a large number of VFs

The return value of if_nlmsg_size() in rtnl_calcit() will exceed
range of u16 data type when any network cards has a larger number of
VFs. rtnl_vfinfo_size() will significant increase needed dump size when
the value of num_vfs is larger.

Eventually we get a wrong value of min_ifinfo_dump_size because of overflow
which decides the memory size needed by netlink dump and netlink_dump()
will return -EMSGSIZE because of not enough memory was allocated.

So fix it by promoting min_dump_alloc data type to u32 to
avoid whole netlink message size overflow and it's also align
with the data type of struct netlink_callback{}.min_dump_alloc
which is assigned by return value of rtnl_calcit()

Signed-off-by: Di Zhu
Link: https://lore.kernel.org/r/20201021020053.1401-1-zhudi21@huawei.com
Signed-off-by: Jakub Kicinski

Di Zhu
2020-10-22 09:24:08 +0800

10 Oct, 2020

1 commit

44f3625bc netlink: export policy in extended ACK ... Browse Code »

Add a new attribute NLMSGERR_ATTR_POLICY to the extended ACK
to advertise the policy, e.g. if an attribute was out of range,
you'll know the range that's permissible.

Add new NL_SET_ERR_MSG_ATTR_POL() and NL_SET_ERR_MSG_ATTR_POL()
macros to set this, since realistically it's only useful to do
this when the bad attribute (offset) is also returned.

Use it in lib/nlattr.c which practically does all the policy
validation.

v2:
- add and use netlink_policy_dump_attr_size_estimate()
v3:
- remove redundant break
v4:
- really remove redundant break ... sorry

Reviewed-by: Jakub Kicinski
Signed-off-by: Johannes Berg
Signed-off-by: Jakub Kicinski

Johannes Berg
2020-10-10 11:22:32 +0800

26 Mar, 2020

1 commit

9fb16955f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Overlapping header include additions in macsec.c

A bug fix in 'net' overlapping with the removal of 'version'
string in ena_netdev.c

Overlapping test additions in selftests Makefile

Overlapping PCI ID table adjustments in iwlwifi driver.

Signed-off-by: David S. Miller

David S. Miller
2020-03-26 09:58:11 +0800

24 Mar, 2020

1 commit

55b474c41 netlink: check for null extack in cookie helpers ... Browse Code »

Unlike NL_SET_ERR_* macros, nl_set_extack_cookie_u64() and
nl_set_extack_cookie_u32() helpers do not check extack argument for null
and neither do their callers, as syzbot recently discovered for
ethnl_parse_header().

Instead of fixing the callers and leaving the trap in place, add check of
null extack to both helpers to make them consistent with NL_SET_ERR_*
macros.

v2: drop incorrect second Fixes tag

Fixes: 2363d73a2f3e ("ethtool: reject unrecognized request flags")
Reported-by: syzbot+258a9089477493cea67b@syzkaller.appspotmail.com
Signed-off-by: Michal Kubecek
Signed-off-by: David S. Miller

Michal Kubecek
2020-03-24 12:19:20 +0800

16 Mar, 2020

1 commit

f1388ec4a netlink: add nl_set_extack_cookie_u32() ... Browse Code »

Similar to existing nl_set_extack_cookie_u64(), add new helper
nl_set_extack_cookie_u32() which sets extack cookie to a u32 value.

Signed-off-by: Michal Kubecek
Signed-off-by: David S. Miller

Michal Kubecek
2020-03-16 17:04:24 +0800

28 Feb, 2020

1 commit

085c20cac bpf: inet_diag: Dump bpf_sk_storages in inet_diag_dump() ... Browse Code »

This patch will dump out the bpf_sk_storages of a sk
if the request has the INET_DIAG_REQ_SK_BPF_STORAGES nlattr.

An array of SK_DIAG_BPF_STORAGE_REQ_MAP_FD can be specified in
INET_DIAG_REQ_SK_BPF_STORAGES to select which bpf_sk_storage to dump.
If no map_fd is specified, all bpf_sk_storages of a sk will be dumped.

bpf_sk_storages can be added to the system at runtime. It is difficult
to find a proper static value for cb->min_dump_alloc.

This patch learns the nlattr size required to dump the bpf_sk_storages
of a sk. If it happens to be the very first nlmsg of a dump and it
cannot fit the needed bpf_sk_storages, it will try to expand the
skb by "pskb_expand_head()".

Instead of expanding it in inet_sk_diag_fill(), it is expanded at a
sleepable context in __inet_diag_dump() so __GFP_DIRECT_RECLAIM can
be used. In __inet_diag_dump(), it will retry as long as the
skb is empty and the cb->min_dump_alloc becomes larger than before.
cb->min_dump_alloc is bounded by KMALLOC_MAX_SIZE. The min_dump_alloc
is also changed from 'u16' to 'u32' to accommodate a sk that may have
a few large bpf_sk_storages.

The updated cb->min_dump_alloc will also be used to allocate the skb in
the next dump. This logic already exists in netlink_dump().

Here is the sample output of a locally modified 'ss' and it could be made
more readable by using BTF later:
[root@arch-fb-vm1 ~]# ss --bpf-map-id 14 --bpf-map-id 13 -t6an 'dst [::1]:8989'
State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess
ESTAB 0 0 [::1]:51072 [::1]:8989
bpf_map_id:14 value:[ 3feb ]
bpf_map_id:13 value:[ 3f ]
ESTAB 0 0 [::1]:51070 [::1]:8989
bpf_map_id:14 value:[ 3feb ]
bpf_map_id:13 value:[ 3f ]

[root@arch-fb-vm1 ~]# ~/devshare/github/iproute2/misc/ss --bpf-maps -t6an 'dst [::1]:8989'
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
ESTAB 0 0 [::1]:51072 [::1]:8989
bpf_map_id:14 value:[ 3feb ]
bpf_map_id:13 value:[ 3f ]
bpf_map_id:12 value:[ 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000... total:65407 ]
ESTAB 0 0 [::1]:51070 [::1]:8989
bpf_map_id:14 value:[ 3feb ]
bpf_map_id:13 value:[ 3f ]
bpf_map_id:12 value:[ 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000... total:65407 ]

Signed-off-by: Martin KaFai Lau
Signed-off-by: Alexei Starovoitov
Acked-by: Song Liu
Link: https://lore.kernel.org/bpf/20200225230427.1976129-1-kafai@fb.com

Martin KaFai Lau
2020-02-28 10:50:19 +0800

02 Jul, 2019

1 commit

362b87f5b netlink: use 48 byte ctx instead of 6 signed longs for callback ... Browse Code »

People are inclined to stuff random things into cb->args[n] because it
looks like an array of integers. Sometimes people even put u64s in there
with comments noting that a certain member takes up two slots. The
horror! Really this should mirror the usage of skb->cb, which are just
48 opaque bytes suitable for casting a struct. Then people can create
their usual casting macros for accessing strongly typed members of a
struct.

As a plus, this also gives us the same amount of space on 32bit and 64bit.

Signed-off-by: Jason A. Donenfeld
Reviewed-by: Johannes Berg
Signed-off-by: David S. Miller

Jason A. Donenfeld
2019-07-02 10:12:10 +0800

20 Jan, 2019

1 commit

59c28058f net: netlink: add helper to retrieve NETLINK_F_STRICT_CHK ... Browse Code »

Dumps can read state of the NETLINK_F_STRICT_CHK flag from
a field in the callback structure. For non-dump GET requests
we need a way to access the state of that flag from a socket.

Signed-off-by: Jakub Kicinski
Signed-off-by: David S. Miller

Jakub Kicinski
2019-01-20 02:09:58 +0800

21 Dec, 2018

1 commit

aa9d6e0f3 linux/netlink.h: drop unnecessary extern prefix ... Browse Code »

Don't need extern prefix before function prototypes.
Checkpatch has complained about this for a couple of years.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2018-12-21 08:43:54 +0800

09 Nov, 2018

1 commit

801f87469 netlink: add nl_set_extack_cookie_u64() ... Browse Code »

Add a helper function nl_set_extack_cookie_u64() to use a u64 as
the netlink extended ACK cookie, to avoid having to open-code it
in any users of the cookie.

A u64 should be sufficient for most subsystems though we allow
for up to 20 bytes right now. This also matches the cookies in
nl80211 where I intend to use this.

Signed-off-by: Johannes Berg
Acked-by: David S. Miller
Signed-off-by: Johannes Berg

Johannes Berg
2018-11-09 18:20:07 +0800

16 Oct, 2018

1 commit

22e6c58b8 netlink: Add answer_flags to netlink_callback ... Browse Code »

With dump filtering we need a way to ensure the NLM_F_DUMP_FILTERED
flag is set on a message back to the user if the data returned is
influenced by some input attributes. Normally this can be done as
messages are added to the skb, but if the filter results in no data
being returned, the user could be confused as to why.

This patch adds answer_flags to the netlink_callback allowing dump
handlers to set the NLM_F_DUMP_FILTERED at a minimum in the
NLMSG_DONE message ensuring the flag gets back to the user.

The netlink_callback space is initialized to 0 via a memset in
__netlink_dump_start, so init of the new answer_flags is covered.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2018-10-16 15:13:12 +0800

09 Oct, 2018

2 commits

89d35528d netlink: Add new socket option to enable strict checking on dumps ... Browse Code »

Add a new socket option, NETLINK_DUMP_STRICT_CHK, that userspace
can use via setsockopt to request strict checking of headers and
attributes on dump requests.

To get dump features such as kernel side filtering based on data in
the header or attributes appended to the dump request, userspace
must call setsockopt() for NETLINK_DUMP_STRICT_CHK and a non-zero
value. Since the netlink sock and its flags are private to the
af_netlink code, the strict checking flag is passed to dump handlers
via a flag in the netlink_callback struct.

For old userspace on new kernel there is no impact as all of the data
checks in later patches are wrapped in a check on the new strict flag.

For new userspace on old kernel, the setsockopt will fail and even if
new userspace sets data in the headers and appended attributes the
kernel will silently ignore it. Moving forward when the setsockopt
succeeds, the new userspace on old kernel means the dump request can
pass an attribute the kernel does not understand. The dump will then
fail as the older kernel does not understand it.

New userspace on new kernel setting the socket option gets the benefit
of the improved data dump.

Kernel side the NETLINK_DUMP_STRICT_CHK uapi is converted to a generic
NETLINK_F_STRICT_CHK flag which can potentially be leveraged for tighter
checking on the NEW, DEL, and SET commands.

Signed-off-by: David Ahern
Acked-by: Christian Brauner
Signed-off-by: David S. Miller

David Ahern
2018-10-09 01:39:04 +0800
4a19edb60 netlink: Pass extack to dump handlers ... Browse Code »

Declare extack in netlink_dump and pass to dump handlers via
netlink_callback. Add any extack message after the dump_done_errno
allowing error messages to be returned. This will be useful when
strict checking is done on dump requests, returning why the dump
fails EINVAL.

Signed-off-by: David Ahern
Acked-by: Christian Brauner
Signed-off-by: David S. Miller

David Ahern
2018-10-09 01:39:03 +0800

25 Jul, 2018

1 commit

3730cf4dd netlink: do not store start function in netlink_cb ... Browse Code »

->start() is called once when dump is being initialized, there is no
need to store it in netlink_cb.

Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller

Florian Westphal
2018-07-25 01:04:49 +0800

16 Jan, 2018

1 commit

6311b7ce4 netlink: extack: avoid parenthesized string constant warning ... Browse Code »

NL_SET_ERR_MSG() and NL_SET_ERR_MSG_ATTR() lead to the following warning
in newer versions of gcc:
warning: array initialized from parenthesized string constant

Just remove the parentheses, they're not needed in this context since
anyway since there can be no operator precendence issues or similar.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2018-01-16 04:15:23 +0800

14 Nov, 2017

1 commit

096d1dd0f netlink: remove unused NETLINK SKB flags ... Browse Code »

These flags are unused, remove them to be less confusing.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2017-11-14 20:51:14 +0800

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800

30 May, 2017

1 commit

9ae287274 net: add extack arg to lwtunnel build state ... Browse Code »

Pass extack arg down to lwtunnel_build_state and the build_state callbacks.
Add messages for failures in lwtunnel_build_state, and add the extarg to
nla_parse where possible in the build_state callbacks.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2017-05-30 23:55:32 +0800

23 May, 2017

1 commit

c3ab2b4ec net: ipv4: Add extack messages for route add failures ... Browse Code »

Add messages for non-obvious errors (e.g, no need to add text for malloc
failures or ENODEV failures). This mostly covers the annoying EINVAL errors
Some message strings violate the 80-columns but searchable strings need to
trump that rule.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2017-05-23 00:12:20 +0800

03 May, 2017

1 commit

4d463c4db xdp: use common helper for netlink extended ack reporting ... Browse Code »

Small follow-up to d74a32acd59a ("xdp: use netlink extended ACK reporting")
in order to let drivers all use the same NL_SET_ERR_MSG_MOD() helper macro
for reporting. This also ensures that we consistently add the driver's
prefix for dumping the report in user space to indicate that the error
message is driver specific and not coming from core code. Furthermore,
NL_SET_ERR_MSG_MOD() now reuses NL_SET_ERR_MSG() and thus makes all macros
check the pointer as suggested.

References: https://www.spinics.net/lists/netdev/msg433267.html
Signed-off-by: Daniel Borkmann
Acked-by: Jakub Kicinski
Reviewed-by: Johannes Berg
Signed-off-by: David S. Miller

Daniel Borkmann
2017-05-03 21:51:24 +0800

01 May, 2017

1 commit

45d9b378e netlink: add NULL-friendly helper for setting extended ACK message ... Browse Code »

As we propagate extended ack reporting throughout various paths in
the kernel it may be that the same function is called with the
extended ack parameter passed as NULL. One place where that happens
is in drivers which have a centralized reconfiguration function
called both from ndos and from ethtool_ops. Add a new helper for
setting the error message in such conditions.

Existing helper is left as is to encourage propagating the ext act
fully wherever possible. It also makes it clear in the code which
messages may be lost due to ext ack being NULL.

Signed-off-by: Jakub Kicinski
Acked-by: Daniel Borkmann
Signed-off-by: David S. Miller

Jakub Kicinski
2017-05-01 22:35:47 +0800

14 Apr, 2017

2 commits

ba0dc5f6e netlink: allow sending extended ACK with cookie on success ... Browse Code »

Now that we have extended error reporting and a new message format for
netlink ACK messages, also extend this to be able to return arbitrary
cookie data on success.

This will allow, for example, nl80211 to not send an extra message for
cookies identifying newly created objects, but return those directly
in the ACK message.

The cookie data size is currently limited to 20 bytes (since Jamal
talked about using SHA1 for identifiers.)

Thanks to Jamal Hadi Salim for bringing up this idea during the
discussions.

Signed-off-by: Johannes Berg
Reviewed-by: Jiri Pirko
Signed-off-by: David S. Miller

Johannes Berg
2017-04-14 01:58:21 +0800
2d4bc9336 netlink: extended ACK reporting ... Browse Code »

Add the base infrastructure and UAPI for netlink extended ACK
reporting. All "manual" calls to netlink_ack() pass NULL for now and
thus don't get extended ACK reporting.

Big thanks goes to Pablo Neira Ayuso for not only bringing up the
whole topic at netconf (again) but also coming up with the nlattr
passing trick and various other ideas.

Signed-off-by: Johannes Berg
Reviewed-by: David Ahern
Signed-off-by: David S. Miller

Johannes Berg
2017-04-14 01:58:20 +0800

19 Feb, 2016

1 commit

c5b0db326 nfnetlink: Revert "nfnetlink: add support for memory mapped netlink" ... Browse Code »

reverts commit 3ab1f683bf8b ("nfnetlink: add support for memory mapped
netlink")'

Like previous commits in the series, remove wrappers that are not needed
after mmapped netlink removal.

Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller

Florian Westphal
2016-02-19 00:42:22 +0800

16 Dec, 2015

1 commit

fc9e50f5a netlink: add a start callback for starting a netlink dump ... Browse Code »

The start callback allows the caller to set up a context for the
dump callbacks. Presumably, the context can then be destroyed in
the done callback.

Signed-off-by: Tom Herbert
Signed-off-by: David S. Miller

Tom Herbert
2015-12-16 12:25:20 +0800

10 Sep, 2015

1 commit

6bb0fef48 netlink, mmap: fix edge-case leakages in nf queue zero-copy ... Browse Code »

When netlink mmap on receive side is the consumer of nf queue data,
it can happen that in some edge cases, we write skb shared info into
the user space mmap buffer:

Assume a possible rx ring frame size of only 4096, and the network skb,
which is being zero-copied into the netlink skb, contains page frags
with an overall skb->len larger than the linear part of the netlink
skb.

skb_zerocopy(), which is generic and thus not aware of the fact that
shared info cannot be accessed for such skbs then tries to write and
fill frags, thus leaking kernel data/pointers and in some corner cases
possibly writing out of bounds of the mmap area (when filling the
last slot in the ring buffer this way).

I.e. the ring buffer slot is then of status NL_MMAP_STATUS_VALID, has
an advertised length larger than 4096, where the linear part is visible
at the slot beginning, and the leaked sizeof(struct skb_shared_info)
has been written to the beginning of the next slot (also corrupting
the struct nl_mmap_hdr slot header incl. status etc), since skb->end
points to skb->data + ring->frame_size - NL_MMAP_HDRLEN.

The fix adds and lets __netlink_alloc_skb() take the actual needed
linear room for the network skb + meta data into account. It's completely
irrelevant for non-mmaped netlink sockets, but in case mmap sockets
are used, it can be decided whether the available skb_tailroom() is
really large enough for the buffer, or whether it needs to internally
fallback to a normal alloc_skb().

>From nf queue side, the information whether the destination port is
an mmap RX ring is not really available without extra port-to-socket
lookup, thus it can only be determined in lower layers i.e. when
__netlink_alloc_skb() is called that checks internally for this. I
chose to add the extra ldiff parameter as mmap will then still work:
We have data_len and hlen in nfqnl_build_packet_message(), data_len
is the full length (capped at queue->copy_range) for skb_zerocopy()
and hlen some possible part of data_len that needs to be copied; the
rem_len variable indicates the needed remaining linear mmap space.

The only other workaround in nf queue internally would be after
allocation time by f.e. cap'ing the data_len to the skb_tailroom()
iff we deal with an mmap skb, but that would 1) expose the fact that
we use a mmap skb to upper layers, and 2) trim the skb where we
otherwise could just have moved the full skb into the normal receive
queue.

After the patch, in my test case the ring slot doesn't fit and therefore
shows NL_MMAP_STATUS_COPY, where a full skb carries all the data and
thus needs to be picked up via recv().

Fixes: 3ab1f683bf8b ("nfnetlink: add support for memory mapped netlink")
Signed-off-by: Daniel Borkmann
Signed-off-by: David S. Miller

Daniel Borkmann
2015-09-10 12:43:22 +0800

10 May, 2015

1 commit

59324cf35 netlink: allow to listen "all" netns ... Browse Code »

More accurately, listen all netns that have a nsid assigned into the netns
where the netlink socket is opened.
For this purpose, a netlink socket option is added:
NETLINK_LISTEN_ALL_NSID. When this option is set on a netlink socket, this
socket will receive netlink notifications from all netns that have a nsid
assigned into the netns where the socket has been opened. The nsid is sent
to userland via an anscillary data.

With this patch, a daemon needs only one socket to listen many netns. This
is useful when the number of netns is high.

Because 0 is a valid value for a nsid, the field nsid_is_set indicates if
the field nsid is valid or not. skb->cb is initialized to 0 on skb
allocation, thus we are sure that we will never send a nsid 0 by error to
the userland.

Signed-off-by: Nicolas Dichtel
Acked-by: Thomas Graf
Signed-off-by: David S. Miller

Nicolas Dichtel
2015-05-10 10:15:31 +0800

14 Apr, 2015

1 commit

0392d099a netlink: Fix portid type in netlink_notify ... Browse Code »

portid is an unsigned integer. Fix netlink_notify to
match all other portid user in the kernel.

Signed-off-by: Richard Weinberger
Signed-off-by: David S. Miller

Richard Weinberger
2015-04-14 04:35:16 +0800

27 Dec, 2014

1 commit

023e2cfa3 netlink/genetlink: pass network namespace to bind/unbind ... Browse Code »

Netlink families can exist in multiple namespaces, and for the most
part multicast subscriptions are per network namespace. Thus it only
makes sense to have bind/unbind notifications per network namespace.

To achieve this, pass the network namespace of a given client socket
to the bind/unbind functions.

Also do this in generic netlink, and there also make sure that any
bind for multicast groups that only exist in init_net is rejected.
This isn't really a problem if it is accepted since a client in a
different namespace will never receive any notifications from such
a group, but it can confuse the family if not rejected (it's also
possible to silently (without telling the family) accept it, but it
would also have to be ignored on unbind so families that take any
kind of action on bind/unbind won't do unnecessary work for invalid
clients like that.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2014-12-27 16:07:50 +0800

04 Jun, 2014

1 commit

c99f7abf0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
include/net/inetpeer.h
net/ipv6/output_core.c

Changes in net were fixing bugs in code removed in net-next.

Signed-off-by: David S. Miller

David S. Miller
2014-06-04 14:32:12 +0800

03 Jun, 2014

1 commit

2d7a85f4b netlink: Only check file credentials for implicit destinations ... Browse Code »

It was possible to get a setuid root or setcap executable to write to
it's stdout or stderr (which has been set made a netlink socket) and
inadvertently reconfigure the networking stack.

To prevent this we check that both the creator of the socket and
the currentl applications has permission to reconfigure the network
stack.

Unfortunately this breaks Zebra which always uses sendto/sendmsg
and creates it's socket without any privileges.

To keep Zebra working don't bother checking if the creator of the
socket has privilege when a destination address is specified. Instead
rely exclusively on the privileges of the sender of the socket.

Note from Andy: This is exactly Eric's code except for some comment
clarifications and formatting fixes. Neither I nor, I think, anyone
else is thrilled with this approach, but I'm hesitant to wait on a
better fix since 3.15 is almost here.

Note to stable maintainers: This is a mess. An earlier series of
patches in 3.15 fix a rather serious security issue (CVE-2014-0181),
but they did so in a way that breaks Zebra. The offending series
includes:

commit aa4cf9452f469f16cea8c96283b641b4576d4a7b
Author: Eric W. Biederman
Date: Wed Apr 23 14:28:03 2014 -0700

net: Add variants of capable for use on netlink messages

If a given kernel version is missing that series of fixes, it's
probably worth backporting it and this patch. if that series is
present, then this fix is critical if you care about Zebra.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman"
Signed-off-by: Andy Lutomirski
Signed-off-by: David S. Miller

Eric W. Biederman
2014-06-03 07:34:09 +0800

13 May, 2014

1 commit

5f013c9bc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
drivers/net/ethernet/altera/altera_sgdma.c
net/netlink/af_netlink.c
net/sched/cls_api.c
net/sched/sch_api.c

The netlink conflict dealt with moving to netlink_capable() and
netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations
in non-init namespaces. These were simple transformations from
netlink_capable to netlink_ns_capable.

The Altera driver conflict was simply code removal overlapping some
void pointer cast cleanups in net-next.

Signed-off-by: David S. Miller

David S. Miller
2014-05-13 01:19:14 +0800

25 Apr, 2014

1 commit

aa4cf9452 net: Add variants of capable for use on netlink messages ... Browse Code »

netlink_net_capable - The common case use, for operations that are safe on a network namespace
netlink_capable - For operations that are only known to be safe for the global root
netlink_ns_capable - The general case of capable used to handle special cases

__netlink_ns_capable - Same as netlink_ns_capable except taking a netlink_skb_parms instead of
the skbuff of a netlink message.

Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller

Eric W. Biederman
2014-04-25 01:44:54 +0800

23 Apr, 2014

1 commit

4f5209005 netlink: have netlink per-protocol bind function return an error code. ... Browse Code »

Have the netlink per-protocol optional bind function return an int error code
rather than void to signal a failure.

This will enable netlink protocols to perform extra checks including
capabilities and permissions verifications when updating memberships in
multicast groups.

In netlink_bind() and netlink_setsockopt() the call to the per-protocol bind
function was moved above the multicast group update to prevent any access to
the multicast socket groups before checking with the per-protocol bind
function. This will enable the per-protocol bind function to be used to check
permissions which could be denied before making them available, and to avoid
the messy job of undoing the addition should the per-protocol bind function
fail.

The netfilter subsystem seems to be the only one currently using the
per-protocol bind function.

Signed-off-by: Richard Guy Briggs
Signed-off-by: David S. Miller

Richard Guy Briggs
2014-04-23 09:42:26 +0800

02 Jan, 2014

1 commit

2173f8d95 netlink: cleanup tap related functions ... Browse Code »

Cleanups in netlink_tap code
* remove unused function netlink_clear_multicast_users
* make local function static

Signed-off-by: Stephen Hemminger
Reviewed-by: Johannes Berg
Signed-off-by: David S. Miller

stephen hemminger
2014-01-02 12:43:36 +0800

28 Jun, 2013

1 commit

3a36515f7 netlink: fix splat in skb_clone with large messages ... Browse Code »

Since (c05cdb1 netlink: allow large data transfers from user-space),
netlink splats if it invokes skb_clone on large netlink skbs since:

* skb_shared_info was not correctly initialized.
* skb->destructor is not set in the cloned skb.

This was spotted by trinity:

[ 894.990671] BUG: unable to handle kernel paging request at ffffc9000047b001
[ 894.991034] IP: [] skb_clone+0x24/0xc0
[...]
[ 894.991034] Call Trace:
[ 894.991034] [] nl_fib_input+0x6a/0x240
[ 894.991034] [] ? _raw_read_unlock+0x26/0x40
[ 894.991034] [] netlink_unicast+0x169/0x1e0
[ 894.991034] [] netlink_sendmsg+0x251/0x3d0

Fix it by:

1) introducing a new netlink_skb_clone function that is used in nl_fib_input,
that sets our special skb->destructor in the cloned skb. Moreover, handle
the release of the large cloned skb head area in the destructor path.

2) not allowing large skbuffs in the netlink broadcast path. I cannot find
any reasonable use of the large data transfer using netlink in that path,
moreover this helps to skip extra skb_clone handling.

I found two more netlink clients that are cloning the skbs, but they are
not in the sendmsg path. Therefore, the sole client cloning that I found
seems to be the fib frontend.

Thanks to Eric Dumazet for helping to address this issue.

Reported-by: Fengguang Wu
Signed-off-by: Pablo Neira Ayuso
Signed-off-by: David S. Miller

Pablo Neira
2013-06-28 13:44:16 +0800

25 Jun, 2013

1 commit

bcbde0d44 net: netlink: virtual tap device management ... Browse Code »

Similarly to the networking receive path with ptype_all taps, we add
the possibility to register netdevices that are for ARPHRD_NETLINK to
the netlink subsystem, so that those can be used for netlink analyzers
resp. debuggers. We do not offer a direct callback function as out-of-tree
modules could do crap with it. Instead, a netdevice must be registered
properly and only receives a clone, managed by the netlink layer. Symbols
are exported as GPL-only.

Signed-off-by: Daniel Borkmann
Signed-off-by: David S. Miller

Daniel Borkmann
2013-06-25 07:39:05 +0800

11 Jun, 2013

1 commit

da12c90e0 netlink: Add compare function for netlink_table ... Browse Code »

As we know, netlink sockets are private resource of
net namespace, they can communicate with each other
only when they in the same net namespace. this works
well until we try to add namespace support for other
subsystems which use netlink.

Don't like ipv4 and route table.., it is not suited to
make these subsytems belong to net namespace, Such as
audit and crypto subsystems,they are more suitable to
user namespace.

So we must have the ability to make the netlink sockets
in same user namespace can communicate with each other.

This patch adds a new function pointer "compare" for
netlink_table, we can decide if the netlink sockets can
communicate with each other through this netlink_table
self-defined compare function.

The behavior isn't changed if we don't provide the compare
function for netlink_table.

Signed-off-by: Gao feng
Acked-by: Serge E. Hallyn
Signed-off-by: David S. Miller

Gao feng
2013-06-11 17:39:42 +0800

20 Apr, 2013

2 commits

f9c228883 netlink: implement memory mapped recvmsg() ... Browse Code »

Add support for mmap'ed recvmsg(). To allow the kernel to construct messages
into the mapped area, a dataless skb is allocated and the data pointer is
set to point into the ring frame. This means frames will be delivered to
userspace in order of allocation instead of order of transmission. This
usually doesn't matter since the order is either not determinable by
userspace or message creation/transmission is serialized. The only case
where this can have a visible difference is nfnetlink_queue. Userspace
can't assume mmap'ed messages have ordered IDs anymore and needs to check
this if using batched verdicts.

For non-mapped sockets, nothing changes.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2013-04-20 02:57:58 +0800
9652e931e netlink: add mmap'ed netlink helper functions ... Browse Code »

Add helper functions for looking up mmap'ed frame headers, reading and
writing their status, allocating skbs with mmap'ed data areas and a poll
function.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2013-04-20 02:57:57 +0800