Eric Lee / smarc-fsl-linux-kernel

08 Apr, 2015

1 commit

7026b1ddb netfilter: Pass socket pointer down through okfn(). ... Browse Code »

On the output paths in particular, we have to sometimes deal with two
socket contexts. First, and usually skb->sk, is the local socket that
generated the frame.

And second, is potentially the socket used to control a tunneling
socket, such as one the encapsulates using UDP.

We do not want to disassociate skb->sk when encapsulating in order
to fix this, because that would break socket memory accounting.

The most extreme case where this can cause huge problems is an
AF_PACKET socket transmitting over a vxlan device. We hit code
paths doing checks that assume they are dealing with an ipv4
socket, but are actually operating upon the AF_PACKET one.

Signed-off-by: David S. Miller

David Miller
2015-04-08 03:25:55 +0800

07 Apr, 2015

1 commit

c85d6975e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
drivers/net/ethernet/mellanox/mlx4/cmd.c
net/core/fib_rules.c
net/ipv4/fib_frontend.c

The fib_rules.c and fib_frontend.c conflicts were locking adjustments
in 'net' overlapping addition and removal of code in 'net-next'.

The mlx4 conflict was a bug fix in 'net' happening in the same
place a constant was being replaced with a more suitable macro.

Signed-off-by: David S. Miller

David S. Miller
2015-04-07 10:34:15 +0800

05 Apr, 2015

1 commit

238e54c9c netfilter: Make nf_hookfn use nf_hook_state. ... Browse Code »

Pass the nf_hook_state all the way down into the hook
functions themselves.

Signed-off-by: David S. Miller

David S. Miller
2015-04-05 00:31:38 +0800

03 Apr, 2015

1 commit

419df12fb net: move fib_rules_unregister() under rtnl lock ... Browse Code »

We have to hold rtnl lock for fib_rules_unregister()
otherwise the following race could happen:

fib_rules_unregister(): fib_nl_delrule():
... ...
... ops = lookup_rules_ops();
list_del_rcu(&ops->list);
list_for_each_entry(ops->rules) {
fib_rules_cleanup_ops(ops); ...
list_del_rcu(); list_del_rcu();
}

Note, net->rules_mod_lock is actually not needed at all,
either upper layer netns code or rtnl lock guarantees
we are safe.

Cc: Alexander Duyck
Cc: Thomas Graf
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

WANG Cong
2015-04-03 08:52:34 +0800

10 Mar, 2015

1 commit

ddb3b6033 net: Remove protocol from struct dst_ops ... Browse Code »

After my change to neigh_hh_init to obtain the protocol from the
neigh_table there are no more users of protocol in struct dst_ops.
Remove the protocol field from dst_ops and all of it's initializers.

Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller

Eric W. Biederman
2015-03-10 04:06:10 +0800

07 Mar, 2015

1 commit

aaa4e7040 DECnet: Only use neigh_ops for adding the link layer header ... Browse Code »

Other users users of the neighbour table use neigh->output as the method
to decided when and which link-layer header to place on a packet.
DECnet has been using neigh->output to decide which DECnet headers to
place on a packet depending which neighbour the packet is destined for.

The DECnet usage isn't totally wrong but it can run into problems if the
neighbour output function is run for a second time as the teql driver
and the bridge netfilter code can do.

Therefore to avoid pathologic problems later down the line and make the
neighbour code easier to understand by refactoring the decnet output
code to only use a neighbour method to add a link layer header to a
packet.

This is done by moving the neigbhour operations lookup from
dn_to_neigh_output to dn_neigh_output_packet.

Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller

Eric W. Biederman
2015-03-07 03:54:22 +0800

04 Mar, 2015

2 commits

60395a20f neigh: Factor out ___neigh_lookup_noref ... Browse Code »

While looking at the mpls code I found myself writing yet another
version of neigh_lookup_noref. We currently have __ipv4_lookup_noref
and __ipv6_lookup_noref.

So to make my work a little easier and to make it a smidge easier to
verify/maintain the mpls code in the future I stopped and wrote
___neigh_lookup_noref. Then I rewote __ipv4_lookup_noref and
__ipv6_lookup_noref in terms of this new function. I tested my new
version by verifying that the same code is generated in
ip_finish_output2 and ip6_finish_output2 where these functions are
inlined.

To get to ___neigh_lookup_noref I added a new neighbour cache table
function key_eq. So that the static size of the key would be
available.

I also added __neigh_lookup_noref for people who want to to lookup
a neighbour table entry quickly but don't know which neibhgour table
they are going to look up.

Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller

Eric W. Biederman
2015-03-04 13:23:23 +0800
71a83a6db Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
drivers/net/ethernet/rocker/rocker.c

The rocker commit was two overlapping changes, one to rename
the ->vport member to ->pport, and another making the bitmask
expression use '1ULL' instead of plain '1'.

Signed-off-by: David S. Miller

David S. Miller
2015-03-04 10:16:48 +0800

03 Mar, 2015

2 commits

bdf53c584 neigh: Don't require dst in neigh_hh_init ... Browse Code »

- Add protocol to neigh_tbl so that dst->ops->protocol is not needed
- Acquire the device from neigh->dev

This results in a neigh_hh_init that will cache the samve values
regardless of the packets flowing through it.

Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller

Eric W. Biederman
2015-03-03 05:43:41 +0800
1b7841404 net: Remove iocb argument from sendmsg and recvmsg ... Browse Code »

After TIPC doesn't depend on iocb argument in its internal
implementations of sendmsg() and recvmsg() hooks defined in proto
structure, no any user is using iocb argument in them at all now.
Then we can drop the redundant iocb argument completely from kinds of
implementations of both sendmsg() and recvmsg() in the entire
networking stack.

Cc: Christoph Hellwig
Suggested-by: Al Viro
Signed-off-by: Ying Xue
Signed-off-by: David S. Miller

Ying Xue
2015-03-03 02:06:31 +0800

24 Feb, 2015

1 commit

46b9e4bb7 decnet: Fix obvious o/0 typo ... Browse Code »

Signed-off-by: Rasmus Villemoes
Signed-off-by: David S. Miller

Rasmus Villemoes
2015-02-24 04:28:50 +0800

19 Jan, 2015

1 commit

7b46a644a netlink: Fix bugs in nlmsg_end() conversions. ... Browse Code »

Commit 053c095a82cf ("netlink: make nlmsg_end() and genlmsg_end()
void") didn't catch all of the cases where callers were breaking out
on the return value being equal to zero, which they no longer should
when zero means success.

Fix all such cases.

Reported-by: Marcel Holtmann
Reported-by: Scott Feldman
Signed-off-by: David S. Miller

David S. Miller
2015-01-19 12:36:08 +0800

18 Jan, 2015

1 commit

053c095a8 netlink: make nlmsg_end() and genlmsg_end() void ... Browse Code »

Contrary to common expectations for an "int" return, these functions
return only a positive value -- if used correctly they cannot even
return 0 because the message header will necessarily be in the skb.

This makes the very common pattern of

if (genlmsg_end(...) < 0) { ... }

be a whole bunch of dead code. Many places also simply do

return nlmsg_end(...);

and the caller is expected to deal with it.

This also commonly (at least for me) causes errors, because it is very
common to write

if (my_function(...))
/* error condition */

and if my_function() does "return nlmsg_end()" this is of course wrong.

Additionally, there's not a single place in the kernel that actually
needs the message length returned, and if anyone needs it later then
it'll be very easy to just use skb->len there.

Remove this, and make the functions void. This removes a bunch of dead
code as described above. The patch adds lines because I did

- return nlmsg_end(...);
+ nlmsg_end(...);
+ return 0;

I could have preserved all the function's return values by returning
skb->len, but instead I've audited all the places calling the affected
functions and found that none cared. A few places actually compared
the return value with < 0 with no change in behaviour, so I opted for the more
efficient version.

One instance of the error I've made numerous times now is also present
in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
check for
Signed-off-by: David S. Miller

Johannes Berg
2015-01-18 14:03:45 +0800

06 Jan, 2015

1 commit

ea6976399 net: tcp: add RTAX_CC_ALGO fib handling ... Browse Code »

This patch adds the minimum necessary for the RTAX_CC_ALGO congestion
control metric to be set up and dumped back to user space.

While the internal representation of RTAX_CC_ALGO is handled as a u32
key, we avoided to expose this implementation detail to user space, thus
instead, we chose the netlink attribute that is being exchanged between
user space to be the actual congestion control algorithm name, similarly
as in the setsockopt(2) API in order to allow for maximum flexibility,
even for 3rd party modules.

It is a bit unfortunate that RTAX_QUICKACK used up a whole RTAX slot as
it should have been stored in RTAX_FEATURES instead, we first thought
about reusing it for the congestion control key, but it brings more
complications and/or confusion than worth it.

Joint work with Florian Westphal.

Signed-off-by: Florian Westphal
Signed-off-by: Daniel Borkmann
Signed-off-by: David S. Miller

Daniel Borkmann
2015-01-06 11:55:24 +0800

24 Nov, 2014

2 commits

7eab8d9e8 new helper: memcpy_to_msg() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2014-11-24 17:28:51 +0800
6ce8e9ce5 new helper: memcpy_from_msg() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2014-11-24 17:28:48 +0800

12 Nov, 2014

1 commit

d7480fd3b neigh: remove dynamic neigh table registration support ... Browse Code »

Currently there are only three neigh tables in the whole kernel:
arp table, ndisc table and decnet neigh table. What's more,
we don't support registering multiple tables per family.
Therefore we can just make these tables statically built-in.

Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

WANG Cong
2014-11-12 04:23:54 +0800

23 Aug, 2014

3 commits

c0b802367 af_decnet: Use time_after_eq ... Browse Code »

The functions time_before, time_before_eq, time_after, and time_after_eq
are more robust for comparing jiffies against other values.

A simplified version of the Coccinelle semantic patch making this change
is as follows:

@change@
expression E1,E2,E3;
@@
- jiffies - E1 >= (E2*E3)
+ time_after_eq(jiffies, E1+E2*E3)

Signed-off-by: Himangi Saraogi
Acked-by: Julia Lawall
Signed-off-by: David S. Miller

Himangi Saraogi
2014-08-23 03:23:11 +0800
8b1b1eb52 decnet: Use time_after_eq ... Browse Code »

The functions time_before, time_before_eq, time_after, and time_after_eq
are more robust for comparing jiffies against other values.

A simplified version of the Coccinelle semantic patch making this change
is as follows:

@change@
expression E1,E2;
@@
- (jiffies - E1) >= E2
+ time_after_eq(jiffies, E1+E2)

Signed-off-by: Himangi Saraogi
Acked-by: Julia Lawall
Signed-off-by: David S. Miller

Himangi Saraogi
2014-08-23 03:23:11 +0800
b5c5c36d3 dn_dev: Use time_before ... Browse Code »

The functions time_before, time_before_eq, time_after, and time_after_eq
are more robust for comparing jiffies against other values.

A simplified version of the Coccinelle semantic patch making this change
is as follows:

@change@
expression E1,E2;
@@

(
- (jiffies - E1) < E2
+ time_before(jiffies, E1+E2)
)

Signed-off-by: Himangi Saraogi
Acked-by: Julia Lawall
Signed-off-by: David S. Miller

Himangi Saraogi
2014-08-23 03:23:11 +0800

24 May, 2014

1 commit

28448b804 net: Split sk_no_check into sk_no_check_{rx,tx} ... Browse Code »

Define separate fields in the sock structure for configuring disabling
checksums in both TX and RX-- sk_no_check_tx and sk_no_check_rx.
The SO_NO_CHECK socket option only affects sk_no_check_tx. Also,
removed UDP_CSUM_* defines since they are no longer necessary.

Signed-off-by: Tom Herbert
Signed-off-by: David S. Miller

Tom Herbert
2014-05-24 04:28:53 +0800

25 Apr, 2014

1 commit

90f62cf30 net: Use netlink_ns_capable to verify the permisions of netlink messages ... Browse Code »

It is possible by passing a netlink socket to a more privileged
executable and then to fool that executable into writing to the socket
data that happens to be valid netlink message to do something that
privileged executable did not intend to do.

To keep this from happening replace bare capable and ns_capable calls
with netlink_capable, netlink_net_calls and netlink_ns_capable calls.
Which act the same as the previous calls except they verify that the
opener of the socket had the desired permissions as well.

Reported-by: Andy Lutomirski
Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller

Eric W. Biederman
2014-04-25 01:44:54 +0800

16 Apr, 2014

1 commit

aad88724c ipv4: add a sock pointer to dst->output() path. ... Browse Code »

In the dst->output() path for ipv4, the code assumes the skb it has to
transmit is attached to an inet socket, specifically via
ip_mc_output() : The sk_mc_loop() test triggers a WARN_ON() when the
provider of the packet is an AF_PACKET socket.

The dst->output() method gets an additional 'struct sock *sk'
parameter. This needs a cascade of changes so that this parameter can
be propagated from vxlan to final consumer.

Fixes: 8f646c922d55 ("vxlan: keep original skb ownership")
Reported-by: lucien xin
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2014-04-16 01:47:15 +0800

12 Apr, 2014

1 commit

676d23690 net: Fix use after free by removing length arg from sk_data_ready callbacks. ... Browse Code »

Several spots in the kernel perform a sequence like:

skb_queue_tail(&sk->s_receive_queue, skb);
sk->sk_data_ready(sk, skb->len);

But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up. So this skb->len access is potentially
to freed up memory.

Furthermore, the skb->len can be modified by the consumer so it is
possible that the value isn't accurate.

And finally, no actual implementation of this callback actually uses
the length argument. And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.

So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.

Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.

Signed-off-by: David S. Miller

David S. Miller
2014-04-12 04:15:36 +0800

10 Feb, 2014

2 commits

ab3301bd9 net: Move prototype declaration to header file include/net/dn.h from net/decnet/af_decnet.c ... Browse Code »

Move prototype declaration of functions to header file include/net/dn.h
from net/decnet/af_decnet.c because they are used by more than one file.

This eliminates the following warning in net/decnet/af_decnet.c:
net/decnet/sysctl_net_decnet.c:354:6: warning: no previous prototype for ‘dn_register_sysctl’ [-Wmissing-prototypes]
net/decnet/sysctl_net_decnet.c:359:6: warning: no previous prototype for ‘dn_unregister_sysctl’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria
Reviewed-by: Josh Triplett
Signed-off-by: David S. Miller

Rashika Kheria
2014-02-10 09:32:49 +0800
f56b8bf6e net: Move prototype declaration to appropriate header file from decnet/af_decnet.c ... Browse Code »

Move prototype declaration of functions to header file include/net/dn_route.h
from net/decnet/af_decnet.c because it is used by more than one file.

This eliminates the following warning in net/decnet/dn_route.c:
net/decnet/dn_route.c:629:5: warning: no previous prototype for ‘dn_route_rcv’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria
Reviewed-by: Josh Triplett
Signed-off-by: David S. Miller

Rashika Kheria
2014-02-10 09:32:49 +0800

19 Jan, 2014

1 commit

342dfc306 net: add build-time checks for msg->msg_name size ... Browse Code »

This is a follow-up patch to f3d3342602f8bc ("net: rework recvmsg
handler msg_name and msg_namelen logic").

DECLARE_SOCKADDR validates that the structure we use for writing the
name information to is not larger than the buffer which is reserved
for msg->msg_name (which is 128 bytes). Also use DECLARE_SOCKADDR
consistently in sendmsg code paths.

Signed-off-by: Steffen Hurrle
Suggested-by: Hannes Frederic Sowa
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Steffen Hurrle
2014-01-19 15:04:16 +0800

15 Jan, 2014

1 commit

d4c5fba2f decnet: use __dev_get_by_index instead of dev_get_by_index to find interface ... Browse Code »

The following call chain we can identify that dn_cache_getroute() is
protected under rtnl_lock. So if we use __dev_get_by_index() instead
of dev_get_by_index() to find interface handlers in it, this would help
us avoid to change interface reference counter.

rtnetlink_rcv()
rtnl_lock()
netlink_rcv_skb()
dn_cache_getroute()
rtnl_unlock()

Signed-off-by: Ying Xue
Signed-off-by: David S. Miller

Ying Xue
2014-01-15 10:50:46 +0800

20 Dec, 2013

1 commit

1669cb985 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next ... Browse Code »

Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2013-12-19

1) Use the user supplied policy index instead of a generated one
if present. From Fan Du.

2) Make xfrm migration namespace aware. From Fan Du.

3) Make the xfrm state and policy locks namespace aware. From Fan Du.

4) Remove ancient sleeping when the SA is in acquire state,
we now queue packets to the policy instead. This replaces the
sleeping code.

5) Remove FLOWI_FLAG_CAN_SLEEP. This was used to notify xfrm about the
posibility to sleep. The sleeping code is gone, so remove it.

6) Check user specified spi for IPComp. Thr spi for IPcomp is only
16 bit wide, so check for a valid value. From Fan Du.

7) Export verify_userspi_info to check for valid user supplied spi ranges
with pfkey and netlink. From Fan Du.

8) RFC3173 states that if the total size of a compressed payload and the IPComp
header is not smaller than the size of the original payload, the IP datagram
must be sent in the original non-compressed form. These packets are dropped
by the inbound policy check because they are not transformed. Document the need
to set 'level use' for IPcomp to receive such packets anyway. From Fan Du.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller

David S. Miller
2013-12-20 07:37:49 +0800

11 Dec, 2013

1 commit

9a32b8604 dn_dev: add support for IFA_FLAGS nl attribute ... Browse Code »

Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller

Jiri Pirko
2013-12-11 10:50:00 +0800

10 Dec, 2013

1 commit

1f9248e56 neigh: convert parms to an array ... Browse Code »

This patch converts the neigh param members to an array. This allows easier
manipulation which will be needed later on to provide better management of
default values.

Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller

Jiri Pirko
2013-12-10 09:56:12 +0800

06 Dec, 2013

1 commit

0e0d44ab4 net: Remove FLOWI_FLAG_CAN_SLEEP ... Browse Code »

FLOWI_FLAG_CAN_SLEEP was used to notify xfrm about the posibility
to sleep until the needed states are resolved. This code is gone,
so FLOWI_FLAG_CAN_SLEEP is not needed anymore.

Signed-off-by: Steffen Klassert

Steffen Klassert
2013-12-06 14:24:39 +0800

14 Oct, 2013

1 commit

795aa6ef6 netfilter: pass hook ops to hookfn ... Browse Code »

Pass the hook ops to the hookfn to allow for generic hook
functions. This change is required by nf_tables.

Signed-off-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso

Patrick McHardy
2013-10-14 17:29:31 +0800

13 Jun, 2013

1 commit

fe2c6338f net: Convert uses of typedef ctl_table to struct ctl_table ... Browse Code »

Reduce the uses of this unnecessary typedef.

Done via perl script:

$ git grep --name-only -w ctl_table net | \
xargs perl -p -i -e '\
sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \
s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge'

Reflow the modified lines that now exceed 80 columns.

Signed-off-by: Joe Perches
Signed-off-by: David S. Miller

Joe Perches
2013-06-13 17:36:09 +0800

29 May, 2013

1 commit

351638e7d net: pass info struct via netdevice notifier ... Browse Code »

So far, only net_device * could be passed along with netdevice notifier
event. This patch provides a possibility to pass custom structure
able to provide info that event listener needs to know.

Signed-off-by: Jiri Pirko

v2->v3: fix typo on simeth
shortened dev_getter
shortened notifier_info struct name
v1->v2: fix notifier_call parameter in call_netdevice_notifier()
Signed-off-by: David S. Miller

Jiri Pirko
2013-05-29 04:11:01 +0800

08 Apr, 2013

1 commit

8303e699f decnet: remove duplicated include from dn_table.c ... Browse Code »

Remove duplicated include.

Signed-off-by: Wei Yongjun
Signed-off-by: David S. Miller

Wei Yongjun
2013-04-08 05:12:01 +0800

29 Mar, 2013

1 commit

573ce260b net-next: replace obsolete NLMSG_* with type safe nlmsg_* ... Browse Code »

Signed-off-by: Hong Zhiguo
Signed-off-by: David S. Miller

Hong zhi guo
2013-03-29 02:25:25 +0800

23 Mar, 2013

1 commit

2fa70df93 decnet: Move rtm_dn_policy to dn_route to make it available if !CONFIG_DECNET_ROUTER ... Browse Code »

Otherwise build fails with CONFIG_DECNET && !CONFIG_DECNET_ROUTER

Reported-by: kbuild test robot
Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2013-03-23 00:51:59 +0800

22 Mar, 2013

2 commits

661d2967b rtnetlink: Remove passing of attributes into rtnl_doit functions ... Browse Code »

With decnet converted, we can finally get rid of rta_buf and its
computations around it. It also gets rid of the minimal header
length verification since all message handlers do that explicitly
anyway.

Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2013-03-22 22:31:16 +0800
58d7d8f9b decnet: Parse netlink attributes on our own ... Browse Code »

decnet is the only subsystem left that is relying on the global
netlink attribute buffer rta_buf. It's horrible design and we
want to get rid of it.

This converts all of decnet to do implicit attribute parsing. It
also gets rid of the error prone struct dn_kern_rta.

Yes, the fib_magic() stuff is not pretty.

It's compiled tested but I need someone with appropriate hardware
to test the patch since I don't have access to it.

Cc: linux-decnet-user@lists.sourceforge.net
Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2013-03-22 22:31:16 +0800