Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

14 Nov, 2014

5 commits

5cf520370 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) sunhme driver lacks DMA mapping error checks, based upon a report by
Meelis Roos.

2) Fix memory leak in mvpp2 driver, from Sudip Mukherjee.

3) DMA memory allocation sizes are wrong in systemport ethernet driver,
fix from Florian Fainelli.

4) Fix use after free in mac80211 defragmentation code, from Johannes
Berg.

5) Some networking uapi headers missing from Kbuild file, from Stephen
Hemminger.

6) TUN driver gets csum_start offset wrong when VLAN accel is enabled,
and macvtap has a similar bug, from Herbert Xu.

7) Adjust several tunneling drivers to set dev->iflink after registry,
because registry sets that to -1 overwriting whatever we did. From
Steffen Klassert.

8) Geneve forgets to set inner tunneling type, causing GSO segmentation
to fail on some NICs. From Jesse Gross.

9) Fix several locking bugs in stmmac driver, from Fabrice Gasnier and
Giuseppe CAVALLARO.

10) Fix spurious timeouts with NewReno on low traffic connections, from
Marcelo Leitner.

11) Fix descriptor updates in enic driver, from Govindarajulu
Varadarajan.

12) PPP calls bpf_prog_create() with locks held, which isn't kosher.
Fix from Takashi Iwai.

13) Fix NULL deref in SCTP with malformed INIT packets, from Daniel
Borkmann.

14) psock_fanout selftest accesses past the end of the mmap ring, fix
from Shuah Khan.

15) Fix PTP timestamping for VLAN packets, from Richard Cochran.

16) netlink_unbind() calls in netlink pass wrong initial argument, from
Hiroaki SHIMODA.

17) vxlan socket reuse accidently reuses a socket when the address
family is different, so we have to explicitly check this, from
Marcelo Lietner.

18) Fix missing include in nft_reject_bridge.c breaking the build on ppc
and other architectures, from Guenter Roeck.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
vxlan: Do not reuse sockets for a different address family
smsc911x: power-up phydev before doing a software reset.
lib: rhashtable - Remove weird non-ASCII characters from comments
net/smsc911x: Fix delays in the PHY enable/disable routines
net/smsc911x: Fix rare soft reset timeout issue due to PHY power-down mode
netlink: Properly unbind in error conditions.
net: ptp: fix time stamp matching logic for VLAN packets.
cxgb4 : dcb open-lldp interop fixes
selftests/net: psock_fanout seg faults in sock_fanout_read_ring()
net: bcmgenet: apply MII configuration in bcmgenet_open()
net: bcmgenet: connect and disconnect from the PHY state machine
net: qualcomm: Fix dependency
ixgbe: phy: fix uninitialized status in ixgbe_setup_phy_link_tnx
net: phy: Correctly handle MII ioctl which changes autonegotiation.
ipv6: fix IPV6_PKTINFO with v4 mapped
net: sctp: fix memory leak in auth key management
net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet
net: ppp: Don't call bpf_prog_create() in ppp_lock
net/mlx4_en: Advertize encapsulation offloads features only when VXLAN tunnel is set
cxgb4 : Fix bug in DCB app deletion
...

Linus Torvalds
2014-11-14 09:54:08 +0800
cc9f1f518 libceph: change from BUG to WARN for __remove_osd() asserts ... Browse Code »
16

No reason to use BUG_ON for osd request list assertions.

Signed-off-by: Ilya Dryomov
Reviewed-by: Alex Elder

Ilya Dryomov
2014-11-14 03:26:34 +0800
ba9d114ec libceph: clear r_req_lru_item in __unregister_linger_request() ... Browse Code »
26

kick_requests() can put linger requests on the notarget list. This
means we need to clear the much-overloaded req->r_req_lru_item in
__unregister_linger_request() as well, or we get an assertion failure
in ceph_osdc_release_request() - !list_empty(&req->r_req_lru_item).

AFAICT the assumption was that registered linger requests cannot be on
any of req->r_req_lru_item lists, but that's clearly not the case.

Signed-off-by: Ilya Dryomov
Reviewed-by: Alex Elder

Ilya Dryomov
2014-11-14 03:21:14 +0800
a390de020 libceph: unlink from o_linger_requests when clearing r_osd ... Browse Code »

Requests have to be unlinked from both osd->o_requests (normal
requests) and osd->o_linger_requests (linger requests) lists when
clearing req->r_osd. Otherwise __unregister_linger_request() gets
confused and we trip over a !list_empty(&osd->o_linger_requests)
assert in __remove_osd().

MON=1 OSD=1:

# cat remove-osd.sh
#!/bin/bash
rbd create --size 1 test
DEV=$(rbd map test)
ceph osd out 0
sleep 3
rbd map dne/dne # obtain a new osdmap as a side effect
rbd unmap $DEV & # will block
sleep 3
ceph osd in 0

Signed-off-by: Ilya Dryomov
Reviewed-by: Alex Elder

Ilya Dryomov
2014-11-14 03:21:13 +0800
aaef31703 libceph: do not crash on large auth tickets ... Browse Code »
5

Large (greater than 32k, the value of PAGE_ALLOC_COSTLY_ORDER) auth
tickets will have their buffers vmalloc'ed, which leads to the
following crash in crypto:

[ 28.685082] BUG: unable to handle kernel paging request at ffffeb04000032c0
[ 28.686032] IP: [] scatterwalk_pagedone+0x22/0x80
[ 28.686032] PGD 0
[ 28.688088] Oops: 0000 [#1] PREEMPT SMP
[ 28.688088] Modules linked in:
[ 28.688088] CPU: 0 PID: 878 Comm: kworker/0:2 Not tainted 3.17.0-vm+ #305
[ 28.688088] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 28.688088] Workqueue: ceph-msgr con_work
[ 28.688088] task: ffff88011a7f9030 ti: ffff8800d903c000 task.ti: ffff8800d903c000
[ 28.688088] RIP: 0010:[] [] scatterwalk_pagedone+0x22/0x80
[ 28.688088] RSP: 0018:ffff8800d903f688 EFLAGS: 00010286
[ 28.688088] RAX: ffffeb04000032c0 RBX: ffff8800d903f718 RCX: ffffeb04000032c0
[ 28.688088] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8800d903f750
[ 28.688088] RBP: ffff8800d903f688 R08: 00000000000007de R09: ffff8800d903f880
[ 28.688088] R10: 18df467c72d6257b R11: 0000000000000000 R12: 0000000000000010
[ 28.688088] R13: ffff8800d903f750 R14: ffff8800d903f8a0 R15: 0000000000000000
[ 28.688088] FS: 00007f50a41c7700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[ 28.688088] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 28.688088] CR2: ffffeb04000032c0 CR3: 00000000da3f3000 CR4: 00000000000006b0
[ 28.688088] Stack:
[ 28.688088] ffff8800d903f698 ffffffff81392ca8 ffff8800d903f6e8 ffffffff81395d32
[ 28.688088] ffff8800dac96000 ffff880000000000 ffff8800d903f980 ffff880119b7e020
[ 28.688088] ffff880119b7e010 0000000000000000 0000000000000010 0000000000000010
[ 28.688088] Call Trace:
[ 28.688088] [] scatterwalk_done+0x38/0x40
[ 28.688088] [] scatterwalk_done+0x38/0x40
[ 28.688088] [] blkcipher_walk_done+0x182/0x220
[ 28.688088] [] crypto_cbc_encrypt+0x15f/0x180
[ 28.688088] [] ? crypto_aes_set_key+0x30/0x30
[ 28.688088] [] ceph_aes_encrypt2+0x29c/0x2e0
[ 28.688088] [] ceph_encrypt2+0x93/0xb0
[ 28.688088] [] ceph_x_encrypt+0x4a/0x60
[ 28.688088] [] ? ceph_buffer_new+0x5d/0xf0
[ 28.688088] [] ceph_x_build_authorizer.isra.6+0x297/0x360
[ 28.688088] [] ? kmem_cache_alloc_trace+0x11b/0x1c0
[ 28.688088] [] ? ceph_auth_create_authorizer+0x36/0x80
[ 28.688088] [] ceph_x_create_authorizer+0x63/0xd0
[ 28.688088] [] ceph_auth_create_authorizer+0x54/0x80
[ 28.688088] [] get_authorizer+0x80/0xd0
[ 28.688088] [] prepare_write_connect+0x18b/0x2b0
[ 28.688088] [] try_read+0x1e59/0x1f10

This is because we set up crypto scatterlists as if all buffers were
kmalloc'ed. Fix it.

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil

Ilya Dryomov
2014-11-14 03:21:12 +0800

13 Nov, 2014

1 commit

6251edd93 netlink: Properly unbind in error conditions. ... Browse Code »

Even if netlink_kernel_cfg::unbind is implemented the unbind() method is
not called, because cfg->unbind is omitted in __netlink_kernel_create().
And fix wrong argument of test_bit() and off by one problem.

At this point, no unbind() method is implemented, so there is no real
issue.

Fixes: 4f520900522f ("netlink: have netlink per-protocol bind function return an error code.")
Signed-off-by: Hiroaki SHIMODA
Cc: Richard Guy Briggs
Acked-by: Richard Guy Briggs
Signed-off-by: David S. Miller

Hiroaki SHIMODA
2014-11-13 04:12:06 +0800

12 Nov, 2014

3 commits

5337b5b75 ipv6: fix IPV6_PKTINFO with v4 mapped ... Browse Code »

Use IS_ENABLED(CONFIG_IPV6), to enable this code if IPv6 is
a module.

Signed-off-by: Eric Dumazet
Fixes: c8e6ad0829a7 ("ipv6: honor IPV6_PKTINFO with v4 mapped addresses on sendmsg")
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Eric Dumazet
2014-11-12 04:32:45 +0800
4184b2a79 net: sctp: fix memory leak in auth key management ... Browse Code »
21

A very minimal and simple user space application allocating an SCTP
socket, setting SCTP_AUTH_KEY setsockopt(2) on it and then closing
the socket again will leak the memory containing the authentication
key from user space:

unreferenced object 0xffff8800837047c0 (size 16):
comm "a.out", pid 2789, jiffies 4296954322 (age 192.258s)
hex dump (first 16 bytes):
01 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[] kmemleak_alloc+0x4e/0xb0
[] __kmalloc+0xe8/0x270
[] sctp_auth_create_key+0x23/0x50 [sctp]
[] sctp_auth_set_key+0xa1/0x140 [sctp]
[] sctp_setsockopt+0xd03/0x1180 [sctp]
[] sock_common_setsockopt+0x14/0x20
[] SyS_setsockopt+0x71/0xd0
[] system_call_fastpath+0x12/0x17
[] 0xffffffffffffffff

This is bad because of two things, we can bring down a machine from
user space when auth_enable=1, but also we would leave security sensitive
keying material in memory without clearing it after use. The issue is
that sctp_auth_create_key() already sets the refcount to 1, but after
allocation sctp_auth_set_key() does an additional refcount on it, and
thus leaving it around when we free the socket.

Fixes: 65b07e5d0d0 ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
Signed-off-by: Daniel Borkmann
Cc: Vlad Yasevich
Acked-by: Neil Horman
Signed-off-by: David S. Miller

Daniel Borkmann
2014-11-12 04:19:11 +0800
e40607cbe net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet ... Browse Code »
5

An SCTP server doing ASCONF will panic on malformed INIT ping-of-death
in the form of:

------------ INIT[PARAM: SET_PRIMARY_IP] ------------>

While the INIT chunk parameter verification dissects through many things
in order to detect malformed input, it misses to actually check parameters
inside of parameters. E.g. RFC5061, section 4.2.4 proposes a 'set primary
IP address' parameter in ASCONF, which has as a subparameter an address
parameter.

So an attacker may send a parameter type other than SCTP_PARAM_IPV4_ADDRESS
or SCTP_PARAM_IPV6_ADDRESS, param_type2af() will subsequently return 0
and thus sctp_get_af_specific() returns NULL, too, which we then happily
dereference unconditionally through af->from_addr_param().

The trace for the log:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
IP: [] sctp_process_init+0x492/0x990 [sctp]
PGD 0
Oops: 0000 [#1] SMP
[...]
Pid: 0, comm: swapper Not tainted 2.6.32-504.el6.x86_64 #1 Bochs Bochs
RIP: 0010:[] [] sctp_process_init+0x492/0x990 [sctp]
[...]
Call Trace:

[] ? sctp_bind_addr_copy+0x5d/0xe0 [sctp]
[] sctp_sf_do_5_1B_init+0x21b/0x340 [sctp]
[] sctp_do_sm+0x71/0x1210 [sctp]
[] ? sctp_endpoint_lookup_assoc+0xc9/0xf0 [sctp]
[] sctp_endpoint_bh_rcv+0x116/0x230 [sctp]
[] sctp_inq_push+0x56/0x80 [sctp]
[] sctp_rcv+0x982/0xa10 [sctp]
[] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
[] ? nf_iterate+0x69/0xb0
[] ? ip_local_deliver_finish+0x0/0x2d0
[] ? nf_hook_slow+0x76/0x120
[] ? ip_local_deliver_finish+0x0/0x2d0
[...]

A minimal way to address this is to check for NULL as we do on all
other such occasions where we know sctp_get_af_specific() could
possibly return with NULL.

Fixes: d6de3097592b ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT")
Signed-off-by: Daniel Borkmann
Cc: Vlad Yasevich
Acked-by: Neil Horman
Signed-off-by: David S. Miller

Daniel Borkmann
2014-11-12 04:19:10 +0800

11 Nov, 2014

1 commit

cfdf1e1ba udptunnel: Add SKB_GSO_UDP_TUNNEL during gro_complete. ... Browse Code »

When doing GRO processing for UDP tunnels, we never add
SKB_GSO_UDP_TUNNEL to gso_type - only the type of the inner protocol
is added (such as SKB_GSO_TCPV4). The result is that if the packet is
later resegmented we will do GSO but not treat it as a tunnel. This
results in UDP fragmentation of the outer header instead of (i.e.) TCP
segmentation of the inner header as was originally on the wire.

Signed-off-by: Jesse Gross
Signed-off-by: David S. Miller

Jesse Gross
2014-11-11 04:09:45 +0800

07 Nov, 2014

2 commits

1f5623106 Merge tag 'master-2014-11-04' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless ... Browse Code »

John W. Linville says:

====================
pull request: wireless 2014-11-06

Please pull this batch of fixes intended for the 3.18 stream...

For the mac80211 bits, Johannes says:

"This contains another small set of fixes for 3.18, these are all
over the place and most of the bugs are old, one even dates back
to the original mac80211 we merged into the kernel."

For the iwlwifi bits, Emmanuel says:

"I fix here two issues that are related to the firmware
loading flow. A user reported that he couldn't load the
driver because the rfkill line was pulled up while we
were running the calibrations. This was happening while
booting the system: systemd was restoring the "disable
wifi settings" and that raised an RFKILL interrupt during
the calibration. Our driver didn't handle that properly
and this is now fixed."

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller

David S. Miller
2014-11-07 11:15:20 +0800
b31f65fb4 net: dsa: slave: Fix autoneg for phys on switch MDIO bus ... Browse Code »
26

When the ports phys are connected to the switches internal MDIO bus,
we need to connect the phy to the slave netdev, otherwise
auto-negotiation etc, does not work.

Signed-off-by: Andrew Lunn
Signed-off-by: David S. Miller

Andrew Lunn
2014-11-07 04:06:28 +0800

06 Nov, 2014

3 commits

1f37bf87a tcp: zero retrans_stamp if all retrans were acked ... Browse Code »

Ueki Kohei reported that when we are using NewReno with connections that
have a very low traffic, we may timeout the connection too early if a
second loss occurs after the first one was successfully acked but no
data was transfered later. Below is his description of it:

When SACK is disabled, and a socket suffers multiple separate TCP
retransmissions, that socket's ETIMEDOUT value is calculated from the
time of the *first* retransmission instead of the *latest*
retransmission.

This happens because the tcp_sock's retrans_stamp is set once then never
cleared.

Take the following connection:

Linux remote-machine
| |
send#1---->(*1)|--------> data#1 --------->|
| | |
RTO : :
| | |
---(*2)|----> data#1(retrans) ---->|
| (*3)|(*4)|--------> data#2 --------->|
| | |
RTO : :
| | |
---(*5)|----> data#2(retrans) ---->|
| | |
| | |
RTO*2 : :
| | |
| | |
ETIMEDOUT
Cc: Yuchung Cheng
Signed-off-by: Marcelo Ricardo Leitner
Acked-by: Neal Cardwell
Tested-by: Neal Cardwell
Signed-off-by: David S. Miller

Marcelo Leitner
2014-11-06 05:59:49 +0800
d3ca9eafc geneve: Unregister pernet subsys on module unload. ... Browse Code »

The pernet ops aren't ever unregistered, which causes a memory
leak and an OOPs if the module is ever reinserted.

Fixes: 0b5e8b8eeae4 ("net: Add Geneve tunneling protocol driver")
CC: Andy Zhou
Signed-off-by: Jesse Gross
Acked-by: Nicolas Dichtel
Signed-off-by: David S. Miller

Jesse Gross
2014-11-06 04:00:51 +0800
45cac46e5 geneve: Set GSO type on transmit. ... Browse Code »

Geneve does not currently set the inner protocol type when
transmitting packets. This causes GSO segmentation to fail on NICs
that do not support Geneve offloading.

CC: Andy Zhou
Signed-off-by: Jesse Gross
Signed-off-by: David S. Miller

Jesse Gross
2014-11-06 04:00:51 +0800

05 Nov, 2014

1 commit

0c9a67c8f Merge tag 'mac80211-for-john-2014-11-04' of git://git.kernel.org/pub/scm/linux/k… ... Browse Code »

…ernel/git/jberg/mac80211

Johannes Berg <johannes@sipsolutions.net> says:

"This contains another small set of fixes for 3.18, these are all
over the place and most of the bugs are old, one even dates back
to the original mac80211 we merged into the kernel."

Signed-off-by: John W. Linville <linville@tuxdriver.com>

John W. Linville
2014-11-05 04:56:33 +0800

04 Nov, 2014

6 commits

ce1928da8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client ... Browse Code »

Pull ceph fixes from Sage Weil:
"There is a GFP flag fix from Mike Christie, an error code fix from
Jan, and fixes for two unnecessary allocations (kmalloc and workqueue)
from Ilya. All are well tested.

Ilya has one other fix on the way but it didn't get tested in time"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: eliminate unnecessary allocation in process_one_ticket()
rbd: Fix error recovery in rbd_obj_read_sync()
libceph: use memalloc flags for net IO
rbd: use a single workqueue for all devices

Linus Torvalds
2014-11-04 07:04:26 +0800
f03eb128e gre6: Move the setting of dev->iflink into the ndo_init functions. ... Browse Code »
5

Otherwise it gets overwritten by register_netdev().

Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller

Steffen Klassert
2014-11-04 04:42:24 +0800
ebe084aaf sit: Use ipip6_tunnel_init as the ndo_init function. ... Browse Code »
10

ipip6_tunnel_init() sets the dev->iflink via a call to
ipip6_tunnel_bind_dev(). After that, register_netdevice()
sets dev->iflink = -1. So we loose the iflink configuration
for ipv6 tunnels. Fix this by using ipip6_tunnel_init() as the
ndo_init function. Then ipip6_tunnel_init() is called after
dev->iflink is set to -1 from register_netdevice().

Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller

Steffen Klassert
2014-11-04 04:42:24 +0800
16a0231bf vti6: Use vti6_dev_init as the ndo_init function. ... Browse Code »
5

vti6_dev_init() sets the dev->iflink via a call to
vti6_link_config(). After that, register_netdevice()
sets dev->iflink = -1. So we loose the iflink configuration
for vti6 tunnels. Fix this by using vti6_dev_init() as the
ndo_init function. Then vti6_dev_init() is called after
dev->iflink is set to -1 from register_netdevice().

Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller

Steffen Klassert
2014-11-04 04:42:24 +0800
6c6151daa ip6_tunnel: Use ip6_tnl_dev_init as the ndo_init function. ... Browse Code »
5

ip6_tnl_dev_init() sets the dev->iflink via a call to
ip6_tnl_link_config(). After that, register_netdevice()
sets dev->iflink = -1. So we loose the iflink configuration
for ipv6 tunnels. Fix this by using ip6_tnl_dev_init() as the
ndo_init function. Then ip6_tnl_dev_init() is called after
dev->iflink is set to -1 from register_netdevice().

Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller

Steffen Klassert
2014-11-04 04:42:24 +0800
c1207c049 netfilter: nft_reject_bridge: Fix powerpc build error ... Browse Code »

Fix:
net/bridge/netfilter/nft_reject_bridge.c:
In function 'nft_reject_br_send_v6_unreach':
net/bridge/netfilter/nft_reject_bridge.c:240:3:
error: implicit declaration of function 'csum_ipv6_magic'
csum_ipv6_magic(&nip6h->saddr, &nip6h->daddr,
^
make[3]: *** [net/bridge/netfilter/nft_reject_bridge.o] Error 1

Seen with powerpc:allmodconfig.

Fixes: 523b929d5446 ("netfilter: nft_reject_bridge: don't use IP stack to reject traffic")
Cc: Pablo Neira Ayuso
Signed-off-by: Guenter Roeck
Signed-off-by: David S. Miller

Guenter Roeck
2014-11-04 01:12:34 +0800

03 Nov, 2014

2 commits

b8fff407a mac80211: fix use-after-free in defragmentation ... Browse Code »
21

Upon receiving the last fragment, all but the first fragment
are freed, but the multicast check for statistics at the end
of the function refers to the current skb (the last fragment)
causing a use-after-free bug.

Since multicast frames cannot be fragmented and we check for
this early in the function, just modify that check to also
do the accounting to fix the issue.

Cc: stable@vger.kernel.org
Reported-by: Yosef Khyal
Signed-off-by: Johannes Berg

Johannes Berg
2014-11-03 21:28:50 +0800
4cb8c3593 irda: stop calling sk_prot->disconnect() on connection failure ... Browse Code »

The sk_prot is irda's own set of protocol handlers, so irda should
statically know what that function is anyway, without using an indirect
pointer. And as it happens, we know *exactly* what that pointer is
statically: it's NULL, because irda doesn't define a disconnect
operation.

So calling that function is doubly wrong, and will just cause an oops.

Reported-by: Martin Lang
Cc: Samuel Ortiz
Cc: David Miller
Signed-off-by: Linus Torvalds

Linus Torvalds
2014-11-03 02:20:26 +0800

01 Nov, 2014

5 commits

e9226d7c9 libceph: eliminate unnecessary allocation in process_one_ticket() ... Browse Code »

Commit c27a3e4d667f ("libceph: do not hard code max auth ticket len")
while fixing a buffer overlow tried to keep the same as much of the
surrounding code as possible and introduced an unnecessary kmalloc() in
the unencrypted ticket path. It is likely to fail on huge tickets, so
get rid of it.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil

Ilya Dryomov
2014-11-01 04:43:08 +0800
e0fb6fb6d net: ethtool: Return -EOPNOTSUPP if user space tries to read EEPROM with lengh 0 ... Browse Code »

If a driver supports reading EEPROM but no EEPROM is installed in the system,
the driver's get_eeprom_len function returns 0. ethtool will subsequently
try to read that zero-length EEPROM anyway. If the driver does not support
EEPROM access at all, this operation will return -EOPNOTSUPP. If the driver
does support EEPROM access but no EEPROM is installed, the operation will
return -EINVAL. Return -EOPNOTSUPP in both cases for consistency.

Signed-off-by: Guenter Roeck
Tested-by: Andrew Lunn
Signed-off-by: David S. Miller

Guenter Roeck
2014-11-01 04:12:34 +0800
de05c400f mpls: Allow mpls_gso to be built as module ... Browse Code »

Kconfig already allows mpls to be built as module. Following patch
fixes Makefile to do same.

CC: Simon Horman
Signed-off-by: Pravin B Shelar
Acked-by: Simon Horman
Signed-off-by: David S. Miller

Pravin B Shelar
2014-11-01 03:47:21 +0800
f7065f4bd mpls: Fix mpls_gso handler. ... Browse Code »

mpls gso handler needs to pull skb after segmenting skb.

CC: Simon Horman
Signed-off-by: Pravin B Shelar
Acked-by: Simon Horman
Signed-off-by: David S. Miller

Pravin B Shelar
2014-11-01 03:47:21 +0800
e3a88f9c4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
netfilter/ipvs fixes for net

The following patchset contains fixes for netfilter/ipvs. This round of
fixes is larger than usual at this stage, specifically because of the
nf_tables bridge reject fixes that I would like to see in 3.18. The
patches are:

1) Fix a null-pointer dereference that may occur when logging
errors. This problem was introduced by 4a4739d56b0 ("ipvs: Pull
out crosses_local_route_boundary logic") in v3.17-rc5.

2) Update hook mask in nft_reject_bridge so we can also filter out
packets from there. This fixes 36d2af5 ("netfilter: nf_tables: allow
to filter from prerouting and postrouting"), which needs this chunk
to work.

3) Two patches to refactor common code to forge the IPv4 and IPv6
reject packets from the bridge. These are required by the nf_tables
reject bridge fix.

4) Fix nft_reject_bridge by avoiding the use of the IP stack to reject
packets from the bridge. The idea is to forge the reject packets and
inject them to the original port via br_deliver() which is now
exported for that purpose.

5) Restrict nft_reject_bridge to bridge prerouting and input hooks.
the original skbuff may cloned after prerouting when the bridge stack
needs to flood it to several bridge ports, it is too late to reject
the traffic.
====================

Signed-off-by: David S. Miller

David S. Miller
2014-11-01 00:29:42 +0800

31 Oct, 2014

9 commits

127917c29 netfilter: nft_reject_bridge: restrict reject to prerouting and input ... Browse Code »

Restrict the reject expression to the prerouting and input bridge
hooks. If we allow this to be used from forward or any other later
bridge hook, if the frame is flooded to several ports, we'll end up
sending several reject packets, one per cloned packet.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-31 19:50:09 +0800
523b929d5 netfilter: nft_reject_bridge: don't use IP stack to reject traffic ... Browse Code »
13

If the packet is received via the bridge stack, this cannot reject
packets from the IP stack.

This adds functions to build the reject packet and send it from the
bridge stack. Comments and assumptions on this patch:

1) Validate the IPv4 and IPv6 headers before further processing,
given that the packet comes from the bridge stack, we cannot assume
they are clean. Truncated packets are dropped, we follow similar
approach in the existing iptables match/target extensions that need
to inspect layer 4 headers that is not available. This also includes
packets that are directed to multicast and broadcast ethernet
addresses.

2) br_deliver() is exported to inject the reject packet via
bridge localout -> postrouting. So the approach is similar to what
we already do in the iptables reject target. The reject packet is
sent to the bridge port from which we have received the original
packet.

3) The reject packet is forged based on the original packet. The TTL
is set based on sysctl_ip_default_ttl for IPv4 and per-net
ipv6.devconf_all hoplimit for IPv6.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-31 19:50:08 +0800
8bfcdf667 netfilter: nf_reject_ipv6: split nf_send_reset6() in smaller functions ... Browse Code »

That can be reused by the reject bridge expression to build the reject
packet. The new functions are:

* nf_reject_ip6_tcphdr_get(): to sanitize and to obtain the TCP header.
* nf_reject_ip6hdr_put(): to build the IPv6 header.
* nf_reject_ip6_tcphdr_put(): to build the TCP header.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-31 19:49:57 +0800
052b9498e netfilter: nf_reject_ipv4: split nf_send_reset() in smaller functions ... Browse Code »

That can be reused by the reject bridge expression to build the reject
packet. The new functions are:

* nf_reject_ip_tcphdr_get(): to sanitize and to obtain the TCP header.
* nf_reject_iphdr_put(): to build the IPv4 header.
* nf_reject_ip_tcphdr_put(): to build the TCP header.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-31 19:49:05 +0800
4d87716cd netfilter: nf_tables_bridge: update hook_mask to allow {pre,post}routing ... Browse Code »

Fixes: 36d2af5 ("netfilter: nf_tables: allow to filter from prerouting and postrouting")
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-31 19:44:56 +0800
5188cd44c drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets ... Browse Code »
36

UFO is now disabled on all drivers that work with virtio net headers,
but userland may try to send UFO/IPv6 packets anyway. Instead of
sending with ID=0, we should select identifiers on their behalf (as we
used to).

Signed-off-by: Ben Hutchings
Fixes: 916e4cf46d02 ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data")
Signed-off-by: David S. Miller

Ben Hutchings
2014-10-31 08:01:18 +0800
39bb5e628 net: skb_fclone_busy() needs to detect orphaned skb ... Browse Code »

Some drivers are unable to perform TX completions in a bound time.
They instead call skb_orphan()

Problem is skb_fclone_busy() has to detect this case, otherwise
we block TCP retransmits and can freeze unlucky tcp sessions on
mostly idle hosts.

Signed-off-by: Eric Dumazet
Fixes: 1f3279ae0c13 ("tcp: avoid retransmits of TCP packets hanging in host queues")
Signed-off-by: David S. Miller

Eric Dumazet
2014-10-31 07:58:30 +0800
14051f045 gre: Use inner mac length when computing tunnel length ... Browse Code »
37

Currently, skb_inner_network_header is used but this does not account
for Ethernet header for ETH_P_TEB. Use skb_inner_mac_header which
handles TEB and also should work with IP encapsulation in which case
inner mac and inner network headers are the same.

Tested: Ran TCP_STREAM over GRE, worked as expected.

Signed-off-by: Tom Herbert
Acked-by: Alexander Duyck
Signed-off-by: David S. Miller

Tom Herbert
2014-10-31 07:51:56 +0800
fa19c2b05 ipv4: Do not cache routing failures due to disabled forwarding. ... Browse Code »

If we cache them, the kernel will reuse them, independently of
whether forwarding is enabled or not. Which means that if forwarding is
disabled on the input interface where the first routing request comes
from, then that unreachable result will be cached and reused for
other interfaces, even if forwarding is enabled on them. The opposite
is also true.

This can be verified with two interfaces A and B and an output interface
C, where B has forwarding enabled, but not A and trying
ip route get $dst iif A from $src && ip route get $dst iif B from $src

Signed-off-by: Nicolas Cavallari
Reviewed-by: Julian Anastasov
Signed-off-by: David S. Miller

Nicolas Cavallari
2014-10-31 07:20:40 +0800

30 Oct, 2014

2 commits

46238845b mac80211: properly flush delayed scan work on interface removal ... Browse Code »
5

When an interface is deleted, an ongoing hardware scan is canceled and
the driver must abort the scan, at the very least reporting completion
while the interface is removed.

However, if it scheduled the work that might only run after everything
is said and done, which leads to cfg80211 warning that the scan isn't
reported as finished yet; this is no fault of the driver, it already
did, but mac80211 hasn't processed it.

To fix this situation, flush the delayed work when the interface being
removed is the one that was executing the scan.

Cc: stable@vger.kernel.org
Reported-by: Sujith Manoharan
Tested-by: Sujith Manoharan
Signed-off-by: Johannes Berg

Johannes Berg
2014-10-30 22:48:32 +0800
89baaa570 libceph: use memalloc flags for net IO ... Browse Code »
13

This patch has ceph's lib code use the memalloc flags.

If the VM layer needs to write data out to free up memory to handle new
allocation requests, the block layer must be able to make forward progress.
To handle that requirement we use structs like mempools to reserve memory for
objects like bios and requests.

The problem is when we send/receive block layer requests over the network
layer, net skb allocations can fail and the system can lock up.
To solve this, the memalloc related flags were added. NBD, iSCSI
and NFS uses these flags to tell the network/vm layer that it should
use memory reserves to fullfill allcation requests for structs like
skbs.

I am running ceph in a bunch of VMs in my laptop, so this patch was
not tested very harshly.

Signed-off-by: Mike Christie
Reviewed-by: Ilya Dryomov

Mike Christie
2014-10-30 18:11:50 +0800