Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

28 Oct, 2014

2 commits

3d53666b4 ipvs: Avoid null-pointer deref in debug code ... Browse Code »

Use daddr instead of reaching into dest.

Reported-by: Dan Carpenter
Signed-off-by: Alex Gartrell
Signed-off-by: Simon Horman

Alex Gartrell
2014-10-28 08:48:31 +0800
7965ee937 netfilter: nft_compat: fix wrong target lookup in nft_target_select_ops() ... Browse Code »
5

The code looks for an already loaded target, and the correct list to search
is nft_target_list, not nft_match_list.

Signed-off-by: Arturo Borrero Gonzalez
Signed-off-by: Pablo Neira Ayuso

Arturo Borrero
2014-10-28 05:17:46 +0800

24 Oct, 2014

3 commits

b51d3fa36 netfilter: nf_log: release skbuff on nlmsg put failure ... Browse Code »
5

The kernel should reserve enough room in the skb so that the DONE
message can always be appended. However, in case of e.g. new attribute
erronously not being size-accounted for, __nfulnl_send() will still
try to put next nlmsg into this full skbuf, causing the skb to be stuck
forever and blocking delivery of further messages.

Fix issue by releasing skb immediately after nlmsg_put error and
WARN() so we can track down the cause of such size mismatch.

[ fw@strlen.de: add tailroom/len info to WARN ]

Signed-off-by: Houcheng Lin
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Houcheng Lin
2014-10-24 20:34:11 +0800
c1e7dc91e netfilter: nfnetlink_log: fix maximum packet length logged to userspace ... Browse Code »
5

don't try to queue payloads > 0xffff - NLA_HDRLEN, it does not work.
The nla length includes the size of the nla struct, so anything larger
results in u16 integer overflow.

This patch is similar to
9cefbbc9c8f9abe (netfilter: nfnetlink_queue: cleanup copy_range usage).

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2014-10-24 20:32:27 +0800
9dfa1dfe4 netfilter: nf_log: account for size of NLMSG_DONE attribute ... Browse Code »
5

We currently neither account for the nlattr size, nor do we consider
the size of the trailing NLMSG_DONE when allocating nlmsg skb.

This can result in nflog to stop working, as __nfulnl_send() re-tries
sending forever if it failed to append NLMSG_DONE (which will never
work if buffer is not large enough).

Reported-by: Houcheng Lin
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2014-10-24 20:30:15 +0800

22 Oct, 2014

3 commits

c123bb716 netfilter: nf_tables: check for NULL in nf_tables_newchain pcpu stats allocation ... Browse Code »

alloc_percpu returns NULL on failure, not a negative error code.

Fixes: ff3cd7b3c922 ("netfilter: nf_tables: refactor chain statistic routines")
Signed-off-by: Sabrina Dubroca
Signed-off-by: Pablo Neira Ayuso

Sabrina Dubroca
2014-10-22 20:12:51 +0800
0f9f5e1b8 netfilter: ipset: off by one in ip_set_nfnl_get_byindex() ... Browse Code »
5

The ->ip_set_list[] array is initialized in ip_set_net_init() and it
has ->ip_set_max elements so this check should be >= instead of >
otherwise we are off by one.

Signed-off-by: Dan Carpenter
Acked-by: Jozsef Kadlecsik
Signed-off-by: Pablo Neira Ayuso

Dan Carpenter
2014-10-22 20:12:50 +0800
e37ad9fd6 netfilter: nf_conntrack: allow server to become a client in TW handling ... Browse Code »

When a port that was used to listen for inbound connections gets closed
and reused for outgoing connections (like rsh ends up doing for stderr
flow), current we may reject the SYN/ACK packet for the new connection
because tcp_conntracks states forbirds a port to become a client while
there is still a TIME_WAIT entry in there for it.

As TCP may expire the TIME_WAIT socket in 60s and conntrack's timeout
for it is 120s, there is a ~60s window that the application can end up
opening a port that conntrack will end up blocking.

This patch fixes this by simply allowing such state transition: if we
see a SYN, in TIME_WAIT state, on REPLY direction, move it to sSS. Note
that the rest of the code already handles this situation, more
specificly in tcp_packet(), first switch clause.

Signed-off-by: Marcelo Ricardo Leitner
Acked-by: Jozsef Kadlecsik
Signed-off-by: Pablo Neira Ayuso

Marcelo Leitner
2014-10-22 20:12:50 +0800

21 Oct, 2014

1 commit

330966e50 net: make skb_gso_segment error handling more robust ... Browse Code »

skb_gso_segment has three possible return values:
1. a pointer to the first segmented skb
2. an errno value (IS_ERR())
3. NULL. This can happen when GSO is used for header verification.

However, several callers currently test IS_ERR instead of IS_ERR_OR_NULL
and would oops when NULL is returned.

Note that these call sites should never actually see such a NULL return
value; all callers mask out the GSO bits in the feature argument.

However, there have been issues with some protocol handlers erronously not
respecting the specified feature mask in some cases.

It is preferable to get 'have to turn off hw offloading, else slow' reports
rather than 'kernel crashes'.

Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller

Florian Westphal
2014-10-21 00:38:13 +0800

20 Oct, 2014

1 commit

ce8ec4896 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
netfilter fixes for net

The following patchset contains netfilter fixes for your net tree,
they are:

1) Fix missing MODULE_LICENSE() in the new nf_reject_ipv{4,6} modules.

2) Restrict nat and masq expressions to the nat chain type. Otherwise,
users may crash their kernel if they attach a nat/masq rule to a non
nat chain.

3) Fix hook validation in nft_compat when non-base chains are used.
Basically, initialize hook_mask to zero.

4) Make sure you use match/targets in nft_compat from the right chain
type. The existing validation relies on the table name which can be
avoided by

5) Better netlink attribute validation in nft_nat. This expression has
to reject the configuration when no address and proto configurations
are specified.

6) Interpret NFTA_NAT_REG_*_MAX if only if NFTA_NAT_REG_*_MIN is set.
Yet another sanity check to reject incorrect configurations from
userspace.

7) Conditional NAT attribute dumping depending on the existing
configuration.
====================

Signed-off-by: David S. Miller

David S. Miller
2014-10-20 23:57:47 +0800

18 Oct, 2014

4 commits

1e2d56a5d netfilter: nft_nat: dump attributes if they are set ... Browse Code »

Dump NFTA_NAT_REG_ADDR_MIN if this is non-zero. Same thing with
NFTA_NAT_REG_PROTO_MIN.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-18 20:16:13 +0800
61cfac6b4 netfilter: nft_nat: NFTA_NAT_REG_ADDR_MAX depends on NFTA_NAT_REG_ADDR_MIN ... Browse Code »

Interpret NFTA_NAT_REG_ADDR_MAX if NFTA_NAT_REG_ADDR_MIN is present,
otherwise, skip it. Same thing with NFTA_NAT_REG_PROTO_MAX.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-18 20:16:12 +0800
5c819a397 netfilter: nft_nat: insufficient attribute validation ... Browse Code »

We have to validate that we at least get an NFTA_NAT_REG_ADDR_MIN or
NFTA_NFT_REG_PROTO_MIN attribute. Reject the configuration if none
of them are present.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-18 20:16:11 +0800
f3f5ddedd netfilter: nft_compat: validate chain type in match/target ... Browse Code »
26

We have to validate the real chain type to ensure that matches/targets
are not used out from their scope (eg. MASQUERADE in nat chain type).
The existing validation relies on the table name, but this is not
sufficient since userspace can fool us by using the appropriate table
name with a different chain type.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-18 20:14:07 +0800

14 Oct, 2014

3 commits

493618a92 netfilter: nft_compat: fix hook validation for non-base chains ... Browse Code »

Set hook_mask to zero for non-base chains, otherwise people may hit
bogus errors from the xt_check_target() and xt_check_match() when
validating the uninitialized hook_mask.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-14 18:52:40 +0800
18082746a netfilter: replace strnicmp with strncasecmp ... Browse Code »
13

The kernel used to contain two functions for length-delimited,
case-insensitive string comparison, strnicmp with correct semantics and
a slightly buggy strncasecmp. The latter is the POSIX name, so strnicmp
was renamed to strncasecmp, and strnicmp made into a wrapper for the new
strncasecmp to avoid breaking existing users.

To allow the compat wrapper strnicmp to be removed at some point in the
future, and to avoid the extra indirection cost, do
s/strnicmp/strncasecmp/g.

Signed-off-by: Rasmus Villemoes
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2014-10-14 08:18:24 +0800
7210e4e38 netfilter: nf_tables: restrict nat/masq expressions to nat chain type ... Browse Code »

This adds the missing validation code to avoid the use of nat/masq from
non-nat chains. The validation assumes two possible configuration
scenarios:

1) Use of nat from base chain that is not of nat type. Reject this
configuration from the nft_*_init() path of the expression.

2) Use of nat from non-base chain. In this case, we have to wait until
the non-base chain is referenced by at least one base chain via
jump/goto. This is resolved from the nft_*_validate() path which is
called from nf_tables_check_loops().

The user gets an -EOPNOTSUPP in both cases.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-14 02:42:00 +0800

11 Oct, 2014

1 commit

7b6fa1eef Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter fixes for net-next

This batch contains two fixes for what you have in your net-next,
they are:

1) Remove nf_send_reset6() from header file. This function now resides
in the nf_reject_ipv6 module. Reported by Eric Dumazet.

2) Fix wrong NFT_REJECT_ICMPX_MAX definition and adjust code to fix
errors reported by Dan Carpenter's static analysis tools.
====================

Signed-off-by: David S. Miller

David S. Miller
2014-10-11 03:01:09 +0800

09 Oct, 2014

1 commit

35a9ad8af Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking updates from David Miller:
"Most notable changes in here:

1) By far the biggest accomplishment, thanks to a large range of
contributors, is the addition of multi-send for transmit. This is
the result of discussions back in Chicago, and the hard work of
several individuals.

Now, when the ->ndo_start_xmit() method of a driver sees
skb->xmit_more as true, it can choose to defer the doorbell
telling the driver to start processing the new TX queue entires.

skb->xmit_more means that the generic networking is guaranteed to
call the driver immediately with another SKB to send.

There is logic added to the qdisc layer to dequeue multiple
packets at a time, and the handling mis-predicted offloads in
software is now done with no locks held.

Finally, pktgen is extended to have a "burst" parameter that can
be used to test a multi-send implementation.

Several drivers have xmit_more support: i40e, igb, ixgbe, mlx4,
virtio_net

Adding support is almost trivial, so export more drivers to
support this optimization soon.

I want to thank, in no particular or implied order, Jesper
Dangaard Brouer, Eric Dumazet, Alexander Duyck, Tom Herbert, Jamal
Hadi Salim, John Fastabend, Florian Westphal, Daniel Borkmann,
David Tat, Hannes Frederic Sowa, and Rusty Russell.

2) PTP and timestamping support in bnx2x, from Michal Kalderon.

3) Allow adjusting the rx_copybreak threshold for a driver via
ethtool, and add rx_copybreak support to enic driver. From
Govindarajulu Varadarajan.

4) Significant enhancements to the generic PHY layer and the bcm7xxx
driver in particular (EEE support, auto power down, etc.) from
Florian Fainelli.

5) Allow raw buffers to be used for flow dissection, allowing drivers
to determine the optimal "linear pull" size for devices that DMA
into pools of pages. The objective is to get exactly the
necessary amount of headers into the linear SKB area pre-pulled,
but no more. The new interface drivers use is eth_get_headlen().
From WANG Cong, with driver conversions (several had their own
by-hand duplicated implementations) by Alexander Duyck and Eric
Dumazet.

6) Support checksumming more smoothly and efficiently for
encapsulations, and add "foo over UDP" facility. From Tom
Herbert.

7) Add Broadcom SF2 switch driver to DSA layer, from Florian
Fainelli.

8) eBPF now can load programs via a system call and has an extensive
testsuite. Alexei Starovoitov and Daniel Borkmann.

9) Major overhaul of the packet scheduler to use RCU in several major
areas such as the classifiers and rate estimators. From John
Fastabend.

10) Add driver for Intel FM10000 Ethernet Switch, from Alexander
Duyck.

11) Rearrange TCP_SKB_CB() to reduce cache line misses, from Eric
Dumazet.

12) Add Datacenter TCP congestion control algorithm support, From
Florian Westphal.

13) Reorganize sk_buff so that __copy_skb_header() is significantly
faster. From Eric Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1558 commits)
netlabel: directly return netlbl_unlabel_genl_init()
net: add netdev_txq_bql_{enqueue, complete}_prefetchw() helpers
net: description of dma_cookie cause make xmldocs warning
cxgb4: clean up a type issue
cxgb4: potential shift wrapping bug
i40e: skb->xmit_more support
net: fs_enet: Add NAPI TX
net: fs_enet: Remove non NAPI RX
r8169:add support for RTL8168EP
net_sched: copy exts->type in tcf_exts_change()
wimax: convert printk to pr_foo()
af_unix: remove 0 assignment on static
ipv6: Do not warn for informational ICMP messages, regardless of type.
Update Intel Ethernet Driver maintainers list
bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING
tipc: fix bug in multicast congestion handling
net: better IFF_XMIT_DST_RELEASE support
net/mlx4_en: remove NETDEV_TX_BUSY
3c59x: fix bad split of cpu_to_le32(pci_map_single())
net: bcmgenet: fix Tx ring priority programming
...

Linus Torvalds
2014-10-09 09:40:54 +0800

08 Oct, 2014

2 commits

28596c972 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial ... Browse Code »

Pull "trivial tree" updates from Jiri Kosina:
"Usual pile from trivial tree everyone is so eagerly waiting for"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
Remove MN10300_PROC_MN2WS0038
mei: fix comments
treewide: Fix typos in Kconfig
kprobes: update jprobe_example.c for do_fork() change
Documentation: change "&" to "and" in Documentation/applying-patches.txt
Documentation: remove obsolete pcmcia-cs from Changes
Documentation: update links in Changes
Documentation: Docbook: Fix generated DocBook/kernel-api.xml
score: Remove GENERIC_HAS_IOMAP
gpio: fix 'CONFIG_GPIO_IRQCHIP' comments
tty: doc: Fix grammar in serial/tty
dma-debug: modify check_for_stack output
treewide: fix errors in printk
genirq: fix reference in devm_request_threaded_irq comment
treewide: fix synchronize_rcu() in comments
checkstack.pl: port to AArch64
doc: queue-sysfs: minor fixes
init/do_mounts: better syntax description
MIPS: fix comment spelling
powerpc/simpleboot: fix comment
...

Linus Torvalds
2014-10-08 09:16:26 +0800
f0d1f04f0 netfilter: fix wrong arithmetics regarding NFT_REJECT_ICMPX_MAX ... Browse Code »

NFT_REJECT_ICMPX_MAX should be __NFT_REJECT_ICMPX_MAX - 1.

nft_reject_icmp_code() and nft_reject_icmpv6_code() are called from the
packet path, so BUG_ON in case we try to access an unknown abstracted
ICMP code. This should not happen since we already validate this from
nft_reject_{inet,bridge}_init().

Fixes: 51b0a5d ("netfilter: nft_reject: introduce icmp code abstraction for inet and bridge")
Reported-by: Dan Carpenter
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-08 02:16:31 +0800

06 Oct, 2014

1 commit

61b37d2f5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains another batch with Netfilter/IPVS updates
for net-next, they are:

1) Add abstracted ICMP codes to the nf_tables reject expression. We
introduce four reasons to reject using ICMP that overlap in IPv4
and IPv6 from the semantic point of view. This should simplify the
maintainance of dual stack rule-sets through the inet table.

2) Move nf_send_reset() functions from header files to per-family
nf_reject modules, suggested by Patrick McHardy.

3) We have to use IS_ENABLED(CONFIG_BRIDGE_NETFILTER) everywhere in the
code now that br_netfilter can be modularized. Convert remaining spots
in the network stack code.

4) Use rcu_barrier() in the nf_tables module removal path to ensure that
we don't leave object that are still pending to be released via
call_rcu (that may likely result in a crash).

5) Remove incomplete arch 32/64 compat from nft_compat. The original (bad)
idea was to probe the word size based on the xtables match/target info
size, but this assumption is wrong when you have to dump the information
back to userspace.

6) Allow to filter from prerouting and postrouting in the nf_tables bridge.
In order to emulate the ebtables NAT chains (which are actually simple
filter chains with no special semantics), we have support filtering from
this hooks too.

7) Add explicit module dependency between xt_physdev and br_netfilter.
This provides a way to detect if the user needs br_netfilter from
the configuration path. This should reduce the breakage of the
br_netfilter modularization.

8) Cleanup coding style in ip_vs.h, from Simon Horman.

9) Fix crash in the recently added nf_tables masq expression. We have
to register/unregister the notifiers to clean up the conntrack table
entries from the module init/exit path, not from the rule addition /
deletion path. From Arturo Borrero.
====================

Signed-off-by: David S. Miller

David S. Miller
2014-10-06 09:32:37 +0800

03 Oct, 2014

6 commits

739e4a758 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
drivers/net/usb/r8152.c
net/netfilter/nfnetlink.c

Both r8152 and nfnetlink conflicts were simple overlapping changes.

Signed-off-by: David S. Miller

David S. Miller
2014-10-03 02:25:43 +0800
4b7fd5d97 netfilter: explicit module dependency between br_netfilter and physdev ... Browse Code »

You can use physdev to match the physical interface enslaved to the
bridge device. This information is stored in skb->nf_bridge and it is
set up by br_netfilter. So, this is only available when iptables is
used from the bridge netfilter path.

Since 34666d4 ("netfilter: bridge: move br_netfilter out of the core"),
the br_netfilter code is modular. To reduce the impact of this change,
we can autoload the br_netfilter if the physdev match is used since
we assume that the users need br_netfilter in place.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-03 00:30:57 +0800
756c1b1a7 netfilter: nft_compat: remove incomplete 32/64 bits arch compat code ... Browse Code »

This code was based on the wrong asumption that you can probe based
on the match/target private size that we get from userspace. This
doesn't work at all when you have to dump the info back to userspace
since you don't know what word size the userspace utility is using.

Currently, the extensions that require arch compat are limit match
and the ebt_mark match/target. The standard targets are not used by
the nft-xt compat layer, so they are not affected. We can work around
this limitation with a new revision that uses arch agnostic types.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-03 00:30:55 +0800
1b1bc49c0 netfilter: nf_tables: wait for call_rcu completion on module removal ... Browse Code »

Make sure the objects have been released before the nf_tables modules
is removed.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-03 00:30:54 +0800
1109a90c0 netfilter: use IS_ENABLED(CONFIG_BRIDGE_NETFILTER) ... Browse Code »

In 34666d4 ("netfilter: bridge: move br_netfilter out of the core"),
the bridge netfilter code has been modularized.

Use IS_ENABLED instead of ifdef to cover the module case.

Fixes: 34666d4 ("netfilter: bridge: move br_netfilter out of the core")
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-03 00:30:54 +0800
51b0a5d8c netfilter: nft_reject: introduce icmp code abstraction for inet and bridge ... Browse Code »
13

This patch introduces the NFT_REJECT_ICMPX_UNREACH type which provides
an abstraction to the ICMP and ICMPv6 codes that you can use from the
inet and bridge tables, they are:

* NFT_REJECT_ICMPX_NO_ROUTE: no route to host - network unreachable
* NFT_REJECT_ICMPX_PORT_UNREACH: port unreachable
* NFT_REJECT_ICMPX_HOST_UNREACH: host unreachable
* NFT_REJECT_ICMPX_ADMIN_PROHIBITED: administratevely prohibited

You can still use the specific codes when restricting the rule to match
the corresponding layer 3 protocol.

I decided to not overload the existing NFT_REJECT_ICMP_UNREACH to have
different semantics depending on the table family and to allow the user
to specify ICMP family specific codes if they restrict it to the
corresponding family.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-03 00:29:57 +0800

30 Sep, 2014

1 commit

22e0f8b93 net: sched: make bstats per cpu and estimator RCU safe ... Browse Code »
41

In order to run qdisc's without locking statistics and estimators
need to be handled correctly.

To resolve bstats make the statistics per cpu. And because this is
only needed for qdiscs that are running without locks which is not
the case for most qdiscs in the near future only create percpu
stats when qdiscs set the TCQ_F_CPUSTATS flag.

Next because estimators use the bstats to calculate packets per
second and bytes per second the estimator code paths are updated
to use the per cpu statistics.

Signed-off-by: John Fastabend
Signed-off-by: David S. Miller

John Fastabend
2014-09-30 13:02:26 +0800

29 Sep, 2014

2 commits

db29a9508 netfilter: conntrack: disable generic tracking for known protocols ... Browse Code »
7

Given following iptables ruleset:

-P FORWARD DROP
-A FORWARD -m sctp --dport 9 -j ACCEPT
-A FORWARD -p tcp --dport 80 -j ACCEPT
-A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT

One would assume that this allows SCTP on port 9 and TCP on port 80.
Unfortunately, if the SCTP conntrack module is not loaded, this allows
*all* SCTP communication, to pass though, i.e. -p sctp -j ACCEPT,
which we think is a security issue.

This is because on the first SCTP packet on port 9, we create a dummy
"generic l4" conntrack entry without any port information (since
conntrack doesn't know how to extract this information).

All subsequent packets that are unknown will then be in established
state since they will fallback to proto_generic and will match the
'generic' entry.

Our originally proposed version [1] completely disabled generic protocol
tracking, but Jozsef suggests to not track protocols for which a more
suitable helper is available, hence we now mitigate the issue for in
tree known ct protocol helpers only, so that at least NAT and direction
information will still be preserved for others.

[1] http://www.spinics.net/lists/netfilter-devel/msg33430.html

Joint work with Daniel Borkmann.

Signed-off-by: Florian Westphal
Signed-off-by: Daniel Borkmann
Acked-by: Jozsef Kadlecsik
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2014-09-29 18:17:49 +0800
9363dc4b5 netfilter: nf_tables: store and dump set policy ... Browse Code »

We want to know in which cases the user explicitly sets the policy
options. In that case, we also want to dump back the info.

Signed-off-by: Arturo Borrero Gonzalez
Signed-off-by: Pablo Neira Ayuso

Arturo Borrero
2014-09-29 17:28:03 +0800

27 Sep, 2014

2 commits

e7af85db5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
nf pull request for net

This series contains netfilter fixes for net, they are:

1) Fix lockdep splat in nft_hash when releasing sets from the
rcu_callback context. We don't the mutex there anymore.

2) Remove unnecessary spinlock_bh in the destroy path of the nf_tables
rbtree set type from rcu_callback context.

3) Fix another lockdep splat in rhashtable. None of the callers hold
a mutex when calling rhashtable_destroy.

4) Fix duplicated error reporting from nfnetlink when aborting and
replaying a batch.

5) Fix a Kconfig issue reported by kbuild robot.
====================

Signed-off-by: David S. Miller

David S. Miller
2014-09-27 04:21:29 +0800
772476df7 net/netfilter/x_tables.c: use __seq_open_private() ... Browse Code »

Reduce boilerplate code by using __seq_open_private() instead of seq_open()
in xt_match_open() and xt_target_open().

Signed-off-by: Rob Jones
Signed-off-by: Pablo Neira Ayuso

Rob Jones
2014-09-27 00:42:29 +0800

19 Sep, 2014

2 commits

84d7fce69 netfilter: nf_tables: export rule-set generation ID ... Browse Code »

This patch exposes the ruleset generation ID in three ways:

1) The new command NFT_MSG_GETGEN that exposes the 32-bits ruleset
generation ID. This ID is incremented in every commit and it
should be large enough to avoid wraparound problems.

2) The less significant 16-bits of the generation ID are exposed through
the nfgenmsg->res_id header field. This allows us to quickly catch
if the ruleset has change between two consecutive list dumps from
different object lists (in this specific case I think the risk of
wraparound is unlikely).

3) Userspace subscribers may receive notifications of new rule-set
generation after every commit. This also provides an alternative
way to monitor the generation ID. If the events are lost, the
userspace process hits a overrun error, so it knows that it is
working with a stale ruleset anyway.

Patrick spotted that rule-set transformations in userspace may take
quite some time. In that case, it annotates the 32-bits generation ID
before fetching the rule-set, then:

1) it compares it to what we obtain after the transformation to
make sure it is not working with a stale rule-set and no wraparound
has ocurred.

2) it subscribes to ruleset notifications, so it can watch for new
generation ID.

This is complementary to the NLM_F_DUMP_INTR approach, which allows
us to detect an interference in the middle one single list dumping.
There is no way to explicitly check that an interference has occurred
between two list dumps from the kernel, since it doesn't know how
many lists the userspace client is actually going to dump.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-09-19 17:14:43 +0800
fc04733a1 netfilter: nfnetlink: use original skbuff when committing/aborting ... Browse Code »

This allows us to access the original content of the batch from
the commit and the abort paths.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-09-19 17:14:42 +0800

18 Sep, 2014

4 commits

fcfa8f493 Merge branch 'ipvs-next' ... Browse Code »

Simon Horman says:

====================
This pull requests makes the following changes:

* Add simple weighted fail-over scheduler.
- Unlike other IPVS schedulers this offers fail-over rather than load
balancing. Connections are directed to the appropriate server based
solely on highest weight value and server availability.
- Thanks to Kenny Mathis

* Support IPv6 real servers in IPv4 virtual-services and vice versa
- This feature is supported in conjunction with the tunnel (IPIP)
forwarding mechanism. That is, IPv4 may be forwarded in IPv6 and
vice versa.
- The motivation for this is to allow more flexibility in the
choice of IP version offered by both virtual-servers and
real-servers as they no longer need to match: An IPv4 connection from an
end-user may be forwarded to a real-server using IPv6 and vice versa.
- Further work need to be done to support this feature in conjunction
with connection synchronisation. For now such configurations are
not allowed.
- This change includes update to netlink protocol, adding a new
destination address family attribute. And the necessary changes
to plumb this information throughout IPVS.
- Thanks to Alex Gartrell and Julian Anastasov
====================

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-09-18 16:59:33 +0800
bc18d37f6 ipvs: Allow heterogeneous pools now that we support them ... Browse Code »
13

Remove the temporary consistency check and add a case statement to only
allow ipip mixed dests.

Signed-off-by: Alex Gartrell
Acked-by: Julian Anastasov
Signed-off-by: Simon Horman

Alex Gartrell
2014-09-18 07:59:29 +0800
f18ae7206 ipvs: use the new dest addr family field ... Browse Code »

Use the new address family field cp->daf when printing
cp->daddr in logs or connection listing.

Signed-off-by: Julian Anastasov
Signed-off-by: Alex Gartrell
Signed-off-by: Simon Horman

Julian Anastasov
2014-09-18 07:59:28 +0800
4d316f3f9 ipvs: use correct address family in scheduler logs ... Browse Code »

Needed to support svc->af != dest->af.

Signed-off-by: Julian Anastasov
Signed-off-by: Alex Gartrell
Signed-off-by: Simon Horman

Julian Anastasov
2014-09-18 07:59:23 +0800

16 Sep, 2014

1 commit

cf34e646d ipvs: address family of LBLCR entry depends on svc family ... Browse Code »

The LBLCR entries should use svc->af, not dest->af.
Needed to support svc->af != dest->af.

Signed-off-by: Julian Anastasov
Signed-off-by: Alex Gartrell
Signed-off-by: Simon Horman

Julian Anastasov
2014-09-16 08:03:38 +0800