Eric Lee / linux-smarc-t335x-v3.2

18 Sep, 2009

1 commit

f205ce83a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (66 commits)
be2net: fix some cmds to use mccq instead of mbox
atl1e: fix 2.6.31-git4 -- ATL1E 0000:03:00.0: DMA-API: device driver frees DMA
pkt_sched: Fix qstats.qlen updating in dump_stats
ipv6: Log the affected address when DAD failure occurs
wl12xx: Fix print_mac() conversion.
af_iucv: fix race when queueing skbs on the backlog queue
af_iucv: do not call iucv_sock_kill() twice
af_iucv: handle non-accepted sockets after resuming from suspend
af_iucv: fix race in __iucv_sock_wait()
iucv: use correct output register in iucv_query_maxconn()
iucv: fix iucv_buffer_cpumask check when calling IUCV functions
iucv: suspend/resume error msg for left over pathes
wl12xx: switch to %pM to print the mac address
b44: the poll handler b44_poll must not enable IRQ unconditionally
ipv6: Ignore route option with ROUTER_PREF_INVALID
bonding: make ab_arp select active slaves as other modes
cfg80211: fix SME connect
rc80211_minstrel: fix contention window calculation
ssb/sdio: fix printk format warnings
p54usb: add Zcomax XG-705A usbid
...

Linus Torvalds
2009-09-18 11:53:52 +0800

16 Sep, 2009

2 commits

657e9649e tcp: fix CONFIG_TCP_MD5SIG + CONFIG_PREEMPT timer BUG() ... Browse Code »

I have recently came across a preemption imbalance detected by:

huh, entered ffffffff80644630 with preempt_count 00000102, exited with 00000101?
------------[ cut here ]------------
kernel BUG at /usr/src/linux/kernel/timer.c:664!
invalid opcode: 0000 [1] PREEMPT SMP

with ffffffff80644630 being inet_twdr_hangman().

This appeared after I enabled CONFIG_TCP_MD5SIG and played with it a
bit, so I looked at what might have caused it.

One thing that struck me as strange is tcp_twsk_destructor(), as it
calls tcp_put_md5sig_pool() -- which entails a put_cpu(), causing the
detected imbalance. Found on 2.6.23.9, but 2.6.31 is affected as well,
as far as I can tell.

Signed-off-by: Robert Varga
Signed-off-by: David S. Miller

Robert Varga
2009-09-16 14:49:21 +0800
ada3fa150 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits)
powerpc64: convert to dynamic percpu allocator
sparc64: use embedding percpu first chunk allocator
percpu: kill lpage first chunk allocator
x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA
percpu: update embedding first chunk allocator to handle sparse units
percpu: use group information to allocate vmap areas sparsely
vmalloc: implement pcpu_get_vm_areas()
vmalloc: separate out insert_vmalloc_vm()
percpu: add chunk->base_addr
percpu: add pcpu_unit_offsets[]
percpu: introduce pcpu_alloc_info and pcpu_group_info
percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward
percpu: add @align to pcpu_fc_alloc_fn_t
percpu: make @dyn_size mandatory for pcpu_setup_first_chunk()
percpu: drop @static_size from first chunk allocators
percpu: generalize first chunk allocator selection
percpu: build first chunk allocators selectively
percpu: rename 4k first chunk allocator to page
percpu: improve boot messages
percpu: fix pcpu_reclaim() locking
...

Fix trivial conflict as by Tejun Heo in kernel/sched.c

Linus Torvalds
2009-09-16 00:39:44 +0800

15 Sep, 2009

4 commits

75c78500d bonding: remap muticast addresses without using dev_close() and dev_open() ... Browse Code »

This patch fixes commit e36b9d16c6a6d0f59803b3ef04ff3c22c3844c10. The approach
there is to call dev_close()/dev_open() whenever the device type is changed in
order to remap the device IP multicast addresses to HW multicast addresses.
This approach suffers from 2 drawbacks:

*. It assumes tha the device is UP when calling dev_close(), or otherwise
dev_close() has no affect. It is worth to mention that initscripts (Redhat)
and sysconfig (Suse) doesn't act the same in this matter.
*. dev_close() has other side affects, like deleting entries from the routing
table, which might be unnecessary.

The fix here is to directly remap the IP multicast addresses to HW multicast
addresses for a bonding device that changes its type, and nothing else.

Reported-by: Jason Gunthorpe
Signed-off-by: Moni Shoua
Signed-off-by: David S. Miller

Moni Shoua
2009-09-15 17:37:40 +0800
0b6a05c1d tcp: fix ssthresh u16 leftover ... Browse Code »

It was once upon time so that snd_sthresh was a 16-bit quantity.
...That has not been true for long period of time. I run across
some ancient compares which still seem to trust such legacy.
Put all that magic into a single place, I hopefully found all
of them.

Compile tested, though linking of allyesconfig is ridiculous
nowadays it seems.

Signed-off-by: Ilpo Järvinen
Signed-off-by: David S. Miller

Ilpo Järvinen
2009-09-15 16:30:10 +0800
32613090a net: constify struct net_protocol ... Browse Code »

Remove long removed "inet_protocol_base" declaration.

Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller

Alexey Dobriyan
2009-09-15 08:03:01 +0800
d7e9660ad Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1623 commits)
netxen: update copyright
netxen: fix tx timeout recovery
netxen: fix file firmware leak
netxen: improve pci memory access
netxen: change firmware write size
tg3: Fix return ring size breakage
netxen: build fix for INET=n
cdc-phonet: autoconfigure Phonet address
Phonet: back-end for autoconfigured addresses
Phonet: fix netlink address dump error handling
ipv6: Add IFA_F_DADFAILED flag
net: Add DEVTYPE support for Ethernet based devices
mv643xx_eth.c: remove unused txq_set_wrr()
ucc_geth: Fix hangs after switching from full to half duplex
ucc_geth: Rearrange some code to avoid forward declarations
phy/marvell: Make non-aneg speed/duplex forcing work for 88E1111 PHYs
drivers/net/phy: introduce missing kfree
drivers/net/wan: introduce missing kfree
net: force bridge module(s) to be GPL
Subject: [PATCH] appletalk: Fix skb leak when ipddp interface is not loaded
...

Fixed up trivial conflicts:

- arch/x86/include/asm/socket.h

converted to in the x86 tree. The generic
header has the same new #define's, so that works out fine.

- drivers/net/tun.c

fix conflict between 89f56d1e9 ("tun: reuse struct sock fields") that
switched over to using 'tun->socket.sk' instead of the redundantly
available (and thus removed) 'tun->sk', and 2b980dbd ("lsm: Add hooks
to the TUN driver") which added a new 'tun->sk' use.

Noted in 'next' by Stephen Rothwell.

Linus Torvalds
2009-09-15 01:37:28 +0800

11 Sep, 2009

2 commits

9a0da0d19 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 Browse Code »

David S. Miller
2009-09-11 09:17:09 +0800
a3c8b9739 Merge branch 'next' into for-linus Browse Code »

James Morris
2009-09-11 06:04:49 +0800

09 Sep, 2009

1 commit

fa1a9c681 headers: net/ipv[46]/protocol.c header trim ... Browse Code »

Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller

Alexey Dobriyan
2009-09-09 18:43:50 +0800

03 Sep, 2009

2 commits

aa1330766 tcp: replace hard coded GFP_KERNEL with sk_allocation ... Browse Code »

This fixed a lockdep warning which appeared when doing stress
memory tests over NFS:

inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.

page reclaim => nfs_writepage => tcp_sendmsg => lock sk_lock

mount_root => nfs_root_data => tcp_close => lock sk_lock =>
tcp_send_fin => alloc_skb_fclone => page reclaim

David raised a concern that if the allocation fails in tcp_send_fin(), and it's
GFP_ATOMIC, we are going to yield() (which sleeps) and loop endlessly waiting
for the allocation to succeed.

But fact is, the original GFP_KERNEL also sleeps. GFP_ATOMIC+yield() looks
weird, but it is no worse the implicit sleep inside GFP_KERNEL. Both could
loop endlessly under memory pressure.

CC: Arnaldo Carvalho de Melo
CC: David S. Miller
CC: Herbert Xu
Signed-off-by: Wu Fengguang
Signed-off-by: David S. Miller

Wu Fengguang
2009-09-03 14:45:45 +0800
6ce9e7b5f ip: Report qdisc packet drops ... Browse Code »

Christoph Lameter pointed out that packet drops at qdisc level where not
accounted in SNMP counters. Only if application sets IP_RECVERR, drops
are reported to user (-ENOBUFS errors) and SNMP counters updated.

IP_RECVERR is used to enable extended reliable error message passing,
but these are not needed to update system wide SNMP stats.

This patch changes things a bit to allow SNMP counters to be updated,
regardless of IP_RECVERR being set or not on the socket.

Example after an UDP tx flood
# netstat -s
...
IP:
1487048 outgoing packets dropped
...
Udp:
...
SndbufErrors: 1487048

send() syscalls, do however still return an OK status, to not
break applications.

Note : send() manual page explicitly says for -ENOBUFS error :

"The output queue for a network interface was full.
This generally indicates that the interface has stopped sending,
but may be caused by transient congestion.
(Normally, this does not occur in Linux. Packets are just silently
dropped when a device queue overflows.) "

This is not true for IP_RECVERR enabled sockets : a send() syscall
that hit a qdisc drop returns an ENOBUFS error.

Many thanks to Christoph, David, and last but not least, Alexey !

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-09-03 09:05:33 +0800

02 Sep, 2009

4 commits

3b401a81c inet: inet_connection_sock_af_ops const ... Browse Code »

The function block inet_connect_sock_af_ops contains no data
make it constant.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2009-09-02 16:03:49 +0800
b2e4b3deb tcp: MD5 operations should be const ... Browse Code »

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2009-09-02 16:03:43 +0800
6cdee2f96 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
drivers/net/yellowfin.c

David S. Miller
2009-09-02 15:32:56 +0800
89d69d2b7 net: make neigh_ops constant ... Browse Code »

These tables are never modified at runtime. Move to read-only
section.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2009-09-02 08:40:57 +0800

01 Sep, 2009

4 commits

6fa12c850 Revert Backoff [v3]: Calculate TCP's connection close threshold as a time value. ... Browse Code »

RFC 1122 specifies two threshold values R1 and R2 for connection timeouts,
which may represent a number of allowed retransmissions or a timeout value.
Currently linux uses sysctl_tcp_retries{1,2} to specify the thresholds
in number of allowed retransmissions.

For any desired threshold R2 (by means of time) one can specify tcp_retries2
(by means of number of retransmissions) such that TCP will not time out
earlier than R2. This is the case, because the RTO schedule follows a fixed
pattern, namely exponential backoff.

However, the RTO behaviour is not predictable any more if RTO backoffs can be
reverted, as it is the case in the draft
"Make TCP more Robust to Long Connectivity Disruptions"
(http://tools.ietf.org/html/draft-zimmermann-tcp-lcd).

In the worst case TCP would time out a connection after 3.2 seconds, if the
initial RTO equaled MIN_RTO and each backoff has been reverted.

This patch introduces a function retransmits_timed_out(N),
which calculates the timeout of a TCP connection, assuming an initial
RTO of MIN_RTO and N unsuccessful, exponentially backed-off retransmissions.

Whenever timeout decisions are made by comparing the retransmission counter
to some value N, this function can be used, instead.

The meaning of tcp_retries2 will be changed, as many more RTO retransmissions
can occur than the value indicates. However, it yields a timeout which is
similar to the one of an unpatched, exponentially backing off TCP in the same
scenario. As no application could rely on an RTO greater than MIN_RTO, there
should be no risk of a regression.

Signed-off-by: Damian Lukowski
Acked-by: Ilpo Järvinen
Signed-off-by: David S. Miller

Damian Lukowski
2009-09-01 17:45:47 +0800
f1ecd5d9e Revert Backoff [v3]: Revert RTO on ICMP destination unreachable ... Browse Code »

Here, an ICMP host/network unreachable message, whose payload fits to
TCP's SND.UNA, is taken as an indication that the RTO retransmission has
not been lost due to congestion, but because of a route failure
somewhere along the path.
With true congestion, a router won't trigger such a message and the
patched TCP will operate as standard TCP.

This patch reverts one RTO backoff, if an ICMP host/network unreachable
message, whose payload fits to TCP's SND.UNA, arrives.
Based on the new RTO, the retransmission timer is reset to reflect the
remaining time, or - if the revert clocked out the timer - a retransmission
is sent out immediately.
Backoffs are only reverted, if TCP is in RTO loss recovery, i.e. if
there have been retransmissions and reversible backoffs, already.

Changes from v2:
1) Renaming of skb in tcp_v4_err() moved to another patch.
2) Reintroduced tcp_bound_rto() and __tcp_set_rto().
3) Fixed code comments.

Signed-off-by: Damian Lukowski
Acked-by: Ilpo Järvinen
Signed-off-by: David S. Miller

Damian Lukowski
2009-09-01 17:45:42 +0800
4d1a2d9ec Revert Backoff [v3]: Rename skb to icmp_skb in tcp_v4_err() ... Browse Code »

This supplementary patch renames skb to icmp_skb in tcp_v4_err() in order to
disambiguate from another sk_buff variable, which will be introduced
in a separate patch.

Signed-off-by: Damian Lukowski
Acked-by: Ilpo Järvinen
Signed-off-by: David S. Miller

Damian Lukowski
2009-09-01 17:45:38 +0800
6fef4c0c8 netdev: convert pseudo-devices to netdev_tx_t ... Browse Code »

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2009-09-01 16:13:07 +0800

29 Aug, 2009

6 commits

9a7030b76 tcp: Remove redundant copy of MD5 authentication key ... Browse Code »

Remove the copy of the MD5 authentication key from tcp_check_req().
This key has already been copied by tcp_v4_syn_recv_sock() or
tcp_v6_syn_recv_sock().

Signed-off-by: John Dykstra
Signed-off-by: David S. Miller

John Dykstra
2009-08-29 15:19:25 +0800
80a1096ba tcp: fix premature termination of FIN_WAIT2 time-wait sockets ... Browse Code »

There is a race condition in the time-wait sockets code that can lead
to premature termination of FIN_WAIT2 and, subsequently, to RST
generation when the FIN,ACK from the peer finally arrives:

Time TCP header
0.000000 30755 > http [SYN] Seq=0 Win=2920 Len=0 MSS=1460 TSV=282912 TSER=0
0.000008 http > 30755 aSYN, ACK] Seq=0 Ack=1 Win=2896 Len=0 MSS=1460 TSV=...
0.136899 HEAD /1b.html?n1Lg=v1 HTTP/1.0 [Packet size limited during capture]
0.136934 HTTP/1.0 200 OK [Packet size limited during capture]
0.136945 http > 30755 [FIN, ACK] Seq=187 Ack=207 Win=2690 Len=0 TSV=270521...
0.136974 30755 > http [ACK] Seq=207 Ack=187 Win=2734 Len=0 TSV=283049 TSER=...
0.177983 30755 > http [ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283089 TSER=...
0.238618 30755 > http [FIN, ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283151...
0.238625 http > 30755 [RST] Seq=188 Win=0 Len=0

Say twdr->slot = 1 and we are running inet_twdr_hangman and in this
instance inet_twdr_do_twkill_work returns 1. At that point we will
mark slot 1 and schedule inet_twdr_twkill_work. We will also make
twdr->slot = 2.

Next, a connection is closed and tcp_time_wait(TCP_FIN_WAIT2, timeo)
is called which will create a new FIN_WAIT2 time-wait socket and will
place it in the last to be reached slot, i.e. twdr->slot = 1.

At this point say inet_twdr_twkill_work will run which will start
destroying the time-wait sockets in slot 1, including the just added
TCP_FIN_WAIT2 one.

To avoid this issue we increment the slot only if all entries in the
slot have been purged.

This change may delay the slots cleanup by a time-wait death row
period but only if the worker thread didn't had the time to run/purge
the current slot in the next period (6 seconds with default sysctl
settings). However, on such a busy system even without this change we
would probably see delays...

Signed-off-by: Octavian Purdila
Signed-off-by: David S. Miller

Octavian Purdila
2009-08-29 15:00:35 +0800
80b71b80d fib_trie: resize rework ... Browse Code »

Here is rework and cleanup of the resize function.

Some bugs we had. We were using ->parent when we should use
node_parent(). Also we used ->parent which is not assigned by
inflate in inflate loop.

Also a fix to set thresholds to power 2 to fit halve
and double strategy.

max_resize is renamed to max_work which better indicates
it's function.

Reaching max_work is not an error, so warning is removed.
max_work only limits amount of work done per resize.
(limits CPU-usage, outstanding memory etc).

The clean-up makes it relatively easy to add fixed sized
root-nodes if we would like to decrease the memory pressure
on routers with large routing tables and dynamic routing.
If we'll need that...

Its been tested with 280k routes.

Work done together with Robert Olsson.

Signed-off-by: Jens Låås
Signed-off-by: Robert Olsson
Signed-off-by: David S. Miller

Jens Låås
2009-08-29 14:57:15 +0800
30038fc61 net: ip_rt_send_redirect() optimization ... Browse Code »

While doing some forwarding benchmarks, I noticed
ip_rt_send_redirect() is rather expensive, even if send_redirects is
false for the device.

Fix is to avoid two atomic ops, we dont really need to take a
reference on in_dev

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-08-29 14:52:01 +0800
df19a6267 tcp: keepalive cleanups ... Browse Code »

Introduce keepalive_probes(tp) helper, and use it, like
keepalive_time_when(tp) and keepalive_intvl_when(tp)

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-08-29 14:48:54 +0800
3d1427f87 ipv4: af_inet.c cleanups ... Browse Code »

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-08-29 14:45:21 +0800

28 Aug, 2009

1 commit

788d908f2 ipv4: make ip_append_data() handle NULL routing table ... Browse Code »

Add a check in ip_append_data() for NULL *rtp to prevent future bugs in
callers from being exploitable.

Signed-off-by: Julien Tinnes
Signed-off-by: Tavis Ormandy
Acked-by: David S. Miller
Signed-off-by: Linus Torvalds

Julien TINNES
2009-08-28 03:23:43 +0800

25 Aug, 2009

3 commits

399383246 netfilter: nfnetlink: constify message attributes and headers ... Browse Code »

Signed-off-by: Patrick McHardy

Patrick McHardy
2009-08-25 22:07:58 +0800
74f7a6552 netfilter: nf_conntrack: log packets dropped by helpers ... Browse Code »

Log packets dropped by helpers using the netfilter logging API. This
is useful in combination with nfnetlink_log to analyze those packets
in userspace for debugging.

Signed-off-by: Patrick McHardy

Patrick McHardy
2009-08-25 21:33:08 +0800
cce5a5c30 netfilter: nf_nat: fix inverted logic for persistent NAT mappings ... Browse Code »

Kernel 2.6.30 introduced a patch [1] for the persistent option in the
netfilter SNAT target. This is exactly what we need here so I had a quick look
at the code and noticed that the patch is wrong. The logic is simply inverted.
The patch below fixes this.

Also note that because of this the default behavior of the SNAT target has
changed since kernel 2.6.30 as it now ignores the destination IP in choosing
the source IP for nating (which should only be the case if the persistent
option is set).

[1] http://git.eu.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=98d500d66cb7940747b424b245fc6a51ecfbf005

Signed-off-by: Maximilian Engelhardt
Signed-off-by: Patrick McHardy

Maximilian Engelhardt
2009-08-25 01:24:54 +0800

24 Aug, 2009

1 commit

35aad0ffd netfilter: xtables: mark initial tables constant ... Browse Code »

The inputted table is never modified, so should be considered const.

Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy

Jan Engelhardt
2009-08-24 20:56:30 +0800

20 Aug, 2009

1 commit

ece13879e Merge branch 'master' into next ... Browse Code »

Conflicts:
security/Kconfig

Manual fix.

Signed-off-by: James Morris

James Morris
2009-08-20 07:18:42 +0800

15 Aug, 2009

1 commit

8cdb04563 gre: Fix MTU calculation for bound GRE tunnels ... Browse Code »

The GRE header length should be subtracted when the tunnel MTU is
calculated. This just corrects for the associativity change
introduced by commit 42aa916265d740d66ac1f17290366e9494c884c2
("gre: Move MTU setting out of ipgre_tunnel_bind_dev").

Signed-off-by: Tom Goff
Signed-off-by: David S. Miller

Tom Goff
2009-08-15 07:41:18 +0800

14 Aug, 2009

2 commits

384be2b18 Merge branch 'percpu-for-linus' into percpu-for-next ... Browse Code »

Conflicts:
arch/sparc/kernel/smp_64.c
arch/x86/kernel/cpu/perf_counter.c
arch/x86/kernel/setup_percpu.c
drivers/cpufreq/cpufreq_ondemand.c
mm/percpu.c

Conflicts in core and arch percpu codes are mostly from commit
ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many
num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all
the first chunk allocators into mm/percpu.c, the changes are moved
from arch code to mm/percpu.c.

Signed-off-by: Tejun Heo

Tejun Heo
2009-08-14 13:45:31 +0800
a8f80e8ff Networking: use CAP_NET_ADMIN when deciding to call request_module ... Browse Code »

The networking code checks CAP_SYS_MODULE before using request_module() to
try to load a kernel module. While this seems reasonable it's actually
weakening system security since we have to allow CAP_SYS_MODULE for things
like /sbin/ip and bluetoothd which need to be able to trigger module loads.
CAP_SYS_MODULE actually grants those binaries the ability to directly load
any code into the kernel. We should instead be protecting modprobe and the
modules on disk, rather than granting random programs the ability to load code
directly into the kernel. Instead we are going to gate those networking checks
on CAP_NET_ADMIN which still limits them to root but which does not grant
those processes the ability to load arbitrary code into the kernel.

Signed-off-by: Eric Paris
Acked-by: Serge Hallyn
Acked-by: Paul Moore
Acked-by: David S. Miller
Signed-off-by: James Morris

Eric Paris
2009-08-14 09:18:34 +0800

10 Aug, 2009

5 commits

dc05a564a Merge branch 'master' of git://dev.medozas.de/linux Browse Code »

Patrick McHardy
2009-08-10 23:14:59 +0800
e2fe35c17 netfilter: xtables: check for standard verdicts in policies ... Browse Code »

This adds the second check that Rusty wanted to have a long time ago. :-)

Base chain policies must have absolute verdicts that cease processing
in the table, otherwise rule execution may continue in an unexpected
spurious fashion (e.g. next chain that follows in memory).

Signed-off-by: Jan Engelhardt

Jan Engelhardt
2009-08-10 19:35:31 +0800
90e7d4ab5 netfilter: xtables: check for unconditionality of policies ... Browse Code »

This adds a check that iptables's original author Rusty set forth in
a FIXME comment.

Underflows in iptables are better known as chain policies, and are
required to be unconditional or there would be a stochastical chance
for the policy rule to be skipped if it does not match. If that were
to happen, rule execution would continue in an unexpected spurious
fashion.

Signed-off-by: Jan Engelhardt

Jan Engelhardt
2009-08-10 19:35:29 +0800
a7d51738e netfilter: xtables: ignore unassigned hooks in check_entry_size_and_hooks ... Browse Code »

The "hook_entry" and "underflow" array contains values even for hooks
not provided, such as PREROUTING in conjunction with the "filter"
table. Usually, the values point to whatever the next rule is. For
the upcoming unconditionality and underflow checking patches however,
we must not inspect that arbitrary rule.

Skipping unassigned hooks seems like a good idea, also because
newinfo->hook_entry and newinfo->underflow will then continue to have
the poison value for detecting abnormalities.

Signed-off-by: Jan Engelhardt

Jan Engelhardt
2009-08-10 19:35:28 +0800
47901dc2c netfilter: xtables: use memcmp in unconditional check ... Browse Code »

Instead of inspecting each u32/char open-coded, clean up and make use
of memcmp. On some arches, memcmp is implemented as assembly or GCC's
__builtin_memcmp which can possibly take advantages of known
alignment.

Signed-off-by: Jan Engelhardt

Jan Engelhardt
2009-08-10 19:35:27 +0800