24 Nov, 2011
1 commit
-
We can not update iph->daddr in ip_options_rcv_srr(), It is too early.
When some exception ocurred later (eg. in ip_forward() when goto
sr_failed) we need the ip header be identical to the original one as
ICMP need it.Add a field 'nexthop' in struct ip_options to save nexthop of LSRR
or SSRR option.Signed-off-by: Li Wei
Signed-off-by: David S. Miller
10 Nov, 2011
1 commit
-
When opt->srr_is_hit is set skb_rtable(skb) has been updated for
'nexthop' and iph->daddr should always equals to skb_rtable->rt_dst
holds, We need update iph->daddr either.Signed-off-by: Li Wei
Signed-off-by: David S. Miller
01 Jun, 2011
1 commit
-
The current code takes an unaligned pointer and does htonl() on it to
make it big-endian, then does a memcpy(). The problem is that the
compiler decides that since the pointer is to a __be32, it is legal
to optimize the copy into a processor word store. However, on an
architecture that does not handled unaligned writes in kernel space,
this produces an unaligned exception fault.The solution is to track the pointer as a "char *" (which removes a bunch
of unpleasant casts in any case), and then just use put_unaligned_be32()
to write the value to memory.Signed-off-by: Chris Metcalf
Signed-off-by: David S. Miller
14 May, 2011
3 commits
-
At this point iph->daddr equals what rt->rt_dst would hold.
Signed-off-by: David S. Miller
-
Pass in the sk_buff so that we can fetch the necessary keys from
the packet header when working with input routes.Signed-off-by: David S. Miller
-
This code block executes when opt->srr_is_hit is set. It will be
set only by ip_options_rcv_srr().ip_options_rcv_srr() walks until it hits a matching nexthop in the SRR
option addresses, and when it matches one 1) looks up the route for
that nexthop and 2) on route lookup success it writes that nexthop
value into iph->daddr.ip_forward_options() runs later, and again walks the SRR option
addresses looking for the option matching the destination of the route
stored in skb_rtable(). This route will be the same exact one looked
up for the nexthop by ip_options_rcv_srr().Therefore "rt->rt_dst == iph->daddr" must be true.
All it really needs to do is record the route's source address in the
matching SRR option adddress. It need not write iph->daddr again,
since that has already been done by ip_options_rcv_srr() as detailed
above.Signed-off-by: David S. Miller
13 May, 2011
2 commits
-
We already copy the 4-byte nexthop from the options block into
local variable "nexthop" for the route lookup.Re-use that variable instead of memcpy()'ing again when assigning
to iph->daddr after the route lookup succeeds.Signed-off-by: David S. Miller
-
All call sites conditionalize the call to ip_options_rcv_srr()
with a check of opt->srr, so no need to check it again there.Signed-off-by: David S. Miller
29 Apr, 2011
1 commit
-
We lack proper synchronization to manipulate inet->opt ip_options
Problem is ip_make_skb() calls ip_setup_cork() and
ip_setup_cork() possibly makes a copy of ipc->opt (struct ip_options),
without any protection against another thread manipulating inet->opt.Another thread can change inet->opt pointer and free old one under us.
Use RCU to protect inet->opt (changed to inet->inet_opt).
Instead of handling atomic refcounts, just copy ip_options when
necessary, to avoid cache line dirtying.We cant insert an rcu_head in struct ip_options since its included in
skb->cb[], so this patch is large because I had to introduce a new
ip_options_rcu structure.Signed-off-by: Eric Dumazet
Cc: Herbert Xu
Signed-off-by: David S. Miller
15 Apr, 2011
1 commit
-
Scot Doyle demonstrated ip_options_compile() could be called with an skb
without an attached route, using a setup involving a bridge, netfilter,
and forged IP packets.Let's make ip_options_compile() and ip_options_rcv_srr() a bit more
robust, instead of changing bridge/netfilter code.With help from Hiroaki SHIMODA.
Reported-by: Scot Doyle
Tested-by: Scot Doyle
Signed-off-by: Eric Dumazet
Cc: Stephen Hemminger
Acked-by: Hiroaki SHIMODA
Signed-off-by: David S. Miller
28 Mar, 2011
1 commit
-
The current handling of echoed IP timestamp options with prespecified
addresses is rather broken since the 2.2.x kernels. As far as i understand
it, it should behave like when originating packets.Currently it will only timestamp the next free slot if:
- there is space for *two* timestamps
- some random data from the echoed packet taken as an IP is *not* a local IPThis first is caused by an off-by-one error. 'soffset' points to the next
free slot and so we only need to have 'soffset + 7
Signed-off-by: David S. Miller
20 Sep, 2010
1 commit
-
Related dicussion here : http://lkml.org/lkml/2010/9/3/16
Introduce a function br_parse_ip_options that will audit the
skb and possibly refill IP options before a packet enters the
IP stack. If no options are present, the function will zero out
the skb cb area so that it is not misinterpreted as options by some
unsuspecting IP layer routine. If packet consistency fails, drop it.Signed-off-by: Bandan Das
Signed-off-by: David S. Miller
18 May, 2010
2 commits
-
This patch removes from net/ (but not any netfilter files)
all the unnecessary return; statements that precede the
last closing brace of void functions.It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'Signed-off-by: Joe Perches
Signed-off-by: David S. Miller -
Use low order bit of skb->_skb_dst to tell dst is not refcounted.
Change _skb_dst to _skb_refdst to make sure all uses are catched.
skb_dst() returns the dst, regardless of noref bit set or not, but
with a lockdep check to make sure a noref dst is not given if current
user is not rcu protected.New skb_dst_set_noref() helper to set an notrefcounted dst on a skb.
(with lockdep check)skb_dst_drop() drops a reference only if skb dst was refcounted.
skb_dst_force() helper is used to force a refcount on dst, when skb
is queued and not anymore RCU protected.Use skb_dst_force() in __sk_add_backlog(), __dev_xmit_skb() if
!IFF_XMIT_DST_RELEASE or skb enqueued on qdisc queue, in
sock_queue_rcv_skb(), in __nf_queue().Use skb_dst_force() in dev_requeue_skb().
Note: dst_use_noref() still dirties dst, we might transform it
later to do one dirtying per jiffies.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
30 Mar, 2010
1 commit
-
…it slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
03 Jun, 2009
2 commits
-
Define three accessors to get/set dst attached to a skb
struct dst_entry *skb_dst(const struct sk_buff *skb)
void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)
void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;Delete skb->dst field
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb
Delete skb->rtable field
Setting rtable is not allowed, just set dst instead as rtable is an alias.
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
10 Oct, 2008
1 commit
-
This patch accomplishes three minor tasks: add a new tag type for local
labeling, rename the CIPSO_V4_MAP_STD define to CIPSO_V4_MAP_TRANS and
replace some of the CIPSO "magic numbers" with constants from the header
file. The first change allows CIPSO to support full LSM labels/contexts,
not just MLS attributes. The second change brings the mapping names inline
with what userspace is using, compatibility is preserved since we don't
actually change the value. The last change is to aid readability and help
prevent mistakes.Signed-off-by: Paul Moore
12 Jun, 2008
1 commit
-
This patch removes CVS keywords that weren't updated for a long time
from comments.Signed-off-by: Adrian Bunk
Signed-off-by: David S. Miller
21 Apr, 2008
1 commit
-
What do_gettimeofday() does is to call getnstimeofday() and
to convert the result from timespec{} to timeval{}.
After that, these callers convert the result again to msec.
Use getnstimeofday() and convert the units at once.Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller
26 Mar, 2008
1 commit
-
Introduce per-net_device inlines: dev_net(), dev_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.Signed-off-by: YOSHIFUJI Hideaki
25 Mar, 2008
3 commits
-
Replace all the rest of the init_net with a proper net on the IP layer.
Signed-off-by: Denis V. Lunev
Signed-off-by: David S. Miller -
Pass the init_net there for now.
Signed-off-by: Denis V. Lunev
Signed-off-by: David S. Miller -
ip_options_compile uses inet_addr_type which requires a namespace. The
packet argument is optional, so parameter is the only way to obtain
it. Pass the init_net there for now.Signed-off-by: Denis V. Lunev
Signed-off-by: David S. Miller
23 Mar, 2008
3 commits
-
This makes code a bit more uniform and straigthforward.
Signed-off-by: Denis V. Lunev
Signed-off-by: David S. Miller -
ip_options->is_data is assigned only and never checked. The structure is
not a part of kernel interface to the userspace. So, it is safe to remove
this field.Signed-off-by: Denis V. Lunev
Signed-off-by: David S. Miller -
There is the only way to reach ip_options compile with opt != NULL:
ip_options_get_finish
opt->is_data = 1;
ip_options_compile(opt, NULL)So, checking for is_data inside opt != NULL branch is not needed.
Signed-off-by: Denis V. Lunev
Signed-off-by: David S. Miller
06 Mar, 2008
1 commit
-
(Anonymous) unions can help us to avoid ugly casts.
A common cast it the (struct rtable *)skb->dst one.
Defining an union like :
union {
struct dst_entry *dst;
struct rtable *rtable;
};
permits to use skb->rtable in place.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
04 Mar, 2008
1 commit
-
ip_options_echo is called on the packet input path after the initial
routing. The dst entry on the packet is cleared only in the several
very specific places and immidiately assigned back (may be new).Signed-off-by: Denis V. Lunev
Signed-off-by: David S. Miller
29 Jan, 2008
1 commit
-
The patch extends the inet_addr_type and inet_dev_addr_type with the
network namespace pointer. That allows to access the different tables
relatively to the network namespace.The modification of the signature function is reported in all the
callers of the inet_addr_type using the pointer to the well known
init_net.Acked-by: Benjamin Thery
Acked-by: Daniel Lezcano
Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller
01 Aug, 2007
1 commit
-
Signed-off-by: Mariusz Kozlowski
Signed-off-by: David S. Miller
26 Apr, 2007
2 commits
-
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller -
For the places where we need a pointer to the network header, it is still legal
to touch skb->nh.raw directly if just adding to, subtracting from or setting it
to another layer header.Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller
11 Feb, 2007
1 commit
-
Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller
31 Oct, 2006
1 commit
-
This patch makes two changes to protect applications from either removing or
tampering with the CIPSOv4 IP option on a socket. The first is the requirement
that applications have the CAP_NET_RAW capability to set an IPOPT_CIPSO option
on a socket; this prevents untrusted applications from setting their own
CIPSOv4 security attributes on the packets they send. The second change is to
SELinux and it prevents applications from setting any IPv4 options when there
is an IPOPT_CIPSO option already present on the socket; this prevents
applications from removing CIPSOv4 security attributes from the packets they
send.Signed-off-by: Paul Moore
Signed-off-by: James Morris
Signed-off-by: David S. Miller
29 Sep, 2006
5 commits
-
Signed-off-by: Al Viro
Signed-off-by: David S. Miller -
->faddr is net-endian; annotated as such, variables inferred to be net-endian
annotated.Signed-off-by: Al Viro
Signed-off-by: David S. Miller -
daddr is net-endian
Signed-off-by: Al Viro
Signed-off-by: David S. Miller -
argument and inferred net-endian variables in callers annotated.
Signed-off-by: Al Viro
Signed-off-by: David S. Miller -
ip_route_input() takes net-endian source and destination address.
* Annotated as such.
* arguments of its invocations annotated where needed.
* local helpers getting the same values passed to by it (ip_route_input_mc(),
ip_route_input_slow(), ip_handle_martian_source(), ip_mkroute_input(),
ip_mkroute_input_def(), __mkroute_input()) annotatedSigned-off-by: Al Viro
Signed-off-by: David S. Miller