31 Mar, 2006
1 commit
-
This adds support for the sys_splice system call. Using a pipe as a
transport, it can connect to files or sockets (latter as output only).From the splice.c comments:
"splice": joining two ropes together by interweaving their strands.
This is the "extended pipe" functionality, where a pipe is used as
an arbitrary in-memory buffer. Think of a pipe as a small kernel
buffer that you can use to transfer data from one end to the other.The traditional unix read/write is extended with a "splice()" operation
that transfers data buffers to or from a pipe buffer.Named by Larry McVoy, original implementation from Linus, extended by
Jens to support splicing to files and fixing the initial implementation
bugs.Signed-off-by: Jens Axboe
Signed-off-by: Linus Torvalds
30 Mar, 2006
1 commit
29 Mar, 2006
9 commits
-
Every netfilter module uses `init' for its module_init() function and
`fini' or `cleanup' for its module_exit() function.Problem is, this creates uninformative initcall_debug output and makes
ctags rather useless.So go through and rename them all to $(filename)_init and
$(filename)_fini.Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller -
Signed-off-by: S P
Signed-off-by: David S. Miller -
Basically this patch moves the generic tunnel protocol stuff out of
xfrm4_tunnel/xfrm6_tunnel and moves it into the new files of tunnel4.c
and tunnel6 respectively.The reason for this is that the problem that Hugo uncovered is only
the tip of the iceberg. The real problem is that when we removed the
dependency of ipip on xfrm4_tunnel we didn't really consider the module
case at all.For instance, as it is it's possible to build both ipip and xfrm4_tunnel
as modules and if the latter is loaded then ipip simply won't load.After considering the alternatives I've decided that the best way out of
this is to restore the dependency of ipip on the non-xfrm-specific part
of xfrm4_tunnel. This is acceptable IMHO because the intention of the
removal was really to be able to use ipip without the xfrm subsystem.
This is still preserved by this patch.So now both ipip/xfrm4_tunnel depend on the new tunnel4.c which handles
the arbitration between the two. The order of processing is determined
by a simple integer which ensures that ipip gets processed before
xfrm4_tunnel.The situation for ICMP handling is a little bit more complicated since
we may not have enough information to determine who it's for. It's not
a big deal at the moment since the xfrm ICMP handlers are basically
no-ops. In future we can deal with this when we look at ICMP caching
in general.The user-visible change to this is the removal of the TUNNEL Kconfig
prompts. This makes sense because it can only be used through IPCOMP
as it stands.The addition of the new modules shouldn't introduce any problems since
module dependency will cause them to be loaded.Oh and I also turned some unnecessary pskb's in IPv6 related to this
patch to skb's.Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
Sizes in bytes (allyesconfig, i386) and files where those inlines
are used:238 sock_queue_rcv_skb 2.6.16/net/x25/x25_in.o
238 sock_queue_rcv_skb 2.6.16/net/rose/rose_in.o
238 sock_queue_rcv_skb 2.6.16/net/packet/af_packet.o
238 sock_queue_rcv_skb 2.6.16/net/netrom/nr_in.o
238 sock_queue_rcv_skb 2.6.16/net/llc/llc_sap.o
238 sock_queue_rcv_skb 2.6.16/net/llc/llc_conn.o
238 sock_queue_rcv_skb 2.6.16/net/irda/af_irda.o
238 sock_queue_rcv_skb 2.6.16/net/ipx/af_ipx.o
238 sock_queue_rcv_skb 2.6.16/net/ipv6/udp.o
238 sock_queue_rcv_skb 2.6.16/net/ipv6/raw.o
238 sock_queue_rcv_skb 2.6.16/net/ipv4/udp.o
238 sock_queue_rcv_skb 2.6.16/net/ipv4/raw.o
238 sock_queue_rcv_skb 2.6.16/net/ipv4/ipmr.o
238 sock_queue_rcv_skb 2.6.16/net/econet/econet.o
238 sock_queue_rcv_skb 2.6.16/net/econet/af_econet.o
238 sock_queue_rcv_skb 2.6.16/net/bluetooth/sco.o
238 sock_queue_rcv_skb 2.6.16/net/bluetooth/l2cap.o
238 sock_queue_rcv_skb 2.6.16/net/bluetooth/hci_sock.o
238 sock_queue_rcv_skb 2.6.16/net/ax25/ax25_in.o
238 sock_queue_rcv_skb 2.6.16/net/ax25/af_ax25.o
238 sock_queue_rcv_skb 2.6.16/net/appletalk/ddp.o
238 sock_queue_rcv_skb 2.6.16/drivers/net/pppoe.o276 sk_receive_skb 2.6.16/net/decnet/dn_nsp_in.o
276 sk_receive_skb 2.6.16/net/dccp/ipv6.o
276 sk_receive_skb 2.6.16/net/dccp/ipv4.o
276 sk_receive_skb 2.6.16/net/dccp/dccp_ipv6.o
276 sk_receive_skb 2.6.16/drivers/net/pppoe.o209 sk_dst_check 2.6.16/net/ipv6/ip6_output.o
209 sk_dst_check 2.6.16/net/ipv4/udp.o
209 sk_dst_check 2.6.16/net/decnet/dn_nsp_out.oLarge inlines with multiple callers:
Size Uses Wasted Name and definition
===== ==== ====== ================================================
238 21 4360 sock_queue_rcv_skb include/net/sock.h
109 10 801 sock_recv_timestamp include/net/sock.h
276 4 768 sk_receive_skb include/net/sock.h
94 8 518 __sk_dst_check include/net/sock.h
209 3 378 sk_dst_check include/net/sock.h
131 4 333 sk_setup_caps include/net/sock.h
152 2 132 sk_stream_alloc_pskb include/net/sock.h
125 2 105 sk_stream_writequeue_purge include/net/sock.hSigned-off-by: Andrew Morton
Signed-off-by: David S. Miller -
Just use a local econet_mutex instead.
Signed-off-by: David S. Miller
-
Fix kernel oopses whenever somebody issues compatible ioctl on AppleTalk,
Econet, IPX or IRDA socket. For AppleTalk/Econet/IRDA it restores state
in which these sockets were before compat_ioctl was introduced to the socket
ops, for IPX it implements support for 4 ioctls which were not implemented
before - as these ioctls use structures which match between 32bit and 64bit
userspace, no special code is needed, just call 64bit ioctl handler.Signed-off-by: Petr Vandrovec
Signed-off-by: David S. Miller -
Fix a lot of typos. Eyeballed by jmc@ in OpenBSD.
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is a conversion to make the various file_operations structs in fs/
const. Basically a regexp job, with a few manual fixupsThe goal is both to increase correctness (harder to accidentally write to
shared datastructures) and reducing the false sharing of cachelines with
things that get dirty in .data (while .rodata is nicely read only and thus
cache clean)Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Mark the f_ops members of inodes as const, as well as fix the
ripple-through this causes by places that copy this f_ops and then "do
stuff" with it.Signed-off-by: Arjan van de Ven
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
28 Mar, 2006
16 commits
-
We don't make much of an attempt to fall back to lower rates, and 54M
just isn't reliable enough for many people. In fact, it's not clear we
even set it to 11M if we're trying to associate with an 802.11b AP.This patch makes us default to 11M, which ought to work for most people.
When we actually handle dynamic rate adjustment, we can reconsider the
defaults -- but even then, probably it makes as much sense to start at
11M and adjust it upwards as it does to start at 54M and reduce it.Signed-off-by: David Woodhouse
Signed-off-by: John W. Linville -
It currently takes something like 8 seconds to do a scan, because we
spend half a second on each channel. Reduce that time to 20ms per
channel.Signed-off-by: David Woodhouse
Signed-off-by: John W. Linville -
The attached patch removes a potential problem from ieee80211_wx.c, by changing the name of routine
ipw2100_translate_scan to ieee80211_translate_scan. The problem is minor as the routine is declared
static; however, if it were made global, it would pollute the namespace.Signed-Off-By: Larry Finger
Signed-off-by: John W. Linville -
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[NET]: drop duplicate assignment in request_sock
[IPSEC]: Fix tunnel error handling in ipcomp6 -
The kernel's implementation of notifier chains is unsafe. There is no
protection against entries being added to or removed from a chain while the
chain is in use. The issues were discussed in this thread:http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2
We noticed that notifier chains in the kernel fall into two basic usage
classes:"Blocking" chains are always called from a process context
and the callout routines are allowed to sleep;"Atomic" chains can be called from an atomic context and
the callout routines are not allowed to sleep.We decided to codify this distinction and make it part of the API. Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name). New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain. The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed. For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections. (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)There are some limitations, which should not be too hard to live with. For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem. Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain. (This did happen in a couple of places and the code
had to be changed to avoid it.)Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization. Instead we use RCU. The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.Here is the list of chains that we adjusted and their classifications. None
of them use the raw API, so for the moment it is only a placeholder.ATOMIC CHAINS
-------------
arch/i386/kernel/traps.c: i386die_chain
arch/ia64/kernel/traps.c: ia64die_chain
arch/powerpc/kernel/traps.c: powerpc_die_chain
arch/sparc64/kernel/traps.c: sparc64die_chain
arch/x86_64/kernel/traps.c: die_chain
drivers/char/ipmi/ipmi_si_intf.c: xaction_notifier_list
kernel/panic.c: panic_notifier_list
kernel/profile.c: task_free_notifier
net/bluetooth/hci_core.c: hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_expect_chain
net/ipv6/addrconf.c: inet6addr_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_expect_chain
net/netlink/af_netlink.c: netlink_chainBLOCKING CHAINS
---------------
arch/powerpc/platforms/pseries/reconfig.c: pSeries_reconfig_chain
arch/s390/kernel/process.c: idle_chain
arch/x86_64/kernel/process.c idle_notifier
drivers/base/memory.c: memory_chain
drivers/cpufreq/cpufreq.c cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c cpufreq_transition_notifier_list
drivers/macintosh/adb.c: adb_client_list
drivers/macintosh/via-pmu.c sleep_notifier_list
drivers/macintosh/via-pmu68k.c sleep_notifier_list
drivers/macintosh/windfarm_core.c wf_client_list
drivers/usb/core/notify.c usb_notifier_list
drivers/video/fbmem.c fb_notifier_list
kernel/cpu.c cpu_chain
kernel/module.c module_notify_list
kernel/profile.c munmap_notifier
kernel/profile.c task_exit_notifier
kernel/sys.c reboot_notifier_list
net/core/dev.c netdev_chain
net/decnet/dn_dev.c: dnaddr_chain
net/ipv4/devinet.c: inetaddr_chainIt's possible that some of these classifications are wrong. If they are,
please let us know or submit a patch to fix them. Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: Alan Stern
Signed-off-by: Chandra Seetharaman
Signed-off-by: Jes Sorensen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We shouldn't really compare &new->h with anything when new ==NULL, and gather
three different if statements that all startif (rv ...
into one large if.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We can now make some code static.
Signed-off-by: Adrian Bunk
Cc: Neil Brown
Cc: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
.. it makes some of the code nicer.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Cache_fresh is now only used in cache.c, so unexport it.
Part of cache_fresh (setting CACHE_VALID) should really be done under the
lock, while part (calling cache_revisit_request etc) must be done outside the
lock. So we split it up appropriately.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
- in cache_check, h must be non-NULL as it has been de-referenced,
so don't bother checking for NULL.- When a cache-item is updated, we need to call cache_revisit_request to see
if there is a pending request waiting for that item. We were using
a transition to CACHE_VALID to see if that was needed, however that is
wrong as an expired entry will still be marked 'valid' (as the data is valid
and will need to be released). So instead use an off transition for
CACHE_PENDING which is exactly the right thing to test.- Add a little bit more debugging info.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The C++-like 'template' approach proves to be too ugly and hard to work with.
The old 'template' won't go away until all users are updated.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
These were an unnecessary wart. Also only have one 'DefineSimpleCache..'
instead of two.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The 'auth_domain's are simply handles on internal data structures. They do
not cache information from user-space, and forcing them into the mold of a
'cache' misrepresents their true nature and causes confusion.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
27 Mar, 2006
4 commits
-
Just noticed that request_sock.[ch] contain a useless assignment of
rskq_accept_head to itself. I assume this is a typo and the 2nd one
was supposed to be _tail. However, setting _tail to NULL is not
needed, so the patch below just drops the 2nd assignment.Signed-off-By: Norbert Kiesel
Signed-off-by: Adrian Bunk
Signed-off-by: David S. Miller -
The error handling in ipcomp6_tunnel_create is broken in two ways:
1) If we fail to allocate an SPI (this should never happen in practice
since there are plenty of 32-bit SPI values for us to use), we will
still go ahead and create the SA.2) When xfrm_init_state fails, we first of all may trigger the BUG_TRAP
in __xfrm_state_destroy because we didn't set the state to DEAD. More
importantly we end up returning the freed state as if we succeeded!This patch fixes them both.
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
Modify well over a dozen mempool users to call mempool_create_slab_pool()
rather than calling mempool_create() with extra arguments, saving about 30
lines of code and increasing readability.Signed-off-by: Matthew Dobson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Semaphore to mutex conversion.
The conversion was generated via scripts, and the result was validated
automatically via a script as well.Signed-off-by: Ingo Molnar
Cc: Dave Jones
Cc: Paul Mackerras
Cc: Ralf Baechle
Cc: Jens Axboe
Cc: Neil Brown
Acked-by: Alasdair G Kergon
Cc: Greg KH
Cc: Dominik Brodowski
Cc: Adam Belay
Cc: Martin Schwidefsky
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
26 Mar, 2006
6 commits
-
* 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (22 commits)
[PATCH] fix audit_init failure path
[PATCH] EXPORT_SYMBOL patch for audit_log, audit_log_start, audit_log_end and audit_format
[PATCH] sem2mutex: audit_netlink_sem
[PATCH] simplify audit_free() locking
[PATCH] Fix audit operators
[PATCH] promiscuous mode
[PATCH] Add tty to syscall audit records
[PATCH] add/remove rule update
[PATCH] audit string fields interface + consumer
[PATCH] SE Linux audit events
[PATCH] Minor cosmetic cleanups to the code moved into auditfilter.c
[PATCH] Fix audit record filtering with !CONFIG_AUDITSYSCALL
[PATCH] Fix IA64 success/failure indication in syscall auditing.
[PATCH] Miscellaneous bug and warning fixes
[PATCH] Capture selinux subject/object context information.
[PATCH] Exclude messages by message type
[PATCH] Collect more inode information during syscall processing.
[PATCH] Pass dentry, not just name, in fsnotify creation hooks.
[PATCH] Define new range of userspace messages.
[PATCH] Filter rule comparators
...Fixed trivial conflict in security/selinux/hooks.c
-
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (103 commits)
SUNRPC,RPCSEC_GSS: spkm3--fix config dependencies
SUNRPC,RPCSEC_GSS: spkm3: import contexts using NID_cast5_cbc
LOCKD: Make nlmsvc_traverse_shares return void
LOCKD: nlmsvc_traverse_blocks return is unused
SUNRPC,RPCSEC_GSS: fix krb5 sequence numbers.
NFSv4: Dont list system.nfs4_acl for filesystems that don't support it.
SUNRPC,RPCSEC_GSS: remove unnecessary kmalloc of a checksum
SUNRPC: Ensure rpc_call_async() always calls tk_ops->rpc_release()
SUNRPC: Fix memory barriers for req->rq_received
NFS: Fix a race in nfs_sync_inode()
NFS: Clean up nfs_flush_list()
NFS: Fix a race with PG_private and nfs_release_page()
NFSv4: Ensure the callback daemon flushes signals
SUNRPC: Fix a 'Busy inodes' error in rpc_pipefs
NFS, NLM: Allow blocking locks to respect signals
NFS: Make nfs_fhget() return appropriate error values
NFSv4: Fix an oops in nfs4_fill_super
lockd: blocks should hold a reference to the nlm_file
NFSv4: SETCLIENTID_CONFIRM should handle NFS4ERR_DELAY/NFS4ERR_RESOURCE
NFSv4: Send the delegation stateid for SETATTR calls
... -
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[NETFILTER] x_table.c: sem2mutex
[IPV4]: Aggregate route entries with different TOS values
[TCP]: Mark tcp_*mem[] __read_mostly.
[TCP]: Set default max buffers from memory pool size
[SCTP]: Fix up sctp_rcv return value
[NET]: Take RTNL when unregistering notifier
[WIRELESS]: Fix config dependencies.
[NET]: Fill in a 32-bit hole in struct sock on 64-bit platforms.
[NET]: Ensure device name passed to SO_BINDTODEVICE is NULL terminated.
[MODULES]: Don't allow statically declared exports
[BRIDGE]: Unaligned accesses in the ethernet bridge -
Implement the half-closed devices notifiation, by adding a new POLLRDHUP
(and its alias EPOLLRDHUP) bit to the existing poll/select sets. Since the
existing POLLHUP handling, that does not report correctly half-closed
devices, was feared to be changed, this implementation leaves the current
POLLHUP reporting unchanged and simply add a new bit that is set in the few
places where it makes sense. The same thing was discussed and conceptually
agreed quite some time ago:http://lkml.org/lkml/2003/7/12/116
Since this new event bit is added to the existing Linux poll infrastruture,
even the existing poll/select system calls will be able to use it. As far
as the existing POLLHUP handling, the patch leaves it as is. The
pollrdhup-2.6.16.rc5-0.10.diff defines the POLLRDHUP for all the existing
archs and sets the bit in the six relevant files. The other attached diff
is the simple change required to sys/epoll.h to add the EPOLLRDHUP
definition.There is "a stupid program" to test POLLRDHUP delivery here:
http://www.xmailserver.org/pollrdhup-test.c
It tests poll(2), but since the delivery is same epoll(2) will work equally.
Signed-off-by: Davide Libenzi
Cc: "David S. Miller"
Cc: Michael Kerrisk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
net/rxrpc/main.c: In function `rxrpc_initialise':
net/rxrpc/main.c:83: warning: label `error_proc' defined but not usedSigned-off-by: Jesper Juhl
Acked-by: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Implement /proc/slab_allocators. It produces output like:
idr_layer_cache: 80 idr_pre_get+0x33/0x4e
buffer_head: 2555 alloc_buffer_head+0x20/0x75
mm_struct: 9 mm_alloc+0x1e/0x42
mm_struct: 20 dup_mm+0x36/0x370
vm_area_struct: 384 dup_mm+0x18f/0x370
vm_area_struct: 151 do_mmap_pgoff+0x2e0/0x7c3
vm_area_struct: 1 split_vma+0x5a/0x10e
vm_area_struct: 11 do_brk+0x206/0x2e2
vm_area_struct: 2 copy_vma+0xda/0x142
vm_area_struct: 9 setup_arg_pages+0x99/0x214
fs_cache: 8 copy_fs_struct+0x21/0x133
fs_cache: 29 copy_process+0xf38/0x10e3
files_cache: 30 alloc_files+0x1b/0xcf
signal_cache: 81 copy_process+0xbaa/0x10e3
sighand_cache: 77 copy_process+0xe65/0x10e3
sighand_cache: 1 de_thread+0x4d/0x5f8
anon_vma: 241 anon_vma_prepare+0xd9/0xf3
size-2048: 1 add_sect_attrs+0x5f/0x145
size-2048: 2 journal_init_revoke+0x99/0x302
size-2048: 2 journal_init_revoke+0x137/0x302
size-2048: 2 journal_init_inode+0xf9/0x1c4Cc: Manfred Spraul
Cc: Alexander Nyberg
Cc: Pekka Enberg
Cc: Christoph Lameter
Cc: Ravikiran Thirumalai
Signed-off-by: Al Viro
DESC
slab-leaks3-locking-fix
EDESC
From: Andrew MortonUpdate for slab-remove-cachep-spinlock.patch
Cc: Al Viro
Cc: Manfred Spraul
Cc: Alexander Nyberg
Cc: Pekka Enberg
Cc: Christoph Lameter
Cc: Ravikiran Thirumalai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
25 Mar, 2006
3 commits
-
Semaphore to mutex conversion.
The conversion was generated via scripts, and the result was validated
automatically via a script as well.Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller -
When we get an ICMP need-to-frag message, the original TOS value in the
ICMP payload cannot be used as a key to look up the routes to update.
This is because the TOS field may have been modified by routers on the
way. Similarly, ip_rt_redirect should also ignore the TOS as the router
that gave us the message may have modified the TOS value.The patch achieves this objective by aggregating entries with different
TOS values (but are otherwise identical) into the same bucket. This
makes it easy to update them at the same time when an ICMP message is
received.In future we should use a twin-hashing scheme where teh aggregation
occurs at the entry level. That is, the TOS goes back into the hash
for normal lookups while ICMP lookups will end up with a node that
gives us a list that contains all other route entries that differ
only by TOS.Signed-off-by: Ilia Sotnikov
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
Suggested by Stephen Hemminger.
Signed-off-by: David S. Miller