Eric Lee / smarc-fsl-linux-kernel

27 Jun, 2005

5 commits

4da62fc70 [IPVS]: Fix for overflows ... Browse Code »

From:

$subject was fixed in 2.4 already, 2.6 needs it as well.

The impact of the bugs is a kernel stack overflow and privilege escalation
from CAP_NET_ADMIN via the IP_VS_SO_SET_STARTDAEMON/IP_VS_SO_GET_DAEMON
ioctls. People running with 'root=all caps' (i.e., most users) are not
really affected (there's nothing to escalate), but SELinux and similar
users should take it seriously if they grant CAP_NET_ADMIN to other users.

Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller

pageexec
2005-06-27 07:00:19 +0800
d470e3b48 [NETLINK]: Fix two socket hashing bugs. ... Browse Code »

1) netlink_release() should only decrement the hash entry
count if the socket was actually hashed.

This was causing hash->entries to underflow, which
resulting in all kinds of troubles.

On 64-bit systems, this would cause the following
conditional to erroneously trigger:

err = -ENOMEM;
if (BITS_PER_LONG > 32 && unlikely(hash->entries >= UINT_MAX))
goto err;

2) netlink_autobind() needs to propagate the error return from
netlink_insert(). Otherwise, callers will not see the error
as they should and thus try to operate on a socket with a zero pid,
which is very bad.

However, it should not propagate -EBUSY. If two threads race
to autobind the socket, that is fine. This is consistent with the
autobind behavior in other protocols.

So bug #1 above, combined with this one, resulted in hangs
on netlink_sendmsg() calls to the rtnetlink socket. We'd try
to do the user sendmsg() with the socket's pid set to zero,
later we do a socket lookup using that pid (via the value we
stashed away in NETLINK_CB(skb).pid), but that won't give us the
user socket, it will give us the rtnetlink socket. So when we
try to wake up the receive queue, we dive back into rtnetlink_rcv()
which tries to recursively take the rtnetlink semaphore.

Thanks to Jakub Jelink for providing backtraces. Also, thanks to
Herbert Xu for supplying debugging patches to help track this down,
and also finding a mistake in an earlier version of this fix.

Signed-off-by: David S. Miller

David S. Miller
2005-06-27 06:31:51 +0800
64053beeb [PKTGEN]: Fix random packet sizes causing panic ... Browse Code »

Signed-off-by: Robert Olsson
Signed-off-by: David S. Miller

Robert Olsson
2005-06-27 06:27:10 +0800
60fe74032 [TCP]: Let TCP_CONG_ADVANCED default to n ... Browse Code »

It doesn't seem to make much sense to let an "If unsure, say N." option
default to y.

Signed-off-by: Adrian Bunk
Signed-off-by: David S. Miller

Adrian Bunk
2005-06-27 06:21:15 +0800
6c3607676 [IPV4]: Fix thinko in TCP_CONG_BIC default. ... Browse Code »

Since it is tristate when we offer it as a choice, we should
definte it also as tristate when forcing it as the default.
Otherwise kconfig warns.

Signed-off-by: David S. Miller

David S. Miller
2005-06-27 06:20:20 +0800

26 Jun, 2005

2 commits

2031d0f58 Merge Christoph's freeze cleanup patch Browse Code »

Linus Torvalds
2005-06-26 08:16:53 +0800
3e1d1d28d [PATCH] Cleanup patch for process freezing ... Browse Code »

1. Establish a simple API for process freezing defined in linux/include/sched.h:

frozen(process) Check for frozen process
freezing(process) Check if a process is being frozen
freeze(process) Tell a process to freeze (go to refrigerator)
thaw_process(process) Restart process
frozen_process(process) Process is frozen now

2. Remove all references to PF_FREEZE and PF_FROZEN from all
kernel sources except sched.h

3. Fix numerous locations where try_to_freeze is manually done by a driver

4. Remove the argument that is no longer necessary from two function calls.

5. Some whitespace cleanup

6. Clear potential race in refrigerator (provides an open window of PF_FREEZE
cleared before setting PF_FROZEN, recalc_sigpending does not check
PF_FROZEN).

This patch does not address the problem of freeze_processes() violating the rule
that a task may only modify its own flags by setting PF_FREEZE. This is not clean
in an SMP environment. freeze(process) is therefore not SMP safe!

Signed-off-by: Christoph Lameter
Signed-off-by: Linus Torvalds

Christoph Lameter
2005-06-26 08:10:13 +0800

25 Jun, 2005

4 commits

c54d7e03c [SUNRPC]: Fix {s,}size_t printf format strings in xprt.c ... Browse Code »

Signed-off-by: David S. Miller

David S. Miller
2005-06-25 10:57:07 +0800
a6484045f [TCP]: Do not present confusing congestion control options by default. ... Browse Code »

Create TCP_CONG_ADVANCED option, akin to IP_ADVANCED_ROUTER, which
when disabled will bypass all of the congestion control Kconfig
options and leave the user with a safe default.

That safe default is currently BIC-TCP with new Reno as a fallback.

Signed-off-by: David S. Miller

David S. Miller
2005-06-25 09:07:51 +0800
bb298ca3c [IPV4]: Move FIB lookup algorithm choice under IP_ADVANCED_ROUTING ... Browse Code »

Most users need not be concerned with a complex choice of what
FIB lookup algorithm to use. So give them the safe default of
IP_FIB_HASH if IP_ADVANCED_ROUTING is disabled.

Signed-off-by: David S. Miller

David S. Miller
2005-06-25 08:50:53 +0800
f7704347a [PKT_SCHED]: Make TEXTSEARCH* options only selected. ... Browse Code »

Do not present these confusing new options to the user
unless he picked some facility that makes use of it,
such as NET_EMATCH_TEXT.

Signed-off-by: David S. Miller

David S. Miller
2005-06-25 08:39:03 +0800

24 Jun, 2005

23 commits

59a49e387 Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 Browse Code »

Linus Torvalds
2005-06-24 15:31:46 +0800
52c1da395 [PATCH] make various thing static ... Browse Code »

Another rollup of patches which give various symbols static scope

Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2005-06-24 15:06:43 +0800
5ba266d63 [PATCH] knfsd: nfsd4: fix probe_callback ... Browse Code »

rpc_create_client was modified recently to do its own (synchronous) NULL ping
of the server. We'd rather do that on our own, asynchronously, so that we
don't have to block the nfsd thread doing the probe, and so that setclientid
handling (hence, client mounts) can proceed normally whether the callback is
succesful or not. (We can still function fine without the callback
channel--we just won't be able to give out delegations till it's verified to
work.)

Signed-off-by: J. Bruce Fields
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2005-06-24 15:06:30 +0800
f2d368fa3 [PKT_SCHED]: Make NET_EMATCH_TEXT select TEXTSEARCh ... Browse Code »

Signed-off-by: David S. Miller

David S. Miller
2005-06-24 14:55:41 +0800
d675c989e [PKT_SCHED]: Packet classification based on textsearch (ematch) ... Browse Code »

Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2005-06-24 12:00:58 +0800
3fc7e8a6d [NET]: skb_find_text() - Find a text pattern in skb data ... Browse Code »

Finds a pattern in the skb data according to the specified
textsearch configuration. Use textsearch_next() to retrieve
subsequent occurrences of the pattern. Returns the offset
to the first occurrence or UINT_MAX if no match was found.

Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2005-06-24 12:00:17 +0800
677e90eda [NET]: Zerocopy sequential reading of skb data ... Browse Code »

Implements sequential reading for both linear and non-linear
skb data at zerocopy cost. The data is returned in chunks of
arbitary length, therefore random access is not possible.

Usage:
from := 0
to := 128
state := undef
data := undef
len := undef
consumed := 0

skb_prepare_seq_read(skb, from, to, &state)
while (len = skb_seq_read(consumed, &data, &state)) != 0 do
/* do something with 'data' of length 'len' */
if abort then
/* abort read if we don't wait for
* skb_seq_read() to return 0 */
skb_abort_seq_read(&state)
return
endif
/* not necessary to consume all of 'len' */
consumed += len
done

Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2005-06-24 11:59:51 +0800
5f8ef48d2 [TCP]: Allow choosing TCP congestion control via sockopt. ... Browse Code »

Allow using setsockopt to set TCP congestion control to use on a per
socket basis.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-06-24 11:37:36 +0800
51b0bdedb [NET]: Separate two usages of netdev_max_backlog. ... Browse Code »

Separate out the two uses of netdev_max_backlog. One controls the
upper bound on packets processed per softirq, the new name for this is
netdev_budget; the other controls the limit on packets queued via
netif_rx.

Increase the max_backlog default to account for faster processors.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-06-24 11:14:40 +0800
31aa02c53 [NET]: Eliminate netif_rx massive packet drops. ... Browse Code »

Eliminate the throttling behaviour when the netif receive queue fills
because it behaves badly when using high speed networks under load.
The throttling cause multiple packet drops that cause TCP to go into
slow start mode. The same effective patch has been part of BIC TCP and
H-TCP as well as part of Web100.

The existing code drops 100's of packets when the queue fills;
this changes it to individual packet drop-tail.

Signed-off-by: Stephen Hemmminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-06-24 11:12:48 +0800
34008d8c6 [NET]: Remove obsolete netif_rx congestion sensing mechanism. ... Browse Code »

Remove the congestion sensing mechanism from netif_rx, and always
return either full or empty. Almost no driver checks the return value
from netif_rx, and those that do only use it for debug messages.

The original design of netif_rx was to do flow control based on the
receive queue, but NAPI has supplanted this and no driver uses the
feedback.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-06-24 11:10:00 +0800
c1ebcdb8c [NET]: Remove obsolete fastroute stats. ... Browse Code »

Remove last vestiages of fastroute code that is no longer used.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-06-24 11:08:59 +0800
0e57976b6 [TCP]: Add Scalable TCP congestion control module. ... Browse Code »

This patch implements Tom Kelly's Scalable TCP congestion control algorithm
for the modular framework.

The algorithm has some nice scaling properties, and has been used a fair bit
in research, though is known to have significant fairness issues, so it's not
really suitable for general purpose use.

Signed-off-by: John Heffner
Signed-off-by: David S. Miller

John Heffner
2005-06-24 03:29:07 +0800
a7868ea68 [TCP]: Add H-TCP congestion control module. ... Browse Code »

H-TCP is a congestion control algorithm developed at the Hamilton Institute, by
Douglas Leith and Robert Shorten. It is extending the standard Reno algorithm
with mode switching is thus a relatively simple modification.

H-TCP is defined in a layered manner as it is still a research platform. The
basic form includes the modification of beta according to the ratio of maxRTT
to min RTT and the alpha=2*factor*(1-beta) relation, where factor is dependant
on the time since last congestion.

The other layers improve convergence by adding appropriate factors to alpha.

The following patch implements the H-TCP algorithm in it's basic form.

Signed-Off-By: Baruch Even
Signed-off-by: David S. Miller

Baruch Even
2005-06-24 03:28:11 +0800
b87d8561d [TCP]: Add TCP Vegas congestion control module. ... Browse Code »

TCP Vegas code modified for the new TCP infrastructure.
Vegas now uses microsecond resolution timestamps for
better estimation of performance over higher speed links.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-06-24 03:27:19 +0800
835b3f0c0 [TCP]: Add TCP Hybla congestion control module. ... Browse Code »

TCP Hybla congestion avoidance.

- "In heterogeneous networks, TCP connections that incorporate a
terrestrial or satellite radio link are greatly disadvantaged with
respect to entirely wired connections, because of their longer round
trip times (RTTs). To cope with this problem, a new TCP proposal, the
TCP Hybla, is presented and discussed in the paper[1]. It stems from an
analytical evaluation of the congestion window dynamics in the TCP
standard versions (Tahoe, Reno, NewReno), which suggests the necessary
modifications to remove the performance dependence on RTT.[...]"[1]

[1]: Carlo Caini, Rosario Firrincieli, "TCP Hybla: a TCP enhancement for
heterogeneous networks",
International Journal of Satellite Communications and Networking
Volume 22, Issue 5 , Pages 547 - 566. September 2004.

Signed-off-by: Daniele Lacamera (root at danielinux.net)net
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Daniele Lacamera
2005-06-24 03:26:34 +0800
a628d29b5 [TCP]: Add High Speed TCP congestion control module. ... Browse Code »

Sally Floyd's high speed TCP congestion control.
This is useful for comparison and research.

Signed-off-by: John Heffner
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

John Heffner
2005-06-24 03:24:58 +0800
872707628 [TCP]: Add TCP Westwood congestion control module. ... Browse Code »

This is the existing 2.6.12 Westwood code moved from tcp_input
to the new congestion framework. A lot of the inline functions
have been eliminated to try and make it clearer.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-06-24 03:24:09 +0800
83803034f [TCP]: Add TCP BIC congestion control module. ... Browse Code »

TCP BIC congestion control reworked to use the new congestion control
infrastructure. This version is more up to date than the BIC
code in 2.6.12; it incorporates enhancements from BICTCP 1.1,
to handle low latency links.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-06-24 03:23:25 +0800
056ede6cf [TCP]: Report congestion control algorithm in tcp_diag. ... Browse Code »

Enhancement to the tcp_diag interface used by the iproute2 ss command
to report the tcp congestion control being used by a socket.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-06-24 03:21:28 +0800
7c99c909f [TCP]: Change tcp_diag to use the existing __RTA_PUT() macro. ... Browse Code »

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-06-24 03:20:36 +0800
317a76f9a [TCP]: Add pluggable congestion control algorithm infrastructure. ... Browse Code »

Allow TCP to have multiple pluggable congestion control algorithms.
Algorithms are defined by a set of operations and can be built in
or modules. The legacy "new RENO" algorithm is used as a starting
point and fallback.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-06-24 03:19:55 +0800
543537bd9 [PATCH] create a kstrdup library function ... Browse Code »

This patch creates a new kstrdup library function and changes the "local"
implementations in several places to use this function.

Most of the changes come from the sound and net subsystems. The sound part
had already been acknowledged by Takashi Iwai and the net part by David S.
Miller.

I left UML alone for now because I would need more time to read the code
carefully before making changes there.

Signed-off-by: Paulo Marques
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paulo Marques
2005-06-24 00:45:18 +0800

23 Jun, 2005

6 commits

060de20e8 Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 Browse Code »

Linus Torvalds
2005-06-23 14:11:50 +0800
ebc3f64b8 [X25]: Fast select with no restriction on response ... Browse Code »

This patch is a follow up to patch 1 regarding "Selective Sub Address
matching with call user data". It allows use of the Fast-Select-Acceptance
optional user facility for X.25.

This patch just implements fast select with no restriction on response
(NRR). What this means (according to ITU-T Recomendation 10/96 section
6.16) is that if in an incoming call packet, the relevant facility bits are
set for fast-select-NRR, then the called DTE can issue a direct response to
the incoming packet using a call-accepted packet that contains
call-user-data. This patch allows such a response.

The called DTE can also respond with a clear-request packet that contains
call-user-data. However, this feature is currently not implemented by the
patch.

How is Fast Select Acceptance used?
By default, the system does not allow fast select acceptance (as before).
To enable a response to fast select acceptance,
After a listen socket in created and bound as follows
socket(AF_X25, SOCK_SEQPACKET, 0);
bind(call_soc, (struct sockaddr *)&locl_addr, sizeof(locl_addr));
but before a listen system call is made, the following ioctl should be used.
ioctl(call_soc,SIOCX25CALLACCPTAPPRV);
Now the listen system call can be made
listen(call_soc, 4);
After this, an incoming-call packet will be accepted, but no call-accepted
packet will be sent back until the following system call is made on the socket
that accepts the call
ioctl(vc_soc,SIOCX25SENDCALLACCPT);
The network (or cisco xot router used for testing here) will allow the
application server's call-user-data in the call-accepted packet,
provided the call-request was made with Fast-select NRR.

Signed-off-by: Shaun Pereira
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller

Shaun Pereira
2005-06-23 13:16:17 +0800
cb65d506c [X25]: Selective sub-address matching with call user data. ... Browse Code »

From: Shaun Pereira

This is the first (independent of the second) patch of two that I am
working on with x25 on linux (tested with xot on a cisco router). Details
are as follows.

Current state of module:

A server using the current implementation (2.6.11.7) of the x25 module will
accept a call request/ incoming call packet at the listening x.25 address,
from all callers to that address, as long as NO call user data is present
in the packet header.

If the server needs to choose to accept a particular call request/ incoming
call packet arriving at its listening x25 address, then the kernel has to
allow a match of call user data present in the call request packet with its
own. This is required when multiple servers listen at the same x25 address
and device interface. The kernel currently matches ALL call user data, if
present.

Current Changes:

This patch is a follow up to the patch submitted previously by Andrew
Hendry, and allows the user to selectively control the number of octets of
call user data in the call request packet, that the kernel will match. By
default no call user data is matched, even if call user data is present.
To allow call user data matching, a cudmatchlength > 0 has to be passed
into the kernel after which the passed number of octets will be matched.
Otherwise the kernel behavior is exactly as the original implementation.

This patch also ensures that as is normally the case, no call user data
will be present in the Call accepted / call connected packet sent back to
the caller

Future Changes on next patch:

There are cases however when call user data may be present in the call
accepted packet. According to the X.25 recommendation (ITU-T 10/96)
section 5.2.3.2 call user data may be present in the call accepted packet
provided the fast select facility is used. My next patch will include this
fast select utility and the ability to send up to 128 octets call user data
in the call accepted packet provided the fast select facility is used. I
am currently testing this, again with xot on linux and cisco.

Signed-off-by: Shaun Pereira

(With a fix from Alexey Dobriyan )
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller

Shaun Pereira
2005-06-23 13:15:01 +0800
68d318720 [EBTABLES]: vfree() checking cleanups ... Browse Code »

From: jlamanna@gmail.com

ebtables.c vfree() checking cleanups.

Signed-off by: James Lamanna
Signed-off-by: Domen Puncer
Signed-off-by: David S. Miller

James Lamanna
2005-06-23 13:12:57 +0800
285b3afef [ATALK] aarp: replace schedule_timeout() with msleep() ... Browse Code »

From: Nishanth Aravamudan

Use msleep() instead of schedule_timeout() to guarantee the task
delays as expected. The current code is not wrong, but it does not account for
early return due to signals, so I think msleep() should be appropriate.

Signed-off-by: Nishanth Aravamudan
Signed-off-by: Domen Puncer
Signed-off-by: David S. Miller

Nishanth Aravamudan
2005-06-23 13:11:44 +0800
7abaa27c1 [IPV4]: Fix route.c gcc4 warnings ... Browse Code »

Signed-off by: Chuck Short
Signed-off-by: David S. Miller

Chuck Short
2005-06-23 13:10:23 +0800