04 Jan, 2006

33 commits

  • Upcoming patches will make, for instance, ip_sockglue.c need just this enum
    and not all of tcp.h.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Renaming it to inet6_hash_connect, making it possible to ditch
    dccp_v6_hash_connect and share the same code with TCP instead.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Renaming it to inet_hash_connect, making it possible to ditch
    dccp_v4_hash_connect and share the same code with TCP instead.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • So that we can share several timewait sockets related functions and
    make the timewait mini sockets infrastructure closer to the request
    mini sockets one.

    Next changesets will take advantage of this, moving more code out of
    TCP and DCCP v4 and v6 to common infrastructure.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Now we have the destructor (dccp_v4_reqsk_destructor) in our
    request_sock_ops vtable.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Still needs mucho polishing, specially in the checksum code, but works
    just fine, inet_diag/iproute2 and all 8)

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • It was already non-TCP specific, will be used by DCCPv6.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Basically exports a similar set of functions as the one exported by
    the non-AF specific TCP code.

    In the process moved some non-AF specific code from dccp_v4_connect to
    dccp_connect_init and moved the checksum verification from
    dccp_invalid_packet to dccp_v4_rcv, so as to use it in dccp_v6_rcv
    too.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • To match TCP equivalent.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Out of tcp6_timewait_sock, that now is just an aggregation of
    inet_timewait_sock and inet6_timewait_sock, using tw_ipv6_offset in struct
    inet_timewait_sock, that is common to the IPv6 transport protocols that use
    timewait sockets, like DCCP and TCP.

    tw_ipv6_offset plays the struct inet_sock pinfo6 role, i.e. for the generic
    code to find the IPv6 area in a timewait sock.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Using sk->sk_protocol instead of IPPROTO_TCP.

    Will be used by DCCPv6 in the next changesets.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • AF_UNIX stream socket performance on P4 CPUs tends to suffer due to a
    lot of pipeline flushes from atomic operations. The patch below
    removes the sock_hold() and sock_put() in unix_stream_sendmsg(). This
    should be safe as the socket still holds a reference to its peer which
    is only released after the file descriptor's final user invokes
    unix_release_sock(). The only consideration is that we must add a
    memory barrier before setting the peer initially.

    Signed-off-by: Benjamin LaHaise
    Signed-off-by: David S. Miller

    Benjamin LaHaise
     
  • It also looks like there were 2 places where the test on sk_err was
    missing from the event wait logic (in sk_stream_wait_connect and
    sk_stream_wait_memory), while the rest of the sock_error() users look
    to be doing the right thing. This version of the patch fixes those,
    and cleans up a few places that were testing ->sk_err directly.

    Signed-off-by: Benjamin LaHaise
    Signed-off-by: David S. Miller

    Benjamin LaHaise
     
  • This patch removes dead code. I don't see the reason to keep this cruft
    around, besides cluttering the nice and functionally working code.

    Signed-off-by: Roberto Nibali
    Signed-off-by: Horms
    Signed-off-by: David S. Miller

    Roberto Nibali
     
  • Since udp_checksum_init always returns 0 there is no point in
    having it return a value.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • When a packet is obtained from skb_recv_datagram with MSG_PEEK enabled
    it is left on the socket receive queue. This means that when we detect
    a checksum error we have to be careful when trying to free the packet
    as someone could have dequeued it in the time being.

    Currently this delicate logic is duplicated three times between UDPv4,
    UDPv6 and RAWv6. This patch moves them into a one place and simplifies
    the code somewhat.

    This is based on a suggestion by Eric Dumazet.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • And make the core DCCP code AF agnostic, just like TCP, now its time
    to work on net/dccp/ipv6.c, we are close to the end!

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Renaming it to inet_csk_addr2sockaddr.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • And move it to struct inet_connection_sock. DCCP will use it in the
    upcoming changesets.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • And inet6_rsk_offset in inet_request_sock, for the same reasons as
    inet_sock's pinfo6 member.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • More work is needed tho to introduce inet6_request_sock from
    tcp6_request_sock, in the same layout considerations as ipv6_pinfo in
    inet_sock, next changeset will do that.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Another spin of Herbert Xu's "safer ip reassembly" patch
    for 2.6.16.

    (The original patch is here:
    http://marc.theaimsgroup.com/?l=linux-netdev&m=112281936522415&w=2
    and my only contribution is to have tested it.)

    This patch (optionally) does additional checks before accepting IP
    fragments, which can greatly reduce the possibility of reassembling
    fragments which originated from different IP datagrams.

    Signed-off-by: Herbert Xu
    Signed-off-by: Arthur Kepner
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This makes ebt_log and ebt_ulog use the new nf_log api. This enables
    the bridging packet filter to log packets e.g. via nfnetlink_log.

    Signed-off-by: Bart De Schuymer
    Signed-off-by: Harald Welte
    Signed-off-by: David S. Miller

    Bart De Schuymer
     
  • Part of a performance problem with ip_tables is that memory allocation
    is not NUMA aware, but 'only' SMP aware (ie each CPU normally touch
    separate cache lines)

    Even with small iptables rules, the cost of this misplacement can be
    high on common workloads. Instead of using one vmalloc() area
    (located in the node of the iptables process), we now allocate an area
    for each possible CPU, using vmalloc_node() so that memory should be
    allocated in the CPU's node if possible.

    Port to arp_tables and ip6_tables by Harald Welte.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Replace existing BIC version 1.1 with new version 2.0.
    The main change is to replace the window growth function
    with a cubic function as described in:
    http://www.csc.ncsu.edu/faculty/rhee/export/bitcp/cubic-paper.pdf

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • The latest BICTCP patch at:
    http://www.csc.ncsu.edu:8080/faculty/rhee/export/bitcp/index_files/Page546.htm

    disables the low_utilization feature of BICTCP because it doesn't work
    in some cases. This patch removes it.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • This patch series implements per packet access control via the
    extension of the Linux Security Modules (LSM) interface by hooks in
    the XFRM and pfkey subsystems that leverage IPSec security
    associations to label packets. Extensions to the SELinux LSM are
    included that leverage the patch for this purpose.

    This patch implements the changes necessary to the SELinux LSM to
    create, deallocate, and use security contexts for policies
    (xfrm_policy) and security associations (xfrm_state) that enable
    control of a socket's ability to send and receive packets.

    Patch purpose:

    The patch is designed to enable the SELinux LSM to implement access
    control on individual packets based on the strongly authenticated
    IPSec security association. Such access controls augment the existing
    ones in SELinux based on network interface and IP address. The former
    are very coarse-grained, and the latter can be spoofed. By using
    IPSec, the SELinux can control access to remote hosts based on
    cryptographic keys generated using the IPSec mechanism. This enables
    access control on a per-machine basis or per-application if the remote
    machine is running the same mechanism and trusted to enforce the
    access control policy.

    Patch design approach:

    The patch's main function is to authorize a socket's access to a IPSec
    policy based on their security contexts. Since the communication is
    implemented by a security association, the patch ensures that the
    security association's negotiated and used have the same security
    context. The patch enables allocation and deallocation of such
    security contexts for policies and security associations. It also
    enables copying of the security context when policies are cloned.
    Lastly, the patch ensures that packets that are sent without using a
    IPSec security assocation with a security context are allowed to be
    sent in that manner.

    A presentation available at
    www.selinux-symposium.org/2005/presentations/session2/2-3-jaeger.pdf
    from the SELinux symposium describes the overall approach.

    Patch implementation details:

    The function which authorizes a socket to perform a requested
    operation (send/receive) on a IPSec policy (xfrm_policy) is
    selinux_xfrm_policy_lookup. The Netfilter and rcv_skb hooks ensure
    that if a IPSec SA with a securit y association has not been used,
    then the socket is allowed to send or receive the packet,
    respectively.

    The patch implements SELinux function for allocating security contexts
    when policies (xfrm_policy) are created via the pfkey or xfrm_user
    interfaces via selinux_xfrm_policy_alloc. When a security association
    is built, SELinux allocates the security context designated by the
    XFRM subsystem which is based on that of the authorized policy via
    selinux_xfrm_state_alloc.

    When a xfrm_policy is cloned, the security context of that policy, if
    any, is copied to the clone via selinux_xfrm_policy_clone.

    When a xfrm_policy or xfrm_state is freed, its security context, if
    any is also freed at selinux_xfrm_policy_free or
    selinux_xfrm_state_free.

    Testing:

    The SELinux authorization function is tested using ipsec-tools. We
    created policies and security associations with particular security
    contexts and added SELinux access control policy entries to verify the
    authorization decision. We also made sure that packets for which no
    security context was supplied (which either did or did not use
    security associations) were authorized using an unlabelled context.

    Signed-off-by: Trent Jaeger
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Trent Jaeger
     
  • This patch series implements per packet access control via the
    extension of the Linux Security Modules (LSM) interface by hooks in
    the XFRM and pfkey subsystems that leverage IPSec security
    associations to label packets. Extensions to the SELinux LSM are
    included that leverage the patch for this purpose.

    This patch implements the changes necessary to the XFRM subsystem,
    pfkey interface, ipv4/ipv6, and xfrm_user interface to restrict a
    socket to use only authorized security associations (or no security
    association) to send/receive network packets.

    Patch purpose:

    The patch is designed to enable access control per packets based on
    the strongly authenticated IPSec security association. Such access
    controls augment the existing ones based on network interface and IP
    address. The former are very coarse-grained, and the latter can be
    spoofed. By using IPSec, the system can control access to remote
    hosts based on cryptographic keys generated using the IPSec mechanism.
    This enables access control on a per-machine basis or per-application
    if the remote machine is running the same mechanism and trusted to
    enforce the access control policy.

    Patch design approach:

    The overall approach is that policy (xfrm_policy) entries set by
    user-level programs (e.g., setkey for ipsec-tools) are extended with a
    security context that is used at policy selection time in the XFRM
    subsystem to restrict the sockets that can send/receive packets via
    security associations (xfrm_states) that are built from those
    policies.

    A presentation available at
    www.selinux-symposium.org/2005/presentations/session2/2-3-jaeger.pdf
    from the SELinux symposium describes the overall approach.

    Patch implementation details:

    On output, the policy retrieved (via xfrm_policy_lookup or
    xfrm_sk_policy_lookup) must be authorized for the security context of
    the socket and the same security context is required for resultant
    security association (retrieved or negotiated via racoon in
    ipsec-tools). This is enforced in xfrm_state_find.

    On input, the policy retrieved must also be authorized for the socket
    (at __xfrm_policy_check), and the security context of the policy must
    also match the security association being used.

    The patch has virtually no impact on packets that do not use IPSec.
    The existing Netfilter (outgoing) and LSM rcv_skb hooks are used as
    before.

    Also, if IPSec is used without security contexts, the impact is
    minimal. The LSM must allow such policies to be selected for the
    combination of socket and remote machine, but subsequent IPSec
    processing proceeds as in the original case.

    Testing:

    The pfkey interface is tested using the ipsec-tools. ipsec-tools have
    been modified (a separate ipsec-tools patch is available for version
    0.5) that supports assignment of xfrm_policy entries and security
    associations with security contexts via setkey and the negotiation
    using the security contexts via racoon.

    The xfrm_user interface is tested via ad hoc programs that set
    security contexts. These programs are also available from me, and
    contain programs for setting, getting, and deleting policy for testing
    this interface. Testing of sa functions was done by tracing kernel
    behavior.

    Signed-off-by: Trent Jaeger
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Trent Jaeger
     

03 Jan, 2006

4 commits

  • Hey, it's fifteen years today since I bought the machine that got Linux
    started. January 2nd is a good date.

    Linus Torvalds
     
  • Otherwise a bad mem policy system call can confuse the interleaving
    code into referencing undefined nodes.

    Originally reported by Doug Chapman

    I was told it's CVE-2005-3358
    (one has to love these security people - they make everything sound important)

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • In commit 3D59121003721a8fad11ee72e646fd9d3076b5679c, the x86 and x86-64
    was changed to include for the
    configurable timer frequency.

    However, asm/param.h is sometimes used in userland (it is included
    indirectly from ), so your commit pollutes the userland
    namespace with tons of CONFIG_FOO macros. This greatly confuses
    software packages (such as BusyBox) which use CONFIG_FOO macros
    themselves to control the inclusion of optional features.

    After a short exchange, Christoph approved this patch

    Signed-off-by: Linus Torvalds

    Dag-Erling Smørgrav
     
  • Some G5s still occasionally experience shutdowns due to overtemp
    conditions despite the recent fix. After analyzing logs from such
    machines, it appears that the overtemp code is a bit too quick at
    shutting the machine down when reaching the critical temperature (tmax +
    8) and doesn't leave the fan enough time to actually cool it down. This
    happens if the temperature of a CPU suddenly rises too high in a very
    short period of time, or occasionally on boot (that is the CPUs are
    already overtemp by the time the driver loads).

    This patches makes the code a bit more relaxed, leaving a few seconds to
    the fans to do their job before kicking the machine shutown.

    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Linus Torvalds

    Benjamin Herrenschmidt
     

01 Jan, 2006

3 commits