14 Oct, 2010

2 commits


05 Oct, 2010

2 commits


04 Oct, 2010

6 commits


29 Sep, 2010

1 commit

  • This patch adds the basic infrastructure to support user-space
    expectation helpers via ctnetlink and the netfilter queuing
    infrastructure NFQUEUE. Basically, this patch:

    * adds NF_CT_EXPECT_USERSPACE flag to identify user-space
    created expectations. I have also added a sanity check in
    __nf_ct_expect_check() to avoid that kernel-space helpers
    may create an expectation if the master conntrack has no
    helper assigned.
    * adds some branches to check if the master conntrack helper
    exists, otherwise we skip the code that refers to kernel-space
    helper such as the local expectation list and the expectation
    policy.
    * allows to set the timeout for user-space expectations with
    no helper assigned.
    * a list of expectations created from user-space that depends
    on ctnetlink (if this module is removed, they are deleted).
    * includes USERSPACE in the /proc output for expectations
    that have been created by a user-space helper.

    This patch also modifies ctnetlink to skip including the helper
    name in the Netlink messages if no kernel-space helper is set
    (since no user-space expectation has not kernel-space kernel
    assigned).

    You can access an example user-space FTP conntrack helper at:
    http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-userspace-POC.tar.bz

    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Patrick McHardy

    Pablo Neira Ayuso
     

22 Sep, 2010

1 commit


21 Sep, 2010

2 commits

  • Add new sysctl flag "snat_reroute". Recent kernels use
    ip_route_me_harder() to route LVS-NAT responses properly by
    VIP when there are multiple paths to client. But setups
    that do not have alternative default routes can skip this
    routing lookup by using snat_reroute=0.

    Signed-off-by: Julian Anastasov
    Signed-off-by: Patrick McHardy

    Julian Anastasov
     
  • Add more code to IPVS to work with Netfilter connection
    tracking and fix some problems.

    - Allow IPVS to be compiled without connection tracking as in
    2.6.35 and before. This can avoid keeping conntracks for all
    IPVS connections because this costs memory. ip_vs_ftp still
    depends on connection tracking and NAT as implemented for 2.6.36.

    - Add sysctl var "conntrack" to enable connection tracking for
    all IPVS connections. For loaded IPVS directors it needs
    tuning of nf_conntrack_max limit.

    - Add IP_VS_CONN_F_NFCT connection flag to request the connection
    to use connection tracking. This allows user space to provide this
    flag, for example, in dest->conn_flags. This can be useful to
    request connection tracking per real server instead of forcing it
    for all connections with the "conntrack" sysctl. This flag is
    set currently only by ip_vs_ftp and of course by "conntrack" sysctl.

    - Add ip_vs_nfct.c file to hold all connection tracking code,
    by this way main code should not depend of netfilter conntrack
    support.

    - Return back the ip_vs_post_routing handler as in 2.6.35 and use
    skb->ipvs_property=1 to allow IPVS to work without connection
    tracking

    Connection tracking:

    - most of the code is already in 2.6.36-rc

    - alter conntrack reply tuple for LVS-NAT connections when first packet
    from client is forwarded and conntrack state is NEW or RELATED.
    Additionally, alter reply for RELATED connections from real server,
    again for packet in original direction.

    - add IP_VS_XMIT_TUNNEL to confirm conntrack (without altering
    reply) for LVS-TUN early because we want to call nf_reset. It is
    needed because we add IPIP header and the original conntrack
    should be preserved, not destroyed. The transmitted IPIP packets
    can reuse same conntrack, so we do not set skb->ipvs_property.

    - try to destroy conntrack when the IPVS connection is destroyed.
    It is not fatal if conntrack disappears before that, it depends
    on the used timers.

    Fix problems from long time:

    - add skb->ip_summed = CHECKSUM_NONE for the LVS-TUN transmitters

    Signed-off-by: Julian Anastasov
    Signed-off-by: Patrick McHardy

    Julian Anastasov
     

17 Sep, 2010

1 commit

  • - the sync protocol supports 16 bits only, so bits 0..15 should be
    used only for flags that should go to backup server, bits 16 and
    above should be allocated for flags not sent to backup.

    - use IP_VS_CONN_F_DEST_MASK as mask of connection flags in
    destination that can be changed by user space

    - allow IP_VS_CONN_F_ONE_PACKET to be set in destination

    Signed-off-by: Julian Anastasov
    Signed-off-by: Patrick McHardy

    Julian Anastasov
     

10 Sep, 2010

3 commits


09 Sep, 2010

11 commits

  • David S. Miller
     
  • commit 30fff923 introduced in linux-2.6.33 (udp: bind() optimisation)
    added a secondary hash on UDP, hashed on (local addr, local port).

    Problem is that following sequence :

    fd = socket(...)
    connect(fd, &remote, ...)

    not only selects remote end point (address and port), but also sets
    local address, while UDP stack stored in secondary hash table the socket
    while its local address was INADDR_ANY (or ipv6 equivalent)

    Sequence is :
    - autobind() : choose a random local port, insert socket in hash tables
    [while local address is INADDR_ANY]
    - connect() : set remote address and port, change local address to IP
    given by a route lookup.

    When an incoming UDP frame comes, if more than 10 sockets are found in
    primary hash table, we switch to secondary table, and fail to find
    socket because its local address changed.

    One solution to this problem is to rehash datagram socket if needed.

    We add a new rehash(struct socket *) method in "struct proto", and
    implement this method for UDP v4 & v6, using a common helper.

    This rehashing only takes care of secondary hash table, since primary
    hash (based on local port only) is not changed.

    Reported-by: Krzysztof Piotr Oledzki
    Signed-off-by: Eric Dumazet
    Tested-by: Krzysztof Piotr Oledzki
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • flows are an obsolete date type.

    Signed-off-by: Andy Grover

    Andy Grover
     
  • Replace e.g. u_int32_t types with the more common uint32_t.

    Reported-by: Matthew Wilcox
    Signed-off-by: Andy Grover

    Andy Grover
     
  • Also, a number of changes were made based on the assumption that
    rds.h wasn't exported, so roll these back.

    Signed-off-by: Andy Grover

    Andy Grover
     
  • Add two CMSGs for masked versions of cswp and fadd. args
    struct modified to use a union for different atomic op type's
    arguments. Change IB to do masked atomic ops. Atomic op type
    in rds_message similarly unionized.

    Signed-off-by: Andy Grover

    Andy Grover
     
  • Add a flag to the API so users can indicate they want
    silent operations. This is needed because silent ops
    cannot be used with USE_ONCE MRs, so we can't just
    assume silent.

    Also, change send_xmit to do atomic op before rdma op if
    both are present, and centralize the hairy logic to determine if
    we want to attempt silent, or not.

    Signed-off-by: Andy Grover

    Andy Grover
     
  • Implement a CMSG-based interface to do FADD and CSWP ops.

    Alter send routines to handle atomic ops.

    Add atomic counters to stats.

    Add xmit_atomic() to struct rds_transport

    Inline rds_ib_send_unmap_rdma into unmap_rm

    Signed-off-by: Andy Grover

    Andy Grover
     
  • We use rcu_dereference_check(p, rcu_read_lock_held() ||
    lockdep_rtnl_is_held()) several times in network stack.

    More usages to come too, so its time to create a helper.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     
  • - Do not create expectation when forwarding the PORT
    command to avoid blocking the connection. The problem is that
    nf_conntrack_ftp.c:help() tries to create the same expectation later in
    POST_ROUTING and drops the packet with "dropping packet" message after
    failure in nf_ct_expect_related.

    - Change ip_vs_update_conntrack to alter the conntrack
    for related connections from real server. If we do not alter the reply in
    this direction the next packet from client sent to vport 20 comes as NEW
    connection. We alter it but may be some collision happens for both
    conntracks and the second conntrack gets destroyed immediately. The
    connection stucks too.

    Signed-off-by: Julian Anastasov
    Signed-off-by: Simon Horman
    Signed-off-by: David S. Miller

    Julian Anastasov
     

08 Sep, 2010

5 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
    PCI: bus speed strings should be const
    PCI hotplug: Fix build with CONFIG_ACPI unset
    PCI: PCIe: Remove the port driver module exit routine
    PCI: PCIe: Move PCIe PME code to the pcie directory
    PCI: PCIe: Disable PCIe port services during port initialization
    PCI: PCIe: Ask BIOS for control of all native services at once
    ACPI/PCI: Negotiate _OSC control bits before requesting them
    ACPI/PCI: Do not preserve _OSC control bits returned by a query
    ACPI/PCI: Make acpi_pci_query_osc() return control bits
    ACPI/PCI: Reorder checks in acpi_pci_osc_control_set()
    PCI: PCIe: Introduce commad line switch for disabling port services
    PCI: PCIe AER: Introduce pci_aer_available()
    x86/PCI: only define pci_domain_nr if PCI and PCI_DOMAINS are set
    PCI: provide stub pci_domain_nr function for !CONFIG_PCI configs

    Linus Torvalds
     
  • * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
    powerpc/pseries: Correct rtas_data_buf locking in dlpar code
    powerpc/85xx: Add P1021 PCI IDs and quirks
    arch/powerpc/sysdev/qe_lib/qe.c: Add of_node_put to avoid memory leak
    arch/powerpc/platforms/83xx/mpc837x_mds.c: Add missing iounmap
    fsl_rio: fix compile errors
    powerpc/85xx: Fix compile issue with p1022_ds due to lmb rename to memblock
    powerpc/85xx: Fix compilation of mpc85xx_mds.c
    powerpc: Don't use kernel stack with translation off
    powerpc/perf_event: Reduce latency of calling perf_event_do_pending
    powerpc/kexec: Adds correct calling convention for kexec purgatory

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
    percpu: fix a mismatch between code and comment
    percpu: fix a memory leak in pcpu_extend_area_map()
    percpu: add __percpu notations to UP allocator
    percpu: handle __percpu notations in UP accessors

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    workqueue: use zalloc_cpumask_var() for gcwq->mayday_mask
    workqueue: fix GCWQ_DISASSOCIATED initialization
    workqueue: Add a workqueue chapter to the tracepoint docbook
    workqueue: fix cwq->nr_active underflow
    workqueue: improve destroy_workqueue() debuggability
    workqueue: mark lock acquisition on worker_maybe_bind_and_lock()
    workqueue: annotate lock context change
    workqueue: free rescuer on destroy_workqueue

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
    tty: fix tty_line must not be equal to number of allocated tty pointers in tty driver
    serial: bfin_sport_uart: restore transmit frame sync fix
    serial: fix port type conflict between NS16550A & U6_16550A
    MAINTAINERS: orphan isicom
    vt: Fix console corruption on driver hand-over.

    Linus Torvalds
     

07 Sep, 2010

1 commit

  • Sandybridge GTT has new cache control bits in PTE, which controls
    graphics page cache in LLC or LLC/MLC, so we need to extend the mask
    function to respect the new bits.

    And set cache control to always LLC only by default on Gen6.

    Signed-off-by: Zhenyu Wang
    Cc: stable@kernel.org
    Signed-off-by: Chris Wilson

    Zhenyu Wang
     

05 Sep, 2010

1 commit

  • cgroup_attach_task_current_cg API that have upstream is backwards: we
    really need an API to attach to the cgroups from another process A to
    the current one.

    In our case (vhost), a priveledged user wants to attach it's task to cgroups
    from a less priveledged one, the API makes us run it in the other
    task's context, and this fails.

    So let's make the API generic and just pass in 'from' and 'to' tasks.
    Add an inline wrapper for cgroup_attach_task_current_cg to avoid
    breaking bisect.

    Signed-off-by: Michael S. Tsirkin
    Acked-by: Li Zefan
    Acked-by: Paul Menage

    Michael S. Tsirkin
     

04 Sep, 2010

2 commits


03 Sep, 2010

2 commits