05 Jan, 2012

8 commits

  • …wireless-next into for-davem

    Conflicts:
    drivers/net/wireless/b43legacy/dma.c

    John W. Linville
     
  • David S. Miller
     
  • Recently Dave noticed that a test we did in ipv6_add_addr to see if we next hop
    route for the interface we're adding an addres to was wrong (see commit
    7ffbcecbeed91e5874e9a1cfc4c0cbb07dac3069). for one, it never triggers, and two,
    it was completely wrong to begin with. This test was meant to cover this
    section of RFC 4429:

    3.3 Modifications to RFC 2462 Stateless Address Autoconfiguration

    * (modifies section 5.5) A host MAY choose to configure a new address
    as an Optimistic Address. A host that does not know the SLLAO
    of its router SHOULD NOT configure a new address as Optimistic.
    A router SHOULD NOT configure an Optimistic Address.

    This patch should bring us into proper compliance with the above clause. Since
    we only add a SLAAC address after we've received a RA which may or may not
    contain a source link layer address option, we can pass a pointer to that option
    to addrconf_prefix_rcv (which may be null if the option is not present), and
    only set the optimistic flag if the option was found in the RA.

    Change notes:
    (v2) modified the new parameter to addrconf_prefix_rcv to be a bool rather than
    a pointer to make its use more clear as per request from davem.

    Signed-off-by: Neil Horman
    CC: "David S. Miller"
    CC: Hideaki YOSHIFUJI
    Signed-off-by: David S. Miller

    Neil Horman
     
  • The nfcid1 is the NFC-A identifier.
    It is exported as an attribute of the target info
    (returned as a response to NFC_CMD_GET_TARGET).

    Signed-off-by: Ilan Elias
    Acked-by: Samuel Ortiz
    Signed-off-by: John W. Linville

    Ilan Elias
     
  • Add support for NCI Interface Error Notification.
    When this notification is received and we're during a
    data exchange transaction, indicate an error to the NFC
    core layer via the data exchange callback.

    Signed-off-by: Ilan Elias
    Signed-off-by: John W. Linville

    Ilan Elias
     
  • Addition, deletion, and modification of NCI constants.
    Changes in NCI commands, responses, and notifications structures.

    Signed-off-by: Ilan Elias
    Signed-off-by: John W. Linville

    Ilan Elias
     
  • All implementations have been converted to implement set_rxnfc
    instead.

    Signed-off-by: Ben Hutchings
    Signed-off-by: David S. Miller

    Ben Hutchings
     
  • Define special location values for RX NFC that request the driver to
    select the actual rule location. This allows for implementation on
    devices that use hash-based filter lookup, whereas currently the API is
    more suited to devices with TCAM lookup or linear search.

    In ethtool_set_rxnfc() and the compat wrapper ethtool_ioctl(), copy
    the structure back to user-space after insertion so that the actual
    location is returned.

    Signed-off-by: Ben Hutchings
    Signed-off-by: David S. Miller

    Ben Hutchings
     

04 Jan, 2012

3 commits

  • SMSC generation 4 LAN chips integrate an IEEE 802.3 ethernet physical layer.
    The ethernet driver for this family of devices needs to access the SMSC PHY
    registers and bit-fields.

    So, this patch moves these constants to a place where it can be used for both
    the PHY and LAN drivers.

    Signed-off-by: Javier Martinez Canillas
    Signed-off-by: David S. Miller

    Javier Martinez Canillas
     
  • Commit 1e39f384bb01 ("evm: fix build problems") makes the stub version
    of security_old_inode_init_security() return 0 when CONFIG_SECURITY is
    not set.

    But that makes callers such as reiserfs_security_init() assume that
    security_old_inode_init_security() has set name, value, and len
    arguments properly - but security_old_inode_init_security() left them
    uninitialized which then results in interesting failures.

    Revert security_old_inode_init_security() to the old behavior of
    returning EOPNOTSUPP since both callers (reiserfs and ocfs2) handle this
    just fine.

    [ Also fixed the S_PRIVATE(inode) case of the actual non-stub
    security_old_inode_init_security() function to return EOPNOTSUPP
    for the same reason, as pointed out by Mimi Zohar.

    It got incorrectly changed to match the new function in commit
    fb88c2b6cbb1: "evm: fix security/security_old_init_security return
    code". - Linus ]

    Reported-by: Jorge Bastos
    Acked-by: James Morris
    Acked-by: Mimi Zohar
    Signed-off-by: Jan Kara
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • …wireless-next into for-davem

    Conflicts:
    drivers/net/wireless/b43/dma.c
    drivers/net/wireless/brcm80211/brcmfmac/dhd_linux.c

    John W. Linville
     

03 Jan, 2012

1 commit


31 Dec, 2011

10 commits

  • We should not forget to try for real server with port 0
    in the backup server when processing the sync message. We should
    do it in all cases because the backup server can use different
    forwarding method.

    Signed-off-by: Julian Anastasov
    Signed-off-by: Simon Horman
    Signed-off-by: Pablo Neira Ayuso

    Julian Anastasov
     
  • During some debugging I needed to look into how /proc/net/ipv6_route
    operated and in my digging I found its calling fib6_clean_all() which uses
    "write_lock_bh(&table->tb6_lock)" before doing the walk of the table. I
    found this on 2.6.32, but reading the code I believe the same basic idea
    exists currently. Looking at the rtnetlink code they are only calling
    "read_lock_bh(&table->tb6_lock);" via fib6_dump_table(). While I realize
    reading from proc isn't the recommended way of fetching the ipv6 route
    table; taking a write lock seems unnecessary and would probably cause
    network performance issues.

    To verify this I loaded up the ipv6 route table and then ran iperf in 3
    cases:
    * doing nothing
    * reading ipv6 route table via proc
    (while :; do cat /proc/net/ipv6_route > /dev/null; done)
    * reading ipv6 route table via rtnetlink
    (while :; do ip -6 route show table all > /dev/null; done)

    * Load the ipv6 route table up with:
    * for ((i = 0;i < 4000;i++)); do ip route add unreachable 2000::$i; done

    * iperf commands:
    * client: iperf -i 1 -V -c
    * server: iperf -V -s

    * iperf results - 3 runs each (in Mbits/sec)
    * nothing: client: 927,927,927 server: 927,927,927
    * proc: client: 179,97,96,113 server: 142,112,133
    * iproute: client: 928,927,928 server: 927,927,927

    lock_stat shows taking the write lock is causing the slowdown. Using this
    info I decided to write a version of fib6_clean_all() which replaces
    write_lock_bh(&table->tb6_lock) with read_lock_bh(&table->tb6_lock). With
    this new function I see the same results as with my rtnetlink iperf test.

    Signed-off-by: Josh Hunt
    Signed-off-by: David S. Miller

    Josh Hunt
     
  • While it's not too late fix the recently added RQLEN diag extension
    to report rqlen and wqlen in the same way as TCP does.

    I.e. for listening sockets the ack backlog length (which is the input
    queue length for socket) in rqlen and the max ack backlog length in
    wqlen, and what the CINQ/OUTQ ioctls do for established.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Currently tcp diag reports rqlen and wqlen values similar to how
    the CINQ/COUTQ iotcls do. To make unix diag report these values
    in the same way move the respective code into helpers.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • [ Fix indentation of sock_diag*() calls. -DaveM ]

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Add a routine that dumps memory-related values of a socket.
    It's made as an array to make it possible to add more stuff
    here later without breaking compatibility.

    Since v1: The SK_MEMINFO_ constants are in userspace
    visible part of sock_diag.h, the rest is under __KERNEL__.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • The headers check complains it should include the linux/types.h
    withing, thus add this one.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Properly toss existing components around the ifdef __KERNEL__
    and include the header into the header-y target.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • David S. Miller
     

30 Dec, 2011

1 commit

  • Commit 2a95ea6c0d129b4 ("procfs: do not overflow get_{idle,iowait}_time
    for nohz") did not take into account that one some architectures jiffies
    and cputime use different units.

    This causes get_idle_time() to return numbers in the wrong units, making
    the idle time fields in /proc/stat wrong.

    Instead of converting the usec value returned by
    get_cpu_{idle,iowait}_time_us to units of jiffies, use the new function
    usecs_to_cputime64 to convert it to the correct unit of cputime64_t.

    Signed-off-by: Andreas Schwab
    Acked-by: Michal Hocko
    Cc: Arnd Bergmann
    Cc: "Artem S. Tashkinov"
    Cc: Dave Jones
    Cc: Alexey Dobriyan
    Cc: Thomas Gleixner
    Cc: "Luck, Tony"
    Cc: Benjamin Herrenschmidt
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andreas Schwab
     

29 Dec, 2011

3 commits


28 Dec, 2011

2 commits


26 Dec, 2011

1 commit

  • Unlike all of the other cpuid bits, the TSC deadline timer bit is set
    unconditionally, regardless of what userspace wants.

    This is broken in several ways:
    - if userspace doesn't use KVM_CREATE_IRQCHIP, and doesn't emulate the TSC
    deadline timer feature, a guest that uses the feature will break
    - live migration to older host kernels that don't support the TSC deadline
    timer will cause the feature to be pulled from under the guest's feet;
    breaking it
    - guests that are broken wrt the feature will fail.

    Fix by not enabling the feature automatically; instead report it to userspace.
    Because the feature depends on KVM_CREATE_IRQCHIP, which we cannot guarantee
    will be called, we expose it via a KVM_CAP_TSC_DEADLINE_TIMER and not
    KVM_GET_SUPPORTED_CPUID.

    Fixes the Illumos guest kernel, which uses the TSC deadline timer feature.

    [avi: add the KVM_CAP + documentation]

    Reported-by: Alexey Zaytsev
    Tested-by: Alexey Zaytsev
    Signed-off-by: Jan Kiszka
    Signed-off-by: Avi Kivity

    Jan Kiszka
     

25 Dec, 2011

4 commits

  • David S. Miller
     
  • This patch adds the match that allows to perform extended
    accounting. It requires the new nfnetlink_acct infrastructure.

    # iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic
    # iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-traffic

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • We currently have two ways to account traffic in netfilter:

    - iptables chain and rule counters:

    # iptables -L -n -v
    Chain INPUT (policy DROP 3 packets, 867 bytes)
    pkts bytes target prot opt in out source destination
    8 1104 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0

    - use flow-based accounting provided by ctnetlink:

    # conntrack -L
    tcp 6 431999 ESTABLISHED src=192.168.1.130 dst=212.106.219.168 sport=58152 dport=80 packets=47 bytes=7654 src=212.106.219.168 dst=192.168.1.130 sport=80 dport=58152 packets=49 bytes=66340 [ASSURED] mark=0 use=1

    While trying to display real-time accounting statistics, we require
    to pool the kernel periodically to obtain this information. This is
    OK if the number of flows is relatively low. However, in case that
    the number of flows is huge, we can spend a considerable amount of
    cycles to iterate over the list of flows that have been obtained.

    Moreover, if we want to obtain the sum of the flow accounting results
    that match some criteria, we have to iterate over the whole list of
    existing flows, look for matchings and update the counters.

    This patch adds the extended accounting infrastructure for
    nfnetlink which aims to allow displaying real-time traffic accounting
    without the need of complicated and resource-consuming implementation
    in user-space. Basically, this new infrastructure allows you to create
    accounting objects. One accounting object is composed of packet and
    byte counters.

    In order to manipulate create accounting objects, you require the
    new libnetfilter_acct library. It contains several examples of use:

    libnetfilter_acct/examples# ./nfacct-add http-traffic
    libnetfilter_acct/examples# ./nfacct-get
    http-traffic = { pkts = 000000000000, bytes = 000000000000 };

    Then, you can use one of this accounting objects in several iptables
    rules using the new nfacct match (which comes in a follow-up patch):

    # iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic
    # iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-traffic

    The idea is simple: if one packet matches the rule, the nfacct match
    updates the counters.

    Thanks to Patrick McHardy, Eric Dumazet, Changli Gao for reviewing and
    providing feedback for this contribution.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • Aim of this patch is to provide full range of rps_flow_cnt on 64bit arches.

    Theorical limit on number of flows is 2^32

    Fix some buggy RPS/RFS macros as well.

    Signed-off-by: Eric Dumazet
    CC: Tom Herbert
    CC: Xi Wang
    CC: Laurent Chavey
    Signed-off-by: David S. Miller

    Eric Dumazet
     

24 Dec, 2011

4 commits


23 Dec, 2011

3 commits

  • The NAT range to nlattr conversation callbacks and helpers are entirely
    dead code and are also useless since there are no NAT ranges in conntrack
    context, they are only used for initially selecting a tuple. The final NAT
    information is contained in the selected tuples of the conntrack entry.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • The only remaining user of NAT protocol module reference counting is NAT
    ctnetlink support. Since this is a fairly short sequence of code, convert
    over to use RCU and remove module reference counting.

    Module unregistration is already protected by RCU using synchronize_rcu(),
    so no further changes are necessary.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Export the NAT definitions to userspace. So far userspace (specifically,
    iptables) has been copying the headers files from include/net. Also
    rename some structures and definitions in preparation for IPv6 NAT.
    Since these have never been officially exported, this doesn't affect
    existing userspace code.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy