08 Mar, 2012

12 commits


07 Mar, 2012

28 commits

  • When a CPU is taken out of reset, either cold booted or hotplugged in,
    some of its PMU registers can contain UNKNOWN values.

    This patch adds a hotplug notifier to ARM core perf code so that upon
    CPU restart the PMU unit is reset and becomes ready to use again.

    Signed-off-by: Lorenzo Pieralisi
    Signed-off-by: Will Deacon
    Signed-off-by: Russell King

    Lorenzo Pieralisi
     
  • xscale2 PMUs indicate overflow not via the PMU control register, but by
    a separate overflow FLAG register instead.

    This patch fixes the xscale2 PMU code to use this register to detect
    to overflow and ensures that we clear any pending overflow when
    disabling a counter.

    Cc:
    Signed-off-by: Will Deacon
    Signed-off-by: Russell King

    Will Deacon
     
  • The PMU IRQ handlers in perf assume that if a counter has overflowed
    then perf must be responsible. In the paranoid world of crazy hardware,
    this could be false, so check that we do have a valid event before
    attempting to dereference NULL in the interrupt path.

    Cc:
    Signed-off-by: Ming Lei
    Signed-off-by: Will Deacon
    Signed-off-by: Russell King

    Will Deacon
     
  • When disabling a counter on an ARMv7 PMU, we should also clear the
    overflow flag in case an overflow occurred whilst stopping the counter.
    This prevents a spurious overflow being picked up later and leading to
    either false accounting or a NULL dereference.

    Cc:
    Reported-by: Ming Lei
    Signed-off-by: Will Deacon
    Signed-off-by: Russell King

    Will Deacon
     
  • On ARM, the PMU does not stop counting after an overflow and therefore
    IRQ latency affects the new counter value read by the kernel. This is
    significant for non-sampling runs where it is possible for the new value
    to overtake the previous one, causing the delta to be out by up to
    max_period events.

    Commit a737823d ("ARM: 6835/1: perf: ensure overflows aren't missed due
    to IRQ latency") attempted to fix this problem by allowing interrupt
    handlers to pass an overflow flag to the event update function, causing
    the overflow calculation to assume that the counter passed through zero
    when going from prev to new. Unfortunately, this doesn't work when
    overflow occurs on the perf_task_tick path because we have the flag
    cleared and end up computing a large negative delta.

    This patch removes the overflow flag from armpmu_event_update and
    instead limits the sample_period to half of the max_period for
    non-sampling profiling runs.

    Cc:
    Signed-off-by: Ming Lei
    Signed-off-by: Will Deacon
    Signed-off-by: Russell King

    Will Deacon
     
  • The message count field uses three bits of storage, not two.

    Signed-off-by: Jason Gerecke
    Acked-by: Chris Bagwell
    Signed-off-by: Dmitry Torokhov

    Jason Gerecke
     
  • Pull networking fixes from David Miller:

    1) TCP can chop up SACK'd SKBs below below the unacked send sequence and
    that breaks lots of stuff. Fix from Neal Cardwell.

    2) There is code in ipv6 to properly join and leave the all-routers
    multicast code when the forwarding setting is changed, but once
    forwarding is turned on, we don't do the join for newly registered
    devices. Fix from Li Wei.

    3) Netfilter's NAT module autoload in ctnetlink drops a spinlock around
    a sleeping call, problem is this code path doesn't actually hold that
    lock. Fix from Pablo Neira Ayuso.

    4) TG3 uses the wrong interfaces to hook into the new byte queue limit
    support. It uses the device level interfaces, which is fine for
    single queue devices, but on more recent chips this driver supports
    multiqueue so we have to use the multiqueue BQL APIs. Fix from Tom
    Herbert.

    5) r8169 resume fix from Francois Romieu.

    6) Add some cxgb4 device IDs, from Vipul Pandya.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    IPv6: Fix not join all-router mcast group when forwarding set.
    caif-hsi: Set default MTU to 4096
    cxgb4vf: Add support for Chelsio's T480-CR and T440-LP-CR adapters
    cxgb4: Add support for Chelsio's T480-CR and T440-LP-CR adapters
    mlx4_core: remove buggy sched_queue masking
    netfilter: nf_conntrack: fix early_drop with reliable event delivery
    bridge: netfilter: don't call iptables on vlan packets if sysctl is off
    netfilter: bridge: fix wrong pointer dereference
    netfilter: ctnetlink: remove incorrect spin_[un]lock_bh on NAT module autoload
    netfilter: ebtables: fix wrong name length while copying to user-space
    r8169: runtime resume before shutdown.
    tcp: fix tcp_shift_skb_data() to not shift SACKed data below snd_una
    tg3: Fix to use multi queue BQL interfaces

    Linus Torvalds
     
  • It turns out that test-compiling this file on x86-64 doesn't really
    help, because much of it is x86-32-specific. And so I hadn't noticed
    the slightly over-eager removal of the 'r' from 'addr' variable despite
    thinking I had tested it.

    Signed-off-by: Linus "oopsie" Torvalds

    Linus Torvalds
     
  • Several users of "find_vma_prev()" were not in fact interested in the
    previous vma if there was no primary vma to be found either. And in
    those cases, we're much better off just using the regular "find_vma()",
    and then "prev" can be looked up by just checking vma->vm_prev.

    The find_vma_prev() semantics are fairly subtle (see Mikulas' recent
    commit 83cd904d271b: "mm: fix find_vma_prev"), and the whole "return
    prev by reference" means that it generates worse code too.

    Thus this "let's avoid using this inconvenient and clearly too subtle
    interface when we don't really have to" patch.

    Cc: Mikulas Patocka
    Cc: KOSAKI Motohiro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Pull CIFS fixes from Steve French

    * git://git.samba.org/sfrench/cifs-2.6:
    cifs: fix dentry refcount leak when opening a FIFO on lookup
    CIFS: Fix mkdir/rmdir bug for the non-POSIX case

    Linus Torvalds
     
  • Commit 6bd4837de96e ("mm: simplify find_vma_prev()") broke memory
    management on PA-RISC.

    After application of the patch, programs that allocate big arrays on the
    stack crash with segfault, for example, this will crash if compiled
    without optimization:

    int main()
    {
    char array[200000];
    array[199999] = 0;
    return 0;
    }

    The reason is that PA-RISC has up-growing stack and the stack is usually
    the last memory area. In the above example, a page fault happens above
    the stack.

    Previously, if we passed too high address to find_vma_prev, it returned
    NULL and stored the last VMA in *pprev. After "simplify find_vma_prev"
    change, it stores NULL in *pprev. Consequently, the stack area is not
    found and it is not expanded, as it used to be before the change.

    This patch restores the old behavior and makes it return the last VMA in
    *pprev if the requested address is higher than address of any other VMA.

    Signed-off-by: Mikulas Patocka
    Acked-by: KOSAKI Motohiro
    Signed-off-by: Linus Torvalds

    Mikulas Patocka
     
  • Xommit ac5637611(genirq: Unmask oneshot irqs when thread was not woken)
    fails to unmask when a !IRQ_ONESHOT threaded handler is handled by
    handle_level_irq.

    This happens because thread_mask is or'ed unconditionally in
    irq_wake_thread(), but for !IRQ_ONESHOT interrupts never cleared. So
    the check for !desc->thread_active fails and keeps the interrupt
    disabled.

    Keep the thread_mask zero for !IRQ_ONESHOT interrupts.

    Document the thread_mask magic while at it.

    Reported-and-tested-by: Sven Joachim
    Reported-and-tested-by: Stefan Lippers-Hollmann
    Cc: stable@vger.kernel.org
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • When OVS_VPORT_ATTR_NAME is specified and dp_ifindex is nonzero, the
    logical behavior would be for the vport name lookup scope to be limited
    to the specified datapath, but in fact the dp_ifindex value was ignored.
    This commit causes the search scope to be honored.

    Signed-off-by: Ben Pfaff
    Signed-off-by: Jesse Gross

    Ben Pfaff
     
  • When forwarding was set and a new net device is register,
    we need add this device to the all-router mcast group.

    Signed-off-by: Li Wei
    Signed-off-by: David S. Miller

    Li Wei
     
  • Currently error is -ENOMEM when rejecting VM_GROWSDOWN|VM_GROWSUP
    from shared anonymous: hoist the file case's -EINVAL up for both.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Default MTU for CAIF HSI was wrongly set to 15 * 4092 bytes.
    The patch sets default MTU size to 4096.

    Signed-off-by: Sjur Brændeland
    Signed-off-by: David S. Miller

    Sjur Brændeland
     
  • This patch adds PCI device ids for Chelsio's T480-CR and T440-LP-CR
    adapters.

    Signed-off-by: Vipul Pandya
    Signed-off-by: David S. Miller

    Vipul Pandya
     
  • This patch adds PCI device ids for Chelsio's T480-CR and T440-LP-CR
    adapters.

    Signed-off-by: Vipul Pandya
    Signed-off-by: David S. Miller

    Vipul Pandya
     
  • Fixes a bug introduced by commit fe9a2603c, where the priority bits
    in the schedule queue field were masked out.

    Signed-off-by: Amir Vadai
    Signed-off-by: Yevgeny Petrilin
    Signed-off-by: David S. Miller

    Yevgeny Petrilin
     
  • If reliable event delivery is enabled and ctnetlink fails to deliver
    the destroy event in early_drop, the conntrack subsystem cannot
    drop any the candidate flow that was planned to be evicted.

    Reported-by: Kerin Millar
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller

    Pablo Neira Ayuso
     
  • When net.bridge.bridge-nf-filter-vlan-tagged is 0 (default), vlan packets
    arriving should not be sent to ip(6)tables by bridge netfilter.

    However, it turns out that we currently always send VLAN packets to
    netfilter, if ..
    a), CONFIG_VLAN_8021Q is enabled ; or
    b), CONFIG_VLAN_8021Q is not set but rx vlan offload is enabled
    on the bridge port.

    This is because bridge netfilter treats skb with
    skb->protocol == ETH_P_IP{V6} as "non-vlan packet".

    With rx vlan offload on or CONFIG_VLAN_8021Q=y, the vlan header has
    already been removed here, and we cannot rely on skb->protocol alone.

    Fix this by only using skb->protocol if the skb has no vlan tag,
    or if a vlan tag is present and filter-vlan-tagged bridge netfilter
    sysctl is enabled.

    We cannot remove the skb->protocol == htons(ETH_P_8021Q) test
    because the vlan tag is still around in the CONFIG_VLAN_8021Q=n &&
    "ethtool -K $itf rxvlan off" case.

    reproducer:
    iptables -t raw -I PREROUTING -i br0
    iptables -t raw -I PREROUTING -i br0.1

    Then send packets to an ip address configured on br0.1 interface.
    Even with net.bridge.bridge-nf-filter-vlan-tagged=0, the 1st rule
    will match instead of the 2nd one.

    With this patch applied, the 2nd rule will match instead.
    In the non-local address case, netfilter won't be consulted after
    this patch unless the sysctl is switched on.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller

    Florian Westphal
     
  • In adf7ff8, a invalid dereference was added in ebt_make_names.

    CC [M] net/bridge/netfilter/ebtables.o
    net/bridge/netfilter/ebtables.c: In function `ebt_make_names':
    net/bridge/netfilter/ebtables.c:1371:20: warning: `t' may be used uninitialized in this function [-Wuninitialized]

    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller

    Pablo Neira Ayuso
     
  • Since 7d367e0, ctnetlink_new_conntrack is called without holding
    the nf_conntrack_lock spinlock. Thus, ctnetlink_parse_nat_setup
    does not require to release that spinlock anymore in the NAT module
    autoload case.

    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller

    Pablo Neira Ayuso
     
  • user-space ebtables expects 32 bytes-long names, but xt_match names
    use 29 bytes. We have to copy less 29 bytes and then, make sure we
    fill the remaining bytes with zeroes.

    Signed-off-by: Santosh Nayak
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller

    Santosh Nayak
     
  • With runtime PM, if the ethernet cable is disconnected, the device is
    transitioned to D3 state to conserve energy. If the system is shutdown
    in this state, any register accesses in rtl_shutdown are dropped on
    the floor. As the device was programmed by .runtime_suspend() to wake
    on link changes, it is thus brought back up as soon as the link recovers.

    Resuming every suspended device through the driver core would slow things
    down and it is not clear how many devices really need it now.

    Original report and D0 transition patch by Sameer Nanda. Patch has been
    changed to comply with advices by Rafael J. Wysocki and the PM folks.

    Reported-by: Sameer Nanda
    Signed-off-by: Francois Romieu
    Cc: Rafael J. Wysocki
    Cc: Hayes Wang
    Cc: Alan Stern
    Acked-by: Rafael J. Wysocki
    Signed-off-by: David S. Miller

    françois romieu
     
  • This commit fixes tcp_shift_skb_data() so that it does not shift
    SACKed data below snd_una.

    This fixes an issue whose symptoms exactly match reports showing
    tp->sacked_out going negative since 3.3.0-rc4 (see "WARNING: at
    net/ipv4/tcp_input.c:3418" thread on netdev).

    Since 2008 (832d11c5cd076abc0aa1eaf7be96c81d1a59ce41)
    tcp_shift_skb_data() had been shifting SACKed ranges that were below
    snd_una. It checked that the *end* of the skb it was about to shift
    from was above snd_una, but did not check that the end of the actual
    shifted range was above snd_una; this commit adds that check.

    Shifting SACKed ranges below snd_una is problematic because for such
    ranges tcp_sacktag_one() short-circuits: it does not declare anything
    as SACKed and does not increase sacked_out.

    Before the fixes in commits cc9a672ee522d4805495b98680f4a3db5d0a0af9
    and daef52bab1fd26e24e8e9578f8fb33ba1d0cb412, shifting SACKed ranges
    below snd_una happened to work because tcp_shifted_skb() was always
    (incorrectly) passing in to tcp_sacktag_one() an skb whose end_seq
    tcp_shift_skb_data() had already guaranteed was beyond snd_una. Hence
    tcp_sacktag_one() never short-circuited and always increased
    tp->sacked_out in this case.

    After those two fixes, my testing has verified that shifting SACKed
    ranges below snd_una could cause tp->sacked_out to go negative with
    the following sequence of events:

    (1) tcp_shift_skb_data() sees an skb whose end_seq is beyond snd_una,
    then shifts a prefix of that skb that is below snd_una

    (2) tcp_shifted_skb() increments the packet count of the
    already-SACKed prev sk_buff

    (3) tcp_sacktag_one() sees the end of the new SACKed range is below
    snd_una, so it short-circuits and doesn't increase tp->sacked_out

    (5) tcp_clean_rtx_queue() sees the SACKed skb has been ACKed,
    decrements tp->sacked_out by this "inflated" pcount that was
    missing a matching increase in tp->sacked_out, and hence
    tp->sacked_out underflows to a u32 like 0xFFFFFFFF, which casted
    to s32 is negative.

    (6) this leads to the warnings seen in the recent "WARNING: at
    net/ipv4/tcp_input.c:3418" thread on the netdev list; e.g.:
    tcp_input.c:3418 WARN_ON((int)tp->sacked_out < 0);

    More generally, I think this bug can be tickled in some cases where
    two or more ACKs from the receiver are lost and then a DSACK arrives
    that is immediately above an existing SACKed skb in the write queue.

    This fix changes tcp_shift_skb_data() to abort this sequence at step
    (1) in the scenario above by noticing that the bytes are below snd_una
    and not shifting them.

    Signed-off-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Neal Cardwell
     
  • …wireless into for-davem

    John W. Linville
     
  • Pull arm-soc bug fixes from Arnd Bergmann:
    "Here are all the fixes I got after sending the last pull request.
    These fix mostly regressions on exynos, at91, pxa and ep93xx.

    Signed-off-by: Arnd Bergmann "

    * tag 'fixes-3.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
    ARM: ep93xx: convert vision_ep9307 to MULTI_IRQ_HANDLER
    ARM: EXYNOS: fix touchscreen IRQ setup on Universal C210 board
    ARM: pxa: fix invalid mfp pin issue
    ARM: pxa: remove duplicated registeration on pxa-gpio
    ARM: pxa: add dummy clock for pxa25x and pxa27x
    ARM: S3C24XX: DMA resume regression fix
    ARM: S3C24XX: Fix restart on S3C2442
    ARM: SAMSUNG: Fix memory size for hsotg
    ARM: at91/dma: DMA controller registering with DT support
    ARM: at91/dma: remove platform data from DMA controller

    Linus Torvalds