19 Mar, 2012

2 commits

  • Linus Torvalds
     
  • Commit 28d82dc1c4ed ("epoll: limit paths") that I did to limit the
    number of possible wakeup paths in epoll is causing a few applications
    to longer work (dovecot for one).

    The original patch is really about limiting the amount of epoll nesting
    (since epoll fds can be attached to other fds). Thus, we probably can
    allow an unlimited number of paths of depth 1. My current patch limits
    it at 1000. And enforce the limits on paths that have a greater depth.

    This is captured in: https://bugzilla.redhat.com/show_bug.cgi?id=681578

    Signed-off-by: Jason Baron
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jason Baron
     

18 Mar, 2012

2 commits

  • Pull networking changes from David Miller:
    "1) icmp6_dst_alloc() returns NULL instead of ERR_PTR() leading to
    crashes, particularly during shutdown. Reported by Dave Jones and
    fixed by Eric Dumazet.

    2) hyperv and wimax/i2400m return NETDEV_TX_BUSY when they have
    already freed the SKB, which causes crashes as to the caller this
    means requeue the packet. Fixes from Eric Dumazet.

    3) usbnet driver doesn't allocate the right amount of headroom on
    fresh RX SKBs, fix from Eric Dumazet.

    4) Fix regression in ip6_mc_find_dev_rcu(), as an RCU lookup it
    abolutely should not take a reference to 'dev', this leads to
    leaks. Fix from RonQing Li.

    5) Fix netfilter ctnetlink race between delete and timeout expiration.
    From Pablo Neira Ayuso.

    6) Revert SFQ change which causes regressions, specifically queueing
    to tail can lead to unavoidable flow starvation. From Eric
    Dumazet.

    7) Fix a memory leak and a crash on corrupt firmware files in bnx2x,
    from Michal Schmidt."

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    netfilter: ctnetlink: fix race between delete and timeout expiration
    ipv6: Don't dev_hold(dev) in ip6_mc_find_dev_rcu.
    wimax/i2400m: fix erroneous NETDEV_TX_BUSY use
    net/hyperv: fix erroneous NETDEV_TX_BUSY use
    net/usbnet: reserve headroom on rx skbs
    bnx2x: fix memory leak in bnx2x_init_firmware()
    bnx2x: fix a crash on corrupt firmware file
    sch_sfq: revert dont put new flow at the end of flows
    ipv6: fix icmp6_dst_alloc()

    Linus Torvalds
     
  • Pull perf fixes from Ingo Molnar.

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf tools, x86: Build perf on older user-space as well
    perf tools: Use scnprintf where applicable
    perf tools: Incorrect use of snprintf results in SEGV

    Linus Torvalds
     

17 Mar, 2012

9 commits

  • Kerin Millar reported hardlockups while running `conntrackd -c'
    in a busy firewall. That system (with several processors) was
    acting as backup in a primary-backup setup.

    After several tries, I found a race condition between the deletion
    operation of ctnetlink and timeout expiration. This patch fixes
    this problem.

    Tested-by: Kerin Millar
    Reported-by: Kerin Millar
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller

    Pablo Neira Ayuso
     
  • ip6_mc_find_dev_rcu() is called with rcu_read_lock(), so don't
    need to dev_hold().
    With dev_hold(), not corresponding dev_put(), will lead to leak.

    [ bug introduced in 96b52e61be1 (ipv6: mcast: RCU conversions) ]

    Signed-off-by: RongQing.Li
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    RongQing.Li
     
  • Merge some more email patches from Andrew Morton:
    "A couple of nilfs fixes"

    * emailed from Andrew Morton :
    nilfs2: fix NULL pointer dereference in nilfs_load_super_block()
    nilfs2: clamp ns_r_segments_percentage to [1, 99]

    Linus Torvalds
     
  • According to the report from Slicky Devil, nilfs caused kernel oops at
    nilfs_load_super_block function during mount after he shrank the
    partition without resizing the filesystem:

    BUG: unable to handle kernel NULL pointer dereference at 00000048
    IP: [] nilfs_load_super_block+0x17e/0x280 [nilfs2]
    *pde = 00000000
    Oops: 0000 [#1] PREEMPT SMP
    ...
    Call Trace:
    [] init_nilfs+0x4b/0x2e0 [nilfs2]
    [] nilfs_mount+0x447/0x5b0 [nilfs2]
    [] mount_fs+0x36/0x180
    [] vfs_kern_mount+0x51/0xa0
    [] do_kern_mount+0x3e/0xe0
    [] do_mount+0x169/0x700
    [] sys_mount+0x6b/0xa0
    [] sysenter_do_call+0x12/0x28
    Code: 53 18 8b 43 20 89 4b 18 8b 4b 24 89 53 1c 89 43 24 89 4b 20 8b 43
    20 c7 43 2c 00 00 00 00 23 75 e8 8b 50 68 89 53 28 8b 54 b3 20 72
    48 8b 7a 4c 8b 55 08 89 b3 84 00 00 00 89 bb 88 00 00 00
    EIP: [] nilfs_load_super_block+0x17e/0x280 [nilfs2] SS:ESP 0068:ca9bbdcc
    CR2: 0000000000000048

    This turned out due to a defect in an error path which runs if the
    calculated location of the secondary super block was invalid.

    This patch fixes it and eliminates the reported oops.

    Reported-by: Slicky Devil
    Signed-off-by: Ryusuke Konishi
    Tested-by: Slicky Devil
    Cc: [2.6.30+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • ns_r_segments_percentage is read from the disk. Bogus or malicious
    value could cause integer overflow and malfunction due to meaningless
    disk usage calculation. This patch reports error when mounting such
    bogus volumes.

    Signed-off-by: Haogang Chen
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Haogang Chen
     
  • Pull maintainer update from James Morris:
    "Please pull this patch which adds Serge as maintainer of the
    capabilities code, as discussed on lwn and the lsm list.

    New capabilities must be signed off by the maintainer, and new uses of
    any capabilities should at be cc'd to the maintainer."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    MAINTAINERS: Add Serge as maintainer of capabilities

    Linus Torvalds
     
  • Pull c6x bugfix from Mark Salter:
    "Remove dead code from entry.S which causes a build failure when using
    a newer assembler (v2.22 complains about it, v2.20 ignores it)."

    * tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
    C6X: remove dead code from entry.S

    Linus Torvalds
     
  • When writing files to afs I sometimes hit a BUG:

    kernel BUG at fs/afs/rxrpc.c:179!

    With a backtrace of:

    afs_free_call
    afs_make_call
    afs_fs_store_data
    afs_vnode_store_data
    afs_write_back_from_locked_page
    afs_writepages_region
    afs_writepages

    The cause is:

    ASSERT(skb_queue_empty(&call->rx_queue));

    Looking at a tcpdump of the session the abort happens because we
    are exceeding our disk quota:

    rx abort fs reply store-data error diskquota exceeded (32)

    So the abort error is valid. We hit the BUG because we haven't
    freed all the resources for the call.

    By freeing any skbs in call->rx_queue before calling afs_free_call
    we avoid hitting leaking memory and avoid hitting the BUG.

    Signed-off-by: Anton Blanchard
    Signed-off-by: David Howells
    Cc:
    Signed-off-by: Linus Torvalds

    Anton Blanchard
     
  • A read of a large file on an afs mount failed:

    # cat junk.file > /dev/null
    cat: junk.file: Bad message

    Looking at the trace, call->offset wrapped since it is only an
    unsigned short. In afs_extract_data:

    _enter("{%u},{%zu},%d,,%zu", call->offset, len, last, count);
    ...

    if (call->offset < count) {
    if (last) {
    _leave(" = -EBADMSG [%d < %zu]", call->offset, count);
    return -EBADMSG;
    }

    Which matches the trace:

    [cat ] ==> afs_extract_data({65132},{524},1,,65536)
    [cat ] < 65536]

    call->offset went from 65132 to 0. Fix this by making call->offset an
    unsigned int.

    Signed-off-by: Anton Blanchard
    Signed-off-by: David Howells
    Cc:
    Signed-off-by: Linus Torvalds

    Anton Blanchard
     

16 Mar, 2012

26 commits

  • The ENDPROC() on sys_fadvise64_c6x() in arch/c6x/kernel/entry.S is
    outside of the conditional block with the matching ENTRY() macro. This
    leads a newer (v2.22 vs. v2.20) assembler to complain:

    /tmp/ccGZBaPT.s: Assembler messages:
    /tmp/ccGZBaPT.s: Error: .size expression for sys_fadvise64_c6x does not evaluate to a constant

    The conditional block became dead code when c6x switched to generic
    unistd.h and should be removed along with the offending ENDPROC().

    Signed-off-by: Mark Salter
    Acked-by: David Howells

    Mark Salter
     
  • A driver start_xmit() method cannot free skb and return NETDEV_TX_BUSY,
    since caller is going to reuse freed skb.

    In fact netif_tx_stop_queue() / netif_stop_queue() is needed before
    returning NETDEV_TX_BUSY or you can trigger a ksoftirqd fatal loop.

    In case of memory allocation error, only safe way is to drop the packet
    and return NETDEV_TX_OK

    Also increments tx_dropped counter

    Signed-off-by: Eric Dumazet
    Cc: Inaky Perez-Gonzalez
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • A driver start_xmit() method cannot free skb and return NETDEV_TX_BUSY,
    since caller is going to reuse freed skb.

    This is mostly a revert of commit bf769375c (staging: hv: fix the return
    status of netvsc_start_xmit())

    In fact netif_tx_stop_queue() / netif_stop_queue() is needed before
    returning NETDEV_TX_BUSY or you can trigger a ksoftirqd fatal loop.

    In case of memory allocation error, only safe way is to drop the packet
    and return NETDEV_TX_OK

    Signed-off-by: Eric Dumazet
    Cc: "K. Y. Srinivasan"
    Cc: Haiyang Zhang
    Cc: Greg Kroah-Hartman
    Reviewed-by: Haiyang Zhang
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • network drivers should reserve some headroom on incoming skbs so that we
    dont need expensive reallocations, eg forwarding packets in tunnels.

    This NET_SKB_PAD padding is done in various helpers, like
    __netdev_alloc_skb_ip_align() in this patch, combining NET_SKB_PAD and
    NET_IP_ALIGN magic.

    Signed-off-by: Eric Dumazet
    Cc: Oliver Neukum
    Cc: Greg Kroah-Hartman
    Acked-by: Oliver Neukum
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • When cycling the interface down and up, bnx2x_init_firmware() knows that
    the firmware is already loaded, but nevertheless it allocates certain
    arrays anew (init_data, init_ops, init_ops_offsets, iro_arr). The old
    arrays are leaked.

    Fix the leaks by returning early if the firmware was already loaded.
    Because if the firmware is loaded, so are the arrays.

    Signed-off-by: Michal Schmidt
    Acked-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Michal Schmidt
     
  • If the requested firmware is deemed corrupt and then released, reset the
    pointer to NULL in order to avoid double-freeing it in
    bnx2x_release_firmware() or dereferencing it in bnx2x_init_firmware().

    Signed-off-by: Michal Schmidt
    Acked-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Michal Schmidt
     
  • This reverts commit d47a0ac7b6 (sch_sfq: dont put new flow at the end of
    flows)

    As Jesper found out, patch sounded great but has bad side effects.

    In stress situation, pushing new flows in front of the queue can prevent
    old flows doing any progress. Packets can stay in SFQ queue for
    unlimited amount of time.

    It's possible to add heuristics to limit this problem, but this would
    add complexity outside of SFQ scope.

    A more sensible answer to Dave Taht concerns (who reported the issued I
    tried to solve in original commit) is probably to use a qdisc hierarchy
    so that high prio packets dont enter a potentially crowded SFQ qdisc.

    Reported-by: Jesper Dangaard Brouer
    Cc: Dave Taht
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • commit 87a115783 ( ipv6: Move xfrm_lookup() call down into
    icmp6_dst_alloc().) forgot to convert one error path, leading
    to crashes in mld_sendpack()

    Many thanks to Dave Jones for providing a very complete bug report.

    Reported-by: Dave Jones
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Add Serge as maintainer of capabilities, per suggestion on LWN:
    http://lwn.net/Articles/486306/

    Signed-off-by: James Morris

    James Morris
     
  • Merge patches from Andrew Morton:
    "Nine patches - some bug fixes and some MAINTAINERS fiddling."

    * emailed from Andrew Morton :
    drivers/video/backlight/s6e63m0.c: fix corruption storing gamma mode
    MAINTAINERS: add entry for exynos mipi display drivers
    MAINTAINERS: fix link to Gustavo Padovans tree
    MAINTAINERS: add Johan to Bluetooth maintainers
    MAINTAINERS: Gustavo has moved
    prctl: use CAP_SYS_RESOURCE for PR_SET_MM option
    rapidio/tsi721: fix bug in register offset definitions
    MAINTAINERS: update ST's Mailing list for SPEAr
    memcg: free mem_cgroup by RCU to fix oops

    Linus Torvalds
     
  • Pull i2c subsystem fixes from Jean Delvare.

    * 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
    i2c-algo-bit: Fix spurious SCL timeouts under heavy load
    i2c-core: Comment says "transmitted" but means "received"

    Linus Torvalds
     
  • Pull hwmon fixes from Guenter Roeck.

    * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
    hwmon: (zl6100) Enable interval between chip accesses for all chips
    hwmon: (w83627ehf) Describe undocumented pwm attributes
    hwmon: (w83627ehf) Fix temp2 source for W83627UHG
    hwmon: (w83627ehf) Fix memory leak in probe function
    hwmon: (w83627ehf) Fix writing into fan_stop_time for NCT6775F/NCT6776F

    Linus Torvalds
     
  • Pull drm exynos/intel updates from Dave Airlie:
    "Two minor updates from Jesse for Intel SNB fixes, and a few fixes from
    Samsung for exynos. The pull req has Alan's commit in it since Intel
    based their tree on my tree at that time, but it all seems fine wrt
    merging."

    * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
    drm exynos: use drm_fb_helper_set_par directly
    drm/exynos: Fix fb_videomode drm_mode_modeinfo conversion
    drm/exynos: fix runtime_pm fimd device state on probe
    drm/exynos: use correct 'exynos-drm' name for platform device
    drm/i915: support 32 bit BGR formats in sprite planes
    drm/i915: fix color order for BGR formats on SNB
    drm/gma500: Fix Cedarview boot failures in 3.3-rc

    Linus Torvalds
     
  • Pull media fixes from Mauro Carvalho Chehab:
    "For 4 fixes for 3.3 (all trivial):
    - uvc video driver: fixes a division by zero;
    - davinci: add module.h to fix compilation;
    - smsusb: fix the delivery system setting;
    - smsdvb: the get_frontend implementation there is broken.

    The smsdvb patch has 127 lines, but it is trivial: instead of
    returning a cache of the set_frontend (with is wrong, as it doesn't
    have the updated values for the data, and the implementation there is
    buggy), it copies the information of the detected DVB parameters from
    the smsdvb private structures into the corresponding DVBv5 struct
    fields."

    * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
    [media] smsdvb: fix get_frontend
    [media] smsusb: fix the default delivery system setting
    [media] media: davinci: added module.h to resolve unresolved macros
    [media] [FOR,v3.3] uvcvideo: Avoid division by 0 in timestamp calculation

    Linus Torvalds
     
  • Pull target fixes from Nicholas Bellinger:
    "This series addresses two recently reported regression bugs related to
    legacy SCSI reservation usage in target core, and iscsi-target
    reservation conflict handling.

    The second patch in particular addresses possible data-corruption with
    SCSI reservations that is specific to iscsi-target fabric LUNs with
    multiple client writers. Both patches need to go into v3.2 stable
    ASAP, and the branch based on the last target-pending/3.3-rc-fixes
    HEAD.

    Again, thanks to Martin Svec for his help to identify and address this
    regression bug with iscsi-target."

    * '3.3-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
    iscsi-target: Fix reservation conflict -EBUSY response handling bug
    target: Fix compatible reservation handling (CRH=1) with legacy RESERVE/RELEASE

    Linus Torvalds
     
  • strict_strtoul() writes a long but ->gamma_mode only has space to store an
    int, so on 64 bit systems we end up scribbling over ->gamma_table_count as
    well. I've changed it to use kstrtouint() instead.

    Signed-off-by: Dan Carpenter
    Acked-by: Inki Dae
    Signed-off-by: Florian Tobias Schandinat
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Carpenter
     
  • I'd like to add Inki Dae, Donghwa Lee and Kyungmin Park as maintainers
    who developers for exynos mipi display drivers for
    video/driver/exynos/exynos_mipi* and include/video/exynos_mipi*.

    Signed-off-by: Donghwa Lee
    Signed-off-by: Inki Dae
    Signed-off-by: Kyungmin Park
    Cc: Florian Tobias Schandinat
    Cc: Richard Purdie
    Cc: Kukjin Kim
    Cc: Jingoo Han
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Donghwa Lee
     
  • Gustavo's tree is called just bluetooth.git and not bluetooth-2.6.git
    anymore.

    Signed-off-by: Johan Hedberg
    Cc: Marcel Holtmann
    Cc: "Gustavo F. Padovan"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johan Hedberg
     
  • I've been coordinating Bluetooth patches in my tree for some time and
    it's possible I'll do it in the future too, so add myself to the
    Bluetooth sections as well as mention my tree there.

    Signed-off-by: Johan Hedberg
    Cc: Marcel Holtmann
    Cc: "Gustavo F. Padovan"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johan Hedberg
     
  • This is going to be the primary e-mail for kernel development.

    Signed-off-by: Gustavo Padovan
    Cc: Johan Hedberg
    Cc: Marcel Holtmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gustavo Padovan
     
  • CAP_SYS_ADMIN is already overloaded left and right, so to have more
    fine-grained access control use CAP_SYS_RESOURCE here.

    The CAP_SYS_RESOUCE is chosen because this prctl option allows a current
    process to adjust some fields of memory map descriptor which rather
    represents what the process owns: pointers to code, data, stack
    segments, command line, auxiliary vector data and etc.

    Suggested-by: Michael Kerrisk
    Acked-by: Kees Cook
    Acked-by: Michael Kerrisk
    Cc: Pavel Emelyanov
    Cc: Tejun Heo
    Cc: Oleg Nesterov
    Cc: Paul Bolle
    Cc: KOSAKI Motohiro
    Signed-off-by: Cyrill Gorcunov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cyrill Gorcunov
     
  • Fix indexed register offset definitions that use decimal (wrong) instead
    of hexadecimal (correct) notation for indexing multipliers.

    Incorrect definitions do not affect Tsi721 driver in its current default
    configuration because it uses only IDB queue 0. Loss of inbound
    doorbell functionality should be observed if queue other than 0 is used.

    Signed-off-by: Alexandre Bounine
    Cc: Matt Porter
    Cc: Chul Kim
    Cc: [3.2+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexandre Bounine
     
  • We have created a ST's Mailing list for SPEAr. This can be accessed
    from non-st email ids. I want people to cc this list, when they have
    changes specific to SPEAr. So, its better to get this updated in
    MAINTAINERS file.

    linux-arm-kernel@lists.infradead.org is also added for SPEAr.

    Signed-off-by: Viresh Kumar
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Viresh Kumar
     
  • After fixing the GPF in mem_cgroup_lru_del_list(), three times one
    machine running a similar load (moving and removing memcgs while
    swapping) has oopsed in mem_cgroup_zone_nr_lru_pages(), when retrieving
    memcg zone numbers for get_scan_count() for shrink_mem_cgroup_zone():
    this is where a struct mem_cgroup is first accessed after being chosen
    by mem_cgroup_iter().

    Just what protects a struct mem_cgroup from being freed, in between
    mem_cgroup_iter()'s css_get_next() and its css_tryget()? css_tryget()
    fails once css->refcnt is zero with CSS_REMOVED set in flags, yes: but
    what if that memory is freed and reused for something else, which sets
    "refcnt" non-zero? Hmm, and scope for an indefinite freeze if refcnt is
    left at zero but flags are cleared.

    It's tempting to move the css_tryget() into css_get_next(), to make it
    really "get" the css, but I don't think that actually solves anything:
    the same difficulty in moving from css_id found to stable css remains.

    But we already have rcu_read_lock() around the two, so it's easily fixed
    if __mem_cgroup_free() just uses kfree_rcu() to free mem_cgroup.

    However, a big struct mem_cgroup is allocated with vzalloc() instead of
    kzalloc(), and we're not allowed to vfree() at interrupt time: there
    doesn't appear to be a general vfree_rcu() to help with this, so roll
    our own using schedule_work(). The compiler decently removes
    vfree_work() and vfree_rcu() when the config doesn't need them.

    Signed-off-by: Hugh Dickins
    Acked-by: KAMEZAWA Hiroyuki
    Acked-by: Johannes Weiner
    Cc: Konstantin Khlebnikov
    Cc: Tejun Heo
    Cc: Ying Han
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • When the system is under heavy load, there can be a significant delay
    between the getscl() and time_after() calls inside sclhi(). That delay
    may cause the time_after() check to trigger after SCL has gone high,
    causing sclhi() to return -ETIMEDOUT.

    To fix the problem, double check that SCL is still low after the
    timeout has been reached, before deciding to return -ETIMEDOUT.

    Signed-off-by: Ville Syrjala
    Cc: stable@vger.kernel.org
    Signed-off-by: Jean Delvare

    Ville Syrjala
     
  • Fix that. Also convert this and the related comment to proper commenting
    style.

    Signed-off-by: Wolfram Sang
    Signed-off-by: Jean Delvare

    Wolfram Sang
     

15 Mar, 2012

1 commit