22 Sep, 2011

1 commit

  • * 'for-linus' of git://git.kernel.dk/linux-block:
    floppy: use del_timer_sync() in init cleanup
    blk-cgroup: be able to remove the record of unplugged device
    block: Don't check QUEUE_FLAG_SAME_COMP in __blk_complete_request
    mm: Add comment explaining task state setting in bdi_forker_thread()
    mm: Cleanup clearing of BDI_pending bit in bdi_forker_thread()
    block: simplify force plug flush code a little bit
    block: change force plug flush call order
    block: Fix queue_flag update when rq_affinity goes from 2 to 1
    block: separate priority boosting from REQ_META
    block: remove READ_META and WRITE_META
    xen-blkback: fixed indentation and comments
    xen-blkback: Don't disconnect backend until state switched to XenbusStateClosed.

    Linus Torvalds
     

19 Sep, 2011

4 commits

  • * git://github.com/davem330/net:
    tcp: fix validation of D-SACK
    tcp: fix build error if !CONFIG_SYN_COOKIES

    Linus Torvalds
     
  • commit 946cedccbd7387 (tcp: Change possible SYN flooding messages)
    added a build error if CONFIG_SYN_COOKIES=n

    Reported-by: Markus Trippelsdorf
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • * 'for-linus' of git://git.infradead.org/users/sameo/mfd-2.6:
    mfd: Fix omap-usb-host build failure
    mfd: Make omap-usb-host TLL mode work again
    mfd: Set MAX8997 irq pointer
    mfd: Fix initialisation of tps65910 interrupts
    mfd: Check for twl4030-madc NULL pointer
    mfd: Copy the device pointer to the twl4030-madc structure
    mfd: Rename wm8350 static gpio_set_debounce()
    mfd: Fix value of WM8994_CONFIGURE_GPIO

    Linus Torvalds
     
  • * git://github.com/davem330/net: (62 commits)
    ipv6: don't use inetpeer to store metrics for routes.
    can: ti_hecc: include linux/io.h
    IRDA: Fix global type conflicts in net/irda/irsysctl.c v2
    net: Handle different key sizes between address families in flow cache
    net: Align AF-specific flowi structs to long
    ipv4: Fix fib_info->fib_metrics leak
    caif: fix a potential NULL dereference
    sctp: deal with multiple COOKIE_ECHO chunks
    ibmveth: Fix checksum offload failure handling
    ibmveth: Checksum offload is always disabled
    ibmveth: Fix issue with DMA mapping failure
    ibmveth: Fix DMA unmap error
    pch_gbe: support ML7831 IOH
    pch_gbe: added the process of FIFO over run error
    pch_gbe: fixed the issue which receives an unnecessary packet.
    sfc: Use 64-bit writes for TX push where possible
    Revert "sfc: Use write-combining to reduce TX latency" and follow-ups
    bnx2x: Fix ethtool advertisement
    bnx2x: Fix 578xx link LED
    bnx2x: Fix XMAC loopback test
    ...

    Linus Torvalds
     

17 Sep, 2011

3 commits

  • With the conversion of struct flowi to a union of AF-specific structs, some
    operations on the flow cache need to account for the exact size of the key.

    Signed-off-by: David Ward
    Signed-off-by: David S. Miller

    dpward
     
  • AF-specific flowi structs are now passed to flow_key_compare, which must
    also be aligned to a long.

    Signed-off-by: David Ward
    Signed-off-by: David S. Miller

    David Ward
     
  • Attempt to reduce the number of IP packets emitted in response to single
    SCTP packet (2e3216cd) introduced a complication - if a packet contains
    two COOKIE_ECHO chunks and nothing else then SCTP state machine corks the
    socket while processing first COOKIE_ECHO and then loses the association
    and forgets to uncork the socket. To deal with the issue add new SCTP
    command which can be used to set association explictly. Use this new
    command when processing second COOKIE_ECHO chunk to restore the context
    for SCTP state machine.

    Signed-off-by: Max Matveev
    Signed-off-by: David S. Miller

    Max Matveev
     

16 Sep, 2011

3 commits

  • David S. Miller
     
  • dev_forward_skb loops an skb back into host networking
    stack which might hang on the memory indefinitely.
    In particular, this can happen in macvtap in bridged mode.
    Copy the userspace fragments to avoid blocking the
    sender in that case.

    As this patch makes skb_copy_ubufs extern now,
    I also added some documentation and made it clear
    the SKBTX_DEV_ZEROCOPY flag automatically instead
    of doing it in all callers. This can be made into a separate
    patch if people feel it's worth it.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     
  • "Possible SYN flooding on port xxxx " messages can fill logs on servers.

    Change logic to log the message only once per listener, and add two new
    SNMP counters to track :

    TCPReqQFullDoCookies : number of times a SYNCOOKIE was replied to client

    TCPReqQFullDrop : number of times a SYN request was dropped because
    syncookies were not enabled.

    Based on a prior patch from Tom Herbert, and suggestions from David.

    Signed-off-by: Eric Dumazet
    CC: Tom Herbert
    Signed-off-by: David S. Miller

    Eric Dumazet
     

15 Sep, 2011

2 commits

  • Building a kernel with hotplug disabled results in a link failure:

    `bgpio_remove' referenced in section `___ksymtab_gpl+bgpio_remove' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o

    This is because of bgpio_remove() is exported. It is illegal to export
    symbols which are discarded either at link time or as part of an
    init/exit section.

    Fix this by dropping the __devexit attributation from bgpio_remove().
    Also drop the __devinit attributation from bgpio_init().

    Signed-off-by: Russell King
    Cc: Grant Likely
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Russell King
     
  • Revert the post-3.0 commit 82f9d486e59f5 ("memcg: add
    memory.vmscan_stat").

    The implementation of per-memcg reclaim statistics violates how memcg
    hierarchies usually behave: hierarchically.

    The reclaim statistics are accounted to child memcgs and the parent
    hitting the limit, but not to hierarchy levels in between. Usually,
    hierarchical statistics are perfectly recursive, with each level
    representing the sum of itself and all its children.

    Since this exports statistics to userspace, this may lead to confusion
    and problems with changing things after the release, so revert it now,
    we can try again later.

    Signed-off-by: Johannes Weiner
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Daisuke Nishimura
    Cc: Michal Hocko
    Cc: Ying Han
    Cc: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

09 Sep, 2011

2 commits


08 Sep, 2011

1 commit


06 Sep, 2011

3 commits


31 Aug, 2011

1 commit


30 Aug, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (42 commits)
    netpoll: fix incorrect access to skb data in __netpoll_rx
    cassini: init before use in cas_interruptN.
    can: ti_hecc: Fix uninitialized spinlock in probe
    can: ti_hecc: Fix unintialized variable
    net: sh_eth: fix the compile error
    net/phy: fix DP83865 phy interrupt handler
    sendmmsg/sendmsg: fix unsafe user pointer access
    ibmveth: Fix leak when recycling skb and hypervisor returns error
    arp: fix rcu lockdep splat in arp_process()
    bridge: fix a possible use after free
    bridge: Pseudo-header required for the checksum of ICMPv6
    mcast: Fix source address selection for multicast listener report
    MAINTAINERS: Update GIT trees for network development
    ath9k: Fix PS wrappers in ath9k_set_coverage_class
    carl9170: Fix mismatch in carl9170_op_set_key mutex lock-unlock
    wl12xx: add max_sched_scan_ssids value to the hw description
    wl12xx: Fix validation of pm_runtime_get_sync return value
    wl12xx: Remove obsolete testmode NVS push command
    bcma: add uevent to the bus, to autoload drivers
    ath9k_hw: Fix STA (AR9485) bringup issue due to incorrect MAC address
    ...

    Linus Torvalds
     

29 Aug, 2011

1 commit

  • The current cgroup context switch code was incorrect leading
    to bogus counts. Furthermore, as soon as there was an active
    cgroup event on a CPU, the context switch cost on that CPU
    would increase by a significant amount as demonstrated by a
    simple ping/pong example:

    $ ./pong
    Both processes pinned to CPU1, running for 10s
    10684.51 ctxsw/s

    Now start a cgroup perf stat:
    $ perf stat -e cycles,cycles -A -a -G test -C 1 -- sleep 100

    $ ./pong
    Both processes pinned to CPU1, running for 10s
    6674.61 ctxsw/s

    That's a 37% penalty.

    Note that pong is not even in the monitored cgroup.

    The results shown by perf stat are bogus:
    $ perf stat -e cycles,cycles -A -a -G test -C 1 -- sleep 100

    Performance counter stats for 'sleep 100':

    CPU1 cycles test
    CPU1 16,984,189,138 cycles # 0.000 GHz

    The second 'cycles' event should report a count @ CPU clock
    (here 2.4GHz) as it is counting across all cgroups.

    The patch below fixes the bogus accounting and bypasses any
    cgroup switches in case the outgoing and incoming tasks are
    in the same cgroup.

    With this patch the same test now yields:
    $ ./pong
    Both processes pinned to CPU1, running for 10s
    10775.30 ctxsw/s

    Start perf stat with cgroup:

    $ perf stat -e cycles,cycles -A -a -G test -C 1 -- sleep 10

    Run pong outside the cgroup:
    $ /pong
    Both processes pinned to CPU1, running for 10s
    10687.80 ctxsw/s

    The penalty is now less than 2%.

    And the results for perf stat are correct:

    $ perf stat -e cycles,cycles -A -a -G test -C 1 -- sleep 10

    Performance counter stats for 'sleep 10':

    CPU1 cycles test # 0.000 GHz
    CPU1 23,933,981,448 cycles # 0.000 GHz

    Now perf stat reports the correct counts for
    for the non cgroup event.

    If we run pong inside the cgroup, then we also get the
    correct counts:

    $ perf stat -e cycles,cycles -A -a -G test -C 1 -- sleep 10

    Performance counter stats for 'sleep 10':

    CPU1 22,297,726,205 cycles test # 0.000 GHz
    CPU1 23,933,981,448 cycles # 0.000 GHz

    10.001457237 seconds time elapsed

    Signed-off-by: Stephane Eranian
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20110825135803.GA4697@quad
    Signed-off-by: Ingo Molnar

    Stephane Eranian
     

27 Aug, 2011

3 commits

  • The nfsservctl system call is now gone, so we should remove all
    linkage for it.

    Signed-off-by: NeilBrown
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • * 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
    omap-serial: Allow IXON and IXOFF to be disabled.
    TTY: serial, document ignoring of uart->ops->startup error
    TTY: pty, fix pty counting
    8250: Fix race condition in serial8250_backup_timeout().
    serial/8250_pci: delete duplicate data definition
    8250_pci: add support for Rosewill RC-305 4x serial port card
    tty: Add "spi:" prefix for spi modalias
    atmel_serial: fix atmel_default_console_device
    serial: 8250_pnp: add Intermec CV60 touchscreen device
    drivers/serial/ucc_uart.c: Fix compiler warning
    pch_uart: Set PCIe bus number using probe parameter
    serial: samsung: Fix build error

    Linus Torvalds
     
  • …t/gregkh/driver-core-2.6

    * 'driver-core-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
    drivers:misc: ti-st: fix unexpected UART close
    drivers:misc: ti-st: free skb on firmware download
    drivers:misc: ti-st: wait for completion at fail
    drivers:misc: ti-st: reinit completion before send
    drivers:misc: ti-st: fail-safe on wrong pkt type
    drivers:misc: ti-st: reinit completion on ver read
    drivers:misc:ti-st: platform hooks for chip states
    drivers:misc: ti-st: avoid a misleading dbg msg
    base/devres.c: quiet sparse noise about context imbalance
    pti: add missing CONFIG_PCI dependency
    drivers/base/devtmpfs.c: correct annotation of `setup_done'
    driver core: fix kernel-doc warning in platform.c
    firmware: fix google/gsmi.c build warning

    Linus Torvalds
     

26 Aug, 2011

8 commits

  • …wireless into for-davem

    John W. Linville
     
  • We need a callback to do some things after pwm_enable, pwm_disable
    and pwm_config.

    Signed-off-by: Dilan Lee
    Reviewed-by: Robert Morell
    Reviewed-by: Arun Murthy
    Cc: Richard Purdie
    Cc: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dilan Lee
     
  • Replace/remove use of RIO v.1.2 registers/bits that are not
    forward-compatible with newer versions of RapidIO specification.

    RapidIO specification v.1.3 removed Write Port CSR, Doorbell CSR,
    Mailbox CSR and Mailbox and Doorbell bits of the PEF CAR.

    Use of removed (since RIO v.1.3) register bits affects users of
    currently available 1.3 and 2.x compliant devices who may use not so
    recent kernel versions.

    Removing checks for unsupported bits makes corresponding routines
    compatible with all versions of RapidIO specification. Therefore,
    backporting makes stable kernel versions compliant with RIO v.1.3 and
    later as well.

    Signed-off-by: Alexandre Bounine
    Cc: Kumar Gala
    Cc: Matt Porter
    Cc: Li Yang
    Cc: Thomas Moll
    Cc: Chul Kim
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexandre Bounine
     
  • Signed-off-by: Evgeniy Polyakov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Evgeniy Polyakov
     
  • Purely in-memory filesystems do not use the inode hash as the dcache
    tells us if an entry already exists. As a result, they do not call
    unlock_new_inode, and thus directory inodes do not get put into a
    different lockdep class for i_sem.

    We need the different lockdep classes, because the locking order for
    i_mutex is different for directory inodes and regular inodes. Directory
    inodes can do "readdir()", which takes i_mutex *before* possibly taking
    mm->mmap_sem (due to a page fault while copying the directory entry to
    user space).

    In contrast, regular inodes can be mmap'ed, which takes mm->mmap_sem
    before accessing i_mutex.

    The two cases can never happen for the same inode, so no real deadlock
    can occur, but without the different lockdep classes, lockdep cannot
    understand that. As a result, if CONFIG_DEBUG_LOCK_ALLOC is set, this
    can lead to false positives from lockdep like below:

    find/645 is trying to acquire lock:
    (&mm->mmap_sem){++++++}, at: [] might_fault+0x5c/0xac

    but task is already holding lock:
    (&sb->s_type->i_mutex_key#15){+.+.+.}, at: []
    vfs_readdir+0x5b/0xb4

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&sb->s_type->i_mutex_key#15){+.+.+.}:
    [] lock_acquire+0xbf/0x103
    [] __mutex_lock_common+0x4c/0x361
    [] mutex_lock_nested+0x40/0x45
    [] hugetlbfs_file_mmap+0x82/0x110
    [] mmap_region+0x258/0x432
    [] do_mmap_pgoff+0x2ac/0x306
    [] sys_mmap_pgoff+0x118/0x16a
    [] sys_mmap+0x22/0x24
    [] system_call_fastpath+0x16/0x1b

    -> #0 (&mm->mmap_sem){++++++}:
    [] __lock_acquire+0xa1a/0xcf7
    [] lock_acquire+0xbf/0x103
    [] might_fault+0x89/0xac
    [] filldir+0x6f/0xc7
    [] dcache_readdir+0x67/0x205
    [] vfs_readdir+0x7b/0xb4
    [] sys_getdents+0x7e/0xd1
    [] system_call_fastpath+0x16/0x1b

    This patch moves the directory vs file lockdep annotation into a helper
    function that can be called by in-memory filesystems and has hugetlbfs
    call it.

    Signed-off-by: Josh Boyer
    Acked-by: Peter Zijlstra
    Signed-off-by: Linus Torvalds

    Josh Boyer
     
  • * 'urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback:
    squeeze max-pause area and drop pass-good area

    Linus Torvalds
     
  • * '3.1-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (21 commits)
    target: Convert acl_node_lock to be IRQ-disabling
    target: Make locking in transport_deregister_session() IRQ safe
    tcm_fc: init/exit functions should not be protected by "#ifdef MODULE"
    target: Print subpage too for unhandled MODE SENSE pages
    iscsi-target: Fix iscsit_allocate_se_cmd_for_tmr failure path bugs
    iscsi-target: Implement iSCSI target IPv6 address printing.
    target: Fix task SGL chaining breakage with transport_allocate_data_tasks
    target: Fix task count > 1 handling breakage and use max_sector page alignment
    target: Add missing DATA_SG_IO transport_cmd_get_valid_sectors check
    target: Fix SYNCHRONIZE_CACHE zero LBA + range breakage
    target: Remove duplicate task completions in transport_emulate_control_cdb
    target: Fix WRITE_SAME usage with transport_get_size
    target: Add WRITE_SAME (10) parsing and refactor passthrough checks
    target: Fix write payload exception handling with ->new_cmd_map
    iscsi-target: forever loop bug in iscsit_attach_ooo_cmdsn()
    iscsi-target: remove duplicate return
    target: Convert target_core_rd.c to use use BUG_ON
    iscsi-target: Fix leak on failure in iscsi_copy_param_list()
    target: Use ERR_CAST inlined function
    target: Make standard INQUIRY return 'not connected' for tpg_virt_lun0
    ...

    Linus Torvalds
     
  • I ran into a couple of programs which broke with the new Linux 3.0
    version. Some of those were binary only. I tried to use LD_PRELOAD to
    work around it, but it was quite difficult and in one case impossible
    because of a mix of 32bit and 64bit executables.

    For example, all kind of management software from HP doesnt work, unless
    we pretend to run a 2.6 kernel.

    $ uname -a
    Linux svivoipvnx001 3.0.0-08107-g97cd98f #1062 SMP Fri Aug 12 18:11:45 CEST 2011 i686 i686 i386 GNU/Linux

    $ hpacucli ctrl all show

    Error: No controllers detected.

    $ rpm -qf /usr/sbin/hpacucli
    hpacucli-8.75-12.0

    Another notable case is that Python now reports "linux3" from
    sys.platform(); which in turn can break things that were checking
    sys.platform() == "linux2":

    https://bugzilla.mozilla.org/show_bug.cgi?id=664564

    It seems pretty clear to me though it's a bug in the apps that are using
    '==' instead of .startswith(), but this allows us to unbreak broken
    programs.

    This patch adds a UNAME26 personality that makes the kernel report a
    2.6.40+x version number instead. The x is the x in 3.x.

    I know this is somewhat ugly, but I didn't find a better workaround, and
    compatibility to existing programs is important.

    Some programs also read /proc/sys/kernel/osrelease. This can be worked
    around in user space with mount --bind (and a mount namespace)

    To use:

    wget ftp://ftp.kernel.org/pub/linux/kernel/people/ak/uname26/uname26.c
    gcc -o uname26 uname26.c
    ./uname26 program

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     

25 Aug, 2011

1 commit


24 Aug, 2011

2 commits

  • Cleaning up the code a little bit. attempt_plug_merge() traverses the plug
    list anyway, we can do the request counting there, so stack size is reduced
    a little bit.
    The motivation here is I suspect if we should count the requests for each
    queue (task could handle multiple disks in the meantime), but my test doesn't
    show it's worthy doing. If somebody proves we should do it, below change
    will make that more easier.

    Signed-off-by: Shaohua Li
    Signed-off-by: Shaohua Li
    Signed-off-by: Jens Axboe

    Shaohua Li
     
  • tty_operations->remove is normally called like:
    queue_release_one_tty
    ->tty_shutdown
    ->tty_driver_remove_tty
    ->tty_operations->remove

    However tty_shutdown() is called from queue_release_one_tty() only if
    tty_operations->shutdown is NULL. But for pty, it is not.
    pty_unix98_shutdown() is used there as ->shutdown.

    So tty_operations->remove of pty (i.e. pty_unix98_remove()) is never
    called. This results in invalid pty_count. I.e. what can be seen in
    /proc/sys/kernel/pty/nr.

    I see this was already reported at:
    https://lkml.org/lkml/2009/11/5/370
    But it was not fixed since then.

    This patch is kind of a hackish way. The problem lies in ->install. We
    allocate there another tty (so-called tty->link). So ->install is
    called once, but ->remove twice, for both tty and tty->link. The fix
    here is to count both tty and tty->link and divide the count by 2 for
    user.

    And to have ->remove called, let's make tty_driver_remove_tty() global
    and call that from pty_unix98_shutdown() (tty_operations->shutdown).

    While at it, let's document that when ->shutdown is defined,
    tty_shutdown() is not called.

    Signed-off-by: Jiri Slaby
    Cc: Alan Cox
    Cc: "H. Peter Anvin"
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Jiri Slaby
     

23 Aug, 2011

4 commits

  • Add a new REQ_PRIO to let requests preempt others in the cfq I/O schedule,
    and lave REQ_META purely for marking requests as metadata in blktrace.

    All existing callers of REQ_META except for XFS are updated to also
    set REQ_PRIO for now.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Namhyung Kim
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • Replace all occurnanced of the undocumented READ_META with READ | REQ_META
    and remove the unused WRITE_META define.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • Certain platform specific or Host-WiLink Interface specific actions would be
    required to be taken when the chip is being enabled and after the chip is
    disabled such as configuration of the mux modes for the GPIO of host connected
    to the nshutdown of the chip or relinquishing UART after the chip is disabled.

    Similar actions can also be taken when the chip is in deep sleep or when the
    chip is awake. Performance enhancements such as configuring the host to run
    faster when chip is awake and slower when chip is asleep can also be made
    here.

    Signed-off-by: Pavan Savoy
    Signed-off-by: Greg Kroah-Hartman

    Pavan Savoy
     
  • This patch changes target_emulate_inquiry_std() to set the 'not connected'
    (0x35) bit in standard INQUIRY response data when we are processing a
    request to a virtual LUN=0 mapping from struct se_device *g_lun0_dev that
    have been setup for us in transport_lookup_cmd_lun().

    This addresses an issue where qla2xxx FC clients need to be able
    to create demo-mode I_T FC Nexuses by default, but should not be
    exposing the default set of TPG LUNs to all FC clients. This includes
    adding an new optional target_core_fabric_ops->tpg_check_demo_mode_login_only()
    caller to allow demo_mode nexuses to skip the old default of bulding
    a demo-mode MappedLUNs list via core_tpg_add_node_to_devs().

    (roland: Add missing tpg_check_demo_mode_login_only check in core_dev_add_lun)

    Reported-by: Roland Dreier
    Cc: Andrew Vasquez
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger