11 Nov, 2010

1 commit

  • This patch brings the Ack Vector interface up to date. Its main purpose is
    to lay the basis for the subsequent patches of this set, which will use the
    new data structure fields and routines.

    There are no real algorithmic changes, rather an adaptation:

    (1) Replaced the static Ack Vector size (2) with a #define so that it can
    be adapted (with low loss / Ack Ratio, a value of 1 works, so 2 seems
    to be sufficient for the moment) and added a solution so that computing
    the ECN nonce will continue to work - even with larger Ack Vectors.

    (2) Replaced the #defines for Ack Vector states with a complete enum.

    (3) Replaced #defines to compute Ack Vector length and state with general
    purpose routines (inlines), and updated code to use these.

    (4) Added a `tail' field (conversion to circular buffer in subsequent patch).

    (5) Updated the (outdated) documentation for Ack Vector struct.

    (6) All sequence number containers now trimmed to 48 bits.

    (7) Removal of unused bits:
    * removed dccpav_ack_nonce from struct dccp_ackvec, since this is already
    redundantly stored in the `dccpavr_ack_nonce' (of Ack Vector record);
    * removed Elapsed Time for Ack Vectors (it was nowhere used);
    * replaced semantics of dccpavr_sent_len with dccpavr_ack_runlen, since
    the code needs to be able to remember the old run length;
    * reduced the de-/allocation routines (redundant / duplicate tests).

    Signed-off-by: Gerrit Renker

    Gerrit Renker
     

09 Nov, 2010

28 commits

  • unix_dgram_poll() is pretty expensive to check POLLOUT status, because
    it has to lock the socket to get its peer, take a reference on the peer
    to check its receive queue status, and queue another poll_wait on
    peer_wait. This all can be avoided if the process calling
    unix_dgram_poll() is not interested in POLLOUT status. It makes
    unix_dgram_recvmsg() faster by not queueing irrelevant pollers in
    peer_wait.

    On a test program provided by Alan Crequy :

    Before:

    real 0m0.211s
    user 0m0.000s
    sys 0m0.208s

    After:

    real 0m0.044s
    user 0m0.000s
    sys 0m0.040s

    Suggested-by: Davide Libenzi
    Reported-by: Alban Crequy
    Acked-by: Davide Libenzi
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Alban Crequy reported a problem with connected dgram af_unix sockets and
    provided a test program. epoll() would miss to send an EPOLLOUT event
    when a thread unqueues a packet from the other peer, making its receive
    queue not full.

    This is because unix_dgram_poll() fails to call sock_poll_wait(file,
    &unix_sk(other)->peer_wait, wait);
    if the socket is not writeable at the time epoll_ctl(ADD) is called.

    We must call sock_poll_wait(), regardless of 'writable' status, so that
    epoll can be notified later of states changes.

    Misc: avoids testing twice (sk->sk_shutdown & RCV_SHUTDOWN)

    Reported-by: Alban Crequy
    Cc: Davide Libenzi
    Signed-off-by: Eric Dumazet
    Acked-by: Davide Libenzi
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Instead of wakeup all sleepers, use wake_up_interruptible_sync_poll() to
    wakeup only ones interested into writing the socket.

    This patch is a specialization of commit 37e5540b3c9d (epoll keyed
    wakeups: make sockets use keyed wakeups).

    On a test program provided by Alan Crequy :

    Before:
    real 0m3.101s
    user 0m0.000s
    sys 0m6.104s

    After:

    real 0m0.211s
    user 0m0.000s
    sys 0m0.208s

    Reported-by: Alban Crequy
    Signed-off-by: Eric Dumazet
    Cc: Davide Libenzi
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • While tracking dev_base_lock users, I found decnet used it in
    dnet_select_source(), but for a wrong purpose:

    Writers only hold RTNL, not dev_base_lock, so readers must use RCU if
    they cannot use RTNL.

    Adds an rcu_head in struct dn_ifaddr and handle proper RCU management.

    Adds __rcu annotation in dn_route as well.

    Signed-off-by: Eric Dumazet
    Acked-by: Steven Whitehouse
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • bond_info_seq_start() uses a read_lock(&dev_base_lock) to make sure
    device doesn’t disappear. Same goal can be achieved using RCU.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • dev_base_lock is the legacy way to lock the device list, and is planned
    to disappear. (writers hold RTNL, readers hold RCU lock)

    Convert aoecmd_cfg_pkts() to RCU locking.

    Signed-off-by: Eric Dumazet
    Cc: "Ed L. Cashin"
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Add suspend/resume support using default open/stop interface methods
    to do hardware dependant operations.

    On suspend, same low power state (soft power mode) will be kept, the
    following blocks will be disabled:

    - Internal PLL Clock
    - Tx/Rx PHY
    - MAC
    - SPI Interface

    Signed-off-by: Abraham Arce
    Signed-off-by: David S. Miller

    Arce, Abraham
     
  • David S. Miller
     
  • The sgs allocation error path leaks the allocated message.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Andy Grover
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • QDIO is running independent from netdevice state. We are not
    allowed to schedule NAPI in case the netdevice is not open.

    Signed-off-by: Frank Blaschka
    Signed-off-by: David S. Miller

    Frank Blaschka
     
  • For a certain Hipersockets specific error code in the xmit path, the
    qeth driver tries to invoke dev_queue_xmit again.
    Commit 79640a4ca6955e3ebdb7038508fa7a0cd7fa5527 introduces a busylock
    causing locking problems in case of re-invoked dev_queue_xmit by qeth.
    This patch removes the attempts to retry packet sending with
    dev_queue_xmit from the qeth driver.

    Signed-off-by: Ursula Braun
    Signed-off-by: Frank Blaschka
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • This fix a bug reported by backyes.
    Right the first time pktgen's using queue_map that's not been initialized
    by set_cur_queue_map(pkt_dev);

    Signed-off-by: Junchang Wang
    Signed-off-by: Backyes
    Signed-off-by: David S. Miller

    Junchang Wang
     
  • After e6484930d7c73d324bccda7d43d131088da697b9: net: allocate tx queues in register_netdevice
    These calls make net drivers oops at load time, so let's avoid people
    git-bisect'ing known problems.

    Signed-off-by: Guillaume Chazarain
    Signed-off-by: David S. Miller

    Guillaume Chazarain
     
  • After e6484930d7c73d324bccda7d43d131088da697b9: net: allocate tx queues in register_netdevice
    It causes an Oops at skge_probe() time.

    Signed-off-by: Guillaume Chazarain
    Signed-off-by: David S. Miller

    Guillaume Chazarain
     
  • The type of FRAG6_CB(prev)->offset is int, skb->len is *unsigned* int,
    and offset is int.

    Without this patch, type conversion occurred to this expression, when
    (FRAG6_CB(prev)->offset + prev->len) is less than offset.

    Signed-off-by: Shan Wei
    Signed-off-by: David S. Miller

    Shan Wei
     
  • The basic classifier keeps statistics but does not report it to user space.
    This showed up when using basic classifier (with police) as a default catch
    all on ingress; no statistics were reported.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     
  • Signed-off-by: David Woodhouse
    Signed-off-by: David S. Miller

    David Woodhouse
     
  • The existing 'FirmwareVersion' attribute only covers the DSP firmware as
    provided by Conexant; not the overall version of the device firmware. We
    do want to be able to see the full version number too.

    Signed-off-by: David Woodhouse
    Signed-off-by: David S. Miller

    David Woodhouse
     
  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: Add new ext4 inode tracepoints
    ext4: Don't call sb_issue_discard() in ext4_free_blocks()
    ext4: do not try to grab the s_umount semaphore in ext4_quota_off
    ext4: fix potential race when freeing ext4_io_page structures
    ext4: handle writeback of inodes which are being freed
    ext4: initialize the percpu counters before replaying the journal
    ext4: "ret" may be used uninitialized in ext4_lazyinit_thread()
    ext4: fix lazyinit hang after removing request

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
    TTY: move .gitignore from drivers/char/ to drivers/tty/vt/
    TTY: create drivers/tty/vt and move the vt code there
    TTY: create drivers/tty and move the tty core files there

    Linus Torvalds
     
  • …egkh/staging-next-2.6

    * 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-next-2.6:
    Staging: ath6kl: remove empty files that mess with 'distclean'
    staging: ath6kl: Fixing the driver to use modified mmc_host structure
    Staging: solo6x10: fix build problem

    Linus Torvalds
     
  • …nel/git/lethal/sh-2.6

    * 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
    mmc: sh_mmcif: Convert extern inline to static inline.
    ARM: mach-shmobile: Allow GPIO chips to register IRQ mappings.
    ARM: mach-shmobile: fix sh7372 after a recent clock framework rework
    ARM: mach-shmobile: include drivers/sh/Kconfig
    ARM: mach-shmobile: ap4evb: Add HDMI sound support
    ARM: mach-shmobile: clock-sh7372: Add FSIDIV clock support
    ARM: shmobile: remove sh_timer_config clk member

    Linus Torvalds
     
  • * 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
    sh: clkfwk: Fix up checkpatch warnings.
    sh: make some needlessly global sh7724 clocks static
    sh: add clk_round_parent() to optimize parent clock rate
    sh: Simplify phys_addr_mask()/PTE_PHYS_MASK for 29/32-bit.
    sh: nommu: Support building without an uncached mapping.
    sh: nommu: use 32-bit phys mode.
    sh: mach-se: Fix up SE7206 no ioport build.
    sh: intc: Update for single IRQ reservation helper.
    sh: clkfwk: Fix up rate rounding error handling.
    sh: mach-se: Rip out superfluous 7751 PIO routines.
    sh: mach-se: Rip out superfluous 770x PIO routines.
    sh: mach-edosk7705: Kill off machtype, consolidate board def.
    sh: mach-edosk7705: update for this century, kill off PIO trapping.
    sh: mach-se: Rip out superfluous 7206 PIO routines.
    sh: mach-systemh: Kill off dead board.
    sh: mach-snapgear: Kill off machtype, consolidate board def.
    sh: mach-snapgear: Rip out superfluous PIO routines.
    sh: mach-microdev: SuperIO-relative ioport mapping.

    Linus Torvalds
     
  • Add ext4_evict_inode, ext4_drop_inode, ext4_mark_inode_dirty, and
    ext4_begin_ordered_truncate()

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • Commit 5c521830cf (ext4: Support discard requests when running in
    no-journal mode) attempts to add sb_issue_discard() for data blocks
    (in data=writeback mode) and in no-journal mode. Unfortunately, this
    no longer works, because in commit dd3932eddf (block: remove
    BLKDEV_IFL_WAIT), sb_issue_discard() only presents a synchronous
    interface, and there are times when we call ext4_free_blocks() when we
    are are holding a spinlock, or are otherwise in an atomic context.

    For now, I've removed the call to sb_issue_discard() to prevent a
    deadlock or (if spinlock debugging is enabled) failures like this:

    BUG: scheduling while atomic: rc.sysinit/1376/0x00000002
    Pid: 1376, comm: rc.sysinit Not tainted 2.6.36-ARCH #1
    Call Trace:
    [] __schedule_bug+0x5e/0x70
    [] schedule+0x950/0xa70
    [] ? insert_work+0x7d/0x90
    [] ? queue_work_on+0x1d/0x30
    [] ? queue_work+0x37/0x60
    [] schedule_timeout+0x21d/0x360
    [] ? generic_make_request+0x2c3/0x540
    [] wait_for_common+0xc0/0x150
    [] ? default_wake_function+0x0/0x10
    [] ? submit_bio+0x7c/0x100
    [] ? wake_bit_function+0x0/0x40
    [] wait_for_completion+0x18/0x20
    [] blkdev_issue_discard+0x1b9/0x210
    [] ext4_free_blocks+0x68e/0xb60
    [] ? __ext4_handle_dirty_metadata+0x110/0x120
    [] ext4_ext_truncate+0x8cc/0xa70
    [] ? pagevec_lookup+0x1e/0x30
    [] ext4_truncate+0x178/0x5d0
    [] ? unmap_mapping_range+0xab/0x280
    [] vmtruncate+0x56/0x70
    [] ext4_setattr+0x14b/0x460
    [] notify_change+0x194/0x380
    [] do_truncate+0x60/0x90
    [] ? security_inode_permission+0x1a/0x20
    [] ? tomoyo_path_truncate+0x11/0x20
    [] do_last+0x5d9/0x770
    [] do_filp_open+0x1ed/0x680
    [] ? page_fault+0x1f/0x30
    [] ? alloc_fd+0xec/0x140
    [] do_sys_open+0x61/0x120
    [] sys_open+0x1b/0x20
    [] system_call_fastpath+0x16/0x1b

    https://bugzilla.kernel.org/show_bug.cgi?id=22302

    Reported-by: Mathias Burén
    Signed-off-by: "Theodore Ts'o"
    Cc: jiayingz@google.com

    Theodore Ts'o
     
  • It's not needed to sync the filesystem, and it fixes a lock_dep complaint.

    Signed-off-by: Dmitry Monakhov
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Jan Kara

    Dmitry Monakhov
     
  • Use an atomic_t and make sure we don't free the structure while we
    might still be submitting I/O for that page.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • The following BUG can occur when an inode which is getting freed when
    it still has dirty pages outstanding, and it gets deleted (in this
    because it was the target of a rename). In ordered mode, we need to
    make sure the data pages are written just in case we crash before the
    rename (or unlink) is committed. If the inode is being freed then
    when we try to igrab the inode, we end up tripping the BUG_ON at
    fs/ext4/page-io.c:146.

    To solve this problem, we need to keep track of the number of io
    callbacks which are pending, and avoid destroying the inode until they
    have all been completed. That way we don't have to bump the inode
    count to keep the inode from being destroyed; an approach which
    doesn't work because the count could have already been dropped down to
    zero before the inode writeback has started (at which point we're not
    allowed to bump the count back up to 1, since it's already started
    getting freed).

    Thanks to Dave Chinner for suggesting this approach, which is also
    used by XFS.

    kernel BUG at /scratch_space/linux-2.6/fs/ext4/page-io.c:146!
    Call Trace:
    [] ext4_bio_write_page+0x172/0x307
    [] mpage_da_submit_io+0x2f9/0x37b
    [] mpage_da_map_and_submit+0x2cc/0x2e2
    [] mpage_add_bh_to_extent+0xc6/0xd5
    [] write_cache_pages_da+0x2a4/0x3ac
    [] ext4_da_writepages+0x2d6/0x44d
    [] do_writepages+0x1c/0x25
    [] __filemap_fdatawrite_range+0x4b/0x4d
    [] filemap_fdatawrite_range+0xe/0x10
    [] jbd2_journal_begin_ordered_truncate+0x7b/0xa2
    [] ext4_evict_inode+0x57/0x24c
    [] evict+0x22/0x92
    [] iput+0x212/0x249
    [] dentry_iput+0xa1/0xb9
    [] d_kill+0x3d/0x5d
    [] dput+0x13a/0x147
    [] sys_renameat+0x1b5/0x258
    [] ? _atomic_dec_and_lock+0x2d/0x4c
    [] ? cp_new_stat+0xde/0xea
    [] ? sys_newlstat+0x2d/0x38
    [] sys_rename+0x16/0x18
    [] system_call_fastpath+0x16/0x1b

    Reported-by: Nick Bowler
    Signed-off-by: "Theodore Ts'o"
    Tested-by: Nick Bowler

    Theodore Ts'o
     

08 Nov, 2010

6 commits


07 Nov, 2010

2 commits


06 Nov, 2010

3 commits

  • While scanning the floopy code due to c093ee4f07f4 ("floppy: fix
    use-after-free in module load failure path"), I found one more instance
    of trying to access disk->queue pointer after doing put_disk() on
    gendisk. For some reason , floppy moule still loads/unloads fine. The
    object is probably still around with right pointer values.

    o There seems to be one more instance of trying to cleanup the request
    queue after we have called put_disk() on associated gendisk.

    o This fix is more out of code inspection. Even without this fix for
    some reason I am able to load/unload floppy module without any
    issues.

    o Floppy module loads/unloads fine after the fix.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • The autogenerated files (consolemap_deftbl.c and defkeymap.c) need to
    be ignored by git, so move the .gitignore file that was doing it to the
    properly location now that the files have moved as well.

    Cc: Arnd Bergmann
    Cc: Jiri Slaby
    Cc: Alan Cox
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     
  • Commit 27ae60f8f7aa ("ipw2x00: replace "ieee80211" with "libipw" where
    appropriate") changed DRV_NAME to be "libipw", but didn't properly fix
    up the places where it was used to specify the name for the /proc/net/
    directory.

    For backwards compatibility reasons, that directory name remained
    "ieee80211", but due to the DRV_NAME change, the error case printouts
    and the cleanup functions now used "libipw" instead. Which made it all
    fail badly.

    For example, on module unload as reported by Randy:

    WARNING: at fs/proc/generic.c:816 remove_proc_entry+0x156/0x35e()
    name 'libipw'

    because it's trying to unregister a /proc directory that obviously
    doesn't even exist.

    Clean it all up to use DRV_PROCNAME for the actual /proc directory name.

    Reported-and-tested-by: Randy Dunlap
    Cc: Pavel Roskin
    Cc: John W. Linville
    Signed-off-by: Linus Torvalds

    Linus Torvalds