02 Apr, 2013

2 commits

  • Signed-off-by: Michal Kubecek
    Signed-off-by: Pablo Neira Ayuso

    Michal Kubeček
     
  • Current NFQUEUE target uses a hash, computed over source and
    destination address (and other parameters), for steering the packet
    to the actual NFQUEUE. This, however forgets about the fact that the
    packet eventually is handled by a particular CPU on user request.

    If E. g.

    1) IRQ affinity is used to handle packets on a particular CPU already
    (both single-queue or multi-queue case)

    and/or

    2) RPS is used to steer packets to a specific softirq

    the target easily chooses an NFQUEUE which is not handled by a process
    pinned to the same CPU.

    The idea is therefore to use the CPU index for determining the
    NFQUEUE handling the packet.

    E. g. when having a system with 4 CPUs, 4 MQ queues and 4 NFQUEUEs it
    looks like this:

    +-----+ +-----+ +-----+ +-----+
    |NFQ#0| |NFQ#1| |NFQ#2| |NFQ#3|
    +-----+ +-----+ +-----+ +-----+
    ^ ^ ^ ^
    | |NFQUEUE | |
    + + + +
    +-----+ +-----+ +-----+ +-----+
    |rx-0 | |rx-1 | |rx-2 | |rx-3 |
    +-----+ +-----+ +-----+ +-----+

    The NFQUEUEs not necessarily have to start with number 0, setups with
    less NFQUEUEs than packet-handling CPUs are not a problem as well.

    This patch extends the NFQUEUE target to accept a new
    NFQ_FLAG_CPU_FANOUT flag. If this is specified the target uses the
    CPU index for determining the NFQUEUE being used. I have to introduce
    rev3 for this. The 'flags' are folded into _v2 'bypass'.

    By changing the way which queue is assigned, I'm able to improve the
    performance if the processes reading on the NFQUEUs are pinned
    correctly.

    Signed-off-by: Holger Eitzenberger
    Signed-off-by: Pablo Neira Ayuso

    holger@eitzenberger.org
     

28 Mar, 2013

1 commit

  • Add a new constant ETH_P_802_3_MIN, the minimum ethernet type for
    an 802.3 frame. Frames with a lower value in the ethernet type field
    are Ethernet II.

    Also update all the users of this value that David Miller and
    I could find to use the new constant.

    Also correct a bug in util.c. The comparison with ETH_P_802_3_MIN
    should be >= not >.

    As suggested by Jesse Gross.

    Compile tested only.

    Cc: David Miller
    Cc: Jesse Gross
    Cc: Karsten Keil
    Cc: John W. Linville
    Cc: Johannes Berg
    Cc: Bart De Schuymer
    Cc: Stephen Hemminger
    Cc: Patrick McHardy
    Cc: Marcel Holtmann
    Cc: Gustavo Padovan
    Cc: Johan Hedberg
    Cc: linux-bluetooth@vger.kernel.org
    Cc: netfilter-devel@vger.kernel.org
    Cc: bridge@lists.linux-foundation.org
    Cc: linux-wireless@vger.kernel.org
    Cc: linux1394-devel@lists.sourceforge.net
    Cc: linux-media@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Cc: dev@openvswitch.org
    Acked-by: Mauro Carvalho Chehab
    Acked-by: Stefan Richter
    Signed-off-by: Simon Horman
    Signed-off-by: David S. Miller

    Simon Horman
     

27 Mar, 2013

1 commit


23 Mar, 2013

1 commit


22 Mar, 2013

2 commits

  • The netlink_diag can be built as a module, just like it's done in
    unix sockets.

    The core dumping message carries the basic info about netlink sockets:
    family, type and protocol, portis, dst_group, dst_portid, state.

    Groups can be received as an optional parameter NETLINK_DIAG_GROUPS.

    Netlink sockets cab be filtered by protocols.

    The socket inode number and cookie is reserved for future per-socket info
    retrieving. The per-protocol filtering is also reserved for future by
    requiring the sdiag_protocol to be zero.

    The file /proc/net/netlink doesn't provide enough information for
    dumping netlink sockets. It doesn't provide dst_group, dst_portid,
    groups above 32.

    v2: fix NETLINK_DIAG_MAX. Now it's equal to the last constant.

    Acked-by: Pavel Emelyanov
    Cc: "David S. Miller"
    Cc: Eric Dumazet
    Cc: Pablo Neira Ayuso
    Cc: "Eric W. Biederman"
    Cc: Gao feng
    Cc: Thomas Graf
    Signed-off-by: Andrey Vagin
    Signed-off-by: David S. Miller

    Andrey Vagin
     
  • Follow the common pattern and define *_DIAG_MAX like:

    [...]
    __XXX_DIAG_MAX,
    };

    Because everyone is used to do:

    struct nlattr *attrs[XXX_DIAG_MAX+1];

    nla_parse([...], XXX_DIAG_MAX, [...]

    Reported-by: Thomas Graf
    Cc: "David S. Miller"
    Cc: Pavel Emelyanov
    Cc: Eric Dumazet
    Cc: "Paul E. McKenney"
    Cc: David Howells
    Signed-off-by: Andrey Vagin
    Signed-off-by: David S. Miller

    Andrey Vagin
     

21 Mar, 2013

4 commits


20 Mar, 2013

1 commit

  • Changes:
    v3->v2: rebase (no other changes)
    passes selftest
    v2->v1: read f->num_members only once
    fix bug: test rollover mode + flag

    Minimize packet drop in a fanout group. If one socket is full,
    roll over packets to another from the group. Maintain flow
    affinity during normal load using an rxhash fanout policy, while
    dispersing unexpected traffic storms that hit a single cpu, such
    as spoofed-source DoS flows. Rollover breaks affinity for flows
    arriving at saturated sockets during those conditions.

    The patch adds a fanout policy ROLLOVER that rotates between sockets,
    filling each socket before moving to the next. It also adds a fanout
    flag ROLLOVER. If passed along with any other fanout policy, the
    primary policy is applied until the chosen socket is full. Then,
    rollover selects another socket, to delay packet drop until the
    entire system is saturated.

    Probing sockets is not free. Selecting the last used socket, as
    rollover does, is a greedy approach that maximizes chance of
    success, at the cost of extreme load imbalance. In practice, with
    sufficiently long queues to absorb bursts, sockets are drained in
    parallel and load balance looks uniform in `top`.

    To avoid contention, scales counters with number of sockets and
    accesses them lockfree. Values are bounds checked to ensure
    correctness.

    Tested using an application with 9 threads pinned to CPUs, one socket
    per thread and sufficient busywork per packet operation to limits each
    thread to handling 32 Kpps. When sent 500 Kpps single UDP stream
    packets, a FANOUT_CPU setup processes 32 Kpps in total without this
    patch, 270 Kpps with the patch. Tested with read() and with a packet
    ring (V1).

    Also, passes psock_fanout.c unit test added to selftests.

    Signed-off-by: Willem de Bruijn
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

18 Mar, 2013

2 commits

  • TCPCT uses option-number 253, reserved for experimental use and should
    not be used in production environments.
    Further, TCPCT does not fully implement RFC 6013.

    As a nice side-effect, removing TCPCT increases TCP's performance for
    very short flows:

    Doing an apache-benchmark with -c 100 -n 100000, sending HTTP-requests
    for files of 1KB size.

    before this patch:
    average (among 7 runs) of 20845.5 Requests/Second
    after:
    average (among 7 runs) of 21403.6 Requests/Second

    Signed-off-by: Christoph Paasch
    Signed-off-by: David S. Miller

    Christoph Paasch
     
  • This patch generalizes VXLAN forwarding table entries allowing an administrator
    to:
    1) specify multiple destinations for a given MAC
    2) specify alternate vni's in the VXLAN header
    3) specify alternate destination UDP ports
    4) use multicast MAC addresses as fdb lookup keys
    5) specify multicast destinations
    6) specify the outgoing interface for forwarded packets

    The combination allows configuration of more complex topologies using VXLAN
    encapsulation.

    Changes since v1: rebase to 3.9.0-rc2

    Signed-Off-By: David L Stevens

    Signed-off-by: David S. Miller

    David Stevens
     

14 Mar, 2013

4 commits

  • Merge misc fixes from Andrew Morton:

    - A bunch of fixes

    - Finish off the idr API conversions before someone starts to use the
    old interfaces again.

    * emailed patches from Andrew Morton :
    idr: idr_alloc() shouldn't trigger lowmem warning when preloaded
    UAPI: fix endianness conditionals in M32R's asm/stat.h
    UAPI: fix endianness conditionals in linux/raid/md_p.h
    UAPI: fix endianness conditionals in linux/acct.h
    UAPI: fix endianness conditionals in linux/aio_abi.h
    decompressors: fix typo "POWERPC"
    mm/fremap.c: fix oops on error path
    idr: deprecate idr_pre_get() and idr_get_new[_above]()
    tidspbridge: convert to idr_alloc()
    zcache: convert to idr_alloc()
    mlx4: remove leftover idr_pre_get() call
    workqueue: convert to idr_alloc()
    nfsd: convert to idr_alloc()
    nfsd: remove unused get_new_stid()
    kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER
    signal: always clear sa_restorer on execve
    mm: remove_memory(): fix end_pfn setting
    include/linux/res_counter.h needs errno.h

    Linus Torvalds
     
  • In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
    compared against __BYTE_ORDER in preprocessor conditionals where these are
    exposed to userspace (that is they're not inside __KERNEL__ conditionals).

    However, in the main kernel the norm is to check for
    "defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
    this has incorrectly leaked into the userspace headers.

    The definition of struct mdp_superblock_s in linux/raid/md_p.h is wrong in
    this way. Note that userspace will likely interpret the ordering of the
    fields incorrectly as the big-endian variant on a little-endian machines -
    depending on header inclusion order.

    [!!!] NOTE [!!!] This patch may adversely change the userspace API. It might
    be better to fix the ordering of events_hi, events_lo, cp_events_hi and
    cp_events_lo in struct mdp_superblock_s / typedef mdp_super_t.

    Signed-off-by: David Howells
    Acked-by: NeilBrown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
    compared against __BYTE_ORDER in preprocessor conditionals where these are
    exposed to userspace (that is they're not inside __KERNEL__ conditionals).

    However, in the main kernel the norm is to check for
    "defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
    this has incorrectly leaked into the userspace headers.

    The definition of ACCT_BYTEORDER in linux/acct.h is wrong in this way.
    Note that userspace will likely interpret this incorrectly as the
    big-endian variant on little-endian machines - depending on header
    inclusion order.

    [!!!] NOTE [!!!] This patch may adversely change the userspace API. It might
    be better to fix the value of ACCT_BYTEORDER.

    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
    compared against __BYTE_ORDER in preprocessor conditionals where these are
    exposed to userspace (that is they're not inside __KERNEL__ conditionals).

    However, in the main kernel the norm is to check for
    "defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
    this has incorrectly leaked into the userspace headers.

    The definition of PADDED() in linux/aio_abi.h is wrong in this way. Note
    that userspace will likely interpret this and thus the order of fields in
    struct iocb incorrectly as the little-endian variant on big-endian
    machines - depending on header inclusion order.

    [!!!] NOTE [!!!] This patch may adversely change the userspace API. It might
    be better to fix the ordering of aio_key and aio_reserved1 in struct iocb.

    Signed-off-by: David Howells
    Acked-by: Benjamin LaHaise
    Acked-by: Jeff Moyer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

12 Mar, 2013

3 commits

  • Add support for Altera 8250/16550 compatible serial port.

    Signed-off-by: Ley Foon Tan
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Ley Foon Tan
     
  • This is the second of the TLP patch series; it augments the basic TLP
    algorithm with a loss detection scheme.

    This patch implements a mechanism for loss detection when a Tail
    loss probe retransmission plugs a hole thereby masking packet loss
    from the sender. The loss detection algorithm relies on counting
    TLP dupacks as outlined in Sec. 3 of:
    http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01

    The basic idea is: Sender keeps track of TLP "episode" upon
    retransmission of a TLP packet. An episode ends when the sender receives
    an ACK above the SND.NXT (tracked by tlp_high_seq) at the time of the
    episode. We want to make sure that before the episode ends the sender
    receives a "TLP dupack", indicating that the TLP retransmission was
    unnecessary, so there was no loss/hole that needed plugging. If the
    sender gets no TLP dupack before the end of the episode, then it reduces
    ssthresh and the congestion window, because the TLP packet arriving at
    the receiver probably plugged a hole.

    Signed-off-by: Nandita Dukkipati
    Acked-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Nandita Dukkipati
     
  • This patch series implement the Tail loss probe (TLP) algorithm described
    in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The
    first patch implements the basic algorithm.

    TLP's goal is to reduce tail latency of short transactions. It achieves
    this by converting retransmission timeouts (RTOs) occuring due
    to tail losses (losses at end of transactions) into fast recovery.
    TLP transmits one packet in two round-trips when a connection is in
    Open state and isn't receiving any ACKs. The transmitted packet, aka
    loss probe, can be either new or a retransmission. When there is tail
    loss, the ACK from a loss probe triggers FACK/early-retransmit based
    fast recovery, thus avoiding a costly RTO. In the absence of loss,
    there is no change in the connection state.

    PTO stands for probe timeout. It is a timer event indicating
    that an ACK is overdue and triggers a loss probe packet. The PTO value
    is set to max(2*SRTT, 10ms) and is adjusted to account for delayed
    ACK timer when there is only one oustanding packet.

    TLP Algorithm

    On transmission of new data in Open state:
    -> packets_out > 1: schedule PTO in max(2*SRTT, 10ms).
    -> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms)
    -> PTO = min(PTO, RTO)

    Conditions for scheduling PTO:
    -> Connection is in Open state.
    -> Connection is either cwnd limited or no new data to send.
    -> Number of probes per tail loss episode is limited to one.
    -> Connection is SACK enabled.

    When PTO fires:
    new_segment_exists:
    -> transmit new segment.
    -> packets_out++. cwnd remains same.

    no_new_packet:
    -> retransmit the last segment.
    Its ACK triggers FACK or early retransmit based recovery.

    ACK path:
    -> rearm RTO at start of ACK processing.
    -> reschedule PTO if need be.

    In addition, the patch includes a small variation to the Early Retransmit
    (ER) algorithm, such that ER and TLP together can in principle recover any
    N-degree of tail loss through fast recovery. TLP is controlled by the same
    sysctl as ER, tcp_early_retrans sysctl.
    tcp_early_retrans==0; disables TLP and ER.
    ==1; enables RFC5827 ER.
    ==2; delayed ER.
    ==3; TLP and delayed ER. [DEFAULT]
    ==4; TLP only.

    The TLP patch series have been extensively tested on Google Web servers.
    It is most effective for short Web trasactions, where it reduced RTOs by 15%
    and improved HTTP response time (average by 6%, 99th percentile by 10%).
    The transmitted probes account for
    Acked-by: Neal Cardwell
    Acked-by: Yuchung Cheng
    Signed-off-by: David S. Miller

    Nandita Dukkipati
     

11 Mar, 2013

2 commits

  • This adds a netlink interface for service name lookup support.
    Multiple URIs can be passed nested into the NFC_ATTR_LLC_SDP attribute
    using the NFC_CMD_LLC_SDREQ netlink command.
    When the SNL reply is received, a NFC_EVENT_LLC_SDRES event is sent to
    the user space. URI and SAP tuples are passed back, nested into
    NFC_ATTR_LLC_SDP attribute.

    Signed-off-by: Thierry Escande
    Signed-off-by: Samuel Ortiz

    Thierry Escande
     
  • Some LLCP services (e.g. the validation ones) require some control over
    the LLCP link parameters like the receive window (RW) or the MIU extension
    (MIUX). This can only be done through socket options.

    Signed-off-by: Samuel Ortiz

    Samuel Ortiz
     

09 Mar, 2013

1 commit

  • Split the vSockets header into kernel and UAPI parts. The former gets the bits
    that used to be in __KERNEL__ guards, while the latter gets everything that is
    user-visible. Tested by compiling vsock (+transport) and a simple user-mode
    vSockets application.

    Reported-by: David Howells
    Acked-by: Dmitry Torokhov
    Signed-off-by: Andy King
    Acked-by: David Howells
    Signed-off-by: David S. Miller

    Andy King
     

07 Mar, 2013

1 commit

  • HTB uses an internal pfifo queue, which limit is not reported
    to userland tools (tc), and value inherited from device tx_queue_len
    at setup time.

    Introduce TCA_HTB_DIRECT_QLEN attribute to allow finer control.

    Remove two obsolete pr_err() calls as well.

    Signed-off-by: Eric Dumazet
    Cc: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Eric Dumazet
     

06 Mar, 2013

7 commits

  • If the user requested a userspace MPM, automatically
    disable auto_open_plinks to fully disable the kernel MPM.

    Signed-off-by: Thomas Pedersen
    Signed-off-by: Johannes Berg

    Thomas Pedersen
     
  • Secure mesh had the implicit requirement that the Mesh
    Peering Management entity be in userspace. However
    userspace might want to implement an open MPM as well, so
    specify a mesh setup parameter to indicate this.

    Signed-off-by: Thomas Pedersen
    Signed-off-by: Johannes Berg

    Thomas Pedersen
     
  • Add NL80211_CMD_UPDATE_FT_IES to support update of FT IEs to the WLAN
    driver and NL80211_CMD_FT_EVENT to send FT events from the WLAN driver.
    This will carry the target AP's MAC address along with the relevant
    Information Elements. This event is used to report received FT IEs
    (MDIE, FTIE, RSN IE, TIE, RICIE). These changes allow FT to be supported
    with drivers that use an internal SME instead of user space option (like
    FT implementation in wpa_supplicant with mac80211-based drivers).

    Signed-off-by: Jouni Malinen
    Signed-off-by: Johannes Berg

    Jouni Malinen
     
  • For testing it's sometimes useful to be able to
    override certain VHT capability advertisement,
    add the ability to do that in cfg80211.

    Signed-off-by: Johannes Berg

    Johannes Berg
     
  • The per-wiphy information is getting large, to the point
    where with more than the typical number of channels it's
    too large and overflows, and userspace can't get any of
    the information at all.

    To address this (in a way that doesn't require making all
    messages bigger) allow userspace to specify that it can
    deal with wiphy information split across multiple parts
    of the dump, and if it can split up the data. This also
    splits up each channel separately so an arbitrary number
    of channels can be supported.

    Additionally, since GET_WIPHY has the same problem, add
    support for filtering the wiphy dump and get information
    for a single wiphy only, this allows userspace apps to
    use dump in this case to retrieve all data from a single
    device.

    As userspace needs to know if all this this is supported,
    add a global nl80211 feature set and include a bit for
    this behaviour in it.

    Cc: Dennis H Jensen
    Signed-off-by: Johannes Berg

    Johannes Berg
     
  • The station change API isn't being checked properly before
    drivers are called, and as a result it is difficult to see
    what should be allowed and what not.

    In order to comprehensively check the API parameters parse
    everything first, and then have the driver call a function
    (cfg80211_check_station_change()) with the additionally
    information about the kind of station that is being changed;
    this allows the function to make better decisions than the
    old code could.

    While at it, also add a few checks, particularly in mesh
    and clarify the TDLS station lifetime in documentation.

    To be able to reduce a few checks, ignore any flag set bits
    when the mask isn't set, they shouldn't be applied then.

    Signed-off-by: Johannes Berg

    Johannes Berg
     
  • Make the ability to leave the plink_state unchanged not use a
    magic -1 variable that isn't in the enum, but an explicit change
    flag; reject invalid plink states or actions and move the needed
    constants for plink actions to the right header file. Also
    reject plink_state changes for non-mesh interfaces.

    Signed-off-by: Johannes Berg

    Johannes Berg
     

04 Mar, 2013

2 commits

  • Pull fbdev UAPI disintegration from David Howells:
    "You'll be glad to here that the end is nigh for the UAPI patches.
    Only the fbdev/framebuffer piece remains now that the SCSI stuff has
    gone in.

    Here are the UAPI disintegration bits for the fbdev drivers. It
    appears that Florian hasn't had time to deal with my patch, but back
    in December he did say he didn't mind if I pushed it forward."

    Yay. No more uapi movement. And hopefully no more big header file
    cleanups coming up either, it just tends to be very painful.

    * tag 'disintegrate-fbdev-20121220' of git://git.infradead.org/users/dhowells/linux-headers:
    UAPI: (Scripted) Disintegrate include/video

    Linus Torvalds
     
  • Pull new ImgTec Meta architecture from James Hogan:
    "This adds core architecture support for Imagination's Meta processor
    cores, followed by some later miscellaneous arch/metag cleanups and
    fixes which I kept separate to ease review:

    - Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
    - A few fixes all over, particularly for symbol prefixes
    - A few privilege protection fixes
    - Several cleanups (setup.c includes, split out a lot of
    metag_ksyms.c)
    - Fix some missing exports
    - Convert hugetlb to use vm_unmapped_area()
    - Copy device tree to non-init memory
    - Provide dma_get_sgtable()"

    * tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (61 commits)
    metag: Provide dma_get_sgtable()
    metag: prom.h: remove declaration of metag_dt_memblock_reserve()
    metag: copy devicetree to non-init memory
    metag: cleanup metag_ksyms.c includes
    metag: move mm/init.c exports out of metag_ksyms.c
    metag: move usercopy.c exports out of metag_ksyms.c
    metag: move setup.c exports out of metag_ksyms.c
    metag: move kick.c exports out of metag_ksyms.c
    metag: move traps.c exports out of metag_ksyms.c
    metag: move irq enable out of irqflags.h on SMP
    genksyms: fix metag symbol prefix on crc symbols
    metag: hugetlb: convert to vm_unmapped_area()
    metag: export clear_page and copy_page
    metag: export metag_code_cache_flush_all
    metag: protect more non-MMU memory regions
    metag: make TXPRIVEXT bits explicit
    metag: kernel/setup.c: sort includes
    perf: Enable building perf tools for Meta
    metag: add boot time LNKGET/LNKSET check
    metag: add __init to metag_cache_probe()
    ...

    Linus Torvalds
     

03 Mar, 2013

4 commits

  • Pull btrfs update from Chris Mason:
    "The biggest feature in the pull is the new (and still experimental)
    raid56 code that David Woodhouse started long ago. I'm still working
    on the parity logging setup that will avoid inconsistent parity after
    a crash, so this is only for testing right now. But, I'd really like
    to get it out to a broader audience to hammer out any performance
    issues or other problems.

    scrub does not yet correct errors on raid5/6 either.

    Josef has another pass at fsync performance. The big change here is
    to combine waiting for metadata with waiting for data, which is a big
    latency win. It is also step one toward using atomics from the
    hardware during a commit.

    Mark Fasheh has a new way to use btrfs send/receive to send only the
    metadata changes. SUSE is using this to make snapper more efficient
    at finding changes between snapshosts.

    Snapshot-aware defrag is also included.

    Otherwise we have a large number of fixes and cleanups. Eric Sandeen
    wins the award for removing the most lines, and I'm hoping we steal
    this idea from XFS over and over again."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (118 commits)
    btrfs: fixup/remove module.h usage as required
    Btrfs: delete inline extents when we find them during logging
    btrfs: try harder to allocate raid56 stripe cache
    Btrfs: cleanup to make the function btrfs_delalloc_reserve_metadata more logic
    Btrfs: don't call btrfs_qgroup_free if just btrfs_qgroup_reserve fails
    Btrfs: remove reduplicate check about root in the function btrfs_clean_quota_tree
    Btrfs: return ENOMEM rather than use BUG_ON when btrfs_alloc_path fails
    Btrfs: fix missing deleted items in btrfs_clean_quota_tree
    btrfs: use only inline_pages from extent buffer
    Btrfs: fix wrong reserved space when deleting a snapshot/subvolume
    Btrfs: fix wrong reserved space in qgroup during snap/subv creation
    Btrfs: remove unnecessary dget_parent/dput when creating the pending snapshot
    btrfs: remove a printk from scan_one_device
    Btrfs: fix NULL pointer after aborting a transaction
    Btrfs: fix memory leak of log roots
    Btrfs: copy everything if we've created an inline extent
    btrfs: cleanup for open-coded alignment
    Btrfs: do not change inode flags in rename
    Btrfs: use reserved space for creating a snapshot
    clear chunk_alloc flag on retryable failure
    ...

    Linus Torvalds
     
  • The ptrace interface for metag provides access to some core register
    sets using the PTRACE_GETREGSET and PTRACE_SETREGSET operations. The
    details of the internal context structures is abstracted into user API
    structures to both ease use and allow flexibility to change the internal
    context layouts. Copyin and copyout functions for these register sets
    are exposed to allow signal handling code to use them to copy to and
    from the signal context.

    struct user_gp_regs (NT_PRSTATUS) provides access to the core general
    purpose register context.

    struct user_cb_regs (NT_METAG_CBUF) provides access to the TXCATCH*
    registers which contains information abuot a memory fault, unaligned
    access error or watchpoint. This can be modified to alter the way the
    fault is replayed on resume ("catch replay"), or to prevent the replay
    taking place.

    struct user_rp_state (NT_METAG_RPIPE) provides access to the state of
    the Meta read pipeline which can be used to hide memory latencies in
    hand optimised data loops.

    Extended DSP register state, DSP RAM, and hardware breakpoint registers
    aren't yet exposed through ptrace.

    Signed-off-by: James Hogan
    Cc: Andrew Morton
    Cc: Denys Vlasenko
    Cc: Arnd Bergmann
    Cc: Tony Lindgren
    Cc: "Paul E. McKenney"

    James Hogan
     
  • Pull device-mapper update from Alasdair G Kergon:
    "The main addition here is a long-desired target framework to allow an
    SSD to be used as a cache in front of a slower device. Cache tuning
    is delegated to interchangeable policy modules so these can be
    developed independently of the mechanics needed to shuffle the data
    around.

    Other than that, kcopyd users acquire a throttling parameter, ioctl
    buffer usage gets streamlined, more mempool reliance is reduced and
    there are a few other bug fixes and tidy-ups."

    * tag 'dm-3.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm: (30 commits)
    dm cache: add cleaner policy
    dm cache: add mq policy
    dm: add cache target
    dm persistent data: add bitset
    dm persistent data: add transactional array
    dm thin: remove cells from stack
    dm bio prison: pass cell memory in
    dm persistent data: add btree_walk
    dm: add target num_write_bios fn
    dm kcopyd: introduce configurable throttling
    dm ioctl: allow message to return data
    dm ioctl: optimize functions without variable params
    dm ioctl: introduce ioctl_flags
    dm: merge io_pool and tio_pool
    dm: remove unused _rq_bio_info_cache
    dm: fix limits initialization when there are no data devices
    dm snapshot: add missing module aliases
    dm persistent data: set some btree fn parms const
    dm: refactor bio cloning
    dm: rename bio cloning functions
    ...

    Linus Torvalds
     
  • Pull SCSI updates from James Bottomley:
    "This is an assorted set of stragglers into the merge window with
    driver updates for qla2xxx, megaraid_sas, storvsc and ufs.

    It also includes pulls of the uapi tree (all the remaining SCSI
    pieces) and the fcoe tree (updates to fcoe and libfc)"

    * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (81 commits)
    [SCSI] ufs: Separate PCI code into glue driver
    [SCSI] ufs: Segregate PCI Specific Code
    [SCSI] scsi: fix lpfc build when wmb() is defined as mb()
    [SCSI] storvsc: Handle dynamic resizing of the device
    [SCSI] storvsc: Restructure error handling code on command completion
    [SCSI] storvsc: avoid usage of WRITE_SAME
    [SCSI] aacraid: suppress two GCC warnings
    [SCSI] hpsa: check for dma_mapping_error in hpsa_passthru ioctls
    [SCSI] hpsa: reorganize error handling in hpsa_passthru_ioctl
    [SCSI] hpsa: check for dma_mapping_error in hpsa_map_sg_chain_block
    [SCSI] hpsa: Check for dma_mapping_error for all code paths using fill_cmd
    [SCSI] hpsa: Check for dma_mapping_error in hpsa_map_one
    [SCSI] dc395x: uninitialized variable in device_alloc()
    [SCSI] Fix range check in scsi_host_dif_capable()
    [SCSI] storvsc: Initialize the sglist
    [SCSI] mpt2sas: Add support for OEM specific controller
    [SCSI] ipr: Fix oops while resetting an ipr adapter
    [SCSI] fnic: Fnic Trace Utility
    [SCSI] fnic: New debug flags and debug log messages
    [SCSI] fnic: fnic driver may hit BUG_ON on device reset
    ...

    Linus Torvalds
     

02 Mar, 2013

2 commits

  • This patch introduces enhanced message support that allows the
    device-mapper core to recognise messages that are common to all devices,
    and for messages to return data to userspace.

    Core messages are processed by the function "message_for_md". If the
    device mapper doesn't support the message, it is passed to the target
    driver.

    If the message returns data, the kernel sets the flag
    DM_MESSAGE_OUT_FLAG.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Device-mapper ioctls receive and send data in a buffer supplied
    by userspace. The buffer has two parts. The first part contains
    a 'struct dm_ioctl' and has a fixed size. The second part depends
    on the ioctl and has a variable size.

    This patch recognises the specific ioctls that do not use the variable
    part of the buffer and skips allocating memory for it.

    In particular, when a device is suspended and a resume ioctl is sent,
    this now avoid memory allocation completely.

    The variable "struct dm_ioctl tmp" is moved from the function
    copy_params to its caller ctl_ioctl and renamed to param_kernel.
    It is used directly when the ioctl function doesn't need any arguments.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka