12 Dec, 2013

1 commit

  • This patch significantly updates the BPF documentation and describes
    its internal architecture, Linux extensions, and handling of the
    kernel's BPF and JIT engine, plus documents how development can be
    facilitated with the help of bpf_dbg, bpf_asm, bpf_jit_disasm.

    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

10 Dec, 2013

4 commits

  • There are quite a lot of drivers touching a PHY device MII_BMCR
    register to reset the PHY without taking care of:

    1) ensuring that BMCR_RESET is cleared after a given timeout
    2) the PHY state machine resuming to the proper state and re-applying
    potentially changed settings such as auto-negotiation

    Introduce phy_poll_reset() which will take care of polling the MII_BMCR
    for the BMCR_RESET bit to be cleared after a given timeout or return a
    timeout error code.

    In order to make sure the PHY is in a correct state, phy_init_hw() first
    issues a software reset through MII_BMCR and then applies any fixups.

    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • This patch introduces a PACKET_QDISC_BYPASS socket option, that
    allows for using a similar xmit() function as in pktgen instead
    of taking the dev_queue_xmit() path. This can be very useful when
    PF_PACKET applications are required to be used in a similar
    scenario as pktgen, but with full, flexible packet payload that
    needs to be provided, for example.

    On default, nothing changes in behaviour for normal PF_PACKET
    TX users, so everything stays as is for applications. New users,
    however, can now set PACKET_QDISC_BYPASS if needed to prevent
    own packets from i) reentering packet_rcv() and ii) to directly
    push the frame to the driver.

    In doing so we can increase pps (here 64 byte packets) for
    PF_PACKET a bit:

    # CPUs -- QDISC_BYPASS -- qdisc path -- qdisc path[**]
    1 CPU == 1,509,628 pps -- 1,208,708 -- 1,247,436
    2 CPUs == 3,198,659 pps -- 2,536,012 -- 1,605,779
    3 CPUs == 4,787,992 pps -- 3,788,740 -- 1,735,610
    4 CPUs == 6,173,956 pps -- 4,907,799 -- 1,909,114
    5 CPUs == 7,495,676 pps -- 5,956,499 -- 2,014,422
    6 CPUs == 9,001,496 pps -- 7,145,064 -- 2,155,261
    7 CPUs == 10,229,776 pps -- 8,190,596 -- 2,220,619
    8 CPUs == 11,040,732 pps -- 9,188,544 -- 2,241,879
    9 CPUs == 12,009,076 pps -- 10,275,936 -- 2,068,447
    10 CPUs == 11,380,052 pps -- 11,265,337 -- 1,578,689
    11 CPUs == 11,672,676 pps -- 11,845,344 -- 1,297,412
    [...]
    20 CPUs == 11,363,192 pps -- 11,014,933 -- 1,245,081

    [**]: qdisc path with packet_rcv(), how probably most people
    seem to use it (hopefully not anymore if not needed)

    The test was done using a modified trafgen, sending a simple
    static 64 bytes packet, on all CPUs. The trick in the fast
    "qdisc path" case, is to avoid reentering packet_rcv() by
    setting the RAW socket protocol to zero, like:
    socket(PF_PACKET, SOCK_RAW, 0);

    Tradeoffs are documented as well in this patch, clearly, if
    queues are busy, we will drop more packets, tc disciplines are
    ignored, and these packets are not visible to taps anymore. For
    a pktgen like scenario, we argue that this is acceptable.

    The pointer to the xmit function has been placed in packet
    socket structure hole between cached_dev and prot_hook that
    is hot anyway as we're working on cached_dev in each send path.

    Done in joint work together with Jesper Dangaard Brouer.

    Signed-off-by: Daniel Borkmann
    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Merge 'net' into 'net-next' to get the AF_PACKET bug fix that
    Daniel's direct transmit changes depend upon.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Commit e40526cb20b5 introduced a cached dev pointer, that gets
    hooked into register_prot_hook(), __unregister_prot_hook() to
    update the device used for the send path.

    We need to fix this up, as otherwise this will not work with
    sockets created with protocol = 0, plus with sll_protocol = 0
    passed via sockaddr_ll when doing the bind.

    So instead, assign the pointer directly. The compiler can inline
    these helper functions automagically.

    While at it, also assume the cached dev fast-path as likely(),
    and document this variant of socket creation as it seems it is
    not widely used (seems not even the author of TX_RING was aware
    of that in his reference example [1]). Tested with reproducer
    from e40526cb20b5.

    [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example

    Fixes: e40526cb20b5 ("packet: fix use after free race in send path when dev is released")
    Signed-off-by: Daniel Borkmann
    Tested-by: Salam Noureddine
    Tested-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

07 Dec, 2013

4 commits

  • Add a new check for CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to reduce
    the number of or's used in the ether_addr_equal comparison to very
    slightly improve function performance.

    Simplify the ether_addr_equal_64bits implementation.
    Integrate and remove the zap_last_2bytes helper as it's now
    used only once.

    Remove the now unused compare_ether_addr function.

    Update the unaligned-memory-access documentation to remove the
    compare_ether_addr description and show how unaligned accesses
    could occur with ether_addr_equal.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     
  • The 'max-speed' property is optional but defined in the ePAPR
    specification and now supported by the Linux Device Tree parsing
    infrastructure.

    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • John W. Linville says:

    ====================
    Please pull this batch of updates intended for the 3.14 stream...

    For the mac80211 bits, Johannes says:

    "I have various improvements/cleanups/fixes all over, but the shortlog
    shows that Luis's regulatory work and mesh work from the cozybit folks
    are the biggest ones, along with the CSA fixes."

    Along with that, we have big batches of updates to brcmfmac, rtlwifi,
    and ath9k. There are updates to wcn36xx, rt2x00, and a handful of
    others as well.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • With the introduction of TCP Small Queues, TSO auto sizing, and TCP
    pacing, we can implement Automatic Corking in the kernel, to help
    applications doing small write()/sendmsg() to TCP sockets.

    Idea is to change tcp_push() to check if the current skb payload is
    under skb optimal size (a multiple of MSS bytes)

    If under 'size_goal', and at least one packet is still in Qdisc or
    NIC TX queues, set the TCP Small Queue Throttled bit, so that the push
    will be delayed up to TX completion time.

    This delay might allow the application to coalesce more bytes
    in the skb in following write()/sendmsg()/sendfile() system calls.

    The exact duration of the delay is depending on the dynamics
    of the system, and might be zero if no packet for this flow
    is actually held in Qdisc or NIC TX ring.

    Using FQ/pacing is a way to increase the probability of
    autocorking being triggered.

    Add a new sysctl (/proc/sys/net/ipv4/tcp_autocorking) to control
    this feature and default it to 1 (enabled)

    Add a new SNMP counter : nstat -a | grep TcpExtTCPAutoCorking
    This counter is incremented every time we detected skb was under used
    and its flush was deferred.

    Tested:

    Interesting effects when using line buffered commands under ssh.

    Excellent performance results in term of cpu usage and total throughput.

    lpq83:~# echo 1 >/proc/sys/net/ipv4/tcp_autocorking
    lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128
    9410.39

    Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128':

    35209.439626 task-clock # 2.901 CPUs utilized
    2,294 context-switches # 0.065 K/sec
    101 CPU-migrations # 0.003 K/sec
    4,079 page-faults # 0.116 K/sec
    97,923,241,298 cycles # 2.781 GHz [83.31%]
    51,832,908,236 stalled-cycles-frontend # 52.93% frontend cycles idle [83.30%]
    25,697,986,603 stalled-cycles-backend # 26.24% backend cycles idle [66.70%]
    102,225,978,536 instructions # 1.04 insns per cycle
    # 0.51 stalled cycles per insn [83.38%]
    18,657,696,819 branches # 529.906 M/sec [83.29%]
    91,679,646 branch-misses # 0.49% of all branches [83.40%]

    12.136204899 seconds time elapsed

    lpq83:~# echo 0 >/proc/sys/net/ipv4/tcp_autocorking
    lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128
    6624.89

    Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128':
    40045.864494 task-clock # 3.301 CPUs utilized
    171 context-switches # 0.004 K/sec
    53 CPU-migrations # 0.001 K/sec
    4,080 page-faults # 0.102 K/sec
    111,340,458,645 cycles # 2.780 GHz [83.34%]
    61,778,039,277 stalled-cycles-frontend # 55.49% frontend cycles idle [83.31%]
    29,295,522,759 stalled-cycles-backend # 26.31% backend cycles idle [66.67%]
    108,654,349,355 instructions # 0.98 insns per cycle
    # 0.57 stalled cycles per insn [83.34%]
    19,552,170,748 branches # 488.244 M/sec [83.34%]
    157,875,417 branch-misses # 0.81% of all branches [83.34%]

    12.130267788 seconds time elapsed

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

06 Dec, 2013

3 commits


03 Dec, 2013

1 commit


29 Nov, 2013

1 commit

  • Pull GPIO fixes from Linus Walleij:
    "Here us a bunch of patches for the v3.13 series. Most important stuff
    is related to fixes and documentation for the new GPIO descriptor API.
    If the diffstat is scary you'll notice most of it is to
    Documentation/*:

    - A big slew of documentation for the gpiod transition that happened
    in the merge window, no semantic effect, but we should provide
    proper documentation with the new API.

    - Fix flags related to the new API.

    - Fix to the find_chip_by_name() lookup function related to the new
    API.

    - Fix of_find_gpio() when not using device tree.

    - Bug fix for the TB10x direction setting.

    - Error path fixes from Dan Carpenter.

    - Nasty IRQdomain bug relating to taking an unitialized spinlock.

    - Minor fixes here and there"

    * tag 'gpio-v3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
    gpio: bcm281xx: Fix return value of bcm_kona_gpio_get()
    gpio: pl061: move irqdomain initialization
    gpio: ucb1400: Add MODULE_ALIAS
    gpiolib: fix of_find_gpio() when OF not defined
    gpio: fix memory leak in error path
    gpio: rcar: NULL dereference on error in probe()
    gpio: msm: make msm_gpio.summary_irq signed for error handling
    gpio: mvebu: make mvchip->irqbase signed for error handling
    gpiolib: use dedicated flags for GPIO properties
    gpiolib: fix find_chip_by_name()
    Documentation: gpiolib: document new interface
    gpio: tb10x: Set output value before setting direction to output

    Linus Torvalds
     

28 Nov, 2013

2 commits


27 Nov, 2013

1 commit

  • Pull ARM SoC fixes from Olof Johansson:
    "Mostly bugfixes and a few small code removals. Worth pointing out is:

    - A handful of more fixes to get DT enablement working properly on
    OMAP, finding new breakage of things that don't work quite right
    yet without the traditional board files. I expect a bit more of
    this to come in this release as people test on their hardware.
    - Implementation of power_down_finish() on vexpress, to make kexec
    work and to stop the MCPM core to produce a warning (the warning
    was new to 3.13-rc1).
    - A handful of minor fixes for various platforms"

    * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
    ARM: bcm2835: add missing #xxx-cells to I2C nodes
    ARM: dts: Add max77686 RTC interrupt to cros5250-common
    ARM: vexpress/TC2: Implement MCPM power_down_finish()
    ARM: tegra: Provide dummy powergate implementation
    ARM: omap: fix warning with LPAE build
    ARM: OMAP2+: Remove legacy omap4_twl6030_hsmmc_init
    ARM: OMAP2+: Remove legacy mux code for display.c
    ARM: OMAP2+: Fix undefined reference to set_cntfreq
    gpio: twl4030: Fix passing of pdata in the device tree case
    gpio: twl4030: Fix regression for twl gpio output
    ARM: OMAP2+: More randconfig fixes for reconfigure_io_chain
    ARM: dts: imx6qdl: disable spdif "rxtx5" clock option
    ARM: dts: Fix omap2 specific dtsi files by adding the missing entries
    ARM: OMAP2+: Fix GPMC and simplify bootloader timings for 8250 and smc91x
    i2c: omap: Fix missing device tree flags for omap2

    Linus Torvalds
     

26 Nov, 2013

2 commits

  • ….org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

    From Tony Lindgren:
    Few more fixes for issues found booting older omaps using device tree.
    Also few randconfig build fixes and removal of some dead code for omap4
    as it no longer has legacy platform data based booting support.

    * tag 'omap-for-v3.13/more-fixes-for-merge-window-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
    ARM: OMAP2+: Remove legacy omap4_twl6030_hsmmc_init
    ARM: OMAP2+: Remove legacy mux code for display.c
    ARM: OMAP2+: Fix undefined reference to set_cntfreq
    gpio: twl4030: Fix passing of pdata in the device tree case
    gpio: twl4030: Fix regression for twl gpio output
    ARM: OMAP2+: More randconfig fixes for reconfigure_io_chain
    ARM: dts: Fix omap2 specific dtsi files by adding the missing entries
    ARM: OMAP2+: Fix GPMC and simplify bootloader timings for 8250 and smc91x
    i2c: omap: Fix missing device tree flags for omap2

    Olof Johansson
     
  • These two flags are used for the same purpose, just
    combine them into a no-ir flag to annotate no initiating
    radiation is allowed.

    Old userspace sending either flag will have it treated as
    the no-ir flag. To be considerate to older userspace we
    also send both the no-ir flag and the old no-ibss flags.
    Newer userspace will have to be aware of older kernels.

    Update all places in the tree using these flags with the
    following semantic patch:

    @@
    @@
    -NL80211_RRF_PASSIVE_SCAN
    +NL80211_RRF_NO_IR
    @@
    @@
    -NL80211_RRF_NO_IBSS
    +NL80211_RRF_NO_IR
    @@
    @@
    -IEEE80211_CHAN_PASSIVE_SCAN
    +IEEE80211_CHAN_NO_IR
    @@
    @@
    -IEEE80211_CHAN_NO_IBSS
    +IEEE80211_CHAN_NO_IR
    @@
    @@
    -NL80211_RRF_NO_IR | NL80211_RRF_NO_IR
    +NL80211_RRF_NO_IR
    @@
    @@
    -IEEE80211_CHAN_NO_IR | IEEE80211_CHAN_NO_IR
    +IEEE80211_CHAN_NO_IR
    @@
    @@
    -(NL80211_RRF_NO_IR)
    +NL80211_RRF_NO_IR
    @@
    @@
    -(IEEE80211_CHAN_NO_IR)
    +IEEE80211_CHAN_NO_IR

    Along with some hand-optimisations in documentation, to
    remove duplicates and to fix some indentation.

    Signed-off-by: Luis R. Rodriguez
    [do all the driver updates in one go]
    Signed-off-by: Johannes Berg

    Luis R. Rodriguez
     

25 Nov, 2013

1 commit

  • gpiolib now exports a new descriptor-based interface which deprecates
    the older integer-based one. This patch documents this new interface and
    also takes the opportunity to brush-up the GPIO documentation a little
    bit.

    The new descriptor-based interface follows the same consumer/driver
    model as many other kernel subsystems (e.g. clock, regulator), so its
    documentation has similarly been splitted into different files.

    The content of the former documentation has been reused whenever it
    made sense; however, some of its content did not apply to the new
    interface anymore and have this been removed. Likewise, new sections
    like the mapping of GPIOs to devices have been written from scratch.

    The deprecated legacy-based documentation is still available, untouched,
    under Documentation/gpio/gpio-legacy.txt.

    Signed-off-by: Alexandre Courbot
    Signed-off-by: Linus Walleij

    Alexandre Courbot
     

24 Nov, 2013

1 commit

  • Pull crypto update from Herbert Xu:
    - Made x86 ablk_helper generic for ARM
    - Phase out chainiv in favour of eseqiv (affects IPsec)
    - Fixed aes-cbc IV corruption on s390
    - Added constant-time crypto_memneq which replaces memcmp
    - Fixed aes-ctr in omap-aes
    - Added OMAP3 ROM RNG support
    - Add PRNG support for MSM SoC's
    - Add and use Job Ring API in caam
    - Misc fixes

    [ NOTE! This pull request was sent within the merge window, but Herbert
    has some questionable email sending setup that makes him public enemy
    #1 as far as gmail is concerned. So most of his emails seem to be
    trapped by gmail as spam, resulting in me not seeing them. - Linus ]

    * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (49 commits)
    crypto: s390 - Fix aes-cbc IV corruption
    crypto: omap-aes - Fix CTR mode counter length
    crypto: omap-sham - Add missing modalias
    padata: make the sequence counter an atomic_t
    crypto: caam - Modify the interface layers to use JR API's
    crypto: caam - Add API's to allocate/free Job Rings
    crypto: caam - Add Platform driver for Job Ring
    hwrng: msm - Add PRNG support for MSM SoC's
    ARM: DT: msm: Add Qualcomm's PRNG driver binding document
    crypto: skcipher - Use eseqiv even on UP machines
    crypto: talitos - Simplify key parsing
    crypto: picoxcell - Simplify and harden key parsing
    crypto: ixp4xx - Simplify and harden key parsing
    crypto: authencesn - Simplify key parsing
    crypto: authenc - Export key parsing helper function
    crypto: mv_cesa: remove deprecated IRQF_DISABLED
    hwrng: OMAP3 ROM Random Number Generator support
    crypto: sha256_ssse3 - also test for BMI2
    crypto: mv_cesa - Remove redundant of_match_ptr
    crypto: sahara - Remove redundant of_match_ptr
    ...

    Linus Torvalds
     

23 Nov, 2013

4 commits

  • Signed-off-by: Ben Hutchings

    Ben Hutchings
     
  • Pull SCSI target updates from Nicholas Bellinger:
    "Things have been quiet this round with mostly bugfixes, percpu
    conversions, and other minor iscsi-target conformance testing changes.

    The highlights include:

    - Add demo_mode_discovery attribute for iscsi-target (Thomas)
    - Convert tcm_fc(FCoE) to use percpu-ida pre-allocation
    - Add send completion interrupt coalescing for ib_isert
    - Convert target-core to use percpu-refcounting for se_lun
    - Fix mutex_trylock usage bug in iscsit_increment_maxcmdsn
    - tcm_loop updates (Hannes)
    - target-core ALUA cleanups + prep for v3.14 SCSI Referrals support (Hannes)

    v3.14 is currently shaping to be a busy development cycle in target
    land, with initial support for T10 Referrals and T10 DIF currently on
    the roadmap"

    * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (40 commits)
    iscsi-target: chap auth shouldn't match username with trailing garbage
    iscsi-target: fix extract_param to handle buffer length corner case
    iscsi-target: Expose default_erl as TPG attribute
    target_core_configfs: split up ALUA supported states
    target_core_alua: Make supported states configurable
    target_core_alua: Store supported ALUA states
    target_core_alua: Rename ALUA_ACCESS_STATE_OPTIMIZED
    target_core_alua: spellcheck
    target core: rename (ex,im)plict -> (ex,im)plicit
    percpu-refcount: Add percpu-refcount.o to obj-y
    iscsi-target: Do not reject non-immediate CmdSNs exceeding MaxCmdSN
    iscsi-target: Convert iscsi_session statistics to atomic_long_t
    target: Convert se_device statistics to atomic_long_t
    target: Fix delayed Task Aborted Status (TAS) handling bug
    iscsi-target: Reject unsupported multi PDU text command sequence
    ib_isert: Avoid duplicate iscsit_increment_maxcmdsn call
    iscsi-target: Fix mutex_trylock usage in iscsit_increment_maxcmdsn
    target: Core does not need blkdev.h
    target: Pass through I/O topology for block backstores
    iser-target: Avoid using FRMR for single dma entry requests
    ...

    Linus Torvalds
     
  • Pull hwmon fixes from Guenter Roeck:
    - acpi_power_meter: Fix return value check from call to
    acpi_bus_get_device
    - nct6775: Fix/improve NCT6791 support
    - lm75: Add support for GMT G751

    * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
    hwmon: (acpi_power_meter) Fix acpi_bus_get_device() return value check
    hwmon: (nct6775) NCT6791 supports weight control only for CPUFAN
    hwmon: (nct6775) Monitor additional temperature registers
    hwmon: (lm75) Add support for GMT G751 chip

    Linus Torvalds
     
  • Pull btrfs fixes from Chris Mason:
    "Almost all of these are bug fixes. Dave Sterba's documentation update
    is the big exception because he removed our promises to set any
    machine running Btrfs on fire"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    Documentation: filesystems: update btrfs tools section
    Documentation: filesystems: add new btrfs mount options
    btrfs: update kconfig help text
    btrfs: fix bio_size_ok() for max_sectors > 0xffff
    btrfs: Use trace condition for get_extent tracepoint
    btrfs: fix typo in the log message
    Btrfs: fix list delete warning when removing ordered root from the list
    Btrfs: print bytenr instead of page pointer in check-int
    Btrfs: remove dead codes from ctree.h
    Btrfs: don't wait for ordered data outside desired range
    Btrfs: fix lockdep error in async commit
    Btrfs: avoid heavy operations in btrfs_commit_super
    Btrfs: fix __btrfs_start_workers retval
    Btrfs: disable online raid-repair on ro mounts
    Btrfs: do not inc uncorrectable_errors counter on ro scrubs
    Btrfs: only drop modified extents if we logged the whole inode
    Btrfs: make sure to copy everything if we rename
    Btrfs: don't BUG_ON() if we get an error walking backrefs

    Linus Torvalds
     

22 Nov, 2013

5 commits

  • Merge patches from Andrew Morton:
    "13 fixes"

    * emailed patches from Andrew Morton :
    mm: place page->pmd_huge_pte to right union
    MAINTAINERS: add keyboard driver to Hyper-V file list
    x86, mm: do not leak page->ptl for pmd page tables
    ipc,shm: correct error return value in shmctl (SHM_UNLOCK)
    mm, mempolicy: silence gcc warning
    block/partitions/efi.c: fix bound check
    ARM: drivers/rtc/rtc-at91rm9200.c: disable interrupts at shutdown
    mm: hugetlbfs: fix hugetlbfs optimization
    kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS cleanly
    ipc,shm: fix shm_file deletion races
    mm: thp: give transparent hugepage code a separate copy_page
    checkpatch: fix "Use of uninitialized value" warnings
    configfs: fix race between dentry put and lookup

    Linus Torvalds
     
  • Pull security subsystem updates from James Morris:
    "In this patchset, we finally get an SELinux update, with Paul Moore
    taking over as maintainer of that code.

    Also a significant update for the Keys subsystem, as well as
    maintenance updates to Smack, IMA, TPM, and Apparmor"

    and since I wanted to know more about the updates to key handling,
    here's the explanation from David Howells on that:

    "Okay. There are a number of separate bits. I'll go over the big bits
    and the odd important other bit, most of the smaller bits are just
    fixes and cleanups. If you want the small bits accounting for, I can
    do that too.

    (1) Keyring capacity expansion.

    KEYS: Consolidate the concept of an 'index key' for key access
    KEYS: Introduce a search context structure
    KEYS: Search for auth-key by name rather than target key ID
    Add a generic associative array implementation.
    KEYS: Expand the capacity of a keyring

    Several of the patches are providing an expansion of the capacity of a
    keyring. Currently, the maximum size of a keyring payload is one page.
    Subtract a small header and then divide up into pointers, that only gives
    you ~500 pointers on an x86_64 box. However, since the NFS idmapper uses
    a keyring to store ID mapping data, that has proven to be insufficient to
    the cause.

    Whatever data structure I use to handle the keyring payload, it can only
    store pointers to keys, not the keys themselves because several keyrings
    may point to a single key. This precludes inserting, say, and rb_node
    struct into the key struct for this purpose.

    I could make an rbtree of records such that each record has an rb_node
    and a key pointer, but that would use four words of space per key stored
    in the keyring. It would, however, be able to use much existing code.

    I selected instead a non-rebalancing radix-tree type approach as that
    could have a better space-used/key-pointer ratio. I could have used the
    radix tree implementation that we already have and insert keys into it by
    their serial numbers, but that means any sort of search must iterate over
    the whole radix tree. Further, its nodes are a bit on the capacious side
    for what I want - especially given that key serial numbers are randomly
    allocated, thus leaving a lot of empty space in the tree.

    So what I have is an associative array that internally is a radix-tree
    with 16 pointers per node where the index key is constructed from the key
    type pointer and the key description. This means that an exact lookup by
    type+description is very fast as this tells us how to navigate directly to
    the target key.

    I made the data structure general in lib/assoc_array.c as far as it is
    concerned, its index key is just a sequence of bits that leads to a
    pointer. It's possible that someone else will be able to make use of it
    also. FS-Cache might, for example.

    (2) Mark keys as 'trusted' and keyrings as 'trusted only'.

    KEYS: verify a certificate is signed by a 'trusted' key
    KEYS: Make the system 'trusted' keyring viewable by userspace
    KEYS: Add a 'trusted' flag and a 'trusted only' flag
    KEYS: Separate the kernel signature checking keyring from module signing

    These patches allow keys carrying asymmetric public keys to be marked as
    being 'trusted' and allow keyrings to be marked as only permitting the
    addition or linkage of trusted keys.

    Keys loaded from hardware during kernel boot or compiled into the kernel
    during build are marked as being trusted automatically. New keys can be
    loaded at runtime with add_key(). They are checked against the system
    keyring contents and if their signatures can be validated with keys that
    are already marked trusted, then they are marked trusted also and can
    thus be added into the master keyring.

    Patches from Mimi Zohar make this usable with the IMA keyrings also.

    (3) Remove the date checks on the key used to validate a module signature.

    X.509: Remove certificate date checks

    It's not reasonable to reject a signature just because the key that it was
    generated with is no longer valid datewise - especially if the kernel
    hasn't yet managed to set the system clock when the first module is
    loaded - so just remove those checks.

    (4) Make it simpler to deal with additional X.509 being loaded into the kernel.

    KEYS: Load *.x509 files into kernel keyring
    KEYS: Have make canonicalise the paths of the X.509 certs better to deduplicate

    The builder of the kernel now just places files with the extension ".x509"
    into the kernel source or build trees and they're concatenated by the
    kernel build and stuffed into the appropriate section.

    (5) Add support for userspace kerberos to use keyrings.

    KEYS: Add per-user_namespace registers for persistent per-UID kerberos caches
    KEYS: Implement a big key type that can save to tmpfs

    Fedora went to, by default, storing kerberos tickets and tokens in tmpfs.
    We looked at storing it in keyrings instead as that confers certain
    advantages such as tickets being automatically deleted after a certain
    amount of time and the ability for the kernel to get at these tokens more
    easily.

    To make this work, two things were needed:

    (a) A way for the tickets to persist beyond the lifetime of all a user's
    sessions so that cron-driven processes can still use them.

    The problem is that a user's session keyrings are deleted when the
    session that spawned them logs out and the user's user keyring is
    deleted when the UID is deleted (typically when the last log out
    happens), so neither of these places is suitable.

    I've added a system keyring into which a 'persistent' keyring is
    created for each UID on request. Each time a user requests their
    persistent keyring, the expiry time on it is set anew. If the user
    doesn't ask for it for, say, three days, the keyring is automatically
    expired and garbage collected using the existing gc. All the kerberos
    tokens it held are then also gc'd.

    (b) A key type that can hold really big tickets (up to 1MB in size).

    The problem is that Active Directory can return huge tickets with lots
    of auxiliary data attached. We don't, however, want to eat up huge
    tracts of unswappable kernel space for this, so if the ticket is
    greater than a certain size, we create a swappable shmem file and dump
    the contents in there and just live with the fact we then have an
    inode and a dentry overhead. If the ticket is smaller than that, we
    slap it in a kmalloc()'d buffer"

    * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (121 commits)
    KEYS: Fix keyring content gc scanner
    KEYS: Fix error handling in big_key instantiation
    KEYS: Fix UID check in keyctl_get_persistent()
    KEYS: The RSA public key algorithm needs to select MPILIB
    ima: define '_ima' as a builtin 'trusted' keyring
    ima: extend the measurement list to include the file signature
    kernel/system_certificate.S: use real contents instead of macro GLOBAL()
    KEYS: fix error return code in big_key_instantiate()
    KEYS: Fix keyring quota misaccounting on key replacement and unlink
    KEYS: Fix a race between negating a key and reading the error set
    KEYS: Make BIG_KEYS boolean
    apparmor: remove the "task" arg from may_change_ptraced_domain()
    apparmor: remove parent task info from audit logging
    apparmor: remove tsk field from the apparmor_audit_struct
    apparmor: fix capability to not use the current task, during reporting
    Smack: Ptrace access check mode
    ima: provide hash algo info in the xattr
    ima: enable support for larger default filedata hash algorithms
    ima: define kernel parameter 'ima_template=' to change configured default
    ima: add Kconfig default measurement list template
    ...

    Linus Torvalds
     
  • There are two code paths how page with pmd page table can be freed:
    pmd_free() and pmd_free_tlb().

    I've missed the second one and didn't add page table destructor call
    there. It leads to leak of page->ptl for pmd page tables, if
    dynamically allocated page->ptl is in use.

    The patch adds the missed destructor and modifies documentation
    accordingly.

    Signed-off-by: Kirill A. Shutemov
    Reported-by: Andrey Vagin
    Tested-by: Andrey Vagin
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • The tools mentioned have been obsoleted long ago, replace
    with the current ones.

    CC: linux-doc@vger.kernel.org
    Signed-off-by: David Sterba
    Signed-off-by: Chris Mason

    David Sterba
     
  • Two new options were added in 3.12: commit and rescan_uuid_tree

    CC: linux-doc@vger.kernel.org
    Signed-off-by: David Sterba
    Signed-off-by: Chris Mason

    David Sterba
     

21 Nov, 2013

2 commits

  • Pull more ACPI and power management updates from Rafael Wysocki:

    - ACPI-based device hotplug fixes for issues introduced recently and a
    fix for an older error code path bug in the ACPI PCI host bridge
    driver

    - Fix for recently broken OMAP cpufreq build from Viresh Kumar

    - Fix for a recent hibernation regression related to s2disk

    - Fix for a locking-related regression in the ACPI EC driver from
    Puneet Kumar

    - System suspend error code path fix related to runtime PM and runtime
    PM documentation update from Ulf Hansson

    - cpufreq's conservative governor fix from Xiaoguang Chen

    - New processor IDs for intel_idle and turbostat and removal of an
    obsolete Kconfig option from Len Brown

    - New device IDs for the ACPI LPSS (Low-Power Subsystem) driver and
    ACPI-based PCI hotplug (ACPIPHP) cleanup from Mika Westerberg

    - Removal of several ACPI video DMI blacklist entries that are not
    necessary any more from Aaron Lu

    - Rework of the ACPI companion representation in struct device and code
    cleanup related to that change from Rafael J Wysocki, Lan Tianyu and
    Jarkko Nikula

    - Fixes for assigning names to ACPI-enumerated I2C and SPI devices from
    Jarkko Nikula

    * tag 'pm+acpi-2-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (24 commits)
    PCI / hotplug / ACPI: Drop unused acpiphp_debug declaration
    ACPI / scan: Set flags.match_driver in acpi_bus_scan_fixed()
    ACPI / PCI root: Clear driver_data before failing enumeration
    ACPI / hotplug: Fix PCI host bridge hot removal
    ACPI / hotplug: Fix acpi_bus_get_device() return value check
    cpufreq: governor: Remove fossil comment in the cpufreq_governor_dbs()
    ACPI / video: clean up DMI table for initial black screen problem
    ACPI / EC: Ensure lock is acquired before accessing ec struct members
    PM / Hibernate: Do not crash kernel in free_basic_memory_bitmaps()
    ACPI / AC: Remove struct acpi_device pointer from struct acpi_ac
    spi: Use stable dev_name for ACPI enumerated SPI slaves
    i2c: Use stable dev_name for ACPI enumerated I2C slaves
    ACPI: Provide acpi_dev_name accessor for struct acpi_device device name
    ACPI / bind: Use (put|get)_device() on ACPI device objects too
    ACPI: Eliminate the DEVICE_ACPI_HANDLE() macro
    ACPI / driver core: Store an ACPI device pointer in struct acpi_dev_node
    cpufreq: OMAP: Fix compilation error 'r & ret undeclared'
    PM / Runtime: Fix error path for prepare
    PM / Runtime: Update documentation around probe|remove|suspend
    cpufreq: conservative: set requested_freq to policy max when it is over policy max
    ...

    Linus Torvalds
     
  • Pull slave-dmaengine changes from Vinod Koul:
    "This brings for slave dmaengine:

    - Change dma notification flag to DMA_COMPLETE from DMA_SUCCESS as
    dmaengine can only transfer and not verify validaty of dma
    transfers

    - Bunch of fixes across drivers:

    - cppi41 driver fixes from Daniel

    - 8 channel freescale dma engine support and updated bindings from
    Hongbo

    - msx-dma fixes and cleanup by Markus

    - DMAengine updates from Dan:

    - Bartlomiej and Dan finalized a rework of the dma address unmap
    implementation.

    - In the course of testing 1/ a collection of enhancements to
    dmatest fell out. Notably basic performance statistics, and
    fixed / enhanced test control through new module parameters
    'run', 'wait', 'noverify', and 'verbose'. Thanks to Andriy and
    Linus [Walleij] for their review.

    - Testing the raid related corner cases of 1/ triggered bugs in
    the recently added 16-source operation support in the ioatdma
    driver.

    - Some minor fixes / cleanups to mv_xor and ioatdma"

    * 'next' of git://git.infradead.org/users/vkoul/slave-dma: (99 commits)
    dma: mv_xor: Fix mis-usage of mmio 'base' and 'high_base' registers
    dma: mv_xor: Remove unneeded NULL address check
    ioat: fix ioat3_irq_reinit
    ioat: kill msix_single_vector support
    raid6test: add new corner case for ioatdma driver
    ioatdma: clean up sed pool kmem_cache
    ioatdma: fix selection of 16 vs 8 source path
    ioatdma: fix sed pool selection
    ioatdma: Fix bug in selftest after removal of DMA_MEMSET.
    dmatest: verbose mode
    dmatest: convert to dmaengine_unmap_data
    dmatest: add a 'wait' parameter
    dmatest: add basic performance metrics
    dmatest: add support for skipping verification and random data setup
    dmatest: use pseudo random numbers
    dmatest: support xor-only, or pq-only channels in tests
    dmatest: restore ability to start test at module load and init
    dmatest: cleanup redundant "dmatest: " prefixes
    dmatest: replace stored results mechanism, with uniform messages
    Revert "dmatest: append verify result to results"
    ...

    Linus Torvalds
     

20 Nov, 2013

3 commits

  • Pull networking fixes from David Miller:
    "Mostly these are fixes for fallout due to merge window changes, as
    well as cures for problems that have been with us for a much longer
    period of time"

    1) Johannes Berg noticed two major deficiencies in our genetlink
    registration. Some genetlink protocols we passing in constant
    counts for their ops array rather than something like
    ARRAY_SIZE(ops) or similar. Also, some genetlink protocols were
    using fixed IDs for their multicast groups.

    We have to retain these fixed IDs to keep existing userland tools
    working, but reserve them so that other multicast groups used by
    other protocols can not possibly conflict.

    In dealing with these two problems, we actually now use less state
    management for genetlink operations and multicast groups.

    2) When configuring interface hardware timestamping, fix several
    drivers that simply do not validate that the hwtstamp_config value
    is one the driver actually supports. From Ben Hutchings.

    3) Invalid memory references in mwifiex driver, from Amitkumar Karwar.

    4) In dev_forward_skb(), set the skb->protocol in the right order
    relative to skb_scrub_packet(). From Alexei Starovoitov.

    5) Bridge erroneously fails to use the proper wrapper functions to make
    calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid. Fix from Toshiaki
    Makita.

    6) When detaching a bridge port, make sure to flush all VLAN IDs to
    prevent them from leaking, also from Toshiaki Makita.

    7) Put in a compromise for TCP Small Queues so that deep queued devices
    that delay TX reclaim non-trivially don't have such a performance
    decrease. One particularly problematic area is 802.11 AMPDU in
    wireless. From Eric Dumazet.

    8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts
    here. Fix from Eric Dumzaet, reported by Dave Jones.

    9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn.

    10) When computing mergeable buffer sizes, virtio-net fails to take the
    virtio-net header into account. From Michael Dalton.

    11) Fix seqlock deadlock in ip4_datagram_connect() wrt. statistic
    bumping, this one has been with us for a while. From Eric Dumazet.

    12) Fix NULL deref in the new TIPC fragmentation handling, from Erik
    Hugne.

    13) 6lowpan bit used for traffic classification was wrong, from Jukka
    Rissanen.

    14) macvlan has the same issue as normal vlans did wrt. propagating LRO
    disabling down to the real device, fix it the same way. From Michal
    Kubecek.

    15) CPSW driver needs to soft reset all slaves during suspend, from
    Daniel Mack.

    16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet.

    17) The xen-netfront RX buffer refill timer isn't properly scheduled on
    partial RX allocation success, from Ma JieYue.

    18) When ipv6 ping protocol support was added, the AF_INET6 protocol
    initialization cleanup path on failure was borked a little. Fix
    from Vlad Yasevich.

    19) If a socket disconnects during a read/recvmsg/recvfrom/etc that
    blocks we can do the wrong thing with the msg_name we write back to
    userspace. From Hannes Frederic Sowa. There is another fix in the
    works from Hannes which will prevent future problems of this nature.

    20) Fix route leak in VTI tunnel transmit, from Fan Du.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
    genetlink: make multicast groups const, prevent abuse
    genetlink: pass family to functions using groups
    genetlink: add and use genl_set_err()
    genetlink: remove family pointer from genl_multicast_group
    genetlink: remove genl_unregister_mc_group()
    hsr: don't call genl_unregister_mc_group()
    quota/genetlink: use proper genetlink multicast APIs
    drop_monitor/genetlink: use proper genetlink multicast APIs
    genetlink: only pass array to genl_register_family_with_ops()
    tcp: don't update snd_nxt, when a socket is switched from repair mode
    atm: idt77252: fix dev refcnt leak
    xfrm: Release dst if this dst is improper for vti tunnel
    netlink: fix documentation typo in netlink_set_err()
    be2net: Delete secondary unicast MAC addresses during be_close
    be2net: Fix unconditional enabling of Rx interface options
    net, virtio_net: replace the magic value
    ping: prevent NULL pointer dereference on write to msg_name
    bnx2x: Prevent "timeout waiting for state X"
    bnx2x: prevent CFC attention
    bnx2x: Prevent panic during DMAE timeout
    ...

    Linus Torvalds
     
  • Pull second set of ARC changes from Vineet Gupta:
    - Support for Perf from Mischa
    - Enabling GPIO/Pinctrl drivers for Abilis TB10x platform
    - New defconfig for buildroot

    * tag 'arc-v3.13-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
    ARC: [plat-arcfpga] Add defconfig without initramfs location
    ARC: perf: ARC 700 PMU doesn't support sampling events
    ARC: Add documentation on DT binding for ARC700 PMU
    ARC: Add perf support for ARC700 cores
    ARC: [TB10x] Updates for GPIO and pinctrl

    Linus Torvalds
     
  • SIOCSHWTSTAMP returns the real configuration to the application
    using it, but there is currently no way for any other
    application to find out the configuration non-destructively.
    Add a new ioctl for this, making it unprivileged.

    Signed-off-by: Ben Hutchings

    Ben Hutchings
     

19 Nov, 2013

5 commits

  • * pm-runtime:
    PM / Runtime: Fix error path for prepare
    PM / Runtime: Update documentation around probe|remove|suspend

    Rafael J. Wysocki
     
  • Pull watchdog changes from Wim Van Sebroeck:
    - addition of MOXA ART watchdog driver (moxart_wdt)
    - addition of CSR SiRFprimaII and SiRFatlasVI watchdog driver
    (sirfsoc_wdt)
    - addition of ralink watchdog driver (rt2880_wdt)
    - various fixes and cleanups (__user annotation, ioctl return codes,
    removal of redundant of_match_ptr, removal of unnecessary
    amba_set_drvdata(), use allocated buffer for usb_control_msg, ...)
    - removal of MODULE_ALIAS_MISCDEV statements
    - watchdog related DT bindings
    - first set of improvements on the w83627hf_wdt driver

    * git://www.linux-watchdog.org/linux-watchdog: (26 commits)
    watchdog: w83627hf: Use helper functions to access superio registers
    watchdog: w83627hf: Enable watchdog device only if not already enabled
    watchdog: w83627hf: Enable watchdog only once
    watchdog: w83627hf: Convert to watchdog infrastructure
    watchdog: omap_wdt: raw read and write endian fix
    watchdog: sirf: don't depend on dummy value of CLOCK_TICK_RATE
    watchdog: pcwd_usb: overflow in usb_pcwd_send_command()
    watchdog: rt2880_wdt: fix return value check in rt288x_wdt_probe()
    watchdog: watchdog_core: Fix a trivial typo
    watchdog: dw: Enable OF support for DW watchdog timer
    watchdog: Get rid of MODULE_ALIAS_MISCDEV statements
    watchdog: ts72xx_wdt: Propagate return value from timeout_to_regval
    watchdog: pcwd_usb: Use allocated buffer for usb_control_msg
    watchdog: sp805_wdt: Remove unnecessary amba_set_drvdata()
    watchdog: sirf: add watchdog driver of CSR SiRFprimaII and SiRFatlasVI
    watchdog: Remove redundant of_match_ptr
    watchdog: ts72xx_wdt: cleanup return codes in ioctl
    documentation/devicetree: Move DT bindings from gpio to watchdog
    watchdog: add ralink watchdog driver
    watchdog: Add MOXA ART watchdog driver
    ...

    Linus Torvalds
     
  • Pull i2c changes from Wolfram Sang:
    - new drivers for exynos5, bcm kona, and st micro
    - bigger overhauls for drivers mxs and rcar
    - typical driver bugfixes, cleanups, improvements
    - got rid of the superfluous 'driver' member in i2c_client struct This
    touches a few drivers in other subsystems. All acked.

    * 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (38 commits)
    i2c: bcm-kona: fix error return code in bcm_kona_i2c_probe()
    i2c: i2c-eg20t: do not print error message in syslog if no ACK received
    i2c: bcm-kona: Introduce Broadcom I2C Driver
    i2c: cbus-gpio: Fix device tree binding
    i2c: wmt: add missing clk_disable_unprepare() on error
    i2c: designware: add new ACPI IDs
    i2c: i801: Add Device IDs for Intel Wildcat Point-LP PCH
    i2c: exynos5: Remove incorrect clk_disable_unprepare
    i2c: i2c-st: Add ST I2C controller
    i2c: exynos5: add High Speed I2C controller driver
    i2c: rcar: fixup rcar type naming
    i2c: scmi: remove some bogus NULL checks
    i2c: sh_mobile & rcar: Enable the driver on all ARM platforms
    i2c: sh_mobile: Convert to clk_prepare/unprepare
    i2c: mux: gpio: use reg value for i2c_add_mux_adapter
    i2c: mux: gpio: use gpio_set_value_cansleep()
    i2c: Include linux/of.h header
    i2c: mxs: Fix PIO mode on i.MX23
    i2c: mxs: Rework the PIO mode operation
    i2c: mxs: distinguish i.MX23 and i.MX28 based I2C controller
    ...

    Linus Torvalds
     
  • Pull infiniband/rdma updates from Roland Dreier:
    - Re-enable flow steering verbs with new improved userspace ABI
    - Fixes for slow connection due to GID lookup scalability
    - IPoIB fixes
    - Many fixes to HW drivers including mlx4, mlx5, ocrdma and qib
    - Further improvements to SRP error handling
    - Add new transport type for Cisco usNIC

    * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (66 commits)
    IB/core: Re-enable create_flow/destroy_flow uverbs
    IB/core: extended command: an improved infrastructure for uverbs commands
    IB/core: Remove ib_uverbs_flow_spec structure from userspace
    IB/core: Use a common header for uverbs flow_specs
    IB/core: Make uverbs flow structure use names like verbs ones
    IB/core: Rename 'flow' structs to match other uverbs structs
    IB/core: clarify overflow/underflow checks on ib_create/destroy_flow
    IB/ucma: Convert use of typedef ctl_table to struct ctl_table
    IB/cm: Convert to using idr_alloc_cyclic()
    IB/mlx5: Fix page shift in create CQ for userspace
    IB/mlx4: Fix device max capabilities check
    IB/mlx5: Fix list_del of empty list
    IB/mlx5: Remove dead code
    IB/core: Encorce MR access rights rules on kernel consumers
    IB/mlx4: Fix endless loop in resize CQ
    RDMA/cma: Remove unused argument and minor dead code
    RDMA/ucma: Discard events for IDs not yet claimed by user space
    IB/core: Add Cisco usNIC rdma node and transport types
    RDMA/nes: Remove self-assignment from nes_query_qp()
    IB/srp: Report receive errors correctly
    ...

    Linus Torvalds
     
  • Pull battery updates from Anton Vorontsov:
    "Highlights:
    - A new driver for TI BQ24735 Battery Chargers, courtesy of NVidia.
    - Device tree bindings for TWL4030 chips.
    - Random fixes and cleanups"

    * tag 'for-v3.13' of git://git.infradead.org/battery-2.6:
    pm2301-charger: Remove unneeded NULL checks
    twl4030_charger: Add devicetree support
    power_supply: Fix documentation for TEMP_*ALERT* properties
    max17042_battery: Support regmap to access device's registers
    max17042_battery: Use SIMPLE_DEV_PM_OPS
    charger-manager : Replace kzalloc to devm_kzalloc and remove uneccessary code
    bq2415x_charger: Fix max battery regulation voltage
    tps65090-charger: Use "IS_ENABLED(CONFIG_OF)" for DT code
    tps65090-charger: Drop devm_free_irq of devm_ allocated irq
    power_supply: Add support for bq24735 charger
    pm2301-charger: Staticize pm2xxx_charger_die_therm_mngt
    pm2301-charger: Check return value of regulator_enable
    ab8500-charger: Remove redundant break
    ab8500-charger: Check return value of regulator_enable
    isp1704_charger: Fix driver to work with changes introduced in v3.5

    Linus Torvalds