03 Jan, 2013

9 commits

  • Notice that acpi_bus_add() uses only 2 of its 4 arguments and
    redefine its header to match the body. Update all of its callers as
    necessary and observe that this leads to quite a number of removed
    lines of code (Linus will like that).

    Add a kerneldoc comment documenting acpi_bus_add() and wonder how
    its callers make wrong assumptions about the second argument (make
    note to self to take care of that later).

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Yinghai Lu
    Acked-by: Toshi Kani

    Rafael J. Wysocki
     
  • The ACPI PCI root bridge driver was the only ACPI driver implementing
    the .start() callback, which isn't used by any ACPI drivers any more
    now.

    For this reason, acpi_start_single_object() has no purpose any more,
    so remove it and all references to it. Also remove
    acpi_bus_start_device(), whose only purpose was to call
    acpi_start_single_object().

    Moreover, since after the removal of acpi_bus_start_device() the
    only purpose of acpi_bus_start() remains to call
    acpi_update_all_gpes(), move that into acpi_bus_add() and drop
    acpi_bus_start() too, remove its header from acpi_bus.h and
    update all of its former users accordingly.

    This change was previously proposed in a different from by
    Yinghai Lu.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Yinghai Lu
    Acked-by: Toshi Kani

    Rafael J. Wysocki
     
  • Notice that one member of struct acpi_bus_ops, acpi_op_add, is not
    used anywhere any more and the relationship between its remaining
    members, acpi_op_match and acpi_op_start, is such that it doesn't
    make sense to set the latter without setting the former at the same
    time. Therefore, replace struct acpi_bus_ops with new a enum type,
    enum acpi_bus_add_type, with three values, ACPI_BUS_ADD_BASIC,
    ACPI_BUS_ADD_MATCH, ACPI_BUS_ADD_START, corresponding to
    both acpi_op_match and acpi_op_start unset, acpi_op_match set and
    acpi_op_start unset, and both acpi_op_match and acpi_op_start set,
    respectively.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Yinghai Lu
    Acked-by: Toshi Kani

    Rafael J. Wysocki
     
  • Split the ACPI namespace scanning for devices into two passes, such
    that struct acpi_device objects are registerd in the first pass
    without probing ACPI drivers and the drivers are probed against them
    directly in the second pass.

    There are two main reasons for doing that.

    First, the ACPI PCI root bridge driver's .add() routine,
    acpi_pci_root_add(), causes struct pci_dev objects to be created for
    all PCI devices under the given root bridge. Usually, there are
    corresponding ACPI device nodes in the ACPI namespace for some of
    those devices and therefore there should be "companion" struct
    acpi_device objects to attach those struct pci_dev objects to. These
    struct acpi_device objects should exist when the corresponding
    struct pci_dev objects are created, but that is only guaranteed
    during boot and not during hotplug. This leads to a number of
    functional differences between the boot and the hotplug cases which
    are not strictly necessary and make the code more complicated.

    For example, this forces the ACPI PCI root bridge driver to defer the
    registration of the just created struct pci_dev objects and to use a
    special .start() callback routine, acpi_pci_root_start(), to make
    sure that all of the "companion" struct acpi_device objects will be
    present at PCI devices registration time during hotplug.

    If those differences can be eliminated, we will be able to
    consolidate the boot and hotplug code paths for the enumeration and
    registration of PCI devices and to reduce the complexity of that
    code quite a bit.

    The second reason is that, in general, it should be possible to
    resolve conflicts of resources assigned by the BIOS to different
    devices represented by ACPI namespace nodes before any drivers bind
    to them and before they are attached to "companion" objects
    representing physical devices (such as struct pci_dev). However, for
    this purpose we first need to enumerate all ACPI device nodes in the
    given namespace scope.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Yinghai Lu
    Acked-by: Toshi Kani

    Rafael J. Wysocki
     
  • Pull PCI updates from Bjorn Helgaas:
    "Some fixes for v3.8. They include a fix for the new SR-IOV sysfs
    management support, an expanded quirk for Ricoh SD card readers, a
    Stratus DMI quirk fix, and a PME polling fix."

    * tag '3.8-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
    PCI: Reduce Ricoh 0xe822 SD card reader base clock frequency to 50MHz
    PCI/PM: Do not suspend port if any subordinate device needs PME polling
    PCI: Add PCIe Link Capability link speed and width names
    PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)
    PCI: Remove spurious error for sriov_numvfs store and simplify flow

    Linus Torvalds
     
  • Empty files can get deleted by the patch program, so remove empty Kbuild
    files and their links from the parent Kbuilds.

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Sasha was fuzzing with trinity and reported the following problem:

    BUG: sleeping function called from invalid context at kernel/mutex.c:269
    in_atomic(): 1, irqs_disabled(): 0, pid: 6361, name: trinity-main
    2 locks held by trinity-main/6361:
    #0: (&mm->mmap_sem){++++++}, at: [] __do_page_fault+0x1e4/0x4f0
    #1: (&(&mm->page_table_lock)->rlock){+.+...}, at: [] handle_pte_fault+0x3f7/0x6a0
    Pid: 6361, comm: trinity-main Tainted: G W
    3.7.0-rc2-next-20121024-sasha-00001-gd95ef01-dirty #74
    Call Trace:
    __might_sleep+0x1c3/0x1e0
    mutex_lock_nested+0x29/0x50
    mpol_shared_policy_lookup+0x2e/0x90
    shmem_get_policy+0x2e/0x30
    get_vma_policy+0x5a/0xa0
    mpol_misplaced+0x41/0x1d0
    handle_pte_fault+0x465/0x6a0

    This was triggered by a different version of automatic NUMA balancing
    but in theory the current version is vunerable to the same problem.

    do_numa_page
    -> numa_migrate_prep
    -> mpol_misplaced
    -> get_vma_policy
    -> shmem_get_policy

    It's very unlikely this will happen as shared pages are not marked
    pte_numa -- see the page_mapcount() check in change_pte_range() -- but
    it is possible.

    To address this, this patch restores sp->lock as originally implemented
    by Kosaki Motohiro. In the path where get_vma_policy() is called, it
    should not be calling sp_alloc() so it is not necessary to treat the PTL
    specially.

    Signed-off-by: KOSAKI Motohiro
    Tested-by: KOSAKI Motohiro
    Signed-off-by: Mel Gorman
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Pull ext4 bug fixes from Ted Ts'o:
    "Various bug fixes for ext4. Perhaps the most serious bug fixed is one
    which could cause file system corruptions when performing file punch
    operations."

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: avoid hang when mounting non-journal filesystems with orphan list
    ext4: lock i_mutex when truncating orphan inodes
    ext4: do not try to write superblock on ro remount w/o journal
    ext4: include journal blocks in df overhead calcs
    ext4: remove unaligned AIO warning printk
    ext4: fix an incorrect comment about i_mutex
    ext4: fix deadlock in journal_unmap_buffer()
    ext4: split off ext4_journalled_invalidatepage()
    jbd2: fix assertion failure in jbd2_journal_flush()
    ext4: check dioread_nolock on remount
    ext4: fix extent tree corruption caused by hole punch

    Linus Torvalds
     
  • Remove the unused argument (formerly no_context) from mpol_parse_str()
    and from mpol_to_str().

    Signed-off-by: Hugh Dickins
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

31 Dec, 2012

1 commit

  • Pull DRM update from Dave Airlie:
    "This is a bit larger due to me not bothering to do anything since
    before Xmas, and other people working too hard after I had clearly
    given up.

    It's got the 3 main x86 driver fixes pulls, and a bunch of tegra
    fixes, doesn't fix the Ironlake bug yet, but that does seem to be
    getting closer.

    - radeon: gpu reset fixes and userspace packet support
    - i915: watermark fixes, workarounds, i830/845 fix,
    - nouveau: nvd9/kepler microcode fixes, accel is now enabled and
    working, gk106 support
    - tegra: misc fixes."

    * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (34 commits)
    Revert "drm: tegra: protect DC register access with mutex"
    drm: tegra: program only one window during modeset
    drm: tegra: clean out old gem prototypes
    drm: tegra: remove redundant tegra2_tmds_config entry
    drm: tegra: protect DC register access with mutex
    drm: tegra: don't leave clients host1x member uninitialized
    drm: tegra: fix front_porch back_porch mixup
    drm/nve0/graph: fix fuc, and enable acceleration on all known chipsets
    drm/nvc0/graph: fix fuc, and enable acceleration on GF119
    drm/nouveau/bios: cache ramcfg strap on later chipsets
    drm/nouveau/mxm: silence output if no bios data
    drm/nouveau/bios: parse/display extra version component
    drm/nouveau/bios: implement opcode 0xa9
    drm/nouveau/bios: update gpio parsing apis to match current design
    drm/nouveau: initial support for GK106
    drm/radeon: add WAIT_UNTIL to evergreen VM safe reg list
    drm/i915: disable shrinker lock stealing for create_mmap_offset
    drm/i915: optionally disable shrinker lock stealing
    drm/i915: fix flags in dma buf exporting
    drm/radeon: add support for MEM_WRITE packet
    ...

    Linus Torvalds
     

30 Dec, 2012

1 commit

  • Some fixes for 3.8:
    - Watermark fixups from Chris Wilson (4 pieces).
    - 2 snb workarounds, seem to be recently added to our internal DB.
    - workaround for the infamous i830/i845 hang, seems now finally solid!
    Based on Chris' fix for SNA, now also for UXA/mesa&old SNA.
    - Some more fixlets for shrinker-pulls-the-rug issues (Chris&me).
    - Fix dma-buf flags when exporting (you).
    - Disable the VGA plane if it's enabled on lid open - similar fix in
    spirit to the one I've sent you last weeek, BIOS' really like to mess
    with the display when closing the lid (awesome debug work from Krzysztof
    Mazur).

    * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
    drm/i915: disable shrinker lock stealing for create_mmap_offset
    drm/i915: optionally disable shrinker lock stealing
    drm/i915: fix flags in dma buf exporting
    i915: ensure that VGA plane is disabled
    drm/i915: Preallocate the drm_mm_node prior to manipulating the GTT drm_mm manager
    drm: Export routines for inserting preallocated nodes into the mm manager
    drm/i915: don't disable disconnected outputs
    drm/i915: Implement workaround for broken CS tlb on i830/845
    drm/i915: Implement WaSetupGtModeTdRowDispatch
    drm/i915: Implement WaDisableHiZPlanesWhenMSAAEnabled
    drm/i915: Prefer CRTC 'active' rather than 'enabled' during WM computations
    drm/i915: Clear self-refresh watermarks when disabled
    drm/i915: Double the cursor self-refresh latency on Valleyview
    drm/i915: Fixup cursor latency used for IVB lp3 watermarks

    Dave Airlie
     

28 Dec, 2012

2 commits

  • Pull namespace fixes from Eric Biederman:
    "This tree includes two bug fixes for problems Oleg spotted on his
    review of the recent pid namespace work. A small fix to not enable
    bottom halves with irqs disabled, and a trivial build fix for f2fs
    with user namespaces enabled."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    f2fs: Don't assign e_id in f2fs_acl_from_disk
    proc: Allow proc_free_inum to be called from any context
    pidns: Stop pid allocation when init dies
    pidns: Outlaw thread creation after unshare(CLONE_NEWPID)

    Linus Torvalds
     
  • Pull networking fixes from David Miller:

    1) GRE tunnel drivers don't set the transport header properly, they also
    blindly deref the inner protocol ipv4 and needs some checks. Fixes
    from Isaku Yamahata.

    2) Fix sleeps while atomic in netdevice rename code, from Eric Dumazet.

    3) Fix double-spinlock in solos-pci driver, from Dan Carpenter.

    4) More ARP bug fixes. Fix lockdep splat in arp_solicit() and then the
    bug accidentally added by that fix. From Eric Dumazet and Cong Wang.

    5) Remove some __dev* annotations that slipped back in, as well as all
    HOTPLUG references. From Greg KH

    6) RDS protocol uses wrong interfaces to access scatter-gather elements,
    causing a regression. From Mike Marciniszyn.

    7) Fix build error in cpts driver, from Richard Cochran.

    8) Fix arithmetic in packet scheduler, from Stefan Hasko.

    9) Similarly, fix association during calculation of random backoff in
    batman-adv. From Akinobu Mita.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
    ipv6/ip6_gre: set transport header correctly
    ipv4/ip_gre: set transport header correctly to gre header
    IB/rds: suppress incompatible protocol when version is known
    IB/rds: Correct ib_api use with gs_dma_address/sg_dma_len
    net/vxlan: Use the underlying device index when joining/leaving multicast groups
    tcp: should drop incoming frames without ACK flag set
    netprio_cgroup: define sk_cgrp_prioidx only if NETPRIO_CGROUP is enabled
    cpts: fix a run time warn_on.
    cpts: fix build error by removing useless code.
    batman-adv: fix random jitter calculation
    arp: fix a regression in arp_solicit()
    net: sched: integer overflow fix
    CONFIG_HOTPLUG removal from networking core
    Drivers: network: more __dev* removal
    bridge: call br_netpoll_disable in br_add_if
    ipv4: arp: fix a lockdep splat in arp_solicit()
    tuntap: dont use a private kmem_cache
    net: devnet_rename_seq should be a seqcount
    ip_gre: fix possible use after free
    ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally
    ...

    Linus Torvalds
     

27 Dec, 2012

4 commits

  • Unfortunately with !CONFIG_PAGEFLAGS_EXTENDED, (!PageHead) is false, and
    (PageHead) is true, for tail pages. If this is indeed the intended
    behavior, which I doubt because it breaks cache cleaning on some ARM
    systems, then the nomenclature is highly problematic.

    This patch makes sure PageHead is only true for head pages and PageTail
    is only true for tail pages, and neither is true for non-compound pages.

    [ This buglet seems ancient - seems to have been introduced back in Apr
    2008 in commit 6a1e7f777f61: "pageflags: convert to the use of new
    macros". And the reason nobody noticed is because the PageHead()
    tests are almost all about just sanity-checking, and only used on
    pages that are actual page heads. The fact that the old code returned
    true for tail pages too was thus not really noticeable. - Linus ]

    Signed-off-by: Christoffer Dall
    Acked-by: Andrea Arcangeli
    Cc: Andrew Morton
    Cc: Will Deacon
    Cc: Steve Capper
    Cc: Christoph Lameter
    Cc: stable@kernel.org # 2.6.26+
    Signed-off-by: Linus Torvalds

    Christoffer Dall
     
  • sock->sk_cgrp_prioidx won't be used at all if CONFIG_NETPRIO_CGROUP=n.

    Signed-off-by: Li Zefan
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Li Zefan
     
  • Otherwise it fails like this on cards like the Transcend 16GB SDHC card:

    mmc0: new SDHC card at address b368
    mmcblk0: mmc0:b368 SDC 15.0 GiB
    mmcblk0: error -110 sending status command, retrying
    mmcblk0: error -84 transferring data, sector 0, nr 8, cmd response 0x900, card status 0xb0

    Tested on my Lenovo x200 laptop.

    [bhelgaas: changelog]
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Bjorn Helgaas
    Acked-by: Chris Ball
    CC: Manoj Iyer
    CC: stable@vger.kernel.org

    Andy Lutomirski
     
  • Add standard #defines for the Supported Link Speeds field in the PCIe
    Link Capabilities register.

    Note that prior to PCIe spec r3.0, these encodings were defined:

    0001b 2.5GT/s Link speed supported
    0010b 5.0GT/s and 2.5GT/s Link speed supported

    Starting with spec r3.0, these encodings refer to bits 0 and 1 in the
    Supported Link Speeds Vector in the Link Capabilities 2 register, and bits
    0 and 1 there mean 2.5 GT/s and 5.0 GT/s, respectively. Therefore, code
    that followed r2.0 and interpreted 0x1 as 2.5GT/s and 0x2 as 5.0GT/s will
    continue to work, and we can identify a device using the new encodings
    because it will have a non-zero Link Capabilities 2 register.

    Signed-off-by: Bjorn Helgaas

    Bjorn Helgaas
     

26 Dec, 2012

3 commits

  • Oleg pointed out that in a pid namespace the sequence.
    - pid 1 becomes a zombie
    - setns(thepidns), fork,...
    - reaping pid 1.
    - The injected processes exiting.

    Can lead to processes attempting access their child reaper and
    instead following a stale pointer.

    That waitpid for init can return before all of the processes in
    the pid namespace have exited is also unfortunate.

    Avoid these problems by disabling the allocation of new pids in a pid
    namespace when init dies, instead of when the last process in a pid
    namespace is reaped.

    Pointed-out-by: Oleg Nesterov
    Reviewed-by: Oleg Nesterov
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • We cannot wait for transaction commit in journal_unmap_buffer()
    because we hold page lock which ranks below transaction start. We
    solve the issue by bailing out of journal_unmap_buffer() and
    jbd2_journal_invalidatepage() with -EBUSY. Caller is then responsible
    for waiting for transaction commit to finish and try invalidation
    again. Since the issue can happen only for page stradding i_size, it
    is simple enough to manually call jbd2_journal_invalidatepage() for
    such page from ext4_setattr(), check the return value and wait if
    necessary.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • In data=journal mode we don't need delalloc or DIO handling in invalidatepage
    and similarly in other modes we don't need the journal handling. So split
    invalidatepage implementations.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     

22 Dec, 2012

10 commits

  • Pull watchdog updates from Wim Van Sebroeck:
    "This includes some fixes and code improvements (like
    clk_prepare_enable and clk_disable_unprepare), conversion from the
    omap_wdt and twl4030_wdt drivers to the watchdog framework, addition
    of the SB8x0 chipset support and the DA9055 Watchdog driver and some
    OF support for the davinci_wdt driver."

    * git://www.linux-watchdog.org/linux-watchdog: (22 commits)
    watchdog: mei: avoid oops in watchdog unregister code path
    watchdog: Orion: Fix possible null-deference in orion_wdt_probe
    watchdog: sp5100_tco: Add SB8x0 chipset support
    watchdog: davinci_wdt: add OF support
    watchdog: da9052: Fix invalid free of devm_ allocated data
    watchdog: twl4030_wdt: Change TWL4030_MODULE_PM_RECEIVER to TWL_MODULE_PM_RECEIVER
    watchdog: remove depends on CONFIG_EXPERIMENTAL
    watchdog: Convert dev_printk(KERN_ to dev_(
    watchdog: DA9055 Watchdog driver
    watchdog: omap_wdt: eliminate goto
    watchdog: omap_wdt: delete redundant platform_set_drvdata() calls
    watchdog: omap_wdt: convert to devm_ functions
    watchdog: omap_wdt: convert to new watchdog core
    watchdog: WatchDog Timer Driver Core: fix comment
    watchdog: s3c2410_wdt: use clk_prepare_enable and clk_disable_unprepare
    watchdog: imx2_wdt: Select the driver via ARCH_MXC
    watchdog: cpu5wdt.c: add missing del_timer call
    watchdog: hpwdt.c: Increase version string
    watchdog: Convert twl4030_wdt to watchdog core
    davinci_wdt: preparation for switch to common clock framework
    ...

    Linus Torvalds
     
  • Pull dm update from Alasdair G Kergon:
    "Miscellaneous device-mapper fixes, cleanups and performance
    improvements.

    Of particular note:
    - Disable broken WRITE SAME support in all targets except linear and
    striped. Use it when kcopyd is zeroing blocks.
    - Remove several mempools from targets by moving the data into the
    bio's new front_pad area(which dm calls 'per_bio_data').
    - Fix a race in thin provisioning if discards are misused.
    - Prevent userspace from interfering with the ioctl parameters and
    use kmalloc for the data buffer if it's small instead of vmalloc.
    - Throttle some annoying error messages when I/O fails."

    * tag 'dm-3.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm: (36 commits)
    dm stripe: add WRITE SAME support
    dm: remove map_info
    dm snapshot: do not use map_context
    dm thin: dont use map_context
    dm raid1: dont use map_context
    dm flakey: dont use map_context
    dm raid1: rename read_record to bio_record
    dm: move target request nr to dm_target_io
    dm snapshot: use per_bio_data
    dm verity: use per_bio_data
    dm raid1: use per_bio_data
    dm: introduce per_bio_data
    dm kcopyd: add WRITE SAME support to dm_kcopyd_zero
    dm linear: add WRITE SAME support
    dm: add WRITE SAME support
    dm: prepare to support WRITE SAME
    dm ioctl: use kmalloc if possible
    dm ioctl: remove PF_MEMALLOC
    dm persistent data: improve improve space map block alloc failure message
    dm thin: use DMERR_LIMIT for errors
    ...

    Linus Torvalds
     
  • Pull more infiniband changes from Roland Dreier:
    "Second batch of InfiniBand/RDMA changes for 3.8:
    - cxgb4 changes to fix lookup engine hash collisions
    - mlx4 changes to make flow steering usable
    - fix to IPoIB to avoid pinning dst reference for too long"

    * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
    RDMA/cxgb4: Fix bug for active and passive LE hash collision path
    RDMA/cxgb4: Fix LE hash collision bug for passive open connection
    RDMA/cxgb4: Fix LE hash collision bug for active open connection
    mlx4_core: Allow choosing flow steering mode
    mlx4_core: Adjustments to Flow Steering activation logic for SR-IOV
    mlx4_core: Fix error flow in the flow steering wrapper
    mlx4_core: Add QPN enforcement for flow steering rules set by VFs
    cxgb4: Add LE hash collision bug fix path in LLD driver
    cxgb4: Add T4 filter support
    IPoIB: Call skb_dst_drop() once skb is enqueued for sending

    Linus Torvalds
     
  • Pull asm-generic cleanup from Arnd Bergmann:
    "These are a few cleanups for asm-generic:

    - a set of patches from Lars-Peter Clausen to generalize asm/mmu.h
    and use it in the architectures that don't need any special
    handling.
    - A patch from Will Deacon to remove the {read,write}s{b,w,l} as
    discussed during the arm64 review
    - A patch from James Hogan that helps with the meta architecture
    series."

    * tag 'asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
    xtensa: Use generic asm/mmu.h for nommu
    h8300: Use generic asm/mmu.h
    c6x: Use generic asm/mmu.h
    asm-generic/mmu.h: Add support for FDPIC
    asm-generic/mmu.h: Remove unused vmlist field from mm_context_t
    asm-generic: io: remove {read,write} string functions
    asm-generic/io.h: remove asm/cacheflush.h include

    Linus Torvalds
     
  • Using a seqlock for devnet_rename_seq is not a good idea,
    as device_rename() can sleep.

    As we hold RTNL, we dont need a protection for writers,
    and only need a seqcount so that readers can catch a change done
    by a writer.

    Bug added in commit c91f6df2db4972d3 (sockopt: Change getsockopt() of
    SO_BINDTODEVICE to return an interface name)

    Reported-by: Dave Jones
    Signed-off-by: Eric Dumazet
    Cc: Brian Haley
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • This patch removes map_info from bio-based device mapper targets.
    map_info is still used for request-based targets.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • This patch moves target_request_nr from map_info to dm_target_io and
    makes it accessible with dm_bio_get_target_request_nr.

    This patch is a preparation for the next patch that removes map_info.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Introduce a field per_bio_data_size in struct dm_target.

    Targets can set this field in the constructor. If a target sets this
    field to a non-zero value, "per_bio_data_size" bytes of auxiliary data
    are allocated for each bio submitted to the target. These data can be
    used for any purpose by the target and help us improve performance by
    removing some per-target mempools.

    Per-bio data is accessed with dm_per_bio_data. The
    argument data_size must be the same as the value per_bio_data_size in
    dm_target.

    If the target has a pointer to per_bio_data, it can get a pointer to
    the bio with dm_bio_from_per_bio_data() function (data_size must be the
    same as the value passed to dm_per_bio_data).

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Allow targets to opt in to WRITE SAME support by setting
    'num_write_same_requests' in the dm_target structure.

    A dm device will only advertise WRITE SAME support if all its
    targets and all its underlying devices support it.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Mike Snitzer
     
  • When allocating memory for the userspace ioctl data, set some
    appropriate GPF flags directly instead of using PF_MEMALLOC.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     

21 Dec, 2012

10 commits

  • Pull filesystem notification updates from Eric Paris:
    "This pull mostly is about locking changes in the fsnotify system. By
    switching the group lock from a spin_lock() to a mutex() we can now
    hold the lock across things like iput(). This fixes a problem
    involving unmounting a fs and having inodes be busy, first pointed out
    by FAT, but reproducible with tmpfs.

    This also restores signal driven I/O for inotify, which has been
    broken since about 2.6.32."

    Ugh. I *hate* the timing of this. It was rebased after the merge
    window opened, and then left to sit with the pull request coming the day
    before the merge window closes. That's just crap. But apparently the
    patches themselves have been around for over a year, just gathering
    dust, so now it's suddenly critical.

    Fixed up semantic conflict in fs/notify/fdinfo.c as per Stephen
    Rothwell's fixes from -next.

    * 'for-next' of git://git.infradead.org/users/eparis/notify:
    inotify: automatically restart syscalls
    inotify: dont skip removal of watch descriptor if creation of ignored event failed
    fanotify: dont merge permission events
    fsnotify: make fasync generic for both inotify and fanotify
    fsnotify: change locking order
    fsnotify: dont put marks on temporary list when clearing marks by group
    fsnotify: introduce locked versions of fsnotify_add_mark() and fsnotify_remove_mark()
    fsnotify: pass group to fsnotify_destroy_mark()
    fsnotify: use a mutex instead of a spinlock to protect a groups mark list
    fanotify: add an extra flag to mark_remove_from_mask that indicates wheather a mark should be destroyed
    fsnotify: take groups mark_lock before mark lock
    fsnotify: use reference counting for groups
    fsnotify: introduce fsnotify_get_group()
    inotify, fanotify: replace fsnotify_put_group() with fsnotify_destroy_group()

    Linus Torvalds
     
  • Merge the rest of Andrew's patches for -rc1:
    "A bunch of fixes and misc missed-out-on things.

    That'll do for -rc1. I still have a batch of IPC patches which still
    have a possible bug report which I'm chasing down."

    * emailed patches from Andrew Morton : (25 commits)
    keys: use keyring_alloc() to create module signing keyring
    keys: fix unreachable code
    sendfile: allows bypassing of notifier events
    SGI-XP: handle non-fatal traps
    fat: fix incorrect function comment
    Documentation: ABI: remove testing/sysfs-devices-node
    proc: fix inconsistent lock state
    linux/kernel.h: fix DIV_ROUND_CLOSEST with unsigned divisors
    memcg: don't register hotcpu notifier from ->css_alloc()
    checkpatch: warn on uapi #includes that #include
    mm: cma: WARN if freed memory is still in use
    exec: do not leave bprm->interp on stack
    ...

    Linus Torvalds
     
  • Pull VFS update from Al Viro:
    "fscache fixes, ESTALE patchset, vmtruncate removal series, assorted
    misc stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (79 commits)
    vfs: make lremovexattr retry once on ESTALE error
    vfs: make removexattr retry once on ESTALE
    vfs: make llistxattr retry once on ESTALE error
    vfs: make listxattr retry once on ESTALE error
    vfs: make lgetxattr retry once on ESTALE
    vfs: make getxattr retry once on an ESTALE error
    vfs: allow lsetxattr() to retry once on ESTALE errors
    vfs: allow setxattr to retry once on ESTALE errors
    vfs: allow utimensat() calls to retry once on an ESTALE error
    vfs: fix user_statfs to retry once on ESTALE errors
    vfs: make fchownat retry once on ESTALE errors
    vfs: make fchmodat retry once on ESTALE errors
    vfs: have chroot retry once on ESTALE error
    vfs: have chdir retry lookup and call once on ESTALE error
    vfs: have faccessat retry once on an ESTALE error
    vfs: have do_sys_truncate retry once on an ESTALE error
    vfs: fix renameat to retry on ESTALE errors
    vfs: make do_unlinkat retry once on ESTALE errors
    vfs: make do_rmdir retry once on ESTALE errors
    vfs: add a flags argument to user_path_parent
    ...

    Linus Torvalds
     
  • Pull signal handling cleanups from Al Viro:
    "sigaltstack infrastructure + conversion for x86, alpha and um,
    COMPAT_SYSCALL_DEFINE infrastructure.

    Note that there are several conflicts between "unify
    SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline;
    resolution is trivial - just remove definitions of SS_ONSTACK and
    SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and
    include/uapi/linux/signal.h contains the unified variant."

    Fixed up conflicts as per Al.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
    alpha: switch to generic sigaltstack
    new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those
    generic compat_sys_sigaltstack()
    introduce generic sys_sigaltstack(), switch x86 and um to it
    new helper: compat_user_stack_pointer()
    new helper: restore_altstack()
    unify SS_ONSTACK/SS_DISABLE definitions
    new helper: current_user_stack_pointer()
    missing user_stack_pointer() instances
    Bury the conditionals from kernel_thread/kernel_execve series
    COMPAT_SYSCALL_DEFINE: infrastructure

    Linus Torvalds
     
  • Commit 263a523d18bc ("linux/kernel.h: Fix warning seen with W=1 due to
    change in DIV_ROUND_CLOSEST") fixes a warning seen with W=1 due to
    change in DIV_ROUND_CLOSEST.

    Unfortunately, the C compiler converts divide operations with unsigned
    divisors to unsigned, even if the dividend is signed and negative (for
    example, -10 / 5U = 858993457). The C standard says "If one operand has
    unsigned int type, the other operand is converted to unsigned int", so
    the compiler is not to blame. As a result, DIV_ROUND_CLOSEST(0, 2U) and
    similar operations now return bad values, since the automatic conversion
    of expressions such as "0 - 2U/2" to unsigned was not taken into
    account.

    Fix by checking for the divisor variable type when deciding which
    operation to perform. This fixes DIV_ROUND_CLOSEST(0, 2U), but still
    returns bad values for negative dividends divided by unsigned divisors.
    Mark the latter case as unsupported.

    One observed effect of this problem is that the s2c_hwmon driver reports
    a value of 4198403 instead of 0 if the ADC reads 0.

    Other impact is unpredictable. Problem is seen if the divisor is an
    unsigned variable or constant and the dividend is less than (divisor/2).

    Signed-off-by: Guenter Roeck
    Reported-by: Juergen Beisert
    Tested-by: Juergen Beisert
    Cc: Jean Delvare
    Cc: [3.7.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Guenter Roeck
     
  • If a series of scripts are executed, each triggering module loading via
    unprintable bytes in the script header, kernel stack contents can leak
    into the command line.

    Normally execution of binfmt_script and binfmt_misc happens recursively.
    However, when modules are enabled, and unprintable bytes exist in the
    bprm->buf, execution will restart after attempting to load matching
    binfmt modules. Unfortunately, the logic in binfmt_script and
    binfmt_misc does not expect to get restarted. They leave bprm->interp
    pointing to their local stack. This means on restart bprm->interp is
    left pointing into unused stack memory which can then be copied into the
    userspace argv areas.

    After additional study, it seems that both recursion and restart remains
    the desirable way to handle exec with scripts, misc, and modules. As
    such, we need to protect the changes to interp.

    This changes the logic to require allocation for any changes to the
    bprm->interp. To avoid adding a new kmalloc to every exec, the default
    value is left as-is. Only when passing through binfmt_script or
    binfmt_misc does an allocation take place.

    For a proof of concept, see DoTest.sh from:

    http://www.halfdog.net/Security/2012/LinuxKernelBinfmtScriptStackDataDisclosure/

    Signed-off-by: Kees Cook
    Cc: halfdog
    Cc: P J P
    Cc: Alexander Viro
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • Where we can pass in LOOKUP_DIRECTORY or LOOKUP_REVAL. Any other flags
    passed in here are currently ignored.

    Signed-off-by: Jeff Layton
    Signed-off-by: Al Viro

    Jeff Layton
     
  • This function is expected to be called from path-based syscalls to help
    them decide whether to try the lookup and call again in the event that
    they got an -ESTALE return back on an earier try.

    Currently, we only retry the call once on an ESTALE error, but in the
    event that we decide that that's not enough in the future, we should be
    able to change the logic in this helper without too much effort.

    Signed-off-by: Jeff Layton
    Signed-off-by: Al Viro

    Jeff Layton
     
  • …/linux-fs into for-linus

    Al Viro
     
  • Commit 8e22cc88d68ca1a46d7d582938f979eb640ed30f removes the (un)lock_super
    function definitions but forgets to remove their prototypes.

    Signed-off-by: Alessio Igor Bogani
    Signed-off-by: Al Viro

    Alessio Igor Bogani