01 Feb, 2012

2 commits

  • Linus Torvalds
     
  • There are few important bug fixes for LogFS

    * tag 'for-linus' of git://github.com/prasad-joshi/logfs_upstream:
    Logfs: Allow NULL block_isbad() methods
    logfs: Grow inode in delete path
    logfs: Free areas before calling generic_shutdown_super()
    logfs: remove useless BUG_ON
    MAINTAINERS: Add Prasad Joshi in LogFS maintiners
    logfs: Propagate page parameter to __logfs_write_inode
    logfs: set superblock shutdown flag after generic sb shutdown
    logfs: take write mutex lock during fsync and sync
    logfs: Prevent memory corruption
    logfs: update page reference count for pined pages

    Fix up conflict in fs/logfs/dev_mtd.c due to semantic change in what
    "mtd->block_isbad" means in commit f2933e86ad93: "Logfs: Allow NULL
    block_isbad() methods" clashing with the abstraction changes in the
    commits 7086c19d0742: "mtd: introduce mtd_block_isbad interface" and
    d58b27ed58a3: "logfs: do not use 'mtd->block_isbad' directly".

    This resolution takes the semantics from commit f2933e86ad93, and just
    makes mtd_block_isbad() return zero (false) if the 'block_isbad'
    function is NULL. But that also means that now "mtd_can_have_bb()"
    always returns 0.

    Now, "mtd_block_markbad()" will obviously return an error if the
    low-level driver doesn't support bad blocks, so this is somewhat
    non-symmetric, but it actually makes sense if a NULL "block_isbad"
    function is considered to mean "I assume that all my blocks are always
    good".

    Linus Torvalds
     

31 Jan, 2012

14 commits

  • * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
    hwmon: (w83627ehf) Disable setting DC mode for pwm2, pwm3 on NCT6776F
    hwmon: (sht15) fix bad error code
    MAINTAINERS: Drop maintainer for MAX1668 hwmon driver
    MAINTAINERS: Add hwmon entries for Wolfson
    hwmon: (f71805f) Fix clamping of temperature limits

    Linus Torvalds
     
  • Here are some fixes to the pin control system that has accumulated since
    -rc1. Mainly Tony Lindgren fixed the module load/unload logic and the
    rest are minor fixes and documentation.

    * 'for-torvalds' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
    pinctrl: add checks for empty function names
    pinctrl: fix pinmux_hog_maps when ctrl_dev_name is not set
    pinctrl: fix some pinmux typos
    pinctrl: free debugfs entries when unloading a pinmux driver
    pinctrl: unbreak error messages
    Documentation/pinctrl: fix a few syntax errors in code examples
    pinctrl: fix pinconf_pins_show iteration

    Linus Torvalds
     
  • Here are some tty/serial patches for 3.3-rc1

    Big thing here is the movement of the 8250 serial drivers to their own
    directory, now that the patch churn has calmed down.

    Other than that, only minor stuff (omap patches were reverted as they
    were found to be wrong), and another broken driver removed from the
    system.

    Signed-off-by: Greg Kroah-Hartman

    * tag 'tty-3.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
    serial: Kill off Moorestown code
    Revert "tty: serial: OMAP: ensure FIFO levels are set correctly in non-DMA mode"
    Revert "tty: serial: OMAP: transmit FIFO threshold interrupts don't wake the chip"
    serial: Fix wakeup init logic to speed up startup
    docbook: don't use serial_core.h in device-drivers book
    serial: amba-pl011: lock console writes against interrupts
    amba-pl011: do not disable RTS during shutdown
    tty: serial: OMAP: transmit FIFO threshold interrupts don't wake the chip
    tty: serial: OMAP: ensure FIFO levels are set correctly in non-DMA mode
    omap-serial: make serial_omap_restore_context depend on CONFIG_PM_RUNTIME
    omap-serial :Make the suspend/resume functions depend on CONFIG_PM_SLEEP.
    TTY: fix UV serial console regression
    jsm: Fixed EEH recovery error
    Updated TTY MAINTAINERS info
    serial: group all the 8250 related code together

    Linus Torvalds
     
  • Here are a bunch of USB patches for 3.3-rc1.

    Nothing major, largest thing here is the removal of some drivers that
    did not work at all. Other than that, the normal collection of bugfixes
    and new device ids.

    Signed-off-by: Greg Kroah-Hartman

    * tag 'usb-3.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (52 commits)
    uwb & wusb: fix kconfig error
    USB: Realtek cr: fix autopm scheduling while atomic
    USB: ftdi_sio: Add more identifiers
    xHCI: Cleanup isoc transfer ring when TD length mismatch found
    usb: musb: omap2430: minor cleanups.
    qcaux: add more Pantech UML190 and UML290 ports
    Revert "drivers: usb: Fix dependency for USB_HWA_HCD"
    usb: mv-otg - Fix build if CONFIG_USB is not set
    USB: cdc-wdm: Avoid hanging on interface with no USB_CDC_DMM_TYPE
    usb: add support for STA2X11 host driver
    drivers: usb: Fix dependency for USB_HWA_HCD
    kernel-doc: fix new warning in usb.h
    USB: OHCI: fix new compiler warnings
    usb: serial: kobil_sct: fix compile warning:
    drivers/usb/host/ehci-fsl.c: add missing iounmap
    USB: cdc-wdm: better allocate a buffer that is at least as big as we tell the USB core
    USB: cdc-wdm: call wake_up_all to allow driver to shutdown on device removal
    USB: cdc-wdm: use two mutexes to allow simultaneous read and write
    USB: cdc-wdm: updating desc->length must be protected by spin_lock
    USB: usbsevseg: fix max length
    ...

    Linus Torvalds
     
  • 1) Setting link attributes can modify the size of the attributes that
    would be reported on a subsequent getlink netlink operation,
    therefore min_ifinfo_dump_size needs to be adjusted. From Stefan
    Gula.

    2) Resegmentation of TSO frames while trimming can violate invariants
    expected by callers, namely that the number of segments can only stay
    the same or decrease, never increase. If MSS changes, however, we
    can trim data but then end up with more segments. Fix this by only
    segmenting to the MSS already recorded in the SKB. That's the
    simplest fix for now and if we want to get more fancy in the future
    that's a more involved change.

    This probably explains some retransmit counter inaccuracies.

    From Neal Cardwell.

    3) Fix too-many-wakeups in POLL with AF_UNIX sockets, from Eric Dumazet.

    4) Fix CAIF crashes wrt. namespace handling. From Eric Dumazet and
    Eric W. Biederman.

    5) TCP port selection fixes from Flavio Leitner.

    6) More socket memory cgroup build fixes in certain randonfig
    situations. From Glauber Costa.

    7) Fix TCP memory sysctl regression reported by Ingo Molnar, also from
    Glauber Costa.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    af_unix: fix EPOLLET regression for stream sockets
    tcp: fix tcp_trim_head() to adjust segment count with skb MSS
    net/tcp: Fix tcp memory limits initialization when !CONFIG_SYSCTL
    net caif: Register properly as a pernet subsystem.
    netns: Fail conspicously if someone uses net_generic at an inappropriate time.
    net: explicitly add jump_label.h header to sock.h
    net: RTNETLINK adjusting values of min_ifinfo_dump_size
    ipv6: Fix ip_gre lockless xmits.
    xen-netfront: correct MAX_TX_TARGET calculation.
    netns: fix net_alloc_generic()
    tcp: bind() optimize port allocation
    tcp: bind() fix autoselection to share ports
    l2tp: l2tp_ip - fix possible oops on packet receive
    iwlwifi: fix PCI-E transport "inta" race
    mac80211: set bss_conf.idle when vif is connected
    mac80211: update oper_channel on ibss join

    Linus Torvalds
     
  • This fixes an integration issue with the regulator device tree bindings
    which shook out in -rc. The bindings were overly enthusiatic when
    deciding to set a voltage on a regulator and would try to set zero volts
    on an unconfigured regulator which isn't supported.

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
    regulator: Set apply_uV only when min and max voltages are defined

    Linus Torvalds
     
  • Commit 0884d7aa24 (AF_UNIX: Fix poll blocking problem when reading from
    a stream socket) added a regression for epoll() in Edge Triggered mode
    (EPOLLET)

    Appropriate fix is to use skb_peek()/skb_unlink() instead of
    skb_dequeue(), and only call skb_unlink() when skb is fully consumed.

    This remove the need to requeue a partial skb into sk_receive_queue head
    and the extra sk->sk_data_ready() calls that added the regression.

    This is safe because once skb is given to sk_receive_queue, it is not
    modified by a writer, and readers are serialized by u->readlock mutex.

    This also reduce number of spinlock acquisition for small reads or
    MSG_PEEK users so should improve overall performance.

    Reported-by: Nick Mathewson
    Signed-off-by: Eric Dumazet
    Cc: Alexey Moiseytsev
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • This commit fixes tcp_trim_head() to recalculate the number of
    segments in the skb with the skb's existing MSS, so trimming the head
    causes the skb segment count to be monotonically non-increasing - it
    should stay the same or go down, but not increase.

    Previously tcp_trim_head() used the current MSS of the connection. But
    if there was a decrease in MSS between original transmission and ACK
    (e.g. due to PMTUD), this could cause tcp_trim_head() to
    counter-intuitively increase the segment count when trimming bytes off
    the head of an skb. This violated assumptions in tcp_tso_acked() that
    tcp_trim_head() only decreases the packet count, so that packets_acked
    in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to
    pass u32 pkts_acked values as large as 0xffffffff to
    ca_ops->pkts_acked().

    As an aside, if tcp_trim_head() had really wanted the skb to reflect
    the current MSS, it should have called tcp_set_skb_tso_segs()
    unconditionally, since a decrease in MSS would mean that a
    single-packet skb should now be sliced into multiple segments.

    Signed-off-by: Neal Cardwell
    Acked-by: Nandita Dukkipati
    Acked-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Neal Cardwell
     
  • sysctl_tcp_mem() initialization was moved to sysctl_tcp_ipv4.c
    in commit 3dc43e3e4d0b52197d3205214fe8f162f9e0c334, since it
    became a per-ns value.

    That code, however, will never run when CONFIG_SYSCTL is
    disabled, leading to bogus values on those fields - causing hung
    TCP sockets.

    This patch fixes it by keeping an initialization code in
    tcp_init(). It will be overwritten by the first net namespace
    init if CONFIG_SYSCTL is compiled in, and do the right thing if
    it is compiled out.

    It is also named properly as tcp_init_mem(), to properly signal
    its non-sysctl side effect on TCP limits.

    Reported-by: Ingo Molnar
    Signed-off-by: Glauber Costa
    Cc: David S. Miller
    Link: http://lkml.kernel.org/r/4F22D05A.8030604@parallels.com
    [ renamed the function, tidied up the changelog a bit ]
    Signed-off-by: Ingo Molnar
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
    [S390] dasd: revalidate server for new pathgroup
    [S390] dasd: revert LCU optimization
    [S390] cleanup entry point definition

    Linus Torvalds
     
  • * 'next' of git://git.monstr.eu/linux-2.6-microblaze:
    microblaze: generic atomic64 support

    Linus Torvalds
     
  • * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
    vmwgfx: Fix assignment in vmw_framebuffer_create_handle
    drm/radeon/kms: Fix device tree linkage of i2c buses
    drm: Pass the real error code back during GEM bo initialisation
    Revert "drm/i810: cleanup reclaim_buffers"

    Linus Torvalds
     
  • NFS client bugfixes for Linux 3.3 (pull 3)

    * tag 'nfs-for-3.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    SUNRPC: Fix machine creds in generic_create_cred and generic_match

    Linus Torvalds
     
  • Power management fix for 3.3-rc2

    Fix for a hibernate (s2disk) regression introduced during the 3.2
    merge window that causes s2disk to trigger BUG_ON() in
    freeze_workqueues_begin() if there is not enough swap space to save
    the image.

    * tag 'pm-fix-for-3.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    PM / Hibernate: Fix s2disk regression related to freezing workqueues

    Linus Torvalds
     

30 Jan, 2012

7 commits

  • The assignment of handle in vmw_framebuffer_create_handle doesn't actually do anything useful and is incorrectly assigning an integer value to a pointer argument. It appears that this is a typo and should be dereferencing handle rather than assigning to it directly. This fixes a bug where an undefined handle value is potentially returned to user-space.

    Signed-off-by: Ryan Mallon
    Reviewed-by: Jakob Bornecrantz
    Cc: stable@vger.kernel.org
    Signed-off-by: Dave Airlie

    Ryan Mallon
     
  • Properly set the parent device of i2c buses before registering them so
    that they will show at the right place in the device tree (rather than
    in /sys/devices directly.)

    Signed-off-by: Jean Delvare
    Cc: Dave Airlie
    Reviewed-by: Alex Deucher
    Signed-off-by: Dave Airlie

    Jean Delvare
     
  • In particular, I found I was hitting the max-file limit in the VFS,
    and the EFILE was being magically transformed into ENOMEM. Confusion
    reigns.

    Signed-off-by: Chris Wilson
    Reviewed-by: Daniel Vetter
    Signed-off-by: Dave Airlie

    Chris Wilson
     
  • This reverts commit 87499ffdcb1c70f66988cd8febc4ead0ba2f9118.

    Where is that paper bag ... ah here.

    I've failed to take an odd interaction between my other cleanups and
    this reclaim_buffers patch into account and also failed to properly
    test it. Looks like there are more dragons and hidden trapdoors in the
    drm release path than actual lines of code.

    Until I get a clue, let's just revert this.

    Signed-Off-by: Daniel Vetter

    Signed-off-by: Dave Airlie

    Daniel Vetter
     
  • NCT6776F only supports pwm mode for pwm2 and pwm3. Return error if an attempt
    is made to set those pwm channels to DC mode.

    Signed-off-by: Guenter Roeck
    Acked-by: Jean Delvare
    Cc: stable@vger.kernel.org # 3.0+
    Signed-off-by: Guenter Roeck

    Guenter Roeck
     
  • Commit 2aede851ddf08666f68ffc17be446420e9d2a056

    PM / Hibernate: Freeze kernel threads after preallocating memory

    introduced a mechanism by which kernel threads were frozen after
    the preallocation of hibernate image memory to avoid problems with
    frozen kernel threads not responding to memory freeing requests.
    However, it overlooked the s2disk code path in which the
    SNAPSHOT_CREATE_IMAGE ioctl was run directly after SNAPSHOT_FREE,
    which caused freeze_workqueues_begin() to BUG(), because it saw
    that worqueues had been already frozen.

    Although in principle this issue might be addressed by removing
    the relevant BUG_ON() from freeze_workqueues_begin(), that would
    reintroduce the very problem that commit 2aede851ddf08666f68ffc17be4
    attempted to avoid into that particular code path. For this reason,
    to fix the issue at hand, introduce thaw_kernel_threads() and make
    the SNAPSHOT_FREE ioctl execute it.

    Special thanks to Srivatsa S. Bhat for detailed analysis of the
    problem.

    Reported-and-tested-by: Jiri Slaby
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Srivatsa S. Bhat
    Cc: stable@kernel.org

    Rafael J. Wysocki
     
  • When no platform data was supplied, returned error code was 0.

    Signed-off-by: Vivien Didelot
    Cc: stable@vger.kernel.org # 2.6.32+
    Signed-off-by: Guenter Roeck

    Vivien Didelot
     

29 Jan, 2012

7 commits

  • …ernel/git/gregkh/driver-core

    Here are some patches for the 3.3-rc1 tree.

    It contains the removal of the sysdev code, now that all users of it are
    gone, as well as some sysfs bugfixes that have been reported by users.
    There are also some documentation updates here as well.

    * tag 'driver-core-3.3-rc1-bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
    sysfs: Complain bitterly about attempts to remove files from nonexistent directories.
    stable: update documentation to ask for kernel version
    base/core.c:fix typo in comment in function device_add
    Documentation: devres: add allocation functions to list of supported calls
    Documentation update for the driver model core
    kernel-doc: fix new warnings in driver-core
    kernel-doc: fix new warnings in debugfs
    kernel-doc: fix new warnings in device.h
    driver core: remove drivers/base/sys.c and include/linux/sysdev.h

    Linus Torvalds
     
  • * tag 'for-linus' of git://github.com/rustyrussell/linux:
    lguest: remove reference from Documentation/virtual/00-INDEX
    virtio: correct the memory barrier in virtqueue_kick_prepare()
    virtio: fix typos of memory barriers

    Linus Torvalds
     
  • …kernel/git/konrad/xen

    * 'stable/for-linus-fixes-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen/granttable: Disable grant v2 for HVM domains.
    x86: xen: size struct xen_spinlock to always fit in arch_spinlock_t

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    Btrfs: fix reservations in btrfs_page_mkwrite
    Btrfs: advance window_start if we're using a bitmap
    btrfs: mask out gfp flags in releasepage
    Btrfs: fix enospc error caused by wrong checks of the chunk
    Btrfs: do not defrag a file partially
    Btrfs: fix warning for 32-bit build of fs/btrfs/check-integrity.c
    Btrfs: use cluster->window_start when allocating from a cluster bitmap
    Btrfs: Check for NULL page in extent_range_uptodate
    btrfs: Fix busyloops in transaction waiting code
    Btrfs: make sure a bitmap has enough bytes
    Btrfs: fix uninit warning in backref.c

    Linus Torvalds
     
  • * git://www.linux-watchdog.org/linux-watchdog:
    watchdog: iTCO_wdt: add Intel Lynx Point DeviceIDs
    watchdog: via_wdt: Set min_timeout and max_timeout for wdt_dev
    watchdog: Fix typo "unexpectdly"
    watchdog: wafer5823wdt: Fix handling WDIOS_DISABLECARD/WDIOS_ENABLECARD options
    watchdog: wm8350_wdt: Fix handling WDIOS_DISABLECARD/WDIOS_ENABLECARD options
    watchdog: Return proper error in nuc900wdt_probe if misc_register fails
    watchdog: Staticise nuc900_wdt
    watchdog: via_wdt: Staticise wdt_pci_table
    watchdog: omap_wdt.c: Fix the mismatch of pm_runtime enable and disable
    watchdog: dw_wdt.c: use devm_request_and_ioremap
    watchdog: imx2_wdt.c: use devm_request_and_ioremap

    Linus Torvalds
     
  • * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: (31 commits)
    ARM: 7304/1: ioremap: fix boundary check when reusing static mapping
    ARM: 7301/1: Rename the T() macro to TUSER() to avoid namespace conflicts
    ARM: 7299/1: ftrace: clear zero bit in reported IPs for Thumb-2
    ARM: 7298/1: realview: fix mapping of MPCore private memory region
    PCMCIA: fix sa1111 oops on remove
    ARM: 7288/1: mach-sa1100: add missing module_init() call
    ARM: 7297/1: smp_twd: make sure timer is stopped before registering it
    ARM: 7296/1: proc-v7.S: remove HARVARD_CACHE preprocessor guards
    ARM: 7295/1: cortex-a7: move proc_info out of !CONFIG_ARM_LPAE block
    ARM: 7293/1: logical_cpu_map: decouple CPU mapping from SMP
    ARM: 7291/1: cache: assume 64-byte L1 cachelines for ARMv7 CPUs
    ARM: 7290/1: vmlinux.lds.S: align the exception fixup table to a 4-byte boundary
    ARM: 7289/1: vmlinux.lds.S: do not hardcode cacheline size as 32 bytes
    MFD: ucb1x00-ts: fix resume failure
    MFD: ucb1x00-core: fix gpiolib direction_output handling
    MFD: ucb1x00-core: fix missing restore of io output data on resume
    MFD: mcp-core: fix mcp_priv() to be more type safe
    MFD: mcp-core: fix complaints from the genirq layer
    Revert "ARM: sa11x0: Implement autoloading of codec and codec pdata for mcp bus."
    Revert "ARM: sa1100: Refactor mcp-sa11x0 to use platform resources."
    ...

    Fix up conflict due to arch/arm/mach-mx5/Kconfig having been merged into
    mach-imx5 (commit 784a90c0a7d8: "ARM i.MX: Merge i.MX5 support into
    mach-imx"), but the ARM_L1_CACHE_SHIFT_6 entry was moved to be driven by
    the CPU_V7 logic from it in the old location in rmk's branch (commit
    a092f2b15399: "ARM: 7291/1: cache: assume 64-byte L1 cachelines for
    ARMv7 CPUs").

    Linus Torvalds
     
  • arm-soc fixes for 3.3-rc:

    AT91 needed reset fixes which resulted in some minor code refactoring,
    it also adds a feature-removal for one of their platforms for 3.4.
    The USB patches have been acked by Greg K-H.

    i.MX and ux500 both have some minor fixes, nothing controversial.

    * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
    arch/arm/mach-imx/mach-mx53_ard.c: add missing iounmap
    ARM: imx: iomux-v1.h: Fix build error due to __init annotation
    ARM: at91: Fix at91sam9g45 and at91cap9 reset
    ARM: at91: make rstc soc independent
    ARM: at91: introduce AT91_SAM9_ALT_RESET to select the at91sam9 alternative reset
    ARM: at91: merge at91cap9_ddrsdr.h in at91sam9_ddrsdr.h
    ARM: at91: fix cap9 ddrsdr register
    ARM/USB: at91/ohci-at91: rename vbus_pin_inverted to vbus_pin_active_low
    USB: at91: fix clk_get error handling
    ARM: at91: removal of CAP9 SoC family
    ARM: at91: fix at91rm9200 soc subtype handling
    mach-ux500: no MMC_CAP_SD_HIGHSPEED on Snowball
    mach-ux500: enable ARM errata 764369
    mach-ux500: do not override outer.inv_all
    mach-ux500: musb: now musb is always in OTG mode
    ARM: imx6: add missing twd_clk for imx6q clock

    Linus Torvalds
     

28 Jan, 2012

10 commits

  • Not all mtd drivers define block_isbad(). Let's assume no bad blocks
    instead of refusing to mount.

    Signed-off-by: Joern Engel

    Joern Engel
     
  • Can be necessary if an inode gets deleted (through -ENOSPC) before being
    written. Might be better to move this into logfs_write_rec(), but for
    now go with the stupid&safe patch.

    Signed-off-by: Joern Engel

    Joern Engel
     
  • Or hit an assertion in map_invalidatepage() instead.

    Signed-off-by: Joern Engel

    Joern Engel
     
  • It prevents write sizes >4k.

    Signed-off-by: Joern Engel

    Joern Engel
     
  • Acked-by: Joern Engel
    Signed-off-by: Prasad Joshi

    Prasad Joshi
     
  • During GC LogFS has to rewrite each valid block to a separate segment.
    Rewrite operation reads data from an old segment and writes it to a
    newly allocated segment. Since every write operation changes data
    block pointers maintained in inode, inode should also be rewritten.

    In GC path to avoid AB-BA deadlock LogFS marks a page with
    PG_pre_locked in addition to locking the page (PG_locked). The page
    lock is ignored iff the page is pre-locked.

    LogFS uses a special file called segment file. The segment file
    maintains an 8 bytes entry for every segment. It keeps track of erase
    count, level etc. for every segment.

    Bad things happen with a segment belonging to the segment file is GCed

    ------------[ cut here ]------------
    kernel BUG at /home/prasad/logfs/readwrite.c:297!
    invalid opcode: 0000 [#1] SMP
    Modules linked in: logfs joydev usbhid hid psmouse e1000 i2c_piix4
    serio_raw [last unloaded: logfs]
    Pid: 20161, comm: mount Not tainted 3.1.0-rc3+ #3 innotek GmbH
    VirtualBox
    EIP: 0060:[] EFLAGS: 00010292 CPU: 0
    EIP is at logfs_lock_write_page+0x6a/0x70 [logfs]
    EAX: 00000027 EBX: f73f5b20 ECX: c16007c8 EDX: 00000094
    ESI: 00000000 EDI: e59be6e4 EBP: c7337b28 ESP: c7337b18
    DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
    Process mount (pid: 20161, ti=c7336000 task=eb323f70 task.ti=c7336000)
    Stack:
    f8099a3d c7337b24 f73f5b20 00001002 c7337b50 f8091f6d f8099a4d f80994e4
    00000003 00000000 c7337b68 00000000 c67e4400 00001000 c7337b80 f80935e5
    00000000 00000000 00000000 00000000 e1fcf000 0000000f e59be618 c70bf900
    Call Trace:
    [] logfs_get_write_page.clone.16+0xdd/0x100 [logfs]
    [] logfs_mod_segment_entry+0x55/0x110 [logfs]
    [] logfs_get_segment_entry+0x1d/0x20 [logfs]
    [] ? logfs_cleanup_journal+0x50/0x50 [logfs]
    [] ostore_get_erase_count+0x1b/0x40 [logfs]
    [] logfs_open_area+0xc8/0x150 [logfs]
    [] ? kmemleak_alloc+0x2c/0x60
    [] __logfs_segment_write.clone.16+0x4e/0x1b0 [logfs]
    [] ? mempool_kmalloc+0x13/0x20
    [] ? mempool_kmalloc+0x13/0x20
    [] logfs_segment_write+0x17f/0x1d0 [logfs]
    [] logfs_write_i0+0x11c/0x180 [logfs]
    [] logfs_write_direct+0x45/0x90 [logfs]
    [] __logfs_write_buf+0xbd/0xf0 [logfs]
    [] ? kmap_atomic_prot+0x4e/0xe0
    [] logfs_write_buf+0x3b/0x60 [logfs]
    [] __logfs_write_inode+0xa9/0x110 [logfs]
    [] logfs_rewrite_block+0xc0/0x110 [logfs]
    [] ? get_mapping_page+0x10/0x60 [logfs]
    [] ? logfs_load_object_aliases+0x2e0/0x2f0 [logfs]
    [] logfs_gc_segment+0x2ad/0x310 [logfs]
    [] __logfs_gc_once+0x4a/0x80 [logfs]
    [] logfs_gc_pass+0x683/0x6a0 [logfs]
    [] logfs_mount+0x5a9/0x680 [logfs]
    [] mount_fs+0x21/0xd0
    [] ? __alloc_percpu+0xf/0x20
    [] ? alloc_vfsmnt+0xb1/0x130
    [] vfs_kern_mount+0x4b/0xa0
    [] do_kern_mount+0x3e/0xe0
    [] do_mount+0x34d/0x670
    [] ? strndup_user+0x49/0x70
    [] sys_mount+0x6b/0xa0
    [] syscall_call+0x7/0xb
    Code: f8 e8 8b 93 39 c9 8b 45 f8 3e 0f ba 28 00 19 d2 85 d2 74 ca eb d0 0f 0b 8d 45 fc 89 44 24 04 c7 04 24 3d 9a 09 f8 e8 09 92 39 c9 0b 8d 74 26 00 55 89 e5 3e 8d 74 26 00 8b 10 80 e6 01 74 09
    EIP: [] logfs_lock_write_page+0x6a/0x70 [logfs] SS:ESP 0068:c7337b18
    ---[ end trace 96e67d5b3aa3d6ca ]---

    The patch passes locked page to __logfs_write_inode. It calls function
    logfs_get_wblocks() to pre-lock the page. This ensures any further
    attempts to lock the page are ignored (esp from get_erase_count).

    Acked-by: Joern Engel
    Signed-off-by: Prasad Joshi

    Prasad Joshi
     
  • While unmounting the file system LogFS calls generic_shutdown_super.
    The function does file system independent superblock shutdown.
    However, it might result in call file system specific inode eviction.

    LogFS marks FS shutting down by setting bit LOGFS_SB_FLAG_SHUTDOWN in
    super->s_flags. Since, inode eviction might call truncate on inode,
    following BUG is observed when file system is unmounted:

    ------------[ cut here ]------------
    kernel BUG at /home/prasad/logfs/segment.c:362!
    invalid opcode: 0000 [#1] PREEMPT SMP
    CPU 3
    Modules linked in: logfs binfmt_misc ppdev virtio_blk parport_pc lp
    parport psmouse floppy virtio_pci serio_raw virtio_ring virtio

    Pid: 1933, comm: umount Not tainted 3.0.0+ #4 Bochs Bochs
    RIP: 0010:[] []
    logfs_segment_write+0x211/0x230 [logfs]
    RSP: 0018:ffff880062d7b9e8 EFLAGS: 00010202
    RAX: 000000000000000e RBX: ffff88006eca9000 RCX: 0000000000000000
    RDX: ffff88006fd87c40 RSI: ffffea00014ff468 RDI: ffff88007b68e000
    RBP: ffff880062d7ba48 R08: 8000000020451430 R09: 0000000000000000
    R10: dead000000100100 R11: 0000000000000000 R12: ffff88006fd87c40
    R13: ffffea00014ff468 R14: ffff88005ad0a460 R15: 0000000000000000
    FS: 00007f25d50ea760(0000) GS:ffff88007fd80000(0000)
    knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000000000d05e48 CR3: 0000000062c72000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process umount (pid: 1933, threadinfo ffff880062d7a000,
    task ffff880070b44500)
    Stack:
    ffff880062d7ba38 ffff88005ad0a508 0000000000001000 0000000000000000
    8000000020451430 ffffea00014ff468 ffff880062d7ba48 ffff88005ad0a460
    ffff880062d7bad8 ffffea00014ff468 ffff88006fd87c40 0000000000000000
    Call Trace:
    [] logfs_write_i0+0x12e/0x190 [logfs]
    [] __logfs_write_rec+0x140/0x220 [logfs]
    [] __logfs_write_rec+0xf2/0x220 [logfs]
    [] logfs_write_rec+0x64/0xd0 [logfs]
    [] __logfs_write_buf+0x106/0x110 [logfs]
    [] logfs_write_buf+0x4e/0x80 [logfs]
    [] __logfs_write_inode+0x98/0x110 [logfs]
    [] logfs_truncate+0x54/0x290 [logfs]
    [] logfs_evict_inode+0xdc/0x190 [logfs]
    [] evict+0x85/0x170
    [] iput+0xe6/0x1b0
    [] shrink_dcache_for_umount_subtree+0x218/0x280
    [] shrink_dcache_for_umount+0x51/0x90
    [] generic_shutdown_super+0x2c/0x100
    [] logfs_kill_sb+0x57/0xf0 [logfs]
    [] deactivate_locked_super+0x45/0x70
    [] deactivate_super+0x4a/0x70
    [] mntput_no_expire+0xa4/0xf0
    [] sys_umount+0x6f/0x380
    [] system_call_fastpath+0x16/0x1b
    Code: 55 c8 49 8d b6 a8 00 00 00 45 89 f9 45 89 e8 4c 89 e1 4c 89 55
    b8 c7 04 24 00 00 00 00 e8 68 fc ff ff 4c 8b 55 b8 e9 3c ff ff ff
    0b 0f 0b c7 45 c0 00 00 00 00 e9 44 fe ff ff 66 66 66 66 66
    RIP [] logfs_segment_write+0x211/0x230 [logfs]
    RSP
    ---[ end trace fe6b040cea952290 ]---

    Therefore, move super->s_flags setting after the fs-indenpendent work
    has been finished.

    Reviewed-by: Joern Engel
    Signed-off-by: Prasad Joshi

    Prasad Joshi
     
  • LogFS uses super->s_write_mutex while writing data to disk. Taking the
    same mutex lock in sync and fsync code path solves the following BUG:

    ------------[ cut here ]------------
    kernel BUG at /home/prasad/logfs/dev_bdev.c:134!

    Pid: 2387, comm: flush-253:16 Not tainted 3.0.0+ #4 Bochs Bochs
    RIP: 0010:[] []
    bdev_writeseg+0x25d/0x270 [logfs]
    Call Trace:
    [] logfs_open_area+0x91/0x150 [logfs]
    [] ? find_level.clone.9+0x62/0x100
    [] __logfs_segment_write.clone.20+0x5c/0x190 [logfs]
    [] ? mempool_kmalloc+0x15/0x20
    [] ? mempool_alloc+0x53/0x130
    [] logfs_segment_write+0x1d4/0x230 [logfs]
    [] logfs_write_i0+0x12e/0x190 [logfs]
    [] __logfs_write_rec+0x140/0x220 [logfs]
    [] logfs_write_rec+0x64/0xd0 [logfs]
    [] __logfs_write_buf+0x106/0x110 [logfs]
    [] logfs_write_buf+0x4e/0x80 [logfs]
    [] __logfs_writepage+0x23/0x80 [logfs]
    [] logfs_writepage+0xdc/0x110 [logfs]
    [] __writepage+0x17/0x40
    [] write_cache_pages+0x208/0x4f0
    [] ? set_page_dirty+0x70/0x70
    [] generic_writepages+0x4a/0x70
    [] do_writepages+0x21/0x40
    [] writeback_single_inode+0x101/0x250
    [] writeback_sb_inodes+0xed/0x1c0
    [] writeback_inodes_wb+0x7b/0x1e0
    [] wb_writeback+0x4c3/0x530
    [] ? sub_preempt_count+0x9d/0xd0
    [] wb_do_writeback+0xdb/0x290
    [] ? sub_preempt_count+0x9d/0xd0
    [] ? _raw_spin_unlock_irqrestore+0x18/0x40
    [] ? del_timer+0x8a/0x120
    [] bdi_writeback_thread+0x8c/0x2e0
    [] ? wb_do_writeback+0x290/0x290
    [] kthread+0x96/0xa0
    [] kernel_thread_helper+0x4/0x10
    [] ? kthread_worker_fn+0x190/0x190
    [] ? gs_change+0xb/0xb
    RIP [] bdev_writeseg+0x25d/0x270 [logfs]
    ---[ end trace 0211ad60a57657c4 ]---

    Reviewed-by: Joern Engel
    Signed-off-by: Prasad Joshi

    Prasad Joshi
     
  • This is a bad one. I wonder whether we were so far protected by
    no_free_segments(sb) usually being smaller than LOGFS_NO_AREAS.

    Found by Dan Carpenter using smatch.

    Signed-off-by: Joern Engel
    Signed-off-by: Prasad Joshi

    Joern Engel
     
  • LogFS sets PG_private flag to indicate a pined page. We assumed that
    marking a page as private is enough to ensure its existence. But
    instead it is necessary to hold a reference count to the page.

    The change resolves the following BUG

    BUG: Bad page state in process flush-253:16 pfn:6a6d0
    page flags: 0x100000000000808(uptodate|private)

    Suggested-and-Acked-by: Joern Engel
    Signed-off-by: Prasad Joshi

    Prasad Joshi