22 Dec, 2011

1 commit

  • Currently, the *_global_[un]lock_online() routines are not at all synchronized
    with CPU hotplug. Soft-lockups detected as a consequence of this race was
    reported earlier at https://lkml.org/lkml/2011/8/24/185. (Thanks to Cong Meng
    for finding out that the root-cause of this issue is the race condition
    between br_write_[un]lock() and CPU hotplug, which results in the lock states
    getting messed up).

    Fixing this race by just adding {get,put}_online_cpus() at appropriate places
    in *_global_[un]lock_online() is not a good option, because, then suddenly
    br_write_[un]lock() would become blocking, whereas they have been kept as
    non-blocking all this time, and we would want to keep them that way.

    So, overall, we want to ensure 3 things:
    1. br_write_lock() and br_write_unlock() must remain as non-blocking.
    2. The corresponding lock and unlock of the per-cpu spinlocks must not happen
    for different sets of CPUs.
    3. Either prevent any new CPU online operation in between this lock-unlock, or
    ensure that the newly onlined CPU does not proceed with its corresponding
    per-cpu spinlock unlocked.

    To achieve all this:
    (a) We introduce a new spinlock that is taken by the *_global_lock_online()
    routine and released by the *_global_unlock_online() routine.
    (b) We register a callback for CPU hotplug notifications, and this callback
    takes the same spinlock as above.
    (c) We maintain a bitmap which is close to the cpu_online_mask, and once it is
    initialized in the lock_init() code, all future updates to it are done in
    the callback, under the above spinlock.
    (d) The above bitmap is used (instead of cpu_online_mask) while locking and
    unlocking the per-cpu locks.

    The callback takes the spinlock upon the CPU_UP_PREPARE event. So, if the
    br_write_lock-unlock sequence is in progress, the callback keeps spinning,
    thus preventing the CPU online operation till the lock-unlock sequence is
    complete. This takes care of requirement (3).

    The bitmap that we maintain remains unmodified throughout the lock-unlock
    sequence, since all updates to it are managed by the callback, which takes
    the same spinlock as the one taken by the lock code and released only by the
    unlock routine. Combining this with (d) above, satisfies requirement (2).

    Overall, since we use a spinlock (mentioned in (a)) to prevent CPU hotplug
    operations from racing with br_write_lock-unlock, requirement (1) is also
    taken care of.

    By the way, it is to be noted that a CPU offline operation can actually run
    in parallel with our lock-unlock sequence, because our callback doesn't react
    to notifications earlier than CPU_DEAD (in order to maintain our bitmap
    properly). And this means, since we use our own bitmap (which is stale, on
    purpose) during the lock-unlock sequence, we could end up unlocking the
    per-cpu lock of an offline CPU (because we had locked it earlier, when the
    CPU was online), in order to satisfy requirement (2). But this is harmless,
    though it looks a bit awkward.

    Debugged-by: Cong Meng
    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Al Viro
    Cc: stable@vger.kernel.org

    Srivatsa S. Bhat
     

21 Dec, 2011

1 commit


19 Dec, 2011

1 commit


17 Dec, 2011

3 commits

  • * 'drm-intel-fixes' of git://people.freedesktop.org/~keithp/linux:
    drm/i915/dp: Dither down to 6bpc if it makes the mode fit
    drm/i915: enable semaphores on per-device defaults
    drm/i915: don't set unpin_work if vblank_get fails
    drm/i915: By default, enable RC6 on IVB and SNB when reasonable
    iommu: Export intel_iommu_enabled to signal when iommu is in use
    drm/i915/sdvo: Include LVDS panels for the IS_DIGITAL check
    drm/i915: prevent division by zero when asking for chipset power
    drm/i915: add PCH info to i915_capabilities
    drm/i915: set the right SDVO transcoder for CPT
    drm/i915: no-lvds quirk for ASUS AT5NM10T-I
    drm/i915: Treat pre-gen4 backlight duty cycle value consistently
    drm/i915: Hook up Ivybridge eDP
    drm/i915: add multi-threaded forcewake support

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.dk/linux-block:
    block: don't kick empty queue in blk_drain_queue()
    block/swim3: Locking fixes
    loop: Fix discard_alignment default setting
    cfq-iosched: fix cfq_cic_link() race confition
    cfq-iosched: free cic_index if blkio_alloc_blkg_stats fails
    cciss: fix flush cache transfer length
    cciss: Add IRQF_SHARED back in for the non-MSI(X) interrupt handler
    loop: fix loop block driver discard and encryption comment
    block: initialize request_queue's numa node during

    Linus Torvalds
     
  • In i915 driver, we do not enable either rc6 or semaphores on SNB when dmar
    is enabled. The new 'intel_iommu_enabled' variable signals when the
    iommu code is in operation.

    Cc: Ted Phelps
    Cc: Peter
    Cc: Lukas Hejtmanek
    Cc: Andrew Lutomirski
    CC: Daniel Vetter
    Cc: Eugeni Dodonov
    Signed-off-by: Keith Packard

    Eugeni Dodonov
     

13 Dec, 2011

2 commits

  • Exactly like roundup_pow_of_two(1), the rounddown version was buggy for
    the case of a compile-time constant '1' argument. Probably because it
    originated from the same code, sharing history with the roundup version
    from before the bugfix (for that one, see commit 1a06a52ee1b0: "Fix
    roundup_pow_of_two(1)").

    However, unlike the roundup version, the fix for rounddown is to just
    remove the broken special case entirely. It's simply not needed - the
    generic code

    1UL << ilog2(n)

    does the right thing for the constant '1' argment too. The only reason
    roundup needed that special case was because rounding up does so by
    subtracting one from the argument (and then adding one to the result)
    causing the obvious problems with "ilog2(0)".

    But rounddown doesn't do any of that, since ilog2() naturally truncates
    (ie "rounds down") to the right rounded down value. And without the
    ilog2(0) case, there's no reason for the special case that had the wrong
    value.

    tl;dr: rounddown_pow_of_two(1) should be 1, not 0.

    Acked-by: Dmitry Torokhov
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
    mmc: core: Fix deadlock when the CONFIG_MMC_UNSAFE_RESUME is not defined
    mmc: sdhci-s3c: Remove old and misprototyped suspend operations
    mmc: tmio: fix clock gating on platforms with a .set_pwr() method
    mmc: sh_mmcif: fix clock gating on platforms with a .down_pwr() method
    mmc: core: Fix typo at mmc_card_sleep
    mmc: core: Fix power_off_notify during suspend
    mmc: core: Fix setting power notify state variable for non-eMMC
    mmc: core: Add quirk for long data read time
    mmc: Add module.h include to sdhci-cns3xxx.c
    mmc: mxcmmc: fix falling back to PIO
    mmc: omap_hsmmc: DMA unmap only once in case of MMC error

    Linus Torvalds
     

11 Dec, 2011

1 commit

  • Adds a quirk that sets the data read timeout to a fixed value instead
    of relying on the information in the CSD. The timeout value chosen
    is 300ms since that has proven enough for the problematic cards found,
    but could be increased if other cards require this.

    This patch also enables this quirk for certain Micron cards known to
    have this problem.

    Signed-off-by: Stefan Nilsson XK
    Signed-off-by: Ulf Hansson
    Acked-by: Linus Walleij
    Cc:
    Signed-off-by: Chris Ball

    Stefan Nilsson XK
     

10 Dec, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
    arch/tile: use new generic {enable,disable}_percpu_irq() routines
    drivers/net/ethernet/tile: use skb_frag_page() API
    asm-generic/unistd.h: support new process_vm_{readv,write} syscalls
    arch/tile: fix double-free bug in homecache_free_pages()
    arch/tile: add a few #includes and an EXPORT to catch up with kernel changes.

    Linus Torvalds
     

09 Dec, 2011

1 commit


08 Dec, 2011

1 commit


07 Dec, 2011

2 commits

  • __d_path() API is asking for trouble and in case of apparmor d_namespace_path()
    getting just that. The root cause is that when __d_path() misses the root
    it had been told to look for, it stores the location of the most remote ancestor
    in *root. Without grabbing references. Sure, at the moment of call it had
    been pinned down by what we have in *path. And if we raced with umount -l, we
    could have very well stopped at vfsmount/dentry that got freed as soon as
    prepend_path() dropped vfsmount_lock.

    It is safe to compare these pointers with pre-existing (and known to be still
    alive) vfsmount and dentry, as long as all we are asking is "is it the same
    address?". Dereferencing is not safe and apparmor ended up stepping into
    that. d_namespace_path() really wants to examine the place where we stopped,
    even if it's not connected to our namespace. As the result, it looked
    at ->d_sb->s_magic of a dentry that might've been already freed by that point.
    All other callers had been careful enough to avoid that, but it's really
    a bad interface - it invites that kind of trouble.

    The fix is fairly straightforward, even though it's bigger than I'd like:
    * prepend_path() root argument becomes const.
    * __d_path() is never called with NULL/NULL root. It was a kludge
    to start with. Instead, we have an explicit function - d_absolute_root().
    Same as __d_path(), except that it doesn't get root passed and stops where
    it stops. apparmor and tomoyo are using it.
    * __d_path() returns NULL on path outside of root. The main
    caller is show_mountinfo() and that's precisely what we pass root for - to
    skip those outside chroot jail. Those who don't want that can (and do)
    use d_path().
    * __d_path() root argument becomes const. Everyone agrees, I hope.
    * apparmor does *NOT* try to use __d_path() or any of its variants
    when it sees that path->mnt is an internal vfsmount. In that case it's
    definitely not mounted anywhere and dentry_path() is exactly what we want
    there. Handling of sysctl()-triggered weirdness is moved to that place.
    * if apparmor is asked to do pathname relative to chroot jail
    and __d_path() tells it we it's not in that jail, the sucker just calls
    d_absolute_path() instead. That's the other remaining caller of __d_path(),
    BTW.
    * seq_path_root() does _NOT_ return -ENAMETOOLONG (it's stupid anyway -
    the normal seq_file logics will take care of growing the buffer and redoing
    the call of ->show() just fine). However, if it gets path not reachable
    from root, it returns SEQ_SKIP. The only caller adjusted (i.e. stopped
    ignoring the return value as it used to do).

    Reviewed-by: John Johansen
    ACKed-by: John Johansen
    Signed-off-by: Al Viro
    Cc: stable@vger.kernel.org

    Al Viro
     
  • * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    ftrace: Fix hash record accounting bug
    perf: Fix parsing of __print_flags() in TP_printk()
    jump_label: jump_label_inc may return before the code is patched
    ftrace: Remove force undef config value left for testing
    tracing: Restore system filter behavior
    tracing: fix event_subsystem ref counting

    Linus Torvalds
     

06 Dec, 2011

5 commits

  • * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    intr_remapping: Fix section mismatch in ir_dev_scope_init()
    intel-iommu: Fix section mismatch in dmar_parse_rmrr_atsr_dev()
    x86, amd: Fix up numa_node information for AMD CPU family 15h model 0-0fh northbridge functions
    x86, AMD: Correct align_va_addr documentation
    x86/rtc, mrst: Don't register a platform RTC device for for Intel MID platforms
    x86/mrst: Battery fixes
    x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode
    x86: Fix "Acer Aspire 1" reboot hang
    x86/mtrr: Resolve inconsistency with Intel processor manual
    x86: Document rdmsr_safe restrictions
    x86, microcode: Fix the failure path of microcode update driver init code
    Add TAINT_FIRMWARE_WORKAROUND on MTRR fixup
    x86/mpparse: Account for bus types other than ISA and PCI
    x86, mrst: Change the pmic_gpio device type to IPC
    mrst: Added some platform data for the SFI translations
    x86,mrst: Power control commands update
    x86/reboot: Blacklist Dell OptiPlex 990 known to require PCI reboot
    x86, UV: Fix UV2 hub part number
    x86: Add user_mode_vm check in stack_overflow_check

    Linus Torvalds
     
  • * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf: Fix loss of notification with multi-event
    perf, x86: Force IBS LVT offset assignment for family 10h
    perf, x86: Disable PEBS on SandyBridge chips
    trace_events_filter: Use rcu_assign_pointer() when setting ftrace_event_call->filter
    perf session: Fix crash with invalid CPU list
    perf python: Fix undefined symbol problem
    perf/x86: Enable raw event access to Intel offcore events
    perf: Don't use -ENOSPC for out of PMU resources
    perf: Do not set task_ctx pointer in cpuctx if there are no events in the context
    perf/x86: Fix PEBS instruction unwind
    oprofile, x86: Fix crash when unloading module (nmi timer mode)
    oprofile: Fix crash when unloading module (hr timer mode)

    Linus Torvalds
     
  • * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched, x86: Avoid unnecessary overflow in sched_clock
    sched: Fix buglet in return_cfs_rq_runtime()
    sched: Avoid SMT siblings in select_idle_sibling() if possible
    sched: Set the command name of the idle tasks in SMP kernels
    sched, rt: Provide means of disabling cross-cpu bandwidth sharing
    sched: Document wait_for_completion_*() return values
    sched_fair: Fix a typo in the comment describing update_sd_lb_stats
    sched: Add a comment to effective_load() since it's a pain

    Linus Torvalds
     
  • Though not all events have field 'prev_pid', it was allowed to do this:

    # echo 'prev_pid == 100' > events/sched/filter

    but commit 75b8e98263fdb0bfbdeba60d4db463259f1fe8a2 (tracing/filter: Swap
    entire filter of events) broke it without any reason.

    Link: http://lkml.kernel.org/r/4EAF46CF.8040408@cn.fujitsu.com

    Signed-off-by: Li Zefan
    Signed-off-by: Steven Rostedt

    Li Zefan
     
  • I've received complaints that the numa_node attribute for family
    15h model 00-0fh (e.g. Interlagos) northbridge functions shows
    -1 instead of the proper node ID.

    Correct this with attached quirks (similar to quirks for other
    AMD CPU families used in multi-socket systems).

    Signed-off-by: Andreas Herrmann
    Cc: Frank Arnold
    Cc: Borislav Petkov
    Link: http://lkml.kernel.org/r/20111202072143.GA31916@alberich.amd.com
    Signed-off-by: Ingo Molnar

    Andreas Herrmann
     

05 Dec, 2011

1 commit

  • When you do:
    $ perf record -e cycles,cycles,cycles noploop 10

    You expect about 10,000 samples for each event, i.e., 10s at
    1000samples/sec. However, this is not what's happening. You
    get much fewer samples, maybe 3700 samples/event:

    $ perf report -D | tail -15
    Aggregated stats:
    TOTAL events: 10998
    MMAP events: 66
    COMM events: 2
    SAMPLE events: 10930
    cycles stats:
    TOTAL events: 3644
    SAMPLE events: 3644
    cycles stats:
    TOTAL events: 3642
    SAMPLE events: 3642
    cycles stats:
    TOTAL events: 3644
    SAMPLE events: 3644

    On a Intel Nehalem or even AMD64, there are 4 counters capable
    of measuring cycles, so there is plenty of space to measure those
    events without multiplexing (even with the NMI watchdog active).
    And even with multiplexing, we'd expect roughly the same number
    of samples per event.

    The root of the problem was that when the event that caused the buffer
    to become full was not the first event passed on the cmdline, the user
    notification would get lost. The notification was sent to the file
    descriptor of the overflowed event but the perf tool was not polling
    on it. The perf tool aggregates all samples into a single buffer,
    i.e., the buffer of the first event. Consequently, it assumes
    notifications for any event will come via that descriptor.

    The seemingly straight forward solution of moving the waitq into the
    ringbuffer object doesn't work because of life-time issues. One could
    perf_event_set_output() on a fd that you're also blocking on and cause
    the old rb object to be freed while its waitq would still be
    referenced by the blocked thread -> FAIL.

    Therefore link all events to the ringbuffer and broadcast the wakeup
    from the ringbuffer object to all possible events that could be waited
    upon. This is rather ugly, and we're open to better solutions but it
    works for now.

    Reported-by: Stephane Eranian
    Finished-by: Stephane Eranian
    Reviewed-by: Stephane Eranian
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20111126014731.GA7030@quad
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

04 Dec, 2011

1 commit


03 Dec, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
    ALSA: hda - Fix S3/S4 problem on machines with VREF-pin mute-LED
    ALSA: hda_intel - revert a quirk that affect VIA chipsets
    ALSA: hda - Avoid touching mute-VREF pin for IDT codecs
    firmware: Sigma: Fix endianess issues
    firmware: Sigma: Skip header during CRC generation
    firmware: Sigma: Prevent out of bounds memory access
    ALSA: usb-audio - Support for Roland GAIA SH-01 Synthesizer
    ASoC: Supply dcs_codes for newer WM1811 revisions
    ASoC: Error out if we can't generate a LRCLK at all for WM8994
    ASoC: Correct name of Speyside Main Speaker widget
    ASoC: skip resume of soc-audio devices without codecs
    ASoC: cs42l51: Fix off-by-one for reg_cache_size
    ASoC: drop support for PlayPaq with WM8510
    ASoC: mpc8610: tell the CS4270 codec that it's the master
    ASoC: cs4720: use snd_soc_cache_sync()
    ASoC: SAMSUNG: Fix build error
    ASoC: max9877: Update register if either val or val2 is changed
    ASoC: Fix wrong define for AD1836_ADC_WORD_OFFSET

    Linus Torvalds
     

02 Dec, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
    netfilter: Remove ADVANCED dependency from NF_CONNTRACK_NETBIOS_NS
    ipv4: flush route cache after change accept_local
    sch_red: fix red_change
    Revert "udp: remove redundant variable"
    bridge: master device stuck in no-carrier state forever when in user-stp mode
    ipv4: Perform peer validation on cached route lookup.
    net/core: fix rollback handler in register_netdevice_notifier
    sch_red: fix red_calc_qavg_from_idle_time
    bonding: only use primary address for ARP
    ipv4: fix lockdep splat in rt_cache_seq_show
    sch_teql: fix lockdep splat
    net: fec: Select the FEC driver by default for i.MX SoCs
    isdn: avoid copying too long drvid
    isdn: make sure strings are null terminated
    netlabel: Fix build problems when IPv6 is not enabled
    sctp: better integer overflow check in sctp_auth_create_key()
    sctp: integer overflow in sctp_auth_create_key()
    ipv6: Set mcast_hops to IPV6_DEFAULT_MCASTHOPS when -1 was given.
    net: Fix corruption in /proc/*/net/dev_mcast
    mac80211: fix race between the AGG SM and the Tx data path
    ...

    Linus Torvalds
     

30 Nov, 2011

1 commit

  • * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    PM: Update comments describing device power management callbacks
    PM / Sleep: Update documentation related to system wakeup
    PM / Runtime: Make documentation follow the new behavior of irq_safe
    PM / Sleep: Correct inaccurate information in devices.txt
    PM / Domains: Document how PM domains are used by the PM core
    PM / Hibernate: Do not leak memory in error/test code paths

    Linus Torvalds
     

29 Nov, 2011

6 commits

  • Currently the SigmaDSP firmware loader only works correctly on little-endian
    systems. Fix this by using the proper endianess conversion functions.

    Signed-off-by: Lars-Peter Clausen
    Acked-by: Mike Frysinger
    Signed-off-by: Mark Brown
    Cc: stable@kernel.org

    Lars-Peter Clausen
     
  • The SigmaDSP firmware loader currently does not perform enough boundary size
    checks when processing the firmware. As a result it is possible that a
    malformed firmware can cause an out of bounds memory access.

    This patch adds checks which ensure that both the action header and the payload
    are completely inside the firmware data boundaries before processing them.

    Signed-off-by: Lars-Peter Clausen
    Acked-by: Mike Frysinger
    Signed-off-by: Mark Brown
    Cc: stable@kernel.org

    Lars-Peter Clausen
     
  • I just hit this during my testing. Isn't there another bug lurking?

    BUG kmalloc-8: Redzone overwritten

    INFO: 0xc0000000de9dec48-0xc0000000de9dec4b. First byte 0x0 instead of 0xcc
    INFO: Allocated in .__seq_open_private+0x30/0xa0 age=0 cpu=5 pid=3896
    .__kmalloc+0x1e0/0x2d0
    .__seq_open_private+0x30/0xa0
    .seq_open_net+0x60/0xe0
    .dev_mc_seq_open+0x4c/0x70
    .proc_reg_open+0xd8/0x260
    .__dentry_open.clone.11+0x2b8/0x400
    .do_last+0xf4/0x950
    .path_openat+0xf8/0x480
    .do_filp_open+0x48/0xc0
    .do_sys_open+0x140/0x250
    syscall_exit+0x0/0x40

    dev_mc_seq_ops uses dev_seq_start/next/stop but only allocates
    sizeof(struct seq_net_private) of private data, whereas it expects
    sizeof(struct dev_iter_state):

    struct dev_iter_state {
    struct seq_net_private p;
    unsigned int pos; /* bucket << BUCKET_SPACE + offset */
    };

    Create dev_seq_open_ops and use it so we don't have to expose
    struct dev_iter_state.

    [ Problem added by commit f04565ddf52e4 (dev: use name hash for
    dev_seq_ops) -Eric ]

    Signed-off-by: Anton Blanchard
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Anton Blanchard
     
  • The comments describing device power management callbacks in
    include/pm.h are outdated and somewhat confusing, so make them
    reflect the reality more accurately.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
    pstore: pass allocated memory region back to caller

    Linus Torvalds
     
  • * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    hrtimer: Fix extra wakeups from __remove_hrtimer()
    timekeeping: add arch_offset hook to ktime_get functions
    clocksource: Avoid selecting mult values that might overflow when adjusted
    time: Improve documentation of timekeeeping_adjust()

    Linus Torvalds
     

24 Nov, 2011

6 commits


23 Nov, 2011

4 commits

  • Count of selector voltage is required for regulator_set_voltage
    to work via set_voltage_sel. VDD1/2 currently have it as zero,
    so regulator_set_voltage won't work for VDD1/2.
    Update count (n_voltages) for VDD1/2.

    Output Voltage = (step value * 12.5 mV + 562.5 mV) * gain

    With above expr, number of voltages that can be selected is
    step value count * gain count

    constant for gain count will be called VDD1_2_NUM_VOLT_COARSE

    existing constant for step value count is VDD1_2_NUM_VOLTS,
    use VDD1_2_NUM_VOLT_FINE instead to make clear that step value
    is not the only component in deciding selectable voltage count

    Signed-off-by: Afzal Mohammed
    Signed-off-by: Mark Brown

    Afzal Mohammed
     
  • Last piece of code using ANY_I2C_BUS was deleted almost 2 years ago,
    so ANY_I2C_BUS can go away as well.

    Signed-off-by: Jean Delvare

    Jean Delvare
     
  • struct request_queue is allocated with __GFP_ZERO so its "node" field is
    zero before initialization. This causes an oops if node 0 is offline in
    the page allocator because its zonelists are not initialized. From Dave
    Young's dmesg:

    SRAT: Node 1 PXM 2 0-d0000000
    SRAT: Node 1 PXM 2 100000000-330000000
    SRAT: Node 0 PXM 1 330000000-630000000
    Initmem setup node 1 0000000000000000-000000000affb000
    ...
    Built 1 zonelists in Node order, mobility grouping on.
    ...
    BUG: unable to handle kernel paging request at 0000000000001c08
    IP: [] __alloc_pages_nodemask+0xb5/0x870

    and __alloc_pages_nodemask+0xb5 translates to a NULL pointer on
    zonelist->_zonerefs.

    The fix is to initialize q->node at the time of allocation so the correct
    node is passed to the slab allocator later.

    Since blk_init_allocated_queue_node() is no longer needed, merge it with
    blk_init_allocated_queue().

    [rientjes@google.com: changelog, initializing q->node]
    Cc: stable@vger.kernel.org [2.6.37+]
    Reported-by: Dave Young
    Signed-off-by: Mike Snitzer
    Signed-off-by: David Rientjes
    Tested-by: Dave Young
    Signed-off-by: Jens Axboe

    Mike Snitzer
     
  • Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger