24 Nov, 2014

11 commits

  • Linus Torvalds
     
  • x86 call do_notify_resume on paranoid returns if TIF_UPROBE is set but
    not on non-paranoid returns. I suspect that this is a mistake and that
    the code only works because int3 is paranoid.

    Setting _TIF_NOTIFY_RESUME in the uprobe code was probably a workaround
    for the x86 bug. With that bug fixed, we can remove _TIF_NOTIFY_RESUME
    from the uprobes code.

    Reported-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju
    Acked-by: Borislav Petkov
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Linus Torvalds

    Andy Lutomirski
     
  • Chris bisected a NULL pointer deference in task_sched_runtime() to
    commit 6e998916dfe3 'sched/cputime: Fix clock_nanosleep()/clock_gettime()
    inconsistency'.

    Chris observed crashes in atop or other /proc walking programs when he
    started fork bombs on his machine. He assumed that this is a new exit
    race, but that does not make any sense when looking at that commit.

    What's interesting is that, the commit provides update_curr callbacks
    for all scheduling classes except stop_task and idle_task.

    While nothing can ever hit that via the clock_nanosleep() and
    clock_gettime() interfaces, which have been the target of the commit in
    question, the author obviously forgot that there are other code paths
    which invoke task_sched_runtime()

    do_task_stat(()
    thread_group_cputime_adjusted()
    thread_group_cputime()
    task_cputime()
    task_sched_runtime()
    if (task_current(rq, p) && task_on_rq_queued(p)) {
    update_rq_clock(rq);
    up->sched_class->update_curr(rq);
    }

    If the stats are read for a stomp machine task, aka 'migration/N' and
    that task is current on its cpu, this will happily call the NULL pointer
    of stop_task->update_curr. Ooops.

    Chris observation that this happens faster when he runs the fork bomb
    makes sense as the fork bomb will kick migration threads more often so
    the probability to hit the issue will increase.

    Add the missing update_curr callbacks to the scheduler classes stop_task
    and idle_task. While idle tasks cannot be monitored via /proc we have
    other means to hit the idle case.

    Fixes: 6e998916dfe3 'sched/cputime: Fix clock_nanosleep()/clock_gettime() inconsistency'
    Reported-by: Chris Mason
    Reported-and-tested-by: Borislav Petkov
    Signed-off-by: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Stanislaw Gruszka
    Cc: Peter Zijlstra
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Merge x86-64 iret fixes from Andy Lutomirski:
    "This addresses the following issues:

    - an unrecoverable double-fault triggerable with modify_ldt.
    - invalid stack usage in espfix64 failed IRET recovery from IST
    context.
    - invalid stack usage in non-espfix64 failed IRET recovery from IST
    context.

    It also makes a good but IMO scary change: non-espfix64 failed IRET
    will now report the correct error. Hopefully nothing depended on the
    old incorrect behavior, but maybe Wine will get confused in some
    obscure corner case"

    * emailed patches from Andy Lutomirski :
    x86_64, traps: Rework bad_iret
    x86_64, traps: Stop using IST for #SS
    x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C

    Linus Torvalds
     
  • It's possible for iretq to userspace to fail. This can happen because
    of a bad CS, SS, or RIP.

    Historically, we've handled it by fixing up an exception from iretq to
    land at bad_iret, which pretends that the failed iret frame was really
    the hardware part of #GP(0) from userspace. To make this work, there's
    an extra fixup to fudge the gs base into a usable state.

    This is suboptimal because it loses the original exception. It's also
    buggy because there's no guarantee that we were on the kernel stack to
    begin with. For example, if the failing iret happened on return from an
    NMI, then we'll end up executing general_protection on the NMI stack.
    This is bad for several reasons, the most immediate of which is that
    general_protection, as a non-paranoid idtentry, will try to deliver
    signals and/or schedule from the wrong stack.

    This patch throws out bad_iret entirely. As a replacement, it augments
    the existing swapgs fudge into a full-blown iret fixup, mostly written
    in C. It's should be clearer and more correct.

    Signed-off-by: Andy Lutomirski
    Reviewed-by: Thomas Gleixner
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Andy Lutomirski
     
  • On a 32-bit kernel, this has no effect, since there are no IST stacks.

    On a 64-bit kernel, #SS can only happen in user code, on a failed iret
    to user space, a canonical violation on access via RSP or RBP, or a
    genuine stack segment violation in 32-bit kernel code. The first two
    cases don't need IST, and the latter two cases are unlikely fatal bugs,
    and promoting them to double faults would be fine.

    This fixes a bug in which the espfix64 code mishandles a stack segment
    violation.

    This saves 4k of memory per CPU and a tiny bit of code.

    Signed-off-by: Andy Lutomirski
    Reviewed-by: Thomas Gleixner
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Andy Lutomirski
     
  • There's nothing special enough about the espfix64 double fault fixup to
    justify writing it in assembly. Move it to C.

    This also fixes a bug: if the double fault came from an IST stack, the
    old asm code would return to a partially uninitialized stack frame.

    Fixes: 3891a04aafd668686239349ea58f3314ea2af86b
    Signed-off-by: Andy Lutomirski
    Reviewed-by: Thomas Gleixner
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Andy Lutomirski
     
  • Pull ARM SoC fixes from Olof Johansson:
    "A collection of fixes this week:

    - A set of clock fixes for shmobile platforms
    - A fix for tegra that moves serial port labels to be per board.
    We're choosing to merge this for 3.18 because the labels will start
    being parsed in 3.19, and without this change serial port numbers
    that used to be stable since the dawn of time will change numbers.
    - A few other DT tweaks for Tegra.
    - A fix for multi_v7_defconfig that makes it stop spewing cpufreq
    errors on Arndale (Exynos)"

    * tag 'armsoc-for-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
    ARM: multi_v7_defconfig: fix failure setting CPU voltage by enabling dependent I2C controller
    ARM: tegra: roth: Fix SD card VDD_IO regulator
    ARM: tegra: Remove eMMC vmmc property for roth/tn7
    ARM: dts: tegra: move serial aliases to per-board
    ARM: tegra: Add serial port labels to Tegra124 DT
    ARM: shmobile: kzm9g legacy: Set i2c clks_per_count to 2
    ARM: shmobile: r8a7740 dtsi: Correct IIC0 parent clock
    ARM: shmobile: r8a7790: Fix SD3CKCR address to device tree
    ARM: shmobile: r8a7740 legacy: Correct IIC0 parent clock
    ARM: shmobile: r8a7740 legacy: Add missing INTCA clock for irqpin module
    ARM: shmobile: r8a7790: Fix SD3CKCR address
    ARM: dts: sun6i: Re-parent ahb1_mux to pll6 as required by dma controller

    Linus Torvalds
     
  • Pull percpu fix from Tejun Heo:
    "This contains one patch to fix a race condition which can lead to
    percpu_ref using a percpu pointer which is corrupted with a set DEAD
    bit. The bug was introduced while separating out the ATOMIC mode flag
    from the DEAD flag. The fix is pretty straight forward.

    I just committed the patch to the percpu tree but am sending out the
    pull request early as I'll be on vacation for a week. The patch
    should be fairly safe and while the latency will be higher I'll be
    checking emails"

    * 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
    percpu-ref: fix DEAD flag contamination of percpu pointer

    Linus Torvalds
     
  • Pull btrfs deadlock fix from Chris Mason:
    "This has a fix for a long standing deadlock that we've been trying to
    nail down for a while. It ended up being a bad interaction with the
    fair reader/writer locks and the order btrfs reacquires locks in the
    btree"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    btrfs: fix lockups from btrfs_clear_path_blocking

    Linus Torvalds
     
  • While decoupling ATOMIC and DEAD flags, f47ad4578461 ("percpu_ref:
    decouple switching to percpu mode and reinit") updated
    __ref_is_percpu() so that it only tests ATOMIC flag to determine
    whether the ref is in percpu mode or not; however, while DEAD implies
    ATOMIC, the two flags are set separately during percpu_ref_kill() and
    if __ref_is_percpu() races percpu_ref_kill(), it may see DEAD w/o
    ATOMIC. Because __ref_is_percpu() returns @ref->percpu_count_ptr
    value verbatim as the percpu pointer after testing ATOMIC, the pointer
    may now be contaminated with the DEAD flag.

    This can be fixed by clearing the flag bits before returning the
    pointer which was the fix proposed by Shaohua; however, as DEAD
    implies ATOMIC, we can just test for both flags at once and avoid the
    explicit masking.

    Update __ref_is_percpu() so that it tests that both ATOMIC and DEAD
    are clear before returning @ref->percpu_count_ptr as the percpu
    pointer.

    Signed-off-by: Tejun Heo
    Reported-and-Reviewed-by: Shaohua Li
    Link: http://lkml.kernel.org/r/995deb699f5b873c45d667df4add3b06f73c2c25.1416638887.git.shli@kernel.org
    Fixes: f47ad4578461 ("percpu_ref: decouple switching to percpu mode and reinit")

    Tejun Heo
     

23 Nov, 2014

2 commits


22 Nov, 2014

17 commits

  • Pull networking fixes from David Miller:

    1) Fix BUG when decrypting empty packets in mac80211, from Ronald Wahl.

    2) nf_nat_range is not fully initialized and this is copied back to
    userspace, from Daniel Borkmann.

    3) Fix read past end of b uffer in netfilter ipset, also from Dan
    Carpenter.

    4) Signed integer overflow in ipv4 address mask creation helper
    inet_make_mask(), from Vincent BENAYOUN.

    5) VXLAN, be2net, mlx4_en, and qlcnic need ->ndo_gso_check() methods to
    properly describe the device's capabilities, from Joe Stringer.

    6) Fix memory leaks and checksum miscalculations in openvswitch, from
    Pravin B SHelar and Jesse Gross.

    7) FIB rules passes back ambiguous error code for unreachable routes,
    making behavior confusing for userspace. Fix from Panu Matilainen.

    8) ieee802154fake_probe() doesn't release resources properly on error,
    from Alexey Khoroshilov.

    9) Fix skb_over_panic in add_grhead(), from Daniel Borkmann.

    10) Fix access of stale slave pointers in bonding code, from Nikolay
    Aleksandrov.

    11) Fix stack info leak in PPP pptp code, from Mathias Krause.

    12) Cure locking bug in IPX stack, from Jiri Bohac.

    13) Revert SKB fclone memory freeing optimization that is racey and can
    allow accesses to freed up memory, from Eric Dumazet.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (71 commits)
    tcp: Restore RFC5961-compliant behavior for SYN packets
    net: Revert "net: avoid one atomic operation in skb_clone()"
    virtio-net: validate features during probe
    cxgb4 : Fix DCB priority groups being returned in wrong order
    ipx: fix locking regression in ipx_sendmsg and ipx_recvmsg
    openvswitch: Don't validate IPv6 label masks.
    pptp: fix stack info leak in pptp_getname()
    brcmfmac: don't include linux/unaligned/access_ok.h
    cxgb4i : Don't block unload/cxgb4 unload when remote closes TCP connection
    ipv6: delete protocol and unregister rtnetlink when cleanup
    net/mlx4_en: Add VXLAN ndo calls to the PF net device ops too
    bonding: fix curr_active_slave/carrier with loadbalance arp monitoring
    mac80211: minstrel_ht: fix a crash in rate sorting
    vxlan: Inline vxlan_gso_check().
    can: m_can: update to support CAN FD features
    can: m_can: fix incorrect error messages
    can: m_can: add missing delay after setting CCCR_INIT bit
    can: m_can: fix not set can_dlc for remote frame
    can: m_can: fix possible sleep in napi poll
    can: m_can: add missing message RAM initialization
    ...

    Linus Torvalds
     
  • Pull drm fixes from Dave Airlie:
    "Just two radeon and two intel fixes: endian and regression fixes"

    * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
    drm/radeon: fix endian swapping in vbios fetch for tdp table
    drm/radeon: disable native backlight control on pre-r6xx asics (v2)
    drm/i915: Kick fbdev before vgacon
    drm/i915: drop WaSetupGtModeTdRowDispatch:snb

    Linus Torvalds
     
  • Pull sound fixes from Takashi Iwai:
    "This batch ended up as a relatively high volume due to pending ASoC
    fixes. But most of fixes there are trivial and/or device- specific
    fixes and quirks, so safe to apply. The only (ASoC) core fixes are
    the DPCM race fix and the machine-driver matching fix for
    componentization"

    * tag 'sound-3.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
    ALSA: hda - fix the mic mute led problem for Latitude E5550
    ALSA: hda - move DELL_WMI_MIC_MUTE_LED to the tail in the quirk chain
    ASoC: wm_adsp: Avoid attempt to free buffers that might still be in use
    ALSA: usb-audio: Set the Control Selector to SU_SELECTOR_CONTROL for UAC2
    ALSA: usb-audio: Add ctrl message delay quirk for Marantz/Denon devices
    ASoC: sgtl5000: Fix SMALL_POP bit definition
    ASoC: cs42l51: re-hook of_match_table pointer
    ASoC: rt5670: change dapm routes of PLL connection
    ASoC: rt5670: correct the incorrect default values
    ASoC: samsung: Add MODULE_DEVICE_TABLE for Snow
    ASoC: max98090: Correct pclk divisor settings
    ASoC: dpcm: Fix race between FE/BE updates and trigger
    ASoC: Fix snd_soc_find_dai() matching component by name
    ASoC: rsnd: remove unsupported PAUSE flag
    ASoC: fsi: remove unsupported PAUSE flag
    ASoC: rt5645: Mark RT5645_TDM_CTRL_3 as readable
    ASoC: rockchip-i2s: fix infinite loop in rockchip_snd_rxctrl
    ASoC: es8328-i2c: Fix i2c_device_id name field in es8328_id
    ASoC: fsl_asrc: Add reg_defaults for regmap to fix kernel dump

    Linus Torvalds
     
  • Pull ACPI power management fix from Rafael Wysocki:
    "This is just a one-liner fixing a regression introduced in 3.13 that
    broke system suspend on some Chromebooks.

    On those machines there are ACPI device objects for some I2C devices
    that can wake up the system from sleep states, but that is done via a
    platform-specific mechanism and the ACPI objects don't contain any
    wakeup-related information. When we started to use ACPI power
    management with those devices (which happened during the 3.13 cycle),
    their configuration confused the ACPI PM layer that returned error
    codes from suspend callbacks for them causing system suspend to fail.

    However, the ACPI PM layer can safely ignore the wakeup setting from a
    device driver if the ACPI object corresponding to the device in
    question doesn't contain wakeup information in which case the driver
    itself is responsible for setting up the device for system wakeup"

    * tag 'pm+acpi-3.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    ACPI / PM: Ignore wakeup setting if the ACPI companion can't wake up

    Linus Torvalds
     
  • Pull devicetree fixes from Rob Herring:
    "DeviceTree fixes for 3.18:

    - two fixes for OF selftest code
    - fix for PowerPC address parsing to disable work-around except on
    old PowerMACs
    - fix a crash when earlycon is enabled, but no device is found
    - DT documentation fixes and missing vendor prefixes

    All but the doc updates are also for stable"

    * tag 'devicetree-fixes-for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
    of/selftest: Fix testing when /aliases is missing
    of/selftest: Fix off-by-one error in removal path
    documentation: pinctrl bindings: Fix trivial typo 'abitrary'
    devicetree: bindings: Add vendor prefix for Micron Technology, Inc.
    of: Add vendor prefix for Chips&Media, Inc.
    of/base: Fix PowerPC address parsing hack
    devicetree: vendor-prefixes.txt: fix whitespace
    of: Fix crash if an earlycon driver is not found
    of/irq: Drop obsolete 'interrupts' vs 'interrupts-extended' text
    of: Spelling s/stucture/structure/
    devicetree: bindings: add sandisk to the vendor prefixes

    Linus Torvalds
     
  • Pull PCI fixes from Bjorn Helgaas:
    "These are fixes for an issue with 64-bit PCI bus addresses on 32-bit
    PAE kernels, an APM X-Gene problem (it depended on a generic change we
    removed before merging), a fix for my hotplug device configuration
    changes, and a devicetree documentation update.

    Resource management:
    - Support 64-bit bridge windows if we have 64-bit dma_addr_t (Yinghai Lu)

    PCI device hotplug:
    - Apply _HPX Link Control settings to all devices with a link (Yinghai Lu)

    Generic host bridge driver:
    - Add DT binding for "linux,pci-domain" property (Lucas Stach)

    APM X-Gene:
    - Assign resources to bus before adding new devices (Duc Dang)"

    * tag 'pci-v3.18-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
    PCI: Support 64-bit bridge windows if we have 64-bit dma_addr_t
    PCI: Apply _HPX Link Control settings to all devices with a link
    PCI: Add missing DT binding for "linux,pci-domain" property
    PCI: xgene: Assign resources to bus before adding new devices

    Linus Torvalds
     
  • Pull SCSI target fixes from Nicholas Bellinger:
    "Here are the target-pending fixes queued for v3.18-rc6.

    The highlights include:

    - target-core OOPs fix with tcm_qla2xxx + vxworks FC initiators +
    zero length SCSI commands having a transfer direction set. (Roland
    + Craig Watson)

    - vhost-scsi OOPs fix to explicitly prevent WWPN endpoint configfs
    group removal while qemu still has an active reference. (Paolo +
    nab)

    - ib_srpt fix for RDMA hardware with lower srp_sq_size limits.
    (Bart)

    - two ib_isert work-arounds for running on ocrdma hardware (Or + Sagi
    + Chris)

    - iscsi-target discovery portal typo + SPC-3 PR Preempt SA key
    matching fix (Steve)"

    * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
    IB/isert: Adjust CQ size to HW limits
    target: return CONFLICT only when SA key unmatched
    iser-target: Handle DEVICE_REMOVAL event on network portal listener correctly
    ib_isert: Add max_send_sge=2 minimum for control PDU responses
    srp-target: Retry when QP creation fails with ENOMEM
    iscsi-target: return the correct port in SendTargets
    vhost-scsi: Take configfs group dependency during VHOST_SCSI_SET_ENDPOINT
    target: Don't call TFO->write_pending if data_length == 0

    Linus Torvalds
     
  • Pull dmaengine fixes from Vinod Koul:
    "We have couple of fixes for dmaengine queued up:
    - dma mempcy fix for dma configuration of sun6i by Maxime
    - pl330 fixes: First the fixing allocation for data buffers by Liviu
    and then Jon's fixe for fifo width and usage"

    * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
    dmaengine: Fix allocation size for PL330 data buffer depth.
    dmaengine: pl330: Limit MFIFO usage for memcpy to avoid exhausting entries
    dmaengine: pl330: Align DMA memcpy operations to MFIFO width
    dmaengine: sun6i: Fix memcpy operation

    Linus Torvalds
     
  • Pull MIPS fixes from Ralf Baechle:
    "More 3.18 fixes for MIPS:

    - backtraces were not quite working on on 64-bit kernels
    - loongson needs a different cache coherency setting
    - Loongson 3 is a MIPS64 R2 version but due to erratum we treat is an
    older architecture revision.
    - fix build errors due to undefined references to __node_distances
    for certain configurations.
    - fix instruction decodig in the jump label code.
    - for certain configurations copy_{from,to}_user destroy the content
    of $3 so that register needs to be marked as clobbed by the calling
    code.
    - Hardware Table Walker fixes.
    - fill the delay slot of the last instruction of memcpy otherwise
    whatever ends up there randomly might have undesirable effects.
    - ensure get_user/__get_user always zero the variable to be read even
    in case of an error"

    * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
    MIPS: jump_label.c: Handle the microMIPS J instruction encoding
    MIPS: jump_label.c: Correct the span of the J instruction
    MIPS: Zero variable read by get_user / __get_user in case of an error.
    MIPS: lib: memcpy: Restore NOP on delay slot before returning to caller
    MIPS: tlb-r4k: Add missing HTW stop/start sequences
    MIPS: asm: uaccess: Add v1 register to clobber list on EVA
    MIPS: oprofile: Fix backtrace on 64-bit kernel
    MIPS: Loongson: Set Loongson-3's ISA level to MIPS64R1
    MIPS: Loongson: Fix the write-combine CCA value setting
    MIPS: IP27: Fix __node_distances undefined error
    MIPS: Loongson3: Fix __node_distances undefined error

    Linus Torvalds
     
  • Pull powerpc fix from Michael Ellerman:
    "One fix from Scott, he says:

    This patch fixes a crash (introduced in v3.18-rc1) in the FSL MSI driver
    when threaded IRQs are enabled"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
    powerpc/fsl_msi: mark the msi cascade handler IRQF_NO_THREAD

    Linus Torvalds
     
  • Pull x86 fixes from Thomas Gleixner:
    "Misc fixes:
    - gold linker build fix
    - noxsave command line parsing fix
    - bugfix for NX setup
    - microcode resume path bug fix
    - _TIF_NOHZ versus TIF_NOHZ bugfix as discussed in the mysterious
    lockup thread"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86, syscall: Fix _TIF_NOHZ handling in syscall_trace_enter_phase1
    x86, kaslr: Handle Gold linker for finding bss/brk
    x86, mm: Set NX across entire PMD at boot
    x86, microcode: Update BSPs microcode on resume
    x86: Require exact match for 'noxsave' command line option

    Linus Torvalds
     
  • Pull scheduler fixes from Ingo Molnar:
    "Misc fixes: two NUMA fixes, two cputime fixes and an RCU/lockdep fix"

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/cputime: Fix clock_nanosleep()/clock_gettime() inconsistency
    sched/cputime: Fix cpu_timer_sample_group() double accounting
    sched/numa: Avoid selecting oneself as swap target
    sched/numa: Fix out of bounds read in sched_init_numa()
    sched: Remove lockdep check in sched_move_task()

    Linus Torvalds
     
  • Pull perf fixes from Ingo Molnar:
    "Misc fixes: two Intel uncore driver fixes, a CPU-hotplug fix and a
    build dependencies fix"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf/x86/intel/uncore: Fix boot crash on SBOX PMU on Haswell-EP
    perf/x86/intel/uncore: Fix IRP uncore register offsets on Haswell EP
    perf: Fix corruption of sibling list with hotplug
    perf/x86: Fix embarrasing typo

    Linus Torvalds
     
  • Pull core fix from Ingo Molnar:
    "Fix GENMASK macro shift overflow"

    Nobody seems to currently use GENMASK() to fill every single last bit
    (which is what overflows) in-tree, and gcc would warn about it, so we
    have that going for us. But apparently there are pending changes that
    want this.

    * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    bitops: Fix shift overflow in GENMASK macros

    Linus Torvalds
     
  • Commit c3ae62af8e755 ("tcp: should drop incoming frames without ACK
    flag set") was created to mitigate a security vulnerability in which a
    local attacker is able to inject data into locally-opened sockets by
    using TCP protocol statistics in procfs to quickly find the correct
    sequence number.

    This broke the RFC5961 requirement to send a challenge ACK in response
    to spurious RST packets, which was subsequently fixed by commit
    7b514a886ba50 ("tcp: accept RST without ACK flag").

    Unfortunately, the RFC5961 requirement that spurious SYN packets be
    handled in a similar manner remains broken.

    RFC5961 section 4 states that:

    ... the handling of the SYN in the synchronized state SHOULD be
    performed as follows:

    1) If the SYN bit is set, irrespective of the sequence number, TCP
    MUST send an ACK (also referred to as challenge ACK) to the remote
    peer:

    After sending the acknowledgment, TCP MUST drop the unacceptable
    segment and stop processing further.

    By sending an ACK, the remote peer is challenged to confirm the loss
    of the previous connection and the request to start a new connection.
    A legitimate peer, after restart, would not have a TCB in the
    synchronized state. Thus, when the ACK arrives, the peer should send
    a RST segment back with the sequence number derived from the ACK
    field that caused the RST.

    This RST will confirm that the remote peer has indeed closed the
    previous connection. Upon receipt of a valid RST, the local TCP
    endpoint MUST terminate its connection. The local TCP endpoint
    should then rely on SYN retransmission from the remote end to
    re-establish the connection.

    This patch lets SYN packets through the discard added in c3ae62af8e755,
    so that spurious SYN packets are properly dealt with as per the RFC.

    The challenge ACK is sent unconditionally and is rate-limited, so the
    original vulnerability is not reintroduced by this patch.

    Signed-off-by: Calvin Owens
    Acked-by: Eric Dumazet
    Acked-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Calvin Owens
     
  • Not sure what I was thinking, but doing anything after
    releasing a refcount is suicidal or/and embarrassing.

    By the time we set skb->fclone to SKB_FCLONE_FREE, another cpu
    could have released last reference and freed whole skb.

    We potentially corrupt memory or trap if CONFIG_DEBUG_PAGEALLOC is set.

    Reported-by: Chris Mason
    Fixes: ce1a4ea3f1258 ("net: avoid one atomic operation in skb_clone()")
    Signed-off-by: Eric Dumazet
    Cc: Sabrina Dubroca
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • …t/mszeredi/vfs into for-linus

    "The biggest change is to rename the filesystem from "overlayfs" to "overlay".
    This will allow legacy overlayfs to be easily carried by distros alongside the
    new mainline one. Also fix a couple of copy-up races and allow escaping comma
    character in filenames."

    The last bit is about commas in pathname mount options...

    Al Viro
     

21 Nov, 2014

10 commits

  • We currently trigger BUG when VIRTIO_NET_F_CTRL_VQ
    is not set but one of features depending on it is.
    That's not a friendly way to report errors to
    hypervisors.
    Let's check, and fail probe instead.

    Cc: Rusty Russell
    Cc: Cornelia Huck
    Cc: Wanlong Gao
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Jason Wang
    Acked-by: Cornelia Huck
    Acked-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Jason Wang
     
  • Pablo Neira Ayuso says:

    ====================
    Netfilter fixes for net

    The following patchset contains two bugfixes for your net tree, they are:

    1) Validate netlink group from nfnetlink to avoid an out of bound array
    access. This should only happen with superuser priviledges though.
    Discovered by Andrey Ryabinin using trinity.

    2) Don't push ethernet header before calling the netfilter output hook
    for multicast traffic, this breaks ebtables since it expects to see
    skb->data pointing to the network header, patch from Linus Luessing.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • John W. Linville says:

    ====================
    pull request: wireless 2014-11-20

    Please full this little batch of fixes intended for the 3.18 stream!

    For the mac80211 patch, Johannes says:

    "Here's another last minute fix, for minstrel HT crashing
    depending on the value of some uninitialised stack."

    On top of that...

    Ben Greear fixes an ath9k regression in which a BSSID mask is
    miscalculated.

    Dmitry Torokhov corrects an error handling routing in brcmfmac which
    was checking an unsigned variable for a negative value.

    Johannes Berg avoids a build problem in brcmfmac for arches where
    linux/unaligned/access_ok.h and asm/unaligned.h conflict.

    Mathy Vanhoef addresses another brcmfmac issue so as to eliminate a
    use-after-free of the URB transfer buffer if a timeout occurs.

    Please let me know if there are problems!
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Peer priority groups were being reversed, but this was missed in the previous
    fix sent out for this issue.

    v2 : Previous patch was doing extra unnecessary work, result is the same.
    Please ignore previous patch

    Fixes : ee7bc3cdc270 ('cxgb4 : dcb open-lldp interop fixes')

    Signed-off-by: Anish Bhatt
    Signed-off-by: David S. Miller

    Anish Bhatt
     
  • This fixes an old regression introduced by commit
    b0d0d915 (ipx: remove the BKL).

    When a recvmsg syscall blocks waiting for new data, no data can be sent on the
    same socket with sendmsg because ipx_recvmsg() sleeps with the socket locked.

    This breaks mars-nwe (NetWare emulator):
    - the ncpserv process reads the request using recvmsg
    - ncpserv forks and spawns nwconn
    - ncpserv calls a (blocking) recvmsg and waits for new requests
    - nwconn deadlocks in sendmsg on the same socket

    Commit b0d0d915 has simply replaced BKL locking with
    lock_sock/release_sock. Unlike now, BKL got unlocked while
    sleeping, so a blocking recvmsg did not block a concurrent
    sendmsg.

    Only keep the socket locked while actually working with the socket data and
    release it prior to calling skb_recv_datagram().

    Signed-off-by: Jiri Bohac
    Reviewed-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Jiri Bohac
     
  • When userspace doesn't provide a mask, OVS datapath generates a fully
    unwildcarded mask for the flow by copying the flow and setting all bits
    in all fields. For IPv6 label, this creates a mask that matches on the
    upper 12 bits, causing the following error:

    openvswitch: netlink: Invalid IPv6 flow label value (value=ffffffff, max=fffff)

    This patch ignores the label validation check for masks, avoiding this
    error.

    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Joe Stringer
     
  • pptp_getname() only partially initializes the stack variable sa,
    particularly only fills the pptp part of the sa_addr union. The code
    thereby discloses 16 bytes of kernel stack memory via getsockname().

    Fix this by memset(0)'ing the union before.

    Cc: Dmitry Kozlov
    Signed-off-by: Mathias Krause
    Signed-off-by: David S. Miller

    Mathias Krause
     
  • fix one regression and one endian issue.

    * 'drm-fixes-3.18' of git://people.freedesktop.org/~agd5f/linux:
    drm/radeon: fix endian swapping in vbios fetch for tdp table
    drm/radeon: disable native backlight control on pre-r6xx asics (v2)

    Dave Airlie
     
  • TIF_NOHZ is 19 (i.e. _TIF_SYSCALL_TRACE | _TIF_NOTIFY_RESUME |
    _TIF_SINGLESTEP), not (1<
    Cc: Don Zickus
    Cc: Peter Zijlstra
    Cc: Dave Jones
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/a6cd3b60a3f53afb6e1c8081b0ec30ff19003dd7.1416434075.git.luto@amacapital.net
    Signed-off-by: Thomas Gleixner

    Andy Lutomirski
     
  • This is a specific implementation, is the
    multiplexer that has the arch-specific knowledge of which
    of the implementations needs to be used, so include that.

    This issue was revealed by kbuild testing
    when was added in
    resulting in redefinition of get_unaligned_be16 (and
    probably others).

    Cc: stable@vger.kernel.org # v3.17
    Reported-by: Fengguang Wu
    Signed-off-by: Johannes Berg
    Signed-off-by: Arend van Spriel
    Signed-off-by: John W. Linville

    Johannes Berg