25 Sep, 2013

1 commit

  • In order to prepare to per-arch implementations of preempt_count move
    the required bits into an asm-generic header and use this for all
    archs.

    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

13 Sep, 2013

2 commits

  • After the last architecture switched to generic hard irqs the config
    options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code
    for !CONFIG_GENERIC_HARDIRQS can be removed.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Unlike global OOM handling, memory cgroup code will invoke the OOM killer
    in any OOM situation because it has no way of telling faults occuring in
    kernel context - which could be handled more gracefully - from
    user-triggered faults.

    Pass a flag that identifies faults originating in user space from the
    architecture-specific fault handlers to generic code so that memcg OOM
    handling can be improved.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Michal Hocko
    Cc: David Rientjes
    Cc: KAMEZAWA Hiroyuki
    Cc: azurIt
    Cc: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

12 Sep, 2013

1 commit

  • I found the following pattern that leads in to interesting findings:

    grep -r "ret.*|=.*__put_user" *
    grep -r "ret.*|=.*__get_user" *
    grep -r "ret.*|=.*__copy" *

    The __put_user() calls in compat_ioctl.c, ptrace compat, signal compat,
    since those appear in compat code, we could probably expect the kernel
    addresses not to be reachable in the lower 32-bit range, so I think they
    might not be exploitable.

    For the "__get_user" cases, I don't think those are exploitable: the worse
    that can happen is that the kernel will copy kernel memory into in-kernel
    buffers, and will fail immediately afterward.

    The alpha csum_partial_copy_from_user() seems to be missing the
    access_ok() check entirely. The fix is inspired from x86. This could
    lead to information leak on alpha. I also noticed that many architectures
    map csum_partial_copy_from_user() to csum_partial_copy_generic(), but I
    wonder if the latter is performing the access checks on every
    architectures.

    Signed-off-by: Mathieu Desnoyers
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Jens Axboe
    Cc: Oleg Nesterov
    Cc: David Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     

04 Sep, 2013

3 commits


20 Jul, 2013

13 commits


15 Jul, 2013

2 commits

  • The __cpuinit type of throwaway sections might have made sense
    some time ago when RAM was more constrained, but now the savings
    do not offset the cost and complications. For example, the fix in
    commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
    is a good example of the nasty type of bugs that can be created
    with improper use of the various __init prefixes.

    After a discussion on LKML[1] it was decided that cpuinit should go
    the way of devinit and be phased out. Once all the users are gone,
    we can then finally remove the macros themselves from linux/init.h.

    This removes all the alpha uses of the __cpuinit macros.

    [1] https://lkml.org/lkml/2013/5/20/589

    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     
  • Pull more vfs stuff from Al Viro:
    "O_TMPFILE ABI changes, Oleg's fput() series, misc cleanups, including
    making simple_lookup() usable for filesystems with non-NULL s_d_op,
    which allows us to get rid of quite a bit of ugliness"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    sunrpc: now we can just set ->s_d_op
    cgroup: we can use simple_lookup() now
    efivarfs: we can use simple_lookup() now
    make simple_lookup() usable for filesystems that set ->s_d_op
    configfs: don't open-code d_alloc_name()
    __rpc_lookup_create_exclusive: pass string instead of qstr
    rpc_create_*_dir: don't bother with qstr
    llist: llist_add() can use llist_add_batch()
    llist: fix/simplify llist_add() and llist_add_batch()
    fput: turn "list_head delayed_fput_list" into llist_head
    fs/file_table.c:fput(): add comment
    Safer ABI for O_TMPFILE

    Linus Torvalds
     

13 Jul, 2013

1 commit

  • [suggested by Rasmus Villemoes] make O_DIRECTORY | O_RDWR part of O_TMPFILE;
    that will fail on old kernels in a lot more cases than what I came up with.
    And make sure O_CREAT doesn't get there...

    Signed-off-by: Al Viro

    Al Viro
     

11 Jul, 2013

1 commit

  • Rename LL_SO to BUSY_POLL_SO
    Rename sysctl_net_ll_{read,poll} to sysctl_busy_{read,poll}
    Fix up users of these variables.
    Fix documentation for sysctl.

    a patch for the socket.7 man page will follow separately,
    because of limitations of my mail setup.

    Signed-off-by: Eliezer Tamir
    Signed-off-by: David S. Miller

    Eliezer Tamir
     

10 Jul, 2013

1 commit

  • Pull networking updates from David Miller:
    "This is a re-do of the net-next pull request for the current merge
    window. The only difference from the one I made the other day is that
    this has Eliezer's interface renames and the timeout handling changes
    made based upon your feedback, as well as a few bug fixes that have
    trickeled in.

    Highlights:

    1) Low latency device polling, eliminating the cost of interrupt
    handling and context switches. Allows direct polling of a network
    device from socket operations, such as recvmsg() and poll().

    Currently ixgbe, mlx4, and bnx2x support this feature.

    Full high level description, performance numbers, and design in
    commit 0a4db187a999 ("Merge branch 'll_poll'")

    From Eliezer Tamir.

    2) With the routing cache removed, ip_check_mc_rcu() gets exercised
    more than ever before in the case where we have lots of multicast
    addresses. Use a hash table instead of a simple linked list, from
    Eric Dumazet.

    3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from
    Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski,
    Marek Puzyniak, Michal Kazior, and Sujith Manoharan.

    4) Support reporting the TUN device persist flag to userspace, from
    Pavel Emelyanov.

    5) Allow controlling network device VF link state using netlink, from
    Rony Efraim.

    6) Support GRE tunneling in openvswitch, from Pravin B Shelar.

    7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from
    Daniel Borkmann and Eric Dumazet.

    8) Allow controlling of TCP quickack behavior on a per-route basis,
    from Cong Wang.

    9) Several bug fixes and improvements to vxlan from Stephen
    Hemminger, Pravin B Shelar, and Mike Rapoport. In particular,
    support receiving on multiple UDP ports.

    10) Major cleanups, particular in the area of debugging and cookie
    lifetime handline, to the SCTP protocol code. From Daniel
    Borkmann.

    11) Allow packets to cross network namespaces when traversing tunnel
    devices. From Nicolas Dichtel.

    12) Allow monitoring netlink traffic via AF_PACKET sockets, in a
    manner akin to how we monitor real network traffic via ptype_all.
    From Daniel Borkmann.

    13) Several bug fixes and improvements for the new alx device driver,
    from Johannes Berg.

    14) Fix scalability issues in the netem packet scheduler's time queue,
    by using an rbtree. From Eric Dumazet.

    15) Several bug fixes in TCP loss recovery handling, from Yuchung
    Cheng.

    16) Add support for GSO segmentation of MPLS packets, from Simon
    Horman.

    17) Make network notifiers have a real data type for the opaque
    pointer that's passed into them. Use this to properly handle
    network device flag changes in arp_netdev_event(). From Jiri
    Pirko and Timo Teräs.

    18) Convert several drivers over to module_pci_driver(), from Peter
    Huewe.

    19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a
    O(1) calculation instead. From Eric Dumazet.

    20) Support setting of explicit tunnel peer addresses in ipv6, just
    like ipv4. From Nicolas Dichtel.

    21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet.

    22) Prevent a single high rate flow from overruning an individual cpu
    during RX packet processing via selective flow shedding. From
    Willem de Bruijn.

    23) Don't use spinlocks in TCP md5 signing fast paths, from Eric
    Dumazet.

    24) Don't just drop GSO packets which are above the TBF scheduler's
    burst limit, chop them up so they are in-bounds instead. Also
    from Eric Dumazet.

    25) VLAN offloads are missed when configured on top of a bridge, fix
    from Vlad Yasevich.

    26) Support IPV6 in ping sockets. From Lorenzo Colitti.

    27) Receive flow steering targets should be updated at poll() time
    too, from David Majnemer.

    28) Fix several corner case regressions in PMTU/redirect handling due
    to the routing cache removal, from Timo Teräs.

    29) We have to be mindful of ipv4 mapped ipv6 sockets in
    upd_v6_push_pending_frames(). From Hannes Frederic Sowa.

    30) Fix L2TP sequence number handling bugs, from James Chapman."

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits)
    drivers/net: caif: fix wrong rtnl_is_locked() usage
    drivers/net: enic: release rtnl_lock on error-path
    vhost-net: fix use-after-free in vhost_net_flush
    net: mv643xx_eth: do not use port number as platform device id
    net: sctp: confirm route during forward progress
    virtio_net: fix race in RX VQ processing
    virtio: support unlocked queue poll
    net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit
    Documentation: Fix references to defunct linux-net@vger.kernel.org
    net/fs: change busy poll time accounting
    net: rename low latency sockets functions to busy poll
    bridge: fix some kernel warning in multicast timer
    sfc: Fix memory leak when discarding scattered packets
    sit: fix tunnel update via netlink
    dt:net:stmmac: Add dt specific phy reset callback support.
    dt:net:stmmac: Add support to dwmac version 3.610 and 3.710
    dt:net:stmmac: Allocate platform data only if its NULL.
    net:stmmac: fix memleak in the open method
    ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available
    net: ipv6: fix wrong ping_v6_sendmsg return value
    ...

    Linus Torvalds
     

04 Jul, 2013

7 commits

  • VALID_PAGE() has been removed from kernel long time ago, so clean up it.

    Signed-off-by: Jiang Liu
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Nadia Yvette Chambers
    Cc: Jiri Kosina
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     
  • Now mem_init() for both Alpha UMA and Alpha NUMA are the same, so unify it
    to reduce duplicated code.

    Signed-off-by: Jiang Liu
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     
  • Prepare for removing num_physpages and simplify mem_init().

    Signed-off-by: Jiang Liu
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     
  • Concentrate code to modify totalram_pages into the mm core, so the arch
    memory initialized code doesn't need to take care of it. With these
    changes applied, only following functions from mm core modify global
    variable totalram_pages: free_bootmem_late(), free_all_bootmem(),
    free_all_bootmem_node(), adjust_managed_page_count().

    With this patch applied, it will be much more easier for us to keep
    totalram_pages and zone->managed_pages in consistence.

    Signed-off-by: Jiang Liu
    Acked-by: David Howells
    Cc: "H. Peter Anvin"
    Cc: "Michael S. Tsirkin"
    Cc:
    Cc: Arnd Bergmann
    Cc: Catalin Marinas
    Cc: Chris Metcalf
    Cc: Geert Uytterhoeven
    Cc: Ingo Molnar
    Cc: Jeremy Fitzhardinge
    Cc: Jianguo Wu
    Cc: Joonsoo Kim
    Cc: Kamezawa Hiroyuki
    Cc: Konrad Rzeszutek Wilk
    Cc: Marek Szyprowski
    Cc: Mel Gorman
    Cc: Michel Lespinasse
    Cc: Minchan Kim
    Cc: Rik van Riel
    Cc: Rusty Russell
    Cc: Tang Chen
    Cc: Tejun Heo
    Cc: Thomas Gleixner
    Cc: Wen Congyang
    Cc: Will Deacon
    Cc: Yasuaki Ishimatsu
    Cc: Yinghai Lu
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     
  • Address more review comments from last round of code review.
    1) Enhance free_reserved_area() to support poisoning freed memory with
    pattern '0'. This could be used to get rid of poison_init_mem()
    on ARM64.
    2) A previous patch has disabled memory poison for initmem on s390
    by mistake, so restore to the original behavior.
    3) Remove redundant PAGE_ALIGN() when calling free_reserved_area().

    Signed-off-by: Jiang Liu
    Cc: Geert Uytterhoeven
    Cc: "H. Peter Anvin"
    Cc: "Michael S. Tsirkin"
    Cc:
    Cc: Arnd Bergmann
    Cc: Catalin Marinas
    Cc: Chris Metcalf
    Cc: David Howells
    Cc: Ingo Molnar
    Cc: Jeremy Fitzhardinge
    Cc: Jianguo Wu
    Cc: Joonsoo Kim
    Cc: Kamezawa Hiroyuki
    Cc: Konrad Rzeszutek Wilk
    Cc: Marek Szyprowski
    Cc: Mel Gorman
    Cc: Michel Lespinasse
    Cc: Minchan Kim
    Cc: Rik van Riel
    Cc: Rusty Russell
    Cc: Tang Chen
    Cc: Tejun Heo
    Cc: Thomas Gleixner
    Cc: Wen Congyang
    Cc: Will Deacon
    Cc: Yasuaki Ishimatsu
    Cc: Yinghai Lu
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     
  • Change signature of free_reserved_area() according to Russell King's
    suggestion to fix following build warnings:

    arch/arm/mm/init.c: In function 'mem_init':
    arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
    free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
    ^
    In file included from include/linux/mman.h:4:0,
    from arch/arm/mm/init.c:15:
    include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
    extern unsigned long free_reserved_area(unsigned long start, unsigned long end,

    mm/page_alloc.c: In function 'free_reserved_area':
    >> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
    In file included from arch/mips/include/asm/page.h:49:0,
    from include/linux/mmzone.h:20,
    from include/linux/gfp.h:4,
    from include/linux/mm.h:8,
    from mm/page_alloc.c:18:
    arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
    mm/page_alloc.c: In function 'free_area_init_nodes':
    mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]

    Also address some minor code review comments.

    Signed-off-by: Jiang Liu
    Reported-by: Arnd Bergmann
    Cc: "H. Peter Anvin"
    Cc: "Michael S. Tsirkin"
    Cc:
    Cc: Catalin Marinas
    Cc: Chris Metcalf
    Cc: David Howells
    Cc: Geert Uytterhoeven
    Cc: Ingo Molnar
    Cc: Jeremy Fitzhardinge
    Cc: Jianguo Wu
    Cc: Joonsoo Kim
    Cc: Kamezawa Hiroyuki
    Cc: Konrad Rzeszutek Wilk
    Cc: Marek Szyprowski
    Cc: Mel Gorman
    Cc: Michel Lespinasse
    Cc: Minchan Kim
    Cc: Rik van Riel
    Cc: Rusty Russell
    Cc: Tang Chen
    Cc: Tejun Heo
    Cc: Thomas Gleixner
    Cc: Wen Congyang
    Cc: Will Deacon
    Cc: Yasuaki Ishimatsu
    Cc: Yinghai Lu
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     
  • Pull second set of VFS changes from Al Viro:
    "Assorted f_pos race fixes, making do_splice_direct() safe to call with
    i_mutex on parent, O_TMPFILE support, Jeff's locks.c series,
    ->d_hash/->d_compare calling conventions changes from Linus, misc
    stuff all over the place."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    Document ->tmpfile()
    ext4: ->tmpfile() support
    vfs: export lseek_execute() to modules
    lseek_execute() doesn't need an inode passed to it
    block_dev: switch to fixed_size_llseek()
    cpqphp_sysfs: switch to fixed_size_llseek()
    tile-srom: switch to fixed_size_llseek()
    proc_powerpc: switch to fixed_size_llseek()
    ubi/cdev: switch to fixed_size_llseek()
    pci/proc: switch to fixed_size_llseek()
    isapnp: switch to fixed_size_llseek()
    lpfc: switch to fixed_size_llseek()
    locks: give the blocked_hash its own spinlock
    locks: add a new "lm_owner_key" lock operation
    locks: turn the blocked_list into a hashtable
    locks: convert fl_link to a hlist_node
    locks: avoid taking global lock if possible when waking up blocked waiters
    locks: protect most of the file_lock handling with i_lock
    locks: encapsulate the fl_link list handling
    locks: make "added" in __posix_lock_file a bool
    ...

    Linus Torvalds
     

03 Jul, 2013

1 commit

  • Pull tty/serial updates from Greg KH:
    "Here is the big TTY / Serial driver merge for 3.11-rc1.

    It's not all that big, nothing major changed in the tty api, which is
    a nice change, just a number of serial driver fixes and updates and
    new drivers, along with some n_tty fixes to help resolve some reported
    issues.

    All of these have been in the linux-next releases for a while, with
    the exception of the last revert patch, which was reported this past
    weekend by two different people as being needed."

    * tag 'tty-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (51 commits)
    Revert "serial: 8250_pci: add support for another kind of NetMos Technology PCI 9835 Multi-I/O Controller"
    pch_uart: Add uart_clk selection for the MinnowBoard
    tty: atmel_serial: prepare clk before calling enable
    tty: Reset itty for other pty
    n_tty: Buffer work should not reschedule itself
    n_tty: Fix unsafe update of available buffer space
    n_tty: Untangle read completion variables
    n_tty: Encapsulate minimum_to_wake within N_TTY
    serial: omap: Fix device tree based PM runtime
    serial: imx: Fix serial clock unbalance
    serial/mpc52xx_uart: fix kernel panic when system reboot
    serial: mfd: Add sysrq support
    serial: imx: enable the clocks for console
    tty: serial: add Freescale lpuart driver support
    serial: imx: Improve Kconfig text
    serial: imx: Allow module build
    serial: imx: Fix warning when !CONFIG_SERIAL_IMX_CONSOLE
    tty/serial/sirf: fix error propagation in sirfsoc_uart_probe()
    serial: omap: fix potential NULL pointer dereference in serial_omap_runtime_suspend()
    tty: serial: Enable uartlite for ARM zynq
    ...

    Linus Torvalds
     

29 Jun, 2013

5 commits


18 Jun, 2013

1 commit


21 May, 2013

1 commit

  • Impact:
    1:convert all remain take_over_console to do_take_over_console
    2:update take_over_console to do_take_over_console in comment

    Commit dc9641895abb ("vt: delete unneeded functions
    register_con_driver|take_over_console") delete take_over_console,
    but forget to convert remain take_over_console's users to new API
    do_take_over_console, this patch fix it.

    Reported-by: Stephen Rothwell
    Signed-off-by: Wang YanQing
    Signed-off-by: Greg Kroah-Hartman

    Wang YanQing