01 Jun, 2013

4 commits

  • 'boot_args' is an input args, and 'boot_command_line' has a fix length.
    So use strlcpy() instead of strcpy() to avoid memory overflow.

    Signed-off-by: Chen Gang
    Acked-by: Kyle McMartin
    Signed-off-by: Helge Deller

    Chen Gang
     
  • There's a Makefile line setting cflags for CONFIG_PA7100. But that
    Kconfig macro doesn't exist. There is a Kconfig symbol PA7000, which
    covers both PA7000 and PA7100 processors. So let's use the corresponding
    Kconfig macro.

    Signed-off-by: Paul Bolle
    Signed-off-by: Helge Deller

    Paul Bolle
     
  • With CONFIG_DISCONTIGMEM=y and multiple physical memory areas,
    cat /proc/kpageflags triggers this kernel bug:

    kernel BUG at arch/parisc/include/asm/mmzone.h:50!
    CPU: 2 PID: 7848 Comm: cat Tainted: G D W 3.10.0-rc3-64bit #44
    IAOQ[0]: kpageflags_read0x128/0x238
    IAOQ[1]: kpageflags_read0x12c/0x238
    RP(r2): proc_reg_read0xbc/0x130
    Backtrace:
    [] proc_reg_read0xbc/0x130
    [] vfs_read0xc4/0x1d0
    [] SyS_read0x94/0xf0
    [] syscall_exit0x0/0x14

    kpageflags_read() walks through the whole memory, even if some memory
    areas are physically not available. So, we should better not BUG on an
    unavailable pfn in pfn_to_nid() but just return the expected value -1 or
    0.

    Signed-off-by: Helge Deller

    Helge Deller
     
  • 'path.bc[i]' can be asigned by PCI_SLOT() which can '> 10', so sizeof(6
    * "%u:" + "%u" + '\0') may be 21.

    Since 'name' length is 20, it may be memory overflow.

    And 'path.bc[i]' is 'unsigned char' for printing, we can be sure the
    max length of 'name' must be less than 28.

    So simplify thinking, we can use 28 instead of 20 directly, and do not
    think of whether 'patchc.bc[i]' can '> 100'.

    Signed-off-by: Chen Gang
    Signed-off-by: Helge Deller

    Chen Gang
     

25 May, 2013

5 commits

  • The logic to detect if the irq stack was already in use with
    raw_spin_trylock() is wrong, because it will generate a "trylock failure
    on UP" error message with CONFIG_SMP=n and CONFIG_DEBUG_SPINLOCK=y.

    arch_spin_trylock() can't be used either since in the CONFIG_SMP=n case
    no atomic protection is given and we are reentrant here. A mutex didn't
    worked either and brings more overhead by turning off interrupts.

    So, let's use the fastest path for parisc which is the ldcw instruction.

    Counting how often the irq stack was used is pretty useless, so just
    drop this piece of code.

    Signed-off-by: Helge Deller

    Helge Deller
     
  • The get_stack_use_cr30 and get_stack_use_r30 macros allocate a stack
    frame for external interrupts and interruptions requiring a stack frame.
    They are currently not reentrant in that they save register context
    before the stack is set or adjusted.

    I have observed a number of system crashes where there was clear
    evidence of stack corruption during interrupt processing, and as a
    result register corruption. Some interruptions can still occur during
    interruption processing, however external interrupts are disabled and
    data TLB misses don't occur for absolute accesses. So, it's not entirely
    clear what triggers this issue. Also, if an interruption occurs when
    Q=0, it is generally not possible to recover as the shadowed registers
    are not copied.

    The attached patch reworks the get_stack_use_cr30 and get_stack_use_r30
    macros to allocate stack before doing register saves. The new code is a
    couple of instructions shorter than the old implementation. Thus, it's
    an improvement even if it doesn't fully resolve the stack corruption
    issue. Based on limited testing, it improves SMP system stability.

    Signed-off-by: John David Anglin
    Signed-off-by: Helge Deller

    John David Anglin
     
  • Show number of floating point assistant and unaligned access fixup
    handler in /proc/interrupts file.

    Signed-off-by: Helge Deller

    Helge Deller
     
  • additionally clean up some whitespaces & tabs.

    Signed-off-by: Helge Deller

    Helge Deller
     
  • Signed-off-by: Helge Deller

    Helge Deller
     

14 May, 2013

1 commit

  • People/distros vary how they prefix the toolchain name for 64bit builds.
    Rather than enforce one convention over another, add a for loop which
    does a search for all the general prefixes.

    For 64bit builds, we now search for (in order):
    hppa64-unknown-linux-gnu
    hppa64-linux-gnu
    hppa64-linux

    For 32bit builds, we look for:
    hppa-unknown-linux-gnu
    hppa-linux-gnu
    hppa-linux
    hppa2.0-unknown-linux-gnu
    hppa2.0-linux-gnu
    hppa2.0-linux
    hppa1.1-unknown-linux-gnu
    hppa1.1-linux-gnu
    hppa1.1-linux

    This patch was initiated by Mike Frysinger, with feedback from Jeroen
    Roovers, John David Anglin and Helge Deller.

    Signed-off-by: Mike Frysinger
    Signed-off-by: Jeroen Roovers
    Signed-off-by: John David Anglin
    Signed-off-by: Helge Deller

    Helge Deller
     

12 May, 2013

2 commits

  • Currently, race conditions exist in the handling of TLB interruptions in
    entry.S. In particular, dirty bit updates can be lost if an accessed
    interruption occurs just after the dirty bit interruption on a different
    cpu. Lost dirty bit updates result in user pages not being flushed and
    general system instability. This change adds lock and unlock macros to
    synchronize all PTE and TLB updates done in entry.S. As a result,
    userspace stability is significantly improved.

    Signed-off-by: John David Anglin
    Signed-off-by: Helge Deller

    John David Anglin
     
  • This patch fixes few build issues which were introduced with the last
    irq stack patch, e.g. the combination of stack overflow check and irq
    stack.

    Furthermore we now do proper locking and change the irq bh handler
    to use the irq stack as well.

    In /proc/interrupts one now can monitor how huge the irq stack has grown
    and how often it was preferred over the kernel stack.

    IRQ stacks are now enabled by default just to make sure that we not
    overflow the kernel stack by accident.

    Signed-off-by: Helge Deller

    Helge Deller
     

11 May, 2013

1 commit


10 May, 2013

1 commit


08 May, 2013

5 commits

  • Fix up build error on UP and show correctly number of function call
    (ipi) irqs.

    Signed-off-by: Helge Deller

    Helge Deller
     
  • Add framework and initial values for more fine grained statistics in
    /proc/interrupts.

    Signed-off-by: Helge Deller

    Helge Deller
     
  • Default kernel stack size on parisc is 16k. During tests we found that the
    kernel stack can easily grow beyond 13k, which leaves 3k left for irq
    processing.

    This patch adds the possibility to activate an additional stack of 16k per CPU
    which is being used during irq processing. This implementation does not yet
    uses this irq stack for the irq bh handler.

    The assembler code for call_on_stack was heavily cleaned up by John
    David Anglin.

    CC: John David Anglin
    Signed-off-by: Helge Deller

    Helge Deller
     
  • Add the CONFIG_DEBUG_STACKOVERFLOW config option to enable checks to
    detect kernel stack overflows.

    Stack overflows can not be detected reliable since we do not want to
    introduce too much overhead.

    Instead, during irq processing in do_cpu_irq_mask() we check kernel
    stack usage of the interrupted kernel process. Kernel threads can be
    easily detected by checking the value of space register 7 (sr7) which
    is zero when running inside the kernel.

    Since THREAD_SIZE is 16k and PAGE_SIZE is 4k, reduce the alignment of
    the init thread to the lower value (PAGE_SIZE) in the kernel
    vmlinux.ld.S linker script.

    Signed-off-by: Helge Deller

    Helge Deller
     
  • … returning to userspace

    Helge and I have found that we have a kernel stack overflow problem
    which causes a variety of random failures.
    Currently, we re-enable interrupts when returning from an external
    interrupt incase we need to schedule or delivery
    signals. As a result, a potentially unlimited number of interrupts
    can occur while we are running on the kernel
    stack. It is very limited in space (currently, 16k). This change
    defers enabling interrupts until we have
    actually decided to schedule or delivery signals. This only occurs
    when we about to return to userspace. This
    limits the number of interrupts on the kernel stack to one. In other
    cases, interrupts remain disabled until the
    final return from interrupt (rfi).

    Signed-off-by: John David Anglin <dave.anglin@bell.net>
    Signed-off-by: Helge Deller <deller@gmx.de>

    John David Anglin
     

07 May, 2013

8 commits

  • Signed-off-by: Helge Deller

    Helge Deller
     
  • The "b" branch instruction used in the fork_like macro only can handle
    17-bit pc-relative offsets.
    This fails with an out of range offset with some .config files.
    Rewrite to use the "be" instruction which
    can branch to any address in a space.

    Signed-off-by: John David Anglin
    Signed-off-by: Helge Deller

    John David Anglin
     
  • The ifeq operator does not accept globs, so this little bit of code will
    never match (unless uname literally prints out "parsic*"). Rewrite to
    use a pattern matching operator so that NATIVE is set to 1 on parisc.

    Signed-off-by: Mike Frysinger
    Signed-off-by: Helge Deller

    Mike Frysinger
     
  • Include some documentation about how the parisc gateway page technically
    works and how it is used from userspace.

    James Bottomley is the original author of this description and it was
    copied here out of an email thread from Apr 12 2013 titled:
    man2 : syscall.2 : document syscall calling conventions

    CC: James Bottomley
    Signed-off-by: Helge Deller

    Helge Deller
     
  • This patch fixes partly PAGE_SIZEs of 16K or 64K by adjusting the
    assembler PTE lookup code and the assembler TEMPALIAS code. Furthermore
    some data alignments for PAGE_SIZE have been limited to 4K (or less) to
    not waste too much memory with greater page sizes. As a side note, the
    palo loader can (currently) only handle up to 10 ELF segments which is
    fixed with tighter aligning as well.

    My testings indicated that the ldci command in the sba iommu coding
    needed adjustment by the PAGE_SHIFT value and that the I/O PDIR Page
    size was only set to 4K for my machine (C3000).

    All this fixes partly the boot, but there are still quite some caching
    problems left. Examples are e.g. the symbios logic driver which is
    failing:

    sym0: rev 0x7 at pci 0000:00:0f.0 irq 69
    sym0: PA-RISC Firmware, ID 7, Fast-40, SE, parity checking
    CACHE TEST FAILED: DMA error (dstat=0x81).sym0: CACHE INCORRECTLY CONFIGURED.

    and the tulip network driver which doesn't seem to work correctly
    either:

    Sending BOOTP requests .net eth0: Setting full-duplex based on MII#1
    link partner capability of 05e1
    ..... timed out!

    Beside those kernel fixes glibc will need fixes too to be able to handle
    >4K page sizes.

    Signed-off-by: Helge Deller

    Helge Deller
     
  • Most architectures that define CONFIG_HAVE_DMA, have implementations for
    both dma_alloc_attrs() and dma_free_attrs(). All achitectures that do
    not define CONFIG_HAVE_DMA also have both of these definitions provided by
    dma-mapping-broken.h.

    Add default implementations for these functions on parisc.

    Signed-off-by: Damian Hobson-Garcia
    Signed-off-by: Helge Deller

    Damian Hobson-Garcia
     
  • Things like " \t" and whitespace at end of line. I'm leaving all the other
    coding style errors here alone.

    Signed-off-by: Rolf Eike Beer
    Signed-off-by: Helge Deller

    Rolf Eike Beer
     
  • kmap_atomic allows only one argument now, just move the second.

    Signed-off-by: Zhao Hongjiang
    Signed-off-by: Helge Deller

    Zhao Hongjiang
     

05 May, 2013

1 commit


02 May, 2013

2 commits

  • Pull VFS updates from Al Viro,

    Misc cleanups all over the place, mainly wrt /proc interfaces (switch
    create_proc_entry to proc_create(), get rid of the deprecated
    create_proc_read_entry() in favor of using proc_create_data() and
    seq_file etc).

    7kloc removed.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
    don't bother with deferred freeing of fdtables
    proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
    proc: Make the PROC_I() and PDE() macros internal to procfs
    proc: Supply a function to remove a proc entry by PDE
    take cgroup_open() and cpuset_open() to fs/proc/base.c
    ppc: Clean up scanlog
    ppc: Clean up rtas_flash driver somewhat
    hostap: proc: Use remove_proc_subtree()
    drm: proc: Use remove_proc_subtree()
    drm: proc: Use minor->index to label things, not PDE->name
    drm: Constify drm_proc_list[]
    zoran: Don't print proc_dir_entry data in debug
    reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
    proc: Supply an accessor for getting the data from a PDE's parent
    airo: Use remove_proc_subtree()
    rtl8192u: Don't need to save device proc dir PDE
    rtl8187se: Use a dir under /proc/net/r8180/
    proc: Add proc_mkdir_data()
    proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
    proc: Move PDE_NET() to fs/proc/proc_net.c
    ...

    Linus Torvalds
     
  • Pull networking updates from David Miller:
    "Highlights (1721 non-merge commits, this has to be a record of some
    sort):

    1) Add 'random' mode to team driver, from Jiri Pirko and Eric
    Dumazet.

    2) Make it so that any driver that supports configuration of multiple
    MAC addresses can provide the forwarding database add and del
    calls by providing a default implementation and hooking that up if
    the driver doesn't have an explicit set of handlers. From Vlad
    Yasevich.

    3) Support GSO segmentation over tunnels and other encapsulating
    devices such as VXLAN, from Pravin B Shelar.

    4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton.

    5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita
    Dukkipati.

    6) In the PHY layer, allow supporting wake-on-lan in situations where
    the PHY registers have to be written for it to be configured.

    Use it to support wake-on-lan in mv643xx_eth.

    From Michael Stapelberg.

    7) Significantly improve firewire IPV6 support, from YOSHIFUJI
    Hideaki.

    8) Allow multiple packets to be sent in a single transmission using
    network coding in batman-adv, from Martin Hundebøll.

    9) Add support for T5 cxgb4 chips, from Santosh Rastapur.

    10) Generalize the VXLAN forwarding tables so that there is more
    flexibility in configurating various aspects of the endpoints.
    From David Stevens.

    11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver,
    from Dmitry Kravkov.

    12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo
    Neira Ayuso.

    13) Start adding networking selftests.

    14) In situations of overload on the same AF_PACKET fanout socket, or
    per-cpu packet receive queue, minimize drop by distributing the
    load to other cpus/fanouts. From Willem de Bruijn and Eric
    Dumazet.

    15) Add support for new payload offset BPF instruction, from Daniel
    Borkmann.

    16) Convert several drivers over to mdoule_platform_driver(), from
    Sachin Kamat.

    17) Provide a minimal BPF JIT image disassembler userspace tool, from
    Daniel Borkmann.

    18) Rewrite F-RTO implementation in TCP to match the final
    specification of it in RFC4138 and RFC5682. From Yuchung Cheng.

    19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear
    you like netlink, so I implemented netlink dumping of netlink
    sockets.") From Andrey Vagin.

    20) Remove ugly passing of rtnetlink attributes into rtnl_doit
    functions, from Thomas Graf.

    21) Allow userspace to be able to see if a configuration change occurs
    in the middle of an address or device list dump, from Nicolas
    Dichtel.

    22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes
    Frederic Sowa.

    23) Increase accuracy of packet length used by packet scheduler, from
    Jason Wang.

    24) Beginning set of changes to make ipv4/ipv6 fragment handling more
    scalable and less susceptible to overload and locking contention,
    from Jesper Dangaard Brouer.

    25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*()
    instead. From Hong Zhiguo.

    26) Optimize route usage in IPVS by avoiding reference counting where
    possible, from Julian Anastasov.

    27) Convert IPVS schedulers to RCU, also from Julian Anastasov.

    28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger
    Eitzenberger.

    29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG,
    nfnetlink_log, and nfnetlink_queue. From Gao feng.

    30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa.

    31) Support several new r8169 chips, from Hayes Wang.

    32) Support tokenized interface identifiers in ipv6, from Daniel
    Borkmann.

    33) Use usbnet_link_change() helper in USB net driver, from Ming Lei.

    34) Add 802.1ad vlan offload support, from Patrick McHardy.

    35) Support mmap() based netlink communication, also from Patrick
    McHardy.

    36) Support HW timestamping in mlx4 driver, from Amir Vadai.

    37) Rationalize AF_PACKET packet timestamping when transmitting, from
    Willem de Bruijn and Daniel Borkmann.

    38) Bring parity to what's provided by /proc/net/packet socket dumping
    and the info provided by netlink socket dumping of AF_PACKET
    sockets. From Nicolas Dichtel.

    39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin
    Poirier"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
    filter: fix va_list build error
    af_unix: fix a fatal race with bit fields
    bnx2x: Prevent memory leak when cnic is absent
    bnx2x: correct reading of speed capabilities
    net: sctp: attribute printl with __printf for gcc fmt checks
    netlink: kconfig: move mmap i/o into netlink kconfig
    netpoll: convert mutex into a semaphore
    netlink: Fix skb ref counting.
    net_sched: act_ipt forward compat with xtables
    mlx4_en: fix a build error on 32bit arches
    Revert "bnx2x: allow nvram test to run when device is down"
    bridge: avoid OOPS if root port not found
    drivers: net: cpsw: fix kernel warn on cpsw irq enable
    sh_eth: use random MAC address if no valid one supplied
    3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA)
    tg3: fix to append hardware time stamping flags
    unix/stream: fix peeking with an offset larger than data in queue
    unix/dgram: fix peeking with an offset larger than data in queue
    unix/dgram: peek beyond 0-sized skbs
    openvswitch: Remove unneeded ovs_netdev_get_ifindex()
    ...

    Linus Torvalds
     

01 May, 2013

4 commits

  • Pull compat cleanup from Al Viro:
    "Mostly about syscall wrappers this time; there will be another pile
    with patches in the same general area from various people, but I'd
    rather push those after both that and vfs.git pile are in."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
    syscalls.h: slightly reduce the jungles of macros
    get rid of union semop in sys_semctl(2) arguments
    make do_mremap() static
    sparc: no need to sign-extend in sync_file_range() wrapper
    ppc compat wrappers for add_key(2) and request_key(2) are pointless
    x86: trim sys_ia32.h
    x86: sys32_kill and sys32_mprotect are pointless
    get rid of compat_sys_semctl() and friends in case of ARCH_WANT_OLD_COMPAT_IPC
    merge compat sys_ipc instances
    consolidate compat lookup_dcookie()
    convert vmsplice to COMPAT_SYSCALL_DEFINE
    switch getrusage() to COMPAT_SYSCALL_DEFINE
    switch epoll_pwait to COMPAT_SYSCALL_DEFINE
    convert sendfile{,64} to COMPAT_SYSCALL_DEFINE
    switch signalfd{,4}() to COMPAT_SYSCALL_DEFINE
    make SYSCALL_DEFINE-generated wrappers do asmlinkage_protect
    make HAVE_SYSCALL_WRAPPERS unconditional
    consolidate cond_syscall and SYSCALL_ALIAS declarations
    teach SYSCALL_DEFINE how to deal with long long/unsigned long long
    get rid of duplicate logics in __SC_....[1-6] definitions

    Linus Torvalds
     
  • The help text for this config is duplicated across the x86, parisc, and
    s390 Kconfig.debug files. Arnd Bergman noted that the help text was
    slightly misleading and should be fixed to state that enabling this
    option isn't a problem when using pre 4.4 gcc.

    To simplify the rewording, consolidate the text into lib/Kconfig.debug
    and modify it there to be more explicit about when you should say N to
    this config.

    Also, make the text a bit more generic by stating that this option
    enables compile time checks so we can cover architectures which emit
    warnings vs. ones which emit errors. The details of how an
    architecture decided to implement the checks isn't as important as the
    concept of compile time checking of copy_from_user() calls.

    While we're doing this, remove all the copy_from_user_overflow() code
    that's duplicated many times and place it into lib/ so that any
    architecture supporting this option can get the function for free.

    Signed-off-by: Stephen Boyd
    Acked-by: Arnd Bergmann
    Acked-by: Ingo Molnar
    Acked-by: H. Peter Anvin
    Cc: Arjan van de Ven
    Acked-by: Helge Deller
    Cc: Heiko Carstens
    Cc: Stephen Rothwell
    Cc: Chris Metcalf
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Boyd
     
  • show_regs() is inherently arch-dependent but it does make sense to print
    generic debug information and some archs already do albeit in slightly
    different forms. This patch introduces a generic function to print debug
    information from show_regs() so that different archs print out the same
    information and it's much easier to modify what's printed.

    show_regs_print_info() prints out the same debug info as dump_stack()
    does plus task and thread_info pointers.

    * Archs which didn't print debug info now do.

    alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r,
    metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc,
    um, xtensa

    * Already prints debug info. Replaced with show_regs_print_info().
    The printed information is superset of what used to be there.

    arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86

    * s390 is special in that it used to print arch-specific information
    along with generic debug info. Heiko and Martin think that the
    arch-specific extra isn't worth keeping s390 specfic implementation.
    Converted to use the generic version.

    Note that now all archs print the debug info before actual register
    dumps.

    An example BUG() dump follows.

    kernel BUG at /work/os/work/kernel/workqueue.c:4841!
    invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7
    Hardware name: empty empty/S3992, BIOS 080011 10/26/2007
    task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000
    RIP: 0010:[] [] init_workqueues+0x4/0x6
    RSP: 0000:ffff88007c861ec8 EFLAGS: 00010246
    RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001
    RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a
    RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    FS: 0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Stack:
    ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650
    0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d
    ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760
    Call Trace:
    [] do_one_initcall+0x122/0x170
    [] kernel_init_freeable+0x9b/0x1c8
    [] ? rest_init+0x140/0x140
    [] kernel_init+0xe/0xf0
    [] ret_from_fork+0x7c/0xb0
    [] ? rest_init+0x140/0x140
    ...

    v2: Typo fix in x86-32.

    v3: CPU number dropped from show_regs_print_info() as
    dump_stack_print_info() has been updated to print it. s390
    specific implementation dropped as requested by s390 maintainers.

    Signed-off-by: Tejun Heo
    Acked-by: David S. Miller
    Acked-by: Jesper Nilsson
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: Bjorn Helgaas
    Cc: Fengguang Wu
    Cc: Mike Frysinger
    Cc: Vineet Gupta
    Cc: Sam Ravnborg
    Acked-by: Chris Metcalf [tile bits]
    Acked-by: Richard Kuo [hexagon bits]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • Both dump_stack() and show_stack() are currently implemented by each
    architecture. show_stack(NULL, NULL) dumps the backtrace for the
    current task as does dump_stack(). On some archs, dump_stack() prints
    extra information - pid, utsname and so on - in addition to the
    backtrace while the two are identical on other archs.

    The usages in arch-independent code of the two functions indicate
    show_stack(NULL, NULL) should print out bare backtrace while
    dump_stack() is used for debugging purposes when something went wrong,
    so it does make sense to print additional information on the task which
    triggered dump_stack().

    There's no reason to require archs to implement two separate but mostly
    identical functions. It leads to unnecessary subtle information.

    This patch expands the dummy fallback dump_stack() implementation in
    lib/dump_stack.c such that it prints out debug information (taken from
    x86) and invokes show_stack(NULL, NULL) and drops arch-specific
    dump_stack() implementations in all archs except blackfin. Blackfin's
    dump_stack() does something wonky that I don't understand.

    Debug information can be printed separately by calling
    dump_stack_print_info() so that arch-specific dump_stack()
    implementation can still emit the same debug information. This is used
    in blackfin.

    This patch brings the following behavior changes.

    * On some archs, an extra level in backtrace for show_stack() could be
    printed. This is because the top frame was determined in
    dump_stack() on those archs while generic dump_stack() can't do that
    reliably. It can be compensated by inlining dump_stack() but not
    sure whether that'd be necessary.

    * Most archs didn't use to print debug info on dump_stack(). They do
    now.

    An example WARN dump follows.

    WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505()
    Hardware name: empty
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #9
    0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48
    ffffffff8108f50f ffffffff82228240 0000000000000040 ffffffff8234a03c
    0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58
    Call Trace:
    [] dump_stack+0x19/0x1b
    [] warn_slowpath_common+0x7f/0xc0
    [] warn_slowpath_null+0x1a/0x20
    [] init_workqueues+0x35/0x505
    ...

    v2: CPU number added to the generic debug info as requested by s390
    folks and dropped the s390 specific dump_stack(). This loses %ksp
    from the debug message which the maintainers think isn't important
    enough to keep the s390-specific dump_stack() implementation.

    dump_stack_print_info() is moved to kernel/printk.c from
    lib/dump_stack.c. Because linkage is per objecct file,
    dump_stack_print_info() living in the same lib file as generic
    dump_stack() means that archs which implement custom dump_stack()
    - at this point, only blackfin - can't use dump_stack_print_info()
    as that will bring in the generic version of dump_stack() too. v1
    The v1 patch broke build on blackfin due to this issue. The build
    breakage was reported by Fengguang Wu.

    Signed-off-by: Tejun Heo
    Acked-by: David S. Miller
    Acked-by: Vineet Gupta
    Acked-by: Jesper Nilsson
    Acked-by: Vineet Gupta
    Acked-by: Martin Schwidefsky [s390 bits]
    Cc: Heiko Carstens
    Cc: Mike Frysinger
    Cc: Fengguang Wu
    Cc: Bjorn Helgaas
    Cc: Sam Ravnborg
    Acked-by: Richard Kuo [hexagon bits]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

30 Apr, 2013

4 commits

  • Pull SMP/hotplug changes from Ingo Molnar:
    "This is a pretty large, multi-arch series unifying and generalizing
    the various disjunct pieces of idle routines that architectures have
    historically copied from each other and have grown in random, wildly
    inconsistent and sometimes buggy directions:

    101 files changed, 455 insertions(+), 1328 deletions(-)

    this went through a number of review and test iterations before it was
    committed, it was tested on various architectures, was exposed to
    linux-next for quite some time - nevertheless it might cause problems
    on architectures that don't read the mailing lists and don't regularly
    test linux-next.

    This cat herding excercise was motivated by the -rt kernel, and was
    brought to you by Thomas "the Whip" Gleixner."

    * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
    idle: Remove GENERIC_IDLE_LOOP config switch
    um: Use generic idle loop
    ia64: Make sure interrupts enabled when we "safe_halt()"
    sparc: Use generic idle loop
    idle: Remove unused ARCH_HAS_DEFAULT_IDLE
    bfin: Fix typo in arch_cpu_idle()
    xtensa: Use generic idle loop
    x86: Use generic idle loop
    unicore: Use generic idle loop
    tile: Use generic idle loop
    tile: Enter idle with preemption disabled
    sh: Use generic idle loop
    score: Use generic idle loop
    s390: Use generic idle loop
    powerpc: Use generic idle loop
    parisc: Use generic idle loop
    openrisc: Use generic idle loop
    mn10300: Use generic idle loop
    mips: Use generic idle loop
    microblaze: Use generic idle loop
    ...

    Linus Torvalds
     
  • Use common help functions to free reserved pages.

    Signed-off-by: Jiang Liu
    Cc: "James E.J. Bottomley"
    Cc: Helge Deller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     
  • On large systems with a lot of memory, walking all RAM to determine page
    types may take a half second or even more.

    In non-blockable contexts, the page allocator will emit a page allocation
    failure warning unless __GFP_NOWARN is specified. In such contexts, irqs
    are typically disabled and such a lengthy delay may even result in NMI
    watchdog timeouts.

    To fix this, suppress the page walk in such contexts when printing the
    page allocation failure warning.

    Signed-off-by: David Rientjes
    Cc: Mel Gorman
    Acked-by: Michal Hocko
    Cc: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Don't use create_proc_read_entry() as that is deprecated, but rather use
    proc_create_data() and seq_file instead.

    Signed-off-by: David Howells
    cc: "James E.J. Bottomley"
    cc: Helge Deller
    cc: linux-parisc@vger.kernel.org
    Signed-off-by: Al Viro

    David Howells
     

26 Apr, 2013

2 commits

  • User applications running on SMP kernels have long suffered from instability
    and random segmentation faults. This patch improves the situation although
    there is more work to be done.

    One of the problems is the various routines in pgtable.h that update page table
    entries use different locking mechanisms, or no lock at all (set_pte_at). This
    change modifies the routines to all use the same lock pa_dbit_lock. This lock
    is used for dirty bit updates in the interruption code. The patch also purges
    the TLB entries associated with the PTE to ensure that inconsistent values are
    not used after the page table entry is updated. The UP and SMP code are now
    identical.

    The change also includes a minor update to the purge_tlb_entries function in
    cache.c to improve its efficiency.

    Signed-off-by: John David Anglin
    Cc: Helge Deller
    Signed-off-by: Helge Deller

    John David Anglin
     
  • CONFIG_MLONGCALLS was introduced in commit
    ec758f98328da3eb933a25dc7a2eed01ef44d849 to overcome linker issues when linking
    huge linux kernels, e.g. with many modules linked in.

    But in the kernel module loader there is no support yet for the new relocation
    types, which is why modules built with -mlong-calls can't be loaded.
    Furthermore, for modules long calls are not really necessary, since we already
    use stub sections which resolve long distance calls.

    So, let's just disable this compiler option when compiling kernel modules.

    Signed-off-by: Helge Deller

    Helge Deller