06 May, 2011

1 commit

  • This patch adds a multiple message send syscall and is the send
    version of the existing recvmmsg syscall. This is heavily
    based on the patch by Arnaldo that added recvmmsg.

    I wrote a microbenchmark to test the performance gains of using
    this new syscall:

    http://ozlabs.org/~anton/junkcode/sendmmsg_test.c

    The test was run on a ppc64 box with a 10 Gbit network card. The
    benchmark can send both UDP and RAW ethernet packets.

    64B UDP

    batch pkts/sec
    1 804570
    2 872800 (+ 8 %)
    4 916556 (+14 %)
    8 939712 (+17 %)
    16 952688 (+18 %)
    32 956448 (+19 %)
    64 964800 (+20 %)

    64B raw socket

    batch pkts/sec
    1 1201449
    2 1350028 (+12 %)
    4 1461416 (+22 %)
    8 1513080 (+26 %)
    16 1541216 (+28 %)
    32 1553440 (+29 %)
    64 1557888 (+30 %)

    We see a 20% improvement in throughput on UDP send and 30%
    on raw socket send.

    [ Add sparc syscall entries. -DaveM ]

    Signed-off-by: Anton Blanchard
    Signed-off-by: David S. Miller

    Anton Blanchard
     

05 May, 2011

1 commit


30 Apr, 2011

1 commit

  • This makes sure that when a driver calls the ethtool's
    get/set_settings() callback of another driver, the data passed to it
    is clean. This guarantees that speed_hi will be zeroed correctly if
    the called callback doesn't explicitely set it: we are sure we don't
    get a corrupted speed from the underlying driver. We also take care of
    setting the cmd field appropriately (ETHTOOL_GSET/SSET).

    This applies to dev_ethtool_get_settings(), which now makes sure it
    sets up that ethtool command parameter correctly before passing it to
    drivers. This also means that whoever calls dev_ethtool_get_settings()
    does not have to clean the ethtool command parameter. This function
    also becomes an exported symbol instead of an inline.

    All drivers visible to make allyesconfig under x86_64 have been
    updated.

    Signed-off-by: David Decotigny
    Signed-off-by: David S. Miller

    David Decotigny
     

28 Apr, 2011

1 commit

  • In order to speedup packet filtering, here is an implementation of a
    JIT compiler for x86_64

    It is disabled by default, and must be enabled by the admin.

    echo 1 >/proc/sys/net/core/bpf_jit_enable

    It uses module_alloc() and module_free() to get memory in the 2GB text
    kernel range since we call helpers functions from the generated code.

    EAX : BPF A accumulator
    EBX : BPF X accumulator
    RDI : pointer to skb (first argument given to JIT function)
    RBP : frame pointer (even if CONFIG_FRAME_POINTER=n)
    r9d : skb->len - skb->data_len (headlen)
    r8 : skb->data

    To get a trace of generated code, use :

    echo 2 >/proc/sys/net/core/bpf_jit_enable

    Example of generated code :

    # tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24

    flen=18 proglen=147 pass=3 image=ffffffffa00b5000
    JIT code: ffffffffa00b5000: 55 48 89 e5 48 83 ec 60 48 89 5d f8 44 8b 4f 60
    JIT code: ffffffffa00b5010: 44 2b 4f 64 4c 8b 87 b8 00 00 00 be 0c 00 00 00
    JIT code: ffffffffa00b5020: e8 24 7b f7 e0 3d 00 08 00 00 75 28 be 1a 00 00
    JIT code: ffffffffa00b5030: 00 e8 fe 7a f7 e0 24 00 3d 00 14 a8 c0 74 49 be
    JIT code: ffffffffa00b5040: 1e 00 00 00 e8 eb 7a f7 e0 24 00 3d 00 14 a8 c0
    JIT code: ffffffffa00b5050: 74 36 eb 3b 3d 06 08 00 00 74 07 3d 35 80 00 00
    JIT code: ffffffffa00b5060: 75 2d be 1c 00 00 00 e8 c8 7a f7 e0 24 00 3d 00
    JIT code: ffffffffa00b5070: 14 a8 c0 74 13 be 26 00 00 00 e8 b5 7a f7 e0 24
    JIT code: ffffffffa00b5080: 00 3d 00 14 a8 c0 75 07 b8 ff ff 00 00 eb 02 31
    JIT code: ffffffffa00b5090: c0 c9 c3

    BPF program is 144 bytes long, so native program is almost same size ;)

    (000) ldh [12]
    (001) jeq #0x800 jt 2 jf 8
    (002) ld [26]
    (003) and #0xffffff00
    (004) jeq #0xc0a81400 jt 16 jf 5
    (005) ld [30]
    (006) and #0xffffff00
    (007) jeq #0xc0a81400 jt 16 jf 17
    (008) jeq #0x806 jt 10 jf 9
    (009) jeq #0x8035 jt 10 jf 17
    (010) ld [28]
    (011) and #0xffffff00
    (012) jeq #0xc0a81400 jt 16 jf 13
    (013) ld [38]
    (014) and #0xffffff00
    (015) jeq #0xc0a81400 jt 16 jf 17
    (016) ret #65535
    (017) ret #0

    Signed-off-by: Eric Dumazet
    Cc: Arnaldo Carvalho de Melo
    Cc: Ben Hutchings
    Cc: Hagen Paul Pfeifer
    Signed-off-by: David S. Miller

    Eric Dumazet
     

22 Apr, 2011

1 commit

  • mac-fec.c was setting individual UDP address registers instead of multicast
    group address registers when joining a multicast group.
    This prevented from correctly receiving UDP multicast packets.
    According to datasheet, replaced hash_table_high and hash_table_low
    with grp_hash_table_high and grp_hash_table_low respectively.
    Also renamed hash_table_* with grp_hash_table_* in struct fec declaration
    for 8xx: these registers are used only for multicast there.

    Tested on a MPC5121 based board.
    Build tested also against mpc866_ads_defconfig.

    Signed-off-by: Andrea Galbusera
    Signed-off-by: David S. Miller

    Andrea Galbusera
     

08 Apr, 2011

7 commits

  • * 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
    [S390] compile fix for latest binutils
    [S390] cio: prevent purging of CCW devices in the online state
    [S390] qdio: fix init sequence
    [S390] Fix parameter passing for smp_switch_to_cpu()
    [S390] oprofile s390: prevent stack corruption

    Linus Torvalds
     
  • …l/git/lethal/fbdev-2.6

    * 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6:
    efifb: Add override for 11" Macbook Air 3,1
    efifb: Support overriding fields FW tells us with the DMI data.
    fb: Reduce priority of resource conflict message
    savagefb: Remove obsolete else clause in savage_setup_i2c_bus
    savagefb: Set up I2C based on chip family instead of card id
    savagefb: Replace magic register address with define
    drivers/video/bfin-lq035q1-fb.c: introduce missing kfree
    video: s3c-fb: fix checkpatch errors and warning
    efifb: support AMD Radeon HD 6490
    s3fb: fix Virge/GX2
    fbcon: Remove unused 'display *p' variable from fb_flashcursor()
    fbdev: sh_mobile_lcdcfb: fix module lock acquisition
    fbdev: sh_mobile_lcdcfb: add blanking support
    viafb: initialize margins correct
    viafb: refresh rate bug collection
    sh: mach-ap325rxa: move backlight control code
    sh: mach-ecovec24: support for main lcd backlight

    Linus Torvalds
     
  • …nel/git/lethal/sh-2.6

    * 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
    ARM: arch-shmobile: only run FSI init on respective boards
    ARM: arch-shmobile: only run HDMI init on respective boards
    ARM: mach-shmobile: Correctly check for CONFIG_MACH_MACKEREL

    Linus Torvalds
     
  • * 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
    sh: select ARCH_NO_SYSDEV_OPS.
    sh: fix build error in board-sh7757lcr.c
    sh: landisk: Remove whitespace
    sh: landisk: Remove mv_nr_irqs
    sh: sh-sci: Fix double initialization by serial_console_setup
    serial: sh-sci: prevent setup of uninitialized serial console
    dma: shdma: add checking the DMAOR_AE in sh_dmae_err

    Linus Torvalds
     
  • …-linus', 'irq-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86-32, fpu: Fix FPU exception handling on non-SSE systems
    x86, hibernate: Initialize mmu_cr4_features during boot
    x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change
    x86: visws: Fixup irq overhaul fallout

    * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sched: Clean up rebalance_domains() load-balance interval calculation

    * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86/mrst/vrtc: Fix boot crash in mrst_rtc_init()
    rtc, x86/mrst/vrtc: Fix boot crash in rtc_read_alarm()

    * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    genirq: Fix cpumask leak in __setup_irq()

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf probe: Fix listing incorrect line number with inline function
    perf probe: Fix to find recursively inlined function
    perf probe: Fix multiple --vars options behavior
    perf probe: Fix to remove redundant close
    perf probe: Fix to ensure function declared file

    Linus Torvalds
     
  • * 'kvm-updates/2.6.39' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    KVM: move and fix substitue search for missing CPUID entries
    KVM: fix XSAVE bit scanning
    KVM: Enable async page fault processing
    KVM: fix crash on irqfd deassign

    Linus Torvalds
     
  • * 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6:
    Fix common misspellings

    Linus Torvalds
     

07 Apr, 2011

7 commits

  • The sfi_mrtc_array[] only gets initialized when the sfi mrtc
    table is parsed, so the vrtc_paddr should be initalized after it
    too.

    Signed-off-by: Feng Tang
    Signed-off-by: Alan Cox
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1302140389-27603-1-git-send-email-feng.tang@intel.com
    Signed-off-by: Ingo Molnar

    Feng Tang
     
  • On 32bit systems without SSE (that is, they use FSAVE/FRSTOR for FPU
    context switches), FPU exceptions in user mode cause Oopses, BUGs,
    recursive faults and other nasty things:

    fpu exception: 0000 [#1]
    last sysfs file: /sys/power/state
    Modules linked in: psmouse evdev pcspkr serio_raw [last unloaded: scsi_wait_scan]

    Pid: 1638, comm: fxsave-32-excep Not tainted 2.6.35-07798-g58a992b-dirty #633 VP3-596B-DD/VT82C597
    EIP: 0060:[] EFLAGS: 00010202 CPU: 0
    EIP is at math_error+0x1b4/0x1c8
    EAX: 00000003 EBX: cf9be7e0 ECX: 00000000 EDX: cf9c5c00
    ESI: cf9d9fb4 EDI: c1372db3 EBP: 00000010 ESP: cf9d9f1c
    DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
    Process fxsave-32-excep (pid: 1638, ti=cf9d8000 task=cf9be7e0 task.ti=cf9d8000)
    Stack:
    00000000 00000301 00000004 00000000 00000000 cf9d3000 cf9da8f0 00000001
    00000004 cf9b6b60 c1019a6b c1019a79 00000020 00000242 000001b6 cf9c5380
    cf806b40 cf791880 00000000 00000282 00000282 c108a213 00000020 cf9c5380
    Call Trace:
    [] ? need_resched+0x11/0x1a
    [] ? should_resched+0x5/0x1f
    [] ? do_sys_open+0xbd/0xc7
    [] ? do_sys_open+0xbd/0xc7
    [] ? do_coprocessor_error+0x0/0x11
    [] ? error_code+0x65/0x70
    Code: a8 20 74 30 c7 44 24 0c 06 00 03 00 8d 54 24 04 89 d9 b8 08 00 00 00 e8 9b 6d 02 00 eb 16 8b 93 5c 02 00 00 eb 05 e9 04 ff ff ff dd 32 9b e9 16 ff ff ff 81 c4 84 00 00 00 5b 5e 5f 5d c3 c6
    EIP: [] math_error+0x1b4/0x1c8 SS:ESP 0068:cf9d9f1c

    This usually continues in slight variations until the system is reset.

    This bug was introduced by commit 58a992b9cbaf449aeebd3575c3695a9eb5d95b5e:
    x86-32, fpu: Rewrite fpu_save_init()

    Signed-off-by: Hans Rosenfeld
    Link: http://lkml.kernel.org/r/1302106003-366952-1-git-send-email-hans.rosenfeld@amd.com
    Signed-off-by: H. Peter Anvin

    Hans Rosenfeld
     
  • Restore the initialization of mmu_cr4_features during boot, which was
    removed without comment in checkin e5f15b45ddf3afa2bbbb10c7ea34fb32b6de0a0e

    x86: Cleanup highmap after brk is concluded

    thereby breaking resume from hibernate. This restores previous
    functionality in approximately the same place, and corrects the
    reading of %cr4 on pre-CPUID hardware (%cr4 exists if and only if
    CPUID is supported.)

    However, part of the problem is that the hibernate suspend/resume
    sequence should manage the save/restore of %cr4 explicitly.

    Signed-off-by: H. Peter Anvin
    Cc: Rafael J. Wysocki
    Cc: Stefano Stabellini
    Cc: Yinghai Lu
    LKML-Reference:

    H. Peter Anvin
     
  • Now that everything that was using these interfaces has been converted to
    the syscore ops, prevent new code from using the old API.

    Signed-off-by: Paul Mundt

    Paul Mundt
     
  • If several boards are enabled in the kernel configuration,
    fsi_init_pm_clock() functions from board-ap4evb.c
    will run on any of them. Prevent this by calling these functions from the
    .init_machine() callback instead of using device_initcall().

    Signed-off-by: Kuninori Morimoto
    Cc: Magnus Damm
    Signed-off-by: Paul Mundt

    Kuninori Morimoto
     
  • If several boards are enabled in the kernel configuration,
    hdmi_init_pm_clock() functions from board-ap4evb.c and board-mackerel.c
    will run on any of them. Prevent this by calling these functions from the
    .init_machine() callback instead of using device_initcall().

    Signed-off-by: Guennadi Liakhovetski
    Cc: Magnus Damm
    Tested-by: Kuninori Morimoto
    Signed-off-by: Paul Mundt

    Guennadi Liakhovetski
     
  • I made a bit of a thinko when adding Mackerel to the boards
    that support zboot using MMCIF.

    Reported-by: Magnus Damm
    Signed-off-by: Simon Horman
    Signed-off-by: Paul Mundt

    Simon Horman
     

06 Apr, 2011

2 commits

  • If KVM cannot find an exact match for a requested CPUID leaf, the
    code will try to find the closest match instead of simply confessing
    it's failure.
    The implementation was meant to satisfy the CPUID specification, but
    did not properly check for extended and standard leaves and also
    didn't account for the index subleaf.
    Beside that this rule only applies to CPUID intercepts, which is not
    the only user of the kvm_find_cpuid_entry() function.

    So fix this algorithm and call it from kvm_emulate_cpuid().
    This fixes a crash of newer Linux kernels as KVM guests on
    AMD Bulldozer CPUs, where bogus values were returned in response to
    a CPUID intercept.

    Signed-off-by: Andre Przywara
    Signed-off-by: Avi Kivity

    Andre Przywara
     
  • When KVM scans the 0xD CPUID leaf for propagating the XSAVE save area
    leaves, it assumes that the leaves are contigious and stops at the
    first zero one. On AMD hardware there is a gap, though, as LWP uses
    leaf 62 to announce it's state save area.
    So lets iterate through all 64 possible leaves and simply skip zero
    ones to also cover later features.

    Signed-off-by: Andre Przywara
    Signed-off-by: Avi Kivity

    Andre Przywara
     

05 Apr, 2011

7 commits


04 Apr, 2011

10 commits


02 Apr, 2011

2 commits