20 Mar, 2014

7 commits

  • lib/audit.c provides a generic function for auditing system calls.
    This patch extends it for compat syscall support on bi-architectures
    (32/64-bit) by adding lib/compat_audit.c.
    What is required to support this feature are:
    * add asm/unistd32.h for compat system call names
    * select CONFIG_AUDIT_ARCH_COMPAT_GENERIC

    Signed-off-by: AKASHI Takahiro
    Acked-by: Richard Guy Briggs
    Signed-off-by: Eric Paris

    AKASHI Takahiro
     
  • Currently AUDITSYSCALL has a long list of architecture depencency:
    depends on AUDIT && (X86 || PARISC || PPC || S390 || IA64 || UML ||
    SPARC64 || SUPERH || (ARM && AEABI && !OABI_COMPAT) || ALPHA)
    The purpose of this patch is to replace it with HAVE_ARCH_AUDITSYSCALL
    for simplicity.

    Signed-off-by: AKASHI Takahiro
    Acked-by: Will Deacon (arm)
    Acked-by: Richard Guy Briggs (audit)
    Acked-by: Matt Turner (alpha)
    Acked-by: Michael Ellerman (powerpc)
    Signed-off-by: Eric Paris

    AKASHI Takahiro
     
  • Signed-off-by: Zhenglong.cai
    Signed-off-by: Matt Turner

    蔡正龙
     
  • In perverse cases of file descriptor passing the current network
    namespace of a process and the network namespace of a socket used by
    that socket may differ. Therefore use the network namespace of the
    appropiate socket to ensure replies always go to the appropiate
    socket.

    Signed-off-by: "Eric W. Biederman"
    Acked-by: Richard Guy Briggs
    Signed-off-by: Eric Paris

    Eric W. Biederman
     
  • While reading through 3.14-rc1 I found a pretty siginficant mishandling
    of network namespaces in the recent audit changes.

    In struct audit_netlink_list and audit_reply add a reference to the
    network namespace of the caller and remove the userspace pid of the
    caller. This cleanly remembers the callers network namespace, and
    removes a huge class of races and nasty failure modes that can occur
    when attempting to relook up the callers network namespace from a pid_t
    (including the caller's network namespace changing, pid wraparound, and
    the pid simply not being present).

    Signed-off-by: "Eric W. Biederman"
    Acked-by: Richard Guy Briggs
    Signed-off-by: Eric Paris

    Eric W. Biederman
     
  • During an audit event, cache and print the value of the process's
    proctitle value (proc//cmdline). This is useful in situations
    where processes are started via fork'd virtual machines where the
    comm field is incorrect. Often times, setting the comm field still
    is insufficient as the comm width is not very wide and most
    virtual machine "package names" do not fit. Also, during execution,
    many threads have their comm field set as well. By tying it back to
    the global cmdline value for the process, audit records will be more
    complete in systems with these properties. An example of where this
    is useful and applicable is in the realm of Android. With Android,
    their is no fork/exec for VM instances. The bare, preloaded Dalvik
    VM listens for a fork and specialize request. When this request comes
    in, the VM forks, and the loads the specific application (specializing).
    This was done to take advantage of COW and to not require a load of
    basic packages by the VM on very app spawn. When this spawn occurs,
    the package name is set via setproctitle() and shows up in procfs.
    Many of these package names are longer then 16 bytes, the historical
    width of task->comm. Having the cmdline in the audit records will
    couple the application back to the record directly. Also, on my
    Debian development box, some audit records were more useful then
    what was printed under comm.

    The cached proctitle is tied to the life-cycle of the audit_context
    structure and is built on demand.

    Proctitle is controllable by userspace, and thus should not be trusted.
    It is meant as an aid to assist in debugging. The proctitle event is
    emitted during syscall audits, and can be filtered with auditctl.

    Example:
    type=AVC msg=audit(1391217013.924:386): avc: denied { getattr } for pid=1971 comm="mkdir" name="/" dev="selinuxfs" ino=1 scontext=system_u:system_r:consolekit_t:s0-s0:c0.c255 tcontext=system_u:object_r:security_t:s0 tclass=filesystem
    type=SYSCALL msg=audit(1391217013.924:386): arch=c000003e syscall=137 success=yes exit=0 a0=7f019dfc8bd7 a1=7fffa6aed2c0 a2=fffffffffff4bd25 a3=7fffa6aed050 items=0 ppid=1967 pid=1971 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="mkdir" exe="/bin/mkdir" subj=system_u:system_r:consolekit_t:s0-s0:c0.c255 key=(null)
    type=UNKNOWN[1327] msg=audit(1391217013.924:386): proctitle=6D6B646972002D70002F7661722F72756E2F636F6E736F6C65

    Acked-by: Steve Grubb (wrt record formating)

    Signed-off-by: William Roberts
    Signed-off-by: Eric Paris

    William Roberts
     
  • Re-factor proc_pid_cmdline() to use get_cmdline() helper
    from mm.h.

    Acked-by: David Rientjes
    Acked-by: Stephen Smalley
    Acked-by: Richard Guy Briggs

    Signed-off-by: William Roberts
    Acked-by: Richard Guy Briggs
    Signed-off-by: Eric Paris

    William Roberts
     

08 Mar, 2014

4 commits


20 Jan, 2014

4 commits


18 Jan, 2014

13 commits

  • This reverts commit f6308b36c411 (ACPI: Add BayTrail SoC GPIO and LPSS
    ACPI IDs), because it causes the Alan Cox' ASUS T100TA to "crash and
    burn" during boot if the Baytrail pinctrl driver is compiled in.

    Fixes: f6308b36c411 (ACPI: Add BayTrail SoC GPIO and LPSS ACPI IDs)
    Reported-by: One Thousand Gnomes
    Requested-by: Linus Walleij
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • Pull networking fixes from David Miller:

    1) The value choosen for the new SO_MAX_PACING_RATE socket option on
    parisc was very poorly choosen, let's fix it while we still can.
    From Eric Dumazet.

    2) Our generic reciprocal divide was found to handle some edge cases
    incorrectly, part of this is encoded into the BPF as deep as the JIT
    engines themselves. Just use a real divide throughout for now.
    From Eric Dumazet.

    3) Because the initial lookup is lockless, the TCP metrics engine can
    end up creating two entries for the same lookup key. Fix this by
    doing a second lookup under the lock before we actually create the
    new entry. From Christoph Paasch.

    4) Fix scatter-gather list init in usbnet driver, from Bjørn Mork.

    5) Fix unintended 32-bit truncation in cxgb4 driver's bit shifting.
    From Dan Carpenter.

    6) Netlink socket dumping uses the wrong socket state for timewait
    sockets. Fix from Neal Cardwell.

    7) Fix netlink memory leak in ieee802154_add_iface(), from Christian
    Engelmayer.

    8) Multicast forwarding in ipv4 can overflow the per-rule reference
    counts, causing all multicast traffic to cease. Fix from Hannes
    Frederic Sowa.

    9) via-rhine needs to stop all TX queues when it resets the device,
    from Richard Weinberger.

    10) Fix RDS per-cpu accesses broken by the this_cpu_* conversions. From
    Gerald Schaefer.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions
    parisc: fix SO_MAX_PACING_RATE typo
    ipv6: simplify detection of first operational link-local address on interface
    tcp: metrics: Avoid duplicate entries with the same destination-IP
    net: rds: fix per-cpu helper usage
    e1000e: Fix compilation warning when !CONFIG_PM_SLEEP
    bpf: do not use reciprocal divide
    be2net: add dma_mapping_error() check for dma_map_page()
    bnx2x: Don't release PCI bars on shutdown
    net,via-rhine: Fix tx_timeout handling
    batman-adv: fix batman-adv header overhead calculation
    qlge: Fix vlan netdev features.
    net: avoid reference counter overflows on fib_rules in multicast forwarding
    dm9601: add USB IDs for new dm96xx variants
    MAINTAINERS: add virtio-dev ML for virtio
    ieee802154: Fix memory leak in ieee802154_add_iface()
    net: usbnet: fix SG initialisation
    inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets
    cxgb4: silence shift wrapping static checker warning

    Linus Torvalds
     
  • The s390 bpf jit compiler emits the signed divide instructions "dr" and "d"
    for unsigned divisions.
    This can cause problems: the dividend will be zero extended to a 64 bit value
    and the divisor is the 32 bit signed value as specified A or X accumulator,
    even though A and X are supposed to be treated as unsigned values.

    The divide instrunctions will generate an exception if the result cannot be
    expressed with a 32 bit signed value.
    This is the case if e.g. the dividend is 0xffffffff and the divisor either 1
    or also 0xffffffff (signed: -1).

    To avoid all these issues simply use unsigned divide instructions.

    Signed-off-by: Heiko Carstens
    Signed-off-by: David S. Miller

    Heiko Carstens
     
  • SO_MAX_PACING_RATE definition on parisc got a typo.
    Its not too late to fix it, before 3.13 is official.

    Fixes: 62748f32d501 ("net: introduce SO_MAX_PACING_RATE")
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • In commit 1ec047eb4751e3 ("ipv6: introduce per-interface counter for
    dad-completed ipv6 addresses") I build the detection of the first
    operational link-local address much to complex. Additionally this code
    now has a race condition.

    Replace it with a much simpler variant, which just scans the address
    list when duplicate address detection completes, to check if this is
    the first valid link local address and send RS and MLD reports then.

    Fixes: 1ec047eb4751e3 ("ipv6: introduce per-interface counter for dad-completed ipv6 addresses")
    Reported-by: Jiri Pirko
    Cc: Flavio Leitner
    Signed-off-by: Hannes Frederic Sowa
    Acked-by: Flavio Leitner
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • Because the tcp-metrics is an RCU-list, it may be that two
    soft-interrupts are inside __tcp_get_metrics() for the same
    destination-IP at the same time. If this destination-IP is not yet part of
    the tcp-metrics, both soft-interrupts will end up in tcpm_new and create
    a new entry for this IP.
    So, we will have two tcp-metrics with the same destination-IP in the list.

    This patch checks twice __tcp_get_metrics(). First without holding the
    lock, then while holding the lock. The second one is there to confirm
    that the entry has not been added by another soft-irq while waiting for
    the spin-lock.

    Fixes: 51c5d0c4b169b (tcp: Maintain dynamic metrics in local cache.)
    Signed-off-by: Christoph Paasch
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Christoph Paasch
     
  • commit ae4b46e9d "net: rds: use this_cpu_* per-cpu helper" broke per-cpu
    handling for rds. chpfirst is the result of __this_cpu_read(), so it is
    an absolute pointer and not __percpu. Therefore, __this_cpu_write()
    should not operate on chpfirst, but rather on cache->percpu->first, just
    like __this_cpu_read() did before.

    Cc: # 3.8+
    Signed-off-byd Gerald Schaefer

    Signed-off-by: David S. Miller

    Gerald Schaefer
     
  • Pull namespace fixes from Eric Biederman:
    "This is a set of 3 regression fixes.

    This fixes /proc/mounts when using "ip netns add " to display
    the actual mount point.

    This fixes a regression in clone that broke lxc-attach.

    This fixes a regression in the permission checks for mounting /proc
    that made proc unmountable if binfmt_misc was in use. Oops.

    My apologies for sending this pull request so late. Al Viro gave
    interesting review comments about the d_path fix that I wanted to
    address in detail before I sent this pull request. Unfortunately a
    bad round of colds kept from addressing that in detail until today.
    The executive summary of the review was:

    Al: Is patching d_path really sufficient?
    The prepend_path, d_path, d_absolute_path, and __d_path family of
    functions is a really mess.

    Me: Yes, patching d_path is really sufficient. Yes, the code is mess.
    No it is not appropriate to rewrite all of d_path for a regression
    that has existed for entirely too long already, when a two line
    change will do"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    vfs: Fix a regression in mounting proc
    fork: Allow CLONE_PARENT after setns(CLONE_NEWPID)
    vfs: In d_path don't call d_dname on a mount point

    Linus Torvalds
     
  • Pull KVM fix from Paolo Bonzini:
    "Fix for a brown paper bag bug. Thanks to Drew Jones for noticing"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    kvm: x86: fix apic_base enable check

    Linus Torvalds
     
  • Fixup caught by checkpatch.

    Signed-off-by: Richard Guy Briggs
    Signed-off-by: Eric Paris

    Richard Guy Briggs
     
  • Fixup caught by checkpatch.

    Signed-off-by: Richard Guy Briggs
    Signed-off-by: Eric Paris

    Richard Guy Briggs
     
  • A message about creating the audit socket might be fine at startup, but
    a pr_info for every single network namespace created on a system isn't
    useful.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Each asm-generic/audit_xx.h defines a set of system calls for respective
    audit permission class (read, write, change attribute or exec).
    This patch changes two entries:

    1) fchown in audit_change_attr.h
    Make fchown included by its own because in asm-generic/unistd.h, for example,
    fchown always exists while chown is optional. This change is necessary at
    least for arm64.

    2) truncate64 in audit_write.h
    Add missing truncate64/ftruncate64 as well as truncate/ftruncate

    Signed-off-by: AKASHI Takahiro
    Acked-by: Will Deacon
    Signed-off-by: Eric Paris

    AKASHI Takahiro
     

17 Jan, 2014

5 commits

  • Included change:
    - properly compute the batman-adv header overhead. Such
    result is later used to initialize the hard_header_len
    member of the soft-interface netdev object

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Pull arm64 fix from Catalin Marinas:
    "Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache"

    We noticed that it breaks ioremap (and earlyprintk) with 64K page
    configuration"

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache"

    Linus Torvalds
     
  • Commit 74e72f894d56 ("lib/percpu_counter.c: fix __percpu_counter_add()")
    looked very plausible, but its arithmetic was badly wrong: obvious once
    you see the fix, but maddening to get there from the weird tmpfs ENOSPCs

    Signed-off-by: Hugh Dickins
    Cc: Ming Lei
    Cc: Paul Gortmaker
    Cc: Shaohua Li
    Cc: Jens Axboe
    Cc: Fan Du
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Commit 7509963c703b (e1000e: Fix a compile flag mis-match for
    suspend/resume) moved suspend and resume hooks to be available when
    CONFIG_PM is set. However, it can be set even if CONFIG_PM_SLEEP is not set
    causing following warnings to be emitted:

    drivers/net/ethernet/intel/e1000e/netdev.c:6178:12: warning:
    ‘e1000_suspend’ defined but not used [-Wunused-function]

    drivers/net/ethernet/intel/e1000e/netdev.c:6185:12: warning:
    ‘e1000_resume’ defined but not used [-Wunused-function]

    To fix this make the hooks to be available only when CONFIG_PM_SLEEP is set
    and remove CONFIG_PM wrapping from driver ops because this is already
    handled by SET_SYSTEM_SLEEP_PM_OPS() and SET_RUNTIME_PM_OPS().

    Signed-off-by: Mika Westerberg
    Cc: Dave Ertman
    Cc: Aaron Brown
    Cc: Jeff Kirsher
    Signed-off-by: David S. Miller

    Mika Westerberg
     
  • This reverts commit 2f7dc6027522499582a520807cb9ffda589de47e.

    The above commit breaks the mapping type for Device memory because
    pgprot_default already contains a Normal memory type. pgprot_default is
    also not initialised early enough for earlyprintk resulting in an
    inconsistent memory mapping with 64K PAGE_SIZE configuration.

    Signed-off-by: Catalin Marinas
    Reported-by: Will Deacon
    Acked-by: Will Deacon

    Catalin Marinas
     

16 Jan, 2014

7 commits

  • On AMD family 10h we see following error messages while waking up from
    S3 for all non-boot CPUs leading to a failed IBS initialization:

    Enabling non-boot CPUs ...
    smpboot: Booting Node 0 Processor 1 APIC 0x1
    [Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
    perf: IBS APIC setup failed on cpu #1
    process: Switch to broadcast mode on CPU1
    CPU1 is up
    ...
    ACPI: Waking up from system sleep state S3

    Reason for this is that during suspend the LVT offset for the IBS
    vector gets lost and needs to be reinialized while resuming.

    The offset is read from the IBSCTL msr. On family 10h the offset needs
    to be 1 as offset 0 is used for the MCE threshold interrupt, but
    firmware assings it for IBS to 0 too. The kernel needs to reprogram
    the vector. The msr is a readonly node msr, but a new value can be
    written via pci config space access. The reinitialization is
    implemented for family 10h in setup_ibs_ctl() which is forced during
    IBS setup.

    This patch fixes IBS setup after waking up from S3 by adding
    resume/supend hooks for the boot cpu which does the offset
    reinitialization.

    Marking it as stable to let distros pick up this fix.

    Signed-off-by: Robert Richter
    Signed-off-by: Peter Zijlstra
    Cc: v3.2..
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1389797849-5565-1-git-send-email-rric.net@gmail.com
    Signed-off-by: Ingo Molnar

    Robert Richter
     
  • Waiman managed to trigger a PMI while in a emulate_vsyscall() fault,
    the PMI in turn managed to trigger a fault while obtaining a stack
    trace. This triggered the sig_on_uaccess_error recursive fault logic
    and killed the process dead.

    Fix this by explicitly excluding interrupts from the recursive fault
    logic.

    Reported-and-Tested-by: Waiman Long
    Fixes: e00b12e64be9 ("perf/x86: Further optimize copy_from_user_nmi()")
    Cc: Aswin Chandramouleeswaran
    Cc: Scott J Norton
    Cc: Linus Torvalds
    Cc: Andy Lutomirski
    Cc: Arnaldo Carvalho de Melo
    Cc: Andrew Morton
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20140110200603.GJ7572@laptop.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • …it.kernel.org/pub/scm/linux/kernel/git/tip/tip

    Pull scheduler and timer fixes from Ingo Molnar:
    "Contains a fix for a scheduler bug that manifested itself as a 3D
    performance regression and a crash fix for the ARM Cadence TTC clock
    driver"

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched: Calculate effective load even if local weight is 0

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    clocksource: cadence_ttc: Fix mutex taken inside interrupt context

    Linus Torvalds
     
  • Pull locking fixes from Ingo Molnar:
    "Two fixes from lockdep coverage of seqlocks, which fix deadlocks on
    lockdep-enabled ARM systems"

    * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched_clock: Disable seqlock lockdep usage in sched_clock()
    seqlock: Use raw_ prefix instead of _no_lockdep

    Linus Torvalds
     
  • Pull hwmon fix from Guenter Roeck:
    "Fix attribute length problem in coretemp driver"

    * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
    hwmon: (coretemp) Fix truncated name of alarm attributes

    Linus Torvalds
     
  • Pull ARM fixes from Russell King:
    "Another few fixes for ARM, nothing major here"

    * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
    ARM: 7938/1: OMAP4/highbank: Flush L2 cache before disabling
    ARM: 7939/1: traps: fix opcode endianness when read from user memory
    ARM: 7937/1: perf_event: Silence sparse warning
    ARM: 7934/1: DT/kernel: fix arch_match_cpu_phys_id to avoid erroneous match
    Revert "ARM: 7908/1: mm: Fix the arm_dma_limit calculation"

    Linus Torvalds
     
  • Pull writeback fix from Wu Fengguang:
    "Fix data corruption on NFS writeback.

    It has been in linux-next for one month"

    * tag 'writeback-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
    writeback: Fix data corruption on NFS

    Linus Torvalds