Doug / smarc-fsl-linux-kernel | Embedian Git Server

09 Dec, 2011

18 commits

2a95ea6c0 procfs: do not overflow get_{idle,iowait}_time for nohz ... Browse Code »

Since commit a25cac5198d4 ("proc: Consider NO_HZ when printing idle and
iowait times") we are reporting idle/io_wait time also while a CPU is
tickless. We rely on get_{idle,iowait}_time functions to retrieve
proper data.

These functions, however, use usecs_to_cputime to translate micro
seconds time to cputime64_t. This is just an alias to usecs_to_jiffies
which reduces the data type from u64 to unsigned int and also checks
whether the given parameter overflows jiffies_to_usecs(MAX_JIFFY_OFFSET)
and returns MAX_JIFFY_OFFSET in that case.

When we overflow depends on CONFIG_HZ but especially for CONFIG_HZ_300
it is quite low (1431649781) so we are getting MAX_JIFFY_OFFSET for
>3000s! until we overflow unsigned int. Just for reference
CONFIG_HZ_100 has an overflow window around 20s, CONFIG_HZ_250 ~8s and
CONFIG_HZ_1000 ~2s.

This results in a bug when people saw [h]top going mad reporting 100%
CPU usage even though there was basically no CPU load. The reason was
simply that /proc/stat stopped reporting idle/io_wait changes (and
reported MAX_JIFFY_OFFSET) and so the only change happening was for user
system time.

Let's use nsecs_to_jiffies64 instead which doesn't reduce the precision
to 32b type and it is much more appropriate for cumulative time values
(unlike usecs_to_jiffies which intended for timeout calculations).

Signed-off-by: Michal Hocko
Tested-by: Artem S. Tashkinov
Cc: Dave Jones
Cc: Arnd Bergmann
Cc: Alexey Dobriyan
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Hocko
2011-12-09 23:50:29 +0800
1368edf06 mm: vmalloc: check for page allocation failure before vmlist insertion ... Browse Code »

Commit f5252e00 ("mm: avoid null pointer access in vm_struct via
/proc/vmallocinfo") adds newly allocated vm_structs to the vmlist after
it is fully initialised. Unfortunately, it did not check that
__vmalloc_area_node() successfully populated the area. In the event of
allocation failure, the vmalloc area is freed but the pointer to freed
memory is inserted into the vmlist leading to a a crash later in
get_vmalloc_info().

This patch adds a check for ____vmalloc_area_node() failure within
__vmalloc_node_range. It does not use "goto fail" as in the previous
error path as a warning was already displayed by __vmalloc_area_node()
before it called vfree in its failure path.

Credit goes to Luciano Chavez for doing all the real work of identifying
exactly where the problem was.

Signed-off-by: Mel Gorman
Reported-by: Luciano Chavez
Tested-by: Luciano Chavez
Reviewed-by: Rik van Riel
Acked-by: David Rientjes
Cc: [3.1.x+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2011-12-09 23:50:29 +0800
d02156388 mm: Ensure that pfn_valid() is called once per pageblock when reserving pageblocks ... Browse Code »

setup_zone_migrate_reserve() expects that zone->start_pfn starts at
pageblock_nr_pages aligned pfn otherwise we could access beyond an
existing memblock resulting in the following panic if
CONFIG_HOLES_IN_ZONE is not configured and we do not check pfn_valid:

IP: [] setup_zone_migrate_reserve+0xcd/0x180
*pdpt = 0000000000000000 *pde = f000ff53f000ff53
Oops: 0000 [#1] SMP
Pid: 1, comm: swapper Not tainted 3.0.7-0.7-pae #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
EIP: 0060:[] EFLAGS: 00010006 CPU: 0
EIP is at setup_zone_migrate_reserve+0xcd/0x180
EAX: 000c0000 EBX: f5801fc0 ECX: 000c0000 EDX: 00000000
ESI: 000c01fe EDI: 000c01fe EBP: 00140000 ESP: f2475f58
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process swapper (pid: 1, ti=f2474000 task=f2472cd0 task.ti=f2474000)
Call Trace:
[] __setup_per_zone_wmarks+0xec/0x160
[] setup_per_zone_wmarks+0xf/0x20
[] init_per_zone_wmark_min+0x27/0x86
[] do_one_initcall+0x2b/0x160
[] kernel_init+0xbe/0x157
[] kernel_thread_helper+0x6/0xd
Code: a5 39 f5 89 f7 0f 46 fd 39 cf 76 40 8b 03 f6 c4 08 74 32 eb 91 90 89 c8 c1 e8 0e 0f be 80 80 2f 86 c0 8b 14 85 60 2f 86 c0 89 c8 82 b4 12 00 00 c1 e0 05 03 82 ac 12 00 00 8b 00 f6 c4 08 0f
EIP: [] setup_zone_migrate_reserve+0xcd/0x180 SS:ESP 0068:f2475f58
CR2: 00000000000012b4

We crashed in pageblock_is_reserved() when accessing pfn 0xc0000 because
highstart_pfn = 0x36ffe.

The issue was introduced in 3.0-rc1 by 6d3163ce ("mm: check if any page
in a pageblock is reserved before marking it MIGRATE_RESERVE").

Make sure that start_pfn is always aligned to pageblock_nr_pages to
ensure that pfn_valid s always called at the start of each pageblock.
Architectures with holes in pageblocks will be correctly handled by
pfn_valid_within in pageblock_is_reserved.

Signed-off-by: Michal Hocko
Signed-off-by: Mel Gorman
Tested-by: Dang Bo
Reviewed-by: KAMEZAWA Hiroyuki
Cc: Andrea Arcangeli
Cc: David Rientjes
Cc: Arve Hjnnevg
Cc: KOSAKI Motohiro
Cc: John Stultz
Cc: Dave Hansen
Cc: [3.0+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Hocko
2011-12-09 23:50:28 +0800
09761333e mm/migrate.c: pair unlock_page() and lock_page() when migrating huge pages ... Browse Code »

Avoid unlocking and unlocked page if we failed to lock it.

Signed-off-by: Hillf Danton
Cc: Naoya Horiguchi
Cc: Andrea Arcangeli
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hillf Danton
2011-12-09 23:50:28 +0800
58a84aa92 thp: set compound tail page _count to zero ... Browse Code »

Commit 70b50f94f1644 ("mm: thp: tail page refcounting fix") keeps all
page_tail->_count zero at all times. But the current kernel does not
set page_tail->_count to zero if a 1GB page is utilized. So when an
IOMMU 1GB page is used by KVM, it wil result in a kernel oops because a
tail page's _count does not equal zero.

kernel BUG at include/linux/mm.h:386!
invalid opcode: 0000 [#1] SMP
Call Trace:
gup_pud_range+0xb8/0x19d
get_user_pages_fast+0xcb/0x192
? trace_hardirqs_off+0xd/0xf
hva_to_pfn+0x119/0x2f2
gfn_to_pfn_memslot+0x2c/0x2e
kvm_iommu_map_pages+0xfd/0x1c1
kvm_iommu_map_memslots+0x7c/0xbd
kvm_iommu_map_guest+0xaa/0xbf
kvm_vm_ioctl_assigned_device+0x2ef/0xa47
kvm_vm_ioctl+0x36c/0x3a2
do_vfs_ioctl+0x49e/0x4e4
sys_ioctl+0x5a/0x7c
system_call_fastpath+0x16/0x1b
RIP gup_huge_pud+0xf2/0x159

Signed-off-by: Youquan Song
Reviewed-by: Andrea Arcangeli
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Youquan Song
2011-12-09 23:50:28 +0800
b6999b191 thp: add compound tail page _mapcount when mapped ... Browse Code »

With the 3.2-rc kernel, IOMMU 2M pages in KVM works. But when I tried
to use IOMMU 1GB pages in KVM, I encountered an oops and the 1GB page
failed to be used.

The root cause is that 1GB page allocation calls gup_huge_pud() while 2M
page calls gup_huge_pmd. If compound pages are used and the page is a
tail page, gup_huge_pmd() increases _mapcount to record tail page are
mapped while gup_huge_pud does not do that.

So when the mapped page is relesed, it will result in kernel oops
because the page is not marked mapped.

This patch add tail process for compound page in 1GB huge page which
keeps the same process as 2M page.

Reproduce like:
1. Add grub boot option: hugepagesz=1G hugepages=8
2. mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages
3. qemu-kvm -m 2048 -hda os-kvm.img -cpu kvm64 -smp 4 -mem-path /dev/hugepages
-net none -device pci-assign,host=07:00.1

kernel BUG at mm/swap.c:114!
invalid opcode: 0000 [#1] SMP
Call Trace:
put_page+0x15/0x37
kvm_release_pfn_clean+0x31/0x36
kvm_iommu_put_pages+0x94/0xb1
kvm_iommu_unmap_memslots+0x80/0xb6
kvm_assign_device+0xba/0x117
kvm_vm_ioctl_assigned_device+0x301/0xa47
kvm_vm_ioctl+0x36c/0x3a2
do_vfs_ioctl+0x49e/0x4e4
sys_ioctl+0x5a/0x7c
system_call_fastpath+0x16/0x1b
RIP put_compound_page+0xd4/0x168

Signed-off-by: Youquan Song
Reviewed-by: Andrea Arcangeli
Cc: Andi Kleen
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Youquan Song
2011-12-09 23:50:28 +0800
09dc3cf93 printk: avoid double lock acquire ... Browse Code »

Commit 4f2a8d3cf5e ("printk: Fix console_sem vs logbuf_lock unlock race")
introduced another silly bug where we would want to acquire an already
held lock. Avoid this.

Reported-by: Andrea Arcangeli
Signed-off-by: Peter Zijlstra
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2011-12-09 23:50:28 +0800
c193c82f0 memcg: update maintainers ... Browse Code »

More players joined to memory cgroup developments and Johannes' great work
changed internal design of memory cgroup dramatically. And he will do
more works. Michal Hokko did many bug fixes and know memory cgroup very
well. Daisuke Nishimura helped us very much but he seems busy now.
Thanks to his works.

Signed-off-by: KAMEZAWA Hiroyuki
Acked-by: Michal Hocko
Acked-by: Johannes Weiner
Acked-by: Daisuke Nishimura
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2011-12-09 23:50:28 +0800
2dbcd05f1 drivers/rtc/rtc-s3c.c: fix driver clock enable/disable balance issues ... Browse Code »

If an error occurs after the clock is enabled, the enable/disable state
can become unbalanced.

Signed-off-by: Jonghwan Choi
Cc: Alessandro Zummo
Acked-by: Kukjin Kim
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jonghwan Choi
2011-12-09 23:50:28 +0800
1de8ad43d CREDITS: update Kees's expired fingerprint and fix details ... Browse Code »

Small clean-up for my CREDITS entry; the GPG fingerprint was not up to
date, so I fixed other details at the same time too.

Signed-off-by: Kees Cook
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2011-12-09 23:50:28 +0800
1dfb059b9 thp: reduce khugepaged freezing latency ... Browse Code »

khugepaged can sometimes cause suspend to fail, requiring that the user
retry the suspend operation.

Use wait_event_freezable_timeout() instead of
schedule_timeout_interruptible() to avoid missing freezer wakeups. A
try_to_freeze() would have been needed in the khugepaged_alloc_hugepage
tight loop too in case of the allocation failing repeatedly, and
wait_event_freezable_timeout will provide it too.

khugepaged would still freeze just fine by trying again the next minute
but it's better if it freezes immediately.

Reported-by: Jiri Slaby
Signed-off-by: Andrea Arcangeli
Tested-by: Jiri Slaby
Cc: Tejun Heo
Cc: Oleg Nesterov
Cc: "Srivatsa S. Bhat"
Cc: "Rafael J. Wysocki"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrea Arcangeli
2011-12-09 23:50:28 +0800
b53fc7c29 fs/proc/meminfo.c: fix compilation error ... Browse Code »

Fix the error message "directives may not be used inside a macro argument"
which appears when the kernel is compiled for the cris architecture.

Signed-off-by: Claudio Scordino
Cc: Andrea Arcangeli
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Claudio Scordino
2011-12-09 23:50:28 +0800
83aeeada7 vmscan: use atomic-long for shrinker batching ... Browse Code »

Use atomic-long operations instead of looping around cmpxchg().

[akpm@linux-foundation.org: massage atomic.h inclusions]
Signed-off-by: Konstantin Khlebnikov
Cc: Dave Chinner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Konstantin Khlebnikov
2011-12-09 23:50:27 +0800
635697c66 vmscan: fix initial shrinker size handling ... Browse Code »

A shrinker function can return -1, means that it cannot do anything
without a risk of deadlock. For example prune_super() does this if it
cannot grab a superblock refrence, even if nr_to_scan=0. Currently we
interpret this -1 as a ULONG_MAX size shrinker and evaluate `total_scan'
according to this. So the next time around this shrinker can cause
really big pressure. Let's skip such shrinkers instead.

Also make total_scan signed, otherwise the check (total_scan < 0) below
never works.

Signed-off-by: Konstantin Khlebnikov
Cc: Dave Chinner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Konstantin Khlebnikov
2011-12-09 23:50:27 +0800
09d9673d5 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
alarmtimers: Fix time comparison
ptp: Fix clock_getres() implementation

Linus Torvalds
2011-12-09 05:21:28 +0800
fb38f9b8f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: drop spin lock when memory alloc fails
Btrfs: check if the to-be-added device is writable
Btrfs: try cluster but don't advance in search list
Btrfs: try to allocate from cluster even at LOOP_NO_EMPTY_SIZE

Linus Torvalds
2011-12-09 05:18:59 +0800
8bd1c8815 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc ... Browse Code »

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (28 commits)
ARM: sa1100: fix build error
ARM: OMAP1: recalculate loops per jiffy after dpll1 reprogram
ARM: davinci: dm365 evm: align nand partition table to u-boot
ARM: davinci: da850 evm: change audio edma event queue to EVENTQ_0
ARM: davinci: dm646x evm: wrong register used in setup_vpif_input_channel_mode
ARM: davinci: dm646x does not have a DSP domain
ARM: davinci: psc: fix incorrect offsets
ARM: davinci: psc: fix incorrect mask
ARM: mx28: LRADC macro rename
arm: mx23: recognise stmp378x as mx23
ARM: mxs: fix machines' initializers order
ARM: mxs/tx28: add __initconst for fec pdata
ARM: S3C64XX: Staticise s3c6400_sysclass
ARM: S3C64XX: Add linux/export.h to dev-spi.c
ARM: S3C64XX: Remove extern from definition of framebuffer setup call
MAINTAINERS: Extend Samsung patterns to cover SPI and ASoC drivers
MAINTAINERS: Add linux-samsung-soc mailing list for Samsung
MAINTAINERS: Consolidate Samsung MAINTAINERS
ARM: CSR: PM: fix build error due to undeclared 'THIS_MODULE'
ARM: CSR: fix build error due to new mdesc->dma_zone_size
...

Linus Torvalds
2011-12-09 05:18:38 +0800
1418a3e5a TOMOYO: Fix pathname handling of disconnected paths. ... Browse Code »

Current tomoyo_realpath_from_path() implementation returns strange pathname
when calculating pathname of a file which belongs to lazy unmounted tree.
Use local pathname rather than strange absolute pathname in that case.

Also, this patch fixes a regression by commit 02125a82 "fix apparmor
dereferencing potentially freed dentry, sanitize __d_path() API".

Signed-off-by: Tetsuo Handa
Acked-by: Al Viro
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds

Tetsuo Handa
2011-12-09 05:18:12 +0800

08 Dec, 2011

15 commits

073c46031 Merge branch 'fixes' of git://github.com/hzhuang1/linux into fixes Browse Code »

Arnd Bergmann
2011-12-08 23:52:23 +0800
1cf4ffdb3 Btrfs: drop spin lock when memory alloc fails ... Browse Code »

Drop spin lock in convert_extent_bit() when memory alloc fails,
otherwise, it will be a deadlock.

Signed-off-by: Liu Bo
Signed-off-by: Chris Mason

Liu Bo
2011-12-08 21:55:47 +0800
a5d163336 Btrfs: check if the to-be-added device is writable ... Browse Code »

If we call ioctl(BTRFS_IOC_ADD_DEV) directly, we'll succeed in adding
a readonly device to a btrfs filesystem, and btrfs will write to
that device, emitting kernel errors:

[ 3109.833692] lost page write due to I/O error on loop2
[ 3109.833720] lost page write due to I/O error on loop2
...

Signed-off-by: Li Zefan
Signed-off-by: Chris Mason

Li Zefan
2011-12-08 21:55:46 +0800
274bd4fb3 Btrfs: try cluster but don't advance in search list ... Browse Code »

When we find an existing cluster, we switch to its block group as the
current block group, possibly skipping multiple blocks in the process.
Furthermore, under heavy contention, multiple threads may fail to
allocate from a cluster and then release just-created clusters just to
proceed to create new ones in a different block group.

This patch tries to allocate from an existing cluster regardless of its
block group, and doesn't switch to that group, instead proceeding to
try to allocate a cluster from the group it was iterating before the
attempt.

Signed-off-by: Alexandre Oliva
Signed-off-by: Chris Mason

Alexandre Oliva
2011-12-08 21:55:40 +0800
c564a0cb9 ARM: sa1100: fix build error ... Browse Code »

arm-eabi-4.4.3-ld:--defsym zreladdr=: syntax error
make[2]: *** [arch/arm/boot/compressed/vmlinux] Error 1
make[1]: *** [arch/arm/boot/compressed/vmlinux] Error 2
make: *** [uImage] Error 2

Signed-off-by: Haojian Zhuang
Signed-off-by: Jett.Zhou

Jett.Zhou
2011-12-08 14:55:57 +0800
b981f980b Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Browse Code »

Olof Johansson
2011-12-08 12:36:27 +0800
34a9d2c39 Merge branch '3.2-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending ... Browse Code »

* '3.2-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (25 commits)
iscsi-target: Fix hex2bin warn_unused compile message
target: Don't return an error if disabling unsupported features
target/rd: fix or rewrite the copy routine
target/rd: simplify the page/offset computation
target: remove the unused se_dev_list
target/file: walk properly over sg list
target: remove unused struct fields
target: Fix page length in emulated INQUIRY VPD page 86h
target: Handle 0 correctly in transport_get_sectors_6()
target: Don't return an error status for 0-length READ and WRITE
iscsi-target: Use kmemdup rather than duplicating its implementation
iscsi-target: Add missing F_BIT for iscsi_tm_rsp
iscsi-target: Fix residual count hanlding + remove iscsi_cmd->residual_count
target: Reject SCSI data overflow for fabrics using transport_generic_map_mem_to_cmd
target: remove the unused t_task_pt_sgl and t_task_pt_sgl_num se_cmd fields
target: remove the t_tasks_bidi se_cmd field
target: remove the t_tasks_fua se_cmd field
target: remove the se_ordered_node se_cmd field
target: remove the se_obj_ptr and se_orig_obj_ptr se_cmd fields
target: Drop config_item_name usage in fabric TFO->free_wwn()
...

Linus Torvalds
2011-12-08 10:18:27 +0800
062c05c46 Btrfs: try to allocate from cluster even at LOOP_NO_EMPTY_SIZE ... Browse Code »

If we reach LOOP_NO_EMPTY_SIZE, we won't even try to use a cluster that
others might have set up. Odds are that there won't be one, but if
someone else succeeded in setting it up, we might as well use it, even
if we don't try to set up a cluster again.

Signed-off-by: Alexandre Oliva
Signed-off-by: Chris Mason

Alexandre Oliva
2011-12-08 08:50:42 +0800
a694ad94b Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs ... Browse Code »

* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: fix the logspace waiting algorithm
xfs: fix nfs export of 64-bit inodes numbers on 32-bit kernels
xfs: fix allocation length overflow in xfs_bmapi_write()

Linus Torvalds
2011-12-08 08:13:54 +0800
1c70132ff Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / Driver core: leave runtime PM enabled during system shutdown

Linus Torvalds
2011-12-08 08:12:29 +0800
22c6b32d8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Wire up process_vm_{read,write}v

Linus Torvalds
2011-12-08 08:12:14 +0800
fe6b91f47 PM / Driver core: leave runtime PM enabled during system shutdown ... Browse Code »

Disabling all runtime PM during system shutdown turns out not to be a
good idea, because some devices may need to be woken up from a
low-power state at that time.

The whole point of disabling runtime PM for system shutdown was to
prevent untimely runtime-suspend method calls. This patch (as1504)
accomplishes the same result by incrementing the usage count for each
device and waiting for ongoing runtime-PM callbacks to finish. This
is what we already do during system suspend and hibernation, which
makes sense since the shutdown method is pretty much a legacy analog
of the pm->poweroff method.

This fixes a recent regression on some OMAP systems introduced by
commit af8db1508f2c9f3b6e633e2d2d906c6557c617f9 (PM / driver core:
disable device's runtime PM during shutdown).

Reported-and-tested-by: NeilBrown
Signed-off-by: Alan Stern
Acked-by: Greg Kroah-Hartman
Signed-off-by: Rafael J. Wysocki

Alan Stern
2011-12-08 05:26:56 +0800
77a7300ab of/irq: Get rid of NO_IRQ usage ... Browse Code »

PPC32/64 defines NO_IRQ to zero, so no problems expected.
ARM defines NO_IRQ to -1, but OF code relies on IRQ domains support,
which returns correct ('0') value in 'no irq' case. So everything
should be fine.

Other arches might break if some of their OF drivers rely on NO_IRQ
being not 0. If so, the drivers must be fixed, finally.

[ Rob Herring points out that microblaze should be fixed, and has posted
a patch for testing for that. - Linus ]

Signed-off-by: Anton Vorontsov
Acked-by: Wolfram Sang
Signed-off-by: Linus Torvalds

Anton Vorontsov
2011-12-08 01:06:37 +0800
10ec5e6c0 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux ... Browse Code »

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
vmwgfx: Use kcalloc instead of kzalloc to allocate array
drm/i915: fix infinite recursion on unbind due to ilk vt-d w/a
drm/radeon/kms: fix return type for radeon_encoder_get_dp_bridge_encoder_id

Linus Torvalds
2011-12-08 00:20:01 +0800
3172f8fe1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix apparmor dereferencing potentially freed dentry, sanitize __d_path() API

Linus Torvalds
2011-12-08 00:14:42 +0800

07 Dec, 2011

7 commits

24bb5a0ce vmwgfx: Use kcalloc instead of kzalloc to allocate array ... Browse Code »

The advantage of kcalloc is, that will prevent integer overflows which could
result from the multiplication of number of elements and size and it is also
a bit nicer to read.

The semantic patch that makes this change is available
in https://lkml.org/lkml/2011/11/25/107

Signed-off-by: Thomas Meyer
Reviewed-by: Jakob Bornecrantz
Signed-off-by: Dave Airlie

Thomas Meyer
2011-12-07 18:44:41 +0800
eb1711bb9 drm/i915: fix infinite recursion on unbind due to ilk vt-d w/a ... Browse Code »

The recursion loop goes retire_requests->unbind->gpu_idle->retire_reqeusts.

Every time we go through this we need a
- active object that can be retired
- and there are no other references to that object than the one from
the active list, so that it gets unbound and freed immediately.
Otherwise the recursion stops. So the recursion is only limited by the
number of objects that fit these requirements sitting in the active list
any time retire_request is called.

Issue exercised by tests/gem_unref_active_buffers from i-g-t.

There's been a decent bikeshed discussion whether it wouldn't be
better to pass around a flag, but imo this is o.k. for such a limited
case that only supports a w/a.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42180

Signed-Off-by: Daniel Vetter
Reviewed-by: Chris Wilson
[ickle- we built better bikesheds, but this keeps the rain off for now]
Tested-by: Dave Airlie
Signed-off-by: Dave Airlie

Daniel Vetter
2011-12-07 18:44:40 +0800
dc87cd5c2 drm/radeon/kms: fix return type for radeon_encoder_get_dp_bridge_encoder_id ... Browse Code »

Seems like something got mis-merged here.

Noticed by kallisti5 on IRC.

Signed-off-by: Alex Deucher
Signed-off-by: Dave Airlie

Alex Deucher
2011-12-07 18:44:38 +0800
e3a36c415 Merge branch 'samsung-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/… ... Browse Code »

…kgene/linux-samsung into fixes

Arnd Bergmann
2011-12-07 18:27:55 +0800
02125a826 fix apparmor dereferencing potentially freed dentry, sanitize __d_path() API ... Browse Code »

__d_path() API is asking for trouble and in case of apparmor d_namespace_path()
getting just that. The root cause is that when __d_path() misses the root
it had been told to look for, it stores the location of the most remote ancestor
in *root. Without grabbing references. Sure, at the moment of call it had
been pinned down by what we have in *path. And if we raced with umount -l, we
could have very well stopped at vfsmount/dentry that got freed as soon as
prepend_path() dropped vfsmount_lock.

It is safe to compare these pointers with pre-existing (and known to be still
alive) vfsmount and dentry, as long as all we are asking is "is it the same
address?". Dereferencing is not safe and apparmor ended up stepping into
that. d_namespace_path() really wants to examine the place where we stopped,
even if it's not connected to our namespace. As the result, it looked
at ->d_sb->s_magic of a dentry that might've been already freed by that point.
All other callers had been careful enough to avoid that, but it's really
a bad interface - it invites that kind of trouble.

The fix is fairly straightforward, even though it's bigger than I'd like:
* prepend_path() root argument becomes const.
* __d_path() is never called with NULL/NULL root. It was a kludge
to start with. Instead, we have an explicit function - d_absolute_root().
Same as __d_path(), except that it doesn't get root passed and stops where
it stops. apparmor and tomoyo are using it.
* __d_path() returns NULL on path outside of root. The main
caller is show_mountinfo() and that's precisely what we pass root for - to
skip those outside chroot jail. Those who don't want that can (and do)
use d_path().
* __d_path() root argument becomes const. Everyone agrees, I hope.
* apparmor does *NOT* try to use __d_path() or any of its variants
when it sees that path->mnt is an internal vfsmount. In that case it's
definitely not mounted anywhere and dentry_path() is exactly what we want
there. Handling of sysctl()-triggered weirdness is moved to that place.
* if apparmor is asked to do pathname relative to chroot jail
and __d_path() tells it we it's not in that jail, the sucker just calls
d_absolute_path() instead. That's the other remaining caller of __d_path(),
BTW.
* seq_path_root() does _NOT_ return -ENAMETOOLONG (it's stupid anyway -
the normal seq_file logics will take care of growing the buffer and redoing
the call of ->show() just fine). However, if it gets path not reachable
from root, it returns SEQ_SKIP. The only caller adjusted (i.e. stopped
ignoring the return value as it used to do).

Reviewed-by: John Johansen
ACKed-by: John Johansen
Signed-off-by: Al Viro
Cc: stable@vger.kernel.org

Al Viro
2011-12-07 12:57:18 +0800
9f9c19ec1 xfs: fix the logspace waiting algorithm ... Browse Code »

Apply the scheme used in log_regrant_write_log_space to wake up any other
threads waiting for log space before the newly added one to
log_regrant_write_log_space as well, and factor the code into readable
helpers. For each of the queues we have add two helpers:

- one to try to wake up all waiting threads. This helper will also be
usable by xfs_log_move_tail once we remove the current opportunistic
wakeups in it.
- one to sleep on t_wait until enough log space is available, loosely
modelled after Linux waitqueues.

And use them to reimplement the guts of log_regrant_write_log_space and
log_regrant_write_log_space. These two function now use one and the same
algorithm for waiting on log space instead of subtly different ones before,
with an option to completely unify them in the near future.

Also move the filesystem shutdown handling to the common caller given
that we had to touch it anyway.

Based on hard debugging and an earlier patch from
Chandra Seetharaman .

Signed-off-by: Christoph Hellwig
Reviewed-by: Chandra Seetharaman
Tested-by: Chandra Seetharaman
Signed-off-by: Ben Myers

Christoph Hellwig
2011-12-07 04:19:47 +0800
b835c0f47 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: Silence seq_scale() unused warning
ipv4:correct description for tcp_max_syn_backlog
pasemi_mac: Fix building as module
netback: Fix alert message.
r8169: fix Rx index race between FIFO overflow recovery and NAPI handler.
r8169: Rx FIFO overflow fixes.
ipv4: Fix peer validation on cached lookup.
ipv4: make sure RTO_ONLINK is saved in routing cache
iwlwifi: change the default behavior of watchdog timer
iwlwifi: do not re-configure HT40 after associated
iwlagn: fix HW crypto for TX-only keys
Revert "mac80211: clear sta.drv_priv on reconfiguration"
mac80211: fill rate filter for internal scan requests
cfg80211: amend regulatory NULL dereference fix
cfg80211: fix race on init and driver registration

Linus Torvalds
2011-12-07 04:03:54 +0800