15 Apr, 2015
40 commits
-
A small cleanup. Seems in e3239ff9 ("memblock: Rename memblock_region to
memblock_type and memblock_property to memblock_region") this one was
missed.Signed-off-by: Baoquan He
Cc: Benjamin Herrenschmidt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It's odd that we have populate_vma_page_range() and __mm_populate() in
mm/mlock.c. It's implementation of generic memory population and mlocking
is one of possible side effect, if VM_LOCKED is set.__get_user_pages() is core of the implementation. Let's move the code
into mm/gup.c.Signed-off-by: Kirill A. Shutemov
Acked-by: Linus Torvalds
Acked-by: David Rientjes
Cc: Michel Lespinasse
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is praparation to moving mm_populate()-related code out of
mm/mlock.c.Signed-off-by: Kirill A. Shutemov
Acked-by: Linus Torvalds
Acked-by: David Rientjes
Cc: Michel Lespinasse
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
__mlock_vma_pages_range() doesn't necessarily mlock pages. It depends on
vma flags. The same codepath is used for MAP_POPULATE.Let's rename __mlock_vma_pages_range() to populate_vma_page_range().
This patch also drops mlock_vma_pages_range() references from
documentation. It has gone in cea10a19b797 ("mm: directly use
__mlock_vma_pages_range() in find_extend_vma()").Signed-off-by: Kirill A. Shutemov
Acked-by: Linus Torvalds
Acked-by: David Rientjes
Cc: Michel Lespinasse
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
After commit a1fde08c74e9 ("VM: skip the stack guard page lookup in
get_user_pages only for mlock") FOLL_MLOCK has lost its original
meaning: we don't necessarily mlock the page if the flags is set -- we
also take VM_LOCKED into consideration.Since we use the same codepath for __mm_populate(), let's rename
FOLL_MLOCK to FOLL_POPULATE.Signed-off-by: Kirill A. Shutemov
Acked-by: Linus Torvalds
Acked-by: David Rientjes
Cc: Michel Lespinasse
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
slob_alloc_node() is only used in slob.c. Remove the EXPORT_SYMBOL and
make slob_alloc_node() static.Signed-off-by: Fabian Frederick
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use the normal return values for bool functions
Signed-off-by: Joe Perches
Cc: Christoph Lameter
Cc: Pekka Enberg
Acked-by: David Rientjes
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
CONFIG_SLAB_DEBUG doesn't exist, CONFIG_DEBUG_SLAB does.
Signed-off-by: David Rientjes
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
By moving the O option detection into the switch statement, we allow this
parameter to be combined with other options correctly. Previously options
like slub_debug=OFZ would only detect the 'o' and use DEBUG_DEFAULT_FLAGS
to fill in the rest of the flags.Signed-off-by: Chris J Arges
Cc: Christoph Lameter
Cc: Pekka Enberg
Acked-by: David Rientjes
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
With gcc version 4.7.3 (Ubuntu/Linaro 4.7.3-12ubuntu1) :
mm/migrate.c: In function `migrate_pages':
mm/migrate.c:1148:1: internal compiler error: in push_minipool_fix, at config/arm/arm.c:13500
Please submit a full bug report,
with preprocessed source if appropriate.
See for instructions.
Preprocessed source stored into /tmp/ccPoM1tr.out file, please attach this to your bugreport.
make[1]: *** [mm/migrate.o] Error 1
make: *** [mm/migrate.o] Error 2Mark unmap_and_move() (which is used in a single place only) "noinline"
to work around this compiler bug.[akpm@linux-foundation.org: make it conditional on gcc-4.7.3 and arm]
[khilman@kernel.org: fine-tune compiler versions]
[akpm@linux-foundation.org: fix comment]
Signed-off-by: Geert Uytterhoeven
Reported-by: Kevin Hilman
Cc: Marc Zyngier
Tested-by: Kevin Hilman
Tested-by: Lina Iyer
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Have kvm_guest_init() use hardlockup_detector_disable() instead of
watchdog_enable_hardlockup_detector(false).Remove the watchdog_hardlockup_detector_is_enabled() and the
watchdog_enable_hardlockup_detector() function which are no longer needed.Signed-off-by: Ulrich Obergfell
Signed-off-by: Don Zickus
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Rename the update_timers*() functions to update_watchdog*().
Remove the boolean argument from watchdog_enable_all_cpus() because
update_watchdog_all_cpus() is now a generic function to change the run
state of the lockup detectors and to have the lockup detectors use a new
sample period.Signed-off-by: Ulrich Obergfell
Signed-off-by: Don Zickus
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
With the current user interface of the watchdog mechanism it is only
possible to disable or enable both lockup detectors at the same time.
This series introduces new kernel parameters and changes the semantics of
some existing kernel parameters, so that the hard lockup detector and the
soft lockup detector can be disabled or enabled individually. With this
series applied, the user interface is as follows.- parameters in /proc/sys/kernel
. soft_watchdog
This is a new parameter to control and examine the run state of
the soft lockup detector.. nmi_watchdog
The semantics of this parameter have changed. It can now be used
to control and examine the run state of the hard lockup detector.. watchdog
This parameter is still available to control the run state of both
lockup detectors at the same time. If this parameter is examined,
it shows the logical OR of soft_watchdog and nmi_watchdog.. watchdog_thresh
The semantics of this parameter are not affected by the patch.- kernel command line parameters
. nosoftlockup
The semantics of this parameter have changed. It can now be used
to disable the soft lockup detector at boot time.. nmi_watchdog=0 or nmi_watchdog=1
Disable or enable the hard lockup detector at boot time. The patch
introduces '=1' as a new option.. nowatchdog
The semantics of this parameter are not affected by the patch. It
is still available to disable both lockup detectors at boot time.Also, remove the proc_dowatchdog() function which is no longer needed.
[dzickus@redhat.com: wrote changelog]
[dzickus@redhat.com: update documentation for kernel params and sysctl]
Signed-off-by: Ulrich Obergfell
Signed-off-by: Don Zickus
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If watchdog_nmi_enable() fails to set up the hardware perf event of one
CPU, the entire hard lockup detector is deemed unreliable. Hence, disable
the hard lockup detector and shut down the hardware perf events on all
CPUs.[dzickus@redhat.com: update comments to explain some code]
Signed-off-by: Ulrich Obergfell
Signed-off-by: Don Zickus
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Separate handlers for each watchdog parameter in /proc/sys/kernel replace
the proc_dowatchdog() function. Three of those handlers merely call
proc_watchdog_common() with one different argument.Signed-off-by: Ulrich Obergfell
Signed-off-by: Don Zickus
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Three of four handlers for the watchdog parameters in /proc/sys/kernel
essentially have to do the same thing.if the parameter is being read {
return the state of the corresponding bit(s) in 'watchdog_enabled'
} else {
set/clear the state of the corresponding bit(s) in 'watchdog_enabled'
update the run state of the lockup detector(s)
}Hence, introduce a common function that can be called by those handlers.
The callers pass a 'bit mask' to this function to indicate which bit(s)
should be set/cleared in 'watchdog_enabled'.This function handles an uncommon race with watchdog_nmi_enable() where a
concurrent update of 'watchdog_enabled' is possible. We use 'cmpxchg' to
detect the concurrency. [This avoids introducing a new spinlock or a
mutex to synchronize updates of 'watchdog_enabled'. Using the same lock
or mutex in watchdog thread context and in system call context needs to be
considered carefully because it can make the code prone to deadlock
situations in connection with parking/unparking the watchdog threads.]Signed-off-by: Ulrich Obergfell
Signed-off-by: Don Zickus
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This series removes proc_dowatchdog(). Since multiple new functions need
the 'watchdog_proc_mutex' to serialize access to the watchdog parameters
in /proc/sys/kernel, move the mutex outside of any function.Signed-off-by: Ulrich Obergfell
Signed-off-by: Don Zickus
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This series introduces a separate handler for each watchdog parameter in
/proc/sys/kernel. The separate handlers need a common function that they
can call to update the run state of the lockup detectors, or to have the
lockup detectors use a new sample period.Signed-off-by: Ulrich Obergfell
Signed-off-by: Don Zickus
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The hardlockup and softockup had always been tied together. Due to the
request of KVM folks, they had a need to have one enabled but not the
other. Internally rework the code to split things apart more cleanly.There is a bunch of churn here, but the end result should be code that
should be easier to maintain and fix without knowing the internals of what
is going on.This patch (of 9):
Introduce new definitions and variables to separate the user interface in
/proc/sys/kernel from the internal run state of the lockup detectors. The
internal run state is represented by two bits in a new variable that is
named 'watchdog_enabled'. This helps simplify the code, for example:- In order to check if any of the two lockup detectors is enabled,
it is sufficient to check if 'watchdog_enabled' is not zero.- In order to enable/disable one or both lockup detectors,
it is sufficient to set/clear one or both bits in 'watchdog_enabled'.- Concurrent updates of 'watchdog_enabled' need not be synchronized via
a spinlock or a mutex. Updates can either be atomic or concurrency can
be detected by using 'cmpxchg'.Signed-off-by: Ulrich Obergfell
Signed-off-by: Don Zickus
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
ocfs2 does
mlog_errno(v);
return v;in many places. Change mlog_errno() so we can do
return mlog_errno(v);
For some weird reason this patch reduces the size of ocfs2 by 6k:
akpm3:/usr/src/25> size fs/ocfs2/ocfs2.ko
text data bss dec hex filename
1146613 82767 832192 2061572 1f7504 fs/ocfs2/ocfs2.ko-before
1140857 82767 832192 2055816 1f5e88 fs/ocfs2/ocfs2.ko-after[dan.carpenter@oracle.com: double evaluation concerns in mlog_errno()]
Cc: Mark Fasheh
Cc: Joel Becker
Cc: alex chen
Signed-off-by: Dan Carpenter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If ocfs2 lockres has not been initialized before calling ocfs2_dlm_lock,
the lock won't be dropped and then will lead umount hung. The case is
described below:ocfs2_mknod
ocfs2_mknod_locked
__ocfs2_mknod_locked
ocfs2_journal_access_di
Failed because of -ENOMEM or other reasons, the inode lockres
has not been initialized yet.iput(inode)
ocfs2_evict_inode
ocfs2_delete_inode
ocfs2_inode_lock
ocfs2_inode_lock_full_nested
__ocfs2_cluster_lock
Succeeds and allocates a new dlm lockres.
ocfs2_clear_inode
ocfs2_open_unlock
ocfs2_drop_inode_locks
ocfs2_drop_lock
Since lockres has not been initialized, the lock
can't be dropped and the lockres can't be
migrated, thus umount will hang forever.Signed-off-by: Alex Chen
Reviewed-by: Joseph Qi
Reviewed-by: joyce.xue
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use the vsprintf %pV extension to avoid using a static buffer and remove
the now unnecessary buffer.Signed-off-by: Joe Perches
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
debugfs_create_dir and debugfs_create_file may return -ENODEV when debugfs
is not configured, so the return value should be checked against
ERROR_VALUE as well, otherwise the later dereference of the dentry pointer
would crash the kernel.This patch tries to solve this problem by fixing certain checks. However,
I have that found other call sites are protected by #ifdef CONFIG_DEBUG_FS.
In current implementation, if CONFIG_DEBUG_FS is defined, then the above
two functions will never return any ERROR_VALUE. So another possibility
to fix this is to surround all the buggy checks/functions with the same
#ifdef CONFIG_DEBUG_FS. But I'm not sure if this would break any functionality,
as only OCFS2_FS_STATS declares dependency on DEBUG_FS.Signed-off-by: Chengyu Song
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Jakub Wilk
Reviewed-by: Eric Ren
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
In ocfs2_local_alloc_find_clear_bits and ocfs2_get_dentry, variable
numfound and set may be uninitialized and then used in tracepoint. In
ocfs2_xattr_block_get and ocfs2_delete_xattr_in_bucket, variable block_off
and xv may be uninitialized and then used in the following logic due to
unchecked return value.This patch fixes these possible issues.
Signed-off-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Daeseok Youn
Reviewed-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
ocfs2_block_group_clear_bits will clear bits in block group bitmap.
Once it succeeds but fails in the following step, it will cause block
group bitmap mismatch the corresponding count recorded in dinode.
So rollback the cleared bits if error occurs.Signed-off-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When ocfs2_get_system_file_inode fails, it is obscure to set the return
value to -EEXIST. So change it to -ENOENT.Signed-off-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If the namelen is 20 and name only has actual length 16, it will fail in
ocfs2_find_entry because of mismatch. So use actual name length when find
entry.Signed-off-by: Joseph Qi
Signed-off-by: Yiwen Jiang
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The code at the "out" label assumes that "default_acl" and "acl" are NULL,
but actually the pointers can be NULL, unitialized, or freed.Signed-off-by: Dan Carpenter
Reviewed-by: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
In ocfs2_reserve_local_alloc_bits, it calls ocfs2_error if local alloc
inode bitmap used bits mismatch, but the log mistakes it as free bits.Signed-off-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
In ocfs2_direct_IO_write, we use ocfs2_zero_extend to zero allocated
clusters in case of cluster not aligned. But ocfs2_zero_extend uses page
cache, this may happen that it clears the data which blockdev_direct_IO
has already written.We should use blkdev_issue_zeroout instead of ocfs2_zero_extend during
direct IO.So fix this issue by introducing ocfs2_direct_IO_zero_extend and
ocfs2_direct_IO_extend_no_holes.Reported-by: Yiwen Jiang
Signed-off-by: Joseph Qi
Tested-by: Yiwen Jiang
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We need take inode lock when calling ocfs2_get_clusters.
And use GFP_NOFS instead of GFP_KERNEL.Signed-off-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Since di_bh won't be used when zeroing extend, set it to NULL.
Signed-off-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Only when direct IO succeeds we need consider zeroing out in case of
cluster not aligned.Signed-off-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix an off-by-one when attempting to avoid an msleep() on the final loop
iteration.Signed-off-by: Daeseok Youn
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
kfree() was called by user_cluster_connect() even if a previous call of
the kzalloc() function failed.Return from this implementation directly after failure detection.
Signed-off-by: Markus Elfring
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
__ocfs2_free_slot_info() was called by ocfs2_init_slot_info() even if a
call of the kzalloc() function failed.Return from this implementation directly after corresponding
exception handling.Signed-off-by: Markus Elfring
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
ocfs2_free_path() was called by ocfs2_merge_rec_right() even if a call of
the ocfs2_get_right_path() function failed.Return from this implementation directly after corresponding
exception handling.Signed-off-by: Markus Elfring
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
ocfs2_free_path() was called by ocfs2_merge_rec_left() even if a call of
the ocfs2_get_left_path() function failed.Return from this implementation directly after corresponding
exception handling.Signed-off-by: Markus Elfring
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds