05 Aug, 2016
1 commit
-
* pm-sleep:
x86/power/64: Do not refer to __PAGE_OFFSET from assembly code* pm-cpufreq:
cpufreq: Do not default-yes CPU_FREQ_STAT
cpufreq: intel_pstate: Add more out-of-band IDs* pm-core:
PM-wakeup: Delete unnecessary checks before three function calls* pm-opp:
PM / OPP: optimize dev_pm_opp_set_rate() performance a bit
03 Aug, 2016
2 commits
-
When CONFIG_RANDOMIZE_MEMORY is set on x86-64, __PAGE_OFFSET becomes
a variable and using it as a symbol in the image memory restoration
assembly code under core_restore_code is not correct any more.To avoid that problem, modify set_up_temporary_mappings() to compute
the physical address of the temporary page tables and store it in
temp_level4_pgt, so that the value of that variable is ready to be
written into CR3. Then, the assembly code doesn't have to worry
about converting that value into a physical address and things work
regardless of whether or not CONFIG_RANDOMIZE_MEMORY is set.Reported-and-tested-by: Thomas Garnier
Signed-off-by: Rafael J. Wysocki -
CPU frequency transition statistics are not absolutely required for
proper cpufreq operation on the system AFAICT so remove the default-yes
setting in Kconfig.Signed-off-by: Borislav Petkov
Acked-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki
30 Jul, 2016
16 commits
-
Pull power management fix from Rafael Wysocki:
"Fix a nasty (and really hard to debug) memory corruption during resume
from hibernation on x86-64 (that leads to a kernel panic most of the
time) due to the use of a stale stack pointer value in FRAME_BEGIN
(Josh Poimboeuf)"* tag 'pm-urgent-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
x86/power/64: Fix hibernation return address corruption -
Pull more cgroup updates from Tejun Heo:
"I forgot to include the patches which got applied to for-4.7-fixes
late during last cycle.Eric's three patches fix bugs introduced with the namespace support"
* 'for-4.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroupns: Only allow creation of hierarchies in the initial cgroup namespace
cgroupns: Close race between cgroup_post_fork and copy_cgroup_ns
cgroupns: Fix the locking in copy_cgroup_ns -
Pull smp hotplug updates from Thomas Gleixner:
"This is the next part of the hotplug rework.- Convert all notifiers with a priority assigned
- Convert all CPU_STARTING/DYING notifiers
The final removal of the STARTING/DYING infrastructure will happen
when the merge window closes.Another 700 hundred line of unpenetrable maze gone :)"
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
timers/core: Correct callback order during CPU hot plug
leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
powerpc/numa: Convert to hotplug state machine
arm/perf: Fix hotplug state machine conversion
irqchip/armada: Avoid unused function warnings
ARC/time: Convert to hotplug state machine
clocksource/atlas7: Convert to hotplug state machine
clocksource/armada-370-xp: Convert to hotplug state machine
clocksource/exynos_mct: Convert to hotplug state machine
clocksource/arm_global_timer: Convert to hotplug state machine
rcu: Convert rcutree to hotplug state machine
KVM/arm/arm64/vgic-new: Convert to hotplug state machine
smp/cfd: Convert core to hotplug state machine
x86/x2apic: Convert to CPU hotplug state machine
profile: Convert to hotplug state machine
timers/core: Convert to hotplug state machine
hrtimer: Convert to hotplug state machine
x86/tboot: Convert to hotplug state machine
arm64/armv8 deprecated: Convert to hotplug state machine
hwtracing/coresight-etm4x: Convert to hotplug state machine
... -
Pull IDE updates from David Miller:
"Just a couple small bug fixes, nothing overly exciting in here"* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
ide: missing break statement in set_timings_mdma()
ide: hpt366: fix incorrect mask when checking at cmd_high_time
ide-tape: fix misprint in failure handling in idetape_init()
cmd640: add __init attribute -
Pull sparc updates from David Miller:
1) Double spin lock bug in sunhv serial driver, from Dan Carpenter.
2) Use correct RSS estimate when determining whether to grow the huge
TSB or not, from Mike Kravetz.3) Don't use full three level page tables for hugepages, PMD level is
sufficient. From Nitin Gupta.4) Mask out extraneous bits from TSB_TAG_ACCESS register, we only want
the address bits.* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Trim page tables for 8M hugepages
sparc64 mm: Fix base TSB sizing when hugetlb pages are used
sparc: serial: sunhv: fix a double lock bug
sparc32: off by ones in BUG_ON()
sparc: Don't leak context bits into thread->fault_address -
Pull ARC updates from Vineet Gupta:
"Things have been calm here - nothing much except for a few fixes"* tag 'arc-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: mm: don't loose PTE_SPECIAL in pte_modify()
ARC: dma: fix address translation in arc_dma_free
ARC: typo fix in mm/ioremap.c
ARC: fix linux-next build breakage -
* pm-sleep:
x86/power/64: Fix hibernation return address corruption -
Pull AVR32 updates from Hans-Christian Noren Egtvedt.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
avr32: off by one in at32_init_pio()
avr32: fixup code style in unistd.h and syscall_table.S
avr32: wire up preadv2 and pwritev2 syscalls -
Pull ARM updates from Russell King:
"Included in this update are:- Patches from Gregory Clement to fix the coherent DMA cases in our
dma-mapping code.- A number of CPU errata updates and fixes.
- ARM cpuidle improvements from Jisheng Zhang.
- Fix from Kees for the location of _etext.
- Cleanups from Masahiro Yamada to avoid duplicated messages during
the kernel build, and remove CONFIG_ARCH_HAS_BARRIERS.- Remove a udelay loop limitation, allowing for faster CPUs to
calibrate the delay correctly.- Cleanup some left-overs from the SW PAN implementation.
- Ensure that a modified address limit is not visible to exception
handlers"* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (21 commits)
ARM: 8586/1: cpuidle: make arm_cpuidle_suspend() a bit more efficient
ARM: 8585/1: cpuidle: fix !cpuidle_ops[cpu].init case during init
ARM: 8561/4: dma-mapping: Fix the coherent case when iommu is used
ARM: 8561/3: dma-mapping: Don't use outer_flush_range when the L2C is coherent
ARM: 8560/1: errata: Workaround errata A12 825619 / A17 852421
ARM: 8559/1: errata: Workaround erratum A12 821420
ARM: 8558/1: errata: Workaround errata A12 818325/852422 A17 852423
ARM: save and reset the address limit when entering an exception
ARM: 8577/1: Fix Cortex-A15 798181 errata initialization
ARM: 8584/1: floppy: avoid gcc-6 warning
ARM: 8583/1: mm: fix location of _etext
ARM: 8582/1: remove unused CONFIG_ARCH_HAS_BARRIERS
ARM: 8306/1: loop_udelay: remove bogomips value limitation
ARM: 8581/1: add missing to arch/arm/kernel/devtree.c
ARM: 8576/1: avoid duplicating "Kernel: arch/arm/boot/*Image is ready"
ARM: 8556/1: on a generic DT system: do not touch l2x0
ARM: uaccess: remove put_user() code duplication
ARM: 8580/1: Remove orphaned __addr_ok() definition
ARM: get rid of horrible *(unsigned int *)(regs + 1)
ARM: introduce svc_pt_regs structure
... -
Pull fuse updates from Miklos Szeredi:
"This fixes error propagation from writeback to fsync/close for
writeback cache mode as well as adding a missing capability flag to
the INIT message. The rest are cleanups.(The commits are recent but all the code actually sat in -next for a
while now. The recommits are due to conflict avoidance and the
addition of Cc: stable@...)"* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: use filemap_check_errors()
mm: export filemap_check_errors() to modules
fuse: fix wrong assignment of ->flags in fuse_send_init()
fuse: fuse_flush must check mapping->flags for errors
fuse: fsync() did not return IO errors
fuse: don't mess with blocking signals
new helper: wait_event_killable_exclusive()
fuse: improve aio directIO write performance for size extending writes -
This reverts commit 3c9fe8cdff1b889a059a30d22f130372f2b3885f.
As Miklos points out in commit c1b2cc1a765a, the "lookup_hash()" helper
is now unused, and in fact, with the hash salting changes, since the
hash of a dentry name now depends on the directory dentry it is in, the
helper function isn't even really likely to be useful.So rather than keep it around in case somebody else might end up finding
a use for it, let's just remove the helper and not trick people into
thinking it might be a useful thing.For example, I had obviously completely missed how the helper didn't
follow the normal dentry hashing patterns, and how the hash salting
patch broke overlayfs. Things would quietly build and look sane, but
not work.Suggested-by: Miklos Szeredi
Signed-off-by: Linus Torvalds -
Pull overlayfs update from Miklos Szeredi:
"First of all, this fixes a regression in overlayfs introduced by the
dentry hash salting. I've moved the patch fixing this to the front of
the queue, so if (god forbid) something needs to be bisected in
overlayfs this regression won't interfere with that.The biggest part is preparation for selinux support, done by Vivek
Goyal. Essentially this makes all operations on underlying
filesystems be done with credentials of mounter. This makes
everything nicely consistent.There are also fixes for a number of known and recently discovered
non-standard behavior (thanks to Eryu Guan for testing and improving
the test suites)"* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (23 commits)
ovl: simplify empty checking
qstr: constify instances in overlayfs
ovl: clear nlink on rmdir
ovl: disallow overlayfs as upperdir
ovl: fix warning
ovl: remove duplicated include from super.c
ovl: append MAY_READ when diluting write checks
ovl: dilute permission checks on lower only if not special file
ovl: fix POSIX ACL setting
ovl: share inode for hard link
ovl: store real inode pointer in ->i_private
ovl: permission: return ECHILD instead of ENOENT
ovl: update atime on upper
ovl: fix sgid on directory
ovl: simplify permission checking
ovl: do not require mounter to have MAY_WRITE on lower
ovl: do operations on underlying file system in mounter's context
ovl: modify ovl_permission() to do checks on two inodes
ovl: define ->get_acl() for overlay inodes
ovl: move some common code in a function
... -
Pull freevxfs updates from Christoph Hellwig:
"Support for foreign endianess and HP-UP superblocks from
Krzysztof Błaszkowski"* tag 'freevxfs-for-4.8' of git://git.infradead.org/users/hch/freevxfs:
freevxfs: update Kconfig information
freevxfs: refactor readdir and lookup code
freevxfs: fix lack of inode initialization
freevxfs: fix memory leak in vxfs_read_fshead()
freevxfs: update documentation and cresdits for HP-UX support
freevxfs: implement ->alloc_inode and ->destroy_inode
freevxfs: avoid the need for forward declaring the super operations
freevxfs: move VFS inode allocation into vxfs_blkiget and vxfs_stiget
freevxfs: remove vxfs_put_fake_inode
freevxfs: handle big endian HP-UX file systems -
Pull configfs update from Christoph Hellwig:
"A simple error handling fix from Tal Shorer"* tag 'configfs-for-4.8' of git://git.infradead.org/users/hch/configfs:
configfs: don't set buffer_needs_fill to zero if show() returns error -
Pull CIFS/SMB3 fixes from Steve French:
"Various CIFS/SMB3 fixes, most for stable"* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: Fix a possible invalid memory access in smb2_query_symlink()
fs/cifs: make share unaccessible at root level mountable
cifs: fix crash due to race in hmac(md5) handling
cifs: unbreak TCP session reuse
cifs: Check for existing directory when opening file with O_CREAT
Add MF-Symlinks support for SMB 2.0 -
For PMD aligned (8M) hugepages, we currently allocate
all four page table levels which is wasteful. We now
allocate till PMD level only which saves memory usage
from page tables.Also, when freeing page table for 8M hugepage backed region,
make sure we don't try to access non-existent PTE level.Orabug: 22630259
Signed-off-by: Nitin Gupta
Signed-off-by: David S. Miller
29 Jul, 2016
21 commits
-
Signed-off-by: Miklos Szeredi
-
Can be used by fuse, btrfs and f2fs to replace opencoded variants.
Signed-off-by: Miklos Szeredi
-
FUSE_HAS_IOCTL_DIR should be assigned to ->flags, it may be a typo.
Signed-off-by: Wei Fang
Signed-off-by: Miklos Szeredi
Fixes: 69fe05c90ed5 ("fuse: add missing INIT flags")
Cc: -
fuse_flush() calls write_inode_now() that triggers writeback, but actual
writeback will happen later, on fuse_sync_writes(). If an error happens,
fuse_writepage_end() will set error bit in mapping->flags. So, we have to
check mapping->flags after fuse_sync_writes().Signed-off-by: Maxim Patlasov
Signed-off-by: Miklos Szeredi
Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on")
Cc: # v3.15+ -
Due to implementation of fuse writeback filemap_write_and_wait_range() does
not catch errors. We have to do this directly after fuse_sync_writes()Signed-off-by: Alexey Kuznetsov
Signed-off-by: Maxim Patlasov
Signed-off-by: Miklos Szeredi
Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on")
Cc: # v3.15+ -
In kernel bug 150021, a kernel panic was reported when restoring a
hibernate image. Only a picture of the oops was reported, so I can't
paste the whole thing here. But here are the most interesting parts:kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
BUG: unable to handle kernel paging request at ffff8804615cfd78
...
RIP: ffff8804615cfd78
RSP: ffff8804615f0000
RBP: ffff8804615cfdc0
...
Call Trace:
do_signal+0x23
exit_to_usermode_loop+0x64
...The RIP is on the same page as RBP, so it apparently started executing
on the stack.The bug was bisected to commit ef0f3ed5a4ac (x86/asm/power: Create
stack frames in hibernate_asm_64.S), which in retrospect seems quite
dangerous, since that code saves and restores the stack pointer from a
global variable ('saved_context').There are a lot of moving parts in the hibernate save and restore paths,
so I don't know exactly what caused the panic. Presumably, a FRAME_END
was executed without the corresponding FRAME_BEGIN, or vice versa. That
would corrupt the return address on the stack and would be consistent
with the details of the above panic.[ rjw: One major problem is that by the time the FRAME_BEGIN in
restore_registers() is executed, the stack pointer value may not
be valid any more. Namely, the stack area pointed to by it
previously may have been overwritten by some image memory contents
and that page frame may now be used for whatever different purpose
it had been allocated for before hibernation. In that case, the
FRAME_BEGIN will corrupt that memory. ]Instead of doing the frame pointer save/restore around the bounds of the
affected functions, just do it around the call to swsusp_save().That has the same effect of ensuring that if swsusp_save() sleeps, the
frame pointers will be correct. It's also a much more obviously safe
way to do it than the original patch. And objtool still doesn't report
any warnings.Fixes: ef0f3ed5a4ac (x86/asm/power: Create stack frames in hibernate_asm_64.S)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=150021
Cc: 4.6+ # 4.6+
Reported-by: Andre Reinke
Tested-by: Andre Reinke
Signed-off-by: Josh Poimboeuf
Acked-by: Ingo Molnar
Signed-off-by: Rafael J. Wysocki -
The empty checking logic is duplicated in ovl_check_empty_and_clear() and
ovl_remove_and_whiteout(), except the condition for clearing whiteouts is
different:ovl_check_empty_and_clear() checked for being upper
ovl_remove_and_whiteout() checked for merge OR lower
Move the intersection of those checks (upper AND merge) into
ovl_check_empty_and_clear() and simplify ovl_remove_and_whiteout().Signed-off-by: Miklos Szeredi
-
Signed-off-by: Al Viro
Signed-off-by: Miklos Szeredi -
To make delete notification work on fa/inotify.
Signed-off-by: Miklos Szeredi
-
This does not work and does not make sense. So instead of fixing it
(probably not hard) just disallow.Reported-by: Andrei Vagin
Signed-off-by: Miklos Szeredi
Cc: -
There's a superfluous newline in the warning message in ovl_d_real().
Signed-off-by: Miklos Szeredi
-
Remove duplicated include.
Signed-off-by: Wei Yongjun
Signed-off-by: Miklos Szeredi -
Right now we remove MAY_WRITE/MAY_APPEND bits from mask if realfile is on
lower/. This is done as files on lower will never be written and will be
copied up. But to copy up a file, mounter should have MAY_READ permission
otherwise copy up will fail. So set MAY_READ in mask when MAY_WRITE is
reset.Dan Walsh noticed this when he did access(lowerfile, W_OK) and it returned
True (context mounts) but when he tried to actually write to file, it
failed as mounter did not have permission on lower file.[SzM] don't set MAY_READ if only MAY_APPEND is set without MAY_WRITE; this
won't trigger a copy-up.Reported-by: Dan Walsh
Signed-off-by: Vivek Goyal
Signed-off-by: Miklos Szeredi -
Right now if file is on lower/, we remove MAY_WRITE/MAY_APPEND bits from
mask as lower/ will never be written and file will be copied up. But this
is not true for special files. These files are not copied up and are opened
in place. So don't dilute the checks for these types of files.Reported-by: Dan Walsh
Signed-off-by: Vivek Goyal
Signed-off-by: Miklos Szeredi -
Setting POSIX ACL needs special handling:
1) Some permission checks are done by ->setxattr() which now uses mounter's
creds ("ovl: do operations on underlying file system in mounter's
context"). These permission checks need to be done with current cred as
well.2) Setting ACL can fail for various reasons. We do not need to copy up in
these cases.In the mean time switch to using generic_setxattr.
[Arnd Bergmann] Fix link error without POSIX ACL. posix_acl_from_xattr()
doesn't have a 'static inline' implementation when CONFIG_FS_POSIX_ACL is
disabled, and I could not come up with an obvious way to do it.This instead avoids the link error by defining two sets of ACL operations
and letting the compiler drop one of the two at compile time depending
on CONFIG_FS_POSIX_ACL. This avoids all references to the ACL code,
also leading to smaller code.Signed-off-by: Miklos Szeredi
-
Inode attributes are copied up to overlay inode (uid, gid, mode, atime,
mtime, ctime) so generic code using these fields works correcty. If a hard
link is created in overlayfs separate inodes are allocated for each link.
If chmod/chown/etc. is performed on one of the links then the inode
belonging to the other ones won't be updated.This patch attempts to fix this by sharing inodes for hard links.
Use inode hash (with real inode pointer as a key) to make sure overlay
inodes are shared for hard links on upper. Hard links on lower are still
split (which is not user observable until the copy-up happens, see
Documentation/filesystems/overlayfs.txt under "Non-standard behavior").The inode is only inserted in the hash if it is non-directoy and upper.
Signed-off-by: Miklos Szeredi
-
To get from overlay inode to real inode we currently use 'struct
ovl_entry', which has lifetime connected to overlay dentry. This is okay,
since each overlay dentry had a new overlay inode allocated.Following patch will break that assumption, so need to leave out ovl_entry.
This patch stores the real inode directly in i_private, with the lowest bit
used to indicate whether the inode is upper or lower.Lifetime rules remain, using ovl_inode_real() must only be done while
caller holds ref on overlay dentry (and hence on real dentry), or within
RCU protected regions.Signed-off-by: Miklos Szeredi
-
The error is due to RCU and is temporary.
Signed-off-by: Miklos Szeredi
-
Fix atime update logic in overlayfs.
This patch adds an i_op->update_time() handler to overlayfs inodes. This
forwards atime updates to the upper layer only. No atime updates are done
on lower layers.Remove implicit atime updates to underlying files and directories with
O_NOATIME. Remove explicit atime update in ovl_readlink().Clear atime related mnt flags from cloned upper mount. This means atime
updates are controlled purely by overlayfs mount options.Reported-by: Konstantin Khlebnikov
Signed-off-by: Miklos Szeredi -
When creating directory in workdir, the group/sgid inheritance from the
parent dir was omitted completely. Fix this by calling inode_init_owner()
on overlay inode and using the resulting uid/gid/mode to create the file.Unfortunately the sgid bit can be stripped off due to umask, so need to
reset the mode in this case in workdir before moving the directory in
place.Reported-by: Eryu Guan
Signed-off-by: Miklos Szeredi -
The fact that we always do permission checking on the overlay inode and
clear MAY_WRITE for checking access to the lower inode allows cruft to be
removed from ovl_permission().1) "default_permissions" option effectively did generic_permission() on the
overlay inode with i_mode, i_uid and i_gid updated from underlying
filesystem. This is what we do by default now. It did the update using
vfs_getattr() but that's only needed if the underlying filesystem can
change (which is not allowed). We may later introduce a "paranoia_mode"
that verifies that mode/uid/gid are not changed.2) splitting out the IS_RDONLY() check from inode_permission() also becomes
unnecessary once we remove the MAY_WRITE from the lower inode check.Signed-off-by: Miklos Szeredi