17 Feb, 2007

31 commits

  • Use mask_ack_irq() where possible.

    Signed-off-by: Jan Beulich
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Beulich
     
  • Fix kernel-doc warnings in IRQ management.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Never mask interrupts immediately upon request. Disabling interrupts in
    high-performance codepaths is rare, and on the other hand this change could
    recover lost edges (or even other types of lost interrupts) by conservatively
    only masking interrupts after they happen. (NOTE: with this change the
    highlevel irq-disable code still soft-disables this IRQ line - and if such an
    interrupt happens then the IRQ flow handler keeps the IRQ masked.)

    Mark i8529A controllers as 'never loses an edge'.

    Signed-off-by: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Use RCU to avoid the need to acquire tasklist_lock in the single-threaded
    case of clock_gettime(). It still acquires tasklist_lock when for a
    (potentially multithreaded) process. This change allows realtime
    applications to frequently monitor CPU consumption of individual tasks, as
    requested (and now deployed) by some off-list users.

    This has been in Ingo Molnar's -rt patchset since late 2005 with no
    problems reported, and tests successfully on 2.6.20-rc6, so I believe that
    it is long-since ready for mainline adoption.

    [paulmck@linux.vnet.ibm.com: fix exit()/posix_cpu_clock_get() race spotted by Oleg]
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: john stultz
    Cc: Roman Zippel
    Cc: Oleg Nesterov
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     
  • In preparation for the x86_64 generic time conversion, this patch splits out
    TSC and HPET related code from arch/x86_64/kernel/time.c into respective
    hpet.c and tsc.c files.

    [akpm@osdl.org: fix printk timestamps]
    [akpm@osdl.org: cleanup]
    Signed-off-by: John Stultz
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Andi Kleen
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    john stultz
     
  • Provides generic infrastructure for vsyscall-gtod.

    [akpm@osdl.org: cleanup]
    Signed-off-by: John Stultz
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Andi Kleen
    Cc: Roman Zippel

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    john stultz
     
  • add /proc/timer_list, which prints all currently pending (high-res) timers,
    all clock-event sources and their parameters in a human-readable form.

    Sample output:

    Timer List Version: v0.1
    HRTIMER_MAX_CLOCK_BASES: 2
    now at 4246046273872 nsecs

    cpu: 0
    clock 0:
    .index: 0
    .resolution: 1 nsecs
    .get_time: ktime_get_real
    .offset: 1273998312645738432 nsecs
    active timers:
    clock 1:
    .index: 1
    .resolution: 1 nsecs
    .get_time: ktime_get
    .offset: 0 nsecs
    active timers:
    #0: , hrtimer_sched_tick, hrtimer_stop_sched_tick, swapper/0
    # expires at 4246432689566 nsecs [in 386415694 nsecs]
    #1: , hrtimer_wakeup, do_nanosleep, pcscd/2050
    # expires at 4247018194689 nsecs [in 971920817 nsecs]
    #2: , hrtimer_wakeup, do_nanosleep, irqbalance/1909
    # expires at 4247351358392 nsecs [in 1305084520 nsecs]
    #3: , hrtimer_wakeup, do_nanosleep, crond/2157
    # expires at 4249097614968 nsecs [in 3051341096 nsecs]
    #4: , it_real_fn, do_setitimer, syslogd/1888
    # expires at 4251329900926 nsecs [in 5283627054 nsecs]
    .expires_next : 4246432689566 nsecs
    .hres_active : 1
    .check_clocks : 0
    .nr_events : 31306
    .idle_tick : 4246020791890 nsecs
    .tick_stopped : 1
    .idle_jiffies : 986504
    .idle_calls : 40700
    .idle_sleeps : 36014
    .idle_entrytime : 4246019418883 nsecs
    .idle_sleeptime : 4178181972709 nsecs

    cpu: 1
    clock 0:
    .index: 0
    .resolution: 1 nsecs
    .get_time: ktime_get_real
    .offset: 1273998312645738432 nsecs
    active timers:
    clock 1:
    .index: 1
    .resolution: 1 nsecs
    .get_time: ktime_get
    .offset: 0 nsecs
    active timers:
    #0: , hrtimer_sched_tick, hrtimer_restart_sched_tick, swapper/0
    # expires at 4246050084568 nsecs [in 3810696 nsecs]
    #1: , hrtimer_wakeup, do_nanosleep, atd/2227
    # expires at 4261010635003 nsecs [in 14964361131 nsecs]
    #2: , hrtimer_wakeup, do_nanosleep, smartd/2332
    # expires at 5469485798970 nsecs [in 1223439525098 nsecs]
    .expires_next : 4246050084568 nsecs
    .hres_active : 1
    .check_clocks : 0
    .nr_events : 24043
    .idle_tick : 4246046084568 nsecs
    .tick_stopped : 0
    .idle_jiffies : 986510
    .idle_calls : 26360
    .idle_sleeps : 22551
    .idle_entrytime : 4246043874339 nsecs
    .idle_sleeptime : 4170763761184 nsecs

    tick_broadcast_mask: 00000003
    event_broadcast_mask: 00000001

    CPU#0's local event device:

    Clock Event Device: lapic
    capabilities: 0000000e
    max_delta_ns: 807385544
    min_delta_ns: 1443
    mult: 44624025
    shift: 32
    set_next_event: lapic_next_event
    set_mode: lapic_timer_setup
    event_handler: hrtimer_interrupt
    .installed: 1
    .expires: 4246432689566 nsecs

    CPU#1's local event device:

    Clock Event Device: lapic
    capabilities: 0000000e
    max_delta_ns: 807385544
    min_delta_ns: 1443
    mult: 44624025
    shift: 32
    set_next_event: lapic_next_event
    set_mode: lapic_timer_setup
    event_handler: hrtimer_interrupt
    .installed: 1
    .expires: 4246050084568 nsecs

    Clock Event Device: hpet
    capabilities: 00000007
    max_delta_ns: 2147483647
    min_delta_ns: 3352
    mult: 61496110
    shift: 32
    set_next_event: hpet_next_event
    set_mode: hpet_set_mode
    event_handler: handle_nextevt_broadcast

    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Add /proc/timer_stats support: debugging feature to profile timer expiration.
    Both the starting site, process/PID and the expiration function is captured.
    This allows the quick identification of timer event sources in a system.

    Sample output:

    # echo 1 > /proc/timer_stats
    # cat /proc/timer_stats
    Timer Stats Version: v0.1
    Sample period: 4.010 s
    24, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick)
    11, 0 swapper sk_reset_timer (tcp_delack_timer)
    6, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick)
    2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
    17, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick)
    2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
    4, 2050 pcscd do_nanosleep (hrtimer_wakeup)
    5, 4179 sshd sk_reset_timer (tcp_write_timer)
    4, 2248 yum-updatesd schedule_timeout (process_timeout)
    18, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick)
    3, 0 swapper sk_reset_timer (tcp_delack_timer)
    1, 1 swapper neigh_table_init_no_netlink (neigh_periodic_timer)
    2, 1 swapper e1000_up (e1000_watchdog)
    1, 1 init schedule_timeout (process_timeout)
    100 total events, 25.24 events/sec

    [ cleanups and hrtimers support from Thomas Gleixner ]
    [bunk@stusta.de: nr_entries can become static]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Cc: john stultz
    Cc: Roman Zippel
    Cc: Andi Kleen
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Fix potential setitimer DoS with high-res timers by pushing itimer rearm
    processing to process context.

    [Fixes from: Ingo Molnar ]
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Implement high resolution timers on top of the hrtimers infrastructure and the
    clockevents / tick-management framework. This provides accurate timers for
    all hrtimer subsystem users.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • With Ingo Molnar

    Add functions to provide dynamic ticks and high resolution timers. The code
    which keeps track of jiffies and handles the long idle periods is shared
    between tick based and high resolution timer based dynticks. The dyntick
    functionality can be disabled on the kernel commandline. Provide also the
    infrastructure to support high resolution timers.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • With Ingo Molnar

    Add broadcast functionality, so per cpu clock event devices can be registered
    as dummy devices or switched from/to broadcast on demand. The broadcast
    function distributes the events via the broadcast function of the clock event
    device. This is primarily designed to replace the switch apic timer to / from
    IPI in power states, where the apic stops.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • With Ingo Molnar

    The tick-management code is the first user of the clockevents layer. It takes
    clock event devices from the clock events core and uses them to provide the
    periodic tick.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Architectures register their clock event devices, in the clock events core.
    Users of the clockevents core can get clock event devices for their use. The
    clockevents core code provides notification mechanisms for various clock
    related management events.

    This allows to control the clock event devices without the architectures
    having to worry about the details of function assignment. This is also a
    preliminary for high resolution timers and dynamic ticks to allow the core
    code to control the clock functionality without intrusive changes to the
    architecture code.

    [Fixes-by: Ingo Molnar ]
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: Roman Zippel
    Cc: john stultz
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Reintroduce ktimers feature "optimized away" by the ktimers review process:
    remove the curr_timer pointer from the cpu-base and use the hrtimer state.

    No functional changes.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: Roman Zippel
    Cc: john stultz
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Reintroduce ktimers feature "optimized away" by the ktimers review process:
    multiple hrtimer states to enable the running of hrtimers without holding the
    cpu-base-lock.

    (The "optimized" rbtree hack carried only 2 states worth of information and we
    need 4 for high resolution timers and dynamic ticks.)

    No functional changes.

    Build-fixes-from: Andrew Morton
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: Roman Zippel
    Cc: john stultz
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Improve kernel/hrtimers.c locking: use a per-CPU base with a lock to control
    locking of all clocks belonging to a CPU. This simplifies code that needs to
    lock all clocks at once. This makes life easier for high-res timers and
    dyntick.

    No functional changes.

    [ optimization change from Andrew Morton ]

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • - hrtimers did not use the hrtimer_restart enum and relied on the implict
    int representation. Fix the prototypes and the functions using the enums.
    - Use seperate name spaces for the enumerations
    - Convert hrtimer_restart macro to inline function
    - Add comments

    No functional changes.

    [akpm@osdl.org: fix input driver]
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Cc: Dmitry Torokhov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • For CONFIG_NO_HZ we need to calculate the next timer wheel event based on a
    given jiffie value. Extend the existing code to allow the extra 'now'
    argument. Provide a compability function for the existing implementations to
    call the function with now == jiffies. (This also solves the racyness of the
    original code vs. jiffies changing during the iteration.)

    No functional changes to existing users of this infrastructure.

    [ remove WARN_ON() that triggered on s390, by Carsten Otte ]
    [ made new helper static, Adrian Bunk ]
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • When searching for the next pending timer in the timer wheel we need to take
    the cascade into account. The current code has several problems:

    1. it looks into the previous cascade
    2. it ignores a pending cascade
    3. it ignores multiple cascades

    Change the cascade lookup, so it calculates the array index from the point of
    the next cascade and always look at the cascade buckets, when the cascade is
    pending, i.e. gets executed in the next timer softirq. When multiple
    cascades are pending, then lookup the next buckets too.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Uninline irq_enter(). [dynticks adds more stuff to it]

    No functional changes.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • The TSC needs to be verified against another clocksource. Instead of using
    hardwired assumptions of available hardware, provide a generic verification
    mechanism. The verification uses the best available clocksource and handles
    the usability for high resolution timers / dynticks of the clocksource which
    needs to be verified.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • The clocksource code allows direct updates of the rating of a given
    clocksource now. Change TSC unstable tracking to use this interface and
    remove the update callback.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Using a flag filed allows to encode more than one information into a variable.
    Preparatory patch for the generic clocksource verification.

    [mingo@elte.hu: convert vmitime.c to the new clocksource flag]
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Enqueue clocksources in rating order to make selection of the clocksource
    easier. Also check the match with an user override at enqueue time.

    Preparatory patch for the generic clocksource verification.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Persistent clock support: do proper timekeeping across suspend/resume.

    [bunk@stusta.de: cleanup]
    Signed-off-by: John Stultz
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: Roman Zippel
    Cc: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Stultz
     
  • Fix multiple conversion bugs in msecs_to_jiffies().

    The main problem is that this condition:

    if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET))

    overflows if HZ is smaller than 1000!

    This change is user-visible: for HZ=250 SUS-compliant poll()-timeout
    value of -20 is mistakenly converted to 'immediate timeout'.

    (The new dyntick code also triggered this, that's how we noticed.)

    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • There are loads of fat functions hidden in jiffies.h. Uninline them. No code
    changes.

    [jeremy@goop.org: export fix]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Cc: john stultz
    Cc: Roman Zippel
    Cc: Jeremy Fitzhardinge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Distangle the NTP update from HZ. This is necessary for dynamic tick enabled
    kernels.

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    john stultz
     
  • Provide funtions to:
    - check, whether an interrupt can set the affinity
    - pin the interrupt to a given cpu

    Necessary for the ability to setup clocksources more flexible (e.g. use the
    different HPET channels per CPU)

    [akpm@osdl.org: alpha build fix]
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Add a flag so we can prevent the irq balancing of an interrupt. Move the
    bits, so we have room for more :)

    Necessary for the ability to setup clocksources more flexible (e.g. use the
    different HPET channels per CPU)

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

15 Feb, 2007

9 commits

  • * 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (94 commits)
    [PATCH] x86-64: Remove mk_pte_phys()
    [PATCH] i386: Fix broken CONFIG_COMPAT_VDSO on i386
    [PATCH] i386: fix 32-bit ioctls on x64_32
    [PATCH] x86: Unify pcspeaker platform device code between i386/x86-64
    [PATCH] i386: Remove extern declaration from mm/discontig.c, put in header.
    [PATCH] i386: Rename cpu_gdt_descr and remove extern declaration from smpboot.c
    [PATCH] i386: Move mce_disabled to asm/mce.h
    [PATCH] i386: paravirt unhandled fallthrough
    [PATCH] x86_64: Wire up compat epoll_pwait
    [PATCH] x86: Don't require the vDSO for handling a.out signals
    [PATCH] i386: Fix Cyrix MediaGX detection
    [PATCH] i386: Fix warning in cpu initialization
    [PATCH] i386: Fix warning in microcode.c
    [PATCH] x86: Enable NMI watchdog for AMD Family 0x10 CPUs
    [PATCH] x86: Add new CPUID bits for AMD Family 10 CPUs in /proc/cpuinfo
    [PATCH] i386: Remove fastcall in paravirt.[ch]
    [PATCH] x86-64: Fix wrong gcc check in bitops.h
    [PATCH] x86-64: survive having no irq mapping for a vector
    [PATCH] i386: geode configuration fixes
    [PATCH] i386: add option to show more code in oops reports
    ...

    Linus Torvalds
     
  • Add a parent entry into the ctl_table so you can walk the list of parents and
    find the entire path to a ctl_table entry.

    Signed-off-by: Eric W. Biederman
    Cc: Stephen Smalley
    Cc: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • With this change the sysctl inodes can be cached and nothing needs to be done
    when removing a sysctl table.

    For a cost of 2K code we will save about 4K of static tables (when we remove
    de from ctl_table) and 70K in proc_dir_entries that we will not allocate, or
    about half that on a 32bit arch.

    The speed feels about the same, even though we can now cache the sysctl
    dentries :(

    We get the core advantage that we don't need to have a 1 to 1 mapping between
    ctl table entries and proc files. Making it possible to have /proc/sys vary
    depending on the namespace you are in. The currently merged namespaces don't
    have an issue here but the network namespace under /proc/sys/net needs to have
    different directories depending on which network adapters are visible. By
    simply being a cache different directories being visible depending on who you
    are is trivial to implement.

    [akpm@osdl.org: fix uninitialised var]
    [akpm@osdl.org: fix ARM build]
    [bunk@stusta.de: make things static]
    Signed-off-by: Eric W. Biederman
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The current logic to walk through the list of sysctl table headers is slightly
    painful and implement in a way it cannot be used by code outside sysctl.c

    I am in the process of implementing a version of the sysctl proc support that
    instead of using the proc generic non-caching monster, just uses the existing
    sysctl data structure as backing store for building the dcache entries and for
    doing directory reads. To use the existing data structures however I need a
    way to get at them.

    [akpm@osdl.org: warning fix]
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The semantic effect of insert_at_head is that it would allow new registered
    sysctl entries to override existing sysctl entries of the same name. Which is
    pain for caching and the proc interface never implemented.

    I have done an audit and discovered that none of the current users of
    register_sysctl care as (excpet for directories) they do not register
    duplicate sysctl entries.

    So this patch simply removes the support for overriding existing entries in
    the sys_sysctl interface since no one uses it or cares and it makes future
    enhancments harder.

    Signed-off-by: Eric W. Biederman
    Acked-by: Ralf Baechle
    Acked-by: Martin Schwidefsky
    Cc: Russell King
    Cc: David Howells
    Cc: "Luck, Tony"
    Cc: Ralf Baechle
    Cc: Paul Mackerras
    Cc: Martin Schwidefsky
    Cc: Andi Kleen
    Cc: Jens Axboe
    Cc: Corey Minyard
    Cc: Neil Brown
    Cc: "John W. Linville"
    Cc: James Bottomley
    Cc: Jan Kara
    Cc: Trond Myklebust
    Cc: Mark Fasheh
    Cc: David Chinner
    Cc: "David S. Miller"
    Cc: Patrick McHardy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • parse_table has support for calling a strategy routine when descending into a
    directory. To date no one has used this functionality and the /proc/sys
    interface has no analog to it.

    So no one is using this functionality kill it and make the binary sysctl code
    easier to follow.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • There are currently no users in the kernel for CTL_ANY and it only has effect
    on the binary interface which is practically unused.

    So this complicates sysctl lookups for no good reason so just remove it.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • binfmt_misc has a mount point in the middle of the sysctl and that mount point
    is created as a proc_generic directory.

    Doing it that way gets in the way of cleaning up the sysctl proc support as it
    continues the existence of a horrible hack. So instead simply create the
    directory as an ordinary sysctl directory. At least that removes the magic
    special case.

    [akpm@osdl.org: warning fix]
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman