05 Jan, 2016

1 commit


07 Nov, 2015

1 commit

  • …d avoiding waking kswapd

    __GFP_WAIT has been used to identify atomic context in callers that hold
    spinlocks or are in interrupts. They are expected to be high priority and
    have access one of two watermarks lower than "min" which can be referred
    to as the "atomic reserve". __GFP_HIGH users get access to the first
    lower watermark and can be called the "high priority reserve".

    Over time, callers had a requirement to not block when fallback options
    were available. Some have abused __GFP_WAIT leading to a situation where
    an optimisitic allocation with a fallback option can access atomic
    reserves.

    This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
    cannot sleep and have no alternative. High priority users continue to use
    __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
    are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
    callers that want to wake kswapd for background reclaim. __GFP_WAIT is
    redefined as a caller that is willing to enter direct reclaim and wake
    kswapd for background reclaim.

    This patch then converts a number of sites

    o __GFP_ATOMIC is used by callers that are high priority and have memory
    pools for those requests. GFP_ATOMIC uses this flag.

    o Callers that have a limited mempool to guarantee forward progress clear
    __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
    into this category where kswapd will still be woken but atomic reserves
    are not used as there is a one-entry mempool to guarantee progress.

    o Callers that are checking if they are non-blocking should use the
    helper gfpflags_allow_blocking() where possible. This is because
    checking for __GFP_WAIT as was done historically now can trigger false
    positives. Some exceptions like dm-crypt.c exist where the code intent
    is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
    flag manipulations.

    o Callers that built their own GFP flags instead of starting with GFP_KERNEL
    and friends now also need to specify __GFP_KSWAPD_RECLAIM.

    The first key hazard to watch out for is callers that removed __GFP_WAIT
    and was depending on access to atomic reserves for inconspicuous reasons.
    In some cases it may be appropriate for them to use __GFP_HIGH.

    The second key hazard is callers that assembled their own combination of
    GFP flags instead of starting with something like GFP_KERNEL. They may
    now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
    if it's missed in most cases as other activity will wake kswapd.

    Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Vitaly Wool <vitalywool@gmail.com>
    Cc: Rik van Riel <riel@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

    Mel Gorman
     

27 Nov, 2014

1 commit

  • The struct cn_msg len field comes from userspace and needs to be
    validated. More logical to do so here where the cn_msg pointer is
    pulled out of the sk_buff than the callback which is passed cn_msg *
    and might assume no validation is needed.

    Reported-by: Dan Carpenter
    Acked-by: Evgeniy Polyakov
    Signed-off-by: David Fries
    Signed-off-by: Greg Kroah-Hartman

    David Fries
     

28 May, 2014

1 commit

  • This increases the amount of bundling to reduce the number of packets
    sent. For the one wire use there can be multiple struct
    w1_netlink_cmd in a struct w1_netlink_msg and multiple of those in
    struct cn_msg, and with this change multiple of those in a struct
    nlmsghdr, and at each level the len identifies there being multiple of
    the next.

    Signed-off-by: David Fries
    Acked-by: Evgeniy Polyakov
    Signed-off-by: Greg Kroah-Hartman

    David Fries
     

03 Apr, 2014

1 commit

  • Pull networking updates from David Miller:
    "Here is my initial pull request for the networking subsystem during
    this merge window:

    1) Support for ESN in AH (RFC 4302) from Fan Du.

    2) Add full kernel doc for ethtool command structures, from Ben
    Hutchings.

    3) Add BCM7xxx PHY driver, from Florian Fainelli.

    4) Export computed TCP rate information in netlink socket dumps, from
    Eric Dumazet.

    5) Allow IPSEC SA to be dumped partially using a filter, from Nicolas
    Dichtel.

    6) Convert many drivers to pci_enable_msix_range(), from Alexander
    Gordeev.

    7) Record SKB timestamps more efficiently, from Eric Dumazet.

    8) Switch to microsecond resolution for TCP round trip times, also
    from Eric Dumazet.

    9) Clean up and fix 6lowpan fragmentation handling by making use of
    the existing inet_frag api for it's implementation.

    10) Add TX grant mapping to xen-netback driver, from Zoltan Kiss.

    11) Auto size SKB lengths when composing netlink messages based upon
    past message sizes used, from Eric Dumazet.

    12) qdisc dumps can take a long time, add a cond_resched(), From Eric
    Dumazet.

    13) Sanitize netpoll core and drivers wrt. SKB handling semantics.
    Get rid of never-used-in-tree netpoll RX handling. From Eric W
    Biederman.

    14) Support inter-address-family and namespace changing in VTI tunnel
    driver(s). From Steffen Klassert.

    15) Add Altera TSE driver, from Vince Bridgers.

    16) Optimizing csum_replace2() so that it doesn't adjust the checksum
    by checksumming the entire header, from Eric Dumazet.

    17) Expand BPF internal implementation for faster interpreting, more
    direct translations into JIT'd code, and much cleaner uses of BPF
    filtering in non-socket ocntexts. From Daniel Borkmann and Alexei
    Starovoitov"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1976 commits)
    netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
    net: Add a test to see if a skb is freeable in irq context
    qlcnic: Fix build failure due to undefined reference to `vxlan_get_rx_port'
    net: ptp: move PTP classifier in its own file
    net: sxgbe: make "core_ops" static
    net: sxgbe: fix logical vs bitwise operation
    net: sxgbe: sxgbe_mdio_register() frees the bus
    Call efx_set_channels() before efx->type->dimension_resources()
    xen-netback: disable rogue vif in kthread context
    net/mlx4: Set proper build dependancy with vxlan
    be2net: fix build dependency on VxLAN
    mac802154: make csma/cca parameters per-wpan
    mac802154: allow only one WPAN to be up at any given time
    net: filter: minor: fix kdoc in __sk_run_filter
    netlink: don't compare the nul-termination in nla_strcmp
    can: c_can: Avoid led toggling for every packet.
    can: c_can: Simplify TX interrupt cleanup
    can: c_can: Store dlc private
    can: c_can: Reduce register access
    can: c_can: Make the code readable
    ...

    Linus Torvalds
     

04 Mar, 2014

1 commit


08 Feb, 2014

1 commit


03 Oct, 2013

2 commits


29 Mar, 2013

1 commit


19 Feb, 2013

2 commits

  • proc_net_remove is only used to remove proc entries
    that under /proc/net,it's not a general function for
    removing proc entries of netns. if we want to remove
    some proc entries which under /proc/net/stat/, we still
    need to call remove_proc_entry.

    this patch use remove_proc_entry to replace proc_net_remove.
    we can remove proc_net_remove after this patch.

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng
     
  • Right now, some modules such as bonding use proc_create
    to create proc entries under /proc/net/, and other modules
    such as ipv4 use proc_net_fops_create.

    It looks a little chaos.this patch changes all of
    proc_net_fops_create to proc_create. we can remove
    proc_net_fops_create after this patch.

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng
     

04 Jan, 2013

1 commit

  • CONFIG_HOTPLUG is going away as an option. As a result, the __dev*
    markings need to be removed.

    This change removes the use of __devinit, __devexit_p, __devinitdata,
    __devinitconst, and __devexit from these drivers.

    Based on patches originally written by Bill Pemberton, but redone by me
    in order to handle some of the coding style issues better, by hand.

    Cc: Bill Pemberton
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

09 Sep, 2012

1 commit


17 Jul, 2012

1 commit


30 Jun, 2012

1 commit

  • This patch adds the following structure:

    struct netlink_kernel_cfg {
    unsigned int groups;
    void (*input)(struct sk_buff *skb);
    struct mutex *cb_mutex;
    };

    That can be passed to netlink_kernel_create to set optional configurations
    for netlink kernel sockets.

    I've populated this structure by looking for NULL and zero parameters at the
    existing code. The remaining parameters that always need to be set are still
    left in the original interface.

    That includes optional parameters for the netlink socket creation. This allows
    easy extensibility of this interface in the future.

    This patch also adapts all callers to use this new interface.

    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller

    Pablo Neira Ayuso
     

27 Jun, 2012

1 commit


08 Jun, 2011

1 commit


13 Apr, 2011

1 commit

  • When a skb is delivered to a registered callback, cn_call_callback()
    incorrectly returns -ENODEV after freeing the skb, causing cn_rx_skb()
    to free the skb a second time.

    Reported-by: Eric B Munson
    Signed-off-by: Patrick McHardy
    Tested-by: Eric B Munson
    Signed-off-by: David S. Miller

    Patrick McHardy
     

31 Mar, 2011

1 commit

  • Commits 01a16b21 (netlink: kill eff_cap from struct netlink_skb_parms)
    and c53fa1ed (netlink: kill loginuid/sessionid/sid members from struct
    netlink_skb_parms) removed some members from struct netlink_skb_parms
    that depend on the current context, all netlink users are now required
    to do synchronous message processing.

    connector however queues received messages and processes them in a work
    queue, which is not valid anymore. This patch converts connector to do
    synchronous message processing by invoking the registered callback handler
    directly from the netlink receive function.

    In order to avoid invoking the callback with connector locks held, a
    reference count is added to struct cn_callback_entry, the reference
    is taken when finding a matching callback entry on the device's queue_list
    and released after the callback handler has been invoked.

    Signed-off-by: Patrick McHardy
    Acked-by: Evgeniy Polyakov
    Signed-off-by: David S. Miller

    Patrick McHardy
     

24 Feb, 2011

1 commit


11 Dec, 2010

1 commit

  • Since connector can be built as a module and uses netlink socket
    to communicate. The module should have an alias to autoload when socket
    of NETLINK_CONNECTOR type is requested.

    Signed-off-by: Stephen Hemminger
    Acked-by: Evgeniy Polyakov
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

25 Oct, 2010

1 commit

  • Commit 1a5645bc (connector: create connector workqueue only while
    needed once) implements lazy workqueue creation for connector
    workqueue. With cmwq now in place, lazy workqueue creation doesn't
    make much sense while adding a lot of complexity. Remove it and
    allocate an ordered workqueue during initialization.

    This also removes a call to flush_scheduled_work() which is deprecated
    and scheduled to be removed.

    Signed-off-by: Tejun Heo
    Cc: Frederic Weisbecker
    Signed-off-by: David S. Miller

    Tejun Heo
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

03 Feb, 2010

1 commit

  • On Tue, Feb 02, 2010 at 02:57:14PM -0800, Greg KH (gregkh@suse.de) wrote:
    > > There are at least two ways to fix it: using a big cannon and a small
    > > one. The former way is to disable notification registration, since it is
    > > not used by anyone at all. Second way is to check whether calling
    > > process is root and its destination group is -1 (kind of priveledged
    > > one) before command is dispatched to workqueue.
    >
    > Well if no one is using it, removing it makes the most sense, right?
    >
    > No objection from me, care to make up a patch either way for this?

    Getting it is not used, let's drop support for notifications about
    (un)registered events from connector.
    Another option was to check credentials on receiving, but we can always
    restore it without bugs if needed, but genetlink has a wider code base
    and none complained, that userspace can not get notification when some
    other clients were (un)registered.

    Kudos for Sebastian Krahmer , who found a bug in the
    code.

    Signed-off-by: Evgeniy Polyakov
    Acked-by: Greg Kroah-Hartman
    Signed-off-by: David S. Miller

    Evgeniy Polyakov
     

03 Oct, 2009

3 commits


24 Jul, 2009

1 commit


22 Jul, 2009

1 commit


18 Jul, 2009

1 commit

  • The connector documentation states that the argument to the callback
    function is always a pointer to a struct cn_msg, but rather than encode it
    in the API itself, it uses a void pointer everywhere. This doesn't make
    much sense to encode the pointer in documentation as it prevents proper C
    type checking from occurring and can easily allow people to use the wrong
    pointer type. So convert the argument type to an explicit struct cn_msg
    pointer.

    Signed-off-by: Mike Frysinger
    Signed-off-by: David S. Miller

    Mike Frysinger
     

03 Feb, 2009

1 commit

  • The netlink connector uses its own workqueue to relay the datas sent
    from userspace to the appropriate callback. If you launch the test
    from Documentation/connector and change it a bit to send a high flow
    of data, you will see thousands of events coming to the "cqueue"
    workqueue by looking at the workqueue tracer.

    This flow of events can be sent very quickly. So, to not encumber the
    kevent workqueue and delay other jobs, the "cqueue" workqueue should
    remain.

    But this workqueue is pointless most of the time, it will always be
    created (assuming you have built it of course) although only
    developpers with specific needs will use it.

    So avoid this "most of the time useless task", this patch proposes to
    create this workqueue only when needed once. The first jobs to be
    sent to connector callbacks will be sent to kevent while the "cqueue"
    thread creation will be scheduled to kevent too.

    The following jobs will continue to be scheduled to keventd until the
    cqueue workqueue is created, and then the rest of the jobs will
    continue to perform as usual, through this dedicated workqueue.

    Each time I tested this patch, only the first event was sent to
    keventd, the rest has been sent to cqueue which have been created
    quickly.

    Also, this patch fixes some trailing whitespaces on the connector files.

    Signed-off-by: Frederic Weisbecker
    Acked-by: Evgeniy Polyakov
    Signed-off-by: David S. Miller

    Frederic Weisbecker
     

28 Jun, 2008

1 commit

  • I got a problem when I wanted to check if the kernel supports process
    event connector, and It seems there's no way to do this check.

    At best I can check if the kernel supports connector or not, by looking
    into /proc/net/netlink, or maybe checking the return value of bind() to
    see if it's ENOENT.

    So it would be useful to add /proc/net/connector to list all supported
    connectors:
    # cat /proc/net/connector
    Name ID
    connector 4294967295:4294967295
    cn_proc 1:1
    w1 3:1

    Changelog:
    - fix memory leak: s/seq_release/single_release
    - use spin_lock_bh instead of spin_lock_irqsave

    Signed-off-by: Li Zefan
    Acked-by: Evgeniy Polyakov
    Signed-off-by: David S. Miller

    Li Zefan
     

27 Feb, 2008

1 commit


29 Jan, 2008

3 commits


04 Jan, 2008

1 commit


31 Oct, 2007

1 commit

  • Remove a spurious call to kfree_skb() in the connector rx_skb handler.

    This fixes a regression introduced by the '[NET]: make netlink user ->
    kernel interface synchronious' patch (cd40b7d3983c708aabe3d3008ec64ffce56d33b0)

    Signed-off-by: Michal Januszewski
    Signed-off-by: David S. Miller

    Michal Januszewski
     

11 Oct, 2007

1 commit

  • This patch make processing netlink user -> kernel messages synchronious.
    This change was inspired by the talk with Alexey Kuznetsov about current
    netlink messages processing. He says that he was badly wrong when introduced
    asynchronious user -> kernel communication.

    The call netlink_unicast is the only path to send message to the kernel
    netlink socket. But, unfortunately, it is also used to send data to the
    user.

    Before this change the user message has been attached to the socket queue
    and sk->sk_data_ready was called. The process has been blocked until all
    pending messages were processed. The bad thing is that this processing
    may occur in the arbitrary process context.

    This patch changes nlk->data_ready callback to get 1 skb and force packet
    processing right in the netlink_unicast.

    Kernel -> user path in netlink_unicast remains untouched.

    EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock
    drop, but the process remains in the cycle until the message will be fully
    processed. So, there is no need to use this kludges now.

    Signed-off-by: Denis V. Lunev
    Acked-by: Alexey Kuznetsov
    Signed-off-by: David S. Miller

    Denis V. Lunev