04 May, 2016

1 commit

  • In the sendmsg function of UDP, raw, ICMP and l2tp sockets, we use local
    variables like hlimits, tclass, opt and dontfrag and pass them to corresponding
    functions like ip6_make_skb, ip6_append_data and xxx_push_pending_frames.
    This is not a good practice and makes it hard to add new parameters.
    This fix introduces a new struct ipcm6_cookie similar to ipcm_cookie in
    ipv4 and include the above mentioned variables. And we only pass the
    pointer to this structure to corresponding functions. This makes it easier
    to add new parameters in the future and makes the function cleaner.

    Signed-off-by: Wei Wang
    Signed-off-by: David S. Miller

    Wei Wang
     

05 Apr, 2016

1 commit

  • Process socket-level control messages by invoking
    __sock_cmsg_send in ip6_datagram_send_ctl for control messages on
    the SOL_SOCKET layer.

    This makes sure whenever ip6_datagram_send_ctl is called for
    udp and raw, we also process socket-level control messages.

    This is a bit uglier than IPv4, since IPv6 does not have
    something like ipcm_cookie. Perhaps we can later create
    a control message cookie for IPv6?

    Note that this commit interprets new control messages that
    were ignored before. As such, this commit does not change
    the behavior of IPv6 control messages.

    Signed-off-by: Soheil Hassas Yeganeh
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Soheil Hassas Yeganeh
     

08 Feb, 2016

1 commit

  • Silence lockdep false positive about rcu_dereference() being
    used in the wrong context.

    First one should use rcu_dereference_protected() as we own the spinlock.

    Second one should be a normal assignation, as no barrier is needed.

    Fixes: 18367681a10bd ("ipv6 flowlabel: Convert np->ipv6_fl_list to RCU.")
    Reported-by: Dave Jones
    Signed-off-by: Eric Dumazet
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Eric Dumazet
     

04 May, 2015

1 commit

  • This patch divides the IPv6 flow label space into two ranges:
    0-7ffff is reserved for flow label manager, 80000-fffff will be
    used for creating auto flow labels (per RFC6438). This only affects how
    labels are set on transmit, it does not affect receive. This range split
    can be disbaled by systcl.

    Background:

    IPv6 flow labels have been an unmitigated disappointment thus far
    in the lifetime of IPv6. Support in HW devices to use them for ECMP
    is lacking, and OSes don't turn them on by default. If we had these
    we could get much better hashing in IPv6 networks without resorting
    to DPI, possibly eliminating some of the motivations to to define new
    encaps in UDP just for getting ECMP.

    Unfortunately, the initial specfications of IPv6 did not clarify
    how they are to be used. There has always been a vague concept that
    these can be used for ECMP, flow hashing, etc. and we do now have a
    good standard how to this in RFC6438. The problem is that flow labels
    can be either stateful or stateless (as in RFC6438), and we are
    presented with the possibility that a stateless label may collide
    with a stateful one. Attempts to split the flow label space were
    rejected in IETF. When we added support in Linux for RFC6438, we
    could not turn on flow labels by default due to this conflict.

    This patch splits the flow label space and should give us
    a path to enabling auto flow labels by default for all IPv6 packets.
    This is an API change so we need to consider compatibility with
    existing deployment. The stateful range is chosen to be the lower
    values in hopes that most uses would have chosen small numbers.

    Once we resolve the stateless/stateful issue, we can proceed to
    look at enabling RFC6438 flow labels by default (starting with
    scaled testing).

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

01 Apr, 2015

2 commits

  • The ipv6 code uses a mixture of coding styles. In some instances check for NULL
    pointer is done as x != NULL and sometimes as x. x is preferred according to
    checkpatch and this patch makes the code consistent by adopting the latter
    form.

    No changes detected by objdiff.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     
  • The ipv6 code uses a mixture of coding styles. In some instances check for NULL
    pointer is done as x == NULL and sometimes as !x. !x is preferred according to
    checkpatch and this patch makes the code consistent by adopting the latter
    form.

    No changes detected by objdiff.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     

13 Mar, 2015

1 commit

  • hold_net and release_net were an idea that turned out to be useless.
    The code has been disabled since 2008. Kill the code it is long past due.

    Signed-off-by: "Eric W. Biederman"
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

12 Feb, 2015

1 commit

  • Use spin_lock_bh in ip6_fl_purge() to prevent following potentially
    deadlock scenario between ip6_fl_purge() and ip6_fl_gc() timer.

    =================================
    [ INFO: inconsistent lock state ]
    3.19.0 #1 Not tainted
    ---------------------------------
    inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
    swapper/5/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
    (ip6_fl_lock){+.?...}, at: [] ip6_fl_gc+0x2d/0x180
    {SOFTIRQ-ON-W} state was registered at:
    [] __lock_acquire+0x4a0/0x10b0
    [] lock_acquire+0xc4/0x2b0
    [] _raw_spin_lock+0x3d/0x80
    [] ip6_flowlabel_net_exit+0x28/0x110
    [] ops_exit_list.isra.1+0x39/0x60
    [] cleanup_net+0x100/0x1e0
    [] process_one_work+0x20a/0x830
    [] worker_thread+0x11b/0x460
    [] kthread+0x104/0x120
    [] ret_from_fork+0x7c/0xb0
    irq event stamp: 84640
    hardirqs last enabled at (84640): [] _raw_spin_unlock_irq+0x30/0x50
    hardirqs last disabled at (84639): [] _raw_spin_lock_irq+0x1f/0x80
    softirqs last enabled at (84628): [] _local_bh_enable+0x21/0x50
    softirqs last disabled at (84629): [] irq_exit+0x12d/0x150

    other info that might help us debug this:
    Possible unsafe locking scenario:

    CPU0
    ----
    lock(ip6_fl_lock);

    lock(ip6_fl_lock);

    *** DEADLOCK ***

    Signed-off-by: Jan Stancek
    Signed-off-by: David S. Miller

    Jan Stancek
     

24 Nov, 2014

1 commit


06 Nov, 2014

1 commit

  • Using a single fixed string is smaller code size than using
    a format and many string arguments.

    Reduces overall code size a little.

    $ size net/ipv4/igmp.o* net/ipv6/mcast.o* net/ipv6/ip6_flowlabel.o*
    text data bss dec hex filename
    34269 7012 14824 56105 db29 net/ipv4/igmp.o.new
    34315 7012 14824 56151 db57 net/ipv4/igmp.o.old
    30078 7869 13200 51147 c7cb net/ipv6/mcast.o.new
    30105 7869 13200 51174 c7e6 net/ipv6/mcast.o.old
    11434 3748 8580 23762 5cd2 net/ipv6/ip6_flowlabel.o.new
    11491 3748 8580 23819 5d0b net/ipv6/ip6_flowlabel.o.old

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

05 Nov, 2014

1 commit


25 Aug, 2014

2 commits

  • This patch makes no changes to the logic of the code but simply addresses
    coding style issues as detected by checkpatch.

    Both objdump and diff -w show no differences.

    This patch removes some blank lines between the end of a function
    definition and the EXPORT_SYMBOL_GPL macro in order to prevent
    checkpatch warning that EXPORT_SYMBOL must immediately follow
    a function.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     
  • This patch makes no changes to the logic of the code but simply addresses
    coding style issues as detected by checkpatch.

    Both objdump and diff -w show no differences.

    A number of items are addressed in this patch:
    * Multiple spaces converted to tabs
    * Spaces before tabs removed.
    * Spaces in pointer typing cleansed (char *)foo etc.
    * Remove space after sizeof
    * Ensure spacing around comparators such as if statements.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     

01 May, 2014

1 commit


19 Feb, 2014

1 commit


20 Jan, 2014

3 commits

  • With the introduction of IPV6_FL_F_REFLECT, there is no guarantee of
    flow label unicity. This patch introduces a new sysctl to protect the old
    behaviour, enable by default.

    Changelog of V3:
    * rename ip6_flowlabel_consistency to flowlabel_consistency
    * use net_info_ratelimited()
    * checkpatch cleanups

    Signed-off-by: Florent Fourcot
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Florent Fourcot
     
  • This information is already available via IPV6_FLOWINFO
    of IPV6_2292PKTOPTIONS, and them a filtering to get the flow label
    information. But it is probably logical and easier for users to add this
    here, and to control both sent/received flow label values with the
    IPV6_FLOWLABEL_MGR option.

    Signed-off-by: Florent Fourcot
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Florent Fourcot
     
  • With this option, the socket will reply with the flow label value read
    on received packets.

    The goal is to have a connection with the same flow label in both
    direction of the communication.

    Changelog of V4:
    * Do not erase the flow label on the listening socket. Use pktopts to
    store the received value

    Signed-off-by: Florent Fourcot
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Florent Fourcot
     

15 Jan, 2014

1 commit


11 Nov, 2013

1 commit


09 Nov, 2013

3 commits

  • Take ip6_fl_lock before to read and update
    a label.

    v2: protect only the relevant code

    Reported-by: Hannes Frederic Sowa
    Signed-off-by: Florent Fourcot
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Florent Fourcot
     
  • If the last RFC 6437 does not give any constraints
    for lifetime of flow labels, the previous RFC 3697
    spoke of a minimum of 120 seconds between
    reattribution of a flow label.

    The maximum linger is currently set to 60 seconds
    and does not allow this configuration without
    CAP_NET_ADMIN right.

    This patch increase the maximum linger to 150
    seconds, allowing more flexibility to standard
    users.

    Signed-off-by: Florent Fourcot
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Florent Fourcot
     
  • It is already possible to set/put/renew a label
    with IPV6_FLOWLABEL_MGR and setsockopt. This patch
    add the possibility to get information about this
    label (current value, time before expiration, etc).

    It helps application to take decision for a renew
    or a release of the label.

    v2:
    * Add spin_lock to prevent race condition
    * return -ENOENT if no result found
    * check if flr_action is GET

    v3:
    * move the spin_lock to protect only the
    relevant code

    Signed-off-by: Florent Fourcot
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Florent Fourcot
     

06 Nov, 2013

1 commit

  • The code of flow label in Linux Kernel follows
    the rules of RFC 1809 (an informational one) for
    conditions on flow label sharing. There rules are
    not in the last proposed standard for flow label
    (RFC 6437), or in the previous one (RFC 3697).

    Since this code does not follow any current or
    old standard, we can remove it.

    With this removal, the ipv6_opt_cmp function is
    now a dead code and it can be removed too.

    Changelog to v1:
    * add justification for the change
    * remove the condition on IPv6 options

    [ Remove ipv6_hdr_cmp and it is now unused as well. -DaveM ]

    Signed-off-by: Florent Fourcot
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Florent Fourcot
     

08 Mar, 2013

1 commit


19 Feb, 2013

2 commits

  • proc_net_remove is only used to remove proc entries
    that under /proc/net,it's not a general function for
    removing proc entries of netns. if we want to remove
    some proc entries which under /proc/net/stat/, we still
    need to call remove_proc_entry.

    this patch use remove_proc_entry to replace proc_net_remove.
    we can remove proc_net_remove after this patch.

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng
     
  • Right now, some modules such as bonding use proc_create
    to create proc entries under /proc/net/, and other modules
    such as ipv4 use proc_net_fops_create.

    It looks a little chaos.this patch changes all of
    proc_net_fops_create to proc_create. we can remove
    proc_net_fops_create after this patch.

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng
     

09 Feb, 2013

1 commit

  • This patch fixes the following RCU warning:

    [ 51.680236] ===============================
    [ 51.681914] [ INFO: suspicious RCU usage. ]
    [ 51.683610] 3.8.0-rc6-next-20130206-sasha-00028-g83214f7-dirty #276 Tainted: G W
    [ 51.686703] -------------------------------
    [ 51.688281] net/ipv6/ip6_flowlabel.c:671 suspicious rcu_dereference_check() usage!

    we should use rcu_dereference_bh() when we hold rcu_read_lock_bh().

    Reported-by: Sasha Levin
    Cc: David S. Miller
    Cc: YOSHIFUJI Hideaki
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Amerigo Wang
     

06 Feb, 2013

1 commit

  • Conflicts:
    drivers/net/ethernet/intel/e1000e/ethtool.c
    drivers/net/vmxnet3/vmxnet3_drv.c
    drivers/net/wireless/iwlwifi/dvm/tx.c
    net/ipv6/route.c

    The ipv6 route.c conflict is simple, just ignore the 'net' side change
    as we fixed the same problem in 'net-next' by eliminating cached
    neighbours from ipv6 routes.

    The e1000e conflict is an addition of a new statistic in the ethtool
    code, trivial.

    The vmxnet3 conflict is about one change in 'net' removing a guarding
    conditional, whilst in 'net-next' we had a netdev_info() conversion.

    The iwlwifi conflict is dealing with a WARN_ON() conversion in
    'net-next' vs. a revert happening in 'net'.

    Signed-off-by: David S. Miller

    David S. Miller
     

01 Feb, 2013

1 commit


31 Jan, 2013

3 commits


19 Nov, 2012

1 commit

  • Allow an unpriviled user who has created a user namespace, and then
    created a network namespace to effectively use the new network
    namespace, by reducing capable(CAP_NET_ADMIN) and
    capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
    CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.

    Settings that merely control a single network device are allowed.
    Either the network device is a logical network device where
    restrictions make no difference or the network device is hardware NIC
    that has been explicity moved from the initial network namespace.

    In general policy and network stack state changes are allowed while
    resource control is left unchanged.

    Allow the SIOCSIFADDR ioctl to add ipv6 addresses.
    Allow the SIOCDIFADDR ioctl to delete ipv6 addresses.
    Allow the SIOCADDRT ioctl to add ipv6 routes.
    Allow the SIOCDELRT ioctl to delete ipv6 routes.

    Allow creation of ipv6 raw sockets.

    Allow setting the IPV6_JOIN_ANYCAST socket option.
    Allow setting the IPV6_FL_A_RENEW parameter of the IPV6_FLOWLABEL_MGR
    socket option.

    Allow setting the IPV6_TRANSPARENT socket option.
    Allow setting the IPV6_HOPOPTS socket option.
    Allow setting the IPV6_RTHDRDSTOPTS socket option.
    Allow setting the IPV6_DSTOPTS socket option.
    Allow setting the IPV6_IPSEC_POLICY socket option.
    Allow setting the IPV6_XFRM_POLICY socket option.

    Allow sending packets with the IPV6_2292HOPOPTS control message.
    Allow sending packets with the IPV6_2292DSTOPTS control message.
    Allow sending packets with the IPV6_RTHDRDSTOPTS control message.

    Allow setting the multicast routing socket options on non multicast
    routing sockets.

    Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, and SIOCDELTUNNEL ioctls for
    setting up, changing and deleting tunnels over ipv6.

    Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, SIOCDELTUNNEL ioctls for
    setting up, changing and deleting ipv6 over ipv4 tunnels.

    Allow the SIOCADDPRL, SIOCDELPRL, SIOCCHGPRL ioctls for adding,
    deleting, and changing the potential router list for ISATAP tunnels.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

17 Aug, 2012

1 commit


15 Aug, 2012

1 commit

  • Correct a long standing omission and use struct pid in the owner
    field of struct ip6_flowlabel when the share type is IPV6_FL_S_PROCESS.
    This guarantees we don't have issues when pid wraparound occurs.

    Use a kuid_t in the owner field of struct ip6_flowlabel when the
    share type is IPV6_FL_S_USER to add user namespace support.

    In /proc/net/ip6_flowlabel capture the current pid namespace when
    opening the file and release the pid namespace when the file is
    closed ensuring we print the pid owner value that is meaning to
    the reader of the file. Similarly use from_kuid_munged to print
    uid values that are meaningful to the reader of the file.

    This requires exporting pid_nr_ns so that ipv6 can continue to built
    as a module. Yoiks what silliness

    Acked-by: David S. Miller
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

19 May, 2012

1 commit


01 May, 2012

1 commit


16 Apr, 2012

1 commit


23 Nov, 2011

1 commit