18 Mar, 2015
1 commit
-
as a follow on to patch 70006af95515 ("bpf: allow eBPF access skb fields")
this patch allows 'protocol' and 'vlan_tci' fields to be accessible
from extended BPF programs.The usage of 'protocol', 'vlan_present' and 'vlan_tci' fields is the same as
corresponding SKF_AD_PROTOCOL, SKF_AD_VLAN_TAG_PRESENT and SKF_AD_VLAN_TAG
accesses in classic BPF.Signed-off-by: Alexei Starovoitov
Signed-off-by: David S. Miller
17 Mar, 2015
2 commits
-
reqsk_put() is the generic function that should be used
to release a refcount (and automatically call reqsk_free())reqsk_free() might be called if refcount is known to be 0
or undefined.refcnt is set to one in inet_csk_reqsk_queue_add()
As request socks are not yet in global ehash table,
I added temporary debugging checks in reqsk_put() and reqsk_free()Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
We have many places where we want to check if a socket is
not a timewait or request socket. Use a helper to avoid
hard coding this.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
16 Mar, 2015
5 commits
-
Signed-off-by: Scott Feldman
Signed-off-by: David S. Miller -
As discussed at netconf, introduce swdev_ops as first step to move switchdev
ops from ndo to swdev. This will keep switchdev from cluttering up ndo ops
space.Signed-off-by: Scott Feldman
Signed-off-by: David S. Miller -
introduce user accessible mirror of in-kernel 'struct sk_buff':
struct __sk_buff {
__u32 len;
__u32 pkt_type;
__u32 mark;
__u32 queue_mapping;
};bpf programs can do:
int bpf_prog(struct __sk_buff *skb)
{
__u32 var = skb->pkt_type;which will be compiled to bpf assembler as:
dst_reg = *(u32 *)(src_reg + 4) // 4 == offsetof(struct __sk_buff, pkt_type)
bpf verifier will check validity of access and will convert it to:
dst_reg = *(u8 *)(src_reg + offsetof(struct sk_buff, __pkt_type_offset))
dst_reg &= 7since skb->pkt_type is a bitfield.
Signed-off-by: Alexei Starovoitov
Signed-off-by: David S. Miller -
This patch adds the possibility to obtain raw_smp_processor_id() in
eBPF. Currently, this is only possible in classic BPF where commit
da2033c28226 ("filter: add SKF_AD_RXHASH and SKF_AD_CPU") has added
facilities for this.Perhaps most importantly, this would also allow us to track per CPU
statistics with eBPF maps, or to implement a poor-man's per CPU data
structure through eBPF maps.Example function proto-type looks like:
u32 (*smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id;
Signed-off-by: Daniel Borkmann
Signed-off-by: David S. Miller -
This work is similar to commit 4cd3675ebf74 ("filter: added BPF
random opcode") and adds a possibility for packet sampling in eBPF.Currently, this is only possible in classic BPF and useful to
combine sampling with f.e. packet sockets, possible also with tc.Example function proto-type looks like:
u32 (*prandom_u32)(void) = (void *)BPF_FUNC_get_prandom_u32;
Signed-off-by: Daniel Borkmann
Signed-off-by: David S. Miller
15 Mar, 2015
6 commits
-
This patch moves future_tbl to open up the possibility of having
multiple rehashes on the same table.Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
This patch adds a rehash counter to bucket_table to indicate
the last bucket that has been rehashed. This serves two purposes:1. Any bucket that has been rehashed can never gain a new object.
2. If the rehash counter reaches the size of the table, the table
will forever remain empty.This patch also downsizes bucket_table->size to an unsigned int
since we do not support sizes greater than 32 bits yet.Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
There is in fact no need to wait for an RCU grace period in the
rehash function, since all insertions are guaranteed to go into
the new table through spin locks.This patch uses call_rcu to free the old/rehashed table at our
leisure.Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
Previously whenever the walker encountered a resize it simply
snaps back to the beginning and starts again. However, this only
works if the rehash started and completed while the walker was
idle.If the walker attempts to restart while the rehash is still ongoing,
we may miss objects that we shouldn't have.This patch fixes this by making the walker walk the old table
followed by the new table just like all other readers. If a
rehash is detected we will still signal our caller of the fact
so they can prepare for duplicates but we will simply continue
the walk onto the new table after the old one is finished either
by us or by the rehasher.Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
…etooth/bluetooth-next
Johan Hedberg says:
====================
Here's another set of Bluetooth & ieee802154 patches intended for 4.1:- Added support for QCA ROME chipset family in the btusb driver
- at86rf230 driver fixes & cleanups
- ieee802154 cleanups
- Refactoring of Bluetooth mgmt API to allow new users
- New setting for static Bluetooth address exposed to user space
- Refactoring of hci_dev flags to remove limit of 32
- Remove unnecessary fast-connectable setting usage restrictions
- Fix behavior to be consistent when trying to pair already paired device
- Service discovery corner-case fixesPlease let me know if there are any issues pulling. Thanks.
====================Signed-off-by: David S. Miller <davem@davemloft.net>
-
This patch fix the max sifs size correction when the
IEEE802154_HW_TX_OMIT_CKSUM flag is set. With this flag the sk_buff
doesn't contain the CRC, because the transceiver will add the CRC
while transmit.Also add some defines for the max sifs frame size value and frame check
sequence according to 802.15.4 standard.Signed-off-by: Alexander Aring
Acked-by: Marc Kleine-Budde
Signed-off-by: Marcel Holtmann
14 Mar, 2015
2 commits
-
With the extension of hdev->dev_flags utilizing a bitmap now, the space
is no longer restricted. Merge the hdev->dbg_flags into hdev->dev_flags
to save space on 64-bit architectures. On 32-bit architectures no size
reduction happens.Signed-off-by: Marcel Holtmann
Signed-off-by: Johan Hedberg -
The hdev->dev_flags field has outgrown itself on 32-bit systems. So
instead of hacking around it, switch to using DECLARE_BITMAP.Signed-off-by: Marcel Holtmann
Signed-off-by: Johan Hedberg
13 Mar, 2015
19 commits
-
Instead of manually coding test_and_set_bit on hdev->dev_flags all the
time, use hci_dev_test_and_set_flag helper macro.Signed-off-by: Marcel Holtmann
Signed-off-by: Johan Hedberg -
Instead of manually coding test_and_clear_bit on hdev->dev_flags all the
time, use hci_dev_test_and_clear_flag helper macro.Signed-off-by: Marcel Holtmann
Signed-off-by: Johan Hedberg -
Instead of manually coding test_and_change_bit on hdev->dev_flags all the
time, use hci_dev_test_and_change_flag helper macro.Signed-off-by: Marcel Holtmann
Signed-off-by: Johan Hedberg -
Instead of manually coding change_bit on hdev->dev_flags all the time,
use hci_dev_change_flag helper macro.Signed-off-by: Marcel Holtmann
Signed-off-by: Johan Hedberg -
Instead of manually coding clear_bit on hdev->dev_flags all the time,
use hci_dev_clear_flag helper macro.Signed-off-by: Marcel Holtmann
Signed-off-by: Johan Hedberg -
Instead of manually coding set_bit on hdev->dev_flags all the time,
use hci_dev_set_flag helper macro.Signed-off-by: Marcel Holtmann
Signed-off-by: Johan Hedberg -
Instead of manually coding test_bit on hdev->dev_flags all the time,
use hci_dev_test_flag helper macro.Signed-off-by: Marcel Holtmann
Signed-off-by: Johan Hedberg -
The patch adds a second advertising setting that allows switching of the
controller into connectable mode independent of the global connectable
setting.Signed-off-by: Marcel Holtmann
Signed-off-by: Johan Hedberg -
Now that all of the operations are safe on a single hash table
accross network namespaces, allocate a single global hash table
and update the code to use it.Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller -
Commit c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for worker
queue") changed ht->shift to be atomic, which is actually unnecessary.Instead of leaving the current shift in the core rhashtable structure,
it can be cached inside the individual bucket tables.There, it will only be initialized once during a new table allocation
in the shrink/expansion slow path, and from then onward it stays immutable
for the rest of the bucket table liftime.That allows shift to be non-atomic. The patch also moves hash_rnd
management into the table setup. The rhashtable structure now consumes
3 instead of 4 cachelines.Signed-off-by: Daniel Borkmann
Cc: Ying Xue
Acked-by: Thomas Graf
Signed-off-by: David S. Miller -
Before inserting request socks into general hash table,
fill their socket family.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
sock_edemux() & sock_gen_put() should be ready to cope with request socks.
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
When request socks will be in ehash, they'll need to be refcounted.
This patch adds rsk_refcnt/ireq_refcnt macros, and adds
reqsk_put() function, but nothing yet use them.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
We need to identify request sock when they'll be visible in
global ehash table.ireq_state is an alias to req.__req_common.skc_state.
Its value is set to TCP_NEW_SYN_RECV
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
TCP_SYN_RECV state is currently used by fast open sockets.
Initial TCP requests (the pseudo sockets created when a SYN is received)
are not yet associated to a state. They are attached to their parent,
and the parent is in TCP_LISTEN state.This commit adds TCP_NEW_SYN_RECV state, so that we can convert
TCP stack to a different schem gradually.This state is not exported to user space.
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
I forgot to update dccp_v6_conn_request() & cookie_v6_check().
They both need to set ireq->ireq_net and ireq->ir_cookieLets clear ireq->ir_cookie in inet_reqsk_alloc()
Signed-off-by: Eric Dumazet
Fixes: 33cf7c90fe2f ("net: add real socket cookies")
Signed-off-by: David S. Miller -
I noticed that a helper function with argument type ARG_ANYTHING does
not need to have an initialized value (register).This can worst case lead to unintented stack memory leakage in future
helper functions if they are not carefully designed, or unintended
application behaviour in case the application developer was not careful
enough to match a correct helper function signature in the API.The underlying issue is that ARG_ANYTHING should actually be split
into two different semantics:1) ARG_DONTCARE for function arguments that the helper function
does not care about (in other words: the default for unused
function arguments), and2) ARG_ANYTHING that is an argument actually being used by a
helper function and *guaranteed* to be an initialized register.The current risk is low: ARG_ANYTHING is only used for the 'flags'
argument (r4) in bpf_map_update_elem() that internally does strict
checking.Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)")
Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller -
Having to say
> #ifdef CONFIG_NET_NS
> struct net *net;
> #endifin structures is a little bit wordy and a little bit error prone.
Instead it is possible to say:
> typedef struct {
> #ifdef CONFIG_NET_NS
> struct net *net;
> #endif
> } possible_net_t;And then in a header say:
> possible_net_t net;
Which is cleaner and easier to use and easier to test, as the
possible_net_t is always there no matter what the compile options.Further this allows read_pnet and write_pnet to be functions in all
cases which is better at catching typos.This change adds possible_net_t, updates the definitions of read_pnet
and write_pnet, updates optional struct net * variables that
write_pnet uses on to have the type possible_net_t, and finally fixes
up the b0rked users of read_pnet and write_pnet.Signed-off-by: "Eric W. Biederman"
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
hold_net and release_net were an idea that turned out to be useless.
The code has been disabled since 2008. Kill the code it is long past due.Signed-off-by: "Eric W. Biederman"
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
12 Mar, 2015
5 commits
-
This makes it possible to retain the route preference when RAs are handled in
userspace.Signed-off-by: Lubomir Rintel
Reviewed-by: Jiri Pirko
Signed-off-by: David S. Miller -
Flags are used in the return path rather than the return patch.
Fixes: af33c1adae1e ("vxlan: Eliminate dependency on UDP socket in transmit path")
Signed-off-by: Simon Horman
Signed-off-by: David S. Miller -
A long standing problem in netlink socket dumps is the use
of kernel socket addresses as cookies.1) It is a security concern.
2) Sockets can be reused quite quickly, so there is
no guarantee a cookie is used once and identify
a flow.3) request sock, establish sock, and timewait socks
for a given flow have different cookies.Part of our effort to bring better TCP statistics requires
to switch to a different allocator.In this patch, I chose to use a per network namespace 64bit generator,
and to use it only in the case a socket needs to be dumped to netlink.
(This might be refined later if needed)Note that I tried to carry cookies from request sock, to establish sock,
then timewait sockets.Signed-off-by: Eric Dumazet
Cc: Eric Salo
Signed-off-by: David S. Miller -
Export of_mdio_parse_addr() which allows parsing a given Ethernet PHY
node MDIO address, verify it is within the allowed range, and return
its value. This is going to be useful for the DSA code which needs to
deal with multiple layers of MDIO buses.Signed-off-by: Florian Fainelli
Acked-by: Rob Herring
Signed-off-by: David S. Miller -
Currently hash_rnd is a parameter that users can set. However,
no existing users set this parameter. It is also something that
people are unlikely to want to set directly since it's just a
random number.In preparation for allowing the reseeding/rehashing of rhashtable,
this patch moves hash_rnd into bucket_table so that it's now an
internal state rather than a parameter.Signed-off-by: Herbert Xu
Acked-by: Thomas Graf
Signed-off-by: David S. Miller