11 Sep, 2012

1 commit

  • It is a frequent mistake to confuse the netlink port identifier with a
    process identifier. Try to reduce this confusion by renaming fields
    that hold port identifiers portid instead of pid.

    I have carefully avoided changing the structures exported to
    userspace to avoid changing the userspace API.

    I have successfully built an allyesconfig kernel with this change.

    Signed-off-by: "Eric W. Biederman"
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

15 Aug, 2012

1 commit


18 Jun, 2012

1 commit


21 Apr, 2012

2 commits

  • This results in code with less boiler plate that is a bit easier
    to read.

    Additionally stops us from using compatibility code in the sysctl
    core, hastening the day when the compatibility code can be removed.

    Signed-off-by: Eric W. Biederman
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This makes it clearer which sysctls are relative to your current network
    namespace.

    This makes it a little less error prone by not exposing sysctls for the
    initial network namespace in other namespaces.

    This is the same way we handle all of our other network interfaces to
    userspace and I can't honestly remember why we didn't do this for
    sysctls right from the start.

    Signed-off-by: Eric W. Biederman
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

16 Apr, 2012

1 commit


14 Apr, 2012

2 commits


13 Apr, 2012

2 commits

  • Pull in the 'net' tree to get CAIF bug fixes upon which
    the following set of CAIF feature patches depend.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Recently an oops was reported in phonet if there was a failure during
    network namespace creation.

    [ 163.733755] ------------[ cut here ]------------
    [ 163.734501] kernel BUG at include/net/netns/generic.h:45!
    [ 163.734501] invalid opcode: 0000 [#1] PREEMPT SMP
    [ 163.734501] CPU 2
    [ 163.734501] Pid: 19145, comm: trinity Tainted: G W 3.4.0-rc1-next-20120405-sasha-dirty #57
    [ 163.734501] RIP: 0010:[] [] phonet_pernet+0x182/0x1a0
    [ 163.734501] RSP: 0018:ffff8800674d5ca8 EFLAGS: 00010246
    [ 163.734501] RAX: 000000003fffffff RBX: 0000000000000000 RCX: ffff8800678c88d8
    [ 163.734501] RDX: 00000000003f4000 RSI: ffff8800678c8910 RDI: 0000000000000282
    [ 163.734501] RBP: ffff8800674d5cc8 R08: 0000000000000000 R09: 0000000000000000
    [ 163.734501] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880068bec920
    [ 163.734501] R13: ffffffff836b90c0 R14: 0000000000000000 R15: 0000000000000000
    [ 163.734501] FS: 00007f055e8de700(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000
    [ 163.734501] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    [ 163.734501] CR2: 00007f055e6bb518 CR3: 0000000070c16000 CR4: 00000000000406e0
    [ 163.734501] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 163.734501] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [ 163.734501] Process trinity (pid: 19145, threadinfo ffff8800674d4000, task ffff8800678c8000)
    [ 163.734501] Stack:
    [ 163.734501] ffffffff824d5f00 ffffffff810e2ec1 ffff880067ae0000 00000000ffffffd4
    [ 163.734501] ffff8800674d5cf8 ffffffff824d667a ffff880067ae0000 00000000ffffffd4
    [ 163.734501] ffffffff836b90c0 0000000000000000 ffff8800674d5d18 ffffffff824d707d
    [ 163.734501] Call Trace:
    [ 163.734501] [] ? phonet_pernet+0x20/0x1a0
    [ 163.734501] [] ? get_parent_ip+0x11/0x50
    [ 163.734501] [] phonet_device_destroy+0x1a/0x100
    [ 163.734501] [] phonet_device_notify+0x3d/0x50
    [ 163.734501] [] notifier_call_chain+0xee/0x130
    [ 163.734501] [] raw_notifier_call_chain+0x11/0x20
    [ 163.734501] [] call_netdevice_notifiers+0x52/0x60
    [ 163.734501] [] rollback_registered_many+0x185/0x270
    [ 163.734501] [] unregister_netdevice_many+0x14/0x60
    [ 163.734501] [] ipip_exit_net+0x1b3/0x1d0
    [ 163.734501] [] ? ipip_rcv+0x420/0x420
    [ 163.734501] [] ops_exit_list+0x35/0x70
    [ 163.734501] [] setup_net+0xab/0xe0
    [ 163.734501] [] copy_net_ns+0x76/0x100
    [ 163.734501] [] create_new_namespaces+0xfb/0x190
    [ 163.734501] [] unshare_nsproxy_namespaces+0x61/0x80
    [ 163.734501] [] sys_unshare+0xff/0x290
    [ 163.734501] [] ? trace_hardirqs_on_thunk+0x3a/0x3f
    [ 163.734501] [] system_call_fastpath+0x16/0x1b
    [ 163.734501] Code: e0 c3 fe 66 0f 1f 44 00 00 48 c7 c2 40 60 4d 82 be 01 00 00 00 48 c7 c7 80 d1 23 83 e8 48 2a c4 fe e8 73 06 c8 fe 48 85 db 75 0e 0b 0f 1f 40 00 eb fe 66 0f 1f 44 00 00 48 83 c4 10 48 89 d8
    [ 163.734501] RIP [] phonet_pernet+0x182/0x1a0
    [ 163.734501] RSP
    [ 163.861289] ---[ end trace fb5615826c548066 ]---

    After investigation it turns out there were two issues.
    1) Phonet was not implementing network devices but was using register_pernet_device
    instead of register_pernet_subsys.

    This was allowing there to be cases when phonenet was not initialized and
    the phonet net_generic was not set for a network namespace when network
    device events were being reported on the netdevice_notifier for a network
    namespace leading to the oops above.

    2) phonet_exit_net was implementing a confusing and special case of handling all
    network devices from going away that it was hard to see was correct, and would
    only occur when the phonet module was removed.

    Now that unregister_netdevice_notifier has been modified to synthesize unregistration
    events for the network devices that are extant when called this confusing special
    case in phonet_exit_net is no longer needed.

    Signed-off-by: Eric W. Biederman
    Acked-by: Rémi Denis-Courmont
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

11 Apr, 2012

1 commit


06 Apr, 2012

1 commit

  • A phonet packet is limited to USHRT_MAX bytes, this is never checked during
    tx which means that the user can specify any size he wishes, and the kernel
    will attempt to allocate that size.

    In the good case, it'll lead to the following warning, but it may also cause
    the kernel to kick in the OOM and kill a random task on the server.

    [ 8921.744094] WARNING: at mm/page_alloc.c:2255 __alloc_pages_slowpath+0x65/0x730()
    [ 8921.749770] Pid: 5081, comm: trinity Tainted: G W 3.4.0-rc1-next-20120402-sasha #46
    [ 8921.756672] Call Trace:
    [ 8921.758185] [] warn_slowpath_common+0x87/0xb0
    [ 8921.762868] [] warn_slowpath_null+0x15/0x20
    [ 8921.765399] [] __alloc_pages_slowpath+0x65/0x730
    [ 8921.769226] [] ? zone_watermark_ok+0x1a/0x20
    [ 8921.771686] [] ? get_page_from_freelist+0x625/0x660
    [ 8921.773919] [] __alloc_pages_nodemask+0x1f8/0x240
    [ 8921.776248] [] kmalloc_large_node+0x70/0xc0
    [ 8921.778294] [] __kmalloc_node_track_caller+0x34/0x1c0
    [ 8921.780847] [] ? sock_alloc_send_pskb+0xbc/0x260
    [ 8921.783179] [] __alloc_skb+0x75/0x170
    [ 8921.784971] [] sock_alloc_send_pskb+0xbc/0x260
    [ 8921.787111] [] ? release_sock+0x7e/0x90
    [ 8921.788973] [] sock_alloc_send_skb+0x10/0x20
    [ 8921.791052] [] pep_sendmsg+0x60/0x380
    [ 8921.792931] [] ? pn_socket_bind+0x156/0x180
    [ 8921.794917] [] ? pn_socket_autobind+0x3f/0x90
    [ 8921.797053] [] pn_socket_sendmsg+0x4f/0x70
    [ 8921.798992] [] sock_aio_write+0x187/0x1b0
    [ 8921.801395] [] ? sub_preempt_count+0xae/0xf0
    [ 8921.803501] [] ? __lock_acquire+0x42c/0x4b0
    [ 8921.805505] [] ? __sock_recv_ts_and_drops+0x140/0x140
    [ 8921.807860] [] do_sync_readv_writev+0xbc/0x110
    [ 8921.809986] [] ? might_fault+0x97/0xa0
    [ 8921.811998] [] ? security_file_permission+0x1e/0x90
    [ 8921.814595] [] do_readv_writev+0xe2/0x1e0
    [ 8921.816702] [] ? do_setitimer+0x1ac/0x200
    [ 8921.818819] [] ? get_parent_ip+0x11/0x50
    [ 8921.820863] [] ? sub_preempt_count+0xae/0xf0
    [ 8921.823318] [] vfs_writev+0x46/0x60
    [ 8921.825219] [] sys_writev+0x4f/0xb0
    [ 8921.827127] [] system_call_fastpath+0x16/0x1b
    [ 8921.829384] ---[ end trace dffe390f30db9eb7 ]---

    Signed-off-by: Sasha Levin
    Acked-by: Rémi Denis-Courmont
    Signed-off-by: David S. Miller

    Sasha Levin
     

02 Apr, 2012

1 commit


13 Jan, 2012

1 commit

  • commit a9b3cd7f32 (rcu: convert uses of rcu_assign_pointer(x, NULL) to
    RCU_INIT_POINTER) did a lot of incorrect changes, since it did a
    complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x,
    y).

    We miss needed barriers, even on x86, when y is not NULL.

    Signed-off-by: Eric Dumazet
    CC: Stephen Hemminger
    CC: Paul E. McKenney
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Nov, 2011

1 commit


01 Nov, 2011

2 commits


02 Aug, 2011

1 commit

  • When assigning a NULL value to an RCU protected pointer, no barrier
    is needed. The rcu_assign_pointer, used to handle that but will soon
    change to not handle the special case.

    Convert all rcu_assign_pointer of NULL value.

    //smpl
    @@ expression P; @@

    - rcu_assign_pointer(P, NULL)
    + RCU_INIT_POINTER(P, NULL)

    //

    Signed-off-by: Stephen Hemminger
    Acked-by: Paul E. McKenney
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

10 Jun, 2011

1 commit

  • The message size allocated for rtnl ifinfo dumps was limited to
    a single page. This is not enough for additional interface info
    available with devices that support SR-IOV and caused a bug in
    which VF info would not be displayed if more than approximately
    40 VFs were created per interface.

    Implement a new function pointer for the rtnl_register service that will
    calculate the amount of data required for the ifinfo dump and allocate
    enough data to satisfy the request.

    Signed-off-by: Greg Rose
    Signed-off-by: Jeff Kirsher

    Greg Rose
     

24 May, 2011

1 commit

  • The %pK format specifier is designed to hide exposed kernel pointers,
    specifically via /proc interfaces. Exposing these pointers provides an
    easy target for kernel write vulnerabilities, since they reveal the
    locations of writable structures containing easily triggerable function
    pointers. The behavior of %pK depends on the kptr_restrict sysctl.

    If kptr_restrict is set to 0, no deviation from the standard %p behavior
    occurs. If kptr_restrict is set to 1, the default, if the current user
    (intended to be a reader via seq_printf(), etc.) does not have CAP_SYSLOG
    (currently in the LSM tree), kernel pointers using %pK are printed as 0's.
    If kptr_restrict is set to 2, kernel pointers using %pK are printed as
    0's regardless of privileges. Replacing with 0's was chosen over the
    default "(null)", which cannot be parsed by userland %p, which expects
    "(nil)".

    The supporting code for kptr_restrict and %pK are currently in the -mm
    tree. This patch converts users of %p in net/ to %pK. Cases of printing
    pointers to the syslog are not covered, since this would eliminate useful
    information for postmortem debugging and the reading of the syslog is
    already optionally protected by the dmesg_restrict sysctl.

    Signed-off-by: Dan Rosenberg
    Cc: James Morris
    Cc: Eric Dumazet
    Cc: Thomas Graf
    Cc: Eugene Teo
    Cc: Kees Cook
    Cc: Ingo Molnar
    Cc: David S. Miller
    Cc: Peter Zijlstra
    Cc: Eric Paris
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    Dan Rosenberg
     

21 May, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits)
    macvlan: fix panic if lowerdev in a bond
    tg3: Add braces around 5906 workaround.
    tg3: Fix NETIF_F_LOOPBACK error
    macvlan: remove one synchronize_rcu() call
    networking: NET_CLS_ROUTE4 depends on INET
    irda: Fix error propagation in ircomm_lmp_connect_response()
    irda: Kill set but unused variable 'bytes' in irlan_check_command_param()
    irda: Kill set but unused variable 'clen' in ircomm_connect_indication()
    rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport()
    be2net: Kill set but unused variable 'req' in lancer_fw_download()
    irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication()
    atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined.
    rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer().
    rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler()
    rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection()
    rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window()
    pkt_sched: Kill set but unused variable 'protocol' in tc_classify()
    isdn: capi: Use pr_debug() instead of ifdefs.
    tg3: Update version to 3.119
    tg3: Apply rx_discards fix to 5719/5720
    ...

    Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c
    as per Davem.

    Linus Torvalds
     

08 May, 2011

1 commit


03 May, 2011

1 commit

  • Four years ago, Patrick made a change to hold rtnl mutex during netlink
    dump callbacks.

    I believe it was a wrong move. This slows down concurrent dumps, making
    good old /proc/net/ files faster than rtnetlink in some situations.

    This occurred to me because one "ip link show dev ..." was _very_ slow
    on a workload adding/removing network devices in background.

    All dump callbacks are able to use RCU locking now, so this patch does
    roughly a revert of commits :

    1c2d670f366 : [RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks
    6313c1e0992 : [RTNETLINK]: Remove unnecessary locking in dump callbacks

    This let writers fight for rtnl mutex and readers going full speed.

    It also takes care of phonet : phonet_route_get() is now called from rcu
    read section. I renamed it to phonet_route_get_rcu()

    Signed-off-by: Eric Dumazet
    Cc: Patrick McHardy
    Cc: Remi Denis-Courmont
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet
     

15 Apr, 2011

1 commit


16 Mar, 2011

1 commit


10 Mar, 2011

8 commits

  • This is now a run-time choice so that a single kernel can support both
    old and new generation ISI modems. Support for manually enabling the
    pipe flow is removed as it did not work properly, does not fit well
    with the socket API, and I am not aware of any use at the moment.

    Signed-off-by: Rémi Denis-Courmont
    Signed-off-by: David S. Miller

    Rémi Denis-Courmont
     
  • This provides support for newer ISI modems with no need for the
    earlier experimental compile-time alternative choice. With this,
    we can now use the same kernel and userspace with both types of
    modems.

    This also avoids confusing two different and incompatible state
    machines, actively connected vs accepted sockets, and adds
    connection response error handling (processing "SYN/RST" of sorts).

    Signed-off-by: Rémi Denis-Courmont
    Signed-off-by: David S. Miller

    Rémi Denis-Courmont
     
  • User-space sometimes needs this information. In particular, the GPRS
    context or the AT commands pipe setups may use the pipe handle as a
    reference.

    This removes the settable pipe handle with CONFIG_PHONET_PIPECTRLR.
    It did not handle error cases correctly. Furthermore, the kernel
    *could* implement a smart scheme for allocating handles (if ever
    needed), but userspace really cannot.

    Signed-off-by: Rémi Denis-Courmont
    Signed-off-by: David S. Miller

    Rémi Denis-Courmont
     
  • This moves most of the accept logic to process context like other
    socket stacks do. Then we can use a few more common socket helpers
    and simplify a bit.

    Signed-off-by: Rémi Denis-Courmont
    Signed-off-by: David S. Miller

    Rémi Denis-Courmont
     
  • With the addition of the pipe controller, there is now quite a bit
    of repetitive code for small signaling messages. Lets factor it.

    Signed-off-by: Rémi Denis-Courmont
    Signed-off-by: David S. Miller

    Rémi Denis-Courmont
     
  • In some cases, the Phonet pipe backlog callbacks returned negative
    errno instead of NET_RX_* values.

    In other cases, NET_RX_DROP was returned for invalid packets, even
    though it seems only intended for buffering problems (not for
    deliberately discarded packets).

    Signed-off-by: Rémi Denis-Courmont
    Signed-off-by: David S. Miller

    Rémi Denis-Courmont
     
  • Phonet assumes that packets are never dropped. We try our best to
    avoid this situation. But lets return ENOBUFS if queueing to the
    network device fails so that the caller knows things went wrong.

    Signed-off-by: Rémi Denis-Courmont
    Signed-off-by: David S. Miller

    Rémi Denis-Courmont
     
  • The previous Phonet patch series introduced per-socket implicit
    destination (i.e. connect()). In that case, the destination
    socket address is NULL in the transmit function.
    However commit a8059512b120362b15424f152b2548fe8b11bd0c
    ("Phonet: implement per-socket destination/peer address")
    is incomplete and would trigger a NULL dereference.
    (Fortunately, the code is not in released kernel, and in fact
    currently not reachable.)

    Signed-off-by: Rémi Denis-Courmont
    Signed-off-by: David S. Miller

    Rémi Denis-Courmont
     

26 Feb, 2011

7 commits