27 Jun, 2014

5 commits

  • This includes the special handling for NFPROTO_INET. There is
    no real inet logger since we don't see packets of this family.
    However, rules are loaded using this special family type. So
    let's just request both IPV4 and IPV6 loggers.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • This adds the generic plain text packet loggger for bridged packets.
    It routes the logging message to the real protocol packet logger.
    I decided not to refactor the ebt_log code for two reasons:

    1) The ebt_log output is not consistent with the IPv4 and IPv6
    Netfilter packet loggers. The output is different for no good
    reason and it adds redundant code to handle packet logging.

    2) To avoid breaking backward compatibility for applications
    outthere that are parsing the specific ebt_log output, the ebt_log
    output has been left as is. So only nftables will use the new
    consistent logging format for logged bridged packets.

    More decisions coming in this patch:

    1) This also removes ebt_log as default logger for bridged packets.
    Thus, nf_log_packet() routes packet to this new packet logger
    instead. This doesn't break backward compatibility since
    nf_log_packet() is not used to log packets in plain text format
    from anywhere in the ebtables/netfilter bridge code.

    2) The new bridge packet logger also performs a lazy request to
    register the real IPv4, ARP and IPv6 netfilter packet loggers.
    If the real protocol logger is no available (not compiled or the
    module is not available in the system, not packet logging happens.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • This adds the generic plain text packet loggger for ARP packets. It is
    based on the ebt_log code. Nevertheless, the output has been modified
    to make it consistent with the original xt_LOG output.

    This is an example output:

    IN=wlan0 OUT= ARP HTYPE=1 PTYPE=0x0800 OPCODE=2 MACSRC=00:ab:12:34:55:63 IPSRC=192.168.10.1 MACDST=80:09:12:70:4f:50 IPDST=192.168.10.150

    This patch enables packet logging from ARP chains, eg.

    nft add rule arp filter input log prefix "input: "

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • Before this patch, the nf_loginfo parameter specified the logging
    configuration in case the specified default logger was loaded. This
    patch updates the semantics of the nf_loginfo parameter in
    nf_log_packet() which now indicates the logger that you explicitly
    want to use.

    Thus, nf_log_packet() is exposed as an unified interface which
    internally routes the log message to the corresponding logger type
    by family.

    The module dependencies are expressed by the new nf_logger_find_get()
    and nf_logger_put() functions which bump the logger module refcount.
    Thus, you can not remove logger modules that are used by rules anymore.

    Another important effect of this change is that the family specific
    module is only loaded when required. Therefore, xt_LOG and nft_log
    will just trigger the autoload of the nf_log_{ip,ip6} modules
    according to the family.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • The plain text logging is currently embedded into the xt_LOG target.
    In order to be able to use the plain text logging from nft_log, as a
    first step, this patch moves the family specific code to the following
    files and Kconfig symbols:

    1) net/ipv4/netfilter/nf_log_ip.c: CONFIG_NF_LOG_IPV4
    2) net/ipv6/netfilter/nf_log_ip6.c: CONFIG_NF_LOG_IPV6
    3) net/netfilter/nf_log_common.c: CONFIG_NF_LOG_COMMON

    These new modules will be required by xt_LOG and nft_log. This patch
    is based on original patch from Arturo Borrero Gonzalez.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

26 Jun, 2014

4 commits

  • This patch moves Eric Dumazet's log buffer implementation from the
    xt_log.h header file to the core net/netfilter/nf_log.c. This also
    includes the renaming of the structure and functions to avoid possible
    undesired namespace clashes.

    This change allows us to use it from the arp and bridge packet logging
    implementation in follow up patches.

    Pablo Neira Ayuso
     
  • Now that legacy ulog targets are not available anymore in the tree, we
    can have up to two possible loggers:

    1) The plain text logging via kernel logging ring.
    2) The nfnetlink_log infrastructure which delivers log messages
    to userspace.

    This patch replaces the list of loggers by an array of two pointers
    per family for each possible logger and it also introduces a new field
    to the nf_logger structure which indicates the position in the logger
    array (based on the logger type).

    This prepares a follow up patch that consolidates the nf_log_packet()
    interface by allowing to specify the logger as parameter.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • This has been marked as deprecated for quite some time and the NFLOG
    target replacement has been also available since 2006.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • This brings the (per-conntrack) ecache extension back to 24 bytes in size
    (was 152 byte on x86_64 with lockdep on).

    When event delivery fails, re-delivery is attempted via work queue.

    Redelivery is attempted at least every 0.1 seconds, but can happen
    more frequently if userspace is not congested.

    The nf_ct_release_dying_list() function is removed.
    With this patch, ownership of the to-be-redelivered conntracks
    (on-dying-list-with-DYING-bit not yet set) is with the work queue,
    which will release the references once event is out.

    Joint work with Pablo Neira Ayuso.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

25 Jun, 2014

3 commits


24 Jun, 2014

11 commits

  • Signed-off-by: Rasmus Villemoes
    Signed-off-by: David S. Miller

    Rasmus Villemoes
     
  • Signed-off-by: David S. Miller

    David S. Miller
     
  • Govindarajulu Varadarajan says:

    ====================
    enic updates

    This series fixes minor bugs and adds new features like Accelerated RFS,
    busy_poll, tx clean-up in napi_poll.

    v3:
    * While doing tx cleanup in napi, ignore budget and clean up all desc possible.

    v2:
    * Fix #ifdef coding style issue in '[PATCH 4/8] enic: alloc/free rx_cpu_rmap'
    And [PATCH 5/8] enic: Add Accelerated RFS support'
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Till now enic had been doing tx clean in isr.

    Using napi infrastructure to move the tx clean up out of isr to softirq.
    Now, wq isr schedules napi poll. In enic_poll_msix_wq we clean up the tx queus.

    This is applicable only on MSIX. In INTx and MSI we use single napi to clean
    both rx & tx queues.

    Signed-off-by: Govindarajulu Varadarajan
    Signed-off-by: David S. Miller

    Govindarajulu Varadarajan
     
  • This patch adds support for low latency busy_poll.

    * Introduce drivers ndo_busy_poll function enic_busy_poll, which is called by
    socket waiting for data.

    * Introduce locking between napi_poll nad busy_poll

    * enic_busy_poll cleans up all the rx pkts possible. While in busy_poll, rq
    holds the state ENIC_POLL_STATE_POLL. While in napi_poll, rq holds the state
    ENIC_POLL_STATE_NAPI.

    * in napi_poll we return if we are in busy_poll. Incase of INTx & msix, we just
    service wq and return if busy_poll is going on.

    Signed-off-by: Govindarajulu Varadarajan
    Signed-off-by: David S. Miller

    Govindarajulu Varadarajan
     
  • We were experiencing occasional "BUG: scheduling while atomic" splats
    in our testing. Enabling DEBUG_SPINLOCK and DEBUG_LOCKDEP in the kernel
    exposed a lockdep in the enic driver.

    enic 0000:0b:00.0 eth2: Link UP

    ======================================================
    [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
    3.12.0-rc1.x86_64-dbg+ #2 Tainted: GF W
    ------------------------------------------------------
    NetworkManager/4209 [HC0[0]:SC0[2]:HE1:SE0] is trying to acquire:
    (&(&enic->devcmd_lock)->rlock){+.+...}, at: [] enic_dev_packet_filter+0x44/0x90 [enic]

    The fix was to replace spin_lock with spin_lock_bh for the enic
    devcmd_lock, so that soft irqs would be disabled while the lock
    is held.

    Signed-off-by: Sujith Sankar
    Signed-off-by: Tony Camuso
    Signed-off-by: Govindarajulu Varadarajan
    Signed-off-by: David S. Miller

    Tony Camuso
     
  • This patch adds supports for Accelerated Receive Flow Steering.

    When the desired rx is different from current rq, for a flow, kernel calls the
    driver function enic_rx_flow_steer(). enic_rx_flow_steer adds a IP-TCP/UDP
    hardware filter.

    Driver registers a timer function enic_flow_may_expire. This function is called
    every HZ/4 seconds. In this function we check if the added filter has expired
    by calling rps_may_expire_flow(). If the flow has expired, it removes the hw
    filter.

    As of now adaptor supports only IPv4 - TCP/UDP filters.

    Signed-off-by: Govindarajulu Varadarajan
    Signed-off-by: David S. Miller

    Govindarajulu Varadarajan
     
  • rx_cpu_rmap provides the reverse irq cpu affinity. This patch allocates and
    sets drivers netdev->rx_cpu_rmap accordingly.

    rx_cpu_rmap is set in enic_request_intr() which is called by enic_open and
    rx_cpu_rmap is freed in enic_free_intr() which is called by enic_stop.

    This is used by Accelerated RFS.

    Signed-off-by: Govindarajulu Varadarajan
    Signed-off-by: David S. Miller

    Govindarajulu Varadarajan
     
  • This patch adds interface to add and delete IP 5 tuple filter. This interface
    is used by Accelerated RFS code to steer a flow to corresponding receive
    queue.

    As of now adaptor supports only ipv4 + tcp/udp packet steering.

    Signed-off-by: Govindarajulu Varadarajan
    Signed-off-by: David S. Miller

    Govindarajulu Varadarajan
     
  • Hardware (in readq(&devcmd->args[0])) returns positive number in case of error.
    But _vnic_dev_cmd should return a negative value in case of error.

    Signed-off-by: Govindarajulu Varadarajan
    Signed-off-by: David S. Miller

    Govindarajulu Varadarajan
     
  • skb_flow_dissect() dissects only transport header type in ip_proto. It dose not
    give any information about IPv4 or IPv6.

    This patch adds new member, n_proto, to struct flow_keys. Which records the
    IP layer type. i.e IPv4 or IPv6.

    This can be used in netdev->ndo_rx_flow_steer driver function to dissect flow.

    Adding new member to flow_keys increases the struct size by around 4 bytes.
    This causes BUILD_BUG_ON(sizeof(qcb->data) < sz); to fail in
    qdisc_cb_private_validate()

    So increase data size by 4

    Signed-off-by: Govindarajulu Varadarajan
    Signed-off-by: David S. Miller

    Govindarajulu Varadarajan
     

23 Jun, 2014

11 commits


22 Jun, 2014

3 commits

  • tcf_ematch is allocated by kzalloc in function tcf_em_tree_validate(),
    so cm_old is always NULL.

    Signed-off-by: Duan Jiong
    Signed-off-by: David S. Miller

    Duan Jiong
     
  • use list_for_each_entry_continue_reverse to rollback in fdb_add_hw
    when add address failed

    Signed-off-by: Li RongQing
    Signed-off-by: David S. Miller

    Li RongQing
     
  • Jeff Kirsher says:

    ====================
    Intel Wired LAN Driver Updates 2014-06-20

    This series contains updates to i40e and i40evf.

    Anjali provides an update to the registers to handle the updates from the
    hardware. Also provides a fix so that we do not try to access the rings
    through the qvectors at the time of freeing the qvectors.

    Jesse provides a workaround for some older NVM versions where the NVM
    was not filling in the GLQF_HKEY register, so made sure that the
    critical register is initialized.

    Michal provides a fix to reset the head and tail on admin queue
    initialization where head and tail are not reset by the hardware.

    Neerav adds a helper routine that would wait for the Rx/Tx queue to reach
    the enable or disable state that is requested. Also provides a fix
    to the debugfs command "lldp get remote" which was dumping the local
    LLDPDU instead of the peer's LLDPDU. Fixed a bug when all the Tx hang
    recovery mechanisms have failed and the driver tries to bring down the
    interface in the interrupt context.

    Shannon provides a patch to clear the Virtual Ethernet Bridge (VEB) stats
    when the PF stats are cleared. Also cleans the service tasks so that
    they do not run while a reset is in progress.

    Mitch fixes an issue in i40evf_get_rxfh() where only fifteen registers
    were being read instead of all sixteen.

    Carolyn provides a change to the RSS configuration to set table size and
    write to the hardware to confirm the RSS table size being used.

    Kamil makes a change to the admin queue debug prints so that they will not
    cause segmentation faults in some of our tool applications.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

20 Jun, 2014

3 commits