03 Oct, 2020

7 commits

  • drivers/s390/net/ctcm_fsms.h: fsm_action_nop - only declaration left
    after commit 04885948b101 ("ctc: removal of the old ctc driver")

    drivers/s390/net/ctcm_mpc.h: ctcmpc_open - only declaration left after
    commit 293d984f0e36 ("ctcm: infrastructure for replaced ctc driver")

    Reviewed-by: Julian Wiedmann
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Vasily Gorbik
     
  • - Add/delete some blanks, white spaces and braces.
    - Fix misindentations.
    - Adjust a deprecated header include, and htons() conversion.
    - Remove extra 'return' statements.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • Replace our custom version of netdev_name().

    Once we started to allocate the netdev at probe time with
    commit d3d1b205e89f ("s390/qeth: allocate netdevice early"), this
    stopped working as intended anyway.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • The discipline struct is a fixed group of function pointers.
    So declare the L2 and L3 disciplines as constant.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • For OSA devices that are _not_ configured in prio-queue mode, give users
    the option of selecting the number of active TX queues.
    This requires setting up the HW queues with a reasonable default QoS
    value in the QIB's PQUE parm area.

    As with the other device types, we bring up the device with a minimal
    number of TX queues for compatibility reasons.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • Use a proper struct, and only program the QIB extensions for devices
    where they are supported.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • When re-initializing a device, we can hit a situation where
    qeth_osa_set_output_queues() detects that it supports more or less
    HW TX queues than before. Right now we adjust dev->real_num_tx_queues
    from right there, but
    1. it's getting more & more complicated to cover all cases, and
    2. we can't re-enable the actually expected number of TX queues later
    because we lost the needed information.

    So keep track of the wanted TX queues (on initial setup, and whenever
    its changed via .set_channels), and later use that information when
    re-enabling the netdevice.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     

29 Sep, 2020

2 commits


24 Sep, 2020

9 commits

  • Shuffle some code around (primarily all the discipline-related stuff) to
    get rid of all the unnecessary forward declarations.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • Clarify which discipline-specific steps are needed to roll back after
    error in qeth_l?_set_online(), and which are common to roll back
    from qeth_hardsetup_card().

    Some steps (cancelling the RX modeset, draining the TX queues) are only
    necessary if the netdev was potentially UP before, so move them to the
    common qeth_set_offline().

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • Move duplicated code from the disciplines into the core path.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • Originators of cmd IO typically hold the rtnl or conf_mutex to protect
    against a concurrent teardown.
    Since qeth_set_offline() already holds the conf_mutex, the main reason
    why we still care about cancelling pending cmds is so that they release
    the rtnl when we need it ourselves.

    So move this step a little earlier into the teardown sequence.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • The programming of ucast IPs via qeth_l3_modify_ip() is driven
    independently from any of our typical locking mechanisms (eg. detaching
    the netdevice, or holding the conf_mutex).
    So when we inspect the card state to check whether the required cmd IO
    should be deferred, there is no protection against concurrent state
    changes.

    But by slightly re-ordering the teardown sequence, we can rely on the
    ip_lock to sufficiently serialize things:

    1. when running concurrently to qeth_l3_set_online(), any instance of
    qeth_l3_modify_ip() that aquires the ip_lock _after_
    qeth_l3_recover_ip() will observe the state as CARD_STATE_SOFTSETUP
    and not defer the IO.
    2. when running concurrently to qeth_l3_set_offline(), any instance of
    qeth_l3_modify_ip() that aquires the ip_lock _after_
    qeth_l3_clear_ip_htable() will observe the state as CARD_STATE_DOWN
    and defer the IO.

    These guarantees in mind, we can now drop the conf_mutex from the
    qeth_l3_modify_rxip_vipa() wrapper.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • Convert the remaining occurences in sysfs code to kstrtouint().

    While at it move some input parsing out of locked sections, replace an
    open-coded clamp() and remove some unnecessary run-time checks for
    ipatoe->mask_bits that are already enforced when creating the object.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • Indicate the max number of to-be-parsed characters, and avoid copying
    the address sub-string.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • card->ipato is currently protected by the conf_mutex. But most users
    also hold the ip_lock - in particular qeth_l3_add_ip().

    So slightly expand the sections under ip_lock in a few places (to
    effectively cover a few error & no-op cases), and then drop the
    conf_mutex where it's no longer needed.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • mcast IP objects are allocated within qeth_l3_add_mcast_rtnl(),
    with .ref_counter already set to 1 via qeth_l3_init_ipaddr().

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     

23 Sep, 2020

3 commits

  • Two minor conflicts:

    1) net/ipv4/route.c, adding a new local variable while
    moving another local variable and removing it's
    initial assignment.

    2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes.
    One pretty prints the port mode differently, whilst another
    changes the driver to try and obtain the port mode from
    the port node rather than the switch node.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Pull networking fixes from Jakub Kicinski:

    - fix failure to add bond interfaces to a bridge, the offload-handling
    code was too defensive there and recent refactoring unearthed that.
    Users complained (Ido)

    - fix unnecessarily reflecting ECN bits within TOS values / QoS marking
    in TCP ACK and reset packets (Wei)

    - fix a deadlock with bpf iterator. Hopefully we're in the clear on
    this front now... (Yonghong)

    - BPF fix for clobbering r2 in bpf_gen_ld_abs (Daniel)

    - fix AQL on mt76 devices with FW rate control and add a couple of AQL
    issues in mac80211 code (Felix)

    - fix authentication issue with mwifiex (Maximilian)

    - WiFi connectivity fix: revert IGTK support in ti/wlcore (Mauro)

    - fix exception handling for multipath routes via same device (David
    Ahern)

    - revert back to a BH spin lock flavor for nsid_lock: there are paths
    which do require the BH context protection (Taehee)

    - fix interrupt / queue / NAPI handling in the lantiq driver (Hauke)

    - fix ife module load deadlock (Cong)

    - make an adjustment to netlink reply message type for code added in
    this release (the sole change touching uAPI here) (Michal)

    - a number of fixes for small NXP and Microchip switches (Vladimir)

    [ Pull request acked by David: "you can expect more of this in the
    future as I try to delegate more things to Jakub" ]

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (167 commits)
    net: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
    net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
    net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
    inet_diag: validate INET_DIAG_REQ_PROTOCOL attribute
    net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU
    net: Update MAINTAINERS for MediaTek switch driver
    net/mlx5e: mlx5e_fec_in_caps() returns a boolean
    net/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock
    net/mlx5e: kTLS, Fix leak on resync error flow
    net/mlx5e: kTLS, Add missing dma_unmap in RX resync
    net/mlx5e: kTLS, Fix napi sync and possible use-after-free
    net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported
    net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats()
    net/mlx5e: Fix multicast counter not up-to-date in "ip -s"
    net/mlx5e: Fix endianness when calculating pedit mask first bit
    net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported
    net/mlx5e: CT: Fix freeing ct_label mapping
    net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready
    net/mlx5e: Use synchronize_rcu to sync with NAPI
    net/mlx5e: Use RCU to protect rq->xdp_prog
    ...

    Linus Torvalds
     
  • Pull block fixes from Jens Axboe:
    "A few NVMe fixes, and a dasd write zero fix"

    * tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-block:
    nvmet: get transport reference for passthru ctrl
    nvme-core: get/put ctrl and transport module in nvme_dev_open/release()
    nvme-tcp: fix kconfig dependency warning when !CRYPTO
    nvme-pci: disable the write zeros command for Intel 600P/P3100
    s390/dasd: Fix zero write for FBA devices

    Linus Torvalds
     

16 Sep, 2020

7 commits

  • Documentation/networking/switchdev.txt and 'man bridge' indicate that the
    learning_sync bridge attribute is used to control whether a given
    device will sync MAC addresses learned on its device port to a master
    bridge FDB, where they will show up as 'extern_learn offload'. So we map
    qeth_l2_dev2br_an_set() to the learning_sync bridge link attribute.

    Turning off learning_sync will flush all extern_learn entries from the
    bridge fdb and all pending events from the card's work queue.

    When the hardware interface goes offline with learning_sync on
    (e.g. for HW recovery), all extern_learn entries will be flushed from the
    bridge fdb and all pending events from the card's work queue. When the
    interface goes online again, it will send new notifications for all then
    valid MACs. learning_sync attribute can not be modified while interface is
    offline. See
    'commit e6e771b3d897 ("s390/qeth: detach netdevice while card is offline")'

    An alternative implementation would be to always offload the 'learning'
    attribute of a software bridge to the hardware interface attached to it
    and thus implicitly enable fdb notification. This was not chosen for 2
    reasons:
    1) In our case the software bridge is NOT a representation of a hardware
    switch. It is just connected to a smart NIC that is able to inform
    about the addresses attached to it. It is not necessarily using source
    MAC learning for this and other bridgeports can be attached to other
    NICs with different properties.
    2) We want a means to enable this notification explicitly. There may be
    cases where a bridgeport is set to 'learning', but we do not want to
    enable the notification.

    Signed-off-by: Alexandra Winter
    Reviewed-by: Julian Wiedmann
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Alexandra Winter
     
  • Documentation/networking/switchdev.txt and 'man bridge' indicate that the
    learning_sync bridge attribute is used to indicate whether a given
    device will sync MAC addresses learned on its device port to a master
    bridge FDB.

    learning_sync attribute can not be read while interface is offline (down).
    See
    'commit e6e771b3d897 ("s390/qeth: detach netdevice while card is offline")'
    We return EOPNOTSUPP and not EONODEV in this case, because EONOTSUPP is the
    only rc that is tolerated by 'bridge -d link show'.

    Signed-off-by: Alexandra Winter
    Reviewed-by: Julian Wiedmann
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Alexandra Winter
     
  • In case hardware sends more device-to-bridge-address-change notfications
    than the qeth-l2 driver can handle, the hardware will send an overflow
    event and then stop sending any events. It expects software to flush its
    FDB and start over again. Re-enabling address-change-notification will
    report all current addresses.

    In order to re-enable address-change-notification this patch defines
    the functions qeth_l2_dev2br_an_set() and qeth_l2_dev2br_an_set_cb
    to enable or disable dev-to-bridge-address-notification.

    A following patch will use the learning_sync bridgeport flag to trigger
    enabling or disabling of address-change-notification, so we define
    priv->brport_features to store the current setting. BRIDGE_INFO and
    ADDR_INFO functionality are mutually exclusive, whereas ADDR_INFO and
    qeth_l2_vnicc* can be used together.

    Alternative implementations to handle buffer overflow:
    Just re-enabling notification and adding all newly reported addresses
    would cover any lost 'add' events, but not the lost 'delete' events.
    Then these invalid addresses would stay in the bridge FDB as long as the
    device exists.
    Setting the net device down and up, would be an alternative, but is a bit
    drastic. If the net device has many secondary addresses this will create
    many delete/add events at its peers which could de-stabilize the
    network segment.

    Signed-off-by: Alexandra Winter
    Reviewed-by: Julian Wiedmann
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Alexandra Winter
     
  • A qeth-l2 HiperSockets card can show switch-ish behaviour in the sense,
    that it can report all MACs that are reachable via this interface. Just
    like a switch device, it can notify the software bridge about changes
    to its fdb. This patch exploits this device-to-bridge-notification and
    extracts the relevant information from the hardware events to generate
    notifications to an attached software bridge.

    There are 2 sources for this information:
    1) The reply message of Perform-Network-Subchannel-Operations (PNSO)
    (operation code ADDR_INFO) reports all addresses that are currently
    reachable (implemented in a later patch).
    2) As long as device-to-bridge-notification is enabled, hardware will
    generate address change notification events, whenever the content of
    the hardware fdb changes (this patch).

    The bridge_hostnotify feature (PNSO operation code BRIDGE_INFO) uses
    the same address change notification events. We need to distinguish
    between qeth_pnso_mode QETH_PNSO_BRIDGEPORT and QETH_PNSO_ADDR_INFO
    and call a different handler. In both cases deadlocks must be
    prevented, if the workqueue is drained under lock and QETH_PNSO_NONE,
    when notification is disabled.

    bridge_hostnotify generates udev events, there is no intend to do the same
    for dev2br. Instead this patch will generate SWITCHDEV_FDB_ADD_TO_BRIDGE
    and SWITCHDEV_FDB_DEL_TO_BRIDGE notifications, that will cause the
    software bridge to add (or delete) entries to its fdb as 'extern_learn
    offload'.

    Documentation/networking/switchdev.txt proposes to add
    "depends NET_SWITCHDEV" to driver's Kconfig. This is not done here,
    so even in absence of the NET_SWITCHDEV module, the QETH_L2 module will
    still be built, but then the switchdev notifiers will have no effect.

    No VLAN filtering is done on the entries and VLAN information is not
    passed on to the bridge fdb entries. This could be added later.
    For now VLAN interfaces can be defined on the upper bridge interface.

    Multicast entries are not passed on to the bridge fdb.
    This could be added later. For now mcast flooding can be used in the
    bridge.

    The card reports all MACs that are in its FDB, but we must not pass on
    MACs that are registered for this interface.

    Signed-off-by: Alexandra Winter
    Reviewed-by: Julian Wiedmann
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Alexandra Winter
     
  • This patch detects whether device-to-bridge-notification, provided
    by the Perform Network Subchannel Operation (PNSO) operation code
    ADDR_INFO (OC3), is supported by this card. A following patch will
    map this to the learning_sync bridgeport flag, so we store it in
    priv->brport_hw_features in bridgeport flag format.

    Only IQD cards provide PNSO.
    There is a feature bit to indicate whether the machine provides OC3,
    unfortunately it is not set on old machines.
    So PNSO is called to find out. As this will disable notification
    and is exclusive with bridgeport_notification, this must be done
    during card initialisation before previous settings are restored.

    PNSO functionality requires some configuration values that are added to
    the qeth_card.info structure. Some helper functions are defined to fill
    them out when the card is brought online and some other places are
    adapted, that can also benefit from these fields.

    Signed-off-by: Alexandra Winter
    Reviewed-by: Julian Wiedmann
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Alexandra Winter
     
  • Add helper functions to expose Channel Subsystem ID (CSSID), MIF Image Id
    (IID), Channel ID (CHID) and Channel Path ID (CHPID).
    These values are required by the qeth driver's exploitation of network-
    address-change-notifications to determine which entries belong to this
    interface.

    Store the Partition identifier in System log, as this may be used to map
    a Linux view to a Hardware view for debugging purpose.

    Signed-off-by: Alexandra Winter
    Reviewed-by: Vineeth Vijayan
    Signed-off-by: Julian Wiedmann
    Acked-by: Heiko Carstens
    Signed-off-by: David S. Miller

    Alexandra Winter
     
  • Add support for operation code 3 (OC3) of the
    Perform-Network-Subchannel-Operations (PNSO) function
    of the Channel-Subsystem-Call (CHSC) instruction.

    PNSO provides 2 operation codes:
    OC0 - BRIDGE_INFO
    OC3 - ADDR_INFO (new)

    Extend the function calls to *pnso* to pass the OC and
    add new response code 0108.

    Support for OC3 is indicated by a flag in the css_general_characteristics.

    Signed-off-by: Alexandra Winter
    Reviewed-by: Julian Wiedmann
    Reviewed-by: Peter Oberparleiter
    Reviewed-by: Vineeth Vijayan
    Signed-off-by: Julian Wiedmann
    Acked-by: Heiko Carstens
    Signed-off-by: David S. Miller

    Alexandra Winter
     

15 Sep, 2020

1 commit

  • A discard request that writes zeros using the global kernel internal
    ZERO_PAGE will fail for machines with more than 2GB of memory due to the
    location of the ZERO_PAGE.

    Fix this by using a driver owned global zero page allocated with GFP_DMA
    flag set.

    Fixes: 28b841b3a7cb ("s390/dasd: Add discard support for FBA devices")
    Signed-off-by: Jan Höppner
    Reviewed-by: Stefan Haberland
    Cc: # 4.14+
    Signed-off-by: Jens Axboe

    Jan Höppner
     

14 Sep, 2020

1 commit

  • Tests showed that under stress conditions the kernel may
    temporary fail to allocate 256k with kmalloc. However,
    this fix reworks the related code in the cca_findcard2()
    function to use kvmalloc instead.

    Signed-off-by: Harald Freudenberger
    Reviewed-by: Ingo Franzki
    Cc: Stable
    Signed-off-by: Vasily Gorbik

    Harald Freudenberger
     

11 Sep, 2020

2 commits

  • arch/s390/net/pnet.c uses ccwgroup function dev_is_ccwgroup()
    in pnetid_by_dev_port().
    For s390 the net/smc code makes use of function pnetid_by_dev_port().
    Make sure ccwgroup is built into the kernel, if smc is to be built
    into the kernel.

    Signed-off-by: Guvenc Gulce
    Reviewed-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Guvenc Gulce
     
  • Wait until the QDIO data connection is severed. Otherwise the device
    might still be processing the buffers, and end up accessing skb data
    that we already freed.

    Fixes: 8b5026bc1693 ("s390/qeth: fix qdio teardown after early init error")
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     

05 Sep, 2020

1 commit

  • We got slightly different patches removing a double word
    in a comment in net/ipv4/raw.c - picked the version from net.

    Simple conflict in drivers/net/ethernet/ibm/ibmvnic.c. Use cached
    values instead of VNIC login response buffer (following what
    commit 507ebe6444a4 ("ibmvnic: Fix use-after-free of VNIC login
    response buffer") did).

    Signed-off-by: Jakub Kicinski

    Jakub Kicinski
     

27 Aug, 2020

7 commits

  • The current code for bridge address events has two shortcomings in its
    control sequence:

    1. after disabling address events via PNSO, we don't flush the remaining
    events from the event_wq. So if the feature is re-enabled fast
    enough, stale events could leak over.
    2. PNSO and the events' arrival via the READ ccw device are unordered.
    So even if we flushed the workqueue, it's difficult to say whether
    the READ device might produce more events onto the workqueue
    afterwards.

    Fix this by
    1. explicitly fencing off the events when we no longer care, in the
    READ device's event handler. This ensures that once we flush the
    workqueue, it doesn't get additional address events.
    2. Flush the workqueue after disabling the events & fencing them off.
    As the code that triggers the flush will typically hold the sbp_lock,
    we need to rework the worker code to avoid a deadlock here in case
    of a 'notifications-stopped' event. In case of lock contention,
    requeue such an event with a delay. We'll eventually aquire the lock,
    or spot that the feature has been disabled and the event can thus be
    discarded.

    This leaves the theoretical race that a stale event could arrive
    _after_ we re-enabled ourselves to receive events again. Such an event
    would be impossible to distinguish from a 'good' event, nothing we can
    do about it.

    Signed-off-by: Julian Wiedmann
    Reviewed-by: Alexandra Winter
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • The data returned from IPA_SBP_QUERY_BRIDGE_PORTS and
    IPA_SBP_BRIDGE_PORT_STATE_CHANGE has the same format. Use a single
    struct definition for it.

    Signed-off-by: Julian Wiedmann
    Reviewed-by: Alexandra Winter
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • Current code copies _all_ entries from the event into a worker, when we
    later only need specific data from the first entry.

    Signed-off-by: Julian Wiedmann
    Reviewed-by: Alexandra Winter
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • The only time that our Bridgeport role should change is when we change
    the configuration ourselves. In which case we also adjust our internal
    state tracking, no need to do it again when we receive the corresponding
    event.

    Removing the locked section helps a subsequent patch that needs to flush
    the workqueue while under sbp_lock.

    It would be nice to raise a warning here in case HW does weird things
    after all, but this could end up generating false-positives when we
    change the configuration ourselves.

    Suggested-by: Alexandra Winter
    Signed-off-by: Julian Wiedmann
    Reviewed-by: Alexandra Winter
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • A newly initialized device is disabled for address events, there's no
    need to explicitly disable them.

    Signed-off-by: Julian Wiedmann
    Reviewed-by: Alexandra Winter
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • queue->state is a ternary spinlock in disguise, used by
    OSA's TX completion path to lock the Output Queue and flush any pending
    packets on it to the device. If the Queue is already locked by our TX
    code, setting the lock word to QETH_OUT_Q_LOCKED_FLUSH lets the TX
    completion code move on - the TX path will later take care of things
    when it unlocks the Queue.

    This sort of DIY locking is a non-starter of course, just let the
    TX completion path block on the spinlock when necessary. If that ends up
    causing additional latency due to lock contention, then converting
    the OSA path to use xmit_more is the right way to go forward.

    Also slightly expand the locked section and capture all of
    qeth_do_send_packet(), so that the update for the 'bufs_pack' statistics
    is done race-free.

    While reworking the TX completion path's code, remove a barrier() that
    doesn't make any sense.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • Avoid poking around in the delayed_work struct's internals.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann