Skip to content

zebra, bgpd: EVPN VXLAN Multihome support with extern mode #19438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

mrinmoyg
Copy link

Introduction of "--kernel-ext-learn" mode of operation in zebra.
This delivers a comprehensive set of enhancements and fixes to support and validate EVPN VXLAN multihoming in external learning mode.

EVPN Multihoming External Mode Support:

  • New option as "--kernel-ext-learn" in zebra startup.
  • Both data plane and control plane MACs are marked with extern_learn, to prevent kernel aging.
  • With the new 'protocol' field in bridge fdb, hardware learnt ('hw') and control plane learnt('zebra') are differentiated.
  • 'extern_mode' handling of netlink messaging to debounce 'zebra' protocol messages and accept 'hw' mode for data plane learnt MAC
  • 'ARP Suppression' presently not supported for 'extern only' mode of operation
  • Specific handling of Sync MAC Update, Delete in extern mode as there is no kernel aging and thereby explicit cleanup is necessary.

Test Enhancements and New Scenarios:
The test framework is updated to cover EVPN VXLAN multihoming with both L2VNI and L3VNI, specifically targeting external learning mode. This includes new logic to handle and verify the correct operation of Ethernet Segments (ES), Designated Forwarder (DF) election, and VTEP peer relationships.
Added and updated topotests under bgp_evpn_mh_l2l3vni_ext_learn to: Validate correct ES discovery and advertisement for both local and remote PEs. Check VTEP peer lists for accuracy, including handling of downed VTEPs and ES state transitions. Ensure L2VNI and L3VNI are correctly instantiated and associated with the appropriate VRFs and VXLAN interfaces. Test orphaned hosts, dual-attached hosts, and single-attached hosts in various failure and recovery scenarios. Utility and Parser Functions:
Utility functions in lib/bgp_evpn.py(new) is added, such as a parser for the show evpn vni json command. This parser extracts VNI-to-type mappings, simplifying validation of VNI configuration and type (L2/L3) in automated tests. These changes adds the test coverage and reliability for EVPN VXLAN multihoming in external mode, making it easier to detect regressions and validate new features.

Per File change summary:

  • bgpd/bgp_evpn.c:
    Addition of flog_err for error in installation of VNI MAC and IP to zebra. Non functional.

  • include/linux/rtnetlink.h:
    RTPROT_HW definition as 'protocol' in bridge fdb

  • zebra/debug_nl.c:
    Updated Netlink print infra to include 'protocol' field in netlink msg dump

  • zebra/main.c:
    Add option '--kernel-ext-learn' in zebra startup.

  • zebra/rt_netlink.c:
    Debounce/Accept MAC update from kernel based on 'protocol' field (i.e NDA_PROTOCOL)
    Add ext_flags e.g NTF_E_MH_PEER_SYNC as Peer Sync for 'extern_only' mode.
    Add NTF_EXT_LEARNED flag for MAC in dplane programming, also drop for netlink messages from kernel without the flag

  • zebra/zebra_evpn.c:
    ARP ND Suppression 'set' and 'get' function definition

  • zebra/zebra_evpn.h:
    Corresponding, ARP ND Suppression 'set' and 'get' function declaration

  • zebra/zebra_evpn_mac.c:
    zebra_evpn_sync_mac_dp_install: For Sync MAC DPLANE installation, make sure the MAC is inactive for extern mode
    In zebra_evpn_mac_hold_exp_cb: In extern only mode, explicit flush of the mac is required
    In zebra_evpn_sync_mac_del: In extern mode, only reprogram MACif it does not have any PEER flags or neighbor and LOCAL INACTIVE. This helps to clear up the "static" for PEER_PROXY on proxy withdrawal

  • zebra/zebra_evpn_mh.c:
    ES Info Set: In extern only mode, use the API zebra_evpn_es_bypass_update_macs to clean MAC from interfaces and ES

  • zebra/zebra_evpn_mh.h:
    Declare API zebra_evpn_flush_local_mac as its used in zebra_evpn_mac.c for hold timer expiry cleanup and sync delete of MAC

  • zebra/zebra_evpn_neigh.c:
    Prevent neighbor entry installation n DPLANE for remote neighbor if ARP/ND suppression is not enabled. For Extern only mode, ARP/ND Supression is not enabled

  • zebra/zebra_router.c:
    For extern only mode, disable ARP suppression

  • zebra/zebra_router.h:
    new bool field 'kernel_ext_learn' in zebra_router, to store extern only mode

  • zebra/zebra_vxlan.c:
    Add print for EVPN ARP/ND Suppression in "show evpn"

  • zebra/zebra_vxlan.h:
    Accessor for Extern Only mode i.e zebra_mac_ext_learn_mode

Review for Kernel Change for Protocol field Addition: https://lore.kernel.org/netdev/[email protected]/T/#u
Review for iproute2 change for protocol field Addition: https://lore.kernel.org/netdev/[email protected]/T/#u

@donaldsharp
Copy link
Member

This commit needs to be broken up into small logical units of work such that a reviewer can look at it and understand the logical steps that get you to this new feature. I have given some suggestions on how to do this already. I am sure that there are other places it can be broken up some.

Additionally I see no documentation at all. This needs to be added too. I can review further once this is broken up

mrinmoyg and others added 4 commits August 20, 2025 00:35
Introduction of "--kernel-ext-learn" mode of operation in zebra.

This is part of the change is to enable the
 option '--kernel-ext-learn' in zebra.

Rationale:
EVPN Multihoming External Mode Support is to enable platforms doing
Hardware Based MAC Learning and Aging to support EVPN VXLAN Multihome.

MAC's in this mode for both data and control plane will be
 marked and programmed as 'extern_only' in kernel, so that kernel aging
 is disabled for this MACs. Zebra along with HW will control these MACs
 for control plane and data plane respectively.

Per File change summary:
zebra/main.c:
  Add option '--kernel-ext-learn' in zebra startup.
zebra/zebra_router.c:
   - zebra_router_init definitation updation to pass kernel_ext_learn mode
   - store kernel_ext_learn in zrouter structure
zebra/zebra_router.h:
   - new bool field 'kernel_ext_learn' in zebra_router, to
     store extern only mode
zebra/zebra_vxlan.h:
   - Accessor for Extern Only mode i.e zebra_mac_ext_learn_mode

Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Patrice Brissette <[email protected]>
Protocol field is added in bridge FDB, to distinguish between
 MAC addresses learned via the control plane and those learned
 via the data plane with hardware aging.

Protocol 'hw' (i.e RTPROT_HW aka hardware) for MAC learnt by hardware
 will be used for data plane(hardware) learnt MAC while
 existing protocol 'zebra' to be used for control plane learnt ones.

Kernel Patch in review:
https://lore.kernel.org/netdev/[email protected]/
iproute2 patch in review:
https://lore.kernel.org/netdev/[email protected]/

Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Patrice Brissette <[email protected]>
Doc update for new zebra option '--kernel-ext-learn'

Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Mrinmoy Ghosh <[email protected]>
This is to update netlink debug print,
 to print 'protocol' i.e NDA_PROTOCOL field for netlink messages.

Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Mrinmoy Ghosh <[email protected]>
@mrinmoyg mrinmoyg force-pushed the evpn_vxlan_mh_extern_mode branch from e2285f0 to 559ccc8 Compare August 20, 2025 00:42
For External mode, presently ARP/ND Suppression is not supported,
 thereby disabled. Remote Neighbor programming is also not required,
 based on zebra_evpn_get_arp_nd_suppress.

Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Patrice Brissette <[email protected]>
Receive handling:
- Debounce of AF_BRIDGE netlink message based on 'protocol' field ZEBRA
- Update zlog to include protocol
- Ignore VXLAN Info message in extern mode
- Ignore if "NTF_EXT_LEARNED" flag is not marked in the netlink message
  for extern only mode of operation
- Ignore MAC netlink update if interface is down.
  Presently done only for extern mode
Send Operation:
- Always mark MAC as NTF_EXT_LEARNED in extern only mode
- Mark Sync MAC as NTF_E_MH_PEER_SYNC, only in extern only mode
- Peer-Sync flag (i.e NTF_E_MH_PEER_SYNC) state print in Tx zlog

Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Patrice Brissette <[email protected]>
Dataplane Sync MAC Update:
- Install sync local MAC only if its inactive
On hold timer expiry:
- Flush the MAC, and no reprogram, as no dynamic learn during static
Sync Del:
- Explicit MAC flush if MAC is inactive, if it has no PEER flags
Sync MAC update:
- In Peer Proxy, no additional BGP update computation

Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Patrice Brissette <[email protected]>
Signed-off-by: Nitin Kalavala <[email protected]>
@frrbot frrbot bot added bgp tests Topotests, make check, etc labels Aug 20, 2025
@github-actions github-actions bot added size/XXL and removed size/M labels Aug 20, 2025
Topotests added under bgp_evpn_mh_l2l3vni_ext_learn to:
- Validate correct ES discovery and advertisement for both local and
 remote PEs.
- Check VTEP peer lists for accuracy, including handling of downed VTEPs
  and ES state transitions.
- Ensure L2VNI and L3VNI are correctly instantiated and associated with the
 appropriate VRFs and VXLAN interfaces.
- Test orphaned hosts, dual-attached hosts, and single-attached hosts
 in various failure and recovery scenarios.
- MAC 'protocol' state transitions i.e data plane learnt to
  control plane learnt and vice versa, delete, relearn in both peers
  sequeunces
Utility and Parser Functions:
Utility functions in lib/bgp_evpn.py(new) are added.
These changes adds the test coverage and reliability for EVPN VXLAN
 multihoming in external mode, making it easier to detect regressions and
 validate new features.

Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Patrice Brissette <[email protected]>
flog_err added for evpn install to zebra failure for
 ip and mac.

Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Mrinmoy Ghosh <[email protected]>
Signed-off-by: Mike Mallin <[email protected]>
@mrinmoyg mrinmoyg force-pushed the evpn_vxlan_mh_extern_mode branch from f2b42df to bf18b37 Compare August 20, 2025 15:58
@mrinmoyg mrinmoyg marked this pull request as ready for review August 20, 2025 15:59
@mrinmoyg
Copy link
Author

This commit needs to be broken up into small logical units of work such that a reviewer can look at it and understand the logical steps that get you to this new feature. I have given some suggestions on how to do this already. I am sure that there are other places it can be broken up some.

Additionally I see no documentation at all. This needs to be added too. I can review further once this is broken up

Its broken down now, with the zebra.rst file updated for the documentation on the new option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants