I am the assigned Gen-ART reviewer for this draft. The General Area Review Team (Gen-ART) reviews all IETF documents being processed by the IESG for the IETF Chair. Please treat these comments just like any other last call comments. For more information, please see the FAQ at . Document: draft-ietf-bess-evpn-optimized-ir-?? Reviewer: Gyan Mishra Review Date: 2021-10-02 IETF LC End Date: 2021-09-07 IESG Telechat date: Not scheduled for a telechat Summary: I am the GEN-ART reviewer for this draft and am reviewing the draft as a BESS WG member familiar with the EVPN technology and issues that exist with IR and understand the need for the IR optimized solution for BUM replication. This draft clearly defines the problem to be solved with IR BUM replication & the proposed EVPN Optimized IR Solution which is technically sound. My comments, considerations & recommendations are related re-writing of some of the technical verbiage to help improve the draft. The draft is well written & clearly describes the problem with EVPN IR PTA and how the Optimized IR solution with AR replication RT-11 can be used to provide an optimized Selective P-Tree so all PEs do not have to receive the BUM as exists today with RT-3 I-PMSI. This draft provides a EVPN procedure optimization for IR PTA R-3 X-PMSI that utilizes a new RT-11 Leaf A-D that was introduced in Jeffrey Zhang’s EVPN BUM Procedure update “draft draft-ietf-bess-evpn-bum-procedure-updates-10” that utilizes the RFC 6513 Leaf-AD route to create a new Selective tree Leaf A-D Route for optimized EVPN BUM procedures for inter-as segmentation for any PTA P-Tree being instantiated including IR. Leaf Auto-Discovery (A-D) routes [RFC6513]: For explicit leaf tracking purpose. Leaf A-D concept from RFC 6514 Leaf A-D route for Multicast in VPLS RFC 7117 Section 8.3 bottom of page 33 & optimized selective & inclusive P-Tree X-PMSI tunnels with or without inter-as segmentation and “draft draft-ietf-bess-evpn-bum-procedure-updates-10” P-Tree Multicast both specifications uses RFC 7524 Section 4 Inter-Area P2MP Segmented Next hop extended community (S-NH-EC) utilized for tunnel segmentation for seamless MPLS MVPN Multicast setting of “Leaf information required” L flag in PTA now used in EVPN BUM procedures updates in draft “draft draft-ietf-bess-evpn-bum-procedure-updates-10” Section 6.3 and now also used in EVPN IR Optimizations draft for Assisted Replication function in RT-11 (S-NH-EC) with caveat that S-NH-EC is not used is changed from RFC 7524 which should be reflected in the verbiage. RFC 7524 S-NH-EC Section 4 4. Inter-Area P2MP Segmented Next-Hop Extended Community This document defines a new Transitive IPv4-Address-Specific Extended Community Sub-Type: "Inter-Area P2MP Next-Hop". This document also defines a new BGP Transitive IPv6-Address-Specific Extended Community Sub-Type: "Inter-Area P2MP Next-Hop". A PE, an ABR, or an ASBR constructs the Inter-Area P2MP Segmented Next-Hop Extended Community as follows: - The Global Administrator field MUST be set to an IP address of the PE, ABR, or ASBR that originates or advertises the route carrying the P2MP Next-Hop Extended Community. For example this address may be the loopback address or the PE, ABR, or ASBR that advertises the route. - The Local Administrator field MUST be set to 0. If the Global Administrator field is an IPv4 address, the IPv4-Address-Specific Extended Community is used; if the Global Administrator field is an IPv6 address, the IPv6-Address-Specific Extended Community is used. The detailed usage of these Extended Communities is described in the following sections. Excerpt from RFC 7524 Section 6.3 also verbiage used in the BUM procedure update Section 6.3 as well as this EVPN IR optimization draft Section 4 page 9: 6.3. Use of S-NH-EC [RFC7524] specifies the use of S-NH-EC because it does not allow ABRs to change the BGP next hop when they re-advertise I/S-PMSI A-D routes to downstream areas. That is only to be consistent with the MVPN Inter-AS I-PMSI A-D routes, whose next hop must not be changed when they're re-advertised by the segmenting ABRs for reasons specific to MVPN. For EVPN, it is perfectly fine to change the next hop when RBRs re-advertise the I/S-PMSI A-D routes, instead of relying on S- NH-EC. As a result, this document specifies that RBRs change the BGP next hop when they re-advertise I/S-PMSI A-D routes and do not use S- NH-EC. If a downstream PE/RBR needs to originate Leaf A-D routes, it constructs an IP-based Route Target Extended Community by placing the IP address carried in the Next Hop of the received I/S-PMSI A-D route in the Global Administrator field of the Community, with the Local Administrator field of this Community set to 0 and setting the Extended Communities attribute of the Leaf A-D route to that Community. RFC 7117 Excerpt Section 8.3 bottom: The PE constructs an IP-address-specific RT by placing the IP address carried in the Next Hop field of the received S-PMSI A-D route in the Global Administrator field of the Community, with the Local Administrator field of this Community set to 0 and setting the Extended Communities attribute of the leaf A-D route to that Community. This draft EVPN IR Optimization Section 4 page 9 The AR-LEAF constructs an IP-address-specific route-target as indicated in [I-D.ietf-bess-evpn-bum-procedure-updates], by placing the IP address carried in the Next-Hop field of the received Replicator-AR route in the Global Administrator field of the Community, with the Local Administrator field of this Community set to 0. Note that the same IP-address-specific import route-target is auto-configured by the AR-REPLICATOR that sent the Replicator-AR, in order to control the acceptance of the Leaf A-D routes. RFC 6514 Leaf A-D route is being used for EVPN procedures RT-11 to build the selective tree optimization using a new Assisted Replication (AR) procedure which is the EVPN IR optimization in this draft. The confusing part about this draft is that it mentions NVO3 & MPLS PTA. In general, NVO3 overlay encapsulations are used in Data Centers with typically IP based underlay, however MPLS EVPN procedures RFC 7432 applies to both DC or Core any underlay IP, MPLS, SR underlay. This draft as its written applies to an NVO3 overlay IR procedure optimization utilized in a Data Centers, however the Data Center underlay as well as Core can be MPLS or IP based and can both have an NVO3 overlay, however the Data Center environment is generally where NVO3 VTEP termination tunnel endpoints reside and the core carries the EVPN control plane inter-DC. RFC 7432 MPLS / IP EVPN supports both IP & MPLS underlay with IP underlay supporting IR PTA only and MPLS underlay supporting all PTAs for RT-3 I-PMSI inclusive tree. Here are a few scenario for the authors to think about and where the EVPN IR replication optimization solution could be utilized. The point I would like to make here is that for BUM the use of Multicast P2MP mLDP or RSVP-TE PTA is always the most preferred method to handle BUM for both Core or DC scenario and only certain scenario’s that exist where multicast would not be preferred. As the NVO3 & MPLS can be used in both DC or Core scenario, I will mention both DC & Core scenario, as both pertains to this draft. If MPLS is used in the DC or Core then the DC or Core could be “PIM” free & “BGP” free in the underlay and mLPD or RSVP-TE PTA options could be utilized as the optimal BUM solution. If MPLS is used in the DC or Core then the DC or Core could be “BGP” free but PIM is enabled in the core for PIM Rosen MDT RFC 6037. In the above use cases the IR optimization would not optimal or preferred solution. Only if IP Is used in the DC or Core in which case MVPN PTA options are not possible as MVPN is only utilized with MPLS underlay & “PIM” is not desired in the underlay then this IR optimization could be utilized. However, in the use case where underlay is IP only and not MPLS & “PIM” is not desired then this IR optimization would be the most desired solution for BUM with the caveat in this case that as MVPN procedures RFC 6513 & 6514 is used with MPLS underlay for PTA in this case the only viable PTA would be IR as all the other PTA have MPLS underlay dependency. So in summary if MPLS exists then there are a lot of viable X-PMSI PTA options for both DC & Core for EVPN NVO3 BUM and IR would not be the desired, and only the unique case for IP underlay when “PIM” is not desired. I believe IR optimization AR replication solution can be used for MPLS underlay as well as there could be a use case where even though other PTAs X-PMSI are available it is desired to use IR as PTA’s that use MPLS based multicast is not desired and in those cases the IR optimization could be for both DC or Core & could apply to Core “Non NVO3” use case of EVPN PE-CE AC MLAG All Active Multi-home. This solution breaks up the BUM 3 tuple “Broadcast, Unknown Unicast, Multicast” into BM “Broadcast/Multicast” & keeps Unknown Unicast separated out treated the same as known unicast. As MPLS EVPN has a ubiquitous framework & thus ubiquitous use cases and can be used for DC or Core and any underlay IP, MPLS or SR where the two primary use cases for EVPN are NVO3 encapsulation overlay for DC multi-tenant environments and NG L2 VPN PE-CE L2 AC advancement addressing VPLS/H-VPLS gaps that existed to NG MPLS L2 VPN “EVPN” E-LINE, E-LAN,E-TREE, this IR optimization draft as well should apply to any EVPN use case and not limited to NVO3. BUM & why to separate out BUM 3-tuple (Broadcast, Unknown Unicast, Multicast) separate out Unknown Unicast BUM handling from Broadcast & Multicast “BM” traffic. With regards to the BUM Broadcast / Unkown Unicast - With Proxy ARP/Proxy ND what occurs is when the broadcast occurs as an ARP All Fs broadcast, the first ARP packet goes out and the Type 2 change from unknown mac / ip to Mac when arp request is sent and then when reply is received the MAC/IP state is created. After that point no further ARPs are sent for the device. Most implementations have a ARP/ND refresh so to keep the MAC/IP state current and purge the old entries save on MAC VRF URIB state tradeoff so there is constant ARP and is does not necessarily stop even with Proxy ARP. Trade off is maintain the larger MAC VRF if the ARP/ND refresh did not occur which is worse that you don’t want to hit the ceiling on the MAC VRF which is worse. So the draft states that Broadcast is greatly reduced by Proxy ARP / Proxy ND capability & Unknown Unicast is greatly reduced by in virtualized NVO3 networks where MAC/IP is learned in the control plane. Even with Proxy ARP / ND ARP as stated above the 1st ARP packet is sent as flood all FFs until the control plane MAC learning generates the Type 2 MAC-IP route, however since most implementations track the MAC-IP control plane state with refresh timer to age out and purge old entries the all FF’s ARP broadcast ends up being sent more often then just once due to the refresh timers to purge the MAC-IP VRF. Unknown unicast is a situation where the switch does not have the MAC address in its CAM table or in the EVPN scenario the MAC/IP does not exist in leaf within the fabric. In a L2 switch environment the Unknown unicast “out of sync” of Bridge tables can occur when first hop routing protocol is salt/peppered even/odd such that only the Active Router has the MAC and the Standby router does not. With EVPN All Active Multi-home MHD/MHN MLAG scenario of host endpoint connections both leafs are active so there is never an out of sync situation where one leaf has the MAC and the other leaf does not. Also EVPN backup path aliasing uniform load balancing over MLAG & local bias may take care of the Unknown Unicast making it nill or very rare in a EVPN NVO3 environment. BUM Broadcast ARP/ND traffic would definitely exist even with Proxy ARP/ Proxy ND and it can be quite substantial due to refresh/purge timers. Is the reason for treating the Unknown Unicast differently broken out from “BM” because none exists in a NVO3 environment? With regards to EVPN IR optimization for BUM traffic as this draft addresses BUM optimization when using IR, as draft draft-ietf-bess-evpn-igmp-mld-proxy defines a new SMET A-D RT-6 route for IR optimization for BUM which is equivalent to this drafts leaf-ad route but unsolicited and untargeted. This draft must mention normatively in the draft, draft-ietf-bess-evpn-igmp-mld-proxy as an alternative solution for BUM IR optimization and why this solution should be utilized for BUM IR optimization over the SMET RT-6 style optimization. Also how is this drafts RT-11 selective trees AR replications solution interoperate with draft draft-ietf-bess-evpn-igmp-mld-proxy SMET route. Is that possible or do you have to implement one or the other. Major issues: None Minor issues: Abstract OLD TXT Network Virtualization Overlay (NVO) networks using EVPN as control plane may use Ingress Replication (IR) or PIM (Protocol Independent Multicast) based trees to convey the overlay Broadcast, Unknown unicast and Multicast (BUM) traffic. PIM provides an efficient solution to avoid sending multiple copies of the same packet over the same physical link, however it may not always be deployed in the NVO core network. IR avoids the dependency on PIM in the NVO network core. While IR provides a simple multicast transport, some NVO networks with demanding multicast applications require a more efficient solution without PIM in the core. This document describes a solution to optimize the efficiency of IR in NVO networks. NEW TXT Network Virtualization Overlay (NVO) networks and BGP MPLS Based L2 VPN E-LINE, E-LAN, E-TREE flavor Ethernet VPN’s in a Service Provider Core and Data Center Networks using EVPN as control plane may use any available PMSI Tunnel Attribute (PTA)such as Ingress Replication (IR) RFC 7988,PIM (Protocol Independent Multicast)MDT SAFI RFC 6037, mLDP P2MP MP2MP RFC 6388 or RSVP-TE P2MP RFC 4875 based P-Trees to replicate the overlay Broadcast, Unknown unicast and Multicast (BUM) traffic. Multicast based PTA tunnel types provides an efficient solution to avoid sending multiple copies of the same packet over the same physical link, however in a Data Center all the PTA tunnel types may not be available with IP-Based underlay and native PIM is not desirable or with MPLS-Based underlay with “BGP” and “PIM” free core where the operator is migrating to Segment Routing and is in the process of eliminating LDP and RSVP-TE P2MP PTA is not desirable. In these use cases, the only option available is to use IR. While IR provides a simple multicast transport, in the case of Service Provider Core migrating to Segment Routing or Data Center NVO networks with IP-Based underlay with demanding multicast applications require a more efficient solution than IR. This document describes a solution to optimize the efficiency of IR in a Service Provider Core in transition to Segment Routing or Data Center NVO network with IP-Based underlay. Introduction OLD TXT Ethernet Virtual Private Networks (EVPN) may be used as the control plane for a Network Virtualization Overlay (NVO) network. Network Virtualization Edge (NVE) devices and Provider Edges (PEs) that are part of the same EVPN Instance (EVI) use Ingress Replication (IR) or PIM-based trees to transport the tenant's Broadcast, Unknown unicast and Multicast (BUM) traffic. In NVO networks where PIM-based trees cannot be used, IR is the only option. Examples of these situations are NVO networks where the core nodes don't support PIM or the network operator does not want to run PIM in the core. In some use-cases, the amount of replication for BUM (Broadcat, Unkown Unicast, Multicast) traffic is kept under control on the NVEs due to the following fairly common assumptions: a. Broadcast is greatly reduced due to the proxy ARP (Address Resolution Protocol) and proxy ND (Neighbor Discovery) capabilities supported by EVPN on the NVEs. Some NVEs can even provide Dynamic Host Configuration Protocol (DHCP) server functions for the attached Tenant Systems (TS) reducing the broadcast even further. b. Unknown unicast traffic is greatly reduced in virtualized NVO networks where all the MAC and IP addresses are learned in the control plane. c. Multicast applications are not used. If the above assumptions are true for a given NVO network, then IR provides a simple solution for multi-destination traffic. However, the statement c) above is not always true and multicast applications are required in many use-cases. When the multicast sources are attached to NVEs residing in hypervisors or low-performance-replication TORs (Top Of Rack switches), the ingress replication of a large amount of multicast traffic to a significant number of remote NVEs/PEs can seriously degrade the performance of the NVE and impact the application. NEW TXT Service Provider Core and Data Center networks may use Ethernet Virtual Private Networks (EVPN)as the control plane for an Network Virtualization Overlay (NVO) network with IP-Based Underlay or BGP MPLS Based L2 VPN E-LINE, E-LAN, E-TREE flavor Ethernet VPN’s Virtualization Edge (NVE) devices and Provider Edges (PEs) that are part of the same EVPN Instance (EVI)can use Ingress Replication (IR) or any available MPLS based PTA for P-Tree instantiation to transport the tenant's Broadcast, Unknown unicast and Multicast (BUM) traffic. In Service Provider Core or Data Center NVO networks where MPLS based PTA’s are not available such as a Service Provider core migrating to Segment Routing where LDP is being eliminated and RSVP-TE P2MP is not desirable or Data Center network with IP-Based Underlay and Native PIM is not desirable, IR is the only option. Examples of these situations are NVO networks where the core nodes don't support MPLS based PTA with dependency on mLDP and both Native PIM and RSVP-TE P2MP LSM is not desirable. In some use-cases, the amount of replication for BUM traffic is kept under control on the NVEs due to the following fairly common assumptions: a. Broadcast is moderately reduced due to the proxy ARP (Address Resolution Protocol) and proxy ND (Neighbor Discovery) capabilities supported by EVPN on the NVEs with Selective IR tunnels optimization defined in draft draft-ietf-bess-evpn-igmp-mld-proxy. Some NVEs can even provide Dynamic Host Configuration Protocol (DHCP) server functions for the attached Tenant Systems (TS) reducing the broadcast even further. During the Proxy ARP/ND process the first ARP packet is still send all F’s broadcast resulting in Type 2 change from Unknown Mac-IP route to MAC-IP route when ARP/ND request is sent and reply is received the MAC VRF MAC-IP state is created. Proxy ARP/ND then suppresses or proxies all ARP/ND sent by the local hosts. However, due to ARP/ND refresh state requirements to keep the MAC-IP state current and purge the old entries save on MAC VRF URIB state as a tradeoff there maybe additional ARP/ND packets sent for each MAC VRF MAC-IP entry. The IGMP-MLD proxy Selective IR tunnel optimization draft improves the performance of IR using SMET route and maybe used in conjunction with this draft. Even though Proxy ARP/ND suppression techniques are utilized as the refresh/purge must be implemented to age old entries to control the MAC VRF size the broadcast traffic is only moderately reduced and thus RFC 7432 EVPN IR for BUM is not a viable solution without the IR optimization solution defined in this draft and/or draft-ietf-bess-evpn-igmp-mld-proxy. ***Please investigate if both EVPN IR optimizations can be used together and what are all the caveats and if they cannot be used together and why** The main point here that should be mentioned is that Broadcast traffic is reduced but there is still a considerable amount of broadcast traffic that needs to be optimized b. Unknown unicast traffic is eliminated in virtualized NVO networks due to all the MAC and IP addresses are learned in the control plane for All-Active Multi-home LAG scenario and reduced for Single-Active Multi-Home EVPN scenario. Unknown unicast is a situation where the packet has the IP and MAC, however the switch is missing the MAC entry which occurs due to Layer 2 switch BD table synchronization becomes unsynchronized due to salt and pepper of first hop router redundancy active router VLAN between L2 switches resulting in Unknown unicast. In an EVPN scenario with All-Active-Multi-Home the MAC-IP remains synchronized with ESI auto discovery, however with Single-Active-Multi-Home the MAC-IP may not be synchronized resulting in Unknown unicast. As a result, there is minimal to none Unknown Unicast in a NVO network. c. Multicast applications are not used. If the above assumptions are true for a given NVO network, then IR provides a simple solution for multi-destination traffic. However, the statement c) above is not always true and multicast applications are required in many use-cases. When the multicast sources are attached to NVEs residing in hypervisors or low-performance-replication TORs (Top Of Rack switches), the ingress replication of a large amount of multicast traffic to a significant number of remote NVEs/PEs can seriously degrade the performance of the NVE and impact the application. In the draft it should be mentioned the reason why BM (Broacast & Multicast) are treated differently by this solution then Unknown Unicast. My answer is that the Unknown Unicast is minimal to none so does not need the optimization. Terminology section: OLD TXT - Regular-IR: Refers to Regular Ingress Replication, where the source NVE/PE sends a copy to each remote NVE/PE part of the BD. - IR-IP: IP address used for Ingress Replication as in [RFC7432]. - AR-IP: IP address owned by the AR-REPLICATOR and used to differentiate the ingress traffic that must follow the AR procedures. New TXT - Regular-IR: an EVPN RT-3 ( Route Type 3) Regular Ingress Replication, where the source NVE/PE sends a copy to each remote NVE/PE part of the BD. - IR-IP: PTA Tunnel endpoint identifier which carries the unicast tunnel endpoint (Loopback) IP address of the Non-AR-Replicator local PE used for Ingress Replication as defined in RFC 6514. - AR-IP: PTA Tunnel endpoint identifier which carries the unicast tunnel endpoint (loopback) IP address of the AR-REPLICATOR local PE as defined in RFC 6514 and used to differentiate the ingress traffic that must follow the AR procedures. Updated the reference to what the AR-IP & IR-IP is basically is the PMSI Tunnel attribute PTA termination endpoint ID, AR-IP for the AR node & IR-IP for Non-AR node. RFC 7432 section 11.2 references RFC 6514 PMSI tunnel attribute must contain the identity of the tree RFC 7432 Section 11.2 11.2. P-Tunnel Identification In order to identify the P-tunnel used for sending broadcast, unknown unicast, or multicast traffic, the Inclusive Multicast Ethernet Tag route MUST carry a Provider Multicast Service Interface (PMSI) Tunnel attribute as specified in [RFC6514]. + If the PE that originates the advertisement uses ingress replication for the P-tunnel for EVPN, the route MUST include the PMSI Tunnel attribute with the Tunnel Type set to Ingress Replication and the Tunnel Identifier set to a routable address of the PE. RFC 6514 Section 5 When the Tunnel Type is set to Ingress Replication, the Tunnel Identifier carries the unicast tunnel endpoint IP address of the local PE that is to be this PE's receiving endpoint address for the tunnel. Section 3 Solution Requirements OLD TXT a. It provides an IR optimization for BM (Broadcast and Multicast) traffic without the need for PIM, while preserving the packet order for unicast applications, i.e., known and unknown unicast traffic should follow the same path. This optimization is required in low-performance NVEs. NEW TXT a. It provides an IR optimization for BM (Broadcast and Multicast) traffic without the need for PTA’s with MPLS or PIM based dependencies, while preserving the packet order for unicast applications, i.e., known and unknown unicast traffic should follow the same path. This optimization is required in low-performance NVEs. How is IR optimization preserving unicast ordering ? Normal Unicast traffic is not BUM and thus would not use EVPN IR optimization AR mechanism. Section 4 – Type3 is being extended to support -optimized IR – new type 3 – so that is part of capability exchange 4. EVPN BGP Attributes for optimized-IR This solution extends the [RFC7432] Inclusive Multicast Ethernet Tag routes and attributes so that an NVE/PE can signal its optimized-IR capabilities. 7432 section 7.3 7.3. Inclusive Multicast Ethernet Tag Route An Inclusive Multicast Ethernet Tag route type specific EVPN NLRI consists of the following: +---------------------------------------+ | RD (8 octets) | +---------------------------------------+ | Ethernet Tag ID (4 octets) | +---------------------------------------+ | IP Address Length (1 octet) | +---------------------------------------+ | Originating Router's IP Address | | (4 or 16 octets) | +---------------------------------------+ Please reference below with RFC 6514 Section 5 5. PMSI Tunnel Attribute This document defines and uses a new BGP attribute called the "P-Multicast Service Interface Tunnel (PMSI Tunnel) attribute". This is an optional transitive BGP attribute. The format of this attribute is defined as follows: RFC 6514 BGP Encodings and Procedures for MVPNs February 2012 +---------------------------------+ | Flags (1 octet) | +---------------------------------+ | Tunnel Type (1 octets) | +---------------------------------+ | MPLS Label (3 octets) | +---------------------------------+ | Tunnel Identifier (variable) | +---------------------------------+ Section 4 top of page 8 As described in the summary section of the review, this section should reference RFC 7524 Section 4 which is referenced by “draft-ietf-bess-evpn-bum-procedure-updates” section 6.3 S-NH-EC and also reference used by RFC 7117 Section 8.3 and in describe that in “draft-ietf-bess-evpn-bum-procedure-updates” that for EVPN S-NH-EC in the Leaf-AD routes is not necessary for the response to Replicator-AR route RT-3. This should be included in the verbiage. I updated some normative language – please check OLD TXT In this document, the above RT-3 and PTA can be used in two different modes for the same BD: - Regular-IR route: in this route, Originating Router's IP Address, Tunnel Type (0x06), MPLS Label and Tunnel Identifier MUST be used as described in [RFC7432] when Ingress Replication is in use. The NVE/PE that advertises the route will set the Next-Hop to an IP address that we denominate IR-IP in this document. When advertised by an AR-LEAF node, the Regular-IR route SHOULD be advertised with type T= AR-LEAF. - Replicator-AR route: this route is used by the AR-REPLICATOR to advertise its AR capabilities, with the fields set as follows: o Originating Router's IP Address MUST be set to an IP address of the PE that should be common to all the EVIs on the PE (usually this is the PE's loopback address). The Tunnel Identifier and Next-Hop SHOULD be set to the same IP address as the Originating Router's IP address when the NVE/PE originates the route. The Next-Hop address is referred to as the AR-IP and SHOULD be different than the IR-IP for a given PE/NVE. o Tunnel Type = Assisted-Replication Tunnel. Section 11 provides the allocated type value. o T (AR role type) = 01 (AR-REPLICATOR). o L (Leaf Information Required) = 0 (for non-selective AR) or 1 (for selective AR). In addition, this document also uses the Leaf A-D route (RT-11) defined in [I-D.ietf-bess-evpn-bum-procedure-updates] in case the selective AR mode is used. The Leaf A-D route MAY be used by the AR- LEAF in response to a Replicator-AR route (with the L flag set) to advertise its desire to receive the BM traffic from a specific AR- REPLICATOR. It is only used for selective AR and its fields are set as follows: o Originating Router's IP Address is set to the advertising PE's IP address (same IP used by the AR-LEAF in regular-IR routes). The Next-Hop address is set to the IR-IP. o Route Key is the "Route Type Specific" NLRI of the Replicator- AR route for which this Leaf A-D route is generated. o The AR-LEAF constructs an IP-address-specific route-target as indicated in [I-D.ietf-bess-evpn-bum-procedure-updates], by placing the IP address carried in the Next-Hop field of the received Replicator-AR route in the Global Administrator field of the Community, with the Local Administrator field of this Community set to 0. Note that the same IP-address-specific import route-target is auto-configured by the AR-REPLICATOR that sent the Replicator-AR, in order to control the acceptance of the Leaf A-D routes. o The Leaf A-D route MUST include the PMSI Tunnel attribute with the Tunnel Type set to AR, type set to AR-LEAF and the Tunnel Identifier set to the IP of the advertising AR-LEAF. The PMSI Tunnel attribute MUST carry a downstream-assigned MPLS label or VNI that is used by the AR-REPLICATOR to send traffic to the AR-LEAF. Each AR-enabled node MUST understand and process the AR type field in the PTA (Flags field) of the routes, and MUST signal the corresponding type (1 or 2) according to its administrative choice. NEW TXT When the PTA builds PMSI tunnel per RFC 6514 section I called the IR-IP changed to PTA-ID to make it easier for the reader as the source / destination of the PMSI tunnel termination endpoints is the PTA PMSI Tunnel Attribute Identifier. **start of the new txt** In this document, the above RT-3 and PTA can be used in two different modes for the same BD: - Regular-IR route: This route is the regular RT-3 I-PMSI Originating Router's Unicast IP Address called the IR-IP MUST be set to the PMSI Tunnel Identifier for the PTA Tunnel Type (0x06) used for IR as described in 6514 when Ingress Replication is used. The NVE/PE that advertises the route will set the Next-Hop to the remote tunnel endpoint PMSI Tunnel Identifier IR-IP as defined in RFC 6514. When advertised by an AR-LEAF node, the Regular-IR route MUST be advertised with type T= AR-LEAF. o Tunnel Type = Assisted-Replication Tunnel. Section 11 provides the allocated type value. o T (AR role type) = 10 (AR-LEAF). o L (Leaf Information Required) = 0 (for non-selective AR=0) or (for selective AR=1). Regular-IR route is only used only for Non Selective P-Tree. - Replicator-AR route: This route is used by the AR-REPLICATOR to advertise its AR capabilities, with the fields set as follows: o Originating Router's Unicast IP Address called the AR-IP MUST be set to the PMSI Tunnel Identifier for the PTA Tunnel Type(0x06) which is the IP address of the PE that should be common to all the EVIs on the PE as defined in RFC 6514. The Tunnel Identifier and Next-Hop MUST be set to the same IP address as the Originating Router's IP address PTA Tunnel ID when the NVE/PE originates the route as described in RFC 6514. The Next-Hop address of the Replicator-AR route as seen on the AR-LEAF is referred to as the AR-IP and MUST be unique and cannot be the same as the IR-IP for a given PE/NVE. o Tunnel Type = Assisted-Replication Tunnel. Section 11 provides the allocated type value. o T (AR role type) = 01 (AR-REPLICATOR). o L (Leaf Information Required) = 1 (for non-selective AR=0) or (for selective AR=1). Replicator-AR route is only used for Selective P-Tree. In addition, this document also uses the Leaf A-D route (RT-11) defined in [I-D.ietf-bess-evpn-bum-procedure-updates] in case the selective AR mode is used. Draft ietf-bess-evpn-bum-procedure-updates updates the EVPN BUM procedures for EVPN Multicast optimized selective trees used, introducing three new route types RT-9 Per Region I-PMSI A-D, RT-10 S-PMSI A/D and RT-11 Leaf A-D utilized for Selective P-Tree PTA inter-as segmentation optimizations, and utilizes RFC 7117 concept of selective tree optimization procedure to signal leaf-ad route to instantiate inter-as P-Tree framework from Intra-AS and Inter-AS VPLS Multicast I/S-PMSI A/D & Leaf A-D solution which now is also leveraged by AR replicator for IR optimization utilizing RT-11 to build selective tree IR optimization for BUM traffic. Section 6 of bess-evpn-bum-procedure-updates defines the RT-11 Leaf-AD route selective tree optimization concept from RFC 7117 response to I-PMSI route, RFC 7524 Inter-Area P2MP Segmented Next Hop Extended Community S-NH-EC which is utilized for Inter-AS P2MP Segmented LSP stitching. RFC 7524 Section 6 states that it requires the ABRs to keep the next hop unchanged for re-advertisement I/S PMSI A-D route which only needs to be consistent for MVPN Inter-AS I-PMSI A/D routes whose next hop MUST be unchanged. EVPN for inter-as readvertisement of I/S-PMSI A-D route the next hop can be changed and so does not need to rely on S-NH-EC. The Leaf A-D route MAY be used by the AR-LEAF in response to a Replicator-AR route (with the L flag set) to advertise its desire to receive the BM traffic from a specific AR-REPLICATOR. It is only used for selective AR and its fields are set as follows: o Originating Router's IP Address is set to the advertising PE's IP address (same IP used by the AR-LEAF in regular-IR routes). The Next-Hop address is set to the IR-IP. o Route Key is the "Route Type Specific" NLRI of the Replicator- AR route for which this Leaf A-D route is generated. o The AR-LEAF constructs an IP-address-specific route-target as indicated in [I-D.ietf-bess-evpn-bum-procedure-updates], by placing the IP address carried in the Next-Hop field of the received Replicator-AR route in the Global Administrator field of the Community, with the Local Administrator field of this Community set to 0. Note that the same IP-address-specific import route-target is auto-configured by the AR-REPLICATOR that sent the Replicator-AR, in order to control the acceptance of the Leaf A-D routes. o The Leaf A-D route MUST include the PMSI Tunnel attribute with the Tunnel Type set to AR, type set to AR-LEAF and the Tunnel Identifier set to the IP of the advertising AR-LEAF. The PMSI Tunnel attribute MUST carry a downstream-assigned MPLS label or VNI that is used by the AR-REPLICATOR to send traffic to the AR-LEAF. Each AR-enabled node MUST understand and process the AR type field in the PTA (Flags field) of the routes, and MUST signal the corresponding type (1 or 2) according to its administrative choice. **There are a few different flags & new flags defined in the PTA - please be specific as to the type 1 & 2 flags** ***Implementation considerations section – important and also details as to how does the backwards compatibility work*** As RT-3 introduces a mode and RT-11 is new in this draft what devices need to be upgraded and do all need to be upgraded to support the solution? ***Implementation section of any vendor implementations thus far please list** Also mention any issues found with any implementations also any operators that have deployed the implementation. Nits/editorial comments: Normative reference should be added per the re-written text provided in the Minor issues section for the following: RFC 7524 Inter-AS P2MP Segmented LSP & RFC 7117 Multicast VPLS and draft draft-ietf-bess-evpn-igmp-mld-proxy, RFC 6388 mLDP, RFC 6037 MDT SAFI, RFC 4875 P2MP TE Informative reference to MVPN procedures RFC 6513 MVPN, RFC 7988 Ingress Replication, RFC 7348 VXLAN, RFC 8926 GENEVE