I have reviewed this document draft-ietf-ccamp-rsvp-te-bandwidth-availability as part of the security directorate’s ongoing effort to review all IETF documents being processed by the IESG. These comments were written primarily for the benefit of the security area directors. Document editors and WG chairs should treat these comments just like any other last call comments. This draft provides a new TLV for the Ethernet SENDER_TSPEC object that will carry availability requirements for RSVP-TE signaling of GMPLS LSPs. The draft is thorough. I do have some comments. I reviewed the normative references RFCs 2205, 3209, 3473, 6003, as well as RFC3945 and RFC5920. I can’t claim that I understood everything in that stack, so the following could easily be wrong. Computing the LSP path: Page 4, section 2 discusses obtaining network topology, calculating the LSP route, RFC8330’s extensions for availability in OSPF TE LSA messages. Does this draft assume that this extension will always be used with an EXPLICIT_ROUTE object? Is this draft not applicable without that explicit LSP route calculation? Availability TLV vs CLASSTYPE objects: The definition in RFC6003 of the Bandwidth Profile TLV has certain constraints on the values of the Index: The Index field value MUST correspond to at least one of the Class-Type values included either in the CLASSTYPE object [RFC4124] or in the EXTENDED_CLASSTYPE object [MCOS]. I am not certain if this means that the presence of an Ethernet SENDER_TSPEC Object with a Bandwidth Profile TLV means there must be a CLASSTYPE object in the RSVP-TE message as well, or that the Index field values are taken from the set of defined Class-Type values. But if the first, does this induce requirements by inclusion of the Availability TLV that these other CLASSTYPE objects must be included as well? Or are you intending to update RFC6003 to eliminate that constraint? If the second, does the RFC6003 constraint also constrain the index values used in the Availability TLV? Should that be mentioned? (Or am I just confused?) Bandwidth TLV to Availability TLV association: Page 4, Section 3.1 says When the Availability TLV is included, it MUST be present along with the Ethernet Bandwidth Profile TLV. If the bandwidth requirements in the multiple Ethernet Bandwidth Profile TLVs have different Availability requirements, multiple Availability TLVs SHOULD be carried. In such a case, the Availability TLV has a one to one correspondence with the Ethernet Bandwidth Profile TLV by having the same value of Index field. If all the bandwidth requirements in the Ethernet Bandwidth Profile have the same Availability requirement, one Availability TLV SHOULD be carried. In this case, the Index field is set to 0. I find that the description is not clear in all cases. I found a message in the working group discussion of this association that the association is either “n:n” or “n:1”. I think this description sounds more like n 1:1 associations or a n:1 association. Is that what is intended? Can the associations be mixed in the same message? Suppose there were 3 Bandwidth TLVs that needed the same availability and one that had a different availability need, could there be 3 Bandwidth TLVs with index 0 and one Availability-TLV with index 0 and also a Bandwidth TLVs - Availability TLV pair with matching index values? error checking: Other documents in the references (RFC2205, 3209, 3473, 6003, etc) have made a point of explicitly describing the error handling - when PathErr and ResvErr and Notify messages are sent, to whom, the error codes, the error value sub-codes, etc. I don’t see that here for the bandwidth-tlv-to-availability-tlv associations. Is a mix of index-zero and index-non-zero bandwidth-tlv-to-availability-tlv associations (like above) an error? is the message dropped? is an error sent? if the message is not dropped, are any of the bandwidth-tlv, availability-tlv associations retained? If there are availability-tlvs with non-zero indexes with no matching index value among the bandwidth-tlvs, that surely is an error? Is the message dropped? Or is the availability tlv dropped? Is a PathErr/ResvErr message sent? Suppose all availability-tlvs have a matching (zero or non-zero) index value among the bandwidth-tlvs, but there are extra bandwidth-tlvs (no availability-tlv with a matching index value). Is that an error? Are the extra bandwidth-tlvs dropped? ignored? propagated? (RFC3209 has several cases where there might be extra objects or sub-objects and the language is “can be/MAY be/SHOULD be/are ignored and SHOULD NOT be /are not/need not be propagated” ) multiplicity: RFC3209 says it does not apply to multicast, but it does talk about multiple parallel LSP tunnels between two nodes, and about multipoint-to-point LSPs for WF and SE style reservations when there are multiple senders, and about the merging rules of WF reservations. Does availability work in those style reservations? availability vs “variable discrete bandwidth”: I believe I understood the discussion of the need to signal availability requirements in order for the system to determine when an LSP was feasible. I can dimly understand that there might be links have “variable discrete bandwidth”. Section 2 says “The Availability TLV can be applicable to any kind of physical links with variable discrete bandwidth, such as microwave or DSL.” Why not other link types? Do only “variable discrete bandwidth” links support availability? calculating availability: In page 9, Appendix A: Perhaps I don’t understand how the availability metric is used. In the following: On a sunny day, the modulation level 3 can be used to achieve 400 Mbps link bandwidth. A light rain with X mm/h rate triggers the system to change the modulation level from level 3 to level 2, with bandwidth changing from 400 Mbps to 200 Mbps. The probability of X mm/h rain in the local area is 52 minutes in a year. Then the dropped 200 Mbps bandwidth has 99.99% availability. I would say that the 400Mbps bandwidth is available whenever it is not raining. It lightly rains 52 min year, which means it is not raining 99.99% of the time, so the 400Mbps availability is 99.99%. The 200Mbps is available during that 52 min, so 99.99% is not the 200Mbps availability. Right? The analogous comment applies to the next two paragraphs. Does that explain why the table shows the 100Mbps bandwidth having two different availabilities? security: The draft (*) security consideration points to RSVP-TE, but without an RFC reference, and to RFC5920. Because this is a GMPLS related feature, it should refer to the GMPLS extensions to RSVP-TE in RFC3473. As an extension to RFC6003, it could refer to that RFC’s security considerations section, but that only gets the reader to RFC3473, RFC3209, and RFC5920. The security considerations for RSVP-TE itself (RFC3209) points to RFC2205. RFC2205 defines an Integrity object (defined in RFC2747) that carries a keyed cryptographic digest based on a shared key, providing hop-by-hop protection between two RSVP nodes. However, PATH messages are directed toward the traffic destination address, not the next RSVP node. There could be clouds of non-RSVP nodes between two RSVP nodes that the PATH encounters. This makes it difficult to share a key between individual pairs of RSVP nodes, and could motivate operators to configure the same key in large numbers of RSVP nodes. RFC3473 points to RFC2747’s protection of RSVP messages. It also notes that it introduces a Notify message that is not sent to the traffic destination address but instead to a node that requested notification. One transmission option is that the NOTIFY is encapsulated in an IP packet and forwarded directly to the requesting node. That complicates the Integrity object protection, unless the shared key is widely shared. RFC3945 notes that authentication in GMPLS systems may use the authentication mechanisms of the component protocols, pointing to RFC2747 (as well as others for LDP, LMP, etc that don’t apply here). RFC5920 discusses threats, attacks, and protections for MPLS/GMPLS data and control planes. Section 7.1.2 in particular talks about “Control-Plane Protection with RSVP-TE”, and could be mentioned here explicitly. It talks about network border configuration to limit external attacks, and mentions RFC2747 authentication protections, making some of the same points about non-RSVP clouds and shared keys configuration. It also points to RFC4230, which is a very detailed look at RSVP security, and probably deserves to be mentioned here. So all told, at the end of all the reference chains, the only defined authentication and integrity protection in 2205, 3209, and 3473 is based on shared keys that are very difficult to configure with fine granularity. However, as was said in reply to a different MPLS related draft review yesterday: The MPLS network is often considered to be a closed network such that insertion, modification, or inspection of packets by an outside party is not possible. So maybe that is accepted as sufficient in deployment. MPLS documents are also typically granted an exception from more rigorous security requirements because MPLS is used only within one routing domain / ISP / provider / etc, under a single administrative control, so errors made would not be global in impact. In particular, errors that might result from one legitimate but faulty/mis-configured/subverted/malicious MPLS node should not propagate out to the general Internet. (**) Nits: floating numbers Page 5, Section 3.1, says “a 32-bit floating number”. I believe you mean a floating-point number. I checked other IETF RFCs (e.g., RFC8330), and it is common to mention the IEEE 754-2008 standard when including a floating point value in the spec. But is a floating point value needed? The draft says that the values are typically in a small set of known values. The intro sounds like a small set of classes are used for “efficient planning”. Just curious. OTOH, RFC8330 uses floating point, and the ITU documents’ calculation of availability make it seem like full floating point is needed. the Availability TLV format: page 5, section 3.1 says: The Availability TLV has the following format: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Index | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Availability | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1: Availability TLV I presume that this is just the Value portion of the TLV format that is defined for the Ethernet SENDER_TSPEC Object in Section 4 of RFC6003. Page 1, Abstract: typically used for describing these links when during network planning “when during” - is that deliberate? It sounds redundant, maybe due to editing. Or maybe it was supposed to be “when doing”? signaling. This extension can be used to set up a Generalized Multi- Protocol Label Switching (GMPLS) Label Switched Path (LSP) using the Ethernet SENDER_TSPEC object. not sure - what is using the SENDER_TSPEC - the LSP or this extension? Page 3, Section 1: bandwidth availability. For example, the bandwidth with 99.999% availability of a link is 100 Mbps; the bandwidth with 99.99% availability is 200 Mbps. maybe: bandwidth availability. Suppose, for example, the bandwidth with 99.999% Page 5, section 3.2: TLVs and one or more Availability TLVs. Each Ethernet Bandwidth Profile TLV corresponds to an availability parameter in the Availability TLV. … “in an Availability TLV”? or “in the associated Availability TLV”? There’s more than one. Page 6, section 3.2 link), it SHOULD reserve the bandwidth resource from each “it” -> “the node” this LSP. Optionally, the higher availability bandwidth can be “the higher” -> “a higher” (there’s more than one, right?) request cannot be satisfied, it SHOULD generate PathErr message “it” -> “the node” generate PathErr message with the error code "Extended Class-Type “PathErr” -> “a PathErr” or “PathErr messages” postscripts: (**) [[[ I will note that RFC3209 includes an AS number subject among the subobjects of the EXPLICIT_ROUTE object. With the idea that you might set up explicit routes that go through multiple ASNs. Ouch. I know there are providers who have different ASNs under single administrative control, from acquisitions or business use cases, but this just makes it possible for an explicit route for an LSP to be misconfigured to include your (external) neighbor ASN. And RFC5920 talks about “ASBR-ASBR communication for inter-AS LSPs”. Better have good outbound filters on your border routers. ]]] (*)As is typical for specifications that extend other published RFCs, this draft says it “does not introduce any new security considerations”. In general, I am skeptical of extension drafts that make such claims. Surely the existing security considerations should be examined to see how they apply to this new feature or object being introduced? Do current protections adequately protect the new feature/object? Does the new feature/object carry new information, produce new behaviors? etc. But this is so very common I could hardly request that more be said here. Just saying.