I am the assigned Gen-ART reviewer for this draft. For background on Gen-ART, please see the FAQ at < http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>. Please resolve these comments along with any other Last Call comments you may receive. Document: draft-ietf-bmwg-protection-meth-09.txt Methodology for benchmarking MPLS protection mechanisms Reviewer: Joel M. Halpern Review Date: 9-March-2012 IETF LC End Date: 20-March-2012 IESG Telechat date: N/A Summary: This document is almost ready for publication as an Informational RFC. Major issues: I find a definitions section (Section 3)that says "This document also uses existing terminology defined in other BMWG work." and follows this with examples, but neither a complete list of terms nor a complete list of documents, to be an unclear approach. If the reader finds terms they do not know, they have no good indication as to what document(s) they should read to repair the gap. The description of the General Reference Topology (section 4) seems unclear to me. The text starts out discussing, and the diagram explicitly shows, a Traffic Generator and a Traffic Analyzer. In the diagram these are two disparate devices, connected to routers R1 and R5 respectively. So far, so good. However, the text then talks about "the Tester" as being made up of the Traffic generator and the Traffic Analyzer", and describes "the Tester" as being directly connected to the Device Under Test. It is exceedingly unclear whether this is supposed to mean that the full collection of routers R1-R6 are the device under test, or whether R1, R5, or some other specific router is the device under test. Could an effort be made to reword section 5.7? First it says "one or more traffic streams". Then it says "16 flows". Then it talks about traffic spreading across some set of prefixes. And the description of the reason for not doing round-robin across the prefixes leaves me even more confused about what one actually should set up. Section 5.8 describing the capabilities of "the Tester" seems to contradict section 4, where "the Tester is comprised of" the Traffic generator and the Traffic Analyzer. The capabilities listed in section 5.8 go well beyond that. The 8 scenarios shown in section 6 all have Mid-Point PLRs as far as I can tell. Section 7 says that the test it describes can be applied to all the 8 cases from section 6. But then it carefully describes cases of Headend, Mid-Point, and Egress PLR. But no examples of the first or third have been shown. Thus, I do not see, for example, as described in section 7.1. one can select a scenario from section 6, and then establish a headend PLR. This reviewer would like to verify that the test procedures described produce a meaningful value for items like Failover Packet Loss and Failover Time. Is there a specific reference for these, since the actual calculations are not described here? Finding the definition of the Failover Time calculation methods hidden in the reporting format (section 8) was quite surprising. Given that these are important definitions for the meaning fo the tests, they should occur before the test descriptions, not in the reporting format. Minor issues: As noted by id-nits, section 3 references TERM-ID as a document defining terminology, but there is no such ID in the list of references. And why do the section headers for 5.1, 5.2, and 5.6 also have "[TERM-ID]"? Note that even if those section headers are defined terms, it is stylistically unusual to put the reference into the section header. (It almost looks like "TERM-ID" was a marker for things which still needed a proper reference.) In section 5.1 a set of example failure events is listed. It is unclear whether this list is the ones to be tested for, or just "some" events. In addition, it is unclear why there is inconsistency in the coverage of the descriptions of the failures. The three different monitoring methods are mentioned explicitly with the Interface Shutdown failures, but are not even mentioned for the other failures. And then while most of the failures list local or remote side, the last two failures do not indicate a side. Why? Some of the abbreviations in section 6 are unclear. For example, since there is no real provider it is not clear which router(s) are meant by PE as distinct from P routers. Also, while I familiar with Layer3 VPN, I am not familiar with the usage"Layer2 VC". Further given taht VPNs have different label usages, I suspect that both "Layer3 VPN" and "Layer2 VC" are insufficiently specific to match to a specific size label stack. As an example of the above confusion, in the figure in section 6.1.2, the number of labels in the Layer3VPN from the PE to the P router is described as going from 2 to 3 upon failure. The PE->P link in the diagram is R1->R2, which is upstream of the failure. So the number of labels on that link won't change. The number of labels on the R6->R3 P->PE link (assuming I have properly guessed what PE is) does go from 2 to 3. But the lines refer simply to PE-P. Similarly, while I suspect that the numbers are accurate, it is very hard to map the pre-failure label counts to the diagrams in a way that explains the difference between the numbers in section 6.1.1/6.1.2 and 6.1.3 and onward. Assuming PE-PE traffic is HE-TE traffic, then the internal topology should not affect the label count on that. So you probably mean something else. But I don't know what. It is unclear what section 7 means by "Select an overlay technology (e.g. IGP, VPN, or VC)." Please clarify. Why is section 7.1.3 (determining tailend performance) included in the document, when no test cases include tailend failure? Is it an issue that the timestamp based method for determining the failover time will, on average, overestimate the failure time by on inter-packet interval? (Based on assuming that on average the failure and recovery are each uniformly distributed across the inter-packet interval.) Nits/editorial comments: Section 7.1.2 item A refers to 9 scenarios from section 6. There are only 8. Section 7.4 decides to leave out the nubmer of scenarios from section 6, leading to a surprising, but otherwise probably meaningless, difference in wording.