I am the assigned Gen-ART reviewer for this draft. For background on Gen-ART, please see the FAQ at < http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>. Please resolve these comments along with any other Last Call comments you may receive. Document: draft-ietf-mmusic-rfc2326bis-34 Reviewer: Robert Sparks Review Date: 05-Jun-2013 IETF LC End Date: 04-Jun-2013 IESG Telechat date: not yet on a telechat Summary: This draft is on the right track but has open issues, described in the review. I have not reviewed this document at the level of detail I prefer for Gen-art reviews, but have tried to be thorough in the sections I have reviewed. In particular, I didn't verify the tables and the text agree I didn't carefully check the ABNF I haven't thought about possible edge cases and race conditions as much as I would have liked I didn't closely review the security mechanisms in section 19 or the specialized MIKEY keying mechanism in the appendix - both need careful review from security experts. One observation I would like to make before calling out issues and nits: The document is very long, and the structure is unusual - much of the definition of the protocol itself is in the appendices. You are missing an opportunity to make this document much shorter (thereby likely increasing its effectiveness). Much of the jump in from RFC2326 was importing the description of headers and responses from HTTP, tailoring them to RTSP. That was a good exercise in that it highlights some issues that would otherwise be non-obvious (and raises a few questions below). But you followed the style from HTTP perhaps too closely - much shorter descriptions without examples might have done the job better. Overall, separating exposition and examples from the protocol definition would make it much easier for an implementer to find the definition of the protocol, and use the document as a reference when diagnosing a failure to interoperate. Major Issues I'm not seeing what instructs an RTSP element on how to find the server it would try to open a connection to given an rtsp or rtsps URI? Are you assuming it would be doing A or AAAA DNS lookups? Should this protocol use NAPTR/SRV? The document should be explicit. On a related note, (and maybe I missed this), but where do you talk about what an element should check in the server's certificate when connecting over TLS? Are you assuming a common name match? Or are you expecting some SubjectAltName processing? The document should say more about when connection reuse is appropriate, particularly when the connections happened because of an rtsps uri. Two different names might resolve to the same IP address - reusing a connection formed when looking at the first name (and verifying the server's cert) is dangerous. A new connection needs to be formed (and verified) instead. The text talks about the option to queue S->C requests if there isn't a connection to the client available. Ostensibly, this means the request can go down some future connection. It's not clear how the server can tell the right client connected. In particular, when using rtsps, how do you prevent a malicious client from getting such a queued message that wasn't meant for him? Given that the text talks about queuing S-C requests, it should explicitly call out whether a server can queue a response if the connection the associated request arrived on is no longer available. I think what you want to say is that such a response must not be queued, and must be dropped. There are several places in the text that talk about using a 503 response with a Retry-After header to push back on traffic from an element (the first is section 10.7). * It's not clear what the subject of a 503 is. Is it intended to be scoped to requests to the resource in the RURI of the associated request? Is it intended to be scoped to the domain in that resource? Or is it, like in SIP, intended to be scoped to the server issuing the response, independent of what the RURI contained? * If the intent is that the 503 talks about the server, then you should clarify that a proxy doesn't simply forward 503 responses (after rewriting CSeqs), and should probably have a response of its own. Otherwise, clients that might be talking to two different servers through one proxy would lose access to both of them when one of the servers 503ed. Section 4.9.1 lists values the Seek-Style header can take, but 18.45 lists a completely different set (which most of the rest of the document uses). Should 4.9.1 be changed to use the values in 18.45? Are the right strings being used in sections 13.4.4 through 13.4.6? Those appear to be using strings from 4.9.1. The document will not stand if you delete some of the appendices. They aren't supplementary material. Please consider restructuring the normative sections back into the main document, or clearly identifying which ones define protocol and which are background information. Section 15 talks about a "transparent' proxy, but there is no description in this document of what protocol such a thing should follow. What bad thing happens if you remove all mention of "transparent" proxies from the document? As far as I can tell, that won't affect the protocol definition at all. The list of steps for establishing an SRTP cryptographic context in C.1.4.1 says "This specification does not specify an explicit method for indicating this SRTP cryptographic context establishment method, but future specifications may." in the context of looking at the SDP. SDP _has_ methods of indicating what keying mechanism is used with SRTP - are you talking about something different? Why doesn't this section say something like "the use of SRTP with RTSP is only defined with MIKEY with keys established as defined in this section. Future documents may talk about how an RTSP implementation might treat SDP that indicates some other key mechanism be used"? Could you provide an example in the document of an RTSP message carrying SDP reflecting the use of SRTP as defined in this document? Minor Issues The document uses the notion of a "control connection" (it appears first in section 2.5) without defining what that is, or what kind of connections you might have that aren't control connections. The update to the registration of the rtsp and rtsps schema call out that there are changes that can result in interoperability issues. It doesn't say what the issues are, or who might encounter them. I can infer that the point is that there might be problems if a URI following this updated registration were to be processed by an RTSP 1.0 implementation (or any other application that relied on the previous definition). It would be better to say that explicitly, and talk about how a previous implementation is likely to act when presented with a URI that exercised these new changes. (It's not clear to me that all of the thread at www.ietf.org—msg01567.html < http://www.ietf.org/mail-archive/web/uri-review/current/msg01567.html > concluded - I see how the fragments question ended, but what about things like the addition of IPV6 literals? In particular, I'm not finding a response where Ted Hardie agreed that the potential for interoperability failure had been adequately addressed). I think you've said something different than you mean in 5.4 item 1. I think you mean to say that an RTSP implementation would ignore any body that happened to come in message that MUST NOT contain one. What you've said is that the part of the parser that's doing framing has to stop framing at the end of the headers, and that such an errant body would be parsed as a separate message. As written this would keep someone from using an Internet Message Format parser since it would have to ignore any Content-Length that might have appeared, even though it wasn't supposed to. In section 10.4, it looks like a server can keep an RTSP transaction pending forever if it sends 100 responses often enough. I assume the client's recourse if it doesn't want to remain involved is to close the connection. If that's right, it's worth calling it out explicitly in this section. Somewhere, probably in section 15.2, you should be explicit that a proxy that is multiplexing requests MUST keep the requests in the same order as they are folded together. You can infer that, or things simply break, but saying it explicitly would be better. Both GET_PARAMETER and SET_PARAMETER are listed as keep-alive methods in 10.5, but the note in section 13 on table 7 only uses that as a reason for requiring SET_PARAMETER. Why doesn't it also apply to GET_PARAMETER? And why is this only required at servers? "Required" in this section means Mandatory to Implement, I think. If that's right, it should be made explicit. If that's not right, what does it mean? Section 13.8 makes a good argument for always putting parameters to be retrieved in a body. Why are you keeping the complexity of also allowing them in headers? Section 13.10: "That should not provide any benefit." is not clear - can you expand it a little? In Section 13.10 you talk about clients _sending_ media (search for "send or receive media"). Do you mean anything more than possibly RTCP? Can you make that clear in the section? Section 15.2's second paragraph appears to be the only place the proxy rewriting of CSeq in order to multiplex requests is defined, and it is very loose. It would be easy for a reader to fail to recognize this as an interoperability requirement. Please consider expanding how this is specified when addressing the comment about above about keeping requests in relative order when multiplexing. You follow the principle of saying a client should treat a response with an unknown response code, the same as it would treat the x00 of that class (e.g. 400 for an unknown 4xx). However, you do not define what a response code of 300 means. How is a client supposed to handle an unknown 3xx response? As written, a client and server can use HTTP Basic authentication over TCP that is not protected with TLS. Consider restricting it's use to TLS only. Are you sure you don't need to say anything RTSP specific about DIGEST computations (I don't think you need to go as far as SIP did talking about DIGEST, but I'm surprised you haven't run into issues relying only on 2617 literally. How does 411 Length Required for this protocol make sense? Given that you've restricted the protocol to TCP and require the Content-Length header to be present if there is a body, in what circumstance would a server need to send a 411? Section 18.19 requires senders to increment CSeq by 1. We have a reliable transport with no request retransmission, so there should never be gaps at the receiver. Should the receiver check and react some way if there are gaps? I think you should explicitly tell them not to even look. If you agree, why is incrementing by one important as long as you are always strictly increasing? You call out several places where RTCP is essential to allow RTSP using RTP to work correctly, yet section C.1.2 sets up conditions where RTCP MUST NOT be sent. Why are you allowing those instead of treating the conditions you describe as errors? Nits The definition of Message still talks about connectionless transports 22.14 says "These URI schemes are defined in existing registries which are not created by RTSP." Should it say that they are _registered_ rather than defined, and "not created by this document" instead of by RTSP? In section 4.2 in the paragraph discussing ports 554 and 322, why is the language different for rtsp and rtsps? "MUST be used" vs "is registered and MUST be assumed"? Consider a reference in 4.4 to where SMPTE 30 drop is defined, particularly for where the 1/100th frame measurement comes from. Section 4.5 introduces NPT as indicating "the absolute position" and then defines something that can carry a point or a range. I suggest adjusting the first sentence to allow for a range. Section 4.9.1's description of Conditional Random Access contains some ambiguity. It would help be more explicit. I think it's trying to say "would move the client's play point further away from the requested play point than it's play point currently is."? Section 5 : what do you mean by "tricky" switching? Section 7: consider octets instead of bytes Section 10.3: what is "take appropriate action" trying to say? The document mixes using "3rr" and "3xx" to talk about redirect responses. Why are there two ways to say the same thing? (Total nit, but) The examples show dates in 1997, when there could be no RTSP/2.0. Are there other aspects of the examples that might not have been updated? Is "will need to use" at the bottom of page 85 in section 13.5.2 trying to say something that should be said with 2119? Consider saying that a client must not put a Terminate-Reason in a C->S Teardown request, or tweak the definition of Terminate-Reason to discuss client use and the possibility of C->S specific values. These two sentences in 13.7.1 are particularly hard to parse together: "The time period is considered extended when it is 10 times the Session Timeout period. Consideration of the application of the server and its content should be performed when configuring what is considered as extended period of time." I suspect they were not written at the same time? Can you simplify what they are trying to say, noting that the suggested "10 times" is a default, and that if you want to change that default, you need to consider the context? Section 14: "Interleaved binary data SHOULD only be used if RTSP is carried over TCP". Is this leftover from when UDP was an option? (small nit) Section 14, you mention an ASCII dollar sign - maybe you should just say an octet with value 0x24? It's not clear how far up to go with "upper-layer protocol headers" in section 14, given that you are trying to speak generally about carrying things in interleaved data. Your example for RTP may cover the only case that matters. Consider speaking more normatively (instead of with an example) for RTP and provide some guidelines about where to cut the prefixes for other things you might want to carry (lest you end up with the v6 headers too when tunneling some new foo protocol?) In section 17, you say "Status codes that have the same meaning are not repeated here." But you did repeat them (see 200 OK). Should that sentence be deleted? 304 (section 17.3.4) probably should have been a different class response, maybe even a 2xx. The text does call it out as "unusual" for the 3xx, but it would be better to say why this ended up in this response class. I might be confused, but I'm not sure the redirection semantics defined in this document lead to the "black hole" case discussed in 17.4.14. If I'm confused, could you send me an example of such a redirection black hole happening? Section 18.11 first paragraph should be split into two sentences. Is "MUST be transported over TCP" in C.2.1 stale now that no connectionless transport is defined? The end of Appendix G still implies that CSeq was primarily for unreliable transports. There are other places that out and point to proxy multiplexing. All of them should acknowledge that CSeq is necessary for associating responses with requests.