I am the assigned Gen-ART reviewer for this draft. The General Area Review Team (Gen-ART) reviews all IETF documents being processed by the IESG for the IETF Chair. Please treat these comments just like any other last call comments. For more information, please see the FAQ at . Document: draft-ietf-ace-revoked-token-notification-06 Reviewer: Dale R. Worley Review Date: 2024-04-03 IETF LC End Date: 2024-04-05 IESG Telechat date: not known Summary: This draft is on the right track but has open issues, described in the review. ** Technical issues: * Section 5.1 includes: For each requester, the AS maintains an update collection of maximum MAX_N series items, where MAX_N is a pre-defined, constant positive integer. The AS MUST keep track of the MAX_N most recent updates to the subset of the TRL that pertains to each requester. This suggests that that the AS needs to maintain either separate collections for each requester or engage in some rather complicated bookkeeping to tell when an item can be reaped (since sometimes the oldest item cannot be reaped but some newer item can be). Is this what is intended? * The Observe mechanism requires that the AS maintain per-observer state for current subscriptions. There are some complications: - Is there a limitation to one subscription per registered client? - The notifications sent for a subscription depend on the parameters of the GET which establishes the subscription. Additionally, the URL to which notifications are to be sent must be recorded. I suspect that these questions are answered (explicitly or implicitly) by the specification of the CoAP Observe mechanism. But I notice that the data that the AS must store to support subscriptions is not stated in Appendix B. - More subtly, a subscription with Cursor implicitly saves the "cursor" value sent with a notification to be used as the implicit input "cursor" value used to construct the payload of the next notification. This processing is "obvious" but I didn't notice it being specified anywhere. And it likely isn't implicit in the general CoAP Observe specification because it isn't data from the GET that is stored to be used as input for generating each successive notification. So it seems that some additional specification is needed. Also, Appendix B doesn't specify this as part of the AS state. * If I were writing this specification, I would replace MAX_INDEX with "INDEX_LIMIT", defined as MAX_INDEX + 1. That is, instead of the constraint "index <= MAX_INDEX" would be "index < INDEX_LIMIT". Then the numerous places where "MAX_INDEX + 1" is used could be changed to just "INDEX_LIMIT": The AS defines INDEX_LIMIT <= (2^64). If i_X is the value of 'index' associated with a series item X, then the following series item Y will take 'index' with value i_Y = (i_X + 1) % INDEX_LIMIT. - ( i_B = INDEX_LIMIT - 2, i_C = INDEX_LIMIT -1, i_D = 0 ) a value (not less than)/(greater than or equal to) INDEX_LIMIT Although then there are a number of places where "MAX_INDEX" is now used that would have to be revised to "INDEX_LIMIT - 1". * Section 3, "Token Hash" specifies two alternative ways of constructing HASH_INPUT from an access token, depending on whether the token value is a byte string or a text string. If the token is a byte string, HASH_INPUT is the CBOR serialization of the byte string, but if the token is a (UTF-8) text string, HASH_INPUT is the text string itself. The document does not describe the motivation for this construction. It cannot be to restrict the bytes in HASH_INPUT to a particular set, because the CBOR serialization of a byte string can contain any byte values. I suspect that the motivation is to ensure that the hash of a byte string is never the same as the hash of a character string. However, this differentiation is not accomplished by the described process. For example, the byte sequence 0x41 0x41 is both the character string "AA" as well as a CBOR serialization of the single-byte string 0x41. For that matter, the byte sequence 0x40 is both the character string "@" as well as the most common CBOR serialization of the null byte string. In addition, for the overall processing to work, a receiver must be able to recognize when a hash matches a token that it possesses. But the CBOR serialization of a byte string is not unique. In the example in section 3, the first byte of the encoding could be 58, 59, 5a, or 5b (corresponding to using a 1-, 2-, 4, or 8-byte extended count) or 5f (corresponding to using an indefinite length encoding). Indeed, since an indefinite-length byte string encoding may contain zero-length chunks, there are an infinite number of valid CBOR serializations of any byte string and the number of HASH_INPUT values and derived hashes for any particular byte string token is infinite. This makes it impossible to determine in general whether a particular hash refers to a particular byte string token. However, these problems can be eliminated (and the overall process simplified) in this manner: To distinguish byte strings from text strings, set HASH_STRING to be the byte 0x80 prepended to the byte string. Since 0x80 can never be the initial byte of a UTF-8 text string, this HASH_STRING value cannot be generated by any text string. * In regard to section 5.2, "Query Parameters": * 'cursor': if included, it indicates to perform a diff query of the TRL together with the "Cursor" extension, as defined in Section 8.2. Its value MUST be either 0 or a positive integer. The way the document is written suggests that "cursor" is an optional extension of "diff", which is itself an optional feature. Naively, that suggests that a query with "cursor" must necessarily include "diff". But that is not stated in the document. However, I'm having a hard time making out the design concept behind the stated requirements. This paragraph says that if the AS *does not* support Cursor, then it MUST *process* GETs with 'cursor' and without 'diff', as a full query (i.e., without reporting an error): If the AS does not support the "Cursor" extension, it ignores the 'cursor' query parameter when present in the GET request. In such a case, the AS proceeds: i) like when processing a diff query of the TRL (see Section 7), if it supports diff queries and the 'diff' query parameter is present in the GET request; or ii) like when processing a full query of the TRL (see Section 6) otherwise. These paragraphs say that if the AS *does* support Cursor, then if it receives a GET with 'cursor' and without 'diff' it MUST *report a 4.00 error*: If the AS supports both diff queries and the "Cursor" extension, and the GET request specifies the 'cursor' query parameter, then the AS MUST return a 4.00 (Bad Request) response in case any of the conditions below holds. ... - The GET request does not specify the 'diff' query parameter, irrespective of the value of the 'cursor' parameter. It seems that whether such a query is valid depends on whether the AS supports Cursor or not; certain ASs are required to treat such a query as valid and other ASs are required to treat it as an error. I suggest rectifying the ontology by saying * 'cursor': if included, it indicates to perform a diff query of the TRL together with the "Cursor" extension, as defined in Section 8.2. Its value MUST be either 0 or a positive integer. If included, the 'diff' parameter MUST also be included. and simplifying the above paragraph to: If the AS does not support the "Cursor" extension, it ignores the 'cursor' query parameter when present in the GET request. In such a case, the AS proceeds as specified elsewhere in this document. That change doesn't change the inconsistent responses to "cursor without diff", but it makes it clear that the requester is making an error. * In regard to section 13.5, "Dishonest Clients": I think that the substance of this section is critical, but the discussion is unnecessarily restricted to situations where the client has dishonest intent. The problem to be addressed is broader, and is essentially stated in the third paragraph: This makes the RS vulnerable during a time interval that starts when [the access token is revoked] ends when the RS expunges the access token, e.g., after having gained knowledge of its revocation. And during that time interval, it is a security violation if a client gains access to the resource, whether or not the client's intention is dishonest or not. I suggest a revision like If a Client attempts to access a protected resource after the access token is revoked but before the RS hosting the resource is aware of the revocation, it will be able to do so even though it should not. Such an access is a security violation even if the Client is not attempting to be malicious. In order to minimize such risk, if an RS relies solely on polling through individual requests to the TRL endpoint to learn of revoked access tokens, the RS SHOULD implement an adequate trade-off between the polling frequency and the maximum length of the vulnerable time window. ** Editorial issues: 1. Introduction It might be worth noting that the process(es) by which a token is declared revoked, and the method by which the AS is notified of that (and consequently updates the TRL), is out of scope. That fact is implicit in this document, but stating it ensures someone doesn't hunt through this document looking for a specification of the revocation process. 2. Protocol Overview After the registration procedure is finished [...] device can send an Observation Request to the TRL endpoint as [...] At any time, the registered device can send a GET request to the CTRL endpoint. When doing so, it can request for: the current [...] It seems that these two paragraphs are separate items and should be marked with "*" like other items in the list. Depending on the specific subscription established through the observation request, the notification provides the current updated list of revoked access tokens in the subset of the TRL pertaining to that device (see Section 6), or rather the most recent TRL updates occurred over that list of pertaining revoked access tokens (see Section 7). Instead of "or rather", the wording probably should be "or alternatively" or "or, if the 'diff' parameter is in effect,". * An administrator can access and subscribe to the TRL like a registered device, while getting the full updated content of the TRL. Perhaps better phrased * An administrator can access and subscribe to the TRL like a registered device, while getting updates of the full content of the TRL. since there is no "un-updated content of the TRL". Consistently, the AS adds the three token hashes to the TRL at once, Does "Consistently" carry any specific meaning here? Actually, I think you mean "Consequently", that is, in consequence of the simultaneous revocation of three tokens. And if so, combining this paragraph with the preceding one would clarify that the actions in this sentence were consequent to the actions in the preceding sentence. 3. Token Hash * If the content of the 'access_token' parameter from step 1 is a CBOR byte string, then HASH_INPUT takes the binary serialization of that CBOR byte string. This is the case where CBOR was used to transport the Access Token (as a CWT or JWT). With reference to the example in Figure 2, and assuming the string's length in bytes to be 119 (i.e., 0x77 in hexadecimal), then HASH_INPUT takes the bytes {0x58 0x77 0xd0 0x83 0x44 0xa1 ...}, i.e., the raw content of the 'access_token' parameter. I would change "HASH_INPUT takes" to "HASH_INPUT is". And similarly in the next item. Probably better phrased "... HASH_INPUT is the CBOR encoding of the byte string." and "... i.e., the CBOR-encoded content of the 'access_token' parameter." (Unless the process is revised as discussed in "Technical issues".) 5. The TRL Endpoint * Diff query: [...] The entry associated with one of such updates contains a list of token hashes, such that: i) the corresponding revoked access tokens pertain to the requester; and ii) they were added to or removed from the TRL at that update. This seems awkward to me. Perhaps better: The entry associated with one of such updates contains the list of token hashes which were added to or removed from the TRL at that update and for which the corresponding revoked access tokens pertain to the requester. -- First, the AS can avoid excessively big latencies when several diff entries have to be transferred, by delivering one adjacent subset at the time, in different diff query responses. This seems awkward to me. Perhaps better: First, the AS can avoid excessively long messages when several diff entries have to be transferred, by delivering several diff query responses, each containing one adjacent subset at a time. 5.1. Supporting Diff Queries The AS MUST keep track of the MAX_N most recent updates to the subset of the TRL that pertains to each requester. Clearer to say For each requester, the AS MUST keep track of the MAX_N most recent updates to the subset of the TRL that pertain to the requester. -- 6. The AS adds the series item to the update collection associated with the requester, as the most recent one. Better 6. The AS adds the series item to the update collection associated with the requester, as the last (most recent) one. 5.1.1. Supporting the "Cursor" Extension The AS defines the constant, unsigned integer MAX_INDEX <= ((2^64) - 1), where "^" is the exponentiation operator. In particular, the value of MAX_INDEX is REQUIRED to be at least (MAX_N - 1), and is RECOMMENDED to be at least ((2^32) - 1). "In particular" isn't quite right when specifying a further constraint. Just start "The value of MAX_INDEX ..." Note that MAX_INDEX is practically expected to be order of magnitudes greater than MAX_N. The phrase you want is either "an order of magnitude", meaning MAX_INDEX >= 10 * MAX_N, or "orders of magnitude", meaning (more intense but less precise) MAX_INDEX is expected to be >= 100 * MAX_N. But you probably want to use "SHOULD", perhaps like this MAX_INDEX SHOULD be (at least an order of magnitude)/(orders of magnitude) greater than MAX_N. Given the error processing in the last-but-one paragraph of 5.2, you should add a warning: MAX_INDEX SHOULD be (at least an order of magnitude)/(orders of magnitude) greater than MAX_N. Note that section 5.2 requires "Cursor" processing to return errors to a requester after index wrap-around has occurred for that requester, so a small value of MAX_INDEX will cause GETs to return errors when otherwise they would not. -- When maintaining the history of updates to the TRL, the following applies separately for each update collection. Clearer to say When maintaining the history of updates to the TRL, the following applies separately for each requester's update collection. -- If the update collection is not empty, 'last_index' has the value of 'index' currently associated with the latest added series item in the update collection. Note that "the latest added series item" is the same as "the last series item". As long as a wrap-around of the 'index' value has not occurred, the value of 'last_index' is the absolute counter of series items added to that update collection until and including V, minus 1. I think "until and including V" should be removed, as "V" isn't the name of any thing, and the sentence has the right meaning without those words. That is, As long as a wrap-around of the 'index' value has not occurred, the value of 'last_index' is the absolute counter of series items added to that update collection, minus 1. -- The specific value depends on the specific registered device or administrator associated with the update collection in question. Strictly, you want to add "MAY", because some implementations may use the same value for all devices/administrators: The specific value MAY depend on the specific registered device or administrator associated with the update collection in question. 6. Full Query of the TRL * The 'full_set' parameter MUST be included and specifies a CBOR array 'full_set_value'. There are quotes around 'full_set_value' as if the value is a parameter name, but "full_set_value" is only used in this document to describe how the response is assembled (similarly to "HASHES", "HASH_INPUT", etc.) and there's no need to quote its name. Similarly for 'diff_entry', 'diff_set_value' and some other variables in various places. 7. Diff Query of the TRL * The 'diff_set' parameter MUST be present and specifies a CBOR array 'diff_set_value' of U elements. Each element of 'diff_set_value' specifies one of the CBOR arrays 'diff_entry' prepared above as diff entry. s/as diff entry/as a diff entry/ Their values are specified according to what is defined in Section 8.2, [...] Can be simplified to "Their values are specified in Section 8.2, [...]". Figure 7 shows an example of response from the AS, [...] Can be simplified to "[...] shows an example response [...]". 8. Response Messages when Using the "Cursor" Extension If it supports both diff queries and the "Cursor" extension, the AS composes a response [...] Clearer to say If the AS supports both diff queries and the "Cursor" extension, it composes a response [...] 8.2.2. Cursor Not Specified in the Diff Query Request [...] the AS performs the same actions defined in Section 7, with the following differences. Can be simplified as [...] the AS performs the actions defined in Section 7, with the following differences. -- If the 'more' parameter has value "true", the requester can send [...] This paragraph should be at the normal left margin (which is outdented from its current position). 9. Registration at the Authorization Server * The hash function used to compute token hashes. This is specified as an integer or a text string, taking value from the "ID" or "Hash Name String" column of the "Named Information Hash Algorithm" Registry [Named.Information.Hash.Algorithm], respectively. Actually, given that the registration process is seemingly not specified anywhere, the only thing that can be specified in this document is the overall semantic effect, which is that some entry in the registry is specified by some method: * The hash function used to compute token hashes. This is specified by identifying an entry in the "Named Information Hash Algorithm" Registry [Named.Information.Hash.Algorithm]. The specific means for this is outside the scope of this document. 10. Notification of Revoked Access Tokens In what way does the AS record the types of observation responses that the client is capable of understanding? There seem to be three varieties, full, diff, and diff-with-cursor. I would guess that the absence of diff in the GET-with-Observe indicates that full observation responses must be sent, the presence of diff without cursor indicates that diff responses (using the saved value of the 'diff' parameter) must be sent, and the presence of diff and cursor indicates that cursor responses must be sent (with the AS caching the cursor value from one response to generate the next response). But all of this seems to be only implicit in this document and needs to be specified. Once completed the registration procedure at the AS, [...] Maybe better "After completing the registration procedure at the AS, [...]". When doing so, the requester towards the TRL endpoint can perform a full query (see Section 6) or a diff query (see Section 7) of the TRL. Or for that matter, a diff-with-cursor query (see Section 8). 10.1. Handling of Access Tokens and Token Hashes An RS stores a token hash th1 corresponding to an access token t1 until both the following conditions hold. It seems like this should be "MUST store". 11. ACE Token Revocation List Parameters The table below summarizes them, and specifies the CBOR value to use as abbreviation instead of the full descriptive name. This sort of thing is probably conventional in CBOR, but to the naive reader a more explicit explanation is useful. Perhaps The table below summarizes the parameters, and specifies the CBOR value to use as the key in the map pair for the parameter instead of the string that is used as the parameter's name in this document. 13. Security Considerations Security considerations are inherited from the ACE framework for Authentication and Authorization [RFC9200], [...] It would be clearer to state ACE token revocation notification inherits the security considerations from the ACE framework for Authentication and Authorization [RFC9200], [...] 13.1. Content Retrieval from the TRL To this end, the AS can perform the required filtering [...] Probably better to be normative: To this end, the AS MAY perform the required filtering [...] unless the intention is to assert that an AS can always use this method to perform the required filtering, in which case "can" should be "can always". Disclosing any information about revoked access tokens to entities other than the intended registered devices [...] "intended" is not well defined here. "pertain" should be used instead. I think you want to say Disclosing any information about revoked access tokens to entities to which they do not pertain [...] 13.2. Size of the TRL Issuing access tokens with not too long expiration time could help reduce [...] Less awkward to say Issuing access tokens with limited expiration time could help reduce [...] 13.4. Request of New Access Tokens [...] the Client might anyway receive an unprotected 4.01 (Unauthorized) response from the RS. Probably better to say [...] the Client may receive an unprotected 4.01 (Unauthorized) response from the RS. since the next paragraph explains why it might receive such a response. -- This can be due to different reasons. For example, the access token has actually been revoked and the Client is not aware about that yet, while the RS has gained knowledge about that and has expunged the access token. Probably better to say This can be due to a number of causes. For example, the access token has been revoked, the RS is aware of it and expunged the token, but the Client is not aware of it (yet). 14.3. ACE Token Revocation List Parameters Registry * CBOR Value: This field contains the value used as CBOR abbreviation of the item. Parallel to the comment on section 11, * CBOR Value: This field contains the value used as CBOR abbreviation of the item, specifically as the key in CBOR maps that are values of media type application/ace-trl+cbor. -- The value can be an unsigned integer or a negative integer. I assume the intention is The value MUST be an unsigned integer or a negative integer. rather than MAY. Similarly for "Value" in the registry of section 14.4. Appendix B. Parameters of the TRL Endpoint "parameters" isn't quite the right word here. Compare with the use of "parameters" in section 5.2. This list seems to combine two things: "configuration parameters", values which have a single instance and are constant over long periods of time for a single AS. For example, MAX_N, MAX_DIFF_BATCH, and MAX_INDEX. It is this uniqueness and constancy that allows the values to be communicated to the client only "upon registration", an event which seems to be intended to happen infrequently. "state variables", values which the AS stores and updates as it performs its various operations. I notice that in this situation, a parameter is a state variable if and only if it has multiple (per-client) instances, but the fundamental difference is whether the parameter changes during normal operation, and that difference isn't clearly noted in this section. There seem to be a lot of state variables which are not listed here. The leading example is the TRL itself, but there is also a lot of information about the observation subscriptions that needs to be maintained. Appendix C. Interaction Examples The details of the registration process are omitted, but it is assumed that the RS sends an unspecified payload to the AS, which replies with a 2.01 (Created) response. The payload of the registration response is a CBOR map, which includes the following entries: Given that the registration process is (seems to be) entirely unspecified, the wording should be a little broader: Registration is assumed to be done by the RS sending a POST request with an unspecified payload to the AS, which replies with a 2.01 (Created) response. The payload of the registration response is assumed to be a CBOR map which includes the following entries: Note the absence of a comma in the last sentence, because we also assume that the response includes the specified entries. If a comma was present, we would be asserting that we are *assuming* that the response is a map, but that assumption *implies* that the map contains the specified entries. * a 'trl_hash' parameter, specifying the hash function used to compute token hashes as defined in Section 3; might as well flesh out to: * a 'trl_hash' parameter, specifying the "Hash Name String" of the hash function used to compute token hashes as defined in Section 3; C.1. Full Query with Observe |<---------------------------------------------------+ | 2.05 CONTENT Observe: 42 | | Content-Format: "application/ace-trl+cbor" | | Payload: { | | "full_set" : [] | | } | | . | | . | | . | | | | (Access tokens t1 and t2 issued | | and successfully submitted to RS) | | . | | . | | . | | | | | | (Access token t1 is revoked) | This might made easier to draw by (1) making sure that every response body is followed by an empty line (I initially mistook that the "Payload" above was followed by more data in the response than is shown.), and (2) replacing the vertical "..." with a horizontal "..." (to save space): |<---------------------------------------------------+ | 2.05 CONTENT Observe: 42 | | Content-Format: "application/ace-trl+cbor" | | Payload: { | | "full_set" : [] | | } | | | | ... | | (Access tokens t1 and t2 issued | | and successfully submitted to RS) | | ... | | (Access token t1 is revoked) | C.3. Full Query with Observe plus Diff Query The example also considers one of the notifications from the AS to get lost in transmission, and thus not reaching the RS. [...] | X<------------------------------------------------+ | 2.05 CONTENT Observe: 86 | | Content-Format: "application/ace-trl+cbor" | You might want to make the message loss more obvious. (I overlooked it the first time I read this section.) Perhaps | Lost <-------------------------------------------+ | 2.05 CONTENT Observe: 86 | | Content-Format: "application/ace-trl+cbor" | Similarly in section C.5. [END]