I am the assigned Gen-ART reviewer for this draft. The General Area
Review Team (Gen-ART) reviews all IETF documents being processed
by the IESG for the IETF Chair.  Please treat these comments just
like any other last call comments.

For more information, please see the FAQ at

<https://trac.ietf.org/trac/gen/wiki/GenArtfaq>.

Document:  draft-ietf-ace-revoked-token-notification-06
Reviewer:  Dale R. Worley
Review Date:  2024-04-03
IETF LC End Date:  2024-04-05
IESG Telechat date:  not known

Summary:

    This draft is on the right track but has open issues, described in
    the review.

** Technical issues:

* Section 5.1 includes:

   For each requester, the AS maintains an update collection of maximum
   MAX_N series items, where MAX_N is a pre-defined, constant positive
   integer.  The AS MUST keep track of the MAX_N most recent updates to
   the subset of the TRL that pertains to each requester.

This suggests that that the AS needs to maintain either separate
collections for each requester or engage in some rather complicated
bookkeeping to tell when an item can be reaped (since sometimes the
oldest item cannot be reaped but some newer item can be).  Is this
what is intended?

* The Observe mechanism requires that the AS maintain per-observer
state for current subscriptions.  There are some complications:

- Is there a limitation to one subscription per registered client?

- The notifications sent for a subscription depend on the parameters
  of the GET which establishes the subscription.  Additionally, the
  URL to which notifications are to be sent must be recorded.

I suspect that these questions are answered (explicitly or implicitly)
by the specification of the CoAP Observe mechanism.  But I notice that
the data that the AS must store to support subscriptions is not stated
in Appendix B.

- More subtly, a subscription with Cursor implicitly saves the
"cursor" value sent with a notification to be used as the implicit
input "cursor" value used to construct the payload of the next
notification.  This processing is "obvious" but I didn't notice it
being specified anywhere.  And it likely isn't implicit in the general
CoAP Observe specification because it isn't data from the GET that is
stored to be used as input for generating each successive
notification.  So it seems that some additional specification is
needed.  Also, Appendix B doesn't specify this as part of the AS
state.

* If I were writing this specification, I would replace MAX_INDEX with
"INDEX_LIMIT", defined as MAX_INDEX + 1.  That is, instead of the
constraint "index <= MAX_INDEX" would be "index < INDEX_LIMIT".  Then
the numerous places where "MAX_INDEX + 1" is used could be changed to
just "INDEX_LIMIT":

    The AS defines INDEX_LIMIT <= (2^64).

    If i_X is the value of 'index' associated with a series item X, then
    the following series item Y will take 'index' with value i_Y = (i_X +
    1) % INDEX_LIMIT.

    -  ( i_B = INDEX_LIMIT - 2, i_C = INDEX_LIMIT -1, i_D = 0 )

    a value (not less than)/(greater than or equal to) INDEX_LIMIT

Although then there are a number of places where "MAX_INDEX" is now
used that would have to be revised to "INDEX_LIMIT - 1".

* Section 3, "Token Hash" specifies two alternative ways of
constructing HASH_INPUT from an access token, depending on whether the
token value is a byte string or a text string.  If the token is a byte
string, HASH_INPUT is the CBOR serialization of the byte string, but
if the token is a (UTF-8) text string, HASH_INPUT is the text string
itself.

The document does not describe the motivation for this construction.
It cannot be to restrict the bytes in HASH_INPUT to a particular set,
because the CBOR serialization of a byte string can contain any byte
values.  I suspect that the motivation is to ensure that the hash of a
byte string is never the same as the hash of a character string.

However, this differentiation is not accomplished by the described
process.  For example, the byte sequence 0x41 0x41 is both the
character string "AA" as well as a CBOR serialization of the
single-byte string 0x41.  For that matter, the byte sequence 0x40 is
both the character string "@" as well as the most common CBOR
serialization of the null byte string.

In addition, for the overall processing to work, a receiver must be
able to recognize when a hash matches a token that it possesses.  But
the CBOR serialization of a byte string is not unique.  In the example
in section 3, the first byte of the encoding could be 58, 59, 5a, or
5b (corresponding to using a 1-, 2-, 4, or 8-byte extended count) or
5f (corresponding to using an indefinite length encoding).  Indeed,
since an indefinite-length byte string encoding may contain
zero-length chunks, there are an infinite number of valid CBOR
serializations of any byte string and the number of HASH_INPUT values
and derived hashes for any particular byte string token is infinite.
This makes it impossible to determine in general whether a particular
hash refers to a particular byte string token.

However, these problems can be eliminated (and the overall process
simplified) in this manner:  To distinguish byte strings from text
strings, set HASH_STRING to be the byte 0x80 prepended to the byte
string.  Since 0x80 can never be the initial byte of a UTF-8 text
string, this HASH_STRING value cannot be generated by any text string.

* In regard to section 5.2, "Query Parameters":

   *  'cursor': if included, it indicates to perform a diff query of the
      TRL together with the "Cursor" extension, as defined in
      Section 8.2.  Its value MUST be either 0 or a positive integer.

The way the document is written suggests that "cursor" is an optional
extension of "diff", which is itself an optional feature.  Naively,
that suggests that a query with "cursor" must necessarily include
"diff".  But that is not stated in the document.

However, I'm having a hard time making out the design concept behind
the stated requirements.

This paragraph says that if the AS *does not* support Cursor, then it
MUST *process* GETs with 'cursor' and without 'diff', as a full query
(i.e., without reporting an error):

      If the AS does not support the "Cursor" extension, it ignores the
      'cursor' query parameter when present in the GET request.  In such
      a case, the AS proceeds: i) like when processing a diff query of
      the TRL (see Section 7), if it supports diff queries and the
      'diff' query parameter is present in the GET request; or ii) like
      when processing a full query of the TRL (see Section 6) otherwise.

These paragraphs say that if the AS *does* support Cursor, then if it
receives a GET with 'cursor' and without 'diff' it MUST *report a 4.00
error*:

      If the AS supports both diff queries and the "Cursor" extension,
      and the GET request specifies the 'cursor' query parameter, then
      the AS MUST return a 4.00 (Bad Request) response in case any of
      the conditions below holds.

      ...

      -  The GET request does not specify the 'diff' query parameter,
         irrespective of the value of the 'cursor' parameter.

It seems that whether such a query is valid depends on whether the AS
supports Cursor or not; certain ASs are required to treat such a query
as valid and other ASs are required to treat it as an error.

I suggest rectifying the ontology by saying

   *  'cursor': if included, it indicates to perform a diff query of the
      TRL together with the "Cursor" extension, as defined in
      Section 8.2.  Its value MUST be either 0 or a positive integer.
      If included, the 'diff' parameter MUST also be included.

and simplifying the above paragraph to:

      If the AS does not support the "Cursor" extension, it ignores the
      'cursor' query parameter when present in the GET request.  In such
      a case, the AS proceeds as specified elsewhere in this document.

That change doesn't change the inconsistent responses to "cursor
without diff", but it makes it clear that the requester is making an
error.

* In regard to section 13.5, "Dishonest Clients":

I think that the substance of this section is critical, but the
discussion is unnecessarily restricted to situations where the client
has dishonest intent.  The problem to be addressed is broader, and is
essentially stated in the third paragraph:

   This makes the RS vulnerable during a time interval that starts
   when [the access token is revoked] ends when the RS expunges the
   access token, e.g., after having gained knowledge of its
   revocation.

And during that time interval, it is a security violation if a client
gains access to the resource, whether or not the client's intention is
dishonest or not.

I suggest a revision like

   If a Client attempts to access a protected resource after the
   access token is revoked but before the RS hosting the resource is
   aware of the revocation, it will be able to do so even though it
   should not.  Such an access is a security violation even if the
   Client is not attempting to be malicious.

   In order to minimize such risk, if an RS relies solely on polling
   through individual requests to the TRL endpoint to learn of revoked
   access tokens, the RS SHOULD implement an adequate trade-off
   between the polling frequency and the maximum length of the
   vulnerable time window.

** Editorial issues:

1.  Introduction

It might be worth noting that the process(es) by which a token is
declared revoked, and the method by which the AS is notified of that
(and consequently updates the TRL), is out of scope.  That fact is
implicit in this document, but stating it ensures someone doesn't hunt
through this document looking for a specification of the revocation
process.

2.  Protocol Overview

      After the registration procedure is finished [...]  device can
      send an Observation Request to the TRL endpoint as [...]

      At any time, the registered device can send a GET request to the
      CTRL endpoint.  When doing so, it can request for: the current
      [...]

It seems that these two paragraphs are separate items and should be
marked with "*" like other items in the list.

      Depending on the specific subscription established through the
      observation request, the notification provides the current updated
      list of revoked access tokens in the subset of the TRL pertaining
      to that device (see Section 6), or rather the most recent TRL
      updates occurred over that list of pertaining revoked access
      tokens (see Section 7).

Instead of "or rather", the wording probably should be "or
alternatively" or "or, if the 'diff' parameter is in effect,".

   *  An administrator can access and subscribe to the TRL like a
      registered device, while getting the full updated content of the
      TRL.

Perhaps better phrased

   *  An administrator can access and subscribe to the TRL like a
      registered device, while getting updates of the full
      content of the TRL.

since there is no "un-updated content of the TRL".

   Consistently, the AS adds the three token hashes to the TRL at once,

Does "Consistently" carry any specific meaning here?  Actually, I
think you mean "Consequently", that is, in consequence of the
simultaneous revocation of three tokens.  And if so, combining this
paragraph with the preceding one would clarify that the actions in
this sentence were consequent to the actions in the preceding
sentence.

3.  Token Hash

       *  If the content of the 'access_token' parameter from step 1 is
          a CBOR byte string, then HASH_INPUT takes the binary
          serialization of that CBOR byte string.  This is the case
          where CBOR was used to transport the Access Token (as a CWT or
          JWT).

          With reference to the example in Figure 2, and assuming the
          string's length in bytes to be 119 (i.e., 0x77 in
          hexadecimal), then HASH_INPUT takes the bytes {0x58 0x77 0xd0
          0x83 0x44 0xa1 ...}, i.e., the raw content of the
          'access_token' parameter.

I would change "HASH_INPUT takes" to "HASH_INPUT is".  And similarly
in the next item.

Probably better phrased "... HASH_INPUT is the CBOR encoding of the
byte string." and "... i.e., the CBOR-encoded content of the
'access_token' parameter."  (Unless the process is revised as
discussed in "Technical issues".)

5.  The TRL Endpoint

   *  Diff query: [...]

      The entry associated with one of such updates contains a list of
      token hashes, such that: i) the corresponding revoked access
      tokens pertain to the requester; and ii) they were added to or
      removed from the TRL at that update.

This seems awkward to me.  Perhaps better:

      The entry associated with one of such updates contains the list
      of token hashes which were added to or removed from the TRL at
      that update and for which the corresponding revoked access
      tokens pertain to the requester.

--

   First, the AS can avoid
   excessively big latencies when several diff entries have to be
   transferred, by delivering one adjacent subset at the time, in
   different diff query responses.

This seems awkward to me.  Perhaps better:

   First, the AS can avoid excessively long messages when several diff
   entries have to be transferred, by delivering several diff query
   responses, each containing one adjacent subset at a time.

5.1.  Supporting Diff Queries

   The AS MUST keep track of the MAX_N most recent updates to
   the subset of the TRL that pertains to each requester.

Clearer to say

   For each requester, the AS MUST keep track of the MAX_N most recent
   updates to the subset of the TRL that pertain to the requester.

--

   6.  The AS adds the series item to the update collection associated
       with the requester, as the most recent one.

Better

   6.  The AS adds the series item to the update collection associated
       with the requester, as the last (most recent) one.

5.1.1.  Supporting the "Cursor" Extension

   The AS defines the constant, unsigned integer MAX_INDEX <= ((2^64) -
   1), where "^" is the exponentiation operator.  In particular, the
   value of MAX_INDEX is REQUIRED to be at least (MAX_N - 1), and is
   RECOMMENDED to be at least ((2^32) - 1).

"In particular" isn't quite right when specifying a further
constraint.  Just start "The value of MAX_INDEX ..."

   Note that MAX_INDEX is practically expected to be order of
   magnitudes greater than MAX_N.

The phrase you want is either "an order of magnitude", meaning
MAX_INDEX >= 10 * MAX_N, or "orders of magnitude", meaning (more
intense but less precise) MAX_INDEX is expected to be >= 100 * MAX_N.
But you probably want to use "SHOULD", perhaps like this

   MAX_INDEX SHOULD be (at least an order of magnitude)/(orders of
   magnitude) greater than MAX_N.

Given the error processing in the last-but-one paragraph of 5.2, you
should add a warning:

   MAX_INDEX SHOULD be (at least an order of magnitude)/(orders of
   magnitude) greater than MAX_N.  Note that section 5.2 requires
   "Cursor" processing to return errors to a requester after index
   wrap-around has occurred for that requester, so a small value of
   MAX_INDEX will cause GETs to return errors when otherwise they
   would not.

--

   When maintaining the history of updates to the TRL, the following
   applies separately for each update collection.

Clearer to say

   When maintaining the history of updates to the TRL, the following
   applies separately for each requester's update collection.

--

      If the update collection is not empty, 'last_index' has the
      value of 'index' currently associated with the latest added
      series item in the update collection.

Note that "the latest added series item" is the same as "the last
series item".

      As long as a wrap-around of the 'index' value has not occurred,
      the value of 'last_index' is the absolute counter of series items
      added to that update collection until and including V, minus 1.

I think "until and including V" should be removed, as "V" isn't the
name of any thing, and the sentence has the right meaning without
those words.  That is,

      As long as a wrap-around of the 'index' value has not occurred,
      the value of 'last_index' is the absolute counter of series items
      added to that update collection, minus 1.

--

   The specific value depends on the specific registered
   device or administrator associated with the update collection in
   question.

Strictly, you want to add "MAY", because some implementations may use
the same value for all devices/administrators:

   The specific value MAY depend on the specific registered
   device or administrator associated with the update collection in
   question.

6.  Full Query of the TRL

       *  The 'full_set' parameter MUST be included and specifies a CBOR
          array 'full_set_value'.

There are quotes around 'full_set_value' as if the value is a
parameter name, but "full_set_value" is only used in this document to
describe how the response is assembled (similarly to "HASHES",
"HASH_INPUT", etc.) and there's no need to quote its name.

Similarly for 'diff_entry', 'diff_set_value' and some other variables
in various places.

7.  Diff Query of the TRL

       *  The 'diff_set' parameter MUST be present and specifies a CBOR
          array 'diff_set_value' of U elements.  Each element of
          'diff_set_value' specifies one of the CBOR arrays 'diff_entry'
          prepared above as diff entry.

s/as diff entry/as a diff entry/

          Their values are specified according to what is defined in
          Section 8.2, [...]

Can be simplified to "Their values are specified in Section 8.2, [...]".

   Figure 7 shows an example of response from the AS, [...]

Can be simplified to "[...] shows an example response [...]".

8.  Response Messages when Using the "Cursor" Extension

   If it supports both diff queries and the "Cursor" extension, the AS
   composes a response [...]

Clearer to say

   If the AS supports both diff queries and the "Cursor" extension, it
   composes a response [...]

8.2.2.  Cursor Not Specified in the Diff Query Request

   [...] the AS performs the same actions defined in Section 7,
   with the following differences.

Can be simplified as

   [...] the AS performs the actions defined in Section 7,
   with the following differences.

--

         If the 'more' parameter has value "true", the requester can
         send [...]

This paragraph should be at the normal left margin (which is outdented
from its current position).

9.  Registration at the Authorization Server

   *  The hash function used to compute token hashes.  This is specified
      as an integer or a text string, taking value from the "ID" or
      "Hash Name String" column of the "Named Information Hash
      Algorithm" Registry [Named.Information.Hash.Algorithm],
      respectively.

Actually, given that the registration process is seemingly not
specified anywhere, the only thing that can be specified in this
document is the overall semantic effect, which is that some entry in
the registry is specified by some method:

   *  The hash function used to compute token hashes.  This is specified
      by identifying an entry in the "Named Information Hash
      Algorithm" Registry [Named.Information.Hash.Algorithm].  The
      specific means for this is outside the scope of this document.

10.  Notification of Revoked Access Tokens

In what way does the AS record the types of observation responses that
the client is capable of understanding?  There seem to be three
varieties, full, diff, and diff-with-cursor.  I would guess that the
absence of diff in the GET-with-Observe indicates that full
observation responses must be sent, the presence of diff without
cursor indicates that diff responses (using the saved value of the
'diff' parameter) must be sent, and the presence of diff and cursor
indicates that cursor responses must be sent (with the AS caching the
cursor value from one response to generate the next response).  But
all of this seems to be only implicit in this document and needs to be
specified.

   Once completed the registration procedure at the AS, [...]

Maybe better "After completing the registration procedure at the AS,
[...]".

   When doing so, the requester towards the TRL endpoint can perform a
   full query (see Section 6) or a diff query (see Section 7) of the
   TRL.

Or for that matter, a diff-with-cursor query (see Section 8).

10.1.  Handling of Access Tokens and Token Hashes

   An RS stores a token hash th1 corresponding to an access token t1
   until both the following conditions hold.

It seems like this should be "MUST store".

11.  ACE Token Revocation List Parameters

   The table below summarizes them, and specifies the CBOR value to use
   as abbreviation instead of the full descriptive name.

This sort of thing is probably conventional in CBOR, but to the naive
reader a more explicit explanation is useful.  Perhaps

   The table below summarizes the parameters, and specifies the CBOR
   value to use as the key in the map pair for the parameter instead
   of the string that is used as the parameter's name in this document.

13.  Security Considerations

   Security considerations are inherited from the ACE framework for
   Authentication and Authorization [RFC9200], [...]

It would be clearer to state

   ACE token revocation notification inherits the security
   considerations from the ACE framework for Authentication and
   Authorization [RFC9200], [...]

13.1.  Content Retrieval from the TRL

   To this end, the AS can perform the required filtering [...]

Probably better to be normative:

   To this end, the AS MAY perform the required filtering [...]

unless the intention is to assert that an AS can always use this
method to perform the required filtering, in which case "can" should
be "can always".

   Disclosing any information about revoked access tokens to entities
   other than the intended registered devices [...]

"intended" is not well defined here.  "pertain" should be used
instead.  I think you want to say

   Disclosing any information about revoked access tokens to entities
   to which they do not pertain [...]

13.2.  Size of the TRL

   Issuing access tokens with not too long expiration time could help
   reduce [...]

Less awkward to say

   Issuing access tokens with limited expiration time could help
   reduce [...]

13.4.  Request of New Access Tokens

   [...] the Client might anyway receive an unprotected 4.01
   (Unauthorized) response from the RS.

Probably better to say

   [...] the Client may receive an unprotected 4.01 (Unauthorized)
   response from the RS.

since the next paragraph explains why it might receive such a
response.

--

   This can be due to different reasons.  For example, the access token
   has actually been revoked and the Client is not aware about that yet,
   while the RS has gained knowledge about that and has expunged the
   access token.

Probably better to say

   This can be due to a number of causes.  For example, the access
   token has been revoked, the RS is aware of it and expunged the
   token, but the Client is not aware of it (yet).

14.3.  ACE Token Revocation List Parameters Registry

   *  CBOR Value: This field contains the value used as CBOR
      abbreviation of the item.

Parallel to the comment on section 11, 

   *  CBOR Value: This field contains the value used as CBOR
      abbreviation of the item, specifically as the key in CBOR maps
      that are values of media type application/ace-trl+cbor.

--

      The value can be an unsigned integer or a negative integer.

I assume the intention is

      The value MUST be an unsigned integer or a negative integer.

rather than MAY.  Similarly for "Value" in the registry of section 14.4.

Appendix B.  Parameters of the TRL Endpoint

"parameters" isn't quite the right word here.  Compare with the use of
"parameters" in section 5.2.

This list seems to combine two things:

"configuration parameters", values which have a single instance and
are constant over long periods of time for a single AS.  For example,
MAX_N, MAX_DIFF_BATCH, and MAX_INDEX.  It is this uniqueness and
constancy that allows the values to be communicated to the client only
"upon registration", an event which seems to be intended to happen
infrequently.

"state variables", values which the AS stores and updates as it
performs its various operations.  I notice that in this situation, a
parameter is a state variable if and only if it has multiple
(per-client) instances, but the fundamental difference is whether the
parameter changes during normal operation, and that difference isn't
clearly noted in this section.

There seem to be a lot of state variables which are not listed here.
The leading example is the TRL itself, but there is also a lot of
information about the observation subscriptions that needs to be
maintained.

Appendix C.  Interaction Examples

   The details of the registration process are omitted, but it is
   assumed that the RS sends an unspecified payload to the AS, which
   replies with a 2.01 (Created) response.

   The payload of the registration response is a CBOR map, which
   includes the following entries:

Given that the registration process is (seems to be) entirely
unspecified, the wording should be a little broader:

   Registration is assumed to be done by the RS sending a POST request
   with an unspecified payload to the AS, which replies with a 2.01
   (Created) response.  The payload of the registration response is
   assumed to be a CBOR map which includes the following entries:

Note the absence of a comma in the last sentence, because we also
assume that the response includes the specified entries.  If a comma
was present, we would be asserting that we are *assuming* that the
response is a map, but that assumption *implies* that the map contains
the specified entries.

   *  a 'trl_hash' parameter, specifying the hash function used to
      compute token hashes as defined in Section 3;

might as well flesh out to:

   *  a 'trl_hash' parameter, specifying the "Hash Name String" of the
      hash function used to compute token hashes as defined in
      Section 3;

C.1.  Full Query with Observe

          |<---------------------------------------------------+
          |      2.05 CONTENT Observe: 42                      |
          |        Content-Format: "application/ace-trl+cbor"  |
          |        Payload: {                                  |
          |          "full_set" : []                           |
          |        }                                           |
          |                         .                          |
          |                         .                          |
          |                         .                          |
          |                                                    |
          |          (Access tokens t1 and t2 issued           |
          |          and successfully submitted to RS)         |
          |                         .                          |
          |                         .                          |
          |                         .                          |
          |                                                    |
          |                                                    |
          |             (Access token t1 is revoked)           |

This might made easier to draw by (1) making sure that every response
body is followed by an empty line (I initially mistook that the
"Payload" above was followed by more data in the response than is
shown.), and (2) replacing the vertical "..." with a horizontal "..."
(to save space):

          |<---------------------------------------------------+
          |      2.05 CONTENT Observe: 42                      |
          |        Content-Format: "application/ace-trl+cbor"  |
          |        Payload: {                                  |
          |          "full_set" : []                           |
          |        }                                           |
          |                                                    |
          |                        ...                         |
          |          (Access tokens t1 and t2 issued           |
          |          and successfully submitted to RS)         |
          |                        ...                         |
          |             (Access token t1 is revoked)           |

C.3.  Full Query with Observe plus Diff Query

   The example also considers one of the notifications from the AS to
   get lost in transmission, and thus not reaching the RS.
   [...]
          |  X<------------------------------------------------+
          |      2.05 CONTENT Observe: 86                      |
          |        Content-Format: "application/ace-trl+cbor"  |

You might want to make the message loss more obvious.  (I overlooked
it the first time I read this section.)  Perhaps

          |   Lost <-------------------------------------------+
          |      2.05 CONTENT Observe: 86                      |
          |        Content-Format: "application/ace-trl+cbor"  |

Similarly in section C.5.

[END]