I have reviewed this document as part of the security directorate's ongoing effort to review all IETF documents being processed by the IESG. These comments were written primarily for the benefit of the security area directors. Document editors and WG chairs should treat these comments just like any other last call comments. This document describes a extension to the RTP to indicate the audio level of stream without the need to decode and measure the stream received. This is needed so conference call mixer does not need to decode all streams to be able to detect which of them contains audio and which should be forwarded to participants. The security considerations section seems to include good analysis on what security properties this extension could have (including denial-of service attack, and passive listeners infering information about the conversation). I see no issues with this document. -- kivinen at iki.fi