The ESP Goal

[This TLA conficts with one for Security and will be changing raj 12/12]

Provide a common framework for the measurement and reporting of
network performance for end systems.

			     Motivations

"TVO Corporation is pleased to announce their new SuperTurbo BUN
 adaptor, offering 1.5 gigabit/s BUN connectivity. This adaptor
 replaces our previous 0.155 gigabit/s BUN adaptor giving our our
 users a 10X improvement in performance" - TVO Marketing

"This link is 100 Mbit/s and that one is 10 Mbit/s, so it must have
 one-tenth the latency right?" - Naive User

"I just installed an FDDI adaptor in my TVO WonderWorkstation 4000.
 Why doesn't FTP go at 12 Mbytes/s?" - Another Naive User


Internet Engineering Task Force - San Jose
Benchmarking Methodology WG - Rick Jones <raj@cup.hp.com>


			       Overview


Definitions

Metrics

Reporting

Issues

Internet Engineering Task Force - San Jose
Benchmarking Methodology WG - Rick Jones <raj@cup.hp.com>


			     Definitions

End-system definitions should be driven by application-level
considerations.  As such, they may differ from those for network
interconnect devices.

One-Way Latency. The time between the initiation of a message to a
                 remote application, and the receipt of the complete
                 message by the remote application.

Round-Trip Latency. The time between the initiation of a message to a
                    remote application and the receipt of the complete
                    response from the remote application.

Throughput. The rate at which data flows from application to
            application. This MUST NOT include non-application data
            (Eg Transport, Network, or Link Headers).

Internet Engineering Task Force - San Jose
Benchmarking Methodology WG - Rick Jones <raj@cup.hp.com>


			 Definitions - Cont.

Host. The End-System, including CPU(s) and I/O Bus(ses);Transport,
      Network, and Driver software; excluding the network interface
      card(s).

Link. The cable(s), fibres, or spectrum used to convey data between
      end-points.  From the perspective of an end-system, switches,
      hubs, bridges and the like are all part of the "link."

Card. The I/O card or IC's which connect the Host to the Link

Host Limited. Replacing the Card and Link with ones of infinite
              capacity and infinitesimal latency would not
              (significantly?) improve measured performance.  (EG,
              when the Host CPU is 100% utilized)

Card Limited. Replacing the Host and Link with ones of infinite
              capacity and infinitesimal latency would not
              (significantly) improve measured performance.  (EG when
              the Card is maxed-out on Trans./s or DMA bandwidth)

Link Limited. Replacing the Card and Host with ones of infinite
              capacity and infinitesimal latency would not
              (significantly) improve measured performance.  (EG when
              the Link is at theoretical Frames/s or Bits/s)

Internet Engineering Task Force - San Jose
Benchmarking Methodology WG - Rick Jones <raj@cup.hp.com>


			       Metrics


End-System Metrics should be expressed in units familiar to users of
End-Systems.  PageFaults/Fortnight, while useful and interesting to
developers, is completely alien to an End-User. At the same time,
metrics should not be abstracted beyond that which is easily measured
by a simple component benchmark.

Unidirectional Throughput. People just love to talk about
    megasomethings per second. Measure the rate of EFFECTIVE (received
    and useable) Throughput. This is especially important for UDP and
    other "unreliable" transports (or link-level API's).

Latency. Not many people are familiar with latency. They are
    especially unfamiliar with its effects. However, it is important
    that they have some feel for the latency of an end-system
    solution. It is too easy to hide a card's horrendous latency
    behind large TCP windows or a UDP Stream test. Given that
    microseconds are hard to fathom, have the units be analogous to
    those for Throughput - Transactions/s.  Can also report s/Trans.
    for.


Internet Engineering Task Force - San Jose
Benchmarking Methodology WG - Rick Jones <raj@cup.hp.com>


			   Metrics - Cont.


Speed. The performance of a single instance of a benchmark. Examples
       would be a single TCP connection, Single VC and so on.

Capacity. The aggregate performance of multiple instances of a
          benchmark. Examples would be multiple TCP connections or
          VC's.

Internet Engineering Task Force - San Jose
Benchmarking Methodology WG - Rick Jones <raj@cup.hp.com>


			      Reporting

If performance cannot be reported in a consistent manner which allows
reproduction of results, there is not much sense in reporting it at
all.

-At least one figure of merit of the Speed type from each of Bandwidth
 and Latency MUST be reported. At least five from each SHOULD be
 available.

-At least one figure of merit from the Capacity type from each of
 Bandwidth and Latency SHOULD be reported. At least five from each MAY
 be available.

-Software/Hardware/Firmware revisions for all component SHOULD be
 reported.  Test Parms, Vendor, Model, and OS MUST be reported.

Internet Engineering Task Force - San Jose
Benchmarking Methodology WG - Rick Jones <raj@cup.hp.com>


				Issues

There are lots of issues.

-Where does Loopback (127.0.0.1) loop-back?

-Homogenous or heterogeneous capacity metrics? (Prefer Homogenous - it
 is simpler)

-Explicit synchronization of capacity tests, or "long enough" test
 times?

-More than 2 systems? N systems in a star? N systems in a mesh?

-Should the benchmark application use select/poll? (Go with plain
 blocking sockets for simplicity)

-What is a sufficient run time? 60 seconds? 600 seconds?

-Should the end-systems and networks be completely isolated?

Internet Engineering Task Force - San Jose
Benchmarking Methodology WG - Rick Jones <raj@cup.hp.com>