LinkedIn

See our LinkedIn profile, click on this button:

APDEX

Apdex-G and Apdex-R Composite Sections 1-5

I’m writing a series of posts about Generalizing Apdex. This is #25. To minimize confusion, section numbers in the current spec are accompanied by the section symbol, like this: §1. The corresponding section numbers in the generalized spec documents, Apdex-G and Apdex-R, are enclosed in square brackets and prefixed by the document type, like this: [G 1] and [R 1].

Drafts of the Apdex-G spec were posted as Section [1] Introduction, Section [2] Index Overview, Section [3] Calculation Inputs, Section [4] Calculating the Index, Section [5] Reporting, and Section [6] References.

Drafts of the Apdex-R spec were posted as Section [1] Introduction and [2] Overview, Section [3] Calculation Inputs, Section [4] Calculating the Index, and Section [5] Reporting. Those posts show how the proposed Apdex-R spec is derived from the present Apdex spec.

This post consolidates the Apdex-G and Apdex-R proposals side-by-side, without annotations, allowing sections 1-5 of both documents to be viewed together.

Apdex-G, second draft:

Apdex-R, first draft:

[R 1] Introduction

Apdex (Application Performance Index) is a metric that reflects the degree to which a set of performance measurements achieves designated targets.

This specification defines Apdex-R, the Apdex index of customer satisfaction with transactional application response times.

[G 1] Introduction

Apdex (Application Performance Index) is a metric that reflects the degree to which a set of performance measurements achieves designated targets.

Originally designed as an index of user satisfaction with the responsiveness of enterprise applications, Apdex can also be used to report on many other quality measures, such as the degree to which a product conforms to technical standards, or the degree to which a service achieves business objectives.

[R 1.1] Apdex Specification Documents

A master specification document, Apdex-G, specifies rules for calculating and reporting the Apdex index. The suffix ‘G’ denotes generic, indicating that the master document specifies domain-independent rules.

This document, Apdex-R, is a supplemental document, termed an Addendum. The suffix ‘R’ is the Addendum Type or Document Type. An addendum contains rules that apply only to applications of Apdex within a specific measurement domain.

Note that Apdex-R contains supplemental rules for applying Apdex to response times; except where specifically stated, an addendum does not replace any provision of Apdex-G.

References to sections or paragraphs of Apdex documents are distinguished by the inclusion of the document type with the paragraph number, for example [G 1.1] or [R 1.1].

[G 1.1] Apdex Specification Documents

This document, Apdex-G, specifies rules for calculating and reporting the Apdex index.

The suffix ‘G’ denotes generic, indicating that this document specifies domain-independent rules. A supplemental document, termed an addendum, may contain rules that apply only to applications of Apdex within a specific measurement domain. An addendum is named Apdex-X, where the suffix ‘X’ is the Addendum Type or Document Type, and denotes a specific measurement domain. For example, the addendum Apdex-R contains supplemental rules for applying Apdex to response times.

References to sections or paragraphs of Apdex documents are distinguished by the inclusion of the document type with the paragraph number, for example [G 1.1] or [R 1.1].

[R 1.2] Document Status

This document was developed by the Apdex Alliance, a group of companies and individuals collaborating to promote Apdex. It was ratified by …

[ Ratification history/status of Apdex-G will be inserted here ]

Apdex specifications may be updated from time to time to reflect feedback based on practical experience with Apdex. Current versions of all Apdex specifications are available from the Alliance web site, www.apdex.org.

[G 1.2] Document Status

This document was developed by the Apdex Alliance, a group of companies and individuals collaborating to promote Apdex. It was ratified by …

[ Ratification history/status of Apdex-G will be inserted here ]

Apdex specifications may be updated from time to time to reflect feedback based on practical experience with Apdex. Current versions of all Apdex specifications are available from the Alliance web site, www.apdex.org.

[G 1.3] Broader Apdex Goals

See [G 1.3] for broader Apdex goals that fall outside the scope of this specification.

[G 1.3] Broader Apdex Goals

The Apdex Alliance has adopted the following statements of intent:

    Tools: Members of the Alliance will implement tools and/or services that adhere to this specification.
    Research: Members of the Alliance support an ongoing process of inquiry into the relationship between application or process performance and customer satisfaction.
    Education: Members of the Alliance will develop and maintain additional documents and educational material to help enterprises understand the Apdex metric and put it to productive use within their organizations.
    Documentation: Members of the Alliance will contribute to an Apdex Technical Guide, which will provide detailed information about implementing Apdex.

These statements reflect goals that fall outside the scope of this specification. Current information about Apdex Alliance goals and activities can be found at the Alliance web site, www.apdex.org

[R 2] Index Overview

See [G 2] for a general statement of the requirement for a performance index; this section focuses on the requirement to report response times using an index.

Many aspects of performance affect the delivery and management of information technology. One critical performance factor is the responsiveness of the human-computer interface from the perspective of the human user. This responsiveness is a core quality of a transactional application. The speed by which an application reacts to the needs of the user directly affects the user’s productivity and satisfaction with the application experience. It is a critical metric of application performance that has direct implications for business revenue, customer retention, and user productivity.

Therefore, measuring, reporting, and tracking application response time is important to an enterprise that values the opinion and productivity of its users’ experiences. The Application Performance Index (“Apdex”) defines a methodology for reporting the responsiveness of human-application transactions, in terms of its effect on user satisfaction and productivity.

[G 2] Index Overview

Performance is a critical metric of application or process quality that has direct implications for business revenue, customer retention, and user productivity. Users of any application or process are directly affected by its performance. An enterprise that values the opinion of its customers, or the productivity of its internal users, must measure and track the performance of its applications and processes.

Collecting measurements is only one step in the proper management of user experience. Managers must set quality objectives, and track whether those objectives are being met. Such tracking requires that large numbers of individual performance measurements be summarized for easy review. To address this requirement, the Apdex standard defines a method for reporting the degree to which a set of performance measurements meet designated targets.

[R 2.1] Index Objectives

See [G 2.1] for the goals of the Apdex metric.

[G 2.1] Index Objectives

The fundamental objective of Apdex is to simplify the reporting of any application or process measurements by making it possible to represent a set of such measurement using a common metric. Measurements can describe many applications or processes, and their magnitudes can vary widely. The Apdex metric normalizes this variability, producing values that lie within a fixed range, for all measurement input.

The goals of the Apdex metric are:

  1. To be easy to understand
  2. To be a summary of the quality or performance of an application or process, based on a set of measurements
  3. To be a performance indicator for any application or process that is subject to performance targets
  4. To be dimensionless, irrespective of the measurement domain being reported
  5. To lie within a fixed range of values, irrespective of the measurement data being summarized
  6. To operate so that index values of 1 and 0 indicate the best and worst performance respectively
  7. To operate so that a particular value of the index indicates a comparable level of quality for different applications or process
  8. To operate so that different Apdex tools will report similar index values for user experiences having comparable levels of quality

This document defines a standard by which creators of measurement and reporting tools can achieve these objectives.

[R 2.2] Terminology

See [G 2.2] and [G 7] for Apdex terminology. The following terms are also defined within Apdex-R:

Apdex Term General Definition HTML/HTTP Equivalent
Session The period of time that a user is “connected” to an application. This is a continuous period of time that the user is interacting with the application. No specific HTML or HTTP markers define a session. A proxy is an IP address connection (open-to-close). Another is the assignment of a session identification cookie followed by a period of inactivity.
Process A multi-step series of user interactions (Task, see below)together with the corresponding user Think Times – that may (buy a book, look up an address, get a stock quote) or may not (no clear end) be well defined. Often called the “transaction” or “application use case.” A series of web pages that allow a user to perform a defined application function, together with the user interaction times (“Think Times”) required to interact with those pages.
Task Each user interaction with the application during a Process. Task time is measured from the moment the user enters an application query, command, function, etc., that requires a server response to the moment the user receives the response such that they can proceed with the application. Often called the “user wait time” or “application response time.” A Page download, consisting of a container and zero or more component objects, and the time required to perform that download, measured from the time of a user interface action until the page download completes and the page is available for use.
Task Chain A defined sequence of Tasks. The Tasks comprising a Task Chain may or may not correspond to a complete Process (as defined above). Furthermore, time of a Task Chain corresponds to the sum of the Task times only, and does not include any user Think Time that may occur between Tasks. A defined sequence of Web pages, and their associated download times, excluding any Think Times between user interface actions.
Turn Each application client-server software interaction needed to generate a user response or Task (see above). These software-level client-server interactions add to the time it takes for the software to complete a Task. The user does not see Turns operating. A Turn is a client-server request-driven round-trip. Often called application “chattiness.” HTTP Gets for parts of a Web page or frame. In HTML, each object — such as a container (index.html), component (image.gif) or stand-alone document (program.zip) — will generate a Turn.
Protocol The above Turns are further deconstructed into transport protocol events required to reliably move information among computers. These include DNS look-ups, TCP Opens, TCP transmission, TCP ACKs, retransmission, etc. TCP is the protocol supporting HTTP. HTML/HTTP do not generate additional events at this layer of the transactional model.
Packet The smallest unit of interaction and transmission between the user’s client and the application server is the packet. Packets are routed and transported by networks from source to destination. HTML/HTTP do not generate additional events at this layer of the transactional model.
Table 1 – Taxonomy of Transactional Computer Applications

[G 2.2] Terminology

The following terms are defined:

Measurement Domain
The set of all possible values from which the input to an Apdex calculation is drawn. Unless limited by a domain-specific Addendum, this will be the real number domain (-∞, ∞)
Apdex Tool
A system that calculates Apdex values, or reports Apdex values, or does both. The calculation and reporting functions may be separated between an Apdex analysis tool and an Apdex reporting tool. Apdex tools may employ software or hardware components, or both.
Tool Creator
The developer(s) of an Apdex tool. In practice, a tool creator may refer to a tool vendor, an open-source community, or an individual developer.
Technician
The person who controls an Apdex tool, or who reads and interprets Apdex values. Typically, a technician is an enterprise staff member responsible for application or process quality.
User or Customer
A person served by an application or process that is the subject of an Apdex index.

Other Apdex terms are defined in context within this document and its addenda. Section [G 7] Glossary collects together all Apdex terminology.

[R 2.3] Measurement Types: User-Application Interaction Model

See [G 2.3] for an introduction to Measurement Types in Apdex; this section focuses on response time measurements.

In order for the index to properly report on users’ perceptions of application responsiveness, the measurements that are used as input must measure the actual time that a user is waiting for the application to respond. Transactional applications are characterized by a query-response interaction process. A user types, clicks, or makes some entry followed by a period of time when the system (network, computers, databases, etc.) handles the entry and generates a response. This period is typically called “user wait time.” The critical factor regarding the duration of this time is that the user is typically incapable – even prevented from – proceeding with the application until the response is delivered. The user will then read, react, or in some way think about the response that will then lead him or her to make the next entry. This period of time where the system is waiting for the user to make the next entry is typically called “user think time.” This enter-wait-think cycle repeats many times during the user’s session with the application.

The user’s perception of the application’s responsiveness or speed is formed by an accumulated number of views regarding the lengths of wait times or response times. Many components of a client-server interaction make up response time. Transactional applications have complex behaviors that operate at many levels. Section [R 2.2] defines a taxonomy comprising seven layers of interaction, and illustrates that taxonomy by identifying the layers of a web-based application built using HTTP/HTML. Transactional applications based upon other protocols or proprietary schema follow the same taxonomy and may therefore be reported by tools using Apdex-R.

[G 2.3] Measurement Types

This document specifies domain-independent rules that apply to all measurement types. An addendum may define a class of measurements within a specific measurement domain, and specify supplemental rules that apply only to that class of measurements. The addendum type (see [G 1.1]) may also be referred to as the Document Type, or the Measurement Type of that class of measurements.

[R 2.4] Measurement Subtypes: Processes, Tasks and Task Chains

Within the taxonomy described in Section [R 2.2], application response time is defined to be the time that elapses between a user completing a Task entry and the computer system responding with sufficient information to allow the user to proceed to the next Task. Furthermore, human users are most aware of the responsiveness of an application delivery system at its Task and Process layers, because these layers usually correspond to two distinct and readily identifiable aspects of a user’s interaction with a computer system, as follows:

Task Time
An instance of application response time a human user experiences while the computer system is performing one operation (or one step) of an application.
Process Time
The sum of the Task Times a user experiences while performing a sequence of related Tasks necessary to complete a single application function, including user think time. This is the actual time a user experiences while completing a Process.
Task Chain Time
The sum of the Task Times of a defined series of Tasks. Task Chain Time differs from Process time in two ways: (a) the total time does not include user think time, and (b) the set of Tasks does not necessarily correspond to a complete Process.

In Section [R 2.5.1], this specification defines the Apdex method for computing and reporting measurements of Task Time and Task Chain Time.

TBD: When reporting response-time measurements, is Apdex-R mandatory, or an alternative to Apdex-G? Is the answer the same for Tasks, Task Chains, and other response times?

[G 2.4] Measurement Subtypes

This document specifies domain-independent rules that apply to all measurement types. An addendum that addresses a particular measurement type may also identify subtypes of that measurement type, and specify supplemental rules that apply to each of those subtypes.

[R 2.5] Satisfaction Levels: How Users Experience Application Response Times

Apdex satisfaction levels are defined in [G 2.5]. For interactive applications, the three levels of user satisfaction with responsiveness are further defined as follows:

Satisfied
The response time is fast enough to satisfy the user, who is therefore able to concentrate fully on the work at hand.
Tolerating
The response time is slower than desired, potentially impairing user productivity, but not slow enough to make the system or application unusable. A user notices how long it takes to interact with the system, but can tolerate the slower response.

Frustrated
As response times increase, at some threshold the user becomes unhappy with slow performance, becoming frustrated. A casual user is likely to abandon a course of action and a production user is likely to cancel a task or process.

[G 2.5] Satisfaction Levels

Apdex works by assigning a satisfaction level to each measurement, as defined below:

Satisfaction Level
One of three categories (Satisfied, Tolerating, or Frustrated) that reflects how an organization using Apdex evaluates the performance of the application or process being reported. For the Apdex method to be applicable to a Measurement Domain, it must be possible to assign a Satisfaction Level to any value within the Measurement Domain.
Quality Level
An alternative term for Satisfaction Level.
Satisfied
A Satisfaction Level that reflects the desired level of performance for the application or process being reported. Measurements classified as Satisfied usually indicate that the application or process being measured is meeting its targets and needs little or no attention.
Tolerating
A Satisfaction Level that reflects a level of performance below the desired level, but which can be tolerated. Measurements classified as Tolerating may signal the need for caution or greater attention to the application or process being measured.
Frustrated
A Satisfaction Level that reflects an unsatisfactory level of performance for the application or process being reported. Measurements classified as Frustrated may be associated with undesirable outcomes, and lead to remedial or abnormal actions.

[TBD, include these examples of abnormal actions? For example, a customer may abandon an unsatisfactory transaction, an executive may assign more resources to a critical business process that is not meeting its assigned targets, or a supervisor may halt a manufacturing process whose product is outside the specified tolerance.]

[R 2.5.1] Thresholds and Performance Intervals

See [G 2.5.1] for general definitions of Apdex Thresholds and Performance Intervals. When using Apdex to report response time measurements, Apdex-R simplifies the general method by eliminating the need to specify Performance Intervals, and by specifying just one or two Thresholds, as follows:

Threshold
A response time value that defines a boundary between two adjacent performance zones. Apdex-R defines two thresholds, the tolerating threshold (T) and the frustrated threshold (F).
T
The Tolerating Threshold. The value T is contained in the Satisfied Zone, and is the boundary between the Satisfied Zone and the Tolerating Zone.
F
The Frustrated Threshold. The value F contained in the Tolerating Zone, and is the boundary between the Tolerating Zone and the Frustrated Zone. The default value of F is four times the value T. When this default is adopted, only the value of T has to be specified.

Applying The Default F=4xT

In Apdex-R, the default value of F is four times the value T. This means, for example, that if users perceive response time as tolerable beginning at 4 seconds then they will be frustrated at response times slower than 16 seconds. However, tool creators are expected to support separate specification of T and F, and users of Apdex-R are encouraged to review the applicability of the default rule in their own reporting environment.

Basis for The Default F=4xT

Research supporting the default assertion F=4xT is described in Reference [G 6.1]. Further background information is available in References [G 6.2], [G 6.3], and [G 6.4] and on the Apdex web site www.apdex.org.

Most research into application response times has focused on user attitudes and behaviors when exposed to differing Task times. Since Processes and Task Chains are simply sequences of Tasks, this Apdex specification assumes that applicable research conclusions obtained for Tasks can be generalized to apply to Task Chains also.

Because the set of Task times comprising a Task Chain time are likely either to be independent (because environmental conditions affecting one Task have no effect on other tasks in a chain), or to be positively correlated (because some or all of the Tasks share a common system environment) it is reasonable to generalize the conclusions obtained for single Task times to Task Chain times.

Such a generalization would not be valid if the Task times comprising a Task Chain tended to be negatively correlated, because in that case (for example) user frustration with one task could be offset by obtaining rapid response time from another task in the chain. Users accustomed to such offsetting behavior might well expect smaller levels of variance in total Task Chain times, and thus develop thresholds for Task Chains that were lower than the sum of the individual Task thresholds, and which did not adhere to the rule that F is four times T. In this situation, the proposed Apdex mathematics would still work, but the resulting Apdex score could overestimate the users’ perceived quality of a Task Chain.

At present, Apdex considers such negatively correlated Task times unlikely. Therefore the default relationship F=4xT may be employed when reporting measurements of a single Task or of a Task Chain.

[G 2.5.1] Thresholds and Performance Intervals

A performance interval is a range of values within the measurement domain that share a common satisfaction level. Thresholds define the boundaries of performance intervals:

Threshold
One of N distinct values within the Measurement Domain, which partition the Measurement Domain into N+1 contiguous non-empty Performance Intervals in such a way that measurement values within a Performance Interval have identical Satisfaction Levels. Each Threshold lies within one and only one Performance Interval. Unless otherwise specified by a domain-specific Addendum, a Threshold lies within the Performance Interval for which it acts as the upper bound.
Threshold Name
Thresholds can have user-assigned names. Tools may also assign default generic names (T1, T2, … or T1, T2, …) to thresholds based on their order, from low to high. An Addendum may specify domain-specific names and orderings.
Performance Interval
A partition of the Measurement Domain defined by Thresholds. All measurement values within a Performance Interval have the same Satisfaction Level, which is therefore considered to be an attribute of the Performance Interval. There must be at least three Performance Intervals, one for each of the three Satisfaction Levels. When the context is unambiguous, the term ‘Performance Interval’ may be shortened to Interval.
Performance Interval Name
Performance Intervals can have user-assigned names. Tools may also assign default generic names (such as PI1, PI2, … or PI1, PI2, …) or names signifying the Satisfaction Level of the Interval (such as PIS, PIT1, PIT2, PIF1, … or PIs, PIt1, PIt2, PIf1 …). Default names should be assigned to Intervals based on their order, from low to high. An Addendum may specify domain-specific names and orderings.

[R 2.5.2] Performance Zones

See [G 2.5.2] for general definitions of Apdex Performance Zones. When using Apdex to report response time measurements, Apdex-R uses the thresholds T and F to define three Performance Zones, one for each satisfaction level.

Performance Zone
A response time range within which user possesses a common satisfaction level. See Satisfied Zone, Tolerating Zone, Frustrated Zone.

Satisfied Zone
The range of application response times from 0 to T, that reflect the desired level of performance for the application or process being reported. The user is not adversely affected by the response time.
Tolerating Zone
The range of application response times greater than T to F, that reflect a level of performance below the desired level, but which can be tolerated. The user is somewhat negatively affected by response time.
Frustrated Zone
Application response times greater than F, that reflects an unsatisfactory level of performance for the application or process being reported. The user is very negatively affected by response time.

[G 2.5.2] Performance Zones

Apdex defines three Performance Zones, one for each satisfaction level. The performance zone is the union of all Performance Intervals having that satisfaction level:

Satisfied Zone
The union of all Performance Intervals that have the Satisfaction Level of ‘Satisfied’.
Tolerating Zone
The union of all Performance Intervals that have the Satisfaction Level of ‘Tolerating’.
Frustrated Zone
The union of all Performance Intervals that have the Satisfaction Level of ‘Frustrated’.

[R 2.6] How the Index Works

Applying the Apdex method converts a set of response times into a unitless index value in the range 0 to 1. See [G 2.6] for an overview of how the Apdex index works.

[G 2.6] How the Index Works

The Apdex method converts measurements into an index by counting the samples in each performance zone and computing a weighted proportion of satisfied samples. Satisfied samples carry more weight than tolerating samples, which in turn carry more weight than frustrated samples.

The index is a ratio that always lies in the range 0 to 1. An index of zero means that all of the samples were within the frustrated zone. An index of 1 means that all of the samples were within the satisfied zone. An index that is greater than zero and less than one means that the samples were from a mix of the performance zones. The higher the index value, the higher the overall satisfaction level of the measurements.

[R 3]. Apdex Calculation Inputs

See [G 3] for a general introduction to Apdex calculation inputs.

[G 3] Apdex Calculation Inputs

Apdex does not favor any tool or method; a wide variety of tools and methods may be used to gather the necessary information for an Apdex calculation and report.

This section specifies the minimal requirements governing input to any Apdex calculation. A tool creator may add features and capabilities above these minimal requirements. An addendum for a particular measurement type may also specify domain-specific rules governing the methods and tools used to obtain measurements of that type.

[R 3.1] Measurements: Tasks and Task Chains

See [G 3.1] for a general rules regarding measurements.

[R 3.1.1] Task Measurements

Tool creators must identify which applications their tool is capable of interpreting at the Task level. For example, having the ability to mark the start and end of a Web page by interpreting the HTTP and HTML protocols is sufficient for Web-based applications. The tool must clearly identify the application by protocol (e.g., HTTP/HTML), or by software maker and product name (e.g., Microsoft Exchange), that it is capable of properly interpreting Tasks and measuring their response times.

Implementation of the Task measurement will vary. However, Task measurements must be a reasonable approximation of response time measurements that could be performed by the user with a stopwatch.

[R 3.1.2] Task Chain Measurements

Because Task Chain Time is defined to be the sum of a set of Task Times, and does not include any Think Time between Tasks, obtaining a Task Chain Time measurement will probably necessitate measuring and accumulating individual Task times. However, the implementation is left to the measurement tool developer, provided that the result is a reasonable approximation of a measurement of Task Chain Time a user could obtain by measuring and summing the component Task Times.

[G 3.1] Measurements

Tool creators must identify the measurement types their tool is capable of analyzing. When a tool analyzes a measurement type that is covered by an addendum, the tool creator must apply the rules specified in that addendum, in addition to those specified in this document.

Implementations of Apdex-based reporting may vary, provided that individual measurements are assigned to a performance zone that is factored into an Apdex calculation and reported as further defined in this specification.

[R 3.2] Report Groups

The report group is a named set of measurement samples that will be input to an Apdex calculation. Tools must provide a way for the technician to specify a report group. See [G 3.2] for general rules regarding the specification of report groups. Tools implementing Apdex-R must also allow the technician to specify the following values for filters:

Measurement Type (R)
If the tool analyzes measurements of more than one type, it must provide a filter that limits a report group to response time data only. The value “R” (signifying “response time data”) may be the default measurement type.
Measurement Subtypes
Task and Task Chain must be distinguished; measurements of Task Time and Task Chain Time may not be combined in one report group.

[G 3.2] Report Groups

The report group is a named set of measurement samples that will be input to an Apdex calculation. Tools must provide a way for the technician to specify a report group.

Report groups can be defined in many ways, but tools must allow the technician to specify values for the following filters:

Measurement Type
Required if the tool analyzes measurements of more than one type. Measurements of different types may not be included in the same report group.
Measurement Subtype
Required if the measurement type is covered by an addendum that defines distinct subtypes, and the tool analyzes measurements of more than one subtype. Measurements of different subtypes may not be included in the same report group.
Application(s) or Process(es)
Required if the tool analyzes measurements of more than one application or process. Measurements of more than one application or process may be included in the same report group if they are subject to the same thresholds.
Usage Segment(s)
Required if the tool analyzes measurements of more than one usage segment. A usage segment is any identifiable subset of the measurements of an application or process, such as a demographic segment, a geographic segment, an organizational segment, etc. Measurements of different usage segments may be included in the same report group if they are subject to the same thresholds.
Time Period(s)
Required. The technician must be able to define the time period(s) for which the index will be calculated.

Filters can be applied alone and in combination with others, to select a report group from a larger database of measurement data. An addendum governing a particular measurement type may define additional filters to be provided for measurement of that type.

[R 3.3] Report Group Size

See [G 3.3] for rules regarding report group size.

[G 3.3] Report Group Size

A set of measurement samples comprising a report group may be unique or may overlap with the measurement samples in other report groups. A report group may be defined as broadly as all of the samples for an application, or as narrowly as a single sample.

For normal reporting, the default minimum sample count for a report group is 100 samples; addenda may specify domain-specific minima. Any report group containing fewer than the specified minimum sample count is considered a small group.

Small groups may not be reliable as performance indicators, but may occur when reporting low traffic periods, or when reporting diagnostic measurements. Special handling is required for reporting small groups; this is specified in section [G 5.2].

A report group must contain at least one sample for an Apdex value to be calculated. Single samples report groups may be useful for diagnostic purposes. Single sample Apdex results are also permitted when reporting low traffic periods, but discouraged.

[R 3.4] Thresholds

See [G 3.4] for rules regarding threshold settings. For tools supporting Apdex-R, it is recommended that the default target threshold value, T, be set to 4 seconds.

[G 3.4] Thresholds

Tools must provide a way for the technician to specify the thresholds that define the boundaries of performance intervals to be used when calculating an Apdex index for a report group. A single set of thresholds is applied to all samples in a report group.

Thresholds are decimal values having no more than two significant digits of precision. This means that the following types of values are permitted, for any measurement unit, where a ‘unit’ refers to the dimension of the measurement type, such as seconds, meters, kilograms:

0<|T|<10
Values greater than 0 units and below 10 units can be defined to a tenth of a unit.
Examples: 0.5, 1.2, 5.8, 9.9
10<|T|<100
Values equal to or greater than 10 units but below 100 units can be defined to one unit.
Examples: 10, 19, 56, 85, 99
100<|T|<1,000
Values equal to or greater than 100 units but below 1000 units can be defined to 10 units.
Examples: 100, 190, 560, 850, 990
|T|>1,000
Values equal to or greater than 1,000 units follow the same two significant digits restriction.

A tool that supports a specific measurement type will ship with default threshold values selected by the tool creator, which technicians can change. These defaults enable the tool to begin supplying information with minimal set-up by the technician. An addendum may specify additional rules governing default threshold values, relationships among threshold values, and the required precision of threshold values, as appropriate for a specific measurement type.

A goal of Apdex is to operate so that different Apdex tools will report similar index values for user experiences having comparable levels of quality (see [G 2.1] item 8). However, tools use different measurement methods, and measure from different vantage points within a system. As a result, data purporting to measure the same quantity can have different values. If such differences are systematic, the technician can adjust the Apdex thresholds used with measurements from different tools so that the Apdex calculation produces comparable index values.

[R 4] Calculating the Index

See [G 4] for a general introduction to the Apdex index calculation.

[G 4] Calculating the Index

The Apdex does not entail new measurements, rather it is a new way to represent an existing set of measurements, reflecting the degree to which those measurements achieve designated targets.

[R 4.1] The Apdex Formula

See [G 4.1] for the Apdex formula, and corresponding input requirements:

    Apdex Formula

The following sections address Apdex-R input requirements:

    Measurement data: [R 3.1]

    Report group: [R 3.2] and [R 3.3]

    Performance zones: [R 2.5], [R 2.5.1], [R 2.5.2], and [R 3.4]

[G 4.1] The Standard Apdex Formula

The Apdex index is calculated as follows. Given:

    Measurement data that meets the requirements of section [G 3.1]

    A report group comprising measurement samples, defined as specified in sections [G 3.2] and [G 3.3]

    Three performance zones (Satisfied, Tolerating, Frustrated), defined as specified in sections [G 2.5], [G 2.5.1], [G 2.5.2], and [G 3.4]

    An allocation process that assigns each sample to a performance zone, and counts all samples, so that:

    Total_Samples is the number of all samples in the report group

    Satisfied_Count is the number of report group samples in the Satisfied Zone

    Tolerating_Count is the number of report group samples in the Tolerating Zone

Then the Apdex index for the report group is:

    Apdex Formula

[R 4.1.1] The Standard Formula in Action

See [G 4.1.1] for a general overview of the Apdex formula in action.

[G 4.1.1] The Standard Formula in Action

Note that measurements in the Frustrated zone are counted in Total_Samples, the denominator of the formula, but do not contribute to the numerator.

Another way to describe the Apdex index is as the weighted proportion of satisfactory samples in the report group. Samples in the Satisfied Zone have weights of 1, those in the Tolerating Zone have weights of ½, and those in the Frustrated Zone have weights of 0.

[ TBD: ] Therefore to achieve the optimal Apdex value of 1.00, all samples must represent the Satisfaction Level of ‘Satisfied’. If any samples reflect Tolerating or Frustrated performance, the index drops below 1.00. For example, if 80% of samples are Satisfied and 10% are Tolerating, while the remaining 10% are Frustrated, the index is 0.85.

[R 4.1.2] Alternative Scoring in the Tolerating Zone

See [G 4.1.2] for the rules for optional alternative scoring.

Apdex-R permits tool creators to substitute an alternative scoring function in the tolerating zone, provided that sample values (or value ranges) closest to the satisfied zone receive the greatest weight.

[G 4.1.2] Alternative Scoring in the Tolerating Zone

If the characteristics of a particular measurement and reporting domain justify using graduated weights within the Tolerating zone, an Addendum may substitute an alternative scoring function for the factor Tolerating_Count/2 in the standard formula. Any alternative scoring function (such as a sliding scale) must have the property that sample values (or value ranges) closest to the satisfied zone receive the greatest weight, in accordance with the index objectives (see [G 2.1] item 6).

[R 4.2] Dealing with Exceptions

See [G 4.2] for general rules for dealing with exceptions. The following additional notes may apply to measurements of Task (and Task Chain) times.

User aborts: A user abort occurs when a user enters a new inquiry before the system responds with the original inquiry. A user-generated abort stops the timing of the Task. Therefore, user aborts can fall into any of the satisfied, tolerating, frustrated zones.

Server failure: If a tool can detect a clear server-generated abort, then it is handled differently. Server aborts (e.g., TCP closes within a Task) are counted as a frustrated sample regardless of the Task time measurement.

Application failure: Some tools may have the optional capability to interpret the application to a greater level of detail than the minimal Task boundary. For example, they may be able to detect user relevant information at the layer of the application logic. If the tool can detect Task errors, then these application errors (e.g. Web page 404 replies) are counted as frustrated samples.

[G 4.2] Dealing with Exceptions

When a Report Group contains samples marked as errors or exceptions, tools performing the Apdex calculation should classify those measurements as follows:

User Error: Exceptions caused by user actions may be classified as Satisfied, Tolerating, or Frustrated in the same way as any other measurement, if the necessary field(s) are present in the sample. If the field(s) required to perform classification are absent, user-generated exceptions should be classified as Satisfied.

Warning: Exceptions indicating abnormal system or process behavior may be classified as Tolerating if the system or process returned immediately to normal operation without requiring abnormal intervention.

Failure: Exceptions indicating a system or process failure that is experienced by the user should be classified as Frustrated.

Measurement Error: Exceptions indicating measurement errors should be discarded from the Report Group.

Unknown: Exceptions not amenable to classification should be discarded from the Report Group. However, tool creators must attempt to minimize the number of error types assigned to this category.

Because all exceptions are specific to a particular measurement domain, an addendum may specify domain-specific refinements to these general guidelines.

[R 5] Reporting the Index

In this section, the terms Report(s) and Reporting refer to any representation of an Apdex index, whether in print, on a computer screen, or elsewhere.

[G 5] Reporting the Index

In this section, the terms Report(s) and Reporting refer to any representation of an Apdex index, whether in print, on a computer screen, or elsewhere.

[R 5.1] Reporting the Apdex Value

See [G 5.1] for general rules about reporting the Apdex value. Apdex-R adds the following rule for reporting performance zones:

All Apdex-R values are calculated using two thresholds, T and F, which must be reported in association with the corresponding Apdex score for a report group. The value of T must always be reported. If the specification of the report group applied a default rule for the value of F (see [R 2.5.1]), the value of F may be omitted.

[G 5.1] Reporting the Apdex Value

Reports must adhere to the following rules:

  1. The index is a decimal value between 0 and 1, rounded to a precision of two decimal places. Values equal to or greater than 0.995 round to 1.00.
  2. Unless its value is 1.00, the index is always reported with a zero in the tens place, followed by a decimal point, followed by no more than two decimal places.
  3. Apdex values must be identified as such. When several Apdex values are presented in tabular form, a single Apdex identifier can be identify a group of values in a row, column, or table.
  4. All Apdex values are calculated based on specified performance zones. The performance zone specification must be clearly displayed in association with the Apdex score.

[R 5.1.1] Describing General Cases

See [G 5.1.1] for rules about describing unspecified Apdex thresholds.

[G 5.1.1] Describing General Cases

In general discussions of Apdex values, references to unspecified performance thresholds are written using the notation “[T]”, as shown in the following examples.

“Everyone should understand that 0.90 [T] is a better value than 0.80 [T].”
“Apdex scores in the range 0.85 to 0.93 [T] are rated Good.”

Sections [G 5.3] and [G 5.4] also use this notation when referring to threshold reporting.

[R 5.2] Uniform Output File (Mandatory)

To support the exchange of index values between Apdex analysis and reporting tools, all Apdex analysis tools must support the Apdex Uniform Output format specified in [G 5.2]. In Apdex-R, the following additional rules apply:

[R 5.2.1] Bracketed Output Format (Mandatory)

Tools implementing Apdex-R must be capable of reporting Apdex-R values using the bracketed format described in Table 2. When the value of F is set using a default rule (see [R 2.5.1]) elements 5 and 6 in Table 2 may be omitted, so that only the value of the threshold T accompanies the reported Apdex value.

Element Number Definition Type Content
1 Apdex index value Number Decimal in range [0.00, 1.00]
2 Space Literal
3 Left bracket designating start of threshold(s) Literal [
4 Tolerating threshold, T Number Decimal
5 Comma Literal ,
6 Frustrated threshold, F Number Decimal
7 Right bracket designating end of threshold(s) Literal ]
8 Small Group Indicator Literal * or NS
Table 2. Bracketed Output Format

Examples of bracketed output format are: 0.85 [5.5], 1.00 [8.0]*, 0.90 [4.0], and 0.77 [450].

[R 5.2.2] Subscripted Output Format (Optional)

When the value of F is set using a default rule (see [R 2.5.1]), tools may display Apdex values in which the value of T is shown as a mathematical subscript. For example, an Apdex value of 0.75 that is based upon a T value of 4 seconds is shown as 0.754.

When T has a decimal component, as defined in section [G 3.4], then the exact value of T must be shown (example: 0.754.5). To increase readability, tools may drop the decimal portion of T if it is zero (example: 0.754.0 becomes 0.754).

[G 5.2] Uniform Output File (Mandatory)

Separate tools may be used to calculate Apdex index values from measurement data and to report Apdex index values. To support the exchange of index values between Apdex analysis and reporting tools, all Apdex analysis tools must support the Apdex Uniform Output format. A tool shall display, print, and export Apdex-G values to an ASCII file having a comma-separated values (CSV) format. For this purpose, Apdex-G incorporates the IETF specification of CSV published in RFC 4180 (see section [G 6] References).

A Uniform Output File is composed of a uniform output header record, followed by one or more uniform output data records. The header record contains the same number of comma-separated fields as all data records in the file. Fields in the header record contain fixed literal values, or names that correspond to each field in the data records. The content of a uniform output header and data records are described in Tables 2 and 3 below. In Table 2, each element except the last is followed by a comma.

Element Number Definition Header Record Data Record
Type Content Type Content
1 Apdex data record identifier Literal Apdex Header Literal Apdex
Report Group Metadata Section Header Record Data Record
2 Apdex Report Group section identifier Literal ARG Literal ARG
3 Report Group Name Literal Report Group Name Defined by tool or user
4 Report Group Description Literal Description Text String Defined by tool or user. See note 1 below
5 Measurement Type Literal Type Name G, or addendum type (e.g. R for Apdex-R)
6 Measurement Subtype Literal Subtype Name Values specified in an addendum for this measurement type
7 Application Literal Application Name Defined by tool or user
8 User Group Literal User Group Name Defined by tool or user
9 Time Period Start Literal Start Time Timestamp ISO 8601: [YYYY][MM][DD]T[hh][mm][ss]Z
10 Time Period End Literal End Time Timestamp ISO 8601: [YYYY][MM][DD]T[hh][mm][ss]Z
Input Data Summary Section Header Record Data Record
11 Apdex Data Summary section identifier Literal ADS Literal ADS
12 Sample Count Literal Total Samples Number Integer
13 Satisfied Zone Count Literal Satisfied Count Number Integer
14 Tolerating Zone Count Literal Tolerating Count Number Integer
15 Frustrated Zone Count Literal Frustrated Count Number Integer
16 Earliest Sample Timestamp Literal First Sample Timestamp ISO 8601: [YYYY][MM][DD]T[hh][mm][ss]Z
17 Latest Sample Timestamp Literal Last Sample Timestamp ISO 8601: [YYYY][MM][DD]T[hh][mm][ss]Z
Apdex Index Section Header Record Data Record
18 Apdex IndeX section identifier Literal AIX Literal AIX
19 Apdex index value Literal Apdex Index Number Decimal in range [0.00, 1.00]
20 Satisfied Zone Identifier Literal S Literal S
21 Satisfied Thresholds(s) Group Performance Interval Names Group Interval Group, see Table 3
22 Tolerating Zone Identifier Literal T Literal T
23 Tolerating Thresholds(s) Group Performance Interval Names Group Interval Group, see Table 3
24 Frustrated Zone Identifier Literal F Literal F
25 Frustrated Thresholds(s) Group Performance Interval Names Group Interval Group, see Table 3
26 Small Group Indicator Literal SGI Literal * or NS
Table 2. Layout of a Uniform Output Data Record

Note 1: To avoid ambiguity, tool creators must ensure that Report Group description fields do not contain commas, because commas separate fields in the Uniform Output format.

Performance Interval Names

If a tool has obtained user-specified names for performance intervals, those names should be included in the corresponding fields of the header record (elements 21, 23, and 25 in Table 2). If user-specified names are not available, distinct default names must be supplied for each performance interval field in the header record. See also section [G 2.5.1] Thresholds and Performance Intervals.

Performance Interval Groups

In Table 2, each Interval Group (elements 21, 23, and 25) is composed of one or more Performance Interval elements, separated by commas. Table 3 shows the structure of a single Performance Interval element, therefore the components shown in Table 3 are not separated by commas.

Component Definition Header Record Data Record
Type Content Type Content
1 Opening backet Name Performance Interval Name Literal ( or [
2 Lower threshold Number Decimal number — see [G 3.4]
3 Colon Literal : See note 2 below
4 Upper threshold Number Decimal number — see [G 3.4]
5 Closing backet Literal ) or ]
Table 3. Components of a Performance Interval within a Uniform Output Data Record

Note 2: Interval Notation, described in References [G 6.8], uses a comma to separate the upper and lower bounds of an interval. To avoid ambiguity Apdex substitutes a colon, because commas separate fields in the Uniform Output format.

Optional Sections

The Uniform Output File format comprises three sections, each having its own identifier (fields 2, 11, and 18). The Report Group Metadata Section and the Apdex Index Section are required, because they contain the necessary data for reporting Apdex scores. The Input Data Summary Section is optional, but tool creators are encouraged to produce this data at the analysis stage, and to pass it to Reporting tools. Reporting tools can use this data to provide additional context for Apdex scores. If the Input Data Summary Section is omitted, fields 11 through 17 must be omitted from the header and data records.

Examples of the uniform output

Examples of the uniform output are:

Apdex Header,ARG,,,,,,,,,AIX,Apdex Index,S,Green,T,Yellow,F,Red,SGI CRLF user labels
Apdex,ARG,,,,,,,,,AIX,0.72,S,[0:4],T,(4:16],F,(16:INF) CRLF
Apdex,ARG,,,,,,,,,AIX,0.85,S,[0:6],T,(6:20],F,(20:INF),* CRLF
Apdex,ARG,,,,,,,,,AIX,0.58,S,[0:3],T,(3:10],F,(10:INF) CRLF
Apdex,ARG,,,,,,,,,AIX,0.91,S,[0:10],T,(10:30],F,(30:INF) CRLF

Apdex Header,ARG,,,,,,,,,AIX,Apdex Index,S,PI3,T,PI2,PI4,F,PI1,PI5,SGI CRLF generic labels
Apdex,ARG,,,,,,,,,AIX,0.88,S,(10:12],T,(6:10],(12:16],F,[0:6],(16:INF) CRLF
Apdex,ARG,,,,,,,,,AIX,0.96,S,(9:15],T,(6:9],(15:18],F,[0:6],(18:INF) CRLF

[ More details and examples will be added here ]

[R 5.3] Indicating Sample Size

See [G 5.3] for rules about reporting the sample size of an Apdex report group.

[G 5.3] Indicating Sample Size

Apdex values are calculated based upon a set of measurements (samples) in the report group. If there are a small number of samples, the tool must still present a result. However, a result for such a small report group must be clearly marked.

A small report group is defined as one having fewer than 100 samples. An addendum may modify this definition to be appropriate for a particular measurement domain. Apdex tools will clearly indicate that the result is based upon one of the following scenarios:

No Samples
The Apdex calculation could not be performed because there were no samples (NS) within the report group. Where the calculated Apdex value would normally appear, the tool will show an output of NS [T], where [T] is the normal threshold display (see section [G 5.1.1]).
Small Group
When an Apdex value is the output of a small group calculation, an asterisk (*) must be appended to that value, for example: 0.84* [T], where [T] is the normal threshold display (see section [G 5.1.1]).

[R 5.4] Apdex Quality Ratings

See [G 5.4] for rules about assigning and reporting Apdex quality ratings.

In Apdex-R, when the value of F is set using a default rule (see [R 2.5.1]), the target threshold T may be shown as a subscript, as illustrated in Table 3. The table shows examples where T is 4 seconds.

Table 3 – Apdex Qualitative Reporting Rules
(Examples where T=4)

Apdex Value Range Rating Color Indication
0.944 to 1.004 Excellent4 Determined by creator (with a 4 plus a color indication)
0.854 to 0.934 Good4 Determined by creator (with a 4 plus a color indication)
0.704 to 0.844 Fair4 Determined by creator (with a 4 plus a color indication)
0.504 to 0.694 Poor4 Determined by creator (with a 4 plus a color indication)
0.004 to 0.494 Unacceptable4 or UNAX4 Determined by creator (with a 4 plus a color indication)
Low Sample Cases
NS4 NoSample4 Determined by creator (with a 4 plus an NS inside color indication)
0.854* Good4* Determined by creator (with a 4 plus an * inside color indication)
Table 3. Apdex Qualitative Reporting Rules

Note: In the current specification, the ‘Apdex Value Range’ column of Table 3 contains a typo in the ‘NoSample’ row, which reads ‘0.NS4‘. That cell should read ‘NS4‘ (as shown above).

[G 5.4] Apdex Quality Ratings

Some tool creators may wish to assign quality ratings to Apdex value ranges, and to present those ratings graphically. This is an optional feature, but, if implemented, it must follow these guidelines.

Two alternative representations are permitted for quality ratings: a rating word or a color indication. Table 6 below lists the value ranges to be used when assigning a rating to an Apdex value. The table shows examples for the target threshold [T], where [T] is the normal threshold display as described in section [G 5.1.1].

Colors may be selected by the creator for consistency with other products, or based on user-supplied preferences. However a legend must clearly indicate which color represents each Apdex rating.

Apdex Value Range Rating Word Color Indication
0.94 to 1.00 [T] Excellent [T] Determined by creator (with [T] plus a color indication)
0.85 to 0.93 [T] Good [T] Determined by creator (with [T] plus a color indication)
0.70 to 0.84 [T] Fair [T] Determined by creator (with [T] plus a color indication)
0.50 to 0.69 [T] Poor [T] Determined by creator (with [T] plus a color indication)
0.00 to 0.49 [T] Unacceptable [T] or UNAX [T] Determined by creator (with [T] plus a color indication)
Low Sample Cases
NS [T] NoSample [T] Determined by creator (with [T] plus an NS inside color indication)
0.85* [T] Good* [T] Determined by creator (with [T] plus an * inside color indication)
Table 6. Rules for Apdex Quality Ratings

Note: An addendum may specify alternative colors [TBD: and/or value ranges?] to be applied when reporting Apdex scores for a particular measurement domain.

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

  

  

  


*