LinkedIn

See our LinkedIn profile, click on this button:

APDEX

Apdex-R Section [1] Introduction and [2] Overview

I’m writing a series of posts about Generalizing Apdex. This is #21. To minimize confusion, section numbers in the current spec are accompanied by the section symbol, like this: §1. The corresponding section numbers in the generalized spec documents, Apdex-G and Apdex-R, are enclosed in square brackets and prefixed by the document type, like this: [G 1] and [R 1].

Drafts of the Apdex-G spec were posted as Section [1] Introduction, Section [2] Index Overview, Section [3] Calculation Inputs, Section [4] Calculating the Index, Section [5] Reporting, and Section [6] References.

This post presents the first draft of Apdex-R, Sections 1 and 2. Apdex-R specifies the Apdex rules for reporting response-time measurements, therefore my first draft draws heavily on the present Apdex spec. My goal is to reformat the present spec’s rules within the context established by Apdex-G. To allow both documents to be viewed together, I show Apdex-G (draft #2, plus a few subsequent amendments) on the left, and Apdex-R (draft #1) on the right.

Differences between Apdex-R and the corresponding sections of the present spec are highlighted, as follows:

  • text deleted from the present spec
  • new text added to Apdex-R (plus a few additions to Apdex-G draft #2)
  • new text created for Apdex-G, now being repeated in Apdex-R for readability
  • notes, questions, issues to be resolved

While major deletions, additions, and changes to the present spec are marked, many minor wording changes to the present spec’s text, to improve continuity and clarity of meaning, are not marked. I believe those changes do not substantially alter the meaning and intent of the present spec.

Apdex-R, first draft:

Apdex-G, second draft:

[R 1] Introduction

Apdex (Application Performance Index) is a metric that reflects the degree to which a set of performance measurements achieves designated targets.

The Apdex Alliance is a group of companies collaborating to promote an application performance metric called Application Performance Index (Apdex). Apdex is a numerical measure of user satisfaction with the performance of enterprise applications, intended to reflect the effectiveness of IT investments in contributing to business objectives. The Apdex metric may be used by any organization seeking insight into their IT investments. This specification defines Apdex, which is a method for calculating and reporting a metric of transactional application response time in the form of an index with a value of 0 to 1.

This specification defines Apdex-R, the Apdex index of customer satisfaction with transactional application response times.

[G 1] Introduction

Apdex (Application Performance Index) is a metric that reflects the degree to which a set of performance measurements achieves designated targets.

Originally designed as an index of user satisfaction with the responsiveness of enterprise applications, Apdex can also be used to report on many other quality measures, such as the degree to which a product conforms to technical standards, or the degree to which a service achieves business objectives.

[R 1.1] Apdex Specification Documents

A master specification document, Apdex-G, specifies rules for calculating and reporting the Apdex index. The suffix ‘G’ denotes generic, indicating that the master document specifies domain-independent rules.

This document, Apdex-R, is a supplemental document, termed an Addendum. The suffix ‘R’ is the Addendum Type or Document Type. An addendum contains rules that apply only to applications of Apdex within a specific measurement domain.

Note that Apdex-R contains supplemental rules for applying Apdex to transactional application response times; except where specifically stated, an addendum does not replace any provision of Apdex-G.

References to sections or paragraphs of Apdex documents are distinguished by the inclusion of the document type with the paragraph number, for example [G 1.1] or [R 1.1].

[G 1.1] Apdex Specification Documents

This document, Apdex-G, specifies rules for calculating and reporting the Apdex index.

The suffix ‘G’ denotes generic, indicating that this document specifies domain-independent rules. A supplemental document, termed an addendum, may contain rules that apply only to applications of Apdex within a specific measurement domain. An addendum is named Apdex-X, where the suffix ‘X’ is the Addendum Type or Document Type, and denotes a specific measurement domain. For example, the addendum Apdex-R contains supplemental rules for applying Apdex to response times.

References to sections or paragraphs of Apdex documents are distinguished by the inclusion of the document type with the paragraph number, for example [G 1.1] or [R 1.1].

[R 1.2] Document Status

This document was developed by the Apdex Alliance, a group of companies and individuals collaborating to promote Apdex. It was ratified by …

[ Ratification history/status of Apdex-G will be inserted here ]

Apdex specifications may be updated from time to time to reflect feedback based on practical experience with Apdex. Current versions of all Apdex specifications are available from the Alliance web site, www.apdex.org.

[G 1.2] Document Status

This document was developed by the Apdex Alliance, a group of companies and individuals collaborating to promote Apdex. It was ratified by …

[ Ratification history/status of Apdex-G will be inserted here ].

Apdex specifications may be updated from time to time to reflect feedback based on practical experience with Apdex. Current versions of all Apdex specifications are available from the Alliance web site, www.apdex.org.

[G 1.3] Broader Apdex Goals

See [G 1.3] for broader Apdex goals that fall outside the scope of this specification.

The Apdex Alliance plans to develop and maintain additional documents and educational material to help enterprises understand the Apdex metric and put it to productive use within their organizations. The Apdex Technical Guide will provide detailed information on defining Apdex parameters within an enterprise. Most recent information about the Alliance and Apdex documents will be made available at the Alliance web site at www.apdex.org.

Members of the Alliance have made a commitment to implement tools or services that adhere to this specification. The Alliance is also committed to supporting an ongoing process of inquiry into the relationship between application responsiveness and user satisfaction.

[G 1.3] Broader Apdex Goals

The Apdex Alliance has adopted the following statements of intent:

    Tools: Members of the Alliance will implement tools and/or services that adhere to this specification.
    Research: Members of the Alliance support an ongoing process of inquiry into the relationship between application or process performance and customer satisfaction.
    Education: Members of the Alliance will develop and maintain additional documents and educational material to help enterprises understand the Apdex metric and put it to productive use within their organizations.
    Documentation: Members of the Alliance will contribute to an Apdex Technical Guide, which will provide detailed information about implementing Apdex.

These statements reflect goals that fall outside the scope of this specification. Current information about Apdex Alliance goals and activities can be found at the Alliance web site, www.apdex.org

[R 2] Index Overview

See [G 2] for a general statement of the requirement for a performance index; this section focuses on the requirement to report response times using an index.

There are Many aspects of performance relating to affect the delivery and management of information technology. One critical performance factor is the responsiveness of the human-computer interface from the perspective of the human user. This responsiveness is a core quality of a transactional application. The speed by which an application reacts to the needs of the user directly affects the user’s productivity and satisfaction with the application experience. It is a critical metric of application performance that has direct implications for business revenue, customer retention, and user productivity.

Therefore, measuring, reporting, and tracking application response time is important to an enterprise that values the opinion and productivity of its users’ experiences. However, measuring is a necessary but insufficient step in proper management of the user experience. Meaningful reporting of the measurements is equally important. The Application Performance Index (“Apdex”) defines a methodology for reporting the responsiveness of human-application transactions, in terms of its effect on user satisfaction and productivity.

[G 2] Index Overview

Performance is a critical metric of application or process quality that has direct implications for business revenue, customer retention, and user productivity. Users of any application or process are directly affected by its performance. An enterprise that values the opinion of its customers, or the productivity of its internal users, must measure and track the performance of its applications and processes.

Collecting measurements is only one step in the proper management of user experience. Managers must set quality objectives, and track whether those objectives are being met. Such tracking requires that large numbers of individual performance measurements be summarized for easy review. To address this requirement, the Apdex standard defines a method for reporting the degree to which a set of performance measurements meet designated targets.

[R 2.1] Index Objectives

See [G 2.1] for the goals of the Apdex metric.


The fundamental objective of Apdex is to simplify the reporting of application response time measurements by making it possible to represent any such measurement using a common metric. Response time data can describe a wide range of targets, and its magnitude can vary widely. The index is designed to normalize for this variability of time (wide range of seconds) and measurement requirements (many distinct targets), producing a single metric that always has the same meaning. The goals of this metric are:

  • To provide a useful summary of an application’s responsiveness
  • To make it easy to understand the significance of values produced by the index
  • To work for all transactional applications
  • To operate within a fixed rage (0-to1) and be unit-neutral
  • To indicate application performance directly so that 0 is the worst performance and 1 is the best performance
  • To operate in such a way that specific values of the index (e.g., 0.5) report the same user experience across any application, user group, or enterprise
  • To operate in such a way that equivalent user experiences observed by different measurement and reporting tools will report the same value

The purpose of this document is to provide a standard by which many vendors of measurement and reporting tools can achieve these seven objectives.

[G 2.1] Index Objectives

The fundamental objective of Apdex is to simplify the reporting of any application or process measurements by making it possible to represent a set of such measurement using a common metric. Measurements can describe many applications or processes, and their magnitudes can vary widely. The Apdex metric normalizes this variability, producing values that lie within a fixed range, for all measurement input.

The goals of the Apdex metric are:

  1. To be easy to understand
  2. To be a summary of the quality or performance of an application or process, based on a set of measurements
  3. To be a performance indicator for any application or process that is subject to performance targets
  4. To be dimensionless, irrespective of the measurement domain being reported
  5. To lie within a fixed range of values, irrespective of the measurement data being summarized
  6. To operate so that index values of 1 and 0 indicate the best and worst performance respectively
  7. To operate so that a particular value of the index indicates a comparable level of quality for different applications or process
  8. To operate so that different Apdex tools will report similar index values for user experiences having comparable levels of quality

This document defines a standard by which creators of measurement and reporting tools can achieve these objectives.

[R 2.2] Terminology

This document makes a clear distinction between the user of an application that is being measured and the user of the tool that is performing the measurement and generating index values. The latter is called the technician. Other unique terms are defined in the glossary at the end of this document.

See [G 2.2] and [G 7] for Apdex terminology. The following terms are also defined within Apdex-R:

Apdex Term General Definition HTML/HTTP Equivalent
Session The period of time that a user is “connected” to an application. This is a continuous period of time that the user is interacting with the application. No specific HTML or HTTP markers define a session. A proxy is an IP address connection (open-to-close). Another is the assignment of a session identification cookie followed by a period of inactivity.
Process A multi-step series of user interactions (Task, see below)together with the corresponding user Think Times – that may (buy a book, look up an address, get a stock quote) or may not (no clear end) be well defined. Often called the “transaction” or “application use case.” A series of web pages that allow a user to perform a defined application function, together with the user interaction times (“Think Times”) required to interact with those pages.
Task Each user interaction with the application during a Process. Task time is measured from the moment the user enters an application query, command, function, etc., that requires a server response to the moment the user receives the response such that they can proceed with the application. Often called the “user wait time” or “application response time.” A Page download, consisting of a container and zero or more component objects, and the time required to perform that download, measured from the time of a user interface action until the page download completes and the page is available for use.
Task Chain A defined sequence of Tasks. The Tasks comprising a Task Chain may or may not correspond to a complete Process (as defined above). Furthermore, time of a Task Chain corresponds to the sum of the Task times only, and does not include any user Think Time that may occur between Tasks. A defined sequence of Web pages, and their associated download times, excluding any Think Times between user interface actions.
Turn Each application client-server software interaction needed to generate a user response or Task (see above). These software-level client-server interactions add to the time it takes for the software to complete a Task. The user does not see Turns operating. A Turn is a client-server request-driven round-trip. Often called application “chattiness.” HTTP Gets for parts of a Web page or frame. In HTML, each object — such as a container (index.html), component (image.gif) or stand-alone document (program.zip) — will generate a Turn.
Protocol The above Turns are further deconstructed into transport protocol events required to reliably move information among computers. These include DNS look-ups, TCP Opens, TCP transmission, TCP ACKs, retransmission, etc. TCP is the protocol supporting HTTP. HTML/HTTP do not generate additional events at this layer of the transactional model.
Packet The smallest unit of interaction and transmission between the user’s client and the application server is the packet. Packets are routed and transported by networks from source to destination. HTML/HTTP do not generate additional events at this layer of the transactional model.
Table 1 – Taxonomy of Transactional Computer Applications

[G 2.2] Terminology

The following terms are defined:

Measurement Domain
The set of all possible values from which the input to an Apdex calculation is drawn. Unless limited by a domain-specific Addendum, this will be the real number domain (-∞, ∞)
Apdex Tool
A system that calculates Apdex values, or reports Apdex values, or does both. The calculation and reporting functions may be separated between an Apdex analysis tool and an Apdex reporting tool. Apdex tools may employ software or hardware components, or both.
Tool Creator
The developer(s) of an Apdex tool. In practice, a tool creator may refer to a tool vendor, an open-source community, or an individual developer.
Technician
The person who controls an Apdex tool, or who reads and interprets Apdex values. Typically, a technician is an enterprise staff member responsible for application or process quality.
User or Customer
A person served by an application or process that is the subject of an Apdex index.

Other Apdex terms are defined in context within this document and its addenda. Section [G 7] Glossary collects together all Apdex terminology.

[R 2.3] Measurement Types: User-Application Interaction Model

See [G 2.3] for an introduction to Measurement Types in Apdex; this section focuses on response time measurements.

In order for the index to properly report on users’ perceptions of application responsiveness, the measurements that are used as input must measure the actual time that a user is waiting for the application to respond. Transactional applications are characterized by a query-response interaction process. A user types, clicks, or makes some entry followed by a period of time when the system (network, computers, databases, etc.) handles the entry and generates a response. This period is typically called “user wait time.” The critical factor regarding the duration of this time is that the user is typically incapable – even prevented from – proceeding with the application until the response is delivered. The user will then read, react, or in some way think about the response that will then lead him or her to make the next entry. This period of time where the system is waiting for the user to make the next entry is typically called “user think time.” This enter-wait-think cycle repeats many times during the user’s session with the application.

The user’s perception of the application’s responsiveness or speed is formed by an accumulated number of views regarding the lengths of wait times or response times. Many components of a client-server interaction make up response time. Transactional applications have complex behaviors that operate at many levels. Section [R 2.2] defines a taxonomy comprising seven layers of interaction, and illustrates that taxonomy by identifying the layers of a web-based application built using HTTP/HTML. Transactional applications based upon other protocols or proprietary schema follow the same taxonomy and may therefore be measured and reported by tools using Apdex-R.

[G 2.3] Measurement Types

This document specifies domain-independent rules that apply to all measurement types. An addendum may define a class of measurements within a specific measurement domain, and specify supplemental rules that apply only to that class of measurements. The addendum type (see [G 1.1]) may also be referred to as the Document Type, or the Measurement Type of that class of measurements.

[R 2.4] Measurement Subtypes: Processes, Tasks and Task Chains

Within the taxonomy described in Section [R 2.2], application response time is defined to be the time that elapses between a user completing a Task entry and the computer system responding with sufficient information to allow the user to proceed to the next Task. Furthermore, human users are most aware of the responsiveness of an application delivery system at its Task and Process layers, because these layers usually correspond to two distinct and readily identifiable aspects of a user’s interaction with a computer system, as follows:

Task Time
An instance of application response time a human user experiences while the computer system is performing one operation (or one step) of an application.
Process Time
The sum of the Task Times a user experiences while performing a sequence of related Tasks necessary to complete a single application function, including user think time. This is the actual time a user experiences while completing a Process.
Task Chain Time
The sum of the Task Times of a defined series of Tasks. Task Chain Time differs from Process time in two ways: (a) the total time does not include user think time, and (b) the set of Tasks does not necessarily correspond to a complete Process.

In Section [R 2.5.1], this specification defines the Apdex method for computing and reporting measurements of Task Time and Task Chain Time.

TBD: When reporting response-time measurements, is Apdex-R mandatory, or an alternative to Apdex-G? Is the answer the same for Tasks, Task Chains, and other response times?

[G 2.4] Measurement Subtypes

This document specifies domain-independent rules that apply to all measurement types. An addendum that addresses a particular measurement type may also identify subtypes of that measurement type, and specify supplemental rules that apply to each of those subtypes.

[R 2.5] Satisfaction Levels: How Users Experience Application Response Times

Users have a finite set of reactions or views by which they characterize application response time. Each such group of time durations is called a performance zone. Performance zones are defined by two thresholds – times where the zone begins and ends. Apdex defines three such performance zones:

Apdex satisfaction levels are defined in [G 2.5]. For interactive applications, the three levels of user satisfaction with responsiveness are further defined as follows:

Satisfied
The response time is fast enough to satisfy the user, who is therefore able to concentrate fully on the work at hand. with minimal negative impact on his/her thought process.
Tolerating
Responses in the tolerating zone are longer than those of the satisfied zone, exceeding the threshold at which the user notices how long it takes to interact with the system, and potentially impairing the user’s productivity. Application responses in this zone are less than ideal but don’t by themselves threaten the usability of the application. The response time is slower than desired, potentially impairing user productivity, but not slow enough to make the system or application unusable. A user notices how long it takes to interact with the system, but can tolerate the slower response.

Frustrated
As response times increase, at some threshold the user becomes unhappy with slow performance, becoming frustrated. A casual user is likely to abandon a course of action and a production user is likely to cancel a task or process.

[G 2.5] Satisfaction Levels

Apdex works by assigning a satisfaction level to each measurement, as defined below:

Satisfaction Level
One of three categories (Satisfied, Tolerating, or Frustrated) that reflects how an organization using Apdex evaluates the performance of the application or process being reported. For the Apdex method to be applicable to a Measurement Domain, it must be possible to assign a Satisfaction Level to any value within the Measurement Domain.
Quality Level
An alternative term for Satisfaction Level.
Satisfied
A Satisfaction Level that reflects the desired level of performance for the application or process being reported. Measurements classified as Satisfied usually indicate that the application or process being measured is meeting its targets and needs little or no attention.
Tolerating
A Satisfaction Level that reflects a level of performance below the desired level, but which can be tolerated. Measurements classified as Tolerating may signal the need for caution or greater attention to the application or process being measured.
Frustrated
A Satisfaction Level that reflects an unsatisfactory level of performance for the application or process being reported. Measurements classified as Frustrated may be associated with undesirable outcomes, and lead to remedial or abnormal actions.

[TBD, include these examples of abnormal actions? For example, a customer may abandon an unsatisfactory transaction, an executive may assign more resources to a critical business process that is not meeting its assigned targets, or a supervisor may halt a manufacturing process whose product is outside the specified tolerance.]

[R 2.5.1] Thresholds and Performance Intervals

See [G 2.5.1] for general definitions of Apdex Thresholds and Performance Intervals. When using Apdex to report response time measurements, Apdex-R simplifies the general method by eliminating the need to specify Performance Intervals, and by specifying just one or two Thresholds, as follows:

Threshold
A response time value that defines a boundary between two adjacent performance zones. Apdex-R defines two thresholds, the tolerating threshold (T) and the frustrated threshold (F).
T
The Tolerating Threshold. The value T is contained in the Satisfied Zone, and is the boundary between the Satisfied Zone and the Tolerating Zone.
F
The Frustrated Threshold. The value F contained in the Tolerating Zone, and is the boundary between the Tolerating Zone and the Frustrated Zone. The default value of F is four times the value T. When this default is adopted, only the value of T has to be specified.

Applying The Default F=4xT

In Apdex-R, the default value of F is four times the value T. This means, for example, that if users perceive response time as tolerable beginning at 4 seconds then they will be frustrated at response times slower than 16 seconds. However, tool creators are expected to support separate specification of T and F, and users of Apdex-R are encouraged to review the applicability of the default rule in their own reporting environment.

Basis for The Default F=4xT

Research supporting the default assertion F=4xT is described in Reference [G 6.1]. Further background information is available in References [G 6.2], [G 6.3], and [G 6.4] and on the Apdex web site www.apdex.org.

Most research into application response times has focused on user attitudes and behaviors when exposed to differing Task times. Since Processes and Task Chains are simply sequences of Tasks, this Apdex specification assumes that applicable research conclusions obtained for Tasks can be generalized to apply to Task Chains also.

Because the set of Task times comprising a Task Chain time are likely either to be independent (because environmental conditions affecting one Task have no effect on other tasks in a chain), or to be positively correlated (because some or all of the Tasks share a common system environment) it is reasonable to generalize the conclusions obtained for single Task times to Task Chain times.

Such a generalization would not be valid if the Task times comprising a Task Chain tended to be negatively correlated, because in that case (for example) user frustration with one task could be offset by obtaining rapid response time from another task in the chain. Users accustomed to such offsetting behavior might well expect smaller levels of variance in total Task Chain times, and thus develop thresholds for Task Chains that were lower than the sum of the individual Task thresholds, and which did not adhere to the rule that F is four times T. In this situation, the proposed Apdex mathematics would still work, but the resulting Apdex score could overestimate the users’ perceived quality of a Task Chain.

At present, Apdex considers such negatively correlated Task times unlikely. Therefore the default relationship F=4xT may be employed when reporting measurements of a single Task or of a Task Chain.

[G 2.5.1] Thresholds and Performance Intervals

A performance interval is a range of values within the measurement domain that share a common satisfaction level. Thresholds define the boundaries of performance intervals:

Threshold
One of N distinct values within the Measurement Domain, which partition the Measurement Domain into N+1 contiguous non-empty Performance Intervals in such a way that measurement values within a Performance Interval have identical Satisfaction Levels. Each Threshold lies within one and only one Performance Interval. Unless otherwise specified by a domain-specific Addendum, a Threshold lies within the Performance Interval for which it acts as the upper bound.
Threshold Name
Thresholds can have user-assigned names. Tools may also assign default generic names (T1, T2, … or T1, T2, …) to thresholds based on their order, from low to high. An Addendum may specify domain-specific names and orderings.
Performance Interval
A partition of the Measurement Domain defined by Thresholds. All measurement values within a Performance Interval have the same Satisfaction Level, which is therefore considered to be an attribute of the Performance Interval. There must be at least three Performance Intervals, one for each of the three Satisfaction Levels. When the context is unambiguous, the term ‘Performance Interval’ may be shortened to Interval.
Performance Interval Name
Performance Intervals can have user-assigned names. Tools may also assign default generic names (such as PI1, PI2, … or PI1, PI2, …) or names signifying the Satisfaction Level of the Interval (such as PIS, PIT1, PIT2, PIF1, … or PIs, PIt1, PIt2, PIf1 …). Default names should be assigned to Intervals based on their order, from low to high. An Addendum may specify domain-specific names and orderings.

[R 2.5.2] Performance Zones

See [G 2.5.2] for general definitions of Apdex Performance Zones. When using Apdex to report response time measurements, Apdex-R uses the thresholds T and F to define three Performance Zones, one for each satisfaction level.

Performance Zone
A range of time (between two time values) that a user waits for an application to respond during which his/her perception of the application’s responsiveness does not change.A response time range within which user possesses a common satisfaction level. See Satisfied Zone, Tolerating Zone, Frustrated Zone.

Satisfied Zone
The range of application response times from 0 to T, that reflect the desired level of performance for the application or process being reported. The user is not adversely affected by the response time.
Tolerating Zone
The range of application response times greater than T to F, that reflect a level of performance below the desired level, but which can be tolerated. The user is somewhat negatively affected by response time.
Frustrated Zone
Application response times greater than F, that reflects an unsatisfactory level of performance for the application or process being reported. The user is very negatively affected by response time.

[G 2.5.2] Performance Zones

Apdex defines three Performance Zones, one for each satisfaction level. The performance zone is the union of all Performance Intervals having that satisfaction level:

Satisfied Zone
The union of all Performance Intervals that have the Satisfaction Level of ‘Satisfied’.
Tolerating Zone
The union of all Performance Intervals that have the Satisfaction Level of ‘Tolerating’.
Frustrated Zone
The union of all Performance Intervals that have the Satisfaction Level of ‘Frustrated’.

[R 2.6] How the Index Works

Applying the Apdex method converts a set of response times into a unitless index value in the range 0 to 1. See [G 2.6] for an overview of how the Apdex index works.

The Apdex method converts response time (seconds) into an index (unitless) by counting the number of samples in each performance zone relative to all samples. The result is a ratio that is therefore within the range of 0-to-1. An index of zero means that all of the samples were within the frustrated zone. An index of 1 means that all of the samples were within the satisfied zone. An index that is greater than zero and less than one means that the samples were from a mix of the performance zones. The higher the index value, the more satisfied the user population was for that group of application performance response samples.

[G 2.6] How the Index Works

The Apdex method converts measurements into an index by counting the samples in each performance zone and computing a weighted proportion of satisfied samples. Satisfied samples carry more weight than tolerating samples, which in turn carry more weight than frustrated samples.

The index is a ratio that always lies in the range 0 to 1. An index of zero means that all of the samples were within the frustrated zone. An index of 1 means that all of the samples were within the satisfied zone. An index that is greater than zero and less than one means that the samples were from a mix of the performance zones. The higher the index value, the higher the overall satisfaction level of the measurements.

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

  

  

  


*