LinkedIn

See our LinkedIn profile, click on this button:

APDEX

Apdex-G Section [2] Index Overview

I’m writing a series of posts about Generalizing Apdex. This is #16. To minimize confusion, section numbers in the current spec are accompanied by the section symbol, like this: §1. The corresponding section numbers in the generalized spec, Apdex-G, are enclosed in square brackets, like this: [1].

I have been working systematically through the process of generalizing the current Apdex spec. Along the way, I have been skipping over parts of the current spec to work on the paragraphs that presented challenges. I am now going back to fill in those less contentious paragraphs. Over the next week, I plan to post updated drafts of each section of the Apdex-G spec. I began yesterday, with Section [1] Introduction; today I continue with Section [2] Index Overview.


Summary

In this section of Apdex-G, I have generalized and reorganized the material in Apdex section §2 as follows:

  • The introduction contains a brief description of the reasons for reducing a set of measurements to an index.
  • Section [2.1] corresponds to Apdex section §2.1. I have generalized the language, and made the last two goals (comparability among tools and applications) more realistic. This section was originally discussed in Core Apdex Qualities.
  • Section [2.2] corresponds to Apdex section §2.2. I have added definitions for some terms used but not defined in the current Apdex spec, and introduced Measurement Domain and Tool Creator.
  • Section [2.3] replaces all the material currently found in Apdex section §2.3, which will be reviewed later for inclusion in Apdex-R.
  • Section [2.4] replaces all the material currently found in Apdex section §2.4, which will be reviewed later for inclusion in Apdex-R.
  • Section [2.5] presents the most significant changes in section [2]. The difference is illustrated in Figure 1 below. First, section [2.5] defines three Satisfaction Levels. Next, section [2.5.1] defines Thresholds as the boundaries of Performance Intervals, which are ranges of values within the measurement domain that share a common satisfaction level. Finally, section [2.5.2] defines Performance Zones as a collection (union) of one or more performance intervals. The arguments for these proposals were presented in Configurable Zone Alignment in Apdex-G and Generalizing the Apdex Thresholds. The extensive discussion of threshold relationships at the end of section §2.5 will be reviewed later for inclusion in Apdex-R.
  • Section [2.6] corresponds to Apdex section §2.6.

Conceptual Models of Apdex and Apdex-G

Figure 1. Conceptual Models of Apdex and Apdex-G

Because this material includes some drafts posted previously, some of which I have edited, everything in this series of posts is considered the second draft of Apdex-G. After the second draft is posted, I will begin work on Apdex-R, the addendum addressing measurements of response times. Apdex-R will cover all the domain-specific content of the current spec that has been excised from Apdex-G.

Apdex, current spec:

Apdex-G, second draft:

§2. Index Overview
There are many aspects of performance relating to the delivery and management of information technology. One critical performance factor is the responsiveness of the human-computer interface from the perspective of the human user. This responsiveness is a core quality of a transactional application. The speed by which an application reacts to the needs of the user directly affects the user’s productivity and satisfaction with the application experience. It is a critical metric of application performance that has direct implications for business revenue, customer retention, and user productivity.

Therefore, measuring and tracking application response time is important to an enterprise that values the opinion and productivity of its users’ experiences. However, measuring is a necessary but insufficient step in proper management of the user experience. Meaningful reporting of the measurements is equally important. The Application Performance Index (“Apdex”) defines a methodology for reporting the responsiveness of human-application transactions, in terms of its effect on user productivity.

[2] Index Overview

Performance is a critical metric of application or process quality that has direct implications for business revenue, customer retention, and user productivity. Users of any application or process are directly affected by its performance. An enterprise that values the opinion of its customers, or the productivity of its internal users, must measure and track the performance of its applications and processes.

Collecting measurements is only one step in the proper management of user experience. Managers must set quality objectives, and track whether those objectives are being met. Such tracking requires that large numbers of individual performance measurements be summarized for easy review. To address this requirement, the Apdex standard defines a method for reporting the degree to which a set of performance measurements meet designated targets.

§2.1 Index Objectives

The fundamental objective of Apdex is to simplify the reporting of application response time measurements by making it possible to represent any such measurement using a common metric. Response time data can describe a wide range of targets, and its magnitude can vary widely. The index is designed to normalize for this variability of time (wide range of seconds) and measurement requirements (many distinct targets), producing a single metric that always has the same meaning. The goals of this metric are:

  • To provide a useful summary of an application’s responsiveness
  • To make it easy to understand the significance of values produced by the index
  • To work for all transactional applications
  • To operate within a fixed rage (0-to1) and be unit-neutral
  • To indicate application performance directly so that 0 is the worst performance and 1 is the best performance
  • To operate in such a way that specific values of the index (e.g., 0.5) report the same user experience across any application, user group, or enterprise
  • To operate in such a way that equivalent user experiences observed by different measurement and reporting tools will report the same value

The purpose of this document is to provide a standard by which many vendors of measurement and reporting tools can achieve these seven objectives.

[2.1] Index Objectives

The fundamental objective of Apdex is to simplify the reporting of any application or process measurements by making it possible to represent a set of such measurement using a common metric. Measurements can describe many applications or processes, and their magnitudes can vary widely. The Apdex metric normalizes this variability, producing values that lie within a fixed range, for all measurement input.

The goals of the Apdex metric are:

  1. To be easy to understand
  2. To be a summary of the quality or performance of an application or process, based on a set of measurements
  3. To be a performance indicator for any application or process that is subject to performance targets
  4. To be dimensionless, irrespective of the measurement domain being reported
  5. To lie within a fixed range of values, irrespective of the measurement data being summarized
  6. To operate so that index values of 1 and 0 indicate the best and worst performance respectively
  7. To operate so that a particular value of the index indicates a comparable level of quality for different applications or process
  8. To operate so that different Apdex tools will report similar index values for user experiences having comparable levels of quality

This document defines a standard by which creators of measurement and reporting tools can achieve these objectives.

§2.2 Terms in This Document

This document makes a clear distinction between the user of an application that is being measured and the user of the tool that is performing the measurement and generating index values. The latter is called the technician. Other unique terms are defined in the glossary at the end of this document.

[2.2] Terminology

The following terms are defined:

Measurement Domain
The set of all possible values from which the input to an Apdex calculation is drawn. Unless limited by a domain-specific Addendum, this will be the real number domain (-∞, ∞)
Apdex Tool
A system that calculates Apdex values, or reports Apdex values, or does both. The calculation and reporting functions may be separated between an Apdex analysis tool and an Apdex reporting tool. Apdex tools may employ software or hardware components, or both.
Tool Creator
The developer(s) of an Apdex tool. In practice, a tool creator may refer to a tool vendor, an open-source community, or an individual developer.
Technician
The person who controls an Apdex tool, or who reads and interprets Apdex values. Typically, a technician is an enterprise staff member responsible for application or process quality.
User or Customer
A person served by an application or process that is the subject of an Apdex index.

Other Apdex terms are defined in context within this document and its addenda. Section [7] Glossary collects together all Apdex terminology.

§2.3 User-Application Interaction Model

[ Domain specific content about users’ perceptions of application responsiveness ]

[2.3] Measurement Types

This document specifies domain-independent rules that apply to all measurement types. An addendum may define a class of measurements within a specific measurement domain, and specify supplemental rules that apply only to that class of measurements. The addendum type (see [1.1]) may also be referred to as the Measurement Type of that class of measurements.

§2.4 Processes, Tasks and Task Chains

[ Domain specific content about components of application responsiveness ]

[2.4] Measurement Subtypes

This document specifies domain-independent rules that apply to all measurement types. An addendum that addresses a particular measurement type may also identify subtypes of that measurement type, and specify supplemental rules that apply to each of those subtypes.

§2.5 How Users Interpret Application Response Times.

Users have a finite set of reactions or views by which they characterize application response time. Each such group of time durations is called a performance zone. Performance zones are defined by two thresholds – times where the zone begins and ends. Apdex defines three such performance zones:

Satisfied
Response times that are fast enough to satisfy the user, who is therefore able to concentrate fully on the work at hand, with minimal negative impact on his/her thought process.
Tolerating
Responses in the tolerating zone are longer than those of the satisfied zone, exceeding the threshold at which the user notices how long it takes to interact with the system, and potentially impairing the user’s productivity. Application responses in this zone are less than ideal but don’t by themselves threaten the usability of the application.
Frustrated
As response times increase, at some threshold the user becomes unhappy with slow performance entering the frustrated zone. With response times in the frustrated zone, a casual user is likely to abandon a course of action and a production user is likely to cancel a Task.

[2.5] Satisfaction Levels

Apdex works by assigning a satisfaction level to each measurement, as defined below:

Satisfaction Level
One of three categories (Satisfied, Tolerating, or Frustrated) that reflects how an organization using Apdex evaluates the performance of the application or process being reported. For the Apdex method to be applicable to a Measurement Domain, it must be possible to assign a Satisfaction Level to any value within the Measurement Domain.
Quality Level
An alternative term for Satisfaction Level.
Satisfied
A Satisfaction Level that reflects the desired level of performance for the application or process being reported. Measurements classified as Satisfied usually indicate that the application or process being measured is meeting its targets and needs little or no attention.
Tolerating
A Satisfaction Level that reflects a level of performance below the desired level, but which can be tolerated. Measurements classified as Tolerating may signal the need for caution or greater attention to the application or process being measured.
Frustrated
A Satisfaction Level that reflects an unsatisfactory level of performance for the application or process being reported. Measurements classified as Frustrated may be associated with undesirable outcomes, and lead to remedial or abnormal actions.

[TBD, include these examples of abnormal actions? For example, a customer may abandon an unsatisfactory transaction, an executive may assign more resources to a critical business process that is not meeting its assigned targets, or a supervisor may halt a manufacturing process whose product is outside the specified tolerance.]

[ Section §2.5 (continued): ]
The three zones are defined by two thresholds: T and F, in seconds, as follows.

  • Satisfied Zone = zero to T.
  • Tolerating Zone = Greater than T to F.
  • Frustrated Zone = Greater than F

The value of F is four times the value T. For example, if users perceive response time as tolerable beginning at 4 seconds then they will be frustrated at greater than 16 seconds. The research that supports this model is described in Reference 1. Further background information is available in References 2-4.

Most research into application response times has focused on user attitudes and behaviors when exposed to differing Task times. Since Processes and Task Chains are simply sequences of Tasks, this Apdex specification assumes that applicable research conclusions obtained for Tasks can be generalized to apply to Task Chains also. Therefore Apdex considers that the above discussion of performance zones and thresholds applies either to a single Task or to a Task Chain (a sequence of related Tasks).

Because the set of Task times comprising a Task Chain time are likely either to be independent (because environmental conditions affecting one Task have no effect on other tasks in a chain), or to be positively correlated (because Tasks share a common system environment) it is reasonable to generalize the conclusions obtained for single Task times to Task Chain times. Such a generalization would not be valid if the Task times comprising a Task Chain tended to be negatively correlated, because in that case (for example) user frustration with one task could be offset by obtaining rapid response time from another task in the chain. Users accustomed to such offsetting behavior might well expect smaller levels of variance in total Task Chain times, and thus develop thresholds for Task Chains that were lower than the sum of the individual Task thresholds, and which did not adhere to the rule that F is four times T. In this situation, the proposed Apdex mathematics would still work, but the resulting Apdex score might well overestimate the users’ perceived quality of a Task Chain. At present, Apdex considers that such negatively correlated Task times are unlikely to occur in practice.

[2.5.1] Thresholds and Performance Intervals

A performance interval is a range of values within the measurement domain that share a common satisfaction level. Thresholds define the boundaries of performance intervals:

Threshold
One of N distinct values within the Measurement Domain, which partition the Measurement Domain into N+1 contiguous non-empty Performance Intervals in such a way that measurement values within a Performance Interval have identical Satisfaction Levels. Each Threshold lies within one and only one Performance Interval. Unless otherwise specified by a domain-specific Addendum, a Threshold lies within the Performance Interval for which it acts as the upper bound.
Threshold Name
Thresholds can have user-assigned names. Tools may also assign default generic names (T1, T2, … or T1, T2, …) to thresholds based on their order, from low to high. An Addendum may specify domain-specific names and orderings.
Performance Interval
A partition of the Measurement Domain defined by Thresholds. All measurement values within a Performance Interval have the same Satisfaction Level, which is therefore considered to be an attribute of the Performance Interval. There must be at least three Performance Intervals, one for each of the three Satisfaction Levels. When the context is unambiguous, the term ‘Performance Interval’ may be shortened to Interval.
Performance Interval Name
Performance Intervals can have user-assigned names. Tools may also assign default generic names (such as PI1, PI2, … or PI1, PI2, …) or names signifying the Satisfaction Level of the Interval (such as PIS, PIT1, PIT2, PIF1, … or PIs, PIt1, PIt2, PIf1 …). Default names should be assigned to Intervals based on their order, from low to high. An Addendum may specify domain-specific names and orderings.

[2.5.2] Performance Zones

Apdex defines three Performance Zones, one for each satisfaction level. The performance zone is the union of all Performance Intervals having that satisfaction level:

Satisfied Zone
The union of all Performance Intervals that have the Satisfaction Level of ‘Satisfied’.
Tolerating Zone
The union of all Performance Intervals that have the Satisfaction Level of ‘Tolerating’.
Frustrated Zone
The union of all Performance Intervals that have the Satisfaction Level of ‘Frustrated’.

§2.6 How the Index Works

The Apdex method converts response time (seconds) into an index (unitless) by counting the number of samples in each performance zone relative to all samples. The result is a ratio that is therefore within the range of 0-to-1. An index of zero means that all of the samples were within the frustrated zone. An index of 1 means that all of the samples were within the satisfied zone. An index that is greater than zero and less than one means that the samples were from a mix of the performance zones. The higher the index value, the more satisfied the user population was for that group of application performance response samples.

[2.6] How the Index Works

The Apdex method converts measurements into an index by counting the samples in each performance zone and computing a weighted proportion of satisfied samples. Satisfied samples carry more weight than tolerating samples, which in turn carry more weight than frustrated samples.

The index is a ratio that always lies in the range 0 to 1. An index of zero means that all of the samples were within the frustrated zone. An index of 1 means that all of the samples were within the satisfied zone. An index that is greater than zero and less than one means that the samples were from a mix of the performance zones. The higher the index value, the higher the overall satisfaction level of the measurements.

Open Issues and Public Review

I am undecided on some text formatting questions. As a result, this draft deliberately illustrates different possibilities, rather than adopting single a consistent style throughout:

  • I am unsure whether to capitalize and/or bold Apdex terminology (a) when it is first introduced, and (b) when it is referenced later. In section [2.5] above I experimented with using bold face for terms of art that are referenced within definitions.
  • I am considering using glossary-style definitions when introducing terms within the body of the document. This approach, illustrated in [2.2] and [2.5] above, allows definitions in the glossary to be made identical to their counterparts in the earlier sections of the spec.

As usual, all these proposals are open for public discussion. Please use the comment form below to contribute any comments, suggestions, or questions.experimenting

2 comments to Apdex-G Section [2] Index Overview

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

  

  

  


*