I’m writing a series of posts about Generalizing Apdex. This is #5.
“The question is,” said Alice, “whether you can make words mean so many different things.”
“The question is,” said Humpty Dumpty, “which is to be master — that’s all.”
— Through the Looking Glass, Chapter VI, Lewis Carroll
Specifications need to be precise. Therefore, to create a more general version of the Apdex spec–and before that, to discuss what the spec should contain–we need to define our terminology precisely. This is especially vital in the “Looking Glass” world of information technology, where people like to make words mean so many different things. This post is my attempt to master the meanings of the words I need to use, to make them mean just what I choose them to mean — neither more nor less.
The solution, of course, is to create a glossary. The Apdex spec already contains one, in Section 7, but since we’re working on generalizing the spec, we need to generalize the glossary too. At this stage in the process, I can’t predict all the additional terms we’re going to need, but I can already see some obvious omissions in the current glossary. So I’m going to post an extensible glossary here, and update it whenever necessary. That way, by the time we complete the process of rewriting the spec, we’ll have the new glossary for that spec already written.
To seed this process, my first version of this extensible glossary appears below. It comprises the terms defined in Section 7 of the current Apdex spec, together with a lot of placeholders for additional terms that I foresee we may need to define and use while discussing and writing the new generalized version. For example, my recent study of Performance Indicators and KPIs has shown me that we need precise meanings for sample, measure, metric, statistic, index, performance indicator, and KPI. So those terms are all in the glossary now, waiting to be defined (except “sample” which came from the current spec).
With this approach, terms can be added, or deleted, as necessary. Definitions can be added, or refined, as we proceed. I’ve started with an assortment of terms from the fields of measurement, data analysis, and statistics; some I know we’re going to need, others may turn out be superfluous. Some may not be needed in the spec but could be useful in a potential companion document on Apdex Usage Guidelines.
- A supplemental Apdex specification containing rules that apply only to applications of the Apdex method within a specific domain.
- Addendum Name
- An addendum is named Apdex-X, where X is a suffix appropriate to a particular domain. Examples might be Apdex-R for response time, Apdex-V for VOIP quality, and so on.
- Confidence Interval
- Descriptive Statistics
- [Apdex-R] The threshold which defines the boundary between tolerating and frustrated performance zones.
- A Satisfaction Level that reflects an unsatisfactory level of performance for the application or process being reported. Measurements classified as Frustrated may be associated with undesirable outcomes, and lead to remedial or abnormal actions.
- Frustrated Zone
- The union of all Performance Intervals that have the Satisfaction Level of ‘Frustrated’.
- Geometric Mean
- Log-normal Distribution
- Measure (noun)
- Measurement Domain
- The set of all possible values from which the input to an Apdex calculation is drawn. Unless limited by a domain-specific Addendum, this will be the real number domain (-∞, ∞)
- Normal Distribution
- Performance Indicator
- Performance Interval
- A partition of the Measurement Domain defined by Thresholds. All measurement values within a Performance Interval have the same Satisfaction Level, which is therefore considered to be an attribute of the Performance Interval. There must be at least three Performance Intervals, one for each of the three Satisfaction Levels. When the context is unambiguous, the term ‘Performance Interval’ may be shortened to Interval.
- Performance Interval Name
- Performance Intervals have user-assigned names. Within Apdex-G, references to Performance Intervals may use generic names (such as PI1, PI2, … or PI1, PI2, …) or names signifying the Satisfaction Level of the Interval (such as PIS, PIT1, PIT2, PIF1, … or PIs, PIt1, PIt2, PIf1 …). Within Apdex-G, generic names are assigned to Intervals based on their order, from low to high. An Addendum may use domain-specific names and orderings.
- Performance Zone
- The Satisfied Zone, the Tolerating Zone, or the Frustrated Zone.
- [Apdex-R] A sequence of related Tasks necessary to complete a single application function.
- Process Time
- [Apdex-R] The time needed to complete the single application function defined by the process. This time is the sum of the task times for all tasks necessary to complete the process and the sum of all think times for the user between tasks while he/she determines the next step.
- Quality Level
- An alternative term for Satisfaction Level.
- Report Group
- The parameters that define a set of measurement samples that are used in an Apdex calculation. These parameters are meaningful to the user and typically relate to the way the user’s enterprise is managing performance.
- Response Time
- [Apdex-R] The elapsed time beginning when a user completes a Task entry and ending when the system responds with all the information needed in order for the user to proceed to the next Task.
- [Apdex-R] One distinct user Task or Task Chain time measurement.
- Satisfaction Level
- One of three categories (Satisfied, Tolerating, or Frustrated) that reflects how an organization using Apdex evaluates the performance of the application or process being reported. For the Apdex method to be applicable to a Measurement Domain, it must be possible to assign a Satisfaction Level to any value within the Measurement Domain.
- A Satisfaction Level that reflects the desired level of performance for the application or process being reported. Measurements classified as Satisfied usually indicate that the application or process being measured is meeting its targets and needs little or no attention.
- Satisfied Zone
- The union of all Performance Intervals that have the Satisfaction Level of ‘Satisfied’.
- Standard Deviation
- Standard Error
- [Apdex-R] Each user-application interface interaction that requires a user entry and an application response.
- [Apdex-R] The target threshold which defines the boundary between satisfied and tolerating performance zones.
- Task Chain Time
- [Apdex-R] The sum of the Task Times a user experiences while performing the sequence of related Tasks necessary to complete a single application function. Task Chain Time does not include Think Time.
- Task Time
- [Apdex-R] An instance of application response time that a human user experiences while the computer system is performing one operation (or one step) of an application.
- The person who controls the measurement and reporting tool. This person also reads and interprets the index. Typically, this is a member of an enterprise application, data center, or network staff.
- Think Time
- [Apdex-R] The time interval between Tasks, during which a user may, for example, be reviewing the previous system response, thinking, or entering data.
- One of N distinct values, all of which lie within the Measurement Domain, and which partition the Measurement Domain into N+1 contiguous non-empty Performance Intervals in such a way that measurement values within a Performance Interval have identical Satisfaction Levels. Each Threshold lies within one and only one Performance Interval. Unless otherwise specified by a domain-specific Addendum, a Threshold lies within the Performance Interval for which it acts as the upper bound.
- Threshold Name
- Threshold have user-assigned names. Within Apdex-G, references to Thresholds use generic names (T1, T2, … or T1, T2, …). Within Apdex-G names are assigned to thresholds based on their order, from low to high. An Addendum may use domain-specific names and orderings.
- A Satisfaction Level that reflects a level of performance below the desired level, but which can be tolerated. Measurements classified as Tolerating may signal the need for caution or greater attention to the application or process being measured.
- Tolerating Zone
- The union of all Performance Intervals that have the Satisfaction Level of ‘Tolerating’.
- The measurement and reporting system (devices, software, etc.) that generates Apdex values.
- The human user of an enterprise transactional application. The human may be accessing the application through “client” software in a client-server application architecture or he/she may be using a much simpler termination device.
- See Performance Zone, Satisfied Zone, Tolerating Zone, Frustrated Zone.
Explanatory Notes: When I add new terms or definitions, I will post any relevant explanatory details in a comment. That way, the comments will serve as a log of the evolution of the glossary. If you have suggestions for additions or changes, or links to relevant and useful resources, please feel free to post a comment, and I will respond with a comment or a change to the glossary, or both.
Useful Online Glossaries
Where do we look for definitions to fill the gaps? In some cases, I expect to replace those missing entries with links to an existing definitions online, rather than composing new ones myself. There are plenty of online glossaries to choose from; here are some that seem to be reliable:
- The ESS EduNet Methodological glossary. The European Social Survey Education Net is a training resource mainly developed for use in higher education. The authors claim that: “In this glossary you will find only a very short description of each concept.” Of course, “very short” is an imprecise term; here it appears to mean “not encyclopedic.” Many of the definitions are much longer–over 100 words–than I had in mind for the Apdex spec. But an entry here could provide an insight that leads to a more concise definition.
- Statsoft’s Statistics Glossary. This comprehensive list is a useful resource when you need precise definitions of statistical terminology. However, because it aims to provide rigorous definitions, it does not even offer definitions for common but imprecise terms like accuracy and precision. Also, many of the entries are too technical for use within the Apdex spec, where I would prefer to limit the use of definitions containing mathematical formulae.
- The US FDA Glossary of Computer Systems Software Development Terminology (8/95). This is more accessible than Statsoft’s glossary. Its definitions, when present, seem to be at about the right level for Apdex spec.
I will extend this list if I find other useful ones. I will only link to an existing definition if I’m convinced that it is accurate and precise–and that it will remain so. One of the challenges of Web content is its instability. While crowd-sourcing can be a useful feature in some contexts, it’s not appropriate when you’re defining a specification.
Avoiding the Devotees of Humpty Dumpty
Some glossaries, it seems, were compiled by devotees of Humpty Dumpty–that’s why this exercise is necessary. For an example, browse The Fundamentals of Web Analytics Glossary, published by Webtrends in the Education area of their site. Here are a few definitions that I would not want to include in the Apdex spec:
- Accuracy The ability of a measurement to match the actual value of the quantity being measured. Accuracy is the foundation upon which your marketing analytics should be built. If you can’t trust that your data is accurate, you can’t make confident decisions. In statistical terms, accuracy is the width of the confidence interval for a desired confidence level. See also unique visitors.
- Index The collection of information (contained in a large database) a search engine has that searchers can query against. With crawler-based search engines, the index is typically copies of all the web pages they have found from crawling the web. With human-powered directories, the index contains the summaries of all web sites that have been categorized.
- KPI Key Performance Indicators. Key Performance Indicators are typically kept in dashboards and provide customers with an understanding of how the site is performing.
- Latency The average number of days between visits for a given visitor during a reporting period. For example, those who visit on average every seven days. See also recency and frequency.
- Metrics Metrics are a system of parameters or ways of quantitative assessment of a process that is to be measured, along with the processes to carry out such measurement. Metrics define what is to be measured.
- Parameters These are located in the URL immediately after a question mark and followed by an equal sign and a return value, known as name=value.
- Performance Indicators See KPIs.
- Query A question or inquiry used to find answers about certain metrics.
- Traffic On the web, traffic refers to the amount of data sent and received by visitors to a website.
These definitions typify the imprecise and muddled usage of technical terms that is common on the Web. Even Wikipedia, which does suffer from instability and inconsistent quality, is usually a much better source than that Webtrends glossary. At least there’s a good chance that someone will fix glaring errors and omissions. And articles about technical or scientific concepts often contain readable introductions, before diving into arcane graduate-level discussions and mathematical derivations. See for example the article on Accuracy and precision. I will probably cite that article in a future post about that topic, which is on my list of issues to address as we generalize Apdex.