White Paper
Analysis of Erlanger Data Series:
Option Data Smoothing - Medians
By: Philip B. Erlanger, CMT
Simple moving averages are the customary tool for smoothing
data such as options trading. We have found that using medians is
a better representation.
I. Introduction
Simple moving averages take into account all data within a period.
For example, a 10-day average would factor in all data for a ten-day
period. If there are errors, or outlier samples in the data, these
would be reflected in the average. Options data is particularly
prone to errors as reported by the data feeds from the exchange.
In illiquid issues particularly they are prone to vast swings that
often can be viewed as outliers.
Instead of factoring all values in a 10-day average, we have found
it advantageous to measure the middle or center of such a distribution.
II. Medians
The middle value in an ordered list of numbers is called the median.
The specific value for the median depends on whether the data set
contains an even or odd number of observations and, in the even
case, whether or not the two middle values are the same or different.
To find the median of a data set:
- arrange the number in numerical order from smallest to largest
- If the number of data points, n, is odd, the median is
the middle value in the ordered list. It is located by counting
(n+1)/2 positions from either end of the ordered list
- If n is even, the median is the average of the two middle
data points.
The median has the property that, as nearly as possible, half the
data are below and half the data are above the median value.
III. Influence of Extreme Values on the Median
The median uses order information in the data but does not use
the actual numbers to any large extent. The extremely large numbers
that can occur with one day of options data has essentially no effect
on the median level of options trading. Similarly, small extreme
values have no effect on the median. The median is not influenced
at all by the extreme observations in a data set. This is of
value as the measure of options trading because changes over time
are due to overall changes in activity, and not to outliers or the
rare error that can occur.
|