White Paper

Analysis of Erlanger Data Series:

Option Data Smoothing - Medians

By: Philip B. Erlanger, CMT

 

Simple moving averages are the customary tool for smoothing data such as options trading. We have found that using medians is a better representation.

 

I. Introduction

Simple moving averages take into account all data within a period. For example, a 10-day average would factor in all data for a ten-day period. If there are errors, or outlier samples in the data, these would be reflected in the average. Options data is particularly prone to errors as reported by the data feeds from the exchange. In illiquid issues particularly they are prone to vast swings that often can be viewed as outliers.

Instead of factoring all values in a 10-day average, we have found it advantageous to measure the middle or center of such a distribution.

 

II. Medians

The middle value in an ordered list of numbers is called the median. The specific value for the median depends on whether the data set contains an even or odd number of observations and, in the even case, whether or not the two middle values are the same or different. To find the median of a data set:

  • arrange the number in numerical order from smallest to largest
  • If the number of data points, n, is odd, the median is the middle value in the ordered list. It is located by counting (n+1)/2 positions from either end of the ordered list
  • If n is even, the median is the average of the two middle data points.

The median has the property that, as nearly as possible, half the data are below and half the data are above the median value.

 

III. Influence of Extreme Values on the Median

The median uses order information in the data but does not use the actual numbers to any large extent. The extremely large numbers that can occur with one day of options data has essentially no effect on the median level of options trading. Similarly, small extreme values have no effect on the median. The median is not influenced at all by the extreme observations in a data set. This is of value as the measure of options trading because changes over time are due to overall changes in activity, and not to outliers or the rare error that can occur.