Practical Methods for Identifying Anomalies in Large Datasets

Feb 20, 2015

An outlier is an observation that deviates so much from other observations as to arouse suspicion that it was generated by a different mechanism*.

Today, I gave a talk at O'Reilly's Strata on three case studies that introduced some practical methods for identifying anomalies in large datasets.

You can find the slides here.


*D. Hawkins. Identification of Outliers. Chapman and Hall, 1980.