Black swans, outliers and couscous

by Codewiz51 April 27, 2010 20:38

John Cook as made some excellent blog posts regarding outliers.  The point generally has been that we tend to underestimate improbable events when distributions have long tails.  (At least I think that might have been the point.)

I had a personal epiphany tonight with an outlier observation while I was preparing dinner.  I discovered couscous a few years ago.  I like to cook, and it has become one of my favorite side dishes.  My Mom and family never ate middle eastern food, and I do not recall ever being exposed to couscous until I was an adult that was able to purchase my own groceries.  The observation is that of all my genetic cousins, none of them like couscous but me.  I know there are a million very good reasons why I might like couscous, but looking at my family, I am an outlier when it comes to couscous.

My taste for couscous makes me a genuine outlier, maybe, probably?

Which leads me to my real point.  I am required to analyze reams and reams of data in my job as an analyst.  Unlike many industries and endeavors, I am covered up with data that would be the envy of many statisticians.  What I have learned from so much data, is that, it is what it is.  There are so many data outliers that it does not make sense to discard them.  Outliers occur often enough that they are normal.

It is important to learn to accept outliers in an analysis, rather than trying to exclude them from consideration.  Perhaps we should reflect inwardly on the state of our models.  Maybe they are not correct?

Comments

4/27/2010 10:05:54 PM #

Thanks, Gene.

One problem with throwing out outliers is that the reasoning can be circular.  How do you decide what data are outliers? Those that are unlikely.  But unlikely assuming what, a normal distribution?  If you implicitly assume a normal distribution, you'll throw out data that don't fit a normal.  Then, lo and behold, the data that are left are normal!

One way out of such circular reasoning is to use robust methods that discount outliers automatically without throwing them away.

John Cook United States

4/28/2010 11:48:22 PM #

Pingback from topsy.com

Twitter Trackbacks for
        
        Black swans, outliers and couscous
        [codewiz51.com]
        on Topsy.com

topsy.com

Comments are closed

Powered by BlogEngine.NET 1.6.0.0
Theme by Mads Kristensen | Modified by Mooglegiant

Disclaimer

This blog represents my personal hobby, observations and views. It does not represent the views of my employer, clients, especially my wife, children, in-laws, clergy, the dog, the cats or my daughter's horse. In fact, I am not even sure it represents my views when I take the time to reread postings.

© Copyright 2008-2011