The IV AMMCS International Conference

Waterloo, Ontario, Canada | August 20-25, 2017

AMMSCS 2017 Plenary Talk

Big data's dirty secret

Harvey Stein (Bloomberg LP and Columbia University)

"Let the data speak for themselves."
"We apply machine learning to the problem of..."
These are two commonly heard phrases these days. But what data exactly are we speaking about, and what do we intend to do with it? What is ignored all too often is the quality of the data being used and how it impacts the analyses being done. Are there holes in the data? Are there anomalies? Given how dirty data can be, a more apt phrase might be "Garbage in, garbage out".
In this talk we will discuss some of the data problems we've encountered in financial data, and approaches that can be used to address them. Our particular focus will be on techniques we've employed to deal with missing data and bad data in credit default swap (CDS) spread histories.
Dr. Stein if the Head of Quantitative Risk Analytics at Bloomberg LP and an Adjunct Professor at the Columbia University in the City of New York. He is responsible for all quantitative components of Bloomberg's Enterprise Risk product, including regulatory capital calculations, CCAR scenarios, FRTB support, VaR, stressed VaR, ES, predictive stress and exposure calculations (PFE, EPE, etc). Harvey Stein is a world-class quantitative researcher, technologist and accomplished manager. Extensive experience in quantitative modeling and research, risk analytics, derivatives pricing, stochastic processes, pure and applied mathematics, and numerical methods. Well-known author in finance (ranked in top 0.5% on SSRN). Experienced software architect and systems designer. Pioneer in cluster computing. Specialties: CVA, credit risk, default modeling, regulation, VaR, derivative valuation, interest rate modeling, MBS/CMO valuation, numerical methods, software engineering.