The expression “garbage in, garbage out” is known to all. But what does this mean for machine learning and data mining? This talk will explore several case where application of machine learning approaches have failed (or misled the practitioners with erroneous success) because of deficiencies in training data, poor documentation of feature sets, violation of tacet mathematical assumptions, or naive application of techniques.
Jeff Chen is Deputy Chief Data Officer of the US Department of Commerce where he leads the Commerce Data Service working to integrate data science into policy and operations. A statistician by trade, he has led and deployed data-driven efforts in 35 fields in a dozen countries as well as has led and contributed to efforts at NASA, the White House Office of Science and Technology Policy, the NYC Fire Department, the Clinton Health Access Initiative, and the NYC Mayor’s Office.