What is Data Science?

Data science can generate products that provide actionable information.

Big Data and the resulting opportunities and challenges in business analytics, government analytics, healthcare etc. are considered to be a potential game changer for the US economy. Broad applications of Big Data are estimated to increase the annual gross domestic product of the United States by several hundred billion dollars per year by the year 2020. There is enormous need for talent to sustain this revolution in business and industry.  A recent Forbes web posting (May 2017) reports that annual demand for data scientists, including data developers and engineers, will reach 700,000 by 2020.

What is Data Science?

Data Science is a new interdisciplinary field that incorporates computer science, statistics, and mathematical modeling, with applications in business, government, the life sciences, social sciences, and many other areas. It capitalizes on the enormous explosion in available data that the world has seen over the last decades and that will continue.

The amount of data flowing over the Internet is now in the order of hundreds of Exabytes (millions of Terabytes) per year, and the amount of digital data that is produced each year is even higher. The coming “Internet of Things” (new window) will only accelerate this deluge of “Big Data”. Harnessing these vast amounts by analyzing them, extracting valuable information, and ultimately using the knowledge thus gained to make actionable recommendations can be of great benefit to individuals, businesses, and entire societies. For example, it is estimated that the effective use of big data in the US health care sector could reduce healthcare costs by US $300 billion per year.

There is enormous need for talent in data science to sustain this revolution in business and industry. In a recent report of the McKinsey Global Institute, it is estimated that by 2018 the US could face a shortage of 150,000 people with deep analytical skills, plus a shortage of 1.5 million managers and analysts who are familiar with data science methodologies and applications.

Who is a Data Scientist?

Data scientists combine skills from the mathematical sciences, primarily statistics and linear algebra, with computing skills, including programming and infrastructure design. They must be able to communicate, in order to talk to people with domain knowledge, ask the right questions, and produce work that leads to actions. They are instrumental in helping their organization acquire, process, and leverage data in a timely fashion, in order to enable new processes, generate new insights, and create entire new products.

Data scientists work in all sorts of organizations, from tiny startup businesses and small research teams to established companies and government institutions all the way to hugely successful enterprises that have thrived in the digital revolution which is happening around us. They collect data (e.g. traffic data and related weather records), describe them (“how does traffic move at certain times?”), discover patterns in the data (“how does inclement weather influence car traffic?”), use such information to predict (“which roads will likely be especially busy during this afternoon’s thunderstorms?”), and finally advise and recommend (“which lanes should be opened this afternoon?”). Ideally, data scientists work at all these stages to advance their organization’s goals.

In a recent survey, 800 data scientists from around the world reported that they worked as analysts (including coding), statisticians, software developers, technical leads, managers, and product developers, with most respondents listing more than one job function. The median salary was about $98,000 per year worldwide and about $115,000 per year in the US. The respondents worked in areas such as software and applications development, IT solutions, science and technology, banking and finance, retail/e-commerce, education, healthcare, and many more. More than 50% of all respondents were between 21 and 35.