Data science is an interdisciplinary field whose goal is to extract knowledge and enable discovery from complex data. The field applies a fusion of principles from the disciplines of statistics and computer science to perform data-intensive tasks in domain-specific contexts. The program's goal is to train analysts in a broad spectrum of computational and statistical techniques for handling data-intensive tasks in a variety of quantitative, computational, and scientific disciplines. Additionally, students learn how to work and communicate effectively and efficiently in collaborative environments.
Students will learn how to work with complex and large data sets, analyze data with appropriate statistical and computational tools to extract relevant knowledge, make effective displays and visualizations of the data, effectively communicate results to both experts and nonexperts, understand and communicate the scientific generalizability of their models, and respect relevant ethical/legal issues pertaining to data analytics.
While there are no official specializations, students can pursue individual goals through elective offerings. Some electives will deepen technical skills, and others will give students practical experience leading technical teams and initiatives. Alternatively, many students take domain-specific classes in other schools or departments at Vanderbilt, like the Owen Graduate School of Management or the Biostatistics or Computer Science department.
The program's learning objectives are: (1) be able to acquire, clean, and manage (massive) data, (2) design computational pipelines to collect and process large-scale data, (3) visualize data and highlight data patterns graphically, (4) build and interpret a statistical model for large-scale data, (5) explain the advantages and limitations of competing statistical models, (6) implement machine learning algorithms to make predictions and optimize decisions, (7) understand and explain the difference between inference and prediction goals, (8) generate reproducible code and analyses, and work in a reproducible manner, (9) communicate data science methods and results clearly and concisely to variable audiences, and (10) recognize the ethical, policy, and privacy implications of data science research.


