Data Science

E. A. RUNDENSTEINER, PROGRAM DIRECTOR

PROFESSORS: E. A. Rundensteiner, C. Ruiz, D. M. Strong, S.A. Zekavat

ASSOCIATE PROFESSORS: M. Y. Eltabakh, L. T. Harrison, X. Kong, K. Lee,  Y. Li, X. Liu, R. Paffenroth, A. Trapp, J. Zou

ASSISTANT PROFESSORS: N. Kordzadeh, O. Mangoubi, R. Shraga

TEACHING PROFESSOR: F. Emdad

ASSISTANT TEACHING PROFESSOR: T. Ghoshal, C. K. Ngan

Mission Statement

Data Science prepares WPI undergraduates with the skills to understand, apply and develop models, algorithms and statistical techniques to gather huge amounts of data, draw new insights from it, and formulate appropriate action plans. Through courses and hands-on project work, students in the Data Science program will master foundational and advanced topics, including state-of-the-art data analytic technologies like machine/deep learning, artificial intelligence, and big data. This prepares the student to tackle the most critical data challenges in interdisciplinary teams with diverse perspectives in this increasingly digital world from climate change, self-driving cars, digital healthcare, to social justice. In addition to being a discipline in and of itself, Data Science complements many of the existing undergraduate majors at WPI. Disciplines from the sciences to engineering increasingly grapple with large data sets using computational and statistical techniques and tools.

Students interested in Data Science, both majors and minors, should check with the Data Science program as early as possible in their academic career to develop a plan of study. Students will be assigned a Data Science advisor after completing a major/minor declaration form.

Program Educational Objectives

In support of its goals and mission, the WPI Data Science undergraduate program’s educational objectives are to graduate students who will: 

  • Bring together a community of diverse disciplinary backgrounds and experiential perspectives to promote creative solutions to critical real-world problems and advance knowledge at the cutting edge
  • Achieve professional success due to their mastery of Data Science theory and practice
  • Conduct impactful research and project work in data science tacking the world’s most challenging problems
  • Engage in discovery through purpose-driven project-based learning
  • Collaborate with partners both internally and externally in interdisciplinary projects
  • Become leaders in business, academia, and society due to a broad preparation in data science, computational thinking, mathematics, science & engineering, communication, and social issues 
  • Pursue lifelong learning and continuing professional development
  • Use their understanding of the impact of data science on society for the benefit of humankind

Theme:

“Gather Information, Form Insights, Impact the World”!

Program Outcomes

Students graduating with a Bachelor of Science degree in Data Science: 

  • Have mastered foundational studies in business, computer science, and mathematical sciences 
  • Have mastered advanced principles and techniques in at least one of the three disciplines 
  • Can apply computational and mathematical knowledge to the solution of big data problems 
  • Can communicate effectively across disciplines both verbally and in writing 
  • Can locate, read, and interpret primary literature in data science 
  • Can function effectively as members of an interdisciplinary team 
  • Have an understanding of accepted standards of ethical and professional behavior 
  • Have the ability to be a life-long independent learner 

Majors

Minors

Classes

CS 4433/DS 4433: Big Data Management and Analytics

Category
Category I (offered at least 1x per Year)
Units 1/3

This course introduces the emerging techniques and infrastructures for big data management and analytics including parallel and distributed database systems, map-reduce, Spark, and NoSQL infrastructures, data stream processing systems, scalable analytics and mining, and cloud-based computing. Query processing and optimization, access methods, and storage layouts developed on these infrastructures will be covered. Students are expected to engage in hands-on projects using one or more of these technologies.

CS 4804: Data Visualization

Units 1/3

This course trains students in data visualization, the graphical communication of data and information for presentation, confirmation, and exploration. Students learn the stages of the visualization pipeline, including data characterization, mapping data attributes to graphical attributes, user task abstraction, visual display techniques, tools, paradigms, and perceptual issues. Students evaluate the effectiveness of visualizations for specific data, task, and user types. Students implement visualization algorithms and undertake projects involving the use of commercial and public-domain visualization tools.

DS 1010: Data Science I: Introduction to Data Science

Department
Category
Category I (offered at least 1x per Year)
Units 1/3

This course provides an introduction to the core concepts in Data Science. It covers a broad range of methodologies for working with and making informed decisions based on real-world data. Core topics introduced in this course include basic statistics, data exploration, data cleaning, data visualization, business intelligence, and data analysis. Students will utilize various techniques and tools to explore, understand and visualize real-world data sets from various domains and learn how to communicate data results to decision makers.

DS 2010: Data Science II: Modeling and Data Analysis

Department
Category
Category I (offered at least 1x per Year)
Units 1/3

This course focuses on model- and data-driven approaches in Data Science. It covers methods from applied statistics (regression), optimization, and machine learning to analyze and make predictions and inferences from real-world data sets. Topics introduced in this course include basic statistics (regression), analytics (explanatory and predictive), basics of machine learning (classification and clustering), eigen values and singular matrices, data exploration, data cleaning, data visualization, and business intelligence. Students will utilize various techniques and tools to explore and understand real-world data sets from various domains.

DS 3010: Data Science III: Computational Data Intelligence

Department
Category
Category I (offered at least 1x per Year)
Units 1/3

This course introduces core methods in Data Science. It covers a broad range of methodologies for working with large and/or high-dimensional data sets to making informed decisions based on real-world data. Core topics introduced in this course include data collection through use cycle, data management of large-scale data, cloud computing, machine learning and deep learning. Students will acquire experience with big data problems through hands-on projects using real-world data sets.

DS 4635/MA 4635: Data Analytics and Statistical Learning

Category
Category I (offered at least 1x per Year)
Units 1/3

The focus of this class will be on statistical learning - the intersection of applied statistics and modeling techniques used to analyze and to make predictions and inferences from complex real-world data. Topics covered include: regression; classification/clustering; sampling methods (bootstrap and cross validation); and decision tree learning. Students may not receive credit for both MA 463X and MA 4635.