E. A. RUNDENSTEINER, PROGRAM DIRECTOR
PROFESSORS: E. A. Rundensteiner, C. Ruiz, D. M. Strong, S.A. Zekavat
ASSOCIATE PROFESSORS: M. Y. Eltabakh, L. T. Harrison, X. Kong, K. Lee, Y. Li, X. Liu, R. Paffenroth, A. Trapp, J. Zou
ASSISTANT PROFESSORS: N. Kordzadeh, O. Mangoubi, R. Shraga
TEACHING PROFESSOR: F. Emdad
ASSISTANT TEACHING PROFESSOR: T. Ghoshal, C. K. Ngan
Data Science prepares WPI undergraduates with the skills to understand, apply and develop models, algorithms and statistical techniques to gather huge amounts of data, draw new insights from it, and formulate appropriate action plans. Through courses and hands-on project work, students in the Data Science program will master foundational and advanced topics, including state-of-the-art data analytic technologies like machine/deep learning, artificial intelligence, and big data. This prepares the student to tackle the most critical data challenges in interdisciplinary teams with diverse perspectives in this increasingly digital world from climate change, self-driving cars, digital healthcare, to social justice. In addition to being a discipline in and of itself, Data Science complements many of the existing undergraduate majors at WPI. Disciplines from the sciences to engineering increasingly grapple with large data sets using computational and statistical techniques and tools.
Students interested in Data Science, both majors and minors, should check with the Data Science program as early as possible in their academic career to develop a plan of study. Students will be assigned a Data Science advisor after completing a major/minor declaration form.
Program Educational Objectives
In support of its goals and mission, the WPI Data Science undergraduate program’s educational objectives are to graduate students who will:
- Bring together a community of diverse disciplinary backgrounds and experiential perspectives to promote creative solutions to critical real-world problems and advance knowledge at the cutting edge
- Achieve professional success due to their mastery of Data Science theory and practice
- Conduct impactful research and project work in data science tacking the world’s most challenging problems
- Engage in discovery through purpose-driven project-based learning
- Collaborate with partners both internally and externally in interdisciplinary projects
- Become leaders in business, academia, and society due to a broad preparation in data science, computational thinking, mathematics, science & engineering, communication, and social issues
- Pursue lifelong learning and continuing professional development
- Use their understanding of the impact of data science on society for the benefit of humankind
“Gather Information, Form Insights, Impact the World”!
Students graduating with a Bachelor of Science degree in Data Science:
- Have mastered foundational studies in business, computer science, and mathematical sciences
- Have mastered advanced principles and techniques in at least one of the three disciplines
- Can apply computational and mathematical knowledge to the solution of big data problems
- Can communicate effectively across disciplines both verbally and in writing
- Can locate, read, and interpret primary literature in data science
- Can function effectively as members of an interdisciplinary team
- Have an understanding of accepted standards of ethical and professional behavior
- Have the ability to be a life-long independent learner
Data Science Major,Bachelor of Science
This course introduces the emerging techniques and infrastructures for big data management and analytics including parallel and distributed database systems, map-reduce, Spark, and NoSQL infrastructures, data stream processing systems, scalable analytics and mining, and cloud-based computing. Query processing and optimization, access methods, and storage layouts developed on these infrastructures will be covered. Students are expected to engage in hands-on projects using one or more of these technologies.
Knowledge in database systems at the level of CS4432, and programming experience are assumed.
This course trains students in data visualization, the graphical communication of data and information for presentation, confirmation, and exploration. Students learn the stages of the visualization pipeline, including data characterization, mapping data attributes to graphical attributes, user task abstraction, visual display techniques, tools, paradigms, and perceptual issues. Students evaluate the effectiveness of visualizations for specific data, task, and user types. Students implement visualization algorithms and undertake projects involving the use of commercial and public-domain visualization tools.
CS 2102 or CS 2103, and CS 2223.
This course provides an introduction to the core concepts in Data Science. It covers a broad range of methodologies for working with and making informed decisions based on real-world data. Core topics introduced in this course include basic statistics, data exploration, data cleaning, data visualization, business intelligence, and data analysis. Students will utilize various techniques and tools to explore, understand and visualize real-world data sets from various domains and learn how to communicate data results to decision makers.
This course focuses on model- and data-driven approaches in Data Science. It covers methods from applied statistics (regression), optimization, and machine learning to analyze and make predictions and inferences from real-world data sets. Topics introduced in this course include basic statistics (regression), analytics (explanatory and predictive), basics of machine learning (classification and clustering), eigen values and singular matrices, data exploration, data cleaning, data visualization, and business intelligence. Students will utilize various techniques and tools to explore and understand real-world data sets from various domains.
Data science basics equivalent to DS 1010, applied statistics and regression equivalent to MA2611 and MA 2612, and the ability to write computer programs in a scientific language equivalent to a CS programming course at the CS 1000 or CS 2000 level are assumed.
This course introduces core methods in Data Science. It covers a broad range of methodologies for working with large and/or high-dimensional data sets to making informed decisions based on real-world data. Core topics introduced in this course include data collection through use cycle, data management of large-scale data, cloud computing, machine learning and deep learning. Students will acquire experience with big data problems through hands-on projects using real-world data sets.
Data science basics equivalent to DS 1010, and data analysis principles and modeling equivalent to DS 2010, knowledge of basic statistics equivalent to (MA 2611 and MA 2612), and the ability to program equivalent to (CS 1004 or CS 1101 or CS 1102) and (CS 2102, CS 2103 or CS 2119), as well as understanding of databases equivalent to (CS 3431 or MIS 3720) are assumed.
The focus of this class will be on statistical learning - the intersection of applied statistics and modeling techniques used to analyze and to make predictions and inferences from complex real-world data. Topics covered include: regression; classification/clustering; sampling methods (bootstrap and cross validation); and decision tree learning. Students may not receive credit for both MA 463X and MA 4635.
Linear Algebra (MA 2071 or equivalent), Applied Statistics and Regression (MA 2612 or equivalent), Probability (MA 2631 or equivalent). The ability to write computer programs in a scientific language is assumed.