Big Data, Machine Learning, and their Real World Applications

I - June 28–July 16, 2021
(Course Filled)
II - July 20–August 6, 2021 (Course Filled)
Modality, Day & Time:
Monday–Friday, 9:10 –11:00 a.m. and 1:10–3:00 p.m.

Cynthia Clement, Elena Dubova, Rajeev Nair, Amanda Yang, Yiqiao Yin
Algebra 1 and geometry. Some background with statistics and with computer programming is recommended but not required.

“I have gained invaluable skills in AI and Data Science.” – Rom F. | Glencoe, Illinois

Course Description

The exponential growth of data, advances in cloud computing, and machine learning have transformed every industry from retail and banking to healthcare and education. This introductory-level course enables participants to navigate the new reality of the “data economy,” in which data is the “the new oil”—a ubiquitous and invaluable asset.

We focus on the strategic use of data and innovative technologies to derive actionable business insights. Participants develop a strong foundation in data-driven thinking for solving real-world problems. They are introduced to a variety of popular technologies for data analytics and gain a familiarity with programming in either R, a software environment for statistical computing and graphics, or Python. Much of the in-class work involves working with one of these two languages. Students learn how to import, export, manipulate, transform, and visualize data; use statistical summaries; and run and evaluate machine learning models.

From the start of the course participants are immersed in the world of data: they are introduced to the concepts of big data, artificial intelligence, the internet of things, cloud computing, and data ethics in the context of real-world business scenarios. Through hands-on experience and practice they study data harvesting and exploration, as well as the basics of data visualization. After they get comfortable with data manipulation and transformation, they gain familiarity with statistical frameworks and methods designed to extract practical insights from data. Participants learn and implement common machine-learning techniques and develop and evaluate analytical solutions.

Toward the conclusion of the course, students work in groups on a final project and presentation, thereby (a) solidify their newly acquired analytical and programming skills and (b) practicing storytelling with data.

Participants should expect a dynamic and interactive environment: hands-on exercises, teamwork, continuous in-class dialogue, demonstrations, and interactive presentations. The course features real-world applications of data analytics across industries and challenges students to think in terms of the business value of data and machine learning.

Registration Guidance & Call Number(s)

Please note, this course may have multiple classes being offered in a particular session. Students should only register for one class and with one call number.

To view detailed information on a particular offering, click on the call number to be directed to the Directory of Courses catalogue.

Session 1 Classes

  • Section 01 | Call Number: 10333
  • Section 02 | Call Number: 10334
  • Section 03 | Call Number: 14275
  • Section 04 | Call Number: 10335
  • Session 2 Classes

  • Section 05 | Call Number: 10336
  • Section 06 | Call Number: 10566
  • Section 07 | Call Number: 10337
  • Section 08 | Call Number: 12370
  • Further guidance on the registration process can be found here.


    Cynthia Clement

    Cynthia Clement is a data science manager at Munich Reinsurance, where she leads various initiatives to incorporate new data sources and implement machine learning-based solutions to improve risk assessment. Prior to Munich Re, she was a founding member of a branch of Datorama’s professional services team to analyze social media data in order to drive marketing campaigns. Cynthia has advised executive teams from various companies on incorporating data-driven solutions, model governance, and data science best practices. She holds a bachelor's degree in mathematics from Carnegie Mellon University and a masters in data science from Columbia University.

    Elena Dubova

    Elena Dubova built her career at Microsoft working with enterprise businesses across different industries – retail, transportation, and professional services. She held leadership positions across a number of business departments; operated in various functions, including sales, marketing, strategy, and operations; and developed and executed transformational projects from bringing to success new business verticals to partner ecosystem transformation. Elena holds an M.S. in applied analytics from Columbia University and M.A.’s in economics and international relations from Ivanovo State University. For several years she was a board member and director for the Model United Nations, an educational program that provides students with opportunities to find solutions for real-world issues. Elena is currently a faculty member at Columbia University’s School of Professional Studies.

    Rajeev Nair

    Rajeev Nair has lead the Predictive Modeling and AI/Decision Sciences team for CreditOne Bank, has been associated with the private equity Comcraft Group, and has served as an advisor to the Northeast Big Data Innovation Hub, a think tank created to apply AI/NLP/Machine Learning technologies to solve business problems in finance and healthcare. Rajeev earned his MBA from Columbia Business School. He has completed executive education from MIT on artificial intelligence (AI) for business strategy and holds a B.Tech. from Indian Institute of Technology (IIT), Kharagpur.  

    Amanda Yang

    Amanda Yang is a data analytics manager for the Walt Disney Company, where she leads a team of data analysts who study Disney+ user behavior. Through her work at Disney, she has become an expert on creating compelling stories that inform executives and team members on key strategies for driving growth. Prior to Disney, she was a founding member of the data team at New York Magazine, where she focused on trends in editorial content and contextual ad targeting. Outside of the media and entertainment space, Amanda has worked  in data analytics across multiple industries including advertising, home services, retail, and software. She holds a bachelors in political science from DePaul University and has completed executive coursework at the University of Illinois-Chicago and Northwestern University.

    Yiqiao Yin

    Yiqiao Yin holds an M.A. in statistics from Columbia University, an M.S. in finance from the University of Rochester, and a B.A. in mathematics from the University of Rochester. He is currently a PhD candidate in statistics at Columbia. His research interests include feature learning and representation learning, deep learning, computer vision (CV), natural language processing (NLP), and reinforcement learning (RL). He has held professional appointments as an enterprise-level data scientist at Bayer Crop Science, a quantitative researcher at AQR working on alternative quantitative approaches to portfolio management and factor-based trading, and a trader at T3 Trading on Wall Street. Yiqiao has two years of teaching experience, supervises a small fund specializing in algorithmic trading, and runs his own YouTube Channel in which he discusses topics in data science, machine learning, and artificial intelligence.

    Back to the Course Guide

    Specific course detail such as hours and instructors are subject to change at the discretion of the University. Not all instructors listed for a course teach all sections of that course.