CMSC 152

Introduction to Programming for Data Science II

Coordinator: Nazli Hardy

Credits: 4.0

Description

Continuation of CSCI 151 covering more advanced computer programming techniques with an emphasis on developing programs to manipulate and analyze real-world data from various domains including business, science, and the humanities. Topics include creating appropriate data visualizations, acquiring data from numerous sources, analyzing and cleaning data sets, drawing advanced conclusions from data and the ethics of data collection and analysis. Current language used is Python.

Prerequisites

C or higher in CSCI 151 or B or higher in CSCI 161 and C- or better in MATH 101 or MATH 120 or MATH 130 or MPT of 160 or above

Course Outcomes

At the end of this course, a successful student will be able to

  1. Acquire data from web sites, files, databases, and other sources and convert it to a usable form.
  2. Create appropriate visualizations of data sets, and explain why they are appropriate.
  3. Analyze the quality of a data set and fix deficiencies in it.
  4. Draw advanced conclusions from data.
  5. Discuss the ethics of data collection and analysis.
  6. Find relations between data sets.
  7. Form hypotheses about relations between data sets and test whether these hypotheses are supported by the data.
  8. Manipulate and analyze a variety of types of data, potentially including time series, geospatial, text, or images.
  9. Use intermediate programming techniques such as collection processing, regular expressions, object-oriented methodology, and leveraging appropriate libraries to complete programming objectives.