Course Description:
Overview and Learning Objectives
Data Science is a rapidly evolving field that does not have a uniformly agreed-upon definition. It is an inter-disciplinary field that uses scientific methods, statistical and computer science concepts and processes to extract and communicate knowledge and insights from data. The key component that differentiates Data Science from Computer Science or Statistics is its connection to and the need to understand the contexts from other domains such as Biology, Environmental Science, Economics, etc.
In this course, we will introduce and use critical concepts and skills in computer programming and statistical inference to analyze real-world datasets and interpret real-world phenomena. The purpose of this course is to develop some of the foundational skills needed to consume data and create information. The main theme in the course is understanding the sources of data, the variability inherent in data, biases and fallacies, and the inherent uncertainty associated with conclusions drawn from data.
By the end of the course students will have practiced and learned the following concepts:
- Foundational programming concepts:
- Memory concepts: variables, name, type, value, assignment statements, scope of variables
- Control structures: for loops, if/else, while loops
- Basic data structures: lists
- Functions: function call vs. function definition, formal vs. actual parameters (arguments)
- Input/output concepts: printing output, reading from files
- Problem solving strategies and code design:
- breaking down a problem into a sequence of steps
- abstracting specific problems into general ones, and
- finding general solutions
- Debugging programs by:
- reading code and predicting the output of programs or parts of code
- using print statements to localize program bugs
- Data Science concepts:
- Basic operations on tables: loading data, extracting columns, extracting rows
- Selecting rows that match certain criteria, composite queries, join / aggregate
- Computing summary statistics
- Exploratory Data Analysis (EDA)
- Simulating experiments
- Probability basics
- Analysis and critical thinking with a special attention to biases and fallacies, ethics and fairness
Prerequisites
CMPSC 5A does not have any prerequisites beyond high-school algebra (and a desire to learn). The curriculum and format is designed specifically for students who have not previously taken statistics or computer science courses. The course was created to target first- and second-year students.
CMPSC 5A is unavailable to students who have taken CMPSC 8, ENGR 3, ECE 3.
Students who have taken several statistics or computer science courses should instead take a more advanced course, such as Data Science 2, CMPSC 9 (aka Intermediate Python Programming) and PSTAT 100 (Principles and Techniques of Data Science).
Prior to Fall 2019, this course was taught as INT 5. In 2020-2021, it was offered as CMPSC 90DA.
Sample syllabus and Website: Fall 2020
Additional Information:
Description (from the General Catalog): Introduction to data science methods and Python programming language for students with little to no experience in the subjects. Topics include foundational programming concepts, problem solving strategies and code design, and such data science concepts as table operations, exploratory data analysis, basic probability, and more.
Course Level:
- Undergraduate
Course Number:
Course Time:
Offered every quarter:
Spring 2024: CMPSC 5A - Prof. Solis
Tues/Thurs: 9:30-10:45 am, ILP 1101
Lab 1: Weds at 1:00 pm, SH 1430
Lab 2: Weds at 2:00 pm, PHELP 1440
Lab 3: Weds at 3:00 pm, PHELP 1440
Lab 4: Weds at 4:00 9m, PHELP 1440
Spring 2024: CMPSC 5A - Prof. Tanna
Mon/Weds: 3:30-4:45 pm, ILP 2101
Lab 1: Weds at 6:00 pm, ILP 4101
Lab 2: Weds at 12:00 pm, ILP 4209
Lab 3: Weds at 1:00 pm, ILP 3209
Lab 4: Weds at 2:00 pm, GIRV 1119
Lab 5: Weds at 3:00 PM, PHELP 1448
Winter 2024: CMPSC 5A - Prof. Solis
Tues/Thurs: 3:30-4:45 pm, BUCHN1920
Lab 1: Weds at 3:00 pm, SSMS 1301&
Lab 2: Weds at 4:00 pm, SSMS 1301&
Lab 3: Weds at 5:00 pm, SSMS 1301&
Lab 4: Weds at 10:00 am, SSMS 1301&
Winter 2024: CMPSC 5A - Prof. Tanna
Mon/Weds: 3:30-4:45 pm, LSB 1001
Lab 1: Fridays at 11:00 am, ILP 4101
Lab 2: Fridays at 12:00 pm, ILP 4101
Lab 3: Fridays at 1:00 pm, ILP 4101
Lab 4: Fridays at 2:00 pm, ILP 4101
Lab 5: Fridays at 3:00 PM, ILP 4101
Fall 2023: CMPSC 5A
Tues/Thurs: 2:00-3:15 pm, EMBARHALL
Lab 1: Weds at 10:00, PHELPS 1445
Lab 2: Weds at 11:00, NH 1111
Lab 3: Weds at 12:00, NH 1109
Lab 4: Weds at 1:00, ARTS1349
Lab 5: Weds at 2:00, PHELPS 2514
Lab 6: Weds at 3:00, PHELPS 2514
Spring 2023: CMPSC 5A
Tues/Thurs: 5:00-6:15 pm, GIRV 1004
Lab 1: Weds at 10:00, PHELPS 2514
Lab 2: Weds at 11:00, PHELPS 2514
Lab 3: Weds at 12:00, PHELPS 2514
Lab 4: Weds at 1:00, GIRV 1115
Lab 5: Weds at 2:00, PHELPS 1440
Lab 6: Weds at 3:00, PHELPS 1445
Winter 2023: CMPSC 5A
Mon/Weds: 6:30-7:45 pm, GIRV 1004
Lab 1: Weds at 10:00, SSMS 1301&
Lab 2: Weds at 11:00, SSMS 1301&
Lab 3: Weds at 12:00, SSMS 1301&
Lab 4: Weds at 1:00, NH 1111
Lab 5: Weds at 2:00, 387 1015
Fall 2022: CMPSC 5A
Tue/Thur: 5:00-6:15 pm, LSB 1001
Lab 1: Weds at 10:00, PHELP 1530
Lab 2: Weds at 11:00, PHELP 1530
Lab 3: Weds at 12:00, PHELP 1530
Lab 4: Weds at 1:00, SSMS 1303
Lab 5: Weds at 2:00, SSMS 1303
Spring 2022: CMPSC 5A
Tue/Thur: 5:00-6:15 pm, ELLSN 2617
Lab 1: Weds at 10:00, GIRV 1119
Lab 2: Weds at 11:00, 387 1011
Lab 3: Weds at 2:00, GIRV 2119
Winter 2022: CMPSC 5A
Tue/Thur: 11:00-12:15 pm, BUCHN 1930
Lab 1: Weds at 10:00, GIRV 2127
Lab 2: Weds at 11:00 am, ARTS 1356
Fall 2021: CMPSC 5A
Tue/Thur: 11:00-12:15 pm, Phelps 1260
Lab 1: Weds at 10:00, Bld 387, Rm. 1011
Lab 2: Weds at 11:00 am, Girvetz 2116
Summer 2021; CMPSC 90DA
Session C (June 21 - Aug 10)
Tue/Wed/Thur: 9:30-10:20 am
Lab: Thur at 11:00 am
Spring 2021
Mon/Wed: 11:00 - 12:15 pm
Labs: Wed at 3:00 and 4:00 pm
Winter 2021
Mon/Wed: 11:00 - 12:15 pm
Labs: Wed at 5:00 and 6:00 pm