Link Search Menu Expand Document

Data Science Tools and Applications

Supplemental material can be found on YouTube

Getting Started Checklist

  1. Join Piazza and Discord
  2. Create a GitHub account
  3. Create a Kaggle account
  4. Fill out this form (requires BU email) with your GitHub and Kaggle account username
  5. Install Python and Jupyter Notebook
  6. Sign up for GradeScope (code: TBD)

About

The goal of this course is to provide students a hands-on understanding of classical data analysis techniques and to develop proficiency in applying these techniques in modern programming languages (Python).

The course introduces students to a wide range of techniques that are commonly used in the analysis of data such as clustering, classification, regression, and neural networks.

Note that this is not a Python (or an introduction to programming) course, so self-study will be necessary for those students who do not already know the language.

There is no textbook for this course, all material will be made available online.

Prerequisites

Students taking this class must have some prior familiarity with programming at the level of CS 105, 108, or 111, or equivalent. CS 132 or equivalent (MA 242, MA 442) is required. CS 112 is also helpful.

Workload

There are 3 components to this course:

  1. Weekly assignments
  2. A midterm
  3. A final project
  4. Cold Calls in class

Assignments

A programming assignment will be released every Wednesday and be due on Sunday at 11:59PM. These are individual assignments. The use of tools like chatGPT is encouraged.

Midterm

The midterm will be a Kaggle Data Science competition among the students in the class with a live leaderboard. Students will need to submit predictions based on a training dataset and a report detailing the methods used and decisions made.

Final Project

The final project can be done as an individual or a group of up to 5 students.

A project proposal will need to be submitted at the end of the first month of the semester. Details will be provided at the start of the semester.

You can select among a number of BU Spark curated projects or you can create your own. A list of example projects will be provided.

At the end of the semester, some teams will be selected to present a poster of their project on Demo Day (details to follow).

Cold Call

Reading will be assigned prior to some classes and you may be cold called to answer a question specific to the reading.

Labs

The first few labs will aim to help folks get up to speed with using Python, or tools like Git/GiitHub. Labs will then become extra office hours.

Grading

Grading

  • 40% assignments
  • 20% midterm
  • 35% final project
  • 5% cold call
LetterGrade
A95% +
A-90% - 95%
B+87% - 90%
B83% - 87%
B-80% - 83%
C+77% - 80%
C73% - 77%
C-70% - 73%
D60% - 70%
Fbelow 60%

Extra Credit

Extra Credit can be earned by consistently:

  • Asking and answering questions on Piazza/ Discord
  • Contributing to our class repository or course website via PRs (e.g. by fixing typos, providing clarification edits, sharing class notes, etc.)

Re-Grades

If you notice an issue with a grade you’ve received, please don’t email the teaching staff. Instead, please submit a regrade on Gradescope within 48h of receiving the grade. Anything beyond 48h will not be accepted for a re-grade.

Emails

If emailing the CS506 staff, or creating a private Piazza post, please always CC or include the instructor, the TF, and all TAs.