Link Search Menu Expand Document

CS 506: Data Science Tools and Applications

Spring 2025Instructor: Lance GallettiBoston University

Welcome to CS 506! This course provides a comprehensive introduction to data science tools and applications, combining theoretical foundations with hands-on practice.

๐Ÿค– AI-Powered Course Assistant

Need help? Ask our AI assistant! Click the chat icon in the bottom-right corner to get instant answers about:

  • Course content and assignments
  • Deadlines and important dates
  • Office hours and contact information
  • Technical concepts and algorithms
  • Student perspectives and tips

๐Ÿ”‘ API Key Required

Important: To use the AI assistant, youโ€™ll need to provide your own OpenAI API key:

  1. Get a free key at OpenAI Platform
  2. Click the chat icon and enter your key when prompted
  3. Your key stays private - itโ€™s stored locally and never sent to our servers
  4. Change your key anytime - use the ๐Ÿ”‘ button in the chat header

The assistant learns from course materials and student contributions, providing personalized help throughout your learning journey.

๐Ÿ“š Course Overview

This course covers the complete data science workflow, from data collection to model deployment. Youโ€™ll learn practical skills using real-world datasets and tools commonly used in industry.

What Youโ€™ll Learn

  • Data Science Workflow - CRISP-DM methodology and best practices
  • Data Preprocessing - Cleaning, transformation, and feature engineering
  • Machine Learning - Clustering, classification, regression, and neural networks
  • Data Visualization - Creating compelling visualizations and storytelling
  • Real-world Applications - Industry-standard tools and techniques

Course Outcomes

By the end of this course, you will be able to:

  • โœ… Apply the complete data science workflow to real problems
  • โœ… Implement and evaluate machine learning algorithms
  • โœ… Create compelling data visualizations and reports
  • โœ… Use industry-standard tools (Python, pandas, scikit-learn)
  • โœ… Communicate findings effectively to stakeholders
  • โœ… Build a portfolio of data science projects

๐Ÿ“‹ Prerequisites

Required Background:

  • Basic programming experience (any language)
  • Familiarity with Python is highly recommended
  • Basic statistics knowledge (mean, median, standard deviation)

What Youโ€™ll Learn:

  • Data science workflow and methodology
  • Data cleaning and preprocessing techniques
  • Machine learning algorithms and implementation
  • Data visualization and storytelling
  • Real-world project development

If Youโ€™re Unsure:

  • Contact the instructor if you have concerns about prerequisites
  • We can provide additional resources for students who need them
  • The course is designed to accommodate various backgrounds

Recommended Preparation:

  • Review basic Python syntax if youโ€™re new to it
  • Familiarize yourself with Jupyter notebooks
  • Brush up on basic statistics concepts

๐Ÿ“Š Grading

ComponentWeightDescription
Participation10%Active engagement in discussions and labs
Assignments30%Weekly lab assignments and exercises
Midterm Report25%Individual project report (due March 31)
Final Project35%Real-world data science application (due May 1)

Detailed Breakdown

Participation (10%):

  • Active participation in class discussions
  • Engagement in lab sessions
  • Contribution to group activities
  • Regular attendance and preparation

Assignments (30%):

  • Weekly lab assignments
  • Data analysis exercises
  • Code implementation tasks
  • Submission through Gradescope

Midterm Report (25%):

  • Due: March 31, 2025
  • Individual project report
  • Data analysis and visualization
  • Written report with findings

Final Project (35%):

  • Due: May 1, 2025
  • Individual or group project (up to 5 students)
  • Real-world data science application
  • Final presentation and report

๐Ÿ“… Course Schedule

Topics Covered

  • Introduction to Data Science - Workflow, tools, and methodology (Lectures 1-2)
  • Clustering - K-means, hierarchical, and density-based algorithms (Lectures 3-8)
  • Singular Value Decomposition - Dimensionality reduction and feature extraction (Lectures 9-10)
  • Classification - KNN, decision trees, Naive Bayes, and SVM (Lectures 11-15)
  • Regression - Linear and logistic regression with evaluation (Lectures 16-20)
  • Neural Networks - Deep learning fundamentals and applications (Lectures 21-23)

Semester Structure

  • Duration: 12 weeks (Spring 2025)
  • Total Lectures: 23
  • Topics: 6 major topics with flexible duration
  • Flexible Pacing: Topics can take 1-3 weeks depending on complexity

๐ŸŽฏ Course Components

Lectures

  • Format: Interactive lectures with live coding
  • Materials: Slides, worksheets, and code examples
  • Location: Check your schedule
  • Office Hours: See staff page for times and locations

Labs

  • Format: Hands-on coding sessions
  • Focus: Practical implementation of concepts
  • Submission: Through Gradescope
  • Support: TA assistance and peer collaboration

Projects

  • Midterm Project: Individual analysis and report
  • Final Project: Real-world application of your choice
  • Portfolio: Build a collection of your best work
  • Presentation: Share your findings with the class

๐Ÿš€ Getting Started

1. Set Up Your Environment

  • Install Python and Jupyter notebooks
  • Set up required packages (pandas, numpy, matplotlib, scikit-learn)
  • Configure your development environment

2. Join Our Community

3. Start Learning

๐Ÿ“ Contributing to Course Knowledge

Share Your Insights

Students can contribute their understanding and tips to help others learn:

  1. Create a Pull Request with your notes in the student_notes/ directory
  2. Follow the format with sections for your thoughts, challenges, and tips
  3. Get instant feedback - PRs are automatically validated and merged
  4. Help future students - Your insights become part of the AI assistantโ€™s knowledge

Note Format

---
title: "Your Topic - Your Understanding"
student_name: "Your Name"
topic: "Topic Name"
difficulty_level: "beginner/intermediate/advanced"
---

## What I Think
Your understanding of the concept...

## What I Found Challenging
Specific difficulties you encountered...

## My Tips for Other Students
Helpful advice and strategies...

๐Ÿ“š Course Resources

Learning Materials

  • Lecture Slides: Available in each module
  • Worksheets: Interactive Jupyter notebooks
  • Code Examples: Complete implementations
  • Student Notes: Peer insights and tips

Getting Help

  • AI Assistant: Click the chat icon for instant help (requires OpenAI API key)
  • Office Hours: See staff page for times
  • Discord: Ask questions in our community
  • Email: Contact the instructor for private concerns

๐Ÿ—“๏ธ Important Dates

DateEvent
Early SemesterProject proposal due
March 31Midterm report due
May 1Final project due
May 7Final exam

๐ŸŽ“ Support and Communication

Office Hours

  • Instructor: See staff page for times
  • TAs: Available during lab sessions
  • Discord: 24/7 community support

Communication Channels

  • Course Questions: Use the AI assistant or Discord
  • Private Concerns: Email the instructor
  • Technical Issues: Discord or office hours
  • Assignment Help: Gradescope or Discord

Ready to start your data science journey? Use the AI assistant to ask any questions, and donโ€™t hesitate to reach out for help! ๐Ÿš€