logo_toolbox2

The Data Science Toolbox

Welcome to The Data Science Toolbox! This course is part of the Key Capabilities for Data Science program and covers topics related to workflows, plateforms and tools used in data analysis.

In this course, we will dive into the world of data science tools and utilities. While these are not strictly required for data analysis, they are necessary for efficient, reproducible, and collaborative data science practices, and are all important building blocks for a successful and sustained data science career.

Course prerequisites: Programming in Python for Data Science

Module 0: The Data Science Toolbox

Course introduction, summary of course learning outcomes and prerequisite validation.

Module 1: Introduction to the Data Science Toolbox

In this module we will introduce you to several of the tools that we will be using in this course, as well as to computing in general.

Module 2: The shell

In this module we will you will learn how to use the shell to navigate your filesystem and to execute commands.

Module 3: Git and GitHub intro

This module covers the basics of version control with Git and GitHub.

Module 4: Getting groovy with Git and GitHub

View your git history, travel back in time, deal with merge conflicts and other useful tools

Module 5: Branches, forks, and streams… Welcome to the Git nature walk!

Discover how to efficiently collaborate with Git and GitHub by using branches, forks ad pull requests.

Module 6: File Names, Project Organization, Virtual Environments

An overview of how to effectively manage files, projects, and virtual environments.

Module 7: JupyterLab

In this module, you will learn about JupyterLab, one of the most popular development environments for data science projects.

Module 8: Jupyter Book

In this module you will learn how to create beautiful, publication-ready books and websites using Jupyter Book.

Module Closing Remarks

Well done on finishing The Data Science Toolbox introduction.

About this course

In this course, we will dive into the world of data science tools and utilities. While these tools are not strictly required for data analysis, they are necessary for maintaining efficient, reproducible, and collaborative workflows, and are essential building blocks for a successful and sustained data science career.

About the program

The University of British Columbia (UBC) is a comprehensive research-intensive university, consistently ranked among the 40 best universities in the world. The Key Capabilities in Data Science program was launched in September 2020 and is developed and taught by many of the same instructors as the UBC Master of Data Science program.