20 Completely FREE DataCamp Courses To Take In 2021

In this article, I will list 20 completely free courses that are available of DataCamp right now.

DataCamp is an incredible learning resource for those wanting to learn how to code and develop data science skills. However, their material requires a paid membership to access the full course (if you are interested in their premium content, be sure to check out their latest promotions here or have a read of my DataCamp review).

Fear not! As there are numerous courses available on their website that are actually completely free to take. Some are developed by the guys at DataCamp themselves, while most are known as open courses. Open courses are those created by other individuals and groups. All you need to access these courses is a DataCamp account, which is free, and off you go.

Below, you will find a link (disclaimer: these are affiliate links) to each of the free courses on DataCamp, broken down by the corresponding platform, as well as a brief overview of what to expect in each one.

Happy learning!

The complete list of free DataCamp courses

Python

Kaggle, owned by Google, is an online community of data scientists who use machine learning to come up with the best code to win competitions. Those who rise to the top of the leaderboards can earn some respectable prizes, including cash!

What is covered in the course?

  • A Kaggle-style exercise to predict the survival rate in the Titanic competition.
  • A brief introduction to Python.
  • Learn about the decision tree and random forest techniques to fine tune your machine learning capabilities.

Data.world is a community where you can catalog, access and collaborate all kinds of data in one place. The course is delivered by Rafael Pereira, a Director of Engineering over at data.world, who will introduce you to the platform.

What is covered in the course?

  • A brief introduction to the data.world platform.
  • Learn how to use Python to pull data from data.world.

This is a comprehensive free course (6 chapters!) which starts at the very basics of Python all the way to creating machine learning models. The basis of the course is to encourage students to participate in exciting machine learning competitions over at Analytics Vidhya.

What is covered in the course?

  • A brief introduction to the Python interface and writing your first line of code.
  • Exploring data with the use of Pandas.
  • Building predictive learning models to submit to the DataHack platform.

In this short, but sweet, course you will learn how to tidy messy data – an issue every Data Scientist has to battle with very frequently. The course is inspired by Hadley Wickham’s paper on Tidy Data which defines tidy and messy data.

What is covered in the course?

  • Tidying messy data with the melt function in Pandas.
  • How to rename columns in a dataset, through Pandas.

R

This course complements the material from the University of Amsterdam’s Basic Statistics course at Coursera. It is aimed at those who are interested in using R to perform statistical analyses. Oh, and don’t worry if you are a complete beginner to R, Basic Statistics assumes you have no or very little knowledge of the programming language. I certainly recommend taking this free DataCamp course.

What is covered in the course?

  • The basics of R and writing your first lines of code.
  • How to import data and packages into R.
  • Exploring data through data visualisation.
  • Performing summary statistics (eg mean, median mode).
  • How to perform correlation and regression analyses.
  • Probability and hypothesis testing.

This course was developed by the guys at Microsoft. In it, you will be introduced to the RevoScaleR package, which is a machine learning package specifically designed to deal with big data.

What is covered in the course?

  • How to summarize, cross-tabulate and visualise variables from a large dataset.
  • Learn how to manipulate and transform variables using RevoScaleR.
  • Performing statistical analyses; including linear and logistic regression, k-means clustering and decision tree estimation.

In the next Causal Inference course in the series, this course focuses on regression analysis in R.

What is covered in the course?

  • An introduction to regression analysis to find causal effects.
  • How to compute regression analyses.
  • How to perform matching methods to find causal effects.

This course focuses on instrumental variable and regression discontinuity analysis to find causality through indirect inference.

What is covered in the course?

  • The ability to practice instrumental variable analysis.
  • An introduction to regression discontinuity design (RDD), which is considered more flexible than instrumental variables.

This course complements the Coursera course Data Analysis and Statistical Inference by Mine Çetinkaya-Rundel. Over nine chapters, this comprehensive course will start from the very basics of R and ends with a chapter on regression analyses. This is another perfect course for those interested in statistical analysis, yet have zero R knowledge.

What is covered in the course?

  • An introduction to R and data visualisation.
  • Learn about probability and sampling distributions.
  • Perfrom different statistical tests on numerical and categorical data.
  • An introduction to performing linear and multiple linear regression analysis.

In this course, you will use data from the 2013 American Community Survey to ultimately decide whether it is useful to pursue a PhD or not – the results of which will be of interest to many of you! Combined with data visualization, these Kaggle challenges make learning R fun and interesting. You will soon be building up your own Kaggle scripts to use on your own Kaggle account.

What is covered in the course?

  • Using interesting datasets to answer specific questions through R.
  • Look deeper into pigeon racing data and assess the optimal size for chopsticks that are effective (yes, you read these right!).

This is a relatively short course – only two chapters long – so it will only take a few minutes to complete. The course covers two powerful packages in R: dplyr and Plotly. The former is useful when sorting data files, while Plotly is a nifty tool used to create interactive graphs and figures. Have a play around with it in the course to see its potential.

What is covered in the course?

  • Learn how to tidy messy data with the dplyr package.
  • Use Plotly to create fancy graphs, including heatmaps.

Quandl is a website that offers free and open financial, economic and social datasets. This resource is incredibly useful when wanting to flex your data science knowledge.

What is covered in the course?

  • Understand how to import data from the Quandl server into your own R workspace.
  • Learn how to manipulate the Quandl datasets.

This DataCamp course complements the Inferential Statistics Coursera course. This is another course aimed at beginners to R who are interested in statistical analysis, which I totally recommend starting out with.

What is covered in the course?

  • An introduction to the very basics of R. Start out with baby steps of simple arithmetic tasks before venturing into multiple lines of code.
  • Performing the appropriate statistical tests on a range of data types.
  • Understand simple and multiple regression models.
  • Learn how to perform ANOVA tests to investigate differences between multiple experimental groups.
  • Performing non-parametric testing on datasets which are not normally distributed.

This course complements the Introduction to Probability and Data Coursera course, delivered by Mine Çetinkaya-Rundel. Unlike many of the DataCamp courses which have a clean, interactive interface, the Introduction to Probability and Data course looks exactly like R Studio.

What is covered in the course?

  • Learn about the concept of probability testing.

In the Intro to Computational Finance course, you get a lot of material included which spans eight chapters. This course has a strong mathematical basis and makes use of financial markets.

What is covered in the course?

  • Learn about return calculations by analyzing and plotting graphs.
  • Compute probability distributions.
  • Understand what bivariate distributions are by performing correlation analyses.
  • Use the PerformanceAnalytics, zoo and tseries packages to analyze and visualise stock returns.

This course is identical to the Python version. In it, you will use aspects of machine learning from Kaggle’s Titanic exercise to predict survival rate. Both decision trees and random forests will be covered, which are both fundamentals of machine learning.

What is covered in the course?

  • A Kaggle-style exercise to predict the survival rate in the Titanic competition.
  • A brief introduction to R.
  • Learn about decision tree and random forest techniques to fine tune your machine learning capabilities.

As the name suggests, this is the perfect course for absolute beginners to R – no experience is require! The course instructor, Annika Salzberg, does a fantastic job to break down the R language. Annika covers aspects of installing packages, understanding the types of data structures, as well as data cleaning and visualisation. However, note that this course is purely video-based – so no exercises through DataCamp’s interface can be performed.

What is covered in the course?

  • An introduction to R Studio and its interface.
  • How to install packages in R.
  • Learn about importing and cleaning data.
  • Understand how to use ggplot2 to create graphs.

As the name suggests, this is a fun short course which uses the power of R and the knowledge of data science to search through Yelp reviews for restaraunts to decide where to eat. This course is based on a popular Springboard blog post regarding Yelp Review Modifications. You will quickly learn to explore the raw data and then generating weighted star reviews to provide a better metric when deciding that all important restaraunt choice.

What is covered in the course?

  • An introduction to R in DataCamp and how to explore data.
  • Learn how to create new variables to generate weighted star review scores.

This course will introduce you to Hadley Wickham’s readr package. Delivered by the man himself, Hadley will talk you through importing and exporting datasets from R using readr.

What is covered in the course?

  • Learn how to import CSV and txt files into R.
  • Understand how importanting can sometimes go wrong and how you can go about changing column types.

SQL

Karlijn Willems, a Data Scientist from DataCamp, is the instructor for this SQL course. The focus is on using SQL to undertake database manipulations in a marketing context.

What is covered in the course?

  • Explore how to retrieve data into the SQL workspace and summarizing it.
  • Learn how to filter parts of data that you are interested in as well as combining data.

Conclusion

There are more free DataCamp courses out there than you may be aware of. I hope this list has provided you will an idea of some alternative resources to take advantage of in your data science journey.

*Disclaimer: these links contain an affiliate link. These links do not affect your experience with DataCamp. Simply, we will earn a commission if any sales result when clicked.

LEAVE A REPLY

Please enter your comment!
Please enter your name here