32 Completely FREE DataCamp Courses To Take In 2019

In this article, I will list 32 completely free courses that are available of DataCamp right now.

DataCamp is an incredible learning resource for those wanting to learn how to code and develop data science skills. However, their material requires a paid membership to access the full course (if you are interested in their premium content, be sure to check out their latest promotions in our dedicated article).

Fear not! As there are numerous courses available on their website that are actually completely free to take. Some are developed by the guys at DataCamp themselves, while most are known as open courses. Open courses are those created by other individuals and groups. All you need to access these courses is a DataCamp account, which is free, and off you go.

Below, you will find a link (disclaimer: these are affiliate links) to each of the free courses on DataCamp, broken down by the corresponding platform, as well as a brief overview of what to expect in each one.

Happy learning!

The complete list of free DataCamp courses

Python

Kaggle, owned by Google, is an online community of data scientists who use machine learning to come up with the best code to win competitions. Those who rise to the top of the leaderboards can earn some respectable prizes, including cash!

What is covered in the course?

  • A Kaggle-style exercise to predict the survival rate in the Titanic competition.
  • A brief introduction to Python.
  • Learn about the decision tree and random forest techniques to fine tune your machine learning capabilities.

Data.world is a community where you can catalog, access and collaborate all kinds of data in one place. The course is delivered by Rafael Pereira, a Director of Engineering over at data.world, who will introduce you to the platform.

What is covered in the course?

  • A brief introduction to the data.world platform.
  • Learn how to use Python to pull data from data.world.

This is a comprehensive free course (6 chapters!) which starts at the very basics of Python all the way to creating machine learning models. The basis of the course is to encourage students to participate in exciting machine learning competitions over at Analytics Vidhya.

What is covered in the course?

  • A brief introduction to the Python interface and writing your first line of code.
  • Exploring data with the use of Pandas.
  • Building predictive learning models to submit to the DataHack platform.

In this short, but sweet, course you will learn how to tidy messy data – an issue every Data Scientist has to battle with very frequently. The course is inspired by Hadley Wickham’s paper on Tidy Data which defines tidy and messy data.

What is covered in the course?

  • Tidying messy data with the melt function in Pandas.
  • How to rename columns in a dataset, through Pandas.

R

This course complements the material from the University of Amsterdam’s Basic Statistics course at Coursera. It is aimed at those who are interested in using R to perform statistical analyses. Oh, and don’t worry if you are a complete beginner to R, Basic Statistics assumes you have no or very little knowledge of the programming language. I certainly recommend taking this free DataCamp course.

What is covered in the course?

  • The basics of R and writing your first lines of code.
  • How to import data and packages into R.
  • Exploring data through data visualisation.
  • Performing summary statistics (eg mean, median mode).
  • How to perform correlation and regression analyses.
  • Probability and hypothesis testing.

This course was developed by the guys at Microsoft. In it, you will be introduced to the RevoScaleR package, which is a machine learning package specifically designed to deal with big data.

What is covered in the course?

  • How to summarize, cross-tabulate and visualise variables from a large dataset.
  • Learn how to manipulate and transform variables using RevoScaleR.
  • Performing statistical analyses; including linear and logistic regression, k-means clustering and decision tree estimation.

The Causal Inference with R – Introduction is the first course in a 7-part series delivered by Duke University (with some help from eBay). Causal inference is the process of drawing up a conclusion about a causal relationship and inferring associations between them.

What is covered in the course?

  • An introduction to the basic concepts behind causal inference.
  • Exploring the issues of confounders, counterfactuals and p-hacking.

In the next Causal Inference course in the series, this course focuses on regression analysis in R.

What is covered in the course?

  • An introduction to regression analysis to find causal effects.
  • How to compute regression analyses.
  • How to perform matching methods to find causal effects.

This course focuses on instrumental variable and regression discontinuity analysis to find causality through indirect inference.

What is covered in the course?

  • The ability to practice instrumental variable analysis.
  • An introduction to regression discontinuity design (RDD), which is considered more flexible than instrumental variables.

This is the fourth course in the series on causal inference by Duke University, you will be introduced to panel data. Panel data are multi-dimensional data that contains measurements over time.

What is covered in the course?

  • A beginners look into what panel data is.
  • A deeper look into panel data and the factors to be aware of when working with it.

This course complements the Coursera course Data Analysis and Statistical Inference by Mine Çetinkaya-Rundel. Over nine chapters, this comprehensive course will start from the very basics of R and ends with a chapter on regression analyses. This is another perfect course for those interested in statistical analysis, yet have zero R knowledge.

What is covered in the course?

  • An introduction to R and data visualisation.
  • Learn about probability and sampling distributions.
  • Perfrom different statistical tests on numerical and categorical data.
  • An introduction to performing linear and multiple linear regression analysis.

In this course, you will use data from the 2013 American Community Survey to ultimately decide whether it is useful to pursue a PhD or not – the results of which will be of interest to many of you! Combined with data visualization, these Kaggle challenges make learning R fun and interesting. You will soon be building up your own Kaggle scripts to use on your own Kaggle account.

What is covered in the course?

  • Using interesting datasets to answer specific questions through R.
  • Look deeper into pigeon racing data and assess the optimal size for chopsticks that are effective (yes, you read these right!).

DrivenData is a website where data scientists compete to come up with the best statistical models to predict some difficult challenges. In this course, you will start to use machine learning in R to help predict which water pumps throughout Tanzania are functional, in need of repair or do not work at all.

What is covered in the course?

  • An introduction to the R interface in DataCamp and the DrivenData dataset.
  • Perform your first machine learning script by using random forest analysis.

This is a relatively short course – only two chapters long – so it will only take a few minutes to complete. The course covers two powerful packages in R: dplyr and Plotly. The former is useful when sorting data files, while Plotly is a nifty tool used to create interactive graphs and figures. Have a play around with it in the course to see its potential.

What is covered in the course?

  • Learn how to tidy messy data with the dplyr package.
  • Use Plotly to create fancy graphs, including heatmaps.

Swirl works directly in the R console to teach you aspects of coding. Unlike the simple interface of DataCamp, swirl works within R Studio, so this may look slightly strange if you have never used it before.

In this course, there are three lessons which are focussed on the principle of presenting data effectively and displaying analytical results.

What is covered in the course?

  • Understand the basic plotting systems available in R.
  • Use the ggplot2 package to enhance your graph creations.
  • Learn how to display multidimensional data through clustering techniques.

In the Exploring Polling Data short course, you will refine and visualise polling data from the 2016 Republican and Demographic USA campaign.

What is covered in the course?

  • Use R’s base plot to create simple graphs.
  • An introduction to ggplot2 for creating graphs with more flexibility than the base plot.
  • Learn how to use the googleVis package to create an interactive map.

A major job a data scientist must do on a regular basis is to prepare big datasets for further analyses. In this course, you will learn how to use the dplyr and tidyr packages to clean up messy data.

What is covered in the course?

  • Learn how to manipulate data in R by using dplyr.
  • Understand the capabilities of the tidyr and lubridate packages to clean up messy data.

In this short course, you will learn about the googleVis package in R. Unlike standard graphs, googleVis enables the creation of interactive online graphs to enhance the visualisation of the data. After watching an introductory TED talk from Hans Rosling, a world-leading statastician and public speaker, you will soon be loading data to create your own interactive graphs.

What is covered in the course?

  • Learn how to create your very own interactive graphs by using googleVis.

Quandl is a website that offers free and open financial, economic and social datasets. This resource is incredibly useful when wanting to flex your data science knowledge.

What is covered in the course?

  • Understand how to import data from the Quandl server into your own R workspace.
  • Learn how to manipulate the Quandl datasets.

This DataCamp course complements the Inferential Statistics Coursera course. This is another course aimed at beginners to R who are interested in statistical analysis, which I totally recommend starting out with.

What is covered in the course?

  • An introduction to the very basics of R. Start out with baby steps of simple arithmetic tasks before venturing into multiple lines of code.
  • Performing the appropriate statistical tests on a range of data types.
  • Understand simple and multiple regression models.
  • Learn how to perform ANOVA tests to investigate differences between multiple experimental groups.
  • Performing non-parametric testing on datasets which are not normally distributed.

This course complements the Introduction to Probability and Data Coursera course, delivered by Mine Çetinkaya-Rundel. Unlike many of the DataCamp courses which have a clean, interactive interface, the Introduction to Probability and Data course looks exactly like R Studio.

What is covered in the course?

  • Learn about the concept of probability testing.

In the Intro to Computational Finance course, you get a lot of material included which spans eight chapters. This course has a strong mathematical basis and makes use of financial markets.

What is covered in the course?

  • Learn about return calculations by analyzing and plotting graphs.
  • Compute probability distributions.
  • Understand what bivariate distributions are by performing correlation analyses.
  • Use the PerformanceAnalytics, zoo and tseries packages to analyze and visualise stock returns.

This course is identical to the Python version. In it, you will use aspects of machine learning from Kaggle’s Titanic exercise to predict survival rate. Both decision trees and random forests will be covered, which are both fundamentals of machine learning.

What is covered in the course?

  • A Kaggle-style exercise to predict the survival rate in the Titanic competition.
  • A brief introduction to R.
  • Learn about decision tree and random forest techniques to fine tune your machine learning capabilities.

As the name suggests, this is the perfect course for absolute beginners to R – no experience is require! The course instructor, Annika Salzberg, does a fantastic job to break down the R language. Annika covers aspects of installing packages, understanding the types of data structures, as well as data cleaning and visualisation. However, note that this course is purely video-based – so no exercises through DataCamp’s interface can be performed.

What is covered in the course?

  • An introduction to R Studio and its interface.
  • How to install packages in R.
  • Learn about importing and cleaning data.
  • Understand how to use ggplot2 to create graphs.

In this swirl course, you will start at the very start of your R journey. This is another place to start for those who have not used R before. In the R Studio workspace, you will create your first scripts and ultimately start working on real-world datasets.

What is covered in the course?

  • An introduction to R Studio and how files are used.
  • Exploring the different types of data structures, such as vectors and data frames.
  • Start working with real-world datasets to put your newly acquired coding language to good use.

As the name suggests, this is a fun short course which uses the power of R and the knowledge of data science to search through Yelp reviews for restaraunts to decide where to eat. This course is based on a popular Springboard blog post regarding Yelp Review Modifications. You will quickly learn to explore the raw data and then generating weighted star reviews to provide a better metric when deciding that all important restaraunt choice.

What is covered in the course?

  • An introduction to R in DataCamp and how to explore data.
  • Learn how to create new variables to generate weighted star review scores.

This course will introduce you to Hadley Wickham’s readr package. Delivered by the man himself, Hadley will talk you through importing and exporting datasets from R using readr.

What is covered in the course?

  • Learn how to import CSV and txt files into R.
  • Understand how importanting can sometimes go wrong and how you can go about changing column types.

Regression analysis is a set of statistical processes to estimate the relationship between variables. Through the use of the R Studio interface, this course provided by swirl focusses on all things regression, including simple and multivariate types. As well as this, indepth topics of overfitting, residuals and binary models will be explored.

What is covered in the course?

  • An introduction to what regression analysis is.
  • Learn how to perform and interpret multivariate regressions in R.
  • Understand how models can be under- and over-fitted.

This course is swirl’s version of statistical inference. Over four chapters, you will soon understand the basics of probability, characterising data variance, hypothesis testing and power calculations. This is another useful learning resource for academics who need to understand statistical testing in R.

What is covered in the course?

  • An introduction to probability testing and how to look at variances in data.
  • Understand what hypothesis testing is and that all important p value.
  • The concept of power analysis will be explored to understand the reliability of data.

Spreadsheets (eg Microsoft Excel)

In this course, you will learn about the basics of spreadsheet programs such as Microsoft Excel and Google Sheets. Most of this may seem a little trivial if you have ever used Excel before, since its purpose is to provide a basic introduction to spreadsheet basics.

What is covered in the course?

  • Understand how spreadsheet softwares work by organising data into cells.
  • Perform your own formulas, such as calculating percentages.
  • Learn the potential of these programs in performing referencing.

This course goes a little further in understanding how spreadsheets can be used to enhance your data science skills. Understand the predefined formula already at your disposal, which will make working with large datasets a whole lot easier.

What is covered in the course?

  • An overview of the different formula available in spreadsheet programs, such as ROUND and SQRT.
  • Learn about conditional functions and lookups.

SQL

Karlijn Willems, a Data Scientist from DataCamp, is the instructor for this SQL course. The focus is on using SQL to undertake database manipulations in a marketing context.

What is covered in the course?

  • Explore how to retrieve data into the SQL workspace and summarizing it.
  • Learn how to filter parts of data that you are interested in as well as combining data.

Conclusion

There are more free DataCamp courses out there than you may be aware of. I hope this list has provided you will an idea of some alternative resources to take advantage of in your data science journey.

LEAVE A REPLY

Please enter your comment!
Please enter your name here