In this Udacity Data Engineering nanodegree review, I will share my experience of taking this course along with its pros and cons.
75% OFF on all Nanodegrees
Try Coupon Code: UPSKILL21
Planning to enrol in Udacity’s Data engineering nanodegree? Few months back I had same thoughts. I hope my feedback on Udacity’s Data engineering nanodegree will help you make the right choice. Here’s my story.
I have studied Computer Science Engineering and during my college time, I got interested in the world of Data Science. So, I oriented my internships on this topic and learned the valuable and essential skills required for the job profiles of Data Scientists, Data Analyst, and Data Engineer which are the three magic roles in the world of Data Science.
After I graduated from college, I received a job opportunity as a Data Engineer in a biotechnology company. Working in that company, I came to know about various tools and techniques that were used by them such as Airflow, Spark, Apache Cassandra, etc.
These tools and software attracted me more since my background was mostly from Software Engineering and Development. I realize that I had a gap of knowledge on Data Modelling, Data processing engines, and some of the best practices with data.
Since all of these topics were perfectly covered in the Data Engineering Nanodegree from Udacity, I decided to apply.
In this Udacity Data Engineering nanodegree review, I am going to talk about the syllabus and detail and my project experience.
So keep reading..
If you buy the course through the links mentioned in this article, we can earn some affiliate. It can help us to keep running this blog. 🙏🙏
How Much I Paid For Udacity’s Data Engineering Nanodegree?
Talking about the pricing, well I was not that lucky. At the time when I wanted to enroll there was no valid offer or discount available.That’s the reason why I paid the full price for the course for 5 months subscription.
It was 1400€ which is around 1600$. Without an offer, it is very overpriced since you are paying for only 5 months if you don’t complete it earlier.
You have access to paid services, but I don’t recommend paying such a high price. If you get a discount of 50% or even 70%, go for it! The outcome will be worth it.
Duration: 5 months (5hours/week)
You can Sign Up here.
Also Read: My experience with Udacity Nanodegree
Let me shed some light on the course structure and projects in this Udacity Data Engineering Nanodegree review
Which Topics Are Covered In Udacity’s Data Engineer Nanodegree?
In general, I liked the overall syllabus and structuring of the modules with a realistic project.
They go from the basics to the most demanding and complex projects and you are always connected to the main topic.
Lesson1: Introduction to Data Modeling
In this module, you will be learning the main and key differences between a relational and non-relational data model. It is a great start to the NoSQL world.
This is very important. You need to fully understand this concept to make good decisions when working as a Data Engineer in a team.
The required task was to create a table for our database. And you can see, the first query is following the standard SQL.Here we have to build up a table to fit in our data model.
While in the second query, the objective is to create a table that fits the query. And that’s one of the main difference between SQL and NoSQL databases.
Also, see – Udacity Data Analyst Nanodegree Review
Project 1: Data Modeling with Postgres
This is the first project of the Nanodegree. From this project onwards, you are required to model the data using a common schema pattern while working with Business Intelligence, where you have 2 dimensions and one fact table.
This project is using a relational data model. It is a beginner-friendly project where you can get familiarise with the working methodology of Udacity.
If you have experience working with SQL and Python, this project will be pretty straightforward and simple for you. You are just required to read the Python library psycopg2 documentation.
Project 2: Data Modelling with Apache Cassandra
So the second project is Data Modelling with Apache Cassandra. In this project, you need to do the same task that you have performed in the previous one. But in this project, you will have to follow and implement the best practices for creating a NoSQL database.
The results of these two projects might be the same. You will learn how to do the same task using the best practices and in different ways.
I liked the fact that the difficult part of the projects was not on the database or the data. It was on the steps that you need to follow to manipulate and change your mindset. “It’s NoSQL, Don’t think in SQL”.
Lesson 2: Cloud Data Warehouses
So, lesson 2 is about the Cloud and Data Warehouses. In this lesson you will get in touch with cloud computing using AWS Infrastructure. You will learn about EC2 machines, S3 buckets, and RedShift.
However, at some points when I was learning on my own, I would end up testing with AWS, thanks to what I did and learned from this course.
Also, I would like to let you know in this Udacity data engineer review that Udacity gives you 25$ in credits to use inside the AWS. Cool, isn’t it?
Project: Build a Cloud Data Warehouse
So, the project is to build a cloud data warehouse. Here you will need to work on building a data warehouse on AWS cloud. To do so you will need to orchestrate the interaction between S3 buckets and your RedShift database.
Let me be honest in this Udacity data engineer nanodegree review that this project was groundbreaking for me because at this stage I was not afraid of building my cloud data warehouse for my projects.
You can also check out the screenshots above. You will need to configure and connect the AWS instance to your project workspace.
Lesson 3: Spark and Data Lakes
Lesson three is the Spark and Data Lakes. In this lesson, you will learn about the advantages of using spark as your data processing engine.
You will learn to use it for cleaning and aggregating data. It is one of the most popular tools for working with data.
I was not aware of the technologies like Spark and Hadoop. It was great to know about them and the state of art technologies that should be known in the Data Science Industry.
Project: Build a Data Lake
The project is to build a data lake and the task itself is quite simple. You will have to load data from S3 into in-memory tables and then back to S3 after modeling the data. In this project, you will experience the power of Spark.
I found this project quite useful. But you can barely see how fast is Spark and the big difference between Python and Spark when talking about modeling bit datasets.
As you can see in the images, in the project template description it is specified that you will have to use Spark. This technology will be placed on the top of other technologies or tools that you have learned in the previous lectures. Then at the end, you will be building a different mindset to structure and develop your future projects.
Lesson 4: Data Pipelines with Airflow
In this lesson you will learn about Airflow. It is a simple interface for monitoring and designing pipelines for our automated tasks.
This tool is critical and essential for you when you are focusing on designing pipelines, their development, and maintenance.
Project: Data Pipelines with Airflow
So, in this project you will need to use Airflow to structure and monitor your pipelines. You will also need to build a data warehouse similar to the project in lesson 2, but adding extra complexity for using Airflow.
I found this project quite challenging because of the additional knowledge that you should have from the previous projects and you will have to add Airflow on the top of RedShift and S3 buckets. But, in the end, it was very pleasing to complete this overall project.
Also, see – Review of Udacity’s Data Streaming Nanodegree
Lesson 5: Capstone Project
Project: Data Engineering Capstone Project
This was the final project or the Capstone project. It personally chosen by me. For this project I decided to create a report with agricultural data. This project covered an entire closed-loop scenario. For simplicity, I have just used one Python notebook. Further, we could divide this project into 5 steps:
- Gather data from public sources (in my case, http://www.fao.org/faostat/en/#data)
- Load the tabular data, clean it, rearrange it and create the data model in-memory
- Load the data model into the Redshift data warehouse using the in-memory tables and the S3 buckets for batch processing the fact table
- Ensure the process was correctly done by doing data quality checks on your BD data model
- Create a visual report using a reporting tool (I chose Power BI)
And the result is below.
So this is all about the lesson and the projects in this Udacity Data Engineering Nanodegree review.
How was my project experience?
Every project was adding a step more to get completed.
Personally, the most challenging part of the lesson and projects for me was to complete the Airflow project. Because in that project you are required to combine and use the knowledge of Apache Airflow with RedShift and S3 Buckets. You can also add Spark to it as per your wish.
How much time I took to complete this nanodegree?
It took me a bit longer than expected to complete this Data Engineering Nanodegree. I enrolled in this Nanodegree in October 2020, but because of work, I was not able to focus on the course until January. In finishing the Nanodegree in March, I paid for one extra month with an offer + 80€.
Honestly, it took me around 2 to 3 months to complete the Nanodegree. After my experience, I think that if someone is willing to go 100% for the course then he should choose a monthly payment plan instead of a 5-month subscription.
If you are studying this Nanodegree in parallel to anything else then I strongly recommend you to choose the 5-month subscription as speedrunning the projects will only make you struggle on project requirements.
Also, see – My experience of Data Scientist Nanodegree
Well there are several features of Udacity that I should tell you in this Udacity Data Engineering Nanodegree review.
First the Mentorship. Well I did not approach a mentor directly for advice. But I had a chance of submitting my GitHub repository and LinkedIn profile for review and I don’t think that it is something really special.
If you want to improve your GitHub repo or your LinkedIn profile, you can do 1 to 2 hours of research and visit more than 20 profiles. You will learn as much as taught by your mentor. Still, the overall mentorship was very straightforward and I appreciate the effort that the mentors put in.
The reviewers of the projects were outstanding communicators and they were reviewing our projects with a very positive attitude, marking down the areas where improvement was required especially in the case when you did not pass the requirements.
I was crystal clear about what I wanted to do before, during, and after the course so I did not opt for the career services.
Pros and Cons of Udacity’s Data Engineering Nanodegree
My experience with Udacity Data Engineering Nanodegree was filled with positivity, the positivity that radiates when you are communicating with your mentor, and of course, let’s not forget about the tools and technologies that Udacity lets you work with.
Nowadays keeping the situation and circumstances in mind, most of us are at home and learning from our bedroom. Learning from online tutorials or just from other platforms that are mostly focused on just a bunch of videos.
It is very reinforcing to have personal reviews and monitoring, to ensure we are on the right path and not messing around with random tools.
Well this nanodegree quite expensive if we think about the service itself. We are just buying videos and predefined guidelines.
In the end, it’s worth it because of the knowledge that you get, the flexibility for studying, and the personal monitoring. Thinking just about the content, still I am a bit skeptical.
Is Udacity’s Data Engineer Nanodegree Worth it?
Before concluding this Udacity Data Engineering review, I want to ask you few basic questions:
- Is your professional background lacking specific knowledge of technologies, techniques, and software that are widely used nowadays?
- Do you feel that you want to work as a Data Scientist, Data Analyst, or Data Engineer?
If yes, then you should go for this Data Engineering Nanodegree.
But you should search for an offer if you don’t want to pay the full price of the Nanodegree because 1400$ is too expensive. I could see even before finishing, the difference when applying to job offers for Data Engineer.
Hope you find this Udacity Data Engineer Nanodegree Review useful.
Alberto Baraza Barnes
I am a Data Engineer and Computer science student passionate about Data Science. I have 2 years of experience working for public and private entities.
Visit my LinkedIn Profile.
I can say this nanodegree has played a vital role in shaping my career as a Data engineer. Today Iam working at Glovo as Data engineer. If I can find a job, surely you can too.
Average data engineer earn up to $129000/year in United states(Source: Indeed)
Its possible to complete this nanodegree in a month provided you are familiar with the data engineering. A beginner must dedicate 3-4 months.
It all depends on which role you want go with. I suggest first to look at their job roles and decide. Both nanodegrees are Udacity’s one of the highest enrolled courses