Data Scientist or Data Engineer? Choose your path on Udacity

Nothing can stop the flow of data – 2.5 million terabytes are created every day, so storing, organizing and analyzing data is becoming more important than ever. Udacity has updated the Nanodegree programs offered by its school of data science and they start on September 23.


Udacity School of Data Science now has a total of thirteen Nanodegree programs, eleven of which are flagged as new programs. It also identifies paths to specific careers with two divergent paths for developers who are interested in data, one leading to the role of Data Scientist and the other to Data Engineer.

Data Scientists can be seen as those who make sense of data, present the information it contains, and help make decisions based on it. In paving the way for this career, Udacity reminds us:

There is a shortage of qualified Data Scientists in the job market, and people with these skills are in high demand. Build skills in programming, data processing, machine learning, experiment design, and data visualization, and launch a career in data science.

At the advanced level of Udacity Data Scientist Nanodiploma is the last step on the path if you want to pursue a career in this role. Learning begins with Programming for data science with Python. This nanodegree, whose duration is estimated at 3 months, is at the beginner level. It covers Python fundamentals and introduces you to basic data programming tools, including SQL, as well as version control with Git.

The intermediate step is the Nanodiploma Data Analyst, which is a 4-month program where you use Python, SQL, and statistics to uncover insights, communicate critical findings, and build data-driven solutions. Its modules and projects (CAPS titles) are as follows:

  • Introduction to Data Analysis
    Learn the data analytics process of debating, exploring, analyzing and communicating data. Work with data in Python, using libraries like NumPy and Pandas.
    EXPLORE WEATHER TRENDS
    INVESTIGATE A DATA SET


Learn how to apply inferential statistics and probability to real-world scenarios, such as analyzing A/B tests and building supervised learning models.
ANALYZE THE RESULTS OF THE EXPERIMENT

  • Data conflict
    Learn the process of collecting, evaluating and cleaning data. Learn how to use Python to programmatically aggregate data and prepare it for analysis.
    WRANGLER AND ANALYZE DATA
  • Data Visualization with Python
    Learn how to apply visualization principles to the data analysis process. Visually explore data at multiple levels to find insights and create a compelling story.
    COMMUNICATE DATA RESULTS

Before launching within 4 months Data Scientist Nanodiploma you also need a foundation in machine learning. It contains these modules and projects, as well as a final summary project to set everything up:

  • Solve data science problems
    Learn the data science process, including how to create effective data visualizations and how to communicate with different stakeholders.
    WRITING A DATA SCIENCE BLOG POST
  • Software Engineering for Data Scientists
    Develop software engineering skills that are essential for data scientists, such as creating unit tests and creating classes.
  • Data Engineering for Data Scientists
    Learn how to work with data throughout the data science process, from running pipelines, transforming data, building models, and deploying cloud solutions.
    BUILD PIPELINES TO CLASSIFY MESSAGES WITH THE NUMBER EIGHT
  • Experiment plan and recommendations
    Learn how to design experiments and analyze A/B test results. Explore approaches to building recommender systems.
    DESIGNING A RECOMMENDATION ENGINE WITH IBM

data2

The alternate career path you might want to follow leads to the role of Data Engineer. According to Sam Nelson, Product Manager of Udacity’s School of Data Science.

Data engineers build the engines that help companies understand all of this. They are essential to any company’s data strategy. Without the right infrastructure, you can collect data, but it stays and takes up space.

The first step on this path is again the beginner level nanodegree Programming for data science with Python. The second, at the intermediate level is Nanodiploma Data Engineer, which is designed to show you how to understand the data ecosystem, give you the right tools to navigate it, and enable you to apply what you learn by doing practical, portfolio-ready projects. This is a 5-month program with the following modules and projects, plus a final capstone project to set it all in place:

  • Data modeling
    Learn how to build relational and NoSQL data models to meet the diverse needs of data consumers. Use ETL to create databases in PostgreSQL and Apache Cassandra.
    DATA MODELING WITH POSTGRES
    DATA MODELING WITH APACHE CASSANDRA
  • Cloud data warehouses
    Sharpen your data warehousing skills and deepen your understanding of data infrastructure. Build cloud-based data warehouses on Amazon Web Services (AWS).
    CREATE A DATA WAREHOUSE IN THE CLOUD
  • Spark and data lakes
    Understand the Big Data ecosystem and how to use Spark to work with large data sets. Store big data in a data lake and query it with Spark.
    CREATE A DATA LAKE
  • Data pipelines with airflow
    Plan, automate, and monitor data pipelines using Apache Airflow. Run data quality checks, track data lineage, and work with data pipelines in production.
    DATA PIPELINES WITH AIRFLOW

The third stage of the course, at the advanced level, is the Data flow nanodegree which we reported on when it was initially launched in March 2020. Estimated at 2 months and with two courses and two projects, it is designed to teach you how to process data in real time by developing proficiency in modern data engineering tools. data, such as Apache Spark, Kafka, Spark Streaming and Kafka Streaming.

speed data

More information

Udacity School of Data Science

Data Visualization Nanodegree

Nanodiploma Data Engineer

Data flow nanodegree

Data Scientist Nanodiploma

Nanodiploma Data Analyst

Programming for data science with Python

Programming for data science with R

Nanodegree SQL

Related Articles

Udacity launches data science school

Udacity launches Data Scientist Nanodegree

Udacity Data Science Nanodegrees Reboot

New Udacity Nanodegree in Data Streaming

Udacity Beginner Level Nanodegree SQL

Data Scientist Highest Paying Entry-Level Job, According to Glassdoor

What is a Data Scientist and how do I become one?

To be notified of new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


square



comments

or send your comment to: [email protected]

Sean N. Ayres