Exploratory Data Analysis for Machine Learning Free Beginners Course

toc

Artificial Intelligence

text_snippet

Courses

person_check

Beginner

View Course
- Twitter
- LinkedIn
- Facebook
- Copy
- Email

Introduction

This course aims to introduce users to machine learning. It explores the importance of data quality and why most projects rely on filtering data. It also provides an in-depth understanding of various techniques used today for retrieving, sorting, cleaning, and using data for feature engineering. This course can also be used as part of preliminary analysis and hypothesis testing.

This is a five-module course that takes 14 hours to complete. Learners can set their own pace for the course. However, it is important to note that this is not a beginner-level course. Students need to have a fundamental understanding of machine learning concepts and data analysis theories.

What Will You Gain from This Course?

By the end of this course, students will be able to:

Retrieve and filter out data from multiple data sources like SQL, NoSQL databases, APIs, and Cloud.
Have an in-depth understanding of some of the fundamental feature selection.
Implement feature engineering techniques for data sorting and analysis.
Organize different categorical and feature-based information and fill out missing values as per the requirement.
Utilize different techniques used for highlighting variables and sorting the outliers.
Detect errors and clean data.
Understanding why feature scaling is an essential task in data analysis.
Use different scaling techniques for filtering data set values to fall in the specific range.

Skills Acquired:

Artificial Intelligence (AI)
Machine Learning
Feature Engineering
Statistical Hypothesis Testing
Exploratory Data Analysis
Data Scaling
Data sorting in SQL
Using NoSQL databases
Data filtering in APIs and Cloud

Who Can Benefit From This Course?

This course is designed for:

Aspiring data scientists with prior experience in the industry.
Data analysts who want to improve their productivity.
Data analysts wishing to have hands-on experience with machine learning and artificial intelligence.
Individuals working in financial institutions aiming to assess risk, develop investment strategies and manage portfolios
Healthcare enthusiast researching drug development and other diagnostic reaches.
Marketing experts trying to understand customer demographics, target advertising campaigns and measure the success of marketing initiatives
Educators trying to track student progress, identify at-risk students, and develop a personalized learning plan

Course Content

5 Modules - 48 Videos – 11 Readings – 12 Quizzes - 2 Discussion Prompts – 9 App Items – 1 Peer Review - Completion of Certificate

A Brief History of AI and Its Applications

This module explores the overall timeline of AI and its real-life applications. It explores some basic concepts of artificial intelligence, how it works, and some of the top AI tools that are used today. Students will learn about the novelty of AI. It also looks at how machine learning has made a major contribution to the overall development of AI. The learner will get an overview of AI, machine learning, and a basic history of modern AI.

10 - Videos
2 - Readings
3 - Quizzes
1 - Discussion Prompt

Retrieving and Cleaning Data

The second module of this course helps the student to learn some basic ways to filter out and clean data. For any machine learning and AI project, the contribution of good data is exemplary. Within this module, students will get to learn different ways to retrieve data from multiple data sources and libraries.

Some of the important concepts that the student will get to explore include data retrieving from different sources like CVS, JSON Files, Databases, APIs, Cloud, etc. Students will get to polish data sorting skills like learning Data Cleaning, filling out Handling Missing Values and Outliers, etc.

7 - Videos
3 - Readings
3 - Quizzes
3 - App Items

Exploratory Data Analysis and Feature Engineering

This is the third module of this course and requires four hours to complete. The student will learn about conducting exploratory analysis to visually justify whether a dataset is ready for machine learning modeling through feature engineering and transformations.

This section focuses mainly on data visualization, analysis, and grouping. Some of the basic concepts in this module include Exploratory Data Analysis (EDA), EDA with Visualization, Grouping Data for EDA, Feature Engineering and Variable Transformation, Feature Encoding, and Feature Scaling.

15 - Videos
3 - Readings
3 - Quizzes
4 - App Items

Inferential Statistics and Hypothesis Testing

The fourth module of this course focuses on inferential statistics and hypothesis testing. It will take just one hour to complete. This section of the course is all about quality assurance, so by the end of this module, students will be able to analyze whether the quality of the data is good enough.

Within data analysis, inferential statistics and hypothesis testing are important data analysis tools yet often overlooked. This eventually affects the quality of the data. To ensure that students have the ability to gauge the quality of data, inferential statistics and hypothesis testing are taught.

Both these data analysis tools also help ensure business intuition and prescribe what to analyze next using machine learning. By the end of this module, the learner will have a much better understanding of different terms in data analysis, along with their application and use. The students will also get hands-on experience in creating hypotheses around business problems and testing these situations.

16 - Videos
2 - Readings
3 - Quizzes
2 - App Items
1 - Discussion Prompt

Optional Honors Project

This is the last but optional part of this course. Following the completion of this module, the student will earn HONORS skills that can be more helpful in their career. Within this module, the student will learn about datasets, data-cleaning techniques, feature engineering, exploratory data visualization, and hypothesis testing.

This is a totally hands-on practical module. The student will get to choose any data set of their choice. Later, students can apply all the newly learned skills and take tests on the data set.

This will start with basic data organization, retrieving information, filtering the data, and then using it in different business niches as well.

1 - Reading
1 - Peer Review

Description

This five-module course teaches data analysis using machine learning. The first section provides a historical overview of AI, tracing its development from its beginnings to modern capabilities. Here, you'll learn how AI can deliver accurate and timely responses, automate simple tasks, and ensure consistent results.

The second module teaches students how to access and retrieve information from various data libraries before cleaning them for use. This module is particularly valuable for aspiring data analysts who find manual data cleaning challenging.

The third module focuses on exploratory data analysis (EDA), data visualization, and data grouping. Students will learn the essential tasks for data analysis and gain hands-on experience applying them. The fourth module emphasizes making data error-free and maintaining its quality. This section focuses on inferential statistics and hypothesis testing.

Finally, the last module offers students the opportunity to apply what they have learned through a practical test. While this module is optional, it is highly recommended for a more comprehensive understanding.

Meet the Instructor

This course is a joint venture of IBM and Coursera. Two instructors teach this course:

Joseph Santarcangelo - Ph.D., Data Scientist at IBM – IBM

With a Ph.D. in Electrical Engineering, Joseph has a keen interest in machine learning, signal processing, and computer vision. He has worked in multiple related fields and has also done research on computer vision and the impact of videos on human cognition. He has since been working at IBM and has made some major contributions to machine learning as well.

Svitlana (Lana) Kramar - Data Science Content Developer – IBM

Kramar is the second instructor of this course. He is currently working as a Data Science Content Developer at IBM. He earned his Master’s Degree in Data Science at the University of Calgary, and he is currently an Analytics student as well. Kramar has a keen interest in learning new languages, working on cross cultures, and spreading a passion for Data Science.

View Course
- Twitter
- LinkedIn
- Facebook
- Copy
- Email