Introduction
This course aims to introduce users to machine learning. It explores the importance of data quality and why most projects rely on filtering data. It also provides an in-depth understanding of various techniques used today for retrieving, sorting, cleaning, and using data for feature engineering. This course can also be used as part of preliminary analysis and hypothesis testing.
This is a five-module course that takes 14 hours to complete. Learners can set their own pace for the course. However, it is important to note that this is not a beginner-level course. Students need to have a fundamental understanding of machine learning concepts and data analysis theories.
What Will You Gain from This Course?
By the end of this course, students will be able to:
Skills Acquired:
Who Can Benefit From This Course?
This course is designed for:
Course Content
5 Modules - 48 Videos – 11 Readings – 12 Quizzes - 2 Discussion Prompts – 9 App Items – 1 Peer Review - Completion of Certificate
A Brief History of AI and Its Applications
This module explores the overall timeline of AI and its real-life applications. It explores some basic concepts of artificial intelligence, how it works, and some of the top AI tools that are used today. Students will learn about the novelty of AI. It also looks at how machine learning has made a major contribution to the overall development of AI. The learner will get an overview of AI, machine learning, and a basic history of modern AI.
Retrieving and Cleaning Data
The second module of this course helps the student to learn some basic ways to filter out and clean data. For any machine learning and AI project, the contribution of good data is exemplary. Within this module, students will get to learn different ways to retrieve data from multiple data sources and libraries.
Some of the important concepts that the student will get to explore include data retrieving from different sources like CVS, JSON Files, Databases, APIs, Cloud, etc. Students will get to polish data sorting skills like learning Data Cleaning, filling out Handling Missing Values and Outliers, etc.
Exploratory Data Analysis and Feature Engineering
This is the third module of this course and requires four hours to complete. The student will learn about conducting exploratory analysis to visually justify whether a dataset is ready for machine learning modeling through feature engineering and transformations.
This section focuses mainly on data visualization, analysis, and grouping. Some of the basic concepts in this module include Exploratory Data Analysis (EDA), EDA with Visualization, Grouping Data for EDA, Feature Engineering and Variable Transformation, Feature Encoding, and Feature Scaling.
Inferential Statistics and Hypothesis Testing
The fourth module of this course focuses on inferential statistics and hypothesis testing. It will take just one hour to complete. This section of the course is all about quality assurance, so by the end of this module, students will be able to analyze whether the quality of the data is good enough.
Within data analysis, inferential statistics and hypothesis testing are important data analysis tools yet often overlooked. This eventually affects the quality of the data. To ensure that students have the ability to gauge the quality of data, inferential statistics and hypothesis testing are taught.
Both these data analysis tools also help ensure business intuition and prescribe what to analyze next using machine learning. By the end of this module, the learner will have a much better understanding of different terms in data analysis, along with their application and use. The students will also get hands-on experience in creating hypotheses around business problems and testing these situations.
Optional Honors Project
This is the last but optional part of this course. Following the completion of this module, the student will earn HONORS skills that can be more helpful in their career. Within this module, the student will learn about datasets, data-cleaning techniques, feature engineering, exploratory data visualization, and hypothesis testing.
This is a totally hands-on practical module. The student will get to choose any data set of their choice. Later, students can apply all the newly learned skills and take tests on the data set.
This will start with basic data organization, retrieving information, filtering the data, and then using it in different business niches as well.
Description
This five-module course teaches data analysis using machine learning. The first section provides a historical overview of AI, tracing its development from its beginnings to modern capabilities. Here, you'll learn how AI can deliver accurate and timely responses, automate simple tasks, and ensure consistent results.
The second module teaches students how to access and retrieve information from various data libraries before cleaning them for use. This module is particularly valuable for aspiring data analysts who find manual data cleaning challenging.
The third module focuses on exploratory data analysis (EDA), data visualization, and data grouping. Students will learn the essential tasks for data analysis and gain hands-on experience applying them. The fourth module emphasizes making data error-free and maintaining its quality. This section focuses on inferential statistics and hypothesis testing.
Finally, the last module offers students the opportunity to apply what they have learned through a practical test. While this module is optional, it is highly recommended for a more comprehensive understanding.
Meet the Instructor
This course is a joint venture of IBM and Coursera. Two instructors teach this course:
Joseph Santarcangelo - Ph.D., Data Scientist at IBM – IBM
With a Ph.D. in Electrical Engineering, Joseph has a keen interest in machine learning, signal processing, and computer vision. He has worked in multiple related fields and has also done research on computer vision and the impact of videos on human cognition. He has since been working at IBM and has made some major contributions to machine learning as well.
Svitlana (Lana) Kramar - Data Science Content Developer – IBM
Kramar is the second instructor of this course. He is currently working as a Data Science Content Developer at IBM. He earned his Master’s Degree in Data Science at the University of Calgary, and he is currently an Analytics student as well. Kramar has a keen interest in learning new languages, working on cross cultures, and spreading a passion for Data Science.