Name: Exploratory Data Analysis (EDA) in Python for Machine Learning in Bioinformatics
Price: 29.99 USD
Availability: InStock
Rating: 3 (1 reviews)

17 Video lessons

3h 43m Total content

≈4 hrs/week Master it in ~1 weeks

About Course

While dealing with massive amounts of biological data, it is difficult to properly understand it in a written or tabular form. Hence, in order to gain a better understanding of our biological data it is essential that we represent it in a pictorial form so that various trends, correlations, outliers, and patterns in our biological data can be exposed. Biological data visualization means the graphical representation of biological data and information.

Biological data visualization is an important aspect of bioinformatics which involves the graphical representation of unstructured or structured biological data. It helps you in making impactful decisions during your research based on data visualizations along with publishable figures for your research papers.

Exploratory Data Analysis (EDA) is an approach to analyzing biological datasets to summarize their main characteristics. It is used to understand biological data, get some context, understand the variables and the relationships between them, and formulate hypotheses that could be useful when building predictive models. EDA is performed with the help of biological data visualization.

BioCode is offering a detailed hands-on course on Exploratory Data Analysis for machine learning and data pre-processing in Python. Python provides us with various libraries that come with different features for visualizing biological data and information. This course will help the students in understanding the concept and purpose of exploratory data analysis. The students will learn the importance of exploratory data analysis in machine learning. Students will also learn various different use cases for Pandas, Numpy, Seaborn, Matplotlib, Jupyter-Notebook, and Anaconda in EDA.

Students will also learn how to retrieve bioinformatics, genomics, and health informatics datasets and develop machine learning models after performing the EDA.

In this course, students will identify useful features from the dataset that can be used for machine learning. Students will learn how to completely perform end-to-end exploratory data analysis of their biological datasets and plot beautiful charts such as joint plots, bar plots, line plots, swarm plots, scatter plots, correlation plots, histograms, etc. Students will learn how to analyze trends, distributions, and relations between biological features. This course is for absolute beginners in bioinformatics scripting and you don’t require any prior knowledge of scripting or even bioinformatics to get started with this course.

This course will include the following sections:

Section 1: Introduction to Exploratory Data Analysis and Visualization in Python

Description: This section will focus on making sure that the students gain an understanding of exploratory data analysis and the importance of exploratory data analysis for the identification of trends, patterns, distributions, and correlations in the biological data. Students will learn about the various Python libraries that help us in performing exploratory data analysis. Students will be able to retrieve raw datasets for machine learning.

Learning Outcomes: Upon completion of this section, students will be able to:

Discuss Exploratory Data Analysis.
Understand the Importance of Exploratory Data Analysis in Machine Learning for Bioinformatics.
Explain Pandas Structures.
Explain Numpy Structures.
Describe Matplotlib.
Describe Seaborn.
Retrieve Datasets for Machine Learning.
Explain the Raw Breast Cancer Dataset.

Section 2: Hands-on Exploratory Data Analysis of Cancer Dataset

Description: This section will focus on making sure that the students learn how to perform exploratory data analysis of the cancer dataset. Students will learn how to make several types of graphs and plots including line plots, joint plots, density plots, swarm plots, scatter plots, histograms, correlation plots, linear model plots, and bar charts. Students will be able to identify the biological factors and their relations utilizing these plots. Students will learn how these graphs will help them in their biological data analysis.

Learning Outcomes: Upon completion of this section, students will be able to:

Create a Line Plot to Understand the Trends in Cancer Datasets.
Create a Joint Plot to Visualize Features from Multiple Angles.
Understand the Density Plot to Evaluate the Enzyme Levels in Cancer Individuals.
Compare the Serum Levels in Healthy and Patients Individuals through Swarm Plot.
Evaluate the Distribution of the Features Histogram.
Elucidate the Relation Between Two Features Using Scatter Plot
Understand the Correlation Between Features Using Correlation Plot and Heatmap Visualizations
Create a Linear Model Between Two or More Features to Understand their Relation Using a Linear Model Plot.
Draw a Regression Line Between Two Features for Regression Analysis.
Identify the Frequency of Patients Using Bar Charts.

Not ready to enrol?

Get the free syllabus & course updates

We'll email you the full outline for this course plus a starter guide — no spam, unsubscribe anytime.

Tools & technologies you'll use

Python
Conda
Jupyter
Pandas
NumPy
Machine Learning

Course Content

Introduction to Exploratory Data Analysis and Visualization in Python

Introduction to EDA in Machine Learning for Bioinformatics

18:13
Introduction to Pandas Structures

08:39
Introduction to Numpy Structures

07:35
Introduction to Matplotlib

06:25
Introduction to Seaborn

05:00
How to Retrieve Datasets for Machine Learning

05:41
Raw Breast Cancer Dataset Explanation

06:46

Hands-on EDA of Cancer Dataset

Creating a Line Plot to Understand the Trends in Cancer Datasets

19:18
Assignments Question
Creating a Joint Plot to Visualize Features from Multiple Angles

16:52
Assignment Question
Understanding the Density Plot to Evaluate the Enzyme Levels in Cancer Individuals

20:39
Assignments Question
Comparison of Serums Levels in Healthy and Patients Through Swarm Plots

08:34
Assignments Question
Evaluating the Distribution of Features Histogram

16:01
Assignments Question
Elucidating the Relation Between Two Features Using Scatter Plot

15:53
Assignments Question
Understanding the Correlation Between Features Using Correlation Plot and Heatmap Visualizations

17:53
Assignments Question
Creating a Linear Model Between Two or More Features to Understanding Their Relation Using Linear Model Plot

14:48
Assignments Question
Drawing a Regression Line Between Two Features for Regression Analysis

17:49
Assignments Question
Identifying Frequency of Patients Using Bar Charts

13:57
Assignments Question

Exercise

Add this certificate to your resume to demonstrate your skills & increase your chances of getting noticed.

Student Ratings & Reviews

3.0

Total 1 Rating

0 Rating

1 Rating

0 Rating

3 months ago

This course was exactly what I needed to bridge the gap between biology and data science. The way the instructor broke down how to visualize massive datasets with Python was super helpful. I finally feel like I can actually make sense of complex genomic data for my own projects.

About Course

Get the free syllabus & course updates

What Will You Learn?

Tools & technologies you'll use

Course Content

Introduction to Exploratory Data Analysis and Visualization in Python

Introduction to EDA in Machine Learning for Bioinformatics

Introduction to Pandas Structures

Introduction to Numpy Structures

Introduction to Matplotlib

Introduction to Seaborn

How to Retrieve Datasets for Machine Learning

Raw Breast Cancer Dataset Explanation

Hands-on EDA of Cancer Dataset

Creating a Line Plot to Understand the Trends in Cancer Datasets

Assignments Question

Creating a Joint Plot to Visualize Features from Multiple Angles

Assignment Question

Understanding the Density Plot to Evaluate the Enzyme Levels in Cancer Individuals

Assignments Question

Comparison of Serums Levels in Healthy and Patients Through Swarm Plots

Assignments Question

Evaluating the Distribution of Features Histogram

Assignments Question

Elucidating the Relation Between Two Features Using Scatter Plot

Assignments Question

Understanding the Correlation Between Features Using Correlation Plot and Heatmap Visualizations

Assignments Question

Creating a Linear Model Between Two or More Features to Understanding Their Relation Using Linear Model Plot

Assignments Question

Drawing a Regression Line Between Two Features for Regression Analysis

Assignments Question

Identifying Frequency of Patients Using Bar Charts

Assignments Question

Exercise

Quiz

Earn a certificate

Student Ratings & Reviews

More in Machine Learning

Machine Learning Series: Manipulation of Biological Datasets in R using Dplyr and TidyR

Hurry up! Sale ends in: