Advanced Bioinformatics Scripting in Python, BioPython, R & BioConductor (Subscription)

About Course

Writing short scripts, programs and developing softwares for various biological data analysis such as Sequences Alignment and Analysis, Genome Analysis, Proteome Analysis, Phylogenetic Analysis, Biological data visualization, MicroArray gene expression analysis, etc, requires a great deal of understanding of biological programming languages and the knowledge of how to utilize such programming languages to write the scripts.

BioCode is offering an Advanced Bioinformatics Scripting in Python, BioPython, R & BioConductor course so that you’ll learn from the very basics of biological programming in Python, BioPython & R to an advanced level understanding of Bioinformatics Scripting, even if you lack prior knowledge. You will understand various concepts related to how to write programs for MicroArray Gene Expression Analysis, ggplot2 biological data visualization & sequence retrieval, alignment, BLAST database searching & phylogenetic analysis in BioPython. You’ll also be learning complete end-to-end Linux (BASH) for Bioinformatics.

This course is for absolute beginners in bioinformatics scripting and you don’t require any prior knowledge of scripting or even bioinformatics to get started with this course. Everyday Bioinformatics analysis involves the extensive study and analysis of huge biological datasets. Linux and programming languages like Python and R have made it easy to perform analysis on huge biological data sets.

This course will include the following sections:

Section 1: Python

Description: This section will focus on making sure that the students gain an understanding of scripting in Python language and the basic functions that can be used to manipulate biological data.

Learning Outcomes: Upon completion of this section, students will be able to:

Learn the Importance of Python in Bioinformatics.
Understand Python Programming Language.
Install Python Language.
Discuss Comments in Programming Language.
Perform Basic Input and Output Functions.
Perform Mathematical Operations.
Explain Strings Data Structure.
Explain Dictionaries.
Discuss Lists in Python.
Describe Tuples.
Explain Sets.
Execute If-Else Conditions in Scripts.
Execute While Loop and Perform Biological Data Analysis.
Explain CSV Files.
Read Files.
Write Files.
Consolidate (merge) Multiple DNA and Protein Sequences into one FASTA File.
Describe OS Module.
Explain Functions in Python.
Use the “With” Statement in Python.
Perform Error Handling.

Section 2: BioPython

Description: This section will ensure that the students will learn about the various functions that help in our biological data analysis in BioPython module provided by Python programming language.

Learning Outcomes: Upon completion of this section, students will be able to:

Understand the BioPython module.
Install BioPython.
Create a Sequence Object Using Bio.Seq Class.
Explain How a Sequence Object Behaves like a String.
Perform Central Dogma in BioPython.
Import UnknownSeq and MutableSeq Objects from Bio.Seq Class.
Understand the Alphabets of Biology Using Bio.Alphabet Class.
Explain IUPAC Module and Types of Sequence Representation.
Concatenate Multiple Sequence Records Using Generic Alphabets.
Create Sequence Records Using SeqRecord Module.
Utilize the SeqRecord Module to Demonstrate the Representation of FASTA File Within BioPython.
Utilize the SeqRecord Module to Demonstrate the Representation of GenBank File Within BioPython.
Utilize the Formatting Feature of the SeqRecord Module.
Compare and Read Multiple FASTA Files from Directory Using SeqRecord Module in BioPython.
Read a Sequence File Using SeqIO class.
Parse a Sequence File Using SeqIO class.
Parse a Compressed Sequence File and Create a Dictionary of Sequences.
Write Sequences and SeqRecords into Files.
Extract Annotations and Perform Pattern-wise Sequence Data Extraction Using SeqIO module.
Read and Parse a Multiple Sequence Alignment File using AlignIO Module.
Write Alignment and Multiple Sequence Alignment Records using AlignIO Module.
Convert Alignment Formats.
Manipulate Alignments.
Align Multiple Sequences Using the ClustalW Python Wrapper.
Align Two Sequences Using the paiwise2 Function in BioPython.
Read Multiple Sequence Alignment Files of a Particular Format and Map Information of Alignments.
Format Alignments.
Truncate the Specific Regions from the Entire Alignment (Slice Alignments).
Query NCBI BLAST Through Python.
Access ENTREZ Using Python.
Parse the BLAST Results using the Bio.Blast module.
Get the Summary of Accessions Using Esummary Function of Entrez module in BioPython.
Download Complete Records Using EFetch Function.
Use EGQuery Function to do Global Queries for Search Count.
Search for Database Links of Records Using Elinks.
Search the Entrez Database Using ESearch Function.
Use ESpell Function to Get the Correct Spellings for your Search Terms.
Download GenBank and Entrez Records.
Search Taxonomy Database.
Download PubMed Articles.
Read a PDB (3D Structure) File Using Bio.PDB Module.
Calculate the Distance Matrix Between Sequences for Phylogenetic Analysis.
Convert Phylogenetic Tree Data Formats.
Print Out the Phylogenetic Tree in ASCII.
Read Phylogenetic Trees.
Visualize and Manipulate Phylogenetic Trees.
Create a Web Logo of Motifs.
Perform MEME Analysis.
Write Out Phylogenetic Data.

Section 3: R Language

Description: This section will focus on making sure that the students will learn about R language, various biological functions that are performed using R, and how R is used to visualize biological data using the ggplot2 package.

Learning Outcomes: Upon completion of this section, students will be able to:

Discuss R Language.
Install R.
Explain Comments.
Declare Variables and Objects.
Use Built-in Functions and ARGS.
Explain Samples and Replacement.
Write their own Functions and Arguments.
Create Customized Scripts.
Discuss Packages in R.
Install Bioinformatics Packages in R.
Initialize Library to Perform R Functions.
Get Help from Help Packages.
Explain Atomic Vectors in R.
Explain Integers, Doubles, Logicals, and Factors in R.
Discuss Dim and Dimensions in R.
Explain Attributes and Names.
Describe Matrix and Matrices.
Explain Arrays and Lists.
Describe Coercion.
Explain Data Frames.
Load Biological Data.
Save Biological Data.
Perform R Notation and Select Values from Biological Datasets.
Discuss Positive Integers for Subsetting Biological Datasets.
Discuss Negative Integers for Subsetting Biological Datasets.
Describe Zero Notation for Subsetting Biological Datasets.
Explain Blank Spaces for Subsetting Biological Datasets.
Explain Dollar Signs for Subsetting Biological Datasets.
Modify Values in Existing Datasets.
Explain NA (Not Available) Values in Biological Datasets.
Figure out NA Values in Biological Datasets.
Perform Logical Subsetting in Biological Datasets.
Use If Else Statement in Code.
Use For Loops and Perform Biological Data Binding.
Use While Loops and Read Multiple Biological Datasets.
Explain ggplot2 and its Use in Biological Data Representation.
Describe Key Components in ggplot2.
Visualize Human Mitochondrial Proteome.
Facet the Human Chromosome Dataset.
Smooth Out the Biological Data.
Create Box Plot for Human Mitochondrial Proteome.
Create Histograms for Human Mitochondrial Pattern Finding.
Create Frequency Plots for Human Mitochondrial Information Frequency Mining.
Create Bar Charts for Human Mitochondrial Knowledge Mining.
Scale and Limit Data Visualization.
Visualize Phylogenetic Tree.
Save Visualizations in High Resolution.

Section 4: Linux

Description: This section will focus on making sure that the students will learn about R language

Learning Outcomes: Upon completion of this section, students will be able to:

Discuss Linux Operating System.
Print Working Directory in Linux.
Make Directories in Linux.
Change Directories in Linux.
Move Files, Directories, and Data.
Delete Files and Directories in Linux.
Find the Programs Installed by the User.
Find the Files Created by the User.
List Files and Directories on Linux.
Pipe and Redirect Data.
Visualize and Inspect Text Data.
Read the Specified Number of Lines from the Bottom
Modify File Statistics and Create Files.
See the Statistics of Files & Directories.
Retrieve Genome Assemblies.
Retrieve Bioinformatics Files.
Create and Edit Text Files.
Find Sequence Differences in Files.
Compress and Archive Files Efficiently.
Extract Compressed Content.
Create Archives of Genome Data.
Find Uncharacterized Proteins in the Human Genome.
Subset Required Textual Data from Text Files.
Sort Data.
Find Unique Data Items.
See the Statistics of Data Within the File.
Copy Files and File Contents.
Properly Visualize Delimited Datasets.

English

Introduction
Iterable Objects
Control Flow
File Handling
Functions & Modules
Error Handling
Sequence Analysis
Sequence Data Parsing
Sequence Data Extraction
Alignment Parsing and Analysis
BLAST Database Searching
Parsing BLAST results
Biological Data Retrieval
Parsing a PDB Structure file
Phylogenetic Analysis
Protein Sequence Analysis
Variables & Functions
Packages
Vectors & Data Types
Biological Data Analysis
Data Visualization: ggplot2
Getting Familiar With Linux
Piping and Control Data Flow
Pre-processing Biological Datasets
Processing and Analysis of Biological Datasets
BioConductor
Sequence Retrieval
Bioinformatics File Parsing and Writing
Sequence Alignment
Database Searching
Gene Enrichment Analysis
Data Transformation with dplyr
Tidy Data with tidyr
MicroArray Analysis: BioConductor

Course Content

Python

Why Python in Bioinformatics

09:16
Introduction to Python and it’s Installation

08:25
Comments

05:43
Basic Input and output

15:38
Mathematical Operations

07:20
Strings

21:51
Dictionaries

10:57
Lists

28:48
Lists(pt 2) and Tuples

10:38
Sets

07:36
If-Else

09:19
For Loop and Calculation of Molecular Weight of Proteins

10:56
While Loop and Biological Data Analysis

09:37
CSV (A special kind of file in Bioinformatics)

08:42
Reading Files

13:45
Writing Files

07:18
Consolidate (merge) multiple DNA and Protein Sequences into one FASTA file

09:25
OS Module

31:47
Function

26:41
With

08:50
Error Handling

15:31

BioPython

Introduction to BioPython & Installation

10:19
Bio.Seq Create a Seq Object

07:34
Bio.Seq Seq Object Behaves Like a String

09:54
Bio.Seq Central Dogma in Play Through Python

08:41
Bio.Seq Unknown & Mutable Sequences

06:54
Bio.Alphabet Understanding the Alphabets of Biology

07:38
Bio.Alphabet IUPAC and Types of Sequence Representations

10:34
Bio.Alphabet Concatenation of Multiple Seq Records Using Generic Alphabets

09:47
SeqRecord Creating Seq Records

12:28
SeqRecords & FASTA

04:36
SeqRecords & GenBank

03:29
SeqRecord Formatting Records

03:06
SeqRecord Comparison & Reading Multiple FASTA Files from Directory

05:47
SeqIO Reading a Sequence File

10:32
SeqIO Parsing a Sequence File

07:17
SeqIO Parsing a Compressed Sequence File & Creating a Dictionary of Sequences

06:11
SeqIO – Write Sequences and SeqRecords Into Files

11:43
SeqIO Extracting Annotations and Pattern wise Sequence Data Extraction

10:35
AlignIO – Reading and Parsing a Multiple Sequence Alignment File

08:19
AlignIO – Writing Alignments and Multiple Sequence Alignment Records

05:29
AlignIO – Conversion of Alignment Formats

04:02
AlignIO – Manipulating Alignments

02:57
AlignIO – ClustalW Python Wrapper – Align Multiple Sequences

07:47
AlignIO – Pairwise2 – Align Two Sequences

07:31
AlignIO – Information Mapping of Alignments

02:33
AlignIO – Format Alignments

03:36
AlignIO – Slicing Alignments

06:06
Bio.Blast – Querying NCBI BLAST Through Python

11:15
Bio.Entrez – Accessing ENTREZ Using Python

09:32
Bio.Blast – Parsing BLAST Results

14:52
Bio Entrez Use Esummary To Get Summary Of Your Accessions

08:59
Bio.Entrez – Use EFetch to Download Complete Records

13:57
Bio.Entrez – Use EGQuery to Do Global Queries for Search Counts

07:24
Bio.Entrez – Use Elink To Search For Database Links Of Records

03:42
Bio.Entrez – Use ESearch to Search the Entrez Databases

08:20
Bio.Entrez – Use Espell To Get Correct Spellings For Your Search Terms

05:21
Bio.Entrez – Download GenBank and Entrez Records

14:17
Bio.Entrez – Taxonomy Database Searching

07:05
Bio.Entrez – Download PubMed Articles

08:28
Bio.PDB – Reading a PDB (3D Structure) File

11:59
Bio.Phylo – Calculating Distance Matrix Between Sequences For Phylogenetic

04:18
Bio.Phylo – Converting Phylogenetic Tree Data Formats

03:29
Bio.Phylo – Printing Out Phylogenetic Tree in ASCII

02:17
Bio.Phylo – Reading Phylogenetic Trees

06:29
Bio.Phylo – Visualization And Manipulation Of Phylogenetic Trees

09:36
Bio.motifs – Creating a WebLogo of Motifs

10:47
Bio.motifs – MEME Analysis

09:49
Bio.Phylo – Writing Out Phylogenetic Data

04:04

R

Introduction to R in Bioinformatics & R Installation

09:48
The R Studio Interface

06:23
Comments

04:17
Variable Declaration and Objects

05:24
Built-in Functions & ARGS

04:32
Sample & Replacement

09:09
Write Your Own Functions And Arguments

05:39
Scripts

07:36
Packages

04:00
Install Packages

05:25
Library & Initialize Packages

02:28
Getting Help with Help Packages

03:43
Atomic Vectors

02:43
Doubles

03:31
Integers

03:23
Characters

04:43
Logicals

02:27
Dim & Dimensions

05:46
Attributes and Names

04:46
Matrix & Matrices

04:43
Arrays

03:42
Factors

06:41
Coercion

04:27
Lists

06:42
Data Frames

06:30
Loading Biological Data

07:56
Saving Biological Data

05:27
R Notation & Selecting Values from Biological Dataset

04:09
Positive Integers for subsetting Biological Dataset (DataFrame)

05:26
Negative Integers for subsetting Biological Dataset (DataFrame)

05:28
Zero Notation for subsetting Biological Datasets (DataFrames)

01:09
Blank Spaces For Biological Data Subsetting

03:21
Dollar Signs for Biological Dataset Subsetting

02:58
Modifying Values in Existing DataFrames/Datasets

07:06
NA Values in Biological Dataset

05:25
Figuring out NA Values in Biological Dataset

02:06
Logical Subsetting in Biological Datasets

09:46
If Else Statements

04:15
For Loops & Biological Data Binding

16:30
While Loops & Reading Multiple Biological Datasets while Loops & Reading Multiple

16:16
Introduction to ggplot2 for Biological Datasets

10:46
ggplot2: Key components

08:26
ggplot2: Human Mitochondrial Proteome & Aesthetics (Size, Shape, Color)

26:06
ggplot2: Facetting of Human Genome

22:25
ggplot2: Smooth Out the Biological Data

08:43
ggplot2: Boxplots for Human Mitochondrial Proteome

07:56
ggplot2 :Histograms for Human Mitochondrial Pattern Finding

06:02
ggplot2: Frequency Plots for Human Mitochondrial Information Frequency Mining

06:13
ggplot2: Bar Charts Human Mitochondrial Knowledge Mining

10:43
ggplot2 – Scaling and Limiting Data Visualization

03:53
ggplot2 – Changing Labels and Finalizing Visualization

08:42
ggtree – Phylogenetic Tree Visualization

05:41
ggsave – Saving the Visualizations in High Resolution

04:45

Biological Data Analysis and Manipulation Using DplyR and TidyR

Introduction to dplyr

15:38
Filter Rows with filter ()

20:13
Select Columns with select ()

28:30
Add New Variables with mutate ()

21:19
Grouped Summaries with summerize ()

18:30
Grouped Mutates (and Filters)

19:58
Introduction to tidyr

11:35
Data Spreading Function

13:51
Data Gathering Function

19:30
Data Separating & Pull

17:24
Missing Values

29:58

Linux

Introduction to Linux for Bioinformatics

22:32
PWD – Print Working Directory

01:26
CD – Changing Directories

05:03
MKDIR – Making Directories

08:13
MV – Moving Files, Directories and Data

05:11
RM – Deleting Files and Directories

01:24
Which & Whereis – Find Programs You Installed

03:43
Find – Finding User Created Files

03:39
LS – Listing Files and Directories on Linux

06:46
Piping and Redirection of Data

06:35
Cat – Visualization and Inspection of Text Data

03:56
Head – Reading Specified Number of Lines from Top

03:50
Tail- Reading Specified Number of Lines from Bottom

02:23
Touch – Modifying File Statistics and Creating Files

07:04
Stat – Statistics of File & Directories

02:43
Wget – Retrieval of Genome Assemblies

06:48
Curl – Retrieval of Bioinformatics Files

02:25
Vim – Create and Edit Text Files

05:59
Diff – Find Sequence Differences in Files

02:35
GZIP – Compress and Archive Files Efficiently

06:05
GUNZIP – Extract Compressed Content

02:14
Tar – Create Archives of Genome Data

04:19
Grep – Finding Uncharacterized Proteins in Human Genome

08:55
Cut – Subsetting Required Textual Data from Text Files

05:49
Sort – Sorting Data

04:23
Uniq – Finding Unique Data Items

10:33
WC – Statistics of the Data Within File

02:46
CP – Copying Files and Files Contents

03:43
Column – Proper Visualization of Delimited Datasets

04:38

Microarray

Introduction to ArrayExpress – Getting Started With MicroArray Analysis

09:56
Introduction to BioConductor – Installating MicroArray Packages

05:06
Getting Started with R Studio Project for MicroArray Analysis

04:51
Downloading MicroArray Raw Data from ArrayExpress

04:19
Creating Raw Intensities MicroArray Data Structure and Log2 Transformation

14:41
Principle Component Analysis of Raw Expression Dataset

15:44
Box Plot Visualization of Raw Intensity Data to Interpret the Median Intensities of the Samples

03:11
ArrayQualityMetrics – Automated Quality Control for Microarray Datasets

05:38
Annotating the Probe IDs with Gene Symbols and Names

04:19
Excluding Probe IDs with Multiple Mappings from the ExpressionSet

04:39
Filtering out the Genes that are Above Threshold

06:02
Heatmap Visualization of the Normalized Gene Expression Values

11:52
Intensity-based Filteration of Low-Intensity Transcripts

06:19
Normalization of Raw Intensities Values

04:36
Relative Log Expression Analysis and Visualization

08:57
Removal of the Probe IDs that Match to Multiple Genes

04:04
Robust Multi-Array Summarization and Background Correction of the Raw MicroArray Data

03:47
LIMMA – Data Preparation for Linear Modelling

11:49
Factors Preparation

10:26
Analysis of Gene Expression Levels of a Single Gene Among Different Conditions

11:60
LIMMA – Applying Linear Model on a Single Gene Expression Data

05:34
Applying t-test to Find if Genes are Differentially Expression

06:51

Exercise

Add this certificate to your resume to demonstrate your skills & increase your chances of getting noticed.

Student Ratings & Reviews

No Review Yet

About Course

Section 1: Python

Section 2: BioPython

Section 3: R Language

Section 4: Linux

What Will You Learn?

Course Content

Python

Why Python in Bioinformatics

Introduction to Python and it’s Installation

Comments

Basic Input and output

Mathematical Operations

Strings

Dictionaries

Lists

Lists(pt 2) and Tuples

Sets

If-Else

For Loop and Calculation of Molecular Weight of Proteins

While Loop and Biological Data Analysis

CSV (A special kind of file in Bioinformatics)

Reading Files

Writing Files

Consolidate (merge) multiple DNA and Protein Sequences into one FASTA file

OS Module

Function

With

Error Handling

BioPython

Introduction to BioPython & Installation

Bio.Seq Create a Seq Object

Bio.Seq Seq Object Behaves Like a String

Bio.Seq Central Dogma in Play Through Python

Bio.Seq Unknown & Mutable Sequences

Bio.Alphabet Understanding the Alphabets of Biology

Bio.Alphabet IUPAC and Types of Sequence Representations

Bio.Alphabet Concatenation of Multiple Seq Records Using Generic Alphabets

SeqRecord Creating Seq Records

SeqRecords & FASTA

SeqRecords & GenBank

SeqRecord Formatting Records

SeqRecord Comparison & Reading Multiple FASTA Files from Directory

SeqIO Reading a Sequence File

SeqIO Parsing a Sequence File

SeqIO Parsing a Compressed Sequence File & Creating a Dictionary of Sequences

SeqIO – Write Sequences and SeqRecords Into Files

SeqIO Extracting Annotations and Pattern wise Sequence Data Extraction

AlignIO – Reading and Parsing a Multiple Sequence Alignment File

AlignIO – Writing Alignments and Multiple Sequence Alignment Records

AlignIO – Conversion of Alignment Formats

AlignIO – Manipulating Alignments

AlignIO – ClustalW Python Wrapper – Align Multiple Sequences

AlignIO – Pairwise2 – Align Two Sequences

AlignIO – Information Mapping of Alignments

AlignIO – Format Alignments

AlignIO – Slicing Alignments

Bio.Blast – Querying NCBI BLAST Through Python

Bio.Entrez – Accessing ENTREZ Using Python

Bio.Blast – Parsing BLAST Results

Bio Entrez Use Esummary To Get Summary Of Your Accessions

Bio.Entrez – Use EFetch to Download Complete Records

Bio.Entrez – Use EGQuery to Do Global Queries for Search Counts

Bio.Entrez – Use Elink To Search For Database Links Of Records

Bio.Entrez – Use ESearch to Search the Entrez Databases

Bio.Entrez – Use Espell To Get Correct Spellings For Your Search Terms

Bio.Entrez – Download GenBank and Entrez Records

Bio.Entrez – Taxonomy Database Searching

Bio.Entrez – Download PubMed Articles

Bio.PDB – Reading a PDB (3D Structure) File

Bio.Phylo – Calculating Distance Matrix Between Sequences For Phylogenetic

Bio.Phylo – Converting Phylogenetic Tree Data Formats

Bio.Phylo – Printing Out Phylogenetic Tree in ASCII

Bio.Phylo – Reading Phylogenetic Trees

Bio.Phylo – Visualization And Manipulation Of Phylogenetic Trees

Bio.motifs – Creating a WebLogo of Motifs

Bio.motifs – MEME Analysis

Bio.Phylo – Writing Out Phylogenetic Data

R

Introduction to R in Bioinformatics & R Installation