Pathway enrichment analysis

Intro

In the previous section, we introduced how to quantify RNA, extract normalized data, and perform differential analysis. We also discussed which pathways might be associated with up- and down-regulated genes. In this section, we demonstrate how to conduct pathway enrichment analysis on the differentially expressed genes identified.

Go enrichment analysis

The enrich_analysis function can be applied to the differential analysis results obtained previously. It utilizes the clusterProfiler package to carry out pathway enrichment and returns a list containing both the GO and KEGG enrichment objects as well as corresponding tables.

For example, the following code snippet selects up-regulated genes for Biological Process (BP) pathway enrichment:

library(clusterProfiler)
library(org.Hs.eg.db)

en <- enrich_analysis(diff_data = te.rex$deg_anno,
                      log2fc_col = "log2FoldChange",
                      pval_col = "pvalue",
                      type = "go",
                      trend = "up",
                      ont = "BP",
                      organism = "hsa",
                      OrgDb = org.Hs.eg.db)

After obtaining the enrichment results, you can use the enrichplot package to produce various visualizations:

# dotplot
enrichplot::dotplot(en$enrich.go.obj)

GSEA enrichment analysis

Specifying type = “gsea” includes all genes in the GSEA-based enrichment analysis:

en2 <- enrich_analysis(diff_data = te.rex$deg_anno,
                      log2fc_col = "log2FoldChange",
                      pval_col = "pvalue",
                      type = "gsea",
                      ont = "BP",
                      organism = "hsa",
                      OrgDb = org.Hs.eg.db)

GSEA visualization:

head(en2$enrich.go.df$ID)
# [1] "GO:0003044" "GO:0050818" "GO:1900118" "GO:0007030" "GO:0030193" "GO:1900046"

enrichplot::gseaplot2(x = en2$enrich.go.obj,
                      geneSetID = "GO:0003044")

Multiple pathway visualization:

enrichplot::gseaplot2(x = en2$enrich.go.obj,
                      geneSetID = c("GO:0003044", "GO:0050818", "GO:1900118"))