2  Raw counts example

2.1 Introduction

OmicScope supports flexible input formats including aligned BAM files, count matrices generated by FeatureCounts, or any standard gene expression matrix with genes as rows and samples as columns, providing users with versatile data import options as demonstrated in the following examples.

2.2 Counts matrix input

Here we load the built-in example count matrix provided by OmicScope:

data("counts")

# check
head(counts)
#                    ../test-bam/0a.sorted.bam ../test-bam/0b.sorted.bam ../test-bam/4a.sorted.bam ../test-bam/4b.sorted.bam
# ENSMUSG00000102693                         0                         1                         0                         0
# ENSMUSG00000051951                        13                        16                       109                       147
# ENSMUSG00000102851                         0                         1                         2                         2
# ENSMUSG00000103377                         3                         4                        13                        26
# ENSMUSG00000104017                         0                         4                        13                        15
# ENSMUSG00000103025                         0                         0                         7                        14
#                    ../test-bam/10a.sorted.bam ../test-bam/10b.sorted.bam
# ENSMUSG00000102693                          0                          0
# ENSMUSG00000051951                         65                        105
# ENSMUSG00000102851                          0                          0
# ENSMUSG00000103377                          6                          9
# ENSMUSG00000104017                         13                         10
# ENSMUSG00000103025                          3                          3

Next, we construct an omicscope object from the loaded count matrix:

⚠️ Important: The metadata data frame must contain a sample column to identify sample information.

mta <- data.frame(sample = c("../test-bam/0a.sorted.bam","../test-bam/0b.sorted.bam",
                             "../test-bam/4a.sorted.bam","../test-bam/4b.sorted.bam",
                             "../test-bam/10a.sorted.bam","../test-bam/10b.sorted.bam"),
                  sample_name = c("day0-rep1","day0-rep2","day4-rep1","day4-rep2",
                                  "day10-rep1","day10-rep2"),
                  group = rep(c("day0","day4","day10"),each = 2))
mta
#                       sample sample_name group
# 1  ../test-bam/0a.sorted.bam   day0-rep1  day0
# 2  ../test-bam/0b.sorted.bam   day0-rep2  day0
# 3  ../test-bam/4a.sorted.bam   day4-rep1  day4
# 4  ../test-bam/4b.sorted.bam   day4-rep2  day4
# 5 ../test-bam/10a.sorted.bam  day10-rep1 day10
# 6 ../test-bam/10b.sorted.bam  day10-rep2 day10

os <- omicscope(gtfAnno = "../test-bam/Mus_musculus.GRCm38.102.gtf.gz",
                counts = counts,
                metadata = mta)

os
# class: omicscope 
# dim: 39732 6 
# metadata(0):
# assays(1): counts
# rownames(39732): ENSMUSG00000102693 ENSMUSG00000051951 ... ENSMUSG00000094621 ENSMUSG00000095742
# rowData names(3): gene_id gene_name gene_biotype
# colnames(6): day0-rep1 day0-rep2 ... day10-rep1 day10-rep2
# colData names(3): sample sample_name group

Alternatively, omicScope can directly parse the output files generated by FeatureCounts by setting the parameter featureCountOutput = TRUE:

# Don't run
# 
# os <- omicscope(gtfAnno = "../test-bam/Mus_musculus.GRCm38.102.gtf.gz",
#                 counts = "featurecounts_output/counts.info.txt",
#                 featureCountOutput = TRUE,
#                 metadata = mta)
# os

Once the omicscope object is successfully created, users can proceed with downstream analyses including differential expression, pathway enrichment, and visualization.