Motif occupancy

Intro

To investigate how ribosome occupancy varies across different treatment conditions, cumulative distribution curves are employed to visualize the translational enrichment at specific amino acids or peptide motifs based on Ribo-seq data.

Cumulative curve

The motif_occupancy function allows for the quantification of ribosome occupancy at specific motifs or codons across individual transcripts, and generates cumulative distribution curves to illustrate global translational trends under different treatment conditions. As shown below, upon the loss of eIF5A, an overall increase in occupancy is observed at the following motifs:

motif_occupancy(object = obj0, 
                cds_fa = "./sac_cds.fa",
                search_type = "amino",
                do_offset_correct = T,
                motif_pattern = c("PP","PPP","PPD","DPP"))

By setting return_data = TRUE, the function returns the result as a data frame:

co <- motif_occupancy(object = obj0, 
                      cds_fa = "./sac_cds.fa",
                      search_type = "amino",
                      do_offset_correct = T,
                      motif_pattern = c("PP","PPP","PPD","DPP"),
                      return_data = T)

# check
head(co)
# # A tibble: 6 × 7
#   sample       sample_group rname             codon_pos value motif   pos
#   <chr>        <chr>        <chr>                 <dbl> <dbl> <chr> <int>
# 1 sgeIF5A-rep1 sgeIF5A      YAL002W_mRNA|VPS8        47 34.0  PP       47
# 2 sgeIF5A-rep1 sgeIF5A      YAL005C_mRNA|SSA1       462  5.69 PP      462
# 3 sgeIF5A-rep1 sgeIF5A      YAL011W_mRNA|SWC3       339 10.9  PP      339
# 4 sgeIF5A-rep1 sgeIF5A      YAL013W_mRNA|DEP1       164 13.8  PP      164
# 5 sgeIF5A-rep1 sgeIF5A      YAL014C_mRNA|SYN8       149 12.0  PP      149
# 6 sgeIF5A-rep1 sgeIF5A      YAL015C_mRNA|NTG1       327  3.22 PP      327

Visualize the cumulative distribution curve of ribosome occupancy across codons:

motif_occupancy(object = obj0, 
                cds_fa = "./sac_cds.fa",
                search_type = "codon",
                do_offset_correct = T,
                motif_pattern = c("CCA","CCT","CCG","CCC"))