knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This page gives a compact mental model for misha.
Use it as the first quick read before the full Manual vignette.
Most analyses follow the same pattern:
In misha this is usually one call to gextract, gscreen, or gsummary.
You are not limited to raw track names. You can pass full expressions, for example
log(dense_track + 1), dense_track / (chip.sum + 1e-6), or pmin(dense_track, 2).
All examples below assume the bundled examples database:
library(misha) gdb.init_examples()
A track is genomic signal organized over coordinates.
dense_track in the examples DB).Useful starter commands:
gtrack.ls() # list tracks in the examples DB gtrack.info("dense_track") # inspect type/metadata gtrack.info("sparse_track")
For intuition, you can think of dense_track as a ChIP-seq-like coverage signal.
An interval set defines genomic regions (chrom, start, end) where you want to work.
regions <- gintervals(1, c(0, 250000), c(100000, 260000))
The iterator is the stepping policy inside the scope.
iterator = 100 -> fixed 100 bp binsiterator = "some_sparse_track" -> iterate over that track's intervalsiterator = some_intervals_df -> iterate over explicit regionsiterator = "my_intervals_set" -> iterate directly over an intervals setThink of it as: scope says where, iterator says in what chunks.
out <- gextract("dense_track", regions, iterator = 100) log_out <- gextract("log(dense_track + 1)", regions, iterator = 100)
Create and use an intervals set as an iterator:
gintervals.save(regions, "my_intervals_set") out2 <- gextract("dense_track", gintervals.all(), iterator = "my_intervals_set")
A virtual track is a named on-the-fly transformation, not stored as a physical track file.
Examples:
gvtrack.create("chip.sum", "dense_track", "sum") out <- gextract("chip.sum", regions, iterator = 200)
You can also shift the iterator window used by the virtual track:
gvtrack.create("chip.shifted", "dense_track", "sum") gvtrack.iterator("chip.shifted", sshift = -100, eshift = 100) out <- gextract("chip.shifted", regions, iterator = 200)
Here, each iterator interval is expanded by 100 bp on both sides before evaluating dense_track.
Virtual tracks are session objects (easy to list with gvtrack.ls and delete with gvtrack.rm).
library(misha) gdb.init_examples() # 1) pick scope regions <- gintervals(1, 0, 50000) # 2) inspect available tracks print(gtrack.ls()) # 3) extract signal with a chosen iterator chip <- gextract("dense_track", regions, iterator = 100) # 4) screen high-signal bins (as a simple peak-like filter) hi_chip <- gscreen("dense_track > 0.6", regions, iterator = 100) # 5) summarize distribution/coverage stats <- gsummary("dense_track", regions, iterator = 100)
A PWM/PSSM is a motif model over A/C/G/T. In misha, a common pattern is:
regions <- gintervals(1, c(1000, 2000), c(1020, 2020)) seqs <- gseq.extract(regions) pssm <- matrix(c( 0.80, 0.05, 0.10, 0.05, 0.10, 0.10, 0.70, 0.10, 0.05, 0.80, 0.05, 0.10, 0.10, 0.10, 0.10, 0.70 ), ncol = 4, byrow = TRUE) colnames(pssm) <- c("A", "C", "G", "T") scores <- gseq.pwm(seqs, pssm, mode = "lse")
If your database has motif files under pssms/, you can create a genome-wide PWM-energy track with gtrack.create_pwm_energy(...).
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.