View source: R/run_genespace.R
| run_genespace | R Documentation | 
run_genespace Run the entire GENESPACE pipeline from beginning to end
with one function call.
run_genespace(
  gsParam,
  overwrite = FALSE,
  overwriteBed = overwrite,
  overwriteSynHits = overwrite,
  overwriteInBlkOF = TRUE,
  makePairwiseFiles = FALSE
)
gsParam | 
 A list of genespace parameters created by init_genespace.  | 
overwrite | 
 logical, should all raw files be overwritten except orthofinder results  | 
overwriteBed | 
 logical, should the bed file be re-created and overwritten?  | 
overwriteSynHits | 
 logial, should the annotated blast files be overwritten?  | 
overwriteInBlkOF | 
 logical, should in-block orthogroups be overwritten?  | 
makePairwiseFiles | 
 logical, should pairwise hits in blocks files be generated?  | 
The function calls required to run the full genespace pipeline are printed below. See each function for detailed descriptions. Also, see 'init_genespace'for details on parameter specifications.
'run_orthofinder' runs orthofinder or finds and copies over data from a previous run.
'set_syntenyParams' converts parameters in the gsParam list into a matrix of file paths and parameters for each pairwise combination of query and target genomes
'annotate_bed' reads in all of the bed files, concatenates them and adds some important additional information, including gene rank order, orthofinder IDs, orthogroup information, tandem array identity etc.
'annotate_blast' reads in all the blast files and adds information from the annotated/combined bed file
'synteny' is the main engine for genespace. this flags syntenic blocks and make dotplots
'build_synOGs' integrates syntenic orthogroups across all blast files
'run_orthofinderInBlk' optionally re-runs orthofinder within each syntenic block, returning phylogenetically hierarchical orthogroups (HOGs)
'integrate_synteny' interpolates syntenic position of all genes across all genomes
'pangenes' combines positional and orthogroup information into a single matrix anchored to the gene order coordinates of a single reference
'plot_riparian' is the primary genespace plotting routine, which stacks the genomes and connects syntenic regions to color-coded reference chromosomes
a gsParam list.
## Not run: 
###############################################
# -- change paths to those valid on your system
genomeRepo <- "~/path/to/store/rawGenomes"
wd <- "~/path/to/genespace/workingDirectory"
path2mcscanx <- "~/path/to/MCScanX/"
###############################################
dir.create(genomeRepo)
dir.create(wd)
rawFiles <- download_exampleData(filepath = genomeRepo)
parsedPaths <- parse_annotations(
  rawGenomeRepo = genomeRepo,
  genomeDirs = c("human", "chicken"),
  genomeIDs = c("human", "chicken"),
  presets = "ncbi",
  genespaceWd = wd)
gpar <- init_genespace(
  wd = wd, nCores = 4,
  path2mcscanx = path2mcscanx)
out <- run_genespace(gpar)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.