synteny: flag and split syntenic hits

View source: R/synteny.R

syntenyR Documentation

flag and split syntenic hits

Description

synteny from an annotated blast file, assign syntenic (or not) blocks to each hit.

synteny The main engine to call syntenic blocks and regions

find_selfSyn Self hits where genes are array reps are the anchors. Also pulls all inbuffer hits around the anchors

synteny_engine The main engine for synteny discovery. Takes a "hits" data.table and classifies block IDs, whether a gene is an anchor in the block and whether it is within a syntenic buffer of the block anchor hits. The steps are as follows: 1) Filter to initial hits - onlyOgAnchors (if TRUE) and topN hits depending on ploidy. Built global collinear hits by piping these hits into MCScanX. 2) Cluster global collinear hits into large regions searching within synRad then pull all (or onlyOG hits) within synRad of global anchor hits. 3) Re-run MCScanX within large regions, re-cluster and re-cull region hits within synRad. Split overlapping regions. These are the "regions" named in "regID" column 4) For each region, pull all hits (regardless of score, OG etc) and re-run MCScanX. Cluster these 'potential' anchors into blocks with blkRadius search radius and blkSize minimum size. The resulting hits are the anchors, flagged 'isAnchor = TRUE. 5) Split anchors of interleaved blocks, then extract hits within the physical bounds of the blocks and within blkRadius distance of an anchor (within-block distance). These hits are flagged 'isSyntenic = TRUE'.

Usage

synteny(gsParam, verbose = TRUE)

find_selfSyn(hits, synRad)

synteny_engine(
  hits,
  onlyOgAnchors,
  blkSize,
  blkRadius,
  tmpDir,
  topn1,
  topn2,
  nGaps,
  synRad,
  MCScanX_hCall,
  onlySameChrs,
  verbose = FALSE
)

Arguments

gsParam

A list of genespace parameters. This should be created by init_genespace.

verbose

logical, should updates be printed to the console?

hits

data.table of hits, see read_allBlast

synRad

see init_genespace

onlyOgAnchors

see init_genespace

blkSize

see init_genespace

blkRadius

see init_genespace

tmpDir

see init_genespace

topn1

integer, the number of best scoring hits per gene in genome 1

topn2

integer, the number of best scoring hits per gene in genome 1

nGaps

see init_genespace

MCScanX_hCall

see init_genespace


If called, synteny returns its own arguments.

onlySameChrs

logical, should only hits on chromosomes with the same name be permitted?


jtlovell/GENESPACE documentation built on Jan. 25, 2025, 6:39 a.m.