AHRD_on_gene_clusters-package: Automated Assignment of Human Readable Descriptions on Gene...

Description Details Author(s) Examples

Description

AHRD.on.gene.clusters annotates Gene Clusters with Human Readable Descriptions. Gene Clusters are sets of amino acid sequence of significant similarity.

Gene clusters, interpreted as gene families, enable several conclusions and interpretations: Like gene expansion or loss in a certain organism, phylogenetic reconstruction in order to reconstruct the evolutionary history of the genes of your interest etc.

The problem is, that no cluster has a short, concise, and trustworthy human readable description that gives you a quick overview of what kind of gene family you are dealing with here. And typically you have many such clusters. – In the Tomato genome for istance we had 17,490!

This version of AHRD (see https://github.com/groupschoof/AHRD for the original used to describe single query proteins) provides a simple method to annotate such gene clusters.

Details

Package: AHRD.on.gene.clusters
Type: Package
Version: 1.0
Date: 2014-06-12
License: GPL 3.0

~~ An overview of how to use the package, including the most important functions ~~

Author(s)

Asis Hallab

Maintainer: <hallab@mpipz.mpg.de>

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
~~ Annotate a set of gene families with short human readable descriptions ~~

# Have your gene clusters ready in a named list of character vectors, in which
# each gene family is represented by a single character vector of gene
# accessions. Here, the list names are the family-names.
mcl.fams <- list( 'group1'=c( 'AT2G98278', 'Solyc4G29337' ), 'group2'=c( ...

# Have the InterPro annotations ready in a data.frame. The data.frame is required
# to have two columns, in which the first is to hold the gene accession and the
# second the annotated InterPro entry. In order to get correct frequencies it is
# necessary to ensure that each protein's InterPro annotation only appears once.
# To ensure use unique(...) after reading in the InterPro annotations.
gene.2.ipr.df <- unique( read.table( 'my.ipr.annos.csv', stringsAsFactors=FALSE ) )

# Load the downloaded InterPro database into memory:
ipr.db <- parseInterProXML( './interpro.xml' ) 

# Annotate your gene families:
mcl.fams.hrds <- lapply( mcl.fams, annotateCluster, ipr.annos=gene.2.ipr.df,
  interpro.database=ipr.db )
# Usage of mclapply is recommended:
require( parallel )
options('mc.cores'=detectCores())
mcl.fams.hrds <- mclapply( mcl.fams, annotateCluster, ipr.annos=gene.2.ipr.df,
  interpro.database=ipr.db )

groupschoof/AHRD_on_gene_clusters documentation built on May 17, 2019, 8:38 a.m.