similarity: Compute genome content similarity index

Description Usage Arguments Value References See Also Examples

Description

This function calculates genome content similarity index for two genomes.

Usage

1
similarity(gene_sets1, gene_sets2, threshold = 0.05, size = 4)

Arguments

gene_sets1

list of vectors of 1s and 0s which represnet presence and absence of gene families from each of the gene sets in first genome.

gene_sets2

list of vectors of 1s and 0s which represnet presence and absence of gene families from each of the gene sets in second genome.

threshold

fraction of gene families from a gene set which is required to be present in at least one of the genomes for for the gene set to contribute to genome content similarity index calculation (default is 0.05).

size

minimal size of the gene set to consider (default is 4).

Value

Genome content similarity index computed for two genomes encoding gene_sets1 and gene_sets2 respectively.

References

  1. Kamneva OK. Genome composition of microbes predicts their co-occurrence in the environment. In review.

See Also

similarity_by_set, functional_association, set_representation, reference_gene_sets.

Examples

1
2
3
4
5
6
data(families_Bf)
data(families_Er)
sets_Bf = set_representation(families = families_Bf, gene_sets = reference_gene_sets)
sets_Er = set_representation(families = families_Er, gene_sets = reference_gene_sets)
sim_Bf_Er = similarity(gene_sets1 = sets_Bf, gene_sets2 = sets_Er, threshold = 0.05, size = 4)
sim_Bf_Er

olgakamneva/genomics2ecology documentation built on May 24, 2019, 12:51 p.m.