similarity_by_set: Compute genome content similarity by gene set

Description Usage Arguments Value References See Also Examples

Description

This function calculates genome content similarity index for two genomes gene set by gene set.

Usage

1
similarity_by_set(gene_sets1, gene_sets2)

Arguments

gene_sets1

list of vectors of 1s and 0s which represnet presence and absence of gene families from each of the gene sets in first genome.

gene_sets2

list of vectors of 1s and 0s which represnet presence and absence of gene families from each of the gene sets in second genome.

Value

A data frame with 5 variables: set_ID gene set identifier, score genome content similarities computed between gene_sets1 and gene_sets2 for each individual gene set, set_size number of gene families in the gene set, in_genome1 number of gene families from gene set present in first and in_genome2 second genomes.

References

  1. Kamneva OK. Genome composition of microbes predicts their co-occurrence in the environment. In review.

See Also

similarity, functional_association, set_representation, reference_gene_sets.

Examples

1
2
3
4
5
6
data(families_Bf)
data(families_Er)
sets_Bf = set_representation(families = families_Bf, gene_sets = reference_gene_sets)
sets_Er = set_representation(families = families_Er, gene_sets = reference_gene_sets)
sim_Bf_Er = similarity_by_set(gene_sets1 = sets_Bf, gene_sets2 = sets_Er)
head(sim_Bf_Er)

olgakamneva/genomics2ecology documentation built on May 24, 2019, 12:51 p.m.