GeneConvert | R Documentation |
This function can convert different gene ID types within one species or between two species using the biomart service.
GeneConvert(
geneID,
geneID_from_IDtype = "symbol",
geneID_to_IDtype = "entrez_id",
species_from = "Homo_sapiens",
species_to = NULL,
Ensembl_version = 103,
biomart = NULL,
mirror = NULL,
max_tries = 5
)
geneID |
A vector of the geneID character. |
geneID_from_IDtype |
Gene ID type of the input |
geneID_to_IDtype |
Gene ID type(s) to convert to. e.g. "symbol", "ensembl_id", "entrez_id" |
species_from |
Latin names for animals of the input geneID. e.g. "Homo_sapiens","Mus_musculus" |
species_to |
Latin names for animals of the output geneID. e.g. "Homo_sapiens","Mus_musculus" |
Ensembl_version |
Ensembl database version. If NULL, use the current release version. |
biomart |
The name of the BioMart database that you want to connect to. Possible options include "ensembl", "protists_mart", "fungi_mart", and "plants_mart". |
mirror |
Specify an Ensembl mirror to connect to. The valid options here are 'www', 'uswest', 'useast', 'asia'. |
max_tries |
The maximum number of attempts to connect with the BioMart service. |
A list with the following elements:
geneID_res:
A data.frame contains the all gene IDs mapped in the database with columns: 'from_IDtype','from_geneID','to_IDtype','to_geneID'
geneID_collapse:
The data.frame contains all the successfully converted gene IDs, and the output gene IDs are collapsed into a list. As a result, the 'from_geneID' column (which is set as the row names) of the data.frame is unique.
geneID_expand:
The data.frame contains all the successfully converted gene IDs, and the output gene IDs are expanded.
Ensembl_version:
Ensembl database version.
Datasets:
Datasets available in the selected BioMart database.
Attributes:
Attributes available in the selected BioMart database.
geneID_unmapped:
A character vector of gene IDs that are unmapped in the database.
res <- GeneConvert(
geneID = c("CDK1", "MKI67", "TOP2A", "AURKA", "CTCF"),
geneID_from_IDtype = "symbol",
geneID_to_IDtype = "entrez_id",
species_from = "Homo_sapiens",
species_to = "Mus_musculus",
Ensembl_version = 103
)
str(res)
# Convert the human genes to mouse homologs and replace the raw counts in a Seurat object.
data("pancreas_sub")
counts <- pancreas_sub@assays$RNA@counts
res <- GeneConvert(
geneID = rownames(counts),
geneID_from_IDtype = "symbol",
geneID_to_IDtype = "symbol",
species_from = "Mus_musculus",
species_to = "Homo_sapiens",
Ensembl_version = 103
)
# Check the number of input and converted gene IDs
input_genes <- length(rownames(counts))
db_genes <- length(unique(res$geneID_res$from_geneID))
converted_genes_input <- length(unique(res$geneID_collapse$from_geneID))
converted_genes_output <- length(unique(res$geneID_expand$symbol))
cat("Number of input gene IDs:", input_genes, "\n")
cat("Number of gene IDs mapped in the database:", db_genes, "\n")
cat("Number of input gene IDs that were successfully converted:", converted_genes_input, "\n")
cat("Number of converted gene IDs:", converted_genes_output, "\n")
homologs_counts <- aggregate(
x = counts[res$geneID_expand[, "from_geneID"], ],
by = list(res$geneID_expand[, "symbol"]), FUN = sum
)
rownames(homologs_counts) <- homologs_counts[, 1]
homologs_counts <- as(as_matrix(homologs_counts[, -1]), "dgCMatrix")
homologs_counts
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.