Description Usage Arguments Value Accepted Accession Types Output format Troubleshooting Background Information on Accession Types See Also Examples
View source: R/Accession_Conversion_Functions.R
convertAccession
converts a vector of accessions
(all belonging into the same accession type) into all possible accession
types within SRA and GEO. If no SRA/GEO conversion is possible,
all the missing accession types are marked as NAs.
1 | convertAccession(acc_vector)
|
acc_vector |
A vector of accessions (all must belong to the same type) |
A data frame with conversion between all accession types
convertAccession
accepts any of the 4 SRA or 2 GEO accession types
(see section 'Background Information on Accession Types').
convertAccession
accepts only one accession type at a time.
For example, the following queries are NOT allowed:
convertAccession("SRR_____", "SRP_____")
convertAccession("GSE_____", "SRP_____")
In order to obtain the above results, it is necessary to run
separate queries for each accession type, and, if desirable,
bind the data frames together
(e.g. rbind(convertAccession("SRR_____"),
convertAccession("SRP_____"))
).
SRA accessions differing by the first letter belong to the same type,
hence it is possible to run: convertAccession("SRP_____", "ERP_____")
The function outputs a data frame with conversion of the input accessions into all possible types.
In the best case scenario, i.e. if an accession exists in both SRA and GEO databases, these would include all 6 accession types (SRR, SRX, SRS, SRP, GSM, GSE).
If an accession exists only in one of the databases, the conversion will be limited to that one database. For example, if an accession only exists in SRA, only SRA accessions will be provided, whilst the GEO columns will be populated with NAs.
The conversion between SRA and GEO databases is based on a custom database
generated by startSpiderSeqR()
function. To ensure best results,
make sure that the most up to date versions of the databases.
To improve results, you can do the following:
Download the most up to date versions of SRAmetadb.sqlite
and GEOmetadb.sqlite files - this is done by running
startSpiderSeqR
, specifying an appropriate argument
for expiry period of database files
(e.g. startSpiderSeqR(path = getwd(), general_expiry = 1)
)
Generate a fresh custom database for conversion between
accessions (SRR_GSM.sqlite) - this is also done by running
startSpiderSeqR
, specifying an appropriate argument
for expiry period of the database file
As a last resort, manually search for the missing conversions online
NOTE: because the SRR_GSM.sqlite database is machine-generated, there is some risk that it might not include some conversions in case they have been recorded in the database in a non-standard way. If in doubt, it is worth checking the accession page online. However, users should be aware that the overlap between SRA and GEO is only about 20% (at the time of writing), so most entries will not have corresponding accession numbers in the other database
The two lists below include accession types within SRA and GEO respectively.
All of these are supported by the convertAccession function.
SRA
SRP or DRP or ERP - project_accession
SRS or DRS or ERS - sample_accession
SRX or DRX or ERX - experiment_accession
SRR or DRR or ERR - run_accession
NOTE: depending on the location of the database (NCBI, EBI or DDBJ), these accessions might begin with a different letter (S, E or D), so the accession levels can be either SRP/SRX/SRS/SRR or ERP/ERX/ERS/ERR or DRP/DRX/DRS/ERR. Accessions beginning with 'S' are by far the most common.
GEO
GSE - series_id
GSM - sample
NOTE: GEO accession system is further complicated by existence of 'superseries', which act as higher level series. In these cases a given GSM would belong to multiple (at least two) GSEs - its series_id and superseries.
Other Workflow functions:
addMissingSamples()
,
filterByTermByAccessionLevel()
,
filterByTerm()
,
searchForAccession()
Other Core functions:
searchForAccession()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | # Setup SpiderSeqR environment first (please use non-demo version)
startSpiderSeqRDemo()
convertAccession("SRP134708")
convertAccession("SRR3707942")
convertAccession("GSM2027840")
# Note that DRP, ERP and SRP are of the same accession type (study level)
convertAccession(c("DRP003157", "SRP061795"))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.