lfmSingleSequence: Show significant clusters of mutations of every gene in a...
In LowMACA: LowMACA - Low frequency Mutation Analysis via Consensus Alignment

Description Usage Arguments Details Value Author(s) See Also Examples

The method lfmSingleSequence (low frequency mutations in Single Sequence) launch lfm method on every gene or domain inside a LowMACA object without aligning the sequences

lfmSingleSequence(object , metric='qvalue', threshold=.05 
, conservation=0.1 
, BPPARAM=bpparam("SerialParam") 
, mail=NULL 
, perlCommand="perl"
,verbose=FALSE)

`object`	a LowMACA class object
`metric`	a character that defines whether to use 'pvalue' or 'qvalue' to select significant positions. Default: 'qvalue'
`threshold`	a numeric element between 0 and 1 defining the threshold of significance for the defined metric. Default: 0.05
`conservation`	a numeric value in the range of 0-1 that defines the threshold of trident conservation score to include the specified position. Default: 0.1
`BPPARAM`	An object of class `BiocParallelParam-class` specifiying parameters related to the parallel execution of some of the tasks and calculations within this function. See function `register` from the `BiocParallel` package.
`mail`	if not NULL, it must be a valid email address to use EBI clustalo web service. Default is to use a local clustalo installation
`perlCommand`	a character string containing the path to Perl executable. if missing, "perl" will be used as default. Only used in web mode
`verbose`	logical. verbose output or not

This function completes a LowMACA analysis by analyzing every gene or domain in the LowMACA object as a 'single sequence' analysis was started in the first place. The result is a dataframe showing all the significant positions of every gene. If you have a LowMACA object composed by 100 genes, it will launch 100 LowMACA single gene analyses and aggregates the results of every lfm launched on these 100 objects. The output looks very similar to lfm, but in this case the column Multiple_Aln_pos has a different meaning. While in lfm it shows where the mutation falls in the consensus sequence, in this case it must be intended the consensus within the gene. If the original LowMACA object had mode equal to 'gene', the column Multiple_Aln_pos will be always equal to Amino_Acid_Position. If mode is 'pfam', it is the same unless a gene harbors more than one domain of the same type within its sequence. In that case, an internal alignment of every domain inside the protein is performed.

A data.frame with 10 columns corresponding to the mutations retrieved:

Gene_Symbol gene symbols of the analyzed genes
Amino_Acid_Position amino acidic positions relative to original protein
Amino_Acid_Change amino acid changes in hgvs format
Sample Sample barcode where the mutation was found
Tumor_Type Tumor type of the Sample
Envelope_Start start of the pfam domain in the protein
Envelope_End end of the pfam domain in the protein
Multiple_Aln_pos positions in the consensus relatively to the sequence analyzed. See warnings section
Entrez entrez ids of the mutations
Entry Uniprot entry of the protein
UNIPROT other protein names for Uniprot
Chromosome cytobands of the genes
Protein.name extended protein names

Stefano de Pretis , Giorgio Melloni

lfm

#Load homeobox example
data(lmObj)
#Run lfmSingleSequence
significant_muts <- lfmSingleSequence(lmObj)
#Show the result 
head(significant_muts)
#Show all the genes that harbor significant mutations without the alignment
unique(significant_muts$Gene_Symbol)