multi_aln: Compute Multiple Sequence Alignments

View source: R/multi_aln.R

multi_alnR Documentation

Compute Multiple Sequence Alignments

Description

This function takes a FASTA file containing DNA or amino acid sequences that shall be aligned and computes a multiple alignment using a defined multiple alignment tool.

Usage

multi_aln(
  file,
  tool,
  get_aln = FALSE,
  path = NULL,
  multi_aln_name = NULL,
  params = NULL,
  quiet = FALSE,
  clean_folders = FALSE
)

Arguments

file

a character string specifying the path to the file storing the sequences in FASTA format.

tool

a character string specifying the program that should be used: "clustalw", "t_coffee", "muscle", "clustalo", and "mafft".

get_aln

a logical value indicating whether the produced alignment should be returned.

path

a character string specifying the path to the multiple alignment program (in case you don't use the default path).

multi_aln_name

a character string specifying the name of the stored alignment file. Default is multi_aln_name = NULL denoting a default name: 'toolname.aln' .

params

a character string listing the input paramters that shall be passed to corresponding alignment tool specified in the tool argument. Default is NULL, implicating that a set of default parameters is used when running the correponding tool, e.g. clustalw. Example: params = "-PWMATRIX=BLOSUM -TYPE=PROTEIN" for tool = "clustalw.

quiet

a logical value specifying whether the output of the corresponding multiple alignment tool shall be printed out to the console. Default is quiet = FALSE.

clean_folders

a boolean value spefiying whether all internall folders storing the output of used programs shall be removed. Default is clean_folders = FALSE.

Details

This function provides an interface between R and common multiple alignment programs such as "clustalw", "t_coffee", "muscle", "clustalo", and "mafft".

  • CLUSTALW :

    Different operating systems perform different execution calls to the clustalw program:

    • MacOS: 'clustalw2'

    • Linux: 'clustalw'

    • Windows: 'clustalw2.exe'

    In case you use the default path to the clustalw program, depending on your operating system, the following calls to clustalw should work properly on your system:

    • MacOS: system("clustalw2 -help")

    • Linux: system("clustalw -help")

    • Windows: system("clustalw2.exe -help")

    In case these procedures don't work properly, please use the path argument to specify the 'clustalw' execution path on your system:

    • MacOS: system("path/to/clustalw/clustalw2 -help")

    • Linux: system("path/to/clustalw/clustalw -help")

    • Windows: system("path/to/clustalw/clustalw2.exe -help")

  • T_COFFEE :

    In case you use the default path to the t_coffee program, the following calls to clustalw should work properly on your system:

    system("t_coffee -version")

    In case this procedures doesn't work properly, please use the path argument to specify the 't_coffee' execution path on your system:

    system("path/to/t_coffee/t_coffee -version")

  • MUSCLE :

    In case you use the default path to the muscle program, the following calls to muscle should work properly on your system:

    system("muscle -help")

    In case this procedures doesn't work properly, please use the path argument to specify the 'muscle' execution path on your system:

    system("path/to/muscle/muscle -help")

  • CLUSTALO :

    In case you use the default path to the clustalo program, the following calls to clustalo should work properly on your system:

    system("clustalo --help")

    In case this procedures doesn't work properly, please use the path argument to specify the 'clustalo' execution path on your system:

    system("path/to/clustalo/clustalo --help")

  • MAFFT :

    In case you use the default path to the mafft program, the following calls to mafft should work properly on your system:

    system("mafft -help")

    In case this procedures doesn't work properly, please use the path argument to specify the 'mafft' execution path on your system:

    system("path/to/mafft/mafft -help")

Value

In case the argument get_aln is set TRUE, an object of class alignment of the seqinr package is returned.

Note

Note that when using the clustalw.params, ... parameters, make sure the corresponding alignment tool returns a file in clustal format *.aln. This is only important when get_aln = TRUE.

Author(s)

Hajk-Georg Drost and Sarah Scharfenberg

References

Examples

## Not run: 

# CLUSTALW Example:

# in case the default execution path of clustalw runs properly on your system
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "clustalw", 
          get_aln = TRUE)

# in case the default execution path of clustalw is not set within the default path
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'), 
          tool    = "clustalw", 
          get_aln = TRUE, 
          path    = "path/to/clustalw/")

# running clustalw using additional parameters
# details: system("clustalw2 -help")
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "clustalw", 
          get_aln = TRUE, 
          params  = "-PWMATRIX=BLOSUM -TYPE=PROTEIN")
          
          
# T_COFFEE Example:

# in case the default execution path of t_coffee runs properly on your system
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "t_coffee", 
          get_aln = TRUE)

# in case the default execution path of t_coffee is not set within the default path
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "t_coffee", 
          get_aln = TRUE, 
          path    = "path/to/t_coffee/")

# running t_coffee using additional parameters
# details: http://www.tcoffee.org/Projects/tcoffee/#DOCUMENTATION
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "t_coffee", 
          get_aln = TRUE,
          params  = "-mode expresso")
          


# MUSCLE Example:  

# in case the default execution path of muscle runs properly on your system
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "muscle", 
          get_aln = TRUE)

# in case the default execution path of muscle is not set within the default path
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "muscle", 
          get_aln = TRUE, 
          path    = "path/to/muscle/")

# running muscle using additional parameters
# details: http://www.drive5.com/muscle/manual/
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "muscle", 
          get_aln = TRUE,
          params  = "-diags -clwstrict") 
          

# CLUSTALO Example:  

# in case the default execution path of clustalo runs properly on your system
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "clustalo", 
          get_aln = TRUE)

# in case the default execution path of clustalo is not set within the default path
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "clustalo", 
          get_aln = TRUE, 
          path    = "path/to/clustalo/")

# running clustalo using additional parameters
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "clustalo", 
          get_aln = TRUE,
          params  = "--outfmt clu")         
          
          
                                            
# MAFFT Example:  

# in case the default execution path of mafft runs properly on your system
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "mafft", 
          get_aln = TRUE)

# in case the default execution path of mafft is not set within the default path
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "mafft", 
          get_aln = TRUE, 
          path    = "path/to/mafft/")

# running mafft using additional parameters
# details: http://www.drive5.com/mafft/manual/
multi_aln(file    = system.file('seqs/aa_seqs.fasta', package = 'orthologr'),
          tool    = "mafft", 
          get_aln = TRUE,
          params  = "--maxiterate 1 --clustalout")         
          
                                                                                  

## End(Not run)

HajkD/orthologr documentation built on Oct. 13, 2023, 12:11 a.m.