split_dat: grouping the data frame containing sequences and names and...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Splite the data frame of sequences based on the reference table of grouping.

Usage

1
split_dat(dat, ref_table)

Arguments

dat

data frame generated by read.phylip or read.fasta

ref_table

data frame with first column for the name of the sequence, second column for the group the sequence belongs to.

Details

Each group of sequences will be saved to a fasta file. Sequences not included in the ref_table will be saved in "Ungrouped.fasta"

Value

This is a subroutine, there is no return value.

Author(s)

Jinlong Zhang <jinlongzhang01@gmail.com>

References

http://www.genomatix.de/online_help/help/sequence_formats.html

See Also

rename.fasta

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
  cat(
  ">seq_1",   "--TTACAAATTGACTTATTATA",
  ">seq_2",   "GATTACAAATTGACTTATTATA",
  ">seq_3",   "GATTACAAATTGACTTATTATA",
  ">seq_5",   "GATTACAAATTGACTTATTATA",
  ">seq_8",   "GATTACAAATTGACTTATTATA",
  ">seq_10",  "---TACAAATTGAATTATTATA",
  ">seq_11",  "--TTACAAATTGACTTATTATA",
  ">seq_12",  "GATTACAAATTGACTTATTATA",
  ">seq_13",  "GATTACAAATTGACTTATTATA",
  ">seq_15",  "GATTACAAATTGACTTATTATA",
  ">seq_16",  "GATTACAAATTGACTTATTATA",
  ">seq_17",  "---TACAAATTGAATTATTATA",
  file = "trnh.fasta", sep = "\n")

sequence_name <- get.fasta.name("trnh.fasta")
sequence_group <- c("group1","group1","group1","group1","group1",
"group2","group2","group2","group3","group3","group3","group3")
group <- data.frame(sequence_name, sequence_group)

fasta <- read.fasta("trnh.fasta")
split_dat(fasta, group)

unlink("trnh.fasta")
unlink("ungrouped.fasta")
unlink("group1.fasta")
unlink("group2.fasta")
unlink("group3.fasta")

phylotools documentation built on May 2, 2019, 3:25 a.m.