split_dat: grouping the data frame containing sequences and names and...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/split_dat.R

Description

Splite the data frame of sequences based on the reference table of grouping.

Usage

1
split_dat(dat, ref_table)

Arguments

dat

data frame generated by read.phylip or read.fasta

ref_table

data frame with first column for the name of the sequence, second column for the group the sequence belongs to.

Details

Each group of sequences will be saved to a fasta file. Sequences not included in the ref_table will be saved in "Ungrouped.fasta"

Value

This is a subroutine, there is no return value.

Author(s)

Jinlong Zhang <jinlongzhang01@gmail.com>

References

http://www.genomatix.de/online_help/help/sequence_formats.html

See Also

rename.fasta

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
  cat(
  ">seq_1",   "--TTACAAATTGACTTATTATA",
  ">seq_2",   "GATTACAAATTGACTTATTATA",
  ">seq_3",   "GATTACAAATTGACTTATTATA",
  ">seq_5",   "GATTACAAATTGACTTATTATA",
  ">seq_8",   "GATTACAAATTGACTTATTATA",
  ">seq_10",  "---TACAAATTGAATTATTATA",
  ">seq_11",  "--TTACAAATTGACTTATTATA",
  ">seq_12",  "GATTACAAATTGACTTATTATA",
  ">seq_13",  "GATTACAAATTGACTTATTATA",
  ">seq_15",  "GATTACAAATTGACTTATTATA",
  ">seq_16",  "GATTACAAATTGACTTATTATA",
  ">seq_17",  "---TACAAATTGAATTATTATA",
  file = "trnh.fasta", sep = "\n")

sequence_name <- get.fasta.name("trnh.fasta")
sequence_group <- c("group1","group1","group1","group1","group1",
"group2","group2","group2","group3","group3","group3","group3")
group <- data.frame(sequence_name, sequence_group)

fasta <- read.fasta("trnh.fasta")
split_dat(fasta, group)

unlink("trnh.fasta")
unlink("ungrouped.fasta")
unlink("group1.fasta")
unlink("group2.fasta")
unlink("group3.fasta")

helixcn/phylotools documentation built on March 31, 2021, 5:45 a.m.