merge_samples-methods: Merge samples based on a sample variable or factor.

Description Usage Arguments Details Value See Also Examples

Description

The purpose of this method is to merge/agglomerate the sample indices of a phyloseq object according to a categorical variable contained in a sample_data or a provided factor.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
merge_samples(x, group, fun=mean)

## S4 method for signature 'sample_data'
merge_samples(x, group, fun = mean)

## S4 method for signature 'otu_table'
merge_samples(x, group)

## S4 method for signature 'phyloseq'
merge_samples(x, group, fun = mean)

Arguments

x

(Required). An instance of a phyloseq class that has sample indices. This includes sample_data-class, otu_table-class, and phyloseq-class.

group

(Required). Either the a single character string matching a variable name in the corresponding sample_data of x, or a factor with the same length as the number of samples in x.

fun

(Optional). The function that will be used to merge the values that correspond to the same group for each variable. It must take a numeric vector as first argument and return a single value. Default is mean. Note that this is (currently) ignored for the otu_table, where the equivalent function is sum, but evaluated via rowsum for efficiency.

Details

NOTE: (phylo) trees and taxonomyTable-class are not modified by this function, but returned in the output object as-is.

Value

A phyloseq object that has had its sample indices merged according to the factor indicated by the group argument. The output class matches x.

See Also

merge_taxa, codemerge_phyloseq

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#
data(GlobalPatterns)
GP = GlobalPatterns
mergedGP = merge_samples(GlobalPatterns, "SampleType")
SD = merge_samples(sample_data(GlobalPatterns), "SampleType")
print(SD)
print(mergedGP)
sample_names(GlobalPatterns)
sample_names(mergedGP)
identical(SD, sample_data(mergedGP))
# The OTU abundances of merged samples are summed
# Let's investigate this ourselves looking at just the top10 most abundance OTUs...
OTUnames10 = names(sort(taxa_sums(GP), TRUE)[1:10])
GP10  = prune_taxa(OTUnames10,  GP)
mGP10 = prune_taxa(OTUnames10, mergedGP)
ocean_samples = sample_names(subset(sample_data(GP), SampleType=="Ocean"))
print(ocean_samples)
otu_table(GP10)[, ocean_samples]
rowSums(otu_table(GP10)[, ocean_samples])
otu_table(mGP10)["Ocean", ]

Example output

                   X.SampleID Primer Final_Barcode Barcode_truncated_plus_T
Feces                    19.0   13.5          13.5                16.500000
Freshwater               15.0   11.5          11.5                12.000000
Freshwater (creek)        2.0   14.0          14.0                13.000000
Mock                      7.0   25.0          25.0                12.333333
Ocean                    18.0   17.0          17.0                13.666667
Sediment (estuary)       23.0   20.0          20.0                15.000000
Skin                     12.0    7.0           7.0                 9.666667
Soil                     10.0    2.0           2.0                13.333333
Tongue                   14.5    9.5           9.5                15.000000
                   Barcode_full_length SampleType Description
Feces                        13.750000          1   18.500000
Freshwater                    4.500000          2   15.500000
Freshwater (creek)            6.666667          3    2.000000
Mock                         16.000000          4    7.000000
Ocean                        17.000000          5   18.000000
Sediment (estuary)           14.666667          6   22.666667
Skin                         14.666667          7   12.000000
Soil                         11.333333          8    9.666667
Tongue                       23.000000          9   14.500000
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 19216 taxa and 9 samples ]
sample_data() Sample Data:       [ 9 samples by 7 sample variables ]
tax_table()   Taxonomy Table:    [ 19216 taxa by 7 taxonomic ranks ]
phy_tree()    Phylogenetic Tree: [ 19216 tips and 19215 internal nodes ]
 [1] "CL3"      "CC1"      "SV1"      "M31Fcsw"  "M11Fcsw"  "M31Plmr" 
 [7] "M11Plmr"  "F21Plmr"  "M31Tong"  "M11Tong"  "LMEpi24M" "SLEpi20M"
[13] "AQC1cm"   "AQC4cm"   "AQC7cm"   "NP2"      "NP3"      "NP5"     
[19] "TRRsed1"  "TRRsed2"  "TRRsed3"  "TS28"     "TS29"     "Even1"   
[25] "Even2"    "Even3"   
[1] "Feces"              "Freshwater"         "Freshwater (creek)"
[4] "Mock"               "Ocean"              "Sediment (estuary)"
[7] "Skin"               "Soil"               "Tongue"            
[1] TRUE
[1] "NP2" "NP3" "NP5"
OTU Table:          [10 taxa and 3 samples]
                     taxa are rows
        NP2   NP3   NP5
329744   91   126   120
317182 3148 12370 63084
549656 5045 10713  1784
279599  113   114   126
360229   16    83   786
94166    49   128   709
550960   11    86    65
158660   13    39    28
331820   24   101   105
189047    4    33    29
329744 317182 549656 279599 360229  94166 550960 158660 331820 189047 
   337  78602  17542    353    885    886    162     80    230     66 
OTU Table:          [10 taxa and 1 samples]
                     taxa are columns
      329744 317182 549656 279599 360229 94166 550960 158660 331820 189047
Ocean    337  78602  17542    353    885   886    162     80    230     66
Warning message:
system call failed: Cannot allocate memory 

phyloseq documentation built on Nov. 8, 2020, 6:41 p.m.