colonyzer.read: Read raw cell density timecourse data from Colonyzer output...

View source: R/ColonyzerImport.R

colonyzer.readR Documentation

Read raw cell density timecourse data from Colonyzer output files

Description

Reads in and binds together all of the Colonyzer output files in a directory and puts together a data.frame suitable for qfa.fit2 input. Colonyzer is an open source image analysis tool for quantifying photographic images by calculating cell densities of bacterial or eukaryotic spots and colonies on agar plates. We recommend using the easy-to install and easy to use version called BaColonyzer found on https://github.com/judithbergada/bacolonyzer . The original version and some documentation about it can be found on http://research.ncl.ac.uk/colonyzer/ Required meta-data file processing is reduced compared to the read.colonyzer function of the original QFAr package.

Usage

colonyzer.read(
  path = ".",
  libraries = "LibraryDescriptions.txt",
  Growth = "Intensity",
  files = c(),
  experiment = NA,
  ORF2gene = NA,
  screenID = ""
)

Arguments

path

String. The path to the folder containing the Colonyzer .dat files to be read. Set to working directory by default.

libraries

String. The only necesarry meta-data in the new QFA iteration. Tab-delimited text file describing each well as row-column coordinate of each plate in a series of rectangular arrayed libraries. Header row format is: "Library ORF Plate Row Column Notes". Columns are:

  • Library - Library identifier (e.g. SAU1)

  • ORF - Systematic strain identifier (e.g. Cowan)

  • Plate - Plate number

  • Row - Row number

  • Column - Column number

  • Notes - Optional strain notes

Growth

String. (Optional). Rowname of the .out datafiles to use as the target Growth parameter for following analysis. Set to Intensity by default.

files

String vector. (Optional). Vector giving locations of Colonyzer .dat files to be read (overrides path specified).

experiment

String. (Optional). Name of text file describing the inoculation times, library and plate number for unique plates. If this file is not specified, the variables are taken if possible from the data in the .dat files and if not possible an arbitraryy "UNKNOWN" value is given (see information in brackets below). Filename is taken relative to path if path is. specified. File must be a tab-delimited text file with no header containing the following columns:

  • Barcode - unique identifier for each plate (no file specified: Barcodes from libraries file)

  • Start.time - Time of inoculation of the plate in format YYYY-MM-DD_hh_mm_ss (no file specified: first image date)

  • Treatment - Whatever treatment you applied to the plate (no file specified: UNKNOWN)

  • Medium - Whatver medium you used (no file specified: UNKNOWN)

  • Screen - Screening ID (no file specified: use screenID)

  • Library - Library ID (no file specified: Library values from libraries file)

  • Plate - ID replicates (no file specified: Plate values from libraries file)

  • RepQuad - Which quadrant was used for inoculation (no file specified: UNKNOWN)

ORF2gene

String. (Optional). Filename of a tab-delimited text file containing two columns (with no headers) associating unique, systematic strain identifiers (e.g. yeast ORF Y-numbers) with human readable gene names (e.g. standard names from SGD). If not specified, Gene and ORF are set to the same value

screenID

String. Unique experiment identifier (not the same as the Barcode which is a unique plate identifier). For example, we use it as an identifier of the grid layout for competition assays (Grid vs. Single strain vs Block).

Details

The .dat files should contain the following parameters (they are automatically there if you used BaColonyzer or Colonyzer):

  • Image.Name - Full name at image capture (includes barcode and date-time) of image from which data are derived

  • Row - Row number (counting from top of image) of culture in rectangular gridded array

  • Col - Column number (counting from left of image) of culture in rectangular gridded array

  • X.Offset - x-coordinate of top left corner of rectangular tile bounding culture (number of pixels from left of image)

  • Y.Offset - y-coordinate of top left corner of rectangular tile bounding culture (number of pixels from top of image)

  • Area - Culture area (pixels)

  • Trimmed - Integrated Optical Density, sum of pixel intensities within culture area

  • Threshold - Global pixel intensity threshold used for image segmentation (after lighting correction)

  • Intensity - Total pixel intensity for square tile containing culture

  • Edge Pixels - Number of pixels classified as culture on edge of square tile

  • Colony.Color.R - Culture red channel intensity

  • Colony.Color.G - Culture green channel intensity

  • Colony.Color.B - Culture blue channel intensity

  • Background.Color.R - Background red channel intensity (for current tile)

  • Background.Color.G - Background green channel intensity (for current tile)

  • Background.Color.B - Background blue channel intensity (for current tile)

  • Edge.length - Number of culture pixels classified as being microcolony edge pixels (useful for classifying contaminants in cultures grown from dilute inoculum)

  • Tile.Dimensions.X - Culture tile width (pixels)

  • Tile.Dimensions.Y - Culture tile height (pixels)

  • Growth - Default measure of cell density (direct copy of one of Trimmed, Threshold or Intensity)

  • Barcode - Unique plate identifier

  • Date.Time - Timestamp of image capture (extracted from image filename)

  • Inoc.Time - User specified date and time of inoculation (specified in ExptDescription.txt file)

  • Treatments - Conditions applied externally to plates (e.g. temperature(s) at which cultures were grown, UV irradiation applied, etc.)

  • Medium - Nutrients/drugs in plate agar

  • Screen.Name - Name of screen (identifies biological repeats, and experiment)

  • RepQuad - Integer identifying which of the quadrants of a 1536 plate were used to inoculate the current 384 plate (set equal to 1 for all cultures for 1536 format for example)

  • MasterPlate Number - Library plate identifier

  • Timeseries order - Sequential photograph number

  • Library.Name - Name of library, specifying particular culture location

  • ORF - Systematic, unique identifier for genotype in this position in arrayed library

  • Gene - Standard gene name for genotype in this position in arrayed library. Note that this can be set equal to ORF for example

  • ScreenID - Unique identifier for this QFA screen

  • Client - Client for whom screen was carried out

  • ExptDate - A representative/approximate date for the experiment (note that genome-wide QFA screens typically take weeks to complete)

  • User - Person who actually carried out screen

  • PI - Principal investigator leading project that screen is part of

  • Condition - The most important defining characteristic of screen, as specified by user (e.g. the temperature screen was carried out at if screen is part of multi-temperature set of screens, or the query mutation if part of a set of screens comparing query mutations, or the drugs present in the medium if part of a set of drug screens)

  • Inoc - Qualitative identifier of inoculation type (e.g. "DIL" for dilute inoculum, "CONC" for concentrated). Used to distinguish between experiments carried out with different methods of inoculation.

  • Expt.Time - Time (days) since user-specified inoculation date (Inoc.Time) that current image was captured

Value

An R data.frame where each row corresponds to a single observation on a single colony, with the value of the growth measurement in 'Growth', and the date and time of the measurement in 'Date.Time'. Other information about the observation is stored in the other columns. Several columns returned are direct copies of Colonyzer output and mapped as follows:

  • Image.Name - Image Name

  • Row - Spot Row

  • Col - Spot Column

  • X.Offset - X Offset

  • Y.Offset - Y Offset

  • Area - Area

  • Trimmed - Trimmed Area

  • Threshold - Threshold

  • Intensity - Intensity

  • Edge.Pixels - Edge Pixels

  • Colony.Color.R - Colony Color R

  • Colony.Color.G - Colony Color G

  • Colony.Color.B - Colony Color B

  • Background.Color.R - Background Color R

  • Background.Color.G - Background Color G

  • Background.Color.B - Background Color B

  • Edge.length - Edge length

  • Tile.Dimensions.X - Tile Dimensions X

  • Tile.Dimensions.Y - Tile Dimensions Y

Extra columns are automatically added as follows. Some of this information is derived from auxiliary files passed to the function such as the experimental description file, the orf-gene dictionary and the library description file:

  • Growth - A cell density surrogate built from trimmed Area normalised by tile area and maximum achievable pixel intensity: Trimmed/(Tile.Dimensions.X*Tile.Dimensions.Y*255)

  • Barcode - Plate identifier, essentially image name with date time and file extension stripped

  • Date.Time - Date time of image capture in YYYY-MM-DD_hh-mm-ss format

  • Inoc.Time - Date time that plate was inoculated. If plate is grown at a high temperature, date time at which plate was moved into high temperature incubator. The assumption in this case being that negligible growth occurred before plate temperature was shifted the the target temperature.

  • Treatments - Treatments applied to plate (e.g. temperature)

  • Medium - Medium contained in agar (e.g. nutrients or drugs added to agar)

  • Screen.Name - Unique identifier for experiment (usually identifies repeat number also if multiple repeats carried out).

  • RepQuad - Identifier for experiments scaling down from 1536 format plates to 384, indicating which quadrant on the original 1536 source plate the current 384 format plate belongs to.

  • MasterPlate.Number - Identifies which plate in the source library (as described in the library description file) corresponds to the current plate

  • Timeseries.order - Ordinal describing which photograph captured

  • Library.Name - Identifies which of the libraries identified in the library description file was used to construct this plate

  • ORF - Unique systematic identifier for the genotype of the strain at this location (e.g. yeast Y-number), as defined by library description file

  • Gene - Standard, human readable genotype identifier for the strain at this location, as defined by the ORF-Gene dictionary

  • Background - Tag identifying experiment, typically used to construct file names and axes titles in plots

  • Expt.Time - Number of days passed between inoculation (start of experiment) and current time

Finally, as well as returning the object above, this function prints a small report to screen, summarising the data returned. This includes number of unique barcodes read, number of photos read, number of genotypes in experiment, number of unique culture observations made, a list of treatments applied, a list of media used, a list of unique screen names (e.g. replicates carried out), the plate dimensions (e.g. 1536, 384 or 96 format) and a list of unique inoculation dates.

Examples

#qfa.testdata was generated with the call in the Not run section
#(files not included in package)
data(qfa.testdata)
#Strip non-experimental edge cultures
qfa.testdata = qfa.testdata[(qfa.testdata$Row!=1) & (qfa.testdata$Col!=1) & (qfa.testdata$Row!=8) & (qfa.testdata$Col!=12),]
# Define which measure of cell density to use
qfa.testdata$Growth = qfa.testdata$Intensity
GmpFit = qfa.fit2(qfa.testdata, inocguess=NULL, detectThresh=0, globalOpt=F, AUCLim=NA, TimeFormat="h", Model="Gmp")
# Construct fitness measures
GmpFit = makeFitness2(GmpFit, AUCLim=NA, plotFitness="All", filename="Example_Gmp_fitness.pdf")
# Create plot
qfa.plot2("Example_Gmp_GrowthCurves.pdf", GmpFit, qfa.testdata, maxt=30)

## Not run: 
qfa.testdata = try(colonyzer.read(experiment = "SAU1ExptDescription.txt", ORF2gene = "ORF2GENE.txt",
libraries = "LibraryDescription1.txt" , screenID = "SAUtest1"))

## End(Not run)

JulBaer/baQFA documentation built on Feb. 19, 2023, 10:32 p.m.