rmd_tab_iterator()
Consider option to print the variable name with the first tab:
[Expression], [Correlation]
[Cohort], [Age]
[Data: Expression], [Correlation]
[Centering: Cohort], [Age]
design2colors()
class_colnames
appears to be broken, it does not seem to assign
class colors as expected.
class_group_color
and class_group_lightness_color
,
but there should be class_color
that corresponds to the one-or-more
values in class_colnames
.Consider adding a function to plot the matrix separately, table format,
as done by default in design2colors()
. The showColors()
and
print_color_list()
are not great when there is this much data.
It needs more intuitive "default behavior", for example define
class_colnames
then group_colnames
and have it use the last value
as lightness_colnames
?
class_colnames
, group_colnames
, then lightness_colnames
.Consider new function import_omics_data()
Generic data matrix import routine.
curation_txt
, consistent with other import functions.rowData()
.Consider function to display a data.frame
using sample_color_list
output from design2colors()
.
It should essentially do the same thing as design2colors()
except
re-use the sample_color_list
Consider generic importer for tab-delimited, csv-delimited data
that produces a SummarizedExperiment
object.
Bonus points for some mechanism to recognize multiple header rows,
so they can be added to colData()
.
rowData()
or colData()
.nmatlist2heatmaps()
Consider new argument max_partition_rows
to limit the number
of rows for any given partition. This option needs to be done here
particularly so it can be used together with k_clusters
which
produces clusters whose sizes are not known beforehand.
Coverage heatmap automation functions. So far they include:
make_cov_config()
- create config_df
for matrix files
auto_cov_heatmaps()
- given config_df
, "draw the rest of the owl"
(Do all the things required to make a consistent set of heatmaps.)rbind_cov_config()
- to combine multiple independent config_df
process_cov_files()
make_heatmap
to create matrix filesNo doubt this is a stop-gap specific to certain systems, e.g.
system()
or system2()
sys::exec_wait()
or sys::exec_background()
processx::run()
- preferred for multiple concurrent processesparse_ucsc_gokey()
DONE. Consider mechanism to define custom default track settings.
For example get_track_defaults()
could be used to define
settings in the package environment, which could then be edited.
Similar to igraph::add_shapes()
and igraph:::.igraph.shapes
,
an environment
which is created during package loading.
This environment
can be adjusted.
Add nmat2coverage_matrix()
Intended to save a matrix file in tab-delimited format.
Workflow would be to manipulate the matrix, possibly calculating sum, mean, subtraction, etc. Then save the result to a file for easier, more consistent re-use.
nmatlist2heatmaps()
DONE. Debug why the k_colors
, color_sub
usage is not working properly
with row_split
.
Handle empty matrix (entirely NULL) as if it were zero.
coverage_matrix2nmat()
Consider handling missing file by creating a matrix with zeros? Use case is to intentionally create empty matrix to be filled with derived data, for example matrix 3 minus matrix 2; or mean of matrix 1 through matrix 8.
DONE. nmatlist2heatmaps()
DONE. Consider applying word-wrap to color legend titles, so the title does not become unreasonably long.
DONE. rmd_tab_iterator()
DONE. The test=TRUE
functionality is not working properly for nested tabs.
design2colors()
Consider method to assign categorical colors to a column name,
for example Sample_ID="rainbowJam"
would create categorical colors
for all values in this column, preventing it from assuming the
same color as the sample group.
color_sub
to be a list
which may contain
a function
to use for color assignment of column values.get_salmon_meta()
when a JSON file is missing, give an error with that information,
for example when meta_info.json
is absent, the error should
say that, to help the user find and fix the problem.
rmd_tab_iterator()
Consider option to define the active tab for each layer of tabs,
see https://bookdown.org/yihui/rmarkdown-cookbook/html-tabs.html
### Tab name {.active}
import_salmon_quant()
Done. Debug and fix occasional error "duplicate 'row.names' are not allowed"
migrate slicejam::import_featurecounts()
here
add argument curation_txt
to populate colData()
consistent with
other uses in this package.
rmd_tab_iterator()
allow tabs to be hidden
DONE. design idea: optional argument to base_fn
test=TRUE
will return
logical
indicating whether to display the tab.
DONE. When test
argument is defined for base_fn()
, for example
base_fn <- function(..., test=TRUE){ 1 }
:
base_fn(test=TRUE, ...)
FALSE
the tab is hidden and proceeds to the next iteration.TRUE
the tab is shown, then call: base_fn(x, test=FALSE, ...)
logical
, we assume the tab contents
have already been displayed.When test
argument is not defined for base_fn()
, for example
base_fn <- function(...){ 1 }
base_fn(...)
catch errors in base_fn
so the tabs will continue to iterate
consider returning logical
for success (TRUE
) or any error (FALSE
),
or integer
number of errors, so 0
indicates zero errors, 4
indicates
four total tabs caused an error.
tryCatch()
around the base_fn()
calls.rmd_tab_iterator()
it has been useful to wrap base_fn
inside tryCatch()
which
continues and prints the error, without crashing the RMarkdown.
Consider how this may be added into this function as a default action.
Need option to hide a tab, most commonly when there is no suitable plot to be produced given the combination of parameters.
FALSE
indicates the tab should not
be displayed? Unclear how the order of steps might permit
hiding a tab...curate_se_colData()
Add an option to reorder se
samples based upon the order found
in the curation df
data.
import_nanostring_csv()
DONE. Update import_nanostring_csv()
to subset by matched rows
in curation_txt
, and reorder samples to match curation_txt
.
import_nanostring_rcc()
add argument curation_txt
to behave as import_nanostring_csv()
nmatlist2heatmaps()
Migrate into coverjam
"jmw86069/coverjam"
more examples of customizing font sizes
legend_base_nrow
logic that inspects the label width,
to avoid multiple columns when labels are already very wide, it squishes
the coverage heatmap size making the heatmaps too narrow.investigate whether figure size can be calculated/predicted upfront,
instead of having to calculate by length(nmatlist)
and estimating the
padding required for row annotations, heatmap gap spacing, and however
large the color legends might be.
new functions
subset_nmatlist()
- convenience function to subset all matrices
in the list. Intended for row subsetting rownames(nmatlist[[1]])
but could also use nmat zoom options.
consider normalizedMatrixList
object class?
it would extend EnrichedHeatmap::normalizedMatrix
and enable subsetting
functions such as nmatlist[x, ]
which thereby subsets rows in all matrices
nmatlist2heatmaps()
it might be able to store associated heatmap parameters:
signal_ceiling
ylim
used by anno_enriched()
nmat_color
color function or color ramplabel
used as title above the heatmapit could simplify pre-processing, such as creating "diff" matrices,
or any combination of matrices in the original nmatlist
.
nmatlist2heatmaps()
DONE. consider storing important function arguments in the returned object
k_clusters
, k_method
rows
anno_df
(?) it could be a large data object, or colnames(anno_df)
color_list
- the list of colors used for anno_df
nmat_colors
(as the final color function for each heatmap)panel_groups
and the parameters for each panel group:
ylims
signal_ceiling
DONE. Consider figure caption with clustering information, similar to
the caption used by multienrichjam::mem_gene_path_heatmap()
:
attr(hm, "caption")
attr(hm, "draw_caption")
grid::grid.textbox()
suggested caption contents:
"N rows displayed"
OR"N rows partitioned into M groups"
"k-means method='correlation' using heatmap X"
OR
"k-means method='correlation' using heatmaps X,Y,Z"
rmd_tab_iterator()
Need way to "hide" tabs:
when no plot is produced
certain combinations of tab values that should be hidden
base_fn
.driving example: MA-plots with useRank=TRUE
are identical
for raw and normalized data, should be shown once, for raw data,
optionally for batch-adjusted data. Either need the MA-plot code to
"hide" unnecessary plots, or provide criteria for which tabs to hide.
consider option to display heading label before each set of tabs
Previous markdown example does not include the tab type, just the tab value: ``` # figures
## tab_list_1 value_1 {.tabset}
### tab_list_2 value_1
{figure 1}
### tab_list_2 value_2
{figure 2}
## tab_list_1 value_2 {.tabset}
### tab_list_2 value_1
{figure 3} ```
New markdown example includes the tab type first, then the next layer uses each tab value: ``` # figures
## tab_list_1 label {.tabset}
### tab_list_1 value_1
#### tab_list_2 label {.tabset}
##### tab_list_2 value_1
{figure 1}
##### tab_list_2 value_2
{figure 2}
### tab_list_1 value_2
#### tab_list_2 label {.tabset}
##### tab_list_2 value_1
{figure 3} ```
migrate slicejam::import_featurecounts()
currently only imports the file format into data.frame
SummarizedExperiment
using rowRanges()
for genomic coordinates.add argument curation_txt
migrate slicejam::fc_to_curation_txt()
convert input data into a reasonably good "curation.txt"
design2colors()
consistent with colorjam 0.0.26.900 changes.import_salmon_quant()
be more tolerant of being given paths to quant.sf
files which
may be renamed to different filenames.
be more tolerant when GTF and tx2gene are not supplied, the import can still import transcript-level data without adding gene-level annotation and summaries.
design2colors()
DONE. accept DataFrame
input, as from SummarizedExperiment::colData()
.
"-"
prefix for colnames to reverse the sort ordergroup_colnames
group_colnames
, then number of unique entries, in order to impose
some reproducibility even when input column order is changed.class_colnames
values, which would
then impose constraints on group color hues within reasonable range
of those colors. It would effective split the color hue into a slightly
wider range of hues, centered on the class color hue.DONE: add importer for metabolomics data
import_metabolomics_niehs()
- import LC-MS or LC-MS/MS data
processed by the NIEHS Metabolomics Core Facility, with defined file
formats.
nmatlist2heatmaps()
add method to display group labels above heatmaps:
signal
- (H3K27ac, H3K4me3, etc) across contiguous panel_groups
type
- (signal, difference) across contiguous signal types
within panel_groups
label
- trimmed label which no longer requires the signal
and
type
encoded into the name.heatmap_column_group_labels()
styleadd method to adjust padding (whitespace) around the overall heatmap layout; between heatmaps and color legend; between annotation stripes.
jamses::heatmap_se()
,
in the form of color color_list
named by column.design2colors()
Error is thrown when there are empty values in group_colnames
fields.
DONE: Allow defining a color_max
, maximum value for numeric
columns that
will use a color gradient.
color_max
value upfront,
perhaps in the form of a named vector similar to color_sub
,
named by colnames(x)
.Allow "rownames"
as valid input to group_colnames
,
essentially assigning one value per row.
group_colnames
is NULL
.
Assign colors to rownames?class_group_color
and class_group_lightness_color
from output, or renaming each to use the combination of actual colnames
assembled to form those colors.get_salmon_meta()
- please fix the file ordering as described below.get_salmon_meta()
should return data in the same order the filenames
were provided, in fact it's weird that it would not already do that.design2colors()
make it work even without group_colnames
, which is effectively
just colorizing a data.frame
by each column.
numeric
columns apply a gradient as if a categorical
color were supplied. Basic idea is that instead of assigning
categorical colors to each numeric value, for numeric columns
assign the categorical color to the column name. Then it should
proceed as it does now, when the user supplies a color_sub
for that column name.goal would be to mimic import_proteomics_PD()
by importing
quant.sf
files, incorporate curation.txt
for sample annotations.
requires GTF
and/or tx2gene
file for transcript-to-gene.
list
with TxSE
and GeneSE
objects.assays()
will include counts
and abundance
.COMPLETE: Salmon produces a useful file "flenDist.txt"
that includes the
distribution of fragment lengths observed, from 1 to 1000 length.
Goal is to parse this file to determine the weighted mean
fragment length, likely extending get_salmon_meta()
.
parse_ucsc_gokey()
updates:
currently the compositeTrack output requires editing to become visible by default. It appears now to export a mix of superTrack/compositeTrack. The apparent changes required:
"subGroups view=COV"
Unclear if this line is required for visibility.Option autoScale group
is useful for sets of tracks, and it should
be easy to configure upfront.
Not all track arguments are being honored in the final track hub entry. For example autoScale=group is not propagating through the workflow.
nmatlist2heatmaps()
transform
is used, alter the y-axis numeric labels,
and color legend numeric labels accordingly.Add example for parse_ucsc_gokey()
showing three kinds of tracks
as input
bigWig with pos/neg or F/R or some track grouping -> multiWig overlay
bigBed -> composite Tracks
COMPLETE:new arguments for bigBed tracks
scoreFilter=0, scoreFilterLimits="0:1000"
For panel_groups
:
COMPLETE: when all heatmaps in a set of panels share the
same color gradient, the color legend should be displayed once per group,
and should be labeled using the panel_groups
value.
ht_gap
between adjacent heatmaps in the same group smallerEnrichedHeatmap::anno_enriched(... yaxis=FALSE)
anno_df
should probably use Heatmap()
and not HeatmapAnnotation()
so that it can enable use_raster=TRUE
.Make coverage heatmaps easier for other scientists.
tab-delimited table input with the following columns:
coverage matrix file path - full path to coverage file
NA
, for example:
-250kb,center,+250kb
panel_groups
control_name - optional name of a control coverage file to be used to create a difference matrix
tab-delimited anno_df file whose first column values should match rownames in each coverage matrix file.
Make jam color gradients available:
jam_linear
- color blind friendly linear gradients
from white to colorjam_divergent
- color blind friendly divergent gradients
with black as the central colorMake it possible to define coverage differences
Currently k-means clustering and partitioning works together, but the k-means is performed on everything, then clusters are divided by partition.
The effect when partitioning by "up" and "down" is that some k-means clusters are dominated by the "up" and "down" such that cluster 1 has 95% members from "up", very few from down. For k=4, it ends up making 8 clusters, which are mostly just 4 proper clusters, and 4 junk clusters (the "up" cluster with few "down" members).
Would be better to k-means within each partition, to make proper subclusters for each partition.
nmatlist2heatmaps()
should allow optional colorSub
argument as a named vector of colors. Any categorical column
all of whose values match names(colorSub)
will use colors
in colorSub
instead of generating its own new categorical
colors. This option will help allow colors to be consistent.nmatlist2heatmaps()
argument for data.frame
of annotations,
used to display alongside heatmaps, and/or used to sort the heatmap
rows. For example one column could contain "log2foldchange"
,
or a categorical variable.normalizedMatrix
for each sample with consistent
rownames and colnames.nmatlist2heatmaps()
except that it focuses on
just the profile plots, including optional error bars and
statistical testing by position.nmatlist
are not required to share
the same x-axis range, so this function would probably
need to be applied to individual normalizedMatrix
entries.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.