inst/extdata/test_slides_2.md

Converts Xaringan Slides Into Markdown Notes <p>class: inverse, center, middle</p> <h1>Converts Xaringan Slides Into Markdown Notes</h1> <h3>Hao Ye</h3> <h3>Health Science Center Libraries, University of Florida</h3> <h3>(updated: 2021-01-28)</h3> <hr /> <p>class: inverse, center, middle</p> <h2>"Any fool can write code that a computer can understand. <br />Good programmers write code that humans can understand."</h2> <p><p style="text-align:right">from "Refactoring: Improving the Design of Existing Code" by Martin Fowler</p></p> <hr /> <h1>Motivations</h1> <ul> <li>In addition to code that works, ideally it is <strong>also</strong>:</li> <li>easy to read and understand</li> <li>easy to maintain or change</li> <li>obvious in its correctness</li> <li>aesthetically pleasing?</li> </ul> <hr /> <h1>Learning Outcomes</h1> <p>By the end of the workshop, participants will be able to:</p> <ul> <li>implement functions to make code more modular and increase reproducibility</li> <li>use data structures to manage inputs and outputs</li> <li>recognize and fix basic code smells</li> </ul> <hr /> <h1>Disclaimer</h1> <ul> <li>Concepts are intended to be universal, but code examples will be in <strong><code>R</code></strong>.</li> <li>Like any other skill, <em>effective</em> practice matters!</li> <li>working code != easy-to-read code (e.g. answers from Stack Overflow)</li> <li>when reading, ask yourself "is it clear what is going on?" (even if you don't know the detailed mechanisms)</li> </ul> <hr /> <p>class: inverse, center, middle</p> <h1>Breaking Code into Functions</h1> <hr /> <h1>What are functions?</h1> <ul> <li><strong>Functions</strong> let you refer to another piece of code by (a hopefully informative!) name</li> </ul> <pre><code class="r"> mean() # computes arithmetic mean of input </code></pre> <ul> <li>You can write your own functions, too!</li> </ul> <pre><code class="r"> celsius_to_fahrenheit &lt;- function(x) { 9/5 * x + 32 } </code></pre> <hr /> <h1>Why write your own functions?</h1> <ul> <li>It seems like extra work...</li> </ul> <p>-- * BUT, it enables: - repeating the same operation on different inputs (e.g. datasets, variables, parameter values, etc.) - clarifying the larger organizational structure of the code</p> <hr /> <h1>Duplication</h1> <pre><code class="r">df &lt;- data.frame( a = rnorm(10), b = rnorm(10), c = rnorm(10)) # rescale all the columns of df df$a &lt;- (df$a - min(df$a)) / (max(df$a) - min(df$a)) df$b &lt;- (df$b - min(df$b)) / (max(df$b) - min(df$a)) df$c &lt;- (df$c - min(df$c)) / (max(df$c) - min(df$c)) </code></pre> <hr /> <h1>Define a function!</h1> <pre><code class="r">rescale01 &lt;- function(x) { (x - min(x)) / (max(x) - min(x)) } # rescale all the columns of df df$a &lt;- rescale01(df$a) df$b &lt;- rescale01(df$b) df$c &lt;- rescale01(df$c) # or with dplyr df &lt;- df %&gt;% mutate_at(c("a", "b", "c"), rescale01) </code></pre> <hr /> <h1>Notes</h1> <ul> <li>calculation defined once and re-used</li> <li>changes or corrections only need to happen in one place</li> <li>fewer chances for error when modifying after copy-paste</li> </ul> <hr /> <h1>Workflow structure</h1> <p><img src="project-diagram.svg" title="Diagram of the workflow in a hypothetical data analysis project with boxes representing code and data/output files. &quot;Raw data&quot; goes into &quot;Pre-processing&quot; and then &quot;Pre-processed data&quot;. &quot;Pre-processed data&quot; goes directly into &quot;Figure 3&quot; (code) and then &quot;Figure 3&quot; (Data file), but also into &quot;Analysis/Modelling&quot;. The &quot;Model&quot; output from &quot;Analysis/Modelling&quot; is used in code for &quot;Figure 1&quot; and &quot;Figure 2&quot;, which generate files &quot;Figure 1&quot; and &quot;Figure 2&quot;." alt="Diagram of the workflow in a hypothetical data analysis project with boxes representing code and data/output files. &quot;Raw data&quot; goes into &quot;Pre-processing&quot; and then &quot;Pre-processed data&quot;. &quot;Pre-processed data&quot; goes directly into &quot;Figure 3&quot; (code) and then &quot;Figure 3&quot; (Data file), but also into &quot;Analysis/Modelling&quot;. The &quot;Model&quot; output from &quot;Analysis/Modelling&quot; is used in code for &quot;Figure 1&quot; and &quot;Figure 2&quot;, which generate files &quot;Figure 1&quot; and &quot;Figure 2&quot;." width="60%" /></p> <p>.tiny[modified from "Reproducible research best practices @JupyterCon" (version 18) by Rachael Tatman, https://www.kaggle.com/rtatman/reproducible-research-best-practices-jupytercon]</p> <hr /> <h1>Example Code</h1> <pre><code class="r">data_raw &lt;- readRDS("readings.dat") data_proc &lt;- preprocess_data(data_raw) fitted_model &lt;- run_model(data_proc) plot_model_forecasts(fitted_model, "figure-1_abundance-forecasts.pdf") plot_abundance_residuals(fitted_model, "figure-2_abundance-residuals.pdf") plot_abundance_histogram(data_proc, "figure-3_abundance-residuals.pdf") </code></pre> <hr /> <h1>Notes</h1> <ul> <li>The steps of the analysis are clear.</li> <li>When making changes/additions:</li> <li>you know which function to change</li> <li>adding a new analysis: write a new function, include it in the workflow script</li> <li>Ideally save <code>data_proc</code> and <code>fitted_model</code> objects to a file. Then figure code can be changed without re-running the model.</li> </ul> <hr /> <h1>Tips for writing functions</h1> <ul> <li>name things well</li> <li>plan for flexibility</li> <li>split large tasks into smaller units</li> </ul> <hr /> <h1>Tip 1: Naming Things</h1> <ul> <li>function names should be verbs</li> </ul> <pre><code class="r"># bad row_adder() permutation() # good add_row() permute() </code></pre> <h2>.small[examples from https://style.tidyverse.org/functions.html#naming]</h2> <h1>Tip 2: Plan for flexibility</h1> <pre><code class="r">plot_abundance_histogram &lt;- function(data_proc, filename, width = 6, height = 6) { # {{code}} } </code></pre> <ul> <li>input data and location of output are easily changed</li> <li>width and height are adjustable, but have reasonable defaults</li> </ul> <hr /> <h1>Tip 3: Function size</h1> <ul> <li>Each function should have a single well-defined task</li> <li>this makes testing and debugging easier</li> <li>Functions should ideally be 50 lines or less</li> <li>not a hard rule, divide work into functions sensibly!</li> </ul> <hr /> <h1>Tip 3 (cont'd)</h1> <ul> <li>If a line or set of lines of code is complicated, it might need to be its own function (with a good name)</li> </ul> <pre><code class="r"># bad if (class(x)[[1]]) == "numeric" || class(x)[[1]] == "integer") # good if (is.numeric(x)) </code></pre> <h2>.small[examples from https://speakerdeck.com/jennybc/code-smells-and-feels?slide=36]</h2> <p>class: inverse, center, middle</p> <h1>Code Smells</h1> <hr /> <h1>What are "code smells"</h1> <ul> <li>aspects of the code that make it appear less ideal</li> <li> <h2>the code is not necessarily buggy</h2> </li> <li>but it is hard to tell!</li> </ul> <hr /> <h1>Example (code smell)</h1> <hr /> <h1>Solution</h1> <ul> <li>follow a coding style guide</li> <li>use technological tools like automatic indentation and linters</li> <li>understand patterns and how to fix them</li> <li><strong>refactoring</strong>: rewriting code without changing its behavior (i.e. make it faster, cleaner, easier to use)</li> </ul> <hr /> <ul> <li>indentation</li> <li>spaces</li> </ul> <hr /> <h1>Comments</h1> <hr /> <h1>Thanks</h1> <ul> <li>Let me know what content you'd like to see</li> <li>Contact me for additional questions or consultation requests!</li> <li>Check back in on the libguide for more modules and contact info:</li> <li>https://guides.uflib.ufl.edu/reproducibility @media screen {.remark-slide-container{display:block;}.remark-slide-scaler{box-shadow:none;}} var slideshow = remark.create({ "highlightStyle": "github", "highlightLines": true, "countIncrementalSlides": false }); if (window.HTMLWidgets) slideshow.on('afterShowSlide', function (slide) { window.dispatchEvent(new Event('resize')); }); (function(d) { var s = d.createElement("style"), r = d.querySelector(".remark-slide-scaler"); if (!r) return; s.type = "text/css"; s.innerHTML = "@page {size: " + r.style.width + " " + r.style.height +"; }"; d.head.appendChild(s); })(document);</li> </ul> <p>(function(d) { var el = d.getElementsByClassName("remark-slides-area"); if (!el) return; var slide, slides = slideshow.getSlides(), els = el[0].children; for (var i = 1; i &lt; slides.length; i++) { slide = slides[i]; if (slide.properties.continued === "true" || slide.properties.count === "false") { els[i - 1].className += ' has-continuation'; } } var s = d.createElement("style"); s.type = "text/css"; s.innerHTML = "@media print { .has-continuation { display: none; } }"; d.head.appendChild(s); })(document); // delete the temporary CSS (for displaying all slides initially) when the user // starts to view slides (function() { var deleted = false; slideshow.on('beforeShowSlide', function(slide) { if (deleted) return; var sheets = document.styleSheets, node; for (var i = 0; i &lt; sheets.length; i++) { node = sheets[i].ownerNode; if (node.dataset["target"] !== "print-only") continue; node.parentNode.removeChild(node); } deleted = true; }); })(); (function() { "use strict" // Replace <script> tags in slides area to make them executable var scripts = document.querySelectorAll( '.remark-slides-area .remark-slide-container script' ); if (!scripts.length) return; for (var i = 0; i &lt; scripts.length; i++) { var s = document.createElement('script'); var code = document.createTextNode(scripts[i].textContent); s.appendChild(code); var scriptAttrs = scripts[i].attributes; for (var j = 0; j &lt; scriptAttrs.length; j++) { s.setAttribute(scriptAttrs[j].name, scriptAttrs[j].value); } scripts[i].parentElement.replaceChild(s, scripts[i]); } })(); (function() { var links = document.getElementsByTagName('a'); for (var i = 0; i &lt; links.length; i++) { if (/^(https?:)?\/\//.test(links[i].getAttribute('href'))) { links[i].target = '_blank'; } } })(); // adds .remark-code-has-line-highlighted class to <pre> parent elements // of code chunks containing highlighted lines with class .remark-code-line-highlighted (function(d) { const hlines = d.querySelectorAll('.remark-code-line-highlighted'); const preParents = []; const findPreParent = function(line, p = 0) { if (p &gt; 1) return null; // traverse up no further than grandparent const el = line.parentElement; return el.tagName === "PRE" ? el : findPreParent(el, ++p); };</p> <p>for (let line of hlines) { let pre = findPreParent(line); if (pre &amp;&amp; !preParents.includes(pre)) preParents.push(pre); } preParents.forEach(p =&gt; p.classList.add("remark-code-has-line-highlighted")); })(document);

slideshow._releaseMath = function(el) { var i, text, code, codes = el.getElementsByTagName('code'); for (i = 0; i < codes.length;) { code = codes[i]; if (code.parentNode.tagName !== 'PRE' && code.childElementCount === 0) { text = code.textContent; if (/^\\\((.|\s)+\\\)$/.test(text) || /^\\\[(.|\s)+\\\]$/.test(text) || /^\$\$(.|\s)+\$\$$/.test(text) || /^\\begin\{([^}]+)\}(.|\s)+\\end\{[^}]+\}$/.test(text)) { code.outerHTML = code.innerHTML; // remove continue; } } i++; } }; slideshow._releaseMath(document); (function () { var script = document.createElement('script'); script.type = 'text/javascript'; script.src = 'https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-MML-AM_CHTML'; if (location.protocol !== 'file:' && /^https?:/.test(script.src)) script.src = script.src.replace(/^https?:/, ''); document.getElementsByTagName('head')[0].appendChild(script); })();



ha0ye/RMDconverter documentation built on Feb. 4, 2021, 8:55 p.m.