Batch save plots (Advanced; skip for now)

knitr::opts_chunk$set(
  echo = TRUE,
  results = 'show',
  fig.path = "OutputPlots/",
  dev = c('svg', 'png'),


  fig.width = 10,
  #In the unit of inches
  fig.height = 8,
  #In the unit of inches
  unit = "in"
) 

Load data


Preview data structure



Generate dummy data to for join() revisit


join()

Mutating join()

Inner join() == intersection

![](join GIF\inner-join.gif)


Outer join()

left_join() ![](join GIF\left-join.gif)


right_join() ![](join GIF\right-join.gif)


full_join() == union ![](join GIF\full-join.gif)


Filtering join()

semi_join()

![](join GIF\semi-join.gif)


You basically use the keys from right df to filter on the rows in left df and only keep the rows from the left df that have ID already existing in the right df. Is there a more direct method?


anti_join()

![](join GIF\anti-join.gif)


You basically use the keys from right df to filter on the rows in left df but keep the ones with an ID that doesn't exist in the right df. Is there a more direct method?

`%!in%` <- Negate(`%in%`) #Negate() returns a negated function of the target function 

DataFrames binding/assembly

bind_rows()

Returns tables one on top of the other as a single table.


bind_cols()

Returns tables placed side by side as a single table.


Data transformation on the run for ggplot2


Data filtering by df subseting

If I want to investigate on the potential relationship between the organ and GC content, but I'm only interested in certain types of organ and RNA types


`%!in%` <- Negate(`%in%`)

Conditional styling by ifelse()


If you want to filter out the 'other' category, you also need to tell ifelse() that your original df has changed


Text processing with stringr

library('stringr')

str_view_all('abcdefg','bc|f')
str_view_all('abcdefg','[bdf]')
str_view_all('abcdefg','[^bdf]')
str_view_all('abcdefg','[b-f]')
str_view_all(c('abc','def'),'^a')
str_view_all(c('abc','def'),'f$')
str_view_all('loooloolol','o?')
str_view_all('loooloolol','o*')
str_view_all('loooloolol','o+')
str_view_all('loooloolol','o{2,}')

stringr with ggplot2




hirscheylab/tidybiology documentation built on May 20, 2022, 10:55 p.m.