Base Graphics

We will continue to investigate the yeast data from earlier. Make sure that you have the data and ggplot2 loaded into the session as part of your new script (if you started one).

library("ggplot2")
data(yeast, package = "jrIntroBio")

Scatter plots

Let's start with a basic scatter plot of McGeoch's method mcg and von Heijne's method gvh for signal sequence recognition.

ggplot(yeast, aes(x = mcg, y = gvh)) +
  geom_point()

Whilst this can be informative at the data exploration stage, it isn't very aesthetically pleasing. First off the default axis labels are not very good.

(1) Use the x, y and title arguments^[Arguments are the things we pass to our function inside the () to control the behavior of that function.] of the labs() function to change the axis labels to something more sensible r ggplot(yeast, aes(x = mcg, y = gvh)) + geom_point() + labs(x = "McGeoch's method", y = "von Heijne's method", title = "Correlation of signal sequence detection measures")

(1) The range of possible values for each measure is between 0 and 1. Use the ylim() and xlim() functions to specify a new axis range. For example, for the x-axis it would be + xlim(0, 1) r ggplot(yeast, aes(x = mcg, y = gvh)) + geom_point() + labs(x = "McGeoch's method", y = "von Heijne's method") + xlim(0, 1) + ylim(0, 1)

(1) Try changing the colours of your points using the colour argument. You can find out what colours are allowed by name by using the colors() function. For instance, colour = "red" would change the points to red. r ggplot(yeast, aes(x = mcg, y = gvh)) + geom_point(colour = "red") + labs(x = "McGeoch's method", y = "von Heijne's method") + xlim(0, 1) + ylim(0, 1)

(1) We could make this even neater by colouring points by a column in our data. Try replacing colour = "red" with aes(colour = class). Why do you think we need the aes() function? ```r ggplot(yeast, aes(x = mcg, y = gvh)) + geom_point(aes(colour = class)) + labs(x = "McGeoch's method", y = "von Heijne's method") + xlim(0, 1) + ylim(0, 1)

# need the aes() function because we are mapping
# a variable in the data set to an aesthetic
```

We should now have a plot that look like Figure 2.

ggplot(yeast, aes(x = mcg, y = gvh)) +
  geom_point(aes(colour = class)) +
  labs(x = "McGeoch's method",
       y = "von Heijne's method") +
  xlim(0, 1) +
  ylim(0, 1)

Extras

Here is the code for a histogram of McGeoch's method mcg for signal sequence recognition.

ggplot(yeast, aes(x = mcg)) +
  geom_histogram()

(1) Try changing the binwidth using the argument binwidth within geom_histogram(). For instance, binwidth = 0.02 r ggplot(yeast, aes(x = mcg)) + geom_histogram(binwidth = 0.02)

(1) Try adding fill = class to the aesthetic mapping function, aes(). What happens? r ggplot(yeast, aes(x = mcg, fill = class)) + geom_histogram(binwidth = 0.02) # different colour for each fill

(1) Try adding the argument alpha = 0.2 to the geom_histogram() function. What happens? r ggplot(yeast, aes(x = mcg)) + geom_histogram(binwidth = 0.02, alpha = 0.02) # alpha controls the transparency

(1) Try changing geom_histogram() to geom_density() r ggplot(yeast, aes(x = mcg, fill = class)) + geom_density(binwidth = 0.02, alpha = 0.2)



jr-packages/jrIntroBio documentation built on Dec. 24, 2019, 8:03 a.m.