Basic plots

Also see: https://github.com/biovcnet/topic-R/blob/master/Lesson-6/lesson-06-bvcn-full.R

0.1 Bar

library(tidyverse)
head(starwars)
# A tibble: 6 × 14
  name      height  mass hair_color skin_color eye_color birth_year sex   gender
  <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> 
1 Luke Sky…    172    77 blond      fair       blue            19   male  mascu…
2 C-3PO        167    75 <NA>       gold       yellow         112   none  mascu…
3 R2-D2         96    32 <NA>       white, bl… red             33   none  mascu…
4 Darth Va…    202   136 none       white      yellow          41.9 male  mascu…
5 Leia Org…    150    49 brown      light      brown           19   fema… femin…
6 Owen Lars    178   120 brown, gr… light      blue            52   male  mascu…
# ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,
#   vehicles <list>, starships <list>

0.1.1 Star Wars visuals

What species are present on each planet?

Create a bar plot where the x axis equals homeworlds and the y axis show the number of different species on each planet.

0.2 Point & line

For all humans in Star Wars, what is their relationship between height and weight? and is it impacted by their planet of origin?

starwars %>%

0.3 Boxplot

1 First what is the data I’m looking at on Tatooine

# View(starwars)

tatooine <- starwars %>% 
  filter(homeworld == "Tatooine") %>% # select only those from Tatooine
  data.frame

2 How many species on Tatooine?

unique(starwars$species); length(unique(starwars$species))
 [1] "Human"          "Droid"          "Wookiee"        "Rodian"        
 [5] "Hutt"           NA               "Yoda's species" "Trandoshan"    
 [9] "Mon Calamari"   "Ewok"           "Sullustan"      "Neimodian"     
[13] "Gungan"         "Toydarian"      "Dug"            "Zabrak"        
[17] "Twi'lek"        "Aleena"         "Vulptereen"     "Xexto"         
[21] "Toong"          "Cerean"         "Nautolan"       "Tholothian"    
[25] "Iktotchi"       "Quermian"       "Kel Dor"        "Chagrian"      
[29] "Geonosian"      "Mirialan"       "Clawdite"       "Besalisk"      
[33] "Kaminoan"       "Skakoan"        "Muun"           "Togruta"       
[37] "Kaleesh"        "Pau'an"        
[1] 38
unique(tatooine$species); length(unique(tatooine$species))
[1] "Human" "Droid"
[1] 2

3 Bar chart example

3.1 First, example of a bar chart but something isn’t right

ggplot(tatooine, aes(x = species, y = height)) +
  geom_bar(stat = "identity")

4 What’s wrong with this?

starwars %>% 
  filter(homeworld == "Tatooine") %>%
  group_by(species, homeworld) %>%
  summarise(MEAN_height = mean(height), MEDIAN_height = median(height),
            MAX_height = max(height), MIN_height = min(height),
            MEAN_mass = mean(mass), MEDIAN_mass = median(mass))
`summarise()` has grouped output by 'species'. You can override using the
`.groups` argument.
# A tibble: 2 × 8
# Groups:   species [2]
  species homeworld MEAN_height MEDIAN_height MAX_height MIN_height MEAN_mass
  <chr>   <chr>           <dbl>         <dbl>      <int>      <int>     <dbl>
1 Droid   Tatooine         132           132         167         97      53.5
2 Human   Tatooine         179.          180.        202        163      NA  
# ℹ 1 more variable: MEDIAN_mass <dbl>
hist((filter(tatooine, species == "Human"))$height)  

hist((filter(tatooine, species == "Droid"))$mass) 

A better way to show this data is via box plot! Let’s address that question now, but with a better graphical representation

ggplot(tatooine, aes(x = species, y = height)) +
  geom_boxplot()

4.0.1 Boxplots in ggplot:

  • median at middle
  • upper/lower hinges = 1st and 3rd quartiles (25th and 75th percentiles)
  • whiskers = largest/lowest value, but maxes at 1.5 * inter-quartile range (distance from upper/lower hinges)
  • Outliers are shown as points
  • NOTE: varies from base R ‘boxplot()’
ggplot(tatooine, aes(x = species, y = height)) +
  geom_boxplot() +
  geom_point()