R Exercises for Beginners – 1-10

Practicing is a crucial part of learning a new language. Statistical languages like R are no exception of that rule. Many of my students think the same and would love to see more exercises. Therefore, I decided to write an R exercise sheet for beginners and blog it over here. These R exercises are an add-on to the already existing exercise videos available in each and every R-Tutorial product.

On this sheet you will find 10 R exercises corresponding to the material taught in R Basics and R Level 1. The exercise solutions can be found here in an extra blog post.

I will first write down the exercise question and below the question you can see how the console output should look like. The corresponding code is to be found on the solutions page.

Exercises

 

1a. Get the length of the “lynx” dataset

[1] 114

1b. Create a vector of ordered “lynx” index numbers (hint: order, increasing)

[1]  69  22  70  71  23  68  99  98  12  78 100  21  79  13  72  59  24  32  60 101  41  42  49
[24]   1  14  40  58   2  88  51  33  30  31  73  89  80  67 102  15  20  61  50 109  11 108  25
[47]  43   3  77 110  97  39  48  34  62  57  81  52  90   4  29 111  26 103  74  82  91  56   5
[70] 107 112  53  44  35  54  19  87  63  38  27  55  16 104  66  28  10 113  17  92  36  64   6
[93]  37 106  95  94  45 114  18  83  76 105  96  86  93   7  75  47  65   9   8  85  46  84

1c. Get 2 vectors (index positions and absolute values) with all “lynx” observations smaller than 500 (hint: which, subset)

[1]   1   2  12  13  14  15  20  21  22  23  24  30  31  32  33  40  41  42  49  50  51  58  59
[24]  60  61  67  68  69  70  71  72  73  78  79  80  88  89  98  99 100 101 102 109

and

[1] 269 321  98 184 279 409 409 151  45  68 213 361 377 225 360 299 236 245 255 473 358 299 201
[24] 229 469 389  73  39  49  59 188 377 105 153 387 345 382  81  80 108 229 399 485
length(lynx)
order(lynx)
which(lynx < 500)
subset(lynx, lynx < 500)

 

2a. Get a Histogram of the “lynx” dataset

2b. Adjust the bin size to 7 bins

2c. Remove the labs and change the bins to 2 alternating colors

2d. Add a subtitle and divide the main title in 2 lines

exercise beginners 1

 

hist(lynx, col=c("salmon2", "darkblue"), breaks=7,
sub="r-tutorials.com", xlab="", ylab="",
main="Exercise Question\nHistogram")


Hint: use \n to get text to next level

 

3a. Extract Sepal.Length from the “iris” dataset and call the resulting vector mysepal

3b. Get the summation, mean, median, max and min of mysepal

[1] 876.5
[1] 5.843333
[1] 5.8
[1] 4.3
[1] 7.9

3c. Get the summary of mysepal and compare the results with 3b

Min.  1st Qu.  Median    Mean 3rd Qu.    Max.
4.300   5.100   5.800   5.843   6.400   7.900
mysepal = iris$Sepal.Length
sum(mysepal);
mean(mysepal);
median(mysepal);
min(mysepal);
max(mysepal)
summary(mysepal)

 

4a. Install and load library “dplyr”

4b. Call help for function arrange of “dplyr”

4c. Deinstall “dplyr”

install.packages("dplyr")
library(dplyr)
?arrange
remove.packages("dplyr")

5. Data for this exercise:

x = c(3,6,9)

5a. Repeat x 4 times, but there should be an extra 1 at the end and beginning

[1] 1 3 6 9 3 6 9 3 6 9 3 6 9 1

5b. Call the vector of 5a myvec and extract the 5th value (hint: dplyr, nth)

[1] 3
myvec = c(1, rep(x, times=4), 1)
library(dplyr)
nth(myvec, 5)

 

6a. Get a subset of “mtcars” with transmission type: manual, and call it mysubset

                mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Mazda RX4      21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
Datsun 710     22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
Fiat 128       32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
Honda Civic    30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
Toyota Corolla 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
Fiat X1-9      27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
Porsche 914-2  26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
Lotus Europa   30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
Ford Pantera L 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
Ferrari Dino   19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
Maserati Bora  15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
Volvo 142E     21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

6b. Extract the first 2 variables of the first 2 observations of mysubset

              mpg cyl
Mazda RX4      21   6
Mazda RX4 Wag  21   6
mysubset = subset(mtcars, am == 1)
mysubset[c(1,2), c(1,2)]

 

7a. Get the first 9 observations of “mtcars”

                   mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
Duster 360        14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
Merc 240D         24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
Merc 230          22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2

7b. Get the “mtcars” dataset ordered according to the increasing amount of “carb”
hint for 7b: library dplyr, arrange

 

head(mtcars, 9)
library(dplyr)
arrange(mtcars, carb)

 

8a. Get the means of the first 2 columns in the “iris” dataset by Species

iris$Species: setosa
Sepal.Length  Sepal.Width
       5.006        3.428

--------------------------
iris$Species: versicolor
Sepal.Length  Sepal.Width
       5.936        2.770
--------------------------
iris$Species: virginica
Sepal.Length  Sepal.Width
       6.588        2.974

8b. Create vector x which is the alternation of 1 and 2, 75 times, check the length

[1] 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
[47] 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
[93] 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
[139] 1 2 1 2 1 2 1 2 1 2 1 2

8c. Attach x to iris as dataframe “irisx”, check the head

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species x
1          5.1         3.5          1.4         0.2  setosa 1
2          4.9         3.0          1.4         0.2  setosa 2
3          4.7         3.2          1.3         0.2  setosa 1
4          4.6         3.1          1.5         0.2  setosa 2
5          5.0         3.6          1.4         0.2  setosa 1
6          5.4         3.9          1.7         0.4  setosa 2

8d. Get the colsums of columns 1,3,4 in the “irisx” dataset by the new x variable

irisx$x: 1
Sepal.Length Petal.Length  Petal.Width
       438.0        283.2         91.4
---------------------------------------
irisx$x: 2
Sepal.Length Petal.Length  Petal.Width
       438.5        280.5         88.5
by(iris[,1:2], iris$Species, colMeans)
x = rep(1:2, times=75); length(x)
irisx = data.frame(iris, x); head(irisx)
by(irisx[,c(1,3,4)], irisx$x, colSums)

 

9a. How many observations in “lynx” are smaller than 500?

[1] 43

9b. How many observations in “iris” have a Sepal.Length greater or equal 5?

[1] 128
sum(lynx < 500)
sum(iris$Sepal.Length >= 5)

 

10a. Plot a simple xy plot with “iris” Sepal.Length vs. Sepal.Width
10b. Enlarge the scale limits: y from 0 – 5, x from 3 – 9
10c. Add a text of your choosing, in red, at the lower part of the plot

exercise beginners 2

 

attach(iris)
plot(Sepal.Length, Sepal.Width)
plot(Sepal.Length, Sepal.Width, ylim=c(0,5), xlim=c(3,9))
text(6,1, "r-tutorials.com", col="red", cex=2, lwd=2)
Quality R Training for You