Matrices, Data Frames, Functions, Conditionals, Loops with R

 

Guest post by Slaviana Pavlovich Microsoft Student Partner

slaviana

My name is Slaviana Pavlovich. I am an IT and Management student at University College London with a passion for data science. I recently completed the Microsoft Professional Program for Data Science, where I developed core skills to work with data. If you are also interested in this career, but not sure where to start - I strongly encourage you to check it out. I also have a wide range of interests including 3D bioprinting, public speaking, and politics. Additionally, I enjoy swimming and photography to balance out my studies. I became a Microsoft Student Partner at the end of my first year and I absolutely enjoy being part of such a vibrant community. If you have any questions, feel free to ask!

Introduction

In today’s article, I am going to continue talking about R. In the second part of this two-part introduction to R (the first part is available here), we are going to consider:

• Matrices

• Data Frames

• Functions

• Conditionals

• Loops

Matrices

Matrix is another data type that we are going to look at. Matrix is a two-dimensional data set. A matrix is created using the function matrix() :

 > # creating a matrix
 > example <- matrix(c(99,45,4,47,2,5), nrow = 3, ncol = 2, byrow = TRUE)
 > example
      [,1] [,2]
 [1,]   99   45
 [2,]    4   47
 [3,]    2    5

As you can see in the example above, nrow and ncol are used to define the values for rows and columns. Also, byrow = TRUE means that the matrix is filled by rows, while byrow=FALSE – by columns. Let’s look at the following example:

 > # creating a 2x3 matrix that contains the numbers from 1 to 6 and filled by columns
 > example.2 <- matrix(1:6, nrow = 2, ncol = 3, byrow = FALSE)
 > example.2
 [,1] [,2] [,3]
 [1,] 1 3 5
 [2,] 2 4 6

To change the names of rows and columns of the matrix use dimnames:

 > # creating a matrix
 > A <- matrix(1:6, nrow = 3, byrow = TRUE)
 > # setting row and column names
 > A
 [,1] [,2]
 [1,] 1 2
 [2,] 3 4
 [3,] 5 6
 > dimnames(A) = list(c("1row", "2row", "3row"), c("1col", "2col"))
 > A
 1col 2col
 1row 1 2
 2row 3 4
 3row 5 6

There are certain operations you can do with matrices. You can transpose a matrix, using a function t() :

 > M <- matrix(c(14,2,4,3,2,5), nrow = 2, ncol = 3, byrow = TRUE)
 > M
      [,1] [,2] [,3]
 [1,]   14    2    4
 [2,]    3    2    5
 > t(M)
      [,1] [,2]
 [1,]   14    3
 [2,]    2    2
 [3,]    4    5

Furthermore, use solve() function to find an inverse of a square matrix:

 > X <- matrix(1:4, nrow = 2, byrow = TRUE)
 > X
      [,1] [,2]
 [1,]    1    2
 [2,]    3    4
 > solve(X)
      [,1] [,2]
 [1,] -2.0  1.0
 [2,]  1.5 -0.5

Arithmetic operations are done element-wise:

 > A <- matrix(1:6, nrow = 3, byrow = TRUE)
 > B <- matrix(1:6, nrow = 2, byrow = TRUE)
 > A
      [,1] [,2]
 [1,]    1    2
 [2,]    3    4
 [3,]    5    6
 > B
      [,1] [,2] [,3]
 [1,]    1    2    3
 [2,]    4    5    6
 > A + 2
      [,1] [,2]
 [1,]    3    4
 [2,]    5    6
 [3,]    7    8
 > B / 2
      [,1] [,2] [,3]
 [1,]  0.5  1.0  1.5
 [2,]  2.0  2.5  3.0

For matrix multiplication use “%*%”:

 > A %*% B
      [,1] [,2] [,3]
 [1,]    9   12   15
 [2,]   19   26   33
 [3,]   29   40   51

In R, to select elements of the matrix, do the following:

 > K <- matrix(4:7, nrow = 2, byrow = TRUE)
 > K
      [,1] [,2]
 [1,]    4    5
 [2,]    6    7
 > K[1,2] # element at 1st row and 2rd column
 [1] 5
 > K[1,] # first row
 [1] 4 5
 > K[,2] # second column
 [1] 5 7

Finally, we can always modify a matrix:

 > V <- matrix(1:9, ncol = 3)
 > V
      [,1] [,2] [,3]
 [1,]    1    4    7
 [2,]    2    5    8
 [3,]    3    6    9
 > V[1,3] <- 0; V    # modify a single element at 1st row and 3rd column to 0
      [,1] [,2] [,3]
 [1,]    1    4    0
 [2,]    2    5    8
 [3,]    3    6    9
 > V[V>2] <- 1; V    # change all elements greater than 2 to 1
      [,1] [,2] [,3]
 [1,]    1    1    0
 [2,]    2    1    1
 [3,]    1    1    1

Data Frames

After looking at matrices, I suggest learning about data frames. A data frame is a special case of a list (another data object in R that was previously considered in the first part of the article). Data frames are used for storing tables. Unlike matrices, each column, also known as a vector, can store different types of data (logical, numeric, character, complex, etc.). The function data.frame() is used to create a data frame:

 > # creating a data frame
 > table <- data.frame(name=c("Jack", "Karan", "Thomas", "Vito", "Kristine"),
 +                     age=c(19, 20, 19, 19, 19),
 +                     sex=c("M", "M", "M", "M", "F"),
 +                     colour=c("yellow", "red", "green", "blue", "pink"))
 > table
       name age sex colour
 1     Jack  19   M yellow
 2    Karan  20   M    red
 3   Thomas  19   M  green
 4     Vito  19   M   blue
 5 Kristine  19   F   pink
 > typeof(table)
 [1] "list"
 > class(table)
 [1] "data.frame"
 > # function of a data frame
 > names(table)
 [1] "name"   "age"    "sex"    "colour"
 > nrow(table)
 [1] 5
 > ncol(table)
 [1] 4

There are several ways of accessing an element of a data frame:

 > table[2:4] # columns starting from 2nd to 4th of data frame
   age sex colour
 1  19   M yellow
 2  20   M    red
 3  19   M  green
 4  19   M   blue
 5  19   F   pink
 > table[c("colour","age")] # columns with the titles colour and age from data frame
   colour age
 1 yellow  19
 2    red  20
 3  green  19
 4   blue  19
 5   pink  19

In a similar way to matrices, it is possible to change the values of the elements:

 > table[3,"age"] <- 20; table # modify the element at 3st row and column age to 20
       name age sex colour
 1     Jack  19   M yellow
 2    Karan  20   M    red
 3   Thomas  20   M  green
 4     Vito  19   M   blue
 5 Kristine  19   F   pink

Functions

There is a straightforward way of creating own functions in R. Let’s consider an example where our function is going to find the difference between two integers:

 > example <- function (a, b) {
 +          c <- a - b
 +          c
 +      }
 >  example(15, 1)
 [1] 14

As shown above, the word function is used to declare a function in R. Now we are going to create a function that prints a type and class of an argument:

 example<-function(X){
 +     print(typeof(X))
 +     print(class(X))
 
 > example <- function (a, b) {
 +          c <- a - b
 +          c
 +      }
 >  example(15, 1)
 [1] 14
 
 +     print(paste("The type is", typeof(X) , "and class is", class(X)))
 + }
 > Y<-c("Vito")
 > example(Y)
 [1] "character"
 [1] "character"
 [1] "The type is character and class is character"
 > Z<-c(11)
 > example(Z)
 [1] "double"
 [1] "numeric"
 [1] "The type is double and class is numeric"

If you want to take an input from the user, use the function readline() in R:

 read.example <- function()
 + {
 +     str <- readline(prompt="Your name: ")
 +     return(as.character(str))
 + }
 > print(paste("Nice to meet you,", read.example(), "!"))
 Your name: Dre
 [1] "Nice to meet you, Dre !"

Conditionals

To use conditional execution in R, we are going to use if…else statement:

 > x <- 4
 > if (x < 0) {
 +     print("It is a negative number!")
 + } else if (x > 0) {
 +     print("It is a positive number!")
 + } else
 +     print("Zero!")
 [1] "It is a positive number!"
 
 > x <- -10
 > if (x < 0) {
 +     print("It is a negative number!")
 + } else if (x > 0) {
 +     print("It is a positive number!")
 + } else
 +     print("Zero!")
 [1] "It is a negative number!"
 
 > x <- 0
 > if (x < 0) {
 +     print("It is a negative number!")
 + } else if (x > 0) {
 +     print("It is a positive number!")
 + } else
 +     print("Zero!")
 [1] "Zero!

Loops

Now we are going to consider the control statements in R, such as for{}, repeat{} and while{}.

· A for{} loop in the example below is going to print the first three numbers in the vector Y:

 > Y <- c(17, 25, 19, 33, 11, 51, 55)
 > for(i in 1:3) {
 + print(Y[i])
 + }
 [1] 17
 [1] 25
 [1] 19

· In this example, a repeat{} loop is going to print “task” and after 3 loops it is going to break:

 > task <- c("R is great!")
 > i <- 3
 > repeat {
 +     i <- i + 1
 +     print(task)
 +     if(i > 5) {
 +         break
 +     }
 + }
 [1] "R is great!"
 [1] "R is great!"
 [1] "R is great!"

· A while{} loop is going to follow the commands as long as the condition is true:

 > i <- 1
 > while(i < 5) {
 +     print(i)
 +     i <- i + 1
 + }
 [1] 1
 [1] 2
 [1] 3
 [1] 4

Resources

There are so many interesting resources online that can help you further with R. I strongly recommend checking them out. In the following article, I am going to cover data visualisation, stay updated!

https://academy.microsoft.com/en-us/professional-program/ Microsoft professional programmes, Big Data, Data Science
https://imagine.microsoft.com/en-us/Catalog R Server Download for Students & Academics via Imagine Access
/en-us/r-server/ R Server and R Documentation
https://www.microsoft.com/en-gb/cloud-platform/r-server Microsoft R Server