# Data Structures in R

##### Share

There are many types of objects / Data Structures to store R-object . The frequently used objects are –

- Vectors
- Matrices
- Arrays
- Factors
- Data Frames
- Lists
- Table

**Vectors**

A vector is most basic data structure in R. It is collection same objects like character , logical, integer or numeric .

*a <- c(1,2,3,4)*

*a*

Output :

[1] 1 2 3 4

We can check object “**a**” is vector or not .

*is.vector(a)*

Output:

[1] TRUE

We have created character vector “**b**” as :

**Matrices**

A matrix is a collection of data elements arranged in two-dimensional rectangular layout.

We can check layout of matrix function by :

*?matrix*

It opens description and syntax of matrix function in Help window.

Syntax of matrix is :

matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL)

data = Represent data

nrow = number of rows

ncol = number of columns

byrow= the matrix is filled by row

dimnames = to assign names to row and columns

Number of elements = nrow * ncol

We have nrow=3 .

We find ncol = 2 .

The default value of byrow is FALSE and dimnames is NULL .

*mat <- matrix(c(1,0,4,2,-1),nrow= 3)*

In this case, byrow = FALSE means we filled elements of matrix by column. So, elements are filled column -wise .

[,1] [,2] <- Number of column

[1,] 1 2 <-Number of row

[2,] 0 -1

[3,] 4 5

We represent matrix mat as mat[r,c] , where r is number of row and c is number of column.

mat[1,2] <- First row and second column element .

Output :

*2*

*mat[2,2]*

Output :

*-1*

We can show all elements of whole row as m[r,] and whole column as m[,c] .

*mat[1,] <- It shows all elements of first row*

Output:

*[1] 1 2*

*mat[,2] <- It shows all elements of second column*

Output:

*[1] 2 -1 5*

We can also filled elements by row as :

It shows elements are filled in matrix by filling rows by rows.

We check the class of **mat** object by :

*class(mat)*

Output:

[1] “matrix”

We can give names to columns and rows by using **dimnames **:

*mat <- matrix(c(1,0,4,2,-1,5),nrow= 3,dimnames = list(c(“a”,”b”,”c”),c(“x”,”y”)))*

*mat*

Recycling :

Recycle means reusable of materials . We can reuse data to perform functions required .

We are creating a matrix having 5 elements. We assign number of rows are 2 . So , number of columns are 3.

*x<-matrix(c(1,2,3,4,5),2)*

You can see that , there is warning message which showing number of elements are not multiple of 2. So, it will recycle the remaining element by starting with first element to fill.

You can see here , we have create a matrix of 10 elements . The elements are repeated to fill remaining elements.

We can also create matrix on data object :

*a<-c(5,3,8,7,11,9)*

We can create matrix by using dim() . We assign dimension of matrix using dim() and create a matrix on “**a**” object.

*a <- 1:20 *

We assign dimension as rows X columns .

*dim(a) <- c(4,5) # number of rows = 4 , number of columns =5*

*a*

Output :

We can transpose matrix by using t() .

**%*% Operator**

This operator is used to multiply a matrix with its transpose.

We can bind matrices by row or column .

**cbind()**

We can bind two matrices column-wise . When we bind columns, the number of rows of matrices should be same.

*X<-c(1,2,5,7,8)*

*Y<-c(11,24,85,98,12)*

*cbind(X,Y)*

Output:

When the two matrix do not have same number of rows , it will join . There is an ERROR comes while binding them.

We can also bind vectors by using following code:

*v<-c(1,2,4,5,9)*

*h<-c(2,8,9,4,7)*

*cbind(v,h)*

**rbind()**

We can bind matrices by row-wise . When we bind rows , the number of columns of two matrices should be same.

We can also bind vectors row-wise as:

**Arrays**

We can store data in more than two dimensions . If we create an array of dimension (2,3,4) then it creates 4 rectangular matrices each of 2 rows and 3 columns.

An array can create by using array() . We used **dim** to assign dimension of array.

Arrays are also recycled same as matrix . We create two vectors and input these vectors to an array to fill the elements of array.

*vector1 <- c(5,9,3)*

*vector2 <- c(10,11,12,13,14,15,16)*

*result <- array(c(vector1,vector2),dim = c(3,3,2))*

*result*

Output:

We can give names to columns , rows and matrices in the array by using **dimnames** parameter.

*vector1 <- c(5,9,3)*

*vector2 <- c(10,11,12,13,14,15)*

*column.names <- c(“first”,”second”,”third”)*

*row.names <- c(“first”,”second”,”third”)*

*matrix.names <- c(“Matrix1″,”Matrix2”)*

*result <- array(c(vector1,vector2),dim = c(3,3,2),dimnames = list(row.names,column.names,*

* matrix.names))*

We can show the third row of the first matrix of the array .

result [3,,2]

We can show the element in the 1st row and 2nd column of the 1st matrix.

result[1,2,1]

[1] 10

Check out second matrix .

result[,,2]

Create matrices from the array .

*mat1<-result[,,1]*

*mat2<-result[,,2]*

We add two matrices also :

**Factors**

Factors are the data objects which are used to categorize the data and store it. They can store both strings and integers .

Factors are created using factor() function .

*l <- c(“male”,”female”)*

Levels shows all possible values of given object . We can check levels of object .

*levels(l)*

[1] “female” “male”

We create another factor variable **Name** :

*Name<-c(1,2,1,1,2,1,2,1,2,1,2,1,2,1)*

*Name<-factor(Name)*

*levels(Name)*

*[1] “1” “2”*

*class(Name)*

*[1] “factor”*

To convert the default factor **Name** to roman numerals, we use the assignment form of the **levels()** function:

*levels(Name) = c(‘I’,’II’)*

**Table**

It is used to build a contingency table of the count of each combination of factor variables .

*mons = c(“March”,”April”,”January”,”November”,”January”,*

* “September”,”October”,”September”,”November”,”August”,*

* “January”,”November”,”November”,”February”,”May”,”August”,*

* “July”,”December”,”August”,”August”,”September”,”November”,*

* “February”,”April”)*

* mons = factor(mons)*

* table(mons)*

*mons*

**Data Frames**

Data Frame is a two dimensional data structure . The characteristics of a data frame are :

- The column names should be non-empty . Every column has assign certain name.
- The row names should be unique.
- The data can be of numeric , character or factor type.
- Each column should contain same number of elements.

We create a data frame name ” myFirstDataFrame ” as:

*myFirstDataFrame <- data.frame(name = c(“Bob”, “Fred”, “Barb”, “Sue”,”Jeff”),*

* age = c(21,18,18,24,20),*

* hgt= c(70,67,64,66,72),*

* wgt= c(180,156,128,118,202),*

* race= c(“Cauc”, “Af.Am”,”Af.Am”, “Cauc”, “Asian”),*

* year= c(“Jr”,”Fr”,”Fr”,”Sr”,”So”),*

* SAT= c(1080,1210,840,1340,880))*

*myFirstDataFrame*

We can view data frame by :

*View(myFirstDataFrame)*

We can find number of rows and columns by using nrow() and ncol() .

**Lists**

A list is a generic vector . It is combination of different objects .

We can create list as :

*list<-list(1:4 ,”abc”,TRUE)*

*list*

Output :

[[1]] <- it shows first object from list

[1] 1 2 3 4 <- it shows elements of first object

[[2]] <- it shows second object from list

[1] “abc” <- it shows elements of second object

[[3]] <- it shows third object from list

[1] TRUE <- it shows elements of third object

We can create a list by combining different objects as:

*a<-c(1,5,4,7,8)*

*b<-c(“Alec”, “Dan”, “Rob”, “Rich”)*

*c <- c(TRUE, TRUE, FALSE, FALSE)*

*list1<-list(a,b,c)*

*list1*

We create a list of integer , matrix and character as:

*x <- 1:10*

*y <- matrix(1:12, nrow=3)*

*z <- “Hello”*

*mylist <- list(x,y,z)*

*mylist*

We create a list contains character , matrix and list objects as:

*list_data <- list(c(“January”,”February”,”March”), matrix(c(4,8,6,9,-5,3), nrow = 2),*

* list(“yellow”,15.4))*

We can give names to the list objects by using **names() ** as:

*names(list_data) <- c(“1st Quarter”, “A_Matrix”, “A Inner list”)*

*list_data*

Output: