Working with Sparse Matrices in R Programming
Sparse matrices are sparsely populated collection of elements, where there is very less number of non-null elements. Storage of sparsely populated data in a fully dense matrix leads to increased complexities of time and space. Therefore, the data structures are optimized to store this data much more efficiently and decrease the access time of elements.
Creating a Sparse Matrix
R has an in-built package “matrix” which provides classes for the creation and working with sparse matrices.
library(Matrix)
The following code snippet illustrates the usage of matrix library:
R
# installing the matrix library library ( 'Matrix' ) # declaring matrix of 1000 rows and 1000 cols mat1 <- Matrix (0, nrow = 1000, ncol = 1000, sparse = TRUE ) # setting the value at 1st row # and 1st col to be 1 mat1[1][1]<-5 print ( "Size of sparse mat1" ) print ( object.size (mat1)) |
Output:
[1] "Size of sparse mat1" 5440 bytes
The space occupied by the sparse matrix decrease largely, because it saves space only for the non-zero values.
Constructing Sparse Matrices From Dense
The dense matrix can be simply created by the in-built matrix() command in R. The dense matrix is then fed as input into the as() function which is embedded implicitly in R. The function has the following signature:
Syntax: as(dense_matrix, type = )
Parameters:
dense_matrix : A numeric or logical array.
type : Default evaluates to dgCMatrix, in case we mention sparseMatrix. This converts the matrix to compressed sparse column( CSC ) format. The other type available is the dgRMatrix, which converts the dense matrix in sparse row format.
The following code snippet indicates the conversion of the dense matrix to sparse:
R
library (Matrix) # construct a matrix with values # 0 with probability 0.80 # 6 with probability 0.10 # 7 with probability 0.10 set.seed (0) rows <- 4L cols <- 6L vals <- sample ( x = c (0, 6, 7), prob = c (0.8, 0.1, 0.1), size = rows * cols, replace = TRUE ) dense_mat <- matrix (vals, nrow = rows) print ( "Dense Matrix" ) print (dense_mat) # Convert to sparse sparse_mat <- as (dense_mat, "sparseMatrix" ) print ( "Sparse Matrix" ) print (sparse_mat) |
Output:
[1] "Dense Matrix" [,1] [,2] [,3] [,4] [,5] [,6] [1,] 7 6 0 0 0 0 [2,] 0 0 0 0 0 6 [3,] 0 7 0 0 6 0 [4,] 0 6 0 0 0 0 [1] "Sparse Matrix" 4 x 6 sparse Matrix of class "dgCMatrix" [1,] 7 6 . . . . [2,] . . . . . 6 [3,] . 7 . . 6 . [4,] . 6 . . . .
Operations on Sparse Matrices
Various arithmetic and binding operations can be performed on sparse matrices:
Addition and subtraction by Scalar Value
The scalar values are added or subtracted to all the elements of the sparse matrix. The resultant matrix is a dense matrix since the scalar value is operated upon by all elements. The following code indicates the usage of + or – operators:
R
# Loading Library library (Matrix) # construct a matrix with values # 0 with probability 0.80 # 6 with probability 0.10 # 7 with probability 0.10 set.seed (0) rows <- 4L cols <- 6L vals <- sample ( x = c (0, 10), prob = c (0.85, 0.15), size = rows * cols, replace = TRUE ) dense_mat <- matrix (vals, nrow = rows) # Convert to sparse sparse_mat <- as (dense_mat, "sparseMatrix" ) print ( "Sparse Matrix" ) print (sparse_mat) print ( "Addition" ) # adding a scalar value 5 # to the sparse matrix print (sparse_mat + 5) print ( "Subtraction" ) # subtracting a scalar value 1 # to the sparse matrix print (sparse_mat - 1) |
Output:
[1] "Sparse Matrix" 4 x 6 sparse Matrix of class "dgCMatrix" [1,] 10 10 . . . . [2,] . . . . . 10 [3,] . 10 . . 10 . [4,] . 10 . . . . [1] "Addition" 4 x 6 Matrix of class "dgeMatrix" [,1] [,2] [,3] [,4] [,5] [,6] [1,] 15 15 5 5 5 5 [2,] 5 5 5 5 5 15 [3,] 5 15 5 5 15 5 [4,] 5 15 5 5 5 5 [1] "Subtraction" 4 x 6 Matrix of class "dgeMatrix" [,1] [,2] [,3] [,4] [,5] [,6] [1,] 9 9 -1 -1 -1 -1 [2,] -1 -1 -1 -1 -1 9 [3,] -1 9 -1 -1 9 -1 [4,] -1 9 -1 -1 -1 -1
Multiplication or Division by Scalar
These operations are performed on all the non-zero elements of the matrix. The resultant matrix is a sparse matrix:
R
# library(Matrix) # construct a matrix with values # 0 with probability 0.80 # 6 with probability 0.10 # 7 with probability 0.10 set.seed (0) rows <- 4L cols <- 6L vals <- sample ( x = c (0, 10), prob = c (0.85, 0.15), size = rows * cols, replace = TRUE ) dense_mat <- matrix (vals, nrow = rows) # Convert to sparse sparse_mat <- as (dense_mat, "sparseMatrix" ) print ( "Sparse Matrix" ) print (sparse_mat) print ( "Multiplication" ) # multiplying a scalar value 10 # to the sparse matrix print (sparse_mat * 10) print ( "Division" ) # dividing a scalar value 10 # to the sparse matrix print (sparse_mat / 10) |
Output:
[1] "Sparse Matrix" 4 x 6 sparse Matrix of class "dgCMatrix" [1,] 10 10 . . . . [2,] . . . . . 10 [3,] . 10 . . 10 . [4,] . 10 . . . . [1] "Multiplication" 4 x 6 sparse Matrix of class "dgCMatrix" [1,] 100 100 . . . . [2,] . . . . . 100 [3,] . 100 . . 100 . [4,] . 100 . . . . [1] "Division" 4 x 6 sparse Matrix of class "dgCMatrix" [1,] 1 1 . . . . [2,] . . . . . 1 [3,] . 1 . . 1 . [4,] . 1 . . . .
Matrix Multiplication
Matrices can be multiplied with each other, irrespective of sparse or dense. However, the columns of the first matrix should be equal to rows of the second.
R
library (Matrix) # construct a matrix with values # 0 with probability 0.80 # 6 with probability 0.10 # 7 with probability 0.10 set.seed (0) rows <- 4L cols <- 6L vals <- sample ( x = c (0, 10), prob = c (0.85, 0.15), size = rows * cols, replace = TRUE ) dense_mat <- matrix (vals, nrow = rows) # Convert to sparse sparse_mat <- as (dense_mat, "sparseMatrix" ) print ( "Sparse Matrix" ) print (sparse_mat) # computing transpose of matrix transpose_mat = t (sparse_mat) # computing multiplication of matrix # and its transpose mul_mat = sparse_mat %*% transpose_mat print ( "Multiplication of Matrices" ) print (mul_mat) |
Output:
[1] "Sparse Matrix" 4 x 6 sparse Matrix of class "dgCMatrix" [1,] 10 10 . . . . [2,] . . . . . 10 [3,] . 10 . . 10 . [4,] . 10 . . . . [1] "Multiplication of Matrices" 4 x 4 sparse Matrix of class "dgCMatrix" [1,] 200 . 100 100 [2,] . 100 . . [3,] 100 . 200 100 [4,] 100 . 100 100
Multiplication by a Vector
Matrices can be multiplied by uni-dimensional vectors, to transform data. The rows are multiplied by the corresponding elements of the vector, that is the first row is multiplied by the first indexed element of the vector, until the length of the vector.
R
library (Matrix) # construct a matrix with values # 0 with probability 0.80 # 6 with probability 0.10 # 7 with probability 0.10 set.seed (0) rows <- 4L cols <- 6L vals <- sample ( x = c (0, 10), prob = c (0.85, 0.15), size = rows * cols, replace = TRUE ) dense_mat <- matrix (vals, nrow = rows) # Convert to sparse sparse_mat <- as (dense_mat, "sparseMatrix" ) print ( "Sparse Matrix" ) print (sparse_mat) # declaring a vector vec <- c (3, 2) print ( "Multiplication by vector" ) print (sparse_mat * vec) |
Output:
[1] "Sparse Matrix" 4 x 6 sparse Matrix of class "dgCMatrix" [1,] 10 10 . . . . [2,] . . . . . 10 [3,] . 10 . . 10 . [4,] . 10 . . . . [1] "Multiplication by vector" 4 x 6 sparse Matrix of class "dgCMatrix" [1,] 30 30 . . . . [2,] . . . . . 20 [3,] . 30 . . 30 . [4,] . 20 . . . .
Combination of Matrices
Matrices can be combined with vectors or other matrices using column bind cbind( ) or row bind rbind( ) operations. The resultant matrices rows are the summation of the rows of the input matrices in rbind() function and the columns are the summation of the columns of the input matrices in cbind().
R
library (Matrix) # construct a matrix with values # 0 with probability 0.80 # 6 with probability 0.10 # 7 with probability 0.10 set.seed (0) rows <- 4L cols <- 6L vals <- sample ( x = c (0, 10), prob = c (0.85, 0.15), size = rows * cols, replace = TRUE ) dense_mat <- matrix (vals, nrow = rows) # Convert to sparse sparse_mat <- as (dense_mat, "sparseMatrix" ) print ( "Sparse Matrix" ) print (sparse_mat) # combining matrix through rows row_bind <- rbind (sparse_mat, sparse_mat) # printing matrix after row bind print ( "Row Bind" ) print (row_bind) |
Output:
[1] "Sparse Matrix" 4 x 6 sparse Matrix of class "dgCMatrix" [1,] 10 10 . . . . [2,] . . . . . 10 [3,] . 10 . . 10 . [4,] . 10 . . . . [1] "Row Bind" 8 x 6 sparse Matrix of class "dgCMatrix" [1,] 10 10 . . . . [2,] . . . . . 10 [3,] . 10 . . 10 . [4,] . 10 . . . . [5,] 10 10 . . . . [6,] . . . . . 10 [7,] . 10 . . 10 . [8,] . 10 . . . .
Properties of Sparse Matrices
- NA Values
NA values are not considered equivalent to sparsity and therefore are treated as non-zero values. However, they don’t participate in any sparse matrix operations.
R
library (Matrix) # declaring original matrix mat <- matrix (data = c (5.5, 0, NA , 0, 0, NA ), nrow = 3) print ( "Original Matrix" ) print (mat) sparse_mat <- as (mat, "sparseMatrix" ) print ( "Sparse Matrix" ) print (sparse_mat) |
Output:
[1] "Original Matrix" [,1] [,2] [1,] 5.5 0 [2,] 0.0 0 [3,] NA NA [1] "Sparse Matrix" 3 x 2 sparse Matrix of class "dgCMatrix" [1,] 5.5 . [2,] . . [3,] NA NA
- Sparse matrix data can be written into an ordinary file in the MatrixMarketformat(.mtx). WriteMM function is available to transfer the data of a sparse matrix into a file.
writeMM(obj-matrix,file="fname.mtx")
Please Login to comment...