Application Center - Maplesoft

App Preview:

Principal Component Analysis

You can switch back to the summary page by clicking here.

Learn about Maple
Download Application




Principal Component Analysis

Samir Khan

Introduction

Principal Component Analysis transforms a multi-dimensional data set to a new set of perpendicular axes (or components) that describe decreasing amounts of variance in the data.  

 

This worksheet performs a principal component analysis on multi-dimensional data. Those components that have the least impact on the variance can be discarded, and the simplified data reconstructed from the remaining components.

Load Packages

restartwith(LinearAlgebra)with(Statistics)with(plottools)with(plots)

Specify Data

data := Matrix(10, 3, {(1, 1) = 2.5, (1, 2) = 2.4, (1, 3) = 1.05, (2, 1) = .5, (2, 2) = .7, (2, 3) = .785, (3, 1) = 2.2, (3, 2) = 2.9, (3, 3) = 1.286, (4, 1) = 1.9, (4, 2) = 2.2, (4, 3) = 2.35, (5, 1) = 3.1, (5, 2) = 3.0, (5, 3) = 2.202, (6, 1) = 2.3, (6, 2) = 2.7, (6, 3) = 1.351, (7, 1) = 2, (7, 2) = 1.6, (7, 3) = 2.021, (8, 1) = 1, (8, 2) = 1.1, (8, 3) = 1.247, (9, 1) = 1.5, (9, 2) = 1.6, (9, 3) = 2.503, (10, 1) = 1.1, (10, 2) = .9, (10, 3) = 1.214})

Number of components to keep

numcomp := 2:

Centre the Data

cdim := ColumnDimension(data):rdim := RowDimension(data):

data_centered := Matrix(rdim, cdim):for i to cdim do mean[i] := Mean(Column(data, i)); for j to rdim do data_centered[j, i] := data[j, i]-mean[i] end do end do:

Calculate the Covariance, Eigvenvalues and Eigenvectors

Covariance Matrix

cov := CovarianceMatrix(data):

Eigenvectors and eigenvalues

evals := map(Re, Eigenvectors(cov)[1]):evecs := map(Re, Eigenvectors(cov)[2]):

Sorting the eigenvectors in order of decreasing eigenvalues (i.e. the most significant eigenvvectors are first)

SortMatByRow := proc (Mat, r) options operator, arrow; Matrix(sort([seq(Column(Mat, i), i = 1 .. LinearAlgebra[ColumnDimension](Mat))], proc (x, y) options operator, arrow; evalb(y[r] < x[r]) end proc)) end proc:

eigenVecsSorted := SubMatrix(SortMatByRow(convert(linalg[stackmatrix](evecs, evals), Matrix), RowDimension(evecs)+1), 1 .. RowDimension(evecs), 1 .. ColumnDimension(evecs)):

Calculate the Princpal Components

The feature contains the retained components

fv := SubMatrix(eigenVecsSorted, 1 .. cdim, 1 .. numcomp):

Hence the data in terms of the new coordinate system.

prinCom := Typesetting:-delayDotProduct(Transpose(fv), Transpose(data_centered)):

Reconstruct Data from Principal Components

recData := Matrix(rdim, cdim):temp1 := Transpose(Typesetting:-delayDotProduct(fv, prinCom)):

for i to rdim do for j to cdim do recData[i, j] := temp1[i, j]+mean[j] end do end do;

Plot Reconstruct Data

recdata := convert(recData, listlist):datalist := convert(data, listlist):

p1 := seq(sphere(datalist[i], 0.5e-1, color = gray), i = 1 .. rdim):p2 := seq(sphere(recdata[i], 0.5e-1, color = red), i = 1 .. rdim):

display(p1, p2, axes = boxed, scaling = constrained, style = patchnogrid, title = "Original and Reduced Complexity Data", labels = ["X", "Y", "Z"], labelfont = [Calibri], axesfont = [Calibri], titlefont = [Calibri, 18, Bold])