Application of the Modified Gram-Schmidt Algorithm in Maple and how this can be applied to a least squares curve fitting problem.
by Douglas Edan Lewit, currently enrolled in Real Analysis and Numerical Linear Algebra at Illinois Institute of Technology in Chicago, IL, Fall Semester of 2013.
I'm currently enrolled in a graduate level numerical linear algebra class at IIT in Chicago, IL. I often use Maple for various assignments and computing projects in this class. Maple has two excellent packages for manipulating matrices--the older linalg package and the newer LinearAlgebra package, the latter being preferable when working on large matrices containing floating point terms rather than symbolic terms. Maple's LinearAlgebra package contains a special command called QRDecomposition. The 'QRDecomposition' command successfully factors any matrix into the product of two matrices, the first one, usually called the Q-matrix, having orthonormal column vectors, and the second matrix, usually called the R-matrix, being an upper triangular non-singular matrix. In fitting curves to data one often encounters overdetermined systems of linear equations. (An overdetermined system is where the number of linear equations is greater than the number of unknowns in the system.) Solving such a system can be challenging. The QR decomposition method offers one convenient approach because it breaks down the coefficient matrix into two matrices, the Q matrix, which has the convenient property that its inverse is the same as its transpose--assuming of course that the Q matrix is square--and the R matrix, which is easily inverted because of its upper triangular shape. Using the Q and R matrices, we thus can avoid the problems of taking the inverse of another approach to solving this problem. may not be an easily inverted matrix, and even if is invertible, this matrix usually has a very large condition number, meaning that we may not necessarily be able to trust the accuracy of any numerical results found using this matrix. (The condition number of a matrix is usually defined as the square root of the ratio of the largest singular value to the smallest singular value. If a non-singular matrix has a very large condition number, then the matrix is said to be ill-conditioned or almost singular.) The QR decomposition method provides a very powerful alternative to working directly with the matrix.
Maple's QRDecomposition command basically utilizes one of two routines for generating the Q and R matrices. If the matrix contains only integers and/or symbolic expressions, then Maple performs a QR decomposition using the Classical Gram-Schmidt algorithm. If however, the matrix contains a mixture of integers and floating point decimals or only floating point decimals, then Maple carries out the QR decomposition of the matrix using Householder transformations. My approach below uses a third alternative, the Modified Gram-Schmidt algorithm, which I read about in Chapter 8 of the textbook, NUMERICAL LINEAR ALGEBRA, by Lloyd N. Trefethen and David Bau III.
Computer problem 8.2 (using Maple).
restart;
mgs:=proc(A::Matrix) # implement the Modified Gram-Schmidt algorithm in Maple where A is a matrix. local m,n; # these variables will store the dimensions of the matrix A. local i,j,k,total; # counter variables for the loops in the program. global V,Q; # these variables will store the column vectors of A and the orthonormalized vectors of A, respectively. global R; # an upper triangular matrix. total:=time(); # initialize the variable that will calculate the program's execution time. m,n:=LinearAlgebra:-Dimensions(A); if m < n then error "The number of rows must be greater than or equal to the number of columns to use this algorithm." else for j to n do V[j]:=convert(LinearAlgebra:-SubMatrix(A,1..m,j..j),Vector[column]); end do; unassign('Q','R'); R:=Matrix(n,n,shape=triangular[upper]); R[1,1]:=LinearAlgebra:-VectorNorm(V[1],2); Q[1]:=V[1]/R[1,1]; for j from 2 to n do for i to j do if i<>j then R[i,j]:=LinearAlgebra:-DotProduct(Q[i],V[j]); V[j]:=V[j]-R[i,j]*Q[i]; else R[i,j]:=LinearAlgebra:-VectorNorm(V[j],2); Q[j]:=V[j]/R[i,j]; end if; end do; end do; Q:=[seq(entries(Q)[k][1],k=1..n)]; Q:=convert(Q,Matrix); total:=time()-total; printf("The QR factorization of the matrix based on the MGS algorithm required %0.8f CPU seconds.",total); end if; end proc;
showstat(mgs); # This output for a program is more traditional because the program lines are numbered in order.
mgs := proc(A::Matrix) local m, n, i, j, k, total; global V, Q, R; 1 total := time(); 2 m, n := LinearAlgebra:-Dimensions(A); 3 if m < n then 4 error "The number of rows must be greater than or equal to the number of columns to use this algorithm." else 5 for j to n do 6 V[j] := convert(LinearAlgebra:-SubMatrix(A,1 .. m,j .. j),Vector[column]) end do; 7 unassign('Q','R'); 8 R := Matrix(n,n,shape = triangular[upper]); 9 R[1,1] := LinearAlgebra:-VectorNorm(V[1],2); 10 Q[1] := V[1]/R[1,1]; 11 for j from 2 to n do 12 for i to j do 13 if i <> j then 14 R[i,j] := LinearAlgebra:-DotProduct(Q[i],V[j]); 15 V[j] := -Q[i]*R[i,j]+V[j] else 16 R[i,j] := LinearAlgebra:-VectorNorm(V[j],2); 17 Q[j] := V[j]/R[i,j] end if end do end do; 18 Q := [seq(entries(Q)[k][1],k = 1 .. n)]; 19 Q := convert(Q,Matrix); 20 total := time()-total; 21 printf("The QR factorization of the matrix based on the MGS algorithm required %0.8f CPU seconds.",total) end if end proc
Let's apply the program to a few examples to see how well it works!
f:=rand(1..10); # create a short program that randomly generates any integer from 1 to 10, inclusive.
M:=Matrix(f,8,4);
LinearAlgebra:-Map(evalf,M);
mgs(M);
The QR factorization of the matrix based on the MGS algorithm required 0.05300000 CPU seconds.
Q;
R;
with(LinearAlgebra): # load Maple's library of Linear Algebra commands and routines.
map(round,Transpose(Q).Q);
map(round,Q.R);
Let's apply my program to solving a practical least squares problem. Consider the following problem taken from the public website, http://calculator.maconstate.edu/cubic_regression/. x = { -3, -2, -1, 0, 1, 2, 3} and y = { 3, -8, -7, 0, 7, 8, -3}. Use the MGS (modified Gram-Schmidt) algorithm to perform a QR factorization on the matrix for this problem, and then determine the unique least squares cubic polynomial that best fit these data.
X:=[seq(n,n=-3..3)];
Y:=[3,-8,-7,0,7,8,-3];
A:=VandermondeMatrix(X,7,4); # 7 rows because there are 7 points, and 4 columns because a cubic polynomial normally has 4 terms.
Map(evalf,A); # convert A's elements to floating point numbers.
A; # checking to make sure that my changes were saved.
mgs(A); # mgs is my own program for doing a QR factorization that implements the modified Gram-Schmidt algorithm.
The QR factorization of the matrix based on the MGS algorithm required 0.00500000 CPU seconds.
whattype(Y);
Y:=convert(Y,Vector[column]);
Y:=Transpose(Q).Y;
M:=<R|Y>; # augment matrix R with the product Q*Y from the line above.
M:=ReducedRowEchelonForm(M);
p0:=M[1,5];
p1:=M[2,5];
p2:=M[3,5];
p3:=M[4,5];
X:=(9); # redefine my X and Y data points.
Y:=(10);
with(plots):
P:=x-> p0 + p1*x + p2*x^2 + p3*x^3; # define the least squares cubic polynomial for this problem.
Data:=[seq([X[n],Y[n]],n=1..7)];
display(listplot(Data,style=point,symbol=solidcircle,symbolsize=15,color=black,axesfont=[Arial,bold,9]),plot(P(x),x=-3..3,color=red,thickness=2,title="\nLeast Squares Cubic Polynomial\nfound through QR Decomposition\n",titlefont=[Arial,bold,12],caption="\nGraph by Douglas Lewit of Math-577",captionfont=[Arial,bold,10],labels=["x","y"],labelfont=[Times,11]));
What happens if we feed a matrix to the 'mgs' program where the matrix has fewer rows than columns?
A:=Matrix(f,4,5);
mgs(A);
Error, (in mgs) The number of rows must be greater than or equal to the number of columns to use this algorithm.
The modified Gram-Schmidt algorithm contains the assumption that the matrix has at least as many rows as columns. For example, in the matrix above we have a sample of five vectors from , but that doesn't make any sense. Any basis of must contain no more than four linearly independent vectors. Any subspace of has a basis composed of four or fewer linearly independent vectors. If we are looking at five vectors in , then we are assured that at least one of those vectors is dependent or in other words, is a linear combination of the other vectors in the basis. The Gram Schmidt orthogonalization algorithm, whether in its classical or modified form, can only be successfully applied to a linearly independent set of vectors.