Each question has some number of stars assigned to it based on how conceptually involved it is. You need to do questions that add up to 6 stars to get full marks. Each problem is graded on a point scale.
Submit your homeworks on Gradescope by the deadline (all enrolled students should be automatically added to Gradescope, make a private Piazza post if you’re experiencing difficulties!)
You may work in groups up to 4 people. You should
individually turn in your own unique-as-possible write-up (direct copying of any written/typed/TEXed equations is not allowed,) and
list your collaborators’ SUNet IDs in the spaces provided below. (A gentle reminder that failure to do so would be a violation of the honor code.)
You will have 5 late days in total to use on the written homeworks this quarter, and are allowed to use multiple late days per homework. If you’ve used up your late days, you may turn in your homework late with a flat penalty.
You may submit either typed-up solutions or legible scans of handwritten solutions. The LaTeX source is provided separately so you can use it if you choose to.
Each problem has a TA assigned to it, and the assigned TA is labelled next to each problem. You may want to read the suggested readings before attempting the question.
For each homework-related question, attend the office hours of the assigned TA. TAs will not handle homework-specific questions on problems other than their own. Additionally, if you post questions on Piazza about the assignments, please take care to not post parts of your solutions on public posts!
Please specify which solutions you’d like to count towards your homework grade below. You may submit solutions to problems that add up to more than 6 stars, but if you don’t specify which questions you’d like to be graded for points, then we’ll use the lowest difficulty questions. Please tag all questions on Gradescope that you’d like to receive feedback on too.
In the box below, place an under questions you would like to count towards your grade. The total number of stars must add up to 6 .
Collaborators:
(SUNet IDs)
2. Problem 3.1. (*) Properties of Singular Value Decomposition
Note: Part 1 is worth point; Parts 2-3 are 0.25 each.
For this problem, consider the following definitions:
The singular value decomposition (SVD) for any matrix is where and are orthonormal, and is diagonal. The diagonal values of are ‘s singular values, and are ordered such that .
An orthonormal matrix is one where its columns satisfy
The 2-norm of a vector is
Using just the definition of orthonormality given above, show that for any orthonormal matrix and any vector :
.
Note: The latter relationship states that the linear transformation by an orthonormal matrix is an isometry, meaning it preserves the length of the vector
Using the definitions of and the properties of orthornomal matrices from Part 1, show either:
where are the -th columns of . You only need to show one of the above, since a similar process shows the other.
What do the relationships mean conceptually?
For a square invertible matrix , show that is bounded by:
What does this mean conceptually?
Hint: The summation form of the 2-norm definition as well as the definition of orthonormality should help. You only need to show one of the bounds (e.g. the upper bound), then write a sentence that argues how a similar process gets the other bound. Your Solution.
3. Problem 3.2. (**) Pseudoinverse from Singular Value Decomposition
The singular value decomposition (SVD) provides a robust framework for solving linear least squares problems where has any shape or rank.
When is square and nonsingular, the unique solution to is
But, when is nonsquare or singular, the linear system no longer has a guaranteed unique solution. In this case, it is common practice to choose the solution
where
The matrix is defined as the Moore-Penrose inverse (colloquially referred to as pseudoinverse) of a matrix.
Show that when a tall matrix has full column rank ,
(a) evaluates to the identity matrix.
(b) the to -th columns of are in the null space of (assume the singular values are in descending order).
Given
compute and by hand. Solution must be expressed in fractions/square roots. Credit will not be given for answers expressed in floating point!
Sanity check your answer using Part 1a.
Not for grading, but if you’re interested: check for yourself that when is invertible.
CA:Jane
Recommended Readings: Lecture 3, Heath
4. Your Solution.
Your Solution.
5. Problem 3.3. (**) Reduced SVD of a Non-Square Matrix
We will compute the SVD for a matrix in this problem.
We can find the by solving the eigenproblem on the matrix
To find the eigenvalues of a matrix , one can solve the characteristic equation
For a matrix, it has the simple form of a quadratic equation
For each eigenvalue , its corresponding eigenvector can be found by solving . Note that if a unit vector is an eigenvector of , then is also a unit length eigenvector with the same eigenvalue.
Please rationalize the denominator, and make sure the fractions are simplified.
Find the eigenvalues and eigenvectors for by following these steps:
(a) Write down and solve the characteristic equation for to find the eigenvalues.
(b) Find the corresponding unit-length eigenvectors (i.e., the norm , or the magnitude of , is 1 ). You can do it by hand in this simple case. For the sake of this problem, let’s choose the direction in which the first nonzero entry is positive.
Use Part 1 to find the SVD of in reduced form with the following steps:
(a) Write down the singular value matrix in reduced form ( is a square matrix in reduced form). You should order the singular values such that on the diagonal.
(b) Write down and . Recall that columns in are eigenvectors of . Make sure that the columns in correspond to the correct eigenvalues in . You can find the reduced form of via (feel free to verify yourself that this is true by expanding the SVD of ).
You should verify that the columns in thus computed are indeed unit-length and orthogonal. (Also feel free to numerically verify that the columns of thus computed are indeed eigenvectors of .) Your Solution.
6. Problem 3.4. (*) Symmetric Positive Definiteness
In class, we talked about symmetric positive definite (SPD) matrices as being a special class of matrices with nice properties. A matrix is symmetric if and is positive definite if for all .
Prove that the singular values of are also its eigenvalues.
Hint: This was briefly mentioned in class. Please do the proof out.
What conditions must the matrix satisfy for to be SPD? Justify your answer.
Hint: Think about the shapes of the matrices and their ranks.
Prove that if is not , then the Cholesky factorization (as mentioned in class) does not exist.
Hint: It might be easier to prove the contrapositive.
7. Your Solution.
Your Solution.
8. Problem 3.5. (***) An Iterative Method for Solving Matrix Equations
Recently in lecture, we discussed the idea of using iterative methods to solve a matrix equation instead of using a method like LU decomposition to get a closed-form solution. As you might recall, we prefer iterative methods in situations where we have large, sparse matrices, as taking the inverse, LU decomposition, or anything of the sort of a sparse matrix might lead to a dense matrix. And in those cases, we would need to allocate a lot of computer memory to store large, dense matrices, which not only isn’t optimal, but might not even be feasible if we have limited space.
In this problem, we’re going to explore one such iterative method to solve the standard matrix equation, , involving matrix splitting:
Given that is an , non-singular matrix, we split:
where is an invertible matrix (i.e. exists).
As it turns out, we can use the iterative scheme:
to solve for , with being a count variable to keep track of our number of iterations. That is, we start with an initial guess for ; then, we compute and so on with the formula above.
With the above scheme, you might be wondering: how can we be sure that the above iterative scheme converges to the right solution? That’s what you will figure out.
First, for convenience, let’s define the following variables:
, i.e. algebraically represents our theoretical, true solution, even though we don’t actually know it.
, i.e. algebraically represents the error between our -th estimate of and our theoretical, true solution.
Keep in mind that we don’t actually know the value of – we’re just using these algebraic statements for convenience.
9. Show that:
You may assume without proof the identity: . 2. Using Equation 1, prove that:
Hint: Use induction.
Define the matrix: . Using Equation 2, prove:
Assume without proof that the matrix has a full set of linearly independent eigenvectors. Show that converges to the theoretical, true solution, , when:
i.e. when the magnitude of the largest eigenvalue of is less than 1. See the recommended reading if you’re stuck.
Fun fact: it is actually possible to prove this convergence condition without using the above assumption for , but to do so requires math that is outside the scope of what we’ve covered so far in class.
From this problem, we see that if we ever have a sparse matrix equation where we can decompose the non-singular matrix into , where is invertible, then we can use the iterative scheme defined above to converge to a solution for . Note that defining variables, such as and , for our convenience is a common practice when analyzing iterative methods such as this one, especially since no matter what problem we have, we know that we’ll always have a true solution and error term.
In general, there are many iterative methods out there (e.g. the popular conjugate gradient descent), and we should always be aware of their convergence conditions to make sure that they are applicable to our problem. Once we’re fairly sure that the method will converge, it’s just a simple matter of coding up a loop for the iterative scheme that terminates once our estimate for no longer differs from the previous estimate by a significant amount.