homework | math – Homework No.10 for Chap 10/12, APMA E

1. CS205L Homework 3

Due: Jan 27th at 11:59 PM PST
  • Each question has some number of stars
    assigned to it based on how conceptually involved it is. You need to do questions that add up to 6 stars to get full marks. Each problem is graded on a
    point scale.
  • Submit your homeworks on Gradescope by the deadline (all enrolled students should be automatically added to Gradescope, make a private Piazza post if you’re experiencing difficulties!)
  • You may work in groups up to 4 people. You should
  • individually turn in your own unique-as-possible write-up (direct copying of any written/typed/TEXed equations is not allowed,) and
  • list your collaborators’ SUNet IDs in the spaces provided below. (A gentle reminder that failure to do so would be a violation of the honor code.)
  • You will have 5 late days in total to use on the written homeworks this quarter, and are allowed to use multiple late days per homework. If you’ve used up your late days, you may turn in your homework late with a flat
    penalty.
  • You may submit either typed-up solutions or legible scans of handwritten solutions. The LaTeX source is provided separately so you can use it if you choose to.
  • Each problem has a TA assigned to it, and the assigned TA is labelled next to each problem. You may want to read the suggested readings before attempting the question.
  • For each homework-related question, attend the office hours of the assigned TA. TAs will not handle homework-specific questions on problems other than their own. Additionally, if you post questions on Piazza about the assignments, please take care to not post parts of your solutions on public posts!
  • Please specify which solutions you’d like to count towards your homework grade below. You may submit solutions to problems that add up to more than 6 stars, but if you don’t specify which questions you’d like to be graded for points, then we’ll use the lowest difficulty questions. Please tag all questions on Gradescope that you’d like to receive feedback on too.
In the box below, place an
under questions you would like to count towards your grade. The total number of stars must add up to 6 .
Collaborators:










(SUNet IDs)

2. Problem 3.1. (*) Properties of Singular Value Decomposition

Note: Part 1 is worth
point; Parts 2-3 are 0.25 each.
For this problem, consider the following definitions:
  • The singular value decomposition (SVD) for any
    matrix
    is
    where
    and
    are orthonormal, and
    is diagonal. The diagonal values
    of
    are
    ‘s singular values, and are ordered such that
    .
  • An orthonormal matrix
    is one where its columns
    satisfy

  • The 2-norm
    of a vector
    is
  1. Using just the definition of orthonormality given above, show that for any orthonormal matrix
    and any vector
    :


  • .
Note: The latter relationship states that the linear transformation by an orthonormal matrix
is an isometry, meaning it preserves the length of the vector
  1. Using the definitions of
    and the properties of orthornomal matrices from Part 1, show either:

where
are the
-th columns of
. You only need to show one of the above, since a similar process shows the other.
What do the relationships mean conceptually?
  1. For a square invertible matrix
    , show that
    is bounded by:

What does this mean conceptually?
Hint: The summation form of the 2-norm definition as well as the definition of orthonormality should help. You only need to show one of the bounds (e.g. the upper bound), then write a sentence that argues how a similar process gets the other bound. Your Solution.

3. Problem 3.2. (**) Pseudoinverse from Singular Value Decomposition

The singular value decomposition (SVD)
provides a robust framework for solving linear least squares problems
where
has any shape or rank.
When
is square and nonsingular, the unique solution to
is

But, when
is nonsquare or singular, the linear system
no longer has a guaranteed unique solution. In this case, it is common practice to choose the solution

where

The matrix
is defined as the Moore-Penrose inverse (colloquially referred to as pseudoinverse) of a matrix.
  1. Show that when a tall
    matrix
    has full column rank
    ,
(a)
evaluates to the
identity matrix.
(b) the
to
-th columns of
are in the null space of
(assume the singular values are in descending order).
  1. Given

compute
and
by hand. Solution must be expressed in fractions/square roots. Credit will not be given for answers expressed in floating point!
Sanity check your answer using Part 1a.
Not for grading, but if you’re interested: check for yourself that
when
is invertible.
CA:Jane
Recommended Readings: Lecture 3, Heath

4. Your Solution.

Your Solution.

5. Problem 3.3. (**) Reduced SVD of a Non-Square Matrix

We will compute the SVD for a
matrix
in this problem.

We can find the
by solving the eigenproblem on the
matrix

To find the eigenvalues
of a matrix
, one can solve the characteristic equation

For a
matrix, it has the simple form of a quadratic equation

For each eigenvalue
, its corresponding eigenvector can be found by solving
. Note that if a unit vector
is an eigenvector of
, then
is also a unit length eigenvector with the same eigenvalue.
Please rationalize the denominator, and make sure the fractions are simplified.
  1. Find the eigenvalues and eigenvectors for
    by following these steps:
(a) Write down and solve the characteristic equation for
to find the eigenvalues.
(b) Find the corresponding unit-length eigenvectors (i.e., the
norm
, or the magnitude of
, is 1 ). You can do it by hand in this simple case. For the sake of this problem, let’s choose the direction in which the first nonzero entry is positive.
  1. Use Part 1 to find the SVD of
    in reduced form with the following steps:
(a) Write down the singular value matrix
in reduced form (
is a square matrix in reduced form). You should order the singular values such that
on the diagonal.
(b) Write down
and
. Recall that columns in
are eigenvectors of
. Make sure that the columns in
correspond to the correct eigenvalues in
. You can find the reduced form of
via
(feel free to verify yourself that this is true by expanding the SVD of
).
You should verify that the columns in
thus computed are indeed unit-length and orthogonal. (Also feel free to numerically verify that the columns of
thus computed are indeed eigenvectors of
.) Your Solution.

6. Problem 3.4. (*) Symmetric Positive Definiteness

In class, we talked about symmetric positive definite (SPD) matrices as being a special class of matrices with nice properties. A matrix
is symmetric if
and is positive definite if
for all
.
  1. Prove that the singular values of
    are also its eigenvalues.
Hint: This was briefly mentioned in class. Please do the proof out.
  1. What conditions must the
    matrix
    satisfy for
    to be SPD? Justify your answer.
Hint: Think about the shapes of the matrices and their ranks.
  1. Prove that if
    is not
    , then the Cholesky factorization
    (as mentioned in class) does not exist.
Hint: It might be easier to prove the contrapositive.

7. Your Solution.

Your Solution.

8. Problem 3.5. (***) An Iterative Method for Solving Matrix Equations

Recently in lecture, we discussed the idea of using iterative methods to solve a matrix equation instead of using a method like LU decomposition to get a closed-form solution. As you might recall, we prefer iterative methods in situations where we have large, sparse matrices, as taking the inverse, LU decomposition, or anything of the sort of a sparse matrix might lead to a dense matrix. And in those cases, we would need to allocate a lot of computer memory to store large, dense matrices, which not only isn’t optimal, but might not even be feasible if we have limited space.
In this problem, we’re going to explore one such iterative method to solve the standard matrix equation,
, involving matrix splitting:
  • Given that
    is an
    , non-singular matrix, we split:

where
is an invertible matrix (i.e.
exists).
  • As it turns out, we can use the iterative scheme:

to solve for
, with
being a count variable to keep track of our number of iterations. That is, we start with an initial guess for
; then, we compute
and so on with the formula above.
With the above scheme, you might be wondering: how can we be sure that the above iterative scheme converges to the right solution? That’s what you will figure out.
First, for convenience, let’s define the following variables:

  • , i.e.
    algebraically represents our theoretical, true solution, even though we don’t actually know it.

  • , i.e.
    algebraically represents the error between our
    -th estimate of
    and our theoretical, true solution.
Keep in mind that we don’t actually know the value of
– we’re just using these algebraic statements for convenience.

9. Show that:


You may assume without proof the identity:
. 2. Using Equation 1, prove that:

Hint: Use induction.
  1. Define the matrix:
    . Using Equation 2, prove:

  1. Assume without proof that the matrix
    has a full set of linearly independent eigenvectors. Show that
    converges to the theoretical, true solution,
    , when:

i.e. when the magnitude of the largest eigenvalue of
is less than 1. See the recommended reading if you’re stuck.
Fun fact: it is actually possible to prove this convergence condition without using the above assumption for
, but to do so requires math that is outside the scope of what we’ve covered so far in class.
From this problem, we see that if we ever have a sparse matrix equation
where we can decompose the
non-singular matrix
into
, where
is invertible, then we can use the iterative scheme defined above to converge to a solution for
. Note that defining variables, such as
and
, for our convenience is a common practice when analyzing iterative methods such as this one, especially since no matter what problem we have, we know that we’ll always have a true solution and error term.
In general, there are many iterative methods out there (e.g. the popular conjugate gradient descent), and we should always be aware of their convergence conditions to make sure that they are applicable to our problem. Once we’re fairly sure that the method will converge, it’s just a simple matter of coding up a loop for the iterative scheme that terminates once our estimate for
no longer differs from the previous estimate by a significant amount.
CA:Kevin
Recommended Readings: Lecture 5, Heath 4.2.2

10. Your Solution.

Your Solution.