### ML4ENG Coursework Part 2

matlab | Network | network代写 | Algorithm | html | 作业assignment | 代做lab – 这是一个关于Algorithm的pca方面题目, 主要考察了关于matlab解决pca问题的内容,是一个比较经典的题目, 涵盖了matlab/Network/network/Algorithm/html等程序代做方面, 这是值得参考的lab代写的题目

Deadline for submission (enforced by Keats): April 18, 2022 at 16:00.

#### 1 GENERAL GUIDELINES

In this coursework, you will work with

- Binary classification using linear and non-linear models.
- Principal component analysis (PCA).

To start, download the data sets data_xor.mat, dats_usps_5_6.mat and the template file cw_template.m from the modules Keats website. Once this is done:

- Change the cw_template.m to your k number. In the following, we will refer to this file as k12345678.m.
- Open the k12345678.m file with your mat lab editor [1]. Note that the file contains a preamble, referred to as main body, which you should
**not**modify, and the definition of several functions. - Follow the Instructions (Section 3 of this document) to fill in the details of the functions in the template file. The functions in the k12345678.m file have been numbered according to the numbered list below in Section 3 (Instructions).
- Make sure each functions output has the correct size.
- You are encouraged to use in each function previous functions that you wrote. It will not always be possible.
- Once you have written the functions, verify k12345678.m runs without errors when the file is included in a folder containing
**only**the file itself and the data sets. - Check that
**no**MATLAB toolbox was used (you can type in the command window license(‘inuse’) and verify only standard matlab functions are used). - Check no display lines are printed beside the discussions. Any variable that will be printed have a penalty of 2 points. As a reminder, place ; at the end of each assignment line.
- A coursework raising any error while running, may be graded 0.
- Submit only the k12345678.m file on Keats. No other files are allowed.

#### 2 DATA SETS

We will use two data sets, each resides in a different mat file. The file **data_xor.mat** is used in Sections 1 and 2. It contains an already split data set.

Specifically, the training data set ={(,)}= 1 contains = 200 pairs, while the test

data set ={(,)}= 1 consists of = 100 examples. Each example pair consists of:

- input vector =[(^1 ) (^2 )] in ^2.
- its corresponding binary label { 0 , 1 }.

The data is loaded into the workspace as follows. **Name Size Type Description**

t_tr (^) 1 Logical Training set binary labels: 0 and 1 X_tr (^) Double Training set data matrix (samples vectors as rows) t_te (^) 1 Logical Test set binary labels: 0 and 1 X_te (^) Double Test set data matrix (samples vectors as rows) The data is used via two feature mappings, named A (using = 2 features) and B (using = 3 features). The feature vectors are given as ()=[(^1 ) (^2 )]^2 ()=[(^1 ) (^1 ) (^1 )(^2 )]^3 For some set of samples, we stack the features as usual into the feature matrices =[

##### ( 1 )

##### ()

##### ] , =[

##### ( 1 )

##### ()

##### ]

and denote the labels vector as =[ 1 ,…,]. The file **dats_usps_5_6.mat** is used for Section 4. It contains = 1550 images of handwritten digits 5 and 6 from the USPS data base [3] (which contains examples of all digits from 1 to 10). All images contain 16×16 pixels, stacked into vectors of length 256 that form the rows of the input matrix X^256. Their associated labels are in the vector { 0 , 1 }, with label 0 corresponding to digit 5 and label 1 corresponding to digit 6. **Name Size Type Description**

X (^) 256 Double Images as rows t (^) 1 Logical Test set binary labels: 0 and 1

#### 3 INSTRUCTIONS FOR COMPLETING THE COURSEWORK

Section 1 Perceptron

To complete this section, we will work on six functions that will enable the training of a discriminative linear model for binary classification via the perceptron algorithm. Two model classes will be investigated, one with feature mapping () and the other with feature vector (). Accordingly, the model parameter vector is of size or , and all functions below are applicable to an arbitrary feature size . The main body loads the XOR data set.

- [ 5 points] Design the function function t_hat=
**perceptron_predict**(U, theta) that takes as inputs a feature matrix U and a model parameter vector theta . The function returns the perceptron output t_hat , as a vector of hard predictions (|) applied to the input feature vectors. - [ 5 points] Design the function function grad_theta =
**perceptron_gradient**(u,t,theta) that takes as input a feature vector u , its binary label t { 0 , 1 }, and a model parameter vector theta . The function returns a vector representing the gradient (,(|)) of the surrogate loss of the perceptron algorithm, that is the gradient of the hinge-at-zero loss at the above sample pair. - [ 5 points] Design the function function loss =
**detection_error_loss**(U,t,theta) that takes as inputs a feature matrix U , its label vector t { 0 , 1 }, and a model parameter vector theta . The function returns the empirical detection- error loss loss of the perceptron predictions using the given features and model parameter vector. - [ 5 points] Design the function function loss =
**hinge_at_zero_loss**(U,t,theta) that takes as inputs a feature matrix U , its label vector t { 0 , 1 }, and a model parameter vector theta . The function returns the empirical hinge-at- zero loss loss of the perceptron predictions using the given features and model parameter vector. - [ 5 points] Design the function function theta_mat =
**perceptron_train_sgd**(U_tr,t_tr,theta_init,I,gamma)

```
that takes as inputs: a training feature matrix U_tr and corresponding
label vector t_tr { 0 , 1 }; an initial model parameter theta_init ; the
number I of training iterations; and learning rate gamma > 0. The function runs
over I training iterations of the perceptron Algorithm with mini-batches of size 1. It
returns the learned model parameter vectors for each iteration, as columns in the
matrix theta_mat (+^1 ). The first column equals theta_init, the second
column is after one update, and so on. The mini-batch for the -th iteration has a
single sample chosen by the cyclic scan
= 1 +( 1 ) ,
meaning the first update uses the first row in U_tr and first entry of t_tr, the
second update uses the second sample, and so forth.
```

The main body trains a perceptron for each of the two proposed model classes, plots the losses and shows the decision rules for some representative iterations.

- [ 10 points] Add up to two lines of text in function
**discussionA**() describing how useful the perceptron predictions under feature mappings A and B are for this data set.

Section 2 Logistic Regression

This section consists of four functions, which will enable the training of discriminative probabilistic linear model for binary classification by using logistic regression. The same two model classes with feature mappings () and () are considered again, alongside the XOR data set. All functions must be written to an arbitrary model parameter size .

- [ 5 points] Design the function function logit=
**logistic_regression_logit**(U, theta) that takes as inputs a feature matrix U, and a model parameter vector theta. The function returns the logit vector logit of logistic regression outputs, with the -th entry being the logit of the -th feature vector, along the -th row of U. - [ 5 points] Design the function function grad_theta =
**logistic_regression_gradient**(u,t,theta)

```
that takes as input a feature vector u , its binary label t { 0 , 1 }, and a model
parameter vector theta. The function returns a vector representing the
gradient (,(|)) of the logistic loss at the above sample pair.
```

- [ 5 points] Design the function function loss =
**logistic_loss**(U,t,theta) that takes as inputs a feature matrix U , its label vector t { 0 , 1 }, and a model parameter vector theta . The function returns the empirical logistic loss loss of logistic regression predictions using the given features and model parameter vector. - [ 5 points] Design the function function theta_mat =
**logistic_regression_train_sgd**(U_tr,t_tr,theta_init,I,gamma,S) having the same inputs and outputs as the fifth function perceptron_train_sgd(), with the difference of using the logistic loss for the considered logistic regression. Moreover, the SGD uses now mini-batches of S samples in each mini-batch, following the cyclic scan rule of = 1 +(( 1 )+[ 0 : 1 ]) .

The main body trains a logistic regression model for the two different model classes, plots the losses and shows the decision rules for some representative iterations. Use them to gain insights and validate your code. No need for discussion here.

Section 3 Neural network

We now consider a 3 – layer neural network model for binary classification, with input features of size , number of neurons in the first hidden layer 1 and in the second hidden layer 2 , and ReLU activation in the hidden layers. To represent the model parameters= {^1 ,^2 ,^3 }, we use a MATLAB struct [2] to group the matrices. The model parameters can be accessed using the struct fields. For a struct named theta, use theta.W1, theta.W and theta.w3 to access them as regular variables. The functions to be designed must be written to account for general layer sizes, for 3 – layer neural networks. You can assume the number of layers is 3, without generalizing it. We use the lectures notations and denote the input features as rather than as in the previous sections. This section does not use any data set. The main body calls for the listed function for two different input feature vectors, using the same model parameter vector as provided within the main body.

- [ 5 points] Design the function function [logit, record]=
**neural_network_logit**(x, theta) that takes as inputs a single feature vector x and a model parameter theta. The function returns the logit logit of a neural network with feature vector u. You may use the second output record as a structure with fields of your choice, to record inner variables that can be useful when designing the upcoming function neural_network_gradient(). This structure is not marked, you can leave it empty before you reach designing the next function, or if it isnt useful for you even after. You may use the function ReLU(), as provided in the auxiliary functions. - [ 5 points] Design the function function out=
**grad_ReLU**(in) that computes entry-wise the gradient of the non linear activation function ReLU over any size vector or matrix in, outputting a vector or matrix out of the same size. As it is not defined when an input entry is 0 , we choose arbitrarily for this case the output to be 0. - [ 10 points] Design the function function grad_theta =
**neural_network_gradient**(x,t,theta) that takes as input one sample , its label { 0 , 1 } and a struct theta with fields W1,W2,w3. It produces a struct grad_theta, with inner fields (W1,W2,w3), containing the gradients (,(|)) of the logistic loss of the non-linear model for the corresponding weights. You may find it useful to call neural_network_logit(), with proper recorded signals, but this is not compulsory.

Section 4 – PCA

From this point on, we consider a binary classification problem accounting for handwritten digits 5 and 6 from the USPS data set [3], which the main body loads and show some samples from. The main body pre-processes the data base to have zero mean, forming the un biased data set Xub.

- [ 5 points] Design the function function W =
**pca_get_dict**(Xub,M) that takes as inputs an unlabelled unbiased data set Xub and a number of

```
principal components M. By principal components analysis, it outputs the
dictionary matrix W.
```

- [ 5 points] Design the function function z =
**pca_encode**(x,W) that encodes an input x using the dictionary W, producing the encoded vector z. - [ 5 points] Design the function function x_hat =
**pca_decode**(z,W) that reconstructs x_hat using an encoded vector z and a dictionary W .

The main body runs over 4 values of number of components , and plot their encoding- decoding outcomes for a selected number of digits.

- [ 10 points] Add up to two lines of text in function
**discussionB**() describing how increasing the number of components affects accuracy and computational complexity.

#### 4 REFERENCES

[ 1 ] https://uk.mathworks.com/academia/tah-portal/kings-college-london-30860095. html [ 2 ] https://www.mathworks.com/help/matlab/ref/struct.html [3] https://www.kaggle.com/bistaumanga/usps-dataset