matlab | report | 作业scheme | assignment代做 | lab代做 – EECE5644 2022 Spring Assignment 2

EECE5644 2022 Spring Assignment 2

matlab | report | 作业scheme | assignment代做 | lab代做 – 这个题目属于一个scheme的代写任务, 涵盖了matlab/report/scheme等程序代做方面, 这个项目是lab代写的代写题目

ass代做 assignment代写 代写assignment

Submit:Before Tuesday, 2022-March-15, 23:59ET Please submit your solutions at the assignments page in Canvas in the form of a single PDF file that includes all math, numerical and visual results. Also, for verification of the existence of your own computer implementation, include a link to your online code repository or include the code as an appendix / attachment in a ZIP file along with the PDF. The code is not graded, but helps verify your results are feasible as claimed. Only results and discussion presented in the PDF will be graded, so do not link to an external location where further results may be presented. This is a graded assignment and the entirety of your submission must contain only your own work. You may benefit from publicly available literature including software (not from classmates), as long as these sources are properly acknowledged in your submission. All discussions and ma- terials shared during office periods are also acceptable resources and these tend to be very useful, so participate in office periods or take a look at their recordings. Cite your sources as appropri- ate. Discussing verbally with classmates are acceptable, but there can not be any written material exchange. By submitting a PDF file in response to this take home assignment you are declaring that the contents of your submission, and the associated code is your own work, except as noted in your citations to resources and allowed otherwise as described.

Question 1 (40%)

The probability density function (pdf) for a 2-dimensional real-valued random vectorXis as follows: p(x) =P(L= 0 )p(x|L= 0 ) +P(L= 1 )p(x|L= 1 ). HereLis the true class label that indicates which class-label-conditioned pdf generates the data. The class priors areP(L= 0 ) = 0 .65 andP(L= 1 ) = 0 .35. The class class-conditional pdfs are p(x|L= 0 ) =w 1 g(x|m 01 ,C 01 )+w 2 g(x|m 02 ,C 02 )andp(x|L= 1 ) =g(x|m 1 ,C 1 ), whereg(x|m,C) is a multivariate Gaussian probability density function with mean vectormand covariance matrix C. The parameters of the class-conditional Gaussian pdfs are:w 1 =w 2 = 1 /2, and

m 01 = [^30 ] C 01 = [2 00 1] m 02 = [^03 ] C 02 = [1 00 2] m 1 = [^22 ] C 1 = [1 00 1]

For numerical results requested below, generate the following independent datasets each con- sisting of iid samples from the specified data distribution, and in each dataset make sure to include the true class label for each sample.

  • D^20 trainconsists of 20 samples and their labels for training;
  • D^200 trainconsists of 200 samples and their labels for training;
  • D^2000 trainconsists of 2000 samples and their labels for training;
  • D^10 validateK consists of 10000 samples and their labels for validation; Part 1: (10%)Determine the theoretically optimal classifier that achieves minimum prob- ability of error using the knowledge of the true pdf. Specify the classifier mathematically and implement it; then apply it to all samples inD^10 validateK. From the decision results and true labels for this validation set, estimate and plot the ROC curve of this min-P(error) classifier, and on the ROC curve indicate, with a special marker, the location of the min-P(error) classifier. Also report an estimate of the min-P(error) achievable, based on counts of decision-truth label pairs onD^10 validateK. Optional: As supplementary visualization, generate a plot of the decision boundary of this classi- fication rule overlaid on the validation dataset. This establishes an aspirational performance level on this data for the following approximations. Part 2: (30%)(a) Using the maximum likelihood parameter estimation technique train three separate logistic-linear-function-based approximations of class label posterior functions given a sample. For each approximation use one of the three training datasetsD^20 train,D^200 train,Dtrain^2000. When optimizing the parameters, specify the optimization problem as minimization of the negative-log- likelihood of the training dataset, and use your favorite numerical optimization approach, such as gradient descent or Matlabs fminsearch. Determine how to use these class-label-posterior approx- imations to classify a sample in order to approximate the minimum-P(error) classification rule; apply these three approximations of the class label posterior function on samples inD^10 validateK , and estimate the probability of error that these three classification rules will attain (using counts of decisions on the validation set). Optional: As supplementary visualization, generate plots of the decision boundaries of these trained classifiers superimposed on their respective training datasets and the validation dataset. (b) Repeat the process described in Part (2a) using a logistic-quadratic- function-based approximation of class label posterior functions given a sample. How does the performance of your classifiers trained in this part compare to each other considering differences in number of training samples and function form? How do they compare to the theoretically opti- mal classifier from Part 1? Briefly discuss results and insights.
Note 1:Withxrepresenting the input sample vector andwdenoting the model parameter vec-

tor, logistic-linear-function refers toh(x,w) = 1 /( 1 +ew

Tz(x)
), wherez(x) = [ 1 ,xT]T; and logistic-

quadratic-function refers toh(x,w) = 1 /( 1 +ew

Tz(x)
), wherez(x) = [ 1 ,x 1 ,x 2 ,x^21 ,x 1 x 2 ,x^22 ]T.

Question 2 (40%)

Assume that scalar-realyand two-dimensional real vectorxare related to each other according toy=c(x,w)+v, wherec(.,w)is a cubic polynomial inxwith coefficientswandvis a random Gaussian random scalar with mean zero and^2 -variance. Given a datasetD= (x 1 ,y 1 ),…,(xN,yN)withNsamples of(x,y)pairs, with the assumption that these samples are independent and identically distributed according to the model, derive two estimators forwusing maximum-likelihood (ML) and maximum-a-posteriori (MAP) parameter estimation approaches as a function of these data samples. For the MAP estimator, assume thatw has a zero-mean Gaussian prior with covariance matrixI. Having derived the estimator expressions, implement them in code and apply to the dataset generated by the attached mat lab script. Using thetraining dataset, obtain the ML estimator and the MAP estimator for a variety ofvalues ranging from 10^4 to 10^4. Evaluate eachtrained model by calculating the average-squared error between theyvalues in thevalidation samplesand model estimates of these usingc(.,wtrained). How does your MAP-trained model perform on the validation set asis varied? How is the MAP estimate related to the ML estimate? Describe your experiments, visualize and quantify your analyses (e.g. average squared error on validation dataset as a function of hyperparameter) with data from these experiments. Note: Point split will be 20% for ML and 20% for MAP estimator results.

Question 3 (20%)

LetZbe drawn from a categorical distribution (takes discrete values) withKpossible out- comes/states and parameter, represented byCat(). Describe the value/state using a 1-of-K scheme forz= [z 1 ,…,zK]Twherezk=1 if variable is in statekandzk=0 otherwise. Let the parameter vector for the pdf be= [ 1 ,…,K]T, whereP(zk= 1 ) =k, fork{ 1 ,…,K}. GivenD{z 1 ,…,zN}with iid samplesznCat()forn{ 1 ,…,N}:

  • What is the ML estimator for?
  • Assuming that the priorp()for the parameters is a Dirichlet distribution with hyperparam- eter, what is the MAP estimator for?
Hint:The Dirichlet distribution with parameteris
p(|) =

1

B()
K

k= 1
kk^1 where the normalization constant is B() =
Kk= 1 (k)
(Kk= 1 k)