CISC 271, Winter 2021, Assignment 5
ai代写 | machine learning代写 | python代写 | oop代写 – 该题目是一个常规的Logistics Regressio的练习题目代写, 涵盖了ai人工智能 机器学习等方面
March 2022
1 Introduction
In this work, two main goals have been accomplished. One is to use the Perceptron Rule for classification, and compare it with logistic regression. On low-dimensional data, the single-layer perceptron can well classify the data and can achieve the performance of logistic regression. The second is to use kernel PCA for dimension reduction, and then use kmeans for classification. The final results show that Kernel PCA is able to separate the data well and achieve consistency with the ground-truth labels.
2 Results
2.1 Question 1: Perceptron Rule For Machine Learning
Four Plots:
The ROC curves for Perceptron:
Figure 1: The ROC curves for Perceptron.
The ROC curves for Logistics Regression:
Figure 2: The ROC curves for Logistics Regression.
The dimensionally reduced data and the separating hyperplane for Perceptron:
Figure 3: The dimensionally reduced data and the separating hyperplane for Perceptron.
The dimensionally reduced data and the separating hyperplane for Logistics Regression:
Figure 4: The dimensionally reduced data and the separating hyperplane for Logistics Regression.
One table:
Table 1: Threshold vs. Accuracy of Perceptron and Logistic Regression
Threshold Perceptron Logistic Regression
-10 0.727156 0.
-9 0.728443 0.
-8 0.728443 0.
-7 0.731017 0.
-6 0.737452 0.
-5 0.740026 0.
-4 0.751609 0.
-3 0.785071 0.
-2 0.824968 0.
-1 0.884170 0.
0 0.911197 0.
1 0.732304 0.
2 0.429858 0.
3 0.310167 0.
4 0.274131 0.
5 0.272844 0.
6 0.272844 0.
7 0.272844 0.
8 0.272844 0.
9 0.272844 0.
10 0.272844 0.
2.2 Question 2: Kernel PCA and K-Means Clustering
Two plots:
Plot the Iris data using true label:
Figure 5: Plot the Iris data using true label.
Plot the Iris data using cluster indexes:
Figure 6: Plot the Iris data using cluster indexes.
3 Discussion
3.1 Question 1: Perceptron Rule For Machine Learning
According to the Tab. 1, We can see that the best threshold for both methods is 0, and the best accuracy for both methods is 0.911197. Except for this, we can see that when threshold equal to10, it means that not only perceptron but also logistics regression classify all cases as positive, and the accuracy is 0.727156. The result shows that there are more positive samples in the dataset. Thus, even the simplest model, classifying all samples as positive, can achieve around 70% accuracy.
The meanings of the threshold in the ROC curve. At point (0,0), it means that all judgments are 0 without identification, at this timeT P=F P = 0,T P R=T P/P= 0,F P R=F P/N = 0; at point (1,1) , which means that all judgments are 1 without identification, at this timeF N=T N= 0,P =T P+F N=T P, T P R=T P/P= 1,N=F P+T N=F P,F P R=F P/N= 1. The ROC curve actually describes the process of the classifier performance changing as the classifier threshold changes. The performance of the Perceptron and Logistic Regression is similar as shown in Fig. 1 and Fig. 2. Fig. 3 and Fig. 4 show the separating hyperplane of perceptron and logistic regression when threshold is 0. We have known that when threshold is 0 the accuracy of both methods is same while the hyperplane is not same. This shows that on this dataset, the best performance for both is the same. But their optimal hyperplanes are not the same. Both of them found their own optimal solution.
3.2 Question 2: Kernel PCA and K-Means Clustering
Fig. 5 shows the plot between the dimensional reduced data using Kernel PCA and the true label. Fig. 6 shows the plot between the dimensional reduced data using Kernel PCA and the cluster indexes obtained by k-means. We can see that the two plots are identical except for the color difference. The result means that kernel PCA is valid in this question. It makes linearly inseparable data linearly separable.
4 How I tested my code?
Run the code for one question at a time and debug by observing the dimensions and values of the variables.
4.1 Question 1: Perceptron Rule For Machine Learning
Firstly, I implemented the functionsepbinary. I tried the vectorization version but failed. Thus, I decided to use l oop for updating the weight. And I utilized a variable to counter the mis-classified points. Due to the label of the dataset is{ 0 , 1 }, I set an variablesignalwhich represents whether the label is 0 for the convenience of calculation.
4.2 Question 2: Kernel PCA and K-Means Clustering
At the beginning, I implemented the functiongramgaussto compute a Gram matrix. I tried to use bulit-in functionpdist2. However I dont know how to pass in parametersigma2. Therefore, I used a loop to compute the Gram matrix. The remaining steps are relatively simple, spectral decomposition of the Gram matrix and then dimensionality reduction of the raw data.