MiniProject 2: Classification of Image Data with Multilayer
report代写 | 神经网络 | 作业Algorithm | Machine learning代做 | 作业project – 这是一个关于Python的题目, 主要考察了关于机器学习的内容,是一个比较经典的题目, 是比较典型的report/Network/神经网络/Algorithm/Machine learning/Python等代写方向, 这是值得参考的ml代写的题目
Perceptrons and Convolutional Neural Networks
COMP 551, Winter 2023, McGill University
Contact TAs: Safa Alver
Please read this entire document before beginning the assignment.
Preamble
- This mini- project isdue on March 7th at 11:59pm (EST, Montreal Time).There is a penalty of 2 kpercent penalty forkdays of delay, which means your grade will be scaled to be out of 100 2 k. No submission will be accepted after 6 days of delay.
- This mini-project is to be completed in groups of three. All members of a group will receive the same grade except when a group member is not responding or contributing to the project. If this is the case and there are major conflicts, please reach out to the group TA for help and flag this in the submitted report. Please note that it is not expected that all team members will contribute equally. However every team member should make integral contributions to the project, be aware of the content of the submission and learn the full solution submitted.
- You will submit your assignment on MyCourses as a group. You must register your group on MyCourses and any group member can submit. See MyCourses for details.
- We recommend to useOverleaffor writing your report andGoogle colabfor coding and running the exper- iments. The latter also gives access to the required computational resources. Both platforms enable remote collaborations.
- There are additional cloud compute resources available for this project. Please check MyCourses for instructions on how to access those.
- You should use Python for this mini-project. You are free to use libraries with general utilities, such as mat- plotlib, numpy and scipy for Python, unless stated otherwise in the description of the task. In particular, in most cases you should implement the models and evaluation functions yourself, which means you should not use pre-existing implementations of the algorithms or functions as found in SciKit learn, and other packages. The description will specify this in a per case basis.
Background
In this miniproject, you willimplement a multilayer perceptron from scratch, and use it toclassify image data. One of the goals is to implement a basic neural network and its training Algorithm from scratch and get hands-on experience with important decisions that you have to make while training these models. You will also have a chance to experiment withconvolutional neural networksand usepre-trainedversions of them to perform image classification.
Figure 1: An MLP with 2 hidden layers each having 4 units.
Task 1: Acquire the data
Your first task is to acquire the image dataset. You will be using only one dataset in your experiments:CIFAR-10. Use the CIFAR-10 dataset with the default train and test partitions. You can use existing Machine learning libraries to load the dataset. Note that while working with multilayer perceptrons, after loading the data, you will have to vectorize it so that it can have the appropriate dimensions. Also do not forget to normalize the training and test set (see https://cs231n.github.io/neural-networks-2/#datapre).
Based on your previous miniproject, you might be asking the question: where are the features? Well, this is the whole point of using neural nets: instead of hand-designing the features, you train the model so that the feature extractor is also learned together with the classifier on top.
Task 2: Implement a Multilayer Perceptron
In this mini-project, you will implement a multilayer perceptron (MLP) to classify image data. An MLP is composed of three types of layers: (1) an input layer, (2) hidden layers, (3) an output layer (see Figure 1). You should implement it from scratch based on the code available in the slides. Your implementation should include the backpropagation and the mini-batch gradient descent algorithm used (e.g., SGD).
You are free to implement the MLP as you see fit, but you should follow the equations that are presented in the lecture slides, and you must implement it from scratch (i.e., youcannotuse TensorFlow or PyTorch or any other library). Using the Numpy package is encouraged. Regarding the implementation, we recommend the following approach:
- Implement the MLP as a python class. The constructor for the class should take as input the activation function (e.g., ReLU), the number of hidden layers (e.g., 2 ) and the number of units in the hidden layers (e.g.,[64,64]) and it should initialize the weights and biases (with an initializer of your choice) as well as other important properties of the MLP.
- The class should have (at least) two functions:
- Afitfunction, which takes the training data (i.e.,Xandy)as well as other hyperparameters (e.g., the learning rate and number of gradient descent iterations)as input. This function should train your model by modifying the model parameters.
- Apredictfunction, which takes a set of input points (i.e.,X) as input and outputs predictions (i.e.,y) for these points.
- In addition to the model classes, you should also define a functionsevaluateaccto evaluate the model accuracy. This function should take the true labels (i.e.,y), and target labels (i.e.,y) as input, and it should output the accuracy score.
You are also free to use any Python libraries you like to tune the hyper-parameters; see for examplehttps:// scikit-learn.org/stable/modules/grid_search.html.
Task 3: Run the experiments and report
The goal of the experiments in this part is to have you explore the consequences of important decisions made while training neural networks.Split the dataset into training and test sets. Use test set to estimate performance in all of the experiments after training the model with training set. Evaluate the performance using accuracy.You are welcome to perform any experiments and analyses you see fit (e.g., the effect of data augmentation / dropout regularization / number of hidden layers /… on accuracy),but at a minimum you must complete the following experiments in the order stated below:
- First of all, create three different models: (1) an MLP with no hidden layers, i.e., it directly maps the inputs to outputs, (2) an MLP with a single hidden layer having 256 units and ReLU activations, (3) an MLP with 2 hidden layers each having 256 units with ReLU activations. It should be noted that since we want to perform classification, all of these models should have a softmax layer at the end. After training, compare the test accuracy of these three models on the CIFAR-10 dataset. Comment on how non-linearity and network depth affects the accuracy. Are the results that you obtain expected?
- Take the last model above, the one with 2 hidden layers, and create two different copies of it in which the activations are now tanh and Leaky-ReLU. After training these two models compare their test accuracies with model having ReLU activations. Comment on the performances of these models: which one is better and why? Are certain activations better than others? If the results are not as you expected, what could be the reason?
- Create an MLP with 2 hidden layers each having 256 units with ReLU activations as above. However, this time, independently add L1 and L2 regularization to the network and train the MLP in this way. How does these regularizations affect the accuracy? This proportion can be varied as a tunable hyperparameter that can be explored as part of other project requirements.
- Create an MLP with 2 hidden layers each having 256 units with ReLU activations as above. However, this time, train it with unnormalized images. How does this affect the accuracy?
- Using existing libraries such as TensorFlow or PyTorch, create a convolutional neural network (CNN) with 2 convolutional and 2 fully connected layers. Although you are free in your choice of the hyperparameters of the convolutional layers, set the number of units in the fully connected layers to be 256. Also, set the activations in all of the layers to be ReLU. Train this CNN on the CIFAR-10 dataset. Does using a CNN increase/decrease the accuracy compared to using MLPs? Provide comments on your results.
- Load a pre-trained model that you see fit (e.g., a ResNet) using existing libraries such as TensorFlow or PyTorch, and then freeze all the convolutional layers and remove all the fully connected ones. Add a number of fully connected layers of your choice right after the convolutional layers. Train only the fully connected layers of the pre-trained model on the CIFAR-10 dataset. How does this pre-trained model compare to the best MLP in part 1 and to the regular CNN in part 5 in terms of the accuracy? How does it compare to the previous models in terms of the required training time? Justify your choice of how many fully connected layers that you have added to the pre-trained model through careful experiments.
- You can report your findings either in the form of a table or a plot in the write-up. However, include in your co lab notebooks the plots of the test and train performance of the MLPs / CNN / pre-trained model as a function of training epochs. This will allow you to see how much the network should be trained before it starts to overfit to the training data.
Note 1: The above experiments are the minimum requirements that you must complete; however, this project is open-ended.
For example, you might investigate the effect of the width (number of units in the hidden layers) of the MLP on its test accuracy or the effect of the CNNs convolutinal layer hyperparameters (number of filters, kernel size, stride, padding,
… ) on its test accuracy. It is also possible to examine the effect of the usage of different pre-trained models on the final accuracy and training speed of the network. Another interesting thing to report might be training the MLP / CNN / pre-trained model with 10 k, k { 0 , 1 , 2 , 3 , 4 }images and plotting the test accuracy. You do not need to do all of
these things or tune every parameter, but you should demonstrate creativity, rigour, and an understanding of the course material in how you run your chosen experiments and how you report on them in your write-up.
Note 2: We expect you to provide plots/tables in your report that justifies your choice of hyperparameters (the learning rates of the MLPs / CNNs / pretrained models in parts 1-6, and the architectural parameters of the CNNs and pretrained models in parts 5 & 6). You are not required to perform cross-validation in this project.
Deliverables
You must submit two separate files to myCourses (using the exact filenames and file types outlined below):
- code.zip: Your model implementation, and its training and evaluation code (as some combination of .py and .ipynb files).
- writeup.pdf: Your (max 5-page) project write-up as a pdf (details below).
Write-up instructions
Your team must submit a project write-up that is a maximum of five pages (single-spaced, 11pt font or larger; minimum 0.5 inch margins, an extra page for references/bibliographical content can be used). We highly recommend that students use LATEX to complete their write-ups. You have some flexibility in how you report your results, but you must adhere to the following structure and minimum requirements:
Abstract (100-250 words)Summarize the project task and your most important findings.
Introduction (5+ sentences)Summarize the project task, the datasets, and your most important findings. This should be similar to the abstract but more detailed. You should include background information and a few citations to relevant work (e.g., other papers analyzing these datasets).
Datasets (5+ sentences)Very briefly describe the dataset. Present the exploratory analysis you have done to under- stand the data, e.g. class distribution.
Results (7+ sentences, possibly with figures or tables)Describe the results of all the experiments mentioned inTask 3 (at a minimum) as well as any other interesting results you find (Note: demonstrating figures or tables would be an ideal way to report these results).
Discussion and Conclusion (5+ sentences)Summarize the key takeaways from the project and possibly directions for future investigation.
Statement of Contributions (1-3 sentences)State the breakdown of the workload across the team members.
Evaluation
The mini-project is out of 100 points, and the evaluation breakdown is as follows:
- Completeness (20 points)
- Did you submit all the materials?
- Did you run all the required experiments?
- Did you follow the guidelines for the project write-up?
- Correctness (40 points)
- Are your models implemented correctly?
- Are your reported accuracies close to our solution?
- Do you observe the correct trends in the experiments (e.g., how the accuracy changes as the depth of the MLP increases)?
- Do you observe the correct impact of activation choice, regularization and normalization on the model performance?
- Writing quality (30 points)
- Is your report clear and free of grammatical errors and typos?
- Did you go beyond the bare minimum requirements for the write-up (e.g., by including a discussion of related work in the introduction)?
- Do you effectively present numerical results (e.g., via tables or figures)?
- Originality / creativity (10 points)
- Did you go beyond the bare minimum requirements for the experiments?
- Note:Simply adding in a random new experiment will not guarantee a high grade on this section! You should bethoughtful and organizedin your report.
Final remarks
You are expected to display initiative, creativity, scientific rigour, critical thinking, and good communication skills. You dont need to restrict yourself to the requirements listed above – feel free to go beyond, and explore further.
You can discuss methods and technical issues with members of other teams, butyou cannot share any code or data with other teams.