homework代写 | Machine learning | Python | 代写assignment – CSC401 Speech Recognition, Language Understanding, and Big Models

CSC401 Speech Recognition, Language Understanding, and Big Models

homework代写 | Machine learning | Python | 代写assignment – 这个题目属于一个Machine learning的代写任务, 是比较典型的report/bash/shell/Machine learning/Python等代写方向, 该题目是值得借鉴的assignment代写的题目

python代写 代写python

CSC401 11 March 2023 St. George Campus University of Toronto

 homework  assignment #

Speech Recognition, Language Understanding, and Big Models

TA: Amanjit Kainth ([email protected]), and Shuja Khalid ([email protected])

1 Introduction

This assignment introduces you to the pipeline of ASR and NLU, and gives you hands-on experience in building and/or using these pipelines. The assignment is divided into three sections. In first, you will experiment with hands-on Speech Recognition. In the second, you will implement a simplified version of BERT sequential classifier and fine-tune this model to classify the sentiment polarity of movie reviews. In the third section, you will use pre-trained large language models to classify the sentiment polarity.

2 Hands-on Automatic Speech Recognition [10 marks]

The goal of this question is to get familiar with the ASR pipeline end-to-end. There are several steps to building this pipeline.

Step 1 Prepare 10 transcripts. Read out the transcripts, and record the sound files yourself. Here are some possible examples:

  • A word.
  • A sentence.
  • A sentence, but read very slowly.
  • A sentence read at normal speed, except speeding up at some words.
  • A sentence spoken very fast.
  • (Optional) A sentence containing slang or words from a second language. This is code-switching, a very challenging phenomenon for ASR.
  • A sentence with one word spelled out by alphabet.
  • A sentence where two neighboring words have blurred boundaries.
  • A sentence spoken with some stuttering, for example, containing some filler words (um, etc.) and repetitiveness.

(^0) Copyright2023, University of Toronto. All rights reserved.

Step 2 Find an ASR system of your choice. There are many systems that come with pre-trained models, for example:

  • Kaldi the most popular system for ASR. There are many small (i.e., computationally-efficient) pre- trained models with good WERs. Kaldi also provides many shell 代做 script代写”> bash scripts (recipes) and tutorials to train and use their pre-trained models.
  • PyKaldi a Python wrapper for Kaldi.
  • SpeechBrain a Python package as a Kaldi replacement. You dont need to play around with the bash scripts (except for thepip install speechbrain). SpeechBrain also provides some ASR models through huggingface model hub that can be used conveniently.
  • Online ASR transcription services like AWS, Google, or Microsoft. Some of them only have limitations regarding the free acoustic data that you can transcribe. After that, users will need to pay for the transcription. Note that if you choose to use online transcription services, we cannotsupport any associated cost.
  • Voice-to-text input methods by a computer system or some social messaging apps (Messenger, What- sApp, WeChat, etc.). Note: If the ASR system of your choice does not involve coding at all,3 marks of this question will be deducted because there is insufficient hands-on engineering experience.

Step 3 Transcribe the recorded sound files using the ASR system of your choice. Remember to delete the sound files after you finish this problem.

Deliverables InasrPipeline.txt, briefly summarize the transcripts, your procedure to collect the sounds, and the ASR system you use. Comment on the performance of the speech recognition where does the system of your choice work well, and where does it not?

3 Language Understanding Pipeline [20 marks]

The goal of this section is to implement a simplified version of BERT and familiarize with a common NLU pipeline. The starter codes are included in/u/cs401/A3minBERT. Thestructure.mdcontains detailed descriptions about the components of the model to be implemented.

3.1 Implementation of BERT [10 marks]

In thebert.py, fill in 4 missing places that are marked withtodo:

  • bert.BertSelfAttention.attention
  • bert.BertLayer.addnorm
  • bert.BertLayer.forward
  • bert.BertModel.embed

After the implementations, make sure your code passes the test ofsanitycheck.py. Hint: If you are unsure about the details, please refer to the Transformer paper and the Huggingfaces github repository. Note: for questions in this section, do NOT import thetransformerslibrary!

3.2 Implement and fine-tune a sequential classifier [10 marks]

Inclassifier.py, fill in the places that are marked withtodo:

  • classifier.BertSentClassifier. init
  • classifier.BertSentClassifier.forward

Then, run theclassifier.pyscript to fine-tune a BERT classifier on a small version of IMDB, which is provided in thedatafolder. This script saves a checkpoint of the best-performing model, and this checkpoint will be used in the next question. Train a model, either with the BERT model frozen (i.e., only train the classifier head), or without. This can be specified by setting theoptionargument.

Deliverables InbertReport.txt, report the option, the hyperparameter of your choice (as well as the hyperparameter tuning procedure), and other training run details (including the runtime and the computation hardware). Additionally, submit the following:

  • classifier.py
  • bert.py

4 Large Language Model [20 marks]

Questions in this section aim at inspecting the behavior of NLU models a sentiment classification perspec- tive. The problems in this section can be done in a notebook,LLM.ipynb.

4.1 The fine-tuned minBERT classifier [5 marks]

Instantiate aBertSentClassifier model. Load thestatedictfrom the checkpoint saved from the previous question. Specify a batch of 5 sentences containing:

  • A sentence that is very positive.
  • A sentence that is less positive.
  • A slightly negative sentence.
  • A very negative sentence.
  • An off-the-topic sentence one that is not a movie review.

Pass them into the loaded model. For each sentence, print out the models prediction and the probability of the predicted class. Comment on the models performance: Do the polarity match your expectation? Does the fine-grained polarity match your prediction? Does the model consider the off-the-topic sentence to be positive or negative?

4.2 A general-purpose multi-task learner [10 marks]

Here we are going to repeat question 4.1 but using a general-purpose model. Use huggingfaces Transformers library to load a AutoModelForCausalLM (previously this was also called AutoModelWithLMHead) model. To allow comparison against the previous question, lets load thebert-base-uncasedmodel. The CausalLM procedure predicts the probability of the next token given the sequence we pass in. This procedure is not specific to sentiment analysis, but as will be shown, they will be able to recognize the polarity of the sentences to some reasonable extents. For each sentenceS, append it withThis movie review is. Then pass them into the CausalLM model. Query the probabilities of the tokens yes and no that are computed by the LM. Comment on the models performance. Can this LM recognize the polarities of the sentences? Does changing the appendix to e.g., This sentence is improve the out-of-domain generalization ability (i.e., the prediction correctness of the off-the-topic sentence)?

4.3 A large-scale multi-task learner [5 marks]

Nowadays, several large-scale language models demonstrate incredible learning abilities. Some models are openly accessible (with some costs) to the public, for example:

  • Cohere, via either the playground or API.
  • GPT-3, via either the playground or API.
  • BLOOM, via Huggingface hub

Note: Cohere has a trial API, allowing free usage for non-commercial purposes, with a rate limit. For this assignment, we donotsponsor the incurred cost. In this question, choose an LLM and use it in thegenerationmode (i.e., let it continue writing given aprompt). Design a form of your choice (e.g.,S+This movie review is, or other forms) that allows the LLM to output its prediction of whether the 5 sentences you wrote are positive or not. If you use the API, include the scripts in your notebook. If you use the playgrounds, take screenshots and add to the notebook. Describe your prompt format, and the generation hyperparameters (e.g., temperature, max-token, etc.). Comment on the LLMs performance: is it better than the smaller model in the previous sections?

Deliverables Convert your notebook,LLM.ipynbto a PDF (LLM.pdf) and submit this script.

5 Feedback [1 mark]

How long did you spend on each section of this assignment? Are any of the problems particularly interesting or challenging?

Deliverables Infeedback.txt, write your comments. 1 mark will be given as long as thefeedback.txt file is not empty.

6 Bonus [up to 10 marks]

We will give up to 10 bonus marks for innovative work going substantially beyond the minimal requirements. These marks can make up for marks lost in other sections of the assignment, but your overall mark for this assignment cannot exceed 100%. You may decide to pursue any number of tasks of your own design related to this assignment, although you should consult with the instructor or the TA before embarking on such exploration. Certainly, the rest of the assignment takes higher priority.

6.1 Commonsense reasoning using LLM [5 marks]

Many people claim that LLMs show human-like reasoning abilities. Given a question in the suitable format, their generations can include answers that appear to make sense. In this question, examine the commonsensereasoning ability of an LLM of your choice, using the Winograd schema challenge. Following are some possibilities.

  • Winograd question: The trophy cant fit in the brown suitcase because it is too big. What is too big?
  • Prompt (original question): The trophy cant fit in the brown suitcase because it is too big. What is too big?
  • Prompt (question + request explanation): The trophy cant fit in the brown suitcase because it is too big. What is too big, and why?

Pick at least 3 questions from Winograd and analyze the LLM using your prompt formats. Comment on the commonsense reasoning ability of the LLM: Does it give adequate answers and explanations? Is that possible to engineer some prompt formats, so that the outputs make more sense?

Deliverable Include the questions you select, the prompts, the LLMs generations, and your comments inbonusCommonsense.txt.

6.2 Model attribution using SHAP [5 marks]

SHAP implements a number of methods to interpret Machine learning models. In this question, choose a tutorial listed on the SHAP documentation and follow through. Use a pre-trained model that isdifferent from what is listed on the tutorial. Try several examples on a notebook,bonusSHAP.ipynb, and discuss the explanation results. Do the attributions make sense?

Deliverable Identify the tutorial you follow, the pre-trained model originally listed on the tutorial, and the pre-trained model you use in your notebook. Convert the jupyter notebook into a pdf file, bonusSHAP.pdf, and include the pdf file in your submission.

6.3 Natural language explanation using LLM [5 marks]

One way to leverage the reasoning abilities of LLM is to let them explain a phenomenon to us. We can prompt an LLM using a special question format,[phenomenon] + because, to let each of them generate natural language explanations. For example:

  • The sky is turning dark because
  • When releasing a ball, it falls onto the ground because
  • When searching for language model, the search engine returns ChatGPT because

The generated explanations fall into several types. The explanations might (1) highlight the true underlying causal mechanisms of the phenomenon, (2) identify aplausiblemechanism to make the phenomenonappear reasonable, or (3) describe some irrelevant information. In this question, select two LMs a large model (e.g., ChatGPT, GPT3, Cohere) and a smaller model (e.g., BERT, GPT2, GPT-J), and prompt each of them withthreephenomena. Report your choice of models, phenomena, and their generated explanations. Comment on the qualities of the explanations: Are the explanations convincing? Do the explanations highlight the true causes? Include the relevant scripts and your discussions inbonusNLE.pdf, and include the PDF file in your submission.

6.4 Solving CSC401 assignment problems with LLM [5 marks]

Pick an LLM. For one of questions 6.1 to 6.3, prompt the LLM with the question, and report the answer. Does the LLM-generated answer have satisfying quality? If you are not satisfied with the quality, change the wording of the problem and try to get a higher- quality answer. How do you change the wording, and how does that change the LLM-generated answer?

Deliverable InbonusLLMProblem.pdf, include your prompts, the generated answers, and your discus- sions. For this question, please use appropriate methods to improve the readability of the pdf.

7 General specification

We may test your code on different training and testing data in addition to those specified above. Where possible, do not hardwire directory names into your code. As part of grading your assignment, the grader may run your programs using test harness Python scripts. It is therefore important that your code precisely meets the specifications and formatting requirements, including arguments and file names. If your code uses a file or helper script, it must read it either from the directory in which that code is being executed (i.e., probably pwd), or it must read it from a subdirectory of/u/cs401whose absolute path is completely specified in the code. Donothardwire the absolute address of files in your home directory; the grader does not have access to that directory. All your programs must contain adequate internal documentation to be clear to the graders. External documentation is not required. This assignment is inPython 3.

7.1 Submission requirements

This assignment is submitted electronically. Submit your assignment on MarkUs. You should submit the files requested in the corresponding questions, plus theIDfile available from the course website. Do nottarorcompressyour files, and do not place your files in subdirectories.

7.2 Academic integrity

This is your last assignment of this semester. In the past years, some students posted their implementations to their own GitHub repositories after the semesters. We have requested that they change the visibility to private. Please do NOT publicize your solutions.Copying the solutions of other students violates the academic integrity your codes should be the results of your honest efforts. The questions change from year to year anyways.