# homework | math代做 | Algorithm | Machine learning | project代做 | Python代做 – AI, Ethics, and Society

### AI, Ethics, and Society

homework | report | math代做 | Algorithm | Machine learning | project代做 | Python代做 – 该题目是一个常规的ai的练习题目代写, 是有一定代表意义的report/math/Algorithm/Machine learning/Python等代写方向, 该题目是值得借鉴的project代写的题目

``````AI, Ethics, and Society
``````
`````` homework  project #
``````

Readings: Chapter 7: Weapons of math Destruction (Sweating Bullets: On the Job) A Few Useful Things to Know about Machine learning by Pedro Domingos https://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf

In this assignment, youll apply AI/ML algorithms related to two applications word embeddings and facial recognition.

Task Set #1: Here you will use distributional vectors trained using Googles deep learning Word2vec system.

2. Install Gensim (Example: pip install gensim. | pip install –upgrade gensim)
3. Download the provided reducedvector.bin file on Canvas which is a a pre-trained Word2vec model based on the Google News dataset (https://code.google.com/archive/p/word2vec/) from gensim.models import Word2Vec import gensim.models import nltk newmodel = gensim.models.KeyedVectors.load_word2vec_format(, binary=True)
4. We can compute similarity measures associated with words within the model. For example, to find different measures of similarity based on the data in the Word2vec model, we can use: # Find the five nearest neighbors to the word man newmodel.most_similar(‘man’, topn=5)
``````# Compute a measure of similarity between woman and man
newmodel.similarity('woman', 'man')
``````
1. To complete analogies like man is to woman as king is to ??, we can use: newmodel.most_similar(positive=[king, ‘woman’], negative=[‘man’], topn=1)

Q1: We will use the target words – man and woman. Use the pre-trained word2vec model to rank the following 15 words from the most similar to the least similar to each target word. For each word-target word pair, provide the similarity score. Provide your results in table format. wife husband child queen king man

``````woman
birth
doctor
nurse
teacher
professor
engineer
scientist
president
``````

Q2: The Bigger Analogy Test Set (BATS) Word analogy task has been one of the standard benchmarks for word embeddings since 2013 (https://vecto.space/projects/BATS/ ). A) Select any file from the downloaded dataset (BATS_3.0.zip). For each row in your selected file, choose a target word from the row and provide the measure of similarity between your target word and the other words on the row (Remember to document the file used). B) Think of three words that identify membership in one of the protected classes (choose only one class): race, color, religion, or national origin. For each row in your selected BATS_3.0 file, compute the similarity between your target word and each of your three words. Indicate when there are noticeable differences in the similarity scores based on membership in the protected class. Provide your results in table format.

Q3: Sentences: king is to throne as judge is to ___? giant is to dwarf as genius is to ___? college is to dean as jail is to ___? arc is to circle as line is to ___? French is to France as Dutch is to ___? man is to woman as king is to ___? water is to ice as liquid is to ___? bad is to good as sad is to ___? nurse is to hospital as teacher is to ___? usa is to pizza as japan is to ___? human is to house as dog is to ___? grass is to green as sky is to ___? video is to cassette as computer is to ____? universe is to planet as house is to ____? poverty is to wealth as sickness is to ___?

``````a. Complete the above sentences with your own word analogies. Use the Word2Vec model to find the
``````
``````Example:
man is to woman as king is to _queen__?
newmodel.similarity('king', 'queen') -> 0.
``````
``````b. Use the Word2Vec model to find the word analogy and corresponding similarity score. Provide
``````
``````Example:
``````
``````man is to woman as king is to ___?
newmodel.most_similar(positive=[king, 'woman'], negative=['man'], topn=1) -> queen,
0.
``````
``````c. Lastly, compute and print the correlation between the vector of similarity scores from your
analogies versus the Word2Vec analogy-generated similarity scores. What is the strength of the
correlation?
o .00-.19 very weak correlation
o .20-.39 weak correlation
o .40-.59 moderate correlation
o .60-.79 strong correlation
o .80-1.0 very strong correlation
``````

Task Set #2: For this part of the assignment, we will work with the UTK dataset (UTKface_cropped.tar.gz) available on Canvas and based on the original UTKFace dataset (https://susanqq.github.io/UTKFace/)

Q1: Each image in the dataset has a unique value representing age, gender, and race based on the following legend: age: indicates the age of the person in the picture and can range from 0 to 116. gender: indicates the gender of the person and is either 0 (male) or 1 (female). race: indicates the race of the person and can from 0 to 4, denoting White, Black, Asian, Indian, and Others (like Hispanic, Latino, Middle Eastern).

• Compute and document the frequency of images associated with each subgroup for age (subdivide based on – (0-20), (21,40), (41,60), (61,80), (81, 116)), gender (0,1), and race (0 to 4).
• For age, which subgroup has the largest representation? Which subgroup has the least representation?
• For gender, which subgroup has the largest representation? Which subgroup has the least representation?
• For race, which subgroup has the largest representation? Which subgroup has the least representation?
• Recreate a table of the age group, gender, and race distributions of subjects based on the UTK dataset subgroups. Please see the table below as an example – inspired by the one discussed in the lecture.
• Based on what youve learned so far, if an Algorithm is trained based on this dataset, which group(s) will be impacted the most? Explain why.
``````http://biometrics.cse.msu.edu/Publications/Face/HanJain_UnconstrainedAg
``````

### eGenderRaceEstimation_MSUTechReport2014.pdf

Turn in a report (in PDF format) documenting your outputs in each Step. The report should follow the JDF format. Jupyter notebook (ipynb files) submission is optional, but a final PDF document per JDF format is required. The file name for submission is GTuserName_Assignment_4, for example, Joyner03_Assignment_4. Reports that are not neat and well organized will receive up to a 10-point deduction. All charts, graphs, and tables should be generated in Python or Excel, or any other suitable