# Data Science | math代做 | 代写IT | sql作业 – Data Science

### Data Science

Data Science | math代做 | 代写IT | sql作业 – 这道题目是利用math进行的编程代写任务, 是比较有代表性的math/IT/sql等代写方向

Property of UNSW

### Fundamentals of Data Science

##### STATEMENT:
``````I declare that this submission is entirely my own original work.
``````
``````YOU CAN DELETE AND/OR RELOAD FILES UNTIL THE DEADLINE OR
UNTIL YOU HAVE CLICKED THE SUBMIT BUTTON.
``````

Property of UNSW

``````Start a new document clearly marked Statistics
``````

#### Statistics [19 marks]

1. [2 marks]The UEFA Euro 2020 is a 24-team soccer tournament which saw Italy crowned champions against England while Denmark and Spain completed the top four. How many combinations for the top four teams exist when
``````i) [1 mark]taking into account the order of these top four teams?
``````
``````ii) [1 mark]without taking into account the order of these top four teams?
``````
1. [3 marks]Ben and Rhys are two friends who decide to go ten-pin bowling. Based on past experience, Ben usually gets a strike every second throw. Rhys is a little less experienced and gets four strikes every ten throws. We will focus on their first trow only.
``````i) [1 mark]What is the probability that at least one of Ben and Rhys get
a strike?
``````
``````ii) [1 mark]What is the probability that exactly one of Ben and Rhys get
a strike?
``````
``````iii) [1 mark]What is the probability that only Rhys gets a strike?
``````
1. [4 marks]A quality index summarizes different features of a product by means of a score. Different experts may assign different quality scores depending on their experience with the product. LetX be the quality index for a tablet. Suppose the respective probability density function is given as follows:
``````f(x) =
``````
##### {
``````cx(x3) if 0x 3
0 elsewhere
``````
##### .
``````i) [1 mark]Determine csuch that f(x) is a proper probability density
function.
``````
``````ii) [1 mark]Determine the cumulative distribution function.
``````
``````iii) [1 mark]Calculate the expected value ofX
``````

Property of UNSW

``````iv) [1 mark]Calculate the variance ofX.
``````
1. [10 marks]Nadine wants to purchase a record by Fr ed eric Chopin. Her first thought is to buy it online, via an online auction. She discovers that she can also buy the record immediately, without bidding at an auction, from the same online store. She also looks at the price at an internet book store which was recommended to her by a friend. She notes down the following prices (in US\$): Internet book store: 16. Online store, no auction: 18.19, 16.98, 19.97, 16.98, 18.19, 15.99, 13.79, 15.90, 15.90, 15.90, 15.90, 15.90, 19.97, 17. Online store, auction: 10.50, 12.00, 9.54, 10.55, 11.99, 9.30, 10.59, 10.50, 10.01, 11.89, 11.03, 9.52, 15.49, 11. Assume that the price at the online store (no auction) is normally distributed with mean 1 and variance^21 , and that the price at the online store (auction) is normally distributed with mean 2 and variance^22. It is strongly recommended to useRto answer the following questions.
``````i) [3 marks] Test the hypothesis that the mean price at the online store
(no auction) is not equal to US\$16.95 at the level of significance= 0.01.
Write down the null and alternative hypothesis, test statistic and its dis-
tribution under the null hypothesis, observed test statistic, critical value
and conclusion.
``````
``````ii) [1 mark]Calculate a two-sided 99% confidence interval for the mean
price at the online store (no auction) and interpret your findings in the
light of the hypothesis in (i).
``````
``````iii) [2 marks] Calculate a two-sided 99% confidence interval for the ratio of
the price variances^21 /^22.
``````
``````iv) [4 marks] Assuming that the variances are equal, test the hypothesis
that the mean non-auction price is higher than the mean auction price,
at the level of significance= 0.01. Write down the null and alternative
hypothesis, test statistic and its distribution under the null hypothesis,
observed test statistic, critical value and conclusion. Provide thep-value
of the test.
``````

Property of UNSW

``````Start a new document clearly marked Computer Science
``````

#### Computer Science [18 marks]

1. [6 marks]Consider the following students database
``````ID Name Sex CourseID CourseName Score Grade
7392489 Tom male 1015  math 80 D
7392489 Tom male 2029 physics 60 P
7398743 Jack male 1015 math 85 HD
7398743 Jack male 2029 physics 90 HD
7398743 Jack male 3008 chemistry 75 D
7393012 Rose female 1015 math 90 HD
7393012 Rose female 2029 physics 75 D
7393012 Rose female 3008 chemistry 85 HD
7397654 Alice female 1015 math 85 HD
7397654 Alice female 3008 chemistry 85 HD
``````
``````For each of the following queries, writesingle sql statement and the output
of the query:
``````
``````i) [1 mark]List the average score of students for maths and physics. Each
tuple of the result should contain the course name and average score.
ii) [2 marks] List all students where the student score is higher than the
average score of all students. Each tuple should include student name,
course name and score.
iii) [3 marks] List the female student with the highest score.
``````
1. [3 marks]Calculate weights w0, w1 and w2 of a perceptron that correctly classifies the following data. In your calculation use perceptron equation given in the lecture. a(x) =xw Training Example x1 x2 class a 0 1 0 b 2 0 0 c 2 2 1
2. [2 marks]The following code was used to break up a dataset with 10000 instances into training (60%), validation (20%) and testing (20%), which are acceptable proportions used in machine learning.
``````Xtrain = X[1:7000]
Xvalidation = X[6001:8000]
``````

Property of UNSW

``````Xtest = X[8001:10000]
``````
``````i) [0.5 mark]Identify at least two problems with this split?
ii) [1 mark]How these problems affect the learned model?
iii) [0.5 mark]Write the correct version of the code.
``````
1. [5 marks]A two-class model was trained and then tested with a dataset of 100 instances. The test set contained 60 instances in negative classN, and 40 instances in positive classP. As a result of testing, the following counts were obtained:
``````50 instances of N were classified correctly,
10 instances of N were classified into P,
10 instances of P were classified correctly,
30 instances of P were classified into N.
``````
``````i) [1 mark]Construct contingency table (also called confusion matrix)
ii) [1 mark]Calculate the following micro metrics: precision, recall, F1.
iii) [1 mark]Calculate the following macro metrics: precision, recall, F1.
iv) [2 marks] Compare micro with macro metrics and explain why these
metrics are different.
``````
1. [2 marks]In the process of training/validation a model, the following diagram was produced.

Property of UNSW

``````If the model training would stop at the point marked with the red arrow, would
the model underfit, overfit or be an optimal fit. Briefly justify your answer.
``````