R语言代写 | 代做Lab | 统计代写 – Lab 1 R basics

R语言代写 | 代做Lab | 统计代写 – 这是一个R语言相关的代写任务,是基础的R训练题目,属于统计的范畴

lab 1 R basics (50 points)

updated: 2020-08-20

Delivery

Please put your answers in a Word document and turn it in Compass. Insert your screenshot into the Word document based on the requirement.

Load the dataset

Note: data downloaded from here

library (tidyverse) adult_income <- read_csv (“data/adult.csv”)

Question 1. (5 points)

For this adult_income dataset, how many variables are characters?

Question 2. (5 points)

Lets create a simply bar plot to display how many records we have for each education group (use the column called education). However, the factors are not ordered properly. Please reorder education, (Preschool < 1st-4th < 5th-6th < 7th-8th < 9th < 10th < 11th < 12th < HS-grad < Prof-school < Assoc-acdm < Assoc-voc < Some-college < Bachelors < Masters < Doctorate.) and generate a new barplot. Export your barplot and insert it in the word document for submission.

barplot ( table (adult_income $ education), main=”Barplot created by Fang”)

10th 12th 5th6th 9th Bachelors Masters

Barplot created by Fang

0
2000
6000
10000

Question 3 (5 points)

Under the column called workclass, there are some missing values which are labled as ?. Try to rename? as missing, and provide a summary table for this column use the table function.

library (tidyverse) table (adult_income $ workclass)

##? Federal-gov Local-gov Never-worked

1836 960 2093 7

Private Self-emp-inc Self-emp-not-inc State-gov

22696 1116 2541 1298

Without-pay

14

Question 4. (5 points)

How many of people is under 20? You need to provide a screenshot of your script with answer for delivery. Hint: You can usesumfunction here.

Question 5. (10 points)

Between 7 marital status (marital.status column) groups, which group works hardest per week on average (based on hours.per.week column)? You need to provide a screenshot of your script. You also need to clarify what is the average working hour per week for this group

Hint: for average, use themeanfunction.

Question 6 (10 points)

Load the dataset called austin_weather.csv from compass. This dataset includes historical daily temper- ature, precipitation, humidity, and windspeed for Austin, Texas from 2013 to 2017. See more details of the dataset, click here.

We are interested to calculate the average wind speed (Use the column called WindAvgMPH) when Tem- pAvgF is above 75. Check out the script below. Why the script does not work? How to fix it? So what is the average wind speed when temperature is above 75?

library (tidyverse) austin <- read_csv (“data/austin_weather.csv”) austin_above75 <- austin $ TempAvgF > 75 # To extract a subset which has average temperature above 75. mean (austin_above75 $ WindAvgMPH) # calculate the average WindAvgMPH for the subset

Question 7 (10 points)

Examine this weather dataset in Austin. Since the records for 2013 and 2017 are incomplete, you can just focus on data collected in 2014-2016.

According to National weather service heat index chart , air temperature above 80F will be labled as caution. For simplicity, lets ignore humidity for this question.

During these 3 years (2014-2016), which year has the most days of weekdays (M-F) with the temperature above 80F? Hint: you can use the column called TempHighF to answer the question. Usecountortable to summarise your data.

For more details of date/time in R, click here.

You need to provide your answers (which year and how many days), together with screenshot of your script for submission.

发表评论

电子邮件地址不会被公开。 必填项已用*标注