R Bootcamp Exam
exam | report代做 | 代写R语言 | 统计代写 | 可视化代写 | data mining代写 | assignment – 这个题目exam的代写任务, 是有一定代表意义的R相关的任务,涉及了统计相关的代写
R Bootcamp Exam
Instructions: Create an R script file with sensible, code-based answers for the following questions, making sure to show your work with diligent comments throughout. Please submit both your R script and a compiled report in html format as part of your final submission to Blackboard.
To generate a report, with your R script open in RStudio, go to File -> Compile Report… and choose HTML as your output format; the compiled HTML file will appear in the same location as your R script file when the process finishes.
Before generating your report, be sure to: 1) double check that your script is error free; 2) comment out all install.packages commands and/or help calls (e.g., ?mean); 3 ) remove unnecessarily long printouts to the console (e.g., use head to show a long data.frame rather than calling it back in its entirety).
Be sure to include your name, the date, the name of the file, and the assignment name/number in comments at the top of your R script file; please also name your submission files using the following convention:
7040 _R_Bootcamp_Exam_Firstname_Lastname.R 7040 _R_Bootcamp_Exam_Firstname_Lastname.html
(e.g., I would submit 7040 _R_Bootcamp_Exam_David_Dobolyi.R and 7040 _RBootcampExam_David_Dobolyi.html, respectively)
Note that you may need to zip up your HTML report in order to upload it to Blackboard.
Data: A zip file named 7040 _R_Bootcamp_Exam_Data.zip is provided with this assignment. Unzip it to find the data files you will need to answer several of the questions below.
- The file “App.txt” contains 4 columns of data without column names. Use read.table to read these data into an R object called appDat while providing the following names for the columns: AppName, Version, TimesOpened, Platform ( HINT: remember to use the help function [e.g., ?read.table] and be sure to look at the examples; see the sep and col.names arguments in particular).
- Use a function in either the foreign or haven package to import the SPSS data file “empDat.sav” as a data frame, storing
it into an object called empDat. Show the head of the data, the data structure, and a summary of the data ( HINT: see ?summary).
- Use the read_csv function in the readr package to read in the adoption data file “AdoptData.csv” and store it into an object called adoptDat.
- Using the adoption data, show a frequency table of the unique values of Verbal IQ (i.e., column “VIQ”); for example, your table should show a VIQ of 98 occurs 4 times, 99 2 times, 142 once, etc. HINT: for help with this, you might try a Google search on “count of unique values r” and look for Stack Overflow results, e.g., https://tinyurl.com/gcom7040).
- Create a new frequency table of Verbal IQ rounded to the nearest 10 (e.g., 12 becomes 10). NOTE: this one may be a little tricky; some suggestions for solving this are: (A) using a bit of algebra; (B) getting help through Google on how to “round to nearest 10 r”; or (C) looking for a rounding function in the plyr package that serves this exact purpose.
- What is the birthweight (column “Bwt” [in grams]) of the 14th child in the adoption dataset? Use at least two different approaches to answer this question (e.g., brackets specifying rows/columns is one way, but there are many other options we’ve covered that would also work).
- Show how many children in the dataset have birthweights under 3000 grams. HINT/NOTE: given the question states how many, your answer should explicitly show the correct value, i.e., 11.
- Of those children that have birthweights over 3000 grams, show many have VIQs over 11 5.
- What is the 37th percentile of birthweight? HINT: you’ll need to use the quantile function to answer this question.
- Create a histogram of birthweight using the hist function.
- Look up some of the arguments/options for hist and apply at least 3 of them to your figure from question 10.
- Plot birthweight as a density and color the line red. HINT: use the density function to answer this question (see the examples).
- Find a function in the measurements package to convert
birthweight from grams to pounds and store this in a new vector called birthweightPounds. HINT: see the package documentation and/or examples for the measurements package on CRAN (or pull up the documentation via the help function in R).
- Plot the relationship between birthweightPounds and VIQ using the plot function. Would the relationship change if you’d used birthweight in grams instead of pounds? Plot this too to verify and explain your answer. What is the name of this type of transformation that preserves relationships among variables? HINT: you might Google “type of transformation that preserves relationships among variables” to answer the last part of this question.
- Create a new vector called count containing the consecutive numbers from 101 to 200.
- Find the sum of the numbers within count.
- Using the count vector, display those numbers within count that are odd. Avoid hard-coding your answer (i.e., make sure that your code would work even if the values in count were changed; to test this, try your code with another sequence of numbers such as 10 to 150). HINT: one way to do this involves using the modulus operator.
- Write an expression (i.e., R code) to show how many of the numbers within count are evenly divisible by 13. HINT: your code should return an answer of 8.
- Use the data function to load the Seatbelts data set that comes with R. Show the mean and median of each column in the data set. HINT: there are more efficient ways of answering this question than calling mean and median on each column in the data set one at a time (e.g., see the R Bootcamp for examples).
- Using the plot function and the faithful data set that comes with R, plot waiting time on the x-axis and the duration of eruptions on the y-axis. Give the figure meaningful axis labels and a clear title. Finally, describe the relationship between waiting and eruption duration in practical terms.
- Using the ggplot2 package, create a nicer looking version of your plot from question 20. Be sure to:
a) Color the points on the plot blue and make them larger than
the default size.
b) Add a regression line to your plot and color it red. HINT: see the method argument within the geom_smooth function for an easy way to do this (or search Google for examples). c) Add your title and axis labels from question 20 to your plot.