homework | ISE代写 | 商科 – ISE-529 Predictive Analytics

ISE-529 Predictive Analytics

homework | ISE代写 | 商科 – 该题目是值得借鉴的ISE代写的题目

homework代写 代写homework hw代做

Mid-Term Examination July 25, 2022

Instructions
  • You are to complete the exam by typing your answers into this PowerPoint as indicated.
  • You will have 90 minutes to complete the exam and submit it to GradeScope (in the same manner as done for homework assignments). Late submissions will be penalized.
  • The exam is open-book / open-notes. You may consult any resource except another person.
  • Good luck!
For this problem we will be working with the following dataset:
Linear Model Analysis

First, we create three models using X1, X2, and the combination of X1 & X2 to predict Y:

1A) For the two simple (single-predictor) models, are the predictors X & X2 significant?

  • XXX

1B) For the multiple regression model, which predictors are significant?

  • XXX

1C) How do you interpret what is going on here?

  • XXX
Now we incorporate the categorical variable into the model by creating
a dummy variable Blue and incorporate it into the model as shown:

1D) Does adding this categorical variable to the model improve its overall performance? Why or why not?

  • XXX

1E) Looking at this color-coded scatterplot of X1 vs Y, do you see any indication of an interaction effect between X1 and X3? Why or why

not?
  • XXX

1E) Looking at these model results, do you see any indication of an interaction effect between X1 and X3? Why or why not?

  • XXX

After completing your modeling analysis, you decide to use the model

shown below:

1F) Write out the algebraic expression for this model (you do not need

to include the error term):
  • xxx
1G) Write out the simplified algebraic expression for this model for the

Blue observations

  • xxx
1H) Write out the simplified algebraic expression for this model for the

Red observations

  • xxx
  1. We have developed a model to predict the sales (in thousands of dollars) at a new store our company may decide to open in a new city and we define and fit a model with five predictors:
  • : Population of the city (in thousands of people)
  • : Average income of the city (in thousands of dollars per adult)
  • : Type of store (1 for downtown store, 0 for a mall store)
  • : Interaction between population and average income (in thousands)
  • : Interaction between average income (in thousands) and store type
In the cities we are evaluating, the average income is generally less than $100,000 and the cities are in
the size range of 0  500,000 people
After fitting this model using a linear regression, we get the following coefficients: , 20, 50, 350,
0.05, - 5

2a) Which answer is correct:

a) For a fixed value of population and average income, a downtown store would on
average have greater sales than a mall store
b) For a fixed value of population and average income, a mall store would on average
have greater sales than a downtown store
c) For a fixed value of population and average income, a downtown store would on
average have more sales than a mall store provided that the average income is
high enough
d) For a fixed value of population and average income, a mall store would on average
have more sales than a downtown store provided that the average income is high
enough
Response: XXX
2B) What is the predicted sales for a downtown store in a city with a

population of 100,000 and an average income of $50,000?

  • $XXX

2C) Is this statement true or false and why: Since the coefficient of

the interaction term between population and average income is very
small, there is very little evidence of an interaction effect:
  • XXX

2D) Which predictor has the larger impact on sales, income or city population? Explain your answer

  • XXX

You are assessing two candidate models (M1 through M4). You try

training the models ten different times with different population

samples and then assessing those models against test partitions by calculating their mean squared errors (MSE). The results of those tests are summarized on the following page.

Complete the figure on the bottom of the following page with one

model for each of the four boxes.

Low Variance High Variance

Low Bias XXX XXX

High Bias XXX XXX

4A) Explain in your own words how k-fold cross-validation is

implemented
  • xxx

4B) Provide one advantage and one disadvantage of k-fold cross

validation relative to:
  • The validation set approach?
    • Advantage: xxx
    • Disadvantage: xxx
  • Leave-Out-One-Cross-Validation?
    • Advantage: xxx
    • Disadvantage: xxx

The following pages present a residuals diagram and a residuals histogram for each of six different models. For each model, identify the

apparent problem(s) with the model and provide one technique that
you might use to remediate (correct) the problem.
Residuals Analysis

5A Model 1

Residuals Analysis

Model Issue: xxx Possible remediation: xxx

5A Model 2

Residuals Analysis

Model Issue: xxx Possible remediation: xxx

5A Model 3

Residuals Analysis

Model Issue: xxx Possible remediation: xxx

5A Model 4

Residuals Analysis

Model Issue: xxx Possible remediation: xxx

5A Model 5

Residuals Analysis

Model Issue: xxx Possible remediation: xxx

5A Model 6

Residuals Analysis

Model Issue: xxx Possible remediation: xxx