final
scheme代做 | mining | assignment – 这是利用scheme进行训练的代写, 对scheme的流程进行训练解析, 涵盖了scheme/mining等程序代做方面, 该题目是值得借鉴的assignment代写的题目
Econ 104: Intro to Econometrics
Spring 2020
Final Exam: Answer Key
Problem 1 (10 points)
Your Uncle gave you a proprietary data set that records the wind direction and speed for any place in the United States for every half-an-hour period from January 1, 2010 to December 31, 2019. You want to use these data to construct aninstrumentthat can answer an interesting economic question that involves causality. You need to describe your research project, including additional data you may want to use. Specifically:
- Define the economic question you want to address. Make sure your question contains a question mark.
- Define the unit of observation (i.e. if your data are cross-sectional, definei; if your data are a time series, definet, if your data are panel, define bothiandt.)
- Define the outcome variableyit. (This notation implies that your data are panel, but that does not need to be the case).
- Define the key regressor of interestxit.
- Define control variableswit.
- Describe possible unobservable factors that will be picked up byuit.
- Define the main estimated equation/model.
- Explain why your regressorxitis likely to be endogenous.
- Explain why your instrumentzitis likely to be exogenous.
- Explain why your instrumentzitis likely to be relevant.
Answer and Grading:
- What is the effect of electricity prices on smog?
2.tis a month from January 1, 2010 to December 31, 2019.
3.ytis the average (over the respective month) density of PM2.5 particles over a candidate city
that generates a significant amount of its electricity through wind power.
4.xtis the average electricity rate during that month (weighted by market share, in terms of
production units, if multiple providers differ in prices) at said city.
- Include seasonal dummieswsummer,t,wfall,t,wspring,t, a deterministic trend for the role of pro- duction technology evolution, and a lag 1 smog term to account for slow dissipation properties of aggregate air pollution.
6.utwill pick up the effect of economic cycles
7.yt= 0 +t+ 1 yt 1 + 1 xt+ 2 wsummer,t+ 3 wfall,t+ 4 wspring,t+ut
- Economic cycles can affect both the price of electricity, via demand (assuming installed (po- tential) supply is relatively stable in the short term), as well as smog, since this is positively correlated with economic activity.
- Wind farms, due to their visual interference and less optimal wind capture potential of sur- rounding landscape, are likely to be placed far away from cities where would measure pollution, as such, it is sensible to suggest that wind conditions at the sites of power generation and smog measurement would be relatively uncorrelated. Thus, even if one believes, reasonably, that wind conditions of a city affect its smog levels, the fact that we are using the conditions at an alternate location allows us to assume that these affect air pollution almost exclusively through their effect on energy costs.
- Wind strength is the primary variable deter mining total energy generation over a given time period by a wind turbine. Electric power availability from wind farms shifts the supply curve and hence the equilibrium price at which the commodity is traded.
Grading scheme:
- 1 point for each part.
- 1, 2, 3, 4, 6, 7, 8, 9, 10 are straightforward.
- Getting full credit for 5 requires proposing either obviously relevant variables or variables explicitly argued to be relevant.
Problem 2 (10 points)
The following code inRwas written to illustrate the phenomenon ofweak instrumentsdiscussed in section 12.3. Unfortunately, the code has blanks ( ).
- (6 points)Fill in the eight blanks so that the code achieves its intended purpose.
- (1 points)Run your code and record the output.
- (3 points)Briefly explain why the two numbers that your code generates illustrate the phenomenon of weak instruments.
library(AER)
a = (___)
b = (___)
ro1 = (___)
ro2 = 1
alpha = (___)
n = 100
k = 1000
bhat1 = numeric(k)
bhat2 = numeric(k)
for (ii in 1:(___)){
e = rnorm ((___))
z = 1 + rnorm(n)
x1 = ro1*z + e
x2 = ro2*z + e
y1 = a + b*x1 + alpha*e + (1-alpha)*rnorm(n)
y2 = a + b*x2 + alpha*e + (1-alpha)*rnorm(n)
bhat1[ii] = summary(ivreg(y1 ~ x1 | (___)))$coefficients [2,1]
bhat2[ii] = summary(ivreg(y2 ~ x2 | (___)))$coefficients [2,1]
}
mean(bhat1)
mean(bhat2)
Answer and Grading:
1.Grading scheme:
- a and b are arbitrary but should not be random. In particular, you want to show that bhat2 is unbiased and close to the constant b
- 1 point for filling in ro1 very large (ok) or close to 0 (better). Setting ro1 close to 0 is better for showing the impact of weak instruments because when ro1 is large, z is more strongly correlated with x1 than x2, but the correlation between z and x2 still isnt weak. Note that z does not need to be independent of x1 to be a weak instrument for x1. The correlation should just be low
- 1 point for alpha between 0 and 1, large enough to create enough correlation between the error in the regression and the regressor. If alpha = 0, there is no correlation between the regressor and the error in the regression and no need for an instrument. Alpha controls the proportion of the error in the regression that is correlated with the regressor
- 1 point each for filling in k, n, z, and z for the last 4 blanks
- As per the instructions, only the pdf is graded
2.Grading scheme:
- 1 point for reporting mean(bhat1) and mean(bhat2), where bhat1 is further from b than bhat2 is from b.
3.Grading scheme:
- 1 point: ro1 is small so correlation between x2 and z is stronger than correlation between x1 and z, so bhat2 is closer to b than bhat1.
- 1 point: z is a weak instrument for x
- 1 point for assessment of your understanding of what the exercise is showing. We ob- serve this phenomenon because normal distribution provides a poor approximation to the sampling distribution of the TSLS estimator, even if the sample size is large. The esti- mate will be biased towards OLS when the IV is weak. Note that this exercise illustrates that the TSLS estimate is unbiased under a strong instrument. Even though the TSLS estimate converges in probability to the true b, this code doesnt show that. The F statis- tic can be used to evaluate the relevance of the instrument, but you get that from a first stage regression of x1 on z and x2 on z, not from mean(bhat1) and mean(bhat2)
Problem 3 (10 points)
Recently, the Supreme Court of the United States has had multiple opportunities to wrestle with various econometric methods. In this problem, we consider two recent cases. To get full credit, you must answer these two questions concisely and your answers must show that you understand and are able to apply the relevant concepts discussed in class.
- (6 points)In the 2019 caseDepartment of Commerce v. New York, the Court had to decide whether the inclusion of the citizenship question (Are you a U.S. Citizen?) was likely to decrease the number of respondents participating in Census. During the oral argument, Justice Neil Gorsuch weighed in. There could be multiple rea- sons why individuals dont complete the form. He continued:We dont have any evidence disaggregating the reasons why the forms are left uncompleted. What do you do with that? I mean, normally we would have a regression analysis that would disaggregate the potential cause and identify to a 95th percentile degree of certainty what the reason is that persons are not filling out this form and we could attribute it to this question. We dont have anything like that here. So what are we supposed to do about that? Suppose we conduct a regression analysis desired by Justice Gorsuch. Explain whether and how the regression estimates could disaggregate the potential cause and identify to a 95th percentile degree of certainty the effect of the citizenship question on census participation.
- (4 points)In the 2018 caseOhio v. American express Co., the Court had to decide whether a certain action of American Express decreased the volume of credit card transactions. The action at issue involved a specific restriction in the contract between American Express and its participating merchants: the Non-Discrimination Provision, which prohibited merchants from directly or indirectly steering customers to use a particular card in making their purchases. In a 5-4 decision, Justice Clarence Thomas writing for the majority observed:The output of credit-card transactions grew dramatically from 2008 to 2013, increasing 30%. Based on this observation, the Court concluded that the Non-Discrimination provision could not decrease the volume of credit card transactions and ruled for American Express. Was the methodology adopted by the Court appropriate for the question they were after? Can you suggest a better way to determine the causal relationship between the Non-Discrimination provision and the volume of credit card transactions?
Answer and Grading:
- To get full credit, you need to make several points:
- A regression analysis alone cannot prove or disprove causality. What establishes causal- ity is our assumptions about the reasons why the regressor varies from observation to observation. So, if the research question is how the citizenship question affects the par- ticipation in Census, then the inclusion of the citizenship question has to vary from observation to observation randomly or as-if randomly (i.e. exogenously).
- If there are multiple sources of exogenous variation in the data that are potentially cor- related with each other, then regressions can be used to disaggregate the influence of each source. Ultimately, however, the regression estimates will not be able to tell us whether a given factor wasthereason for not filling out this form, simply because the decision can be caused by multiple contributing factors.
- Once regressions are estimated, we can test whether the true value of a coefficient is zero. Our data can reject with 95% confidence that the effect of the citizenship question is zero. The 95% number is the share of cases where we would not make a mistake by falsely rejecting the null. The data cannot, however, determine with 95th percentile degree of certainty (degree of certainty is not a random variable, anyway)thereason why persons are not filling out this form. There may be multiple contributing reasons.
Grading scheme:
- 3 points for a correct discussion the relationship between causality and regressions
- 2 points for analyzing multiple sources correctly
- 1 point for a correct comment on 95th percentile degree of certainty
- The majority opinion commits a common fallacy: Post hoc ergo propter hoc. The major- ity mistakenly believes that the fact that the output didnt go down after the Provision was introduced implies that output didnt go down as a result of the Provision being introduced. Thats not how we think about causality. It is equally possible that the output couldve been growing even faster, had the Provision been absent. To establish the causal effect we need to know what would have happened in a world without this Provision. Thus, the presented empirical evidence is unpersuasive and should have been discounted. Without knowing more about the industry, it is hard to give a detailed explanation on how one would estimate the causal effect, but, speaking in general terms, we need to find a control group unaffected by the Provision (maybe merchants in different countries or different time periods) and compare the growth of the credit-card output for the control group with the growth of the credit-card output for the treatment group (i.e. those affected by the Provision). Importantly, we need to make sure that the treatment is assigned randomly or as-if randomly: the adoption of the Non-Discrimination Provision is likely endogenous. Grading scheme:
- 2 points for identifying the flaw with the existing methodology
- 2 points for a full and complete description of a correct alternative methodology
Problem 4 (10 points)
Researchers from Yale conducted a study that analyzed the growth of coffee chains in Canada (such as Starbucks, Tim Hortons, Second Cup). They collected data on the universe of chain coffee shops in Canada between 1970 and 2005, with their opening and closing years, as well as locations. To understand the factors that affect the chains decisions to open or close an additional coffee shop in a given geographical market they estimated the following model. They assumed that a coffee shops profityimtfrom chainiin geographical marketmin yeartis:
yimt= 1 nimt+ 2 nimt+zmt+imt,
wherenimtis the number existing outlets of chaini,nimtis the number of existing outlets of chain is competitors,zmtis a vector of factors affecting demand, andimtis an i.i.d. unobserved shock that is assumed to follow the standard normal distribution. Thes andare the coefficients. Unfortunately, the researchers did not have data on the coffee chains per-shop profits; they only knew whether in a given year and in a given market the number of shops went up, down, or stayed the same. Based on this information, they defined the outcome variable as follows:
yimt=
exit, ifyimtc 1 ,
unchanged, ifc 1 < yimtc 2.
enter, ifc 2 < yimt,
wherec 1 andc 2 are the threshold parameters for the profit to be estimated. Thus, if the chains per-shop profit exceedsc 2 , a new shop enters. If the chains per-shop profit drops belowc 1 , an existing outlet closes and if the profit is betweenc 1 andc 2 , the number of shops of chainistays the same. Since the researchers assumed that the error is standard normal, they were able to estimate the following probit-like models (these models are called ordered probit, as you saw in Appendix 11.3):
Pr(yimt= exit) =(c 1 ( 1 nimt+ 2 nimt+zmt))
Pr(yimt= unchanged) =(c 2 ( 1 nimt+ 2 nimt+zmt))(c 1 ( 1 nimt+ 2 nimt+zmt))
Pr(yimt= entry) =1(c 2 ( 1 nimt+ 2 nimt+zmt))
The model was estimated using Maximum Likelihood. The estimates are summarized in the fol- lowing table.
(This problem continues on the next page)
Dependent Variable: Decision to Enter/Exit (yimt)
(1) (2) (3) (4)
Number of own-chain shops ( 1 ) 0.3108 0.6135 0.7756 0.9281
(0.0243) (0.0281) (0.0294) (0.0372)
Number of rival-chain shops ( 2 ) 0.0180 0.2733 0.2301 0.2984
(0.0112) (0.0174) (0.0178) (0.0235)
Population (thousand, 1 ) 0.0017 0.0253 0.0259 0.0233
(0.0009) (0.0038) (0.0039) (0.0062)
Income (thousand $, 2 ) 0.0022 0.0165 0.0176 0.0149
(0.0007) (0.0021) (0.0021) (0.0058)
Forward population growth (%, 3 ) 0.0723
(0.0215)
Market fixed effect No Yes Yes Yes
Chain fixed effect No No Yes Yes
Cutoff 1 (c 1 ) 2.7020 0.9237 0.8595 0.
(0.0460) (0.2844) (0.2951) (0.4973)
Cutoff 2 (c 2 ) 2.3736 4.3860 4.6121 4.
(0.0421) (0.2886) (0.3001) (0.5016)
Observations 70,000 70,000 70,000 70,
Pseudo R^2 0.0139 0.0743 0.1128 0.
Note: The symbols,, andindicate significance at the 1%, 5%, and 10% levels, respectively. Forward
population growth is the actual annualized percent change betweentandt+ 5. Alternative time horizons such
as 10 years generate similar results.
- (6 points)Based on the estimates presented in the table, drawthreedistinct economic conclusions and briefly explain how each of your conclusions is supported by the estimates.
- (2 points)In their report, the researchers extensively discuss the difference (c 2 c 1 ). What economic variable does this difference represent? What is your best estimate for this differ- ence? Is your estimate consistent and why?
- (2 points)Explain how you would test whether the difference (c 2 c 1 ) is equal to zero.
Answer and Grading:
- Some examples of possible conclusions (only three are needed):
(a) The presence of own shops is negatively correlated with entry (i.e., the coefficient 1 is
negative and statistically significant across all specification).
(b) The magnitude of the own-chain coefficient seems to be always larger than that of the
rival-chain coefficient (although we cannot test it formally without knowing the corre-
lation between the two estimates). This suggests that own-chain shops are likely to be
better substitutes than rival-chain shops.
(c) The correlation between entry and the presence of rival-chain shops is also negative and
statistically significant (i.e., 2 < 0 ), but only after we include the market fixed effect in
columns. This implies that in specification (1) we mistakenly conclude that the presence
of rival-chain shops makes entry more likely because of the omitted variable bias caused
by market-specific factors: some unobserved market-specific profitability attracts both the
chain and its rivals.
(d) Chains seem to be forward-looking: the coefficient in front of the variable forward popu-
lation growth is positive and statistically significant.
Grading scheme for each conclusion:
- 1 point is for the conclusion itself
- 1 point for referring to the correct part of the table with the estimation results
- The difference(c 2 c 1 )represents the adjustment costs of entry/exit decisions for the chain (e.g. entry costs and/or store-liquidation costs). If this difference were zero, that would imply that exit/entry is costless and the chain could adjust the number of stores without any frictions. (An alternative explanation: the difference could signify the discrete nature of shops, i.e. the fact that an additional entry/exit can cause a discrete change in the profits.) The best estimate is 4 .842( 0 .8608) = 5. 7028. The estimates presented in the table are consistent because ML estimates are consistent. Therefore, this estimate is consistent as well because it is a difference of consistent estimators. Grading scheme:
- 1 point for interpretation
- 1 point for explaining consistency
- incorrect statements result in a 1 point loss.
- Defined 1 =c 1 ,d 2 =c 2 c 1 , then the outcome variable can be defined as:
yimt=
exit, ifyimt d 1 ,
unchanged, ifd 1 < yimt d 1 +d 2.
enter, ifd 1 +d 2 < yimt,
To test the hypothesis, we need to re-estimate the model withd 1 andd 2 instead ofc 1 andc 2
and then test whether the coefficient ond 2 is zero.
Grading scheme:
- 1 point for acknowledging that standard errors are needed and those not available
- 1 point for outlining how to compute the standard error
- incorrect statements result in a 1 point loss.
Problem 5 (10 points)
A friend of yours argues that it is impossible to have a regression with two highly insignificant regressors and a highR^2 at the same time:Its just common sense. Two almost zero coefficients cannot suddenly deliver good fit. Remember in class we discussed the relationship betweenR^2 and the F-statistic? Suppose the p-value for the t-test is 0. 5 for the first regressor and the p-value for the second t-test is 0. 5. Whats the p-value for the F-test for these two regressors? Well, it has to be 0. 52 = 0. 49. It is smaller but still not small enough for the F-test to reject. That means that the value of the F-statistic is small. Therefore,R^2 must be small.
- (2 points)Explain why your friends argument is wrong.
- (6 points)Seemingly convinced by your argument, your friend makes a counterpoint:Fine, my reasoning was wrong, but the point can still be correct. Can you give me an real example that has two insignificant regressors and a highR^2 , say, above 0.8?. You can. You offered to generate a large dataset (10,000 observations) that prove your side of the debate. The following code written inRgenerates 10,000 observations ofy,x 1 , andx 2 such that coefficients on bothx 1 andx 2 are statistically insignificant, while the F-test rejects the hypothesis that both coefficients are zero. Predictably, the code has blanks ( ) that you need to complete in order to achieve the intended purpose. Fill in the seven blanks, run your code, and record the output.
- (2 points)Briefly explain why your code achieves its intended purpose.
set.seed(___) # insert your numeric Penn ID here
n = 10000
a1 = (___)
a2 = (___)
b1 = (___)
b2 = (___)
p1 = (___)
p2 = (___)
z1 = rnorm(n)
z2 = rnorm(n)
x1 = a1 + p1*z1 + (1-p1)*z
x2 = a2 + (1-p2)*z1 + p2*z
y = b1*x1 + b2*x2 + rnorm(n)
summary(lm(y~x1+x2 -1))
Answer and Grading:
- The discussion of the p-value is flawed. The p-value for the F statistic would equal the product of the two p-values of the t-stats only if the two coefficient estimates were independent, which is clearly not the case since these estimates are calculated on the same sample. Depending on the correlation between the two estimates, the p-value for the F-stat can be smaller or larger than the product. Thus, p-values alone do not guarantee a low or high R-squared value. Grading scheme: – 1 point for stating that the p-value of the F-test is not equal to the product of the p-values of the t-tests – 1 point for explaining why
2.x 1 andx 2 will have a high correlation if they are similarly related toz 1 andz 2 , which can
be achieved by settingp 1 close to 1 p 2 (but not exactly equal!). The values ofa 1 anda 2
are irrelevant. Anyb 1 orb 2 would work as long as at least one of them is nonzero (although
having at least one of them very different from zero would make your job easier).
Another way to achieve the outcome is to set p1 exactly equal to p2, but in this case, a
mustnt equal a2. Since the intercept is missing in the model, the OLS can distinguish two
regressors through their constant term even though they are identically related to the random
variables.
Grading scheme:
- 6 points for producing two t-stats with p-values less than 0. 05 and [an F-stat less than 0.05 or an R-sq value greater than 0.8]
- If the replication of your code doesnt give the results in the pdf file it will be an automatic 0 points.
- When there is multicollinearity between two regressors, the coefficients for both regressors are estimated imprecisely. However, if at least one of those regressors is relevant, the model finds it hard to reject that neither of them is statistically significant and we can still get a high F-statistic. For example, the OLS estimator may find it difficult to determine if it is the height or the wingspan that makes a basketball player valuable but will conclude that at least one of the factors is important. Grading scheme:
- 2 points for explaining imperfect multicollinearity
Problem 6 (10 points)
Complaints about increased concentration in The United States, lax merger policy, and decreased competition are common among politicians and academics alike. An economy that is effectively con- trolled by few too-big-to-fail corporations fails to deliver competitive prices to American consumers. When two airlines merge, an industry loses a competitor. The effect of the merger, however, may be different on different routes. On some routes both merging airlines competed before the merger. Competition on these routes decreased. On other routes, only one of the merging airlines offered service or even neither of them did. These routes did not lose a competitor. Researchers from Penn have conducted a study that analyzed the effect of the recent airline mergers on the prices of airline tickets. They collected a representative sample of all airline tickets sold in the United States and summarized the results of their regression analysis in the following table.
UnitedContinental Merger AmericanUS Airways Merger
(1) (2) (3) (4) (5) (6) (7) (8)
Post 24.19 21.45 0.75 1.
(1.19) (1.32) (0.43) (0.49)
PostMerged 17.59 -3.
(2.99) (1.00)
Merged 29.98 31.23 19.45 12.
(2.25) (1.76) (0.77) (0.62)
PostOverlap 35.40 19.31 -4.45 -12.
(2.82) (3.23) (0.77) (0.91)
PostOverlapMerged 48.34 22.
(4.40) (1.24)
Constant 207.22 201.03 216.68 210.46 231.28 226.43 232.45 229.
(0.86) (0.98) (0.65) (0.74) (0.32) (0.37) (0.25) (0.29)
Note: Results from regressions where an observation is a ticket and the dependent variable is price. Columns
(1)-(4) use a total of 58,132,174 observations from 2008 and 2009, as pre-merger, and 2011 and 2012 as post merger
for the United-Continental merger. Columns (5)-(8) use a total of 68,496,639 observations from 2011 and 2012, as
pre-merger, and 2015 and 2016 as post merger for the American-USAir merger. Overlap is a dummy variable
that equals to one if the both merging airlines compete on the route and zero otherwise. Post is a dummy
variable equal to one in the two years after the window merger and zero in the two years before merger. Merged
is a dummy variable equal to one for tickets on United or Continental in the first merger, and American and US
Airways in the second merger, and zero otherwise. All regressions include route fixed effects and the regressions
in columns (3), (4), (7) and (8) include time fixed effects. Robust standard errors are reported in parenthesis.
(This problem continues on the next page)
- (2 points)What is the main economic conclusion you can draw from these estimates? Ex- plain how these estimates support your conclusion.
- (8 points)The study draw some criticism. For each alleged drawback, respond professionally. If you disagree with the criticism, explain why. If you agree, explain how you would re-estimate the regressions to take this criticism into account.
(a)This analysis does not take into account the impact of oil prices on the industry. Airline
prices can go up because oil prices go up.
(b)This analysis fails to consider the effect of congestion. On congested routes the merging
airline will have more market power and the analysis has to recognize it explicitly.
(c) This analysis misses the point. The effect of the merger was not on the airline prices
but on the ancillary fees that consumers have to pay (baggage fee, seat assignment fee,
change fee).
(d) This analysis cannot possibly get the causal effect of the merger on prices. Randomized
control trials are the golden standard. Aint no randomized controls trials in this study
for sure.
Answer and Grading:
- One conclusion that can be drawn from the table: While routes that lost competition to the UnitedContinental merger, saw higher prices (coefficient 35. 40 is positive and statistically significant), routes that lost competition to the AmericanUS Airways merger, saw lower prices (coefficient 4. 45 is negative and statistically significant). Following both mergers, the merged airlines offered higher prices on the overlap routes compared to their competitors (coefficients 48. 34 and 22. 91 are positive and statistically significant). Grading scheme: – 1 point for a correct conclusion – 1 point for a correct reference to the estimates in the table
- (a) The impact of oil prices on the entire industry will be captured by the time fixed effects. To incorporate route-specific or airline-route-specific effect of the oil prices, one could reestimate an augmented model that includes interaction terms between oil prices and route fixed effects or between oil prices, route fixed effects, and airlines. (b) If congestion does not vary over time, then the effect of congestion on the price level will be captured by the route fixed effects. If the criticism is that the merger effect will be different for different levels of congestion, then one can reestimate an augmented model that includes interaction terms between congestion, overlap dummy, post dummy, and, potentially, merged dummy.
(c) If there is no time trend in ancillary fees, then the coefficient on Post will capture
the effect on ancillary fees (assuming that the observed price of the ticket included these
fees). If the effect is assumed to be positive for merged airlines only, then the coefficient
on PostMerged will capture the effect on ancillary fees. If, on the other hand, there is
a time trend in ancillary fees then one will have to re-estimate the model accounting for
this trend explicitly. Since the ancillary fees rarely vary from route to route, specifications
(3), (4), (7), and (8) will not be informative about the effect on the ancillary fees.
(d) This criticism is probably valid, but also too extreme. A merger between two airlines
typically affects the entire industry. So, no group can be called truly control. Unfor-
tunately, conducting randomized trials for the entire industry is impossible. So, we need
to make compromises if we want to learn the effect. If we know the mechanism under-
lying the effect (e.g. the lost direct competition story mentioned in the problem), we can
compare routes that lost direct competition to routes that did not lose direct competition.
It is true that we will not estimate the full effect of the merger, but knowing the effect of
lost direct competition will help us evaluate the effect of the merger at least partially.
Grading scheme for each part:
- 1 point for a correct conclusion
- 1 point for reason/fix