**Question**

1. Annual earnings You will need statistical software to complete this question. From Canvas assignment B, download the data set EarningsAge_350obs.csv. The data set contains information on 350 male full-time workers aged 25 to 65 with information on their annual earnings in 2010 and other variables of interest. The key variables are defined as follows: earnings = Annual earnings in $

lnearnings = Natural logarithm of earnings

education = Educational attainment: years of schooling

exper = Experience in years

hours = Weekly hours worked

lnhours = Natural logarithm of hours

dself = dummy variable equal to 1 if the worker is self-employed, 0 otherwise

(a) Fill in the table below with summary statistics for the following variables: Earnings (earnings), education (education), experience (exper), hours worked (hours) and an indicator of self-employment (dself).

(b) Run an OLS regression of lnearnings on education, exper and dself. On your answer sheet, write down the result in a compact format with the SRL, robust standard errors, t-statistics, p-values, R-squared and sample size.

(c) Briefly interpret all coefficients and their respective p-values in the above regression.

(d) (1pt) What are the predicted lnearnings for a worker with 15 years of education and the following characteristics? Fill in the following table and provide the working outs.

2. Lead and infant mortality Today it is well known that lead is an environmental toxin. Research has shown that lead has adverse effects on infant mortality, loss of IQ and violence, among other things. However, at the beginning of the 20th century, the effects of lead on human health were not yet well understood. Back then, many cities in the US (and elsewhere) had lead water pipes. This assignment investigates the effect of lead water pipes on infant mortality in 1900. You will need statistical software to complete this question. From Canvas assignment B, download the data set LeadMortality.csv. The data set contains information on infant mortality, whether the city had lead pipelines or not, and other variables that are potentially related to infant mortality. The table below presents the key variables and their definition.

(a) Use EViews to compute descriptive statistics on the average rate of infant mortality (infrate) and average pH-index of water (ph) for cities without lead water pipes (i.e. lead=0) and cities with lead water pipes (i.e. lead=1) and fill in the table below on your answer sheet.

(b) Produce two scatter plots in EViews. In the first, you plot pH (X-axis) and infrate (Y-axis) for cities without lead pipes. In the second one, you plot pH (X-axis) and infrate (Y-axis) for cities with lead pipes.

(c) Briefly describe how cities with and without lead pipelines differ using the 4 pattern found in (a) and (b) on infant mortality and the pH-index of water. Does the descriptive evidence suggest that lead increases infant mortality? Research has shown that the amount of lead leached from lead pipes depends on the pH-index of the water running through the pipes. The more acidic the water is (that is, the lower its pH), the more lead is leached. Create a new series ph_lead that captures this fact and that is defined as ph lead = lead ph. The population model thus is:

(d) Run an OLS regression of infrate on lead, ph, ph_lead, temperature and typhoid_rate. On your answer sheet, write down the result in a compact format with the SRL, robust standard errors, t-statistics, p-values, R-squared and sample size.

(e) Briefly interpret the estimated intercept, as well as the slope coefficients associated with temperature and the typhoid-rate in the above regression. Are the coefficients statistically significant at the 10% significance level? (f)(1pt) What is the marginal effect of lead on infant mortality in the population model?

(g) Provide an estimate of the marginal effect of lead on infant mortality using your results from (f) and (d) for different pH-levels and fill in the following table:

