TOPIC 1 (2015)

Apunte Inglés
Universidad Universidad Pompeu Fabra (UPF)
Grado Administración y Dirección de Empresas - 2º curso
Asignatura Econometrics I
Año del apunte 2015
Páginas 13
Fecha de subida 10/04/2016
Descargas 20
Subido por

Vista previa del texto

ECONOMETRICS I | 2015 ECONOMETRICS I INDEX INTRODUCTION ....................................................................................................................................... 3 TOPIC 1: REVIEW OF PROBABILITY AND REVIEW OF STATISTICS ................................................................. 4 REVIEW OF PROBABILITY ..................................................................................................................................4 REVIEW OF STATISTICS .....................................................................................................................................9 TOPIC 2: INTRODUCTION TO LINEAR REGRESSION ................................................................................... 14 LINEAR REGRESSION WITH ONE REGRESSOR .................................................................................................14 REGRESSION WITH A SINGLE REGRESSOR: HYPOTHESIS TESTS AND CONFIDENCE INTERVALS ....................30 TOPIC 3: MULTIPLE REGRESSION ............................................................................................................ 36 LINEAR REGRESSION WITH MULTIPLE REGRESSORS: ESTIMATION ...............................................................36 HYPOTHESIS TESTS AND CONFIDENCE INTERVALS IN MULTIPLE REGRESSION: INFERENCE .........................49 TOPIC 4: NONLINEAR REGRESSION FUNCTIONS ....................................................................................... 52 TOPIC 5: ASSESSING STUDIES BASED ON MULTIPLE REGRESSION ............................................................. 56 Laura Aparicio 2 ECONOMETRICS I INTRODUCTION ECONOMETRICS = is the science of using economic theory and statistical techniques to analyse economic data.
MAIN QUESTION OF THE COURSE: Does reducing class size improve elementary school? To address this quantitative question we will need to use data so our answers will always have How a change in a variable affects another variable? some degree of uncertainty. Therefore, the conceptual framework for the analysis needs to provide both: ① A numerical answer to the question and, ② A measure of how precise our answer is.
The conceptual framework that we will use is the MULTIPLE REGRESSION MODEL.
CAUSAL EFFECT = is the effect on an outcome of a given action or treatment as measured in an ideal randomized controlled experiment (METHODOLOGY: there are both a control group that receives no treatment and a treatment group that receives it.
Moreover, the treatment is assigned randomly so it eliminates any kind of relationship among variables).
In such an experiment, the only systematic reason for differences in outcomes between the treatment and the control groups is the treatment itself.
PROBLEM? In practice it’s not possible to perform ideal experiments: unethical, impractical or too expensive.
DATA SOURCES =  EXPERIMENTAL DATA. Experiments designed to evaluate a treatment or policy or to investigate a causal effect.
 OBSERVATIONAL DATA. Data obtained observing the actual behaviour through surveys and administrative records  with observational data it’s more complicated to estimate the causal effect! TYPES OF DATA =  CROSS-SECTIONAL DATA. Gathered by observing multiple entities at a single point in time.
o  TIME SERIES DATA. Gathered by observing a single entity at multiple points in time.
o  Example: test scores in 420 school districts from California in 1999.
Example: inflation in the United States in the period 1959-2004.
PANEL DATA. Gathered by observing multiple entities, each of which is observed at multiple points in time.
o Example: cigarette consumption for U.S. states in the period 1985-1995.
Laura Aparicio 3 ECONOMETRICS I TOPIC 1: REVIEW OF PROBABILITY AND REVIEW OF STATISTICS REVIEW OF PROBABILITY PROBABILITY = is the proportion of the time that an outcome occurs in the long run.
RANDOM VARIABLE = numerical summary of a random outcome. Random variables can be: discrete or continuous.
KINDS OF PROBABILITY DISTRIBUTION = I.
Probability distribution function: list of all possible values of the variable and the probability that each value will occur [DISCRETE VARIABLES].
II.
Probability density function: the area between two points is the probability that the random variable falls between those two points [CONTINUOUS VARIABLES].
III.
Cumulative distribution function (φ): is the probability that the random variable is less or equal to a particular value.
IV.
Joint probability distribution: is the probability that the random variables simultaneously take on certain values, say x and y  𝑃𝑟(𝑋 = 𝑥, 𝑌 = 𝑦) V.
Marginal probability distribution: is used to distinguish the distribution of Y alone (the marginal distribution) from the joint distribution of Y and another random variable  𝑃𝑟(𝑌 = 𝑦) = ∑𝑙𝑖=1 Pr(𝑋 = 𝑥𝑖 , 𝑌 = 𝑦) VI.
Conditional distribution: is the distribution of a random variable, Y, conditional on another random variable, X, taking on an specific value  𝑃𝑟(𝑌 = 𝑦|𝑋 = 𝑥) = Pr⁡(𝑋=𝑥,⁡⁡⁡𝑌=𝑦) Pr⁡(𝑋=𝑥) MAIN FORMULAS =  EXPECTED VALUE (𝝁𝒀 ): Long-run average value of the  random variable over many repeated trials: KURTOSIS: measures how much mass is in its tails, in other words, how much of the variance arises from extreme values.
The greater the kurtosis, the more likely are outliers. In the case of a normal distribution, it’s equal to 3.
 VARIANCE: measures the dispersion of the “spread” of a probability distribution:  STANDARD DEVIATION: 𝜎𝑌 = √𝑉𝑎𝑟(𝑌)  SKEWNESS: measures the lack of symmetry. In the case of a normal distribution it’s equal to 0.
 CONDITIONAL EXPECTATION:  CONDITIONAL VARIANCE: Laura Aparicio 4 ECONOMETRICS I LAW OF ITERATED EXPECTATIONS = the mean of Y is the weighted average of the conditional expectation of Y given X, weighted by the probability distribution of X.
CONCEPT OF INDEPENDENCE = two random variables X and Y are independently distributed, or independent, if knowing the value of one of the variables provides no information about the other.
which in turn means that if two variables are independent, the covariance is 0 so the sum of their variances is just the sum: COVARIANCE = is the measure of the extent to which two random variables move together. It can take any value (positive, 0 or negative).
CORRELATION = is an alternative measure of dependence between X and Y that solves the “units” problem of the covariance. The random variables X and Y are said to be uncorrelated if 𝑐𝑜𝑟𝑟(𝑋, 𝑌) = 0.
However, it is NOT necessarily true that if X and Y are uncorrelated, then the conditional mean of Y and given X does not depend on X.
Laura Aparicio 5 ECONOMETRICS I MAIN PROPERTIES = MOST KNOWN DISTRIBUTIONS = THE NORMAL DISTRIBUTION: a continuous random variable with a normal distribution has a bell-shaped probability density. The standard normal distribution with mean 0 and variance 1 is denoted as N(0,1). It's symmetric, so its skewness is 0 and its kurtosis is 3 (1) THE CHI-SQUARED DISTRIBUTION: is the distribution of the sum of m squared independent standard normal random variables. This distribution depends on m, which is called the degrees of freedom of the chi2 squared distribution. It's denoted as 𝑋𝑚 THE STUDENT t DISTRIBUTION: with m degrees of freedom is defined to be the distribution of the ratio of a standard normal random variable, divided by the square root of an independently distributed chi-squared random variable with m degrees of freedom divided by m. It is denoted as 𝑡𝑚 . It also has a bell shape similar to the normal distribution, but when m is small (20 or less), it has more mass in the "tails ("flatter") (2) THE F DISTRIBUTION: is defined to be the distribution of the ratio of a chi-squared random variable with degrees of freedom m, divided by m, to an independently distributed chi-squared random variable with degrees of freedom 𝑊/𝑚 n, divided by n.
~𝐹𝑚,𝑛 𝑉/𝑛 (1) To calculate a probability associated with a normal random variable, first standardize the variable, then use the standard normal cumulative distribution.
(2) When m is 30 or more, the Student t distribution is well approximated by the standard normal distribution and the 𝑡∞ distribution equals the standard normal distribution.
Laura Aparicio 6 ECONOMETRICS I RANDOM SAMPLING = It produces n random observations Y1, …, Yn from a population (where each member is equally likely to be included in the sample) that are independently and identically distributed (i.i.d).
The sample average, 𝑌̅, varies from one randomly chosen sample to the next and thus is a random variable with a sampling distribution.
If Y1, …, Yn are i.i.d, then: 2 [1] The sampling distribution of 𝑌̅ has mean 𝜇𝑌 and variance 𝜎𝑌̅ = 𝜎𝑌2 /𝑛⁡ This results hold whatever the distribution of Yi is.
Moreover, when Y is normally distributed, 𝑌̅ is distributed as 𝑁(𝜇𝑌 , 𝜎𝑌2 /𝑛).
There are two approaches to characterizing sampling distributions: (I) an exact approach and (II) an “approximate” approach.
① EXACT APPROACH ② APPROXIMATE APPROACH If Y is normally distributed and the observations are It uses approximations that rely on the sample size ̅ is normal with i.i.d., then the exact distribution of 𝒀 being large. In order to approximate sampling mean 𝝁𝒀 and variance 𝝈𝟐𝒀 /𝒏 .
Unfortunately, if the distribution of Y is not normal, ̅ then in general the exact sampling distribution of 𝒀 is very complicated.
distributions we will use two key tools: [1] Law of large numbers: when the sample size is ̅ will be close to 𝝁𝒀 with very high large, 𝒀 probability.
[2] Central limit theorem: when the sample size is large, the sampling distribution of the standardized sample average is approximately normal.
[2] The LAW OF LARGE NUMBERS says that 𝑌̅ converges in probability to 𝜇𝑌 Laura Aparicio 7 ECONOMETRICS I ̅ [3] The CENTRAL LIMIT THEOREM says that the standardized version of 𝑌̅ , (𝑌 − 𝜇𝑌 )/𝜎𝑌̅ has a standard normal distribution [N (0,1) distribution] when n is large.
Even if our observations are not themselves normally distributed! HOW LARGE IS “LARGE ENOUGH”? It depends. Sometimes we can require n=30 or even more but, in fact, for 𝑛 ≥ 100, the approximation to the distribution of 𝑌̅ typically is very good for a wide variety of population distributions.
Laura Aparicio 8 ECONOMETRICS I REVIEW OF STATISTICS STATISTICS = the key insight of statistics is that one can learn about a population distribution by selecting a random sample from the population. Using statistical methods, we can use this sample to reach tentative conclusions (“to draw statistical inferences”) about characteristics of the full population.
ESTIMATOR VS ESTIMATE: Three types of statistical methods are used throughout econometrics: An estimator (random variable) is a function of a sample [1] Estimation of a data to be drawn randomly from a population [2] Hypothesis testing whereas an estimate (non-random number) is the [3] Confidence intervals numerical value of the estimator when is actually computed using data from a specific sample.
̅ ) = is an estimator of the population mean, 𝜇𝑌 . When 𝑌1 , … , 𝑌𝑛 are i.i.d its properties are: SAMPLE AVERAGE (𝒀 2 = 𝜎𝑌2 /𝑛 A.
The sampling distribution of 𝑌̅ has mean 𝜇𝑌 and variance 𝜎𝑌̅ B.
𝑌̅ is unbiased C.
By the Law of large numbers, 𝑌̅ is consistent D.
By the Central limit theorem, 𝑌̅ has an approximately normal sampling when the sample size is large.
E.
𝑌̅ is BLUE (Best Linear Unbiased Estimator); that is, it is the most efficient (best) estimator among all estimators that are unbiased and are linear functions of 𝑌1 , … , 𝑌𝑛 .
F.
𝑌̅ is least squares estimator of 𝜇𝑌 .It provides the best fit to the data in the sense that the average squared differences between the observations and 𝑌̅ are the smallest of all possible estimators.
 The importance of random sampling: so far, we’ve assumed that our observations are i.i.d. This assumption is important because non-random sampling can result in 𝑌̅ being biased.
Laura Aparicio 9 ECONOMETRICS I HYPOTHESIS TEST = the challenge is to answer questions of the whole population based on a sample evidence. We’ll focus on: a) Hypothesis tests concerning the population mean a.
b) Example: Does the population mean of hourly earnings equal $20? Hypothesis tests involving two populations a.
Example: Are mean earnings the same for men and women? NULL HYPOTHESIS ALTERNATIVE HYPOTHESIS (TWO-SIDED) ① STANDARD ERROR = Where: ② T-STATISTIC = is used to test the null hypothesis that the population mean takes on a particular value. If n is large, the tstatistic has a standard normal sampling distribution when the null hypothesis is true. It can be also used to calculate the pvalue associated with the null hypothesis. A small p-value is evidence that the null hypothesis is false.
Laura Aparicio 10 ECONOMETRICS I ③ P-VALUE = is the probability of drawing a statistic at least as adverse to the null hypothesis as the one you actually computed in your sample, assuming the null hypothesis is correct.
We can make two kinds of mistakes in the hypothesis testing:  TYPE I ERROR: the null hypothesis is rejected when in fact is true.
 TYPE II ERROR: the null hypothesis is not rejected when in fact is false.
95% CONFIDENCE INTERVAL FOR 𝝁𝒀 = is an interval constructed so that it contains the true value of 𝜇𝑌 in 95% of all possible samples.
NOTE! Hypothesis tests and confidence intervals for the difference in the means of two populations are conceptually similar to tests and intervals for the mean of a single population. Until now we’ve focus on a single population but now we’re going to analyse the case of the differences among two populations.
COMPARING MEANS FROM DIFFERENT POPULATIONS = [1] HYPOTHESIS TEST: [2] BECAUSE THE POPULATION MEANS ARE UNKNOWN, THEY MUST BE ESTIMATED FROM SAMPLES OF MEN AND WOMEN: [3] THE STANDARD ERROR OF DIFFERENCE IN MEANS IS: Laura Aparicio 11 ECONOMETRICS I [4] T-STATISTIC: With a pre-specified significance level, simply calculate the t-statistic and compare it to the appropriate critical value. For example. The null hypothesis is rejected at the 5% significance level if the absolute value of the t-statistics exceeds 1.96.
[5] CONFIDENCE INTERVALS: KEY! LINK BETWEEN STATISTICS AND ECONOMETRICS Recall that a randomized controlled experiment randomly selects subjects from a population of interest, then randomly assigns them either to a treatment group, which receives the experimental treatment, or to a control group, which does not receive the treatment.
The difference between the sample means of the treatment and control groups is an estimator of the causal effect of the treatment.
This effect can be expressed as the difference of two conditional expectations. Specifically, the causal effect on Y of treatment level x is the difference in the conditional expectations, 𝑬(𝒀|𝑿 = 𝒙) − 𝑬(𝒀|𝑿 = 𝟎).
BINARY CASE: The causal effect can be estimated by the difference in the sample average outcomes between the treatment and control groups.
T-STATISTIC WHEN N IS SMALL = when the sample size is small, the standard normal distribution can provide a poor approximation to the distribution of the t-statistic.
 CASE I: POPULATION WITH ANY KIND OF DISTRIBUTION AND N LARGE  NORMAL DISTRIBUTION  CASE II: POPULATION IS NORMALLY DISTRIBUTED BUT N SMALL  STUDENT t DISTRIBUTION with n-1 degrees of freedom  CASE III: POPULATION WITH ANY KIND OF DISTRIBUTION BUT N SMALL  PROBLEMS!!  CASE IV (DIFFERENCE IN MEANS): IF BOTH POPULATION ARE NORMAL AND N LARGE  NORMAL DISTRIBUTION  CASE V (DIFFERENCE IN MEANS): IF BOTH POPULATION ARE NORMAL AND THEIR VARIANCES ARE EQUAL  STUDENT t DISTRIBUTION with nm+nw-2 degrees of freedom MAIN IDEA! In practice, the difference between the Student t distribution and the standard normal distribution is negligible if the sample size is large enough. In all the applications of this book, the sample sizes are in hundreds or thousands, so we will always use the large-sample standard normal approximation.
Laura Aparicio 12 ECONOMETRICS I SAMPLE CORRELATION COEFFICIENT = is an estimator of the population correlation coefficient and measures the linear relationship between two variables; that is, how well their scatterplot is approximated by a straight line.
REMEMBER! A high correlation coefficient does not necessarily mean that the line has a steep slope; rather, it means that the points in the scatterplot fall very close to a straight line.
Finally, like the sample variance, the sample covariance is consistent ( correlation coefficient is consistent ( ) and, consequently, the sample ).
Laura Aparicio 13 ...