TOPIC 1 (2015)
Apunte InglésUniversidad  Universidad Pompeu Fabra (UPF) 
Grado  Administración y Dirección de Empresas  2º curso 
Asignatura  Econometrics I 
Año del apunte  2015 
Páginas  13 
Fecha de subida  10/04/2016 
Descargas  21 
Subido por  laparicioimbuluzqueta 
Vista previa del texto
ECONOMETRICS I

2015
ECONOMETRICS I
INDEX
INTRODUCTION ....................................................................................................................................... 3
TOPIC 1: REVIEW OF PROBABILITY AND REVIEW OF STATISTICS ................................................................. 4
REVIEW OF PROBABILITY ..................................................................................................................................4
REVIEW OF STATISTICS .....................................................................................................................................9
TOPIC 2: INTRODUCTION TO LINEAR REGRESSION ................................................................................... 14
LINEAR REGRESSION WITH ONE REGRESSOR .................................................................................................14
REGRESSION WITH A SINGLE REGRESSOR: HYPOTHESIS TESTS AND CONFIDENCE INTERVALS ....................30
TOPIC 3: MULTIPLE REGRESSION ............................................................................................................ 36
LINEAR REGRESSION WITH MULTIPLE REGRESSORS: ESTIMATION ...............................................................36
HYPOTHESIS TESTS AND CONFIDENCE INTERVALS IN MULTIPLE REGRESSION: INFERENCE .........................49
TOPIC 4: NONLINEAR REGRESSION FUNCTIONS ....................................................................................... 52
TOPIC 5: ASSESSING STUDIES BASED ON MULTIPLE REGRESSION ............................................................. 56
Laura Aparicio
2
ECONOMETRICS I
INTRODUCTION
ECONOMETRICS = is the science of using economic theory and statistical techniques to analyse economic data.
MAIN QUESTION OF THE COURSE: Does reducing class size improve elementary school?
To address this quantitative question we will need to use data so our answers will always have
How a change in a
variable affects another
variable?
some degree of uncertainty. Therefore, the conceptual framework for the analysis needs to
provide both:
① A numerical answer to the question and,
② A measure of how precise our answer is.
The conceptual framework that we will use is the MULTIPLE REGRESSION MODEL.
CAUSAL EFFECT = is the effect on an outcome of a given action or treatment as measured in an ideal randomized controlled
experiment (METHODOLOGY: there are both a control group that receives no treatment and a treatment group that receives it.
Moreover, the treatment is assigned randomly so it eliminates any kind of relationship among variables).
In such an experiment, the only systematic reason for differences in outcomes between the treatment and the control groups is the
treatment itself.
PROBLEM? In practice it’s not possible to perform ideal experiments: unethical, impractical or too expensive.
DATA SOURCES =
EXPERIMENTAL DATA. Experiments designed to evaluate a treatment or policy or to investigate a causal effect.
OBSERVATIONAL DATA. Data obtained observing the actual behaviour through surveys and administrative records with
observational data it’s more complicated to estimate the causal effect!
TYPES OF DATA =
CROSSSECTIONAL DATA. Gathered by observing multiple entities at a single point in time.
o
TIME SERIES DATA. Gathered by observing a single entity at multiple points in time.
o
Example: test scores in 420 school districts from California in 1999.
Example: inflation in the United States in the period 19592004.
PANEL DATA. Gathered by observing multiple entities, each of which is observed at multiple points in time.
o
Example: cigarette consumption for U.S. states in the period 19851995.
Laura Aparicio
3
ECONOMETRICS I
TOPIC 1: REVIEW OF PROBABILITY AND REVIEW OF STATISTICS
REVIEW OF PROBABILITY
PROBABILITY = is the proportion of the time that an outcome occurs in the long run.
RANDOM VARIABLE = numerical summary of a random outcome. Random variables can be: discrete or continuous.
KINDS OF PROBABILITY DISTRIBUTION =
I.
Probability distribution function: list of all possible values of the variable and the probability that each value will occur
[DISCRETE VARIABLES].
II.
Probability density function: the area between two points is the probability that the random variable falls between those
two points [CONTINUOUS VARIABLES].
III.
Cumulative distribution function (φ): is the probability that the random variable is less or equal to a particular value.
IV.
Joint probability distribution: is the probability that the random variables simultaneously take on certain values, say x and y
𝑃𝑟(𝑋 = 𝑥, 𝑌 = 𝑦)
V.
Marginal probability distribution: is used to distinguish the distribution of Y alone (the marginal distribution) from the joint
distribution of Y and another random variable 𝑃𝑟(𝑌 = 𝑦) = ∑𝑙𝑖=1 Pr(𝑋 = 𝑥𝑖 , 𝑌 = 𝑦)
VI.
Conditional distribution: is the distribution of a random variable, Y, conditional on another random variable, X, taking on an
specific value 𝑃𝑟(𝑌 = 𝑦𝑋 = 𝑥) =
Pr(𝑋=𝑥,𝑌=𝑦)
Pr(𝑋=𝑥)
MAIN FORMULAS =
EXPECTED VALUE (𝝁𝒀 ): Longrun average value of the
random variable over many repeated trials:
KURTOSIS: measures how much mass is in its tails, in other
words, how much of the variance arises from extreme values.
The greater the kurtosis, the more likely are outliers. In the case
of a normal distribution, it’s equal to 3.
VARIANCE: measures the dispersion of the “spread” of
a probability distribution:
STANDARD DEVIATION: 𝜎𝑌 = √𝑉𝑎𝑟(𝑌)
SKEWNESS: measures the lack of symmetry. In the
case of a normal distribution it’s equal to 0.
CONDITIONAL EXPECTATION:
CONDITIONAL VARIANCE:
Laura Aparicio
4
ECONOMETRICS I
LAW OF ITERATED EXPECTATIONS = the mean of Y is the weighted average of the conditional expectation of Y given X, weighted by
the probability distribution of X.
CONCEPT OF INDEPENDENCE = two random variables X and Y are independently distributed, or independent, if knowing the value of
one of the variables provides no information about the other.
which in turn means that if two variables are independent, the covariance is 0 so the sum of
their variances is just the sum:
COVARIANCE = is the measure of the extent to which two random variables move together. It can take any value (positive, 0 or
negative).
CORRELATION = is an alternative measure of dependence between X and Y that solves the “units” problem of the covariance. The
random variables X and Y are said to be uncorrelated if 𝑐𝑜𝑟𝑟(𝑋, 𝑌) = 0.
However, it is NOT necessarily true that if X and Y are
uncorrelated, then the conditional mean of Y and given X does
not depend on X.
Laura Aparicio
5
ECONOMETRICS I
MAIN PROPERTIES =
MOST KNOWN DISTRIBUTIONS =
THE NORMAL DISTRIBUTION: a continuous random variable with a normal distribution has a bellshaped
probability density. The standard normal distribution with mean 0 and variance 1 is denoted as N(0,1). It's
symmetric, so its skewness is 0 and its kurtosis is 3 (1)
THE CHISQUARED DISTRIBUTION: is the distribution of the sum of m squared independent standard
normal random variables. This distribution depends on m, which is called the degrees of freedom of the chi2
squared distribution. It's denoted as 𝑋𝑚
THE STUDENT t DISTRIBUTION: with m degrees of freedom is defined to be the distribution of the ratio of a
standard normal random variable, divided by the square root of an independently distributed chisquared
random variable with m degrees of freedom divided by m. It is denoted as 𝑡𝑚 . It also has a bell shape similar
to the normal distribution, but when m is small (20 or less), it has more mass in the "tails ("flatter") (2)
THE F DISTRIBUTION: is defined to be the distribution of the ratio of a chisquared random variable with degrees
of freedom m, divided by m, to an independently distributed chisquared random variable with degrees of freedom
𝑊/𝑚
n, divided by n.
~𝐹𝑚,𝑛
𝑉/𝑛
(1) To calculate a probability associated with a normal random variable, first standardize the variable, then use the standard normal
cumulative distribution.
(2) When m is 30 or more, the Student t distribution is well approximated by the standard normal distribution and the 𝑡∞ distribution
equals the standard normal distribution.
Laura Aparicio
6
ECONOMETRICS I
RANDOM SAMPLING = It produces n random observations Y1, …, Yn from a population (where each member is equally likely to be
included in the sample) that are independently and identically distributed (i.i.d).
The sample average, 𝑌̅, varies from one randomly chosen sample to the next and thus is a random variable with a sampling distribution.
If Y1, …, Yn are i.i.d, then:
2
[1] The sampling distribution of 𝑌̅ has mean 𝜇𝑌 and variance 𝜎𝑌̅
= 𝜎𝑌2 /𝑛
This results hold whatever the
distribution of Yi is.
Moreover, when Y is normally
distributed, 𝑌̅ is distributed as
𝑁(𝜇𝑌 , 𝜎𝑌2 /𝑛).
There are two approaches to characterizing sampling distributions: (I) an exact approach and (II) an “approximate” approach.
① EXACT APPROACH
② APPROXIMATE APPROACH
If Y is normally distributed and the observations are
It uses approximations that rely on the sample size
̅ is normal with
i.i.d., then the exact distribution of 𝒀
being large. In order to approximate sampling
mean 𝝁𝒀 and variance
𝝈𝟐𝒀 /𝒏
.
Unfortunately, if the distribution of Y is not normal,
̅
then in general the exact sampling distribution of 𝒀
is very complicated.
distributions we will use two key tools:
[1] Law of large numbers: when the sample size is
̅ will be close to 𝝁𝒀 with very high
large, 𝒀
probability.
[2] Central limit theorem: when the sample size is
large, the sampling distribution of the
standardized sample average is approximately
normal.
[2] The LAW OF LARGE NUMBERS says that 𝑌̅ converges in probability to 𝜇𝑌
Laura Aparicio
7
ECONOMETRICS I
̅
[3] The CENTRAL LIMIT THEOREM says that the standardized version of 𝑌̅ , (𝑌
− 𝜇𝑌 )/𝜎𝑌̅ has a standard normal distribution
[N (0,1) distribution] when n is large.
Even if our observations are not themselves normally distributed!
HOW LARGE IS “LARGE ENOUGH”? It depends. Sometimes we can require n=30 or even more but, in fact, for 𝑛 ≥ 100, the
approximation to the distribution of 𝑌̅ typically is very good for a wide variety of population distributions.
Laura Aparicio
8
ECONOMETRICS I
REVIEW OF STATISTICS
STATISTICS = the key insight of statistics is that one can learn about a population distribution by selecting a random sample from the
population. Using statistical methods, we can use this sample to reach tentative conclusions (“to draw statistical inferences”) about
characteristics of the full population.
ESTIMATOR VS ESTIMATE:
Three types of statistical methods are used throughout econometrics:
An estimator (random variable) is a function of a sample
[1] Estimation
of a data to be drawn randomly from a population
[2] Hypothesis testing
whereas an estimate (nonrandom number) is the
[3] Confidence intervals
numerical value of the estimator when is actually
computed using data from a specific sample.
̅ ) = is an estimator of the population mean, 𝜇𝑌 . When 𝑌1 , … , 𝑌𝑛 are i.i.d its properties are:
SAMPLE AVERAGE (𝒀
2
= 𝜎𝑌2 /𝑛
A.
The sampling distribution of 𝑌̅ has mean 𝜇𝑌 and variance 𝜎𝑌̅
B.
𝑌̅ is unbiased
C.
By the Law of large numbers, 𝑌̅ is consistent
D.
By the Central limit theorem, 𝑌̅ has an approximately normal sampling when the sample size is large.
E.
𝑌̅ is BLUE (Best Linear Unbiased Estimator); that is, it is the most efficient (best) estimator among all estimators that are
unbiased and are linear functions of 𝑌1 , … , 𝑌𝑛 .
F.
𝑌̅ is least squares estimator of 𝜇𝑌 .It provides the best fit to the data in the sense that the average squared differences
between the observations and 𝑌̅ are the smallest of all possible estimators.
The importance of random sampling: so far, we’ve assumed that our observations are i.i.d. This assumption
is important because nonrandom sampling can result in 𝑌̅ being biased.
Laura Aparicio
9
ECONOMETRICS I
HYPOTHESIS TEST = the challenge is to answer questions of the whole population based on a sample evidence. We’ll focus on:
a)
Hypothesis tests concerning the population mean
a.
b)
Example: Does the population mean of hourly earnings equal $20?
Hypothesis tests involving two populations
a.
Example: Are mean earnings the same for men and women?
NULL HYPOTHESIS
ALTERNATIVE HYPOTHESIS (TWOSIDED)
① STANDARD ERROR =
Where:
② TSTATISTIC = is used to test the null hypothesis that the population mean takes on a particular value. If n is large, the tstatistic has a standard normal sampling distribution when the null hypothesis is true. It can be also used to calculate the pvalue associated with the null hypothesis. A small pvalue is evidence that the null hypothesis is false.
Laura Aparicio
10
ECONOMETRICS I
③ PVALUE = is the probability of drawing a statistic at least as adverse to the null hypothesis as the one you actually
computed in your sample, assuming the null hypothesis is correct.
We can make two kinds of mistakes in the hypothesis testing:
TYPE I ERROR: the null hypothesis is rejected when in fact is true.
TYPE II ERROR: the null hypothesis is not rejected when in fact is false.
95% CONFIDENCE INTERVAL FOR 𝝁𝒀 = is an interval constructed so that it contains the true value of 𝜇𝑌 in 95% of all possible samples.
NOTE! Hypothesis tests and confidence intervals for the difference in the means of two populations are conceptually similar to tests
and intervals for the mean of a single population. Until now we’ve focus on a single population but now we’re going to analyse the
case of the differences among two populations.
COMPARING MEANS FROM DIFFERENT POPULATIONS =
[1] HYPOTHESIS TEST:
[2] BECAUSE THE POPULATION MEANS ARE UNKNOWN, THEY MUST BE ESTIMATED FROM SAMPLES OF MEN AND WOMEN:
[3] THE STANDARD ERROR OF DIFFERENCE IN MEANS IS:
Laura Aparicio
11
ECONOMETRICS I
[4] TSTATISTIC: With a prespecified significance level, simply calculate the tstatistic and compare it to the appropriate critical
value. For example. The null hypothesis is rejected at the 5% significance level if the absolute value of the tstatistics exceeds
1.96.
[5] CONFIDENCE INTERVALS:
KEY! LINK BETWEEN STATISTICS AND ECONOMETRICS
Recall that a randomized controlled experiment randomly selects subjects from a population of interest, then randomly assigns
them either to a treatment group, which receives the experimental treatment, or to a control group, which does not receive the
treatment.
The difference between the sample means of the treatment and control groups is an estimator of the causal effect of the treatment.
This effect can be expressed as the difference of two conditional expectations. Specifically, the causal effect on Y of treatment level
x is the difference in the conditional expectations, 𝑬(𝒀𝑿 = 𝒙) − 𝑬(𝒀𝑿 = 𝟎).
BINARY CASE: The causal effect can be estimated by the difference in the sample average outcomes between the treatment and
control groups.
TSTATISTIC WHEN N IS SMALL = when the sample size is small, the standard normal distribution can provide a poor approximation to
the distribution of the tstatistic.
CASE I: POPULATION WITH ANY KIND OF DISTRIBUTION AND N LARGE NORMAL DISTRIBUTION
CASE II: POPULATION IS NORMALLY DISTRIBUTED BUT N SMALL STUDENT t DISTRIBUTION with n1 degrees of freedom
CASE III: POPULATION WITH ANY KIND OF DISTRIBUTION BUT N SMALL PROBLEMS!!
CASE IV (DIFFERENCE IN MEANS): IF BOTH POPULATION ARE NORMAL AND N LARGE NORMAL DISTRIBUTION
CASE V (DIFFERENCE IN MEANS): IF BOTH POPULATION ARE NORMAL AND THEIR VARIANCES ARE EQUAL STUDENT t
DISTRIBUTION with nm+nw2 degrees of freedom
MAIN IDEA! In practice, the difference between the Student t distribution and the standard normal distribution is negligible if the
sample size is large enough. In all the applications of this book, the sample sizes are in hundreds or thousands, so we will always use
the largesample standard normal approximation.
Laura Aparicio
12
ECONOMETRICS I
SAMPLE CORRELATION COEFFICIENT = is an estimator of the population correlation coefficient and measures the linear relationship
between two variables; that is, how well their scatterplot is approximated by a straight line.
REMEMBER!
A high correlation coefficient does
not necessarily mean that the line
has a steep slope; rather, it means
that the points in the scatterplot fall
very close to a straight line.
Finally, like the sample variance, the sample covariance is consistent (
correlation coefficient is consistent (
) and, consequently, the sample
).
Laura Aparicio
13
...