Finding Good Resources

This past week I did not put as much time into my project as I would have liked to, but still managed to be somewhat productive. While searching for information on the advantages of panel data, I stumbled across a web page for a graduate level course called “Applied Econometrics: Topics in the Analysis of Panel Data” taught by William H. Greene.

William H. Greene is a very well-respected author in the field of econometrics, and apparently a professor of economics at NYU. The web site has power points for all of his lectures, and will be incredibly valuable in helping me learn and explain panel data modeling.

Professor Hunnicutt from the ECON department here at PLU is also letting me borrow one of her graduate level textbooks on econometrics. William H. Greene also happens to be the author of that.

For my capstone I intend to look at the mathematics behind regression analysis using panel data, or panel data modeling. A panel of data consists of two components: a cross-section and a time-series. For instance, the data I am using consists of individual observations for each of the 58 counties of California over a span of 8 years (2000-2007). This means that each variable in the regression has 464 ( $58 \cdot 8$) observations. An advantage of this approach is the ability to account for variability over time as well as across the cross-section. Also, it allows for analysis of data with a limited number of observations over time (provided there are substantial cross-sectional observations) or a limited number of observations over the cross-section (provided there are sufficient time-series observations).
The general form of a panel data model is $y_{it} = \alpha_i + \beta'x_{it} + \epsilon_{it}$. In the model, $i$ represents the cross-sectional units, $t$ represents the time-series units, $y$ represents the dependent variable, $x$ represents the independent variables, $\alpha$ represents the individual effects coefficients, $\beta'$ represents the set of coefficients for the independent variables, and $\epsilon$ represents the error terms. This is just the general form of the panel data model. The specific variations of the model that I will be looking at will be discussed in a later blog post.