Econometric analysis of accounting choice

Spring 2014





Explore issues of (broad) interest raised in colloquium, papers, or formative ideas from a causal perspective.


1. Describe the general problem.

2. Identify a specific (causal) research question.

3. Formulate a research design.

4. Discuss interpretation of potential results.


Closely attend to the role of probability assignment and counterfactuals in devising an identification strategy and data analysis.




1. Identify a topic of interest.

2. Apply the above outline to the subject.

3. Simulate data and apply your research design.

4. If you have archival data, subject these data to your vetted design and interpret the results; otherwise, interpret the results for the simulated data.


"No successful mechanical algorithm for discovering causal or structural models has yet been produced, and it is unlikely that one will ever be found. At the same time, it is unlikely that the quest for a mechanical algorithm for determining causality from data will ever be abandoned. The tension between the use of tacit knowledge and formal algorithmic methods is likely to be a permanent feature of empirical research in economics. It arises because in most empirical studies there is always more knowledge about a problem being studied than appears in the sampling distributions of the measured variables being analyzed or in well-specified Bayesian priors. The best empirical work in economics uses economic theory as a framework for integrating all of the available evidence, tacit and algorithmic, to tell a convincing story."


– James Heckman, Quarterly Journal of Economics 115, no. 1 Feb. 2000, p. 46.




The objectives are (i) to build a foundation for meaningful discussions, (ii) to complement econometrics (and statistics) courses (in my view, coursework provides an opportunity to survey what's out there – this is an opportunity to develop a deeper understanding), and (iii) to explore implications for accounting.


Monograph: Accounting and causal effects: Econometric challenges


Work in progress: Accounting, managerial experimentation and causal effects


AAA August 2012 slides: Accounting and causal effects


Fellingham monograph: Accounting as an information science


Imbens and Wooldridge’s NBER econometric minicourse



Hui, K. S. Klasa, and E. Yeung, 2011, “Corporate suppliers and customers and accounting conservatism,” working paper.


Kelly, J. “A new interpretation of information rate,” The Bell System Technical Journal, July 1956, 917-926.


Notes on maximum entropy probability assignment conditional on the Kelly criterion




Li, Chen, 2012, “Mutual monitoring within the management team: A structural modeling investigation,” Carnegie-Mellon working paper.


Margiotta and Miller, 2000, “Managerial compensation and the cost of moral hazard,” International Economic Review 41 (3), 669-719.


Gayle and Miller, 2009, “Has moral hazard become a more important factor in managerial compensation?” American Economic Review 99 (5), 1740-1769.


Gayle and Miller, 2012, “Identifying and testing models of managerial compensation,” Carnegie-Mellon working paper.


Ross, 2011, The Recovery Theorem.


Probability assignment from equilibrium prices: Kelly criterion investors assigning recovery theorem beliefs


Dybvig and Ross, 2003, Arbitrage, state prices and portfolio theory, Handbook of the Economics of Finance.


Beaver, Landsman, and Owens. Asymmetry in earnings timeliness and persistence: A simultaneous equations approach, forthcoming Review of Accounting Studies.






Below is a list of potential topics for summer seminar. We surely will not cover them all (some will naturally get combined in the course of our discussions). It goes without saying, this is an ambitious, time and energy-taxing endeavor, but this seminar is purely optional (it's an opportunity not an obligation). If you decide to participate, I would like you to choose a topic and tentative date for discussion. I believe you'll get more out of seminar if you actively participate, therefore my plan for this summer is to give you some latitude in leading the discussion. Don't think you have to exhaust the topic – that's not possible, many of these topics warrant a course of their own. Do not expect closure.


There is some coverage of each of these issues in my monograph, Accounting and causal effects: Econometric challenges. But don't think you have to limit yourself to these pages. The monograph is posted below along with some supplemental material developed (and still under development) subsequent to completion of the monograph along with some sample R programs.

As you know, linear algebra is an indispensable tool for these explorations. Amongst the supplemental materials is an appendix (not listed with the topics below) which includes a brief survey of some important concepts from linear algebra along with some examples.


Make use of the best tools at your disposal (spreadsheets, Mathematica, Matlab, etc.). I find R, the open source statistical software, to be invaluable especially for computationally intensive work like McMC simulation. Since it's open source, it evolves rapidly and many of the contributors have fabulous programming skills that make R extraordinarily efficient. I plan to post a few sample programs to, hopefully, ease the learning curve but expect to discover your own tricks for gaining efficiency (sadly, not my forte).



Treatment effects (a special case of causal effects) receive a substantial amount of attention in the topics below as well as in the monograph. Some key features (illustrated in the Tuebingen-style and supplementary pages examples) are


omitted, correlated variables (personally, I devote 99% of my efforts to this problem, in this case it leads to potential selection bias)


counterfactuals & inferring population-level parameters (this distinguishes the treatment effect problem from, say, a run of the mill endogenous regressor problem and is a source of controversy among scientists; see Dawid)


common support (how confident are you extrapolating outside the relevant range?)


there are numerous definitions of treatment effects (average treatment effect, average treatment effect on the treated, average treatment effect on the untreated, local average treatment effect, marginal treatment effect, policy-relevant treatment effect, ambiguity of treatment effects identified by linear IV, etc.) not all of them are accessible without full support and they are not all equally well suited to the problem at hand


choice of instruments is even more challenging than usual and greater care is required in interpreting the results


higher explanatory power (in either the selection or outcome equations) does not necessarily lead to a well-identified model





Linear models (OLS, GLS, FWL, etc.)


       Equilibrium earnings management (chapters 2 & 3)


Fixed effects & differences-in-differences and random effects & random coefficients


Linear instrumental variables (IV)


Discrete choice models


Nonlinear regression


Maximum likelihood estimation (James-Stein shrinkage estimators)


Nonparametric & semiparametric regression


Bootstrapping & posterior simulation


McMC (Markov chain Monte Carlo) simulation (chapters 7 & 12 and supplemental materials Bayesian notes)


Strategic choice models


Treatment effects (chapter 8 Tuebingen-style examples)


Ignorable treatment effects (chapter 9 Tuebingen-style examples)


       Identification strategies: exogenous dummy variable regression, nonparametric regression, propensity score & propensity score matching, control functions, regression discontinuity design, etc. (mean conditional independence (ignorability) vis-a-vis conditional independence (strong ignorability))


       Asset revaluation regulation (limitations of outcome measures for inferring welfare effects; chapters 2 & 9)


Nonignorable (IV) treatment effects (chapter 10 Tuebingen-style examples)


       Generalized Roy model interpretation


       Identification strategies: endogenous dummy variable IV regression, propensity score IV, ordinate control function IV, inverse Mills control function IV, LATE and linear IV, etc.


       IV control functions & projections examples (ANOVA, ANCOVA to control functions examples in supplemental materials projections notes)


       Continuous treatment & (correlated) random coefficients


       Regulated report precision (nonignorable versus ignorable identification strategy effectiveness; chapters 2 & 10)


Marginal treatment effects


       Identification and LIV (local instrumental variables)


       Regulated report precision (apparent nonnormality & marginal treatment effects; chapters 2 & 11)


Bayesian treatment effects


       Identification via McMC data augmentation


       Identification & McMC IV (restricted) data augmentation (supplemental materials Bayesian notes)


       Regulated report precision (marginal & average treatment effects as well as policy-relevant treatment effects; chapters 2 & 12)


Informed priors & Bayesian data analysis (probability as logic)


       Maximum entropy probability assignment & Jaynes' widget problem


       Conjugate families (supplemental materials Bayesian notes)


       Inverting financial statements (to recover transactions)


       Smooth accruals (valuation and performance evaluation implications)


       Earnings management (stochastic & selective manipulation)


Strategic disclosure


Quantum production, synergy, and (product market) strategic disclosure


Other topics of interest to you



Monograph: Monograph: Accounting and causal effects: Econometric challenges


Table of contents for Accounting and causal effects: Econometric challenges

1. Introduction

2. Accounting choice

3. Linear models

         - including OLS, GLS, FWL, and IV estimators

4. Loss functions & estimation

         - including MLE, James-Stein shrinkage estimators, and nonlinear regression

5. Discrete choice models

         - employed as propensity score in treatment effect analysis

6. Nonparametric regression models

         - employed with Heckman's MTE

7. Repeated-sampling inference

         - including McMC Bayesian analysis

8. Overview of endogeneity

9. Treatment effects: ignorability

10. Treatment effects: IV

11. Marginal treatment effects

12. Bayesian treatment effects

13. Informed priors

         - including widget example

         - including inferring transactions from financial statements

Appendix. Asymptotic theory

         - including convergence in probability, convergence in distribution, and rates of convergence





Errata (corrections for the 2010 Springer printed version)


Supplemental materials: (most of this is work in progress – expect updates)


1. Probability as logic (

2. Accounting, managerial experimentation and causal effects


ch. 1 introduction


                  stewardship and the search for a better “mouse trap”


ch. 2 classical linear models


                  ANOVA, ANCOVA, linear regression


                  double residual regression (FWL)


ch. 3 classical causal effects strategies


                  ignorable treatment


                  limited common support


                  regression discontinuity designs


                  synthetic controls


                  dynamic treatment effects


                  LATE & 2SLS (linear) IV


                  control functions


                  continuous treatment effects


         Causal effects and difference-in-difference designs


         Causal effects and propensity-score matched regression


Causal effects and saturated propensity-score matched regression


ch. 4 maximum entropy distributions


                  Bayes’ theorem & consistent reasoning


                  maximum entropy probability assignment


ch. 5 loss functions


ch. 6 conjugate families



ch. 7 Bayesian simulation


                  posterior simulation & conjugate families


                  independent & conditional posterior simulation


         Markov chain Monte Carlo (McMC) simulation


                  irreducibility, time reversibility, & stationarity


                  Gibbs sampler


                  Metropolis-Hastings (MH) algorithm


                  data augmented sampler for Probit


                  random walk MH logit


                  uniform data augmented Gibbs sampler for logit


ch. 8 Bayesian regression


                  ANOVA, ANCOVA, regression



ch. 9 Bayesian causal effects strategies


                  data augmented Gibbs sampler for treatment effects


                  data augmented IV restricted Gibbs sampler for treatment effects


ch. 10 Bayesian treatment effects without joint distribution of outcomes


                  Chib’s ATE and LATE


                  extensions (via bounding) to ATT and ATUT


ch. 11 Partial identification and missing data


                  Missing outcomes




                           expected values (point and partial identification)




                           respecting stochastic dominance




                           missing covariates and outcomes


                           missing at random


                           missing by choice


                           stochastic dominance and treatment effects


                  Selection problem


                           various instrumental variable strategies (missing at random, statistical independence, means missing at random, mean independence)


                           monotone instrumental variable strategies (mean monotonicity, exogenous treatment selection, monotone treatment selection, monotone treatment response, MM-MTR, MTR-MTS)


                           quantile treatment effects (point and partial identification)


                  Extrapolation and the mixing problem


                  Appendix: bounds on spreads




         linear algebra basics


         fundamental theorem of linear algebra




                  exactly, under-, and over-identified


         matrix decomposition


                  LU factorization


                  Cholesky decomposition


                  singular value decomposition (SVD) & pseudo inverse


                  spectral decomposition


         Gram-Schmidt orthogonalization


         some determinant identities (& LU factorization)


         iterated expectations


         multivariate normal theory


         generalized least squares (GLS)


         two stage least squares IV (2SLS-IV)


         seemingly unrelated regression (SUR)


         maximum likelihood estimation of discrete choice models


         quantum information


         common distributions




R Statistical Computing Package

R is a freely available, open-source version of S/S-plus.

Follow this link: R statistical computing package


R samples programs


Tutorials: R tutorial


find your favorite on the web


source files for data augmented Gibbs sampler bivariate probit:

you will need to add bayesm library from Cran website if it’s not including in your base program




source files for Metropolis-Hastings bivariate logit (self-contained code and code utilizing MCMCpack library which you may need to add)




source files for data augmented Gibbs sampler bivariate selection analysis of treatment effects:

you will need to add MASS library from Cran website if it’s not including in your base program









Another Bayesian McMC strategy for identifying treatment effects (perhaps, with panel data) can be found in Chib’s papers posted below:


Chib 2007

Chib and Jacobi 2007

Chib and Jacobi 2008

Chib and Jacobi 2009


Strategic disclosure papers include:

Shin, 1993, "News management and the value of firms," RAND Journal of Economics, 25(1), 58-71.

Shin, 1993, "Disclosures and asset returns," Econometrica, 71(1), 105-133.

Shin, 2006, “Disclosure risk and price drift,” Journal of Accounting Research, 44(2), 351-379.

Arya and Mittendorf, “The interaction between corporate tax structure and disclosure policy,” Annals of Finance, forthcoming.

Arya and Mittendorf, “Input markets and the strategic organization of the firm,” Foundations and Trends in Accounting, 5(1), 1-97.