What is the Consumption-CAPM Missing? An Information-Theoretic Framework for the Analysis of Asset Pricing Models

We study a broad class of asset pricing models in which the stochastic discount factor (SDF) can be factorized into an observable component (e.g., a parametric function of consumption) and a potentially unobservable one (e.g., habit level or the return on total wealth). Exploiting this decomposition we derive new entropy bounds that restrict the admissible regions for the SDF and its components. Without using this decomposition, we show that, to a second order approximation, entropy bounds are equivalent to the canonical Hansen-Jagannathan bounds. However, bounds based on our decomposition have higher information content, are generally tighter, and naturally exploit the restriction that the SDF is a positive random variable. In addition, our information-theoretic framework enables us to extract a non-parametric estimate of the unobservable component of the SDF. Empirically we find that this component, in addition to following a clear business cycle pattern, has significant correlation with financial market crashes unrelated to economy-wide contractions. We apply our methodology to the leading consumption-based asset pricing models, gaining new insights about their empirical performance and finding empirical support for the Long Run Risk framework.


Introduction
The absence of arbitrage opportunities implies the existence of a pricing kernel, also known as the stochastic discount factor (SDF), such that the equilibrium price of a traded security can be represented as the conditional expectation of the future pay-o¤ discounted by the pricing kernel. The standard consumption-based asset pricing model, within the representative agent and time-separable power utility framework, identi…es the pricing kernel as a simple parametric function of consumption growth. However, pricing kernels based on consumption risk alone cannot explain (i) the historically observed levels of returns, giving rise to the Equity Premium and Risk Free Rate Puzzles (e.g. Mehra and Prescott (1985) and Weil (1989)), and (ii) the crosssectional dispersion of returns between di¤erent classes of …nancial assets (e.g. Mankiw and Shapiro (1986), Breeden, Gibbons, and Litzenberger (1989), Campbell (1996), Cochrane (1996)).
Nevertheless, there is considerable empirical evidence that consumption risk does matter for explaining asset returns (e.g. Lettau and Ludvigson (2001), Parker and Julliard (2005)). Therefore, a burgeoning literature has developed based on modifying the preferences of investors and/or the structure of the economy. In such models the resulting pricing kernel can be factorized into an observable component consisting of a parametric function of consumption, and a potentially unobservable, model-speci…c, component. Prominent examples in this class include: the external habit model where the additional component consists of a function of the habit level (Campbell and Cochrane (1999); Menzly, Santos, and Veronesi (2004)); the long run risk model based on recursive preferences where the additional component consists of the return on total wealth (Bansal and Yaron (2004)); and models with housing risk where the additional component consists of growth in the expenditure share on non-housing consumption (Piazzesi, Schneider, and Tuzel (2007)). The additional, and potentially unobserved, component may also capture deviations from rational expectations (e.g. as in Brunnermeier and Julliard (2007)) and models with robust control (e.g. Hansen and Sargent (2010)).
In this paper, we propose an new way to analyze dynamic asset pricing models for which the SDF can be factorized into an observable component and a potentially unobservable one. Our analysis utilizes an information-theoretic entropy approach, that builts upon Stutzer (1995Stutzer ( , 1996, to assess the empirical plausibility of candidate SDFs of this form. Firstly, we construct entropy bounds that restrict the admissible regions for the SDF and its unobservable component. Dynamic equilibrium asset pricing models generally impose strong assumptions on the preferences of consumers and the dynamics of the state variables driving asset prices in order to identify the SDF. In contrast we rely on a model-free no-arbitrage approach to construct the bounds on the SDF and its component. Our results complement and improve upon the seminal work by Hansen and Jagannathan (1991), that provide minimum variance bounds for the SDF, and Stutzer (1995Stutzer ( , 1996 that …rst suggested to construct entropy bounds for the SDF based on the asset pricing restriction for the risk neutral probability measure. The use of an entropy metric is also related to the work of Alvarez and Jermann (2005) who derive a lower bound for the volatility of the permanent component of investors' marginal utility of wealth. We show that, in the mean-standard deviation space, a second order approximation of the risk neutral entropy bounds (Q-bounds) have the canonical Hansen-Jagannathan bounds as a special case, but are generally tighter since they impose the non negativity restriction on the pricing kernel. Using the structure of the pricing kernel, we are able to provide bounds (M -bounds) that have higher information content, and are tighter, than both the Hansen and Jagannathan (1991) and Stutzer (1996) bounds. Moreover, our approach improves on Alvarez and Jermann (2005) in that we can accomodate an asset space of arbitrary dimension and include assets that have negative expected rates of return. We also show that our methodology can be used to construct bounds ( -bounds) for the potentially unobserved components of the pricing kernel.
Secondly, we show how the relative entropy minimization approach used for the construction of the bounds can be used to extract nonparametrically the time series of both the SDF and its unobservable component. This methodology identi…es the most likely, in a information theoretic sense, time series of the SDF and its unobservable component. We …nd that the estimated SDF has a clear business cycle pattern, but also shows signi…cant and sharp reactions to …nancial market crashes that do not result in economy wide contractions.
Thirdly, we apply our methodology to some of the leading consumption-based asset pricing models, gaining new insights about their empirical performance. For the standard time separable power utility model, we show that the pricing kernel satis…es the Hansen and Jagannathan (1991) bound for large values of the risk aversion coe¢ cient, and the Q and M bounds for even higher levels of risk aversion. However, the -bound, which is a bound on the unobservable component of the pricing kernel, is tighter and this bound is not satis…ed for any level of risk aversion. We show that these …ndings are robust to the use of the long run consumption risk measure of Parker and Julliard (2005), despite the fact that this measure of consumption risk is able to explain a large share of the cross-sectional variation in asset returns with a small risk aversion coe¢ cient. Considering more general models of dynamic economies, such as models with habit formation, long run risks in consumption growth, and complementarities in consumption, we …nd substantial empirical support for the long run risks framework of Bansal and Yaron (2004).
In linking entropy and variance bounds, our work is also related to Kitamura and Stutzer (2002) which examines the connection between entropic and linear projections in asset pricing estimation. However, our paper di¤ers from this study in that we (i) focus on a broader set of entropy measures, (ii) consider an empirically and theoretically relevant decomposition of the pricing kernel, and (iii) derive time series implications for the stochastic discount factor and its components.
Finally, the methodology developed in this paper has considerable generality and may be applied to any model that delivers well-de…ned Euler equations and for which the SDF can be factorized into an observable component and an unobservable one. These include investment-based asset pricing models, and models with heterogenous agents, limited stock market participation, and fragile beliefs.
The remainder of the paper is organized as follows. Section 2 presents the informationtheoretic methodology and Section 2.1 introduces the entropy bounds developed and their properties. A description of the data used in the empirical applications is provided in Section 3. Section 4 uses the Consumption-CAPM with power utility as an illustrative example of the application of our methodology. Section 5 applies the methodology developed in this paper to the analysis of more general models of dynamic economies. The model considered, and their mapping into our framewrok, are presented in Section 5.1 while the empirical results are presented in Section 5.2. Section 6 concludes and discusses extensions. The Appendix contains the proofs and additional details on the methodology.

Entropy and the Pricing Kernel
In the absence of arbitrage opportunities, there exists a pricing kernel, M t+1 , or stochastic discount factor (SDF), such that the equilibrium price, P it , of any asset i delivering a future payo¤, X it+1 , is given by where E t is the rational expectation operator conditional on the information available at time t. Generally, the SDF can be factorized as follow where m ( ; t) is a known non-negative function of data observable at time t and the parameters vector 2 R k , and t is a potentially unobservable component. In the most common case m ( ; t) is simply a function of consumption growth, i.e. m ( ; t) = m ( c t ; ) where c t := log Ct C t 1 and C t denotes the time t consumption ‡ow. Equations (1) and (2) imply that for any set of tradable assets the following vector of Euler equations must hold in equilibrium where E is the unconditional rational expectation operator, R e t 2 R N is a vector of excess returns on di¤erent tradable assets, and P is the unconditional physical probability measure. Under weak regularity conditions the above pricing restrictions for the SDF can be rewritten as where x := E [x t ], and t = d dP is the Radon-Nikodym derivative of with respect to P . For the above change of measure to be legitimate we need absolute continuity of the measures and P .
The transformation above implies that, given a set of consumption and asset returns data, for any we can estimate the probability measure aŝ The above is a relative entropy (or Kullback-Leibler Information Criterion (KLIC)) minimization under the asset pricing restrictions coming from the Euler equations. That is, we can estimate the unknown measure as the one that adds the minimum amount of additional information needed for the pricing kernel to price assets. Note also that D ( jjP ) is always non negative and has a minimum at zero that is reached when is identical to P , that is when all the information needed to price assets is contained in m ( ; t) and t is simply a constant term. The above approach can also be used, as …rst suggested by Stutzer (1995), to recover the risk neutral probability measure (Q) from the data aŝ under the restriction that Q and P are absolutely continuous. Moreover, since relative entropy is not symmetric, we could also recover and Q as^ = arg min D (P jj ) arg min Note that the approaches in Equations (4) and (6) can identify f t g T t=1 only up to a positive scale constant.
But why should relative entropy minimization be an appropriate criterion for recovering the unknown measures and Q? There are several reasons for this choice.
First, this approach is numerically simple when implemented via duality (see e.g. Csiszar (1975)). That is, when implementing the entropy minimization in Equation (4) each element of the series f t g T t=1 can be estimated, up to positive constant scale factor, as^ where ( ) 2 R N is the solution to where this last expression is the dual formulation of the entropy minimization problem in Equation (4). Similarly, the entropy minimization in Equation (6) is solved by setting each t , up to a constant positive scale factor, as being equal tô where ( ) 2 R N is the solution to the following unconstrained convex problem where this last expression is the dual formulation of the entropy minimization problem in Equation (6). Note also that the above duality results imply that the number of free parameters available in estimating f g T t=1 is the dimension of (the lagrange multiplier) -that is, it is simply equal to the number of asset considered in the Euler equation. Moreover, since the ( ) in Equations (9) and (11) are akin to Extremum Estimators (see e.g. Hayashi (2000, Ch. 7)), under standard regularity conditions (see e.g. Amemiya (1985, Theorem 4.1.3)), one can construct asymptotic con…dence intervals for both f t g T t=1 and the Entropy bounds presented in the next Section.
Second, using entropy minimization to uncover the t component of the pricing kernel satis…es the Occam's razor, or law of parsimony, since it adds the minimum amount of information needed for the pricing kernel to price assets. This is due to the fact that the relative entropy is measured in units of information.
Third, the use of relative entropy, due to the presence of the logarithm in the objective functions in Equations (4)-(7), naturally imposes the non negativity of the pricing kernel. This, for example, is not imposed in the identi…cation of the minimum variance pricing kernel of Hansen and Jagannathan (1991). 1 Fourth, there is no ex-ante restriction of the number of assets that can be used in constructing t , and can naturally handle assets with expected negative rates of return. This is an advantage for example with respect to the Alvarez and Jermann (2005) approach that focuses on only three assets with positive expected returns.
Fifth, as implied by the work of Brown and Smith (1990), the use of entropy is desirable if we think that tail events are an important component of the risk measure.
Sixth, and most importantly, the approaches in Equations (4) and (6) deliver the maximum likelihood estimate of the t component of the pricing kernel -that is, the most likely estimate given the data at hand. To see this let's consider the two minimization problems separately. Note …rst that, normalizing f t g T t=1 to lie in the unit simplex T , the solution of the estimation problem in Equation (6) also solves the following optimization But the objective function above is simply the non parametric likelihood (aka empirical likelihood) of Owen (1988Owen ( , 1991Owen ( , 2001) maximized under the asset pricing restrictions for a vector of asset returns. To see why the estimation problem in Equation (4) also delivers a maximum likelihood estimate of the t component, consider the following procedure for constructing (up to a scale) the series f t g T t=1 . First, given an integer N >> 0, distribute to the various points in time t = 1; :::; T , at random and with equal probabilities, the value 1=N in N independent draws. That is, draw a series of values (probability weights) of the above procedure. This can be easily done by noticing that the distribution of the~ t is, by construction, the multinomial distribution with support given by the data sample. Therefore, the likelihood of any particular sequence This implies that the most likely value for would be the maximizer of the log likelihood Since the above procedure of assigning probability weights will become more and more accurate as N grows bigger, we would ideally like to have N ! 1. But in this case one can show 2 that Therefore, taking into account the constraint for the pricing kernel, the maximum likelihood estimate (MLE) of the time series of t would solve But the solution of the above MLE problem is also the solution of the relative entropy minimization problem in Equation (4) (see e.g. Csiszar (1975)). That is, the KLIC minimization problem we propose is equivalent to maximizing the likelihood in an unbiased procedure for …nding the t component of the pricing kernel. Moreover, note that this is also the rational behind the principle of maximum entropy (see e.g. Jaynes (1957bJaynes ( , 1957b) in physical sciences and Bayesian probability that states that, subject to known testable constraints -the asset pricing Euler restrictions in our case -the probability distribution that best represent our knowledge is the one with maximum entropy, or minimum relative entropy in our notation.
2 Recall that from Stirling's formula we have:

Entropy Bounds
Based on the relative entropy estimation of the pricing kernel and its component outlined in the previous section, we now turn our attention to the derivation of a set of entropy bounds for the SDF and its components. The absence of arbitrage opportunities implies the existence of a convex set of pricing operators, also called SDFs, M , such that Dynamic equilibrium asset pricing models identify the SDFs as parametric functions of variables determined by the consumers'preferences and the dynamics of state variables driving the economy. A substantial research e¤ort has been devoted to developing diagnostic methods to assess the empirical plausibility of candidate SDFs in pricing assets as well as provide guidance for the construction and testing of othermore realistic -asset pricing theories.
The seminal work by Hansen and Jagannathan (1991) identi…es, in a model-free no-arbitrage setting, a variance minimizing benchmark SDF, whose variance places a lower bound on the variances of other SDFs. In particular, the HJ-bounds are de…ned as follows.
De…nition 1 (Canonical HJ-bound) for each E [M t ] = M , the Hansen and Jagannathan (1991) minimum variance SDF is and any candidate stochastic discount factor M t must satisfy V ar (M t ) V ar M t M : The HJ-bounds o¤er a natural benchmark for evaluating the potential of an equilibrium asset pricing model since, by construction, any SDF that is consistent with observed data should have a variance that is not smaller than the one identi…ed by the bound. However, the identi…ed minimum variance SDF does not impose the non negativity constraint on the pricing kernel, and since M t M is a linear function of returns, 3 it does not generally satisfy the restriction. 4 3 The solution of the problem in Equation (12) is . 4 We call the bound in De…nition 1 the "canonical" HJ-bound since Hansen and Jagannathan (1991) also provide an alternative bound, that imposes the non-negativity of the pricing kernel, but that is not generally used due to its computational complexity.
To address this issue Stutzer (1995), using the Kullback-Leibler Information Criterion minimization in Equation (5), proposes an entropy bound for the risk neutral probability measure implied by the pricing kernel that naturally imposes the non negativity constraint (see also Kitamura and Stutzer (2002)). In what follows, we build on the original idea in Stutzer (1995Stutzer ( , 1996 and characterize a series of bounds that use an entropy minimization approach. But our approach is di¤erent from Stutzer's one along several dimension. First, we do not restrict our attention to only one de…nition of relative entropy. Second, and most importantly, our approach takes into account more information about the form of the pricing kernel, therefore delivering sharper bounds. Third, we are also able to construct information bounds for the individual components of the SDFs.
We refer to the …rst set of Entropy bounds as Q-bounds, since they are based on the risk neutral probability measure implied by asset returns.
De…nition 2 (Q-bounds) We de…ne the following probability bounds for any candidate stochastic discount factor M t .
Note that the above bounds, like the HJ-bound, use only the information contained in asset returns but, di¤erently from the latter they impose the restriction that the pricing kernel must be positive. Moreover, we show in the next proposition that, to a second order approximation, the problem of constructing canonical HJ-bounds and Qbounds are equivalent, in the sense that approximated Q-bounds identify the minimum variance bound for the SDF. We have that a second order approximation of the Q-bounds criterion is given by where g 1 and g 2 are positive constants.
Proof. See Appendix A.1. The above result implies that, replacing the second order approximation of the KLIC into the de…nition of Q-bounds above, the HJ-bound can be seen as an approximation to Q-bounds, since the approximated Q-bounds would be equivalent to the HJ-bound. Note also that the (su¢ cient, but not necessary) conditions required for the approximation result stated in the proposition are extremely mild and generally satis…ed in most consumption-based asset pricing models. The …rst assumption is a standard one, that requires the SDF to have a …nite …rst and second moments. The second assumption basically requires p (M) (the distribution of the normalized SDF) to be a smooth probability distribution. The assumption of log-concavity is also very weak since a su¢ cient requirement for this is that p (M) is log-concave (since the product of log-concave functions is log-concave). The requirement that ln g (M) has a …nite maximum, , is basically a requirement that the tails of the p (M) distribution decay at a fast enough rate, but the required rate is very low -much lower than exponential -that is much lower than what is normally required for Central Limit Theorem arguments necessary for asymptotic Gaussian inference on the SDF. Probably the more restrictive assumption is the presence of the positive lower bound K on the SDF. This assumption can be thought of as imposing a …nite, but arbitrarily large, upper bound on the maximum consumption growth between two consecutive periods. Nevertheless, note that this requirement will always be satis…ed in any …nite sample application, and we did not use any asymptotic arguments in deriving the above approximation. 5 Note that both the HJ and Q bounds described above use only information about asset returns and no information about consumption growth, nor the structure of the pricing kernel. Therefore, we now turn our attention to a set of tighter bounds that incorporate this type of information while also imposing the non negativity of the pricing kernel. Consider an SDF that can be factorized in two components that is where m ( ; t) is a non negative known function of observable variables (generally consumption growth) and the parameter vector , and t is a potentially unobservable component. A large class of equilibrium asset pricing models including ones with standard time separable power utility with a constant coe¢ cient of relative risk aversion, external habit formation, recursive preferences, durable consumption good, housing, and disappointment aversion fall into this framework. Based on the above factorization of the SDF we can de…ne the follwoing bounds.
De…nition 3 (M -bounds) For any candidate stochastic discount factors of the form in Equation (15), given any choice of the parameters of m ( ; t), we de…ne the following bounds: 1. M 1-bound: where t solves Equation (6) and m ( ; t) t := E [m ( ; t) t ] : 2. M 2-bound: The above bounds for the SDF are tighter that the Q-bounds since by construction, and are also more informative since not only is the information contained in asset returns used in their construction but also the structure of the pricing kernel in Equation (15) and the information contained in m ( ; t).
Information about the SDF can also be elicited by constructing bounds for the t component itself. Given the m ( ; t) component, these bounds identify the minimum amount of information that t should add for the pricing kernel M t to be able to price asset returns.
De…nition 4 ( -bounds) For any candidate stochastic discount factors of the form in Equation (15), given any choice of the parameters of m ( ; t), two lower bounds for the relative entropy of t are de…ned as follows: 1. 1-bound: where t solves Equation (6); where t solves Equation (4).
Besides providing an additional check for any candidate SDF, the -bounds are useful in that a simple comparison of D t jjP , D m( ;t) m( ;t) jjP and D (Q jjP ) can provide a very informative decomposition in terms of the entropy contribution to the pricing kernel, that is logically similar to the widely used variance decomposition analysis. For example, if D t jjP happens to be close to D (Q jjP ), while D m( ;t) m( ;t) jjP is substantially smaller, the decomposition would imply that, most of the candidate SDFs ability to price assets comes from the t component.
Moreover, note that if we want to evaluate a model of the form M t = m ( ; t)a model without the unobservable t component -the -bounds will o¤er a tight selection criterion since, under the null of the model being true, we should have D t jjP = D P jj t = 0 and this is a tighter bound than the HJ, Q and M bounds de…ned above. The intuition for this is simple: Q-bounds (and HJ-bounds) require the model under test to deliver at least as much relative entropy (variance) as the minimum relative entropy (variance) SDF, but they do not require that the m ( ; t) under scrutiny should also be able to price the assets. That is, it might be the case -as in practice we will show is the case -that for some values of both the Q-bounds and the HJ-bounds will be satis…ed, but nevertheless the SDF grossly violates the pricing restrictions in the Euler Equations (3).
Note that in principle a volatility bound, similar to the Hansen and Jagannathan (1991) bound for the pricing kernel, can be constructed for the t component.
and any candidate SDF must satisfy the condition V ar ( t ) V ar t : The of solution of the above minimization for a given is and the lower volatility bound is given by where := q V ar t . This bond, as the entropy based -bound in De…nition 4, uses information about the structure of the SDF but, di¤erently from the latter, does not constrains t and M t to be non-negative as implied by economic theory. Moreover, using the same approach employed in Proposition 1, one can show that this last bound can be obtained as a second order approximation of the entropy based -bound.
Equation (16), viewed as a second order approximation to the entropy -bound, makes clear why bounds based on the decomposition of the pricing kernel as M t = m ( ; t) t o¤ers sharper information that bounds based on only M t . Consider for example the case in which the candidate SDF is of the form M t = m ( ; t), that is t = 1 for any t. In this case, it can easily happen that it exists a 0 such that Hansen and Jagannathan (1991) bound in De…nition 1, that is it exists a 0 such that the HJ-bound is satis…ed. Nevertheless, the existence of such a 0 does not imply that the candidate SDF is able to price asset returns. This would be the case if and only if the volatility bound for t in De…nition 5 is also satis…ed since, from Equation (16) , that is only if the candidate SDF is able to price asset returns.

Data Description
We focus on two data samples: an annual data sample starting at the onset of the Great Depression (1929 2009), and a quarterly data sample starting in the post World War II period (1947: Q1 2009. Note that the information bounds on the SDF and its unobservable component and the extracted time series of the SDF depend on the set of test assets used in their construction. Since the Euler equation holds for any traded asset as well as any adapted portfolio of the assets, this gives an in…nitely large number of moment restrictions. Econometric considerations necessitate the choice of only a subset of assets to be used in the construction of the bounds. We compute the Q, M , and -bounds and extract the time series of the SDF and its components using a large variety of test assets. At the quarterly frequency, we use 6 di¤erent sets of assets: i) the market portfolio, ii) the 25 Fama-French portfolios, iii) the 10 size-sorted portfolios, iv) the 10 bookto-market-equity-sorted portfolios, v) the 10 momentum-sorted portfolios, and vi) the 10 industry-sorted portfolios. At the annual frequency, we use the same sets of assets except the 25 Fama-French portfolios that are replaced by the 6 portfolios formed by sorting stocks on the basis of size and book-to-market-equity because of the small time series dimension available at the annual frequency.
Our motivation for constructiong bounds and extracting the most likely SDF using di¤erent sets of test assets is two-fold. First, our methodology is computationally simple and may be applied to an asset space of arbitrary dimension. Second, we show that the extracted time series of the SDF is quite robust to the set of test assets used.
Our proxy for the market return is the Center for Research in Security Prices (CRSP) value-weighted index of all stocks on the NYSE, AMEX, and NASDAQ. The proxy for the risk-free rate is the one-month Treasury Bill rate obtained from the CRSP …les. The returns on all the portfolios are obtained from Kenneth French's data library. Quarterly (annual) returns for the above assets are computed by compounding monthly returns within each quarter (year), and converted to real using the personal consumption de ‡ator. Excess returns on the assets are then computed by subtracting the risk free rate.
Finally, for each dynamic asset pricing model, the information bounds and the nonparametrically extracted and model-implied time series of the SDF depend on consumption data. For the standard Consumption-CAPM of Breeden (1976) and Rubinstein (1979), the external habit models of Campbell and Cochrane (1999) and Menzly, Santos, and Veronesi (2004)), and the long-run risks model of Bansal and Yaron (2004), we use per capita real personal consumption expenditures on nondurable goods from the National Income and Product Accounts (NIPA). We make the standard "end-ofperiod" timing assumption that consumption during quarter t takes place at the end of the quarter. For the housing model of Piazzesi, Schneider, and Tuzel (2007) aggregate consumption is measured as expenditures on nondurables, and services excluding housing services.

An Illustrative Example: the Consumption CAPM with Power Utility
We …rst illustrate our methodology for the Consumption-CAPM (C-CAPM) of Breeden (1979) and Rubinstein (1976) when the utility function is time and state separable with a constant coe¢ cient of relative risk aversion. For this speci…cation of preferences, the SDF takes the form, where denotes the subjective discount factor, is the coe¢ cient of relative risk aversion, and C t+1 Ct denotes the real per capita aggregate consumption growth. Empirically, the above pricing kernel fails to explain i) the historically observed levels of returns, giving rise to the Equity Premium and Risk Free Rate Puzzles (e.g. Mehra and Prescott (1985) and Weil (1989)), and ii) the cross-sectional dispersion of returns between different classes of …nancial assets (e.g. Mankiw and Shapiro (1986), Breeden, Gibbons, and Litzenberger (1989), Campbell (1996), Cochrane (1996)). Parker and Julliard (2005) argue that the covariance between contemporaneous consumption growth and asset returns understates the true consumption risk of the stock market if consumption is slow to respond to returns. They propose measuring the risk of an asset by its ultimate risk to consumption, de…ned as the covariance of its return and consumption growth over the period of the return and many following periods. They show that while the ultimate consumption risk would correctly measure the risk of an asset if the C-CAPM were true, it may be a better measure of the true risk if consumption responds with a lag to changes in wealth. The ultimate consumption risk model implies the following SDF: where S denotes the number of periods over which the consumption risk is measured and R f t+1;t+1+S is the risk free rate between periods t + 1 and t + 1 + S. Note that the standard C-CAPM obtains when S = 0. Parker and Julliard (2005) show that the speci…cation of the SDF in Equation (18), unlike the one in Equation (17), explains a large fraction of the variation in expected returns across assets for low levels of the risk aversion coe¢ cient.
The functional forms of the above two SDFs …t into our framework in Equation (2). , and t = R f t+1;t+1+S . Therefore, for each model, we construct entropy bounds for the SDF and its components using quarterly data on consumption and returns on the 25 Fama-French portfolios over the post war period 1947 : 1 2009 : 4 and compare them with the HJ bound. We also obtain the nonparametrically extracted (called "…ltered" hereafter) SDF and its components for = 10. For the ultimate consumption risk model, we set S = 11 quarters because the …t of the model is the greatest at this value as shown in Parker and Julliard (2005).   (17). The black curve with circles shows the relative entropy of the model-implied SDF as a function of the risk aversion coe¢ cient. For this model, the missing component of the SDF, t , is a constant and, therefore, has relative entropy zero for all values of , as shown by the orange straight line with inverted triangles. The blue curve with "+" signs and the yellow curve with inverted triangles show the relative entropy as a function of the risk aversion coe¢ cient of the …ltered SDF and its missing component, respectively. The model satis…es the HJ bound for very high values of > 64, as shown by the green dotted-dashed vertical line. It satis-…es the Q1 bound for even higher values of > 72, as shown by the red dashed vertical line. The minimum value of at which the M 1 bound is satis…ed is given by the value corresponding to the intersection of the black and blue curves, i.e. it is the minimum value of for which the relative entropy of the model-implied SDF exceeds that of the …ltered SDF. The …gure shows that this corresponds to = 107. Finally, the 1 bound is the minimum value of for which the missing component of the model-implied SDF has a higher relative entropy than the missing component of the …ltered SDF. Since the former has zero relative entropy while the latter has a strictly positive value for all values of , the model fails to satisfy the 1 bound for any value of . That is, all the bounds reject the model for low RRA even though the best …tting level for the RRA coe¢ cient is smaller than 10 ( = 1:5) and at this value of the coe¢ cient the model is able to explain about 60% of the cross-sectional variation across the 25 Fama-French portfolios.
Panel B shows that very similar results are obtained for the Q2, M 2, and 2 bounds. The Q2 and M 2 bounds are satis…ed for values of at least as large as 73 and 99, respectively, while the 2 bound is not satis…ed for any value of . Overall, as suggested by the theoretical predictions, the Q-bounds are tighter than the HJ-bound, the M -bounds are tighter than the Q-bounds, and the -bounds are tighter than the M -bounds. Figure 2 presents analogous results to Figure 1 for the ultimate consumption risk model in Equation (18). Panel A shows that the HJ, Q1, and M 1 bounds are satis…ed for > 22, 23, and 46, respectively. These are almost three times, more than three times, and more than two times smaller, respectively, than the corresponding values in Figure 1, Panel A for the contemporaneous consumption risk model. As for the latter model, the 1 bound is not satis…ed for any value of . Panel B shows that the Q2 and M 2 bounds are satis…ed for > 24 and 47, respectively, while the 2 bound is not satis…ed for any value of . Figure 3, Panel A plots the time series of the …ltered SDF and its components estimated using Equation (6)  t . The grey shaded areas represent NBER-dated recessions while the green dashed vertical lines correspond to the major stock market crashes identi…ed in Mishkin and White (2002). The …gure reveals two main points. First, the estimated SDF has a clear business cycle pattern, but also shows signi…cant and sharp reactions to …nancial market crashes that do not result in economy wide contractions. Second, the time series of the SDF almost coincides with that of the unobservable component. In fact, the correlation between the two time series is 0:996. The observable, or consumption growth, component of the SDF, on the other hand, has a correlation of only 0:06 with the SDF. Therefore, most of the variation in the SDF comes from variation in the unobservable component, , and not from the consumption growth component. In fact, the volatility of the SDF and its unobservable component are very similar at 90:5% and 92:9%, respectively, while the volatility of the consumption growth component is an order of magnitude smaller at 8:0%. Similar results are obtained in Panel B that plots the time series of the …ltered SDF and its components estimated using Equation (4) for = 10. Finally, Figure 4, Panel A plots the time series of the …ltered SDF and its components estimated using Equation (6) for = 10 for the ultimate consumption risk model. The …gure shows that, as in the contemporaneous consumption risk model, the estimated SDF has a clear business cycle pattern, but also shows signi…cant and sharp reactions to …nancial market crashes that do not result in economy wide contractions. However, di¤erently from the latter model, the time series of the consumption growth component is much more volatile and more highly correlated with the SDF. The volatility of the consumption growth component is 21:7%, more than 2:5 times higher than that for the standard model. The correlation between the SDF and its consumption growth component is 0:37, an order of magnitude bigger than the correlation of 0:06 in the contemporaneous consumption risk model. This explains the ability of the model to account for a much larger fraction of the variation in expected returns across the 25 Fama-French portfolios for low levels of the risk aversion coe¢ cient. In fact, the cross-sectional R 2 of the model is 54:1%, an order of magnitude higher than the value of 5:2% for the standard model. However, the correlation between the ultimate consumption risk SDF and its unobservable component is still very high at 0:92, showing that the model is missing important elements that would further improve its ability to explain the cross-section of returns. Similar results are obtained in Panel B that plots the time series of the …ltered SDF and its components estimated using Equation (4) for = 10. Overall, the results show that our methodology provides useful diagnostics for dynamic asset pricing models. The very similar results obtained using Equations (4) and (6) also demonstrate the robustness of our approach.

Application to More General Models of Dynamic Economies
Our methodology provides useful diagnostics to assess the empirical plausibility of a large class of representative agent consumption-based asset pricing models where the SDF, M t , can be factorized into an observable component consisting of a parametric function of consumption, C t , as in the standard time-separable power utility model, and a potentially unobservable one, t , that is model-speci…c: In this section, we apply it to a set of "winners"asset pricing models, i.e. frameworks that can successfully explain the Equity Premium and the Risk free Rate Puzzles with "reasonable" calibrations. In particular, we consider the external habit formation models of Campbell and Cochrane (1999) and Menzly, Santos, and Veronesi (2004), the long-run risks model of Bansal and Yaron (2004), and the housing model of Piazzesi, Schneider, and Tuzel (2007). We apply our methodology to assess the empirical plausibility of these models in two ways. First, for each model we compute the values of the power coe¢ cient, , at which the model-implied SDF satis…es the HJ, Q, M , and bounds. To simplify the exposition, we focus on one-dimensional bounds as a function of the risk aversion parameter, , while …xing the other parameters at the authors'preferred values. We show that, as suggested by the theoretical predictions, the Q-bounds are generally tighter than the HJ-bound, and the M -bounds are always tighter than both HJ and Q bounds. Second, since our methodology identi…es the most likely time-series of the SDF, we compare this time-series with the model-implied time-series of the SDF for each model. In the next Sub-Section we present the models considered. The reader familiar with these models, can go directly to Section 5.2 without loss of continuity. Campbell and Cochrane (1999) In this model, identical agents maximize power utility de…ned over the di¤erence between consumption and a slow-moving habit or time-varying subsistence level. The SDF is given by

External Habit Formation Model:
where is the subjective time discount factor, is the curvature parameter, and S t = Ct Xt Ct denotes the surplus consumption ratio. Taking logs we have where lower case letters denote the natural logarithms of the upper case letters. Therefore, in this model, the expression for ln( t ) is given by: Note that the missing component, , depends on the surplus consumption ratio, S, that is not observed. To obtain the time series of , we extract the surplus consumption ratio from observed consumption data as follows. In this model, the aggregate consumption growth is assumed to follow an i:i:d: process: The log surplus consumption ratio evolves as a heteroskedastic AR(1) process: where S = r 1 ; For each value of , we use the calibrated values of the model parameters ( , g, , ) in Campbell and Cochrane (1999) and the innovations in real consumption growth, b t = ct c t 1 g , to extract the time series of the surplus consumption ratio using Equation (21) and, therefore, obtain the time series of the model-implied SDF and its missing component from Equations (19) and (20), respectively. : Menzly, Santos, and Veronesi (2004) In this model, the SDF and its missing component are analogous to those in the Campbell and Cochrane (1999) model. The aggregate consumption growth is also assumed to follow an i:i:d: process:

External Habit Formation Model
where c is the mean consumption growth, c > 0 is a scalar, and B t is a Brownian motion. The point of departure from the Campbell and Cochrane (1999) model is that the Menzly, Santos, and Veronesi (2004) model assumes that the inverse surplus, Y t = 1 St , follows a mean reverting process, perfectly negatively correlated with innovations in consumption growth: where Y is the long run mean of the inverse surplus and k is the speed of the mean reversion. For each value of , we use the calibrated values of the model parameters , c , c , k, Y , , in Menzly, Santos, and Veronesi (2004) and the innovations in real consumption growth, , to extract the time series of the surplus consumption ratio and, therefore, obtain the time series of the model-implied SDF and its missing component from Equations (19) and (20), respectively.

Long-Run Risks Model: Bansal and Yaron (2004)
The Bansal and Yaron (2004) long-run risks model assumes that the representative consumer has the version of Kreps and Porteus (1978) preferences adopted by Epstein and Zin (1989) and Weil (1989) for which the SDF is given by where r c;t+1 is the unobservable log gross return on an asset that delivers aggregate consumption as its dividend each period, is the subjective time discount factor, is the elasticity of intertemporal substitution, = 1 1 1 , and is the risk aversion coe¢ cient.
The aggregate consumption and dividend growth rates, c t+1 and d t+1 , respectively, are modeled as containing a small persistent expected growth rate component, x t , and ‡uctuating variance, t : The shocks z x;t+1 , z ;t+1 , z c;t+1 , and z d;t+1 are assumed to be i:i:d: N (0; 1) and mutually independent.
For the log-linearized version of the model, the log price-consumption ratio, z t , the log price-dividend ratio, z m;t , and the log risk free rate are a¢ ne functions of the state variables, x t and 2 t , (27) Constantinides and Ghosh (2010) argue that Equations (26) and (27) express the observable variables, z m;t and r f;t , as a¢ ne functions of the latent state variables, x t and 2 t . Therefore, these Equations may be inverted to express the unobservable state variables, x t and 2 t , in terms of the observables, z m;t and r f;t . Now, substituting the log-a¢ ne approximation for r c;t+1 = 0 + 1 z m;t+1 z m;t + c t+1 into the expression for the pricing kernel (Equation (23)), and noting that z t is given by Equation (25), we have, functions of z m;t and r f;t , we have, where the parameters c = (c 1 ; c 3 ; c 4 ) 0 are functions of the parameters of the time-series processes and the preference parameters. The model is calibrated at the monthly frequency. Since we assess the empirical plausibility of models at the quarterly and annual frequencies, we obtain the pricing kernels at these frequencies by aggregating the monthly kernels. For instance, the quarterly pricing kernel is obtained as Therefore, the expression for ln( t ) is given by: For each value of , we use the calibrated parameter values from Bansal and Yaron (2004) and the time series of the price-dividend ratio and risk free rate to obtain the time series of the SDF and its missing component, , in Equations (30) and (31), respectively. : Piazzesi, Schneider, and Tuzel (2007) In this model, the pricing kernel is given by:

Housing
where A t is the expenditure share on non-housing consumption, P s t and P c t are the prices of housing and non-housing consumption, respectively, and S t and C t are the housing and non-housing consumption, respectively.
is the intertemporal elasticity of substitution and is the intratemporal elasticity of substitution between housing services and non-housing consumption. Taking logs we have: Therefore, in this model, the expression for ln( t ) is given by: For each value of = 1 , we use the calibrated values of the model parameters ( , ) in Piazzesi, Schneider, and Tuzel (2007) to obtain the time series of the model-implied SDF and its missing component from Equations (32) and (33), respectively.

Empirical Results
We apply our methodology to assess the empirical plausibility of the models just described in two ways. First, for each model we compute the minimum values of the power coe¢ cient, , at which the model-implied SDF satis…es the HJ, Q, M , and bounds. Table 1 reports the results at the quarterly frequency. Panels A, B, C, D, E, and F report results when the set of assets used in the construction of the bounds include the market, 25 Fama-French, 10 size-sorted, 10 book-to-market-equity-sorted, 10 momentum-sorted, and 10 inductry-sorted portfolios, respectively. Consider …rst the results for the HJ, Q1, M 1, and 1 bounds. The …rst row in each panel presents the bounds for the Campbell and Cochrane (1999) external habit model (henceforth referred to as CC). Panel A shows that when the excess return on the market portfolio is used in the construction of the bounds, the minimum value of at which the pricing kernel satis…es the HJ, Q1, M 1, and 1 bounds is 1:4 in all four cases. However, when the set of test assets consists of the excess returns on the 25 Fama-French portfolios, Panel B shows that the HJ, Q1, M 1, and 1 bounds are satis…ed for a minimum value of = 7:3, 9:8, 9:9, and 13:9, respectively. Therefore, as suggested by the theoretical predictions, the Q-bound is tighter than the HJ-bound, the M -bound is tighter than the Q-bound. Note that in this model, the coe¢ cient of risk aversion is St , where S t is the surplus consumption ratio. For = 2, the calibrated value in CC, the risk aversion varies over [20; 1). Panel B reveals that the Q-bound is satis…ed for > 9:8, implying that the risk aversion varies over [44:5; 1), the M -bound is satis…ed for > 9:9, implying that the risk aversion varies over [43:0; 1), and the -bound is satis…ed for > 13:9, implying that the risk aversion varies over [51:5; 1). A similar ordering of the bounds is obtained when the set of assets consist of the 10 size-sorted, 10 book-to-market-equity-sorted, 10 momentum-sorted, and 10 inductry-sorted portfolios in Panels C, D, E, and F , respectively. Also, very similar results are obtained for the Q2, M 2, and 2 bounds pointing to the robustness of our methodology.
The second row in each panel presents the bounds for the Menzly, Santos, and Veronesi (2004) external habit model (henceforth referred to as M SV ). When the set of test assets consists of the excess return on the market portfolio, the HJ, Q1, M 1, and 1 bounds are satis…ed for a minimum value of = 11:4, 11:2, 12:4, and 15:7, respectively. For the 25 Fama-French portfolios, the bounds are much higher at 27:8, 31:7, 33:9, and 53:3, respectively. Therefore, this model requires very high values of the local curvature of the utility function to explain the equity premium and the crosssection of asset returns. In fact, this model requires much higher levels of risk aversion compared to the CC model for each of the set of test assets. As in the case of the CC model, very similar results are obtained for the Q2, M 2, and 2 bounds.
The third row in each panel presents the bounds for the Bansal and Yaron (2004) long run risks model (henceforth referred to as BY ). Panel A shows that when the excess return on the market portfolio is used in the construction of the bounds, the minimum value of at which the pricing kernel satis…es the HJ, Q1, M 1, and 1 bounds is 3:0 in all four cases. When the set of test assets consists of the excess returns on the 25 Fama-French portfolios, Panel B shows that the HJ bound is satis…ed for a minimum value of = 4:0 while the Q1, M 1, and 1 bounds are satis…ed for a minimum value of = 5:0. Similar results are obtained for the other sets of portfolios and for the Q2, M 2, and 2 bounds. In this model, represents the coe¢ cient of relative risk aversion. Therefore, the results in Panels A F reveal that the modelimplied pricing kernel satis…es the HJ, Q, M , and bounds for reasonable values of the risk aversion coe¢ cient for all sets of test assets.
Overall, Table 1 demonstrates that, in line with the theoretical underpinnings of the various bounds, the Q-bound is generally tighter than the HJ-bound because it naturally exploits the restriction that the SDF is a strictly positive random variable. The M -bound is tighter than the Q-bound because it formally takes into account the ability of the SDF to price assets. This relative ordering holds for a variety of di¤erent dynamic asset pricing models. Furthermore, the results suggest that while the external habit models of CC and MSV, the housing model of PST require very high levels of risk aversion to satisfy the bounds, the long run risks model of BY satis…es the bounds for reasonable levels of risk aversion for all the sets of test assets. Table 2 reports analogous bounds as in Table 1 at the annual frequency. The table shows that, at the annual frequency, the HJ, Q, M , and bounds are satis…ed for much smaller values of the utility curvature parameter, , for each of the models considered and for each set of test assets. There is also less dispersion between the bounds compared to the quarterly data in Table 1. However, in line with the theoretical predictions, the Q-bound is generally tighter than the HJ-bound, and the M -bound is generally tighter than the Q-bound.
Our second approach to assessing the empirical plausibility of these models is based on the observation that our methodology identi…es the most likely time-series of the SDF, which we call the …ltered SDF. We compare the …ltered SDF with the modelimplied SDF for each model. Note that the …ltered SDF and its missing component depend on the local curvature of the utility function, . Therefore, for each model, we …x at its calibrated value and extract the time series of the SDF and its components. Table 3 reports the results at the quarterly frequency. In order to examine the models' ability to explain the cross-section of asset returns, we do not consider the market return on its own but focus instead on multiple test assets. Panels A, B, C, D, and E report results for the following sets of test assets: 25 Fama-French, 10 sizesorted, 10 book-to-market-equity-sorted, 10 momentum-sorted, and 10 industry-sorted portfolios, respectively. The …rst column reports the correlation between the …ltered time series of the missing component, f t g T t=1 , of the SDF and the corresponding model-implied time series, f m t g T t=1 . The second column shows the correlation between the …ltered SDF, fM t = m t t g T t=1 , where m t = Ct C t 1 , and the model-implied SDF, Consider …rst the results for the CC external habit model that are presented in the …rst row of each panel. For this model, the utility curvature parameter is set to the calibrated value of = 2. Panel A, Column 1 shows that when the 25 FF portfolios are used in the extraction of , the correlation between the …ltered and model-implied is only 0:02 when is estimated using Equation (6). Column 2 shows that the correlation between the …ltered and model-implied SDFs is marginally higher at 0:05. When is estimated using Equation (4), the correlations are very similar at 0:06 and 0:08, respectively. Panels B E show that the correlations between the …ltered and model-implied SDFs and their missing components remain small for all the other sets of portfolios.
The second row in each panel presents the results for the MSV external habit model. In this case, is set equal to 1 which is the calibrated value in the model. Row 2 in each panel shows that the results for the MSV model are very similar to those for the CC model. When is estimated using Equation (6), the correlations between the …ltered and model-implied missing components of the SDFs are small varying from 0:00 for the 25 FF portfolios to 0:20 for the size-sorted portfolios. The correlations between the …ltered and model-implied SDFs are marginally higher varying from 0:02 for the 25 FF portfolios to 0:24 for the size-sorted portfolios. Similar results are obtained when is estimated using Equation (4).
The third row in each panel presents the results for the BY long run risks model. Note that the long run risks model implies that the SDF is an exponentially a¢ ne function of the log aggregate consumption growth, the market-wide log price-dividend ratio and its lag, and the log risk free rate and its lag (Equation (29)): where the parameters c = (c 1 ; c 3 ; c 4 ) 0 are functions of the underlying model parameters, some of which are not "deep"preference parameters but instead characterizations of the data generating processes. Since the parameters of the data generating processes could be in principle di¤erent in di¤erent samples, we present two types of results for the SDF of the BY model. First, we present results where the restrictions on the parameter, c, implied by the BY calibration are imposed (Row 3). Second, we provide results where the parameter vector c is treated as free (in parentheses in Row 3). The parameter is set equal to the BY calibrated value of 10. Row 3, Panel A, Column 1 shows that when the 25 FF portfolios are used in …ltering the SDF, the correlation between the …ltered and model-implied missing components of the SDFs is 0:10 (0:11) when the restrictions are imposed on the coe¢ cients c and is estimated using Equation (6) (Equation (4)). This is an order of magnitude higher than the values obtained for the CC and MSV models in Rows 1 and 2, respectively. When the coe¢ cients c are treated as free parameters, the correlation more than doubles from 0:10 (0:11) to 0:27 (0:29). Column 2 shows that the correlation between the …ltered and model-implied SDFs is 0:11 (0:12) in the presence of the restrictions and is more than two times higher at 0:30 (0:31) when the restrictions are not imposed.
Similar results are obtained in Panels B E for the other sets of test assets. The correlation between the …ltered and model-implied missing components of the SDF varies from 0:12 (0:11) for the 10 momentum-sorted portfolios to 0:38 (0:38) for the sizesorted portfolios for the restricted speci…cation. These are often an order of magnitude higher than the correlations obtained for the CC and MSV models. For the unrestricted speci…cation, the correlations more than double, varying from 0:44 (0:44) for the 10 momentum-sorted portfolios to 0:80 (0:82) for the size-sorted portfolios. These results show that the SDF implied by the long run risks model correlates much more strongly with the non-parametrically extracted most likely time series of the SDF than the external habit models of CC and MSV.
The fourth row in each panel presents the results for the PST housing model. In this case, is set equal to 16 which is the calibrated value in the original paper. Column 1 shows that the correlations between the …ltered and model-implied missing components of the SDFs are very small and often have the wrong sign, varying from 0:22 ( 0:20) for the size-sorted portfolios to 0:13 (0:12) for the industry-sorted portfolios when is estimated using Equation (6) (Equation (4)). The correlations between the …ltered and model-implied SDFs are marginally higher varying from 0:01 ( 0:02) for the size-sorted portfolios to 0:19 (0:17) for the industry-sorted portfolios. Table 4 reports analogous results as in Table 3 at the annual frequency. The results are largely similar to those in Table 3. The table shows that, at the annual frequency, the SDF implied by the long run risks model correlates even more strongly with the …ltered SDF relative to the external habit and housing models.
The last two columns of Tables 3 and 4 report the cross-sectional R 2 implied by the model SDF for the di¤erent sets of test assets at the quarterly and annual frequencies, respectively. The cross-sectional R 2 are obtained by performing a cross-sectional regression of the historical average returns on the model-implied expected returns. Column 3 reports the cross-sectional R 2 when there is no intercept in the regression while Column 4 presents results when an intercept is included. The results reveal that the cross-sectional R 2 often varies wildly for the same model, and often take on large negative values when an intercept is not allowed in the cross-sectional regression, when evaluated using di¤erent sets of assets. This is in stark contrast with the results based on entropy bounds in Tables 1 and 2, that tend instead to give consistent results for each model across di¤erent sets of assets (even though all models seem to perform better, along this dimension, at annual frequency).
A notable exception to the poor cross-sectional performance of the models considered is that, at the annual frequency, the BY model, unlike the CC, MSV, and PST models, has stable cross-sectional R 2 for the size and BM-sorted portfolios both in the presence and absence of an intercept.
Overall, Tables 3 and 4 make two main points. First, they demonstrate the robustness of our estimation methodology -very similar results are obtained using Equations (6) and (4). Second, they show that the long run risks model implies an SDF that is the most highly correlated with the …ltered SDF -the most likely SDF given the data.
Tables 5 and 6 report the correlations between the …ltered and model-implied SDFs and the three Fama-French (FF) factors at the quarterly and annual frequencies, respectively. Column 1 presents the correlation between the model-implied SDF and the three FF factors. This is computed by performing a linear regression of the modelimplied time series of the SDF, fM m t g T t=1 , on the three FF factors and computing the correlation between M m and the …tted value from the regression. Similarly, Columns 4 and 5 present the correlation of the …ltered SDF and its missing component with the three FF factors, respectively. These columns provide interesting results because the FF factors have been very successful at explaining the cross-sectional variation in returns between di¤erent classes of …nancial assets.
Consider …rst Table 5. Row 1 of each panel shows that for the CC model, the correlation between the model-implied SDF and the three FF factors is small at 0:18. Panel A, Row 1, Column 2 shows that, while the model-implied SDF correlates poorly with the FF factors, the …ltered SDF correlates very highly with the factors having a correlation coe¢ cient of 0:54 and 0:59 when is estimated using Equations (6) and (4), respectively. This is reassuring for our methodology because, as is well known, the FF factors are successful in explaining a large fraction of the cross-sectional dispersion is asset returns. Moreover, Column 3 reveals that this high correlation is due almost entirely to the missing component, , and not m -the correlation between the …ltered SDF and the FF factors is the same as that between the …ltered missing component of the SDF and the FF factors. The results in Panels B E are largely similar -the …ltered SDF and its missing component have high correlation with the FF factors for all the di¤erent sets of test assets, varing from 0:52 (0:52) for the momentum-sorted portfolios to 0:87 (0:89) for the size-sorted portfolios, and the high correlation is almost entirely due to the missing component .
Row 2 in each panel shows that for the MSV model, the correlation between the model-implied SDF and the FF factors is small at 0:21. Finally, the …ltered SDF correlates strongly with the FF factors which is almost entirely driven by the missing component of the SDF and not the consumption growth component.
Row 3 in each panel shows that for the BY model, the correlation between the model-implied SDF and the FF factors is 0:45 in the presence of the restrictions. This is more than double the correlations obtained for the CC and MSV models. Moreover, the correlation further doubles when the restrictions are not imposed varying from 0:87 0:92.
Finally, row 4 in each panel shows that for the PST model, the correlation between the model-implied SDF and the FF factors is very small at 0:07. The …ltered SDF, on the other hand, correlates strongly with the FF factors which is almost entirely driven by the missing component of the SDF and not the consumption growth component. Table 6 reveals that very similar results are obtained at the annual frequency. Tables 5 and 6 demonstrate the robustness of our estimation methodology -the …ltered time series of the SDF and its missing component is quite robust to the choice of the utility curvature parameter and the choice of the set of assets.

Conclusion
In this paper, we propose an information-theoretic approach to assess the empirical plausibility of candidate SDFs for a large class of dynamic asset pricing models. The models we consider are characterized by having a pricing kernel that can be factorized into an observable and component, consisting in general of a parametric function of consumption, and a potentially unobservable one that is model-speci…c.
Based on this decomposition of the pricing kernel, we provide three major contributions. First, we construct a new set of entropy bounds that build upon and improve the ones suggested in the previous litterature in that a) they naturally impose the non negativity of the pricing kernel (c.f. Hansen and Jagannathan (1991)), b) they are generally thigther and have higher information content (c.f. Hansen and Jagannathan (1991) and Stutzer (1995Stutzer ( , 1996), and c) allow to utilize the information contained in a large cross-section of asset returns (cf. Alvarez and Jermann (2005)).
Second, using a relative entropy minimization approach, we also extract nonparametrically the time series of both the SDF and its unobservable component. Given the data, this methodology identi…es the most likely -in the information theoretic sense -time series of the SDF and its unobservable component. Applying this methodology to the data we …nd that the estimated SDF has a clear business cycle pattern, but also shows signi…cant and sharp reactions to …nancial market crashes that do not result in economy wide contractions.
Third, applying the methodology developed in this paper to a large class of dynamic asset pricing models, we …nd that the external habit models of Campbell and Cochrane (1999) and Menzly, Santos, and Veronesi (2004) and the housing model of Piazzesi, Schneider, and Tuzel (2007) require very high levels of risk aversion to satisfy the bounds while the long run risks model of Bansal and Yaron (2004) satis…es the bounds for reasonable levels of risk aversion. These results are robust to the choice of test assets used in the construction of the bounds as well as the frequency of the data. Moreover, comparing the non-parametrically extracted SDF with those implied by the above asset pricing models, we again …nd substantial empirical support for the long run risks framework.
The methodology developed in this paper is considerably general and may be applied to any model that delivers well-de…ned Euler equations like models with heterogenous agents, limited stock market participation, and fragile beliefs, as long as the SDF can be factorized into an observable component and a potentially unobservable one. The table reports the values of the utility curvature parameter at which the model-implied SDF satis…es the HJ, Q, M, and bounds using quarterly data over 1947:2-2009:4. Panels A, B, C, D, E, and F report results when the set of assets used in the construction of the bounds include the market, FF 25, 10 size-sorted, 10 BM-sorted, 10 momentum-sorted, and 10 industry-sorted portfolios, respectively. "CC" and "MSV" denote the Campbell and Cochrane (1999) and Menzly, Santos, and Veronesi (2004) external habit models, respectively. "BY" denotes the Bansal and Yaron (2004) long run risks model and "PST" denotes the Piazzesi, Schneider, and Tuzel (2007) housing model. The table reports the values of the utility curvature parameter at which the model-implied SDF satis…es the HJ, Q, M, and bounds using annual data over 1930-2009. Panels A, B, C, D, E, and F report results when the set of assets used in the construction of the bounds include the market, 6 sizeand BM-sorted, 10 size-sorted, 10 BM-sorted, 10 momentum-sorted, and 10 industry-sorted portfolios, respectively. "CC" and "MSV" denote the Campbell and Cochrane (1999) and Menzly, Santos, and Veronesi (2004) external habit models, respectively. "BY" denotes the Bansal and Yaron (2004) long run risks model and "PST" denotes the Piazzesi, Schneider, and Tuzel (2007) housing model.        The table reports the correlations between the 3 Fama-French factors and the model-implied SDF, the …ltered SDF, and the missing component of the …ltered SDF using annual data over 1930-2009.