Volatility, Valuation Ratios, and Bubbles: An Empirical Measure of Market Sentiment

We define a sentiment indicator that exploits two contrasting views of return predictability, and study its properties. The indicator, which is based on option prices, valuation ratios and interest rates, was unusually high during the late 1990s, reflecting dividend growth expectations that in our view were unreasonably optimistic. We interpret it as helping to reveal irrational beliefs about fundamentals. We show that our measure is a leading indicator of detrended volume, and of various other measures associated with financial fragility. We also make two methodological contributions. First, we derive a new valuation-ratio decomposition that is related to the Campbell and Shiller (1988) loglinearization, but which resembles the traditional Gordon growth model more closely and has certain other advantages for our purposes. Second, we introduce a volatility index that provides a lower bound on the market's expected log return.

need to be careful about the distinction between log returns and simple returns.
Furthermore, we show that while the Campbell-Shiller identity is highly accurate on average, the linearization is most problematic at times when the pricedividend ratio is far above its long-run mean. At such times-the late 1990s being a leading example-a researcher who uses the Campbell-Shiller loglinearization will conclude that long-run expected returns are even lower, and/or long-run expected dividend growth is even higher, than is actually the case. Thus the linearization may "cry bubble" too soon.
We therefore propose a new linearization that does not have this feature, but which also relates a measure of dividend yield to expected log returns and dividend growth. Our approach exploits a measure of dividend yield y t = log (1 + D t /P t ) that has the advantage of being in "natural" units, unlike the quantity dp t = log D t /P t that features in the Campbell-Shiller approach. As a further bonus, the resulting identity bears an even closer resemblance to the traditional Gordon growth model-which it generalizes to allow for time-varying expected returns and dividend growth-than does the Campbell-Shiller loglinearization.
The second ingredient of our paper is a lower bound on expected log returns that plays the role of E R in the loose description above. The lower bound relies on an assumption on the form of the stochastic discount factor (SDF). This assumption, the modified negative correlation condition (mNCC), is satisfied, for example, if one takes the perspective of an unconstrained agent who maximizes expected utility over next-period wealth, who chooses to invest his or her wealth fully in the market, and whose relative risk aversion is at least one. An attractive feature of this approach is that it allows the investor in question to coexist with other agents who may or may not be rational. Under the mNCC, our lower bound on expected log returns can be computed directly from index option prices so is, broadly speaking, a measure of implied volatility.
The paper is organized as follows. Section 2 discusses the link between valuation ratios, returns, and dividend growth; it analyzes the properties of the Campbell-Shiller loglinearization, introduces our alternative loglinearization, and studies the predictive relationship between dividend yields and future log returns and log dividend growth. Section 3 derives the lower bound on expected returns. Section 4 combines the preceding sections to introduce the sentiment indicator. Section 5 documents the fact that our measure is a leading indicator of detrended volume, and of a long-term earnings growth forecast index that has been constructed by Nagel and Xu (2019); and explores its relationship with various other measures of financial conditions. Section 6 concludes.

Framework
Our approach has two ingredients. The first is the predictive relationship between valuation ratios, returns, and fundamentals that has been explored in the vast predictability literature that started from Keim and Stambaugh (1986), Campbell and Shiller (1988), and Fama and French (1988), among others. We introduce a novel loglinearization where r t+1 is the log return on the market, g t+1 is log dividend growth, y t is the market dividend yield, and ρ ≈ 0.97 is a loglinearization constant. If, say, the dividend yield follows an AR(1) process, 1 then equation (1) implies that E t (r t+1 − g t+1 ) = a 0 + a 1 y t for some constants a 0 and a 1 . (We derive these and related results in Section 2.) The second exploits the information in option prices via a strategy introduced by Martin (2017). We assume that the inequality cov t (M t+1 R t+1 , log R t+1 ) ≤ 0 holds; as this is closely related to the negative correlation condition (NCC) of Martin (2017), we refer to it as the modified negative correlation condition (mNCC).
Here M t+1 denotes a stochastic discount factor (SDF) which prices payoffs delivered at time t + 1 from the perspective of time t, and R t+1 is the gross return on the market.
If one thinks from the perspective of an investor whose beliefs and risk preferences are consistent with (2) (or with the alternatives mentioned in footnote 1) and (3), then the mNCC holds if this investor-whom we refer to as a representative investor -maximizes utility E t u(W t+1 ), with relative risk aversion −W u (W )/u (W ) (which need not be constant) at least one, and chooses to invest his or her wealth fully in the market. This setup allows for the possibility that other investors are irrational and/or face trading constraints; but we emphasize that our representative investor is assumed to be unconstrained, and, in particular, to be able to trade in option markets, so that he or she is marginal for option prices. Thus we are ruling out extreme forms of market segmentation by assumption. (A representative investor in this sense is sufficient, but not necessary, for the mNCC to hold. We discuss more general conditions under which the mNCC holds in Section 3. ) We show, in Section 3, that this representative investor's beliefs must then respect the following lower bound on the expected log return on the market, which can be computed directly from option prices: Starting from the decomposition we have a lower bound on the representative investor's expected dividend growth, from (2), (4), and (5).
To implement this, we replace the population coefficients a 0 and a 1 by their sample counterparts a 0 and a 1 , which we estimate by OLS. 2 We end up with the 2 We discuss the issue of estimation uncertainty in Section 4.2.1.
We follow the convention in the literature in writing approximations such as (8) with equals signs. (A number of our results below are in fact exact. We emphasize these as they occur.) We also assume throughout the paper that there are no rational bubbles, as is standard in the literature. Thus, for example, in deriving (8) we are assuming that lim T →∞ ρ T pd T = 0. The approximation (8) is often loosely summarized by saying that high valuation ratios signal high expected dividend growth or low expected returns (or both). But expected log returns are not the same as expected returns: 3 we have where κ (n) t (r t+1+i ) is the nth conditional cumulant of the log return. (If returns are conditionally lognormal, then the higher cumulants κ (n) t (r t+1+i ) are zero for n ≥ 3.) Thus high valuations-and low expected log returns-may be consistent with high expected arithmetic returns if log returns are highly volatile, rightskewed, or fat-tailed. Plausibly, all of these conditions were satisfied in the late 1990s. As they are all potential explanations for the rise in valuation ratios at that time, 4 we will need to be careful about the distinction between log returns and simple returns.
Furthermore, the Campbell-Shiller first-order approximation is least accurate when the valuation ratio is far from its mean, as we now show.
Result 1 (Campbell-Shiller revisited). The log price-dividend ratio pd t obeys the following exact decomposition: where the constants k and ρ are defined as above, and the quantities ψ t+1+i lie between ρ and 1/(1 + e dp t+1+i ).
Equation (9) becomes a second-order Taylor approximation if ψ t is assumed equal to ρ for all t, and reduces to the Campbell-Shiller loglinearization (8) if the final term on the right-hand side of (9) is neglected entirely.
Result 1 expresses the price-dividend ratio in terms of future log dividend growth and future log returns-as in the Campbell-Shiller approximation-plus a convexity correction.
This convexity correction is small on average. Take the unconditional expectation of second-order approximation (10): assuming that pd t , r t , and g t are stationary so that their unconditional means and variances are well defined. Using CRSP data from 1947 to 2019, the sample average of pd t is 3.469 (so that ρ is 0.969) and the sample standard deviation is 0.434. Thus the unconditional average convexity correction ρ 2 var pd t is about 0.0913, that is, about 2.63% of the size of E pd t .
The convexity correction can sometimes be large, however. We have 8 and the final term may be quantitatively important if the valuation ratio is far from its mean and persistent, so that it is expected to remain far from its mean for a significant length of time.
For the sake of argument, suppose the log price-dividend ratio follows an AR(1), pd t+1 − pd = φ(pd t − pd) + ε t+1 , where var t ε t+1 = σ 2 so that var pd t = σ 2 /(1 − φ 2 ); and set σ = 0.167 and φ = 0.923 to match the sample standard deviation and autocorrelation in CRSP data from 1947-2019. The above expression becomes At its peak during the boom of the late 1990s, pd t was 2.2 standard deviations above its mean. The convexity term then equals 0.145: this is the amount by which a researcher using the Campbell-Shiller approximation would overstate ∞ i=0 ρ i E t (g t+1+i − r t+1+i ). With ρ = 0.969, this is equivalent to overstating E t g t+1+i − r t+1+i by 14.5 percentage points for one year, 3.1 percentage points for five years, or 1.0 percentage points for 20 years. 5 The Campbell-Shiller approximation does not apply if dp t follows a random walk (i.e., E t dp t+1 = dp t ). But in that case we can linearize (7) around the conditional mean E t dp t+1 to find 6 E t (r t+1 − g t+1 ) = log 1 + e dpt = log 1 + D t P t .
Motivated by this fact, 7 we define y t = log (1 + D t /P t ). An appealing property 5 The numbers are more dramatic if we use the long sample from 1871-2015 available on Robert Shiller's website. We find ρ = 0.960, σ = 0.136, and φ = 0.942 in the long sample, so that the convexity correction is 0.0596 when pd t is at its mean, and 0.253 at the peak (which was 3.2 standard deviations above the mean). This last number corresponds to overstating E t g t+1+i − r t+1+i by 25.3 percentage points for one year, 5.5 percentage points for five years, 1.8 percentage points for 20 years, or 1.0 percentage points for ever. 6 Campbell (20086 Campbell ( , 2018 derives the same result via a different route, but makes further assumptions (that the driving shocks are homoskedastic and conditionally Normal) that we do not require.
7 Further motivation is provided in Martin (2013), where it is shown that this measure of 9 of this definition-and one that dp t does not possess-is that y t = log(1+D t /P t ) ≈ D t /P t . We can then rewrite the definition of the log return (7) as the (exact) relationship In these terms, equation (12) states that which is valid, as a first-order approximation, if dp t (or y t ) follows a random walk. Alternatively, if y t is stationary (as is almost always assumed in the literature) we have the following result. We write unconditional means as y = E y t , r = E r t and g = E g t .
Result 2 (A variant of the Gordon growth model). Suppose y t is stationary. Then we have the loglinearization where 8 ρ = e −y . As there is no constant in (15), and as (1 − ρ) ∞ i=0 ρ i = 1, this is a variant of the Gordon growth model: y is a weighted average of future r − g.
To second order, we have the approximation (16) We also have the exact relationship which does not rely on any approximation.
dividend yield emerges naturally in i.i.d. models with power utility or Epstein-Zin preferences.
Proof. Using Taylor's theorem to second order in equation (13), we have the second-order approximation which can be rewritten and then solved forward, giving (15) and (16). Equation (17) follows by taking expectations of the identity (13) and noting that E log (1 − e −yt ) = E log (1 − e −y t+1 ) by stationarity of y t .
We note in passing that equation (17) implies that the inequality r > g, which is discussed extensively by Piketty (2014), holds in any model in which y > 0. Piketty (2015) writes that "the inequality r > g holds true in the steady-state equilibrium of the most common economic models, including representative-agent models where each individual owns an equal share of the capital stock." Our result shows that the inequality applies much more generally. It does not rely on equilibrium logic and is not in itself particularly interesting or significant.
Given our focus on bubbles, we are particularly interested in the accuracy of these loglinearizations 9 at times when valuation ratios are unusually high or, equivalently, when dp t and y t are unusually low. This motivates the following definition and result.
Definition 1. We say that y t is far from its mean (at time t) if Example.-If y t follows an AR(1), then a direct calculation shows that y t is far from its mean if and only if it is at least one standard deviation from its mean.
Result 3 (Signing the approximation errors). We can sign the approximation error in the Campbell-Shiller loglinearization (8): The first-order approximation (15) is exact on average. That is, holds exactly, without any approximation. But if y t is far from its mean then (up to a second-order approximation) Proof. The inequality (19) follows immediately from (9) and equation (20) follows directly from equation (17). To establish the inequality (21), rewrite The inequality then follows from (16), (18), and (22).
Dividend yields, whether measured by dp t or by y t , were unusually low around the turn of the millennium, indicating some combination of low future returns and high future dividend growth. Result 3 shows that an econometrician who uses the Campbell-Shiller approximation (8) at such a time-that is, who treats the inequality (19) as an equality-will overstate how low future returns, or how high future dividend growth, must be: and therefore may be too quick to conclude that the market is "bubbly." In contrast, an econometrician who uses the approximation (15) will understate how low future returns, or how high future dividend growth, must be. Thus y t is a conservative diagnostic for bubbles.
To place more structure on the relationship between valuation ratios and r and g, we will make an assumption about the evolution of dp t and y t over time.
For now we will rely on an AR(1) assumption to keep things simple; in Section 4.1, we report the corresponding results assuming AR(2) or AR(3) processes.
The Campbell-Shiller approximation over one period states that r t+1 − g t+1 = k+dp t −ρ dp t+1 . If dp t follows an AR(1) with autocorrelation φ then E t dp t+1 −dp = φ dp t − dp , so where we have absorbed constant terms into c.
Conversely, the first-order approximation underlying Result 2 implies that If y t follows an AR(1) with autocorrelation φ y then this reduces to where again we absorb constants into the intercept c. In view of (17), this can also be written without an intercept as so that the deviation of y t from its long-run mean is proportional to the deviation of conditionally expected r t+1 − g t+1 from its long-run mean. A further advantage of y t over dp t is that the expression (25) is also meaningful if y t follows a random walk: in this case, the coefficient on y t equals one and the intercept is zero, by equation (14). Equations (23) and (25) motivate regressions of realized r t+1 − g t+1 onto dp t and a constant, or onto y t and a constant. The results are shown in Table 1, where we also report the results of regressing r t+1 and −g t+1 separately onto y t and onto dp t . We use end-of-year observations of the price level and accumulated dividends RHS t y t dp t LHS t+1 s.
with Hansen-Hodrick standard errors shown in brackets. (Under the AR(1) assumption, we could also use (23) or (25) as estimates of E t (r t+1 − g t+1 ). This approach turns out to give very similar results, as we show in Table 9 of the appendix.) The variables y t and dp t have similar predictive performance and, consistent with the prior literature, we find, in the post-1947 sample, that valuation ratios help to forecast returns but have limited forecasting power for dividend growth. Table 2 reports results using cash reinvested dividends in the full CRSP post-1926 sample. Tables 3 and 4 report results using market reinvested dividends in the post-1926 period. Tables 5-7 use the price and dividend data of Goyal and Welch (2008) (updated to 2018 and taken from Amit Goyal's webpage): this gives us a longer sample, as it incorporates Robert Shiller's data which goes back as far as 1871. The predictability of r relative to g is to some extent a feature of the post-war period. In the long sample, returns are substantially less predictable and dividends substantially more predictable, perhaps because of the post-war tendency of corporations to smooth dividends (Lintner, 1956). Encouragingly, though, we find that the predictive relationship between y t (or dp t ) and the difference r t+1 − g t+1 is fairly stable across sample periods and data sources.

A lower bound on expected log returns
High valuation ratios are sometimes cited as direct evidence of a bubble. But valuation ratios can be high for good reasons if interest rates or rationally expected risk premia are low. In other words, if we use y t to measure E t (r t+1 − g t+1 ) as suggested above, we may find that y t is low simply because E t r t+1 is very low, which could reflect low interest rates r f,t+1 , low (log) risk premia E t r t+1 − r f,t+1 , or both.
While interest rates are directly observable, risk premia are harder to measure. We start from the following identity, which generalizes an identity introduced by Martin (2017) in the case X t+1 = R t+1 : We have written E * t for the time-t conditional risk-neutral expectation operator, defined by the property that 1 where M t+1 denotes a stochastic discount factor that prices any tradable payoff X t+1 received at time t+1. Assuming the absence of arbitrage, such an SDF must exist, and the identity above holds for any gross return R t+1 such that the payoff R t+1 X t+1 is tradable. Henceforth, however, R t+1 will always denote the gross return on the market.
We are interested in expected log returns, X t+1 = log R t+1 , in which case the 15 identity becomes The first of the two terms on the right-hand side, as a risk-neutral expectation, is directly observable from asset prices, as it represents the price of a contract that pays R t+1 log R t+1 at time t + 1. (Neuberger (2012) has studied this contract in a different context.) The second term can be controlled: we will argue below that it is reasonable to impose an assumption that it is negative. Thus (26) implies a lower bound on expected log returns in terms of a quantity that is directly observable from asset prices.
To make further progress, we make two assumptions throughout the paper. As we will see below, we will use option prices to bound the first term on the right-hand side of the identity (26). Our first assumption addresses the minor 12 technical issue that we observe options on the ex-dividend value of the index, P t+1 , rather than on P t+1 + D t+1 .
x log x is a convex function, then the dispersion of R t+1 is at least as large as that of P t+1 /P t : This condition is very mild. Expanding f (x) = x log x as a Taylor series to second order around x = 1, f (x) ≈ (x 2 −1)/2. Thus, to second order, Assumption Assumption 2. The modified negative correlation condition (mNCC) holds: Martin (2017) imposed the closely related negative correlation condition (NCC) that cov t (M t+1 R t+1 , R t+1 ) ≤ 0. The two conditions are plausible for similar reasons: in any reasonable model, M t+1 will be negatively correlated with the return on the market, R t+1 , and we know from the bound of Hansen and Jagannathan (1991), coupled with the empirical fact that high Sharpe ratios are available, that M t+1 is highly volatile.
In fact, the two conditions are equivalent in the lognormal case. Suppose that the SDF M t+1 and return R t+1 are conditionally jointly lognormal and write Then the mNCC and NCC are both equivalent to the assumption that the conditional Sharpe ratio of the asset, λ t ≡ (µ t − r f,t+1 )/σ t , exceeds its conditional volatility, σ t . 13 The Sharpe ratio of the market is typically thought of as being on the order of 30-50%, while the volatility of the market is on the order of 16-20%. Thus the mNCC holds in the calibrated models of Campbell and Cochrane (1999), Bansal and Yaron (2004), Bansal et al. (2014) and Campbell et al. (2016), among many others. But Martin (2017) argues that option prices are inconsistent with the lognormality assumption. This motivates the following result, which provides a sufficient condition for the mNCC to hold without requiring lognormality.

Result 4. Suppose that an investor's SDF takes the form
where V W is the investor's marginal value of wealth, and z t is a vector of state variables, with signs chosen so that V W is weakly decreasing in each (just as it is decreasing in W t+1 ). We allow time t + 1 wealth, W t+1 , to be invested in the market and in some other asset or portfolio of assets with gross return R t+1 : If (i) R t+1 , R t+1 and the elements of z t+1 are associated random variables, 14 (ii) the investor ensures that the share of wealth in the market, W m,t+1 /W t+1 , is at least θ ∈ (0, 1], some fixed constant, and (iii) the investor's relative risk aversion −W V W W /V W (which need not be constant) is at least 1/θ, then the mNCC holds.
That is, we must prove that the covariance of two functions of R t+1 , R t+1 , and z t+1 is positive.
The two functions are (As the covariance is conditional on time t information, we can treat α t and W t − C t as known constants.) As the random variables are associated, the result follows if f and g are each weakly increasing functions of their arguments. The assumptions above ensure that this is the case. For example, differentiating f with But this holds, by assumptions (ii) and (iii): It is immediate that f and g are weakly increasing in their other arguments.
Result 4 provides a flexible set of conditions under which the mNCC holds.
Example 1. Suppose that there is a representative investor who maximizes utility over next-period wealth and who chooses to invest her wealth fully in the stock market. Then by Result 4, the mNCC holds so long as her relative risk aversion (which need not be constant) is at least one at all levels of wealth. Furthermore, if the representative investor has log utility then the mNCC is tight-that is, the inequality holds with equality-because M t+1 R t+1 = 1 is a constant.
Example 2. Alternatively, if the investor keeps at least (say) a third of her wealth in the market, then her relative risk aversion must be at least three. We also require that the market and non-market returns are associated; in the lognormal case, this holds if they are nonnegatively correlated. 15 These examples make no assumption about the beliefs of other investors in the economy. We can therefore think from the perspective of a rational investor surrounded by other investors, some of whom are potentially irrational. We think that the assumption that the investor chooses to invest fully in the stock market represents a natural benchmark in such cases; but the possibility arises that the lower bound might be violated-say in the late 1990s-because no rational investor would want to hold the market. We discuss this possibility after introducing the sentiment measure in Section 4. We also provide further examples of situations in which the mNCC holds in Appendix E.
We can now state our lower bound on expected log returns.
Result 5. Suppose Assumptions 1 and 2 hold. Write call t (K) and put t (K) for the time t prices of call and put options on P t+1 with strike K, and F t for the time t forward price of the index for settlement at time t + 1. Then we have Proof. As E * t R t+1 = R f,t+1 and E * t P t+1 = F t , the inequality (27) can be rearranged as (30) The right-hand side of this inequality can be measured directly from option prices using a result of Breeden and Litzenberger (1978) that can be rewritten, following Carr and Madan (2001), to give, for any sufficiently well behaved function g(·) , (31) The result follows on combining the identity (26), the inequalities (28) and (30), and equation (31).
We refer to the right-hand side of equation (29) as LVIX because it is reminiscent of the definition of the VIX index which, in our notation, is and of the SVIX index introduced by Martin (2017), We do not annualize our definition (29), so to avoid unnecessary clutter we have also not annualized the definitions of VIX and SVIX above. We will typically choose the period length from t to t + 1 to be 12 months. The forecasting horizon dictates the maturity of the options, so we use options expiring in 12 months to measure expectations of 12-month log returns. VIX, SVIX, and LVIX place differing weights on option prices. VIX has a weighting function 1/K 2 on the prices of options with strike K; LVIX has weighting function 1/K; and SVIX has a constant weighting function. In this sense we can think of LVIX as lying half way between VIX and SVIX. (We could also introduce a factor of two into the definition of LVIX to make the indices look even more similar to one another, but have chosen not to.) We calculate LVIX using end-of-month interest rates and S&P 500 index option (mid) prices from OptionMetrics. In practice, we do not observe option prices at all strikes between zero and infinity, so we have to truncate the integral on the right-hand side of (29) (as does the CBOE in its calculation of the VIX index). In doing so, we understate the idealized value of the integral. That is, our lower bound would be even higher if given perfect data: it is therefore conservative. Figure 1 plots LVIX t , at the end of each month, over our sample period from January 1996 to June 2019. Under our maintained assumptions, the large spikes visible during in 2008-9, for example, indicate that expected excess log returns were very high in the depths of the subprime crisis, consistent with the results of Martin (2017). Of greater relevance for this paper, expected excess log returns were also relatively high around the turn of the millennium, despite the high valuation ratios that prevailed at the time.
One might worry that option markets were illiquid, or segmented from the broader stock market, during the late 1990s. Lamont and Thaler (2003) present evidence that this was indeed the case for certain individual stocks (most famously for options on Palm), and ascribe the anomalous behavior of prices of these stocks, and of options on the stocks, to the difficulty or impossibility of shorting the stocks. As short-selling the broader stock market was possible at low cost throughout this period (for example via the futures market) we do not expect this to be an issue for our approach. But to address the more general concern that option markets may have exhibited extreme bid-ask spreads at the time, we recompute the LVIX index using bid prices as opposed to mid prices. (We use bid rather than ask prices to be conservative, as this will drive our sentiment indicator down.) As shown in Figure 11 of the appendix, doing so has very little effect on our results.

Empirical evidence on the modified NCC
We motivated the inequality of Result 5 via a theoretical argument that the mNCC should hold. We can also assess the inequality empirically by examining the realized forecast errors r t+1 − r f,t+1 − LVIX t . To do so, we carry out a one-sided t-test of the hypothesis that the inequality (29) fails. Using a block bootstrap, 16 we find a p-value of 0.097. Thus despite our relatively short sample periodwhich is imposed on us by the availability of option price data-we can reject the hypothesis with moderate confidence. This supports our approach.
More optimistically, it is natural to wonder whether the inequality (29) might approximately hold with equality (though we emphasize that this does not need to be the case for our approach to make sense). For this to be the case, we would need both (27) and (28) to hold with approximate equality. As the conditional volatility of dividends is substantially lower than that of prices, it is reasonable to think that this is indeed the case for (27), and as noted in footnote 12, much of the literature implicitly makes that assumption. Meanwhile the mNCC (28) would hold with equality if (but not only if) one thinks from the perspective of an investor with log utility who chooses to hold the market, as is clear from the proof provided in Example 2 above. The perspective of such an investor has been shown to provide a useful benchmark for forecasting returns on the stock market (Martin, 2017), on individual stocks (Martin and Wagner, 2019), and on currencies (Kremens and Martin, 2019). Table 8 in the Appendix reports the results of running the regression at horizons of 3, 6, 9, and 12 months. Returns are computed by compounding the CRSP monthly gross return of the S&P 500. We report Hansen-Hodrick standard errors to allow for heteroskedasticity and for autocorrelation that arises due to overlapping observations. If the inequality (29) holds with equality, we should find α = 0 and β = 1. We do not reject this hypothesis at any horizon; and at the six-and nine-month horizons we can reject the hypothesis that β = 0 at conventional significance levels.

A sentiment indicator
We now adopt the perspective of a hypothetical investor whose expectations and stochastic discount factor satisfy the mNCC so that the lower bound (29) of Section 3 applies. We will also assume that this hypothetical investor's beliefs are consistent with the predictive relationship (25) between valuation ratios, returns, and dividend growth, as studied in Section 2. We do so to force the investor's beliefs to be consistent with the historical evidence, in order to prevent him or her from "explaining" asset prices simply by concluding that "this time is different" (in the words of Reinhart and Rogoff, 2009). We can derive a lower bound on such an investor's subjective expectations about fundamentals by subtracting E t (r t+1 − g t+1 ), as revealed by valuation ratios, from E t r t+1 , as revealed by interest rates and option prices: The inequality follows (under our maintained Assumptions 1 and 2) because E t r t+1 − r f,t+1 ≥ LVIX t , as shown in Result 5. We use y t to measure E t (r t+1 − g t+1 ) via the fitted value a 0 + a 1 y t , as in We estimate the coefficients a 0 and a 1 using an expanding window: at time t they are estimated using data from 1947 until time t. Thus B t is observable at time t.
As we have discussed, B t can be interpreted as a lower bound on expected dividend growth, E t g t+1 . If E t g t+1 itself follows an AR(1)-as in the work of Bansal and Yaron (2004) and many others-then B t can also be interpreted, after rescaling, as a lower bound on long-run dividend expectations. For if E t+1 g t+2 − g = φ g (E t g t+1 − g) + ε g,t+1 then we can define a measure of expected long-run dividend growth, at time t, as (We have introduced a factor 1 − ρ so that long-run expected dividend growth can be interpreted as a weighted average of all future periods' expected growth, as the weights (1 − ρ)ρ i sum to 1.) Figure 2a plots B t over our sample period using the full sample from 1947 to 2019 to estimate the relationship between y t (or dp t ) and r t+1 − g t+1 . We work at an annual horizon, 17 so that the value of B t at a given point in time is (subject to our maintained assumptions) a lower bound on the expected dividend growth over the subsequent year. Figure 2b shows the corresponding results using using an expanding window to estimate the relationship, so that the resulting series is observable in real time. Encouragingly, the indicator behaves stably as we move from full-sample information to real-time information. Unless otherwise indicated, we will henceforth work with the series that is observable in real time.
The figures also show modified indicators, B dp,t , that use dp t rather than y t to measure E t (r t+1 − g t+1 ), as in (23). These have the advantage of familiarity-dp t has been widely used in the literature-but the disadvantage that they may err on the side of signalling a bubble too soon, as shown in Result 3. Consistent with this prediction, the two series line up fairly closely, but the B dp,t series are less conservative-in that they suggest even higher E t g t+1 -during the period in the late 1990s when valuation ratios were far from their mean.
Note, moreover, that net dividend growth satisfies E t D t+1 Dt − 1 > E t g t+1 , because e g t+1 − 1 > g t+1 . Thus our lower bound on expected log dividend growth implies still higher expected arithmetic dividend growth. If dividend growth were conditionally lognormal, for example, we would have log E t D t+1 Dt = E t g t+1 + 1 2 var t g t+1 . The variance term is small unconditionally-in our sample period, var g t+1 ≈ 0.005-but it is plausible that during the late 1990s there was unusually high uncertainty about log dividend growth. Figure 3 plots the three components of the sentiment indicator B t from 1996 to 2019. LVIX and E t (g t+1 − r t+1 ) moved in opposite directions for most of our sample period, with high valuation ratios occurring at times of low risk premia. But all three components were above their sample means during the late 1990s.
In particular, our approach implies that the expected annual log dividend growth perceived by our hypothetical representative investor rose above 12% around the turn of the millennium, a degree of optimism that we do not think was reasonable. If we reject this conclusion, we must reject at least one of the  assumptions that delivered it. The first possibility is that there is no investor whose preferences and beliefs are such that the mNCC is satisfied. In particular, this would be a violation of the equilibrium models and of the various examples discussed in Section 3.
Alternatively, if the mNCC did hold then-for the hypothetical investor to perceive high expected log returns and, simultaneously, low expected log dividend growth during the bubble period-he or she must have believed that the historical forecasting relationship between dividend yield and E t (r t+1 − g t+1 ) had broken down, perhaps because of a "paradigm shift" or because the predictive coefficients estimated using historical data failed to reflect the true population values. ("This time is different!") To see this, write E t (r t+1 −g t+1 ) for the regression-implied timet forecast of r t+1 − g t+1 , which we now allow to differ from the agent's forecast E t (r t+1 − g t+1 ). Then, from inequality (33), we have An agent who believed, in the late 1990s, that E t g t+1 was lower than B t must therefore have concluded that E t (r t+1 − g t+1 ) < E t (r t+1 − g t+1 ). By the loglinearization (24), this is equivalent to E t y t+1 < E t y t+1 . On this interpretation, our hypothetical investor's beliefs were consistent only because she expected y t+1 to  remain, in the short run, lower-and valuations higher-than suggested by the historical evidence. We discuss this possibility further in Section 4.2.1, below.

Alternative stochastic processes for y t
We have modelled y t as following an AR(1) to avoid overfitting. Aside from the obvious advantages of parsimony, the partial autocorrelations of y t , shown in Figure 14 of Appendix C, support this choice: the partial autocorrelations of y t at lags greater than one are close to zero. The question of how to model y t is not central to the point of this paper, however, so we also consider the possibility that y t follows an AR(2) or AR(3). If y t follows an AR(2) process, then from the linearization (24) we have r t+1 − g t+1 = α + βy t + γy t−1 + ε t+1 , while if y t follows an AR(3) process, then r t+1 − g t+1 = α + βy t + γy t−1 + δy t−2 + ε t+1 .
The results of these regressions are reported in Table 10 of Appendix C.
The corresponding lower bounds on E t g t+1 are shown in  very similar to our baseline measure during the late 1990s, but they are lower during the crisis of 2008-9 and higher in its aftermath. Once again, we note that the indicator behaves fairly stably as we move from full-sample information to real-time information. Figure IA.3, in the Internet Appendix, plots the minimum of the three series computed under AR(1), AR(2) and AR(3) assumptions; this serves as a conservative lower bound.

Estimation uncertainty
The coefficients in the regression of r t+1 − g t+1 onto y t (and its lags, in the AR(2) and AR(3) cases) are estimated with statistical uncertainty. To illustrate, Figure 5 plots block-bootstrapped one-sided 90% and 95% confidence intervals for our baseline measure B t . 18 At the edge of the 90% (95%) confidence intervals, the lower bounds on expected dividend growth peak at 9.3% (7.9%) for the AR(1) model and at 9.5% (7.8%) for the AR(3) model. While we think these numbers remain implausibly high, one might perhaps argue that they were reasonable forecasts of expected dividend growth. We emphasize, however, that when using the 90% or 95% percentile as the estimate of expected dividend growth, the implicit position taken is that the historical relationship between valuation ratios and r t − g t -as embodied in the point estimates, the correct central measure-is misleading. Furthermore, a prudent policymaker should also entertain the symmetric possibility that in the presence of estimation uncertainty, the true B t is substantially higher than implied by the central point estimates of the predictive coefficients.

What if the valuation ratio follows a random walk?
A true believer in the New Economy might have argued that our measure of E t (r t+1 − g t+1 ), which is based on an assumption that y t follows an AR(1)-or AR(2) or AR(3)-had broken down during the late 1990s. Perhaps the most aggressive possibility our hypothetical investor could reasonably entertain is the "random walk" view that the price-dividend ratio had entirely ceased to meanrevert, as considered by Campbell (2008Campbell ( , 2018. Such a perspective might also be adopted by a cautious central banker to justify inaction on the basis that valuation ratios could remain very high indefinitely. 19 We now show how to accommodate this possibility. If y t follows a random walk then, from equation (14), where we define a variant on our previous indicator, that has the further benefit of not requiring estimation of any free parameters.

2019:06.
19 It can also be interpreted as a conservative approach to dealing with Stambaugh bias. More generally, if all one knows is that E t y t+1 ≥ y t -irrespective of the details of the evolution of y t -then equation (24) implies that we have Figure 6 shows the time series of B t . Even if valuation ratios were expected to follow a random walk in the late 1990s-a dubious proposition in any case-the implied expectations about cashflow growth appear implausibly high.
Unlike our preferred indicator, B t , the random walk version B t spiked almost as high during the subprime crisis as it did around the turn of the millennium. This reflects the fact that implied volatility, and hence the LVIX index, rose dramatically during the last months of 2008, indicating that log returns were expected to be very high over the subsequent year (by Result 5). From the perspective of our notional policymaker who believed that valuation ratios follow a random walk, these high expected log returns could only have reflected high expected log dividend growth. This prediction is unreasonable, in our view, because the random walk assumption is unreasonable. The point is that even a policymaker who believed valuation ratios followed a random walk would have had to perceive unusually high expected dividend growth in the late 1990s.

What if dividend growth is unforecastable?
If dividend growth is unforecastable (in the sense that E t g t+k = g for all k ≥ 1, as in the work of Campbell and Cochrane (1999) and many others) then valuation ratios reveal long-run expectations of log returns while LVIX reveals the corresponding short-run expectations.
Specifically, if dividend growth is unforecastable and y t is stationary, then from equation (15) This equation can be rearranged to give Exploiting the inequality E t r t+1 − r f,t ≥ LVIX t of Result 5, we can conclude that long-run returns This inequality provides an alternative interpretation of the indicator B t = LVIX t + r f,t − y t that we defined in equation (34) above, and which is plotted in Figure 6. If dividend growth is unforecastable, unusually high levels of B t indicate that short-run expected log returns are unusually high relative to subsequent longrun expected log returns.

Are valuation ratios alone enough?
Valuation ratios alone would make for an even simpler sentiment indicator. Are they enough? In theory, no: as we have argued, valuation ratios can be high for good reasons if interest rates are low or if risk premia are low (and are widely understood to be low) or both, and our measure embraces this fact by incorporating r f,t and LVIX t .
Nonetheless, theory aside, we do know, of course, that valuation ratios were very high during the late 1990s, so it is interesting from a purely empirical perspective to see how they compare with B t . We plot the valuation ratio measures −y t and pd t on the same axes as B t over our sample period in Figure 12 in the appendix. For ease of comparison, we standardize all three series to have zero mean and unit standard deviation and use the full-sample version of B t so that the predictive coefficients do not vary over the time series. The sentiment index B t gives a clearer indication of bubbliness in the market at the start of our sample, from 1996 to 2000, in the sense that it is generally around 0.5 to 1 standard deviations further above its mean than are the valuation ratio series.
In the opposite direction, valuation ratios have been very high in recent years. But our measure suggests that this does not represent a bubble, as the high valuation ratios have reflected unusually low interest rates (and also, for much of this period, low volatility).

Can the methodology be applied in other markets?
Our approach can be applied to other assets if their returns obey the mNCC. It is reasonable to expect that this is the case for stock market indices, for example. Figure 13 illustrates by constructing a sentiment index for the NASDAQ-100. To do so, we calculate LVIX using the mid prices of NASDAQ-100 options from Op-tionMetrics and estimate the predictive regression (25) over the period 1983-2019 using NASDAQ-100 dividend yield and price level data from Datastream. (As the predictive regression is estimated over a shorter time series, we present results using the full sample rather than using an expanding window.) The sentiment index for the NASDAQ-100 was substantially higher than that for the S&P 500 around and before the turn of the millennium, consistent with the conventional view that sentiment was particularly elevated in tech stocks at the time.
For "hedge" assets, such as gold, one would expect the direction of the inequality (28) to be reversed. This rules out using our approach to detect bubbles in such assets. The situation is more promising in the case of individual stocks: it may be possible to argue that the mNCC holds for stocks with betas sufficiently close to, or greater than, one, along the lines of Martin and Wagner (2019) and Kadan and Tang (2019), but we leave this extension for future research.

Nonlinearity in the functional form
We can also allow for a nonlinear relationship between r t+1 − g t+1 and y t . In Appendix D, we report the results of running regressions of the form r t+1 − g t+1 = a 0 + a 1 y t + a 2 y 2 t + ε t+1 and r t+1 − g t+1 = a 0 + a 1 y t + a 2 y 2 t + a 3 y 3 t + ε t+1 .

Figure 7a
shows that these regressions deliver very similar results to the linear specification reported above when we use the full sample period to estimate the coefficients a i . But the coefficient estimates in the higher order specifications are strikingly unstable when we estimate the regressions in real time using expanding windows (Figure 7b), even though the regressions are estimated on almost 50 years of data at the start of our sample period, that is, on data from 1947 to 1996.
In the late 1990s, for example, the estimated cubic specification implies a negative relationship between y t and forecast r t+1 −g t+1 around the then prevailing value of y t . That is, given the then recent association of unusually low dividend yield with high realized returns, the cubic specification predicts extremely high returns going forward, as shown in Figure 15a, Appendix D. (It is important that the low dividend yields at the time were unusual, because the cubic specification makes it possible to associate high returns with extremely low yields without materially altering the long established relationship between low returns and low yields that prevails over the usual range of yields.) We view this exercise as a cautionary tale. Given that bubbles occur fairly rarely, it is particularly important to avoid the possibility that an (over-)elaborate model achieves superior performance in-sample by overfitting the historical data. The ingredients of a bubble indicator should behave stably during historically unusual periods, as our simple linear specification does (Figure 15b, Appendix D).

Other indicators of market conditions
We now compare the sentiment indicator to some other indicators of financial conditions that have been proposed in the literature. We standardize all time series to have zero mean and unit standard deviation throughout this section, for ease of comparability and so that correlations can equivalently be interpreted as betas (noting that corr(X, Y ) = cov(X,Y ) var X if X and Y have unit standard deviation).

Volume
We start by exploring the relationship with volume, which has been widely proposed as a signature of bubbles (see, for example, Harrison and Kreps, 1978;Duffie, Gârleanu and Pedersen, 2002;Cochrane, 2003;Lamont and Thaler, 2003;Ofek and Richardson, 2003;Scheinkman and Xiong, 2003;Hong, Scheinkman and Xiong, 2006;Barberis et al., 2018). We construct a daily measure of volume using Compustat data from January 1983 to December 2017, by summing the product of shares traded and daily low price over all S&P 500 stocks on each day. (We find essentially identical results if we use daily high prices to construct the measure.) As volume trended strongly upward during our sample period, we subtract a linear trend from log volume. We do so on using an expanding window, so that our detrended log volume measure, which we call v t , is (like B t ) observable at time t.
The left panel of Figure 8 plots detrended log volume, v t , and B t over the sample period, with both series standardized to zero mean and unit variance. There is a remarkable similarity between the two series, so it is worth emphasizing that they are each based on entirely different input data. The sentiment index is a leading indicator of volume: the right panel shows the correlation between B t+k and v t , where k is measured in months. The shaded area indicates a block bootstrapped 20 95% confidence interval. The correlation between the two is higher than 0.9 when k is around −10 months. Thus Figure 8 shows that B t−10 is highly statistically significant as a forecaster of v t (and, to a lesser extent, that v t is a statistically significant forecaster of B t+10 ).

Survey expectations of long-term earnings growth
We next compare B t to a quarterly time series of financial analysts' long-term earnings growth forecasts (LTG) that has been constructed by Nagel and Xu (2019). Figure 9 shows the LTG series against B t , with the latter computed as in our baseline measure (i.e., using an AR(1) and with an expanding window to compute predictive coefficients) and with an AR(3) using full sample data. There is a striking similarity between the two series-particularly when the sentiment indicator is computed using an AR(3) and full sample estimation of the predictive regression-but we note that B t rose more rapidly during the late 1990s.

The probability of a crash
One expects that the probability of a crash should be higher during a bubble episode; if not, the episode is perhaps not actually a bubble. 21 We use a measure of the (time t conditional) probability of a crash derived by Martin (2017, Result 2) that can be computed in terms of option prices: where put t (K) is the first derivative of put price as a function of strike, evaluated at K. This represents the probability of a market decline perceived by an unconstrained log investor who chooses to hold the market; we also require that the      investor is marginal in option markets, so that we are ruling out the possibility that these markets are segmented from the stock market. (In other words, the above calculation relies on a stronger assumption than the rest of the paper, namely that the SDF M t+1 satisfies M t+1 = 1/R t+1 ; this implies that the mNCC holds with equality.) The probability of a crash (35) is high when out-of-the-money put prices are highly convex, as a function of strike, at strikes at and below αP t . By contrast, the measure of volatility (29) that is relevant for our sentiment indicator is a function of option prices across the full range of strikes of out-of-the-money puts and calls. The left panel of Figure 10 plots the crash probability over time. The probability of a crash was elevated during the late 1990s, consistent with standard intuition about bubbles. But it was also high in the aftermath of the subprime crisis, an episode that we would certainly not identify as bubbly. The right panel shows the correlation between the two series at different leads and lags. The sentiment measure is a leading indicator of crash probability at horizons of about two years.
The possibility that high valuation ratios, expected log returns, and expected log dividend growth can coexist with with a high crash probability (in the mind of our representative investor) is reminiscent of the view of the world colorfully articulated by former Citigroup chief executive Chuck Prince in a July, 2007, interview with the Financial Times: "When the music stops, in terms of liquidity, things will be complicated. But as long as the music is playing, you've got to get up and dance. We're still dancing."

Other measures
The panels of Figures 16 and 17     Harvey (2013). ER 1yr and ER 10yr are, respectively, the cross-sectional average subjective expectations stock market returns over 1-and 10-year horizons, as reported by survey respondents; EER 1yr and EER 10yr are the corresponding average subjective expected excess returns; and ERstd 1yr and ERstd 10yr are disagreement measures at the same horizons (that is, are the cross-sectional standard deviations of reported subjective expected returns). Figure IA.10 compares B t with a quarterly time series of average subjective expectations of dividend growth that has been constructed by De la O and Myers (2018). The measures of mean subjective expected returns, and of mean subjective expected dividend growth, are positively correlated with B t , while the measures of mean subjective expected excess returns, and of disagreement, are negatively correlated with B t . We are hesitant to draw firm conclusions from this evidence, however, as the comparison series do not include the period of greatest interest from 1996 to 2000.

Conclusion
We have presented a sentiment indicator based on interest rates, index option prices, and the market valuation ratio. The indicator can be interpreted as a lower bound on the expected dividend growth that must be perceived by an unconstrained, rational investor with risk aversion at least one who is happy to invest his or her wealth fully in the stock market, and whose beliefs are consistent with the historical evidence on the relationship between valuation ratios, returns, and dividend growth. The bound was very high during the late 1990s, reflecting dividend growth expectations that in our view were unreasonably optimistic-hence our description of it as a sentiment indicator-and that were not realized ex post. We also show that it is a leading indicator of detrended volume, of long-term earnings growth expectations, and of various measures of stress in the financial system.
In simple terms, we characterize the late 1990s as a bubble because valuation ratios and short-run expected returns-as revealed by interest rates and our LVIX measure-were simultaneously high. Both aspects are important. We would not view high valuation ratios at a time of low expected returns, or low valuation ratios at a time of high expected returns, as indicative of a bubble: on the contrary, the latter scenario occurs in the aftermath of the market crash in 2008.
Our measure does not point to an unreasonable level of market sentiment in recent years, as it interprets high valuation ratios as being justified by the low levels of interest rates and of implied volatility.
Volatility and valuation ratios have, of course, long been linked to bubbles. A novel feature of our approach is that we use some theory to motivate our definitions of volatility and of valuation ratios, and to make the link quantitative. There are various choices to be made regarding the details of the construction of the indicator; we have tried to make these choices in a conservative way to avoid "crying bubble" prematurely, in the hope that the indicator might be useful to cautious policymakers in practice. Our approach does ultimately require an appeal to the good judgment of policymakers, as we do not address the hard question of how to identify whether a given level of expected dividend growth is reasonable. We do not see a way to avoid some degree of expert judgment in identifying market-wide bubbles; but we believe that the approach proposed in this paper would make it easier for such judgment to be applied in a focussed and disciplined manner.

B AR(1) vs. linear regression
If y t follows an AR(1) with autocorrelation φ, then the linear approximation (24) reduces to In the body of the paper, we estimate the predictive relationship between r t+1 − g t+1 and the predictor variable y t (and dp t ) via linear regression. Under our AR(1) assumption, we could also estimate the constant term and the coefficient on y t directly, as in (36), by estimating ρ and the autocorrelation φ. Table 9 shows that both approaches give similar results.

D Nonlinear specifications
In this section we consider the effect of allowing for quadratic or cubic functional relationships between r t+1 − g t+1 and y t . We run the regressions r t+1 − g t+1 = a 0 + a 1 y t + a 2 y 2 t + ε t+1 and r t+1 − g t+1 = a 0 + a 1 y t + a 2 y 2 t + a 3 y 3 t + ε t+1 .           , full-sample) and various measures of financial conditions. Shaded areas in the right panels indicate bootstrapped 95% confidence intervals. k is measured in months.

E Further examples
We provide some other illustrations of situations in which the mNCC holds. These illustrations are intended as proof-of-concept rather than as fully fleshed out models, so we have simplified them as far as possible.
Example 3 (Heterogeneous preferences). This example is a simplification of Longstaff and Wang (2012), except that we will not need to make any assumptions on the distribution of aggregate consumption growth. Consider a two-period economy with complete markets and two agents with homogeneous beliefs and power utility, but with differing coefficients of risk aversion, γ 2 > γ 1 ≥ 1. Agent i's problem is therefore As markets are complete and beliefs are homogeneous, the stochastic discount factor is unique, so that β (C 1,t+1 /C 1,t ) −γ 1 = β (C 2,t+1 /C 2,t ) −γ 2 . Following Longstaff and Wang (2012) by assuming that γ 1 = γ and γ 2 = 2γ to ensure a closed form solution, we therefore have Writing Y t = C 1,t + C 2,t for aggregate consumption, this implies that where the constant a = 4C 1,t /C 2 2,t reflects the relative wealth of the two agents. We wish to check whether the mNCC holds for the return on the market, i.e., the aggregate consumption claim. To do so, we construct a representative agent for whom the mNCC holds. (Although agents 1 and 2 are not representativeneither invests only in the market-they have the same beliefs and SDF as the representative agent, so it will then follow that the mNCC holds for them too.) In the usual way, the representative agent consumes Y t+1 and has marginal utility v (Y t+1 ) that is proportional to C −2γ 2,t+1 . Integrating, the representative agent's The representative agent's relative risk aversion is therefore low in good times and high in bad times, and it lies between γ and 2γ: As γ ≥ 1, the mNCC holds.
Example 4 (Heterogeneous beliefs). This example is based on Martin and Papadimitriou (2020). A continuum of investors with log utility over terminal wealth trade a risky asset in unit supply ("the market") and a riskless asset in zero net supply. The net riskless rate is zero. Uncertainty evolves on a binomial tree, so the risky asset's return, R, equals R u at the up-node and R d at the down-node; we choose labels so that R u > R d . Investors, indexed by h ∈ (0, 1), have heterogeneous beliefs: investor h believes that the probability of an up-move is h. On wealth-weighted average, the investors must hold one unit of the asset to clear the market. At any node, we can define a representative agent H ∈ (0, 1) who invests fully in the risky asset with no borrowing or lending. We also define the risk-neutral probability of an up-move (on which all investors agree) as p * ∈ (0, 1). Optimists (h > H) lever up, while sufficiently pessimistic investors (h < p * ) go short, as they perceive that the market earns a negative risk premium. These assumptions imply that the representative agent perceives the market as growth-optimal, and hence that H Ru + 1−H R d = 1 (as the gross riskless rate is 1). On the other hand-using the fact that the gross riskless rate equals 1 once again-we must also have p * R u + (1 − p * )R d = 1 by the defining property of the risk-neutral probability. Combining these two equations, We can now find the covariance cov (h) (M R, log R) from the perspective of an arbitrary investor h ∈ (0, 1). We have where we use (38) in the third line. The mNCC therefore holds when h ≥ H: that is, for the representative investor and for all more optimistic investors.
Example 5 (Heterogeneous preferences and beliefs). Consider a collection of investors who maximize next-period utility. Investor i allocates a fraction θ i of wealth to the risky asset, and 1−θ i to the riskless asset, so E (i) W 1−γ i i,t+1 /(1−γ i ) where W i,t+1 = W i,t R f + θ i W i,t (R − R f ). Risk aversion γ i ≥ 1 may be heterogeneous across investors. Beliefs are also heterogeneous: we suppose that every investor i perceives the return on the market as lognormal, log R ∼ N (µ i , σ 2 i ), and that µ i −r f + 1 2 σ 2 i = γ i σ 2 i where r f = log R f is the log riskless rate. This last assumption implies (together with the first order condition for optimal θ i ) that every investor will set θ i = 1, which clears the market. Every investor is therefore representative, and as γ i ≥ 1, the mNCC holds for them all. Figure IA.1: The sentiment indicator, computed at the two-year horizon using the full sample to estimate the relationship between y t and r t+1 − g t+1 (left) or using an expanding window (right), AR(1) model. Note that t and t + 1 represent periods rather than years: the time interval between them is two years in this figure. We report the index in annualized terms (LVIX and the riskless rate are annualized by construction, and we calculate E t (g t+1 − r t+1 ) using un-annualized quantities throughout the estimation, then divide the end result by 2).  The sentiment indicator, computed at the two-year horizon using the full sample to estimate the relationship between y t and r t+1 − g t+1 (left) or using an expanding window (right), AR(3) model. Note that t and t + 1 represent periods rather than years: the time interval between them is two years in this figure. We report the index in annualized terms (LVIX and the riskless rate are annualized by construction, and we calculate E t (g t+1 − r t+1 ) using un-annualized quantities throughout the estimation, then divide the end result by 2).