Wednesday, June 11, 2014

The Annual SoFiE 2014 Conference

The annual conference starts today and has a strong program.  This year's program is very diverse and includes contributions in the estimation of asset pricing models, inference from options, high-dimensional modeling and, of course, estimating components of quadratic variation using high-frequency data.

Which risks matter?

While it is difficult to choose a single paper that I am looking forward to hearing about, "Asset pricing in the frequency domain: theory and empirics" by Ian Dew-Becker and Stefano Giglio caught my eye.  This paper uses subtle but important differences in the frequencies that matter for risk -- for example when the utility function in Epstein-Zinn, the low-frequency components of risk are the most important -- to `provide new insights into which models are more compatible asset prices.  This is an important areas since many modern models -- habit formation, long-run-risk, or preferences for robustness against the unknown (Hansen-Sargent robust control or other forms of ambiguity aversion) -- can match a large number of common measures of asset prices such as the mean, volatility and persistence of asset prices.  This contributes to a rapidly expanding literature that low-frequency features, such as the volatility over the past few years, are more useful for understanding expected returns than higher-frequency components, and is interesting since it substantially challenges the common perception that risk is dominated by surprises.

Wednesday, April 30, 2014

What is a good volatility model?

While conditional volatility modeling has evolved substantially in the 30+ years since it is first documented in financial data, there is little consensus as to which volatility model is “best”.

Historical Volatility

The simplest model for volatility is simple historical volatility (HV), $$\sigma_t = \sqrt{\sum _{j=1}^t r_j^2}.$$ Historical volatility is really easy to compute and only involves 1 parameter – the length of the window use (or none if using all available data).  Of course, historical volatility doesn’t react to news and so is unlikely to be satisfactory in practice.

Exponentially Weighted Moving Averages

EWMA volatility goes beyond HV to reflect more recent information.  The basic EWMA can be expressed in one of two forms, $$\sigma_t = (1-\lambda)r_{t-1}^2 + \lambda \sigma_{t-1}^2$$ or equivalently $$\sigma_t = (1-\lambda)\sum_{i=0}^{\infty} \lambda^i r_{t-1-i}^2.$$
I’ve often found EWMA-based volatility (and covariance/correlation/beta) to be very popular choices among practitioners.  This popularity is often attributed to a few features of the EWMA model:
  • Simplicity.  The entire model is summarized by 1 parameter which is easily interpretable in terms of memory of the process.
  • Robustness: The model, at least when using common values for \(\lambda\), is very unlikely to produce absurd values.  This is particularly important in practice when volatility is required for hundreds of assets.

Combining: GARCH

EWMAs are far less popular among academics, primarily since they have some undesirable statistical properties (random walk forecasts and do not always generate not non-negative conditional variances), and ARCH-type models are commonly used to address these issues.  The conditional variance in the standard GARCH model is $$\sigma_{t+1}^2 = \omega + \alpha r_{t}^2 + \beta \sigma_t^2$$ and so has 3 free parameters.  However, a GARCH model can be equivalently expressed as a convex combination of HV and EWMA where $$\sigma^2_{t+1} = (1-w) \sigma^2_{t+1,HV} + w \sigma^2_{t+1,EWMA}. $$
In this strange reformulation, \(w=\alpha+\beta\) and \(\lambda\) in the EWMA is the same as the \(\beta\) in the usual GARCH specification.  When I hear GARCH models dismissed while EWMA and MA models lauded, I find this to be somewhat paradoxical.  Of course, the combination model sill have 3 parameters which may be two too many.

Assessing Volatility Models

The standard method to assess volatility models is to evaluate the forecast using a volatility proxy such as the squared return.  The non-observability of volatility poses some problems and so only a subset of loss functions will consistently select the best forecast in the sense that if the true model was included that it would necessarily be selected.  The two leading examples from this class are the Mean Square Error (MSE) loss function and the QLIK loss function, defined 

\[L\left(r_{t+1}^2,\hat{\sigma}^2_{t+1}\right) = \ln\left(\hat{\sigma}^2_{t+1}\right)+ \frac{r_{t+1}^2}{\hat{\sigma}^2_{t+1}}\]

The name QLIK is derived from the similarity to the (negative) Gaussian log-likelihood and its use as a quasi-likelihood in mis-specified models. 

A New Criteria for Volatility Models Assessment

In some discussion with practitioners, I recently stumbled across a new consideration for volatility model assessment - one that is driven by the desire to avoid noise-induced trading.  Trading strategies involve both conditional mean predictions (or \(\alpha\)) as well as forecasts of quantities required to manage risk management.  Volatility is almost always one component of the risk management forecast.  A simple strategy aims to maximize the return subject to a volatility limit - say 20% per year.  The in this type of strategy, the role of the mean forecast is primarily to determine the directions of the position - long or short.  The volatility is used to scale the position.  As a result, a substantial amount of trading is induced by changes in volatility since the sign of the mean forecast is typically persistent. 

Optimizing a volatility forecast for this scenario requires a different criteria, and one simple method to formalize this idea is to include a term that penalizes changes in the volatility forecast.  This could be measured using the absolute or squared difference in forecasts, so that a modified QLIK criteria could be constructed as 

\[QLIK + \gamma\left|\hat{\sigma}^2_{t+1}-\hat{\sigma}^2_{t}\right|\]

where \(\gamma\) is a weight used to control the sensitivity of the loss function to the smoothness penalty.

Monday, April 21, 2014

Scaling up vs. scaling out

Simulations have a long history in econometrics; one of the most influential simulations was Granger and Newbold (1974) who demonstrate the dangers of spurious regression. These results are not more than 40 years old, and while the entire simulation can be readily replicated in less than a minute on a modern computer, the simulation was state-of-the-art when it was originally conducted.

Scale Out

Scale out has traditionally been the model used to enable realistic simulation designs. Many academic institutions have clusters available for researchers, and anyone who is willing to invest some time can run their own cluster on Amazon Web Services (AWS) using the StarCluter toolkit. The scale out model allows for nearly unbounded scaling of computational resources for most simulations since most fall into the class of embarrassinglyparallel problem.

However, the scale out model, while providing a highly scalable environment, comes with one important cost - researcher time. Most econometricians are not overly dependent on cluster environments and so the incentive to learn a substantially programming environment is low. Most programming is done in a relatively interactive manner using MATLAB, GAUSS, Ox or R. The process of moving these to a cluster environment is non-trivial and requires porting to a batch system - where simulations are submitted to queue - which is substantially different from the desktop environment used to develop the code.

Scale Up and a Missing Market

A simpler method to enable researchers to conduct complex simulations is to use a scale up model. This is a far simpler solution since it allows researchers to simply make use of a bigger version of their desktop. Moderately large single computers can be configured with 24 to 30 cores in a single machine, and any code that runs well on 4 core desktop can be trivially ported to run on a 24 core machine. More importantly, it is simple to replicate the environment that was used to develop the code in the scaled up environment so that the time costs, and risk associated with changing environments, are minimized.

The challenge with the scale up model is the cost of provisioning a large computer. A capable scale up computer costs north of $10,000 which may be difficult for research budgets to accommodate. The other side the cost is that the machine will probably not be highly utilized by a single researcher. Most simulation studies I've performed were designed to provide results within a few weeks, and even with multiple simulation studies per year, the machine would idle most of the time.

I am surprised that the "cloud" has not provided this mode of operation. Virtually all of the cloud has developed around the cluster model and the largest instances that are currently available on either Google or AWS contains16 physical core. A single virtual desktop that doubled this core count would be a very attractive to researchers in many fields, and so this seems like a missing market.

Saturday, March 8, 2014

Upcoming SoFiE Conference:
Skewness, Heavy Tails, Market Crashes, and Dynamics

The deadline for the upcoming SoFiE sponsored conference on extreme risks is next week.

Date: April 28 & 29, 2014
Submission Deadline: March 15, 2014
Location: Cambridge, UK

Topics include:
  • Estimation and inference in dynamic asset pricing models 
  • Characterization of financial risk in the presence of skewness and fat tails 
  • Modelling Bubbles and Crashes 
  • Multivariate non-Gaussian densities 
  • Measures of dependences – co-skewness 
  • Conditional Skewness Models 

Invited Speakers

Paul Embrechts (ETH Z├╝rich), Andrew Harvey (Cambridge), Eric Ghysels (UNC - Chaple Hill), Peter Christoffersen (Toronto)

Program Chairs

O. Linton and E. Renault


Papers can only be submitted electronically via e-mail to, with the subject line “SoFiE Fac 2014 Submission” and must consist of a single PDF file. No other formats will be accepted. Submissions must be received by March 15, 2014.

More details are available on the conference website.

Friday, March 7, 2014

Referee Reports (Lost in Translation)

I’ve come across a wide variety of referee report styles, ranging from holistic, short essay style, to simple list of short bullet points.   Unlike general academic writing, when someone first starts writing reports, often as a graduate student, there is little guidance.  I recall asking the person who asked for the report for some and was given one of their recent reports as a template – I have little doubt that this induced extreme path dependence and that my default template today still reflects this initial observation.

Internationalization and Report Language

I recently came across a HBR article on the difference, across cultures, between what is said and what is heard, and am wondering sensitive this issue is in reading referee reports.  It might explain why I’ve heard complaints that the editorial decision did not match the (perceived) reports.

Popular Fiction as Academic Writing

If Harry Potter Was An Academic Work is a light-hearted take on the peer review process.  I found the following to be particularly insightful.
Dear Dr. Rowling 
I am pleased to say that the reviewers have returned their reports on your submission Harry Potter and the Half-Blood Prince and we are able to make an editorial decision, which is ACCEPT WITH MAJOR REVISION.
Reviewer 1 felt that the core point of your contribution could be made much more succinctly, and recommended that you remove the characters of Ron, Hermione, Draco, Hagrid and Snape. I concur with his assessment that the final version will be tighter and stronger for these cuts, and am confident that you can make them in a way that does not compromise the plot. 
Reviewer 2 was positive over all, but did not like being surprised by the ending, and felt that it should have been outlined in the abstract. She also felt that citation of earlier works including Lewis (1950, 1951, 1952, 1953, 1954, 1955, 1956) and Pullman (1995, 1997, 2000) would be appropriate, and noted an over-use of constructions such as “… said Hermione, warningly”.

Thursday, March 6, 2014

Time for WRDS 2.0?

In the beginning...

Managing financial data was very painful. Using CRSP required either using a clunky program to extract or compiling some FORTRAN when more control was needed. Using TAQ meant spending a day rotating CDs through a reader (and also either using a clunky GUI or writing your own code to read a binary format). Wharton Research Data Services (WRDS) dramatically simplified the process of accessing financial data, whether it was simply extracting a large set of return data or accessing quarterly report information. WRDS has grown considerably in scope and covers both a wide range of proprietary databases as well as offering a warehouse for free-to-use datasets.

The good, the bad and the SAS

The WRDS infrastructure seems to be built mostly on SAS, one of the grand-daddy’s of statistical software.  SAS was one of the first statistical packages I used as an undergraduate (along with Shazam, which I didn’t realize still exists and possibly the best domain name).  Back in these dark days it took 30 minutes to run a cross-section regression with 800,000 observations and a dozen or so variables on the shared Sun server.  Of course, the 800,000 observations had actually been read off of a 10.5 inch tape.  But this environment was revolutionary since it could run the regression at all, and so was very valuable.

A short 20 years since I ran my first regressions and I have no use for SAS – well, I would have no use for SAS were it not the only practical method to make non-trivial queries on WRDS.  I am sympathetic to the idea that SAS provides a simple abstraction for a wide range of data form the (now) tiny Monthly CRSP dataset to the large TAQ dataset.  However, this is a decidedly dated approach, especially for larger datasets.  I know a wide range of practitioners who work around high-frequency data and I am not aware of any who use SAS as an important component of their financial modeling toolkit.  Commercial products such as kdb exploit the structure of the dataset to be both faster and require a less storage for the same dataset.  I recently was introduced to an alternative, widely used open data storage format HDF by a former computer science colleague which can also achieve fantastic compression while providing direct access to MATLAB (or R, Python, C, Java, C#, inter alia.) It has been so successful at managing TAQ-type data that the entire TAQ database (1993-2013) can be stored on a $200 desktop hard drive. 

The deep issue is that the SAS data file format is not readily usable in many software packages, which creates an unnecessary cycle of using SAS to export data (possibly aggregating at some level) before re-importing into the native format for a more appropriate statistical analysis package designed for rapidly iterating between model and results.

WRDS Cloud

In 2012 WRDS introduced the cloud, which provided a much needed speed boost to the now aging main server. The cloud operates as a batch processor where jobs – mostly SAS programs – are submitted and run in an orderly, balanced fashion. This is a far superior, both in terms of fairness and long-run potential for growth since it follows the scale-out model so that as the use of WRDS increases, or as new, large datasets are introduced, new capacity can be brought on-line to reflect demand. The limitation of the Cloud is that it mostly is still running SAS jobs, just on faster hardware, and so the deep issues about access remain. The WRDS cloud does also support R for a small minority of datasets which have been exported to text (also not a good format since conversion from text to binary is slow, text files are verbose (although this can be mitigated using compression) and, if not carefully done, the conversion may not perfectly preserve fidelity).

Expectations of a modern data provider

What changes would I really like to see in WRDS? A brief list:

  • Use of more open data formats that support a wide range of tools, especially those which are free (e.g. R or Python, but also Octave, Julia, Java or C++ if needed) or free for academic use (e.g. Ox).
  • The ability to submit queries directly using a Web API. This is how the Oxford-Man Realized dB operates using Thompson-Reuter’s TickHistory – a C# program manages submission requests, check completion queues and downloads data, all using a Web Service.
  • The ability to execute small, short running queries directly from leading, popular software packages. MATLAB, for example, has an add-on that allows Bloomberg data to be directly pulled into MATLAB with essentially no effort and especially no importing code.

Thursday, February 27, 2014

SPOOCs: SPecialized Open Online Courses

MOOCs, Massive Open Online Courses, are the most talked about innovation in higher education in a long time.  Clearly they have the potential to offer education at a lower cost than traditional a University-taught course.  They also have, so far, found very limited successes:  Reported drop out rates are often over 90%, they do not capture the socialization component of University education and they are seen as an existential threat by some academics. 

At the other end of the spectrum, I think the SPOOC market is ready to take off.  John Cochrane has recently run a Ph.D. level course in asset pricing on Coursera.  I am certain that I would participated in Cochrane’s course were it available in 1999/2000.  Unfortunately I only had the non-dynamic paper version of a precursor to Cochrane’s course – first as a pre-print PDF and later in hard back.  See here for some analysis and a discussion of the challenges of the course – the comments are also worth reading.

This method of teaching and general communication could do wonders for bright Ph.D. students across the globe.  Some Finance Departments and Business Schools are large enough that they can regularly offer highly specialized courses in areas such as Theoretical or Empirical Market Microstructure.  Most others only offer these courses if they have faculty – permanent or visiting – with some spare teaching capacity and have sufficient students in a cohort to justify the expense. On the other hand, there are regularly first-rate courses taught in the area by field leading academics like Maureen O’Hara, Hank Bessembinder or Terrence Hendershot.  The obvious solution is to match supply with demand.

These courses will never be Massive – the material is simply too difficult and specialized for most students, even many finance or economics Ph.D. students.  On the other hand, they could provide a much richer foundation and increasing breadth to students studying outside of the select Universities with with largest programs.  I would expect that the participation rates would also be much higher than the typical MOOC since students will have more realistic expectations both of the course and of the effort required for success.  The economies of scale will

The SoFiE Summer School

This leads to the natural question as to whether the SoFiE summer school should consider the SPOOC model. 

Preparing the course is a non-trivial endeavor, and I would suspect that it could not be executed this year.  However, the type of material – cutting edge – and the professors giving the course – first rate – are key building blocks for successful SPOOCs.  And if a fully open course is not possibly, it may still be possible to offer the content to SoFiE members, including (especially) student members.

Note: The summer-school-as-a-SPOOC is pure conjecture at this point.  I have not discussed the topic with the SoFiE leadership or the course teachers.