## Monday, October 21, 2013

### DataVis and Public Communication

Data Visualization or simply DataVis has seen a huge increase in popularity as the web has moved from 2.0 to 3.0. Data Visualization can be a powerful tool to communicate complex ideas to a broad audience, something that is often difficult in financial econometrics.
Today I came across the good example of a data visualization that compactly expresses an idea in financial economics (and econometrics, since this is all about measurement). Of course, there are a number of obvious important caveats to this depiction:

• These results will depend heavily on the choice of subsample, and the past 10 years happen to be relatively good for passive strategies.  This isn't true in all 10 year periods (e.g. ending in November 2008).
• Hedge funds may not achieve the same average level of growth, but they may offer diversification, reduced risk or exposure to exotic $\beta$.
• Warren Buffet is only one manager.

## Friday, October 18, 2013

### Between Fama and Shiller

This years Nobel Memorial Prize in Economic Sciences was awarded on Monday to Eugene F. Fama, Lars Peter Hansen and Robert J. Shiller

Most of the popular press has focused on the obvious dichotomy of the seminal contributions of Fama and Shiller, and have concluded that Shiller was right.

Markets are Gray

This year’s prize has been more controversial than most.  A substantial amount of the criticism of this year Nobel has centered around the differences in the original contributions of Fama and Shiller.  As Justin Wolfers succinctly summarized this difference as:

… financial markets are efficient (Fama), except when they’re not (Shiller), …

One article form The New Yorker was particularly dismissive of efficient markets.

The black and white view adopted in the main stream press is  too simplistic to understand market efficiency.  It is useful to consider a substantially more gray definition of market efficiency, first advanced by another Nobel laureate, Clive. W. J. Granger in a paper with Allan Timmerman.  This extended definition adds two important dimensions to to the definition of weak form efficiency.

The first is the horizon, $h$.  Actual arbitrage capital operates on frequencies ranging from microseconds to quarters, and so it is essential to consider the time scale when asking whether prices are weak form efficient.

The second extension is technology, which can be thought of as a combination of actual physical technology, for example the existence of Twitter, and the understanding of econometric and statistical techniques relevant for capturing arbitrage opportunities.   Technology is constantly evolving and existing arbitrage opportunities disappear as understanding of the risk/reward trade off evolves.  This has clearly occurred at the shortest horizons where high-frequency trading has evolved from simple strategies trading the same asset of different markets (e.g. IBM in New York and Toronto) to complex strategies trading hundreds of assets to eliminate arbitrage between futures, ETFs and the underlying components of an index.  Similarly, recent advances allow real-time sentiment analysis constructed from Twitter  feeds to be used to detect price trends.

## Understanding the Risk

In addition to both horizon and technology, it is essential to understand the risk of these price trends.  There is an increasing list of examples where strategies that consistently generated profits for years experienced sharp reversals.  In some examples, these reversals are so sharp that a decade or more or accumulated profit is eliminated in a couple of months.  This was the case for simple momentum strategies in 2002 and for statistical arbitrage strategies.in August 2007. This type of extremely skewed risk-return relationship substantially complicates the econometric analysis of market efficiency.

The Grossman-Stiglitz paradox states that the absence of arbitrage requires arbitrage.  The contradiction, combined with a more nuanced view of efficient markets leads to the relevant question :

Under what conditions are markets efficient?

## Wednesday, October 16, 2013

### The challenges of high-frequency data

It has been nearly 20 years since the publication of some of the most influential research in trade and quote data.  The past seven years have seen an almost unbelievable growth in the number of quotes and a large increase in the number of transactions on major exchanges.

This video shows 10 seconds of trading of BlackBerry Ltd on October 2.  This flurry of activity was attributed (ex-post) to rumors of a second private equity suitor.   The flying objects are both trades (circles) and quotes (triangles) generated by participants in one particular market which are then transmitted to the other 10 exchanges pictured.  The NBBO is pictured at the 6 o’clock position.

With a span of 10 seconds, I would suspect that most of limit orders were completely computer generated. It is also clear that the orders are being placed so quickly on different exchanges that the traditional practice of trade signing using the Lee & Ready algorithm cannot be relied upon.

The video was produced by Nanex, a specialist low-latency data provider.

## Monday, September 9, 2013

### Which came first, the data or the econometrics?

The New York Times ran an article on Big Data and Economics yesterday. Big Data has become an unavoidable buzzword in the main stream media (see BBC Documentary on Big Data) although most applications covered have been outside the traditional realm of economics and econometrics – crime prediction and prevention, disease control or discerning the mood of a country or region.

Applying Big Data to economic problems is clearly going to require new econometric approaches, especially with respect to model building. The broadly taught method of manual model selection will likely not be possible with billions of records and hundreds or thousands of variables per observation. Even common practices which fall under the heading of data cleaning won’t be possible on many of these datasets.

### Financial Econometrics and Big Data

Financial economists have been using Big Data for far longer than the expression has existed. Even the CRSP database, a tiny database by modern standards, was pushing the envelope when it first became available. More recently, the TAQ database – used by financial economists to understand microstructure and measure volatility and correlation – continues to push the limits of computing. As of some point in 2012, there are more than 1,000,000,000,000 (trillion) quote records in the TAQ database, and about 10% as many trades. TAQ contains only the completed transactions, and using the raw message flow results in two orders of magnitude more data (see LOBSTER).

Whether TAQ is Big Data in the modern parlance is not completely clear. Big Data is typically used to reference unstructured data (or at best weakly structured), such as the information contained in Facebook. TAQ data and exchange message feeds are highly structured (although typically contain many errors) and so they can be organized and analyzed without invoking a reference to a stuffed elephant

### Say’s Law

Say’s Law, at least in one incarnation, states

Supply creates its own demand.

Say’s Law is especially true for econometrics and statistics – developing statistical techniques that can’t be applied to economically interesting data is usually a poor choice for an econometrician. On the other hand, the availability of data allows economists (and econometricians) to develop techniques that can lead to new insights. Recent examples of this include the research into realized variance, model-free implied volatility and their combination to provide new insights into risks which are actually compensated.

It is not obvious that all data vendors understand that making financial data available to academics, typically on a delayed basis, is a fantastic way to get free press, especially for data providers (while not undermining commercial viability). Moreover, new insights which come from analysis of data would be likely to increase the commercial value of the database.

## Friday, September 6, 2013

### The SoFiE Blog

Welcome to the new SoFiE Blog.  This blog aims to both highlight the endeavors of the Society as well as to engage a wider audience than the academics who make up the Society. I am,  in particular, interested in engaging the large practitioner community that consumes, modifies and an many instances advances financial econometrics.

Feel free to contact me via email at kevin.sheppard@economics.ox.ac.uk with an questions, comments or criticism – or leave your comments below.