Deepayan Chakrabarti

Me
Associate Professor,
Information, Risk, and Operations Management (IROM)
McCombs School of Business,
UT Austin
Other affiliation: Amazon Scholar
Office: CBA 6.462, 2110 Speedway Stop B6500,
Austin, TX 78712
Email: deepay <at> utexas <dot> edu

I work on a broad range of problems in Machine Learning and Data Mining, with a recent focus on:

My work spans model-building, statistical inference, designing algorithms, and providing theoretical proofs of consistency.

More details may be found in my resume and my Google Scholar and DBLP profiles.

Note for Prospective Students: I am looking for new Ph.D. students. Please do not contact me directly, as all admission decisions are made by the Ph.D. search committee. In your Statement of Purpose, you can mention your research areas and state your interest in working with me. Most of my projects are at the Ph.D. level, so unfortunately I cannot offer projects to non-Ph.D. students unless you have taken my classes.

Consider firms trading with each other. Each firm wants the trades with the best risk-return tradeoff. However, two firms will only trade if it is beneficial for both. How does this mix of cooperation and competition work?
  • Unique Equilibrium: There is only one set of trades where everyone is happy. Furthermore, the firms can find this equilibrium via bilateral negotiations.
  • Local changes have global effects: If one firm changes its mind, it changes its contracts. However, these local changes can cause even larger changes in the contracts between other firms. Such ripple effects make the financial regulator's job more challenging.
  • Minor news can cause drastic changes: Suppose my trading partners seek larger trades over time. Have they become more profitable or more risk-seeking? Even minor news that signals an answer can change my mind, thus affecting all my contracts.
  • What if firms lie during contract negotiations? Such "negotiating positions" are commonplace. I must choose my position knowing that everyone else is doing the same. There is a unique Nash equilibrium where no firm wants to change its position even after seeing the others' positions.
  • Trade deadlines can protect honest firms: Honest firms are at a disadvantage when trading with strategic firms. But trade deadlines can even the playing field. Any firm can walk away from a trade at the deadline. This implicit threat forces strategic firms to offer better terms to honest firms.
Why is stock-picking difficult? Because "past performance is not a guarantee of future results." Yet, past and future performance are not unrelated. Intuitively, data from the recent past should be more helpful in making predictions. However, recent data is also limited compared to historical data. How can we build reliable portfolios from limited data?
  • Minimum Variance Portfolios: We separated the well-estimated aspects of the data from the poorly estimated parts. Then, we performed classical optimization on the former and robust optimization on the latter. By combining the two, we got a reliable and effective portfolio.
  • Maximum Sharpe Portfolios: The Sharpe ratio measures risk-adjusted returns. To optimize it, we need estimates of future stock returns. However, estimates based on the recent past, while unbiased, can be noisy. We developed a robust method that overcomes the noise, giving high risk-adjusted returns.
Low-rank models assume that a few hidden factors explain most of the data. Such models are often used in network community detection, text mining, and financial analysis. How can we infer these hidden factors?
  • Nonnegative Matrix Factorization (NMF): We designed an efficient and provably optimal algorithm for network community detection. In contrast, general-purpose NMF algorithms can only find local optima.
  • Eigenvector recovery: Often, the data is in the form of a matrix. Spectral decomposition of the matrix is often the first step in finding the hidden factors. We proved sharp bounds on spectral estimation errors in low-rank models.
  • One algorithm for all low-rank models: We developed a simple algorithm for all low-rank problems. For example, we showed theoretical and empirical results for several popular network and text mining models.
  • Fair classifiers: A classifier that is accurate on average may still underperform for "sensitive" subsets of people. In other words, the classifier may be unfair to this subset. We detect and rectify the problem even when the sensitive subsets are unknown.
  • Network embeddings: Node embeddings are vectors, one for each node in a network, that reflect the linkage pattern between the nodes. However, most nodes have few links, so we know little about them. This lack of information leads to poor-quality embeddings. Existing methods resort to assumptions, such as links these nodes "should" have. Instead, we use robust methods to build embeddings based solely on the data.
  • Classification from limited data: To classify, we need to estimate aspects of the data distribution. Scarce data leads to noisy estimates and underperforming classifiers. We develop robust approaches to overcome the noise. These approaches perform well empirically and have theoretical justifications.