Papers You Should Write

Future plans: Update post with links to papers I’ve found addressing parts of these questions. There are lots!

Causality and Machine Learning

  1. Do RL agents perform interventions? Is an action chosen from a policy provably equivalent to an intervention? Do interventions allow agents to learn faster?
  2. In what situations is a causal model is useful or not? (For example, can we build a causal model of a model parameter space, or is there no useful causal information?)
  3. Can we make VAEs learn causal or independent latent representations? Do they help to generalize?
  4. Can we make deep learning perform causal feature engineering, not just associational FE? (I.e. deriving features that are either causally related or independently controllable mechanisms.)
  5. Can we use causal interventions to guard against adversarial attacks? (Credit for this one goes to my father.)
  6. Is interventional surprise (the difference between the predicted and actual interventional distributions \(D(p(\hat{Y} | do(X_{i}=x_{i})) || p(Y | do(X_{i}=x_{i})))\) a useful metric for uncertainty / model fragility?
  7. Is the Object-Relation model of the world as general a framework for evolving intelligence as, say, deep learning?
  8. Can we regularize models according to how causal they are – and does that improve generalization performance?
  9. Can we use the Object-Relation causal model to significantly improve object detection and entity tracking?
  10. What distance metric is useful for comparing interventional distributions? For example, can we cluster units based on how similar their learned interventional distributions, or full causal models, are?
  11. How do we ensure a measured variable has no causal effect on an outcome? (i.e. gender on insurance premiums.) More precisely: what is the state of using machine learning to ensure no back-door paths exist between a dangerous variable variable and an outcome? (I assume there is work on developing appropriate sets that satisfy the back-door criterion. Even better: can we learn a latent variable that satisfies the back-door criterion?)
  12. How can we detect changes in causal regimes? Causal modelling is often interested in non-stationary, non-iid cases. (Suggested long-term solution: developing basic causal-learning primitives that efficiently learn new causal models in new situations.)

Machine Learning

  1. What is a better Machine Learning technique to propensity score matching? (And in which cases?)
  2. How can we use ML to recreate a random, balanced sample ex-post?
  3. Can we use ML to recreate and improve on generally-accepted empirical results in economics?
  4. How do we test for identification in probabilistic graphical models?
  5. How do we automatically find and interpret patterns in streams of large data (in particular economic and political data)?
  6. Can we identify (unobserved) treatment cases using latent variable models?
  7. Can we create human-interpretable Gaussian Process kernels?


  1. Assume we create Artificial General Intelligence. What would the economy look like? (Would humans work? Would private businesses be nationalized? Who do we tax? What is scarce? What would humans do?)
  2. Does Machine Learning’s emphasis on out-of-sample performance contribute to economics’ emphasis on structure discovery and explanation?
  3. How much user-level data would we need in order to feel confident about using other individuals as the counterfactual treatment effect?
  4. Is the rate of job-switching, up-skilling, or redundancy higher in jobs with high automatability?
  5. How do we predict which industries will be important employers in 50 years?
  6. What does the “transition” to an “AI-based economy” mean? (Is it a rough patch of unemployment? Is it a fundamental shift in education? Is it enduring poverty?)
  7. How do we disentangle automation from globalization?
  8. Given automation, how much manufacturing can Trump ‘bring back’?
  9. What will be the effect of automation on inequality?
  10. Are we in a crisis of living standards (not enough people up to the baseline of living standards happiness), or a crisis of pyschology (people are inherently discontent with inequality, no matter how good their lives are)?
  11. Where can Machine Learning help economics the most?
  12. What is the effect of automation on skills and employment in developing economies, at different stages of development?
  13. How do we recreate the human/economist model-creation procedure algorithmically?
  14. Can we create a system that automatically spits out structural equation models?
  15. What is an efficient skills-required-for-job measure (besides “college degree”) and how many more people would be employable if we used it?
  16. How would a sovereign wealth fund that socializes the return to automation work?
  17. How do we measure economic growth from artificial intelligence?