I‘m at NeurIPS this week. I’ll update this post continually with observations: thought-provoking papers, lessons from the main lectures, and observations on research and the state of ML. This year, I’m particularly interested in three themes: (1) Causal Models, (2) Intelligence for Scientific Processes, and (3) Machine learning and Economics.
Update: the conference is over. I had a fantastic time. I believe more in the chance to combine causal reasoning with relational representation learning for efficient/generalizable RL & scientific discovery. Intelligent research processes seem confined to natural rather than social sciences.
- Generalisation of structural knowledge in the hippocampal-entorhinal system. Combining “fast Hebbian” and “slow statistical” learning to recover generalizable spatial cells in agents.
- Hierarchical Graph Representation Learning via Differentiable Pooling. Form higher- and lower-order graph representations of graphs. Represent hierarchical structure, such as groups of nodes.
- Neural Architecture Search with Bayesian Optimisation and Optimal Transport. They develop OTMANN, a distance metric for neural architectures.
- Joelle Pineau’s invited talk on reproducible, reusable, and robust RL was great. Main suggestions: a reproducibility checklist, switch RNG seeds lots of times in training to generalize orders of magnitude better, use live video as background noise, be honest and optimise the hyperparameters of your baseline comparison methods.
- End-to-End Differentiable Physics for Learning and Control. Develops an analytically differentiable physics simulator method, allowing, for example, an agent to learn the parameters of physical systems as well as learn more complex relationships. [Code Here]
- Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects. A generative model for images that uses latent variables to represent location, number, and properties of objects. Allows for quicker object tracking as well as higher fidelity forward simulation of object paths.
- Object-Level Reinforcement Learning. No paper yet; Pedro Domingos showed how using a simple object-relational inductive bias in reinforcement learning leads to superhuman performance in some game tasks in as few as a thousand frames.
- Susan Athey’s Causal Inference tutorial on Monday was generally good. It was similar to other overviews of causal inference techniques. A better tutorial would have decomposed real-life machine learning tasks with causal components, but this was great for non-causal ML researchers.
- Olivier Bousquet and Leon Bottou won the Test of Time award for this paper. At the end of the presentation, Bousquet mentioned three fontiers of ML research. The latter two were “compositionality” and “causality”. A good indication that major minds are turning towards causality, and indeed Bottou and Bousquet have done good work in this area.
- I chatted with the people at the iSee booth. iSee is an SDC company started by researchers from Josh Tenenbaum’s lab. They told me a key component of their “common sense engine” is a combination of compositionality, intuitive physics, and theory of mind.
- So far, causality has been mentioned a lot in talks as a tool for stability and explanation. I’d like to see more about generalization and efficiency. The Causal Learning workshop is happening Friday, which I’m looking forward to.
- A lot of companies exhibiting here have said something along the lines of “causality is on our medium-term roadmap”. I don’t know how serious they are.
- Again, more talk about causality as explanation. Causality is not a silver bullet. You can have a causal effect mediated by 20 different channels.
- A summary of the Causal Learning workshop: (a) reinforcement learning is what the community should be aiming to apply causal learning techniques to; (b) counterfactual reasoning, interventional exploration, and object-relational causal inductive biases, it is believed, are the heart of perfecting RL. To quote Elias Barenboim’s slides: “is AI essentially solved?” (c) the key challenge is to learn causal world models through interventions, eliminating the need to specify SCMs beforehand. (Echoed all day in the workshop on Relational Representation Learning)
- Blei presented the Blessings of Multiple Causes. Initially convincing, though I need to finally read Alex D’Amour’s response (here). Essential idea is that if you have a “multicause confounder” – a confounder that causes multiple non-outcome sibling as well as the outcome – you can learn a latent representation of the factor that drives the sibling variation, which when controlled for, faithfully captures latent multicause confounders. If there are no single-cause confounders, they argue this allows you to identify the treatment effect of the siblings on the outcome.
Intelligence for Scientific Processes
- Nothing much yet. Check back later.
- A lot of papers study graph-based deep learning methods. To the extent that a lot of scientific knowledge is graph-structured – molecules, for example, or causal models – graph-based methods represent meta-infrastructure for intelligent scientific processes. Deep Graph Library is being released today.
- A Probabilistic U-Net for Segmentation of Ambiguous Images. Interesting paper from the DeepMind science and health-focused teams. Addresses the issue that humans (like doctors) may label image areas (i.e. tumors) differently. Develops a deep probabilistic model to represent this variation.
- Attended the Molecules and Materials workshop. Most went over my head. General themes included: (a) encoding useful physiomechanical properties / invariances in deep learning; (b) using RL in drug discovery because we have relatively good models that evaluate properties of proposed candidates; using graph neural networks.
- Question: is scientific automation better achieved through developing general level inductive biases for scientific questions (i.e. relational ML) or through targeting each vertical (drug design, physics, diagnosis) separately?
ML and Economics
- I had a fascinating chat with Uber AI labs. One of their goals is to accurately model and calibrate systems of agents – in theory, an RL challenge, and in practice, a cars/drivers/marketplace challenge – so they can experiment with incentives (rewards in the RL sense). To do this at a larger scale for many world problems would offer a simulation lab for mechanism design with suitably intelligent agents.
- JP Morgan is building ROAR, a market for buying and selling predictions. Think Numerai for any problem. Buyers specify the prediction task, performance measurement method, and reward. These can be updated dynamically. Sellers’ predictions are rewarded proportional to how useful they are to the buyer. The equilibrium of sufficiently well-specified tasks should push the cost of prediction towards zero. The second- and third-order effects are (a) an incentive to modular model-building value: you might have models that 1. provide better data 2. select features 3. perform feature engineering 4. model, 5. ensemble models; (b) autonomous meta-learning systems that can be deployed generally and develop solutions to different tasks without human specification.
- The conference has been overshadowed by the Canadian government doing a miserable job approving visas for people from underrepresented geographies. This disproportionately impacts the Black in AI crowd. Timnit Gebru and others have been pulling Herculean efforts to rectify this.
- I’m seeing nametags with lime green preferred pronoun stickers. Awesome.
- A lot of companies have an insane amount of money to do consulting with no scalable product. Is this a bubble or will they all find products? They all say they do fundamental research. I guesstimate a very small amount of fundamental research makes it into deployment, or even marginally informs products.
- Companies that are claiming they “do AI social good” are transparently using it as (a) a recruitment tool (b) a tool to ingratiate themselves to government customers (c) vying to help inform the legal frameworks that would likely benefit them.