NEW PREPRINT (PDF). Using the metaphor of lottery tickets to explain the success of overparameterization is misleading, we propose a new one: escape dimensions

Prelude

Why do neural networks need to be so large to train well? A popular explanation uses the metaphor of lottery tickets:

"If you want to win the lottery, just buy a lot of tickets and some will likely win. Buying a lot of tickets = having an overparameterized neural network for your task 1Slides from Princeton CS-598D course."

where a ticket is intended to be a subnetwork together with its initialization. Given that there exist subnetworks that can be trained successfully in isolation2Frankle & Carbin, 2019, this metaphor gained popularity to explain the need for overparameterized models: large networks are needed because they contain more potential winning subnetworks (tickets). Even Wikipedia states:

"the chance of any given ticket winning is tiny, but if you buy enough of them you are certain to win, and the number of possible subnetworks increases exponentially as the power set of the set of connections, making the number of possible subnetworks astronomical for any reasonably large network 3Wikipedia: "Lottery ticket hypothesis"."

We show that part of the community interprets this metaphor too literally, leading to wrong intuitions on the mechanisms of optimization in deep neural networks; in particular, we see no evidence of these combinatorially fast scaling laws. So we wrote an opinion piece proposing an alternative intuition, grounded in loss landscape theory: Escape Dimensions.

In short, escape dimensions are new dimensions that are added to the optimization landscape when we make our networks wider. These dimensions serve as escape routes for gradient descent, to avoid getting trapped into high-loss, bad, local minima.

We collect relevant theoretical and empirical results on loss landscapes under a new, intuitive lens called: Escape Dimensions Theory.

PDF

Your browser cannot display the PDF inline. Just download the paper here.

Citation

BibTeX:

@article{martinelli2026lottery,
  title={The Puzzling Success of Overparameterization: Lottery Tickets or Escape Dimensions?},
  author={Martinelli, Flavio; Brea, Johanni; Gerstner, Wulfram},
  year={2026},
  month={May},
  note={Preprint, EPFL Infoscience},
  doi={10.5075/epfl.20.500.14299/263577},
  url={https://infoscience.epfl.ch/handle/20.500.14299/263577}
}



Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Neural networks have minima at infinity. How do they look like?
  • ReLU Playground: how complex are the dynamics of one neuron learning another one?