‘Theory of Vibe’ Open Questions: Generative Models

Peli Grietzer
1 min readJan 19, 2021

--

Let’s say M is a parametric family of distributions, and I’m doing MLE in M on samples from some distribution D that isn’t in M. I want to know whether there are some natural conditions on M and D that will guarantee (exactly or approximately) the following:

1) For every D not in M, there is some D* such that MLE in M on n samples from D converges to D* as n goes to infinity.

2) MLE in M on n samples from D* converges to D* as n goes to infinity.

3) The convergence to D* is faster when sampling from D* than when sampling from D. (That is, n samples from D* will get you further along to convergence than n samples from D.)

Informally, this is the intuitive idea that in many machine learning setups, teaching a generative modelling algorithm with n samples from its best possible model of a distribution D would be like teaching it with n samples from D but more effective. At least for a ‘medium-sized’ n.

(The choice to focus on MLE here is in order to make the question easier, since I know there are results about the consistency of MLE estimators. If it’s actually better to think in more general terms, or to explicitly introduce neural networks or gradient descent to the setup, that’s more than welcome.)

--

--

Peli Grietzer

Harvard Comp. Lit, visitor Einstein Institute of Mathematics