THE LOGIC, OR LACK THEREOF, OF DEEP LEARNING
In this paper, we examine the logic of a supervized neural network in deep learning. In so doing, we have found that, despite its variety and its numerous claims of successes, a supervized neural network consists of a usually highly over-parameterized deterministic dynamic model and a statistical model. As what a physicist calls a “mean-field” model, the deterministic dynamic model describes how the expected values of some joint random variables evolve over its often arbitrary number of layers, each of an often arbitrary number of neurons. The statistical model relates the expected values to a set of data through certain sufficient statistics. The successful training of a supervized neural network yields a single solution out of an infinite number of solutions to a system of p estimating equations. Because of its high over-parameterization, as far as predictive power is concerned, what really matters in a neural network model is not its deterministic dynamic model, but its associated statistical model. As a matter of fact, for prediction, it is unnecessary to convert some sufficient statistics to a sub-set of the weights of a neural network by solving a system of p estimating equations. For this reason, one can simply relinquish its deterministic dynamic model, and directly use its statistical model for prediction. Consequently, as a modelling approach, a neural network model in deep learning can be, and should be relinquished, with its predictions provided by such models as generalized linear models and generalized additive models.
deep learning, supervized neural network, difference equation, fixed point problem, estimating equations, sufficient statistics.