News

Gradient descent algorithms take the loss function and use partial derivatives to determine what each variable (weights and biases) in the network contributed to the loss value.
Zhimin Zhang, Derivative Superconvergent Points in Finite Element Solutions of Harmonic Functions: A Theoretical Justification, Mathematics of Computation, Vol. 71, No. 240 (Oct., 2002), pp. 1421-1430 ...