Modified quasi-Newton methods for training neural networks
Abstract The backpropagation algorithm is the most popular procedure to train self-learning feed-forward neural networks. However, the convergence of this algorithm is slow, it being mainly a steepest descent method. Several researchers have proposed other approaches to improve the convergence: conjugate gradient methods, dynamic modification of learning parameters, quasi-Newton or Newton methods, stochastic methods, etc. Quasi-Newton methods were criticized because they require significant computation time and memory space to perform the update of the Hessian matrix limiting their use to middle-sized problems. This paper proposes three variations of the classical approach of the quasi-Newton method that take into account the structure of the network. By neglecting some second-order interactions, the sizes of the resulting approximated Hessian matrices are not proportional to the square of the total number of weights in the network but depend on the number of neurons of each level. The modified quasi-Newton methods are tested on two examples and are compared to classical approaches like regular quasi-Newton methods, backpropagation and conjugate gradient methods. The numerical results show that one of these approaches, named BFGS-N, represents a clear gain in terms of computational time, on large-scale problems, over the traditional methods without the requirement of large memory space.
유료 다운로드의 경우 해당 사이트의 정책에 따라 신규 회원가입, 로그인, 유료 구매 등이 필요할 수 있습니다. 해당 사이트에서 발생하는 귀하의 모든 정보활동은 NDSL의 서비스 정책과 무관합니다.
NDSL에서는 해당 원문을 복사서비스하고 있습니다. 위의 원문복사신청 또는 장바구니 담기를 통하여 원문복사서비스 이용이 가능합니다.
- 이 논문과 함께 출판된 논문 + 더보기