A Comparative Study Of Optimization Techniques In Deep Learning Using

Leo Migdal
-
a comparative study of optimization techniques in deep learning using

Indian Journal of Science and Technology Year: 2025, Volume: 18, Issue: 10, Pages: 803-810 1Research Scholar, Department of Computer Science, Bharathidasan University, Tiruchirappalli, 620 024, Tamil Nadu, India2Associate Professor, Department of Computer Science, Bharathidasan University, Tiruchirappalli, 620 024, Tamil Nadu, India *Corresponding AuthorEmail: [email protected] Received Date:17 January 2025, Accepted Date:19 March 2025, Published Date:30 March 2025 A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.© Copyright 2025 IEEE - All rights reserved.

Use of this web site signifies your agreement to the terms and conditions. Ijraset Journal For Research in Applied Science and Engineering Technology Authors: Sujay Bashetty, Kalyan Raja , Sahiti Adepu, Ajeet Jain DOI Link: https://doi.org/10.22214/ijraset.2022.48050 Machine learning has enormously contributed towards optimization techniques with new ways for optimization algorithms. These approaches in deep learning have wide applications with resurgence of novelty starting from Stochastic Gradient Descent to convex and non-convex ones.

Selecting an optimizer is a vital choice in deep learning as it determines the training speed and final performance predicted by the DL model. The complexity further increases with growing deeper due to hyper-parameter tuning and as the data sets become larger. In this work, we analyze most popular and widely optimizers algorithms empirically. The augmenting behaviors of these are tested on MNIST, Auto Encoder data sets. We compare them pointing out their similarities, differences and likelihood of their suitability for a given applications. Recent variants of optimizers are highlighted.

The article focuses on their critical role and pinpoints which one would be a better option while making a trade-off. Deep learning (DL) algorithms are essential in statistical computations because of their efficiency as data sets grow in size. Interestingly, one of the pillars of DL is the mathematical tactics of the optimization process that make decisions based on previously invisible data. This is achieved through carefully chosen parameters for a given learning problem (an intuitive near-optimal solution). The hyper–parameters are the parameters of a learning algorithm and not of a given model. Evidently, the inspiration is to look forward to the optimizing algorithm which works well and predict accurately [1, 2, 3, 4].

Many people have worked on text classification in ML because of the fundamental problem of learning from examples. Similarly, speech and image recognition have been dealt with great success and accuracy – yet offers the place for new improvements. In achieving higher goals, use of various optimizing techniques involving convexity principles are much more cited [5, 6, 7] now a days and using logistic and other regression techniques. Moreover, the Stochastic Gradient Descent (SGD) has been very popular over last many years, but also suffers from ill-conditioning and also taking more time to compute for larger data sets. In some cases, it also requires hyper-parameter tuning and different learning rates. arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. A comparative study of deep learning optimization algorithms using a simple neural network. Evaluates Gradient Descent variants and adaptive optimizers on classification performance using accuracy and F1 score.

Compare the performance of the optimization algorithms in solving the selected problem. Finally, rank all 7 algorithms based on their performance. Best: Stochastic GD Why? Single-sample updates enabled escape from poor local minima that trapped Batch GD. Best: SGD + Nesterov Why? Predictive parameter position evaluation yielded smoother convergence than standard momentum.

People Also Search

Indian Journal Of Science And Technology Year: 2025, Volume: 18,

Indian Journal of Science and Technology Year: 2025, Volume: 18, Issue: 10, Pages: 803-810 1Research Scholar, Department of Computer Science, Bharathidasan University, Tiruchirappalli, 620 024, Tamil Nadu, India2Associate Professor, Department of Computer Science, Bharathidasan University, Tiruchirappalli, 620 024, Tamil Nadu, India *Corresponding AuthorEmail: [email protected] Received Date:17 Ja...

Use Of This Web Site Signifies Your Agreement To The

Use of this web site signifies your agreement to the terms and conditions. Ijraset Journal For Research in Applied Science and Engineering Technology Authors: Sujay Bashetty, Kalyan Raja , Sahiti Adepu, Ajeet Jain DOI Link: https://doi.org/10.22214/ijraset.2022.48050 Machine learning has enormously contributed towards optimization techniques with new ways for optimization algorithms. These approac...

Selecting An Optimizer Is A Vital Choice In Deep Learning

Selecting an optimizer is a vital choice in deep learning as it determines the training speed and final performance predicted by the DL model. The complexity further increases with growing deeper due to hyper-parameter tuning and as the data sets become larger. In this work, we analyze most popular and widely optimizers algorithms empirically. The augmenting behaviors of these are tested on MNIST,...

The Article Focuses On Their Critical Role And Pinpoints Which

The article focuses on their critical role and pinpoints which one would be a better option while making a trade-off. Deep learning (DL) algorithms are essential in statistical computations because of their efficiency as data sets grow in size. Interestingly, one of the pillars of DL is the mathematical tactics of the optimization process that make decisions based on previously invisible data. Thi...

Many People Have Worked On Text Classification In ML Because

Many people have worked on text classification in ML because of the fundamental problem of learning from examples. Similarly, speech and image recognition have been dealt with great success and accuracy – yet offers the place for new improvements. In achieving higher goals, use of various optimizing techniques involving convexity principles are much more cited [5, 6, 7] now a days and using logist...