Learning An Adaptive Learning Rate Schedule 1909 09712v1

Leo Migdal

-Nov 17, 2025, 5:22 AM

learning an adaptive learning rate schedule 1909 09712v1

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. PyTorch implementation of the "Learning an Adaptive Learning Rate Schedule" paper found here: https://arxiv.org/abs/1909.09712.

Work in progress! A controller is optimized by PPO to generate adaptive learning rate schedules. Both the actor and the critic are MLPs with 2 hidden layers of size 32. Three distinct child network architectures are used: 1) an MLP with 3 hidden layers, 2) LeNet-5 and 3) ResNet-18. Learning rate schedules are evaluated on three different datasets: 1) MNIST, 2) Fashion-MNIST and 3) CIFAR10. Original paper experiments with combinations of Fashion-MNIST, CIFAR10, LeNet-5 and ResNet-18 only.

In each of the three settings, child networks are optimized using Adam with an initial learning rate in (1e-2, 1e-3, 1e-4) and are trained for 1000 steps on the full training set (40-50k samples)... 20-25 epochs. Learning rate schedules are evaluated based on validation loss over the course of training. Test loss and test accuracies are in the pipeline. Experiments are made in both a discrete and continuous setting. In the discrete setting, the controller controls the learning rate by proposing one of the following actions every 10 steps: 1) increase the learning rate, 2) decrease the learning rate, 3) do nothing.

In the continuous setting, the controller instead proposes a real-valued scaling factor, which allows the controller to modify learning rates with finer granularity. Maximum change per LR update has been set to 5% for simplicity (action space is not stated in the paper). In both the discrete and the continuous setting, Gaussian noise is optionally applied to learning rate updates. Observations for the controller contain information about current training loss, validation loss, variance of predictions, variance of prediction changes, mean and variance of the weights of the output layer as well as the previous... To make credit assignment easier, the validation loss at each step is used as reward signal rather than the final validation loss. Both observations and rewards are normalized by a running mean.

The learning rate is one of the most important hyper-parameters for model training and generalization. However, current hand-designed parametric learning rate schedules offer limited flexibility and the predefined schedule may not match the training dynamics of high dimensional and non-convex optimization problems. In this paper, we propose a reinforcement learning based framework that can automatically learn an adaptive learning rate schedule by leveraging the information from past training histories. The learning rate dynamically changes based on the current training dynamics. To validate this framework, we conduct experiments with different neural network architectures on the Fashion MINIST and CIFAR10 datasets. Experimental results show that the auto-learned learning rate controller can achieve better test results.

In addition, the trained controller network is generalizable -- able to be trained on one data set and transferred to new problems. Please note: Providing information about references and citations is only possible thanks to to the open metadata APIs provided by crossref.org and opencitations.net. If citation data of your publications is not openly available yet, then please consider asking your publisher to release your citation data to the public. For more information please see the Initiative for Open Citations (I4OC). Please also note that there is no way of submitting missing references or citation data directly to dblp. Please also note that this feature is work in progress and that it is still far from being perfect.

That is, in particular, JavaScript is requires in order to retrieve and display any references and citations for this record. To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.

Add open access links from to the list of external document links (if available). When it comes to training deep neural networks, one of the crucial factors that significantly influences model performance is the learning rate. The learning rate determines the size of the steps taken during the optimization process and plays a pivotal role in determining how quickly or slowly a model converges to the optimal solution. In recent years, adaptive learning rate scheduling techniques have gained prominence for their effectiveness in optimizing the training process and improving model performance. Before delving into adaptive learning rate scheduling, let’s first understand why the learning rate is so important in training deep neural networks. In essence, the learning rate controls the amount by which we update the parameters of the model during each iteration of the optimization algorithm, such as stochastic gradient descent (SGD) or its variants.

Enhance arXiv with our new Chrome Extension. Create stunning high-level project timelines with Preceden AI Timeline Maker. Abstract: The learning rate is one of the most important hyper-parameters for model training and generalization. However, current hand-designed parametric learning rate schedules offer limited flexibility and the predefined schedule may not match the training dynamics of high dimensional and non-convex optimization problems. In this paper, we propose a reinforcement learning based framework that can automatically learn an adaptive learning rate schedule by leveraging the information from past training histories. The learning rate dynamically changes based on the current training dynamics.

To validate this framework, we conduct experiments with different neural network architectures on the Fashion MINIST and CIFAR10 datasets. Experimental results show that the auto-learned learning rate controller can achieve better test results. In addition, the trained controller network is generalizable -- able to be trained on one data set and transferred to new problems. We haven't generated a summary for this paper yet. Sign up for free to create and run prompts on this paper using GPT-5. There was an error while loading.

Please reload this page. The learning rate is one of the most important hyper-parameters for model training and generalization. However, current hand-designed parametric learning rate schedules offer limited flexibility and the predefined schedule may not match the training dynamics of high dimensional and non-convex optimization problems. In this paper, we propose a reinforcement learning based framework that can automatically learn an adaptive learning rate schedule by leveraging the information from past training histories. The learning rate dynamically changes based on the current training dynamics. To validate this framework, we conduct experiments with different neural network architectures on the Fashion MINIST and CIFAR10 datasets.

Experimental results show that the auto-learned learning rate controller can achieve better test results. In addition, the trained controller network is generalizable – able to be trained on one data set and transferred to new problems. The learning rate is often regarded as the single most important hyper-parameter to tune and highly influences model training using gradient decent algorithms [1, 2]. Researchers have developed several learning rate schedules such as linear decay, cosine decay, exponential decay, inverse square root decay, etc., sometimes with warm up steps, for different optimization problems [3, 4]. However, there is limited intuition about which learning rate schedule best suits a given problem. In practice, researchers adopt a trial-and-error approach for different learning rate schedules along with different hyper-parameters, which is very time consuming [5].

In this paper, we would like to automatically learn a controller that adapts learning rate schedule by incorporating information from past training dynamics. In addition, current learning rate schedules assume predefined parametric learning rate changes, which are fixed irregardless of actual training dynamics. The optimization landscape can be very complex [6] and these parametric schedules have limited flexibility and may not be optimized for the training dynamics of different high dimensional and non-convex optimization problems. In comparison, our framework offers an auto-learned or meta-learned adaptive learning rate schedule that adapts dynamically based on current training dynamics. There are several related works proposing better update schedules for gradient descent algorithms. [7] propose to directly learn the gradient descent updates using a long short-term memory (LSTM) network.

Our work only learns the learning rate and so is more efficient. Hypergradient takes the derivative of the learning rate and updates the learning rate based on its gradient [8]. In addition to the current state, our approach also considers the entire training history and has a more comprehensive view. [9] propose to use reinforcement learning (RL) to adapt the learning rate. In comparison, we use validation loss as the reward signal and a learning rate scaling function as the action. They improve the generalization capability and stability.

There are a family of widely used optimizers that dynamically adapt the learning rate on a per-parameter basis. For example, Adagrad adapts the learning rate per weight based on the sums of the squares of the gradients [10], while Adam uses an exponentially decayed average of past gradients [11]. However, these optimizers still require a global learning rate which is important to tune. Our work is complementary to these works. This paper makes three main contributions: First, we propose a reinforcement learning based framework to automatically learn an adaptive learning rate schedule based on past training histories. This schedule can adjust the learning rate dynamically to adapt to current training dynamics.

Second, we present an effective set of state observation features, reward functions, and actions for the learning rate decision problem. Specifically, different from the previous work, we use validation loss as the reward signal and a learning rate scaling function as the action. Third, we conduct experiments on Fashion MNIST and CIFAR10 datasets with convolutional neural networks (CNN) [12] and residual networks (ResNet) [13, 14] to show the effectiveness and generalization capability of our framework. The auto-learned learning rate schedule can achieve better results and generalize to different datasets. In our framework, we use RL to train a learning rate controller, which proposes learning rates using features from the training dynamics of the trainee network. The trainee network is trained for a certain number of steps using a proposed learning rate, reports the observations of training dynamics to the controller which then returns a new learning rate.

Learning An Adaptive Learning Rate Schedule 1909 09712v1

People Also Search

ArXivLabs Is A Framework That Allows Collaborators To Develop And

Work In Progress! A Controller Is Optimized By PPO To

In Each Of The Three Settings, Child Networks Are Optimized

In The Continuous Setting, The Controller Instead Proposes A Real-valued

The Learning Rate Is One Of The Most Important Hyper-parameters