Document Type

Dissertation

Degree

Doctor of Philosophy

Major

Computer Science

Date of Defense

4-19-2023

Graduate Advisor

Sanjiv Bhatia

Committee

Badri Adhikari

Sharlee Climer

Henry Kang

Abstract

Deep learning training consumes ever-increasing time and resources, and that is
due to the complexity of the model, the number of updates taken to reach good
results, and both the amount and dimensionality of the data. In this dissertation,
we will focus on making the process of training more efficient by focusing on the
step size to reduce the number of computations for parameters in each update.
We achieved our objective in two new ways: we use loss scaling as a proxy for
the learning rate, and we use learnable layer-wise optimizers. Although our work
is perhaps not the first to point to the equivalence of loss scaling and learning
rate in deep learning optimization, ours is the first to leveraging this relationship
towards more efficient training. We did not only use it in simple gradient descent,
but also we were able to extend it to other adaptive algorithms. Finally, we use
metalearning to shed light on relevant aspects, including learnable losses
and optimizers. In this regard, we developed a novel learnable optimizer and
effectively utilized it to acquire an adaptive rescaling factor and learning rate,
resulting in a significant reduction in required memory during training.

Recommended Citation

Alosily, Nora, "Loss Scaling and Step Size in Deep Learning Optimizatio" (2023). Dissertations. 1286.
https://irl.umsl.edu/dissertation/1286

Download

Included in

Artificial Intelligence and Robotics Commons, Theory and Algorithms Commons

COinS

IRL @ UMSL

Dissertations

Loss Scaling and Step Size in Deep Learning Optimizatio

Document Type

Degree

Major

Date of Defense

Graduate Advisor

Committee

Abstract

Recommended Citation

Included in

Search

Browse

Participate

Links

IRL @ UMSL

Dissertations

Loss Scaling and Step Size in Deep Learning Optimizatio

Author

Document Type

Degree

Major

Date of Defense

Graduate Advisor

Committee

Abstract

Recommended Citation

Included in

Share

Search

Browse

Participate

Links