Investigation into Training Dynamics of Learned Optimizers (Student Abstract)
Authors
Sobotka, J.; Šimánek, P.
Year
2024
Published
Proceedings of the 38th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2024. p. 23657-23658. vol. 38. ISSN 2374-3468. ISBN 978-1-57735-887-9.
Type
Proceedings paper
Departments
Annotation
Modern machine learning heavily relies on optimization, and as deep learning models grow more complex and data-hungry, the search for efficient learning becomes crucial. Learned optimizers disrupt traditional handcrafted methods such as SGD and Adam by learning the optimization strategy itself, potentially speeding up training. However, the learned optimizers' dynamics are still not well understood. To remedy this, our work explores their optimization trajectories from the perspective of network architecture symmetries and proposed parameter update distributions.
Weather4cast at NeurIPS 2022: Super-Resolution Rain Movie Prediction under Spatio-temporal Shifts
Authors
Gruca, A.; Serva, F.; Lliso, L.; Pihrt, J.; Raevskiy, R.; Šimánek, P.
Year
2023
Published
Proceedings of the NeurIPS 2022 Competitions Track. Proceedings of Machine Learning Research, 2023. p. 292-312. Proceedings of Machine Learning Research. vol. 220. ISSN 2640-3498.
Type
Proceedings paper
Departments
Annotation
Weather4cast again advanced modern algorithms in AI and machine learning through a highly topical interdisciplinary competition challenge: The prediction of hi-res rain radar movies from multi-band satellite sensors, requiring data fusion, multi-channel video frame prediction, and super-resolution. Accurate predictions of rain events are becoming ever more critical, with climate change increasing the frequency of unexpected rainfall. The resulting models will have a particular impact where costly weather radar is not available. We here present highlights and insights emerging from the thirty teams participating from over a dozen countries. To extract relevant patterns, models were challenged by spatio-temporal shifts. Geometric data augmentation and test-time ensemble models with a suitable smoother loss helped this transfer learning. Even though, in ablation, static information like geographical location and elevation was not linked to performance, the general success of models incorporating physics in this competition suggests that approaches combining machine learning with application domain knowledge seem a promising avenue for future research. Weather4cast will continue to explore the powerful benchmark reference data set introduced here, advancing competition tasks to quantitative predictions, and exploring the effects of metric choice on model performance and qualitative prediction properties.
Learning to Optimize with Dynamic Mode Decomposition
Authors
Year
2022
Published
2022 International Joint Conference on Neural Networks (IJCNN). Vienna: IEEE Industrial Electronic Society, 2022. p. 1-8. ISSN 2161-4407. ISBN 978-1-7281-8671-9.
Type
Proceedings paper
Departments
Annotation
Designing faster optimization algorithms is of ever-growing interest. In recent years, learning to learn methods that learn how to optimize demonstrated very encouraging results. Current approaches usually do not effectively include the dynamics of the optimization process during training. They either omit it entirely or only implicitly assume the dynamics of an isolated parameter. In this paper, we show how to utilize the dynamic mode decomposition method for extracting informative features about optimization dynamics. By employing those features, we show that our learned optimizer generalizes much better to unseen optimization problems in short. The improved generalization is illustrated on multiple tasks where training the optimizer on one neural network generalizes to different architectures and distinct datasets.
Spatiotemporal Prediction of Vehicle Movement Using Artificial Neural Networks
Authors
Year
2022
Published
Proceedings of 2022 IEEE Intelligent Vehicles Symposium (IV). Piscataway: IEEE, 2022. p. 734-739. ISSN 1931-0587. ISBN 978-1-6654-8821-1.
Type
Proceedings paper
Annotation
Prediction of the movement of all traffic participants is a very important task in autonomous driving. Well-predicted behavior of other cars and actors is crucial for safety. A sequence of bird’s-eye view artificially rasterized frames are used as input to neural networks which are trained to predict the future behavior of the participants. The Lyft Motion Prediction for Autonomous Vehicles dataset is explored and adapted for this task. We developed and applied a novel approach where the prediction problem is viewed as a problem of spatiotemporal prediction and we use methods based on convolutional recurrent neural networks.