Laboratory experiments of model-based reinforcement learning for adaptive optics control
Loading...
Access rights
openAccess
Journal Title
Journal ISSN
Volume Title
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä
This publication is imported from Aalto University research portal.
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
Date
2024
Major/Subject
Mcode
Degree programme
Language
en
Pages
Series
Journal of Astronomical Telescopes, Instruments, and Systems, Volume 10, issue 1
Abstract
Direct imaging of Earth-like exoplanets is one of the most prominent scientific drivers of the next generation of ground-based telescopes. Typically, Earth-like exoplanets are located at small angular separations from their host stars, making their detection difficult. Consequently, the adaptive optics (AO) system's control algorithm must be carefully designed to distinguish the exoplanet from the residual light produced by the host star. A promising avenue of research to improve AO control builds on data-driven control methods, such as reinforcement learning (RL). RL is an active branch of the machine learning research field, where control of a system is learned through interaction with the environment. Thus, RL can be seen as an automated approach to AO control, where its usage is entirely a turnkey operation. In particular, model-based RL has been shown to cope with temporal and misregistration errors. Similarly, it has been demonstrated to adapt to nonlinear wavefront sensing while being efficient in training and execution. In this work, we implement and adapt an RL method called policy optimization for AO (PO4AO) to the GPU-based high-order adaptive optics testbench (GHOST) test bench at ESO headquarters, where we demonstrate a strong performance of the method in a laboratory environment. Our implementation allows the training to be performed parallel to inference, which is crucial for on-sky operation. In particular, we study the predictive and self-calibrating aspects of the method. The new implementation on GHOST running PyTorch introduces only around 700 μs of in addition to hardware, pipeline, and Python interface latency. We open-source well-documented code for the implementation and specify the requirements for the RTC pipeline. We also discuss the important hyperparameters of the method and how they affect the method. Further, the paper discusses the source of the latency and the possible paths for a lower latency implementation.Description
Publisher Copyright: © The Authors. Published by SPIE under a Creative Commons Attribution 4.0 International License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.
Keywords
adaptive optics, high contrast imaging, reinforcement learning, system identification
Other note
Citation
Nousiainen, J, Engler, B, Kasper, M, Rajani, C, Helin, T, Heritier, C T, Quanz, S P & Glauser, A M 2024, ' Laboratory experiments of model-based reinforcement learning for adaptive optics control ', Journal of Astronomical Telescopes, Instruments, and Systems, vol. 10, no. 1, 019001 . https://doi.org/10.1117/1.JATIS.10.1.019001