Model BreakdownsUnited States
OpenORBIT: Cross‑Episode Meta‑RL for In‑Context Online Adaptation of LLMs
ORBIT trains LLMs via cross-episode meta-RL so models learn from interaction traces at inference; authors report Qwen3-14B matches GPT-5.2 on unseen environments after meta-training.