9I制作厂免费

ARIA Spotlight: Rosa Chen 鈥 Department of Computer Science

Rosa Chen's Research Poster

My Arts Undergraduate Research Internship (ARIA) project is called 鈥淚mproving Successor Feature Learning with Efficient Optimization Techniques,鈥 supervised by Dr. Isabeau Pr茅mont鈥慡chwarz. In simple terms, I worked on helping a learning program (an 鈥渁gent鈥) learn useful patterns faster and reuse them across tasks, instead of starting from zero each time. This is practical because goals can change, and we want systems that adapt quickly without wasting computers.

I chose ARIA because I wanted hands鈥憃n research: write code that runs, design clean experiments, and explain results in plain language. I focused on a method called 鈥渟uccessor features鈥 (SF). In plain terms, SF separates two things: general knowledge about the world and the current goal. With that separation, an agent can switch goals while keeping what it already knows, which can save training time.

My learning goals were simple. First, try small, practical changes that make training steadier and faster. Second, run fair tests: keep settings fixed, try multiple random seeds, and track not only scores but also the helper losses we add during training. Third, practice clear communication鈥攕hort notes, readable plots, and simple takeaways.

I added two ideas to the training loop. The first was to mix a slow, steady signal from a target copy of the model with the live model鈥檚 features when computing helper losses. This keeps the model from overreacting and makes learning smoother. The second was to ask the model to predict the immediate reward from its own internal features. This nudges the model to learn features that are truly useful for decisions. Both ideas are light鈥憌eight and easy to tune.

The main value was the process. I built a reliable training pipeline, learned to keep runs reproducible, and saw steadier learning when the slow signal was tuned well. The curves were not dramatically better in every case, but the system became easier to debug and understand. Plotting both the main score and the helper losses made it clearer why certain settings behaved better or worse.

At the beginning I was new to this codebase and to RL in practice. I made mistakes and spent time understanding how the parts fit together. When experiments stalled, it was discouraging, but I learned this is normal in research. I made progress by testing in smaller steps, changing one thing at a time, and keeping careful notes. I also tuned the strength of the helper loss and the slow鈥憇ignal mix to keep training stable.

ARIA gave me the full cycle鈥攊dea, build, test, explain. I learned that small, well鈥憁otivated changes can improve stability without adding heavy complexity. This experience grew my interest in practical, reliable training methods, which I plan to carry into future research or an applied ML role. Next, I hope to turn this work into a short workshop paper or extended report with clearer ablations. I also plan to share the code and simple run scripts so that other students can easily reproduce the results.

Lastly, I would like to thank the ARIA award, precisely, the Undergraduate Experiential Learning Opportunities Support Fund because it covered my basic costs (especially rent), which allowed me to focus on research instead of extra jobs. That time and focus turned into better experiments and clearer results. I am grateful for the support.

Back to top