Home // MODERN SYSTEMS 2025, International Conference of Modern Systems Engineering Solutions // View article


Q-Learning Performance on the CartPole Environment Under Observation Noise and Reward Variants

Authors:
Steven Ren
Taieba Tasnim
Berkeley Wu
Mohammad Rahman
Fan Wu

Keywords: reinforcement learning; q-learning; noise; reward; cyber-physical systems

Abstract:
This paper evaluates Q-learning performance in the CartPole reinforcement learning environment under varying levels of observation noise and two distinct reward functions, in the broader context of designing robust learning-based controllers for cyber-physical systems. Specifically, we compare the standard step-based reward with a cosine-based reward designed to encourage upright pole balance. Observation noise is modeled as Gaussian noise, with standard deviations scaled to the range of each observation variable. Through multiple training runs at different noise levels, we evaluated convergence behavior, pole angle stability, and cumulative rewards. Our results show that observation noise significantly impairs learning under standard reward, whereas cosine-based reward improves robustness and promotes more stable policies. By linking reinforcement learning with noise-robust control design, this work directly contributes to the understanding of Q-learning under noisy environments and represents a step toward applying reinforcement learning to real-world cyber-physical systems, where noise and variability are inherent.

Pages: 23 to 28

Copyright: Copyright (c) IARIA, 2025

Publication date: October 26, 2025

Published in: conference

ISBN: 978-1-68558-316-3

Location: Barcelona, Spain

Dates: from October 26, 2025 to October 30, 2025