An ML Agent using the Policy Gradient Method to win a SoccerTwos Game

Victor Pugliese

2022

Abstract

We conducted an investigative study of Policy Gradient methods using Curriculum Learning applied in Video Games, as professors at the Federal University of Goiás created a customized SoccerTwos environment to evaluate the Machine Learning agents of students in a Reinforcement Learning course. We employed the PPO and SAC as state-of-arts in on-policy and off-policy contexts, respectively. Also, the Curriculum could improve the performance based on it is easier to teach people in a complex gradual order than randomly. So, combining them, we propose our agents win more matches than their adversaries. We measured the results by minimum, maximum, mean rewards, and the mean length per episode in checkpoints. Finally, PPO achieved the best result with Curriculum Learning, modifying players’ (position and rotation) and ball’s (speed and position) settings in time intervals. Also, It used fewer training hours than other experiments.

Download


Paper Citation


in Harvard Style

Pugliese V. (2022). An ML Agent using the Policy Gradient Method to win a SoccerTwos Game. In Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-758-569-2, pages 628-633. DOI: 10.5220/0011108400003179


in Bibtex Style

@conference{iceis22,
author={Victor Pugliese},
title={An ML Agent using the Policy Gradient Method to win a SoccerTwos Game},
booktitle={Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2022},
pages={628-633},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011108400003179},
isbn={978-989-758-569-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - An ML Agent using the Policy Gradient Method to win a SoccerTwos Game
SN - 978-989-758-569-2
AU - Pugliese V.
PY - 2022
SP - 628
EP - 633
DO - 10.5220/0011108400003179