This repository is an example for AlphaRTC Gym. By converting the raw stats of packet traces to some features and leveraging the PPO algorithms, this example trained a simple bandwidth estimator.
- Fetch all submodule
git submodule init
git submodule update
-
Please visit Gym link for the instructions of AlphaRTC Gym to install AlphaRTC Gym in
alphartc_gym
-
Install example dependencies OpenAI GYM
python3 -m pip install -r requirements.txt
- Run this example
python3 main.py
If you see something like
Episode 0 Average policy loss, value loss, reward -0.001012914622997811, 1749.7713505035483, -0.5917726188465685
Episode 1 Average policy loss, value loss, reward -0.003164666119424294, 1643.378693441069, -0.5696511401323291
Episode 2 Average policy loss, value loss, reward -0.0006242794975365944, 1503.4368403712722, -0.5418709073877055
Episode 3 Average policy loss, value loss, reward -0.0013577909024748813, 1396.8935836247986, -0.5149282318393551
Episode 4 Average policy loss, value loss, reward -0.0002334891452391077, 1349.4928827780047, -0.5091586053527624
which means this example has readied in your environment. And you can find your model under the folder data
.