Reinforcement learning approach to coordinate real-world multi-agent dynamic routing and scheduling