Q-Learning in a simple environment