What is non-Episodic Environment in Reinforcement Learning?