The objective of reinforcement learning is to learn a policy, and that is a mapping from states to steps, that maximizes the anticipated cumulative reward with time.It will allow us to decrease the dimension with the data without the need of A great deal reduction of information. PCA reduces the dimension by locating a several orthogonal linear com