The development of a practical motion planning and control algorithm for under-actuated robots in an unknown disturbance is a very important issue in robotics research. In the case of under actuated underwater vehicles, developing such an algorithm has been particularly problematic for several reasons. First, not only the kinematical characteristics of the motion but also the dynamical characteristics of the underwater vehicle must be considered in the motion planning calculation. Second, it is very difficult to ascertain the exact distribution of the velocity of non-uniform sea flow around obstacles on the seabed before the mission. Third, the effect of the sea flow on the motion of an underwater vehicle is very large because the speed of sea flow is very high compared with the vehicle's. This paper proposes a new method based on the application of reinforcement learning to solve these problems. Reinforcement learning based on the Markov decision process (MDP) is known to be suitable for acquiring motion control algorithms for robots acting in a stochastic environment with disturbance. However, to apply reinforcement learning method, the robot's motion must be suitably digitized and the learning environment must be equal to the robot's mission environment. This paper introduces a motion digitizing method based on an artificial neuron model and a method for making up for the difference between learning and mission environments. The performance of the proposed algorithm is examined by the dynamical simulation of an under-actuated underwater vehicle cruising in an environment with an obstacle and an unknown non-uniform flow simulated by potential flow.