Learning to control a joint driven double inverted pendulum using nested actor/critic algorithm | IEEE Conference Publication | IEEE Xplore