Learning First-to-Spike Policies for Neuromorphic Control Using Policy Gradients | IEEE Conference Publication | IEEE Xplore