Floating-point multiply-add-fused with reduced latency | IEEE Journals & Magazine | IEEE Xplore