Floating-point fused multiply-add: reduced latency for floating-point addition | IEEE Conference Publication | IEEE Xplore