Improved Architectures For Fused Floating-Point Arithmetic Units