OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

International Journal of Reconfigurable Computing 2013

Analysis of Fast Radix-10 Digit Recurrence Algorithms for Fixed-Point and Floating-Point Dividers on FPGAs

DOI: 10.1155/2013/453173

Malte Baesler,Sven-Ole Voigt

Full-Text Cite this paper Add to My Lib

Abstract:

Decimal floating point operations are important for applications that cannot tolerate errors from conversions between binary and decimal formats, for instance, commercial, financial, and insurance applications. In this paper we present five different radix-10 digit recurrence dividers for FPGA architectures. The first one implements a simple restoring shift-and-subtract algorithm, whereas each of the other four implementations performs a nonrestoring digit recurrence algorithm with signed-digit redundant quotient calculation and carry-save representation of the residuals. More precisely, the quotient digit selection function of the second divider is implemented fully by means of a ROM, the quotient digit selection function of the third and fourth dividers are based on carry-propagate adders, and the fifth divider decomposes each digit into three components and requires neither a ROM nor a multiplexer. Furthermore, the fixed-point divider is extended to support IEEE 754-2008 compliant decimal floating-point division for decimal64 data format. Finally, the algorithms have been synthesized on a Xilinx Virtex-5 FPGA, and implementation results are given. 1. Introduction Many applications, particularly commercial and financial applications, require decimal floating-point operations to avoid errors from conversions between binary and decimal formats. This paper presents five different decimal fixed-point dividers and analyzes their performances and resource requirements on FPGA platforms. All five architectures apply a radix-10 digit recurrence algorithm but differ in the quotient digit selection (QDS) function. The first fixed-point divider (type1) implements a simple shift-and-subtract algorithm. It is characterized by an unsigned and nonredundant quotient digit calculation. Nine divisor multiples are precomputed, and in each iteration step nine carry-propagate subtractions are performed on the residual. Finally, the smallest, nonnegative difference is selected by a large fan-in multiplexer. This type1 implementation is characterized by a high area use. The second divider (type2) uses a signed-digit quotient calculation with a redundancy of and operands scaling to get a normalized divisor in the range of . The quotient digit selection (QDS) function can be implemented fully by a ROM because it depends only on the two most significant digits (MSDs) of the residual as well as the divisor. The residual uses a redundant carry-save representation but, because of performance issues, the two MSDs are implemented by a nonredundant radix-2 representation. The

References

[1]	M. Baesler, S. Voigt, and T. Teufel, “FPGA implementations of radix-10 digit recurrence fixed-point and floating-point dividers,” in Proceedings of the International Conference on Reconfigurable Computing and FPGAs (ReConFig '11 ), pp. 13–19, IEEE Computer Society, Los Alamitos, CA, USA, December 2011.
[2]	IEEE Task P754. ANSI/IEEE 754-1985, Standard for Binary Floating- Point Arithmetic. New York, NY, USA, August 1985.
[3]	M. F. Cowlishaw, “Decimal floating-point: algorism for computers,” in Proceedings of the 16th IEEE Symposium on Computer Arithmetic (ARITH '03), pp. 104–111, IEEE Computer Society, Washington, DC, USA, June 2003.
[4]	IEEE Task P754. IEEE 754-2008, Standard for Floating-Point Arithmetic. New York, NY, USA, August 2008.
[5]	ANSI/IEEE. ANSI/IEEE Std 854-1987: An American National Standard: IEEE Standard for Radix-Independent Floating-Point Arithmetic. New York, NY, USA, October 1987.
[6]	A. Y. Duale, M. H. Decker, H. G. Zipperer, M. Aharoni, and T. J. Bohizic, “Decimal floating-point in z9: an implementation and testing perspective,” IBM Journal of Research and Development, vol. 51, no. 1-2, pp. 217–227, 2007.
[7]	C. F. Webb, “IBM z10: the next-generation mainframe microprocessor,” IEEE Micro, vol. 28, no. 2, pp. 19–29, 2008.
[8]	L. Eisen, J. W. Ward, H. W. Tast et al., “IBM POWER6 accelerators: VMX and DFU,” IBM Journal of Research and Development, vol. 51, no. 6, pp. 663–683, 2007.
[9]	R. Kalla, B. Sinharoy, W. J. Starke, and M. Floyd, “Power7: IBM's next-generation server processor,” IEEE Micro, vol. 30, no. 2, pp. 7–15, 2010.
[10]	S. F. Oberman and M. J. Flynn, “Division algorithms and implementations,” IEEE Transactions on Computers, vol. 46, no. 8, pp. 833–854, 1997.
[11]	L. K. Wang and M. J. Schulte, “A decimal floating-point divider using newton-raphson iteration,” Journal of VLSI Signal Processing Systems, vol. 49, no. 1, pp. 3–18, 2007.
[12]	M. Véstias and H. Neto, “Revisiting the newton-raphson iterative method for decimal divisionpages,” in Proceedings of the International Conference on Field Programmable Logic and Applications (FPL '11), pp. 138–143, IEEE Computer Society Press, September 2011.
[13]	H. Nikmehr, B. Phillips, and C. C. Lim, “Fast decimal floating-point division,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 14, no. 9, pp. 951–961, 2006.
[14]	T. Lang and A. Nannarelli, “A radix-10 digit-recurrence division unit: algorithm and architecture,” IEEE Transactions on Computers, vol. 56, no. 6, pp. 727–739, 2007.
[15]	A. Vázquez, E. Antelo, and P. Montuschi, “A radix-10 SRT divider based on alternative BCD codings,” in Proceedings of the 25th IEEE International Conference on Computer Design (ICCD '07), pp. 280–287, IEEE Computer Society Press, Los Alamitos, CA, USA, October 2007.
[16]	E. Schwarz and S. Carlough, “Power6 decimal dividepages,” in Proceedings of the 18th IEEE International Conference on Application-Specific Systems Architectures and Processors (ASAP '07), pp. 128–133, IEEE Computer Society, July 2007.
[17]	M. D. Ercegovac and R. McIlhenny, “Design and FPGA implementation of radix-10 algorithm for division with limited precision primitives,” in Proceedings of the 42nd Asilomar Conference on Signals, Systems and Computers (ASILOMAR '08), pp. 762–766, IEEE Computer Society, Pacific Grove, Calif, USA, October 2008.
[18]	M. D. Ercegovac and R. McIlhenny, “Design and FPGA implementation of radix-10 combined division/square root algorithm with limited precision primitives,” in Proceedings of the 44th Asilomar Conference on Signals, Systems and Computers (Asilomar '10), pp. 87–91, IEEE Computer Society, Pacific Grove, Calif, USA, November 2010.
[19]	F. Y. Busaba, C. A. Krygowski, W. H. Li, E. M. Schwarz, and S. R. Carlough, “The IBM z900 decimal arithmetic unit,” in Proceedings of the 35th Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1335–1339, IEEE Computer Society, November 2001.
[20]	M. Ercegovac and T. Lang, Division and Square Root: Digit-Recurrence Algorithms and Implementations, Kluwer Academic Publishers, Norwell, Mass, USA, 1994.
[21]	M. Baesler, S. O. Voigt, and T. Teufel, “A decimal floating-point accurate scalar product unit with a parallel fixed-point multiplier on a Virtex-5 FPGA,” International Journal of Reconfigurable Computing, vol. 2010, Article ID 357839, 13 pages, 2010.
[22]	M. Baesler, S. O. Voigt, and T. Teufel, “A radix-10 digit recurrence division unit with a constant digit selection function,” in Proceedings of the 28th IEEE International Conference on Computer Design (ICCD '10), pp. 241–246, IEEE Computer Society, Los Alamitos, CA, USA, October 2010.
[23]	M. Baesler, S. O. Voigt, and T. Teufel, “An IEEE 754-2008 decimal parallel and pipelined FPGA floating-point multiplier,” in Proceedings of the 20th International Conference on Field Programmable Logic and Applications (FPL '10), pp. 489–495, IEEE Computer Society, Washington, DC, USA, September 2010.
[24]	J. P. Deschamps and G. Sutter, “Decimal division: algorithms and FPGA implementations,” in Proceedings of the 6th Southern Programmable Logic Conference (SPL '10), pp. 67–72, IEEE Computer Society, March 2010.
[25]	Y. Zhang, D. Chen, L. Chen, et al., “Design and implementation of a readix-100 decimal division,” in Proceedings of IEEE Symposium on Circuit and System (ISCAS '09), Taibei, Taiwan, May 2009.
[26]	Xilinx. Xilinx LogiCORE Floating-Point Operator v4.0, April 2008.

Full-Text

Contact Us

[email protected]

QQ:3279437679

WhatsApp +8615387084133