全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

IP-Enabled C/C++ Based High Level Synthesis: A Step towards Better Designer Productivity and Design Performance

DOI: 10.1155/2014/418750

Full-Text   Cite this paper   Add to My Lib

Abstract:

Intellectual property (IP) core based design is an emerging design methodology to deal with increasing chip design complexity. C/C++ based high level synthesis (HLS) is also gaining traction as a design methodology to deal with increasing design complexity. In the work presented here, we present a design methodology that combines these two individual methodologies and is therefore more powerful. We discuss our proposed methodology in the context of supporting efficient hardware synthesis of a class of mathematical functions without altering original C/C++ source code. Additionally, we also discuss and propose methods to integrate legacy IP cores in existing HLS flows. Relying on concepts from the domains of program recognition and optimized low level implementations of such arithmetic functions, the described design methodology is a step towards intelligent synthesis where application characteristics are matched with specific architectural resources and relevant IP cores in a transparent manner for improved area-delay results. The combined methodology is more aware of the target hardware architecture than the conventional HLS flow. Implementation results of certain compute kernels from a commercial tool Vivado-HLS as well as proposed flow are also compared to show that proposed flow gives better results. 1. Introduction C/C++ based high level synthesis has been gaining momentum to deal with the increasing design complexity. Various academic [1, 2] and commercial tools [3, 4] have been introduced. Similarly, intellectual property (IP) core based design has also been proposed to deal with increasing design complexity. Since IP cores are preverified and optimized for a specific task and in some cases can also be configured to support different compute modes; they ease the task of a designer. IP reuse is another dominant factor in evolving design methodologies to deal with design complexity as well as smaller time to market (TTM) windows. We propose a design methodology in this paper that combines IP based design with HLS. We focus on smaller IP cores which typically implement some arithmetic function. A representative list of standard arithmetic in-built functions available in C/C++ is shown in Table 1. The complete list can be found in [5, 6]. It is the IP cores at this level of granularity that we use in the current work. In Section 5, we discuss extending our approach to larger IP cores. Table 1: Representative list of standard arithmetic functions in C/C++. Traditional HLS flow involves the processes of resource allocation, scheduling, and hardware

References

[1]  R. Gupta, S. Gupta, N. D. Dutt, and A. Nicolau, SPARK: A Parallelizing Approach to the High Level Synthesis of Digital Circuits, Kluwer Academic, New York, NY, USA, 2004.
[2]  GAUT, “GAUT: high-level synthesis tool from C to RTL,” 2012, http://www-labsticc.univ-ubs.fr/www-gaut.
[3]  H. L. S. Vivado, “Vivado high level synthesis,” 2012, http://www.xilinx.com/products/design-tools/vivado/integration/esl-design/index.htm.
[4]  Catapult-C SYNTHESIS, “Catapult-C synthesis,” 2012, http://calypto.com/en/products/catapult/overview.
[5]  “ISO/IEC C 11 STANDARD,” ISO/IEC 9899—Programming languages—C, 2011.
[6]  ISO/IEC C++ 11 STANDARD, ISO/IEC 14882:2011 Programming Language C++.
[7]  J. Detrey and F. deDinechin, “Table-based polynomials for fast hardware function evaluation,” LIP Research Report 2004-52, 2004.
[8]  Xilinx LogicCORE IP Linear Algebra Toolkit (LAT) v1. 0. March 1. DS829, 2011.
[9]  P. Coussy, D. D. Gajski, M. Meredith, and A. Takach, “An introduction to high-level synthesis,” IEEE Design and Test of Computers, vol. 26, no. 4, pp. 8–17, 2009.
[10]  A. Sangiovanni-Vincentelli, “Quo vadis, SLD? Reasoning about the trends and challenges of system level design,” Proceedings of the IEEE, vol. 95, no. 3, pp. 467–506, 2007.
[11]  J. M. P. Cardoso, P. Diniz, and M. Weinhardt, “Compiling for reconfigurable computing: a survey,” ACM Computing Surveys, vol. 42, no. 4, article 13, 2010.
[12]  D. D. Gajski, N. D. Dutt, A. C. H. Wu, and S. Y. L. Lin, High-Level Synthesis: Introduction to Chip and System Design, Kluwer Academic, New York, NY, USA, 1992.
[13]  C. Bobda, Introduction to Reconfigurable Computing: Architectures, Algorithms and Applications, Springer, New York, NY, USA, 2007.
[14]  J. Cong and Z. Zhang, “An efficient and versatile scheduling algorithm based on SDC formulation,” in Proceedings of the 43rd IEEE/ACM Design Automation Conference, pp. 433–438, ACM, New York, NY, USA, 2006.
[15]  A. Canis, J. Choi, M. Aldham, V. Zhang, A. Kmmoona, T. Czajkwoski, et al., “LegUp: an open source high-level synthesis tool for FPGA-based processor/accelerator systems,” ACM Transactions on Embedded Computing Systems, vol. 1, no. 1, article 1, 2012.
[16]  C.-Y. Huang, Y.-S. Chen, Y.-L. Lin, and Y.-C. Hsu, “Data path allocation based on bipartite weighted matching,” in Proceedings of the 27th ACM/IEEE Design Automation Conference, pp. 499–504, June 1990.
[17]  J. Cong and J. Xu, “Simultaneous FU and register binding based on network flow method,” in Proceedings of the Design, Automation and Test in Europe (DATE '08), pp. 1057–1062, IEEE, Los Alamitos, CA, USA, March 2008.
[18]  T. Kim and X. Liu, “Compatibility path based binding algorithm for interconnect reduction in high level synthesis,” in Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD '07), pp. 435–441, IEEE, Los Alamitos, CA, USA, November 2007.
[19]  U. Dhawan, S. Sinha, S.-K. Lam, and T. Srikanthan, “Extended compatibility path based hardware binding algorithm for area-time efficient designs,” in Proceedings of the 2nd Asia Symposium on Quality Electronic Design (ASQED '10), pp. 151–156, IEEE, Los Alamitos, CA, USA, August 2010.
[20]  J. Lach, W. H. Mangione-Smith, and M. Potkonjak, “Robust FPGA intellectual property protection through multiple small watermarks,” in Proceedings of the 36th Annual Design Automation Conference (DAC '99), pp. 831–836, IEEE, Los Alamitos, CA, USA, June 1999.
[21]  A. L. Oliveira, “Robust techniques for watermarking sequential circuit designs,” in Proceedings of the 36th Annual Design Automation Conference (DAC '99), pp. 837–842, IEEE, Los Alamitos, CA, USA, June 1999.
[22]  G. Qu and M. Potkonjak, “Fingerprinting intellectual property using constraint-addition,” in Proceedings of the 37th Design Automation Conference (DAC '00), pp. 587–592, IEEE, Los Alamitos, CA, USA, June 2000.
[23]  A. Deshpande, “Verification of IP-Core based SoC's,” in Proceedings of the 9th IEEE International Symposium on Quality Electronic Design (ISQED ’08), pp. 433–436, IEEE, Los Alamitos, CA, USA, 2008.
[24]  M. S. McCorquodale and R. B. Brown, “UMIPS: a semiconductor IP repository for IC design research and education,” in Proceedings of the American Society for Engineering Education Annual Conference & Exposition, June 2004.
[25]  IP-XACT Standard, IEEE 1685–2009. IEEE standard for IP-XACT, standard structure for packaging, integrating and reusing IP within tool flows.
[26]  Y. Lu and H. Zhou, “Efficient design space exploration for component-based system design,” in Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD ’12), pp. 466–472, IEEE, Los Alamitos, CA, USA, 2012.
[27]  Y. Liu, Y. Yang, and J. Hu, “Clustering-based simultaneous task and voltage scheduling for NoC systems,” in Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD '10), pp. 277–283, IEEE, Los Alamitos, CA, USA, November 2010.
[28]  A. Baganne, I. Bennour, M. Elmarzougui, R. Gaeich, and E. Martin, “A multi-level design flow for incorporating IP cores- case study of 1D wavelet IP integration,” in Proceedings of the IEEE Design Automation and Test in Europe (DATE ’03), pp. 250–255, Los Alamitos, CA, USA, 2003.
[29]  C. Trummer, C. Ruggenthaler, C. M. Kirchsteiger et al., “Searching extended IP-XACT components for SoC design based on requirements similarity,” IEEE Systems Journal, vol. 5, no. 1, pp. 70–79, 2010.
[30]  R. Metzger and Z. Wen, Automatic Algorithm Recognition and Replacement, MIT Press, Boston, Mass, USA, 2000.
[31]  D. Batten, S. Jinturkar, J. Glossner, M. Schulte, and P. D'Arcy, “New approach to DSP intrinsic functions,” in Proceedings of the 33rd Annual Hawaii International Conference on System Siences (HICSS-33), IEEE, Los Alamitos, CA, USA, January 2000.
[32]  R. Stallman, Using and Porting GNU CC, version 2. 7. 2. 1., Free Software Foundation, 1996.
[33]  H. Li, W. He, Y. Chen, L. Eeckhout, O. Temam, and C. Wu, “SWAP: parallelization through algorithm substitution,” IEEE Micro, vol. 32, no. 4, pp. 54–67, 2012.
[34]  C. Alias and D. Barthou, “Algorithm recognition based on demand-driven data-flow analysis,” in Proceedings of the 10th Working Conference on Reverse Engineering, pp. 296–305, IEEE, Los Alamitos, CA, USA, November 2003.
[35]  J. Cong and W. Jiang, “Pattern-based behavior synthesis for FPGA resource reduction,” in Proceedings of the 16th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA '08), pp. 107–116, New York, NY, USA, February 2008.
[36]  T. Ly, D. Knapp, R. Miller, and D. MacMillen, “Scheduling using behavioral templates,” in Proceedings of the 32nd Design Automation Conference, pp. 101–106, IEEE, Los Alamitos, CA, USA, June 1995.
[37]  A. Prakash, S. K. Lam, C. T. Clarke, and T. Srikanthan, “FPGA-aware techniques for rapid generation of profitable custom instructions,” Microprocessors and Microsystems, vol. 37, no. 3, pp. 259–269, 2013.
[38]  K. Atasu, L. Pozzi, and P. Lenne, “Automatic application-specific instruction-set extensions under microarchitectural constraints,” in Proceedings of the 40th Design Automation Conference, pp. 256–261, IEEE, Los Alamitos, CA, USA, June 2003.
[39]  P. Bonzini and L. Pozzi, “Polynomial-time subgraph enumeration for automated instruction set extension,” in Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, pp. 1331–1336, IEEE, Los Alamitos, CA, USA, April 2007.
[40]  P. Brisk, A. Kaplan, R. Kastner, and M. Sarrafzadeh, “Instruction generation and regularity extraction for reconfigurable processors,” in Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES '02), pp. 262–269, ACM, New York, NY, USA, October 2002.
[41]  P. Yu and T. Mitra, “Scalable custom instructions identification for instruction-set extensible processors,” in Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES '04), pp. 69–78, ACM, New York, NY, USA, September 2004.
[42]  “LLVM Compiler Infrastructure,” 2013, http://www.llvm.org.
[43]  A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman, Compilers: Principles, Techniques, and Tools, Addison Wesley, New York, NY, USA, 2nd edition, 2006.
[44]  R. Metzger, “Automated recognition of parallel algorithms in scientific applications,” 1995.
[45]  Xilinx LogiCORE IP CORDIC v4. 0. March 1, 2011. DS249.
[46]  R. W. Sinnott, “Virtues of the haversine,” Sky and Telescope, vol. 68, no. 2, article 159, 1984.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133