A C++11 implementation of arbitrary-rank tensors for high-performance computing
This article discusses an efficient implementation of tensors of arbitrary rank by using some of the idioms introduced by the recently published C++ ISO Standard (C++11). With the aims at providing a basic building block for high-performance computing, a single Array class template is carefully crafted, from which vectors, matrices, and even higher-order tensors can be created. An expression template facility is also built around the array class template to provide convenient mathematical syntax. As a result, by using templates, an extra high-level layer is added to the C++ language when dealing with algebraic objects and their operations, without compromising performance. The implementation is tested running on both CPU and GPU. New version program summary Program title: cpp-array Catalogue identifier: AESA_v1_1 Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AESA_v1_1.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU Lesser General Public License, version 3 No. of lines in distributed program, including test data, etc.: 12 376 No. of bytes in distributed program, including test data, etc.: 81 669 Distribution format: tar.gz Programming language: C++. Computer: All modern architectures. Operating system: Linux/Unix/Mac OS. RAM: Problem dependent Classification: 5. External routines: GNU CMake build system and BIAS implementation. NVIDIA CUBLAS for GPU computing. Does the new version supersede the previous version?: Yes Catalogue identifier of previous version: AESA_v1_0 Journal reference of previous version: Comput. Phys. Comm. 185 (2014) 1681 Nature of problem: Tensors are a basic building block for any program in scientific computing. Yet, tensors are not a built-in component of the C++ programming language. Solution method: An arbitrary-rank tensor class template is crafted by using the new features introduced by the C++11 set of requirements. In addition, an entire expression template facility is built on top, to provide mathematical straightforward notation without damaging performance. Reasons for new version: The reason for this version is to make the library more portable. Summary of revisions: The first version of the library relied on the presence of a C interface to the BIAS library (CBLAS). This new version gives priority to a Fortran BIAS implementation when a Fortran compiler is provided during the configuration process. Some minor changes have also been made, including an improved Doxygen documentation, fixed installation when using CUDA, and fixed other minor bugs. Running time: Problem dependent. The tests provided take only seconds. The examples take approximately 15 min. (C) 2014 Elsevier B.V. All rights reserved.