| Parameter | Expected Values | Description |
| \c EIGEN_BUILD_AOCL_BENCH | \c ON, \c OFF | Enable/disable AOCL benchmark compilation |
| \c EIGEN_AOCL_BENCH_FLAGS | Compiler flags string | Additional compiler optimizations: \c "-O3 -mavx512f -fveclib=AMDLIBM" |
| \c EIGEN_AOCL_BENCH_USE_MT | \c ON, \c OFF | Use multi-threaded AOCL libraries (\c ON recommended for performance) |
| \c EIGEN_AOCL_BENCH_ARCH | \c znver3, \c znver4, \c znver5, \c native, \c generic | Target AMD architecture (match your CPU generation) |
| \c CMAKE_BUILD_TYPE | \c Release, \c Debug, \c RelWithDebInfo | Build configuration (\c Release recommended for benchmarks) |
| \c CMAKE_C_COMPILER | \c clang, \c gcc | C compiler (clang recommended for AOCL) |
| \c CMAKE_CXX_COMPILER | \c clang++, \c g++ | C++ compiler (clang++ recommended for AOCL) |
| \c CMAKE_INSTALL_PREFIX | Installation path | Where to install Eigen headers |
| \c INCLUDE_INSTALL_DIR | Header path | Specific path for Eigen headers |
**Architecture Selection Guide:**
- \c znver3: AMD Zen 3 (EPYC 7003, Ryzen 5000 series)
- \c znver4: AMD Zen 4 (EPYC 9004, Ryzen 7000 series)
- \c znver5: AMD Zen 5 (EPYC 9005, Ryzen 9000 series)
- \c native: Auto-detect current CPU architecture
- \c generic: Generic x86-64 without specific optimizations
**Custom Compiler Flags Explanation:**
- \c -O3: Maximum optimization level
- \c -mavx512f: Enable AVX-512 instruction set (if supported)
- \c -fveclib=AMDLIBM: Use AMD LibM for vectorized math functions
\subsection TopicUsingAOCL_Benchmark Building the AOCL Benchmark
After configuring Eigen, build the AOCL benchmark executable:
\code
cmake --build . --target benchmark_aocl -j$(nproc)
\endcode
This creates the \c benchmark_aocl executable that demonstrates AOCL acceleration with various matrix sizes and operations.
**Running the Benchmark:**
\code
./benchmark_aocl
\endcode
The benchmark will automatically compare:
- Eigen's native performance vs AOCL-accelerated operations
- Matrix multiplication performance (BLIS vs Eigen)
- Vector math functions performance (LibM vs Eigen)
- Memory bandwidth utilization and cache efficiency
\section TopicUsingAOCL_CMake CMake Integration
When using CMake, you can use a FindAOCL module:
\code
find_package(AOCL REQUIRED)
target_compile_definitions(my_target PRIVATE EIGEN_USE_AOCL_MT)
target_link_libraries(my_target PRIVATE AOCL::BLIS_MT AOCL::FLAME AOCL::LIBM)
\endcode
\section TopicUsingAOCL_Troubleshooting Troubleshooting
Common issues and solutions:
- **Link errors**: Ensure \c AOCL_ROOT is set and libraries are in \c LD_LIBRARY_PATH
- **Performance not improved**: Verify you're using matrices/vectors larger than the threshold
- **Thread contention**: Set \c OMP_NUM_THREADS to match your CPU core count
- **Architecture mismatch**: Use appropriate \c -march flag for your AMD processor
\section TopicUsingAOCL_Links Links
- AMD AOCL can be downloaded for free