Building AOCL-DLP (Deep Learning Primitives)

This document provides instructions for building the AOCL-DLP library from source code.

📋 System Requirements

Before building AOCL-DLP, ensure your system meets the following requirements:

Software

CMake (≥ 3.26)
C/C++ compiler with C11/C++17 support (e.g., GCC 11+, Clang 14+)
OpenMP library (for multi-threading)
ninja-build (optional, for Ninja generator support)

Note: GCC 11 introduced AVX512_BF16 support, which is required for bfloat16 GEMM.

Hardware

x86 CPU with AVX2/FMA3 support
AVX512 support for enhanced performance
AVX512_VNNI support for int8 GEMM
AVX512_BF16 support for bfloat16 GEMM

Build Configuration Options

AOCL-DLP uses CMake for its build system with several configurable options:

Option	Default	Description
General Build Options
BUILD_EXAMPLES	OFF	Build example programs
BUILD_BENCHMARKS	OFF	Build benchmark programs
BUILD_TESTING	OFF	Build test programs (requires DLP_TESTING_CTEST_DISABLED=OFF for CTest)
BUILD_DOXYGEN	OFF	Build Doxygen documentation
BUILD_SPHINX	OFF	Build Sphinx documentation
CMAKE_EXPORT_COMPILE_COMMANDS	OFF	Generate compile_commands.json for tooling
CMAKE_BUILD_TYPE	Release	Build type ("Release", "Debug", "RelWithDebInfo", "Coverage")
CMAKE_INSTALL_PREFIX	/usr/local	Installation directory

Compiler Options
CMAKE_CXX_COMPILER	system	Specify C++ compiler (e.g., g++)
CMAKE_C_COMPILER	system	Specify C compiler (e.g., gcc)

Threading & Sanitizers
DLP_THREADING_MODEL	"none"	Threading model ("none", "openmp")
DLP_ENABLE_OPENMP	ON	Override OpenMP support (auto-enabled by threading model)
DLP_OPENMP_ROOT	""	Custom path to OpenMP installation
DLP_USE_LLVM_OPENMP	OFF	Force using LLVM OpenMP implementation
DLP_ENABLE_ASAN	OFF	Enable AddressSanitizer
DLP_ENABLE_TSAN	OFF	Enable ThreadSanitizer
DLP_ENABLE_UBSAN	OFF	Enable UndefinedBehaviorSanitizer
DLP_TESTING_CTEST_DISABLED	ON	Disable CTest integration (set to OFF to enable with BUILD_TESTING)

Testing Options
DLP_TESTING_LINK_STATIC	ON	Link tests with static AOCL-DLP library for better performance
DLP_TESTING_ENABLE_DETAILED_DEBUG	OFF	Enable detailed debug information for tests

Benchmarking Options
DLP_BENCHMARKS_LINK_STATIC	ON	Link benchmarks with static AOCL-DLP library for better performance

Build Target Options
DLP_EXAMPLES_LINK_STATIC	ON	Link examples with static AOCL-DLP library for better performance

Kernel Dispatch Table
DLP_KDT_TABLE_SIZE	16	Set table size for the Kernel Dispatch Table
DLP_KDT_CHAIN_SIZE	128	Set table chain size for the Kernel Dispatch Table

Note:

Options can be set via -D<option>=<value> when invoking cmake.
Some options (like -GNinja) are passed as command-line arguments, not as variables.
For a full list of options, see the modular cmake files: cmake/dlp_core_options.cmake, cmake/dlp_testing.cmake, cmake/dlp_benchmark.cmake, cmake/dlp_build_options.cmake, and cmake/dlp_documentation.cmake.

Quick Start Build

Linux

Clone and enter project:

git clone <repository-url> && cd aocl-dlp

Create an out-of-source build directory:
```
mkdir -p build && cd build
```

Configure (choose generator):

# Default (GNU Make)
cmake ..

# Ninja (fast incremental builds)
cmake -G Ninja ..

Build:
```
# Make
make -j$(nproc)

# Ninja
ninja
```
For installation instructions, see INSTALL.md.

Advanced Build Configuration

Enabling Additional Components

To enable benchmarks:

cmake -DBUILD_BENCHMARKS=ON ..

To enable testing with full CTest integration:

cmake -DBUILD_TESTING=ON -DDLP_TESTING_CTEST_DISABLED=OFF ..

Note: Both BUILD_TESTING=ON and DLP_TESTING_CTEST_DISABLED=OFF are required for full CTest integration. Using only BUILD_TESTING=ON builds tests but uses traditional testing instead of Google Test discovery.

Threading Model Configuration

AOCL-DLP supports the following threading models:

# No threading (default)
cmake -DDLP_THREADING_MODEL=none ..

# Enable OpenMP threading
cmake -DDLP_THREADING_MODEL=openmp ..

Note: Setting DLP_THREADING_MODEL=openmp automatically enables OpenMP support. The separate DLP_ENABLE_OPENMP option (default: ON) provides additional control and can disable OpenMP entirely with -DDLP_ENABLE_OPENMP=OFF.

For custom OpenMP installation:

cmake -DDLP_THREADING_MODEL=openmp -DDLP_OPENMP_ROOT=/path/to/openmp ..

Static vs Dynamic Linking

By default, tests, benchmarks, and examples are linked with the static AOCL-DLP library for better performance. You can control this behavior:

Enable static linking (default):

cmake -DBUILD_EXAMPLES=ON -DDLP_EXAMPLES_LINK_STATIC=ON ..
cmake -DBUILD_TESTING=ON -DDLP_TESTING_LINK_STATIC=ON ..
cmake -DBUILD_BENCHMARKS=ON -DDLP_BENCHMARKS_LINK_STATIC=ON ..

Use dynamic linking:

cmake -DBUILD_EXAMPLES=ON -DDLP_EXAMPLES_LINK_STATIC=OFF ..
cmake -DBUILD_TESTING=ON -DDLP_TESTING_LINK_STATIC=OFF ..
cmake -DBUILD_BENCHMARKS=ON -DDLP_BENCHMARKS_LINK_STATIC=OFF ..

Verify linking with ldd:

# Static linking - no libaocl-dlp.so should appear
ldd examples/classic/simple_gemm_f32

# Dynamic linking - libaocl-dlp.so should appear
ldd examples/classic/simple_gemm_f32

Note: Static linking provides better performance by eliminating dynamic library loading overhead, which is especially beneficial for benchmarks and performance testing.

Specifying Build Type

You can specify different build types:

# Debug build
cmake -DCMAKE_BUILD_TYPE=Debug ..

# Release build (default)
cmake -DCMAKE_BUILD_TYPE=Release ..

# Release with debug info
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo ..

# Coverage build (for code coverage analysis)
cmake -DCMAKE_BUILD_TYPE=Coverage ..

Configuring Kernel Dispatch Table Size

By default, the Kernel Dispatch Table (KDT) is configured with 16 buckets and 128 chains for optimal memory usage and quick kernel queries. If necessary, it can be manually configured as below:

cmake -DDLP_KDT_TABLE_SIZE=<number_of_buckets> -DDLP_KDT_CHAIN_SIZE=<number_of_chains> ..

Benchmarking

Enable and run tests and benchmarks in one place:

Developer Tips

Out-of-tree builds: Always build in a separate build/ directory to keep sources clean.

Custom install prefix:

cmake -DCMAKE_INSTALL_PREFIX=/opt/aocl-dlp ..

Verbose output:
```
# Make
make VERBOSE=1

# Ninja
ninja -v
```
Clean cache:
```
rm -rf build/* && cd build && cmake ..
```
Parallel builds: Leverage all cores with -j$(nproc) (Make) or default Ninja parallelism.

CMake Build System Overview

AOCL-DLP uses a modern CMake build system structured as follows:

Main CMakeLists.txt: Orchestrates the overall build process
cmake/dlp_variables.cmake: Sets project variables, languages and standards
cmake/dlp_core_options.cmake: Defines core library options and threading models
cmake/dlp_testing.cmake: Defines testing options and infrastructure
cmake/dlp_benchmark.cmake: Defines benchmarking options and infrastructure
cmake/dlp_build_options.cmake: Defines build target options (examples, sanitizers)
cmake/dlp_documentation.cmake: Defines documentation options and generation
cmake/dlp_dependencies.cmake: Manages OpenMP dependencies
cmake/dlp_compiler_flags_linux.cmake: Sets compiler flags for Linux
cmake/dlp_compiler_flags_windows.cmake: Sets compiler flags for Windows
cmake/dlp_install.cmake: Manages installation rules
cmake/dlp_extensions.cmake: Defines file extensions

Troubleshooting

Threading Model Issues

If you encounter issues with the selected threading model:

Make sure the required libraries are installed on your system:
- For OpenMP: OpenMP development libraries
For OpenMP-specific issues:

cmake -DDLP_THREADING_MODEL=openmp -DDLP_OPENMP_ROOT=/path/to/openmp ..

Compiler Requirements

Make sure your compiler supports:

C11 standard for C code
C++17 standard for C++ code

Build Performance

To speed up the build process, use parallel compilation:

make -j$(nproc)  # Linux

Known Issues

Warnings may appear during compilation (-Werror is currently disabled)
Some platforms may require specific environment setup for threading model detection

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Building AOCL-DLP (Deep Learning Primitives)

📋 System Requirements

Software

Hardware

Build Configuration Options

Quick Start Build

Linux

Advanced Build Configuration

Enabling Additional Components

Threading Model Configuration

Static vs Dynamic Linking

Specifying Build Type

Configuring Kernel Dispatch Table Size

Benchmarking

Developer Tips

CMake Build System Overview

Troubleshooting

Threading Model Issues

Compiler Requirements

Build Performance

Known Issues

Uh oh!

FilesExpand file tree

BUILD.md

Latest commit

History

BUILD.md

File metadata and controls

Building AOCL-DLP (Deep Learning Primitives)

📋 System Requirements

Software

Hardware

Build Configuration Options

Quick Start Build

Linux

Advanced Build Configuration

Enabling Additional Components

Threading Model Configuration

Static vs Dynamic Linking

Specifying Build Type

Configuring Kernel Dispatch Table Size

Benchmarking

Developer Tips

CMake Build System Overview

Troubleshooting

Threading Model Issues

Compiler Requirements

Build Performance

Known Issues