Skip to content

Commit

Permalink
Merge pull request #18 from JPenuchot/jfalcou-remarks
Browse files Browse the repository at this point in the history
Joel's remarks
  • Loading branch information
JPenuchot authored Jan 26, 2023
2 parents 430189e + 99bbee4 commit e9f9ee7
Showing 1 changed file with 24 additions and 26 deletions.
50 changes: 24 additions & 26 deletions paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: 'ctbench - compile-time benchmarking and analysis'
tags:
- C++
- metaprogramming
- meta-programming
- compilation
- benchmarking
- library
Expand All @@ -26,20 +26,20 @@ bibliography: paper.bib

# Summary

With metaprogrammed libraries like Eigen[@eigen], Blaze[@blazelib], or
CTRE[@ctre] being developed, we're seeing increasing computing needs at compile
With libraries like Eigen[@eigen], Blaze[@blazelib], or CTRE[@ctre] being developed with
a large tempalte meta-programmed implementation, we're seeing increasing computing needs at compile
time. These needs might grow even larger as C++ embeds more features over time
to support and extend this kind of practices, like compile-time
containers[@more-constexpr-containers] or static reflection[@static-reflection].
However there is still no clear cut methodology to compare the performance
impact of different metaprogramming strategies. Moreover, new C++ features might
allow for new techniques that could offer better compile-time performance.
However, these claims still remain to be proven with proper methodology.
However, there is still no clear cut methodology to compare the performance
impact of different meta-programming strategies. But, as new C++ features
allows for new techniques with claimed better compile-time performance,
no proper methodologies is provided to back up those claims.

This paper introduces **ctbench**, which is a set of tools for compile-time
benchmarking and analysis in C++. It aims to provide developer-friendly tools to
declare and run benchmarks, then aggregate, filter out, and plot the data to
analyze it. As such, **ctbench** is meant to become the first layer for proper
analyze it. As such, **ctbench** is meant to become the first layer for a proper
scientific methodology for analyzing compile-time program behavior.

<!-- Plan -->
Expand All @@ -49,14 +49,14 @@ benchmarking and establish the limits of what these tools can do.

# Statement of need

C++ template metaprogramming raised interest for allowing computing libraries to
C++ template meta-programming raised interest for allowing computing libraries to
offer great performance with a very high level of abstraction. As a tradeoff for
interpreting representations of calculations at runtime, they are represented at
compile-time, and transformed directly into their own programs.

As metaprogramming became easier with C++11 and C++17, it became more mainstream
As meta-programming became easier with C++11 and C++17, it became more mainstream
and consequently, developers have to bear with longer compilation times without
being able to explain them. Therefore being able to measure compilation times is
being able to explain them. Therefore, being able to measure compilation times is
increasingly important, and being able to explain them as well. A first
generation of tools aims to tackle this issue with their own specific
methodologies:
Expand All @@ -65,7 +65,7 @@ methodologies:
A-B compile-time comparisons in a web browser,
- Metabench[@metabench] instantiates variably sized benchmarks using embedded
Ruby (ERB) templating and plots compiler execution time, allowing scaling
analyses of metaprograms,
analyses of meta-programs,
- Templight[@templight] adds Clang template instantiation inspection
capabilities with debugging and profiling tools.

Expand All @@ -75,20 +75,18 @@ the `-ftime-trace` flag. Its output contains data that can be directly linked to
symbols in the source code, making it easier to study the impact of specific
symbols on various stages of compilation. The output format is a JSON file meant
to be compatible with Chrome's flame graph visualizer, that contains a series of
time events with optional metadata like the (mangled) C++ symbol or the file
time events with optional metadata like the mangled C++ symbol or the file
related to an event. The profiling data can then be visualized using tools such
as Google's [Perfetto UI](https://ui.perfetto.dev/).

![Perfetto UI displaying a Clang time trace file for Poacher's consecutive loops
benchmark case with the expression template backend](
docs/images/perfetto-ui.png)
![Perfetto UI displaying a sample Clang time trace file](docs/images/perfetto-ui.png)

Clang's profiler data is very exhaustive and insightful, however there is no
tooling to make sense of it in the context of variable size compile-time
benchmarks. **ctbench** tries to bridge the gap by providing a tool to analyze
this valuable data. It also improves upon existing tools by providing a solution
that's easy to integrate into existing CMake projects, and generates graphs in
various formats that are trivialy embeddable in documents like research papers,
various formats that are trivially embeddable in documents like research papers,
web pages, or documentations. Additionally, relying on persistent configuration,
benchmark declaration and description files provides strong guarantees for
benchmark reproductibility, as opposed to web tools or interactive profilers.
Expand All @@ -98,7 +96,7 @@ benchmark reproductibility, as opposed to web tools or interactive profilers.
Originally inspired by Metabench[@metabench], **ctbench** development was
driven by the need for a similar tool that allows the observation of Clang's
time-trace files to help get a more comprehensive view on the impact of
metaprogramming techniques on compile times. A strong emphasis was put on
meta-programming techniques on compile times. A strong emphasis was put on
developer friendliness, project integration, and component reusability.

**ctbench** provides:
Expand All @@ -109,14 +107,14 @@ developer friendliness, project integration, and component reusability.
Although CMake is not a proper programming language, it is used as the main
API for **ctbench** as most C++ developers are already familiar with it.

- a set of JSON-configurable plotters with customizable data aggretation
- a set of JSON-configurable plotters with customizable data aggregation
features, which can be reused as a C++ library

The core library provides data representations to handle benchmarks cases
instantited at several sizes, each instance being repeated at least once. It
instantiated at several sizes, each instance being repeated at least once. It
also provides tools to aggregate, filter, and sort data from time-trace
events, as well as various plotters that provide different aggregation and
vizualisation strategies. The plotters can generate files in various format
visualisation strategies. The plotters can generate files in various format
thanks to the Sciplot[@sciplot] library, and they are highly configurable
through JSON configuration files that are well documented. Default
configuration files can be generated using a dedicated CLI tool.
Expand All @@ -127,7 +125,7 @@ time-trace file, making it partially compatible with GCC as well.

In addition to **ctbench**'s time-trace handling, it has a compatibility mode
for compilers that do not support it like GCC. This mode works by measuring
compiler execution time just like Metabench and generating a time-trace file
compiler execution time just like Metabench[@metabench] and generating a time-trace file
that contains compiler execution time. Moreover, the tooling makes defining
compilers per-target possible within a CMake project, allowing black-box
compiler performance comparisons between GCC and Clang for example or
Expand Down Expand Up @@ -274,13 +272,13 @@ hierarchy to take a look at frontend and backend execution times.

The backend is not being impacted here, supposedly because this is purely a
compile-time program, and the output program is empty. However this might not be
the case for all metaprograms, and metaprograms might have different impacts on
the case for all meta-programs, and meta-programs might have different impacts on
the backend as they may generate programs in different ways (ie. generate more
symbols, larger symbols, more data structures, etc.).

![Total InstantiateFunction](docs/images/Total_InstantiateFunction.svg){width=100%}

The Total Instantiate function timer is an interesting one as it explicitely
The Total Instantiate function timer is an interesting one as it explicitly
targets function instanciation time. Note that timers that are prefixed with
"Total" measure the total time spent in a timer section, regardless of the
specific symbol or source associated to its individual timer events.
Expand All @@ -296,7 +294,7 @@ instantiated within this function.

This level of detail and granularity in the analysis of compile-time benchmarks
was never reached before, and may help us set good practices to improve the
compile-time performance of metaprograms.
compile-time performance of meta-programs.

# Statement of interest

Expand All @@ -312,7 +310,7 @@ presented at the Meeting C++ 2022[@meetingcpp22] technical conference.
C++20

- [Rule of Cheese](https://github.com/jpenuchot/rule-of-cheese): A collection of
compile-time microbenchmarks to help set better C++ metaprogramming guidelines
compile-time microbenchmarks to help set better C++ meta-programming guidelines
to improve compile-time performance

# Acknowledgements
Expand Down

0 comments on commit e9f9ee7

Please sign in to comment.