Efficient Testing with VUnit – Test Selection, Test Prioritization, and Load-Balancing
The latest VUnit pre-releases introduce a set of features aimed at improving test efficiency by reducing execution time, increasing coverage, and optimizing the use of available compute resources. This is the first in a series of blog posts that describe these new features, beginning with test selection, test prioritization, and load-balancing.
📌Note: The next official VUnit release will be v5.0.0, a major version that allows for backward-incompatible changes. While these changes are minor, we aim to group them into one major release to avoid needing v6.0.0 soon after. To support this, we’re making a series of pre-releases which have the same level of quality as the official releases. The only difference is that you have to use the --𝚙𝚛𝚎 option when installing via pip (see installation instructions for details).
Background
Before diving into the details of these new features, let’s take a step back and look at the context that brought us here.
Both intuition and decades of experience in FPGA/ASIC development, and engineering as a whole, make one thing clear: finding defects early is far better than discovering them late. This principle is not just common sense. It's also supported by several studies showing that the cost of fixing bugs can increase exponentially the longer they go undetected.
Undetected defects don't simply sit idle but accumulate impact over time. What begins as a minor oversight can rapidly grow into a costly and complex problem. It gets built upon by other components, replicated through reuse, and eventually hidden deep within the system. As the design evolves, diagnosing the root cause becomes harder, and fixing it may require rework across multiple teams or subsystems.
Despite this, EDA companies have centered their verification strategies around dedicated system-level verification teams, often working late in the development process. While this model has its value, it places verification on the right-hand side of the exponential cost curve. To improve on this, the first step is to shift verification efforts leftward. This means involving code developers directly in the verification process, allowing defects to be caught earlier when they are less expensive and easier to correct.
However, this shift isn't just about process, it's also about tools and accessibility. To engage developers in the first place, we need test frameworks that align with the languages they use. The industry's heavy focus on SystemVerilog and UVM-based verification creates a significant barrier, especially for teams working in VHDL. For these developers, asking them to step outside their language and tools ecosystem to write tests in UVM is often a non-starter.
Taking verification seriously during the coding phase is a critical first step but we should also push further left within that phase. Instead of writing large blocks of functionality before testing, we should design, implement, and verify code in small, manageable increments. Each iteration forms a feedback loop, providing immediate insight into whether the design behaves as intended.
But it’s not just about functional correctness. Iterative development also helps ensure the design is clean, understandable, testable, integrable, and maintainable. And by working in short cycles, we stay more responsive to change whether it’s a shift in requirements, constraints, or assumptions.
The highly iterative approach to verification places new demands on the test framework and these demands have shaped every aspect of VUnit’s design. For example:
Full test automation: Frequent iterations mean frequent test runs. This is only practical with full automation, from test and code discovery through compilation, simulation, and reporting. Without automation, the quick feedback loop breaks down.
Error-type agnostic detection: The framework must reliably capture errors from all sources, including compilation, elaboration, simulator internals, newly developed code, legacy and third-party code potentially using error mechanisms other than those provided by VUnit. If developers can't trust the framework to catch these errors automatically, true automation is lost. So is the the iterative workflow.
Parallel/multi-threaded simulation execution: Efficient use of available CPU resources is critical. A machine with 8 cores and 16 threads can, in theory, achieve up to a 16x speed-up. However, the benefits go beyond raw performance. Reduced test times encourage more frequent execution. A one-minute test suite is more likely to be run often than one that takes 16 minutes.
Support for both open source and commercial simulators. Many VUnit users rely on a mix of simulators. Open source simulators offer an unlimited number of licenses, making them well suited for multi-threaded test execution. Commercial licenses are limited in number but can offer unique capabilities such as support for mix-language simulations and advanced debugging. It is important that jumping back and forth between simulators doesn't require any changes to the code base, i.e. the test framework must provide an environment that is agnostic of the simulator being used.
Minimal testbench overhead: Developers need to create and iterate on testbenches quickly, without dealing with heavy and complex boilerplate code.
Independent test cases: Testbenches should support multiple, independently executable test cases. This increases parallelism and allows test outcomes to be reported individually.
Focused error reporting: Traditional EDA workflows often produce overwhelming log dumps. VUnit provides clear, actionable feedback, removing the need for manual log-sifting or custom scripts.
Incremental and minimal compilation: Recompiling only the files that have changed among those required to run a test suite helps reduce test turnaround times and supports faster development iterations.
This brief overview highlights the importance of test framework design in supporting an iterative development process. With this context in place, we now turn to the new features introduced in the latest pre-releases.
Test Prioritization
Before the latest features, VUnit executed tests in the order they were added, as shown in the single-threaded test run in Figure 1. The tests in testbench 1 (t1.1 to t1.5) are executed first, followed by the single test case in testbench 2 (t2.1).
This behavior remains unchanged during the first run with the latest VUnit version. However, differences appear in the second run, once timing information from the previous run is available. VUnit uses this data to sort tests by execution time, running the shortest tests first. The reasoning is to prioritize the most efficient tests, those that cover a lot of functional ground in a short time, thereby reducing the time to potential bug discovery.
Short tests are often more efficient in this context. While there are no strict rules, shorter tests are typically directed unit tests designed to quickly verify specific aspects of the design at a low level. Among longer tests we find higher-level tests which often take more time getting to the point where they begin adding verification value. Randomized tests also fall into this category. They are valuable for uncovering unanticipated issues but are less efficient by nature, as they explore the design space without knowing exactly what to look for. When such tests uncover defects, it is common practice to create more focused, directed tests that address the issue explicitly. As a result, the original randomized test also becomes less efficient over time.
Running tests in execution time order is one strategy. VUnit also uses higher-level strategies based on test categories. In the first run, there was no historical data available and all tests were treated as new. In the second run, all tests had a history of passing, placing them in a different category. When multiple categories are present, VUnit assigns a relative priority to each and executes tests category by category. Within each category, tests are ordered by execution time, from shortest to longest.
In Figure 3, a new test, t3.1, has been added. Since new tests are more likely to fail than previously passing ones, they are given a higher execution priority.
When code changes are introduced, there is a risk of breaking existing functionality. VUnit analyzes code dependencies to determine which tests are likely to be affected and prioritizes them accordingly. In Figure 4, changes impacted tests t2.1 and t3.1. The red color indicates that both tests failed due to these changes.
Suppose a code change is made to fix t3.1 while all other tests remain unaffected. In this scenario, three categories of tests are present: tests that passed, tests that failed and are affected by changes, and tests that failed but are unaffected. Tests that have failed in the past are assumed to be more likely to fail again and are executed first. Since t2.1 was not affected by the latest changes, it is expected to continue failing and is prioritized ahead of t3.1, which may now pass. This behavior is shown in Figure 5.
Test Selection
When modifying code, it is often sufficient to run only the tests affected by the change. This can be done by passing test name patterns as arguments to the run script. Alternatively, the new --𝚌𝚑𝚊𝚗𝚐𝚎𝚍 option can be used. For example, if the fix required to pass t2.1 only affects its corresponding testbench, using --𝚌𝚑𝚊𝚗𝚐𝚎𝚍 will result in a single test being executed, as shown in Figure 6.
📌Note: --𝚌𝚑𝚊𝚗𝚐𝚎𝚍 only considers HDL file changes and the tests affected by them. Changes to input files or top-level generics are not currently taken into account. Improvements in this area are planned for future releases.
📌Note: Test selection is not new to the VUnit ecosystem. In 2017, the University of Texas at Austin published a paper on Eksen, demonstrating how VUnit could support change-based test selection to reduce regression time. What is new is the integration of this and other techniques into the core of VUnit, making them available to all users without the need for custom extensions.
Load-Balancing
Until now, the shortest-test-first strategy has been used within each test category. This approach works well in single-threaded runs but may lead to imbalanced workloads when using multiple threads, as illustrated in Figure 7.
The total execution time for all tests is approximately 30 seconds, making 15 seconds the ideal execution time per thread when using two threads. While this ideal is not always achievable, it provides a useful target. To improve efficiency in multi-threaded runs, VUnit applies a combined scheduling strategy. It begins by executing the shortest tests first, then transitions to a load-balancing strategy as the 15-second mark is approached.
Scheduling tests for optimal load-balancing is an NP-hard problem, even when execution times are known. VUnit therefore applies a simplified method that adapts to observed execution times, which may vary between runs even when no code changes have been made. In this case, the strategy is sufficient to achieve perfect load-balancing, as shown in Figure 8.
Test History Export/Import
All features demonstrated so far rely on the presence of a test history. However, this history is not always available. For example, it is removed when using the --𝚌𝚕𝚎𝚊𝚗 option, and continuous integration pipelines often run regression tests from a clean state. To address this, VUnit supports exporting and importing items from its internal database, including the test history.
The test history can be exported using the 𝚙𝚘𝚜𝚝_𝚛𝚞𝚗 function, which is an optional argument to the final 𝚖𝚊𝚒𝚗 method call in the run script. This process is illustrated in Figure 9. The 𝚎𝚡𝚙𝚘𝚛𝚝_𝚒𝚝𝚎𝚖𝚜 method takes a list of items to export and a directory that is reserved for the database and should not be used for other purposes.
📌Note: 𝚛𝚎𝚜𝚞𝚕𝚝𝚜
𝚒𝚖𝚙𝚘𝚛𝚝_𝚒𝚝𝚎𝚖𝚜
It can also be useful to control import and export operations through the command line interface (CLI). VUnit supports custom additions to its built-in CLI and Figure 11 illustrates how this can be used to control the import of test histories.
This concludes the first blog on VUnit's new features for improving test efficiency. In the next blog, we will describe how constrained random testing can be made more efficient.
Main author and maintainer of VUnit. ASIC/FPGA/DSP Developer consulting for Q-Free ASA
1moFAQ: How do these features compare with similar features provided by premium simulator licenses? Answer: I haven’t had the chance to try these premium features so I can't give a definitive answer. Perhaps others in the community can share their experience? However, I’ve watched some presentations from Siemens on their "Smart Regression" capabilities. These features target similar areas as VUnit (test selection, test prioritization, and load balancing) but use different terminology such as Failure Predictor and Schedule Predictor. The key differences I’ve noticed are: * Siemens uses machine learning for decision making. It would be interesting to see some benchmark on how much extra performance you get from using ML instead of simpler heuristics. * Their algorithms incorporate coverage information. The initial approach for VUnit avoids relying on coverage data since that isn't a feature available to all users. * The tool analyzes compiled and elaborated designs, rather than the source code. This likely allows them to ignore refactorings that don't change functional behavior. Not sure how delta cycle differences are handled.
Main author and maintainer of VUnit. ASIC/FPGA/DSP Developer consulting for Q-Free ASA
1moFAQ: Does it work with VUnit testbenches written in SystemVerilog (UVM or non-UVM)? Anwer: Yes!
Senior FPGA Engineer at Lumacron | Optical Networking | Ex-Xilinx\AMD DSP & Comms IP Libraries | Volunteer Fencing Coach
1moThis looks awesome! Has there been any updates on osvvm integration, particularly for getting the osvvm reports out?
Lead FPGA/SoC Design and Verification Engineer at Enclustra GmbH
1moHi Lars, great addition to VUnit! I see the point of starting the shortest tests first but I often end up in situations where I have multiple short unit tests and one long top-level test that takes significantly more time than the others. In multithreading mode, it would be advantageous to start the longest first, by assigning it to a thread from the start. This way, while this long test executes, the short tests are shared among the other available threads and hopefully the total simulation test is set by the time it is necessary to execute the long test. The load-balance feature seems to help but still introduces some extra time to the total simulation time, which will be the time to execute some short tests plus the long one. I think having an option to set this behavior (start the longer ones first) would also be useful in case that is desired, what do you think?
FPGA/Embedded System Architect | Maintainer of Open Logic HDL Standard Library | FPGA Lecturer
1moThe --changed option is a really great addition.