SlideShare a Scribd company logo
RAPID EXPERIMENTATION WITH PYTHON CONSIDERING
OPTIONAL AND HIERARCHICAL INPUTS
A PREPRINT
Neil C. Ranly
Department of Operational Sciences
Air Force Institute of Technology
Wright-Patterson AFB, OH 45433
neil.ranly@afit.edu
Torrey D. Wagner
Department of Systems Engineering and Management
Air Force Institute of Technology
Wright-Patterson AFB, OH 45433
January 8, 2025
ABSTRACT
Space-filling experimental design techniques are commonly used in many computer modeling and
simulation studies to explore the effects of inputs on outputs. This research presents raxpy, a Python
package that leverages expressive annotation of Python functions and classes to simplify space-filling
experimentation. It incorporates code introspection to derive a Python function’s input space and
novel algorithms to automate the design of space-filling experiments for spaces with optional and
hierarchical input dimensions. In this paper, we review the criteria for design evaluation given
these types of dimensions and compare the proposed algorithms with numerical experiments. The
results demonstrate the ability of the proposed algorithms to create improved space-filling experiment
designs. The package includes support for parallelism and distributed execution. raxpy is available as
free and open-source software under a MIT license.
Keywords space-filling design · computer experiment · maximum projection criterion · hierarchical experimentation
factors
1 Introduction
Design of experiments (DOE) is a branch of applied statistics that generates a set of values, or simply the design points,
as inputs to a function. By passing each point through the function and assembling a dataset with the function-returned
outputs, experimenters perform analysis to gain insights into how the inputs affect outputs. In many situations, given
constrained resources and non-trivial execution durations of these functions, experimenters are constrained on the
number of points they can consider. The objective in designing these experiments is to generate a set of points that
maximizes the insight gained from the experiment’s outputs and that meets the experimenter’s objectives given this
point-set-size constraint.
This research focuses on DOEs to support exploration-based objectives of computer-based executable functions
that possess either non-linear response surfaces, multiple output variables, or system-of-system emergent dynamics.
Computer-executable-functions, such as computer-based simulation models [Sanchez et al., 2020], are a growing source
of data for deep learning [Makoviychuk et al., 2021] and a proven method for generating insights into rare and unrealized
situations. DOEs for exploration use cases include the exploration of a input-space at the start of an optimization
effort, the discovery of non-linear relationships between inputs and outputs [Box and Lucas, 1959], the validation
of a complex simulation model with multiple outputs [Sanchez and Sánchez, 2017], comparing different versions of
computer-executable-functions in a black-box manner, and the generation of a dataset supporting downstream machine
learning, such as surrogate modeling [Arboretti et al., 2022, Fontana et al., 2023].
Common DOE techniques such as grid sampling and random sampling have undesirable properties for exploration
use cases. Researchers may find that only a small subset of input dimensions have an affect on the outputs, known
as the effect sparsity principle [Young et al., 2024]. In these cases, grid-based designs and many factorial-based
arXiv:2501.03398v1
[cs.MS]
6
Jan
2025
raxpy Rapid Experimentation with Python A PREPRINT
design-variations possess projection properties that inefficiently duplicate the execution of points that collapse to
a same lower-dimensional-point after dropping the input dimensions that lack practical-significance on the outputs
[Joseph, 2016]. For example, consider the two-dimension, two-level full-factorial design of {(0, 0), (0, 1), (1, 0), (1, 1)}
projecting to the design {(0), (0), (1), (1)} with duplicate points after removing the 2nd dimension. In addition, grid
and factorial based design sizes increase exponentially as the number of dimensions increase making them infeasible to
consider for large dimensional input spaces. Random sampling designs, by virtue of being stochastic, often fail to hold
space-filling properties and may possess highly correlated design points given design size constraints. Correlations
among the inputs confound the determination of the true cause of output differences. These undesirable theoretical
properties of these designs can cause significant practical differences when compared to alternative design-generation
algorithms that attempt to evenly-sample or uniformly-sample the design-space; this paper refers to these alternative
algorithms as space-filling DOEs. Evaluations of space-filling DOEs have shown better performance in many surrogate
modeling benchmarks [Arboretti et al., 2022, Fontana et al., 2023].
Initial research into space-filling designs, such as Latin-hypercube sampling based designs (LHDs), assumes bounded,
continuous input dimensions, i.e., rectangular input spaces [Johnson et al., 1990, Owen, 1992, Damblin et al., 2013].
LHDs ensure any subsequent dropping of dimensions from a input dataset retains all points uniqueness. LHDs ensure
this by generating n unique values for each input dimension; where n is the desired point set size given resource
limitations and function-execution budgets. LHDs considers the range of an input dimension and creates n evenly,
mutual exclusive sub-bounded regions and chooses a centered value or a random value within this sub-bounded region.
In order to avoid undesirable correlations and distances between points, optimization algorithms derive the final input
value combinations from these single dimension value sets [Damblin et al., 2013, Joseph, 2016].
Additional space-filling design research investigates non-rectangular input spaces, discrete-valued dimensions, and
additional constraints on the input space. Maximum projection space-filling design techniques support discrete-level
numeric inputs while maximizing the sub-space projection distance of points [Joseph et al., 2020, Lekivetz and Jones,
2019]. Fast-flexible filling techniques [Lekivetz and Jones, 2015, Chen et al., 2019], adaptive optimization techniques
[Wu et al., 2019], mixed integer non-linear programming techniques [Özdemir and Turkoz, 2023], and partial swarm
optimization techniques [Chen et al., 2022] address non-linear constraints and non-convex volumes. Orthogonal uniform
composite designs provide techniques for flexible design sizes while retaining orthogonal properties of the design
[Zhang et al., 2020]. Sliced Latin hypercube designs [Qian, 2012], marginally coupled designs [Deng et al., 2015], and
maximum-projection techniques [Joseph et al., 2020] address qualitative input dimensions, such as a categorical input
factors. Techniques for sequential space-filling DOEs remove the requirement that the user needs to specify the design
size in a upfront manner, enabling the dynamic consideration of time constraints or other insight criteria discovered
during experimentation [Crombecq et al., 2011, Sheikholeslami and Razavi, 2017, Parker et al., 2024]. Wang et al.
[2021] provide sequential space-filling design techniques while considering input spaces with non-linear constraints.
Lu et al. [2022] discusses research methods to prioritize regions of the input space.
A focus of this research is algorithms to generate space-filling designs for input spaces with optional-and-hierarchical
dimensions. For example, consider experimenting with a object-detection computer-simulation with the input-space
being the scene context and the inclusion of different types of objects, each type of object having additional input
dimensions that are desired to vary to explore their influence on the system’s detection performance. In this scenario,
the inclusion of each object type represents an optional dimension of a simulation run. In addition, if the object type is
included in the scene, then attributes related to the object, such as color, movement, position, and other object attributes,
can be varied as part of the DOE.
Input spaces with these types of dimensions also occur in hyper-parameter search spaces [Bischl et al., 2017]. The Tree-
structured Parzen Estimator is a hyper-parameter optimization technique supporting hierarchical spaces, studied with an
initial random exploration design [Bergstra et al., 2011]. Decomposition search strategies to conduct optimization over
similar input spaces have also been proposed [Li et al., 2023]. From our literature review, a gap exists of algorithms
to support space-filling designs for input spaces with optional-and-hierarchical dimensions. This research proposes
space-filling DOE algorithms for these types of input spaces given a user design point-size trial budget, n.
Space-filling DOE algorithms are provided in commercial software and open-source software. scipy [Virtanen et al.,
2020] and UniDOE provide algorithms that maximize designs’ discrepancy [Zhou et al., 2013]. MaxPro [Shan Ba and
V. Roshan Joseph, 2015] provides algorithms that maximize the multi-dimension sub-projections of designs. SlicedLHD
provides an implementation of Sliced Latin Hypercube Designs [Kumar et al., 2024]. Inc. [1989] provides a variety of
algorithms including Fast-Flexible filling [Lekivetz and Jones, 2019] to support designs with non-linear constraints. See
Lucas and Parker [2023] for a comparison of the space-filling properties of designs created with popular experimentation
support software. In addition to the algorithm gap perceived, this research also addresses the software gap for the
implementation of these techniques on computer-executable function experimentation subjects.
2
raxpy Rapid Experimentation with Python A PREPRINT
This research presents raxpy, a Python library that contributes novel methods to express experiments’ input spaces with
Python-type specifications and novel extensions to DOE space-filling algorithms to generate experiment designs that
address optional-and-hierarchical input dimensions. raxpy’s use of Python enables instrumented execution of many
types of computer-based executable functions not natively expressed in Python, such as command-line-based simulation
programs, generative artificial intelligence web services, and other types of external web services. The proposed library
enables rapid execution of space-filling experiments with the capabilities of Python and its ecosystem of libraries such
that generated data can be quickly processed to enable downstream analysis activities.
Section 2 discusses how raxpy is used to perform experiments and create designs. Section 3 discusses how raxpy
evaluates and creates DOEs given optional-and-hierarchical dimensions. Finally, Section 4 discusses the results of
numerical experiments comparing the space-filling properties for designs generated with the proposed algorithms.
2 Performing Experiments
The following code demonstrates how users of raxpy can create and execute a space-filling experiment with n = 10
points on an annotated Python function, such as the example function f5 listed in section 2.1.
1 import raxpy
2
3 inputs , outputs = raxpy. perform_experiment (f5 , 10)
The method perform_experiment executes the following steps. First, it derives the input space from the annotated
parameters of the passed-in function; function annotations are described subsequently in section 2.1. The function’s
input space is represented as a Space object composed of zero-or-many Dimension objects; the raxpy.spaces module
is described in section 2.2. Secondly, the Space object for a function’s input space is passed to a space-filling DOE
algorithm that creates a DOE; the DOE algorithm is described in section 3.2. Next, the design points are mapped to
the function parameters and executed for each point. Finally, it returns the points and the corresponding outputs. This
automated mapping from a function signature, to input-space, to DOE, to function arguments for execution, and finally
execution of function with arguments can increase the efficiency of conducting exploration experiments compared to
manually mapping the results between each step.
raxpy can also create an experiment design without executing it, given an annotated function:
1 doe = raxpy. design_experiment (f5 , 10)
2.1 Function Annotation
raxpy utilizes the type specifications and annotations of Python functions and classes to express and to derive an
experiment design’s input and output space. Type specifications in programming languages have a long tradition of
supporting expressive expectations of variables’ types. Programmers have long valued type specification in large
code bases to clearly express interfaces’ input and output specifications to themselves and other programmers. Code
compiling and static-code-analysis linting techniques also use type specifications to identify interface violations before
software is executed. In many scripting languages, including Python 2, type specification is not required and is not
supported. As the use of Python grew, community efforts and demands to enable code quality processes resulted in
type specifications being incorporated to Python in version 3.5 as part of Python Enhancement Proposal (PEP) 484
[van Rossum et al., 2014]. Since, many Python libraries have incorporated type specifications supplying common
documentation of variable type expectations and enabling automated pre-runtime code quality processes. The following
code demonstrates a function with three parameters with and without type specifications.
1 from typing import Optional
2
3 # simple function without type hints
4 def f1(x1 , x2 , x3):
5 return x1 * x2 if x3 is None else x1 * x2 * x3
6
7 # simple function with type hints
8 def f2(x1: float , x2:int , x3:Optional[float ]) -> float:
9 return x1 * x2 if x3 is None else x1 * x2 * x3
With the implementation of PEP 593 in Python 3.9, Python extends the concept of type specification to permit the
annotation with additional meta-data [Varoquaux and Kashin, 2019]. The follow code demonstrates a Python function’s
3
raxpy Rapid Experimentation with Python A PREPRINT
parameter type specifications with annotated meta-data; the first two parameters have a lower and upper bound specified,
while the x3 parameter has a value set specified to indicate the discrete-numeric values for this parameter.
1 import raxpy
2 from typing import Annotated
3
4 # simple function with type hints and annotations
5 def f3(
6 x1: Annotated[float , raxpy.Float(lb=0.0, ub =10.0)],
7 x2: Annotated[float , raxpy.Float(lb=0.0, ub =2.0)],
8 x3: Annotated[Optional[float], raxpy.Float(value_set =[0.0 , 1.5, 3.0])]
9 ) -> Annotated[float , raxpy.Float(tags =[ raxpy.tags.MAXIMIZE ])]:
10 return x1 * x2 if x3 is None else x1 * x2 * x3
In addition to the improved documentation, there are additional benefits for annotating variables. They also enable
runtime validation of values beyond just the validation of a values’ types; see the code below how raxpy can enable this.
The main benefit, which we utilize in raxpy, is the introspection of a function signature to dynamically extract functions’
design spaces. With the derived function’s input and output space specification, raxpy analyzes the space’s dimensions to
design experiments, execute designs, and gather the results. Examples also demonstrate how this introspection of raxpy
annotations can be adapted for use with other Python modules to perform optimization experiments. The expression of
the experimentation subject as a Python function also enables software engineering inspired test-driven-development
best practices such as unit-testing.
1 # simple function with annotations and runtime validation
2 @raxpy. validate_at_runtime (check_outputs=False)
3 def f4(
4 x1: Annotated[float , raxpy.Float(lb=0.0, ub =10.0)],
5 x2: Annotated[int , raxpy.Float(lb=0.0, ub =2.0)],
6 x3: Annotated[Optional[float], raxpy.Float(value_set =[0.0 , 1.5, 3.0])]
7 ) -> float:
8 return x1 * x2 if x3 is None else x1 * x2 * x3
9
10 f4(3.14 , 1, None) # no error
11 f4(3.14 , 11, None) # runtime error given 11 value does not fall within range
To specify the input space of a Python function, each parameter of a function is annotated. raxpy code introspection
algorithm checks for standard type specifications in the typing package to indicate Optional and Union parameters.
This encourages experimenters to reuse existing type-specification best practices while minimizing the input space
specification in an additional configuration section of the code or an external file. It also minimizes the experimenter’s
requirement to employ a different space-specification method. raxpy uses dataclasses to support the expression of
hierarchical dimensions. The following code demonstrates how raxpy creates and executes a space-filling experiment
with 25 points on an annotated function Python with optional hierarchical dimensions expressed with typing.Union,
typing.Optional, and dataclasses.
1 from dataclasses import dataclass
2
3 @dataclass
4 class HierarchicalFactorOne :
5 x4: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)]
6 x5: Annotated[float , raxpy.Float(lb=0.0, ub =2.0)]
7
8 @dataclass
9 class HierarchicalFactorTwo :
10 x6: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)]
11 x7: Annotated[float , raxpy.Float(lb=-1.0, ub =1.0)]
12
13
14 def f5(
15 x1: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)],
16 x2: Annotated[Optional[float], raxpy.Float(lb=-1.0, ub =1.0)],
17 x3: Union[HierarchicalFactorOne , HierarchicalFactorTwo ],
18 ):
19 # placeholder for f5 logic
20 return 1
21 inputs , outputs = raxpy. perform_experiment (f5 , 25)
4
raxpy Rapid Experimentation with Python A PREPRINT
The annotation code for the raxpy type specification is organized in the raxpy.annotations module. Although we expect
many uses of raxpy to employ the raxpy.annotations module or existing annotation libraries, the raxpy.spaces module
provides common data structures for DOE algorithms and external specifications of a space. For example, a user
interface may permit an experimenter to specify the input dimensions of a space for a Python function that dynamically
configures a simulation model given the input provided.
2.2 Design Spaces
A Python function corresponds to two design spaces, an input space derived from the input parameters of the function,
and an output space derived from the return type specification. The code below shows the ability to specify a InputSpace
without deriving it from a function’s signature. A major difference between raxpy and the ConfigSpace Python package
[Lindauer et al., 2019], is the incorporation of an nullable attribute for dimensions. Dimension’s nullable attributes
enables a clear mapping from Python functions’ parameters specified as optional.
1 space = raxpy.spaces.InputSpace(
2 dimensions =[
3 raxpy.spaces.Float(id="x1", lb=0.0, ub=1.0, portion_null =0.0) ,
4 raxpy.spaces.Float(id="x2", lb=0.0, ub=1.0, portion_null =0.0) ,
5 raxpy.spaces.Float(id="x3", lb=0.0, ub=1.0, nullable=True , portion_null
=0.1) ,
6 ]
7 )
The following dimension-types are also provided to support the direct mapping from complex function parameters:
• Composite objects represent dimensions mapped from dataclass type specification;
• Variant objects represent dimensions mapped from typing Union type specification;
• ListDim objects represent dimensions formed from type specifications of from typing List type specification.
3 Designing Experiments with Optional and Hierarchical Inputs
A common approach to creating space-filling designs is to measure and optimize a design with respect to a space-filling
metric. Optional and hierarchical dimensions complicate the direct application of these methods. Section 3.1 suggests
extensions to past space-filling measurement criteria and concepts to address these more complex space attributes.
raxpy code examples are also provided demonstrating techniques to measure and analyze space-filling designs. Section
3.2 presents DOE algorithms employing these concepts.
3.1 Space Filling Metrics
Seven space-filling metrics are extended and used in this work and described in this section.
• Mocov
: Full-sub-space optionality-coverage percentage
• Midis
: Minimum interpoint distance
• Madis
: Average minimum single-dimension projection distance
• Mwdsr
: Weight and sum of full-sub-space discrepancies
• Msdsr
: Star discrepancies, incorporating a null-region within dimensions
• Mmaxpro
: Variation of MaxPro metric extended to support optional and hierarchical dimensions
• DOE target allocation difference: Count of point allocations diverging from the target full-sub-space allocation
counts
Let X denote a DOE with n d-dimensional points and S denote the input-space such that X ⊆ S and S =
(D1, D2, . . . , Dd) where Dk denotes the acceptable range-of-values for dimension k. Let D = {1, 2, . . . , d} rep-
resent the set of indices corresponding to the dimensions. Let Doptional
⊆ D represent the set of indices corresponding
to the optional dimensions in S such that {null} ⊆ Dk, ∀k ∈ Doptional
. Let Dparent
⊆ D indicate the set of indices
for hierarchical activation dimensions and {0, 1} ⊆ Dk, ∀k ∈ Dparent
. Let Dreal
⊆ D denote the set of indices
corresponding to bounded real numbers, such that [lk, uk] ⊆ Dk, ∀k ∈ Dreal
. All numeric experiments and subsequent
notation fix lk = 0 and uk = 1. Let xi represent ith point of X and xik represent point i’s value for dimension k. Let
5
raxpy Rapid Experimentation with Python A PREPRINT
Figure 1: X to Xopt
point mapping
P ⊆ Dparent
× D × Dp denote a set of hierarchical constraints such that if (p, c, v) ∈ P then the child dimension c for
a point is constrained to be null if the parent dimension p’s value is not v. This implies c ∈ Doptional
, ∀(p, c, v) ∈ P.
The first criteria we consider is the extent of a design’s coverage over the different possible combinations of optional
parameters. To compute, we define a projection of X to a d-dimensional binary space indicating the non-null
specification of values in points as the optional-definition design projection, Xopt
. Xopt
⊆ {0, 1}n×d
, where each
element xopt
ij is defined as:
xopt
ij =

0, if xij = null
1, if xij ̸= null
(1)
Figure 1 visualizes this mapping for a example DOE of five dimensions where Doptional
= {2, 3, 4, 5}, Dparent
=
{3}, Dreal
= {1, 2, 4, 5}, P = {(3, 4, 1), (3, 5, 1)}. We can compute the optionality-coverage percentage by taking
the set-size of unique points |
Sn
i=1{xopt
i }| over the size of power set of Doptional
, |PP (Doptional
)|. P is a slightly
modified version of the power-set since some optional dimensions are only feasible given the hierarchical constraints
specified with P and some dimensions are constrained to be not-null, thus are active in every sub-space. raxpy
uses a tree traversal technique to compute PP (Doptional
). For the example depicted in Figure 1, PP (Doptional
) =
{(1), (1, 2), (1, 2, 3), (1, 2, 3, 4), (1, 2, 3, 5), (1, 2, 3, 4, 5), (1, 3), (1, 3, 4), (1, 3, 5), (1, 3, 4, 5)}. The depicted DOE in
Figure 1 only samples six of these possible sub-spaces. We define full-sub-spaces (FSSs) as the set of sub-spaces
indicated by PP (Doptional
). Each point of S maps to a single element of the PP (Doptional
) set. For example points
(rows) two and three in Figure 1 map to the full-sub-space of {1, 2}.
Given a DesignOfExperiment object named doe, raxpy can compute optionality-coverage percentage:
1 opt_coverage = raxpy.measure. compute_opt_coverage (doe)
Given a InputSpace object named space, raxpy can compute PP (Doptional
):
1 fss_dim_sets = space. derive_full_subspaces ()
The second criteria we employ is a design’s minimum interpoint distance, Midis
[Johnson et al., 1990].
minxi,xj ∈X(
d
X
k
distk(xi − xj)p
)(1/p)
(2)
6
raxpy Rapid Experimentation with Python A PREPRINT
This criteria is motivated by the objective to maximize the minimal distance between all the points. To support
measurement of designs with optional dimensions, we define the distance computations with a revised distance equation
for a 0-1 normalized, encoded design:
distk(xi, xj) =



|xik − xik| if xik ̸= null and xjk ̸= null,
1 if (xik = null and xjk ̸= null) or (xjk = null and xik ̸= null),
0 if xik = null and xjk = null.
(3)
Given a DesignOfExperiment object named doe, raxpy provides support for this:
1 min_interpoint_dist = raxpy.measure. compute_min_interpoint_dist (doe , p=2)
The discrepancy criteria evaluates the extent a design achieves uniform distribution across a bounded, numeric-value
input space [Zhou et al., 2013]. We consider two discrepancy measurement extensions. The first extension Mwdsr
, we
measure the design by decomposing the design to well-bounded, non-null sub-input-spaces. We weight and sum the
discrepancies of the sub-designs, with weights representing the target number of points allocated to each FSS over n.
We consider this metric especially applicable for design evaluation when subsequent experimentation is expected to
focus on one sub-space given the initial exploration results. We only employ this metric for design comparisons given
designs with the same FSS point allocations.
The second discrepancy extension, Msdsr
, we consider is the direct incorporation of null regions in Doptional
dimensions
as part of the uniform distribution computation considered by discrepancy. Discrepancy is defined as the maximum
difference between the portion of design points within a region and the volume of that same region as a ratio to the
volume of the whole space. To compute with nullable dimensions, we consider null values to directly precede 0 on the
number line and the null region sizes of a dimension to be specified as stated in section 3.3 or specified by the user.
sup
u∈[null,1]d
|{x : x ≤ u}|
n
−
d
Y
k=1

¯
αk + uk(1 − ¯
αk)

(4)
where ¯
αk indicates the size of the null region for dimension k. Future research is suggested to explore mixture
discrepancy [Zhou et al., 2013] variation extensions.
1 discrepancy = raxpy.measure. compute_star_discrepancy (doe)
To measure the single-dimensional projection properties of designs, we compute the average minimum distance, Madis
,
between single-dimension projections, where Inot-null
k ⊆ {i : xik ̸= null} representing the indices of non-null values:
1
d
d
X
k=1
min
i,j∈Inot-null
k ,i̸=j
|xik − xjk| (5)
1 avg_min_proj_dist = raxpy.measure. compute_average_dim_dist (doe , p=2)
To assess the multi-dimensional projection of designs, we propose a variation of the MaxPro metric. If k ∈ Doptional
∪
Dparent
, ᾱk denotes 1/αk, where αk denotes either |Dk| + 1 if Dk is finite and is optional or an estimation of the
dimension’s complexity; see section 3.3. This avoids dividing by zero for designs that must contain duplicate values for
dimensions with a finite set of values. Another effect of this is that differences corresponding to dimensions with less
complexity are weighted less. Otherwise, ᾱk = 0 for bounded, non-optional real values to avoid duplicates.

 1
n
2

n−1
X
i=1
n
X
j=i+1
1
Qd
k=1(distk(xik, xjk) + ᾱk)2


1
d
(6)
1 max_pro = raxpy.measure. compute_max_pro (doe)
7
raxpy Rapid Experimentation with Python A PREPRINT
3.2 Design Algorithms
To support the construction of designs, we propose five algorithms and evaluate those algorithms in addition to a
baseline random-design algorithm. Some algorithms leverage a traditional space-filling design algorithm (TSFD)
for a real, 0-1 bounded space. The TSFD we employ for numerical experiments in section 4 is the scipy.stats.qmc
LatinHypercube algorithm based on centered discrepancy optimization. Note that each proposed algorithms’ design can
also be post-optimized further with a MaxPro-based optimization algorithm, explained at the end of this section. This
results in the comparison of 12 algorithms in section 4.
1. The Full-Subspace (FSS-LHD) algorithm mimics a manual approach of separately considering the individual
full-sub-spaces and creating a separate space-filling design for each. Each FSS is allocated a portion of the
n points in the design. In the absence of a user-specified allocation, heuristics are used to determine the
allocation given null-portion attributes; see section 3.3. raxpy also supports a feature to force at least one point
allocated to each FSS if possible. This results in the creation of a full-factorial inspired Xopt
design, at least
one point for each element of PP (Doptional
). Next, a TSFD is used to generate designs for each full-sub-design.
2. A variation of the FSS-LHD algorithm above using random designs instead of space-filling designs (FSS-
Random).
3. The Full-Subspace-with-Value-Pool (FSS-LHD-VP) algorithm revises the FSS-LHD algorithm by computing
the number of values needed for each dimension given target FSS point allocations. Given this point count
for each dimension, it next creates a value-pool for the dimension using a LHD method to generate spaced
values. Next, starting with the largest FSS with the largest point allocation, the appropriate number of values
are pulled from the value-pool across the quantiles of the remaining values in the pool. The pull values are
shuffled to create an initial sub-design and then passed to a space-filling centered discrepancy optimization
algorithm based on the scipy.stats.qmc LatinHypercube algorithm.
4. The Tree-Traversal-Design (TT-LHD) algorithm starts by creating a root design using a TSFD for the
dimensions without any parents. The root 0-1 encoded design values are then projected to [0-1]+null space
using the values under the null-portion threshold to project to the null value and values greater being rescaled
to 0-1 values given the range from the null-portion threshold to 1. For Dparent
and finite numeric-value-set
based dimensions, these values are mapped to values representing the discrete values in the finite set, in a
similar manner as suggested in Joseph et al. [2020]. Next, the algorithm creates sub-designs with TSFD
for children-dimension sets given common parents, repeating the same 0-1 mapping logic. The lower-level
sub-designs are merged with the root-design using a centered discrepancy optimization algorithm, only revising
the children-dimensional values to avoid undoing previous merge optimizations. It repeats this in a recursive
manner into the deeper nodes of the hierarchical space.
5. The Whole-Projection-Design (P-LHD) algorithm employs a TSFD to create an initial design for the flattened
hierarchical space, initially ignoring the optional-and-hierarchical-properties of dimensions. Given this initial
design, using a similar value-mapping logic as the TT-LHD algorithm above, the values for optional dimensions
are projected to a [0-1]+null space as needed. Values projected to null for parent dimensions are horizontally
extended onto all children dimensions.
As a baseline to compare the performance of these algorithms, we also consider a completely random design algorithm
that addresses null values for optional dimensions according to the null-portion attribute assigned to optional dimensions.
This is labeled as Random in section 4.
A design Simulated Annealing MaxPro optimization is provided and outlined below where Inot-null
k indicates the indices
of rows with non-null values for dimension k. This algorithm is designed to retain the initial design’s Xopt
structure,
retaining the user’s or the design heuristic’s FSS allocations. It fulfills this by only swapping rows’ values in Inot-null
k .
An algorithm using MaxPro optimization is labeled with the suffix -MP in section 4.
Figure 2 presents the three-dimensional distribution of a 24-point experiment design space with a series of two-
dimensional scatter plots and histograms. A random generated design is shown on the left, and a design generated
with the FSS-LHD-VP-MP algorithm is shown on the right. The three dimensions are x1, x2 and x3, and were defined
with a range of [0,1]; dimensions x2 and x3 are defined as optional. Histograms for each of the variables is shown
along the diagonal of each plot. The ideal histogram is uniform for a space-filling design, and it is visible that the
FSS-LHD-VP-MP histograms are far more uniform with a increased number of null values given the null region size.
3.3 Design Heuristics
Each Doptional
dimension has an null_portion attribute. This attribute indicates the size, from a input-space uniform
distribution perspective, of the null region in the dimension. This directly influences the percentage of points within a
8
raxpy Rapid Experimentation with Python A PREPRINT
Algorithm 1 Simulated Annealing MaxPro Design Optimization
Require: An initial design X
Require: Max-iterations M
Require: MaxPro Measurement Function cMaxPro
1: Initialize Xbest
← X
2: for m = 1 to M do
3: t ← 1.0 − m/M
4: k ∼ Uniform(Dreal
)
5: i, j ∼ Uniform(Inot-null
k ), Uniform(Inot-null
k )
6: while i = j do
7: j ∼ Uniform(Inot-null
k )
8: end while
9: Xtry
← ColumnSwap(X, i, j, k)
10: P ← e−(cMaxPro(Xtry
)−cMaxPro(Xbest
))/t
11: if P  Uniform([0 − 1]) then
12: Xbest
← Xtry
13: end if
14: end for
15: return Xbest
Figure 2: Scatter-plots of the points from a random DOE (left) and a FSS-LHD-VP-MP DOE (right)
9
raxpy Rapid Experimentation with Python A PREPRINT
DOE that should have a null value for this dimension if a user does not explicitly specify FSS allocations. raxpy utilizes
heuristics to specify this attribute if not specified by the user. First, the number of levels of a dimension is estimated.
We denote this simply as the dimension’s complexity, αk. If a dimension is a finite set of values, the complexity is the
set size. If a dimension k ∈ Dreal
, we currently specify the complexity to three plus one representing the null value; one
unit of the complexity represents the null region. We then set null_portion, ᾱk, to 1/αk.
For the algorithms that utilize FSS point-allocation targets, we compute the percentage of points to allocate to the FSSs
as needed. Let SDo represent the FSS of S given the dimensions specified in Do
∈ PP (Doptional
). Next, the percentage
of points to allocate to each SDo is computed as:
Y
k∈Do
(1 − ¯
αk) (7)
For design algorithms that utilize the full-sub-space allocation techniques, raxpy also allows user to directly provide
their own FSS allocations.
4 Analysis of Space-Filling Capability
4.1 Methodology
We demonstrate the capabilities of raxpy with numeric experiments that compare designs’ space-filling properties
generated from the proposed algorithms. Each algorithm is considered by itself and also with the MaxPro optimization
for a total of twelve design algorithms. We investigate four different input spaces (labeled basic, simple, modest,
and complex) and three design sizes for each input space, and then generate 30 designs, i.e., replications, with each
algorithm to analyze the variability of design measures. This results in 360 designs evaluated.
Every design is evaluated with respect to Midis
, Madis
, Msdcr
, and Mwdcr
. The algorithm replication evaluation means
are ranked within each input space and design size combination. Full-sub-space allocation differences from target
allocations are also computed and summarized.
The first input space, Sbasic
, we consider consists of Dreal
= {1, 2, 3}, and Doptional
= {2, 3}. For this input space, we
consider design sizes n ∈ {4|Sbasic
|, 8|Sbasic
|, 12|Sbasic
|} = {12, 24, 36}. The input space is derived from the following
function signature:
1 def f_basic(
2 x1: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)],
3 x2: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)],
4 x3: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)]
5 ):
The second input space, Ssimple
, we consider consists of Dreal
= {1, 2, 4, 5}, Dparent
= {3}, and Doptional
= {3, 4, 5}.
For this input space, we consider n ∈ {20, 40, 60}. The input space is derived from the following function signature:
1 @dataclass
2 class HierarchicalFactorOne :
3 x4: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)]
4 x5: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)]
5
6 def f_simple(
7 x1: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)],
8 x2: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)],
9 x3: Optional[ HierarchicalFactorOne ]
10 ):
The third input space, Smodest
, we consider consists of Dreal
= {1, 2, 4, 5, 7, 8}, Dparent
= {3, 6}, and Doptional
=
{2, 3, 4, 5, 6, 7, 8}. For this input space, we consider n ∈ {32, 64, 96}. The input space is derived from the following
function signature:
1 @dataclass
2 class HierarchicalFactorTwo :
3 x7: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)]
4 x8: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)]
5
10
raxpy Rapid Experimentation with Python A PREPRINT
6 def f_modest(
7 x1: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)],
8 x2: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)],
9 x3: Optional[ HierarchicalFactorOne ],
10 x6: Optional[ HierarchicalFactorTwo ],
11 ):
The fourth input space, Scomplex
, we consider consists of Dreal
= {1, 2, 4, 5, 7, 8, 9, 10}, Dparent
= {3, 5, 8}, and
Doptional
= {2, 3, 4, 5, 6, 7, 8, 9, 10}. For this input space, we consider n ∈ {32, 64, 96}. The input space is derived
from the following function signature:
1 @dataclass
2 class HierarchicalFactorLevel2 :
3 x6: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)]
4 x7: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)]
5
6 @dataclass
7 class HierarchicalFactorLevel1A :
8 x4: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)]
9 x5: Optional[ HierarchicalFactorLevel2 ]
10
11 @dataclass
12 class HierarchicalFactorLevel1B :
13 x9: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)]
14 x10: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)]
15
16 def f_complex(
17 x1: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)],
18 x2: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)],
19 x3: Optional[ HierarchicalFactorLevel1A ],
20 x8: Optional[ HierarchicalFactorLevel1B ],
21 ):
4.2 Results and Discussion
Table 1 shows the resulting rankings of the algorithms by taking the average rankings across input space and design size
combinations; ranking each algorithm on a 1-12 scale where 1 is the best. The top ranked algorithms are highlighted,
with the best in the darkest shade of green. The FSS-LHD-VP-MP algorithm is in the top three performing algorithms
over all four metrics and the best for Midis
. The additional processing with MaxPro optimization, designated with a
-MP suffix, shows a trend with helping to improve the space-filling properties of the designs with respect to Midis
and
Msdcr
, while hurting Mwdcr
for the FSS-LHD and FSS-LHD-VP algorithms that originally optimized discrepancies by
full-sub-spaces.
11
raxpy Rapid Experimentation with Python A PREPRINT
Algorithm Midis
Madis
Msdcr
Mwdcr
Random* 11.75 9.0 11.0 9.83
FSS-Random 11.08 10.17 9.83 11.33
FSS-LHD 5.67 7.83 7.42 1.58
FSS-LHD-VP 5.5 2.5 5.08 2.67
TT-LHD* 8.83 1.67 7.08 5.0
P-LHD* 9.83 4.33 7.17 5.17
Random-MP* 7.75 9.0 8.25 9.67
FSS-Random-MP 3.17 10.17 7.25 8.58
FSS-LHD-MP 2.33 7.83 5.33 6.0
FSS-LHD-VP-MP 1.0 2.5 2.5 4.25
TT-LHD-MP* 5.58 1.67 2.25 7.17
P-LHD-MP* 5.5 4.33 4.08 6.75
Table 1: Average algorithm ranking for each combination of input space and design size. Shading indicates the strongest
algorithms, and an asterisk indicates common divergence from the target number of points for each full-sub-space
causing measurement biases.
Figure 3 shows the normalized Mwdcr
for the FSS-based algorithms; the objective is to attain small discrepancies
from the uniform distribution from the perspective of every full-sub-space. The left side of the figure shows the
algorithm performance, and the right side shows the performance with the MaxPro design optimization. The weighted
discrepancies results show that -MP may worsen this criteria. This suggests if subsequent experimentation is expected
in a single full-sub-space, then -MP whole-design optimization may not be desired.
Figure 3: Algorithm Space-Filling Results, normalized Mwdsr
(lower is better)
Three algorithms (Random, TT-LHD, P-LHD) have an asterisk in table 1, which marks that they often fail to create
designs with point allocations that match the target allocations as derived from the null-portion attributes. The FSS point-
allocation to target-allocations differences for these algorithms are shown in figure 4. Random designs inconsistently
sample the FSS to match the target point allocations. The FSS-based algorithms, by design, create designs that match
the target point allocations.
12
raxpy Rapid Experimentation with Python A PREPRINT
Figure 4: Algorithm Full-Sub-Space Point Allocation Differences from Target Counts (lower is better)
Figure 5 shows the FSS-based algorithm designs’ normalized minimum interpoint distances over the 30 replications
by input-space and design size with-and-without MaxPro processing. With the incorporation of MaxPro processing,
designs’ Midis
increases. The FSS-LHP-VP-MP algorithm had the best average for every input spaces and design size.
Figure 5: Algorithm Space-Filling Results, normalized Midis
(higher is better)
5 Conclusion
The Python programming language provides a rich set of type hinting and annotation capabilities. raxpy makes it easy
to extend these capabilities to design, evaluate, and execute space-filling experiments with optional and hierarchical
dimensions. raxpy provides newly proposed design algorithms and measurement techniques, extended to support spaces
with optional and hierarchical dimensions. Numeric experimentation highlights differences between the proposed
design algorithms and demonstrates the improved space-filling capabilities of the algorithms.
While raxpy supports creating designs with Union parameters, future research is suggested to study the space-filling
optimizations to support this type of dimension. In a similar manner, List function parameters provide a similar
unexplored possible future research area. Another future research area is the ability to design exploratory experiments
13
raxpy Rapid Experimentation with Python A PREPRINT
for sequential point executions given optional and hierarchical input dimensions. The expressive annotation capabilities
of raxpy could be used to simplify optimization experiments of Python functions. We plan to work with the open-source
community to refine raxpy given user-desired use cases. For the latest examples, see https://github.com/neil-r/
raxpy/tree/main/examples.
6 Acknowledgments
The authors declare no conflicts of interest and confirm that no external funding supported this research. The views
and interpretations presented are solely those of the authors and do not necessarily reflect the positions or policies of
their affiliated institution(s). We acknowledge and thank Kyle Daley for his support to support code documentation and
formatting.
References
Susan M. Sanchez, Paul J. Sanchez, and Hong Wan. Work Smarter, Not Harder: A Tutorial on Designing and
Conducting Simulation Experiments. In 2020 Winter Simulation Conference (WSC), pages 1128–1142, Orlando,
FL, USA, December 2020. IEEE. ISBN 978-1-72819-499-8. doi:10.1109/WSC48552.2020.9384057. URL
https://ieeexplore.ieee.org/document/9384057/.
Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller,
Nikita Rudin, Arthur Allshire, Ankur Handa, and Gavriel State. Isaac Gym: High Performance GPU-Based Physics
Simulation For Robot Learning, 2021. URL https://arxiv.org/abs/2108.10470. Version Number: 2.
G. E. P. Box and H. L. Lucas. Design of Experiments in Non-Linear Situations. Biometrika, 46(1/2):77, June 1959.
ISSN 00063444. doi:10.2307/2332810. URL https://www.jstor.org/stable/2332810?origin=crossref.
Susan M. Sanchez and Paul J. Sánchez. Better Big Data via Data Farming Experiments. In Andreas Tolk, John
Fowler, Guodong Shao, and Enver Yücesan, editors, Advances in Modeling and Simulation, pages 159–179. Springer
International Publishing, Cham, 2017. ISBN 978-3-319-64181-2 978-3-319-64182-9. doi:10.1007/978-3-319-
64182-9_9. URL http://link.springer.com/10.1007/978-3-319-64182-9_9. Series Title: Simulation
Foundations, Methods and Applications.
Rosa Arboretti, Riccardo Ceccato, Luca Pegoraro, and Luigi Salmaso. Design choice and machine learning model
performances. Quality and Reliability Engineering International, 38(7):3357–3378, November 2022. ISSN 0748-
8017, 1099-1638. doi:10.1002/qre.3123. URL https://onlinelibrary.wiley.com/doi/10.1002/qre.3123.
Roberto Fontana, Alberto Molena, Luca Pegoraro, and Luigi Salmaso. Design of experiments and machine learning with
application to industrial experiments. Statistical Papers, 64(4):1251–1274, August 2023. ISSN 0932-5026, 1613-9798.
doi:10.1007/s00362-023-01437-w. URL https://link.springer.com/10.1007/s00362-023-01437-w.
Kade Young, Maria L. Weese, Jonathan W. Stallrich, Byran J. Smucker, and David J. Edwards. A graphical comparison
of screening designs using support recovery probabilities. Journal of Quality Technology, 56(4):355–368, August
2024. ISSN 0022-4065, 2575-6230. doi:10.1080/00224065.2024.2356127. URL https://www.tandfonline.
com/doi/full/10.1080/00224065.2024.2356127.
V. Roshan Joseph. Space-filling designs for computer experiments: A review. Quality Engineering, 28(1):28–35, January
2016. ISSN 0898-2112, 1532-4222. doi:10.1080/08982112.2015.1100447. URL http://www.tandfonline.com/
doi/full/10.1080/08982112.2015.1100447.
M.E. Johnson, L.M. Moore, and D. Ylvisaker. Minimax and maximin distance designs. Journal of Statistical
Planning and Inference, 26(2):131–148, October 1990. ISSN 03783758. doi:10.1016/0378-3758(90)90122-B. URL
https://linkinghub.elsevier.com/retrieve/pii/037837589090122B.
Art B. Owen. ORTHOGONAL ARRAYS FOR COMPUTER EXPERIMENTS, INTEGRATION AND VISUAL-
IZATION. Statistica Sinica, 2(2):439–452, 1992. ISSN 10170405, 19968507. URL http://www.jstor.org/
stable/24304869. Publisher: Institute of Statistical Science, Academia Sinica.
G Damblin, M Couplet, and B Iooss. Numerical studies of space-filling designs: optimization of Latin Hypercube
Samples and subprojection properties. Journal of Simulation, 7(4):276–289, November 2013. ISSN 1747-7778, 1747-
7786. doi:10.1057/jos.2013.16. URL https://www.tandfonline.com/doi/full/10.1057/jos.2013.16.
V. Roshan Joseph, Evren Gul, and Shan Ba. Designing computer experiments with multiple types of factors: The
MaxPro approach. Journal of Quality Technology, 52(4):343–354, October 2020. ISSN 0022-4065, 2575-6230.
doi:10.1080/00224065.2019.1611351. URL https://www.tandfonline.com/doi/full/10.1080/00224065.
2019.1611351.
14
raxpy Rapid Experimentation with Python A PREPRINT
Ryan Lekivetz and Bradley Jones. Fast flexible space-filling designs with nominal factors for nonrectangular regions.
Quality and Reliability Engineering International, 35(2):677–684, March 2019. ISSN 0748-8017, 1099-1638.
doi:10.1002/qre.2429. URL https://onlinelibrary.wiley.com/doi/10.1002/qre.2429.
Ryan Lekivetz and Bradley Jones. Fast Flexible Space-Filling Designs for Nonrectangular Regions. Quality and
Reliability Engineering International, 31(5):829–837, July 2015. ISSN 07488017. doi:10.1002/qre.1640. URL
https://onlinelibrary.wiley.com/doi/10.1002/qre.1640.
Ray-Bing Chen, Chi-Hao Li, Ying Hung, and Weichung Wang. Optimal Noncollapsing Space-Filling Designs for
Irregular Experimental Regions. Journal of Computational and Graphical Statistics, 28(1):74–91, January 2019.
ISSN 1061-8600, 1537-2715. doi:10.1080/10618600.2018.1482760. URL https://www.tandfonline.com/
doi/full/10.1080/10618600.2018.1482760.
Zeping Wu, Donghui Wang, Wenjie Wang, Kun Zhao, Patrick N. Okolo, and Weihua Zhang. Space-filling experimental
designs for constrained design spaces. Engineering Optimization, 51(9):1495–1508, September 2019. ISSN 0305-
215X, 1029-0273. doi:10.1080/0305215X.2018.1542691. URL https://www.tandfonline.com/doi/full/10.
1080/0305215X.2018.1542691.
Akın Özdemir and Mehmet Turkoz. Robust design modeling and optimization for dealing with a non-
convex design space. Computers  Industrial Engineering, 185:109688, November 2023. ISSN
03608352. doi:10.1016/j.cie.2023.109688. URL https://linkinghub.elsevier.com/retrieve/pii/
S036083522300712X.
Ping-Yang Chen, Ray-Bing Chen, and Weng Kee Wong. Particle swarm optimization for searching efficient experimental
designs: A review. WIREs Computational Statistics, 14(5):e1578, September 2022. ISSN 1939-5108, 1939-0068.
doi:10.1002/wics.1578. URL https://wires.onlinelibrary.wiley.com/doi/10.1002/wics.1578.
Xue-Ru Zhang, Min-Qian Liu, and Yong-Dao Zhou. Orthogonal uniform composite designs. Journal of Statistical
Planning and Inference, 206:100–110, May 2020. ISSN 03783758. doi:10.1016/j.jspi.2019.08.007. URL https:
//linkinghub.elsevier.com/retrieve/pii/S0378375819300898.
Peter Z. G. Qian. Sliced Latin Hypercube Designs. Journal of the American Statistical Association, 107(497):393–399,
March 2012. ISSN 0162-1459, 1537-274X. doi:10.1080/01621459.2011.644132. URL http://www.tandfonline.
com/doi/abs/10.1080/01621459.2011.644132.
Xinwei Deng, Ying Hung, and C. Devon Lin. Design for computer experiments with qualitative and quantitative factors.
Statistica Sinica, 2015. ISSN 10170405. doi:10.5705/ss.2013.388. URL http://www3.stat.sinica.edu.tw/
statistica/J25N4/J25N414/J25N414.html.
K. Crombecq, E. Laermans, and T. Dhaene. Efficient space-filling and non-collapsing sequential design strategies
for simulation-based modeling. European Journal of Operational Research, 214(3):683–696, November 2011.
ISSN 03772217. doi:10.1016/j.ejor.2011.05.032. URL https://linkinghub.elsevier.com/retrieve/pii/
S0377221711004577.
Razi Sheikholeslami and Saman Razavi. Progressive Latin Hypercube Sampling: An efficient approach for robust
sampling-based analysis of environmental models. Environmental Modelling  Software, 93:109–126, July 2017.
ISSN 13648152. doi:10.1016/j.envsoft.2017.03.010. URL https://linkinghub.elsevier.com/retrieve/
pii/S1364815216305096.
J.D. Parker, T.W. Lucas, and W.M. Carlyle. Sequentially extending space-filling experimental designs by optimally
permuting and stacking columns of the design matrix. European Journal of Operational Research, 319(2):600–610,
December 2024. ISSN 03772217. doi:10.1016/j.ejor.2024.06.020. URL https://linkinghub.elsevier.com/
retrieve/pii/S0377221724004685.
Zhixiang Wang, Dapeng Zhang, Yongjun Lei, Zeping Wu, Jie Wang, Xing OuYang, and Jun Wang. Constrained
space-filling and non-collapsing sequential design of experiments and its application for the lightweight design
of cylindrical stiffened shells. Structural and Multidisciplinary Optimization, 64(6):3265–3286, December 2021.
ISSN 1615-147X, 1615-1488. doi:10.1007/s00158-021-02948-6. URL https://link.springer.com/10.1007/
s00158-021-02948-6.
Lu Lu, Christine M. Anderson-Cook, Miranda Martin, and Towfiq Ahmed. Practical choices for space-filling designs.
Quality and Reliability Engineering International, 38(3):1165–1188, April 2022. ISSN 0748-8017, 1099-1638.
doi:10.1002/qre.2884. URL https://onlinelibrary.wiley.com/doi/10.1002/qre.2884.
Bernd Bischl, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, and Michel Lang. mlrMBO: A Modular
Framework for Model-Based Optimization of Expensive Black-Box Functions, 2017. URL https://arxiv.org/
abs/1703.03373. Version Number: 3.
15
raxpy Rapid Experimentation with Python A PREPRINT
James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. Algorithms for hyper-parameter optimization. In
J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K.Q. Weinberger, editors, Advances in Neural Information
Processing Systems, volume 24. Curran Associates, Inc., 2011. URL https://proceedings.neurips.cc/
paper_files/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf.
Yang Li, Yu Shen, Wentao Zhang, Ce Zhang, and Bin Cui. VolcanoML: speeding up end-to-end AutoML via scalable
search space decomposition. The VLDB Journal, 32(2):389–413, March 2023. ISSN 1066-8888, 0949-877X.
doi:10.1007/s00778-022-00752-2. URL https://link.springer.com/10.1007/s00778-022-00752-2.
Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski,
Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod
Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, İlhan Polat,
Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A.
Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and
SciPy 1.0 Contributors. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods,
17:261–272, 2020. doi:10.1038/s41592-019-0686-2.
Yong-Dao Zhou, Kai-Tai Fang, and Jian-Hui Ning. Mixture discrepancy for quasi-random point sets. Journal of
Complexity, 29(3-4):283–301, 2013.
Shan Ba and V. Roshan Joseph. MaxPro: Maximum Projection Designs, January 2015. URL https://CRAN.
R-project.org/package=MaxPro. Institution: Comprehensive R Archive Network Pages: 4.1-2.
A Anil Kumar, Baidya Nath Mandal, Rajender Parsad, Sukanta Dash, and Mukesh Kumar. SlicedLHD: Sliced Latin
Hypercube Designs, February 2024. URL https://CRAN.R-project.org/package=SlicedLHD. Institution:
Comprehensive R Archive Network Pages: 1.0.
SAS Institute Inc. JMP® Pro. Cary, NC, 1989.
Thomas W. Lucas and Jeffrey D. Parker. The Variability in Design-Quality Measures for Multiple Types of Space-Filling
Designs Created by Leading Software Packages. In 2023 Winter Simulation Conference (WSC), pages 516–527, San
Antonio, TX, USA, December 2023. IEEE. ISBN 9798350369663. doi:10.1109/WSC60868.2023.10407287. URL
https://ieeexplore.ieee.org/document/10407287/.
Guido van Rossum, Jukka Lehtosalo, and Łukasz Langa. PEP 484 – Type Hints. Python Software Foundation, 2014.
URL https://peps.python.org/pep-0484/. Accessed: 2024-09-03.
Till Varoquaux and Konstantin Kashin. PEP 593 – Flexible function and variable annotations. Python Software
Foundation, 2019. URL https://peps.python.org/pep-0593/. Accessed: 2024-09-03.
M. Lindauer, K. Eggensperger, M. Feurer, A. Biedenkapp, J. Marben, P. Müller, and F. Hutter. Boah: A tool suite for
multi-fidelity bayesian optimization  analysis of hyperparameters. arXiv:1908.06756 [cs.LG], 2019.
16

More Related Content

PDF
Multi Label Spatial Semi Supervised Classification using Spatial Associative ...
PDF
Coates p: the use of genetic programing in exploring 3 d design worlds
PDF
Fractal analysis of good programming style
PDF
FRACTAL ANALYSIS OF GOOD PROGRAMMING STYLE
DOCX
Application and evaluation of a K-Medoidsbased shape clustering method for an...
PDF
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
PDF
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
PDF
Rank based similarity search reducing the dimensional dependence
Multi Label Spatial Semi Supervised Classification using Spatial Associative ...
Coates p: the use of genetic programing in exploring 3 d design worlds
Fractal analysis of good programming style
FRACTAL ANALYSIS OF GOOD PROGRAMMING STYLE
Application and evaluation of a K-Medoidsbased shape clustering method for an...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
Rank based similarity search reducing the dimensional dependence

Similar to RAPID EXPERIMENTATION WITH PYTHON CONSIDERING OPTIONAL AND HIERARCHICAL INPUTS (20)

PDF
IRJET - Object Detection using Hausdorff Distance
PDF
IRJET- Object Detection using Hausdorff Distance
PDF
Pak eko 4412ijdms01
PDF
ONE HIDDEN LAYER ANFIS MODEL FOR OOS DEVELOPMENT EFFORT ESTIMATION
PDF
Evaluation of E-Learning Web Sites Using Fuzzy Axiomatic Design Based Approach
PDF
A Hybrid Ant Lion Optimizer (ALO) Algorithm for Construction Site Layout Opti...
PPTX
Thesis Proposal Presentation
DOCX
SHAHBAZ_TECHNICAL_SEMINAR.docx
PDF
Developing Competitive Strategies in Higher Education through Visual Data Mining
PDF
Research Proposal
PDF
Query optimization to improve performance of the code execution
PDF
11.query optimization to improve performance of the code execution
PDF
Smart Response Surface Models using Legacy Data for Multidisciplinary Optimiz...
PDF
Evolutionary Algorithm Performance Evaluation in Project Time-Cost Optimization
PPTX
Dissertation Proposal Presentation
PDF
Inventory Model with Price-Dependent Demand Rate and No Shortages: An Interva...
PDF
Graph-based analysis of resource dependencies in project networks
PDF
Utilizing Artificial Intelligence to Solve Construction Site Layout Planning ...
PDF
algorithms
PDF
International Journal of Computational Engineering Research(IJCER)
IRJET - Object Detection using Hausdorff Distance
IRJET- Object Detection using Hausdorff Distance
Pak eko 4412ijdms01
ONE HIDDEN LAYER ANFIS MODEL FOR OOS DEVELOPMENT EFFORT ESTIMATION
Evaluation of E-Learning Web Sites Using Fuzzy Axiomatic Design Based Approach
A Hybrid Ant Lion Optimizer (ALO) Algorithm for Construction Site Layout Opti...
Thesis Proposal Presentation
SHAHBAZ_TECHNICAL_SEMINAR.docx
Developing Competitive Strategies in Higher Education through Visual Data Mining
Research Proposal
Query optimization to improve performance of the code execution
11.query optimization to improve performance of the code execution
Smart Response Surface Models using Legacy Data for Multidisciplinary Optimiz...
Evolutionary Algorithm Performance Evaluation in Project Time-Cost Optimization
Dissertation Proposal Presentation
Inventory Model with Price-Dependent Demand Rate and No Shortages: An Interva...
Graph-based analysis of resource dependencies in project networks
Utilizing Artificial Intelligence to Solve Construction Site Layout Planning ...
algorithms
International Journal of Computational Engineering Research(IJCER)
Ad

Recently uploaded (20)

PDF
Module 3 - Functions of the Supervisor - Part 1 - Student Resource (1).pdf
PPTX
Sales & Distribution Management , LOGISTICS, Distribution, Sales Managers
PDF
Solaris Resources Presentation - Corporate August 2025.pdf
PDF
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
PDF
Ôn tập tiếng anh trong kinh doanh nâng cao
PDF
How to Get Business Funding for Small Business Fast
PDF
Cours de Système d'information about ERP.pdf
PDF
Module 2 - Modern Supervison Challenges - Student Resource.pdf
PPT
Chapter four Project-Preparation material
PDF
NewBase 12 August 2025 Energy News issue - 1812 by Khaled Al Awadi_compresse...
PPTX
Probability Distribution, binomial distribution, poisson distribution
PDF
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
PDF
Charisse Litchman: A Maverick Making Neurological Care More Accessible
PPT
340036916-American-Literature-Literary-Period-Overview.ppt
PPTX
Belch_12e_PPT_Ch18_Accessible_university.pptx
PDF
NISM Series V-A MFD Workbook v December 2024.khhhjtgvwevoypdnew one must use ...
PDF
SBI Securities Weekly Wrap 08-08-2025_250808_205045.pdf
PDF
Digital Marketing & E-commerce Certificate Glossary.pdf.................
PDF
Laughter Yoga Basic Learning Workshop Manual
PDF
Nante Industrial Plug Factory: Engineering Quality for Modern Power Applications
Module 3 - Functions of the Supervisor - Part 1 - Student Resource (1).pdf
Sales & Distribution Management , LOGISTICS, Distribution, Sales Managers
Solaris Resources Presentation - Corporate August 2025.pdf
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
Ôn tập tiếng anh trong kinh doanh nâng cao
How to Get Business Funding for Small Business Fast
Cours de Système d'information about ERP.pdf
Module 2 - Modern Supervison Challenges - Student Resource.pdf
Chapter four Project-Preparation material
NewBase 12 August 2025 Energy News issue - 1812 by Khaled Al Awadi_compresse...
Probability Distribution, binomial distribution, poisson distribution
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
Charisse Litchman: A Maverick Making Neurological Care More Accessible
340036916-American-Literature-Literary-Period-Overview.ppt
Belch_12e_PPT_Ch18_Accessible_university.pptx
NISM Series V-A MFD Workbook v December 2024.khhhjtgvwevoypdnew one must use ...
SBI Securities Weekly Wrap 08-08-2025_250808_205045.pdf
Digital Marketing & E-commerce Certificate Glossary.pdf.................
Laughter Yoga Basic Learning Workshop Manual
Nante Industrial Plug Factory: Engineering Quality for Modern Power Applications
Ad

RAPID EXPERIMENTATION WITH PYTHON CONSIDERING OPTIONAL AND HIERARCHICAL INPUTS

  • 1. RAPID EXPERIMENTATION WITH PYTHON CONSIDERING OPTIONAL AND HIERARCHICAL INPUTS A PREPRINT Neil C. Ranly Department of Operational Sciences Air Force Institute of Technology Wright-Patterson AFB, OH 45433 neil.ranly@afit.edu Torrey D. Wagner Department of Systems Engineering and Management Air Force Institute of Technology Wright-Patterson AFB, OH 45433 January 8, 2025 ABSTRACT Space-filling experimental design techniques are commonly used in many computer modeling and simulation studies to explore the effects of inputs on outputs. This research presents raxpy, a Python package that leverages expressive annotation of Python functions and classes to simplify space-filling experimentation. It incorporates code introspection to derive a Python function’s input space and novel algorithms to automate the design of space-filling experiments for spaces with optional and hierarchical input dimensions. In this paper, we review the criteria for design evaluation given these types of dimensions and compare the proposed algorithms with numerical experiments. The results demonstrate the ability of the proposed algorithms to create improved space-filling experiment designs. The package includes support for parallelism and distributed execution. raxpy is available as free and open-source software under a MIT license. Keywords space-filling design · computer experiment · maximum projection criterion · hierarchical experimentation factors 1 Introduction Design of experiments (DOE) is a branch of applied statistics that generates a set of values, or simply the design points, as inputs to a function. By passing each point through the function and assembling a dataset with the function-returned outputs, experimenters perform analysis to gain insights into how the inputs affect outputs. In many situations, given constrained resources and non-trivial execution durations of these functions, experimenters are constrained on the number of points they can consider. The objective in designing these experiments is to generate a set of points that maximizes the insight gained from the experiment’s outputs and that meets the experimenter’s objectives given this point-set-size constraint. This research focuses on DOEs to support exploration-based objectives of computer-based executable functions that possess either non-linear response surfaces, multiple output variables, or system-of-system emergent dynamics. Computer-executable-functions, such as computer-based simulation models [Sanchez et al., 2020], are a growing source of data for deep learning [Makoviychuk et al., 2021] and a proven method for generating insights into rare and unrealized situations. DOEs for exploration use cases include the exploration of a input-space at the start of an optimization effort, the discovery of non-linear relationships between inputs and outputs [Box and Lucas, 1959], the validation of a complex simulation model with multiple outputs [Sanchez and Sánchez, 2017], comparing different versions of computer-executable-functions in a black-box manner, and the generation of a dataset supporting downstream machine learning, such as surrogate modeling [Arboretti et al., 2022, Fontana et al., 2023]. Common DOE techniques such as grid sampling and random sampling have undesirable properties for exploration use cases. Researchers may find that only a small subset of input dimensions have an affect on the outputs, known as the effect sparsity principle [Young et al., 2024]. In these cases, grid-based designs and many factorial-based arXiv:2501.03398v1 [cs.MS] 6 Jan 2025
  • 2. raxpy Rapid Experimentation with Python A PREPRINT design-variations possess projection properties that inefficiently duplicate the execution of points that collapse to a same lower-dimensional-point after dropping the input dimensions that lack practical-significance on the outputs [Joseph, 2016]. For example, consider the two-dimension, two-level full-factorial design of {(0, 0), (0, 1), (1, 0), (1, 1)} projecting to the design {(0), (0), (1), (1)} with duplicate points after removing the 2nd dimension. In addition, grid and factorial based design sizes increase exponentially as the number of dimensions increase making them infeasible to consider for large dimensional input spaces. Random sampling designs, by virtue of being stochastic, often fail to hold space-filling properties and may possess highly correlated design points given design size constraints. Correlations among the inputs confound the determination of the true cause of output differences. These undesirable theoretical properties of these designs can cause significant practical differences when compared to alternative design-generation algorithms that attempt to evenly-sample or uniformly-sample the design-space; this paper refers to these alternative algorithms as space-filling DOEs. Evaluations of space-filling DOEs have shown better performance in many surrogate modeling benchmarks [Arboretti et al., 2022, Fontana et al., 2023]. Initial research into space-filling designs, such as Latin-hypercube sampling based designs (LHDs), assumes bounded, continuous input dimensions, i.e., rectangular input spaces [Johnson et al., 1990, Owen, 1992, Damblin et al., 2013]. LHDs ensure any subsequent dropping of dimensions from a input dataset retains all points uniqueness. LHDs ensure this by generating n unique values for each input dimension; where n is the desired point set size given resource limitations and function-execution budgets. LHDs considers the range of an input dimension and creates n evenly, mutual exclusive sub-bounded regions and chooses a centered value or a random value within this sub-bounded region. In order to avoid undesirable correlations and distances between points, optimization algorithms derive the final input value combinations from these single dimension value sets [Damblin et al., 2013, Joseph, 2016]. Additional space-filling design research investigates non-rectangular input spaces, discrete-valued dimensions, and additional constraints on the input space. Maximum projection space-filling design techniques support discrete-level numeric inputs while maximizing the sub-space projection distance of points [Joseph et al., 2020, Lekivetz and Jones, 2019]. Fast-flexible filling techniques [Lekivetz and Jones, 2015, Chen et al., 2019], adaptive optimization techniques [Wu et al., 2019], mixed integer non-linear programming techniques [Özdemir and Turkoz, 2023], and partial swarm optimization techniques [Chen et al., 2022] address non-linear constraints and non-convex volumes. Orthogonal uniform composite designs provide techniques for flexible design sizes while retaining orthogonal properties of the design [Zhang et al., 2020]. Sliced Latin hypercube designs [Qian, 2012], marginally coupled designs [Deng et al., 2015], and maximum-projection techniques [Joseph et al., 2020] address qualitative input dimensions, such as a categorical input factors. Techniques for sequential space-filling DOEs remove the requirement that the user needs to specify the design size in a upfront manner, enabling the dynamic consideration of time constraints or other insight criteria discovered during experimentation [Crombecq et al., 2011, Sheikholeslami and Razavi, 2017, Parker et al., 2024]. Wang et al. [2021] provide sequential space-filling design techniques while considering input spaces with non-linear constraints. Lu et al. [2022] discusses research methods to prioritize regions of the input space. A focus of this research is algorithms to generate space-filling designs for input spaces with optional-and-hierarchical dimensions. For example, consider experimenting with a object-detection computer-simulation with the input-space being the scene context and the inclusion of different types of objects, each type of object having additional input dimensions that are desired to vary to explore their influence on the system’s detection performance. In this scenario, the inclusion of each object type represents an optional dimension of a simulation run. In addition, if the object type is included in the scene, then attributes related to the object, such as color, movement, position, and other object attributes, can be varied as part of the DOE. Input spaces with these types of dimensions also occur in hyper-parameter search spaces [Bischl et al., 2017]. The Tree- structured Parzen Estimator is a hyper-parameter optimization technique supporting hierarchical spaces, studied with an initial random exploration design [Bergstra et al., 2011]. Decomposition search strategies to conduct optimization over similar input spaces have also been proposed [Li et al., 2023]. From our literature review, a gap exists of algorithms to support space-filling designs for input spaces with optional-and-hierarchical dimensions. This research proposes space-filling DOE algorithms for these types of input spaces given a user design point-size trial budget, n. Space-filling DOE algorithms are provided in commercial software and open-source software. scipy [Virtanen et al., 2020] and UniDOE provide algorithms that maximize designs’ discrepancy [Zhou et al., 2013]. MaxPro [Shan Ba and V. Roshan Joseph, 2015] provides algorithms that maximize the multi-dimension sub-projections of designs. SlicedLHD provides an implementation of Sliced Latin Hypercube Designs [Kumar et al., 2024]. Inc. [1989] provides a variety of algorithms including Fast-Flexible filling [Lekivetz and Jones, 2019] to support designs with non-linear constraints. See Lucas and Parker [2023] for a comparison of the space-filling properties of designs created with popular experimentation support software. In addition to the algorithm gap perceived, this research also addresses the software gap for the implementation of these techniques on computer-executable function experimentation subjects. 2
  • 3. raxpy Rapid Experimentation with Python A PREPRINT This research presents raxpy, a Python library that contributes novel methods to express experiments’ input spaces with Python-type specifications and novel extensions to DOE space-filling algorithms to generate experiment designs that address optional-and-hierarchical input dimensions. raxpy’s use of Python enables instrumented execution of many types of computer-based executable functions not natively expressed in Python, such as command-line-based simulation programs, generative artificial intelligence web services, and other types of external web services. The proposed library enables rapid execution of space-filling experiments with the capabilities of Python and its ecosystem of libraries such that generated data can be quickly processed to enable downstream analysis activities. Section 2 discusses how raxpy is used to perform experiments and create designs. Section 3 discusses how raxpy evaluates and creates DOEs given optional-and-hierarchical dimensions. Finally, Section 4 discusses the results of numerical experiments comparing the space-filling properties for designs generated with the proposed algorithms. 2 Performing Experiments The following code demonstrates how users of raxpy can create and execute a space-filling experiment with n = 10 points on an annotated Python function, such as the example function f5 listed in section 2.1. 1 import raxpy 2 3 inputs , outputs = raxpy. perform_experiment (f5 , 10) The method perform_experiment executes the following steps. First, it derives the input space from the annotated parameters of the passed-in function; function annotations are described subsequently in section 2.1. The function’s input space is represented as a Space object composed of zero-or-many Dimension objects; the raxpy.spaces module is described in section 2.2. Secondly, the Space object for a function’s input space is passed to a space-filling DOE algorithm that creates a DOE; the DOE algorithm is described in section 3.2. Next, the design points are mapped to the function parameters and executed for each point. Finally, it returns the points and the corresponding outputs. This automated mapping from a function signature, to input-space, to DOE, to function arguments for execution, and finally execution of function with arguments can increase the efficiency of conducting exploration experiments compared to manually mapping the results between each step. raxpy can also create an experiment design without executing it, given an annotated function: 1 doe = raxpy. design_experiment (f5 , 10) 2.1 Function Annotation raxpy utilizes the type specifications and annotations of Python functions and classes to express and to derive an experiment design’s input and output space. Type specifications in programming languages have a long tradition of supporting expressive expectations of variables’ types. Programmers have long valued type specification in large code bases to clearly express interfaces’ input and output specifications to themselves and other programmers. Code compiling and static-code-analysis linting techniques also use type specifications to identify interface violations before software is executed. In many scripting languages, including Python 2, type specification is not required and is not supported. As the use of Python grew, community efforts and demands to enable code quality processes resulted in type specifications being incorporated to Python in version 3.5 as part of Python Enhancement Proposal (PEP) 484 [van Rossum et al., 2014]. Since, many Python libraries have incorporated type specifications supplying common documentation of variable type expectations and enabling automated pre-runtime code quality processes. The following code demonstrates a function with three parameters with and without type specifications. 1 from typing import Optional 2 3 # simple function without type hints 4 def f1(x1 , x2 , x3): 5 return x1 * x2 if x3 is None else x1 * x2 * x3 6 7 # simple function with type hints 8 def f2(x1: float , x2:int , x3:Optional[float ]) -> float: 9 return x1 * x2 if x3 is None else x1 * x2 * x3 With the implementation of PEP 593 in Python 3.9, Python extends the concept of type specification to permit the annotation with additional meta-data [Varoquaux and Kashin, 2019]. The follow code demonstrates a Python function’s 3
  • 4. raxpy Rapid Experimentation with Python A PREPRINT parameter type specifications with annotated meta-data; the first two parameters have a lower and upper bound specified, while the x3 parameter has a value set specified to indicate the discrete-numeric values for this parameter. 1 import raxpy 2 from typing import Annotated 3 4 # simple function with type hints and annotations 5 def f3( 6 x1: Annotated[float , raxpy.Float(lb=0.0, ub =10.0)], 7 x2: Annotated[float , raxpy.Float(lb=0.0, ub =2.0)], 8 x3: Annotated[Optional[float], raxpy.Float(value_set =[0.0 , 1.5, 3.0])] 9 ) -> Annotated[float , raxpy.Float(tags =[ raxpy.tags.MAXIMIZE ])]: 10 return x1 * x2 if x3 is None else x1 * x2 * x3 In addition to the improved documentation, there are additional benefits for annotating variables. They also enable runtime validation of values beyond just the validation of a values’ types; see the code below how raxpy can enable this. The main benefit, which we utilize in raxpy, is the introspection of a function signature to dynamically extract functions’ design spaces. With the derived function’s input and output space specification, raxpy analyzes the space’s dimensions to design experiments, execute designs, and gather the results. Examples also demonstrate how this introspection of raxpy annotations can be adapted for use with other Python modules to perform optimization experiments. The expression of the experimentation subject as a Python function also enables software engineering inspired test-driven-development best practices such as unit-testing. 1 # simple function with annotations and runtime validation 2 @raxpy. validate_at_runtime (check_outputs=False) 3 def f4( 4 x1: Annotated[float , raxpy.Float(lb=0.0, ub =10.0)], 5 x2: Annotated[int , raxpy.Float(lb=0.0, ub =2.0)], 6 x3: Annotated[Optional[float], raxpy.Float(value_set =[0.0 , 1.5, 3.0])] 7 ) -> float: 8 return x1 * x2 if x3 is None else x1 * x2 * x3 9 10 f4(3.14 , 1, None) # no error 11 f4(3.14 , 11, None) # runtime error given 11 value does not fall within range To specify the input space of a Python function, each parameter of a function is annotated. raxpy code introspection algorithm checks for standard type specifications in the typing package to indicate Optional and Union parameters. This encourages experimenters to reuse existing type-specification best practices while minimizing the input space specification in an additional configuration section of the code or an external file. It also minimizes the experimenter’s requirement to employ a different space-specification method. raxpy uses dataclasses to support the expression of hierarchical dimensions. The following code demonstrates how raxpy creates and executes a space-filling experiment with 25 points on an annotated function Python with optional hierarchical dimensions expressed with typing.Union, typing.Optional, and dataclasses. 1 from dataclasses import dataclass 2 3 @dataclass 4 class HierarchicalFactorOne : 5 x4: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)] 6 x5: Annotated[float , raxpy.Float(lb=0.0, ub =2.0)] 7 8 @dataclass 9 class HierarchicalFactorTwo : 10 x6: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)] 11 x7: Annotated[float , raxpy.Float(lb=-1.0, ub =1.0)] 12 13 14 def f5( 15 x1: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)], 16 x2: Annotated[Optional[float], raxpy.Float(lb=-1.0, ub =1.0)], 17 x3: Union[HierarchicalFactorOne , HierarchicalFactorTwo ], 18 ): 19 # placeholder for f5 logic 20 return 1 21 inputs , outputs = raxpy. perform_experiment (f5 , 25) 4
  • 5. raxpy Rapid Experimentation with Python A PREPRINT The annotation code for the raxpy type specification is organized in the raxpy.annotations module. Although we expect many uses of raxpy to employ the raxpy.annotations module or existing annotation libraries, the raxpy.spaces module provides common data structures for DOE algorithms and external specifications of a space. For example, a user interface may permit an experimenter to specify the input dimensions of a space for a Python function that dynamically configures a simulation model given the input provided. 2.2 Design Spaces A Python function corresponds to two design spaces, an input space derived from the input parameters of the function, and an output space derived from the return type specification. The code below shows the ability to specify a InputSpace without deriving it from a function’s signature. A major difference between raxpy and the ConfigSpace Python package [Lindauer et al., 2019], is the incorporation of an nullable attribute for dimensions. Dimension’s nullable attributes enables a clear mapping from Python functions’ parameters specified as optional. 1 space = raxpy.spaces.InputSpace( 2 dimensions =[ 3 raxpy.spaces.Float(id="x1", lb=0.0, ub=1.0, portion_null =0.0) , 4 raxpy.spaces.Float(id="x2", lb=0.0, ub=1.0, portion_null =0.0) , 5 raxpy.spaces.Float(id="x3", lb=0.0, ub=1.0, nullable=True , portion_null =0.1) , 6 ] 7 ) The following dimension-types are also provided to support the direct mapping from complex function parameters: • Composite objects represent dimensions mapped from dataclass type specification; • Variant objects represent dimensions mapped from typing Union type specification; • ListDim objects represent dimensions formed from type specifications of from typing List type specification. 3 Designing Experiments with Optional and Hierarchical Inputs A common approach to creating space-filling designs is to measure and optimize a design with respect to a space-filling metric. Optional and hierarchical dimensions complicate the direct application of these methods. Section 3.1 suggests extensions to past space-filling measurement criteria and concepts to address these more complex space attributes. raxpy code examples are also provided demonstrating techniques to measure and analyze space-filling designs. Section 3.2 presents DOE algorithms employing these concepts. 3.1 Space Filling Metrics Seven space-filling metrics are extended and used in this work and described in this section. • Mocov : Full-sub-space optionality-coverage percentage • Midis : Minimum interpoint distance • Madis : Average minimum single-dimension projection distance • Mwdsr : Weight and sum of full-sub-space discrepancies • Msdsr : Star discrepancies, incorporating a null-region within dimensions • Mmaxpro : Variation of MaxPro metric extended to support optional and hierarchical dimensions • DOE target allocation difference: Count of point allocations diverging from the target full-sub-space allocation counts Let X denote a DOE with n d-dimensional points and S denote the input-space such that X ⊆ S and S = (D1, D2, . . . , Dd) where Dk denotes the acceptable range-of-values for dimension k. Let D = {1, 2, . . . , d} rep- resent the set of indices corresponding to the dimensions. Let Doptional ⊆ D represent the set of indices corresponding to the optional dimensions in S such that {null} ⊆ Dk, ∀k ∈ Doptional . Let Dparent ⊆ D indicate the set of indices for hierarchical activation dimensions and {0, 1} ⊆ Dk, ∀k ∈ Dparent . Let Dreal ⊆ D denote the set of indices corresponding to bounded real numbers, such that [lk, uk] ⊆ Dk, ∀k ∈ Dreal . All numeric experiments and subsequent notation fix lk = 0 and uk = 1. Let xi represent ith point of X and xik represent point i’s value for dimension k. Let 5
  • 6. raxpy Rapid Experimentation with Python A PREPRINT Figure 1: X to Xopt point mapping P ⊆ Dparent × D × Dp denote a set of hierarchical constraints such that if (p, c, v) ∈ P then the child dimension c for a point is constrained to be null if the parent dimension p’s value is not v. This implies c ∈ Doptional , ∀(p, c, v) ∈ P. The first criteria we consider is the extent of a design’s coverage over the different possible combinations of optional parameters. To compute, we define a projection of X to a d-dimensional binary space indicating the non-null specification of values in points as the optional-definition design projection, Xopt . Xopt ⊆ {0, 1}n×d , where each element xopt ij is defined as: xopt ij = 0, if xij = null 1, if xij ̸= null (1) Figure 1 visualizes this mapping for a example DOE of five dimensions where Doptional = {2, 3, 4, 5}, Dparent = {3}, Dreal = {1, 2, 4, 5}, P = {(3, 4, 1), (3, 5, 1)}. We can compute the optionality-coverage percentage by taking the set-size of unique points | Sn i=1{xopt i }| over the size of power set of Doptional , |PP (Doptional )|. P is a slightly modified version of the power-set since some optional dimensions are only feasible given the hierarchical constraints specified with P and some dimensions are constrained to be not-null, thus are active in every sub-space. raxpy uses a tree traversal technique to compute PP (Doptional ). For the example depicted in Figure 1, PP (Doptional ) = {(1), (1, 2), (1, 2, 3), (1, 2, 3, 4), (1, 2, 3, 5), (1, 2, 3, 4, 5), (1, 3), (1, 3, 4), (1, 3, 5), (1, 3, 4, 5)}. The depicted DOE in Figure 1 only samples six of these possible sub-spaces. We define full-sub-spaces (FSSs) as the set of sub-spaces indicated by PP (Doptional ). Each point of S maps to a single element of the PP (Doptional ) set. For example points (rows) two and three in Figure 1 map to the full-sub-space of {1, 2}. Given a DesignOfExperiment object named doe, raxpy can compute optionality-coverage percentage: 1 opt_coverage = raxpy.measure. compute_opt_coverage (doe) Given a InputSpace object named space, raxpy can compute PP (Doptional ): 1 fss_dim_sets = space. derive_full_subspaces () The second criteria we employ is a design’s minimum interpoint distance, Midis [Johnson et al., 1990]. minxi,xj ∈X( d X k distk(xi − xj)p )(1/p) (2) 6
  • 7. raxpy Rapid Experimentation with Python A PREPRINT This criteria is motivated by the objective to maximize the minimal distance between all the points. To support measurement of designs with optional dimensions, we define the distance computations with a revised distance equation for a 0-1 normalized, encoded design: distk(xi, xj) =    |xik − xik| if xik ̸= null and xjk ̸= null, 1 if (xik = null and xjk ̸= null) or (xjk = null and xik ̸= null), 0 if xik = null and xjk = null. (3) Given a DesignOfExperiment object named doe, raxpy provides support for this: 1 min_interpoint_dist = raxpy.measure. compute_min_interpoint_dist (doe , p=2) The discrepancy criteria evaluates the extent a design achieves uniform distribution across a bounded, numeric-value input space [Zhou et al., 2013]. We consider two discrepancy measurement extensions. The first extension Mwdsr , we measure the design by decomposing the design to well-bounded, non-null sub-input-spaces. We weight and sum the discrepancies of the sub-designs, with weights representing the target number of points allocated to each FSS over n. We consider this metric especially applicable for design evaluation when subsequent experimentation is expected to focus on one sub-space given the initial exploration results. We only employ this metric for design comparisons given designs with the same FSS point allocations. The second discrepancy extension, Msdsr , we consider is the direct incorporation of null regions in Doptional dimensions as part of the uniform distribution computation considered by discrepancy. Discrepancy is defined as the maximum difference between the portion of design points within a region and the volume of that same region as a ratio to the volume of the whole space. To compute with nullable dimensions, we consider null values to directly precede 0 on the number line and the null region sizes of a dimension to be specified as stated in section 3.3 or specified by the user. sup u∈[null,1]d |{x : x ≤ u}| n − d Y k=1 ¯ αk + uk(1 − ¯ αk) (4) where ¯ αk indicates the size of the null region for dimension k. Future research is suggested to explore mixture discrepancy [Zhou et al., 2013] variation extensions. 1 discrepancy = raxpy.measure. compute_star_discrepancy (doe) To measure the single-dimensional projection properties of designs, we compute the average minimum distance, Madis , between single-dimension projections, where Inot-null k ⊆ {i : xik ̸= null} representing the indices of non-null values: 1 d d X k=1 min i,j∈Inot-null k ,i̸=j |xik − xjk| (5) 1 avg_min_proj_dist = raxpy.measure. compute_average_dim_dist (doe , p=2) To assess the multi-dimensional projection of designs, we propose a variation of the MaxPro metric. If k ∈ Doptional ∪ Dparent , ᾱk denotes 1/αk, where αk denotes either |Dk| + 1 if Dk is finite and is optional or an estimation of the dimension’s complexity; see section 3.3. This avoids dividing by zero for designs that must contain duplicate values for dimensions with a finite set of values. Another effect of this is that differences corresponding to dimensions with less complexity are weighted less. Otherwise, ᾱk = 0 for bounded, non-optional real values to avoid duplicates.   1 n 2 n−1 X i=1 n X j=i+1 1 Qd k=1(distk(xik, xjk) + ᾱk)2   1 d (6) 1 max_pro = raxpy.measure. compute_max_pro (doe) 7
  • 8. raxpy Rapid Experimentation with Python A PREPRINT 3.2 Design Algorithms To support the construction of designs, we propose five algorithms and evaluate those algorithms in addition to a baseline random-design algorithm. Some algorithms leverage a traditional space-filling design algorithm (TSFD) for a real, 0-1 bounded space. The TSFD we employ for numerical experiments in section 4 is the scipy.stats.qmc LatinHypercube algorithm based on centered discrepancy optimization. Note that each proposed algorithms’ design can also be post-optimized further with a MaxPro-based optimization algorithm, explained at the end of this section. This results in the comparison of 12 algorithms in section 4. 1. The Full-Subspace (FSS-LHD) algorithm mimics a manual approach of separately considering the individual full-sub-spaces and creating a separate space-filling design for each. Each FSS is allocated a portion of the n points in the design. In the absence of a user-specified allocation, heuristics are used to determine the allocation given null-portion attributes; see section 3.3. raxpy also supports a feature to force at least one point allocated to each FSS if possible. This results in the creation of a full-factorial inspired Xopt design, at least one point for each element of PP (Doptional ). Next, a TSFD is used to generate designs for each full-sub-design. 2. A variation of the FSS-LHD algorithm above using random designs instead of space-filling designs (FSS- Random). 3. The Full-Subspace-with-Value-Pool (FSS-LHD-VP) algorithm revises the FSS-LHD algorithm by computing the number of values needed for each dimension given target FSS point allocations. Given this point count for each dimension, it next creates a value-pool for the dimension using a LHD method to generate spaced values. Next, starting with the largest FSS with the largest point allocation, the appropriate number of values are pulled from the value-pool across the quantiles of the remaining values in the pool. The pull values are shuffled to create an initial sub-design and then passed to a space-filling centered discrepancy optimization algorithm based on the scipy.stats.qmc LatinHypercube algorithm. 4. The Tree-Traversal-Design (TT-LHD) algorithm starts by creating a root design using a TSFD for the dimensions without any parents. The root 0-1 encoded design values are then projected to [0-1]+null space using the values under the null-portion threshold to project to the null value and values greater being rescaled to 0-1 values given the range from the null-portion threshold to 1. For Dparent and finite numeric-value-set based dimensions, these values are mapped to values representing the discrete values in the finite set, in a similar manner as suggested in Joseph et al. [2020]. Next, the algorithm creates sub-designs with TSFD for children-dimension sets given common parents, repeating the same 0-1 mapping logic. The lower-level sub-designs are merged with the root-design using a centered discrepancy optimization algorithm, only revising the children-dimensional values to avoid undoing previous merge optimizations. It repeats this in a recursive manner into the deeper nodes of the hierarchical space. 5. The Whole-Projection-Design (P-LHD) algorithm employs a TSFD to create an initial design for the flattened hierarchical space, initially ignoring the optional-and-hierarchical-properties of dimensions. Given this initial design, using a similar value-mapping logic as the TT-LHD algorithm above, the values for optional dimensions are projected to a [0-1]+null space as needed. Values projected to null for parent dimensions are horizontally extended onto all children dimensions. As a baseline to compare the performance of these algorithms, we also consider a completely random design algorithm that addresses null values for optional dimensions according to the null-portion attribute assigned to optional dimensions. This is labeled as Random in section 4. A design Simulated Annealing MaxPro optimization is provided and outlined below where Inot-null k indicates the indices of rows with non-null values for dimension k. This algorithm is designed to retain the initial design’s Xopt structure, retaining the user’s or the design heuristic’s FSS allocations. It fulfills this by only swapping rows’ values in Inot-null k . An algorithm using MaxPro optimization is labeled with the suffix -MP in section 4. Figure 2 presents the three-dimensional distribution of a 24-point experiment design space with a series of two- dimensional scatter plots and histograms. A random generated design is shown on the left, and a design generated with the FSS-LHD-VP-MP algorithm is shown on the right. The three dimensions are x1, x2 and x3, and were defined with a range of [0,1]; dimensions x2 and x3 are defined as optional. Histograms for each of the variables is shown along the diagonal of each plot. The ideal histogram is uniform for a space-filling design, and it is visible that the FSS-LHD-VP-MP histograms are far more uniform with a increased number of null values given the null region size. 3.3 Design Heuristics Each Doptional dimension has an null_portion attribute. This attribute indicates the size, from a input-space uniform distribution perspective, of the null region in the dimension. This directly influences the percentage of points within a 8
  • 9. raxpy Rapid Experimentation with Python A PREPRINT Algorithm 1 Simulated Annealing MaxPro Design Optimization Require: An initial design X Require: Max-iterations M Require: MaxPro Measurement Function cMaxPro 1: Initialize Xbest ← X 2: for m = 1 to M do 3: t ← 1.0 − m/M 4: k ∼ Uniform(Dreal ) 5: i, j ∼ Uniform(Inot-null k ), Uniform(Inot-null k ) 6: while i = j do 7: j ∼ Uniform(Inot-null k ) 8: end while 9: Xtry ← ColumnSwap(X, i, j, k) 10: P ← e−(cMaxPro(Xtry )−cMaxPro(Xbest ))/t 11: if P Uniform([0 − 1]) then 12: Xbest ← Xtry 13: end if 14: end for 15: return Xbest Figure 2: Scatter-plots of the points from a random DOE (left) and a FSS-LHD-VP-MP DOE (right) 9
  • 10. raxpy Rapid Experimentation with Python A PREPRINT DOE that should have a null value for this dimension if a user does not explicitly specify FSS allocations. raxpy utilizes heuristics to specify this attribute if not specified by the user. First, the number of levels of a dimension is estimated. We denote this simply as the dimension’s complexity, αk. If a dimension is a finite set of values, the complexity is the set size. If a dimension k ∈ Dreal , we currently specify the complexity to three plus one representing the null value; one unit of the complexity represents the null region. We then set null_portion, ᾱk, to 1/αk. For the algorithms that utilize FSS point-allocation targets, we compute the percentage of points to allocate to the FSSs as needed. Let SDo represent the FSS of S given the dimensions specified in Do ∈ PP (Doptional ). Next, the percentage of points to allocate to each SDo is computed as: Y k∈Do (1 − ¯ αk) (7) For design algorithms that utilize the full-sub-space allocation techniques, raxpy also allows user to directly provide their own FSS allocations. 4 Analysis of Space-Filling Capability 4.1 Methodology We demonstrate the capabilities of raxpy with numeric experiments that compare designs’ space-filling properties generated from the proposed algorithms. Each algorithm is considered by itself and also with the MaxPro optimization for a total of twelve design algorithms. We investigate four different input spaces (labeled basic, simple, modest, and complex) and three design sizes for each input space, and then generate 30 designs, i.e., replications, with each algorithm to analyze the variability of design measures. This results in 360 designs evaluated. Every design is evaluated with respect to Midis , Madis , Msdcr , and Mwdcr . The algorithm replication evaluation means are ranked within each input space and design size combination. Full-sub-space allocation differences from target allocations are also computed and summarized. The first input space, Sbasic , we consider consists of Dreal = {1, 2, 3}, and Doptional = {2, 3}. For this input space, we consider design sizes n ∈ {4|Sbasic |, 8|Sbasic |, 12|Sbasic |} = {12, 24, 36}. The input space is derived from the following function signature: 1 def f_basic( 2 x1: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)], 3 x2: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)], 4 x3: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)] 5 ): The second input space, Ssimple , we consider consists of Dreal = {1, 2, 4, 5}, Dparent = {3}, and Doptional = {3, 4, 5}. For this input space, we consider n ∈ {20, 40, 60}. The input space is derived from the following function signature: 1 @dataclass 2 class HierarchicalFactorOne : 3 x4: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)] 4 x5: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)] 5 6 def f_simple( 7 x1: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)], 8 x2: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)], 9 x3: Optional[ HierarchicalFactorOne ] 10 ): The third input space, Smodest , we consider consists of Dreal = {1, 2, 4, 5, 7, 8}, Dparent = {3, 6}, and Doptional = {2, 3, 4, 5, 6, 7, 8}. For this input space, we consider n ∈ {32, 64, 96}. The input space is derived from the following function signature: 1 @dataclass 2 class HierarchicalFactorTwo : 3 x7: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)] 4 x8: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)] 5 10
  • 11. raxpy Rapid Experimentation with Python A PREPRINT 6 def f_modest( 7 x1: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)], 8 x2: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)], 9 x3: Optional[ HierarchicalFactorOne ], 10 x6: Optional[ HierarchicalFactorTwo ], 11 ): The fourth input space, Scomplex , we consider consists of Dreal = {1, 2, 4, 5, 7, 8, 9, 10}, Dparent = {3, 5, 8}, and Doptional = {2, 3, 4, 5, 6, 7, 8, 9, 10}. For this input space, we consider n ∈ {32, 64, 96}. The input space is derived from the following function signature: 1 @dataclass 2 class HierarchicalFactorLevel2 : 3 x6: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)] 4 x7: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)] 5 6 @dataclass 7 class HierarchicalFactorLevel1A : 8 x4: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)] 9 x5: Optional[ HierarchicalFactorLevel2 ] 10 11 @dataclass 12 class HierarchicalFactorLevel1B : 13 x9: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)] 14 x10: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)] 15 16 def f_complex( 17 x1: Annotated[float , raxpy.Float(lb=0.0, ub =1.0)], 18 x2: Annotated[Optional[float], raxpy.Float(lb=0.0, ub =1.0)], 19 x3: Optional[ HierarchicalFactorLevel1A ], 20 x8: Optional[ HierarchicalFactorLevel1B ], 21 ): 4.2 Results and Discussion Table 1 shows the resulting rankings of the algorithms by taking the average rankings across input space and design size combinations; ranking each algorithm on a 1-12 scale where 1 is the best. The top ranked algorithms are highlighted, with the best in the darkest shade of green. The FSS-LHD-VP-MP algorithm is in the top three performing algorithms over all four metrics and the best for Midis . The additional processing with MaxPro optimization, designated with a -MP suffix, shows a trend with helping to improve the space-filling properties of the designs with respect to Midis and Msdcr , while hurting Mwdcr for the FSS-LHD and FSS-LHD-VP algorithms that originally optimized discrepancies by full-sub-spaces. 11
  • 12. raxpy Rapid Experimentation with Python A PREPRINT Algorithm Midis Madis Msdcr Mwdcr Random* 11.75 9.0 11.0 9.83 FSS-Random 11.08 10.17 9.83 11.33 FSS-LHD 5.67 7.83 7.42 1.58 FSS-LHD-VP 5.5 2.5 5.08 2.67 TT-LHD* 8.83 1.67 7.08 5.0 P-LHD* 9.83 4.33 7.17 5.17 Random-MP* 7.75 9.0 8.25 9.67 FSS-Random-MP 3.17 10.17 7.25 8.58 FSS-LHD-MP 2.33 7.83 5.33 6.0 FSS-LHD-VP-MP 1.0 2.5 2.5 4.25 TT-LHD-MP* 5.58 1.67 2.25 7.17 P-LHD-MP* 5.5 4.33 4.08 6.75 Table 1: Average algorithm ranking for each combination of input space and design size. Shading indicates the strongest algorithms, and an asterisk indicates common divergence from the target number of points for each full-sub-space causing measurement biases. Figure 3 shows the normalized Mwdcr for the FSS-based algorithms; the objective is to attain small discrepancies from the uniform distribution from the perspective of every full-sub-space. The left side of the figure shows the algorithm performance, and the right side shows the performance with the MaxPro design optimization. The weighted discrepancies results show that -MP may worsen this criteria. This suggests if subsequent experimentation is expected in a single full-sub-space, then -MP whole-design optimization may not be desired. Figure 3: Algorithm Space-Filling Results, normalized Mwdsr (lower is better) Three algorithms (Random, TT-LHD, P-LHD) have an asterisk in table 1, which marks that they often fail to create designs with point allocations that match the target allocations as derived from the null-portion attributes. The FSS point- allocation to target-allocations differences for these algorithms are shown in figure 4. Random designs inconsistently sample the FSS to match the target point allocations. The FSS-based algorithms, by design, create designs that match the target point allocations. 12
  • 13. raxpy Rapid Experimentation with Python A PREPRINT Figure 4: Algorithm Full-Sub-Space Point Allocation Differences from Target Counts (lower is better) Figure 5 shows the FSS-based algorithm designs’ normalized minimum interpoint distances over the 30 replications by input-space and design size with-and-without MaxPro processing. With the incorporation of MaxPro processing, designs’ Midis increases. The FSS-LHP-VP-MP algorithm had the best average for every input spaces and design size. Figure 5: Algorithm Space-Filling Results, normalized Midis (higher is better) 5 Conclusion The Python programming language provides a rich set of type hinting and annotation capabilities. raxpy makes it easy to extend these capabilities to design, evaluate, and execute space-filling experiments with optional and hierarchical dimensions. raxpy provides newly proposed design algorithms and measurement techniques, extended to support spaces with optional and hierarchical dimensions. Numeric experimentation highlights differences between the proposed design algorithms and demonstrates the improved space-filling capabilities of the algorithms. While raxpy supports creating designs with Union parameters, future research is suggested to study the space-filling optimizations to support this type of dimension. In a similar manner, List function parameters provide a similar unexplored possible future research area. Another future research area is the ability to design exploratory experiments 13
  • 14. raxpy Rapid Experimentation with Python A PREPRINT for sequential point executions given optional and hierarchical input dimensions. The expressive annotation capabilities of raxpy could be used to simplify optimization experiments of Python functions. We plan to work with the open-source community to refine raxpy given user-desired use cases. For the latest examples, see https://github.com/neil-r/ raxpy/tree/main/examples. 6 Acknowledgments The authors declare no conflicts of interest and confirm that no external funding supported this research. The views and interpretations presented are solely those of the authors and do not necessarily reflect the positions or policies of their affiliated institution(s). We acknowledge and thank Kyle Daley for his support to support code documentation and formatting. References Susan M. Sanchez, Paul J. Sanchez, and Hong Wan. Work Smarter, Not Harder: A Tutorial on Designing and Conducting Simulation Experiments. In 2020 Winter Simulation Conference (WSC), pages 1128–1142, Orlando, FL, USA, December 2020. IEEE. ISBN 978-1-72819-499-8. doi:10.1109/WSC48552.2020.9384057. URL https://ieeexplore.ieee.org/document/9384057/. Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller, Nikita Rudin, Arthur Allshire, Ankur Handa, and Gavriel State. Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning, 2021. URL https://arxiv.org/abs/2108.10470. Version Number: 2. G. E. P. Box and H. L. Lucas. Design of Experiments in Non-Linear Situations. Biometrika, 46(1/2):77, June 1959. ISSN 00063444. doi:10.2307/2332810. URL https://www.jstor.org/stable/2332810?origin=crossref. Susan M. Sanchez and Paul J. Sánchez. Better Big Data via Data Farming Experiments. In Andreas Tolk, John Fowler, Guodong Shao, and Enver Yücesan, editors, Advances in Modeling and Simulation, pages 159–179. Springer International Publishing, Cham, 2017. ISBN 978-3-319-64181-2 978-3-319-64182-9. doi:10.1007/978-3-319- 64182-9_9. URL http://link.springer.com/10.1007/978-3-319-64182-9_9. Series Title: Simulation Foundations, Methods and Applications. Rosa Arboretti, Riccardo Ceccato, Luca Pegoraro, and Luigi Salmaso. Design choice and machine learning model performances. Quality and Reliability Engineering International, 38(7):3357–3378, November 2022. ISSN 0748- 8017, 1099-1638. doi:10.1002/qre.3123. URL https://onlinelibrary.wiley.com/doi/10.1002/qre.3123. Roberto Fontana, Alberto Molena, Luca Pegoraro, and Luigi Salmaso. Design of experiments and machine learning with application to industrial experiments. Statistical Papers, 64(4):1251–1274, August 2023. ISSN 0932-5026, 1613-9798. doi:10.1007/s00362-023-01437-w. URL https://link.springer.com/10.1007/s00362-023-01437-w. Kade Young, Maria L. Weese, Jonathan W. Stallrich, Byran J. Smucker, and David J. Edwards. A graphical comparison of screening designs using support recovery probabilities. Journal of Quality Technology, 56(4):355–368, August 2024. ISSN 0022-4065, 2575-6230. doi:10.1080/00224065.2024.2356127. URL https://www.tandfonline. com/doi/full/10.1080/00224065.2024.2356127. V. Roshan Joseph. Space-filling designs for computer experiments: A review. Quality Engineering, 28(1):28–35, January 2016. ISSN 0898-2112, 1532-4222. doi:10.1080/08982112.2015.1100447. URL http://www.tandfonline.com/ doi/full/10.1080/08982112.2015.1100447. M.E. Johnson, L.M. Moore, and D. Ylvisaker. Minimax and maximin distance designs. Journal of Statistical Planning and Inference, 26(2):131–148, October 1990. ISSN 03783758. doi:10.1016/0378-3758(90)90122-B. URL https://linkinghub.elsevier.com/retrieve/pii/037837589090122B. Art B. Owen. ORTHOGONAL ARRAYS FOR COMPUTER EXPERIMENTS, INTEGRATION AND VISUAL- IZATION. Statistica Sinica, 2(2):439–452, 1992. ISSN 10170405, 19968507. URL http://www.jstor.org/ stable/24304869. Publisher: Institute of Statistical Science, Academia Sinica. G Damblin, M Couplet, and B Iooss. Numerical studies of space-filling designs: optimization of Latin Hypercube Samples and subprojection properties. Journal of Simulation, 7(4):276–289, November 2013. ISSN 1747-7778, 1747- 7786. doi:10.1057/jos.2013.16. URL https://www.tandfonline.com/doi/full/10.1057/jos.2013.16. V. Roshan Joseph, Evren Gul, and Shan Ba. Designing computer experiments with multiple types of factors: The MaxPro approach. Journal of Quality Technology, 52(4):343–354, October 2020. ISSN 0022-4065, 2575-6230. doi:10.1080/00224065.2019.1611351. URL https://www.tandfonline.com/doi/full/10.1080/00224065. 2019.1611351. 14
  • 15. raxpy Rapid Experimentation with Python A PREPRINT Ryan Lekivetz and Bradley Jones. Fast flexible space-filling designs with nominal factors for nonrectangular regions. Quality and Reliability Engineering International, 35(2):677–684, March 2019. ISSN 0748-8017, 1099-1638. doi:10.1002/qre.2429. URL https://onlinelibrary.wiley.com/doi/10.1002/qre.2429. Ryan Lekivetz and Bradley Jones. Fast Flexible Space-Filling Designs for Nonrectangular Regions. Quality and Reliability Engineering International, 31(5):829–837, July 2015. ISSN 07488017. doi:10.1002/qre.1640. URL https://onlinelibrary.wiley.com/doi/10.1002/qre.1640. Ray-Bing Chen, Chi-Hao Li, Ying Hung, and Weichung Wang. Optimal Noncollapsing Space-Filling Designs for Irregular Experimental Regions. Journal of Computational and Graphical Statistics, 28(1):74–91, January 2019. ISSN 1061-8600, 1537-2715. doi:10.1080/10618600.2018.1482760. URL https://www.tandfonline.com/ doi/full/10.1080/10618600.2018.1482760. Zeping Wu, Donghui Wang, Wenjie Wang, Kun Zhao, Patrick N. Okolo, and Weihua Zhang. Space-filling experimental designs for constrained design spaces. Engineering Optimization, 51(9):1495–1508, September 2019. ISSN 0305- 215X, 1029-0273. doi:10.1080/0305215X.2018.1542691. URL https://www.tandfonline.com/doi/full/10. 1080/0305215X.2018.1542691. Akın Özdemir and Mehmet Turkoz. Robust design modeling and optimization for dealing with a non- convex design space. Computers Industrial Engineering, 185:109688, November 2023. ISSN 03608352. doi:10.1016/j.cie.2023.109688. URL https://linkinghub.elsevier.com/retrieve/pii/ S036083522300712X. Ping-Yang Chen, Ray-Bing Chen, and Weng Kee Wong. Particle swarm optimization for searching efficient experimental designs: A review. WIREs Computational Statistics, 14(5):e1578, September 2022. ISSN 1939-5108, 1939-0068. doi:10.1002/wics.1578. URL https://wires.onlinelibrary.wiley.com/doi/10.1002/wics.1578. Xue-Ru Zhang, Min-Qian Liu, and Yong-Dao Zhou. Orthogonal uniform composite designs. Journal of Statistical Planning and Inference, 206:100–110, May 2020. ISSN 03783758. doi:10.1016/j.jspi.2019.08.007. URL https: //linkinghub.elsevier.com/retrieve/pii/S0378375819300898. Peter Z. G. Qian. Sliced Latin Hypercube Designs. Journal of the American Statistical Association, 107(497):393–399, March 2012. ISSN 0162-1459, 1537-274X. doi:10.1080/01621459.2011.644132. URL http://www.tandfonline. com/doi/abs/10.1080/01621459.2011.644132. Xinwei Deng, Ying Hung, and C. Devon Lin. Design for computer experiments with qualitative and quantitative factors. Statistica Sinica, 2015. ISSN 10170405. doi:10.5705/ss.2013.388. URL http://www3.stat.sinica.edu.tw/ statistica/J25N4/J25N414/J25N414.html. K. Crombecq, E. Laermans, and T. Dhaene. Efficient space-filling and non-collapsing sequential design strategies for simulation-based modeling. European Journal of Operational Research, 214(3):683–696, November 2011. ISSN 03772217. doi:10.1016/j.ejor.2011.05.032. URL https://linkinghub.elsevier.com/retrieve/pii/ S0377221711004577. Razi Sheikholeslami and Saman Razavi. Progressive Latin Hypercube Sampling: An efficient approach for robust sampling-based analysis of environmental models. Environmental Modelling Software, 93:109–126, July 2017. ISSN 13648152. doi:10.1016/j.envsoft.2017.03.010. URL https://linkinghub.elsevier.com/retrieve/ pii/S1364815216305096. J.D. Parker, T.W. Lucas, and W.M. Carlyle. Sequentially extending space-filling experimental designs by optimally permuting and stacking columns of the design matrix. European Journal of Operational Research, 319(2):600–610, December 2024. ISSN 03772217. doi:10.1016/j.ejor.2024.06.020. URL https://linkinghub.elsevier.com/ retrieve/pii/S0377221724004685. Zhixiang Wang, Dapeng Zhang, Yongjun Lei, Zeping Wu, Jie Wang, Xing OuYang, and Jun Wang. Constrained space-filling and non-collapsing sequential design of experiments and its application for the lightweight design of cylindrical stiffened shells. Structural and Multidisciplinary Optimization, 64(6):3265–3286, December 2021. ISSN 1615-147X, 1615-1488. doi:10.1007/s00158-021-02948-6. URL https://link.springer.com/10.1007/ s00158-021-02948-6. Lu Lu, Christine M. Anderson-Cook, Miranda Martin, and Towfiq Ahmed. Practical choices for space-filling designs. Quality and Reliability Engineering International, 38(3):1165–1188, April 2022. ISSN 0748-8017, 1099-1638. doi:10.1002/qre.2884. URL https://onlinelibrary.wiley.com/doi/10.1002/qre.2884. Bernd Bischl, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, and Michel Lang. mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions, 2017. URL https://arxiv.org/ abs/1703.03373. Version Number: 3. 15
  • 16. raxpy Rapid Experimentation with Python A PREPRINT James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. Algorithms for hyper-parameter optimization. In J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 24. Curran Associates, Inc., 2011. URL https://proceedings.neurips.cc/ paper_files/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf. Yang Li, Yu Shen, Wentao Zhang, Ce Zhang, and Bin Cui. VolcanoML: speeding up end-to-end AutoML via scalable search space decomposition. The VLDB Journal, 32(2):389–413, March 2023. ISSN 1066-8888, 0949-877X. doi:10.1007/s00778-022-00752-2. URL https://link.springer.com/10.1007/s00778-022-00752-2. Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020. doi:10.1038/s41592-019-0686-2. Yong-Dao Zhou, Kai-Tai Fang, and Jian-Hui Ning. Mixture discrepancy for quasi-random point sets. Journal of Complexity, 29(3-4):283–301, 2013. Shan Ba and V. Roshan Joseph. MaxPro: Maximum Projection Designs, January 2015. URL https://CRAN. R-project.org/package=MaxPro. Institution: Comprehensive R Archive Network Pages: 4.1-2. A Anil Kumar, Baidya Nath Mandal, Rajender Parsad, Sukanta Dash, and Mukesh Kumar. SlicedLHD: Sliced Latin Hypercube Designs, February 2024. URL https://CRAN.R-project.org/package=SlicedLHD. Institution: Comprehensive R Archive Network Pages: 1.0. SAS Institute Inc. JMP® Pro. Cary, NC, 1989. Thomas W. Lucas and Jeffrey D. Parker. The Variability in Design-Quality Measures for Multiple Types of Space-Filling Designs Created by Leading Software Packages. In 2023 Winter Simulation Conference (WSC), pages 516–527, San Antonio, TX, USA, December 2023. IEEE. ISBN 9798350369663. doi:10.1109/WSC60868.2023.10407287. URL https://ieeexplore.ieee.org/document/10407287/. Guido van Rossum, Jukka Lehtosalo, and Łukasz Langa. PEP 484 – Type Hints. Python Software Foundation, 2014. URL https://peps.python.org/pep-0484/. Accessed: 2024-09-03. Till Varoquaux and Konstantin Kashin. PEP 593 – Flexible function and variable annotations. Python Software Foundation, 2019. URL https://peps.python.org/pep-0593/. Accessed: 2024-09-03. M. Lindauer, K. Eggensperger, M. Feurer, A. Biedenkapp, J. Marben, P. Müller, and F. Hutter. Boah: A tool suite for multi-fidelity bayesian optimization analysis of hyperparameters. arXiv:1908.06756 [cs.LG], 2019. 16