Tree-based formulation for the multi-commodity flow problem

Simon Spoorendonk
Flowty ApS
Denmark
simon@flowty.ai
&Bjørn Petersen
Flowty ApS
Denmark
bjorn@flowty.ai
Abstract

We introduce a tree-based formulation for the minimum-cost multi-commodity flow problem that addresses large-scale instances. The method decomposes the source-based model by representing flows as convex combinations of trees rooted at source nodes, and solves the resulting formulation with column generation. The number of demand constraints now depends on the number of sources |S||S|, not commodities |K||K|, yielding a compact master problem when |S||K||S|\ll|K|. We conduct a computational study comparing tree-based decomposition against path-based column generation and direct LP solving. The results show speed-ups of up to one order of magnitude over direct LP solving, and improved scalability compared to path-based formulations. Tree-based decomposition enables solving instances with millions of commodities and hundreds of thousands of nodes. This makes it well-suited for applications in transportation and logistics networks where multiple demands often share common origins.

Keywords Multi-commodity flow problem \cdot Column generation

1 Introduction

The minimum-cost multi-commodity flow (MCF) problem is a fundamental network optimization problem with applications in logistics, passenger transportation, telecommunication, and energy systems [1, 2]. It routes multiple origin–destination demands through a capacitated network at minimum cost. The MCF problem can be solved as a linear programming model, but the number of variables and constraints grows rapidly, making large-scale instances computationally challenging.

The classical edge-based formulation introduces O(|K||E|)O(|K||E|) variables and O(|K||V|+|E|)O(|K||V|+|E|) constraints. This becomes intractable when either the commodity set KK or the network size grows. Source-based formulations aggregate by source nodes, reducing the model to O(|S||E|)O(|S||E|) variables and O(|S||V|+|E|)O(|S||V|+|E|) constraints. When |S||K||S|\ll|K| this yields savings, but commodity-level flows must then be reconstructed [1]. Path-based formulations, obtained via Dantzig–Wolfe decomposition, replace edge variables by path variables. Each commodity kKk\in K has an exponential path set 𝒫k\mathcal{P}_{k}, and column generation dynamically adds paths [3, 4]. However, the master problem still contains O(|K|)O(|K|) demand constraints, and restricted master problems grow prohibitively large when |K||K| reaches millions.

Research has therefore focused on decomposition and relaxation approaches. Stabilized column generation methods [5], analytic center cutting-plane algorithms with active sets, and Lagrangian relaxation [6] have extended the tractable instance size. Recently, Zhang and Boyd [7] proposed a GPU–compatible primal–dual hybrid gradient method for the all-pair max-flow variant of the MCF problem, achieving orders-of-magnitude speed-ups over interior-point solvers. It remains to be investigated whether this is also true for the min-cost MCF problem. Destination-aggregated formulations have also been investigated in communication and optical networks [8]. These advances illustrate that scalability is based on exploiting the problem structure and avoiding the explicit introduction of unnecessary flow variables.

To address the master problem bottleneck, we study a tree-based decomposition of the MCF problem that are derived from the source-based formulation by representing flows as combinations of trees rooted at source nodes. This approach drastically reduces the size of the master problem in column generation while retaining the ability to model individual commodity demands. The master problem has now O(|S|)O(|S|) demand constraints and O(|E|)O(|E|) capacity constraints. The number of variables is still exponentially large but in practice much smaller than the path-based formulation since fewer tree variables are needed in an optimal basis. The approach pushes the scalability frontier of exact MCF optimization, solving instances with tens of thousands of nodes and millions of commodities.

The remainder of the paper is organized as follows. section 2 introduces the formulations of the MCF problem including the tree-based formulation introduced in this paper. section 3 describes the column generation algorithm used for the path- and tree-based formulations. section 4 presents computational experiments on large-scale instances, and section 5 concludes with directions for future research.

2 Problem statement

We present four formulations: the edge-based formulation, the source-based formulation, the path-based formulation commonly solved via column generation, and our novel tree-based formulation.

We consider a directed network G=(V,E)G=(V,E) with node set VV and edge set EE. Each edge eEe\in E has capacity ue0u_{e}\geq 0. The set of commodities is KK, with each commodity kKk\in K defined by source skVs_{k}\in V, sink tkVt_{k}\in V, and demand dk>0d_{k}>0. Sending one unit of commodity over edge ee incurs cost cec_{e}. The objective of the MCF problem is to route all demands at minimum cost without exceeding edge capacities.

2.1 Edge-based formulation

The classical edge-based linear program introduces variables fekf^{k}_{e} for the flow of commodity kk on edge ee. The model is

min\displaystyle\min\quad kKeEcefek\displaystyle\sum_{k\in K}\sum_{e\in E}c_{e}f^{k}_{e} (1)
s.t. eδ+(i)fekeδ(i)fek={dk,i=sk,dk,i=tk,0,otherwise,\displaystyle\sum_{e\in\delta^{+}(i)}f^{k}_{e}-\sum_{e\in\delta^{-}(i)}f^{k}_{e}=\begin{cases}d_{k},&i=s_{k},\\ -d_{k},&i=t_{k},\\ 0,&\text{otherwise},\end{cases} iV,kK,\displaystyle\forall i\in V,\;k\in K, (2)
kKfekue,\displaystyle\sum_{k\in K}f^{k}_{e}\leq u_{e}, eE,\displaystyle\forall e\in E, (3)
fek0,\displaystyle f^{k}_{e}\geq 0, eE,kK.\displaystyle\forall e\in E,\;\forall k\in K. (4)

where δ+(i)\delta^{+}(i) and δ(i)\delta^{-}(i) denote the outgoing and incoming edges of node ii, respectively. This formulation has O(|K||E|)O(|K||E|) variables and O(|K||V|+|E|)O(|K||V|+|E|) constraints.

2.2 Source-based formulation

Source-based formulations aggregate flows by source node. For each source sSs\in S, let fesf^{s}_{e} denote the total flow on edge ee originating at ss, and let dis=kK:sk=sdkd^{s}_{i}=\sum_{k\in K:s_{k}=s}d_{k} denote the total demand from ss to node ii. The formulation is

min\displaystyle\min\quad sSeEcefes\displaystyle\sum_{s\in S}\sum_{e\in E}c_{e}f^{s}_{e} (5)
s.t. eδ+(i)feseδ(i)fes={disif i=sdkif i=tkkK:sk=s0otherwise\displaystyle\sum_{e\in\delta^{+}(i)}f^{s}_{e}-\sum_{e\in\delta^{-}(i)}f^{s}_{e}=\begin{cases}d^{s}_{i}&\text{if }i=s\\ -d_{k}&\text{if }i=t_{k}\forall k\in K:s_{k}=s\\ 0&\text{otherwise}\end{cases} iV,sS,\displaystyle\forall i\in V,\;\forall s\in S, (6)
sSfesue,\displaystyle\sum_{s\in S}f^{s}_{e}\leq u_{e}, eE,\displaystyle\forall e\in E, (7)
fes0,\displaystyle f^{s}_{e}\geq 0, eE,sS.\displaystyle\forall e\in E,\;\forall s\in S. (8)

This model has O(|S||E|)O(|S||E|) variables and O(|S||V|+|E|)O(|S||V|+|E|) constraints, where |S||S| is the number of unique source nodes. When |S||K||S|\ll|K|, this yields a significant reduction in model size compared to the edge-based formulation. The trade-off is that the commodity-level flow decomposition is lost and must be reconstructed in a post-processing step.

Remark (on reconstruction).

The source-aggregated flows can be disaggregated back into individual commodity flows through standard flow decomposition algorithms. For each source-sink pair (s,t)(s,t), we iteratively extract sstt paths via augmenting-path search until the required demand is satisfied. This procedure yields a path-based decomposition with at most |E||E| paths per source-sink pair and runs in O(|E||V|)O(|E||V|) time per pair [1]. The resulting paths directly correspond to feasible commodity routings that respect capacity constraints [2].

2.3 Path-based formulation

Path-based formulations rely on a Dantzig–Wolfe decomposition of the edge-based formulation. For each commodity kKk\in K, let 𝒫k\mathcal{P}_{k} be the set of all feasible paths from sks_{k} to tkt_{k}. Introduce variables xpk0x^{k}_{p}\geq 0 denoting the flow of commodity kk on path p𝒫kp\in\mathcal{P}_{k}. The model is

min\displaystyle\min\quad kKp𝒫kcpxpk\displaystyle\sum_{k\in K}\sum_{p\in\mathcal{P}_{k}}c_{p}x^{k}_{p} (9)
s.t. p𝒫kxpk=dk,\displaystyle\sum_{p\in\mathcal{P}_{k}}x^{k}_{p}=d_{k}, kK,\displaystyle\forall k\in K, (10)
kKp𝒫k:epxpkue,\displaystyle\sum_{k\in K}\sum_{p\in\mathcal{P}_{k}:e\in p}x^{k}_{p}\leq u_{e}, eE,\displaystyle\forall e\in E, (11)
xpk0,\displaystyle x^{k}_{p}\geq 0, kK,p𝒫k.\displaystyle\forall k\in K,\;p\in\mathcal{P}_{k}. (12)

where cp=epcec_{p}=\sum_{e\in p}c_{e} is the cost of path pp.

This formulation arises as a Dantzig–Wolfe decomposition of the edge-based model. Each commodity kk defines a sub-problem whose feasible region is the set of all sks_{k}tkt_{k} flows of value dkd_{k}. The extreme points of this polytope correspond to simple sks_{k}tkt_{k} paths. Thus, path variables xpkx^{k}_{p} play the role of convex combination weights over extreme points in the decomposition [9, 10, 3, 4]. The formulation has O(|K|+|E|)O(|K|+|E|) constraints. The size of the path set |𝒫k||\mathcal{P}_{k}| is exponential in |V||V| and can be solved with column generation, see section 3.

2.4 Tree-based formulation

The tree-based formulation is the equivalent of a Dantzig–Wolfe decomposition of the source-based formulation by representing flows as convex combinations of shortest-path trees rooted at each source. Instead of sending flow independently along paths, we aggregate commodities with the same source into tree structures that simultaneously provide feasible routes to all sinks of that source. This yields a compact master problem, since the number of trees required can be much smaller than the number of paths.

For each source sSs\in S, let 𝒯s\mathcal{T}_{s} denote the set of trees rooted at ss that contain all sinks of commodities with sk=ss_{k}=s. We denote the sinks Ks={kK:sk=s}K_{s}=\{k\in K:s_{k}=s\}. A decision variable xτs0x^{s}_{\tau}\geq 0 is associated with each tree τ𝒯s\tau\in\mathcal{T}_{s}, indicating the fraction of flow routed through τ\tau. Let eτe\in\tau be an edge in the tree τ\tau and let pkp_{k} be the path from sks_{k} to tkt_{k} in the tree τ\tau for kKsk\in K_{s}. The model is

min\displaystyle\min\quad sSτ𝒯scτxτs\displaystyle\sum_{s\in S}\sum_{\tau\in\mathcal{T}_{s}}c^{\tau}x^{s}_{\tau} (13)
s.t. τ𝒯sxτs=1,\displaystyle\sum_{\tau\in\mathcal{T}_{s}}x^{s}_{\tau}=1, sS,\displaystyle\forall s\in S, (14)
sSτ𝒯s:eτf¯τexτsue,\displaystyle\sum_{s\in S}\sum_{\tau\in\mathcal{T}_{s}:\,e\in\tau}\bar{f}^{e}_{\tau}x^{s}_{\tau}\leq u_{e}, eE,\displaystyle\forall e\in E, (15)
xτs0,\displaystyle x^{s}_{\tau}\geq 0, sS,τ𝒯s.\displaystyle\forall s\in S,\;\tau\in\mathcal{T}_{s}. (16)

Here f¯τe=kKspkτ:epkdk\bar{f}^{e}_{\tau}=\sum_{k\in K_{s}}\sum_{p_{k}\in\tau:e\in p_{k}}d_{k} is the total flow on edge ee in tree τ\tau, and cτ=eτf¯τecec_{\tau}=\sum_{e\in\tau}\bar{f}^{e}_{\tau}c_{e} denotes the cost of tree τ\tau. The formulation has O(|S|+|E|)O(|S|+|E|) constraints. The tree set |𝒯s||\mathcal{T}_{s}| is exponential in size.

3 Column Generation

Column generation is a powerful decomposition technique for solving large-scale linear programming problems that would be intractable to solve directly. The method has been extensively studied and successfully applied to many problem classes including vehicle routing, crew scheduling, and network flow problems [3, 4].

The approach decomposes the original problem into a restricted master problem (RMP) containing only a subset of variables, and one or more pricing sub-problems that generate new variables (columns) as needed. The algorithm iterates between solving the RMP to obtain dual values, and using these to find columns with negative reduced cost in the pricing problems. When no such columns exist, the current solution is optimal.

For the MCF problem, we apply column generation to both the path-based and tree-based formulations described above. The key difference lies in the structure of the pricing problems - finding shortest paths versus shortest path trees. Below we detail the specific column generation procedures for each formulation, followed by implementation techniques that exploit the network structure to accelerate convergence.

3.1 Path generation

For the path-based formulation the column generation procedure alternates between:

  • Restricted master problem (RMP): solve (9)–(12) using only a subset of paths from kK𝒫k\cup_{k\in K}\mathcal{P}_{k}.

  • Pricing sub-problem: for each commodity kk, compute a shortest sks_{k}tkt_{k} path with respect to the reduced costs given by the dual variables of the RMP. If a path of negative reduced cost is found, it is added as a new column.

Let πk\pi_{k} denote the dual variable associated with the demand constraint of commodity kk (10), and let μe\mu_{e} denote the dual variable associated with the capacity constraint on edge ee (11). Then the reduced cost of a path p𝒫kp\in\mathcal{P}_{k} is

c¯p=ep(ceμe)πk.\displaystyle\bar{c}_{p}=\sum_{e\in p}\big(c_{e}-\mu_{e}\big)-\pi_{k}. (17)

Thus, finding a column of negative reduced cost for commodity kk amounts to computing an sks_{k}tkt_{k} path that minimizes ep(ceμe)\sum_{e\in p}(c_{e}-\mu_{e}) and then subtracting the demand dual πk\pi_{k}. Since all edge weights ce0c_{e}\geq 0 and μe0\mu_{e}\leq 0, the modified costs ceμec_{e}-\mu_{e} remain bounded below, so the pricing problem reduces to a shortest path problem in a weighted graph.

This allows the use of efficient single-source shortest path algorithms such as Dijkstra’s method. In particular, when many commodities share the same source ss, a single run of Dijkstra’s algorithm yields shortest paths to all sinks tkt_{k} with sk=ss_{k}=s, producing one pricing candidate for each commodity simultaneously. This reduces the number of Dijkstra runs per iteration from |K||K| to |S||S|.

3.2 Tree generation

For the tree-based formulation the procedure is:

  • Restricted master problem (RMP): solve (13)-(16) using only a subset of trees from kK𝒯s\cup_{k\in K}\mathcal{T}_{s}.

  • Pricing sub-problem: for each source sSs\in S, compute a shortest path tree rooted at ss that covers all sinks tkt_{k} where sk=ss_{k}=s.

Let πs\pi_{s} denote the dual variable of (14) and μe\mu_{e} the dual variable of (15). The reduced cost of a tree τ𝒯s\tau\in\mathcal{T}_{s} is

c¯τ=eτf¯τe(ceμe)πs.\displaystyle\bar{c}_{\tau}=\sum_{e\in\tau}\bar{f}^{e}_{\tau}\big(c_{e}-\mu_{e}\big)-\pi_{s}. (18)

Hence, pricing reduces to finding a minimum-cost tree rooted at ss with respect to modified edge costs cesμec^{s}_{e}-\mu_{e}. Since all edge weights are non-negative, this corresponds to computing a shortest-path tree (SPT). A single-source shortest path computation from source ss yields a set of shortest paths pkp_{k} to all sinks tkt_{k} with sk=ss_{k}=s where the unique set of edges τ=epk,kKse\tau=\cap_{e\in p_{k},k\in K_{s}}e is the tree. The cost of the paths corresponds to sending one unit of flow from ss to all sinks tkt_{k} with sk=ss_{k}=s. The edge flow values f¯τe\bar{f}^{e}_{\tau} are calculated in a post-processing step as described in subsection 2.4 which allows us to calculate the reduced cost of the tree.

Observe that every path in the tree is the minimum reduced cost path from ss to any sink tkt_{k} with sk=ss_{k}=s. Hence, the tree is the minimum reduced cost tree even when scaled by demands dkd_{k}

3.3 Implementation details

Bounded single-source pricing.

For a fixed source ss, define Πs:=maxk:sk=sπk\Pi_{s}:=\max_{k:\,s_{k}=s}\pi_{k}. During a single run of Dijkstra on adjusted edge weights c~e:=ceμe\tilde{c}_{e}:=c_{e}-\mu_{e}, extract-min keys are tentative distances d(v)d(v) from ss. If the current minimum key satisfies d(v)Πsd(v)\geq\Pi_{s}, then for every remaining destination tkt_{k} with sk=ss_{k}=s we have

d(tk)shortestπkd(v)πkΠsπk 0,\underbrace{d(t_{k})}_{\text{shortest}}-\pi_{k}\;\geq\;d(v)-\pi_{k}\;\geq\;\Pi_{s}-\pi_{k}\;\geq\;0,

so no path with negative reduced cost exists and the pricing for source ss can terminate early.

A* with reverse distance bounds.

Let h(v)h(v) be an admissible and consistent lower bound on the remaining distance from vv to any destination tkt_{k} with sk=ss_{k}=s under the adjusted costs c~e=ceμe\tilde{c}_{e}=c_{e}-\mu_{e} (e.g., obtained from a reverse single-source shortest path). Run A* on c~\tilde{c} with keys f(v)=g(v)+h(v)f(v)=g(v)+h(v), where g(v)g(v) is the tentative distance from ss. The same global stop test as above applies: if minvf(v)Πs\min_{v}f(v)\geq\Pi_{s}, no path of negative reduced cost exists for source ss. Exact reverse distances tighten hh but may be memory-intensive to store for large graphs when |S||S| is large.

Precomputing reverse multi-target bounds.

We compute a lower bound h(v)h(v) from any node vv to any destination tkt_{k} with skSs_{k}\in S. Running multi-source Dijkstra on the reverse graph yields

h(v)=mink:skSdistrev(v,tk),h(v)\;=\;\min_{k:\,s_{k}\in S}\;\text{dist}_{\text{rev}}(v,t_{k}),

which is an admissible and consistent heuristic for all commodities with source ss. This provides lower bounds that are worse than if we considered a single fixed source ss. However, in this scenario only |V||V| lower bounds are stored which is much more memory efficient.

Lazy addition of capacity constraints.

In the RMP, the demand constraints, (10) and (14) respectively, are always enforced, while the capacity constraints, (11) and (15) respectively, constraints can be introduced lazily. At optimality, only the binding edge constraints matter, since the dual multiplier of a non-binding edge constraint is zero. Hence, adding all |E||E| capacity constraints upfront is unnecessary. Instead, one begins with a subset (potentially empty) and monitors edge utilizations during column capacity constraints upfront is unnecessary. Instead, one begins with a subset (potentially empty) and monitors edge utilizations during column generation. Whenever an edge ee becomes violated, the corresponding capacity constraint is added to the RMP (row generation). This significantly reduces the initial model size and is essential for scalability on large networks [5, 6].

This mechanism is closely related to the active set strategy proposed in [6], where only those arc constraints that are likely to be saturated are kept in the RMP. Non-congested arcs are temporarily ignored until they become active, at which point their capacity constraints are inserted. Such strategies exploit the empirical fact that in optimal solutions of large multi-commodity problems, only a small fraction of edges are congested. By combining column generation (to handle the exponential number of paths) with active set row management (to handle the potentially huge set of edge capacities), the master problem is kept compact in both dimensions. This dual stabilization on rows and columns is crucial for solving instances with millions of commodities and tens of thousands of edges.

Balancing master and pricing problems.

The computational effort between solving the master problem and pricing problem can vary significantly based on instance characteristics and network structure. We adapt our strategy accordingly.

When the master RMP is computationally easier:

  • Prioritize re-optimizing the master problem whenever lazy cuts need to be added, before attempting any pricing

  • Apply the pricing filter technique from [11] - only price commodities whose previously generated columns use edges from newly added capacity constraints

  • Only remove the filter if fewer than ϵ\epsilon columns are found

  • In the final iteration, solve all pricing problems to prove optimality

When pricing is computationally easier:

  • In each iteration, attempt both lazy cut separation and pricing problem solutions

  • Continue solving pricing problems until finding NN columns with negative reduced cost

  • If fewer than NN columns are found, solve all remaining pricing problems to obtain valid lower bounds via reduced cost calculations

Slack variables.

To improve convergence we avoid infeasibility of the master problem by adding slack variables. When the number of demand constraints are less than the number of edges we add a slack variable to demand constraints (10) and (14) respectively. Otherwise we add a slack variable to edge capacity constraints (11) and (15). The cost of the slack variables is a sufficiently large value, e.g., sum of all edges for the path-based formulation or sum of all edges times the sum of demand for tree-based formulation.

Note, that since capacity constraints are added dynamically, adding edge slack may result in small infeasibilities until all relevant constraints are added. In this case restore feasibility through a phase 1 by adding artificial variables. This can mostly be avoided by initially adding a column for each pricing problem. We do this when edge slack variables are used.

4 Experiments

We evaluate the performance of the path-based and tree-based formulations on a diverse set of instances. For comparison, we compare against a commercial LP solver (MOSEK) using the source-based formulation.

All experiments were performed on a machine with an AMD Ryzen 9 3950X 16-Core Processor and 128GB RAM running Ubuntu 24.04. The algorithms were implemented in C++. We used MOSEK 11.0.27 interior point method without crossover as the LP solver for both the full formulation and the restricted master problems. We use up to 32 threads.

Throughout the experiments we set a relative optimality tolerance of 1e-4 and use a timeout of 7200 seconds. For grid, planar and transportation networks the pricing problem is considered "easy" and for intermodal transport networks the master problem is considered "easy".

4.1 Instances

We consider 4 families of instances, grid, planar, transportation networks, and intermodal transport networks. The grid and planar instances are from the MCF instances benchmark [12]. The transportation networks are from the transportation networks benchmark [13]. The intermodal transport networks are from the intermodal transport networks benchmark [11].

The provided transportation networks are non-linear maximum flow MFC problems, and as in [6] we convert them to min-cost MCF problems. The data is converted such that "free flow time" is used as the unit edge cost, and we divide the demands of a problem by a coefficient. As previously proposed, we increase the coefficient until the problems becomes feasible. Table 3 in Appendix A shows the coefficients used for the transportation network instances.

[6] addresses these problems (with less accuracy 1e-3 vs 1e-4 relative). However, we were not able to reproduce their result using their coefficient values with the current version of transportation network instances [13]. Hence, we have slightly different coefficients and optimal objective values.

The intermodal transport networks are derived from the Munich public transport network. We have used the code of [11] to generate the instances with seed 0. Preliminary tests showed that instances with seeds 1 to 9 were computationally equivalent to seed 0, so due to lack of difference in behaviors we only consider seed 0. Additionally we have filtered out any trips that were not connected in the generated network.

Table 1 shows the characteristics of the instance families. Grid and planar are relatively small but have many commodity sources per vertices. The transportation networks have a large number of commodities with many shared sources. The intermodal transport networks are large in number of vertices and edges but have a relatively small number of commodities with no shared sources. See the Table 4 in Appendix A for a full overview over instance characteristics.

Family Nodes Edges Commodities Sources Variables Constraints Non-zeros
Grid 523 2,011 4,967 463 1.6e+06 4.1e+05 4.7e+06
Planar 607 3,230 13,920 607 4.8e+06 9.2e+05 1.4e+07
Transportation 10,833 27,108 943,682 1,114 5.0e+07 2.0e+07 1.5e+08
Intermodal 177,179 1,893,620 22,195 22,195 6.4e+10 4.6e+09 1.9e+11
Table 1: Instance family characteristics (mean values). Number of variables, constraints, and non-zeroes are based on the source-based formulation.

4.2 Performance

The performance profile of the path-based, tree-based formulations, and the source-based formulation is shown in Figure 1. The tree-based formulation outperforms the path-based formulation on most instances, the exception being planar2500. Both column generation formulations solves all instances (43) within the time limit compared to the source-based formulation (26) that cannot solve the largest instances due to memory limitations. It is noteworthy, that the source-based formulation with MOSEK did not timeout on any of the instances if it was possible to construct the instance in memory. Detailed results for the instances are shown in Table 5 in Appendix A.

Refer to caption
Figure 1: Performance profile of the path-based and tree-based formulations against the source-based formulation

The cactus plot of the path-based, tree-based formulations, and the source-based formulation is shown in Figure 2 and Figure 3 for runtime and memory respectively. From the plots it is seen that both column generation algorithms are on par with each other, with the tree-based formulation being slightly faster and using a little less memory. The source-based formulation is significantly slower and as expected it uses much more memory.

Table 2 summarize the geometric mean results for instances for runtime and memory. Runtime means are shifted by 10 seconds to reduce the impact of very small runtimes. Memory means are not shifted. From the scaled results it is clear that both column generation formulations outperform the source-based formulation both in runtime and memory.

Refer to caption
Figure 2: Cactus plot with runtime performance
Refer to caption
Figure 3: Cactus plot with memory performance
Solver Time Memory Solved of 43
Unscaled Scaled Unscaled Scaled
Tree 15.0 1.00 0.7 1.00 43
Path 21.2 1.42 0.9 1.39 43
Source 313.4 20.96 12.6 18.57 26
Table 2: Shifted geometric mean runtime and memory results

Comparison to other work.

We compare our implementation against several recent approaches from the literature. For the planar2500 instance, which was the largest instance solved in [5], our implementation is approximately 7 times faster, solving it in under 900 seconds compared to their 6169 seconds on the same machine.

The transportation network instances from [6] include challenging cases like ChicagoRegional and Philadelphia that could not be solved to an accuracy of 1e-4 in their work. While our problem coefficients differ slightly, we successfully solve all instances, including the larger Sydney network, to 1e-4 accuracy in under 700 seconds.

For the intermodal transport networks, we compare against [11] who solved their largest instance (SBT-56295) in 2856 seconds on an Intel Core i9-9900 (3.1 GHz). Accounting for processor speed differences based on benchmark data, their solution time normalizes to approximately 1203 seconds on our AMD Ryzen 9 3950X (3.5 GHz). In comparison, our implementation solves the same instance in just 115 seconds on average - more than a 10-fold improvement in performance.

4.3 Path and tree-based formulations comparison

Figure 4 and Figure 5 show the runtime and memory usage on the y-axis and size of the instance (in the number of non-zeroes in the LP) on the x-axis. The source-based formulation is included for completeness. The transportation network instances are located in the middle of the plots and the intermodal transport network instances are located most right in the plots.

It is seen that the tree- and path-based behave identical on the intermodal transport network instances. This is to be expected since all commodities have a unique source and destination with a demand of 1, hence the formulations are equivalent as all shortest paths trees are paths.

For the transportation network instances the tree-based formulation is faster than the path-based formulation. The plots indicate that high memory usage is correlated with high runtime, and the tree-based formulation is expected to use less memory than the path-based formulation due to fewer columns in the master problem.

Refer to caption
Figure 4: Runtime vs size in number of non-zeroes of the source-based formulation
Refer to caption
Figure 5: Memory vs size in number of non-zeroes of the source-based formulation

The scatter plot of the path-based and tree-based formulations is shown in Figure 6. The formulations behave identical on the grid, planar, and intermodal transport network instances, except for the instance planar2500. The tree-based formulation is faster on the transportation network instances that have many shared sources.

Refer to caption
Figure 6: Scatter plot of the path-based and tree-based formulations against each other

The relation between the number of commodities and the number of shared sources is explored via the heatmap in Figure 7. It shows the relations between the number of commodities and the fraction of shared sources and is color coded according to the speed-up. A large speed-up (yellow) indicates that the tree-based formulation is faster than the path-based formulation. It is seen that the tree-based formulation is faster than the path-based formulation for instances with a large number of commodities and a small fraction of shared sources. This is expected as the number of demand constraints in the master problem is only O(|S|)O(|S|) for the tree-based formulation and O(|K|)O(|K|) for the path-based formulation.

Refer to caption
Figure 7: Heatmap of the path-based and tree-based formulations against each other

We observe that the path-based formulation is faster on the largest planar instances. We expect this to be due to too slow column generation convergence and many pricing iterations for the tree-based formulation. The tree-based formulation generates at most |S||S| columns per iteration, while the path-based formulation generates up to |K||K| columns per iteration. This may stabilize the column generation process faster for the path-based formulation even if each master problem iteration is slower.

5 Conclusion

In this paper, we introduced a novel tree-based formulation for the multi-commodity flow problem. Rather than decomposing by individual commodity paths, our key innovation is to aggregate flows by shared sources and represent them using trees. This reduces the master problem size from O(|K|) to O(|S|) constraints, where |S| << |K| in many practical applications.

We demonstrated the advantages of this formulation through extensive computational experiments comparing it against traditional path-based decomposition and direct LP solving with MOSEK across diverse instance types including grid, planar, transportation, and intermodal networks. Our implementation using column generation showed that:

  • The tree-based formulation achieves up to 5x reduction in memory usage compared to path-based approaches by maintaining a more compact master problem

  • On large instances with many shared sources, the tree-based method scales significantly better, solving problems with millions of commodities with 5-30x speed-ups over the path-based formulation

  • Both decomposition approaches substantially outperform direct LP solving, with the tree-based formulation showing particular advantages on transportation networks where source sharing is common

The results establish that our tree-based formulation represents a significant advance in solving large-scale multi-commodity flow problems. By exploiting the natural structure of shared sources, it overcomes key scalability limitations of existing methods. The dramatic reduction in memory requirements and improved solution times make it possible to tackle problem sizes that were beyond the reach of previous approaches.

This innovation is particularly impactful for real-world applications in transportation and logistics networks where multiple commodities frequently share origins. The ability to efficiently handle millions of commodities while maintaining individual demand constraints opens new possibilities for detailed network optimization at scale.

Several promising directions could further enhance this approach. First-order optimization methods and GPU acceleration could improve master problem solve times. The pricing problem could benefit from edge filtering, path generation heuristics, and warm-starting techniques. Strategic selection of source nodes through pattern analysis and partitioning could also yield additional performance gains. These extensions could help realize the full potential of the tree-based paradigm for extremely large-scale network optimization.

References

  • [1] Ravindra K. Ahuja, Thomas L. Magnanti, and James B. Orlin. Network Flows: Theory, Algorithms, and Applications. Prentice Hall, Englewood Cliffs, NJ, 1993.
  • [2] Khodakaram Salimifard and Sara Bigharaz. The multicommodity network flow problem: state of the art classification, applications, and solution methods. Operational Research, 22:1–47, 2022.
  • [3] Jacques Desrosiers, Marco Lübbecke, Guy Desaulniers, and Jean Bertrand Gauthier. Branch-and-price. Les Cahiers du GERAD G-2024-36, Groupe d’études et de recherche en analyse des décisions, GERAD, Montréal QC H3T 2A7, Canada, June 2024.
  • [4] Eduardo Uchoa, Artur Pessoa, and Lorenza Moreno. Optimizing with Column Generation: Advanced branch-cut-and-price algorithms (Part I). Technical Report L-2024-3, Cadernos do LOGIS-UFF, Universidade Federal Fluminense, Engenharia de Produção, August 2024.
  • [5] Jacek Gondzio, Pablo González-Brevis, and Pedro Munari. Large-scale optimization with the primal–dual column generation method. Mathematical Programming Computation, 8:47–82, 2016.
  • [6] Frédéric Babonneau, Olivier du Merle, and Jean-Philippe Vial. Solving large-scale linear multicommodity flow problems with an active set strategy and proximal-accpm. Operations Research, 54(1):184–197, 2006.
  • [7] Fangzhao Zhang and Stephen Boyd. Solving large multicommodity network flow problems on gpus. arXiv preprint, 2025.
  • [8] Ping Yin, Steven Diamond, Bill Lin, and Stephen Boyd. Network optimization for unified packet and circuit switched networks. Optimization and Engineering, 21:159–180, 2020.
  • [9] Cynthia Barnhart, Charles A. Hane, Ellis L. Johnson, and George Sigismondi. A column generation and partitioning approach for multi-commodity flow problems. Telecommunication Systems, 3:239–258, 1994.
  • [10] Cynthia Barnhart, Charles A. Hane, and Patrick H. Vance. Using branch-and-price-and-cut to solve origin-destination integer multicommodity flow problems. Operations Research, 48(2):318–326, 2000.
  • [11] Benedikt Lienkamp and Maximilian Schiffer. Column generation for solving large scale multi-commodity flow problems for passenger transportation. European Journal of Operational Research, 314(2):703–717, 2024.
  • [12] University of Pisa CommadLab. Multicommodity flow problems, 2025. Accessed 2025-08-27, https://commalab.di.unipi.it/datasets/mmcf/.
  • [13] Transportation Networks for Research Core Team. Transportation networks for research, 2025. GitHub repository. commit #d1639b4, Accessed 2025-01-27, https://github.com/bstabler/TransportationNetworks.

Appendix A Appendix

A.1 Instance details

Table 3 shows the coefficients used for the generation of min-cost transportation network instances. Table 4 shows the instance details.

Instance Coefficient
Transportation
Austin 6.0
Barcelona 5050.0
BerlinCenter 0.5
Birmingham 0.9
ChicagoRegional 4.1
ChicagoSketch 2.4
Philadelphia 7.0
Sydney 1.9
Winnipeg 2000.0
Table 3: Coefficients used for the generation of min-cost transportation network instances
Instance Nodes Edges Commodities Sources Variables Constraints Non-zeros
Grid
grid1 25 80 50 22 1.8e+03 6.3e+02 5.4e+03
grid2 25 80 100 25 2.0e+03 7.0e+02 6.1e+03
grid3 100 360 50 40 1.4e+04 4.4e+03 4.3e+04
grid4 100 360 100 64 2.3e+04 6.8e+03 6.9e+04
grid5 225 840 100 82 6.9e+04 1.9e+04 2.1e+05
grid6 225 840 200 138 1.2e+05 3.2e+04 3.5e+05
grid7 400 1,520 400 251 3.8e+05 1.0e+05 1.1e+06
grid8 625 2,400 500 356 8.5e+05 2.2e+05 2.6e+06
grid9 625 2,400 1,000 495 1.2e+06 3.1e+05 3.6e+06
grid10 625 2,400 2,000 603 1.4e+06 3.8e+05 4.3e+06
grid11 625 2,400 4,000 625 1.5e+06 3.9e+05 4.5e+06
grid12 900 3,480 6,000 898 3.1e+06 8.1e+05 9.4e+06
grid13 900 3,480 12,000 900 3.1e+06 8.1e+05 9.4e+06
grid14 1,225 4,760 16,000 1,225 5.8e+06 1.5e+06 1.8e+07
grid15 1,225 4,760 32,000 1,225 5.8e+06 1.5e+06 1.8e+07
Planar
planar30 30 150 92 29 4.4e+03 1.0e+03 1.3e+04
planar80 80 440 543 80 3.5e+04 6.8e+03 1.1e+05
planar100 100 532 1,085 100 5.3e+04 1.1e+04 1.6e+05
planar150 150 850 2,239 150 1.3e+05 2.3e+04 3.8e+05
planar300 300 1,680 3,584 300 5.0e+05 9.2e+04 1.5e+06
planar500 500 2,842 3,525 500 1.4e+06 2.5e+05 4.3e+06
planar800 800 4,388 12,756 800 3.5e+06 6.4e+05 1.1e+07
planar1000 1,000 5,200 20,026 1,000 5.2e+06 1.0e+06 1.6e+07
planar2500 2,500 12,990 81,430 2,500 3.2e+07 6.3e+06 9.8e+07
Transportation
Austin 7,388 18,956 1,080,603 1,117 2.1e+07 8.3e+06 6.5e+07
Barcelona 1,020 2,522 7,922 108 2.7e+05 1.1e+05 8.3e+05
BerlinCenter 12,981 28,370 49,688 862 2.4e+07 1.1e+07 7.3e+07
Birmingham 14,639 33,937 470,805 898 3.0e+07 1.3e+07 9.2e+07
ChicagoRegional 12,982 39,018 2,296,227 1,768 6.9e+07 2.3e+07 2.1e+08
ChicagoSketch 933 2,950 93,135 386 1.1e+06 3.6e+05 3.5e+06
Philadelphia 13,389 40,003 1,149,795 1,489 6.0e+07 2.0e+07 1.8e+08
Sydney 33,113 75,379 3,340,619 3,257 2.5e+08 1.1e+08 7.4e+08
Winnipeg 1,052 2,836 4,344 138 3.9e+05 1.5e+05 1.2e+06
Intermodal
BUS-2632 119,857 397,233 2,628 2,628 1.0e+09 3.2e+08 3.1e+09
BUS-7896 130,367 710,464 7,883 7,883 5.6e+09 1.0e+09 1.7e+10
BUS-13160 140,871 1,022,654 13,135 13,135 1.3e+10 1.9e+09 4.0e+10
BUS-18424 151,377 1,336,774 18,388 18,388 2.5e+10 2.8e+09 7.4e+10
BUS-23688 161,887 1,651,215 23,643 23,643 3.9e+10 3.8e+09 1.2e+11
SBT-6255 163,479 809,333 6,251 6,251 5.1e+09 1.0e+09 1.5e+10
SBT-18765 188,467 1,782,451 18,745 18,745 3.3e+10 3.5e+09 1.0e+11
SBT-31275 213,481 2,760,176 31,252 31,252 8.6e+10 6.7e+09 2.6e+11
SBT-43785 238,491 3,741,011 43,757 43,757 1.6e+11 1.0e+10 4.9e+11
SBT-56295 263,515 4,724,888 56,269 56,269 2.7e+11 1.5e+10 8.0e+11
Table 4: Instance details. Number of variables, constraints, and non-zeroes are based on the source-based formulation.

A.2 Runtime details

Table 5 shows the runtime details for the instances.

Instance Tree Path MOSEK Objective
Time(s) Mem(GB) Time(s) Mem(GB) Time(s) Mem(GB)
Grid
grid1 0.0 0.0 0.0 0.0 - - 8.2732e+05
grid2 0.0 0.1 0.0 0.0 - - 1.7054e+06
grid3 0.0 0.0 0.0 0.0 - - 1.5246e+06
grid4 0.0 0.1 0.0 0.0 - - 3.0317e+06
grid5 0.0 0.1 0.1 0.1 - - 5.0497e+06
grid6 0.3 0.1 0.2 0.1 - - 1.0402e+07
grid7 0.4 0.1 0.3 0.1 - - 2.5864e+07
grid8 1.6 0.2 1.3 0.2 - - 4.1711e+07
grid9 3.4 0.2 2.8 0.2 - - 8.2653e+07
grid10 4.3 0.3 4.1 0.3 - - 1.6411e+08
grid11 3.9 0.2 4.4 0.3 - - 3.2926e+08
grid12 4.4 0.2 4.2 0.3 - - 5.7719e+08
grid13 10.2 0.3 12.4 0.4 - - 1.1593e+09
grid14 5.5 0.3 8.2 0.4 - - 1.8027e+09
grid15 10.6 0.4 18.8 0.6 - - 3.5935e+09
Planar
planar30 0.0 0.0 0.0 0.0 - - 4.4351e+07
planar80 0.4 0.1 0.3 0.1 - - 1.8244e+08
planar100 0.3 0.1 0.5 0.1 - - 2.3134e+08
planar150 4.4 0.2 4.1 0.3 - - 5.4809e+08
planar300 1.6 0.2 1.9 0.2 - - 6.8998e+08
planar500 0.7 0.2 1.0 0.2 - - 4.8198e+08
planar800 5.5 0.3 8.6 0.4 - - 1.1674e+09
planar1000 75.3 0.6 53.5 0.9 - - 3.4496e+09
planar2500 5894.7 10.0 866.9 19.3 - - 1.2662e+10
Transportation
Austin 670.6 14.5 4967.0 48.1 - - 5.4329e+06
Barcelona 0.1 0.1 2.5 0.2 - - 8.8588e+01
BerlinCenter 10.3 0.5 19.0 0.7 - - 2.7529e+07
Birmingham 196.2 3.3 399.3 16.7 - - 2.1565e+05
ChicagoRegional 68.2 6.1 981.8 33.4 - - 5.5131e+06
ChicagoSketch 0.5 0.2 3.1 0.6 - - 6.7059e+06
Philadelphia 380.6 7.4 1471.2 17.3 - - 2.5578e+07
Sydney 66.4 11.1 2520.9 59.7 - - 4.9653e+06
Winnipeg 0.0 0.1 2.9 0.2 - - 2.3767e+02
Intermodal
BUS-2632 1.5 0.7 1.6 0.7 - - 7.1027e+04
BUS-7896 4.7 2.3 4.3 2.3 - - 2.1060e+05
BUS-13160 8.4 4.6 8.2 4.6 - - 3.4836e+05
BUS-18424 12.9 7.8 13.7 7.8 - - 4.8503e+05
BUS-23688 20.9 11.7 20.5 11.7 - - 6.2488e+05
SBT-6255 3.7 2.2 3.6 2.2 - - 1.6061e+05
SBT-18765 15.4 10.4 15.5 10.4 - - 4.8111e+05
SBT-31275 38.0 24.5 37.3 24.5 - - 8.0133e+05
SBT-43785 68.5 44.5 68.5 44.5 - - 1.1236e+06
SBT-56295 114.3 70.6 112.6 70.6 - - 1.4479e+06
Table 5: Detailed runtime comparison