DocEng2013, September 10– 13, 2013, Florence, Italy

Splitting Wide Tables Optimally
Mihai Bilauca

Patrick Healy

Department of Computer Science and Information Systems
University of Limerick, Ireland

Supported by Science Foundation Ireland under the research programme 01/P1.2/C009,
Mathematical Foundations, Practical Notations, and Tools for Reliable Flexible Software.
Splitting Wide Tables Optimally
Why this paper?
• Tables are widely used for presenting logical
relationships between data items;
• Widely spread WYSIWYG tools have poor support for
wide tables;
• Authoring tables is hard, time consuming and error
prone;
• Style manuals recommendations are not always
supported
• Very little research in this area
Splitting Wide Tables Optimally

Slide 2 of 23
A wide table split across multiple pages
Splitting Wide Tables Optimally

Slide 3 of 23
+ Zoom in

Grouping of data items increases readability
Splitting Wide Tables Optimally

Slide 4 of 23
Splitting Wide Tables Optimally
Style recommendations from Chicago Manual of Style
“For a two-page broadside table – which should be presented
on facing pages if at all possible – column heads need not be
repeated; for broadside tables that run beyond two pages,
column heads are repeated only on each new verso.
Where column heads are repeated, the table number and
“continued” should also appear.
For any table that is likely to run to more than one page, the
editor should specify whether continued lines and repeated
column heads will be needed and where footnotes should
appear (usually at the end of the table as a whole).”

Splitting Wide Tables Optimally

Slide 5 of 23
Splitting Wide Tables Optimally
Overview
We present MIP Solutions using OPL for 3 problems that occur
when splitting wide tables with the aim to minimize the effect
on the meaning of data:
1. Minimize Page Count
2. Minimize Page Count and Column Positioning
Changes
3. Minimize Page Count and Group Splitting

Report experimental results with IBM CPLEX 12.3
Conclusions
MIP – Mixed Integer Programming
OPL – Optimization Programming Language
Splitting Wide Tables Optimally

Slide 6 of 23
1.Minimum Page Count

Splitting Wide Tables Optimally

Slide 7 of 23
1.Minimum Page Count – OPL Model
dvar int+ pageSel[Pages] in 0..1;
dvar int+ X[Pages][Cols] in 0..1;
dexpr int pageCount = sum(p in Pages) pageSel[p];
minimize pageCount;
subject to
{
ct1: // select only one page for each column
forall(j in Cols)
sum(p in Pages) X[p][j] == 1;
ct2: // only columns that fit in the page
forall(p in Pages)
sum(j in Cols)
colW[j] / pageW ∗ X[p][j] <= pageSel[p];
}
Splitting Wide Tables Optimally

Slide 8 of 23
1.Minimum Page Count - Results
●

Page count can be reduced by 14% to 25%

●

The difficulty of the problem is not directly linked to the
problem size but to the data itself

Columns

10

20

30

40

50

60

PC

7

16

19

29

34

48

OPC

6

12

15

23

26

39

%Imp

14.28%

25.00%

21.05%

20.68%

23.52%

18.75%

Time

2.25

0.13

0.17

1.18

04.30

1.52

Building Table Formatting Tools

Slide 9 of 23
2.Minimum Page Count & Column
Positioning Changes

Splitting Wide Tables Optimally

Slide 10 of 23
2.Minimum Page Count & Column Positioning Changes
PageW: 490 points
colW : [210, 140, 210, 420, 280, 350, 70, 140, 140, 350]
7 pages : {210,140} {210} {420} {280} {350,70} {140,140}
{350}
Minimum 5 pages:
ColIdx : [1, 7, 8, 5, 2, 9, 6, 10, 3, 4]
Pages:
{210,280} {140,350} {420,70} {140,210} {350,140}
Minimum 5 pages and column position changes possDiff
colIdx : [1, 2, 3, 5, 4, 7, 6, 8, 9, 10]
Pages : {210,140} {210,280} {420,70} {350,140} {140,350}

Splitting Wide Tables Optimally

Slide 11 of 23
2.Minimum Page Count & Column Positioning Changes
dvar int+ pageSel[Pages] in 0..1;
dvar int+ pageIdx[Cols] in 0..1;
dvar int+ colIdx[Cols] in 0..1;
// check if j1 is placed on a page before j2
dexpr int posO[j1,j2 in Cols] = j1 <= j2−1;
dexpr int posN[j1,j2 in Cols] = (colIdx[j1]<=colIdx[j2]−1)
dexpr float posDiff = sum(j1,j2 in Cols : j2 < j1)
abs(posO[j1,j2] − posN[j1,j2]);
dexpr int pageCount = sum(p in Pages) pageSel[p];
// a, b, obj1Val variables are used for OPL flow control
minimize a * pageCount + b * posDiff;

Splitting Wide Tables Optimally

Slide 12 of 23
2.Minimum Page Count & Column Positioning Changes
subject to {
ct1: // do not exceed page width
forall(p in Pages)
sum(j in Cols)
colW[j]/(p==pageIdx[j]) / pageW <= pageSel[p];
ct2: // page and column indexes relationship
forall(ordered j1,j2 in Cols)
(pageIdx[j1]<=pageIdx[j2]-1) (colIdx[j1]<=colIdx[j2]-1) == 0;
ct3: // unique column index values
forall(ordered j1,j2 in Cols)
colIdx[j1]!=colIdx[j2];
// if the minimum page count obj1Val is set
// maintain this value for subsequent searches
ct4:
if (obj1Val >= 0 ) pageCount == obj1Val;
}
Splitting Wide Tables Optimally

Slide 13 of 23
2.Minimum Page Count & Column Positioning Changes
Results
●

Promising performance:
– 2.25s for minimizing a 10 column table with posDiff
33 down to 4, page count from 9 down to 8;
– 89s for minimizing a 20 column table with posDiff
194 down to 4, page count from 13 down to 11;

●

Computational time increases with columns number

●

The data instance can have no better solutions

Building Table Formatting Tools

Slide 14 of 23
3.Minimum Page Count & Group Splitting

Splitting Wide Tables Optimally

Slide 15 of 23
3.Minimum Page Count & Group Splitting
User specifies which columns should preferably be
kept together
PageW: 490 points
colW : [210, 140, 210, 420, 280, 350, 70, 140, 140, 350]
7 pages: {210,140} {210} {420} {280} {350,70} {140,140}
{350}
Minimum 5 pages:
ColIdx:[3, 5, 4, 7, 10, 6, 8, 1, 2, 9]
Pages: {210,280} {420} {70,350} {350,140} {210,140,140}
Group columns 2,3 and 7:
colIdx:[2, 3, 7, 4, 9, 10, 6, 8, 1, 5]
Pages :{140,210,70} {420} {140,350} {350,140} {210,280}
Splitting Wide Tables Optimally

Slide 16 of 23
3.Minimum Page Count & Group Splitting
int colG[Cols] = ...;// column groups
dvar int+ pageSel[Pages] in 0..1;
dvar int+ pageIdx[Cols] in 0..1;
// find the first column of the group
int gFirstCol[g in groups] =
first({j | j in Cols : colG[j] == g});
// counts how many columns of a group are on a
// different page than the first group’s column
dexpr int gSplit[g in groups ] =
sum(j in Cols : colG[j] == g )
(pageIdx[j] != pageIdx[gFirstCol[g]]);
dexpr int gSplitCount = sum(g in groups)
(gSplit[g] >= 1 );
dexpr int pageCount = sum(p in Pages) pageSel[p];
Splitting Wide Tables Optimally

Slide 17 of 23
3.Minimum Page Count & Group Splitting
// a, b, obj1Val variables are used for OPL flow control
minimize a * pageCount + b * posDiff;
subject to {
ct1: // do not exceed page width
forall(p in Pages)
sum(j in Cols)
colW[j] * (p==pageIdx[j])/ pageW <= pageSel[p];
// if the minimum page count obj1Val is set
// maintain this value for subsequent searches
ct2:
if (obj1Val >= 0 ) pageCount == obj1Val;
}

Splitting Wide Tables Optimally

Slide 18 of 23
3.Minimum Page Count & Group Splitting Model
Results
●

●

Promising performance:
●
1m for a 20 column table with 3 groups, none
split, page count from 12 down to 9;
●
2m for 30-40 column tables but time increased
up to 12m when the number of groups
increased;
Computational time increases with columns and
groups number

●

Some relaxed solutions can be preffered

Building Table Formatting Tools

Slide 19 of 23
Conclusions

Splitting Wide Tables Optimally

Slide 20 of 23
Conclusions
•

•

•

Optimal arrangement of columns such that the
page count is minimized when splitting wide tables
can be achieved in relatively short running time; for
tables with 60 columns a solution has been found
in less than 2s;
If additional criteria are added, for example
minimizing the number of relative column positions
changes,the problems become harder as the
number of columns increase;
the difficulty of the problems not only depends on
the problem size but on the complexity of the data;

Splitting Wide Tables Optimally

Slide 21 of 23
Ongoing work
Minimizing the overall page count when a large table
containing text is displayed on fixed size pages and
neither column widths nor row heights are known in
advance.

Splitting Wide Tables Optimally

Slide 22 of 23
Thank you!

www.tabularlayout.org

Splitting Wide Tables Optimally

Slide 23 of 23

More Related Content

PPTX
Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...
PPTX
Datascape Introduction
PPT
1-7 Presenting Data
PDF
Visualising Multi Dimensional Data
PDF
DSD-INT 2018 iMOD version 4.3 double precision big coordinates - Vermeulen
PPTX
Krb3013 rumusan dis 2013
PDF
SG Profile Book 2015
PDF
Mini Seedcamp Dublin
Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...
Datascape Introduction
1-7 Presenting Data
Visualising Multi Dimensional Data
DSD-INT 2018 iMOD version 4.3 double precision big coordinates - Vermeulen
Krb3013 rumusan dis 2013
SG Profile Book 2015
Mini Seedcamp Dublin

Viewers also liked (10)

PDF
The Software House
PPTX
2013: This Year in Social by Anchor Media
PDF
Embedded programming in RTOS VxWorks for PROFIBUS VME interface card
PPSX
Confidence = 7 Points of Entanglement 01
PPTX
Aprendizaje enseñanza y propuesta pedagógica
PDF
Pp tla función productiva
PPT
5 relaciones comunidad, escuela, familia
DOC
Segundo parcial didáctica de la educación superior l sin contestar
DOC
Segundo parcial didáctica de la educación superior l contestado
The Software House
2013: This Year in Social by Anchor Media
Embedded programming in RTOS VxWorks for PROFIBUS VME interface card
Confidence = 7 Points of Entanglement 01
Aprendizaje enseñanza y propuesta pedagógica
Pp tla función productiva
5 relaciones comunidad, escuela, familia
Segundo parcial didáctica de la educación superior l sin contestar
Segundo parcial didáctica de la educación superior l contestado
Ad

Recently uploaded (20)

PDF
Developing a website for English-speaking practice to English as a foreign la...
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Five Habits of High-Impact Board Members
PDF
Architecture types and enterprise applications.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
August Patch Tuesday
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Getting Started with Data Integration: FME Form 101
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
CloudStack 4.21: First Look Webinar slides
PPTX
O2C Customer Invoices to Receipt V15A.pptx
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
Benefits of Physical activity for teenagers.pptx
PPTX
Chapter 5: Probability Theory and Statistics
PDF
Getting started with AI Agents and Multi-Agent Systems
Developing a website for English-speaking practice to English as a foreign la...
A contest of sentiment analysis: k-nearest neighbor versus neural network
A novel scalable deep ensemble learning framework for big data classification...
Assigned Numbers - 2025 - Bluetooth® Document
Five Habits of High-Impact Board Members
Architecture types and enterprise applications.pdf
Enhancing emotion recognition model for a student engagement use case through...
August Patch Tuesday
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Getting Started with Data Integration: FME Form 101
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
CloudStack 4.21: First Look Webinar slides
O2C Customer Invoices to Receipt V15A.pptx
Final SEM Unit 1 for mit wpu at pune .pptx
Univ-Connecticut-ChatGPT-Presentaion.pdf
DP Operators-handbook-extract for the Mautical Institute
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Benefits of Physical activity for teenagers.pptx
Chapter 5: Probability Theory and Statistics
Getting started with AI Agents and Multi-Agent Systems
Ad

DocEng2013 Bilauca Healy - Splitting Wide Tables Optimally

  • 1. DocEng2013, September 10– 13, 2013, Florence, Italy Splitting Wide Tables Optimally Mihai Bilauca Patrick Healy Department of Computer Science and Information Systems University of Limerick, Ireland Supported by Science Foundation Ireland under the research programme 01/P1.2/C009, Mathematical Foundations, Practical Notations, and Tools for Reliable Flexible Software.
  • 2. Splitting Wide Tables Optimally Why this paper? • Tables are widely used for presenting logical relationships between data items; • Widely spread WYSIWYG tools have poor support for wide tables; • Authoring tables is hard, time consuming and error prone; • Style manuals recommendations are not always supported • Very little research in this area Splitting Wide Tables Optimally Slide 2 of 23
  • 3. A wide table split across multiple pages Splitting Wide Tables Optimally Slide 3 of 23
  • 4. + Zoom in Grouping of data items increases readability Splitting Wide Tables Optimally Slide 4 of 23
  • 5. Splitting Wide Tables Optimally Style recommendations from Chicago Manual of Style “For a two-page broadside table – which should be presented on facing pages if at all possible – column heads need not be repeated; for broadside tables that run beyond two pages, column heads are repeated only on each new verso. Where column heads are repeated, the table number and “continued” should also appear. For any table that is likely to run to more than one page, the editor should specify whether continued lines and repeated column heads will be needed and where footnotes should appear (usually at the end of the table as a whole).” Splitting Wide Tables Optimally Slide 5 of 23
  • 6. Splitting Wide Tables Optimally Overview We present MIP Solutions using OPL for 3 problems that occur when splitting wide tables with the aim to minimize the effect on the meaning of data: 1. Minimize Page Count 2. Minimize Page Count and Column Positioning Changes 3. Minimize Page Count and Group Splitting Report experimental results with IBM CPLEX 12.3 Conclusions MIP – Mixed Integer Programming OPL – Optimization Programming Language Splitting Wide Tables Optimally Slide 6 of 23
  • 7. 1.Minimum Page Count Splitting Wide Tables Optimally Slide 7 of 23
  • 8. 1.Minimum Page Count – OPL Model dvar int+ pageSel[Pages] in 0..1; dvar int+ X[Pages][Cols] in 0..1; dexpr int pageCount = sum(p in Pages) pageSel[p]; minimize pageCount; subject to { ct1: // select only one page for each column forall(j in Cols) sum(p in Pages) X[p][j] == 1; ct2: // only columns that fit in the page forall(p in Pages) sum(j in Cols) colW[j] / pageW ∗ X[p][j] <= pageSel[p]; } Splitting Wide Tables Optimally Slide 8 of 23
  • 9. 1.Minimum Page Count - Results ● Page count can be reduced by 14% to 25% ● The difficulty of the problem is not directly linked to the problem size but to the data itself Columns 10 20 30 40 50 60 PC 7 16 19 29 34 48 OPC 6 12 15 23 26 39 %Imp 14.28% 25.00% 21.05% 20.68% 23.52% 18.75% Time 2.25 0.13 0.17 1.18 04.30 1.52 Building Table Formatting Tools Slide 9 of 23
  • 10. 2.Minimum Page Count & Column Positioning Changes Splitting Wide Tables Optimally Slide 10 of 23
  • 11. 2.Minimum Page Count & Column Positioning Changes PageW: 490 points colW : [210, 140, 210, 420, 280, 350, 70, 140, 140, 350] 7 pages : {210,140} {210} {420} {280} {350,70} {140,140} {350} Minimum 5 pages: ColIdx : [1, 7, 8, 5, 2, 9, 6, 10, 3, 4] Pages: {210,280} {140,350} {420,70} {140,210} {350,140} Minimum 5 pages and column position changes possDiff colIdx : [1, 2, 3, 5, 4, 7, 6, 8, 9, 10] Pages : {210,140} {210,280} {420,70} {350,140} {140,350} Splitting Wide Tables Optimally Slide 11 of 23
  • 12. 2.Minimum Page Count & Column Positioning Changes dvar int+ pageSel[Pages] in 0..1; dvar int+ pageIdx[Cols] in 0..1; dvar int+ colIdx[Cols] in 0..1; // check if j1 is placed on a page before j2 dexpr int posO[j1,j2 in Cols] = j1 <= j2−1; dexpr int posN[j1,j2 in Cols] = (colIdx[j1]<=colIdx[j2]−1) dexpr float posDiff = sum(j1,j2 in Cols : j2 < j1) abs(posO[j1,j2] − posN[j1,j2]); dexpr int pageCount = sum(p in Pages) pageSel[p]; // a, b, obj1Val variables are used for OPL flow control minimize a * pageCount + b * posDiff; Splitting Wide Tables Optimally Slide 12 of 23
  • 13. 2.Minimum Page Count & Column Positioning Changes subject to { ct1: // do not exceed page width forall(p in Pages) sum(j in Cols) colW[j]/(p==pageIdx[j]) / pageW <= pageSel[p]; ct2: // page and column indexes relationship forall(ordered j1,j2 in Cols) (pageIdx[j1]<=pageIdx[j2]-1) (colIdx[j1]<=colIdx[j2]-1) == 0; ct3: // unique column index values forall(ordered j1,j2 in Cols) colIdx[j1]!=colIdx[j2]; // if the minimum page count obj1Val is set // maintain this value for subsequent searches ct4: if (obj1Val >= 0 ) pageCount == obj1Val; } Splitting Wide Tables Optimally Slide 13 of 23
  • 14. 2.Minimum Page Count & Column Positioning Changes Results ● Promising performance: – 2.25s for minimizing a 10 column table with posDiff 33 down to 4, page count from 9 down to 8; – 89s for minimizing a 20 column table with posDiff 194 down to 4, page count from 13 down to 11; ● Computational time increases with columns number ● The data instance can have no better solutions Building Table Formatting Tools Slide 14 of 23
  • 15. 3.Minimum Page Count & Group Splitting Splitting Wide Tables Optimally Slide 15 of 23
  • 16. 3.Minimum Page Count & Group Splitting User specifies which columns should preferably be kept together PageW: 490 points colW : [210, 140, 210, 420, 280, 350, 70, 140, 140, 350] 7 pages: {210,140} {210} {420} {280} {350,70} {140,140} {350} Minimum 5 pages: ColIdx:[3, 5, 4, 7, 10, 6, 8, 1, 2, 9] Pages: {210,280} {420} {70,350} {350,140} {210,140,140} Group columns 2,3 and 7: colIdx:[2, 3, 7, 4, 9, 10, 6, 8, 1, 5] Pages :{140,210,70} {420} {140,350} {350,140} {210,280} Splitting Wide Tables Optimally Slide 16 of 23
  • 17. 3.Minimum Page Count & Group Splitting int colG[Cols] = ...;// column groups dvar int+ pageSel[Pages] in 0..1; dvar int+ pageIdx[Cols] in 0..1; // find the first column of the group int gFirstCol[g in groups] = first({j | j in Cols : colG[j] == g}); // counts how many columns of a group are on a // different page than the first group’s column dexpr int gSplit[g in groups ] = sum(j in Cols : colG[j] == g ) (pageIdx[j] != pageIdx[gFirstCol[g]]); dexpr int gSplitCount = sum(g in groups) (gSplit[g] >= 1 ); dexpr int pageCount = sum(p in Pages) pageSel[p]; Splitting Wide Tables Optimally Slide 17 of 23
  • 18. 3.Minimum Page Count & Group Splitting // a, b, obj1Val variables are used for OPL flow control minimize a * pageCount + b * posDiff; subject to { ct1: // do not exceed page width forall(p in Pages) sum(j in Cols) colW[j] * (p==pageIdx[j])/ pageW <= pageSel[p]; // if the minimum page count obj1Val is set // maintain this value for subsequent searches ct2: if (obj1Val >= 0 ) pageCount == obj1Val; } Splitting Wide Tables Optimally Slide 18 of 23
  • 19. 3.Minimum Page Count & Group Splitting Model Results ● ● Promising performance: ● 1m for a 20 column table with 3 groups, none split, page count from 12 down to 9; ● 2m for 30-40 column tables but time increased up to 12m when the number of groups increased; Computational time increases with columns and groups number ● Some relaxed solutions can be preffered Building Table Formatting Tools Slide 19 of 23
  • 20. Conclusions Splitting Wide Tables Optimally Slide 20 of 23
  • 21. Conclusions • • • Optimal arrangement of columns such that the page count is minimized when splitting wide tables can be achieved in relatively short running time; for tables with 60 columns a solution has been found in less than 2s; If additional criteria are added, for example minimizing the number of relative column positions changes,the problems become harder as the number of columns increase; the difficulty of the problems not only depends on the problem size but on the complexity of the data; Splitting Wide Tables Optimally Slide 21 of 23
  • 22. Ongoing work Minimizing the overall page count when a large table containing text is displayed on fixed size pages and neither column widths nor row heights are known in advance. Splitting Wide Tables Optimally Slide 22 of 23
  • 23. Thank you! www.tabularlayout.org Splitting Wide Tables Optimally Slide 23 of 23