SlideShare a Scribd company logo
1
CATEGORICAL CALCULUS AND
ALGEBRA FOR MULTI-MODEL
DATA
Jiaheng Lu
Department of Computer Science
University of Helsinki, FInland
• Multi-model Databases
• Categorical Algebra and Calculus
• Algebraic Transformation Rules
• Conclusion
• Note: this talk will involve more in database knowledge, but only use basic
knowledge in category theory, such as limit and thin category.
2
OUTLINE
• Data come from different sources and have different formats
3
BIG DATA
Camera
Smart phone Social media
Acknowledgement: Icons created by Freepik - Flaticon
www.helsinki.fi
Variety challenge of big data
Customer
Social
Media
Gaming
Entertain
ment
Banking
Finance
Personal
Information
Purchase Customers
have different
types of data.
4
Customer-360-Degree-View
An example of different data and query
in databases
Marry (1)
John (2)
Friend
Friend
William (3)
{"Order_no":"0c6df508",
“Orderlines": [
{ "Product_no":"2724f”
“Product_Name":“Toy",
"Price":66 },
{ "Product_no":“3424g”,
"Product_Name":“Book",
"Price":40 } ]
}
Customer_ID Name Credit_limits
1 Mary 5,000
2 John 3,000
3 William 2,000
"1" -- > "34e5e759"
"2"-- > "0c6df508"
Persons made the order:
Social network
Table
Order information:
An example of different data and query
Marry (1)
John (2)
Friend
Friend
William (3)
{"Order_no":"0c6df508",
“Orderlines": [
{ "Product_no":"2724f”
“Product_Name":“Toy",
"Price":66 },
{ "Product_no":“3424g”,
"Product_Name":“Book",
"Price":40 } ]
}
Customer_ID Name Credit_limits
1 Mary 5,000
2 John 3,000
3 William 2,000
"1" -- > "34e5e759"
"2"-- > "0c6df508"
Persons made the order:
Social network
Table
Order information:
One application with different models of data
How to integrate those heterogenous data to provide a
unified service?
●Relational data: customer databases
●Graph data: social networks
●Hierarchical data: catalog, product
●Key-value data: orders by customers
7
Multi-Model Databases System
Tabular
RDF
XML
Spatial
Text
Multi-model database
systems
JSON
• One unified database system for multi-model data
8
What is DBMS?
• A Database Management System (DBMS) is software designed to
efficiently manage data, with traditional systems storing data in the form of
tables (RDBMS).
9
Student ID First name Last name Department
001 John Smith Biology
002 Emily Johnson Physics
003 Michael Brown History
004 Sarah Davis English
Students Relational Table
What is multi-model database management
system
● A multi-model database management system (MMDBMS) is
designed to support multiple data models against a single,
integrated backend.
● Document, graph, relational and key-value models are examples
of data models that may be supported by a multi-model database.
10
Four models of data
Marry (1)
John (2)
Friend
Friend
William (3)
{"Order_no":"0c6df508",
“Orderlines": [
{ "Product_no":"2724f”
“Product_Name":“Toy",
"Price":66 },
{ "Product_no":“3424g”,
"Product_Name":“Book",
"Price":40 } ]
}
Customer_ID Name Credit_limits
1 Mary 5,000
2 John 3,000
3 William 2,000
"1" -- > "34e5e759"
"2"-- > "0c6df508"
Key-value:
Graph:
Relation:
Document:
Advantages of MMDBMS over the traditional
relational database
• Handling diverse data types
• Handle various types of data, such as graph, relation, document and
key-value data and more models
• Enhanced query capabilities
• Support content-based search for multi-model data or spatial queries for
geospatial data.
• Flexible schema
• Greater flexibility in schema design and evolution. Relational DBMS has
the fixed database schema definitions.
12
Multi-model databases products
…
• Supporting graph, document, key/value and object models.
Marry (1)
John (2)
Friend
Friend
William (3)
{"Order_no":"0c6df508",
“Orderlines": [
{ "Product_no":"2724f”
“Product_Name":“Toy",
"Price":66 },
{ "Product_no":“3424g”,
"Product_Name":“Book",
"Price":40 } ]
}
Customer_ID Name Credit_limits
1 Mary 5,000
2 John 3,000
3 William 2,000
"1" -- > "34e5e759"
"2"-- > "0c6df508"
Query: Return all products which are ordered by a friend of a customer
whose credit_limit>4000
Answer: John is a friend of Mary (the credit_limit of Mary > 4000)
Query language of OrientDB:
SELECT
expand(out("Knows").Orders.orderlines.Product_no)
FROM Customers
WHERE CreditLimit > 4000
Recommendation query:
Return all products which are ordered by any
friend of a customer whose credit_limit>4000
Challenge: a new theory foundation
Research goal:
Call for a unified model and theory for multi-model data!
The theory of traditional relations is not adequate to
mathematically describe modern database systems.
One possible theory foundation:
Category Theory
● Introduced to mathematics world by Samuel Eilenberg
and Sauders MacLane in 1944
● Developed for a unified language of topology and algebra
Samuel Eilenberg Sauders MacLane
Set Category
● Databases are set categories:
○ Objects are sets and morphisms are functions
● We assume that it is a thin Category (or Posetal Category)
○ Given a pair of objects X and Y in a category C, and any two morphisms f ,
g: X → Y , we say that C is a thin category if and only if the morphisms f
and g are equal.
19
An example of Categorical Unification
20
Unified
Category
for three
types of
data
Relational algebra and relational calculus
● In the field of relational databases, relational algebra and
relational calculus are developed as two formal languages for
query databases.
● Similarly, categorical algebra and categorical calculus are
developed to query category databases.
21
Relational algebra
• Operators:
• Selection:  (sigma)
• Projection: 
• Union: 
• Intersection : 
• Difference: -
• Cartesian Product: 
• Derived operators:
• Joins (equi-join) ⋈
• Examples of query trees:
Categorical algebra
• Set operators:
• Unary operator:
• Map: f
• Selection: 
• Projection: 
• Binary operator:
• Division: ÷
• getParent(D1,D2) (tree data)
• getAncestor(D1,D2) (tree data)
• Tenary operator:
• getReach(S,T,E) (graph data)
• getNHop(S,T,E) (graph data)
Categorical algebra
• Category operators:
• Sets and Functions to Category:
• Cat(S1,...,Sn, f1 : Si1 → Sj1 ,..., fm : Sim → Sjm )
• This operator, called Categorification, constructs a category using a
given set of objects and morphisms.
• Category to Set
• Limit which converts a category into a relational object (set).
• Lim(Cat(S1,...,Sn, f1 : Si1 → Sj1 ,..., fm : Sim → Sjm ))
Example of categorical algebra:
Selection
Query: Find all courses taken by ”Smith”
S1:=  student · Last_name=“Smith” (Registration)
S2:= S1 ·Course ·Title
Return S2
Example of categorical algebra: Division
Query: Find the titles of courses taken by all students
S1:= Registration[Student]÷Student
S2:= S1 · Course · Title
Return S2
28
Two examples of query plan with
categorical algebra
Categorical calculus
• Categorical calculus, a declarative language for describing results in the
category; Categorical algebra, a procedural language for listing operations
in the category.
• The formulae of the Categorical Calculus
Categorical calculus
• The formulae of the Categorical Calculus
Reachable
predicate from
node x1 to x2
Categorical calculus
• The formulae of the Categorical Calculus
Ancestor predicate
to determine the
relationship
between two
nodes in trees.
Categorical calculus
• The formulae of the Categorical Calculus
Unsafe
variables refer
to a variable
that has
possibly infinite
number of
values or is
unbounded.
Categorical calculus and categorical
algebra are equivalent (I)
Query: Find all courses taken by ”Smith”
S1:=  student · Last_name=“Smith” (Registration)
S2:= S1 ·Course ·Title
Equivalent calculus:
{ 𝑥 | 𝑥 ∈ 𝑇𝑖𝑡𝑙𝑒, ∃𝑦 ∈ 𝑅𝑒𝑔𝑖𝑠𝑡𝑟𝑎𝑡𝑖𝑜𝑛, 𝑦 ∙ 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 ∙ 𝐿𝑎𝑠𝑡_𝑁𝑎𝑚𝑒 =
"Smith"˄ 𝑦 ∙ Course ∙ Title= 𝑥}
Query: Find the titles of courses taken by all students
S1:= Registration[Student]÷Student
S2:= S1 · Course · Title
Equivalent calculus:
{ 𝑥 | 𝑥 ∈ 𝑇𝑖𝑡𝑙𝑒, ∀𝑦 ∈ 𝑆𝑡𝑢𝑑𝑒𝑛𝑡, ∃𝑟 ∈ 𝑅𝑒𝑔𝑖𝑠𝑡𝑟𝑎𝑡𝑖𝑜𝑛, 𝑟 ∙ 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 = 𝑦 ˄ 𝑟 ∙
Course ∙ Title= 𝑥}
Categorical calculus and categorical
algebra are equivalent (II)
• Multi-model Databases
• Categorical Algebra and Calculus
• Algebraic Transformation Rules (Query optimization)
• Conclusion
35
OUTLINE
Algebraic transformation rules (I)
• Rewrite the algebraic operators for query optimization.
• Limit and Projection:
• Pushing σ to one or multiple objects in lim
Algebraic transformation rules (II)
• Pushing σ to one or multiple objects in getReach:
• Commuting function mapping with the product operator.
• Commuting 𝜋 with the Lim operation.
Algebraic transformation rules (III)
• Commuting g with the Lim operation..
• The following diagram holds:
then the two operators g and Lim can be commuted as follows:
Algebraic transformation rules (III)
• Commuting g with the Lim operation..
• The following diagram holds:
then the two operators g and limit can be commuted as follows:
40
An optimization query plan with
algebraic operators transformation
• Theorem 13. Categorical calculus and categorical algebra can
express all of the following:
• Relational calculus and algebra queries;
• Graph pattern matching and graph reachability queries;
• XML twig pattern queries.
41
EXPRESSIBILITY POWER
42
TIME AND SPACE COMPLEXITY
• Previous works use category theory on relational databases, but
this paper focuses on multi-model databases.
• Libkin and Wong (1997) showcase the connection between database
operations and the categorical notion of a monad.
• Schultz and Spivak (2016) introduce a categorical query language
that serves as a data integration scripting language
• …..
• There are existing algebra and calculus for relational data, graph
data, and object-oriented data, but not multi-model data.
43
RELATED WORK
• Model multi-model databases as thin set category
• Define categorical algebra and calculus, and their equivalence
• Develop the algebraic transformation rules for query optimization
44
MAIN CONTRIBUTION AND CONCLUSION
Applied category theory (ACT) here can contribute
to practical query processing and optimization
of multi-model databases.
• Jeremy Gibbons, Fritz Henglein, Ralf Hinze & Nicolas Wu (2018): Relational algebra
by way of adjunctions. Proc. ACM Program. Lang. 2(ICFP), pp. 86:1–86:28.
• Leonid Libkin & Limsoon Wong (1997): Query Languages for Bags and Aggregate
Functions. J. Comput. Syst. Sci. 55(2), pp. 241–272.
• Patrick Schultz, David I. Spivak, Christina Vasilakopoulou & Ryan Wisnesky (2016):
Algebraic Databases. arXiv:1602.03501.
• Allen Van Gelder & Rodney W. Topor (1991): Safety and translation of relational
calculus. ACM Trans Database Syst. 16(2), p. 235–278, doi:10.1145/114325.103712.
Available at https://doi.org/10.1145/114325.103712.
• Jiaheng Lu, Irena Holubová: Multi-model Databases: A New Journey to Handle the
Variety of Data. ACM Comput. Surv. 52(3): 55:1-55:38 (2019)
• Jiaheng Lu: A Categorical Unification for Multi-Model Data: Part I Categorical Model
and Normal Forms. CoRR abs/2502.19131 (2025)
45
REFERENCES

More Related Content

PPT
Relational Algebra and Calculus.ppt
PPT
Query processing-and-optimization
PDF
Ch4_Algebra.pdf
PPTX
database chapter 6.pptx advanced database
PPT
Relational Algebra
PDF
APznzab-krNx9xYwUY9_3k8Hh19mmThz2R8IODQ0Q7QpGzIRd4klcTiJbr1Xbm6ooppFjMsR6TZ6B...
PPT
Introduction to Domain Calculus Notes.ppt
PPTX
Relational algebra calculus
Relational Algebra and Calculus.ppt
Query processing-and-optimization
Ch4_Algebra.pdf
database chapter 6.pptx advanced database
Relational Algebra
APznzab-krNx9xYwUY9_3k8Hh19mmThz2R8IODQ0Q7QpGzIRd4klcTiJbr1Xbm6ooppFjMsR6TZ6B...
Introduction to Domain Calculus Notes.ppt
Relational algebra calculus

Similar to Category calculus and algebra for multi-model databases (20)

PPTX
Relational Algebra in Database Systems.pptx
PPTX
UNIT-2 Relation algebra&RelationalCalculus.pptx
PPT
Relational Algebra DBMS formal language used to query and manipulate relation...
PPTX
Relational Calculus
PDF
RelationalAlgebra-RelationalCalculus-SQL.pdf
DOCX
Mit202 data base management system(dbms)
PPT
Algebra
PDF
Relational Query Languages 10_13.pdf
PPT
Rdbms (2)
PPT
lefg sdfg ssdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg d...
PPT
lecture8Alg.ppt
PPT
2 rel-algebra
PPTX
Database managment System Relational Algebra
PPT
Relational Algebra
PPTX
Relational data model
PDF
Introduction to database-Formal Query language and Relational calculus
PDF
Cs501 rel algebra
PPTX
Concepts of Query Processing in ADBMS.pptx
PPTX
04 pig data operations
PPTX
Relational algebra
Relational Algebra in Database Systems.pptx
UNIT-2 Relation algebra&RelationalCalculus.pptx
Relational Algebra DBMS formal language used to query and manipulate relation...
Relational Calculus
RelationalAlgebra-RelationalCalculus-SQL.pdf
Mit202 data base management system(dbms)
Algebra
Relational Query Languages 10_13.pdf
Rdbms (2)
lefg sdfg ssdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg d...
lecture8Alg.ppt
2 rel-algebra
Database managment System Relational Algebra
Relational Algebra
Relational data model
Introduction to database-Formal Query language and Relational calculus
Cs501 rel algebra
Concepts of Query Processing in ADBMS.pptx
04 pig data operations
Relational algebra
Ad

Recently uploaded (20)

PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PDF
Sciences of Europe No 170 (2025)
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPTX
BIOMOLECULES PPT........................
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PDF
The scientific heritage No 166 (166) (2025)
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
2. Earth - The Living Planet earth and life
7. General Toxicologyfor clinical phrmacy.pptx
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
Sciences of Europe No 170 (2025)
bbec55_b34400a7914c42429908233dbd381773.pdf
BIOMOLECULES PPT........................
The KM-GBF monitoring framework – status & key messages.pptx
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
Introduction to Cardiovascular system_structure and functions-1
POSITIONING IN OPERATION THEATRE ROOM.ppt
The scientific heritage No 166 (166) (2025)
Introduction to Fisheries Biotechnology_Lesson 1.pptx
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
2. Earth - The Living Planet Module 2ELS
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
Taita Taveta Laboratory Technician Workshop Presentation.pptx
Placing the Near-Earth Object Impact Probability in Context
2. Earth - The Living Planet earth and life
Ad

Category calculus and algebra for multi-model databases

  • 1. 1 CATEGORICAL CALCULUS AND ALGEBRA FOR MULTI-MODEL DATA Jiaheng Lu Department of Computer Science University of Helsinki, FInland
  • 2. • Multi-model Databases • Categorical Algebra and Calculus • Algebraic Transformation Rules • Conclusion • Note: this talk will involve more in database knowledge, but only use basic knowledge in category theory, such as limit and thin category. 2 OUTLINE
  • 3. • Data come from different sources and have different formats 3 BIG DATA Camera Smart phone Social media Acknowledgement: Icons created by Freepik - Flaticon
  • 4. www.helsinki.fi Variety challenge of big data Customer Social Media Gaming Entertain ment Banking Finance Personal Information Purchase Customers have different types of data. 4 Customer-360-Degree-View
  • 5. An example of different data and query in databases Marry (1) John (2) Friend Friend William (3) {"Order_no":"0c6df508", “Orderlines": [ { "Product_no":"2724f” “Product_Name":“Toy", "Price":66 }, { "Product_no":“3424g”, "Product_Name":“Book", "Price":40 } ] } Customer_ID Name Credit_limits 1 Mary 5,000 2 John 3,000 3 William 2,000 "1" -- > "34e5e759" "2"-- > "0c6df508" Persons made the order: Social network Table Order information:
  • 6. An example of different data and query Marry (1) John (2) Friend Friend William (3) {"Order_no":"0c6df508", “Orderlines": [ { "Product_no":"2724f” “Product_Name":“Toy", "Price":66 }, { "Product_no":“3424g”, "Product_Name":“Book", "Price":40 } ] } Customer_ID Name Credit_limits 1 Mary 5,000 2 John 3,000 3 William 2,000 "1" -- > "34e5e759" "2"-- > "0c6df508" Persons made the order: Social network Table Order information:
  • 7. One application with different models of data How to integrate those heterogenous data to provide a unified service? ●Relational data: customer databases ●Graph data: social networks ●Hierarchical data: catalog, product ●Key-value data: orders by customers 7
  • 8. Multi-Model Databases System Tabular RDF XML Spatial Text Multi-model database systems JSON • One unified database system for multi-model data 8
  • 9. What is DBMS? • A Database Management System (DBMS) is software designed to efficiently manage data, with traditional systems storing data in the form of tables (RDBMS). 9 Student ID First name Last name Department 001 John Smith Biology 002 Emily Johnson Physics 003 Michael Brown History 004 Sarah Davis English Students Relational Table
  • 10. What is multi-model database management system ● A multi-model database management system (MMDBMS) is designed to support multiple data models against a single, integrated backend. ● Document, graph, relational and key-value models are examples of data models that may be supported by a multi-model database. 10
  • 11. Four models of data Marry (1) John (2) Friend Friend William (3) {"Order_no":"0c6df508", “Orderlines": [ { "Product_no":"2724f” “Product_Name":“Toy", "Price":66 }, { "Product_no":“3424g”, "Product_Name":“Book", "Price":40 } ] } Customer_ID Name Credit_limits 1 Mary 5,000 2 John 3,000 3 William 2,000 "1" -- > "34e5e759" "2"-- > "0c6df508" Key-value: Graph: Relation: Document:
  • 12. Advantages of MMDBMS over the traditional relational database • Handling diverse data types • Handle various types of data, such as graph, relation, document and key-value data and more models • Enhanced query capabilities • Support content-based search for multi-model data or spatial queries for geospatial data. • Flexible schema • Greater flexibility in schema design and evolution. Relational DBMS has the fixed database schema definitions. 12
  • 14. • Supporting graph, document, key/value and object models.
  • 15. Marry (1) John (2) Friend Friend William (3) {"Order_no":"0c6df508", “Orderlines": [ { "Product_no":"2724f” “Product_Name":“Toy", "Price":66 }, { "Product_no":“3424g”, "Product_Name":“Book", "Price":40 } ] } Customer_ID Name Credit_limits 1 Mary 5,000 2 John 3,000 3 William 2,000 "1" -- > "34e5e759" "2"-- > "0c6df508" Query: Return all products which are ordered by a friend of a customer whose credit_limit>4000 Answer: John is a friend of Mary (the credit_limit of Mary > 4000)
  • 16. Query language of OrientDB: SELECT expand(out("Knows").Orders.orderlines.Product_no) FROM Customers WHERE CreditLimit > 4000 Recommendation query: Return all products which are ordered by any friend of a customer whose credit_limit>4000
  • 17. Challenge: a new theory foundation Research goal: Call for a unified model and theory for multi-model data! The theory of traditional relations is not adequate to mathematically describe modern database systems.
  • 18. One possible theory foundation: Category Theory ● Introduced to mathematics world by Samuel Eilenberg and Sauders MacLane in 1944 ● Developed for a unified language of topology and algebra Samuel Eilenberg Sauders MacLane
  • 19. Set Category ● Databases are set categories: ○ Objects are sets and morphisms are functions ● We assume that it is a thin Category (or Posetal Category) ○ Given a pair of objects X and Y in a category C, and any two morphisms f , g: X → Y , we say that C is a thin category if and only if the morphisms f and g are equal. 19
  • 20. An example of Categorical Unification 20 Unified Category for three types of data
  • 21. Relational algebra and relational calculus ● In the field of relational databases, relational algebra and relational calculus are developed as two formal languages for query databases. ● Similarly, categorical algebra and categorical calculus are developed to query category databases. 21
  • 22. Relational algebra • Operators: • Selection:  (sigma) • Projection:  • Union:  • Intersection :  • Difference: - • Cartesian Product:  • Derived operators: • Joins (equi-join) ⋈
  • 23. • Examples of query trees:
  • 24. Categorical algebra • Set operators: • Unary operator: • Map: f • Selection:  • Projection:  • Binary operator: • Division: ÷ • getParent(D1,D2) (tree data) • getAncestor(D1,D2) (tree data) • Tenary operator: • getReach(S,T,E) (graph data) • getNHop(S,T,E) (graph data)
  • 25. Categorical algebra • Category operators: • Sets and Functions to Category: • Cat(S1,...,Sn, f1 : Si1 → Sj1 ,..., fm : Sim → Sjm ) • This operator, called Categorification, constructs a category using a given set of objects and morphisms. • Category to Set • Limit which converts a category into a relational object (set). • Lim(Cat(S1,...,Sn, f1 : Si1 → Sj1 ,..., fm : Sim → Sjm ))
  • 26. Example of categorical algebra: Selection Query: Find all courses taken by ”Smith” S1:=  student · Last_name=“Smith” (Registration) S2:= S1 ·Course ·Title Return S2
  • 27. Example of categorical algebra: Division Query: Find the titles of courses taken by all students S1:= Registration[Student]÷Student S2:= S1 · Course · Title Return S2
  • 28. 28 Two examples of query plan with categorical algebra
  • 29. Categorical calculus • Categorical calculus, a declarative language for describing results in the category; Categorical algebra, a procedural language for listing operations in the category. • The formulae of the Categorical Calculus
  • 30. Categorical calculus • The formulae of the Categorical Calculus Reachable predicate from node x1 to x2
  • 31. Categorical calculus • The formulae of the Categorical Calculus Ancestor predicate to determine the relationship between two nodes in trees.
  • 32. Categorical calculus • The formulae of the Categorical Calculus Unsafe variables refer to a variable that has possibly infinite number of values or is unbounded.
  • 33. Categorical calculus and categorical algebra are equivalent (I) Query: Find all courses taken by ”Smith” S1:=  student · Last_name=“Smith” (Registration) S2:= S1 ·Course ·Title Equivalent calculus: { 𝑥 | 𝑥 ∈ 𝑇𝑖𝑡𝑙𝑒, ∃𝑦 ∈ 𝑅𝑒𝑔𝑖𝑠𝑡𝑟𝑎𝑡𝑖𝑜𝑛, 𝑦 ∙ 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 ∙ 𝐿𝑎𝑠𝑡_𝑁𝑎𝑚𝑒 = "Smith"˄ 𝑦 ∙ Course ∙ Title= 𝑥}
  • 34. Query: Find the titles of courses taken by all students S1:= Registration[Student]÷Student S2:= S1 · Course · Title Equivalent calculus: { 𝑥 | 𝑥 ∈ 𝑇𝑖𝑡𝑙𝑒, ∀𝑦 ∈ 𝑆𝑡𝑢𝑑𝑒𝑛𝑡, ∃𝑟 ∈ 𝑅𝑒𝑔𝑖𝑠𝑡𝑟𝑎𝑡𝑖𝑜𝑛, 𝑟 ∙ 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 = 𝑦 ˄ 𝑟 ∙ Course ∙ Title= 𝑥} Categorical calculus and categorical algebra are equivalent (II)
  • 35. • Multi-model Databases • Categorical Algebra and Calculus • Algebraic Transformation Rules (Query optimization) • Conclusion 35 OUTLINE
  • 36. Algebraic transformation rules (I) • Rewrite the algebraic operators for query optimization. • Limit and Projection: • Pushing σ to one or multiple objects in lim
  • 37. Algebraic transformation rules (II) • Pushing σ to one or multiple objects in getReach: • Commuting function mapping with the product operator. • Commuting 𝜋 with the Lim operation.
  • 38. Algebraic transformation rules (III) • Commuting g with the Lim operation.. • The following diagram holds: then the two operators g and Lim can be commuted as follows:
  • 39. Algebraic transformation rules (III) • Commuting g with the Lim operation.. • The following diagram holds: then the two operators g and limit can be commuted as follows:
  • 40. 40 An optimization query plan with algebraic operators transformation
  • 41. • Theorem 13. Categorical calculus and categorical algebra can express all of the following: • Relational calculus and algebra queries; • Graph pattern matching and graph reachability queries; • XML twig pattern queries. 41 EXPRESSIBILITY POWER
  • 42. 42 TIME AND SPACE COMPLEXITY
  • 43. • Previous works use category theory on relational databases, but this paper focuses on multi-model databases. • Libkin and Wong (1997) showcase the connection between database operations and the categorical notion of a monad. • Schultz and Spivak (2016) introduce a categorical query language that serves as a data integration scripting language • ….. • There are existing algebra and calculus for relational data, graph data, and object-oriented data, but not multi-model data. 43 RELATED WORK
  • 44. • Model multi-model databases as thin set category • Define categorical algebra and calculus, and their equivalence • Develop the algebraic transformation rules for query optimization 44 MAIN CONTRIBUTION AND CONCLUSION Applied category theory (ACT) here can contribute to practical query processing and optimization of multi-model databases.
  • 45. • Jeremy Gibbons, Fritz Henglein, Ralf Hinze & Nicolas Wu (2018): Relational algebra by way of adjunctions. Proc. ACM Program. Lang. 2(ICFP), pp. 86:1–86:28. • Leonid Libkin & Limsoon Wong (1997): Query Languages for Bags and Aggregate Functions. J. Comput. Syst. Sci. 55(2), pp. 241–272. • Patrick Schultz, David I. Spivak, Christina Vasilakopoulou & Ryan Wisnesky (2016): Algebraic Databases. arXiv:1602.03501. • Allen Van Gelder & Rodney W. Topor (1991): Safety and translation of relational calculus. ACM Trans Database Syst. 16(2), p. 235–278, doi:10.1145/114325.103712. Available at https://doi.org/10.1145/114325.103712. • Jiaheng Lu, Irena Holubová: Multi-model Databases: A New Journey to Handle the Variety of Data. ACM Comput. Surv. 52(3): 55:1-55:38 (2019) • Jiaheng Lu: A Categorical Unification for Multi-Model Data: Part I Categorical Model and Normal Forms. CoRR abs/2502.19131 (2025) 45 REFERENCES