SlideShare a Scribd company logo
http://kibeha.dk@kibeha
Uses of Row Pattern Matching
Kim
About me
• Danish geek
• SQL & PL/SQL developer since 2000
• Developer at Trivadis since 2016 http://www.trivadis.com
• Oracle Certified Expert in SQL
• Oracle ACE Director
• SQL quizmaster http://devgym.oracle.com
• Blogger http://kibeha.dk
• Likes to cook and read sci-fi
• Member of Danish Beer Enthusiasts
@kibeha
3 Membership Tiers
• Oracle ACE Director
• Oracle ACE
• Oracle ACE Associate
bit.ly/OracleACEProgram
500+ Technical Experts
Helping Peers Globally
Connect:
Nominate yourself or someone you know: acenomination.oracle.com
@oracleace
Facebook.com/oracleaces
oracle-ace_ww@oracle.com
About Trivadis
• Founded 1994
• 16 locations: Switzerland, Germany, Austria, Denmark and Romania
• 700 specialists
• 260 Service Level Agreements
• Over 4,000 training participants
• Research and development budget:
EUR 5.0 million
• More than 1,900 projects per year at over 800 customers
• Financially self-supporting and sustainably profitable
02-Oct-19 Uses of Row Pattern Matching4
TechEvent 2019: Uses of Row Pattern Matching; Kim Berg Hansen - Trivadis
Agenda for Pattern Matching
• Elements in the syntax
• Use cases:
• Stock ticker
• Grouping sequences
• Merge date ranges
• Tablespace growth
• Bin fitting with limited capacity
• Bin fitting in limited number of bins
• Hierarchical child count
• Brief summary
02-Oct-19 Uses of Row Pattern Matching6
02-Oct-19 Uses of Row Pattern Matching7
Elements in the syntax
• Example from Data Warehousing Guide chapter on SQL For Pattern Matching
SELECT *
FROM Ticker MATCH_RECOGNIZE (
PARTITION BY symbol
ORDER BY tstamp
MEASURES STRT.tstamp AS start_tstamp,
FINAL LAST(DOWN.tstamp) AS bottom_tstamp,
FINAL LAST(UP.tstamp) AS end_tstamp,
MATCH_NUMBER() AS match_num,
CLASSIFIER() AS var_match
ALL ROWS PER MATCH
AFTER MATCH SKIP TO LAST UP
PATTERN (STRT DOWN+ UP+)
DEFINE
DOWN AS DOWN.price < PREV(DOWN.price),
UP AS UP.price > PREV(UP.price)
) MR
ORDER BY MR.symbol, MR.match_num, MR.tstamp
What‘s it look like
02-Oct-19 Uses of Row Pattern Matching8
Elements
• PARTITION BY – like analytics split data to work on one partition at a time
• ORDER BY – in which order shall rows be tested whether they match the pattern
• MEASURES – the information we want returned from the match
• ALL ROWS / ONE ROW PER MATCH – return aggregate or detailed info for match
• AFTER MATCH SKIP … – when match found, where to start looking for new match
• PATTERN – regexp like syntax of pattern of defined row classifiers to match
• SUBSET – „union“ a set of classifications into one classification variable
• DEFINE – definition of classification of rows
• FIRST, LAST, PREV, NEXT – navigational functions
• CLASSIFIER(), MATCH_NUMBER() – identification functions
02-Oct-19 Uses of Row Pattern Matching9
02-Oct-19 Uses of Row Pattern Matching10
Stock ticker
• Example from Data Warehousing Guide chapter on SQL for Pattern Matching
create table ticker (
symbol varchar2(10)
, day date
, price number
);
insert into ticker values('PLCH', DATE '2011-04-01', 12);
insert into ticker values('PLCH', DATE '2011-04-02', 17);
insert into ticker values('PLCH', DATE '2011-04-03', 19);
insert into ticker values('PLCH', DATE '2011-04-04', 21);
insert into ticker values('PLCH', DATE '2011-04-05', 25);
insert into ticker values('PLCH', DATE '2011-04-06', 12);
insert into ticker values('PLCH', DATE '2011-04-07', 15);
insert into ticker values('PLCH', DATE '2011-04-08', 20);
insert into ticker values('PLCH', DATE '2011-04-09', 24);
insert into ticker values('PLCH', DATE '2011-04-10', 25);
insert into ticker values('PLCH', DATE '2011-04-11', 19);
insert into ticker values('PLCH', DATE '2011-04-12', 15);
insert into ticker values('PLCH', DATE '2011-04-13', 25);
insert into ticker values('PLCH', DATE '2011-04-14', 25);
insert into ticker values('PLCH', DATE '2011-04-15', 14);
insert into ticker values('PLCH', DATE '2011-04-16', 12);
insert into ticker values('PLCH', DATE '2011-04-17', 14);
insert into ticker values('PLCH', DATE '2011-04-18', 24);
insert into ticker values('PLCH', DATE '2011-04-19', 23);
insert into ticker values('PLCH', DATE '2011-04-20', 22);
Ticker table
02-Oct-19 Uses of Row Pattern Matching11
• Look for V shapes = at least one “down” slope followed by at least one “up” slope
select *
from ticker match_recognize (
partition by symbol
order by day
measures strt.day as start_day,
final last(down.day) as bottom_day,
final last(up.day) as end_day,
match_number() as match_num,
classifier() as var_match
all rows per match
after match skip to last up
pattern (strt down+ up+)
define
down as down.price < prev(down.price),
up as up.price > prev(up.price)
) mr
order by mr.symbol, mr.match_num, mr.day;
Stock ticker
02-Oct-19 Uses of Row Pattern Matching12
• Output of previous slide
SYMBOL DAY START_DAY BOTTOM_DA END_DAY MATCH_NUM VAR_MATCH PRICE
---------- --------- --------- --------- --------- ---------- --------- ----------
PLCH 05-APR-11 05-APR-11 06-APR-11 10-APR-11 1 STRT 25
PLCH 06-APR-11 05-APR-11 06-APR-11 10-APR-11 1 DOWN 12
PLCH 07-APR-11 05-APR-11 06-APR-11 10-APR-11 1 UP 15
PLCH 08-APR-11 05-APR-11 06-APR-11 10-APR-11 1 UP 20
PLCH 09-APR-11 05-APR-11 06-APR-11 10-APR-11 1 UP 24
PLCH 10-APR-11 05-APR-11 06-APR-11 10-APR-11 1 UP 25
PLCH 10-APR-11 10-APR-11 12-APR-11 13-APR-11 2 STRT 25
PLCH 11-APR-11 10-APR-11 12-APR-11 13-APR-11 2 DOWN 19
PLCH 12-APR-11 10-APR-11 12-APR-11 13-APR-11 2 DOWN 15
PLCH 13-APR-11 10-APR-11 12-APR-11 13-APR-11 2 UP 25
PLCH 14-APR-11 14-APR-11 16-APR-11 18-APR-11 3 STRT 25
PLCH 15-APR-11 14-APR-11 16-APR-11 18-APR-11 3 DOWN 14
PLCH 16-APR-11 14-APR-11 16-APR-11 18-APR-11 3 DOWN 12
PLCH 17-APR-11 14-APR-11 16-APR-11 18-APR-11 3 UP 14
PLCH 18-APR-11 14-APR-11 16-APR-11 18-APR-11 3 UP 24
Stock ticker
02-Oct-19 Uses of Row Pattern Matching13
• Previous example ALL ROWS, here ONE ROW per match
select * from ticker match_recognize (
partition by symbol order by day
measures strt.day as start_day,
final last(down.day) as bottom_day,
final last(down.price) as bottom_price,
final last(up.day) as end_day,
match_number() as match_num
one row per match after match skip to last up
pattern (strt down+ up+)
define down as down.price < prev(down.price),
up as up.price > prev(up.price) ) mr
order by mr.symbol, mr.match_num;
SYMBOL START_DAY BOTTOM_DA BOTTOM_PRICE END_DAY MATCH_NUM
---------- --------- --------- ------------ --------- ----------
PLCH 05-APR-11 06-APR-11 12 10-APR-11 1
PLCH 10-APR-11 12-APR-11 15 13-APR-11 2
PLCH 14-APR-11 16-APR-11 12 18-APR-11 3
ONE ROW PER MATCH
02-Oct-19 Uses of Row Pattern Matching14
• Navigational functions in measure expressions (quiz from devgym.oracle.com)
select symbol, day, price
, up_day, up_avg, up_total
from ticker
match_recognize (
partition by symbol
order by day
measures
final count(up.*) as days_up
, up.price - prev(up.price) as up_day
, (final last(up.price) - strt.price)
/ final count(up.*) as up_avg
, up.price - strt.price as up_total
all rows per match
after match skip to last up
pattern ( strt up+ )
define up as up.price > prev(up.price)
)
order by day;
SYMB DAY PRICE UP_DAY UP_AVG UP_TOTAL
---- --------- ----- ------ ------ --------
PLCH 01-APR-11 12 3.25
PLCH 02-APR-11 17 5 3.25 5
PLCH 03-APR-11 19 2 3.25 7
PLCH 04-APR-11 21 2 3.25 9
PLCH 05-APR-11 25 4 3.25 13
PLCH 06-APR-11 12 3.25
PLCH 07-APR-11 15 3 3.25 3
PLCH 08-APR-11 20 5 3.25 8
PLCH 09-APR-11 24 4 3.25 12
PLCH 10-APR-11 25 1 3.25 13
PLCH 12-APR-11 15 10.00
PLCH 13-APR-11 25 10 10.00 10
PLCH 16-APR-11 12 6.00
PLCH 17-APR-11 14 2 6.00 2
PLCH 18-APR-11 24 10 6.00 12
Measure expressions
02-Oct-19 Uses of Row Pattern Matching15
02-Oct-19 Uses of Row Pattern Matching16
Grouping sequences
• https://stewashton.wordpress.com/2014/03/05/12c-match_recognize-grouping-sequences/
• Table of numeric values in some sequential groups
create table ex1 (numval)
as
select 1 from dual union all
select 2 from dual union all
select 3 from dual union all
select 5 from dual union all
select 6 from dual union all
select 7 from dual union all
select 10 from dual union all
select 11 from dual union all
select 12 from dual union all
select 20 from dual;
Stew Ashton example
02-Oct-19 Uses of Row Pattern Matching17
• “b” row is a row where numval is exactly one greater than previous rows numval
• Pattern states any row followed by zero or more occurrences of “b” row
select *
from ex1
match_recognize (
order by numval
measures
first(numval) firstval
, last(numval) lastval
, count(*) cnt
pattern (
a b*
)
define
b as numval = prev(numval) + 1
);
FIRSTVAL LASTVAL CNT
---------- ---------- ----------
1 3 3
5 7 3
10 12 3
20 20 1
DEFINE in relation to PREV row
02-Oct-19 Uses of Row Pattern Matching18
• Analytic method by Aketi Jyuuzou – as efficient, but less self-documenting
select min(numval) firstval
, max(numval) lastval
, count(*) cnt
from (
select numval
, numval - row_number() over (
order by numval
) as grp
from ex1
)
group by grp
order by min(numval);
FIRSTVAL LASTVAL CNT
---------- ---------- ----------
1 3 3
5 7 3
10 12 3
20 20 1
Tabibitosan
02-Oct-19 Uses of Row Pattern Matching19
02-Oct-19 Uses of Row Pattern Matching20
Merge date ranges
• https://stewashton.wordpress.com/2015/06/10/merging-overlapping-date-ranges-with-match_recognize/
• Table of date ranges – open-ended end_date (up to but not including)
create table t ( id int, start_date date, end_date date );
insert into t values ( 1, date '2014-01-01', date '2014-01-03');
insert into t values ( 2, date '2014-01-02', date '2014-01-05');
insert into t values ( 3, date '2014-01-02', date '2014-01-06');
insert into t values ( 4, date '2014-01-03', date '2014-01-05');
insert into t values ( 5, date '2014-01-05', date '2014-01-07');
insert into t values ( 6, date '2014-01-23', date '2014-02-01');
insert into t values ( 7, date '2014-01-25', date '2014-02-01');
insert into t values ( 8, date '2014-02-01', date '2014-02-10');
insert into t values ( 9, date '2014-02-01', date '2014-02-04');
insert into t values (10, date '2014-02-05', date '2014-02-12');
insert into t values (11, date '2014-02-10', date '2014-02-15');
Date Ranges
02-Oct-19 Uses of Row Pattern Matching21
• As long as the start date of the next row is smaller than or equal to the highest end date seen
so far, the next row overlaps or adjoins and is merged (replace <= with < for just overlapping)
select *
from t
match_recognize(
order by start_date, end_date
measures
first(start_date) start_date
, max(end_date) end_date
, count(*) c
pattern(
a* b
)
define
a as next(start_date) <= max(end_date)
);
START_DAT END_DATE C
--------- --------- --
01-JAN-14 07-JAN-14 5
23-JAN-14 15-FEB-14 6
Merge overlapping and contiguous ranges
02-Oct-19 Uses of Row Pattern Matching22
• Add some rows with NULL values
insert into t values (12, null, date '2014-01-01');
insert into t values (13, null, date '2014-01-02');
insert into t values (14, date '2014-02-19', date '2014-02-21');
insert into t values (14, date '2014-02-20', null);
insert into t values (15, date '2014-02-21', null);
NULL for infinity
02-Oct-19 Uses of Row Pattern Matching23
• Handle null start date as minimum date -4712-01-01
• Handle null end date as maximum date 9999-12-31
select * from t
match_recognize(
order by start_date nulls first
, end_date nulls last
measures
first(start_date) start_date
, nullif(
max(nvl(end_date, date '9999-12-31'))
, date '9999-12-31'
) end_date
, count(*) c
pattern( a* b )
define a as
nvl(next(start_date), date '-4712-01-01')
<= max(nvl(end_date, date '9999-12-31'))
);
START_DAT END_DATE C
--------- --------- --
07-JAN-14 7
23-JAN-14 15-FEB-14 6
19-FEB-14 3
NULL for inifinity
02-Oct-19 Uses of Row Pattern Matching24
02-Oct-19 Uses of Row Pattern Matching25
Tablespace growth
• Table storing tablespace size every midnight
create table plch_space (
tabspace varchar2(30)
, sampledate date
, gigabytes number
);
insert into plch_space values ('MYSPACE' , date '2014-02-01', 100);
insert into plch_space values ('MYSPACE' , date '2014-02-02', 103);
insert into plch_space values ('MYSPACE' , date '2014-02-03', 116);
insert into plch_space values ('MYSPACE' , date '2014-02-04', 129);
insert into plch_space values ('MYSPACE' , date '2014-02-05', 142);
insert into plch_space values ('MYSPACE' , date '2014-02-06', 160);
insert into plch_space values ('MYSPACE' , date '2014-02-07', 165);
insert into plch_space values ('MYSPACE' , date '2014-02-08', 210);
insert into plch_space values ('MYSPACE' , date '2014-02-09', 230);
insert into plch_space values ('MYSPACE' , date '2014-02-10', 239);
insert into plch_space values ('YOURSPACE', date '2014-02-06', 50);
insert into plch_space values ('YOURSPACE', date '2014-02-07', 53);
insert into plch_space values ('YOURSPACE', date '2014-02-08', 72);
insert into plch_space values ('YOURSPACE', date '2014-02-09', 97);
insert into plch_space values ('YOURSPACE', date '2014-02-10', 101);
insert into plch_space values ('HISSPACE', date '2014-02-06', 100);
insert into plch_space values ('HISSPACE', date '2014-02-07', 130);
insert into plch_space values ('HISSPACE', date '2014-02-08', 145);
insert into plch_space values ('HISSPACE', date '2014-02-09', 200);
insert into plch_space values ('HISSPACE', date '2014-02-10', 225);
insert into plch_space values ('HISSPACE', date '2014-02-11', 255);
insert into plch_space values ('HISSPACE', date '2014-02-12', 285);
insert into plch_space values ('HISSPACE', date '2014-02-13', 315);
From my quizzes on devgym.oracle.com
02-Oct-19 Uses of Row Pattern Matching26
• FAST defined as 25% growth, SLOW defined as 10-25% growth
• PATTERN states we want to see periods of at least 1 FAST or at least 3 SLOW
select tabspace, spurttype, startdate, startgb, enddate, endgb, avg_daily_gb
from plch_space
match_recognize (
partition by tabspace order by sampledate
measures
classifier() as spurttype
, first(sampledate) as startdate
, first(gigabytes) as startgb
, last(sampledate) as enddate
, next(gigabytes) as endgb
, (next(gigabytes) - first(gigabytes)) / count(*) as avg_daily_gb
one row per match after match skip past last row
pattern ( fast+ | slow{3,} )
define fast as next(gigabytes) / gigabytes >= 1.25
, slow as next(slow.gigabytes) / slow.gigabytes >= 1.10 and
next(slow.gigabytes) / slow.gigabytes < 1.25
)
order by tabspace, startdate;
OR in pattern is |
02-Oct-19 Uses of Row Pattern Matching27
• Output of the previous slide
TABSPACE SPURTTYPE STARTDATE STARTGB ENDDATE ENDGB AVG_DAILY_GB
------------ ---------- --------- ---------- --------- ---------- ------------
HISSPACE FAST 06-FEB-14 100 06-FEB-14 130 30
HISSPACE FAST 08-FEB-14 145 08-FEB-14 200 55
HISSPACE SLOW 09-FEB-14 200 12-FEB-14 315 28.75
MYSPACE SLOW 02-FEB-14 103 05-FEB-14 160 14.25
MYSPACE FAST 07-FEB-14 165 07-FEB-14 210 45
YOURSPACE FAST 07-FEB-14 53 08-FEB-14 97 22
Growth alert report
02-Oct-19 Uses of Row Pattern Matching28
select tabspace, spurttype, startdate
, min(gigabytes) keep (dense_rank first order by sampledate) startgb
, max(sampledate) enddate
, max(nextgb) keep (dense_rank last order by sampledate) endgb
, avg(daily_gb) avg_daily_gb
from (
select tabspace, spurttype, sampledate, gigabytes, nextgb, daily_gb
, last_value(spurtstartdate ignore nulls) over (
partition by tabspace, spurttype order by sampledate
rows between unbounded preceding and current row
) startdate
from (
select tabspace, spurttype, sampledate, gigabytes, nextgb, daily_gb
, case
when spurttype is not null and
( lag(spurttype) over (
partition by tabspace order by sampledate
) is null
or
lag(spurttype) over (
partition by tabspace order by sampledate
) != spurttype
)
...
Analytic alternative
02-Oct-19 Uses of Row Pattern Matching29
...
then sampledate
end spurtstartdate
from (
select tabspace, sampledate, gigabytes, nextgb, nextgb - gigabytes daily_gb
, case
when nextgb >= gigabytes * 1.25 then 'FAST'
when nextgb >= gigabytes * 1.10 then 'SLOW'
end spurttype
from (
select tabspace, sampledate, gigabytes
, lead(gigabytes) over (
partition by tabspace order by sampledate
) nextgb
from plch_space
) ) )
where spurttype is not null
)
group by tabspace, spurttype, startdate
having count(*) >= case spurttype
when 'FAST' then 1
when 'SLOW' then 3
end
order by tabspace, startdate;
Analytic alternative (continued)
02-Oct-19 Uses of Row Pattern Matching30
02-Oct-19 Uses of Row Pattern Matching31
Bin fitting –
limited capacity
• https://stewashton.wordpress.com/2014/03/03/database-12c-match_recognize-for-all-sizes-of-data/
• Create groups of consecutive study_site with sum(cnt) at most 65.000
create table t (
study_site number
, cnt number
);
insert into t (study_site,cnt) values (1001,3407);
insert into t (study_site,cnt) values (1002,4323);
insert into t (study_site,cnt) values (1004,1623);
insert into t (study_site,cnt) values (1008,1991);
insert into t (study_site,cnt) values (1011,885);
insert into t (study_site,cnt) values (1012,11597);
insert into t (study_site,cnt) values (1014,1989);
insert into t (study_site,cnt) values (1015,5282);
insert into t (study_site,cnt) values (1017,2841);
insert into t (study_site,cnt) values (1018,5183);
insert into t (study_site,cnt) values (1020,6176);
insert into t (study_site,cnt) values (1022,2784);
insert into t (study_site,cnt) values (1023,25865);
insert into t (study_site,cnt) values (1024,3734);
insert into t (study_site,cnt) values (1026,137);
insert into t (study_site,cnt) values (1028,6005);
insert into t (study_site,cnt) values (1029,76);
insert into t (study_site,cnt) values (1031,4599);
insert into t (study_site,cnt) values (1032,1989);
insert into t (study_site,cnt) values (1034,3427);
insert into t (study_site,cnt) values (1036,879);
insert into t (study_site,cnt) values (1038,6485);
insert into t (study_site,cnt) values (1039,3);
insert into t (study_site,cnt) values (1040,1105);
insert into t (study_site,cnt) values (1041,6460);
insert into t (study_site,cnt) values (1042,968);
insert into t (study_site,cnt) values (1044,471);
insert into t (study_site,cnt) values (1045,3360);
Stew Ashton example
02-Oct-19 Uses of Row Pattern Matching32
• Aggregate SUM in Define is "running“ semantic
• Pattern "a+" continues matching while rolling sum(cnt) <= 65.000
select * from t
match_recognize (
order by study_site
measures
first(study_site) first_site
, last(study_site) last_site
, sum(cnt) sum_cnt
one row per match
after match skip past last row
pattern ( a+ )
define
a as sum(cnt) <= 65000
);
FIRST_SITE LAST_SITE SUM_CNT
---------- ---------- ----------
1001 1022 48081
1023 1044 62203
1045 1045 3360
Match until rolling sum reaches limit
02-Oct-19 Uses of Row Pattern Matching33
• Previous slide was criteria had to order by STUDY_SITE
• Ordering by CNT descending can "pack" the data a bit better
select * from t
match_recognize (
order by cnt desc, study_site
measures
count(*) sites
, sum(cnt) sum_cnt
, min(cnt) min_cnt
, max(cnt) max_cnt
one row per match
after match skip past last row
pattern ( a+ )
define
a as sum(cnt) <= 65000
);
SITES SUM_CNT MIN_CNT MAX_CNT
------ -------- -------- --------
6 62588 6005 25865
22 51056 3 5282
Match until rolling sum reaches limit
02-Oct-19 Uses of Row Pattern Matching34
• Better (yet simple) "best fit" approximation by interleaved ordering of large/small
• Largest, smallest, second-largest, second-smallest, third-largest, third-smallest, etc.
select * from (
select study_site, cnt
, least(
row_number() over (
order by cnt
)
, row_number() over (
order by cnt desc
)
) rn
from t
)
match_recognize (
order by rn, cnt desc, study_site
...
SITES SUM_CNT MIN_CNT MAX_CNT
------ -------- -------- --------
11 64154 3 25865
17 49490 885 5282
Match until rolling sum reaches limit
02-Oct-19 Uses of Row Pattern Matching35
02-Oct-19 Uses of Row Pattern Matching36
Bin fitting –
limited number of bins
• https://stewashton.wordpress.com/2014/06/06/bin-fitting-problems-with-sql/
• We want to fill 3 bins so each bin sum(item_value) is as near equal as possible
create table items
as
select level item_name, level
item_value
from dual
connect by level <= 10;
select *
from items
order by item_name;
ITEM_NAME ITEM_VALUE
---------- ----------
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
Stew Ashton example
02-Oct-19 Uses of Row Pattern Matching37
• First, order the items by value in descending order
• Then, assign each item to whatever bin has the smallest sum so far
select * from items
match_recognize (
order by item_value desc
measures
to_number(substr(classifier(),4)) bin#,
sum(bin1.item_value) bin1,
sum(bin2.item_value) bin2,
sum(bin3.item_value) bin3
all rows per match
pattern ( (bin1|bin2|bin3)* )
define
bin1 as count(bin1.*) = 1
or sum(bin1.item_value)-bin1.item_value
<= least(sum(bin2.item_value), sum(bin3.item_value))
, bin2 as count(bin2.*) = 1
or sum(bin2.item_value)-bin2.item_value
<= sum(bin3.item_value)
);
Fill 3 bins equally
02-Oct-19 Uses of Row Pattern Matching38
• Output of previous slide
ITEM_VALUE BIN# BIN1 BIN2 BIN3 ITEM_NAME
---------- ---------- ---------- ---------- ---------- ----------
10 1 10 10
9 2 10 9 9
8 3 10 9 8 8
7 3 10 9 15 7
6 2 10 15 15 6
5 1 15 15 15 5
4 1 19 15 15 4
3 2 19 18 15 3
2 3 19 18 17 2
1 3 19 18 18 1
Almost equally filled
02-Oct-19 Uses of Row Pattern Matching39
02-Oct-19 Uses of Row Pattern Matching40
Hierarchical child count
• http://www.kibeha.dk/2015/07/row-pattern-matching-nested-within.html
• CONNECT BY in scalar subquery
select empno
, lpad(' ', (level-1)*2) || ename as ename
, (
select count(*)
from emp sub
start with sub.mgr = emp.empno
connect by sub.mgr = prior sub.empno
) subs
from emp
start with mgr is null
connect by mgr = prior empno
order siblings by empno;
EMPNO ENAME SUBS
----- ------------ -----
7839 KING 13
7566 JONES 4
7788 SCOTT 1
7876 ADAMS 0
7902 FORD 1
7369 SMITH 0
7698 BLAKE 5
7499 ALLEN 0
7521 WARD 0
7654 MARTIN 0
7844 TURNER 0
7900 JAMES 0
7782 CLARK 1
7934 MILLER 0
How many subordinates for each employee
02-Oct-19 Uses of Row Pattern Matching41
• Using AFTER MATCH SKIP TO NEXT ROW allows “nesting” of matches
• Identical output as previous slide
with hierarchy as (
select lvl, empno, ename, rownum as rn
from (
select level as lvl, empno, ename
from emp
start with mgr is null
connect by mgr = prior empno
order siblings by empno
)
)
select empno
, lpad(' ', (lvl-1)*2) || ename as ename
, subs
from hierarchy
...
...
match_recognize (
order by rn
measures
strt.rn as rn
, strt.lvl as lvl
, strt.empno as empno
, strt.ename as ename
, count(higher.lvl) as subs
one row per match
after match skip to next row
pattern ( strt higher* )
define higher as
higher.lvl > strt.lvl
)
order by rn;
Pattern matching instead of scalar subquery
02-Oct-19 Uses of Row Pattern Matching42
• See details of what is happening with ALL ROWS PER MATCH
with hierarchy as (
select lvl, empno, ename, rownum as rn
from (
select level as lvl, empno, ename
from emp
start with mgr is null
connect by mgr = prior empno
order siblings by empno
) )
select mn, rn, empno
, lpad(' ', (lvl-1)*2) || ename as ename
, roll, subs, cls
, stno, stname, hino, hiname
from hierarchy
match_recognize (
order by rn
...
...
measures
match_number() as mn
, classifier() as cls
, strt.empno as stno
, strt.ename as stname
, higher.empno as hino
, higher.ename as hiname
, count(higher.lvl) as roll
, final count(higher.lvl) as subs
all rows per match
after match skip to next row
pattern ( strt higher* )
define higher as
higher.lvl > strt.lvl
)
order by mn, rn;
ALL ROWS PER MATCH
02-Oct-19 Uses of Row Pattern Matching43
• Output of previous slide
MN RN EMPNO ENAME ROLL SUBS CLS STNO STNAME HINO HINAME
--- --- ----- ------------ ---- ---- ------ ----- ------ ----- ------
1 1 7839 KING 0 13 STRT 7839 KING
1 2 7566 JONES 1 13 HIGHER 7839 KING 7566 JONES
1 3 7788 SCOTT 2 13 HIGHER 7839 KING 7788 SCOTT
1 4 7876 ADAMS 3 13 HIGHER 7839 KING 7876 ADAMS
1 5 7902 FORD 4 13 HIGHER 7839 KING 7902 FORD
1 6 7369 SMITH 5 13 HIGHER 7839 KING 7369 SMITH
1 7 7698 BLAKE 6 13 HIGHER 7839 KING 7698 BLAKE
1 8 7499 ALLEN 7 13 HIGHER 7839 KING 7499 ALLEN
1 9 7521 WARD 8 13 HIGHER 7839 KING 7521 WARD
1 10 7654 MARTIN 9 13 HIGHER 7839 KING 7654 MARTIN
1 11 7844 TURNER 10 13 HIGHER 7839 KING 7844 TURNER
1 12 7900 JAMES 11 13 HIGHER 7839 KING 7900 JAMES
1 13 7782 CLARK 12 13 HIGHER 7839 KING 7782 CLARK
1 14 7934 MILLER 13 13 HIGHER 7839 KING 7934 MILLER
2 2 7566 JONES 0 4 STRT 7566 JONES
2 3 7788 SCOTT 1 4 HIGHER 7566 JONES 7788 SCOTT
2 4 7876 ADAMS 2 4 HIGHER 7566 JONES 7876 ADAMS
2 5 7902 FORD 3 4 HIGHER 7566 JONES 7902 FORD
2 6 7369 SMITH 4 4 HIGHER 7566 JONES 7369 SMITH
...
ALL ROWS PER MATCH
02-Oct-19 Uses of Row Pattern Matching44
• PIVOT just to visualize the output which rows are part of what match
with hierarchy as (
select lvl, empno, ename, rownum as rn
from (
select level as lvl, empno, ename
from emp
start with mgr is null
connect by mgr = prior empno
order siblings by empno
) )
select rn, empno, ename
, case "1" when 1 then 'XX' end "1"
, case "2" when 1 then 'XX' end "2"
...
, case "13" when 1 then 'XX' end "13"
, case "14" when 1 then 'XX' end "14"
...
...
from (
select mn, rn, empno
, lpad(' ', (lvl-1)*2) || ename as
ename
from hierarchy
match_recognize (
order by rn
measures match_number() as mn
all rows per match
after match skip to next row
pattern ( strt higher* )
define higher as higher.lvl > strt.lvl
))
pivot (
count(*)
for mn in (1,2,3,4,5,6,7,8,9,10,11,12,13,14)
) order by rn;
PIVOT
02-Oct-19 Uses of Row Pattern Matching45
• Output of the previous slide
RN EMPNO ENAME 1 2 3 4 5 6 7 8 9 10 11 12 13 14
--- ----- ------------ -- -- -- -- -- -- -- -- -- -- -- -- -- --
1 7839 KING XX
2 7566 JONES XX XX
3 7788 SCOTT XX XX XX
4 7876 ADAMS XX XX XX XX
5 7902 FORD XX XX XX
6 7369 SMITH XX XX XX XX
7 7698 BLAKE XX XX
8 7499 ALLEN XX XX XX
9 7521 WARD XX XX XX
10 7654 MARTIN XX XX XX
11 7844 TURNER XX XX XX
12 7900 JAMES XX XX XX
13 7782 CLARK XX XX
14 7934 MILLER XX XX XX
PIVOT
02-Oct-19 Uses of Row Pattern Matching46
• Could wrap entire thing in inline view and filter on “subs > 0”
• But much simpler just to change * into +
with hierarchy as (
select lvl, empno, ename, rownum as rn
from (
select level as lvl, empno, ename
from emp
start with mgr is null
connect by mgr = prior empno
order siblings by empno
)
)
select empno
, lpad(' ', (lvl-1)*2) || ename as ename
, subs
from hierarchy
...
...
match_recognize (
order by rn
measures
strt.rn as rn
, strt.lvl as lvl
, strt.empno as empno
, strt.ename as ename
, count(higher.lvl) as subs
one row per match
after match skip to next row
pattern ( strt higher+ )
define higher as
higher.lvl > strt.lvl
)
order by rn;
Only those with subordinates?
02-Oct-19 Uses of Row Pattern Matching47
• Output of previous slide
EMPNO ENAME SUBS
----- ------------ ----
7839 KING 13
7566 JONES 4
7788 SCOTT 1
7902 FORD 1
7698 BLAKE 5
7782 CLARK 1
Only those with subordinates!
02-Oct-19 Uses of Row Pattern Matching48
• Create BIGEMP table with emp LARRY on top of pyramid of 14.001 employees
create table bigemp as
select 1 empno
, 'LARRY' ename
, cast(null as number) mgr
from dual
union all
select dum.dum * 10000 + empno empno
, ename || '#' || dum.dum ename
, coalesce(dum.dum * 10000 + mgr, 1) mgr
from emp
cross join (
select level dum
from dual
connect by level <= 1000
) dum;
Scalability
02-Oct-19 Uses of Row Pattern Matching49
• Scalar subquery with CONNECT BY on left 30x slower, 8455x more gets, 9252x more sorts
than MATCH_RECOGNIZE method on right
14001 rows selected.
Elapsed: 00:00:11.61
Statistics
--------------------------------------------
0 recursive calls
0 db block gets
465005 consistent gets
0 physical reads
0 redo size
435280 bytes sent via SQL*Net to client
10763 bytes received via SQL*Net from...
935 SQL*Net roundtrips to/from client
37008 sorts (memory)
0 sorts (disk)
14001 rows processed
14001 rows selected.
Elapsed: 00:00:00.35
Statistics
--------------------------------------------
1 recursive calls
0 db block gets
55 consistent gets
0 physical reads
0 redo size
435280 bytes sent via SQL*Net to client
10763 bytes received via SQL*Net from...
935 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
14001 rows processed
Scalability
02-Oct-19 Uses of Row Pattern Matching50
02-Oct-19 Uses of Row Pattern Matching51
Brief summary
MATCH_RECOGNIZE - A “swiss army knife” tool
• Brilliant when applied “BI style” like stock ticker analysis examples
• But applicable to many other cases too
• When you have some problem crossing row boundaries and feel you have to “stretch” even
the capabilities of analytics, try a pattern based approach:
• Rephrase (in natural language) your requirements in terms of what classifies the rows
you are looking for
• Turn that into pattern matching syntax classifying individual rows in DEFINE and how the
classified rows should appear in PATTERN
• As with analytics, it might feel daunting at first, but once you start using pattern matching, it
will become just another tool in your SQL toolbox
02-Oct-19 Uses of Row Pattern Matching52
http://kibeha.dk@kibeha
Questions & Answers
This presentation http://bit.ly/kibeha_patmatch4_pptx
Script with all the code http://bit.ly/kibeha_patmatch4_sql
Webinar http://bit.ly/patternmatch
Webinar scripts http://bit.ly/patternmatchsamples
Stew Ashton https://stewashton.wordpress.com/category/match_recognize/
kim.berghansen@trivadis.com

More Related Content

PDF
Using Apache Spark to Solve Sessionization Problem in Batch and Streaming
PPTX
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
PPTX
PPTX
Evening out the uneven: dealing with skew in Flink
PDF
PostgreSQL on EXT4, XFS, BTRFS and ZFS
PPTX
Practical learnings from running thousands of Flink jobs
PDF
KSQL Intro
PDF
Kafka to the Maxka - (Kafka Performance Tuning)
Using Apache Spark to Solve Sessionization Problem in Batch and Streaming
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Evening out the uneven: dealing with skew in Flink
PostgreSQL on EXT4, XFS, BTRFS and ZFS
Practical learnings from running thousands of Flink jobs
KSQL Intro
Kafka to the Maxka - (Kafka Performance Tuning)

What's hot (20)

PPTX
[211] HBase 기반 검색 데이터 저장소 (공개용)
PDF
Introducing the Apache Flink Kubernetes Operator
PDF
Data warehouse on Kubernetes - gentle intro to Clickhouse Operator, by Robert...
PDF
Native Support of Prometheus Monitoring in Apache Spark 3.0
PDF
Incremental View Maintenance with Coral, DBT, and Iceberg
PDF
Oracle SQL Performance Tuning and Optimization v26 chapter 1
PDF
Exadata master series_asm_2020
PDF
Kafka Streams: What it is, and how to use it?
PPTX
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
PDF
eBPF Trace from Kernel to Userspace
PDF
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
PDF
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)
PDF
BPF Internals (eBPF)
PDF
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentals
PDF
LISA2019 Linux Systems Performance
PDF
Attribute-Based Access Control in Symfony
PPTX
Airflow and supervisor
PDF
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
PPTX
Apache Flink in the Cloud-Native Era
PDF
Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors w...
[211] HBase 기반 검색 데이터 저장소 (공개용)
Introducing the Apache Flink Kubernetes Operator
Data warehouse on Kubernetes - gentle intro to Clickhouse Operator, by Robert...
Native Support of Prometheus Monitoring in Apache Spark 3.0
Incremental View Maintenance with Coral, DBT, and Iceberg
Oracle SQL Performance Tuning and Optimization v26 chapter 1
Exadata master series_asm_2020
Kafka Streams: What it is, and how to use it?
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
eBPF Trace from Kernel to Userspace
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)
BPF Internals (eBPF)
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentals
LISA2019 Linux Systems Performance
Attribute-Based Access Control in Symfony
Airflow and supervisor
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Apache Flink in the Cloud-Native Era
Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors w...
Ad

Similar to TechEvent 2019: Uses of Row Pattern Matching; Kim Berg Hansen - Trivadis (20)

PPTX
Uses of row pattern matching
PPTX
A few things about the Oracle optimizer - 2013
PDF
COIS 420 - Practice 03
PPT
Enabling Applications with Informix' new OLAP functionality
PPSX
Row Pattern Matching in Oracle Database 12c
PDF
Enhancing Spark SQL Optimizer with Reliable Statistics
PPT
Olap Functions Suport in Informix
PDF
Query optimizer vivek sharma
PPT
Intro to tsql unit 10
PDF
Oracle Advanced SQL and Analytic Functions
PPTX
Row patternmatching12ctech14
PDF
How the Postgres Query Optimizer Works
 
PPT
SQL WORKSHOP::Lecture 3
PDF
Use Cases of Row Pattern Matching in Oracle 12c
PDF
Flexible Indexing with Postgres
 
PDF
5.Analytical Function.pdf
PPT
Informix Warehouse Accelerator (IWA) features in version 12.1
PDF
Practical Graph Algorithms with Neo4j
PPTX
Lecture 9.pptx
PDF
MariaDB Server 10.3 - Temporale Daten und neues zur DB-Kompatibilität
Uses of row pattern matching
A few things about the Oracle optimizer - 2013
COIS 420 - Practice 03
Enabling Applications with Informix' new OLAP functionality
Row Pattern Matching in Oracle Database 12c
Enhancing Spark SQL Optimizer with Reliable Statistics
Olap Functions Suport in Informix
Query optimizer vivek sharma
Intro to tsql unit 10
Oracle Advanced SQL and Analytic Functions
Row patternmatching12ctech14
How the Postgres Query Optimizer Works
 
SQL WORKSHOP::Lecture 3
Use Cases of Row Pattern Matching in Oracle 12c
Flexible Indexing with Postgres
 
5.Analytical Function.pdf
Informix Warehouse Accelerator (IWA) features in version 12.1
Practical Graph Algorithms with Neo4j
Lecture 9.pptx
MariaDB Server 10.3 - Temporale Daten und neues zur DB-Kompatibilität
Ad

More from Trivadis (20)

PDF
Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...
PDF
Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...
PDF
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
PDF
Azure Days 2019: Master the Move to Azure (Konrad Brunner)
PDF
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...
PDF
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
PDF
Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...
PDF
Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...
PDF
Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...
PDF
Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...
PDF
TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...
PDF
TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...
PDF
TechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - Trivadis
PDF
TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...
PDF
TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...
PDF
TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...
PDF
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
PDF
TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...
PDF
TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...
PDF
TechEvent 2019: The sleeping Power of Data; Eberhard Lösch - Trivadis
Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...
Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Master the Move to Azure (Konrad Brunner)
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...
Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...
Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...
Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...
TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...
TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...
TechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - Trivadis
TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...
TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...
TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...
TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...
TechEvent 2019: The sleeping Power of Data; Eberhard Lösch - Trivadis

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Spectral efficient network and resource selection model in 5G networks
PPT
Teaching material agriculture food technology
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Spectroscopy.pptx food analysis technology
PDF
Approach and Philosophy of On baking technology
PDF
Encapsulation theory and applications.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Electronic commerce courselecture one. Pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
KodekX | Application Modernization Development
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Empathic Computing: Creating Shared Understanding
PPTX
sap open course for s4hana steps from ECC to s4
Building Integrated photovoltaic BIPV_UPV.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
20250228 LYD VKU AI Blended-Learning.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectral efficient network and resource selection model in 5G networks
Teaching material agriculture food technology
Machine learning based COVID-19 study performance prediction
Spectroscopy.pptx food analysis technology
Approach and Philosophy of On baking technology
Encapsulation theory and applications.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Electronic commerce courselecture one. Pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
KodekX | Application Modernization Development
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Programs and apps: productivity, graphics, security and other tools
Empathic Computing: Creating Shared Understanding
sap open course for s4hana steps from ECC to s4

TechEvent 2019: Uses of Row Pattern Matching; Kim Berg Hansen - Trivadis

  • 2. About me • Danish geek • SQL & PL/SQL developer since 2000 • Developer at Trivadis since 2016 http://www.trivadis.com • Oracle Certified Expert in SQL • Oracle ACE Director • SQL quizmaster http://devgym.oracle.com • Blogger http://kibeha.dk • Likes to cook and read sci-fi • Member of Danish Beer Enthusiasts @kibeha
  • 3. 3 Membership Tiers • Oracle ACE Director • Oracle ACE • Oracle ACE Associate bit.ly/OracleACEProgram 500+ Technical Experts Helping Peers Globally Connect: Nominate yourself or someone you know: acenomination.oracle.com @oracleace Facebook.com/oracleaces oracle-ace_ww@oracle.com
  • 4. About Trivadis • Founded 1994 • 16 locations: Switzerland, Germany, Austria, Denmark and Romania • 700 specialists • 260 Service Level Agreements • Over 4,000 training participants • Research and development budget: EUR 5.0 million • More than 1,900 projects per year at over 800 customers • Financially self-supporting and sustainably profitable 02-Oct-19 Uses of Row Pattern Matching4
  • 6. Agenda for Pattern Matching • Elements in the syntax • Use cases: • Stock ticker • Grouping sequences • Merge date ranges • Tablespace growth • Bin fitting with limited capacity • Bin fitting in limited number of bins • Hierarchical child count • Brief summary 02-Oct-19 Uses of Row Pattern Matching6
  • 7. 02-Oct-19 Uses of Row Pattern Matching7 Elements in the syntax
  • 8. • Example from Data Warehousing Guide chapter on SQL For Pattern Matching SELECT * FROM Ticker MATCH_RECOGNIZE ( PARTITION BY symbol ORDER BY tstamp MEASURES STRT.tstamp AS start_tstamp, FINAL LAST(DOWN.tstamp) AS bottom_tstamp, FINAL LAST(UP.tstamp) AS end_tstamp, MATCH_NUMBER() AS match_num, CLASSIFIER() AS var_match ALL ROWS PER MATCH AFTER MATCH SKIP TO LAST UP PATTERN (STRT DOWN+ UP+) DEFINE DOWN AS DOWN.price < PREV(DOWN.price), UP AS UP.price > PREV(UP.price) ) MR ORDER BY MR.symbol, MR.match_num, MR.tstamp What‘s it look like 02-Oct-19 Uses of Row Pattern Matching8
  • 9. Elements • PARTITION BY – like analytics split data to work on one partition at a time • ORDER BY – in which order shall rows be tested whether they match the pattern • MEASURES – the information we want returned from the match • ALL ROWS / ONE ROW PER MATCH – return aggregate or detailed info for match • AFTER MATCH SKIP … – when match found, where to start looking for new match • PATTERN – regexp like syntax of pattern of defined row classifiers to match • SUBSET – „union“ a set of classifications into one classification variable • DEFINE – definition of classification of rows • FIRST, LAST, PREV, NEXT – navigational functions • CLASSIFIER(), MATCH_NUMBER() – identification functions 02-Oct-19 Uses of Row Pattern Matching9
  • 10. 02-Oct-19 Uses of Row Pattern Matching10 Stock ticker
  • 11. • Example from Data Warehousing Guide chapter on SQL for Pattern Matching create table ticker ( symbol varchar2(10) , day date , price number ); insert into ticker values('PLCH', DATE '2011-04-01', 12); insert into ticker values('PLCH', DATE '2011-04-02', 17); insert into ticker values('PLCH', DATE '2011-04-03', 19); insert into ticker values('PLCH', DATE '2011-04-04', 21); insert into ticker values('PLCH', DATE '2011-04-05', 25); insert into ticker values('PLCH', DATE '2011-04-06', 12); insert into ticker values('PLCH', DATE '2011-04-07', 15); insert into ticker values('PLCH', DATE '2011-04-08', 20); insert into ticker values('PLCH', DATE '2011-04-09', 24); insert into ticker values('PLCH', DATE '2011-04-10', 25); insert into ticker values('PLCH', DATE '2011-04-11', 19); insert into ticker values('PLCH', DATE '2011-04-12', 15); insert into ticker values('PLCH', DATE '2011-04-13', 25); insert into ticker values('PLCH', DATE '2011-04-14', 25); insert into ticker values('PLCH', DATE '2011-04-15', 14); insert into ticker values('PLCH', DATE '2011-04-16', 12); insert into ticker values('PLCH', DATE '2011-04-17', 14); insert into ticker values('PLCH', DATE '2011-04-18', 24); insert into ticker values('PLCH', DATE '2011-04-19', 23); insert into ticker values('PLCH', DATE '2011-04-20', 22); Ticker table 02-Oct-19 Uses of Row Pattern Matching11
  • 12. • Look for V shapes = at least one “down” slope followed by at least one “up” slope select * from ticker match_recognize ( partition by symbol order by day measures strt.day as start_day, final last(down.day) as bottom_day, final last(up.day) as end_day, match_number() as match_num, classifier() as var_match all rows per match after match skip to last up pattern (strt down+ up+) define down as down.price < prev(down.price), up as up.price > prev(up.price) ) mr order by mr.symbol, mr.match_num, mr.day; Stock ticker 02-Oct-19 Uses of Row Pattern Matching12
  • 13. • Output of previous slide SYMBOL DAY START_DAY BOTTOM_DA END_DAY MATCH_NUM VAR_MATCH PRICE ---------- --------- --------- --------- --------- ---------- --------- ---------- PLCH 05-APR-11 05-APR-11 06-APR-11 10-APR-11 1 STRT 25 PLCH 06-APR-11 05-APR-11 06-APR-11 10-APR-11 1 DOWN 12 PLCH 07-APR-11 05-APR-11 06-APR-11 10-APR-11 1 UP 15 PLCH 08-APR-11 05-APR-11 06-APR-11 10-APR-11 1 UP 20 PLCH 09-APR-11 05-APR-11 06-APR-11 10-APR-11 1 UP 24 PLCH 10-APR-11 05-APR-11 06-APR-11 10-APR-11 1 UP 25 PLCH 10-APR-11 10-APR-11 12-APR-11 13-APR-11 2 STRT 25 PLCH 11-APR-11 10-APR-11 12-APR-11 13-APR-11 2 DOWN 19 PLCH 12-APR-11 10-APR-11 12-APR-11 13-APR-11 2 DOWN 15 PLCH 13-APR-11 10-APR-11 12-APR-11 13-APR-11 2 UP 25 PLCH 14-APR-11 14-APR-11 16-APR-11 18-APR-11 3 STRT 25 PLCH 15-APR-11 14-APR-11 16-APR-11 18-APR-11 3 DOWN 14 PLCH 16-APR-11 14-APR-11 16-APR-11 18-APR-11 3 DOWN 12 PLCH 17-APR-11 14-APR-11 16-APR-11 18-APR-11 3 UP 14 PLCH 18-APR-11 14-APR-11 16-APR-11 18-APR-11 3 UP 24 Stock ticker 02-Oct-19 Uses of Row Pattern Matching13
  • 14. • Previous example ALL ROWS, here ONE ROW per match select * from ticker match_recognize ( partition by symbol order by day measures strt.day as start_day, final last(down.day) as bottom_day, final last(down.price) as bottom_price, final last(up.day) as end_day, match_number() as match_num one row per match after match skip to last up pattern (strt down+ up+) define down as down.price < prev(down.price), up as up.price > prev(up.price) ) mr order by mr.symbol, mr.match_num; SYMBOL START_DAY BOTTOM_DA BOTTOM_PRICE END_DAY MATCH_NUM ---------- --------- --------- ------------ --------- ---------- PLCH 05-APR-11 06-APR-11 12 10-APR-11 1 PLCH 10-APR-11 12-APR-11 15 13-APR-11 2 PLCH 14-APR-11 16-APR-11 12 18-APR-11 3 ONE ROW PER MATCH 02-Oct-19 Uses of Row Pattern Matching14
  • 15. • Navigational functions in measure expressions (quiz from devgym.oracle.com) select symbol, day, price , up_day, up_avg, up_total from ticker match_recognize ( partition by symbol order by day measures final count(up.*) as days_up , up.price - prev(up.price) as up_day , (final last(up.price) - strt.price) / final count(up.*) as up_avg , up.price - strt.price as up_total all rows per match after match skip to last up pattern ( strt up+ ) define up as up.price > prev(up.price) ) order by day; SYMB DAY PRICE UP_DAY UP_AVG UP_TOTAL ---- --------- ----- ------ ------ -------- PLCH 01-APR-11 12 3.25 PLCH 02-APR-11 17 5 3.25 5 PLCH 03-APR-11 19 2 3.25 7 PLCH 04-APR-11 21 2 3.25 9 PLCH 05-APR-11 25 4 3.25 13 PLCH 06-APR-11 12 3.25 PLCH 07-APR-11 15 3 3.25 3 PLCH 08-APR-11 20 5 3.25 8 PLCH 09-APR-11 24 4 3.25 12 PLCH 10-APR-11 25 1 3.25 13 PLCH 12-APR-11 15 10.00 PLCH 13-APR-11 25 10 10.00 10 PLCH 16-APR-11 12 6.00 PLCH 17-APR-11 14 2 6.00 2 PLCH 18-APR-11 24 10 6.00 12 Measure expressions 02-Oct-19 Uses of Row Pattern Matching15
  • 16. 02-Oct-19 Uses of Row Pattern Matching16 Grouping sequences
  • 17. • https://stewashton.wordpress.com/2014/03/05/12c-match_recognize-grouping-sequences/ • Table of numeric values in some sequential groups create table ex1 (numval) as select 1 from dual union all select 2 from dual union all select 3 from dual union all select 5 from dual union all select 6 from dual union all select 7 from dual union all select 10 from dual union all select 11 from dual union all select 12 from dual union all select 20 from dual; Stew Ashton example 02-Oct-19 Uses of Row Pattern Matching17
  • 18. • “b” row is a row where numval is exactly one greater than previous rows numval • Pattern states any row followed by zero or more occurrences of “b” row select * from ex1 match_recognize ( order by numval measures first(numval) firstval , last(numval) lastval , count(*) cnt pattern ( a b* ) define b as numval = prev(numval) + 1 ); FIRSTVAL LASTVAL CNT ---------- ---------- ---------- 1 3 3 5 7 3 10 12 3 20 20 1 DEFINE in relation to PREV row 02-Oct-19 Uses of Row Pattern Matching18
  • 19. • Analytic method by Aketi Jyuuzou – as efficient, but less self-documenting select min(numval) firstval , max(numval) lastval , count(*) cnt from ( select numval , numval - row_number() over ( order by numval ) as grp from ex1 ) group by grp order by min(numval); FIRSTVAL LASTVAL CNT ---------- ---------- ---------- 1 3 3 5 7 3 10 12 3 20 20 1 Tabibitosan 02-Oct-19 Uses of Row Pattern Matching19
  • 20. 02-Oct-19 Uses of Row Pattern Matching20 Merge date ranges
  • 21. • https://stewashton.wordpress.com/2015/06/10/merging-overlapping-date-ranges-with-match_recognize/ • Table of date ranges – open-ended end_date (up to but not including) create table t ( id int, start_date date, end_date date ); insert into t values ( 1, date '2014-01-01', date '2014-01-03'); insert into t values ( 2, date '2014-01-02', date '2014-01-05'); insert into t values ( 3, date '2014-01-02', date '2014-01-06'); insert into t values ( 4, date '2014-01-03', date '2014-01-05'); insert into t values ( 5, date '2014-01-05', date '2014-01-07'); insert into t values ( 6, date '2014-01-23', date '2014-02-01'); insert into t values ( 7, date '2014-01-25', date '2014-02-01'); insert into t values ( 8, date '2014-02-01', date '2014-02-10'); insert into t values ( 9, date '2014-02-01', date '2014-02-04'); insert into t values (10, date '2014-02-05', date '2014-02-12'); insert into t values (11, date '2014-02-10', date '2014-02-15'); Date Ranges 02-Oct-19 Uses of Row Pattern Matching21
  • 22. • As long as the start date of the next row is smaller than or equal to the highest end date seen so far, the next row overlaps or adjoins and is merged (replace <= with < for just overlapping) select * from t match_recognize( order by start_date, end_date measures first(start_date) start_date , max(end_date) end_date , count(*) c pattern( a* b ) define a as next(start_date) <= max(end_date) ); START_DAT END_DATE C --------- --------- -- 01-JAN-14 07-JAN-14 5 23-JAN-14 15-FEB-14 6 Merge overlapping and contiguous ranges 02-Oct-19 Uses of Row Pattern Matching22
  • 23. • Add some rows with NULL values insert into t values (12, null, date '2014-01-01'); insert into t values (13, null, date '2014-01-02'); insert into t values (14, date '2014-02-19', date '2014-02-21'); insert into t values (14, date '2014-02-20', null); insert into t values (15, date '2014-02-21', null); NULL for infinity 02-Oct-19 Uses of Row Pattern Matching23
  • 24. • Handle null start date as minimum date -4712-01-01 • Handle null end date as maximum date 9999-12-31 select * from t match_recognize( order by start_date nulls first , end_date nulls last measures first(start_date) start_date , nullif( max(nvl(end_date, date '9999-12-31')) , date '9999-12-31' ) end_date , count(*) c pattern( a* b ) define a as nvl(next(start_date), date '-4712-01-01') <= max(nvl(end_date, date '9999-12-31')) ); START_DAT END_DATE C --------- --------- -- 07-JAN-14 7 23-JAN-14 15-FEB-14 6 19-FEB-14 3 NULL for inifinity 02-Oct-19 Uses of Row Pattern Matching24
  • 25. 02-Oct-19 Uses of Row Pattern Matching25 Tablespace growth
  • 26. • Table storing tablespace size every midnight create table plch_space ( tabspace varchar2(30) , sampledate date , gigabytes number ); insert into plch_space values ('MYSPACE' , date '2014-02-01', 100); insert into plch_space values ('MYSPACE' , date '2014-02-02', 103); insert into plch_space values ('MYSPACE' , date '2014-02-03', 116); insert into plch_space values ('MYSPACE' , date '2014-02-04', 129); insert into plch_space values ('MYSPACE' , date '2014-02-05', 142); insert into plch_space values ('MYSPACE' , date '2014-02-06', 160); insert into plch_space values ('MYSPACE' , date '2014-02-07', 165); insert into plch_space values ('MYSPACE' , date '2014-02-08', 210); insert into plch_space values ('MYSPACE' , date '2014-02-09', 230); insert into plch_space values ('MYSPACE' , date '2014-02-10', 239); insert into plch_space values ('YOURSPACE', date '2014-02-06', 50); insert into plch_space values ('YOURSPACE', date '2014-02-07', 53); insert into plch_space values ('YOURSPACE', date '2014-02-08', 72); insert into plch_space values ('YOURSPACE', date '2014-02-09', 97); insert into plch_space values ('YOURSPACE', date '2014-02-10', 101); insert into plch_space values ('HISSPACE', date '2014-02-06', 100); insert into plch_space values ('HISSPACE', date '2014-02-07', 130); insert into plch_space values ('HISSPACE', date '2014-02-08', 145); insert into plch_space values ('HISSPACE', date '2014-02-09', 200); insert into plch_space values ('HISSPACE', date '2014-02-10', 225); insert into plch_space values ('HISSPACE', date '2014-02-11', 255); insert into plch_space values ('HISSPACE', date '2014-02-12', 285); insert into plch_space values ('HISSPACE', date '2014-02-13', 315); From my quizzes on devgym.oracle.com 02-Oct-19 Uses of Row Pattern Matching26
  • 27. • FAST defined as 25% growth, SLOW defined as 10-25% growth • PATTERN states we want to see periods of at least 1 FAST or at least 3 SLOW select tabspace, spurttype, startdate, startgb, enddate, endgb, avg_daily_gb from plch_space match_recognize ( partition by tabspace order by sampledate measures classifier() as spurttype , first(sampledate) as startdate , first(gigabytes) as startgb , last(sampledate) as enddate , next(gigabytes) as endgb , (next(gigabytes) - first(gigabytes)) / count(*) as avg_daily_gb one row per match after match skip past last row pattern ( fast+ | slow{3,} ) define fast as next(gigabytes) / gigabytes >= 1.25 , slow as next(slow.gigabytes) / slow.gigabytes >= 1.10 and next(slow.gigabytes) / slow.gigabytes < 1.25 ) order by tabspace, startdate; OR in pattern is | 02-Oct-19 Uses of Row Pattern Matching27
  • 28. • Output of the previous slide TABSPACE SPURTTYPE STARTDATE STARTGB ENDDATE ENDGB AVG_DAILY_GB ------------ ---------- --------- ---------- --------- ---------- ------------ HISSPACE FAST 06-FEB-14 100 06-FEB-14 130 30 HISSPACE FAST 08-FEB-14 145 08-FEB-14 200 55 HISSPACE SLOW 09-FEB-14 200 12-FEB-14 315 28.75 MYSPACE SLOW 02-FEB-14 103 05-FEB-14 160 14.25 MYSPACE FAST 07-FEB-14 165 07-FEB-14 210 45 YOURSPACE FAST 07-FEB-14 53 08-FEB-14 97 22 Growth alert report 02-Oct-19 Uses of Row Pattern Matching28
  • 29. select tabspace, spurttype, startdate , min(gigabytes) keep (dense_rank first order by sampledate) startgb , max(sampledate) enddate , max(nextgb) keep (dense_rank last order by sampledate) endgb , avg(daily_gb) avg_daily_gb from ( select tabspace, spurttype, sampledate, gigabytes, nextgb, daily_gb , last_value(spurtstartdate ignore nulls) over ( partition by tabspace, spurttype order by sampledate rows between unbounded preceding and current row ) startdate from ( select tabspace, spurttype, sampledate, gigabytes, nextgb, daily_gb , case when spurttype is not null and ( lag(spurttype) over ( partition by tabspace order by sampledate ) is null or lag(spurttype) over ( partition by tabspace order by sampledate ) != spurttype ) ... Analytic alternative 02-Oct-19 Uses of Row Pattern Matching29
  • 30. ... then sampledate end spurtstartdate from ( select tabspace, sampledate, gigabytes, nextgb, nextgb - gigabytes daily_gb , case when nextgb >= gigabytes * 1.25 then 'FAST' when nextgb >= gigabytes * 1.10 then 'SLOW' end spurttype from ( select tabspace, sampledate, gigabytes , lead(gigabytes) over ( partition by tabspace order by sampledate ) nextgb from plch_space ) ) ) where spurttype is not null ) group by tabspace, spurttype, startdate having count(*) >= case spurttype when 'FAST' then 1 when 'SLOW' then 3 end order by tabspace, startdate; Analytic alternative (continued) 02-Oct-19 Uses of Row Pattern Matching30
  • 31. 02-Oct-19 Uses of Row Pattern Matching31 Bin fitting – limited capacity
  • 32. • https://stewashton.wordpress.com/2014/03/03/database-12c-match_recognize-for-all-sizes-of-data/ • Create groups of consecutive study_site with sum(cnt) at most 65.000 create table t ( study_site number , cnt number ); insert into t (study_site,cnt) values (1001,3407); insert into t (study_site,cnt) values (1002,4323); insert into t (study_site,cnt) values (1004,1623); insert into t (study_site,cnt) values (1008,1991); insert into t (study_site,cnt) values (1011,885); insert into t (study_site,cnt) values (1012,11597); insert into t (study_site,cnt) values (1014,1989); insert into t (study_site,cnt) values (1015,5282); insert into t (study_site,cnt) values (1017,2841); insert into t (study_site,cnt) values (1018,5183); insert into t (study_site,cnt) values (1020,6176); insert into t (study_site,cnt) values (1022,2784); insert into t (study_site,cnt) values (1023,25865); insert into t (study_site,cnt) values (1024,3734); insert into t (study_site,cnt) values (1026,137); insert into t (study_site,cnt) values (1028,6005); insert into t (study_site,cnt) values (1029,76); insert into t (study_site,cnt) values (1031,4599); insert into t (study_site,cnt) values (1032,1989); insert into t (study_site,cnt) values (1034,3427); insert into t (study_site,cnt) values (1036,879); insert into t (study_site,cnt) values (1038,6485); insert into t (study_site,cnt) values (1039,3); insert into t (study_site,cnt) values (1040,1105); insert into t (study_site,cnt) values (1041,6460); insert into t (study_site,cnt) values (1042,968); insert into t (study_site,cnt) values (1044,471); insert into t (study_site,cnt) values (1045,3360); Stew Ashton example 02-Oct-19 Uses of Row Pattern Matching32
  • 33. • Aggregate SUM in Define is "running“ semantic • Pattern "a+" continues matching while rolling sum(cnt) <= 65.000 select * from t match_recognize ( order by study_site measures first(study_site) first_site , last(study_site) last_site , sum(cnt) sum_cnt one row per match after match skip past last row pattern ( a+ ) define a as sum(cnt) <= 65000 ); FIRST_SITE LAST_SITE SUM_CNT ---------- ---------- ---------- 1001 1022 48081 1023 1044 62203 1045 1045 3360 Match until rolling sum reaches limit 02-Oct-19 Uses of Row Pattern Matching33
  • 34. • Previous slide was criteria had to order by STUDY_SITE • Ordering by CNT descending can "pack" the data a bit better select * from t match_recognize ( order by cnt desc, study_site measures count(*) sites , sum(cnt) sum_cnt , min(cnt) min_cnt , max(cnt) max_cnt one row per match after match skip past last row pattern ( a+ ) define a as sum(cnt) <= 65000 ); SITES SUM_CNT MIN_CNT MAX_CNT ------ -------- -------- -------- 6 62588 6005 25865 22 51056 3 5282 Match until rolling sum reaches limit 02-Oct-19 Uses of Row Pattern Matching34
  • 35. • Better (yet simple) "best fit" approximation by interleaved ordering of large/small • Largest, smallest, second-largest, second-smallest, third-largest, third-smallest, etc. select * from ( select study_site, cnt , least( row_number() over ( order by cnt ) , row_number() over ( order by cnt desc ) ) rn from t ) match_recognize ( order by rn, cnt desc, study_site ... SITES SUM_CNT MIN_CNT MAX_CNT ------ -------- -------- -------- 11 64154 3 25865 17 49490 885 5282 Match until rolling sum reaches limit 02-Oct-19 Uses of Row Pattern Matching35
  • 36. 02-Oct-19 Uses of Row Pattern Matching36 Bin fitting – limited number of bins
  • 37. • https://stewashton.wordpress.com/2014/06/06/bin-fitting-problems-with-sql/ • We want to fill 3 bins so each bin sum(item_value) is as near equal as possible create table items as select level item_name, level item_value from dual connect by level <= 10; select * from items order by item_name; ITEM_NAME ITEM_VALUE ---------- ---------- 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 Stew Ashton example 02-Oct-19 Uses of Row Pattern Matching37
  • 38. • First, order the items by value in descending order • Then, assign each item to whatever bin has the smallest sum so far select * from items match_recognize ( order by item_value desc measures to_number(substr(classifier(),4)) bin#, sum(bin1.item_value) bin1, sum(bin2.item_value) bin2, sum(bin3.item_value) bin3 all rows per match pattern ( (bin1|bin2|bin3)* ) define bin1 as count(bin1.*) = 1 or sum(bin1.item_value)-bin1.item_value <= least(sum(bin2.item_value), sum(bin3.item_value)) , bin2 as count(bin2.*) = 1 or sum(bin2.item_value)-bin2.item_value <= sum(bin3.item_value) ); Fill 3 bins equally 02-Oct-19 Uses of Row Pattern Matching38
  • 39. • Output of previous slide ITEM_VALUE BIN# BIN1 BIN2 BIN3 ITEM_NAME ---------- ---------- ---------- ---------- ---------- ---------- 10 1 10 10 9 2 10 9 9 8 3 10 9 8 8 7 3 10 9 15 7 6 2 10 15 15 6 5 1 15 15 15 5 4 1 19 15 15 4 3 2 19 18 15 3 2 3 19 18 17 2 1 3 19 18 18 1 Almost equally filled 02-Oct-19 Uses of Row Pattern Matching39
  • 40. 02-Oct-19 Uses of Row Pattern Matching40 Hierarchical child count
  • 41. • http://www.kibeha.dk/2015/07/row-pattern-matching-nested-within.html • CONNECT BY in scalar subquery select empno , lpad(' ', (level-1)*2) || ename as ename , ( select count(*) from emp sub start with sub.mgr = emp.empno connect by sub.mgr = prior sub.empno ) subs from emp start with mgr is null connect by mgr = prior empno order siblings by empno; EMPNO ENAME SUBS ----- ------------ ----- 7839 KING 13 7566 JONES 4 7788 SCOTT 1 7876 ADAMS 0 7902 FORD 1 7369 SMITH 0 7698 BLAKE 5 7499 ALLEN 0 7521 WARD 0 7654 MARTIN 0 7844 TURNER 0 7900 JAMES 0 7782 CLARK 1 7934 MILLER 0 How many subordinates for each employee 02-Oct-19 Uses of Row Pattern Matching41
  • 42. • Using AFTER MATCH SKIP TO NEXT ROW allows “nesting” of matches • Identical output as previous slide with hierarchy as ( select lvl, empno, ename, rownum as rn from ( select level as lvl, empno, ename from emp start with mgr is null connect by mgr = prior empno order siblings by empno ) ) select empno , lpad(' ', (lvl-1)*2) || ename as ename , subs from hierarchy ... ... match_recognize ( order by rn measures strt.rn as rn , strt.lvl as lvl , strt.empno as empno , strt.ename as ename , count(higher.lvl) as subs one row per match after match skip to next row pattern ( strt higher* ) define higher as higher.lvl > strt.lvl ) order by rn; Pattern matching instead of scalar subquery 02-Oct-19 Uses of Row Pattern Matching42
  • 43. • See details of what is happening with ALL ROWS PER MATCH with hierarchy as ( select lvl, empno, ename, rownum as rn from ( select level as lvl, empno, ename from emp start with mgr is null connect by mgr = prior empno order siblings by empno ) ) select mn, rn, empno , lpad(' ', (lvl-1)*2) || ename as ename , roll, subs, cls , stno, stname, hino, hiname from hierarchy match_recognize ( order by rn ... ... measures match_number() as mn , classifier() as cls , strt.empno as stno , strt.ename as stname , higher.empno as hino , higher.ename as hiname , count(higher.lvl) as roll , final count(higher.lvl) as subs all rows per match after match skip to next row pattern ( strt higher* ) define higher as higher.lvl > strt.lvl ) order by mn, rn; ALL ROWS PER MATCH 02-Oct-19 Uses of Row Pattern Matching43
  • 44. • Output of previous slide MN RN EMPNO ENAME ROLL SUBS CLS STNO STNAME HINO HINAME --- --- ----- ------------ ---- ---- ------ ----- ------ ----- ------ 1 1 7839 KING 0 13 STRT 7839 KING 1 2 7566 JONES 1 13 HIGHER 7839 KING 7566 JONES 1 3 7788 SCOTT 2 13 HIGHER 7839 KING 7788 SCOTT 1 4 7876 ADAMS 3 13 HIGHER 7839 KING 7876 ADAMS 1 5 7902 FORD 4 13 HIGHER 7839 KING 7902 FORD 1 6 7369 SMITH 5 13 HIGHER 7839 KING 7369 SMITH 1 7 7698 BLAKE 6 13 HIGHER 7839 KING 7698 BLAKE 1 8 7499 ALLEN 7 13 HIGHER 7839 KING 7499 ALLEN 1 9 7521 WARD 8 13 HIGHER 7839 KING 7521 WARD 1 10 7654 MARTIN 9 13 HIGHER 7839 KING 7654 MARTIN 1 11 7844 TURNER 10 13 HIGHER 7839 KING 7844 TURNER 1 12 7900 JAMES 11 13 HIGHER 7839 KING 7900 JAMES 1 13 7782 CLARK 12 13 HIGHER 7839 KING 7782 CLARK 1 14 7934 MILLER 13 13 HIGHER 7839 KING 7934 MILLER 2 2 7566 JONES 0 4 STRT 7566 JONES 2 3 7788 SCOTT 1 4 HIGHER 7566 JONES 7788 SCOTT 2 4 7876 ADAMS 2 4 HIGHER 7566 JONES 7876 ADAMS 2 5 7902 FORD 3 4 HIGHER 7566 JONES 7902 FORD 2 6 7369 SMITH 4 4 HIGHER 7566 JONES 7369 SMITH ... ALL ROWS PER MATCH 02-Oct-19 Uses of Row Pattern Matching44
  • 45. • PIVOT just to visualize the output which rows are part of what match with hierarchy as ( select lvl, empno, ename, rownum as rn from ( select level as lvl, empno, ename from emp start with mgr is null connect by mgr = prior empno order siblings by empno ) ) select rn, empno, ename , case "1" when 1 then 'XX' end "1" , case "2" when 1 then 'XX' end "2" ... , case "13" when 1 then 'XX' end "13" , case "14" when 1 then 'XX' end "14" ... ... from ( select mn, rn, empno , lpad(' ', (lvl-1)*2) || ename as ename from hierarchy match_recognize ( order by rn measures match_number() as mn all rows per match after match skip to next row pattern ( strt higher* ) define higher as higher.lvl > strt.lvl )) pivot ( count(*) for mn in (1,2,3,4,5,6,7,8,9,10,11,12,13,14) ) order by rn; PIVOT 02-Oct-19 Uses of Row Pattern Matching45
  • 46. • Output of the previous slide RN EMPNO ENAME 1 2 3 4 5 6 7 8 9 10 11 12 13 14 --- ----- ------------ -- -- -- -- -- -- -- -- -- -- -- -- -- -- 1 7839 KING XX 2 7566 JONES XX XX 3 7788 SCOTT XX XX XX 4 7876 ADAMS XX XX XX XX 5 7902 FORD XX XX XX 6 7369 SMITH XX XX XX XX 7 7698 BLAKE XX XX 8 7499 ALLEN XX XX XX 9 7521 WARD XX XX XX 10 7654 MARTIN XX XX XX 11 7844 TURNER XX XX XX 12 7900 JAMES XX XX XX 13 7782 CLARK XX XX 14 7934 MILLER XX XX XX PIVOT 02-Oct-19 Uses of Row Pattern Matching46
  • 47. • Could wrap entire thing in inline view and filter on “subs > 0” • But much simpler just to change * into + with hierarchy as ( select lvl, empno, ename, rownum as rn from ( select level as lvl, empno, ename from emp start with mgr is null connect by mgr = prior empno order siblings by empno ) ) select empno , lpad(' ', (lvl-1)*2) || ename as ename , subs from hierarchy ... ... match_recognize ( order by rn measures strt.rn as rn , strt.lvl as lvl , strt.empno as empno , strt.ename as ename , count(higher.lvl) as subs one row per match after match skip to next row pattern ( strt higher+ ) define higher as higher.lvl > strt.lvl ) order by rn; Only those with subordinates? 02-Oct-19 Uses of Row Pattern Matching47
  • 48. • Output of previous slide EMPNO ENAME SUBS ----- ------------ ---- 7839 KING 13 7566 JONES 4 7788 SCOTT 1 7902 FORD 1 7698 BLAKE 5 7782 CLARK 1 Only those with subordinates! 02-Oct-19 Uses of Row Pattern Matching48
  • 49. • Create BIGEMP table with emp LARRY on top of pyramid of 14.001 employees create table bigemp as select 1 empno , 'LARRY' ename , cast(null as number) mgr from dual union all select dum.dum * 10000 + empno empno , ename || '#' || dum.dum ename , coalesce(dum.dum * 10000 + mgr, 1) mgr from emp cross join ( select level dum from dual connect by level <= 1000 ) dum; Scalability 02-Oct-19 Uses of Row Pattern Matching49
  • 50. • Scalar subquery with CONNECT BY on left 30x slower, 8455x more gets, 9252x more sorts than MATCH_RECOGNIZE method on right 14001 rows selected. Elapsed: 00:00:11.61 Statistics -------------------------------------------- 0 recursive calls 0 db block gets 465005 consistent gets 0 physical reads 0 redo size 435280 bytes sent via SQL*Net to client 10763 bytes received via SQL*Net from... 935 SQL*Net roundtrips to/from client 37008 sorts (memory) 0 sorts (disk) 14001 rows processed 14001 rows selected. Elapsed: 00:00:00.35 Statistics -------------------------------------------- 1 recursive calls 0 db block gets 55 consistent gets 0 physical reads 0 redo size 435280 bytes sent via SQL*Net to client 10763 bytes received via SQL*Net from... 935 SQL*Net roundtrips to/from client 4 sorts (memory) 0 sorts (disk) 14001 rows processed Scalability 02-Oct-19 Uses of Row Pattern Matching50
  • 51. 02-Oct-19 Uses of Row Pattern Matching51 Brief summary
  • 52. MATCH_RECOGNIZE - A “swiss army knife” tool • Brilliant when applied “BI style” like stock ticker analysis examples • But applicable to many other cases too • When you have some problem crossing row boundaries and feel you have to “stretch” even the capabilities of analytics, try a pattern based approach: • Rephrase (in natural language) your requirements in terms of what classifies the rows you are looking for • Turn that into pattern matching syntax classifying individual rows in DEFINE and how the classified rows should appear in PATTERN • As with analytics, it might feel daunting at first, but once you start using pattern matching, it will become just another tool in your SQL toolbox 02-Oct-19 Uses of Row Pattern Matching52
  • 53. http://kibeha.dk@kibeha Questions & Answers This presentation http://bit.ly/kibeha_patmatch4_pptx Script with all the code http://bit.ly/kibeha_patmatch4_sql Webinar http://bit.ly/patternmatch Webinar scripts http://bit.ly/patternmatchsamples Stew Ashton https://stewashton.wordpress.com/category/match_recognize/ kim.berghansen@trivadis.com