SlideShare a Scribd company logo
AI-BASED ANALYTICS FOR YOUR
EDISCOVERY NEEDS:
PERCEPTION AND INTELLIGENT SEARCH
ZyLAB POWER DEMO
TODAY’S SPEAKERS
Mary Mack
Executive Director ACEDS
Paul Starrett
Specialist in electronic
evidence and data science in
the legal profession
Johannes Scholtes
CSO at ZyLAB
Professor Text-Mining
University of Maastricht
SLIDE / 3
 Tools from the field of Artificial Intelligence and Data Science
accelerate truth-finding missions in regulatory requests and
internal investigations.
 New AI-based analytics have drastically increased the speed
and improved the quality of the eDiscovery process.
 But what exactly are these new AI techniques and how do they
compare to all the other analytics we have been using for
years?
TODAY’S AGENDA
THE BUZZ
SLIDE / 5
e-Discovery & Artificial Intelligence The new reality
AI becomes good business practice
WHAT ARE WE TALKING ABOUT?
“Analytics” is the discovery,
interpretation, and communication
of meaningful patterns in data.
The terms “analytics” or “analysis”
describe functions ranging from
reporting and review metrics to
sophisticated search and
advanced data, text-mining and
machine learning applications.
Benefits also range across various
dimensions.
“Artificial Intelligence (AI) is a
broad, complex field of research.
AI includes tasks such as
reasoning, problem solving,
knowledge representation,
planning, machine learning,
natural language processing,
perception, motion, social
intelligence, and even creativity.
The ultimate goal is the creation
of some form of general
intelligence.
SLIDE / 6
The Usual Suspects:
 Exploding data volumes;
 New types of data (multi-media, social, BYOD);
 Exploding eDiscovery costs;
 New regulations and compliance requirements
 GDPR
 Cyber-security requirements
 More enthusiastic regulators, especially outside of the US.
SLIDE / 7
WHY WE SHOULD CARE
DEALING WITH THE EDISCOVERY DATA WAVE
In eDiscovery, you never know in
advance:
 How much data you will have;
 What type of data it will be and thus
what type of processing is required;
 What workflow and iterations you will
have;
 Automation, AI and Data Science are
very CPU and computers memory
intensive;
So, you need intelligent and extremely
load-balancing and resource allocation to
prevent bottlenecks and deal effectively
with the “Data Wave” in eDiscovery.
 Better understand your data: the ability to make better strategic
decisions.
 Early Case Assessment: build and justify eDiscovery budget,
resources and timelines.
 Reduce data volumes: cut through the noise and zero in on
documents of interest.
 Take an investigative approach: organize and prioritize documents.
 Reduce your eDiscovery cost: improve productivity and precision of
your team.
 Better quality: see greater consistency in coding decisions across
similar documents.
 Speed up litigation.
SLIDE / 9
WHY ANALYTICS?
 Humans have cognitive limitations when processing and deriving
insights from large-scale document sets; humans simply cannot
successfully synthesize large volumes of data.
 Technology will help lawyers work more efficiently, effectively, and
enjoyably.
 Grossman & Cormack* : “TAR was not only more effective than
human review at finding relevant documents, but also much cheaper
… Overall, the myth that exhaustive manual review is the most
effective—and therefore the most defensible—approach to
document review is strongly refuted.”
SLIDE / 10
WHY AI-BASED ANALYTICS?
* TECHNOLOGY-ASSISTED REVIEW IN E-DISCOVERY CAN BE MORE EFFECTIVE AND MORE EFFICIENT THAN EXHAUSTIVE MANUAL REVIEW
By Maura R. Grossman* & Gordon V. Cormack. Richmond Journal of Law and Technology. Vol. XVII, Issue 3.
SLIDE / 11
 Structural: aka syntactic analytics
 File-, Document and Forensic Property extraction, Meta-data
filtering, Saved (full-text) Searches, Email Thread detection,
Email Thread reduction, Missing emails in thread, Duplicate- and
Near Duplicate detection, Language identification,
Communication Analysis, Time-line Visualizations, Geo-mapping,
…
 Conceptual: aka semantic or meaning based analytics
 Keyword Expansion (taxonomy), Content Clustering, Content-
based Categorization, Conceptual Search, Sentiment & Emotion
Mining, Semantic Content Analysis, Word-Cloud, Topic Modeling,
…
 Machine Learning: data driven (predictive) analytics
 Technology Assisted Review, Contract clause detection &
classification, Privileged detection, …
SLIDE / 12
WHAT KIND OF ANALYTICS HAVE WE SEEN?
STRUCTURE OF DATA
MEANING OF DATA
LEARN FROM DATA
WHAT IS THE RELATION BETWEEN AI AND ANALYTICS?
eDiscovery needs:
 Perception
 Reading: OCR, handwriting detection, signature
recognition,
 Listening: Audio search
 Vision: Image classification
 Language: Machine Translation
 Intelligent Search
 Machine Learning for search
 Concept Clustering
 Data Visualization
 Text classification and categorization
 Document
 Paragraph (clause)
 Sentence or phrase
AI provides the algorithms and evaluation
methods:
 Machine Learning
 Decision trees
 Support Vector Machines
 Deep Learning (CNN)
 Topic Modeling / Concept Search
 Hierarchical Clustering
 LSI
 LDA
 NMF
 Natural Language Processing (NLP)
 Shallow Parsing
 Deep Parsing
 Co-reference resolution
SLIDE / 13
PERCEPTION: AUDIO SEARCH
ZyLAB: automatic Audio
Search on all detected
(embedded) audio and
video files.
ZyLAB: embedded
machine translation
on every (embedded)
document or
document section.
PERCEPTION: MACHINE TRANSLATION
SLIDE / 16
PERCEPTION: HANDWRITING & SIGNATURE DETECTION (R&D)
SLIDE / 17
PERCEPTION: VISUAL CLASSIFICATION OF IMAGES FOR
EDISCOVERY (R&D)
PERCEPTION: OCR ON BITMAPS
ZyLAB: people often screenshot or take
pictures from such information, just in case
or to remember…. ZyLAB will pick up such
images, OCR and find them…
STRUCTURAL: UNPACK EMBEDDED CONTENT
ZyLAB:
• Every embedded item is extracted and OCR-ed if needed.
• Search & Find
• Show in document family
STRUCTURAL: ONE-ON-ONE COMMUNICATION
STRUCTURAL: MISSING EMAIL IN THREAD
ZyLAB:
 Identify gaps in
collected emails
 Compare gaps among
suspects
 Restore email from
backup’s
CONCEPTUAL: SEMANTICS AND SENTIMENTS
FIND EVEN IF YOU DO NOT KNOW WHAT TO LOOK FOR
Question Entities or patterns to address this question
Who is it about? PERSON, COMPANY, ORGANIZATION. EMAIL
ADDRESS
What is it about? Result of Topic Modeling and Concept Clustering
When did it happen? DATE, TIME, MONTH, DAY WEEK, YEAR
Where did it happen? ADDRESS, CITY, COUNTRY, CONTINENT,
DEPARTMENT and other geo-locations
Why did it happen? Sentiments, emotions and cursing
How did it happen? Combining entities and facts
How much/often did it happen? Quantitative measures such as amounts,
currencies, and other numbers. Also frequency
and averages on entity occurrences.
SLIDE / 24
MORE DETAILED INSIGHTS
SLIDE / 25
More interesting is to combine the W’s. For instance, why
not look for Who is Where, or What happened When.
Who – Who
Who – Why
When – What
The era of traditional keyword and Boolean search
seems to be over. Even the most brilliant query results
in too many hits. Reviewing these takes too much
time and resources.
 People do not know exactly what to look for, what
keywords to use or how to spell them.
 The quality of traditional search is much lower than
the searchers think (80% perceived versus 20-40%
actual quality).
 Only highly skilled searchers who manage all
(advanced) query options are able to get close to
80%. Even then, they cannot be sure that they did in
fact found 80% of all relevant documents. This is
another problem measuring recall: you never know
what you miss.
MACHINE LEARNING: THE NEW SEARCH
ACEDS - ZyLAB webinar - AI Based eDiscovery Analytics
 Document Classification (TAR)
 Find responsive documents
 Boost recall
 Measure recall
 Paragraph Classification
 Privileged review
 Document clause classification
 Contract clause classification
 GDPR – Privacy detection – Redaction – Pseudomization
SLIDE / 28
DIFFERENT USE CASES OF MACHINE LEARNING
 Have we found all relevant
information? How complete
is the data we sent to the
regulator? Machine
learning!
 During this process, several
quantitative measures can
be calculated such as
precision, recall, F-values
and precision of the return
set. Based on these
measurements, one can
describe exactly how much
of the relevant information
has been found at which
moment in the process.
HOW CAN WE MEASURE RECALL
ACEDS - ZyLAB webinar - AI Based eDiscovery Analytics
0
200
400
600
800
1000
1200
1400
1600
ZyLAB Assisted Review Manual Review
Hours
MACHINE LEARNING
 15-20 faster than manual review
 10-20% more accurate, fully defensible
 Privileged
information:
automatically identify
communications with
our lawyers.
 PII, PHI, and GDPR:
redaction and
pseudonymization
CLAUSE DETECTION
Detailed reporting
on content of
contracts, Reporting
on extraction of key
information, Higher
precision search
 ZyLAB’s Direct Collecting makes tremendous time savings to get data ready for early
case assessment and (first) pass review. Direct Collection drastically reduces the cost
and risks of downloading / uploading data or the shipping around of tapes and hard disks.
 ZyLAB’s Deep Processing allows you to automatically reduce your data volumes before
you send them on for review, without getting in trouble or being accused of data
spoliation. If every component of data is searchable, only then can one use automated
tools to reduce data.
 Using ZyLAB’s Review Accelerators you can minimize the most expensive and time
consuming part of the eDiscovery process. TAR, batch tagging, sampling, redaction,
email trails, …
 Litigants use ZyLAB’s Early Case Assessment to quickly understand the facts and
merits of a case, identify key custodians and recognize critical information so they can
develop an effective and realistic litigation strategy.
SLIDE / 34
BENEFITS TO IN-HOUSE COUNSEL
THE ZYLAB BENEFIT TO LAW FIRMS
 ZyLAB covers multiple eDiscovery use
cases. One platform: More cases, more
volume, better pricing.
 No need to involve any 3rd parties.
 Bill the hours for project management and
data science (machine learning) as well.
 DIY: upload data and almost immediately
start reviewing with your team and bill the
hours.
 Find out what really happened with
ZyLAB’s deep search and analytics.
Expand review team.
 Replace the bottom of the traditional
earnings pyramid with “review robots”:
make more margin.
 Be more competitive.
 Do more work with your current team:
never have to pass on new opportunities
because of capacity problems.
 less risk of errors and missing out on key
issues. So, less risk for liability claims and
higher insurance premiums.
ACEDS - ZyLAB webinar - AI Based eDiscovery Analytics
“ZYLAB TAKES CARE OF THE PROCESS, SUPPORTS THE LAWYER BY
THINKING COMMERCIALLY AND PROVIDES COMFORT WITH THE
USE OF ADVANCED TECHNOLOGY”
Ruben Elkerbout, anti-trust lawyer and partner with Stek Lawyers
MORE READING – WWW.ZYLAB.COM/RESOURCES/EBOOKS/
Q&A
MORE INFORMATION: WWW.ZYLAB.COM
39
More ZyLAB Webinars and events:
https://zylab.com/company/event-calendar/

More Related Content

PDF
How new ai based analytics ignite a productivity revolution in e discovery-final
PPTX
Dark data by Worapol Alex Pongpech
PDF
Introduction to Data Science (Data Summit, 2017)
PDF
Data fluency for the 21st century
PDF
Accelerate Data Discovery
PDF
Smart Data Webinar: A Roadmap for Deploying Modern AI in Business
PDF
Text pre-processing of multilingual for sentiment analysis based on social ne...
KEY
Intro to Data Science for Enterprise Big Data
How new ai based analytics ignite a productivity revolution in e discovery-final
Dark data by Worapol Alex Pongpech
Introduction to Data Science (Data Summit, 2017)
Data fluency for the 21st century
Accelerate Data Discovery
Smart Data Webinar: A Roadmap for Deploying Modern AI in Business
Text pre-processing of multilingual for sentiment analysis based on social ne...
Intro to Data Science for Enterprise Big Data

What's hot (20)

PDF
2015 data-science-salary-survey
PDF
Km cognitive computing overview by ken martin 19 jan2015
PDF
Data science vs. Data scientist by Jothi Periasamy
PDF
Data Scientist Toolbox
PPSX
Intro to Data Science Big Data
PDF
Reflected Intelligence: Real world AI in Digital Transformation
PDF
KM - Cognitive Computing overview by Ken Martin 13Apr2016
PDF
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
PDF
“Semantic Technologies for Smart Services”
PPSX
Data Mining and Data Warehousing (MAKAUT)
PDF
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
PDF
AI, Search, and the Disruption of Knowledge Management
PDF
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
PPTX
From Asset to Impact - Presentation to ICS Data Protection Conference 2011
PDF
Disruptive technologies - Session 2 - Blockchain smart_contracts
PDF
How the Analytics Translator can make your organisation more AI driven
PPTX
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
PDF
A Primer for a layman about Big Data, Business Analytics and Cloud
PDF
The Role of Data Wrangling in Driving Hadoop Adoption
PDF
Oea big-data-guide-1522052
2015 data-science-salary-survey
Km cognitive computing overview by ken martin 19 jan2015
Data science vs. Data scientist by Jothi Periasamy
Data Scientist Toolbox
Intro to Data Science Big Data
Reflected Intelligence: Real world AI in Digital Transformation
KM - Cognitive Computing overview by Ken Martin 13Apr2016
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
“Semantic Technologies for Smart Services”
Data Mining and Data Warehousing (MAKAUT)
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
AI, Search, and the Disruption of Knowledge Management
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
From Asset to Impact - Presentation to ICS Data Protection Conference 2011
Disruptive technologies - Session 2 - Blockchain smart_contracts
How the Analytics Translator can make your organisation more AI driven
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
A Primer for a layman about Big Data, Business Analytics and Cloud
The Role of Data Wrangling in Driving Hadoop Adoption
Oea big-data-guide-1522052
Ad

Similar to ACEDS - ZyLAB webinar - AI Based eDiscovery Analytics (20)

PPTX
Week-1-Introduction to Data Mining.pptx
PPT
Data mining
PPTX
Bio IT World 2019 - AI For Healthcare - Simon Taylor, Lucidworks
PDF
Efficiently Handling Subject Access Requests
PDF
Embracing data science
PPTX
Introduction of Data Science and Data Analytics
PDF
Untitled document.pdf
PPT
Introduction To Data Mining
PPT
Introduction To Data Mining
PPTX
Theres No Crying In Baseball...Or In E Discovery 04.30.10
PPT
District Office of Info and KM - Proposed - by Joel Magnussen - 2004
PPTX
Data-Mining-ppt.pptx
PPTX
Data-Mining-ppt (1).pptx
PDF
Evidence Data Preprocessing for Forensic and Legal Analytics
PDF
Thinkful DC - Intro to Data Science
PPTX
Data mining
PPT
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
PPTX
Digital Reasoning at AirSummit 2014
PPTX
IT in Business: Chapter 11 Data Sciences
PDF
Demystifying analytics in e discovery white paper 06-30-14
Week-1-Introduction to Data Mining.pptx
Data mining
Bio IT World 2019 - AI For Healthcare - Simon Taylor, Lucidworks
Efficiently Handling Subject Access Requests
Embracing data science
Introduction of Data Science and Data Analytics
Untitled document.pdf
Introduction To Data Mining
Introduction To Data Mining
Theres No Crying In Baseball...Or In E Discovery 04.30.10
District Office of Info and KM - Proposed - by Joel Magnussen - 2004
Data-Mining-ppt.pptx
Data-Mining-ppt (1).pptx
Evidence Data Preprocessing for Forensic and Legal Analytics
Thinkful DC - Intro to Data Science
Data mining
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Digital Reasoning at AirSummit 2014
IT in Business: Chapter 11 Data Sciences
Demystifying analytics in e discovery white paper 06-30-14
Ad

Recently uploaded (20)

PPT
Role of trustees in EC Competition Law.ppt
PDF
NRL_Legal Regulation of Forests and Wildlife.pdf
PPTX
Punjab Fertilizers Control Act 2025.pptx
PPTX
R.A. NO. 76 10 OR THE CHILD ABUSE LAW.pptx
PDF
Palghar-SGupta-ScreesnShots-12Aug25.pdf The image of the voter list with phot...
PDF
OBLICON (Civil Law of the Philippines) Obligations and Contracts
PPTX
lecture 5.pptx on family law notes well detailed
PPTX
Philippine Politics and Governance - Lesson 10 - The Executive Branch
PPTX
BL 2 - Courts and Alternative Dispute Resolution.pptx
PDF
Companies Act (1).pdf in details anlysis
PPTX
Financial Rehabilitation and Insolvency Act
PPTX
DepEd 4A Gender Issues and Promoting Gender Equality.pptx
PDF
Analysis Childrens act Kenya for the year 2022
PPTX
Evolution of First Amendment Jurisprudence.pptx
PDF
devolution-handbook (1).pdf the growh of devolution from 2010
PDF
APPELLANT'S AMENDED BRIEF – DPW ENTERPRISES LLC & MOUNTAIN PRIME 2018 LLC v. ...
PPTX
BL - Chapter 1 - Law and Legal Reasoning
PPTX
Constitutional Law 2 Final Report.ppt bill of rights in under the constitution
PPTX
PPT in Consti 2 Report (Week1).pptx under the contituiton
Role of trustees in EC Competition Law.ppt
NRL_Legal Regulation of Forests and Wildlife.pdf
Punjab Fertilizers Control Act 2025.pptx
R.A. NO. 76 10 OR THE CHILD ABUSE LAW.pptx
Palghar-SGupta-ScreesnShots-12Aug25.pdf The image of the voter list with phot...
OBLICON (Civil Law of the Philippines) Obligations and Contracts
lecture 5.pptx on family law notes well detailed
Philippine Politics and Governance - Lesson 10 - The Executive Branch
BL 2 - Courts and Alternative Dispute Resolution.pptx
Companies Act (1).pdf in details anlysis
Financial Rehabilitation and Insolvency Act
DepEd 4A Gender Issues and Promoting Gender Equality.pptx
Analysis Childrens act Kenya for the year 2022
Evolution of First Amendment Jurisprudence.pptx
devolution-handbook (1).pdf the growh of devolution from 2010
APPELLANT'S AMENDED BRIEF – DPW ENTERPRISES LLC & MOUNTAIN PRIME 2018 LLC v. ...
BL - Chapter 1 - Law and Legal Reasoning
Constitutional Law 2 Final Report.ppt bill of rights in under the constitution
PPT in Consti 2 Report (Week1).pptx under the contituiton

ACEDS - ZyLAB webinar - AI Based eDiscovery Analytics

  • 1. AI-BASED ANALYTICS FOR YOUR EDISCOVERY NEEDS: PERCEPTION AND INTELLIGENT SEARCH ZyLAB POWER DEMO
  • 2. TODAY’S SPEAKERS Mary Mack Executive Director ACEDS Paul Starrett Specialist in electronic evidence and data science in the legal profession Johannes Scholtes CSO at ZyLAB Professor Text-Mining University of Maastricht
  • 4.  Tools from the field of Artificial Intelligence and Data Science accelerate truth-finding missions in regulatory requests and internal investigations.  New AI-based analytics have drastically increased the speed and improved the quality of the eDiscovery process.  But what exactly are these new AI techniques and how do they compare to all the other analytics we have been using for years? TODAY’S AGENDA
  • 5. THE BUZZ SLIDE / 5 e-Discovery & Artificial Intelligence The new reality AI becomes good business practice
  • 6. WHAT ARE WE TALKING ABOUT? “Analytics” is the discovery, interpretation, and communication of meaningful patterns in data. The terms “analytics” or “analysis” describe functions ranging from reporting and review metrics to sophisticated search and advanced data, text-mining and machine learning applications. Benefits also range across various dimensions. “Artificial Intelligence (AI) is a broad, complex field of research. AI includes tasks such as reasoning, problem solving, knowledge representation, planning, machine learning, natural language processing, perception, motion, social intelligence, and even creativity. The ultimate goal is the creation of some form of general intelligence. SLIDE / 6
  • 7. The Usual Suspects:  Exploding data volumes;  New types of data (multi-media, social, BYOD);  Exploding eDiscovery costs;  New regulations and compliance requirements  GDPR  Cyber-security requirements  More enthusiastic regulators, especially outside of the US. SLIDE / 7 WHY WE SHOULD CARE
  • 8. DEALING WITH THE EDISCOVERY DATA WAVE In eDiscovery, you never know in advance:  How much data you will have;  What type of data it will be and thus what type of processing is required;  What workflow and iterations you will have;  Automation, AI and Data Science are very CPU and computers memory intensive; So, you need intelligent and extremely load-balancing and resource allocation to prevent bottlenecks and deal effectively with the “Data Wave” in eDiscovery.
  • 9.  Better understand your data: the ability to make better strategic decisions.  Early Case Assessment: build and justify eDiscovery budget, resources and timelines.  Reduce data volumes: cut through the noise and zero in on documents of interest.  Take an investigative approach: organize and prioritize documents.  Reduce your eDiscovery cost: improve productivity and precision of your team.  Better quality: see greater consistency in coding decisions across similar documents.  Speed up litigation. SLIDE / 9 WHY ANALYTICS?
  • 10.  Humans have cognitive limitations when processing and deriving insights from large-scale document sets; humans simply cannot successfully synthesize large volumes of data.  Technology will help lawyers work more efficiently, effectively, and enjoyably.  Grossman & Cormack* : “TAR was not only more effective than human review at finding relevant documents, but also much cheaper … Overall, the myth that exhaustive manual review is the most effective—and therefore the most defensible—approach to document review is strongly refuted.” SLIDE / 10 WHY AI-BASED ANALYTICS? * TECHNOLOGY-ASSISTED REVIEW IN E-DISCOVERY CAN BE MORE EFFECTIVE AND MORE EFFICIENT THAN EXHAUSTIVE MANUAL REVIEW By Maura R. Grossman* & Gordon V. Cormack. Richmond Journal of Law and Technology. Vol. XVII, Issue 3.
  • 12.  Structural: aka syntactic analytics  File-, Document and Forensic Property extraction, Meta-data filtering, Saved (full-text) Searches, Email Thread detection, Email Thread reduction, Missing emails in thread, Duplicate- and Near Duplicate detection, Language identification, Communication Analysis, Time-line Visualizations, Geo-mapping, …  Conceptual: aka semantic or meaning based analytics  Keyword Expansion (taxonomy), Content Clustering, Content- based Categorization, Conceptual Search, Sentiment & Emotion Mining, Semantic Content Analysis, Word-Cloud, Topic Modeling, …  Machine Learning: data driven (predictive) analytics  Technology Assisted Review, Contract clause detection & classification, Privileged detection, … SLIDE / 12 WHAT KIND OF ANALYTICS HAVE WE SEEN? STRUCTURE OF DATA MEANING OF DATA LEARN FROM DATA
  • 13. WHAT IS THE RELATION BETWEEN AI AND ANALYTICS? eDiscovery needs:  Perception  Reading: OCR, handwriting detection, signature recognition,  Listening: Audio search  Vision: Image classification  Language: Machine Translation  Intelligent Search  Machine Learning for search  Concept Clustering  Data Visualization  Text classification and categorization  Document  Paragraph (clause)  Sentence or phrase AI provides the algorithms and evaluation methods:  Machine Learning  Decision trees  Support Vector Machines  Deep Learning (CNN)  Topic Modeling / Concept Search  Hierarchical Clustering  LSI  LDA  NMF  Natural Language Processing (NLP)  Shallow Parsing  Deep Parsing  Co-reference resolution SLIDE / 13
  • 14. PERCEPTION: AUDIO SEARCH ZyLAB: automatic Audio Search on all detected (embedded) audio and video files.
  • 15. ZyLAB: embedded machine translation on every (embedded) document or document section. PERCEPTION: MACHINE TRANSLATION
  • 16. SLIDE / 16 PERCEPTION: HANDWRITING & SIGNATURE DETECTION (R&D)
  • 17. SLIDE / 17 PERCEPTION: VISUAL CLASSIFICATION OF IMAGES FOR EDISCOVERY (R&D)
  • 18. PERCEPTION: OCR ON BITMAPS ZyLAB: people often screenshot or take pictures from such information, just in case or to remember…. ZyLAB will pick up such images, OCR and find them…
  • 19. STRUCTURAL: UNPACK EMBEDDED CONTENT ZyLAB: • Every embedded item is extracted and OCR-ed if needed. • Search & Find • Show in document family
  • 21. STRUCTURAL: MISSING EMAIL IN THREAD ZyLAB:  Identify gaps in collected emails  Compare gaps among suspects  Restore email from backup’s
  • 23. FIND EVEN IF YOU DO NOT KNOW WHAT TO LOOK FOR
  • 24. Question Entities or patterns to address this question Who is it about? PERSON, COMPANY, ORGANIZATION. EMAIL ADDRESS What is it about? Result of Topic Modeling and Concept Clustering When did it happen? DATE, TIME, MONTH, DAY WEEK, YEAR Where did it happen? ADDRESS, CITY, COUNTRY, CONTINENT, DEPARTMENT and other geo-locations Why did it happen? Sentiments, emotions and cursing How did it happen? Combining entities and facts How much/often did it happen? Quantitative measures such as amounts, currencies, and other numbers. Also frequency and averages on entity occurrences. SLIDE / 24
  • 25. MORE DETAILED INSIGHTS SLIDE / 25 More interesting is to combine the W’s. For instance, why not look for Who is Where, or What happened When. Who – Who Who – Why When – What
  • 26. The era of traditional keyword and Boolean search seems to be over. Even the most brilliant query results in too many hits. Reviewing these takes too much time and resources.  People do not know exactly what to look for, what keywords to use or how to spell them.  The quality of traditional search is much lower than the searchers think (80% perceived versus 20-40% actual quality).  Only highly skilled searchers who manage all (advanced) query options are able to get close to 80%. Even then, they cannot be sure that they did in fact found 80% of all relevant documents. This is another problem measuring recall: you never know what you miss. MACHINE LEARNING: THE NEW SEARCH
  • 28.  Document Classification (TAR)  Find responsive documents  Boost recall  Measure recall  Paragraph Classification  Privileged review  Document clause classification  Contract clause classification  GDPR – Privacy detection – Redaction – Pseudomization SLIDE / 28 DIFFERENT USE CASES OF MACHINE LEARNING
  • 29.  Have we found all relevant information? How complete is the data we sent to the regulator? Machine learning!  During this process, several quantitative measures can be calculated such as precision, recall, F-values and precision of the return set. Based on these measurements, one can describe exactly how much of the relevant information has been found at which moment in the process. HOW CAN WE MEASURE RECALL
  • 31. 0 200 400 600 800 1000 1200 1400 1600 ZyLAB Assisted Review Manual Review Hours MACHINE LEARNING  15-20 faster than manual review  10-20% more accurate, fully defensible
  • 32.  Privileged information: automatically identify communications with our lawyers.  PII, PHI, and GDPR: redaction and pseudonymization
  • 33. CLAUSE DETECTION Detailed reporting on content of contracts, Reporting on extraction of key information, Higher precision search
  • 34.  ZyLAB’s Direct Collecting makes tremendous time savings to get data ready for early case assessment and (first) pass review. Direct Collection drastically reduces the cost and risks of downloading / uploading data or the shipping around of tapes and hard disks.  ZyLAB’s Deep Processing allows you to automatically reduce your data volumes before you send them on for review, without getting in trouble or being accused of data spoliation. If every component of data is searchable, only then can one use automated tools to reduce data.  Using ZyLAB’s Review Accelerators you can minimize the most expensive and time consuming part of the eDiscovery process. TAR, batch tagging, sampling, redaction, email trails, …  Litigants use ZyLAB’s Early Case Assessment to quickly understand the facts and merits of a case, identify key custodians and recognize critical information so they can develop an effective and realistic litigation strategy. SLIDE / 34 BENEFITS TO IN-HOUSE COUNSEL
  • 35. THE ZYLAB BENEFIT TO LAW FIRMS  ZyLAB covers multiple eDiscovery use cases. One platform: More cases, more volume, better pricing.  No need to involve any 3rd parties.  Bill the hours for project management and data science (machine learning) as well.  DIY: upload data and almost immediately start reviewing with your team and bill the hours.  Find out what really happened with ZyLAB’s deep search and analytics. Expand review team.  Replace the bottom of the traditional earnings pyramid with “review robots”: make more margin.  Be more competitive.  Do more work with your current team: never have to pass on new opportunities because of capacity problems.  less risk of errors and missing out on key issues. So, less risk for liability claims and higher insurance premiums.
  • 37. “ZYLAB TAKES CARE OF THE PROCESS, SUPPORTS THE LAWYER BY THINKING COMMERCIALLY AND PROVIDES COMFORT WITH THE USE OF ADVANCED TECHNOLOGY” Ruben Elkerbout, anti-trust lawyer and partner with Stek Lawyers
  • 38. MORE READING – WWW.ZYLAB.COM/RESOURCES/EBOOKS/
  • 39. Q&A MORE INFORMATION: WWW.ZYLAB.COM 39 More ZyLAB Webinars and events: https://zylab.com/company/event-calendar/