SlideShare a Scribd company logo
#evolve19
GOING BEYOND METADATA:
EXTRACTING MEANINGFUL
INFORMATION FROM YOUR
DIGITAL ASSETS
PAUL LEGAN
August 7th, 2019
#evolve19 2
DIGITAL ASSET MANAGEMENT
REALLY, IT MAKES THIS PROCESS EASIER.
Find an
existing asset
or set of asset
artifacts
Alter an
existing or
create a new
creative asset
Generate
variations for
different
audiences
Publish this
asset for an
appropriate
duration
Discovery Creation Automation Publication
#evolve19 3
• Supports workflows that allow for
content modification
• Reduces costs of asset creation
and distribution
• Automates tedious tasks like
thumbnail generation
• Increases marketing throughput
for content variations and
personalization
• Increases creative autonomy
DIGITAL ASSET MANAGEMENT
LET’S START WITH THE BENEFITS
#evolve19 4
IF IT’S SO GREAT, WHY ISN’T IT EASY?
WE CAN ALL PROBABLY NAME A FEW REASONS.
#evolve19 5
“Let’s all use in-progress folders.”
ISSUE #1: ORGANIZATION
NAMING CONVENTIONS AND FOLDER STRUCTURE
→
“We can delete this later.”
#evolve19 6
ISSUE #2: INCONSISTENCY
TRAINING + USAGE GUIDELINES
No validation
Poor Naming
Conventions
Number Duplication
Unused Fields
#evolve19 7
ISSUE #3: MYOPIA
THINK BEYOND THE CURRENT USE CASE
Tag Redundancy
Folder Mismatches
No Scheduled Cleanup
#evolve19 8
MULTI-TOOL OF CHOICE: METADATA
WE CAN ALL PROBABLY NAME A FEW REASONS.
#evolve19 9
THE GENRE PROBLEM
ID3, WINAMP, AND ITUNES – UNITE!
(for all of you who totally legally purchased music 20 years ago)
#evolve19 10
THE HUMBLE SCHEMA
YOUR ASSET DATA LAYER
#evolve19 11
INGESTION PROCESS
ASSET PROCESSING AT SCALE
Define a Schema
(Superset of Properties)
Define Ingestion Process
(IPTC, XMP, Validation)
Import Assets
(Auto-Tag, Pre-Fill)
#evolve19 12
INGESTION PROCESS
ASSET PROCESSING AT SCALE
Define a Schema
(Superset of Properties)
Define Ingestion Process
(IPTC, XMP, Validation)
Import Assets
(Auto-Tag, Pre-Fill)
Metadata Profiles
(Sensible Defaults)
Smart Organization
(Sort, Filter, Variants)
Smart Tags
(Auto-Tag, Pre-Fill)
#evolve19 13
INGESTION PROCESS
ASSET PROCESSING AT SCALE
Define a Schema
(Superset of Properties)
Define Ingestion Process
(IPTC, XMP, Validation)
Import Assets
(Auto-Tag, Pre-Fill)
Metadata Profiles
(Sensible Defaults)
Smart Organization
(Sort, Filter, Variants)
Smart Tags
(Auto-Tag, Pre-Fill)
#evolve19 14
• Level #1 Automation
• Helps alleviate tedious work
• Applying global tags
• Complementing IPTC/XMP
data embedded in the binaries
• Photoshoot Location
• Photographer
• Type of Asset
• Digital Rights Management
• Easy to apply at the folder or file
type level
METADATA PROFILES
SENSIBLE METADATA DEFAULTS
#evolve19 15
SMART TAGS
ADOBE I/O SMART CONTENT SERVICE
Can be trained and
training can be run on a
schedule
Auto-tag based on
object recognition
#evolve19 16
SO… HOW CAN WE GO FURTHER?
LET’S SAY YOU WANT MORE AUTOMATION.
#evolve19 17
Uses Optical
Character
Recognition (OCR)
to automatically
detect printed text
and numbers in a
scan or rendering of
a document.
AMAZON TEXTRACT
AN INTRODUCTION
Enables you to
detect key-value
pairs in documents
to retain the
inherent context of
the document
without any manual
intervention.
Returns a
confidence score
for everything it
identifies so you
can make informed
decisions about
how you want to
use the results.
#evolve19 18
LOOKING INSIDE WITH OCR
JUDGE ASSETS BY MORE THAN THEIR COVER
#evolve19 19
LOOKING INSIDE WITH OCR
JUDGE ASSETS BY MORE THAN THEIR COVER
→
#evolve19 20
STRUCTURED DATA
EMBEDDED DOCUMENT INFORMATION
#evolve19 21
STRUCTURED DATA
EMBEDDED DOCUMENT INFORMATION
driver-data.pdf
#evolve19 22
HOW IT WORKS
TECHNICAL PROCESS
Image Uploaded via API
(S3 or Base64 Bytes)
Service Analyzes Input
(Sync or Async)
ML Response Sent
(JSON Payload)
{
"Document": {
"Bytes": blob,
"S3Object": {
"Bucket": "string",
"Name": "string",
"Version": "string"
}
}
}
// SYNC
DetectDocumentText()
AnalyzeDocument()
// ASYNC
StartDocumentTextDetection()
GetDocumentTextDetection()
[Blocks]
[Geometry]
[Bounding Box]
[Confidence]
[Text]
[Block Type]
[ID]
[/Blocks]
→ →
#evolve19 23
HOW IT FITS IN AEM
TECHNICAL PROCESS
Image Uploaded via API
(S3 or Base64 Bytes)
Service Analyzes Input
(Sync or Async)
ML Response Sent
(JSON Payload)
→ →
XML Binary Writeback
(If applicable)
Property Validation
(Notification, Banner)
Properties Saved to JCR
(JSON Payload)
→ →
→
AEMWorkflow
#evolve19
AEMWorkflow
24
HOW IT FITS IN AEM
TECHNICAL PROCESS
Image Uploaded via API
(S3 or Base64 Bytes)
Service Analyzes Input
(Sync or Async)
ML Response Sent
(JSON Payload)
→ →
XML Binary Writeback
(If applicable)
Property Validation
(Notification, Banner)
Properties Saved to JCR
(JSON Payload)
→ →
→
3rd-Party DB
(Search)
Amazon Comprehend
(NLP)
Amazon Translate
(Translation)
→ →
→
#evolve19 25
DEMO
!
#evolve19 26
HOW DO THESE TOOLS HELP?
MORE THAN YOU THINK.
#evolve19 27
BENEFITS & IMPACT
HIGHLIGHTS
-75% -60%Less Effort By Humans
Per Ingested Asset
Reduction in Calls
to IT to Deliver Assets
Tedious Data Entry
Increases the Risk of
Human Error
Reduces Margin of Error
Reduces the Time to
Find Assets and Lessens
the Dependency on IT
Better Discovery
A Scalable System is a
Usable System as
Adoption Increases
Enterprise Scale
+80%User Adoption YoY
Across Departments
#evolve19 28
FUTURE POSSIBILITIES
JUST THINKING OUT LOUD
Process Invoices
& Sales Receipts
Normalize Financial
Document Data
Automatically Redact
PII from a Claim
#evolve19 29
Links to Relevant Resources:
- https://aws.amazon.com/textract/
- https://github.com/aws-samples/amazon-textract-code-samples/
- https://github.com/aws-samples/amazon-textract-serverless-large-scale-document-
processing
MORE INFORMATION
GETTING STARTED & BEYOND
#evolve19
THANK YOU!

More Related Content

PPTX
Evolve 19 | Kevin Campton & Sharat Radhakrishnan | Industry Focus | Autodesk ...
PPTX
Evolve 19 | Jayan Kandathil | Running AEM Workloads on Microsoft Azure
PPTX
Evolve 19 | Gordon Pike | Prepping for Tomorrow - Creating a Flexible AEM Arc...
PPTX
Evolve 19 | Upen Manickam & Amanda Gray | Adventures in SPA with AEM 6.5
PPTX
Evolve 19 | Benjie Wheeler | Intro to Adobe Experience Manager 6.5
PPTX
Evolve 19 | Rabiah Coon, Sabrina Schmidt & Noah Linge | Industry Focus | Furn...
PPTX
Evolve 19 | Amol Anand & Daniel Gordon | Author in AEM Once - Deliver Everywhere
PPTX
Evolve 19 | Ameeth Palla | Adobe Asset Link - Use Cases and Pitfalls to Avoid
Evolve 19 | Kevin Campton & Sharat Radhakrishnan | Industry Focus | Autodesk ...
Evolve 19 | Jayan Kandathil | Running AEM Workloads on Microsoft Azure
Evolve 19 | Gordon Pike | Prepping for Tomorrow - Creating a Flexible AEM Arc...
Evolve 19 | Upen Manickam & Amanda Gray | Adventures in SPA with AEM 6.5
Evolve 19 | Benjie Wheeler | Intro to Adobe Experience Manager 6.5
Evolve 19 | Rabiah Coon, Sabrina Schmidt & Noah Linge | Industry Focus | Furn...
Evolve 19 | Amol Anand & Daniel Gordon | Author in AEM Once - Deliver Everywhere
Evolve 19 | Ameeth Palla | Adobe Asset Link - Use Cases and Pitfalls to Avoid

What's hot (20)

PPTX
Evolve 19 | Giancarlo Berner | JECIS 2 - The Beginning of a New Era in Buildi...
PPTX
EVOLVE'16 | Maximize | Ben Hubble & Lynn Tabet | Scaling the AEM Customer Exp...
PPTX
EVOLVE'16 | Enhance | Gordon Pike | Rev Up Your Marketing Engine
PPTX
EVOLVE'16 | Deploy | Abhishek Dwevedi | Introduction to AEM Assets
PDF
Evolve 19 | Sarah Xu & Kanika Gera | Adobe I/O - Why You Need it to Execute o...
PPTX
EVOLVE'16 | Keynote | Community Profile: Autodesk
PPTX
[VJCD seminar] Launching of APO-Chan, an Azure Mobile Apps with Xamarin and OSS
PPTX
Deep dive into share point framework webparts
PPTX
Evolve 19 | Gina Petruccelli | Let’s Dig Into Requirements
PDF
PPTX
Salesforce Apex Hours: Einstein Intent
PPT
Native App Development for iOS, Android, and Windows with Visual Studio
PPTX
IBM Bluemix Demo with Anki Overdrive Cars
PPTX
PDF
Appcelerator’s Cocoafish Acquisition and the Future of the Mobile Cloud
PDF
Google app engine
PDF
Développement cross-plateforme sans compromis avec Xamarin
PDF
New Enterprisre Capabilities in Telerik Platform
PPTX
Chris O'Brien - Modern SharePoint sites and the SharePoint Framework - reference
PDF
Joe Emison - 10X Product Development
Evolve 19 | Giancarlo Berner | JECIS 2 - The Beginning of a New Era in Buildi...
EVOLVE'16 | Maximize | Ben Hubble & Lynn Tabet | Scaling the AEM Customer Exp...
EVOLVE'16 | Enhance | Gordon Pike | Rev Up Your Marketing Engine
EVOLVE'16 | Deploy | Abhishek Dwevedi | Introduction to AEM Assets
Evolve 19 | Sarah Xu & Kanika Gera | Adobe I/O - Why You Need it to Execute o...
EVOLVE'16 | Keynote | Community Profile: Autodesk
[VJCD seminar] Launching of APO-Chan, an Azure Mobile Apps with Xamarin and OSS
Deep dive into share point framework webparts
Evolve 19 | Gina Petruccelli | Let’s Dig Into Requirements
Salesforce Apex Hours: Einstein Intent
Native App Development for iOS, Android, and Windows with Visual Studio
IBM Bluemix Demo with Anki Overdrive Cars
Appcelerator’s Cocoafish Acquisition and the Future of the Mobile Cloud
Google app engine
Développement cross-plateforme sans compromis avec Xamarin
New Enterprisre Capabilities in Telerik Platform
Chris O'Brien - Modern SharePoint sites and the SharePoint Framework - reference
Joe Emison - 10X Product Development
Ad

Similar to Evolve 19 | Paul Legan | Going Beyond Metadata: Extracting Meaningful Information from Digital Assets Automatically in AEM (20)

PPTX
Evolve18 | Klassjan Tukker | Adobe Cloud Platform: The heart of Adobe Experie...
PDF
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
PDF
NovoDynamics Company Overview
PDF
Big Data Evolution
PPTX
AI in the Enterprise at Scale
PDF
Automation of document management paul fenton webinar
PDF
Add Intelligence to Applications with AWS AI Services
PDF
[Webinar Slides] Using AI to Easily Automate All of Your Correspondence Channels
PDF
Modern data integration | Diyotta
PDF
EVAIN Artificial intelligence and semantic annotation: are you serious about it?
PDF
ADV Slides: Data Pipelines in the Enterprise and Comparison
PPTX
How to analyze text data with Named Entity Recognition
PDF
Data Production Pipelines: Legacy, practices, and innovation
PDF
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
PPTX
Visual Modeling Editor and Ontology API-based Analysis for Decision Making in...
PDF
Gartner EA: The Rise of Data-driven Architectures
PDF
Enable Better Decision Making with Power BI Visualizations & Modern Data Estate
 
PPTX
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
PDF
About_Imaginea
PPTX
apidays LIVE Hong Kong - The Future of Legacy - How to leverage legacy and on...
Evolve18 | Klassjan Tukker | Adobe Cloud Platform: The heart of Adobe Experie...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
NovoDynamics Company Overview
Big Data Evolution
AI in the Enterprise at Scale
Automation of document management paul fenton webinar
Add Intelligence to Applications with AWS AI Services
[Webinar Slides] Using AI to Easily Automate All of Your Correspondence Channels
Modern data integration | Diyotta
EVAIN Artificial intelligence and semantic annotation: are you serious about it?
ADV Slides: Data Pipelines in the Enterprise and Comparison
How to analyze text data with Named Entity Recognition
Data Production Pipelines: Legacy, practices, and innovation
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Visual Modeling Editor and Ontology API-based Analysis for Decision Making in...
Gartner EA: The Rise of Data-driven Architectures
Enable Better Decision Making with Power BI Visualizations & Modern Data Estate
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
About_Imaginea
apidays LIVE Hong Kong - The Future of Legacy - How to leverage legacy and on...
Ad

More from Evolve The Adobe Digital Marketing Community (19)

PPTX
Evolve 19 | Paul Legan & Kristin Jones | Anatomy of a Solid AEM Implementatio...
PPTX
Evolve 19 | Rabiah Coon & Rebecca Blaha | Rockstar Kickoffs for AEM Projects
PPTX
Evolve19 | Nick Panagopoulos | World Focus: Translation Tips and Trends
PPTX
Evolve 19 | Carl Madaffari | Best Practices | From Customer Data to Customer ...
PPTX
Evolve 19 | Dave Fox | Retaining Niche Talent in a Highly Competitive Environ...
PPTX
Evolve19 | Giancarlo Berner & Brett Butterfield | AI & Adobe Sensei
PDF
Evolve 19 | Bruce Swann | Adobe Campaign - Capabilities, Roadmap, and Fit wit...
PPTX
Evolve 19 | Pete Hoback & Francisco Fagalde | AEM QA, UAT, & Go Live
PPTX
Evolve 19 | Harsh Walia | Best Practices - Adobe Experience Manager
PPTX
Evolve19 | Michel Holland | Marketo - Delivering the Best Experience for the ...
PPTX
Evolve18 | Abhishek Dwevidi & Varun Mitra | Introduction to AEM Integration w...
PPTX
Evolve18 | Abhishek Dwevidi & Varun Mitra | Introduction to Adobe Analytics a...
PPTX
Evovle18 | Abhishek Dwevidi & Varun Mitra | Personalization with Adobe Experi...
PPTX
Evolve18 | Abhishek Dwevidi & Varun Mitra | Introduction to Launch by Adobe
PPTX
Evolve18 | Abhishek Dwevidi & Varun Mitra | AEM as Headless or Hybrid CMS
PPTX
Evolve18 | Shreya Jha | Growing up with AEM: Best Western’s Story of Digital ...
PPTX
Evolve18 | Franco Campione | Success Story: How the Adobe Marketing Cloud Tra...
PPTX
Evolve18 | Harold Williams | The BBVA Compass Migration Journey to AEM
PPTX
Evolve18 | Bruce Swann | Adobe Campaign - Capabilities, Roadmap, and Fit With...
Evolve 19 | Paul Legan & Kristin Jones | Anatomy of a Solid AEM Implementatio...
Evolve 19 | Rabiah Coon & Rebecca Blaha | Rockstar Kickoffs for AEM Projects
Evolve19 | Nick Panagopoulos | World Focus: Translation Tips and Trends
Evolve 19 | Carl Madaffari | Best Practices | From Customer Data to Customer ...
Evolve 19 | Dave Fox | Retaining Niche Talent in a Highly Competitive Environ...
Evolve19 | Giancarlo Berner & Brett Butterfield | AI & Adobe Sensei
Evolve 19 | Bruce Swann | Adobe Campaign - Capabilities, Roadmap, and Fit wit...
Evolve 19 | Pete Hoback & Francisco Fagalde | AEM QA, UAT, & Go Live
Evolve 19 | Harsh Walia | Best Practices - Adobe Experience Manager
Evolve19 | Michel Holland | Marketo - Delivering the Best Experience for the ...
Evolve18 | Abhishek Dwevidi & Varun Mitra | Introduction to AEM Integration w...
Evolve18 | Abhishek Dwevidi & Varun Mitra | Introduction to Adobe Analytics a...
Evovle18 | Abhishek Dwevidi & Varun Mitra | Personalization with Adobe Experi...
Evolve18 | Abhishek Dwevidi & Varun Mitra | Introduction to Launch by Adobe
Evolve18 | Abhishek Dwevidi & Varun Mitra | AEM as Headless or Hybrid CMS
Evolve18 | Shreya Jha | Growing up with AEM: Best Western’s Story of Digital ...
Evolve18 | Franco Campione | Success Story: How the Adobe Marketing Cloud Tra...
Evolve18 | Harold Williams | The BBVA Compass Migration Journey to AEM
Evolve18 | Bruce Swann | Adobe Campaign - Capabilities, Roadmap, and Fit With...

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Electronic commerce courselecture one. Pdf
PDF
Encapsulation theory and applications.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPT
Teaching material agriculture food technology
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
KodekX | Application Modernization Development
NewMind AI Weekly Chronicles - August'25 Week I
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Diabetes mellitus diagnosis method based random forest with bat algorithm
Building Integrated photovoltaic BIPV_UPV.pdf
Empathic Computing: Creating Shared Understanding
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Electronic commerce courselecture one. Pdf
Encapsulation theory and applications.pdf
Network Security Unit 5.pdf for BCA BBA.
Teaching material agriculture food technology
The Rise and Fall of 3GPP – Time for a Sabbatical?
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Agricultural_Statistics_at_a_Glance_2022_0.pdf
MYSQL Presentation for SQL database connectivity
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
NewMind AI Monthly Chronicles - July 2025
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
KodekX | Application Modernization Development

Evolve 19 | Paul Legan | Going Beyond Metadata: Extracting Meaningful Information from Digital Assets Automatically in AEM

  • 1. #evolve19 GOING BEYOND METADATA: EXTRACTING MEANINGFUL INFORMATION FROM YOUR DIGITAL ASSETS PAUL LEGAN August 7th, 2019
  • 2. #evolve19 2 DIGITAL ASSET MANAGEMENT REALLY, IT MAKES THIS PROCESS EASIER. Find an existing asset or set of asset artifacts Alter an existing or create a new creative asset Generate variations for different audiences Publish this asset for an appropriate duration Discovery Creation Automation Publication
  • 3. #evolve19 3 • Supports workflows that allow for content modification • Reduces costs of asset creation and distribution • Automates tedious tasks like thumbnail generation • Increases marketing throughput for content variations and personalization • Increases creative autonomy DIGITAL ASSET MANAGEMENT LET’S START WITH THE BENEFITS
  • 4. #evolve19 4 IF IT’S SO GREAT, WHY ISN’T IT EASY? WE CAN ALL PROBABLY NAME A FEW REASONS.
  • 5. #evolve19 5 “Let’s all use in-progress folders.” ISSUE #1: ORGANIZATION NAMING CONVENTIONS AND FOLDER STRUCTURE → “We can delete this later.”
  • 6. #evolve19 6 ISSUE #2: INCONSISTENCY TRAINING + USAGE GUIDELINES No validation Poor Naming Conventions Number Duplication Unused Fields
  • 7. #evolve19 7 ISSUE #3: MYOPIA THINK BEYOND THE CURRENT USE CASE Tag Redundancy Folder Mismatches No Scheduled Cleanup
  • 8. #evolve19 8 MULTI-TOOL OF CHOICE: METADATA WE CAN ALL PROBABLY NAME A FEW REASONS.
  • 9. #evolve19 9 THE GENRE PROBLEM ID3, WINAMP, AND ITUNES – UNITE! (for all of you who totally legally purchased music 20 years ago)
  • 10. #evolve19 10 THE HUMBLE SCHEMA YOUR ASSET DATA LAYER
  • 11. #evolve19 11 INGESTION PROCESS ASSET PROCESSING AT SCALE Define a Schema (Superset of Properties) Define Ingestion Process (IPTC, XMP, Validation) Import Assets (Auto-Tag, Pre-Fill)
  • 12. #evolve19 12 INGESTION PROCESS ASSET PROCESSING AT SCALE Define a Schema (Superset of Properties) Define Ingestion Process (IPTC, XMP, Validation) Import Assets (Auto-Tag, Pre-Fill) Metadata Profiles (Sensible Defaults) Smart Organization (Sort, Filter, Variants) Smart Tags (Auto-Tag, Pre-Fill)
  • 13. #evolve19 13 INGESTION PROCESS ASSET PROCESSING AT SCALE Define a Schema (Superset of Properties) Define Ingestion Process (IPTC, XMP, Validation) Import Assets (Auto-Tag, Pre-Fill) Metadata Profiles (Sensible Defaults) Smart Organization (Sort, Filter, Variants) Smart Tags (Auto-Tag, Pre-Fill)
  • 14. #evolve19 14 • Level #1 Automation • Helps alleviate tedious work • Applying global tags • Complementing IPTC/XMP data embedded in the binaries • Photoshoot Location • Photographer • Type of Asset • Digital Rights Management • Easy to apply at the folder or file type level METADATA PROFILES SENSIBLE METADATA DEFAULTS
  • 15. #evolve19 15 SMART TAGS ADOBE I/O SMART CONTENT SERVICE Can be trained and training can be run on a schedule Auto-tag based on object recognition
  • 16. #evolve19 16 SO… HOW CAN WE GO FURTHER? LET’S SAY YOU WANT MORE AUTOMATION.
  • 17. #evolve19 17 Uses Optical Character Recognition (OCR) to automatically detect printed text and numbers in a scan or rendering of a document. AMAZON TEXTRACT AN INTRODUCTION Enables you to detect key-value pairs in documents to retain the inherent context of the document without any manual intervention. Returns a confidence score for everything it identifies so you can make informed decisions about how you want to use the results.
  • 18. #evolve19 18 LOOKING INSIDE WITH OCR JUDGE ASSETS BY MORE THAN THEIR COVER
  • 19. #evolve19 19 LOOKING INSIDE WITH OCR JUDGE ASSETS BY MORE THAN THEIR COVER →
  • 21. #evolve19 21 STRUCTURED DATA EMBEDDED DOCUMENT INFORMATION driver-data.pdf
  • 22. #evolve19 22 HOW IT WORKS TECHNICAL PROCESS Image Uploaded via API (S3 or Base64 Bytes) Service Analyzes Input (Sync or Async) ML Response Sent (JSON Payload) { "Document": { "Bytes": blob, "S3Object": { "Bucket": "string", "Name": "string", "Version": "string" } } } // SYNC DetectDocumentText() AnalyzeDocument() // ASYNC StartDocumentTextDetection() GetDocumentTextDetection() [Blocks] [Geometry] [Bounding Box] [Confidence] [Text] [Block Type] [ID] [/Blocks] → →
  • 23. #evolve19 23 HOW IT FITS IN AEM TECHNICAL PROCESS Image Uploaded via API (S3 or Base64 Bytes) Service Analyzes Input (Sync or Async) ML Response Sent (JSON Payload) → → XML Binary Writeback (If applicable) Property Validation (Notification, Banner) Properties Saved to JCR (JSON Payload) → → → AEMWorkflow
  • 24. #evolve19 AEMWorkflow 24 HOW IT FITS IN AEM TECHNICAL PROCESS Image Uploaded via API (S3 or Base64 Bytes) Service Analyzes Input (Sync or Async) ML Response Sent (JSON Payload) → → XML Binary Writeback (If applicable) Property Validation (Notification, Banner) Properties Saved to JCR (JSON Payload) → → → 3rd-Party DB (Search) Amazon Comprehend (NLP) Amazon Translate (Translation) → → →
  • 26. #evolve19 26 HOW DO THESE TOOLS HELP? MORE THAN YOU THINK.
  • 27. #evolve19 27 BENEFITS & IMPACT HIGHLIGHTS -75% -60%Less Effort By Humans Per Ingested Asset Reduction in Calls to IT to Deliver Assets Tedious Data Entry Increases the Risk of Human Error Reduces Margin of Error Reduces the Time to Find Assets and Lessens the Dependency on IT Better Discovery A Scalable System is a Usable System as Adoption Increases Enterprise Scale +80%User Adoption YoY Across Departments
  • 28. #evolve19 28 FUTURE POSSIBILITIES JUST THINKING OUT LOUD Process Invoices & Sales Receipts Normalize Financial Document Data Automatically Redact PII from a Claim
  • 29. #evolve19 29 Links to Relevant Resources: - https://aws.amazon.com/textract/ - https://github.com/aws-samples/amazon-textract-code-samples/ - https://github.com/aws-samples/amazon-textract-serverless-large-scale-document- processing MORE INFORMATION GETTING STARTED & BEYOND