SlideShare a Scribd company logo
Serving Information Needs
of Knowledge Workers
Debdoot Mukherjee, IBM Research-India
Knowledge Worker is one who develops or applies knowledge in
the workplace -- Peter Drucker
Who can I reach
out for help?
How did we
handle such a
case before?
What best
practices
apply?
Information
Needs of a
Knowledge
Worker
Engg. Design
Customer Support
Sales / Pre-Sales
R&D
Spend 15% - 35% of their time
searching.
Successful only 50% of time
Source:IDC
Potential Productivity Gain
20 – 25%
Source:Mc-Kinsey
Huge Cost of NOT
finding the RIGHT
information at
the RIGHT time
Sample Case Information created by Sales Teams
Web Portals,
Wikis, Forums
100’s of structured fields
in Notes databases
Dense Documents
in Team Rooms
Going beyond keyword
search. Users expect deeper
insights or analyses.
Complicated Access Control
Handling a mix of structured
and unstructured data
Understanding results from
past cases is difficult
Challenges
Deal with each case domain
separately
Enumerate information
needs in that domain
Extract information entities
and drive semantic search
Leverage context of the case
being worked upon
Opportunities
Information
Retrieval
Information
Interaction
Information
Extraction Understanding the case
domain – artifacts created,
different roles and their
information needs
Information
Retrieval
Information
Interaction
Information
Extraction
Knowledge workers need to
read multiple dense
documents and distill insights
thereof to arrive at a decision
Aggregate insights are often
more important than knowing
about a particular case
Need to facilitate information
exploration - guess what
would one want to know next
360 degree views of
information entities
pureflex
IBM PureFlex™ System is an infrastructure system with expertise for sensing and anticipating resource needs to optimize your infrastructure.
Key attributes include:
Factory integrated and optimized system infrastructure Management integration across physical and virtual resources
Automation and optimization expertise
Built for cloud, as a foundation for Infrastructure as a Service offering
Definition from SO Village Offering page Related Links: IBM PureFlex System (ibm.com)
Switch to Search
Customers
ACME Inc.
2 Opportunities (1 Win)
Processes / Lessons
PureFlex and PureAS
Solution Process
Source: CES Handbook
XYZ Ltd
2 Opportunities (1 Win)
ABCD Inc.
5 Opportunities
Apply FiltersApply Filters
Version 1 - PureFlex
Solution Checklist
Source: CES Handbook
PureSystems
Source: SO Village
Value Proposition
Apply Filters
People
Apply Filters
Add a Explorer View
Select Explorer Views
Add
Leave A Comment
Each exploration view
targeted toward an
unique information need
Decide which views
to show based on
query and role
pureflex
IBM PureFlex™ System is an infrastructure system with expertise for sensing and anticipating resource needs to optimize your infrastructure.
Key attributes include:
Factory integrated and optimized system infrastructure Management integration across physical and virtual resources
Automation and optimization expertise
Built for cloud, as a foundation for Infrastructure as a Service offering
Definition from SO Village Offering page Related Links: IBM PureFlex System (ibm.com)
Switch to Search
Customers
ACME Inc.
2 Opportunities (1 Win)
Processes / Lessons
PureFlex and PureAS
Solution Process
Source: CES Handbook
XYZ Ltd
2 Opportunities (1 Win)
ABCD Inc.
5 Opportunities
Apply FiltersApply Filters
Version 1 - PureFlex
Solution Checklist
Source: CES Handbook
PureSystems
Source: SO Village
Value Proposition
Apply Filters
People
Apply Filters
Add a Explorer View
Leave A Comment
Client: ACME Inc
Client Background
Sector : Distribution
Industry : Travel & Transportation
Contacts : John Will (CSE), Ray Harris (TSM)
Past Opportunities (Win, Loss, Unknown) :
2Y-337WW2 1Y-43HYFD 12KZ-52XZQQU 4D-DFREE
See 5 results from IBM Connections
Similar Clients: World Tour-Co Cosmos, Globus
Comments (1)
It seems like ACME Inc. is an early adopter of Pureflex. Is it willing
to act as a reference. @John Will: Any idea?
4/10/2013
Entity Profiles aggregated from information in
multiple documents across multiple sources
Add Comment
Recommendation View: Change
Client
Solution
Competition
Scope
Win Themes
Value Proposition
Delivery Model
Offerings & Asset
Architecture
Financials
RAID
Engagement
HR Solution
Transition & Trans
Section Selector Section View
Current Topics
Upload Documents
Standardisation
Recommendations
Apply Filters
Recommended Topics
Faster Provisioning
Improve Speed To Market
Reduce IT Operating Costs
Pay per use
Scalability Flexibility
Standardization
of Images Enable
collaboration with partners
ACME Inc
12Y-6774Y
Contact:
Steve Toll
Govt of XYZ
12Y-4YFFFT
Contact:
Mike Chang
Case Field being worked upon
(say, Value Proposition)
Recommended Value Props.
from similar past cases
Visually analyze
relevance of topics
Software
Maintenance
Software
Development
Information
Retrieval
Information
Interaction
Information
Extraction
Summarize
Topic Modeling
MMR based 2-3 line
segment summary
Segment & Annotate
Dictionary, Regex-
based
Leverage Formatting
- Paragraphs, Tables
Crawl & Parse
Multiple platforms,
technologies
Parse formatting, not
just text. Export
thumbnails
CommonAmbiguities
Diagram Parsing
• Parse information about
diagram shapes
• Attributes such as
coordinates,
dimensions, text,
geometry
Structure Inference
• Precisely determine the
underlying flow graph
• Deal with structural
ambiguities
Semantic Interpretation
• Classify the semantic of
every node or edge based
on their structural, textual or
geometric features
• Unsupervised training of
such a classifier performs as
well as supervised
Extracting Formal Models From Informal Diagrams
Information
Retrieval
Information
Interaction
Information
Extraction
Create ER
network
Compute pair-
wise Personalized
Page Rank
Leverage case context to
supplement user queries
Case
People
Technology
Risk
Customer
Create ER network
Pair- wise
Personalized
Page Rank
How do we
set the edge
weights?
Equally?
Suppose, we want
recommendations for
field - Xk for this case
Let’s initiate a random
walk here and try to
hit nodes of Xk
Which case fields provide
meaningful context for Xk?
Across cases, if similarity in Xi leads
to similarity in Xk, then Xi should be
used as context for generating
recommendations for Xk
Correspondence Analysis
1. Select a pair of cases, Ri = (Ci1, Ci2) from the
case repository
2. For each case field, Xk (k = 1, 2,…n), compute
similarity of contents of Xk in Ci1, Ci2  Sik
3. Repeat steps 1 and 2 for all pairs of cases in
repository to populate matrix S.
4. To compute Corr(Xi, Xk) for all i = 1,2…n,
regress column k with the other columns in S
5. The coefficients obtained from a linear
regression model obtained above for each
column i gives Corr(Xi, Xk)
17
X1 X2 …. Xk ….. Xn
S11 S12 …. S1k … S1n
….
….
….
….
….
….
Sm1 Sm2 …. Smk … Smn
R1
Rm
Case
Repository
Ci2Ci1
Ri
Sik = Similarity(Ci1.Xk , Ci1.Xk)
S
Correspondence, Corr(Xi , Xk) is calculated as the degree to which similarity in field Xi
corresponds to similarity in field Xk across pairs of cases.
An incoming edge to a node of type Xi is
weighted by Corr(Xi , Xk) whenever a
node of type Xk is a target for any
Personalized Page Rank calculation.
Semi-structured
Contents
Read
Index Creator
Personalized
Page Rank Calc.
PPR DB
Search &
Rank
Crawlers
Crawlers
Web UI/
API Query
Generator
Thumbnails
Rich texts
Crawlers
Parsers
Document DB
Read
Crawlers
Annotators
Graph DB
Primary
Secondary
Entity Profile
Creator
Graph Builder
Front-end App
Document Processing &
Analytics Pipeline
Provisioning
Project
Context
SIM Architecture
Usage
Tracking
Access Level
Provisioning
Access
Control
Search
Indices
Related Publications
• Rohan Padhye, Debdoot Mukherjee, Vibha Sinha, API as a Social Glue, ICSE 2014 (In
Submission)
• Mu Qiao, Debdoot Mukherjee et. al., Unleashing The Power of Expert Knowledge for IT
Services Sales with Graph Search, INFORMS 2013
• Debdoot Mukherjee, Jeanette Blomberg, Rama Akkiraju, Dinesh Raghu, Monika Gupta,
Sugata Ghosal, Mu Qiao, Taiga Nakamura: A Case Based Approach to Serve Information
Needs in Knowledge Intensive Processes. ICSOC 2013: 541-549
• Richard Goodwin, SweeFen Goh, Pietro Mazzoleni, Vibha Sinha, Debdoot Mukherjee, Senthil
Mani: Effective Content Reuse for Business Consulting Practices. SRII Global Conference
2012: 682-690
• Monika Gupta, Debdoot Mukherjee, Senthil Mani, Vibha Singhal Sinha, Saurabh Sinha:
Serving Information Needs in Business Process Consulting. BPM 2011: 231-247
• Debdoot Mukherjee, Senthil Mani, Vibha Singhal Sinha, Rema Ananthanarayanan, Biplav
Srivastava, Pankaj Dhoolia, Prahlad Chowdhury: AHA: Asset Harvester Assistant. IEEE SCC
2010: 425-432
• Debdoot Mukherjee, Pankaj Dhoolia, Saurabh Sinha, Aubrey J. Rembert, Mangala Gowri
Nanda: From Informal Process Diagrams to Formal Process Models. BPM 2010: 145-161
• Biplav Srivastava, Debdoot Mukherjee, Rema Ananthanarayanan, Vibha Sinha: From Model
Extraction to Model-based Reuse of Enterprise Documents. COMAD 2010: 171
• Pietro Mazzoleni Debdoot Mukherjee, et. Al.: Consultant assistant: a tool for collaborative
requirements gathering and business process documentation. OOPSLA Companion 2009
• (Granted) US 8176412 – Generating Formatted Documents
• (Granted) US 8234570 – Harvesting Assets for packaged software application configuration
• (Granted) US 8356045 – Method to Identify Common Structures in Formatted Text Documents
• (Granted) US 8578346 – System and method to validate and repair process flow drawings
• (Granted) US 8589877 – Modeling and Linking Documents for Packaged Software Application
Configuration
• US 2011/0106801 A1 – Systems and Methods for Organizing Documented Processes
• US 2011/0167070 A1 – Reusing assets for packaged s/w application configuration
• US 2011/0313932 A1 – Model Based Project Network
• US 2012/0062574 A1 – Automated Recognition of Process Modeling Semantics in Flow
Diagrams
• US 2012/0078969 A1 – System and Method to extract models from semi-structured documents
• US 2013/0144872 – System and method to provision semantic and contextual search over
knowledge repositories
• Knowledge Management for Solution Design during Sales and Pre-Sales
• Document Editors to Assimilate Documents Returned by a Search Engine
• System and method for socially enabled business risk management
• System and method for managing and using social search lists in a search engine
Related Patents
Serving Information Needs of Knowledge Workers

More Related Content

PDF
Bhadale group of companies cross- discipline engineering catalogue
PPTX
Taming the Wild West of NLP
PDF
A short study on telecom information models & offerings
PDF
Bi 5
PDF
Overview of business intelligence
PDF
Model driven requirements engineering in the context of erp implementation
PPS
Technologies
PDF
Stakeholder Driven EA
Bhadale group of companies cross- discipline engineering catalogue
Taming the Wild West of NLP
A short study on telecom information models & offerings
Bi 5
Overview of business intelligence
Model driven requirements engineering in the context of erp implementation
Technologies
Stakeholder Driven EA

What's hot (9)

PDF
Real World Guide to Building Your Knowledge Graph
PDF
Réussir son analyse fonctionnelle SharePoint
PPTX
Data imputation for unstructured dataset
PPTX
Data analytics presentation- Management career institute
PDF
workforce analytics using Data Science
PDF
International Refereed Journal of Engineering and Science (IRJES)
PDF
Real World Guide to Building Your Knowledge Graph
PPT
Data Modeling Presentations I
DOCX
Example data specifications and info requirements framework OVERVIEW
Real World Guide to Building Your Knowledge Graph
Réussir son analyse fonctionnelle SharePoint
Data imputation for unstructured dataset
Data analytics presentation- Management career institute
workforce analytics using Data Science
International Refereed Journal of Engineering and Science (IRJES)
Real World Guide to Building Your Knowledge Graph
Data Modeling Presentations I
Example data specifications and info requirements framework OVERVIEW
Ad

Similar to Serving Information Needs of Knowledge Workers (20)

PDF
Integrating SIS’s with Salesforce: An Accidental Integrator’s Guide
PPTX
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
PPTX
Patterns for distributed systems
PPTX
Kuliman "Content Profiles & linked documents"
PDF
Introduction Big Data
PPTX
APIs as your digital connector
PPT
James hall ch 14
PDF
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
PPTX
Arquitectura de Datos en Azure
PDF
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
PPT
Redesigning TCS.com with Remote Research
PPTX
Real time insights for better products, customer experience and resilient pla...
PPTX
Data Science as a Service: Intersection of Cloud Computing and Data Science
PPTX
Data Science as a Service: Intersection of Cloud Computing and Data Science
PDF
FDMEE Scripting - Cloud and On-Premises - It Ain't Groovy, But It's My Bread ...
PDF
C19013010 the tutorial to build shared ai services session 1
PDF
Recsys2016 Tutorial by Xavier and Deepak
PPT
obrien13e_chap005.ppt
PPT
obrien13e_chap005.ppt
PPTX
Azure Databricks for Data Scientists
Integrating SIS’s with Salesforce: An Accidental Integrator’s Guide
Freddie Mac & KPMG Case Study – Advanced Machine Learning Data Integration wi...
Patterns for distributed systems
Kuliman "Content Profiles & linked documents"
Introduction Big Data
APIs as your digital connector
James hall ch 14
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Arquitectura de Datos en Azure
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
Redesigning TCS.com with Remote Research
Real time insights for better products, customer experience and resilient pla...
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
FDMEE Scripting - Cloud and On-Premises - It Ain't Groovy, But It's My Bread ...
C19013010 the tutorial to build shared ai services session 1
Recsys2016 Tutorial by Xavier and Deepak
obrien13e_chap005.ppt
obrien13e_chap005.ppt
Azure Databricks for Data Scientists
Ad

More from Debdoot Mukherjee (7)

PDF
meetup-talk
PPT
Determining QoS of WS-BPEL Compositions
PPT
PPT
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
PPT
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
PPT
Which Work-Item Updates Need Your Response?
PPT
From Informal Process Diagrams To Formal Process Models
meetup-talk
Determining QoS of WS-BPEL Compositions
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Which Work-Item Updates Need Your Response?
From Informal Process Diagrams To Formal Process Models

Serving Information Needs of Knowledge Workers

  • 1. Serving Information Needs of Knowledge Workers Debdoot Mukherjee, IBM Research-India
  • 2. Knowledge Worker is one who develops or applies knowledge in the workplace -- Peter Drucker Who can I reach out for help? How did we handle such a case before? What best practices apply? Information Needs of a Knowledge Worker Engg. Design Customer Support Sales / Pre-Sales R&D
  • 3. Spend 15% - 35% of their time searching. Successful only 50% of time Source:IDC Potential Productivity Gain 20 – 25% Source:Mc-Kinsey Huge Cost of NOT finding the RIGHT information at the RIGHT time
  • 4. Sample Case Information created by Sales Teams Web Portals, Wikis, Forums 100’s of structured fields in Notes databases Dense Documents in Team Rooms
  • 5. Going beyond keyword search. Users expect deeper insights or analyses. Complicated Access Control Handling a mix of structured and unstructured data Understanding results from past cases is difficult Challenges Deal with each case domain separately Enumerate information needs in that domain Extract information entities and drive semantic search Leverage context of the case being worked upon Opportunities
  • 6. Information Retrieval Information Interaction Information Extraction Understanding the case domain – artifacts created, different roles and their information needs
  • 7. Information Retrieval Information Interaction Information Extraction Knowledge workers need to read multiple dense documents and distill insights thereof to arrive at a decision Aggregate insights are often more important than knowing about a particular case Need to facilitate information exploration - guess what would one want to know next 360 degree views of information entities
  • 8. pureflex IBM PureFlex™ System is an infrastructure system with expertise for sensing and anticipating resource needs to optimize your infrastructure. Key attributes include: Factory integrated and optimized system infrastructure Management integration across physical and virtual resources Automation and optimization expertise Built for cloud, as a foundation for Infrastructure as a Service offering Definition from SO Village Offering page Related Links: IBM PureFlex System (ibm.com) Switch to Search Customers ACME Inc. 2 Opportunities (1 Win) Processes / Lessons PureFlex and PureAS Solution Process Source: CES Handbook XYZ Ltd 2 Opportunities (1 Win) ABCD Inc. 5 Opportunities Apply FiltersApply Filters Version 1 - PureFlex Solution Checklist Source: CES Handbook PureSystems Source: SO Village Value Proposition Apply Filters People Apply Filters Add a Explorer View Select Explorer Views Add Leave A Comment Each exploration view targeted toward an unique information need Decide which views to show based on query and role
  • 9. pureflex IBM PureFlex™ System is an infrastructure system with expertise for sensing and anticipating resource needs to optimize your infrastructure. Key attributes include: Factory integrated and optimized system infrastructure Management integration across physical and virtual resources Automation and optimization expertise Built for cloud, as a foundation for Infrastructure as a Service offering Definition from SO Village Offering page Related Links: IBM PureFlex System (ibm.com) Switch to Search Customers ACME Inc. 2 Opportunities (1 Win) Processes / Lessons PureFlex and PureAS Solution Process Source: CES Handbook XYZ Ltd 2 Opportunities (1 Win) ABCD Inc. 5 Opportunities Apply FiltersApply Filters Version 1 - PureFlex Solution Checklist Source: CES Handbook PureSystems Source: SO Village Value Proposition Apply Filters People Apply Filters Add a Explorer View Leave A Comment Client: ACME Inc Client Background Sector : Distribution Industry : Travel & Transportation Contacts : John Will (CSE), Ray Harris (TSM) Past Opportunities (Win, Loss, Unknown) : 2Y-337WW2 1Y-43HYFD 12KZ-52XZQQU 4D-DFREE See 5 results from IBM Connections Similar Clients: World Tour-Co Cosmos, Globus Comments (1) It seems like ACME Inc. is an early adopter of Pureflex. Is it willing to act as a reference. @John Will: Any idea? 4/10/2013 Entity Profiles aggregated from information in multiple documents across multiple sources Add Comment
  • 10. Recommendation View: Change Client Solution Competition Scope Win Themes Value Proposition Delivery Model Offerings & Asset Architecture Financials RAID Engagement HR Solution Transition & Trans Section Selector Section View Current Topics Upload Documents Standardisation Recommendations Apply Filters Recommended Topics Faster Provisioning Improve Speed To Market Reduce IT Operating Costs Pay per use Scalability Flexibility Standardization of Images Enable collaboration with partners ACME Inc 12Y-6774Y Contact: Steve Toll Govt of XYZ 12Y-4YFFFT Contact: Mike Chang Case Field being worked upon (say, Value Proposition) Recommended Value Props. from similar past cases Visually analyze relevance of topics
  • 12. Information Retrieval Information Interaction Information Extraction Summarize Topic Modeling MMR based 2-3 line segment summary Segment & Annotate Dictionary, Regex- based Leverage Formatting - Paragraphs, Tables Crawl & Parse Multiple platforms, technologies Parse formatting, not just text. Export thumbnails
  • 13. CommonAmbiguities Diagram Parsing • Parse information about diagram shapes • Attributes such as coordinates, dimensions, text, geometry Structure Inference • Precisely determine the underlying flow graph • Deal with structural ambiguities Semantic Interpretation • Classify the semantic of every node or edge based on their structural, textual or geometric features • Unsupervised training of such a classifier performs as well as supervised Extracting Formal Models From Informal Diagrams
  • 14. Information Retrieval Information Interaction Information Extraction Create ER network Compute pair- wise Personalized Page Rank Leverage case context to supplement user queries
  • 16. Pair- wise Personalized Page Rank How do we set the edge weights? Equally? Suppose, we want recommendations for field - Xk for this case Let’s initiate a random walk here and try to hit nodes of Xk Which case fields provide meaningful context for Xk? Across cases, if similarity in Xi leads to similarity in Xk, then Xi should be used as context for generating recommendations for Xk
  • 17. Correspondence Analysis 1. Select a pair of cases, Ri = (Ci1, Ci2) from the case repository 2. For each case field, Xk (k = 1, 2,…n), compute similarity of contents of Xk in Ci1, Ci2  Sik 3. Repeat steps 1 and 2 for all pairs of cases in repository to populate matrix S. 4. To compute Corr(Xi, Xk) for all i = 1,2…n, regress column k with the other columns in S 5. The coefficients obtained from a linear regression model obtained above for each column i gives Corr(Xi, Xk) 17 X1 X2 …. Xk ….. Xn S11 S12 …. S1k … S1n …. …. …. …. …. …. Sm1 Sm2 …. Smk … Smn R1 Rm Case Repository Ci2Ci1 Ri Sik = Similarity(Ci1.Xk , Ci1.Xk) S Correspondence, Corr(Xi , Xk) is calculated as the degree to which similarity in field Xi corresponds to similarity in field Xk across pairs of cases. An incoming edge to a node of type Xi is weighted by Corr(Xi , Xk) whenever a node of type Xk is a target for any Personalized Page Rank calculation.
  • 18. Semi-structured Contents Read Index Creator Personalized Page Rank Calc. PPR DB Search & Rank Crawlers Crawlers Web UI/ API Query Generator Thumbnails Rich texts Crawlers Parsers Document DB Read Crawlers Annotators Graph DB Primary Secondary Entity Profile Creator Graph Builder Front-end App Document Processing & Analytics Pipeline Provisioning Project Context SIM Architecture Usage Tracking Access Level Provisioning Access Control Search Indices
  • 19. Related Publications • Rohan Padhye, Debdoot Mukherjee, Vibha Sinha, API as a Social Glue, ICSE 2014 (In Submission) • Mu Qiao, Debdoot Mukherjee et. al., Unleashing The Power of Expert Knowledge for IT Services Sales with Graph Search, INFORMS 2013 • Debdoot Mukherjee, Jeanette Blomberg, Rama Akkiraju, Dinesh Raghu, Monika Gupta, Sugata Ghosal, Mu Qiao, Taiga Nakamura: A Case Based Approach to Serve Information Needs in Knowledge Intensive Processes. ICSOC 2013: 541-549 • Richard Goodwin, SweeFen Goh, Pietro Mazzoleni, Vibha Sinha, Debdoot Mukherjee, Senthil Mani: Effective Content Reuse for Business Consulting Practices. SRII Global Conference 2012: 682-690 • Monika Gupta, Debdoot Mukherjee, Senthil Mani, Vibha Singhal Sinha, Saurabh Sinha: Serving Information Needs in Business Process Consulting. BPM 2011: 231-247 • Debdoot Mukherjee, Senthil Mani, Vibha Singhal Sinha, Rema Ananthanarayanan, Biplav Srivastava, Pankaj Dhoolia, Prahlad Chowdhury: AHA: Asset Harvester Assistant. IEEE SCC 2010: 425-432 • Debdoot Mukherjee, Pankaj Dhoolia, Saurabh Sinha, Aubrey J. Rembert, Mangala Gowri Nanda: From Informal Process Diagrams to Formal Process Models. BPM 2010: 145-161 • Biplav Srivastava, Debdoot Mukherjee, Rema Ananthanarayanan, Vibha Sinha: From Model Extraction to Model-based Reuse of Enterprise Documents. COMAD 2010: 171 • Pietro Mazzoleni Debdoot Mukherjee, et. Al.: Consultant assistant: a tool for collaborative requirements gathering and business process documentation. OOPSLA Companion 2009
  • 20. • (Granted) US 8176412 – Generating Formatted Documents • (Granted) US 8234570 – Harvesting Assets for packaged software application configuration • (Granted) US 8356045 – Method to Identify Common Structures in Formatted Text Documents • (Granted) US 8578346 – System and method to validate and repair process flow drawings • (Granted) US 8589877 – Modeling and Linking Documents for Packaged Software Application Configuration • US 2011/0106801 A1 – Systems and Methods for Organizing Documented Processes • US 2011/0167070 A1 – Reusing assets for packaged s/w application configuration • US 2011/0313932 A1 – Model Based Project Network • US 2012/0062574 A1 – Automated Recognition of Process Modeling Semantics in Flow Diagrams • US 2012/0078969 A1 – System and Method to extract models from semi-structured documents • US 2013/0144872 – System and method to provision semantic and contextual search over knowledge repositories • Knowledge Management for Solution Design during Sales and Pre-Sales • Document Editors to Assimilate Documents Returned by a Search Engine • System and method for socially enabled business risk management • System and method for managing and using social search lists in a search engine Related Patents

Editor's Notes

  • #3: The term – “knowledge worker” is increasingly being heard at research forums on services and BPM. The term was coined by Peter Drucker to identify all those who apply knowledge at the workplace. So, most workers in the professional services industry can be regarded as knowledge workers – some have stronger and more complex information needs than others. This research investigates information needs of knowledge workers and how to best serve them. <Single Click> Some stereotypical questions that come up when a knowledge worker is assigned a new case – How did we handle such a case before – are there some lessons learned from those past engagements? Who can I reach out for help? What best practices can be applied?
  • #8: 360 degree view ranked list of entities not just documents understand context explore the next thing you need to know or want to know - push based compare type analytics
  • #11: This is a snapshot of the Solution Information Management interface developed for the sales practice at a large IT Services company. For each of the case fields in the schema, one can contrast what has been already authored with recommendations from past cases. Here, we show the contents of the “Value proposition” field of the current case in the middle pane. Value Propositions from past cases that are similar to the case are shown as recommendations in the right hand pane. Now, how do we retrieve similar past cases? We match the contents of already authored fields in the current case with respective fields in past cases to retrieve the top matches.
  • #19: SIM is a domain specific enterprise information interaction system targeted at serving the information needs of the solutioning community. Key features: Fast, precise search of multiple trusted repositories Recommendations on information entities (experts, products, different solution design elements) related to the search Ease navigation to relevant regions in result document Brief notes on the different modules in SIM Architecture: Document DB: Storehouse of all text documents, meta-data, pointer to files etc. Different modules in the doc-processing pipeline and index preparation stage read/write content into the document DB. Currently, Mongo DB (http://www.mongodb.org/) is being considered as the Document DB implementation for SIM. Crawlers: Packaged crawler implementations for many DB technologies are available in ICA. May need to be custom written if no packaged implementation suits a repository. They must be scheduled to get incremental crawls at regular intervals. All crawled files dumped in the file-system where the document processing pipeline resides. A post processor to a crawler records crawled entries in the document DB. Available meta-data should be exported as XML. XML data is mapped to document DB fields and loaded by the post processor. Parsers: Extract formatted text, export thumbnails to enable quick browsing. Currently, some parsers have a dependency on MS-Office. Annotators: An array of annotators extract information of different types and store the same in document DB. Examples of information: Deal information (client,TCV, country, geo, people who worked in different roles), Solution information (win-themes, value-prop, scope, risks, assumptions), products, people, competition. All doc processing pipeline modules need to be scheduled in a workflow that runs periodically. Quartz (http://www.quartz-scheduler.org/) is a framework for scheduling Java jobs. Graph Builder creates a large graph with nodes representing different information entities and edges representing their different inter-relationships. This helps us derive insights from the associations captured from multiple documents across multiple sources. Graph DB is the store for the graph described above. Candidate implementations: Neo4j, Titan Index Creator polls the Document DB to aggregate documents (for indexing) out of the information created in the DB by different doc processing modules. Profile Creator creates summarized information profiles on key entities products, companies, delivery capabilities etc. Profiles are also indexed. Personalized Page Rank Calculator computes pairwise topic specific page ranks between different nodes in the graph. This is used to drive proximity search in SIM. Search Index is developed on top of Apache Solr Query Generator augments the user query with project context (if available). Search & Rank executes the query on the Solr index and leverages the PPRanks to update the results. Returns results of different kinds: documents, people, profiles. Also, supports faceted search and so on. Web UI: Interactive visualization to support guided information exploration Usage Tracking: Both server side and client side actions are logged. PiWIk setup for Javascript Usage Tracking and reporting.