Serving Information Needs of Knowledge Workers

Serving Information Needs
of Knowledge Workers
Debdoot Mukherjee, IBM Research-India

Knowledge Worker is one who develops or applies knowledge in
the workplace -- Peter Drucker
Who can I reach
out for help?
How did we
handle such a
case before?
What best
practices
apply?
Information
Needs of a
Knowledge
Worker
Engg. Design
Customer Support
Sales / Pre-Sales
R&D

Spend 15% - 35% of their time
searching.
Successful only 50% of time
Source:IDC
Potential Productivity Gain
20 – 25%
Source:Mc-Kinsey
Huge Cost of NOT
finding the RIGHT
information at
the RIGHT time

Sample Case Information created by Sales Teams
Web Portals,
Wikis, Forums
100’s of structured fields
in Notes databases
Dense Documents
in Team Rooms

Going beyond keyword
search. Users expect deeper
insights or analyses.
Complicated Access Control
Handling a mix of structured
and unstructured data
Understanding results from
past cases is difficult
Challenges
Deal with each case domain
separately
Enumerate information
needs in that domain
Extract information entities
and drive semantic search
Leverage context of the case
being worked upon
Opportunities

Information
Retrieval
Information
Interaction
Information
Extraction Understanding the case
domain – artifacts created,
different roles and their
information needs

Information
Retrieval
Information
Interaction
Information
Extraction
Knowledge workers need to
read multiple dense
documents and distill insights
thereof to arrive at a decision
Aggregate insights are often
more important than knowing
about a particular case
Need to facilitate information
exploration - guess what
would one want to know next
360 degree views of
information entities

pureflex
IBM PureFlex™ System is an infrastructure system with expertise for sensing and anticipating resource needs to optimize your infrastructure.
Key attributes include:
Factory integrated and optimized system infrastructure Management integration across physical and virtual resources
Automation and optimization expertise
Built for cloud, as a foundation for Infrastructure as a Service offering
Definition from SO Village Offering page Related Links: IBM PureFlex System (ibm.com)
Switch to Search
Customers
ACME Inc.
2 Opportunities (1 Win)
Processes / Lessons
PureFlex and PureAS
Solution Process
Source: CES Handbook
XYZ Ltd
ABCD Inc.
5 Opportunities
Apply FiltersApply Filters
Version 1 - PureFlex
Solution Checklist
PureSystems
Source: SO Village
Value Proposition
Apply Filters
People
Apply Filters
Add a Explorer View
Select Explorer Views
Add
Leave A Comment
Each exploration view
targeted toward an
unique information need
Decide which views
to show based on
query and role

pureflex
IBM PureFlex™ System is an infrastructure system with expertise for sensing and anticipating resource needs to optimize your infrastructure.
Key attributes include:
Factory integrated and optimized system infrastructure Management integration across physical and virtual resources
Automation and optimization expertise
Built for cloud, as a foundation for Infrastructure as a Service offering
Definition from SO Village Offering page Related Links: IBM PureFlex System (ibm.com)
Switch to Search
Customers
ACME Inc.
Processes / Lessons
PureFlex and PureAS
Solution Process
XYZ Ltd
ABCD Inc.
5 Opportunities
Apply FiltersApply Filters
Version 1 - PureFlex
Solution Checklist
PureSystems
Source: SO Village
Value Proposition
Apply Filters
People
Apply Filters
Add a Explorer View
Leave A Comment
Client: ACME Inc
Client Background
Sector : Distribution
Industry : Travel & Transportation
Contacts : John Will (CSE), Ray Harris (TSM)
Past Opportunities (Win, Loss, Unknown) :
2Y-337WW2 1Y-43HYFD 12KZ-52XZQQU 4D-DFREE
See 5 results from IBM Connections
Similar Clients: World Tour-Co Cosmos, Globus
Comments (1)
It seems like ACME Inc. is an early adopter of Pureflex. Is it willing
to act as a reference. @John Will: Any idea?
4/10/2013
Entity Profiles aggregated from information in
multiple documents across multiple sources
Add Comment

Recommendation View: Change
Client
Solution
Competition
Scope
Win Themes
Value Proposition
Delivery Model
Offerings & Asset
Architecture
Financials
RAID
Engagement
HR Solution
Transition & Trans
Section Selector Section View
Current Topics
Upload Documents
Standardisation
Recommendations
Apply Filters
Recommended Topics
Faster Provisioning
Improve Speed To Market
Reduce IT Operating Costs
Pay per use
Scalability Flexibility
Standardization
of Images Enable
collaboration with partners
ACME Inc
12Y-6774Y
Contact:
Steve Toll
Govt of XYZ
12Y-4YFFFT
Contact:
Mike Chang
Case Field being worked upon
(say, Value Proposition)
Recommended Value Props.
from similar past cases
Visually analyze
relevance of topics

Software
Maintenance
Software
Development

Information
Retrieval
Information
Interaction
Information
Extraction
Summarize
Topic Modeling
MMR based 2-3 line
segment summary
Segment & Annotate
Dictionary, Regex-
based
Leverage Formatting
- Paragraphs, Tables
Crawl & Parse
Multiple platforms,
technologies
Parse formatting, not
just text. Export
thumbnails

CommonAmbiguities
Diagram Parsing
• Parse information about
diagram shapes
• Attributes such as
coordinates,
dimensions, text,
geometry
Structure Inference
• Precisely determine the
underlying flow graph
• Deal with structural
ambiguities
Semantic Interpretation
• Classify the semantic of
every node or edge based
on their structural, textual or
geometric features
• Unsupervised training of
such a classifier performs as
well as supervised
Extracting Formal Models From Informal Diagrams

Information
Retrieval
Information
Interaction
Information
Extraction
Create ER
network
Compute pair-
wise Personalized
Page Rank
Leverage case context to
supplement user queries

Case
People
Technology
Risk
Customer
Create ER network

Pair- wise
Personalized
Page Rank
How do we
set the edge
weights?
Equally?
Suppose, we want
recommendations for
field - Xk for this case
Let’s initiate a random
walk here and try to
hit nodes of Xk
Which case fields provide
meaningful context for Xk?
Across cases, if similarity in Xi leads
to similarity in Xk, then Xi should be
used as context for generating
recommendations for Xk

Correspondence Analysis
1. Select a pair of cases, Ri = (Ci1, Ci2) from the
case repository
2. For each case field, Xk (k = 1, 2,…n), compute
similarity of contents of Xk in Ci1, Ci2  Sik
3. Repeat steps 1 and 2 for all pairs of cases in
repository to populate matrix S.
4. To compute Corr(Xi, Xk) for all i = 1,2…n,
regress column k with the other columns in S
5. The coefficients obtained from a linear
regression model obtained above for each
column i gives Corr(Xi, Xk)
17
X1 X2 …. Xk ….. Xn
S11 S12 …. S1k … S1n
….
….
….
….
….
….
Sm1 Sm2 …. Smk … Smn
R1
Rm
Case
Repository
Ci2Ci1
Ri
Sik = Similarity(Ci1.Xk , Ci1.Xk)
S
Correspondence, Corr(Xi , Xk) is calculated as the degree to which similarity in field Xi
corresponds to similarity in field Xk across pairs of cases.
An incoming edge to a node of type Xi is
weighted by Corr(Xi , Xk) whenever a
node of type Xk is a target for any
Personalized Page Rank calculation.

Semi-structured
Contents
Read
Index Creator
Personalized
Page Rank Calc.
PPR DB
Search &
Rank
Crawlers
Crawlers
Web UI/
API Query
Generator
Thumbnails
Rich texts
Crawlers
Parsers
Document DB
Read
Crawlers
Annotators
Graph DB
Primary
Secondary
Entity Profile
Creator
Graph Builder
Front-end App
Document Processing &
Analytics Pipeline
Provisioning
Project
Context
SIM Architecture
Usage
Tracking
Access Level
Provisioning
Access
Control
Search
Indices

Related Publications
• Rohan Padhye, Debdoot Mukherjee, Vibha Sinha, API as a Social Glue, ICSE 2014 (In
Submission)
• Mu Qiao, Debdoot Mukherjee et. al., Unleashing The Power of Expert Knowledge for IT
Services Sales with Graph Search, INFORMS 2013
• Debdoot Mukherjee, Jeanette Blomberg, Rama Akkiraju, Dinesh Raghu, Monika Gupta,
Sugata Ghosal, Mu Qiao, Taiga Nakamura: A Case Based Approach to Serve Information
Needs in Knowledge Intensive Processes. ICSOC 2013: 541-549
• Richard Goodwin, SweeFen Goh, Pietro Mazzoleni, Vibha Sinha, Debdoot Mukherjee, Senthil
Mani: Effective Content Reuse for Business Consulting Practices. SRII Global Conference
2012: 682-690
• Monika Gupta, Debdoot Mukherjee, Senthil Mani, Vibha Singhal Sinha, Saurabh Sinha:
Serving Information Needs in Business Process Consulting. BPM 2011: 231-247
• Debdoot Mukherjee, Senthil Mani, Vibha Singhal Sinha, Rema Ananthanarayanan, Biplav
Srivastava, Pankaj Dhoolia, Prahlad Chowdhury: AHA: Asset Harvester Assistant. IEEE SCC
2010: 425-432
• Debdoot Mukherjee, Pankaj Dhoolia, Saurabh Sinha, Aubrey J. Rembert, Mangala Gowri
Nanda: From Informal Process Diagrams to Formal Process Models. BPM 2010: 145-161
• Biplav Srivastava, Debdoot Mukherjee, Rema Ananthanarayanan, Vibha Sinha: From Model
Extraction to Model-based Reuse of Enterprise Documents. COMAD 2010: 171
• Pietro Mazzoleni Debdoot Mukherjee, et. Al.: Consultant assistant: a tool for collaborative
requirements gathering and business process documentation. OOPSLA Companion 2009

• (Granted) US 8176412 – Generating Formatted Documents
• (Granted) US 8234570 – Harvesting Assets for packaged software application configuration
• (Granted) US 8356045 – Method to Identify Common Structures in Formatted Text Documents
• (Granted) US 8578346 – System and method to validate and repair process flow drawings
• (Granted) US 8589877 – Modeling and Linking Documents for Packaged Software Application
Configuration
• US 2011/0106801 A1 – Systems and Methods for Organizing Documented Processes
• US 2011/0167070 A1 – Reusing assets for packaged s/w application configuration
• US 2011/0313932 A1 – Model Based Project Network
• US 2012/0062574 A1 – Automated Recognition of Process Modeling Semantics in Flow
Diagrams
• US 2012/0078969 A1 – System and Method to extract models from semi-structured documents
• US 2013/0144872 – System and method to provision semantic and contextual search over
knowledge repositories
• Knowledge Management for Solution Design during Sales and Pre-Sales
• Document Editors to Assimilate Documents Returned by a Search Engine
• System and method for socially enabled business risk management
• System and method for managing and using social search lists in a search engine
Related Patents

Serving Information Needs of Knowledge Workers

Serving Information Needs of Knowledge Workers

More Related Content

What's hot (9)

Similar to Serving Information Needs of Knowledge Workers (20)

More from Debdoot Mukherjee (7)

Serving Information Needs of Knowledge Workers

Editor's Notes