SlideShare a Scribd company logo
Information Flow and Search in
Unstructured Keyword based
Social Networks

   Ankush Garg, Prantik Bhattacharyya,
   Charles U. Martel, S. (Shyhtsun) Felix Wu




                               SMW '09, August 30,
                                Vancouver, Canada
Introduction

  How to search information in online social
  networks (OSNs) ?




                                               2
Searching in OSNs

  Expectations in terms of relevancy of result
    results from direct friends
    results from trustworthy friends

  Challenges in the absence of structure
    high node degree
    high clustering
    low diameter

                                                 3
Search Problem

 Search a set of users with queried keyword as
 profile attributes
   Output
     Relevant results first
   Constraints
     Minimum resource usage




                                                 4
Outline

  Introduction
  OSN architecture
  Information Flow Model
  Search Algorithm
  Simulation Methodology
  Results
  Related Work
  Concluding Remarks

                           5
OSN architecture

  Undirected social network user topology
  No node has information about entire graph
  Identity of a node available to only friends
  Each edge has an associated trust value
  Each user v has profile attributes   K vPAtt
  Profile attributes built by set of keywords
  Each keyword k has associated Policy(k)
  Policy(k) = [D, T]
                                           FAtt
  Each user v has friendly attributes K v
                                                  6
OSN architecture (more)
  D: maximum distance from where k is
  visible and v can be contacted from users
  T: min trust of each edge in social path
  required to let users contact v
                                     Each edge has trust >
                                     0.5
                                     Each oval shows how
                                     keywords will flow
                                     and conversely, can
                                     be searched
          Example: Keyword based social network

                                                        7
Information Flow Model
 Motivation
   Information maintenance and update
   Diffusion of non-inclusive keyword data
   Integral to search algorithm development
 Users diffuse keyword information
   Along with privacy consideration




                                              8
Information Flow Model (more)
                    Hr: Hops remaining
                    Hc: Hops covered
                    PID; propagation ID to
                    avoid cycles
                    Identity of source
                    suppressed to non-
                    direct friends
                    Non-inclusive data
                    propagation
                      ∆Hr > 0 || ∆T < 0 ||
                      ∆Hc < 0
                                             9
Information Flow Model (more)

  Keyword Forwarding Table, FTw
                                                               F5       F9
    Keyword        Propagation Data               Friends
      K1           {PID1,Hr1,Hc1,T1}             (F1, F2,F3)


                                                                    W
  Keyword Received Table, RTw
    Keyword
      K1
              Max. Hops
               HmaxK1
                              (Friend, Min. Hops)
                            (F9, Hmin F ),, (F5, HminK )
                                       K1
                                        9
                                                      1
                                                               F1   F2
                                                     F5




 Please see paper for details

                                                                         10
Search Algorithm
  Fast lookup for boundary conditions
    No results possible (no entry exists for query
    keyword)
    Information provider are only the direct
    friends (Hmax == 1)
  Components:
    Metric to define value of a direct friend
    Threshold function for dynamic network
    pruning
    Query message processing algorithm
                                                     11
Search Algorithm (more)

  For user w
    Selecting Topologically Closer Nodes
                                    min k∈Sk Hmax k − max k∈Sk H min u
                                                                     k
     distance value, DV (u , Sk ) =
                                              min k∈Sk H max k

    Selecting Trustworthy Nodes
                                     Twu
     trust valu e, TV (u , Sk ) =     Sk
                                    Tmax

    Value for keyword set for a direct friend
     V(u, Sk) = ρ × DV(u, Sk) + (1− ρ) ×TV(u, Sk), 0 ≤ ρ ≤1
                                                                         12
Search Algorithm (more)

  Threshold Function
        Θ ( u , Q k ) = max V ( w , Q k ) − f ( N uQ k ) ×
                       w∈ N u
                          Q     k


                     ( maxk V ( w , Q k ) − min k V ( w , Q k ))
                      w∈ N u
                         Q                     Q
                                                 w∈ N u


  Pruning function f with properties:
          f (1) = 1 and limN Qk →∞ f ( Nu k ) = 0
                                        Q
                                             u




  We use g(x) = x-p for p ≥ 0 as f
    p = 0, f ( N u k ) = mink V ( w, Q k ))
                 Q
                           Q
                                    w∈ N u
       Breadth First Search (BFS)

                                                                   13
Search Algorithm (more)
                 Query message structure
                   <QID, Qk, Tmin, Hopsdone,
                   Hopsleft>
                 Starting Query Structure
                   <qid, Qk, 1, 0, Hl>
                 Query Serviced Table, QIDu:
                   <QID, direct friend who
                   forwarded the query>


                                            14
Search Algorithm (more)

  Advantage of Dynamic Pruning
    Reduces number of messages sent in network
    Makes unstructured social network scalable
    Selects targets of high value
      topologically closer
      and trustworthy
    Returns a set of good results amongst all
    available obtainable results


                                                 15
Simulation Methodology

 Newman Watts Model
 Trust Distribution
   Five Categories of Trust
     From ‘Blind Trust’ (0.9) to ‘Don’t Know’ (0.1)
 Information propagation policy
   Restrictive policy
     Depth set to 2 i.e. friends of friends
   Liberal policy
     Depth set randomly between 1 and graph diameter

                                                      16
Performance at ρ values

 Helps us understand
 system resource
 consumption
 Shows significant
 reduction in
 message generation
 when compared to
 BFS (12 compared to
 121)                  Analysis of network with restrictive
                                 propagation policy

                                                       17
Performance at ρ values (more)

  Helps us
  understand how
  many queries
  are successful to
  find results
  Significant
  improvement at
  ρ = 0.5 when
  compared to
  BFS                 Analysis of network with restrictive
                                propagation policy

                                                        18
Performance at ρ values (more)

  Similar
  observation with
  higher levels of
  improvement
  Reduction in
  number of query
  messages
  generated from
  17855 to only      Analysis of network with liberal
  100’s..                   propagation policy

                                                        19
Performance at ρ values (more)

  Similar results with
  higher levels of
  performance
  Best results when
  hop and trust are
  considered together
  Results dip when
  either of the
  parameters are
  considered             Analysis of network with liberal
  separately                    propagation policy

                                                      20
Performance at hop values

  Computed as a
  fraction of total
  number of
  results found by
  BFS
  Please see paper
  for more results

                      Analysis of network (liberal policy) for
                        various hops and at ρ = 0.5

                                                         21
Performance at hop values (more)

 Higher denominator
 in pruning function
 associated with
 increased number of
 successful query
 messages
 Dynamic pruning
 using friend values
 is an effective way
                       Analysis of network with liberal
                              propagation policy

                                                      22
Related Work

 Search in social networks
   Algorithms using structural knowledge of the
   network through geographical distance,
   organizational hierarchy, interest of users..
 Search in decentralized and unstructured
 networks
   BFS, random BFS, Intelligent BFS, directed
   BFS, iterative BFS, random walks, k random
   walks, other variation of random walks..

                                                   23
Concluding Remarks

 Developed a Search Algorithm
   Using an Information Flow model
   With focus on decentralization and privacy
   Concentrates on finding set of good results
   Dynamically prunes the network to search
   Improvement in orders of magnitude
 Next step
   Evaluation using larger graph topologies

                                                 24
Thanks!




          25

More Related Content

PPT
Cupdf.com public key-cryptography-569692953829a
PDF
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
PPT
Nwc rsa
PPT
Hash crypto
PDF
20CS2008 Computer Networks
PDF
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
PDF
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
PDF
Utilizing Query Change for Session Search (SIGIR 2013)
Cupdf.com public key-cryptography-569692953829a
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Nwc rsa
Hash crypto
20CS2008 Computer Networks
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Utilizing Query Change for Session Search (SIGIR 2013)

What's hot (13)

PDF
A Short Course in Data Stream Mining
PDF
Wanhive vs Chord Distributed Hash Table
PDF
White Paper on Cryptography
PDF
DMTM Lecture 14 Density based clustering
PPTX
PDF
Data science
PPT
internet applications
PDF
IRJET-Block-Level Message Encryption for Secure Large File to Avoid De-Duplic...
PPT
Distributed Hash Table
PPT
집합모델 확장불린모델
PDF
Skip gram and cbow
PDF
PPTX
Deep Neural Methods for Retrieval
A Short Course in Data Stream Mining
Wanhive vs Chord Distributed Hash Table
White Paper on Cryptography
DMTM Lecture 14 Density based clustering
Data science
internet applications
IRJET-Block-Level Message Encryption for Secure Large File to Avoid De-Duplic...
Distributed Hash Table
집합모델 확장불린모델
Skip gram and cbow
Deep Neural Methods for Retrieval
Ad

Viewers also liked (20)

PPTX
Next Generation Video Platform
PDF
Connect 3on3 League Playbook
PPTX
Symantec Data Loss Prevention 9
PPTX
Finding Radical Ideas
PPT
Monetization Models in the Age of Free
DOCX
iShare 技术 爪哇斗地主 Sun
PDF
TFC Capabilities Brochure 2010 A
PPTX
Dirty Eggs For $30
PDF
Conman 1.0
PPTX
Connect 3on3 Indoor Basketball Action Steps
PPTX
[Bronze] Chick-fil-A Race Series of Raleigh
PPTX
[Silver] Chick-fil-A Race Series of Raleigh
PPTX
RB 3on3 Huntsville Choice Sponsor
PPTX
Project
PDF
CFA 3on3 Athens Event Layout
PPTX
RB 3on3 Tuscaloosa Choice Sponsor
PPTX
Connect Race Traditional Model Information
PPTX
RB 3on3 Tuscaloosa Title Sponsor
PPTX
RB 3on3 Huntsville Gold Sponsor
PPTX
Chick-fil-A Rick and Bubba Event Recap - April 9, 2011
Next Generation Video Platform
Connect 3on3 League Playbook
Symantec Data Loss Prevention 9
Finding Radical Ideas
Monetization Models in the Age of Free
iShare 技术 爪哇斗地主 Sun
TFC Capabilities Brochure 2010 A
Dirty Eggs For $30
Conman 1.0
Connect 3on3 Indoor Basketball Action Steps
[Bronze] Chick-fil-A Race Series of Raleigh
[Silver] Chick-fil-A Race Series of Raleigh
RB 3on3 Huntsville Choice Sponsor
Project
CFA 3on3 Athens Event Layout
RB 3on3 Tuscaloosa Choice Sponsor
Connect Race Traditional Model Information
RB 3on3 Tuscaloosa Title Sponsor
RB 3on3 Huntsville Gold Sponsor
Chick-fil-A Rick and Bubba Event Recap - April 9, 2011
Ad

Similar to Information Flow and Search in Unstructured Keyword based Social Networks (20)

PPTX
Bichromatic Reverse Nearest Neighbours
PDF
AI Lesson 06
PDF
Sucha's Presentation at ECTI-CON 09
PPT
Jarrar.lecture notes.aai.2011s.ch4.informedsearch
PDF
Link analysis
PPT
03 search blind
PDF
Ai1.pdf
PPT
Inroduction_To_Algorithms_Lect14
PPTX
Ad hoc routing
 
PPTX
ANT COLONY OPTIMIZATION FOR IMAGE EDGE DETECTION
PDF
Authentication protocols based on zero knowledge proof (Part 2 - Brief talk)
PPTX
UNIT II ARTIFICIQL INTELLIGENCE SEARCH STRATEGIES OSMANIA UNIVERSITY
PDF
Context-aware Mobile Recommendation Services for Conference Participants
PPT
Ads unit 2 ppt
PPT
PDF
Part4 graph algorithms
PDF
ENHANCEMENT OF TRANSMISSION RANGE ASSIGNMENT FOR CLUSTERED WIRELESS SENSOR NE...
PPT
Routing table
PDF
20110319 parameterized algorithms_fomin_lecture01-02
PPTX
20120412 searching techniques in peer to peer networks
Bichromatic Reverse Nearest Neighbours
AI Lesson 06
Sucha's Presentation at ECTI-CON 09
Jarrar.lecture notes.aai.2011s.ch4.informedsearch
Link analysis
03 search blind
Ai1.pdf
Inroduction_To_Algorithms_Lect14
Ad hoc routing
 
ANT COLONY OPTIMIZATION FOR IMAGE EDGE DETECTION
Authentication protocols based on zero knowledge proof (Part 2 - Brief talk)
UNIT II ARTIFICIQL INTELLIGENCE SEARCH STRATEGIES OSMANIA UNIVERSITY
Context-aware Mobile Recommendation Services for Conference Participants
Ads unit 2 ppt
Part4 graph algorithms
ENHANCEMENT OF TRANSMISSION RANGE ASSIGNMENT FOR CLUSTERED WIRELESS SENSOR NE...
Routing table
20110319 parameterized algorithms_fomin_lecture01-02
20120412 searching techniques in peer to peer networks

Recently uploaded (20)

PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Cell Types and Its function , kingdom of life
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
01-Introduction-to-Information-Management.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Computing-Curriculum for Schools in Ghana
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
master seminar digital applications in india
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Complications of Minimal Access Surgery at WLH
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Microbial disease of the cardiovascular and lymphatic systems
Renaissance Architecture: A Journey from Faith to Humanism
Pharma ospi slides which help in ospi learning
Cell Types and Its function , kingdom of life
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
01-Introduction-to-Information-Management.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Computing-Curriculum for Schools in Ghana
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
O5-L3 Freight Transport Ops (International) V1.pdf
master seminar digital applications in india
Abdominal Access Techniques with Prof. Dr. R K Mishra
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Complications of Minimal Access Surgery at WLH
FourierSeries-QuestionsWithAnswers(Part-A).pdf
102 student loan defaulters named and shamed – Is someone you know on the list?
2.FourierTransform-ShortQuestionswithAnswers.pdf
VCE English Exam - Section C Student Revision Booklet
Microbial disease of the cardiovascular and lymphatic systems

Information Flow and Search in Unstructured Keyword based Social Networks

  • 1. Information Flow and Search in Unstructured Keyword based Social Networks Ankush Garg, Prantik Bhattacharyya, Charles U. Martel, S. (Shyhtsun) Felix Wu SMW '09, August 30, Vancouver, Canada
  • 2. Introduction How to search information in online social networks (OSNs) ? 2
  • 3. Searching in OSNs Expectations in terms of relevancy of result results from direct friends results from trustworthy friends Challenges in the absence of structure high node degree high clustering low diameter 3
  • 4. Search Problem Search a set of users with queried keyword as profile attributes Output Relevant results first Constraints Minimum resource usage 4
  • 5. Outline Introduction OSN architecture Information Flow Model Search Algorithm Simulation Methodology Results Related Work Concluding Remarks 5
  • 6. OSN architecture Undirected social network user topology No node has information about entire graph Identity of a node available to only friends Each edge has an associated trust value Each user v has profile attributes K vPAtt Profile attributes built by set of keywords Each keyword k has associated Policy(k) Policy(k) = [D, T] FAtt Each user v has friendly attributes K v 6
  • 7. OSN architecture (more) D: maximum distance from where k is visible and v can be contacted from users T: min trust of each edge in social path required to let users contact v Each edge has trust > 0.5 Each oval shows how keywords will flow and conversely, can be searched Example: Keyword based social network 7
  • 8. Information Flow Model Motivation Information maintenance and update Diffusion of non-inclusive keyword data Integral to search algorithm development Users diffuse keyword information Along with privacy consideration 8
  • 9. Information Flow Model (more) Hr: Hops remaining Hc: Hops covered PID; propagation ID to avoid cycles Identity of source suppressed to non- direct friends Non-inclusive data propagation ∆Hr > 0 || ∆T < 0 || ∆Hc < 0 9
  • 10. Information Flow Model (more) Keyword Forwarding Table, FTw F5 F9 Keyword Propagation Data Friends K1 {PID1,Hr1,Hc1,T1} (F1, F2,F3) W Keyword Received Table, RTw Keyword K1 Max. Hops HmaxK1 (Friend, Min. Hops) (F9, Hmin F ),, (F5, HminK ) K1 9 1 F1 F2 F5 Please see paper for details 10
  • 11. Search Algorithm Fast lookup for boundary conditions No results possible (no entry exists for query keyword) Information provider are only the direct friends (Hmax == 1) Components: Metric to define value of a direct friend Threshold function for dynamic network pruning Query message processing algorithm 11
  • 12. Search Algorithm (more) For user w Selecting Topologically Closer Nodes min k∈Sk Hmax k − max k∈Sk H min u k distance value, DV (u , Sk ) = min k∈Sk H max k Selecting Trustworthy Nodes Twu trust valu e, TV (u , Sk ) = Sk Tmax Value for keyword set for a direct friend V(u, Sk) = ρ × DV(u, Sk) + (1− ρ) ×TV(u, Sk), 0 ≤ ρ ≤1 12
  • 13. Search Algorithm (more) Threshold Function Θ ( u , Q k ) = max V ( w , Q k ) − f ( N uQ k ) × w∈ N u Q k ( maxk V ( w , Q k ) − min k V ( w , Q k )) w∈ N u Q Q w∈ N u Pruning function f with properties: f (1) = 1 and limN Qk →∞ f ( Nu k ) = 0 Q u We use g(x) = x-p for p ≥ 0 as f p = 0, f ( N u k ) = mink V ( w, Q k )) Q Q w∈ N u Breadth First Search (BFS) 13
  • 14. Search Algorithm (more) Query message structure <QID, Qk, Tmin, Hopsdone, Hopsleft> Starting Query Structure <qid, Qk, 1, 0, Hl> Query Serviced Table, QIDu: <QID, direct friend who forwarded the query> 14
  • 15. Search Algorithm (more) Advantage of Dynamic Pruning Reduces number of messages sent in network Makes unstructured social network scalable Selects targets of high value topologically closer and trustworthy Returns a set of good results amongst all available obtainable results 15
  • 16. Simulation Methodology Newman Watts Model Trust Distribution Five Categories of Trust From ‘Blind Trust’ (0.9) to ‘Don’t Know’ (0.1) Information propagation policy Restrictive policy Depth set to 2 i.e. friends of friends Liberal policy Depth set randomly between 1 and graph diameter 16
  • 17. Performance at ρ values Helps us understand system resource consumption Shows significant reduction in message generation when compared to BFS (12 compared to 121) Analysis of network with restrictive propagation policy 17
  • 18. Performance at ρ values (more) Helps us understand how many queries are successful to find results Significant improvement at ρ = 0.5 when compared to BFS Analysis of network with restrictive propagation policy 18
  • 19. Performance at ρ values (more) Similar observation with higher levels of improvement Reduction in number of query messages generated from 17855 to only Analysis of network with liberal 100’s.. propagation policy 19
  • 20. Performance at ρ values (more) Similar results with higher levels of performance Best results when hop and trust are considered together Results dip when either of the parameters are considered Analysis of network with liberal separately propagation policy 20
  • 21. Performance at hop values Computed as a fraction of total number of results found by BFS Please see paper for more results Analysis of network (liberal policy) for various hops and at ρ = 0.5 21
  • 22. Performance at hop values (more) Higher denominator in pruning function associated with increased number of successful query messages Dynamic pruning using friend values is an effective way Analysis of network with liberal propagation policy 22
  • 23. Related Work Search in social networks Algorithms using structural knowledge of the network through geographical distance, organizational hierarchy, interest of users.. Search in decentralized and unstructured networks BFS, random BFS, Intelligent BFS, directed BFS, iterative BFS, random walks, k random walks, other variation of random walks.. 23
  • 24. Concluding Remarks Developed a Search Algorithm Using an Information Flow model With focus on decentralization and privacy Concentrates on finding set of good results Dynamically prunes the network to search Improvement in orders of magnitude Next step Evaluation using larger graph topologies 24
  • 25. Thanks! 25