SlideShare a Scribd company logo
Algoritmo di text-similarity
 per l’annotazione semantica di WS
               SWAP research group - 27 luglio 2010
                     Michele Filannino, @bronko85
Outline
      Il problema
           Scenario di riferimento
           Similarità

      SAWA
           Word-to-word similarity
           Text-to-text similarity

      Risultati sperimentali
           Qualità dei risultati
           Tempo di esecuzione




2
      Sviluppi futuri
      Sessione dimostrativa
Il problema
Come misurare la similarità tra due testi?
4   Scenario di riferimento
                Natural language     To approve/reject
                  descriptions     suggested annotations




     WSDL file   CODEArchitects       CODEArchitects        SAWSDL file
                Annotation Tool      Annotation Tool
5   Similarità semantica
      Assegnare una metrica di somiglianza, basata sul significato, ad un insieme di
      termini e/o documenti;


      Similarità ≠ Correlatività;
      “Banca” e “denaro” sono correlati sebbene non siano affatto simili;


      Similarità   Correlatività;

      “Ragazza” e “fanciulla” sono simili quindi anche correlati.
6   Similarità semantica in SWOP

    Concetti del WS          Concetti ontologici
    - RequestOrder                         Order -
    - Order                        OrderNumber -
    - BillingInformation                 OrderID -
    - ...                                   BillID -
                                   BillReference -
                                   BusinessFirm -
                                         Product -
                                         Catalog -
                                                ... -
7   Peso computazionale


     Esempio:
      Ontologia con 1200 concetti

      WSDL con 15 annotazioni

                    18.000 esecuzioni di SAWA




                                                :(
     1.200 x 15 =
SAWA
Similarity Algorithm Wikipedia-bAsed
9   Word-to-word similarity

      Date due parole stabilire quanto esse sono simili;
      Tipi di algoritmi per il calcolo della similarità tra parole:
        Corpus-based: pointwise mutual information, latent semantic analysis;

        Hierarchy-based: Leacock & Chodorow, Lesk, Wu & Palmer, Resnik, Lin, Jiang &
        Conrath;



      Input: due parole;
      Output: score compreso tra 0 e 1.
10   Algoritmo di Lin (1998)
11   Tool di word-to-word similarity



       Libreria utilizzata: LinguaTools DISCO;
       Utilizza Wikipedia come gerarchia di concetti
         202.578 concetti;

         Aggiornato al 1° gennaio 2008

       Utilizza l’algoritmo di Lin per il calcolo della similarità.
12   Esempi

      Tiger, lion = 90%
      Doctor, nurse = 70%
      Stock, market = 47%
      Love, sex = 46%
      FBI, investigation = 35%
      Professor, cucumber = 0,006%
Qualità dell’algoritmo
         Corpus per la misurazione della qualità: WordSim353;
         Coefficienti di correlazione (Pearson):
           Wikipedia: 0,574;

           BNC: 0,415;

           PubMed: 0,105;

90.000

67.500

45.000

22.500

    0
14   Text-to-text similarity
       Dati due testi stabilire quanto essi sono simili;
       Estensione opportuna degli algoritmi di word-to-word similarity;
       Rimozione delle parole (stopword)
         basso potere discriminatorio;

         alta frequenza di occorrenza;



       Input: due testi;
       Output: score compreso tra 0 e 1.
15   Stopword


      “Returns the first and last name of each customer who is categorized as an
                                  individual consumer”

                                             STOPWORD


                  “name customer categorized individual consumer”
Algoritmo di Corley & Mihalcea
16   (2005)
Ottimizzazioni (v1.2)

  Caching delle frequenze di ogni termine;
  Caching delle similarità tra termini;
  Apprendimento incrementale;
  Riduzione degli accessi a DISCO;


  Performance ridotte di 10 volte;
Risultati sperimentali
Qualità e tempo di esecuzione
DESCRIZIONE DEL DOCUMENTO WSDL SCELTA:
     "returns the first and last name of each customer who is categorized as an individual consumer"

     RANKING DEI CONCETTI ONTOLOGICI SIMILI (con relativo score):
     *---------------------------------------------------------------------------------------------------------------*
     | Descrizione                                                                                          | Score |
     *---------------------------------------------------------------------------------------------------------------*
     | name: name of customer                                                                               | 62,85% |
     | customer: Current customer individual information                                                    | 56,91% |
     | customeraddress: Customer address                                                                    | 42,36% |
     | customercredicard: Customer credit card information                                                  | 35,08% |
     | salesreason: Reasons why a customer may purchase a particular product.                               | 30,35% |
     | customerstore:Stores of our Company (customer and resellers).                                        | 17,31% |
     | salesorderdetail: Product details associated with a specific sales order.                            | 2,99% |
     | productinventory: Product inventory information.                                                     | 2,59% |
     | salesrepresentativeperson: Contains current sales information for the sales representative persons. | 2,39% |
     | productlocation: Product manufacturing locations                                                     | 2,36% |
     | salestaxrate: Sales Tax rate.                                                                        | 2,36% |
     | salesterritory: Sales territory.                                                                     | 2,22% |
     | employeeaddress: Employee information such as salary, department, and title.                         | 2,18% |
     | product: Products sold or used in the manfacturing of sold products.                                 | 2,12% |
     | enterpricedepartment: Departments of Enterprise                                                      | 2,00% |
     | salesspecialoffer: Sales Special Offer (discounts).                                                  | 1,99% |
     | productlistpricehistory: Changes in the list price of a product over time.                           | 1,80% |
     | shipmethod: Shipping methods.                                                                        | 1,79% |
     | salesorder: General sales order information (header).                                                | 1,76% |
     | productdocument: Product Document                                                                    | 1,73% |
     | productcosthistory: Changes in the cost of a product over time.                                      | 1,68% |
     | productbillofmaterials: Bill Of Materials are items required to make products and product subassembl | 1,61% |
     | productmodel: Product model classification.                                                          | 1,48% |
     | currencyrate: Currency exchange rates.                                                               | 1,40% |
     | salesshoppingcartitem: Contains shopping cart items until the order is submitted or cancelled.       | 1,29% |
     | productcategory: High-level product categorization.                                                  | 1,27% |
     | addresstype: Types of addresses                                                                      | 0,95% |
     | unitmeasure: Unit of measure.                                                                        | 0,80% |
     | currency: Standard ISO currencies.                                                                   | 0,51% |




19
     | countryregion: ISO standard codes for countries and regions.                                         | 0,51% |
     | stateprovince: States and provinces                                                                  | 0,12% |
     *---------------------------------------------------------------------------------------------------------------*
     Time elapsed: 9.4 seconds.
DESCRIZIONE DEL DOCUMENTO WSDL SCELTA:
     "lists the names and addresses of all individual customers"

     RANKING DEI CONCETTI ONTOLOGICI SIMILI (con relativo score):
     *---------------------------------------------------------------------------------------------------------------*
     | Descrizione                                                                                          | Score |
     *---------------------------------------------------------------------------------------------------------------*
     | addresstype: Types of addresses                                                                      | 51,77% |
     | customer: Current customer individual information                                                    | 24,03% |
     | customeraddress: Customer address                                                                    | 10,83% |
     | name: name of customer                                                                               | 6,32% |
     | productlistpricehistory: Changes in the list price of a product over time.                           | 4,91% |
     | customercredicard: Customer credit card information                                                  | 4,47% |
     | salesreason: Reasons why a customer may purchase a particular product.                               | 4,20% |
     | customerstore:Stores of our Company (customer and resellers).                                        | 3,21% |
     | salesorder: General sales order information (header).                                                | 2,72% |
     | salesspecialoffer: Sales Special Offer (discounts).                                                  | 2,53% |
     | salesorderdetail: Product details associated with a specific sales order.                            | 2,49% |
     | salesterritory: Sales territory.                                                                     | 2,14% |
     | salesrepresentativeperson: Contains current sales information for the sales representative persons. | 2,08% |
     | employeeaddress: Employee information such as salary, department, and title.                         | 1,81% |
     | salestaxrate: Sales Tax rate.                                                                        | 1,79% |
     | productlocation: Product manufacturing locations                                                     | 1,78% |
     | countryregion: ISO standard codes for countries and regions.                                         | 1,64% |
     | product: Products sold or used in the manfacturing of sold products.                                 | 1,62% |
     | productinventory: Product inventory information.                                                     | 1,60% |
     | currencyrate: Currency exchange rates.                                                               | 1,46% |
     | enterpricedepartment: Departments of Enterprise                                                      | 1,45% |
     | productmodel: Product model classification.                                                          | 1,38% |
     | shipmethod: Shipping methods.                                                                        | 1,37% |
     | salesshoppingcartitem: Contains shopping cart items until the order is submitted or cancelled.       | 1,36% |
     | productbillofmaterials: Bill Of Materials are items required to make products and product subassembl | 1,32% |
     | productdocument: Product Document                                                                    | 1,27% |
     | productcosthistory: Changes in the cost of a product over time.                                      | 1,26% |
     | productcategory: High-level product categorization.                                                  | 1,01% |
     | currency: Standard ISO currencies.                                                                   | 0,85% |




20
     | stateprovince: States and provinces                                                                  | 0,73% |
     | unitmeasure: Unit of measure.                                                                        | 0,71% |
     *---------------------------------------------------------------------------------------------------------------*
     Time elapsed: 4.177 seconds.
DESCRIZIONE DEL DOCUMENTO WSDL SCELTA:
     "returns the name of each customer that is categorized as a store"

     RANKING DEI CONCETTI ONTOLOGICI SIMILI (con relativo score):
     *---------------------------------------------------------------------------------------------------------------*
     | Descrizione                                                                                          | Score |
     *---------------------------------------------------------------------------------------------------------------*
     | name: name of customer                                                                               | 64,29% |
     | customeraddress: Customer address                                                                    | 43,83% |
     | customer: Current customer individual information                                                    | 40,05% |
     | customercredicard: Customer credit card information                                                  | 36,52% |
     | salesreason: Reasons why a customer may purchase a particular product.                               | 31,74% |
     | customerstore:Stores of our Company (customer and resellers).                                        | 21,07% |
     | employeeaddress: Employee information such as salary, department, and title.                         | 2,75% |
     | salesorderdetail: Product details associated with a specific sales order.                            | 2,67% |
     | productinventory: Product inventory information.                                                     | 2,52% |
     | salestaxrate: Sales Tax rate.                                                                        | 2,22% |
     | salesterritory: Sales territory.                                                                     | 2,19% |
     | salesrepresentativeperson: Contains current sales information for the sales representative persons. | 2,09% |
     | productlocation: Product manufacturing locations                                                     | 1,91% |
     | enterpricedepartment: Departments of Enterprise                                                      | 1,87% |
     | salesorder: General sales order information (header).                                                | 1,84% |
     | product: Products sold or used in the manfacturing of sold products.                                 | 1,79% |
     | salesspecialoffer: Sales Special Offer (discounts).                                                  | 1,72% |
     | productlistpricehistory: Changes in the list price of a product over time.                           | 1,68% |
     | productdocument: Product Document                                                                    | 1,63% |
     | salesshoppingcartitem: Contains shopping cart items until the order is submitted or cancelled.       | 1,61% |
     | shipmethod: Shipping methods.                                                                        | 1,52% |
     | productbillofmaterials: Bill Of Materials are items required to make products and product subassembl | 1,47% |
     | productcosthistory: Changes in the cost of a product over time.                                      | 1,43% |
     | productmodel: Product model classification.                                                          | 1,42% |
     | currencyrate: Currency exchange rates.                                                               | 1,30% |
     | productcategory: High-level product categorization.                                                  | 1,15% |
     | addresstype: Types of addresses                                                                      | 1,02% |
     | unitmeasure: Unit of measure.                                                                        | 0,93% |
     | countryregion: ISO standard codes for countries and regions.                                         | 0,45% |




21
     | currency: Standard ISO currencies.                                                                   | 0,44% |
     | stateprovince: States and provinces                                                                  | 0,12% |
     *---------------------------------------------------------------------------------------------------------------*
     Time elapsed: 1.245 seconds.
22   Tempo di esecuzione
                              Ottimizzato                              Non ottimizzato


     3   1.0 s   9.4 s


     6   1.7 s      9.8 s


     5   2.7 s                              18.1 s


     7   3.6 s                                       21.8 s


     2   3.9 s                         15.5 s


     8   5.6 s                                                23.1 s


     1   6.2 s                              14.3 s


     4   9.4 s                                                                           39.4 s


         0                  12.5                        25                       37.5             50
Sviluppi futuri
Imminenti e futuri
Sviluppi futuri
  Imminenti:
    Realizzazione dell’interfaccia Web Service

    Realizzazione dell’interfaccia Web (gratuita)

    Realizzazione dell’interfaccia di rete

    Disseminazione scientifica

  Altri:
    Introduzione di soglie per migliorare le performance

    Rilascio con licenza open-source del codice sorgente
Sessione dimostrativa

More Related Content

PDF
Pricing Routine In Vofm
PDF
A Fuzzy Logic Multi-Criteria Decision Approach for Vendor Selection Manufactu...
PDF
Sap sd quest_answer_2009061511245119496
PDF
54145899 sap-sd-int-tips
PDF
Content Archaeology (Keynote for DocTrain West March 2009)
PDF
3 mystandards ug_editor_best_practices
PDF
Basic rules-in-sap-sd-module
PDF
Modulo di serendipità in un Item Recommender System
Pricing Routine In Vofm
A Fuzzy Logic Multi-Criteria Decision Approach for Vendor Selection Manufactu...
Sap sd quest_answer_2009061511245119496
54145899 sap-sd-int-tips
Content Archaeology (Keynote for DocTrain West March 2009)
3 mystandards ug_editor_best_practices
Basic rules-in-sap-sd-module
Modulo di serendipità in un Item Recommender System

Similar to Algoritmo di text-similarity per l'annotazione semantica di Web Service (20)

PDF
Sapperformancetestingbestpracticeguidev1 0-130121141448-phpapp02
PDF
Sap performance testing best practice guidev1 0-130121141448-phpapp02
PDF
SAP Performance Testing Best Practice Guide v1.0
PDF
Forecast 2014: SaaS Data Exchange
PDF
VMworld 2013: Create a Key Metrics-based Actionable Roadmap to Deliver IT as ...
PDF
Discover Data That Matters- Deep dive into WSO2 Analytics
PDF
Crystal Qube™ Presentation
PDF
Fusion Applications - PIM Deep Dive
PPTX
Agile in Medical Software Development
PPTX
Salesforce Fundamental para Certificação de Administrador e Plataforma App Bu...
PDF
WFX Cloud ERP
PDF
Distributed Caches: A Developer’s Guide to Unleashing Your Data in High-Perfo...
PPTX
Presentatie Duncan Rogers NMD2010 17 juni 2010
 
PPT
Data Warehouse-Final
PPTX
A wrapper for QuantLib and reference data
PPT
INWK Overview
PPTX
Successfully manage your business model transformation
PPTX
Designing a Future-proof API Program
PPTX
Why Generic Configurators dont work in the valve Industry
Sapperformancetestingbestpracticeguidev1 0-130121141448-phpapp02
Sap performance testing best practice guidev1 0-130121141448-phpapp02
SAP Performance Testing Best Practice Guide v1.0
Forecast 2014: SaaS Data Exchange
VMworld 2013: Create a Key Metrics-based Actionable Roadmap to Deliver IT as ...
Discover Data That Matters- Deep dive into WSO2 Analytics
Crystal Qube™ Presentation
Fusion Applications - PIM Deep Dive
Agile in Medical Software Development
Salesforce Fundamental para Certificação de Administrador e Plataforma App Bu...
WFX Cloud ERP
Distributed Caches: A Developer’s Guide to Unleashing Your Data in High-Perfo...
Presentatie Duncan Rogers NMD2010 17 juni 2010
 
Data Warehouse-Final
A wrapper for QuantLib and reference data
INWK Overview
Successfully manage your business model transformation
Designing a Future-proof API Program
Why Generic Configurators dont work in the valve Industry
Ad

More from Michele Filannino (16)

PDF
me_t3_october
PDF
Using machine learning to predict temporal orientation of search engines’ que...
PDF
Temporal information extraction in the general and clinical domain
PDF
Mining temporal footprints from Wikipedia
PDF
Can computers understand time?
PDF
Detecting novel associations in large data sets
PDF
Temporal expressions identification in biomedical texts
PDF
My research taster project
PDF
Nonlinear component analysis as a kernel eigenvalue problem
PDF
Sviluppo di un algoritmo di similarità a supporto dell'annotazione semantica ...
KEY
Tecniche fuzzy per l'elaborazione del linguaggio naturale
KEY
SWOP project and META software
KEY
Semantic Web Service Annotation
PDF
Orchestrazione delle risorse umane nel BPM
PDF
Serendipity module in Item Recommender System
PPTX
Orchestrazione di risorse umane nel BPM: Gestione dinamica feature-based dell...
me_t3_october
Using machine learning to predict temporal orientation of search engines’ que...
Temporal information extraction in the general and clinical domain
Mining temporal footprints from Wikipedia
Can computers understand time?
Detecting novel associations in large data sets
Temporal expressions identification in biomedical texts
My research taster project
Nonlinear component analysis as a kernel eigenvalue problem
Sviluppo di un algoritmo di similarità a supporto dell'annotazione semantica ...
Tecniche fuzzy per l'elaborazione del linguaggio naturale
SWOP project and META software
Semantic Web Service Annotation
Orchestrazione delle risorse umane nel BPM
Serendipity module in Item Recommender System
Orchestrazione di risorse umane nel BPM: Gestione dinamica feature-based dell...
Ad

Recently uploaded (20)

PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Machine Learning_overview_presentation.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Tartificialntelligence_presentation.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
MYSQL Presentation for SQL database connectivity
Encapsulation_ Review paper, used for researhc scholars
Machine Learning_overview_presentation.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Diabetes mellitus diagnosis method based random forest with bat algorithm
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Tartificialntelligence_presentation.pptx
Spectral efficient network and resource selection model in 5G networks
Digital-Transformation-Roadmap-for-Companies.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Building Integrated photovoltaic BIPV_UPV.pdf
Machine learning based COVID-19 study performance prediction
A comparative analysis of optical character recognition models for extracting...
Reach Out and Touch Someone: Haptics and Empathic Computing
NewMind AI Weekly Chronicles - August'25-Week II
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
MYSQL Presentation for SQL database connectivity

Algoritmo di text-similarity per l'annotazione semantica di Web Service

  • 1. Algoritmo di text-similarity per l’annotazione semantica di WS SWAP research group - 27 luglio 2010 Michele Filannino, @bronko85
  • 2. Outline Il problema Scenario di riferimento Similarità SAWA Word-to-word similarity Text-to-text similarity Risultati sperimentali Qualità dei risultati Tempo di esecuzione 2 Sviluppi futuri Sessione dimostrativa
  • 3. Il problema Come misurare la similarità tra due testi?
  • 4. 4 Scenario di riferimento Natural language To approve/reject descriptions suggested annotations WSDL file CODEArchitects CODEArchitects SAWSDL file Annotation Tool Annotation Tool
  • 5. 5 Similarità semantica Assegnare una metrica di somiglianza, basata sul significato, ad un insieme di termini e/o documenti; Similarità ≠ Correlatività; “Banca” e “denaro” sono correlati sebbene non siano affatto simili; Similarità Correlatività; “Ragazza” e “fanciulla” sono simili quindi anche correlati.
  • 6. 6 Similarità semantica in SWOP Concetti del WS Concetti ontologici - RequestOrder Order - - Order OrderNumber - - BillingInformation OrderID - - ... BillID - BillReference - BusinessFirm - Product - Catalog - ... -
  • 7. 7 Peso computazionale Esempio: Ontologia con 1200 concetti WSDL con 15 annotazioni 18.000 esecuzioni di SAWA :( 1.200 x 15 =
  • 9. 9 Word-to-word similarity Date due parole stabilire quanto esse sono simili; Tipi di algoritmi per il calcolo della similarità tra parole: Corpus-based: pointwise mutual information, latent semantic analysis; Hierarchy-based: Leacock & Chodorow, Lesk, Wu & Palmer, Resnik, Lin, Jiang & Conrath; Input: due parole; Output: score compreso tra 0 e 1.
  • 10. 10 Algoritmo di Lin (1998)
  • 11. 11 Tool di word-to-word similarity Libreria utilizzata: LinguaTools DISCO; Utilizza Wikipedia come gerarchia di concetti 202.578 concetti; Aggiornato al 1° gennaio 2008 Utilizza l’algoritmo di Lin per il calcolo della similarità.
  • 12. 12 Esempi Tiger, lion = 90% Doctor, nurse = 70% Stock, market = 47% Love, sex = 46% FBI, investigation = 35% Professor, cucumber = 0,006%
  • 13. Qualità dell’algoritmo Corpus per la misurazione della qualità: WordSim353; Coefficienti di correlazione (Pearson): Wikipedia: 0,574; BNC: 0,415; PubMed: 0,105; 90.000 67.500 45.000 22.500 0
  • 14. 14 Text-to-text similarity Dati due testi stabilire quanto essi sono simili; Estensione opportuna degli algoritmi di word-to-word similarity; Rimozione delle parole (stopword) basso potere discriminatorio; alta frequenza di occorrenza; Input: due testi; Output: score compreso tra 0 e 1.
  • 15. 15 Stopword “Returns the first and last name of each customer who is categorized as an individual consumer” STOPWORD “name customer categorized individual consumer”
  • 16. Algoritmo di Corley & Mihalcea 16 (2005)
  • 17. Ottimizzazioni (v1.2) Caching delle frequenze di ogni termine; Caching delle similarità tra termini; Apprendimento incrementale; Riduzione degli accessi a DISCO; Performance ridotte di 10 volte;
  • 18. Risultati sperimentali Qualità e tempo di esecuzione
  • 19. DESCRIZIONE DEL DOCUMENTO WSDL SCELTA: "returns the first and last name of each customer who is categorized as an individual consumer" RANKING DEI CONCETTI ONTOLOGICI SIMILI (con relativo score): *---------------------------------------------------------------------------------------------------------------* | Descrizione | Score | *---------------------------------------------------------------------------------------------------------------* | name: name of customer | 62,85% | | customer: Current customer individual information | 56,91% | | customeraddress: Customer address | 42,36% | | customercredicard: Customer credit card information | 35,08% | | salesreason: Reasons why a customer may purchase a particular product. | 30,35% | | customerstore:Stores of our Company (customer and resellers). | 17,31% | | salesorderdetail: Product details associated with a specific sales order. | 2,99% | | productinventory: Product inventory information. | 2,59% | | salesrepresentativeperson: Contains current sales information for the sales representative persons. | 2,39% | | productlocation: Product manufacturing locations | 2,36% | | salestaxrate: Sales Tax rate. | 2,36% | | salesterritory: Sales territory. | 2,22% | | employeeaddress: Employee information such as salary, department, and title. | 2,18% | | product: Products sold or used in the manfacturing of sold products. | 2,12% | | enterpricedepartment: Departments of Enterprise | 2,00% | | salesspecialoffer: Sales Special Offer (discounts). | 1,99% | | productlistpricehistory: Changes in the list price of a product over time. | 1,80% | | shipmethod: Shipping methods. | 1,79% | | salesorder: General sales order information (header). | 1,76% | | productdocument: Product Document | 1,73% | | productcosthistory: Changes in the cost of a product over time. | 1,68% | | productbillofmaterials: Bill Of Materials are items required to make products and product subassembl | 1,61% | | productmodel: Product model classification. | 1,48% | | currencyrate: Currency exchange rates. | 1,40% | | salesshoppingcartitem: Contains shopping cart items until the order is submitted or cancelled. | 1,29% | | productcategory: High-level product categorization. | 1,27% | | addresstype: Types of addresses | 0,95% | | unitmeasure: Unit of measure. | 0,80% | | currency: Standard ISO currencies. | 0,51% | 19 | countryregion: ISO standard codes for countries and regions. | 0,51% | | stateprovince: States and provinces | 0,12% | *---------------------------------------------------------------------------------------------------------------* Time elapsed: 9.4 seconds.
  • 20. DESCRIZIONE DEL DOCUMENTO WSDL SCELTA: "lists the names and addresses of all individual customers" RANKING DEI CONCETTI ONTOLOGICI SIMILI (con relativo score): *---------------------------------------------------------------------------------------------------------------* | Descrizione | Score | *---------------------------------------------------------------------------------------------------------------* | addresstype: Types of addresses | 51,77% | | customer: Current customer individual information | 24,03% | | customeraddress: Customer address | 10,83% | | name: name of customer | 6,32% | | productlistpricehistory: Changes in the list price of a product over time. | 4,91% | | customercredicard: Customer credit card information | 4,47% | | salesreason: Reasons why a customer may purchase a particular product. | 4,20% | | customerstore:Stores of our Company (customer and resellers). | 3,21% | | salesorder: General sales order information (header). | 2,72% | | salesspecialoffer: Sales Special Offer (discounts). | 2,53% | | salesorderdetail: Product details associated with a specific sales order. | 2,49% | | salesterritory: Sales territory. | 2,14% | | salesrepresentativeperson: Contains current sales information for the sales representative persons. | 2,08% | | employeeaddress: Employee information such as salary, department, and title. | 1,81% | | salestaxrate: Sales Tax rate. | 1,79% | | productlocation: Product manufacturing locations | 1,78% | | countryregion: ISO standard codes for countries and regions. | 1,64% | | product: Products sold or used in the manfacturing of sold products. | 1,62% | | productinventory: Product inventory information. | 1,60% | | currencyrate: Currency exchange rates. | 1,46% | | enterpricedepartment: Departments of Enterprise | 1,45% | | productmodel: Product model classification. | 1,38% | | shipmethod: Shipping methods. | 1,37% | | salesshoppingcartitem: Contains shopping cart items until the order is submitted or cancelled. | 1,36% | | productbillofmaterials: Bill Of Materials are items required to make products and product subassembl | 1,32% | | productdocument: Product Document | 1,27% | | productcosthistory: Changes in the cost of a product over time. | 1,26% | | productcategory: High-level product categorization. | 1,01% | | currency: Standard ISO currencies. | 0,85% | 20 | stateprovince: States and provinces | 0,73% | | unitmeasure: Unit of measure. | 0,71% | *---------------------------------------------------------------------------------------------------------------* Time elapsed: 4.177 seconds.
  • 21. DESCRIZIONE DEL DOCUMENTO WSDL SCELTA: "returns the name of each customer that is categorized as a store" RANKING DEI CONCETTI ONTOLOGICI SIMILI (con relativo score): *---------------------------------------------------------------------------------------------------------------* | Descrizione | Score | *---------------------------------------------------------------------------------------------------------------* | name: name of customer | 64,29% | | customeraddress: Customer address | 43,83% | | customer: Current customer individual information | 40,05% | | customercredicard: Customer credit card information | 36,52% | | salesreason: Reasons why a customer may purchase a particular product. | 31,74% | | customerstore:Stores of our Company (customer and resellers). | 21,07% | | employeeaddress: Employee information such as salary, department, and title. | 2,75% | | salesorderdetail: Product details associated with a specific sales order. | 2,67% | | productinventory: Product inventory information. | 2,52% | | salestaxrate: Sales Tax rate. | 2,22% | | salesterritory: Sales territory. | 2,19% | | salesrepresentativeperson: Contains current sales information for the sales representative persons. | 2,09% | | productlocation: Product manufacturing locations | 1,91% | | enterpricedepartment: Departments of Enterprise | 1,87% | | salesorder: General sales order information (header). | 1,84% | | product: Products sold or used in the manfacturing of sold products. | 1,79% | | salesspecialoffer: Sales Special Offer (discounts). | 1,72% | | productlistpricehistory: Changes in the list price of a product over time. | 1,68% | | productdocument: Product Document | 1,63% | | salesshoppingcartitem: Contains shopping cart items until the order is submitted or cancelled. | 1,61% | | shipmethod: Shipping methods. | 1,52% | | productbillofmaterials: Bill Of Materials are items required to make products and product subassembl | 1,47% | | productcosthistory: Changes in the cost of a product over time. | 1,43% | | productmodel: Product model classification. | 1,42% | | currencyrate: Currency exchange rates. | 1,30% | | productcategory: High-level product categorization. | 1,15% | | addresstype: Types of addresses | 1,02% | | unitmeasure: Unit of measure. | 0,93% | | countryregion: ISO standard codes for countries and regions. | 0,45% | 21 | currency: Standard ISO currencies. | 0,44% | | stateprovince: States and provinces | 0,12% | *---------------------------------------------------------------------------------------------------------------* Time elapsed: 1.245 seconds.
  • 22. 22 Tempo di esecuzione Ottimizzato Non ottimizzato 3 1.0 s 9.4 s 6 1.7 s 9.8 s 5 2.7 s 18.1 s 7 3.6 s 21.8 s 2 3.9 s 15.5 s 8 5.6 s 23.1 s 1 6.2 s 14.3 s 4 9.4 s 39.4 s 0 12.5 25 37.5 50
  • 24. Sviluppi futuri Imminenti: Realizzazione dell’interfaccia Web Service Realizzazione dell’interfaccia Web (gratuita) Realizzazione dell’interfaccia di rete Disseminazione scientifica Altri: Introduzione di soglie per migliorare le performance Rilascio con licenza open-source del codice sorgente

Editor's Notes

  • #11: LCS = Least Common Subsumer (Ultimo sussuntore comune)