SlideShare a Scribd company logo
Facilitating Document Annotation Using Content and Querying Value 
Facilitating Document Annotation Using Content and Querying 
Value 
A large number of organizations today generate and share textual descriptions of their products, 
services, and actions. Such collections of textual data contain significant amount of structured 
information, which remains buried in the unstructured text. While information extraction 
algorithms facilitate the extraction of structured relations, they are often expensive and 
inaccurate, especially when operating on top of text that does not contain any instances of the 
targeted structured information. We present a novel alternative approach that facilitates the 
generation of the structured metadata by identifying documents that are likely to contain 
information of interest and this information is going to be subsequently useful for querying the 
database. Our approach relies on the idea that humans are more likely to add the necessary 
metadata during creation time, if prompted by the interface; or that it is much easier for humans 
(and/or algorithms) to identify the metadata when such information actually exists in the 
document, instead of naively prompting users to fill in forms with information that is not 
available in the document. As a major contribution of this paper, we present algorithms that 
identify structured attributes that are likely to appear within the document, by jointly utilizing the 
content of the text and the query workload. Our experimental evaluation shows that our approach 
generates superior results compared to approaches that rely only on the textual content or only on 
the query workload, to identify attributes of interest. 
Many annotation systems allow only “untyped” keyword annotation: for instance, a user may 
annotate a weather report using a tag such as “Storm Category 3”. Annotation strategies that use 
attribute-value pairs are generally more expressive, as they can contain more information than 
untyped approaches. In such settings, the above information can be entered as 
(StormCategory,3). A recent line of work towards using more expressive queries that leverage 
such annotations, is the “pay- as-you- go” querying strategy in Dataspaces [2]: In Dataspaces, 
users provide data integration hints at query time. The assumption in such systems is that the 
Contact: 9703109334, 9533694296 
ABSTRACT: 
EXISTING SYSTEM: 
Email id: academicliveprojects@gmail.com, www.logicsystems.org.in
Facilitating Document Annotation Using Content and Querying Value 
data sources already contain structured information and the problem is to match the query 
attributes with the source attributes. Many systems, though, do not even have the basic 
“attribute- value” annotation that would make a “pay-as-you go” querying feasible. Annotations 
that use “attribute- value” pairs require users to be more principled in their annotation efforts. 
Users should know the underlying schema and field types to use; they should also know when to 
use each of these fields. With schemas that often have tens or even hundreds of available fields 
to fill, this task become complicated and cumbersome. This results in data entry users ignoring 
such annotation capabilities. 
DISADVANTAGES OF EXISTING SYSTEM: 
 The cost is high for creation of annotation information. 
 The existing system produces some errors in the suggestions. 
In this paper, we propose CADS (Collaborative Adaptive Data Sharing platform), which is an 
“annotate-as-you create” infrastructure that facilitates fielded data annotation. A key contribution 
of our system is the direct use of the query workload to direct the annotation process, in addition 
to examining the content of the document. In other words, we are trying to prioritize the 
annotation of documents towards generating attribute values for attributes that are often used by 
querying users. The goal of CADS is to encourage and lower the cost of creating nicely 
annotated documents that can be immediately useful for commonly issued semi- structured 
queries such as the ones. Our key goal is to encourage the annotation of the documents at 
creation time, while the creator is still in the “document generation” phase, even though the 
techniques can also be used for post generation document annotation. In our scenario, the author 
generates a new document and uploads it to the repository. After the upload, CADS analyzes the 
text and creates an adaptive insertion form. The form contains the best attribute names given the 
document text and the information need (query workload), and the most probable attribute values 
given the document text. The author (creator) can inspect the form, modify the generated 
metadata as- necessary, and submit the annotated document for storage. 
ADVANTAGES OF PROPOSED SYSTEM: 
Contact: 9703109334, 9533694296 
PROPOSED SYSTEM: 
Email id: academicliveprojects@gmail.com, www.logicsystems.org.in
Facilitating Document Annotation Using Content and Querying Value 
 We present an adaptive technique for automatically generating data input forms, for 
annotating unstructured textual documents, such that the utilization of the inserted data is 
maximized, given the user information needs. 
 We create principled probabilistic methods and algorithms to seamlessly integrate 
information from the query workload into the data annotation process, in order to generate 
metadata that are not just relevant to the annotated document, but also useful to the users 
querying the database. 
 We present extensive experiments with real data and real users, showing that our system 
generates accurate suggestions that are significantly better than the suggestions from 
alternative approaches. 
Contact: 9703109334, 9533694296 
Email id: academicliveprojects@gmail.com, www.logicsystems.org.in 
MODULES: 
1. Collaborative Annotation Module 
2. Data spaces and pay-as-you-go integration Module 
3. Content management product Module 
4. Information extraction Module 
5. Schema Evolution Module 
6. Query Forms Module 
MODULES DESCRIPTION: 
Collaborative Annotation Module: 
In this module, significant amount of work in predicting the tags for documents or other 
resources (WebPages, images, videos). Depending on the object and the user involvement, these 
approaches have different assumptions on what is expected as an input; Nevertheless the goals 
are similar as they expect to find missing tags that are related with the object. We argue that our 
approach is different as we use the workload to augment the document visibility after the tagging 
process. Compared with the other approaches p recision is a secondary goal as we expect that the 
annotator can improve the annotations on the process. On the other hand, the discovered tags 
assist on the tasks of retrieval instead of simply bookmarking. 
Dataspaces and pay-as-you-go integration Module:
Facilitating Document Annotation Using Content and Querying Value 
The integration model of CADS is similar to that of Dataspaces, where a loosely integration 
model is proposed for heterogeneous sources. The basic difference is that Dataspaces integrate 
existing annotations for data sources, in order to answer queries. Our work suggests the 
appropriate annotation during insertion time, and also takes into consideration the query 
workload to identify the most promising attributes to add. Another related data model is that of 
Google Base, where users can specify their own attribute/value pairs, in addition to the ones 
proposed by the system. However, the proposed attributes in Google Base are hard-coded for 
each item category (e.g., real estate property). In CADS, the goal is to learn what attributes to 
suggest. Pay-as-you go integration techniques like PayGo are useful to suggest candidate 
matching at query time. 
Content management product Module: 
In this module, CADS improves these platforms by learning the user information demand and 
adjusting the insertion forms accordingly. 
Information extraction Module: 
Information extraction is related to this effort, mainly in the context of value suggestion for the 
computed attributes. We can broadly separate the area into two main efforts: Closed IE and Open 
IE. Closed IE requires the user to define the schema, and then the system populates the tables 
with relations extracted from the text. Our work on attribute suggestion naturally complements 
closed IE, as we identify what attributes are likely to appear within a document. Once we have 
that information, we can then employ the IE system to extract the values for the attributes. Open 
IE is closer to the needs of CADS. In particular, Open IE generates RDF- like triplets, e.g., 
(Gustav, is category, 3) with no input from the user. Open IE leads to a very large number of 
triplets, which means that even after the successful extraction of the attribute values, we still 
have to deal with the problem of schema explosion that prevents the successful execution of 
structured queries that require knowledge of the attribute names and values that appear within a 
document. 
Contact: 9703109334, 9533694296 
Schema Evolution Module: 
Email id: academicliveprojects@gmail.com, www.logicsystems.org.in
Facilitating Document Annotation Using Content and Querying Value 
In this module, the adaptive annotation in CADS can be viewed as semi-automatic schema 
evolution. Previous work on schema evolution [27] did not address the problem of what attribute 
to add to the schema, but how to support querying and other database operations when the 
schema changes. 
In this schema information to auto-complete attribute or value names in query forms. In keyword 
queries are used to select the most appropriate query forms. Our work can be considered a dual 
approach: instead of generating query forms using the database contents, we create the schema 
and contents of the database by considering the content of the query workload (and the contents 
of the documents, of course). 
SYSTEM CONFIGURATION:- 
HARDWARE CONFIGURATION:- 
 Processor - Pentium –IV 
 Speed - 1.1 Ghz 
 RAM - 256 MB(min) 
 Hard Disk - 20 GB 
 Key Board - Standard Windows Keyboard 
 Mouse - Two or Three Button Mouse 
 Monitor - SVGA 
SOFTWARE CONFIGURATION:- 
 Operating System : Windows XP 
 Programming Language : JAVA/J2EE 
 Java Version : JDK 1.6 & above. 
 IDE : Netbeans 7.2.1 
 Database : MYSQL 
Contact: 9703109334, 9533694296 
Query Forms Module: 
Email id: academicliveprojects@gmail.com, www.logicsystems.org.in
Facilitating Document Annotation Using Content and Querying Value 
Eduardo J. Ruiz, Vagelis Hristidis, and Panagiotis G. Ipeirotis,“Facilitating Document 
Annotation Using Content and Que rying Value”, IEEE TRANSACTIONS, VOL. 26, NO. 2, 
FEBRUARY 2014. 
Contact: 9703109334, 9533694296 
REFERENCE: 
Email id: academicliveprojects@gmail.com, www.logicsystems.org.in

More Related Content

DOCX
JPJ1421 Facilitating Document Annotation Using Content and Querying Value
DOCX
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT Facilitating document annotation usin...
DOCX
Facilitating document annotation using content and querying value
DOCX
Facilitating document annotation using content and querying value
PDF
Annotation Approach for Document with Recommendation
PDF
Annotating Search Results from Web Databases
PPTX
Annotating Search Results from Web Databases
PPTX
Ben Ryan (University of Leeds) – Timescapes Project
JPJ1421 Facilitating Document Annotation Using Content and Querying Value
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT Facilitating document annotation usin...
Facilitating document annotation using content and querying value
Facilitating document annotation using content and querying value
Annotation Approach for Document with Recommendation
Annotating Search Results from Web Databases
Annotating Search Results from Web Databases
Ben Ryan (University of Leeds) – Timescapes Project

What's hot (17)

PPSX
Annotating search results from web databases-IEEE Transaction Paper 2013
DOCX
Annotating search results from web databases
PPTX
Share point metadata
PDF
A Novel Data Extraction and Alignment Method for Web Databases
PDF
Implementation of Matching Tree Technique for Online Record Linkage
DOCX
Annotating search results from web databases
PDF
E017413647
PDF
At33264269
PDF
Optimization of Search Results with Duplicate Page Elimination using Usage Data
DOC
Introduction abstract
PDF
ANALYSIS OF RESEARCH ISSUES IN WEB DATA MINING
PDF
Vision Based Deep Web data Extraction on Nested Query Result Records
PDF
IRJET - Re-Ranking of Google Search Results
PDF
Paper id 37201536
PDF
IRJET-Computational model for the processing of documents and support to the ...
PPTX
NOW! Get the internet to work for you!
PPTX
Drilling Down to the Challenges of SharePoint Taxonomy Implementation
Annotating search results from web databases-IEEE Transaction Paper 2013
Annotating search results from web databases
Share point metadata
A Novel Data Extraction and Alignment Method for Web Databases
Implementation of Matching Tree Technique for Online Record Linkage
Annotating search results from web databases
E017413647
At33264269
Optimization of Search Results with Duplicate Page Elimination using Usage Data
Introduction abstract
ANALYSIS OF RESEARCH ISSUES IN WEB DATA MINING
Vision Based Deep Web data Extraction on Nested Query Result Records
IRJET - Re-Ranking of Google Search Results
Paper id 37201536
IRJET-Computational model for the processing of documents and support to the ...
NOW! Get the internet to work for you!
Drilling Down to the Challenges of SharePoint Taxonomy Implementation
Ad

Similar to facilitating document annotation using content and querying value (20)

DOCX
JAVA 2013 IEEE DATAMINING PROJECT Facilitating document annotation using cont...
DOCX
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Facilitating document annotation using ...
PPTX
Data Warehousing AWS 12345
PDF
Database Management Systems ( Dbms )
PDF
Database Systems Essay
PDF
USING GOOGLE’S KEYWORD RELATION IN MULTIDOMAIN DOCUMENT CLASSIFICATION
PDF
Ay3313861388
PPT
Database Management system
PDF
The International Journal of Engineering and Science (The IJES)
PDF
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
PDF
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
PDF
Data Ware House System in Cloud Environment
PDF
Generic Algorithm based Data Retrieval Technique in Data Mining
DOCX
Data Modeling.docx
DOCX
BIAM 410 Final Paper - Beyond the Buzzwords: Big Data, Machine Learning, What...
PDF
Data Wrangling and Visualization Using Python
PPTX
HRIS UNIT 2 2021.pptx
PPTX
INTRODUCTION TO DATA STRUCTURE & ABSTRACT DATA TYPE.pptx
DOCX
Discussion post· The proper implementation of a database is es.docx
JAVA 2013 IEEE DATAMINING PROJECT Facilitating document annotation using cont...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Facilitating document annotation using ...
Data Warehousing AWS 12345
Database Management Systems ( Dbms )
Database Systems Essay
USING GOOGLE’S KEYWORD RELATION IN MULTIDOMAIN DOCUMENT CLASSIFICATION
Ay3313861388
Database Management system
The International Journal of Engineering and Science (The IJES)
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
Data Ware House System in Cloud Environment
Generic Algorithm based Data Retrieval Technique in Data Mining
Data Modeling.docx
BIAM 410 Final Paper - Beyond the Buzzwords: Big Data, Machine Learning, What...
Data Wrangling and Visualization Using Python
HRIS UNIT 2 2021.pptx
INTRODUCTION TO DATA STRUCTURE & ABSTRACT DATA TYPE.pptx
Discussion post· The proper implementation of a database is es.docx
Ad

More from swathi78 (20)

DOC
secure mining of association rules in horizontally distributed databases
DOCX
a system for denial-of-service attack detection based on multivariate correla...
DOCX
web service recommendation via exploiting location and qo s information
DOCX
privacy-enhanced web service composition
DOCX
optimal distributed malware defense in mobile networks with heterogeneous dev...
DOCX
friend book a semantic-based friend recommendation system for social networks
DOCX
efficient authentication for mobile and pervasive computing
DOCX
cooperative caching for efficient data access in disruption tolerant networks
DOCX
an incentive framework for cellular traffic offloading
DOCX
secure outsourced attribute-based signatures
DOCX
traffic pattern-based content leakage detection for trusted content delivery ...
DOCX
the design and evaluation of an information sharing system for human networks
DOCX
the client assignment problem for continuous distributed interactive applicat...
DOCX
sos a distributed mobile q&a system based on social networks
DOCX
securing broker-less publish subscribe systems using identity-based encryption
DOCX
rre a game-theoretic intrusion response and recovery engine
DOCX
on false data-injection attacks against power system state estimation modelin...
DOCX
loca ward a security and privacy aware location-based rewarding system
DOCX
exploiting service similarity for privacy in location-based search queries
DOCX
enabling trustworthy service evaluation in service-oriented mobile social net...
secure mining of association rules in horizontally distributed databases
a system for denial-of-service attack detection based on multivariate correla...
web service recommendation via exploiting location and qo s information
privacy-enhanced web service composition
optimal distributed malware defense in mobile networks with heterogeneous dev...
friend book a semantic-based friend recommendation system for social networks
efficient authentication for mobile and pervasive computing
cooperative caching for efficient data access in disruption tolerant networks
an incentive framework for cellular traffic offloading
secure outsourced attribute-based signatures
traffic pattern-based content leakage detection for trusted content delivery ...
the design and evaluation of an information sharing system for human networks
the client assignment problem for continuous distributed interactive applicat...
sos a distributed mobile q&a system based on social networks
securing broker-less publish subscribe systems using identity-based encryption
rre a game-theoretic intrusion response and recovery engine
on false data-injection attacks against power system state estimation modelin...
loca ward a security and privacy aware location-based rewarding system
exploiting service similarity for privacy in location-based search queries
enabling trustworthy service evaluation in service-oriented mobile social net...

Recently uploaded (20)

DOCX
573137875-Attendance-Management-System-original
PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PPTX
Construction Project Organization Group 2.pptx
PPTX
Safety Seminar civil to be ensured for safe working.
PPTX
Sustainable Sites - Green Building Construction
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPT
introduction to datamining and warehousing
PPTX
additive manufacturing of ss316l using mig welding
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PPTX
UNIT 4 Total Quality Management .pptx
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
web development for engineering and engineering
573137875-Attendance-Management-System-original
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
Construction Project Organization Group 2.pptx
Safety Seminar civil to be ensured for safe working.
Sustainable Sites - Green Building Construction
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Lecture Notes Electrical Wiring System Components
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Automation-in-Manufacturing-Chapter-Introduction.pdf
R24 SURVEYING LAB MANUAL for civil enggi
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
introduction to datamining and warehousing
additive manufacturing of ss316l using mig welding
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
UNIT 4 Total Quality Management .pptx
Operating System & Kernel Study Guide-1 - converted.pdf
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
web development for engineering and engineering

facilitating document annotation using content and querying value

  • 1. Facilitating Document Annotation Using Content and Querying Value Facilitating Document Annotation Using Content and Querying Value A large number of organizations today generate and share textual descriptions of their products, services, and actions. Such collections of textual data contain significant amount of structured information, which remains buried in the unstructured text. While information extraction algorithms facilitate the extraction of structured relations, they are often expensive and inaccurate, especially when operating on top of text that does not contain any instances of the targeted structured information. We present a novel alternative approach that facilitates the generation of the structured metadata by identifying documents that are likely to contain information of interest and this information is going to be subsequently useful for querying the database. Our approach relies on the idea that humans are more likely to add the necessary metadata during creation time, if prompted by the interface; or that it is much easier for humans (and/or algorithms) to identify the metadata when such information actually exists in the document, instead of naively prompting users to fill in forms with information that is not available in the document. As a major contribution of this paper, we present algorithms that identify structured attributes that are likely to appear within the document, by jointly utilizing the content of the text and the query workload. Our experimental evaluation shows that our approach generates superior results compared to approaches that rely only on the textual content or only on the query workload, to identify attributes of interest. Many annotation systems allow only “untyped” keyword annotation: for instance, a user may annotate a weather report using a tag such as “Storm Category 3”. Annotation strategies that use attribute-value pairs are generally more expressive, as they can contain more information than untyped approaches. In such settings, the above information can be entered as (StormCategory,3). A recent line of work towards using more expressive queries that leverage such annotations, is the “pay- as-you- go” querying strategy in Dataspaces [2]: In Dataspaces, users provide data integration hints at query time. The assumption in such systems is that the Contact: 9703109334, 9533694296 ABSTRACT: EXISTING SYSTEM: Email id: academicliveprojects@gmail.com, www.logicsystems.org.in
  • 2. Facilitating Document Annotation Using Content and Querying Value data sources already contain structured information and the problem is to match the query attributes with the source attributes. Many systems, though, do not even have the basic “attribute- value” annotation that would make a “pay-as-you go” querying feasible. Annotations that use “attribute- value” pairs require users to be more principled in their annotation efforts. Users should know the underlying schema and field types to use; they should also know when to use each of these fields. With schemas that often have tens or even hundreds of available fields to fill, this task become complicated and cumbersome. This results in data entry users ignoring such annotation capabilities. DISADVANTAGES OF EXISTING SYSTEM:  The cost is high for creation of annotation information.  The existing system produces some errors in the suggestions. In this paper, we propose CADS (Collaborative Adaptive Data Sharing platform), which is an “annotate-as-you create” infrastructure that facilitates fielded data annotation. A key contribution of our system is the direct use of the query workload to direct the annotation process, in addition to examining the content of the document. In other words, we are trying to prioritize the annotation of documents towards generating attribute values for attributes that are often used by querying users. The goal of CADS is to encourage and lower the cost of creating nicely annotated documents that can be immediately useful for commonly issued semi- structured queries such as the ones. Our key goal is to encourage the annotation of the documents at creation time, while the creator is still in the “document generation” phase, even though the techniques can also be used for post generation document annotation. In our scenario, the author generates a new document and uploads it to the repository. After the upload, CADS analyzes the text and creates an adaptive insertion form. The form contains the best attribute names given the document text and the information need (query workload), and the most probable attribute values given the document text. The author (creator) can inspect the form, modify the generated metadata as- necessary, and submit the annotated document for storage. ADVANTAGES OF PROPOSED SYSTEM: Contact: 9703109334, 9533694296 PROPOSED SYSTEM: Email id: academicliveprojects@gmail.com, www.logicsystems.org.in
  • 3. Facilitating Document Annotation Using Content and Querying Value  We present an adaptive technique for automatically generating data input forms, for annotating unstructured textual documents, such that the utilization of the inserted data is maximized, given the user information needs.  We create principled probabilistic methods and algorithms to seamlessly integrate information from the query workload into the data annotation process, in order to generate metadata that are not just relevant to the annotated document, but also useful to the users querying the database.  We present extensive experiments with real data and real users, showing that our system generates accurate suggestions that are significantly better than the suggestions from alternative approaches. Contact: 9703109334, 9533694296 Email id: academicliveprojects@gmail.com, www.logicsystems.org.in MODULES: 1. Collaborative Annotation Module 2. Data spaces and pay-as-you-go integration Module 3. Content management product Module 4. Information extraction Module 5. Schema Evolution Module 6. Query Forms Module MODULES DESCRIPTION: Collaborative Annotation Module: In this module, significant amount of work in predicting the tags for documents or other resources (WebPages, images, videos). Depending on the object and the user involvement, these approaches have different assumptions on what is expected as an input; Nevertheless the goals are similar as they expect to find missing tags that are related with the object. We argue that our approach is different as we use the workload to augment the document visibility after the tagging process. Compared with the other approaches p recision is a secondary goal as we expect that the annotator can improve the annotations on the process. On the other hand, the discovered tags assist on the tasks of retrieval instead of simply bookmarking. Dataspaces and pay-as-you-go integration Module:
  • 4. Facilitating Document Annotation Using Content and Querying Value The integration model of CADS is similar to that of Dataspaces, where a loosely integration model is proposed for heterogeneous sources. The basic difference is that Dataspaces integrate existing annotations for data sources, in order to answer queries. Our work suggests the appropriate annotation during insertion time, and also takes into consideration the query workload to identify the most promising attributes to add. Another related data model is that of Google Base, where users can specify their own attribute/value pairs, in addition to the ones proposed by the system. However, the proposed attributes in Google Base are hard-coded for each item category (e.g., real estate property). In CADS, the goal is to learn what attributes to suggest. Pay-as-you go integration techniques like PayGo are useful to suggest candidate matching at query time. Content management product Module: In this module, CADS improves these platforms by learning the user information demand and adjusting the insertion forms accordingly. Information extraction Module: Information extraction is related to this effort, mainly in the context of value suggestion for the computed attributes. We can broadly separate the area into two main efforts: Closed IE and Open IE. Closed IE requires the user to define the schema, and then the system populates the tables with relations extracted from the text. Our work on attribute suggestion naturally complements closed IE, as we identify what attributes are likely to appear within a document. Once we have that information, we can then employ the IE system to extract the values for the attributes. Open IE is closer to the needs of CADS. In particular, Open IE generates RDF- like triplets, e.g., (Gustav, is category, 3) with no input from the user. Open IE leads to a very large number of triplets, which means that even after the successful extraction of the attribute values, we still have to deal with the problem of schema explosion that prevents the successful execution of structured queries that require knowledge of the attribute names and values that appear within a document. Contact: 9703109334, 9533694296 Schema Evolution Module: Email id: academicliveprojects@gmail.com, www.logicsystems.org.in
  • 5. Facilitating Document Annotation Using Content and Querying Value In this module, the adaptive annotation in CADS can be viewed as semi-automatic schema evolution. Previous work on schema evolution [27] did not address the problem of what attribute to add to the schema, but how to support querying and other database operations when the schema changes. In this schema information to auto-complete attribute or value names in query forms. In keyword queries are used to select the most appropriate query forms. Our work can be considered a dual approach: instead of generating query forms using the database contents, we create the schema and contents of the database by considering the content of the query workload (and the contents of the documents, of course). SYSTEM CONFIGURATION:- HARDWARE CONFIGURATION:-  Processor - Pentium –IV  Speed - 1.1 Ghz  RAM - 256 MB(min)  Hard Disk - 20 GB  Key Board - Standard Windows Keyboard  Mouse - Two or Three Button Mouse  Monitor - SVGA SOFTWARE CONFIGURATION:-  Operating System : Windows XP  Programming Language : JAVA/J2EE  Java Version : JDK 1.6 & above.  IDE : Netbeans 7.2.1  Database : MYSQL Contact: 9703109334, 9533694296 Query Forms Module: Email id: academicliveprojects@gmail.com, www.logicsystems.org.in
  • 6. Facilitating Document Annotation Using Content and Querying Value Eduardo J. Ruiz, Vagelis Hristidis, and Panagiotis G. Ipeirotis,“Facilitating Document Annotation Using Content and Que rying Value”, IEEE TRANSACTIONS, VOL. 26, NO. 2, FEBRUARY 2014. Contact: 9703109334, 9533694296 REFERENCE: Email id: academicliveprojects@gmail.com, www.logicsystems.org.in