SlideShare a Scribd company logo
Facilitating Document Annotation Using Content And Querying
Value
Abstract:
A large number of organizations today generate and share textual descriptions of
their products, services, and actions .Such collections of textual data contain
significant amount of structured information, which remains buried in the
unstructured text. While information extraction algorithms facilitate the extraction
of structured relations, they are often expensive and inaccurate, especially when
operating on top of text that does not contain any instances of the targeted
structured information. We present a novel alternative approach that facilitates
the generation of the structured metadata by identifying documents that are likely
to contain information of interest and this information is going to be subsequently
useful for querying the database. Our approach relies on the idea that humans are
more likely to add the necessary metadata during creation time, if prompted by
the interface; or that it is much easier for humans (and/or algorithms) to identify
the metadata when such information actually exists in the document, instead of
naively prompting users to fill in forms with information that is not available in the
document. As a major contribution of this paper, we present algorithms that
identify structured attributes that are likely to appear within the document ,by
jointly utilizing the content of the text and the query workload. Our experimental
evaluation shows that our approach generates superior results compared to
GLOBALSOFT TECHNOLOGIES
IEEE PROJECTS & SOFTWARE DEVELOPMENTS
IEEE FINAL YEAR PROJECTS|IEEE ENGINEERING PROJECTS|IEEE STUDENTS PROJECTS|IEEE
BULK PROJECTS|BE/BTECH/ME/MTECH/MS/MCA PROJECTS|CSE/IT/ECE/EEE PROJECTS
CELL: +91 98495 39085, +91 99662 35788, +91 98495 57908, +91 97014 40401
Visit: www.finalyearprojects.org Mail to:ieeefinalsemprojects@gmail.com
approaches that rely only on the textual content or only on the query workload, to
identify attributes of interest.
Architecture:
EXISTING SYSTEM:
Many systems, though, do not even have the basic “attribute-value” annotation
that would make a “pay-as-you-go” querying feasible. Existing work on query
forms can beleveraged in creating the CADS adaptive query forms. They propose
an algorithm to extract a query form that represents most of the queries in the
database using the ”querability” of the columns, while they extend their work
discussing forms customization. Some people use the schema information to auto-
complete attribute or value names in query forms. In keyword queries are used to
select the most appropriate query forms.
PROPOSED SYSTEM:
In this paper, we propose CADS (Collaborative Adaptive Data Sharing platform),
which is an “annotate-as-you-create” infrastructure that facilitates fielded data
annotation .A key contribution of our system is the direct use of the query
workload to direct the annotation process, in addition to examining the content of
the document. In other words, we are trying to prioritize the annotation of
documents towards generating attribute values for attributes that are often used
by querying users.
Modules :
1. Registration
2. Login
3. Document Upload
4. Search Techniques
5. Download Document
Modules Description
Registration:
In this module an Author(Creater) or User have to register
first,then only he/she has to access the data base.
Login:
In this module,any of the above mentioned person have
to login,they should login by giving their emailid and password .
Document Upload:
In this
module Owner uploads an unstructured document as file(along with meta data)
into database,with the help of this metadata and its contents,the end user has to
download the file.He/She has to enter content/query for download the file.
Search Techniques:
Here we are using two techniques for searching the document
1)Content Search,2)Query Search.
Content Search:
It means that the document will be downloaded by giving the
content which is present in the corresponding document.If its present the
corresponding document will be downloaded,Otherwise it won’t.
Query Search:
It means that the document will be downloaded by using query
which has given in the base paper.If its input matches the document will get
download otherwise it won’t.
Download Document:
The User has to download the document using query/content
values which have given in the base paper.He/She enters the correct data in the
text boxes, if its correct it will download the file.Otherwise it won’t.
System Configuration:-
H/W System Configuration:-
Processor - Pentium –III
Speed - 1.1 GHz
RAM - 256 MB (min)
Hard Disk - 20 GB
Floppy Drive - 1.44 MB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
S/W System Configuration:-
 Operating System :Windows95/98/2000/XP
 Application Server : Tomcat5.0/6.X
 Front End : HTML, Java, Jsp
 Scripts : JavaScript.
 Server side Script : Java Server Pages.
 Database : My sql
 Database Connectivity : JDBC.
Conclusion:
We proposed adaptive techniques to suggest relevant at-tributes to
annotate a document, while trying to satisfy the user querying needs. Our solution
is based on a probabilistic framework that considers the evidence in the document
content and the query workload. We present two ways to combine these two
pieces of evidence, content value and Querying value: a model that considers both
components conditionally independent and a linear weighted model. Experiments
shows that using our techniques, we can suggest attributes that improve the
visibility of the documents with respect to the query workload by up to 50%. That
is, we show that using the query workload can greatly improve the annotation
process and increase the utility of shared data.
CLOUING
DOMAIN: WIRELESS NETWORK PROJECTS

More Related Content

DOCX
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT Facilitating document annotation usin...
DOCX
Facilitating document annotation using content and querying value
DOCX
JPJ1421 Facilitating Document Annotation Using Content and Querying Value
DOCX
facilitating document annotation using content and querying value
PPTX
Google indexing
PDF
CEK KEMIRIPAN PADA CROSSREF
PPT
Automatic Metadata Generation Charles Duncan
PDF
Collecting and Using Funding Data Crossref
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT Facilitating document annotation usin...
Facilitating document annotation using content and querying value
JPJ1421 Facilitating Document Annotation Using Content and Querying Value
facilitating document annotation using content and querying value
Google indexing
CEK KEMIRIPAN PADA CROSSREF
Automatic Metadata Generation Charles Duncan
Collecting and Using Funding Data Crossref

What's hot (19)

PPTX
Reference linking and Cited-by
PPT
Citation Analysis for the Free, Online Literature
PPTX
The Global reach of Crossref metadata
PPTX
Ben Ryan (University of Leeds) – Timescapes Project
PPTX
Collecting and using funding data in your publications
PPT
How search engines work
PPT
Presentation federated search
PPTX
Working with Crossref and registering content
PPTX
Data, data, everywhere? Not nearly enough!
PPTX
Web crawler
PDF
Azure catalog
PPTX
Bigdata overview
PDF
Updating and Scheduling of Streaming Web Services in Data Warehouses
PDF
4. New metadata developments
PDF
New Metadata Developments - Crossref LIVE South Africa
PPTX
Azure data catalog your data your way eugene polonichko dataconf 21 04 18
PDF
Globus Integrations (JupyterHub, Django, ...)
RTF
Introduction to Database Log Analysis
PPT
Federated Search: The Good, The Bad And The Ugly
Reference linking and Cited-by
Citation Analysis for the Free, Online Literature
The Global reach of Crossref metadata
Ben Ryan (University of Leeds) – Timescapes Project
Collecting and using funding data in your publications
How search engines work
Presentation federated search
Working with Crossref and registering content
Data, data, everywhere? Not nearly enough!
Web crawler
Azure catalog
Bigdata overview
Updating and Scheduling of Streaming Web Services in Data Warehouses
4. New metadata developments
New Metadata Developments - Crossref LIVE South Africa
Azure data catalog your data your way eugene polonichko dataconf 21 04 18
Globus Integrations (JupyterHub, Django, ...)
Introduction to Database Log Analysis
Federated Search: The Good, The Bad And The Ugly
Ad

Viewers also liked (12)

PDF
2013 2014 ieee finalyear btech mtech java projects richbraintechnologies
PDF
2013 2014 ieee finalyear me mtech java projects richbraintechnologies
PDF
2013 2014 ieee finalyear beme dotnet projects richbraintechnologies
DOCX
Pack prediction based cloud bandwidth and cost reduction system
DOCX
Enforcing secure and privacy preserving information brokering in distributed ...
DOCX
Spatial approximate string search
DOCX
Power allocation for statistical qo s provisioning in
DOCX
Crowdsourcing predictors of behavioral outcomes
DOCX
Efficient rekeying framework for secure multicast with diverse subscription-p...
DOCX
Personalized mobile search engine
DOCX
Extracting spread spectrum hidden
DOCX
Secure and efficient data transmission for cluster based wireless sensor netw...
2013 2014 ieee finalyear btech mtech java projects richbraintechnologies
2013 2014 ieee finalyear me mtech java projects richbraintechnologies
2013 2014 ieee finalyear beme dotnet projects richbraintechnologies
Pack prediction based cloud bandwidth and cost reduction system
Enforcing secure and privacy preserving information brokering in distributed ...
Spatial approximate string search
Power allocation for statistical qo s provisioning in
Crowdsourcing predictors of behavioral outcomes
Efficient rekeying framework for secure multicast with diverse subscription-p...
Personalized mobile search engine
Extracting spread spectrum hidden
Secure and efficient data transmission for cluster based wireless sensor netw...
Ad

Similar to Facilitating document annotation using content and querying value (20)

PDF
Annotation Approach for Document with Recommendation
PDF
USING GOOGLE’S KEYWORD RELATION IN MULTIDOMAIN DOCUMENT CLASSIFICATION
PDF
Enabling SQL Access to Data Lakes
PPT
Cibm work shop 2chapter six
PDF
History Of Database Technology
PDF
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
PPTX
Fundamentals of Database Design
PPT
Database
PPTX
Share point metadata
PPT
Database
PPT
Database
PDF
TCS_DATA_ANALYSIS_REPORT_ADITYA
PDF
A Review of Data Access Optimization Techniques in a Distributed Database Man...
PDF
A Review of Data Access Optimization Techniques in a Distributed Database Man...
PDF
System Design Interview Questions PDF By ScholarHat
PDF
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
DOC
Database Management System
PDF
Database Management Systems ( Dbms )
DOC
Odam an optimized distributed association rule mining algorithm (synopsis)
PDF
Sweeny ux-seo om-cap 2014_v3
Annotation Approach for Document with Recommendation
USING GOOGLE’S KEYWORD RELATION IN MULTIDOMAIN DOCUMENT CLASSIFICATION
Enabling SQL Access to Data Lakes
Cibm work shop 2chapter six
History Of Database Technology
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
Fundamentals of Database Design
Database
Share point metadata
Database
Database
TCS_DATA_ANALYSIS_REPORT_ADITYA
A Review of Data Access Optimization Techniques in a Distributed Database Man...
A Review of Data Access Optimization Techniques in a Distributed Database Man...
System Design Interview Questions PDF By ScholarHat
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
Database Management System
Database Management Systems ( Dbms )
Odam an optimized distributed association rule mining algorithm (synopsis)
Sweeny ux-seo om-cap 2014_v3

More from IEEEFINALYEARPROJECTS (20)

DOCX
Scalable face image retrieval using attribute enhanced sparse codewords
DOCX
Scalable face image retrieval using attribute enhanced sparse codewords
DOCX
Reversible watermarking based on invariant image classification and dynamic h...
DOCX
Reversible data hiding with optimal value transfer
DOCX
Query adaptive image search with hash codes
DOCX
Noise reduction based on partial reference, dual-tree complex wavelet transfo...
DOCX
Local directional number pattern for face analysis face and expression recogn...
DOCX
An access point based fec mechanism for video transmission over wireless la ns
DOCX
Towards differential query services in cost efficient clouds
DOCX
Spoc a secure and privacy preserving opportunistic computing framework for mo...
DOCX
Privacy preserving back propagation neural network learning over arbitrarily ...
DOCX
Non cooperative location privacy
DOCX
Harnessing the cloud for securely outsourcing large
DOCX
Geo community-based broadcasting for data dissemination in mobile social netw...
DOCX
Enabling data dynamic and indirect mutual trust for cloud computing storage s...
DOCX
Dynamic resource allocation using virtual machines for cloud computing enviro...
DOCX
A secure protocol for spontaneous wireless ad hoc networks creation
DOCX
Utility privacy tradeoff in databases an information-theoretic approach
DOCX
Two tales of privacy in online social networks
DOCX
Spatial approximate string search
Scalable face image retrieval using attribute enhanced sparse codewords
Scalable face image retrieval using attribute enhanced sparse codewords
Reversible watermarking based on invariant image classification and dynamic h...
Reversible data hiding with optimal value transfer
Query adaptive image search with hash codes
Noise reduction based on partial reference, dual-tree complex wavelet transfo...
Local directional number pattern for face analysis face and expression recogn...
An access point based fec mechanism for video transmission over wireless la ns
Towards differential query services in cost efficient clouds
Spoc a secure and privacy preserving opportunistic computing framework for mo...
Privacy preserving back propagation neural network learning over arbitrarily ...
Non cooperative location privacy
Harnessing the cloud for securely outsourcing large
Geo community-based broadcasting for data dissemination in mobile social netw...
Enabling data dynamic and indirect mutual trust for cloud computing storage s...
Dynamic resource allocation using virtual machines for cloud computing enviro...
A secure protocol for spontaneous wireless ad hoc networks creation
Utility privacy tradeoff in databases an information-theoretic approach
Two tales of privacy in online social networks
Spatial approximate string search

Recently uploaded (20)

PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Electronic commerce courselecture one. Pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Encapsulation theory and applications.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Cloud computing and distributed systems.
PPTX
Spectroscopy.pptx food analysis technology
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
A Presentation on Artificial Intelligence
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Empathic Computing: Creating Shared Understanding
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
DOCX
The AUB Centre for AI in Media Proposal.docx
Chapter 3 Spatial Domain Image Processing.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Programs and apps: productivity, graphics, security and other tools
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Electronic commerce courselecture one. Pdf
NewMind AI Weekly Chronicles - August'25-Week II
Encapsulation theory and applications.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Cloud computing and distributed systems.
Spectroscopy.pptx food analysis technology
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
A Presentation on Artificial Intelligence
gpt5_lecture_notes_comprehensive_20250812015547.pdf
sap open course for s4hana steps from ECC to s4
Review of recent advances in non-invasive hemoglobin estimation
Empathic Computing: Creating Shared Understanding
Unlocking AI with Model Context Protocol (MCP)
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
The AUB Centre for AI in Media Proposal.docx

Facilitating document annotation using content and querying value

  • 1. Facilitating Document Annotation Using Content And Querying Value Abstract: A large number of organizations today generate and share textual descriptions of their products, services, and actions .Such collections of textual data contain significant amount of structured information, which remains buried in the unstructured text. While information extraction algorithms facilitate the extraction of structured relations, they are often expensive and inaccurate, especially when operating on top of text that does not contain any instances of the targeted structured information. We present a novel alternative approach that facilitates the generation of the structured metadata by identifying documents that are likely to contain information of interest and this information is going to be subsequently useful for querying the database. Our approach relies on the idea that humans are more likely to add the necessary metadata during creation time, if prompted by the interface; or that it is much easier for humans (and/or algorithms) to identify the metadata when such information actually exists in the document, instead of naively prompting users to fill in forms with information that is not available in the document. As a major contribution of this paper, we present algorithms that identify structured attributes that are likely to appear within the document ,by jointly utilizing the content of the text and the query workload. Our experimental evaluation shows that our approach generates superior results compared to GLOBALSOFT TECHNOLOGIES IEEE PROJECTS & SOFTWARE DEVELOPMENTS IEEE FINAL YEAR PROJECTS|IEEE ENGINEERING PROJECTS|IEEE STUDENTS PROJECTS|IEEE BULK PROJECTS|BE/BTECH/ME/MTECH/MS/MCA PROJECTS|CSE/IT/ECE/EEE PROJECTS CELL: +91 98495 39085, +91 99662 35788, +91 98495 57908, +91 97014 40401 Visit: www.finalyearprojects.org Mail to:ieeefinalsemprojects@gmail.com
  • 2. approaches that rely only on the textual content or only on the query workload, to identify attributes of interest. Architecture: EXISTING SYSTEM: Many systems, though, do not even have the basic “attribute-value” annotation that would make a “pay-as-you-go” querying feasible. Existing work on query forms can beleveraged in creating the CADS adaptive query forms. They propose an algorithm to extract a query form that represents most of the queries in the database using the ”querability” of the columns, while they extend their work discussing forms customization. Some people use the schema information to auto- complete attribute or value names in query forms. In keyword queries are used to select the most appropriate query forms.
  • 3. PROPOSED SYSTEM: In this paper, we propose CADS (Collaborative Adaptive Data Sharing platform), which is an “annotate-as-you-create” infrastructure that facilitates fielded data annotation .A key contribution of our system is the direct use of the query workload to direct the annotation process, in addition to examining the content of the document. In other words, we are trying to prioritize the annotation of documents towards generating attribute values for attributes that are often used by querying users. Modules : 1. Registration 2. Login 3. Document Upload 4. Search Techniques 5. Download Document Modules Description Registration: In this module an Author(Creater) or User have to register first,then only he/she has to access the data base. Login: In this module,any of the above mentioned person have to login,they should login by giving their emailid and password .
  • 4. Document Upload: In this module Owner uploads an unstructured document as file(along with meta data) into database,with the help of this metadata and its contents,the end user has to download the file.He/She has to enter content/query for download the file. Search Techniques: Here we are using two techniques for searching the document 1)Content Search,2)Query Search. Content Search: It means that the document will be downloaded by giving the content which is present in the corresponding document.If its present the corresponding document will be downloaded,Otherwise it won’t. Query Search: It means that the document will be downloaded by using query which has given in the base paper.If its input matches the document will get download otherwise it won’t. Download Document: The User has to download the document using query/content values which have given in the base paper.He/She enters the correct data in the text boxes, if its correct it will download the file.Otherwise it won’t.
  • 5. System Configuration:- H/W System Configuration:- Processor - Pentium –III Speed - 1.1 GHz RAM - 256 MB (min) Hard Disk - 20 GB Floppy Drive - 1.44 MB Key Board - Standard Windows Keyboard Mouse - Two or Three Button Mouse Monitor - SVGA S/W System Configuration:-  Operating System :Windows95/98/2000/XP  Application Server : Tomcat5.0/6.X  Front End : HTML, Java, Jsp  Scripts : JavaScript.  Server side Script : Java Server Pages.  Database : My sql  Database Connectivity : JDBC.
  • 6. Conclusion: We proposed adaptive techniques to suggest relevant at-tributes to annotate a document, while trying to satisfy the user querying needs. Our solution is based on a probabilistic framework that considers the evidence in the document content and the query workload. We present two ways to combine these two pieces of evidence, content value and Querying value: a model that considers both components conditionally independent and a linear weighted model. Experiments shows that using our techniques, we can suggest attributes that improve the visibility of the documents with respect to the query workload by up to 50%. That is, we show that using the query workload can greatly improve the annotation process and increase the utility of shared data.