Facilitating document annotation using content and querying value

Facilitating Document Annotation Using Content And
Querying Value
Abstract:
A large number of organizations today generate and share textual descriptions of
their products, services, and actions .Such collections of textual data contain
significant amount of structured information, which remains buried in the
unstructured text. While information extraction algorithms facilitate the extraction
of structured relations, they are often expensive and inaccurate, especially when
operating on top of text that does not contain any instances of the targeted
structured information. We present a novel alternative approach that facilitates
the generation of the structured metadata by identifying documents that are likely
to contain information of interest and this information is going to be subsequently
useful for querying the database. Our approach relies on the idea that humans are
more likely to add the necessary metadata during creation time, if prompted by
the interface; or that it is much easier for humans (and/or algorithms) to identify
the metadata when such information actually exists in the document, instead of
naively prompting users to fill in forms with information that is not available in the
document. As a major contribution of this paper, we present algorithms that
identify structured attributes that are likely to appear within the document ,by
jointly utilizing the content of the text and the query workload. Our experimental
evaluation shows that our approach generates superior results compared to
approaches that rely only on the textual content or only on the query workload, to
identify attributes of interest.
GLOBALSOFT TECHNOLOGIES
IEEE PROJECTS & SOFTWARE DEVELOPMENTS
IEEE FINAL YEAR PROJECTS|IEEE ENGINEERING PROJECTS|IEEE STUDENTS PROJECTS|IEEE
BULK PROJECTS|BE/BTECH/ME/MTECH/MS/MCA PROJECTS|CSE/IT/ECE/EEE PROJECTS
CELL: +91 98495 39085, +91 99662 35788, +91 98495 57908, +91 97014 40401
Visit: www.finalyearprojects.org Mail to:ieeefinalsemprojects@gmail.com

Architecture:
EXISTING SYSTEM:
Many systems, though, do not even have the basic “attribute-value” annotation
that would make a “pay-as-you-go” querying feasible. Existing work on query
forms can beleveraged in creating the CADS adaptive query forms. They propose
an algorithm to extract a query form that represents most of the queries in the
database using the ”querability” of the columns, while they extend their work
discussing forms customization. Some people use the schema information to auto-
complete attribute or value names in query forms. In keyword queries are used to
select the most appropriate query forms.

PROPOSED SYSTEM:
In this paper, we propose CADS (Collaborative Adaptive Data Sharing platform),
which is an “annotate-as-you-create” infrastructure that facilitates fielded data
annotation .A key contribution of our system is the direct use of the query
workload to direct the annotation process, in addition to examining the content of
the document. In other words, we are trying to prioritize the annotation of
documents towards generating attribute values for attributes that are often used
by querying users.
Modules :
1. Registration
2. Login
3. Document Upload
4. Search Techniques
5. Download Document
Modules Description
Registration:
In this module an Author(Creater) or User have to register
first,then only he/she has to access the data base.
Login:
In this module,any of the above mentioned person have
to login,they should login by giving their emailid and password .

Document Upload:
In this module Owner uploads an unstructured
document as file(along with meta data) into database,with the help of this
metadata and its contents,the end user has to download the file.He/She has to
enter content/query for download the file.
Search Techniques:
Here we are using two techniques for searching the document
1)Content Search,2)Query Search.
Content Search:
It means that the document will be downloaded by giving the
content which is present in the corresponding document. If its present the
corresponding document will be downloaded, otherwise it won’t.
Query Search:
It means that the document will be downloaded by using query
which has given in the base paper. If its input matches the document will get
download otherwise it won’t.
Download Document:
The User has to download the document using query/content
values which have given in the base paper. He/She enters the correct data in the
text boxes, if its correct it will download the file. Otherwise it won’t.

System Configuration:-
H/W System Configuration:-
Processor - Pentium –III
Speed - 1.1 GHz
RAM - 256 MB (min)
Hard Disk - 20 GB
Floppy Drive - 1.44 MB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
S/W System Configuration:-
 Operating System :Windows95/98/2000/XP
 Application Server : Tomcat5.0/6.X
 Front End : HTML, Java, Jsp
 Scripts : JavaScript.
 Server side Script : Java Server Pages.
 Database : My sql
 Database Connectivity : JDBC.

Conclusion:
We proposed adaptive techniques to suggest relevant at-tributes to
annotate a document, while trying to satisfy the user querying needs. Our solution
is based on a probabilistic framework that considers the evidence in the document
content and the query workload. We present two ways to combine these two
pieces of evidence, content value and Querying value: a model that considers both
components conditionally independent and a linear weighted model. Experiments
shows that using our techniques, we can suggest attributes that improve the
visibility of the documents with respect to the query workload by up to 50%. That
is, we show that using the query workload can greatly improve the annotation
process and increase the utility of shared data.

Facilitating document annotation using content and querying value

More Related Content

What's hot (17)

Similar to Facilitating document annotation using content and querying value (20)

More from IEEEFINALYEARPROJECTS (20)

Recently uploaded (20)

Facilitating document annotation using content and querying value