2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transformation

GLOBALSOFT TECHNOLOGIES
A Probabilistic Approach to String Transformation
Abstract:
Many problems in natural language processing, data mining, information retrieval, and
bioinformatics can be formalized as string transformation, which is a task as follows. Given an
input string, the system generates the k most likely output strings corresponding to the input
string. This paper proposes a novel and probabilistic approach to string transformation, which
is both accurate and efficient. The approach includes the use of a log linear model, a method
for training the model, and an algorithm for generating the top k candidates, whether there is or
is not a predefined dictionary. The log linear model is defined as a conditional probability
distribution of an output string and a rule set for the transformation conditioned on an input
string. The learning method employs maximum likelihood estimation for parameter estimation.
The string generation algorithm based on pruning is guaranteed to generate the optimal top k
candidates. The proposed method is applied to correction of spelling errors in queries as well
as reformulation of queries in web search. Experimental results on large scale data show that
the proposed approach is very accurate And efficient improving upon existing methods in
terms of accuracy and efficiency in different settings.
Architecture:
IEEE PROJECTS & SOFTWARE DEVELOPMENTS
IEEE FINAL YEAR PROJECTS|IEEE ENGINEERING PROJECTS|IEEE STUDENTS PROJECTS|IEEE
BULK PROJECTS|BE/BTECH/ME/MTECH/MS/MCA PROJECTS|CSE/IT/ECE/EEE PROJECTS
CELL: +91 98495 39085, +91 99662 35788, +91 98495 57908, +91 97014 40401
Visit: www.finalyearprojects.org Mail to:ieeefinalsemprojects@gmail.com

EXISTING SYSTEM:
Previous work on string transformation can be categorized into two groups. Some work
mainly considered efficient generation of strings. Other work tried to learn the model with
different approaches. However, efficiency is not an important factor taken into consideration
in these methods.The existing work is not focus on enhancement of both accuracy and
efficiency of string transformation.
PROPOSED SYSTEM:
String transformation has many applications in data mining, natural language processing,
information retrieval, and bioinformatics. String transformation has been studied in different
specific tasks such as database record matching, spell ing error correction, query reformulation
and synonym mining. The major difference between our work and the existing work is that we
focus on enhancement of both accuracy and efficiency of string transformation.
Modules :
1. Registration
2. Login
3. Spelling Error Correction
4. String Transformation
5. String mining

Modules Description
Registration:
In this module an Author(Owner) or User have to register first,then
only he/she has to access the data base.
Login:
In this module,any of the above mentioned person have to login,they
should login by giving their emailid and password .
Spelling Error Correction:
In this module if an user wants to check the spelling, He/She can check
it and correct it automatically.
String Transformation:
Here we are techniques for searching the String 1)String
Generation,2)String Transformation.
String Generation:
It means we have generated 50,000 Strings in alphabetical order.From a to z
like a,aa,…..z.
String Transformation:

It means we have given the user with the benefit of String Generation as well as
String alias .It will be useful for the user for example if the end user have typed “TKDE” its
equal to “Transactions
on Knowledge and Data Engineering”.
String mining:
The User has to download the string with its meanings also He/She can
download its substrings and its reverse etc.Also check the given string which is present in the
bunch of strings,if its present the result will be “String Found” otherwise ”String NotFound”.
System Configuration:-
H/W System Configuration:-
Processor - Pentium –III
Speed - 1.1 GHz
RAM - 256 MB (min)
Hard Disk - 20 GB
Floppy Drive - 1.44 MB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse

Monitor - SVGA
S/W System Configuration:-
 Operating System :Windows95/98/2000/XP
 Application Server : Tomcat5.0/6.X
 Front End : HTML, Java, Jsp
 Scripts : JavaScript.
 Server side Script : Java Server Pages.
 Database : My sql
 Database Connectivity : JDBC.
Conclusion:
In this paper, we have proposed a new statistical learning Approach to string transformation.
Our method is novel and unique in its model, learning algorithm, and string generation
algorithm. Two specific applications are addressed with our method, namely spelling error
correction of queries and query reformulation in web Search. Experimental results on two large
data sets and Microsoft Speller Challenge show that our method improves upon the baselines
in terms of accuracy and efficiency. Our method is particularly useful when the-problem
occurs on a large scale.

2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transformation

More Related Content

What's hot (17)

Viewers also liked (9)

Similar to 2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transformation (20)

More from IEEEFINALYEARSTUDENTPROJECT (20)

Recently uploaded (20)

2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transformation