Generic Framework for Knowledge Classification-1

2
Generic Framework For Knowledge
Classification
By
Venkata Vineel

3
Agenda
•  Introduction
•  Problem at Hand
•  How is it solved ?
•  Challenges
•  Skills and Career alignment
•  Q & A

4
Introduction
•  Masters in Computer Science
University of Utah, SaltLakeCity, UT
•  Systems Engineering Intern
Internal tools team - Knowledge Management
Interests:
Scalability challenges, Machine Learning and Visualization.

5
Problem at Hand
•  Generic Framework for classifying knowledge
•  Classifying questions in Answer Hub

6
How did I solve ??
•  Developed an generic algorithm.
•  Answer Hub Knowledge Base that learns.

7
Project High Points
•  72 % percent accuracy has been achieved.
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
1
3
5
7
9
11
13
15
17
19
21
23
Rank Statastics
No of Questions RANK CATEGORIES

8
Confusion matrix
Categories
V3
GBX
C3
Hadoop
BES
DAL
Raptor
Stratus
Security
Pla>orm
General
User
Tracking
ExperimentaEon
Service
Frameworks
Search
Services
Sherlock
Batch
Frameword
Trinity
Commerce
OS

Teradata
AnalyEcs
Pla>orm
Total

V3
1552
2
1
2
6
263
217
3
23
455
2
41
290
9
3
6
0
0
0
0
2875

GBX
1
68
0
0
0
6
37
0
1
9
1
26
4
8
0
0
0
1
0
0
162

C3
0
0
318
1
1
25
27
54
5
32
1
6
1
4
0
1
0
1
1
0
478

Hadoop
0
0
2
173
1
10
8
0
0
20
1
3
4
0
3
0
0
0
0
0
225

BES
11
0
0
0
300
59
39
1
0
5
0
1
22
0
0
0
0
0
0
0
438

DAL
67
0
1
0
3
2307
89
0
2
16
0
13
99
5
0
1
0
0
0
0
2603

Raptor
11
10
5
2
25
396
5352
3
62
212
26
184
337
25
6
17
0
0
1
0
6674

Stratus
1
0
82
2
1
40
188
435
4
40
0
13
6
0
2
1
0
1
0
0
816

Security
Pla>orm
4
0
0
0
0
32
38
0
174
11
0
6
129
1
0
1
0
0
0
0
396

General
100
2
12
15
6
129
258
16
13
1200
3
88
64
29
4
3
0
0
5
0
1947

User
Tracking
3
0
0
1
0
16
43
0
3
8
126
41
10
1
0
0
0
0
0
0
252

ExperimentaEon
1
1
0
0
0
27
40
0
1
8
0
868
29
1
0
0
0
0
3
0
979

Service
Frameworks
124
3
0
0
6
90
299
2
67
83
0
56
1977
38
5
3
0
11
0
0
2764

Search
Services
0
1
1
0
1
5
9
1
2
8
0
4
32
163
0
0
0
0
0
0
227

Sherlock
2
0
0
4
0
67
31
2
0
17
0
29
19
0
85
0
0
0
0
0
256

Batch
Frameword
11
0
0
2
2
100
92
2
2
10
0
2
22
0
0
67
0
0
1
0
313

Trinity
0
0
0
0
0
0
0
0
0
0
0
4
1
1
0
0
0
0
0
0
6

Commerce
OS

0
0
0
0
0
10
48
0
4
15
0
14
15
8
0
0
0
103
0
0
217

Teradata
0
0
1
1
0
10
0
0
0
0
1
16
2
1
0
1
0
0
49
0
82

AnalyEcs
Pla>orm
0
0
1
1
0
5
1
0
1
23
1
14
0
3
1
0
0
0
1
11
63

Total
1888
87
424
204
352
3597
6816
519
364
2172
162
1429
3063
297
109
101
0
117
61
11
21773

Percentage
correct
82.20339
78.16092
75
84.80392
85.22727
64.13678
78.52113
83.81503
47.8021978
55.24862
77.77777778
60.74177747
64.54456415
54.88215488
77.98165
66.33663366
#DIV/0!
88.03418803
80.32787
100

9
Challenges and How Did We Overcome Those
•  Sparse data.
•  Large number of features.
•  Chi- Square test came to the rescue.

10
Skills Obtained
•  Lucene
•  Literature survey of existing techniques
•  Machine Learning and NLP
•  Exposure to productizing research

11
Alignment With My Career Path
•  Interested in Text and Machine Learning.
•  eBay has tonnes of data.

12
Future Scope for Improvement
•  User profile
•  Support Vector Machine, TF-IDF and k-NN algorithms

Generic Framework for Knowledge Classification-1

More Related Content

Similar to Generic Framework for Knowledge Classification-1 (20)

Generic Framework for Knowledge Classification-1