SlideShare a Scribd company logo
3
Most read
4
Most read
17
Most read
Text Summarization
For Review And Feedback
BY :Aman Sadhwani
1
Monday, May 18,
2015
What is Text Summarization?
And why we need it?
• We can define summary as a text which reflects the main and important sentences from
the original text. In Text summarization, Summary is generated by Computer.
• In Recent Years we are witnessing the amount of textual information is increasing day by
day .The Textual Information grows rapidly. It becomes more difficult for the user to read
the textual information and also it leads to loss of interest. That is the reason why Text
Summarization came into picture which will solve this problem.
2
Monday, May 18,
2015
Types of Text Summarization
 1) Extraction: - In Extractive text summarization , summary is generated by selecting a set
of words, phrases, paragraph or sentences from the original document.
 2) Abstraction: - Abstractive methods are based on semantic representation and then use
natural language processing techniques to generate a summary that is nearer to
summary generated manually. This kind of summary may contain words that are not found
in the original document. Currently research is going on this method and demand for this
method is more.
3
Monday, May 18,
2015
Proposed System
4
Monday, May 18,
2015
 We have developed and compared two text summarization techniques
1) Reduction based
2) Inter section based
How Reduction Algorithm Works
 Step 1 - It takes a text as input.
 Step 2 - Splits it into one or more paragraph(s).
 Step 3 - Splits each paragraph into one or more sentence(s).
 Step 4 - Splits each sentence into one or more words.
 Step 5 - Gives each sentence weight-age (a floating point value) by comparing Its words
to a pre-defined dictionary called "stopWords.txt“
 If some word of a sentence matches to any word with the pre-defined Dictionary, then
the word is considered as Low weighted.
5
Monday, May 18,
2015
Cont..
 Step 6 - An ordered list of weighted sentences is then prepared (Relatively High weighted
sentences comes first and low weighted sentences comes At last position).
 Step 7 - Now, we have the ordered list of weighted sentences, it continues to Store each
sentence (from ordered weighted sentences) in the output Variable (i.e. a list) until it
reaches the reduction ratio (It uses A formula to determine max number of sentences to
put in the output List)
 Step 8 - The output list is then returned.
6
Monday, May 18,
2015
How InterSection Algorithm Works?
1. Split input text into Paragraph.
2. Split paragraph into sentences.
3. Split sentences into words.
4. Calculate the intersection between 2 sentences.
5. Remove non-alphabetic characters from sentence.
6. Convert content into dictionary.
7. Build the sentence dictionary.
8. Return best sentences in a paragraph.
9. Get the best sentences according to dictionary.
Monday, May 18,
2015 7
Flow Chart
Monday, May 18,
2015 8
Screen shots
Monday, May 18,
2015 9
Monday, May 18,
2015 10
Monday, May 18,
2015 11
Monday, May 18,
2015 12
Monday, May 18,
2015 13
Monday, May 18,
2015 14
Conclusion
Monday, May 18,
2015 15
Cont…
 By looking at last table we can say that intersection is faster than reduction
 But reduction creates better summary than intersection.
 Intersection works fine on some documents but generates only 1 or 2 line of summary on
some documents.
 This is because intersection is the most basic algorithm for text summarization. It doesn’t
use any NLP libraries like reduction.
Monday, May 18,
2015 16
Hardware & Software requirement
17
Monday, May 18,
2015
 Minimum Hardware Requirements
 Processor : Intel Pentium II or Higher
 RAM : 128 Mb or Higher
 Monitor ,Keyboard, Mouse
 Printer (Optional)
 Hard disk : 20 GB Or Higher
 Software Requirements
 OS: Windows xp or higher
 Java Installed On Machine
 Python 2.7 installed on machine.
Tools used
 NetBeans
 Python 2.7 IDLE
Monday, May 18,
2015 18
References
 http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume22/erkan04a-html/erkan04a.html
 http://www.iajet.org/iajet_files/vol.1/no.4/Text%20Summarization%20Extraction%20System%
20TSES%20Using%20Extracted%20Keywords_doc.pdf
 http://en.wikipedia.org/wiki/Sentiment_analysis
Monday, May 18,
2015 19
Future enhancement
 Will support summarization for multiple file types.
 User wise Document management.
 Multi document summarization.
 Improved summarization algorithms.
Monday, May 18,
2015 20
THANK YOU
21
Monday, May 18,
2015

More Related Content

PDF
Text summarization
PDF
Extraction Based automatic summarization
PPTX
Text summarization
PDF
Screenless displays seminar report
PDF
Text summarization
PPT
Entrepreneurship process
PDF
Representation Learning of Text for NLP
PPTX
CORPORATE SOCIAL RESPONSIBILITY
Text summarization
Extraction Based automatic summarization
Text summarization
Screenless displays seminar report
Text summarization
Entrepreneurship process
Representation Learning of Text for NLP
CORPORATE SOCIAL RESPONSIBILITY

What's hot (20)

PDF
Text Summarization
PPTX
Text summarization using deep learning
PDF
Document Summarization
PDF
Language translator internship report
PPTX
Sentiment analysis of twitter data
PPTX
Text summerization
PPTX
Natural language processing
PDF
Web 3.0 Intro
PPTX
Treebank annotation
PPT
Big Data & Text Mining
PPTX
Text MIning
PPTX
Information retrieval 10 tf idf and bag of words
PPSX
An Introduction to Semantic Web Technology
PPT
Middleware
PPTX
Sentiment analysis
PPTX
Parallel computing
PPTX
Stock Market Prediction using Machine Learning
PPTX
5. phases of nlp
PPTX
Nlp toolkits and_preprocessing_techniques
PDF
Natural Language Processing (NLP)
Text Summarization
Text summarization using deep learning
Document Summarization
Language translator internship report
Sentiment analysis of twitter data
Text summerization
Natural language processing
Web 3.0 Intro
Treebank annotation
Big Data & Text Mining
Text MIning
Information retrieval 10 tf idf and bag of words
An Introduction to Semantic Web Technology
Middleware
Sentiment analysis
Parallel computing
Stock Market Prediction using Machine Learning
5. phases of nlp
Nlp toolkits and_preprocessing_techniques
Natural Language Processing (NLP)
Ad

Viewers also liked (20)

PDF
Text Summarization
PDF
Automatic Text Summarization
PDF
Tutorial on automatic summarization
PDF
Textrank algorithm
PPTX
Document Summarizer
PDF
A Survey of Various Methods for Text Summarization
PDF
Probabilistic content models,
PDF
Opinion mining and summarization
PPT
Week 2 2011 features of a summary
PDF
Neural Summarization by Extracting Sentences and Words
PPT
Auto summarization tool
PDF
2005 Web Content Mining 4
PPT
Project presentation
PPTX
Do You Use Outlook At Work?
PDF
Project report
PDF
Outlook 2010 - How to Guide
PPTX
Opinion Mining
PPT
Opinion Mining
PDF
Introduction to Automatic Summarization
PDF
Automatic Document Summarization
Text Summarization
Automatic Text Summarization
Tutorial on automatic summarization
Textrank algorithm
Document Summarizer
A Survey of Various Methods for Text Summarization
Probabilistic content models,
Opinion mining and summarization
Week 2 2011 features of a summary
Neural Summarization by Extracting Sentences and Words
Auto summarization tool
2005 Web Content Mining 4
Project presentation
Do You Use Outlook At Work?
Project report
Outlook 2010 - How to Guide
Opinion Mining
Opinion Mining
Introduction to Automatic Summarization
Automatic Document Summarization
Ad

Similar to TEXT SUMMARIZATION (20)

DOCX
NLP Techniques for Text Summarization.docx
PDF
Automatic Text Summarization using Natural Language Processing
PDF
Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...
PPTX
Searching for the best translation combination
PPTX
Stock market prediction using data mining
PDF
Automatic Text Summarization Using Natural Language Processing (1)
PDF
ChatGPT in academic settings H2.de
PDF
Екатерина Гордиенко (Serpstat)
PDF
Automatic Text Summarization: A Critical Review
PDF
CLUSTER PRIORITY BASED SENTENCE RANKING FOR EFFICIENT EXTRACTIVE TEXT SUMMARIES
PDF
Automated Essay Grading using Features Selection
PDF
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
DOCX
Data Structure Notes unit 1.docx
PDF
A Novel Approach for Rule Based Translation of English to Marathi
PDF
A Novel Approach for Rule Based Translation of English to Marathi
PDF
A Novel Approach for Rule Based Translation of English to Marathi
PDF
A Novel Approach for Rule Based Translation of English to Marathi
PDF
IRJET - Text Optimization/Summarizer using Natural Language Processing
PDF
Data Warehouses & Deployment By Ankita dubey
PDF
IRJET- Sewage Treatment Potential of Coir Geotextiles in Conjunction with Act...
NLP Techniques for Text Summarization.docx
Automatic Text Summarization using Natural Language Processing
Abigail See - 2017 - Get To The Point: Summarization with Pointer-Generator N...
Searching for the best translation combination
Stock market prediction using data mining
Automatic Text Summarization Using Natural Language Processing (1)
ChatGPT in academic settings H2.de
Екатерина Гордиенко (Serpstat)
Automatic Text Summarization: A Critical Review
CLUSTER PRIORITY BASED SENTENCE RANKING FOR EFFICIENT EXTRACTIVE TEXT SUMMARIES
Automated Essay Grading using Features Selection
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Data Structure Notes unit 1.docx
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
IRJET - Text Optimization/Summarizer using Natural Language Processing
Data Warehouses & Deployment By Ankita dubey
IRJET- Sewage Treatment Potential of Coir Geotextiles in Conjunction with Act...

Recently uploaded (20)

PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
cuic standard and advanced reporting.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Electronic commerce courselecture one. Pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Cloud computing and distributed systems.
PDF
Spectral efficient network and resource selection model in 5G networks
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Diabetes mellitus diagnosis method based random forest with bat algorithm
Mobile App Security Testing_ A Comprehensive Guide.pdf
Unlocking AI with Model Context Protocol (MCP)
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Spectroscopy.pptx food analysis technology
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
cuic standard and advanced reporting.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
20250228 LYD VKU AI Blended-Learning.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Review of recent advances in non-invasive hemoglobin estimation
Electronic commerce courselecture one. Pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Machine learning based COVID-19 study performance prediction
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Cloud computing and distributed systems.
Spectral efficient network and resource selection model in 5G networks

TEXT SUMMARIZATION

  • 1. Text Summarization For Review And Feedback BY :Aman Sadhwani 1 Monday, May 18, 2015
  • 2. What is Text Summarization? And why we need it? • We can define summary as a text which reflects the main and important sentences from the original text. In Text summarization, Summary is generated by Computer. • In Recent Years we are witnessing the amount of textual information is increasing day by day .The Textual Information grows rapidly. It becomes more difficult for the user to read the textual information and also it leads to loss of interest. That is the reason why Text Summarization came into picture which will solve this problem. 2 Monday, May 18, 2015
  • 3. Types of Text Summarization  1) Extraction: - In Extractive text summarization , summary is generated by selecting a set of words, phrases, paragraph or sentences from the original document.  2) Abstraction: - Abstractive methods are based on semantic representation and then use natural language processing techniques to generate a summary that is nearer to summary generated manually. This kind of summary may contain words that are not found in the original document. Currently research is going on this method and demand for this method is more. 3 Monday, May 18, 2015
  • 4. Proposed System 4 Monday, May 18, 2015  We have developed and compared two text summarization techniques 1) Reduction based 2) Inter section based
  • 5. How Reduction Algorithm Works  Step 1 - It takes a text as input.  Step 2 - Splits it into one or more paragraph(s).  Step 3 - Splits each paragraph into one or more sentence(s).  Step 4 - Splits each sentence into one or more words.  Step 5 - Gives each sentence weight-age (a floating point value) by comparing Its words to a pre-defined dictionary called "stopWords.txt“  If some word of a sentence matches to any word with the pre-defined Dictionary, then the word is considered as Low weighted. 5 Monday, May 18, 2015
  • 6. Cont..  Step 6 - An ordered list of weighted sentences is then prepared (Relatively High weighted sentences comes first and low weighted sentences comes At last position).  Step 7 - Now, we have the ordered list of weighted sentences, it continues to Store each sentence (from ordered weighted sentences) in the output Variable (i.e. a list) until it reaches the reduction ratio (It uses A formula to determine max number of sentences to put in the output List)  Step 8 - The output list is then returned. 6 Monday, May 18, 2015
  • 7. How InterSection Algorithm Works? 1. Split input text into Paragraph. 2. Split paragraph into sentences. 3. Split sentences into words. 4. Calculate the intersection between 2 sentences. 5. Remove non-alphabetic characters from sentence. 6. Convert content into dictionary. 7. Build the sentence dictionary. 8. Return best sentences in a paragraph. 9. Get the best sentences according to dictionary. Monday, May 18, 2015 7
  • 16. Cont…  By looking at last table we can say that intersection is faster than reduction  But reduction creates better summary than intersection.  Intersection works fine on some documents but generates only 1 or 2 line of summary on some documents.  This is because intersection is the most basic algorithm for text summarization. It doesn’t use any NLP libraries like reduction. Monday, May 18, 2015 16
  • 17. Hardware & Software requirement 17 Monday, May 18, 2015  Minimum Hardware Requirements  Processor : Intel Pentium II or Higher  RAM : 128 Mb or Higher  Monitor ,Keyboard, Mouse  Printer (Optional)  Hard disk : 20 GB Or Higher  Software Requirements  OS: Windows xp or higher  Java Installed On Machine  Python 2.7 installed on machine.
  • 18. Tools used  NetBeans  Python 2.7 IDLE Monday, May 18, 2015 18
  • 20. Future enhancement  Will support summarization for multiple file types.  User wise Document management.  Multi document summarization.  Improved summarization algorithms. Monday, May 18, 2015 20