SlideShare a Scribd company logo
International Journal of Computer Engineering and TechnologyENGINEERING
 INTERNATIONAL JOURNAL OF COMPUTER (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
                            & TECHNOLOGY (IJCET)

ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online)                                                     IJCET
Volume 4, Issue 2, March – April (2013), pp. 221-228
© IAEME: www.iaeme.com/ijcet.asp
Journal Impact Factor (2013): 6.1302 (Calculated by GISI)
                                                                         ©IAEME
www.jifactor.com




          KEY FRAME EXTRACTION METHODOLOGY FOR VIDEO
                          ANNOTATION

                          Ms. Khushboo Khurana1, Dr. M. B. Chandak2
                  M.Tech Scholar, CSE Department, SRCOEM, Nagpur, India 1
            Associate Professor and Head, CSE Department, SRCOEM, Nagpur, India 2



   ABSTRACT

            Recent advances in technology have made tremendous amount of multimedia content
   available. The amount of video content is increasing, due to which the systems that improve
   the access to the video is needed. This can be done by annotation of video, which facilitate the
   faster access to the videos. The first step towards the video annotation is the extraction of key
   frames. Instead of analysing all the frames in the video, only the frames which contain
   important information of the video can be used for further processing. In this paper, key frame
   extraction method is discussed which assist the video annotation process. The key frames are
   found by computing the edge difference between the consecutive frames and those frames
   exceeding the threshold are considered as key frames.

   KEYWORDS: Key frame extraction, edge difference, video annotation

   1.      INTRODUCTION

           The world as a living space is shrinking, are we really shrinking or have we found a
    new horizon to live in. It is true we are expanding leaps and bounds in Gbs and terabyte
    world. Recent advances in technology have made tremendous amounts of multimedia
    information available to the general population. A video in simplest of words is
    agglomeration of data. With the ever escalating videos the systems for processing these
    videos need to be developed. Analyzing these videos as small data packets for the simplicity
    of human effort is the need of the hour.



                                                  221
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME

             Video annotation is a promising and essential step for content-based video search and
     retrieval. It refers to attaching a metadata to the video for its faster and easier access.
     Extraction of key frames from the video and to analyze only these frames instead of all the
     frames present in the video can greatly improve the performance of the systems. Analysis of
     these key frames can help in forming the annotations for the video.
             Key frame is the frame which can represent the salient content and information of the
    video. The key frames extracted must summarize the characteristics of the video, and the image
    characteristics of a video can be tracked by all the key frames in time sequence. A basic rule of
    key frame extraction is that key frame extraction would rather be wrong than not enough [1].
             In this paper, we have proposed an algorithm for key frame extraction to facilitate the
    video annotation process. The algorithm uses edge difference between the two consecutive
    frames to find the difference between their contents. Our approach is shot-based. In shot based
    method shots of the original video are first detected, and then one or more key frames are
    extracted from each shot.
             Methods of shot transition detection are: pixel-based comparison, template matching
    and histogram-based method [2-3]. The pixel-based methods are susceptible to motion of
    objects. So it is suitable to detect segmentation transition of the camera and object movement.
    But in this method as each pixel is compared the time required is more. Template matching is
    apt to result in error detection if only this method is used. The Histogram-based methods
    entirely lose the location information. For example, two images with similar histograms may
    have completely different content. So we have used the edge- based method. This method
    considers the content of the frames.
       The rest of this paper is organized as follows. Section 2 describes the uses of key frame
    extraction. Section 3 presents the related work in the field of key frame extraction. In Section 4,
    the proposed approach is described with the help of algorithm and flowchart. In section 5 the
    results are specified and finally, we conclude in Section 6.

    2.      USES OF KEY-FRAME EXTRACTION

•   Video transmission: In order to reduce the transfer stress in network and invalid information
    transmission, the transmission, storage and management techniques of video information
    become more and more important [1].
            When a video is being transmitted, the use of key frames reduces the amount of data
    required in video indexing and provides the framework for dealing with the video content [4].
            In [5], a key frame based on-line coding video transmission is proposed. Key-frames
    are fixed in advance. Each frame can only choose the latest coded and reconstructed key frame
    as its reference frame. After coding and packetisation, compressed video packets are
    transmitted with differentiated service classes. Key frame along with difference values are sent
    from the source, using the key frame picture and the difference values the picture is
    reconstructed at the destination.
•   Video summarization: Video summarization is a compact representation of a video sequence.
    It is useful for various video applications such as video browsing and retrieval systems. A
    video summarization can be a preview sequence which can be a collection of key frames
    which is a set of chosen frames of a video. Key-frame-based video summarization may lose
    the spatio-temporal properties and audio content in the original video sequence; it is the
    simplest and the most common method. When temporal order is maintained in selecting the
    key frames, users can locate specific video segments of interest by choosing a particular key

                                                    222
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME

    frame using a browsing tool. Key frames are also effective in representing visual content of a
    video sequence for retrieval purposes. Video indexes may be constructed based on visual
    features of key frames, and queries may be directed at key frames using image retrieval
    techniques [6].
•   Video annotation: Video annotation is the extraction of the information about video, adding
    this information to the video which can help in browsing, searching, analysis, retrieval,
    comparison, and categorization. Annotation is to attach data to some other piece of data (i.e.
    add metadata to data) [7].
    To fasten the access of video, it is annotated. It is not momentous to analyze each video frame
    for this, so key frames are found and only these are analyzed for annotation purpose.
•   Video indexing: Key frames reduce the amount of data required in video indexing and
    provides framework for dealing with the video content.
•   Before downloading any video over the internet, if key frames are shown besides it, users can
    predict the content of the video and decide whether it is pertinent to his search.
•   Other applications such as creating chapter titles in DVDs and prints from video.

    3.      RELATED WORK

             The work in the area of key frame extraction is either in the spatial domain or in the
    compressed domain. In [8] key frames are extracted using histogram difference between two
    consecutive frames.
             Jin-Woo Jeong, Hyun-Ki Hong, and Dong-Ho Lee have proposed an approach for the
    detection of a video shot and its corresponding key frame can be performed based on the
    visual similarity between adjacent video frames.They used Euclidean distance measure to
    visual similarity between video frames. First frame of each shot is selected as a key frame [9].
             Janko Calic and Ebroul Izquierdo proposed an algorithm for scene change detection
    and key frame extraction [10]. It generates the frame difference metrics by analyzing statistics
    of the macro-block feature extracted from MPEG videos. Temporal segmentation is used to
    detect the scene change.
             A more elaborate method is employed by [11] that propose an approach which uses
    shot boundary detection to segment the video into shots and the k-means algorithm to
    determine cluster representatives for each shot that are used as key frames. MPEG-7 Color
    Layout Descriptor (CLD) is used as a feature to compute differences between consecutive
    frames. As k-means is employed after finding shot boundary its complexity increases.

    4.      THE PROPOSED APPROACH

             The first step towards video annotation is the extraction of key frames. The key
    frames must contain the important frames so as to describe the contents of the video in the
    later processing stages. After the extraction of important frames, instead of analyzing the
    contents of all video frames, only the key frame images are analyzed to give the annotation.
    The number of frames should not be reduced to an extent that important information is not
    covered by the key frames. As the key frames are analyzed after the key frame extraction
    process, the algorithm for extraction should not be very complex or time consuming.




                                                  223
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME

  4.1      ALGORITHM FOR KEY FRAME EXTRACTION FROM VIDEO

           All the frames in the video do not contain important information. Each frame is a
  slight variation of the previous frame. It is not meaningful to analyze all the frames; so we find
  those frames which contain important information.
           For the detection of key frame we have used edge difference to calculate the
  difference between two consecutive frames. Only when the difference exceeds a threshold,
  one of the consecutive frames is considered as the key frame. The reason we choose edge
  difference is that the edge is content dependent. The detailed description for key frame
  extraction from the video is as follows:

  Input: Video V, consisting of N frames
  Output: Key frames for input video
  Algorithm Key frame Extraction
  {
  Step 1:
        For each video frame k = 1 to N
       {
   1. Read frame V k and V k+1
   2. Obtain the gray level image for V k and V k+1
            G k = gray image of V k
            G k+1 = gray image of V k+1
  3. Find the edge difference between G k and G k+1 using Canny edge detector.
           Let diff(k) be their difference.

           diff(k) = ∑ ∑ (G k - G k+1 )
                    i j
          where i,j are row and column index
       }
  Step 2:
      Compute the mean and standard deviation
          Mean, M =

          Standard deviation, S =
  Step 3:
        Compute the threshold value
             Threshold = M + a x S
        Where, a is a constant
  Step 4:
         Find the key-frames
         for k = 1 to (N-1)
        {
   if diff(k) > Threshold
   {
      Write frame V k+1 as the output key-frame
   }
       }
  }


                                                   224
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME

           Video V is given as the input; this video consists of total N frames. We first read 1st and 2nd
  frame, convert them to gray scale and find their edge difference using Canny edge detector. The
  difference is stored in diff (1). Next 2nd and 3rd frames are read, the edge difference of their gray scale
  images are computed. Now the difference is stored in diff(2). Then consider 3rd and 4th frame, 4th and
  5th frame, as so on. The procedure is repeated for all the N frames of the video. Diff (k) contains the
  differences between all the consecutive frames for the given input video V. Fig demonstrates how the
  edge differences are computed. As show in the fig.1 the last difference is k, where k = N -1.




        Canny edge detector gives a matrix for the difference between frames; hence diff(k) is

     Calculated by summation of values of rows and columns to get a single difference value
           Diff (k) = ∑ ∑ (G k - G k+1 )
            i j
  Where i,j are row and column index.
     After getting frame differences, mean and standard deviation are calculated (refer step 2 of
  algorithm). Then threshold is calculated using the formula:
           Threshold = mean + a x standard deviation
  Where, a is a constant. After trying for various values, we used value of a=2, as the results were as
  desired using this value.
   The differences which exceed the threshold are considered. If so happens the contents have a
  significant change and may contain important information. If the difference of two consecutive frames
  exceeds the threshold, the latter frame is considered as the key frame. All the key frame images are
  stored in a folder.

  4.2        Flowchart for key-frame extraction from video

  The flowchart for key frame extraction from a video is shown in Fig.2.




                 .

                                     Fig.2. Flowchart for key frame extraction

                                                      225
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME

  5.       RESULTS
            The videos mainly from transport domain consisting of videos with airplane, bus, car or bike
  are considered for the input to the system. The videos are downloaded from youtube. Audio part of the
  video is not considered. Videos with slight moment of the camera and with no or small amount of
  background changes were used. We have implemented the algorithm in Matlab R2012a.
            The input video containing airplane had more than 500 frames; some of the frames are shown
  in Fig.3.




                                          Fig.3. Frames of the input video


            The edge difference between the consecutive frames was found. The edge difference between
  1st and 2nd frame was 4138, edge difference between 2nd and 3rd frame was 3352, between 3rd and 4th –
  4185, between 4th and 5th – 3564, and so on. After finding the edge differences between all the
  consecutive frames the following values were computed:

                                Max                           5734
                                Min                            162
                                Median                        2725
                                Mean                       2.8222e+03
                                Standard deviation         1.3575e+03
                                Threshold                  5.5371e+03

    Those frames which exceed the threshold value are considered as key frames. Fig. 4 shows the
  extracted frames as key frames for the input video whose frames are shown in fig.3.




                                     Fig.4. Output key frames for airplane video


           Result of key frame extraction on input video containing car and humans, along with the
  frame number is shown in fig.5. This video had a still background with humans moving in the video.
  Analysis of these key frames can result in semantic annotation the videos. The actions or events can
  also be analyzed.

                                                    226
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME




                                  Fig.5. Output key frames for car video
  The fig.6 shows the result on the video where the change in the content is high. In this video many cars
  are moving on the road. The result shows that each car is captured by the key frames.




                Fig.6. Output key frames for video with more amount of content change.



                                                    227
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME

  6.       CONCLUSION AND FUTURE WORK

           Depending upon the contents and the change in contents of the video, the key frames are
  extracted. As seen in the first video the no. of key frames is less; this is because the change of content
  in this video was very less. In the third video example above, the change of content or the amount of
  information in the video is more so more number of frames are extracted as key frames.
           As the key frames need to be processed for annotation purpose, the important information must
  not be missed. Our algorithm can be improved by further reducing the number of key frames extracted.
  This can be done by adding one more pass. After the phase 1 the key frames extracted can again be
  given as input to the algorithm. This will reduce the redundant frames or the frames which contain
  similar contents, but adding one more pass will increase the execution time. As the frames need to be
  analyzed after key frame extraction for the purpose of annotation, some amount of redundancy can be
  considered rather than increasing the execution time.
           In future, we can design a video annotation system which will utilize the key frames obtained
  from the above algorithm.

  REFERENCES

   [1] G. Liu, and J. Zhao, “Key Frame Extraction from MPEG Video Stream ”, Proceedings of the
       Second Symposium International Computer Science and Computational Technology (ISCSCT
       ’09) China, 26-28, Dec. 2009, pp. 007-011.
   [2] C. F. Lam, M. C. Lee, “Video segmentation using color difference histogram,” Lecture Notes in
       Computer Science, New York: Springer Press, pp. 159–174., 1998.
   [3] A. Hampapur, R. Jain, and T. Weymouth, “Production model based digital video segmentation,”
       Multimedia Tools Application, vol. 1, no. 1, pp.9–46, 1995.
   [4] T. Liu, H. Zhang, and F. Qi, “A novel video key-frame-extraction algorithm based on perceived
       motion energy model,” IEEE Transactions on Circuits and Systems. For Video Technology, vol.
       13, no. 10, pp. 1006-1013, 2003.
   [5] Q. Zhang and G. Liu, “A key-frame-based error resilient coding scheme for video transmission
       over differentiated services networks,” In proceeding of: Packet Video 2007, 12-13 Nov. 2007 ,
       pp. 85 – 90.
   [6] P. Mundur, Y. Rao, Y. Yesha, “Keyframe-based Video Summarization using Delaunay
       Clustering,” International Journal on Digital Libraries , Volume 6 Issue 2, April 2006
       pp 219 - 232.
   [7] K. Khurana, M. B. Chandak, “Study of Various Video Annotation Techniques,” International
       Journal of Advanced Research in Computer and Communication Engineering Vol. 2, Issue 1,
       January 2013.
   [8] S. Thakare, “Intelligent Processing and Analysis of Image for shot Boundary Detection”,
       International Journal of Engineering Research and Applications, Vol. 2, Issue 2, Mar-Apr 2012,
       pp.366-369.
   [9] J. Jeong, H. Hong, and D. Lee, “Ontology-based Automatic Video Annotation Technique In
       Smart TV Environment”, IEEE Transaction on consumer Electronics, Vol. 57, No. 4, November
       2011
   [10] J. Calic and E. Izquierdo, “Efficient Key-frame Extraction And Video Analysis”, International
       Symposium On Information Technology, April 2002,IEEE.
   [11] D. Borth, A. Ulges, C. Schulze, T. M. Breuel, “Key frame Extraction for Video Tagging &
       Summarization”, 2008.
   [12] Reeja S R and Dr. N. P Kavya, “Motion Detection for Video Denoising – The State of Art And
       The Challenges” International journal of Computer Engineering & Technology (IJCET), Volume
       3, Issue 2, 2012, pp. 518 - 525, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.



                                                     228

More Related Content

PDF
Key frame extraction for video summarization using motion activity descriptors
PDF
Key frame extraction for video summarization using motion activity descriptors
PDF
Video Key-Frame Extraction using Unsupervised Clustering and Mutual Comparison
PPTX
Unsupervised object-level video summarization with online motion auto-encoder
PDF
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
PDF
IRJET - Information Hiding in H.264/AVC using Digital Watermarking
PDF
IRJET- Mosaic Image Creation in Video for Secure Transmission
PPTX
M.tech Third progress Presentation
Key frame extraction for video summarization using motion activity descriptors
Key frame extraction for video summarization using motion activity descriptors
Video Key-Frame Extraction using Unsupervised Clustering and Mutual Comparison
Unsupervised object-level video summarization with online motion auto-encoder
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
IRJET - Information Hiding in H.264/AVC using Digital Watermarking
IRJET- Mosaic Image Creation in Video for Secure Transmission
M.tech Third progress Presentation

What's hot (20)

PDF
Cb35446450
PDF
C1 mala1 akila
PPTX
Mtech Fourth progress presentation
PDF
International Refereed Journal of Engineering and Science (IRJES)
PDF
24 7912 9261-1-ed a meaningful (edit a)
PDF
F0953235
PDF
5 ijaems sept-2015-9-video feature extraction based on modified lle using ada...
PDF
Video inpainting using backgroung registration
PDF
Scene change detection
PDF
International Journal of Engineering Research and Development (IJERD)
PDF
Encrypted sensing of fingerprint image
PDF
C010511420
PDF
AcademicProject
PDF
PCS 2016 presentation
PDF
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
PDF
76201950
PDF
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
PDF
Improved Key Frame Extraction Using Discrete Wavelet Transform with Modified ...
PPTX
MTP_10MT61R20_Rajesh_Jha
PDF
IRJET - Applications of Image and Video Deduplication: A Survey
Cb35446450
C1 mala1 akila
Mtech Fourth progress presentation
International Refereed Journal of Engineering and Science (IRJES)
24 7912 9261-1-ed a meaningful (edit a)
F0953235
5 ijaems sept-2015-9-video feature extraction based on modified lle using ada...
Video inpainting using backgroung registration
Scene change detection
International Journal of Engineering Research and Development (IJERD)
Encrypted sensing of fingerprint image
C010511420
AcademicProject
PCS 2016 presentation
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
76201950
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
Improved Key Frame Extraction Using Discrete Wavelet Transform with Modified ...
MTP_10MT61R20_Rajesh_Jha
IRJET - Applications of Image and Video Deduplication: A Survey
Ad

Viewers also liked (7)

PDF
Instrucció 2
PPT
CHIA CÁPSULA
PDF
Statistical evaluation of compression index equations
PDF
Tempo de deslocamento no Brasil gera prejuízo de R$ 111 bilhões
PDF
Cfd analysis of flow charateristics in a gas turbine a viable approach
PDF
Knowledge and skill requirements in the installation of prefabricated members
Instrucció 2
CHIA CÁPSULA
Statistical evaluation of compression index equations
Tempo de deslocamento no Brasil gera prejuízo de R$ 111 bilhões
Cfd analysis of flow charateristics in a gas turbine a viable approach
Knowledge and skill requirements in the installation of prefabricated members
Ad

Similar to Key frame extraction methodology for video annotation (20)

PDF
Key Frame Extraction in Video Stream using Two Stage Method with Colour and S...
PDF
Parking Surveillance Footage Summarization
PDF
Video Summarization for Sports
PDF
40120130405002
PDF
VISUAL ATTENTION BASED KEYFRAMES EXTRACTION AND VIDEO SUMMARIZATION
PDF
Video indexing using shot boundary detection approach and search tracks
PDF
Video Content Identification using Video Signature: Survey
PDF
SUMMARY GENERATION FOR LECTURING VIDEOS
PDF
Query clip genre recognition using tree pruning technique for video retrieval
PDF
Query clip genre recognition using tree pruning technique for video retrieval
PDF
Action event retrieval from cricket video using audio energy feature for even...
PDF
Action event retrieval from cricket video using audio energy feature for event
PDF
24 7912 9261-1-ed a meaningful (edit a)
PDF
Optimal Repeated Frame Compensation Using Efficient Video Coding
PDF
IRJET-Feature Extraction from Video Data for Indexing and Retrieval
PDF
Video Compression Using Block By Block Basis Salience Detection
PDF
Real-Time Video Copy Detection in Big Data
PDF
Video Summarization
PDF
PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...
PDF
Video copy detection using segmentation method and
Key Frame Extraction in Video Stream using Two Stage Method with Colour and S...
Parking Surveillance Footage Summarization
Video Summarization for Sports
40120130405002
VISUAL ATTENTION BASED KEYFRAMES EXTRACTION AND VIDEO SUMMARIZATION
Video indexing using shot boundary detection approach and search tracks
Video Content Identification using Video Signature: Survey
SUMMARY GENERATION FOR LECTURING VIDEOS
Query clip genre recognition using tree pruning technique for video retrieval
Query clip genre recognition using tree pruning technique for video retrieval
Action event retrieval from cricket video using audio energy feature for even...
Action event retrieval from cricket video using audio energy feature for event
24 7912 9261-1-ed a meaningful (edit a)
Optimal Repeated Frame Compensation Using Efficient Video Coding
IRJET-Feature Extraction from Video Data for Indexing and Retrieval
Video Compression Using Block By Block Basis Salience Detection
Real-Time Video Copy Detection in Big Data
Video Summarization
PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...
Video copy detection using segmentation method and

More from IAEME Publication (20)

PDF
IAEME_Publication_Call_for_Paper_September_2022.pdf
PDF
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
PDF
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
PDF
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
PDF
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
PDF
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
PDF
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
PDF
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
PDF
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
PDF
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
PDF
GANDHI ON NON-VIOLENT POLICE
PDF
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
PDF
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
PDF
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
PDF
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
PDF
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
PDF
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
PDF
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
PDF
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
PDF
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
IAEME_Publication_Call_for_Paper_September_2022.pdf
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
GANDHI ON NON-VIOLENT POLICE
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT

Recently uploaded (20)

PPTX
Amazon (Business Studies) management studies
PPTX
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
PPTX
2025 Product Deck V1.0.pptxCATALOGTCLCIA
DOCX
Business Management - unit 1 and 2
DOCX
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
PDF
Power and position in leadershipDOC-20250808-WA0011..pdf
PDF
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
PPT
Chapter four Project-Preparation material
PDF
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
PPTX
svnfcksanfskjcsnvvjknsnvsdscnsncxasxa saccacxsax
PDF
Daniels 2024 Inclusive, Sustainable Development
PPT
340036916-American-Literature-Literary-Period-Overview.ppt
PDF
Cours de Système d'information about ERP.pdf
PPTX
New Microsoft PowerPoint Presentation - Copy.pptx
PDF
How to Get Business Funding for Small Business Fast
PPTX
3. HISTORICAL PERSPECTIVE UNIIT 3^..pptx
PDF
Laughter Yoga Basic Learning Workshop Manual
PDF
Chapter 5_Foreign Exchange Market in .pdf
PDF
Unit 1 Cost Accounting - Cost sheet
PDF
Roadmap Map-digital Banking feature MB,IB,AB
Amazon (Business Studies) management studies
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
2025 Product Deck V1.0.pptxCATALOGTCLCIA
Business Management - unit 1 and 2
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
Power and position in leadershipDOC-20250808-WA0011..pdf
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
Chapter four Project-Preparation material
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
svnfcksanfskjcsnvvjknsnvsdscnsncxasxa saccacxsax
Daniels 2024 Inclusive, Sustainable Development
340036916-American-Literature-Literary-Period-Overview.ppt
Cours de Système d'information about ERP.pdf
New Microsoft PowerPoint Presentation - Copy.pptx
How to Get Business Funding for Small Business Fast
3. HISTORICAL PERSPECTIVE UNIIT 3^..pptx
Laughter Yoga Basic Learning Workshop Manual
Chapter 5_Foreign Exchange Market in .pdf
Unit 1 Cost Accounting - Cost sheet
Roadmap Map-digital Banking feature MB,IB,AB

Key frame extraction methodology for video annotation

  • 1. International Journal of Computer Engineering and TechnologyENGINEERING INTERNATIONAL JOURNAL OF COMPUTER (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME & TECHNOLOGY (IJCET) ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) IJCET Volume 4, Issue 2, March – April (2013), pp. 221-228 © IAEME: www.iaeme.com/ijcet.asp Journal Impact Factor (2013): 6.1302 (Calculated by GISI) ©IAEME www.jifactor.com KEY FRAME EXTRACTION METHODOLOGY FOR VIDEO ANNOTATION Ms. Khushboo Khurana1, Dr. M. B. Chandak2 M.Tech Scholar, CSE Department, SRCOEM, Nagpur, India 1 Associate Professor and Head, CSE Department, SRCOEM, Nagpur, India 2 ABSTRACT Recent advances in technology have made tremendous amount of multimedia content available. The amount of video content is increasing, due to which the systems that improve the access to the video is needed. This can be done by annotation of video, which facilitate the faster access to the videos. The first step towards the video annotation is the extraction of key frames. Instead of analysing all the frames in the video, only the frames which contain important information of the video can be used for further processing. In this paper, key frame extraction method is discussed which assist the video annotation process. The key frames are found by computing the edge difference between the consecutive frames and those frames exceeding the threshold are considered as key frames. KEYWORDS: Key frame extraction, edge difference, video annotation 1. INTRODUCTION The world as a living space is shrinking, are we really shrinking or have we found a new horizon to live in. It is true we are expanding leaps and bounds in Gbs and terabyte world. Recent advances in technology have made tremendous amounts of multimedia information available to the general population. A video in simplest of words is agglomeration of data. With the ever escalating videos the systems for processing these videos need to be developed. Analyzing these videos as small data packets for the simplicity of human effort is the need of the hour. 221
  • 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME Video annotation is a promising and essential step for content-based video search and retrieval. It refers to attaching a metadata to the video for its faster and easier access. Extraction of key frames from the video and to analyze only these frames instead of all the frames present in the video can greatly improve the performance of the systems. Analysis of these key frames can help in forming the annotations for the video. Key frame is the frame which can represent the salient content and information of the video. The key frames extracted must summarize the characteristics of the video, and the image characteristics of a video can be tracked by all the key frames in time sequence. A basic rule of key frame extraction is that key frame extraction would rather be wrong than not enough [1]. In this paper, we have proposed an algorithm for key frame extraction to facilitate the video annotation process. The algorithm uses edge difference between the two consecutive frames to find the difference between their contents. Our approach is shot-based. In shot based method shots of the original video are first detected, and then one or more key frames are extracted from each shot. Methods of shot transition detection are: pixel-based comparison, template matching and histogram-based method [2-3]. The pixel-based methods are susceptible to motion of objects. So it is suitable to detect segmentation transition of the camera and object movement. But in this method as each pixel is compared the time required is more. Template matching is apt to result in error detection if only this method is used. The Histogram-based methods entirely lose the location information. For example, two images with similar histograms may have completely different content. So we have used the edge- based method. This method considers the content of the frames. The rest of this paper is organized as follows. Section 2 describes the uses of key frame extraction. Section 3 presents the related work in the field of key frame extraction. In Section 4, the proposed approach is described with the help of algorithm and flowchart. In section 5 the results are specified and finally, we conclude in Section 6. 2. USES OF KEY-FRAME EXTRACTION • Video transmission: In order to reduce the transfer stress in network and invalid information transmission, the transmission, storage and management techniques of video information become more and more important [1]. When a video is being transmitted, the use of key frames reduces the amount of data required in video indexing and provides the framework for dealing with the video content [4]. In [5], a key frame based on-line coding video transmission is proposed. Key-frames are fixed in advance. Each frame can only choose the latest coded and reconstructed key frame as its reference frame. After coding and packetisation, compressed video packets are transmitted with differentiated service classes. Key frame along with difference values are sent from the source, using the key frame picture and the difference values the picture is reconstructed at the destination. • Video summarization: Video summarization is a compact representation of a video sequence. It is useful for various video applications such as video browsing and retrieval systems. A video summarization can be a preview sequence which can be a collection of key frames which is a set of chosen frames of a video. Key-frame-based video summarization may lose the spatio-temporal properties and audio content in the original video sequence; it is the simplest and the most common method. When temporal order is maintained in selecting the key frames, users can locate specific video segments of interest by choosing a particular key 222
  • 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME frame using a browsing tool. Key frames are also effective in representing visual content of a video sequence for retrieval purposes. Video indexes may be constructed based on visual features of key frames, and queries may be directed at key frames using image retrieval techniques [6]. • Video annotation: Video annotation is the extraction of the information about video, adding this information to the video which can help in browsing, searching, analysis, retrieval, comparison, and categorization. Annotation is to attach data to some other piece of data (i.e. add metadata to data) [7]. To fasten the access of video, it is annotated. It is not momentous to analyze each video frame for this, so key frames are found and only these are analyzed for annotation purpose. • Video indexing: Key frames reduce the amount of data required in video indexing and provides framework for dealing with the video content. • Before downloading any video over the internet, if key frames are shown besides it, users can predict the content of the video and decide whether it is pertinent to his search. • Other applications such as creating chapter titles in DVDs and prints from video. 3. RELATED WORK The work in the area of key frame extraction is either in the spatial domain or in the compressed domain. In [8] key frames are extracted using histogram difference between two consecutive frames. Jin-Woo Jeong, Hyun-Ki Hong, and Dong-Ho Lee have proposed an approach for the detection of a video shot and its corresponding key frame can be performed based on the visual similarity between adjacent video frames.They used Euclidean distance measure to visual similarity between video frames. First frame of each shot is selected as a key frame [9]. Janko Calic and Ebroul Izquierdo proposed an algorithm for scene change detection and key frame extraction [10]. It generates the frame difference metrics by analyzing statistics of the macro-block feature extracted from MPEG videos. Temporal segmentation is used to detect the scene change. A more elaborate method is employed by [11] that propose an approach which uses shot boundary detection to segment the video into shots and the k-means algorithm to determine cluster representatives for each shot that are used as key frames. MPEG-7 Color Layout Descriptor (CLD) is used as a feature to compute differences between consecutive frames. As k-means is employed after finding shot boundary its complexity increases. 4. THE PROPOSED APPROACH The first step towards video annotation is the extraction of key frames. The key frames must contain the important frames so as to describe the contents of the video in the later processing stages. After the extraction of important frames, instead of analyzing the contents of all video frames, only the key frame images are analyzed to give the annotation. The number of frames should not be reduced to an extent that important information is not covered by the key frames. As the key frames are analyzed after the key frame extraction process, the algorithm for extraction should not be very complex or time consuming. 223
  • 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 4.1 ALGORITHM FOR KEY FRAME EXTRACTION FROM VIDEO All the frames in the video do not contain important information. Each frame is a slight variation of the previous frame. It is not meaningful to analyze all the frames; so we find those frames which contain important information. For the detection of key frame we have used edge difference to calculate the difference between two consecutive frames. Only when the difference exceeds a threshold, one of the consecutive frames is considered as the key frame. The reason we choose edge difference is that the edge is content dependent. The detailed description for key frame extraction from the video is as follows: Input: Video V, consisting of N frames Output: Key frames for input video Algorithm Key frame Extraction { Step 1: For each video frame k = 1 to N { 1. Read frame V k and V k+1 2. Obtain the gray level image for V k and V k+1 G k = gray image of V k G k+1 = gray image of V k+1 3. Find the edge difference between G k and G k+1 using Canny edge detector. Let diff(k) be their difference. diff(k) = ∑ ∑ (G k - G k+1 ) i j where i,j are row and column index } Step 2: Compute the mean and standard deviation Mean, M = Standard deviation, S = Step 3: Compute the threshold value Threshold = M + a x S Where, a is a constant Step 4: Find the key-frames for k = 1 to (N-1) { if diff(k) > Threshold { Write frame V k+1 as the output key-frame } } } 224
  • 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME Video V is given as the input; this video consists of total N frames. We first read 1st and 2nd frame, convert them to gray scale and find their edge difference using Canny edge detector. The difference is stored in diff (1). Next 2nd and 3rd frames are read, the edge difference of their gray scale images are computed. Now the difference is stored in diff(2). Then consider 3rd and 4th frame, 4th and 5th frame, as so on. The procedure is repeated for all the N frames of the video. Diff (k) contains the differences between all the consecutive frames for the given input video V. Fig demonstrates how the edge differences are computed. As show in the fig.1 the last difference is k, where k = N -1. Canny edge detector gives a matrix for the difference between frames; hence diff(k) is Calculated by summation of values of rows and columns to get a single difference value Diff (k) = ∑ ∑ (G k - G k+1 ) i j Where i,j are row and column index. After getting frame differences, mean and standard deviation are calculated (refer step 2 of algorithm). Then threshold is calculated using the formula: Threshold = mean + a x standard deviation Where, a is a constant. After trying for various values, we used value of a=2, as the results were as desired using this value. The differences which exceed the threshold are considered. If so happens the contents have a significant change and may contain important information. If the difference of two consecutive frames exceeds the threshold, the latter frame is considered as the key frame. All the key frame images are stored in a folder. 4.2 Flowchart for key-frame extraction from video The flowchart for key frame extraction from a video is shown in Fig.2. . Fig.2. Flowchart for key frame extraction 225
  • 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 5. RESULTS The videos mainly from transport domain consisting of videos with airplane, bus, car or bike are considered for the input to the system. The videos are downloaded from youtube. Audio part of the video is not considered. Videos with slight moment of the camera and with no or small amount of background changes were used. We have implemented the algorithm in Matlab R2012a. The input video containing airplane had more than 500 frames; some of the frames are shown in Fig.3. Fig.3. Frames of the input video The edge difference between the consecutive frames was found. The edge difference between 1st and 2nd frame was 4138, edge difference between 2nd and 3rd frame was 3352, between 3rd and 4th – 4185, between 4th and 5th – 3564, and so on. After finding the edge differences between all the consecutive frames the following values were computed: Max 5734 Min 162 Median 2725 Mean 2.8222e+03 Standard deviation 1.3575e+03 Threshold 5.5371e+03 Those frames which exceed the threshold value are considered as key frames. Fig. 4 shows the extracted frames as key frames for the input video whose frames are shown in fig.3. Fig.4. Output key frames for airplane video Result of key frame extraction on input video containing car and humans, along with the frame number is shown in fig.5. This video had a still background with humans moving in the video. Analysis of these key frames can result in semantic annotation the videos. The actions or events can also be analyzed. 226
  • 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME Fig.5. Output key frames for car video The fig.6 shows the result on the video where the change in the content is high. In this video many cars are moving on the road. The result shows that each car is captured by the key frames. Fig.6. Output key frames for video with more amount of content change. 227
  • 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 6. CONCLUSION AND FUTURE WORK Depending upon the contents and the change in contents of the video, the key frames are extracted. As seen in the first video the no. of key frames is less; this is because the change of content in this video was very less. In the third video example above, the change of content or the amount of information in the video is more so more number of frames are extracted as key frames. As the key frames need to be processed for annotation purpose, the important information must not be missed. Our algorithm can be improved by further reducing the number of key frames extracted. This can be done by adding one more pass. After the phase 1 the key frames extracted can again be given as input to the algorithm. This will reduce the redundant frames or the frames which contain similar contents, but adding one more pass will increase the execution time. As the frames need to be analyzed after key frame extraction for the purpose of annotation, some amount of redundancy can be considered rather than increasing the execution time. In future, we can design a video annotation system which will utilize the key frames obtained from the above algorithm. REFERENCES [1] G. Liu, and J. Zhao, “Key Frame Extraction from MPEG Video Stream ”, Proceedings of the Second Symposium International Computer Science and Computational Technology (ISCSCT ’09) China, 26-28, Dec. 2009, pp. 007-011. [2] C. F. Lam, M. C. Lee, “Video segmentation using color difference histogram,” Lecture Notes in Computer Science, New York: Springer Press, pp. 159–174., 1998. [3] A. Hampapur, R. Jain, and T. Weymouth, “Production model based digital video segmentation,” Multimedia Tools Application, vol. 1, no. 1, pp.9–46, 1995. [4] T. Liu, H. Zhang, and F. Qi, “A novel video key-frame-extraction algorithm based on perceived motion energy model,” IEEE Transactions on Circuits and Systems. For Video Technology, vol. 13, no. 10, pp. 1006-1013, 2003. [5] Q. Zhang and G. Liu, “A key-frame-based error resilient coding scheme for video transmission over differentiated services networks,” In proceeding of: Packet Video 2007, 12-13 Nov. 2007 , pp. 85 – 90. [6] P. Mundur, Y. Rao, Y. Yesha, “Keyframe-based Video Summarization using Delaunay Clustering,” International Journal on Digital Libraries , Volume 6 Issue 2, April 2006 pp 219 - 232. [7] K. Khurana, M. B. Chandak, “Study of Various Video Annotation Techniques,” International Journal of Advanced Research in Computer and Communication Engineering Vol. 2, Issue 1, January 2013. [8] S. Thakare, “Intelligent Processing and Analysis of Image for shot Boundary Detection”, International Journal of Engineering Research and Applications, Vol. 2, Issue 2, Mar-Apr 2012, pp.366-369. [9] J. Jeong, H. Hong, and D. Lee, “Ontology-based Automatic Video Annotation Technique In Smart TV Environment”, IEEE Transaction on consumer Electronics, Vol. 57, No. 4, November 2011 [10] J. Calic and E. Izquierdo, “Efficient Key-frame Extraction And Video Analysis”, International Symposium On Information Technology, April 2002,IEEE. [11] D. Borth, A. Ulges, C. Schulze, T. M. Breuel, “Key frame Extraction for Video Tagging & Summarization”, 2008. [12] Reeja S R and Dr. N. P Kavya, “Motion Detection for Video Denoising – The State of Art And The Challenges” International journal of Computer Engineering & Technology (IJCET), Volume 3, Issue 2, 2012, pp. 518 - 525, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. 228