SlideShare a Scribd company logo
Functional Requirements for an Interlinear Text Editor Baden Hughes 1 , Catherine Bow 1  and Steven Bird 1,2 1 University of Melbourne 2 Linguistic Data Consortium, University of Pennsylvania
Overview Introduction Motivation Selection Process Evaluation Process Functional Requirements Conclusion
Introduction Interlinear text is a highly prevalent linguistic data type in both field linguistic data as well as in collated corpora
Motivation Previous work has provided an open interlinear encoding standard using XML technologies and demonstrated the flexibility of such an approach  Bow, Hughes & Bird, 2003; Hughes, Bird & Bow 2003 Survey-based results of common functionality across a range of interlinear text handling applications Motivated by the need to build a new interlinear text editing tool and a re-usable API for XML based interlinear text
Selection Process Discovered 40+ linguistically-grounded applications with at least some interlinear functionality Technically-oriented selection criteria end user applications rather than application development frameworks obtainable at low or zero cost only require moderate level of technology literacy to install and use applications which can be used in multiple contexts rather than a specialised single use support for both unimodal and multimodal data exclusion of presentation-oriented applications
Evaluation Process Use of real linguistic data motivated by Replicate typical use patterns Establish a data baseline for comparison Cross-platform evaluation where possible Linguistically-oriented evaluation criteria from a functional perspective General editing Structural segmentation and alignment Flexible content model Import and export capability Non-Roman Script / Unicode Customisable presentation output
Functional Requirements Seeking commonly implemented functions for working with interlinear text, and the degrees of granularity at which these functions can be implemented Functions derived from previous work which has contributed to the definition of the range and type of operations performed on interlinear text Bickford 1997; Kew & McConnell 1997; Maeda & Bird 2000; Bird et al 2002; Maeda et al 2002 Functions derived from selection process Application and API Usable through whole project lifecycle Multimodal and unimodal support Cross-platform API Freely redistributable Functions derived from evaluation process …
General Editing Functions Text selection one or more constituents at morph, word, phrase level differentiate content from structure – select across morph/word/phrase cells and obtain content, structure or both Cut, copy & paste any unit of selected text, with or without rendered orthographic support combinations will facilitate split and merge type actions multiple selection clipboard Search regular expressions within selection/range multiple files cache of previous searches result navigation within text or index Replace As for search, with the addition of: Optional replacement within text or index Multiple level redo and undo
Segmentation and Alignment Granularity of segmentation and alignment Support for morph, word or phrase segmentation Annotation attachment to range of morphs, words or phrases Ontology support Links to discipline standard (eg GOLD)  Links to user specified ontologies for annotations Multimodal integration Any combination of: text, text + audio, text + video, audio + video, text + audio + video user extensible annotation tiers Cross-resource linking (eg XML ID/IDREF construct)
Flexible Content Models Incomplete annotation ambiguous (multi-segment) partial annotations free text annotations Standoff annotation open format non-resource dependent  structurally constrained and linked Ontology support Links to discipline standard (eg GOLD) Links to user specified ontologies
Import and Export Native XML data format Support for DTD or schema based XML interlinearised materials Format conversion Support for common interlinear formats such as Shoebox/Toolbox ELAN TASX AGTK/InterTrans Parsers for SGML/HTML/XML Change/Version control Internal provenance tracking Links to external change/version control systems eg CVS/RCS/Subversion/MKS …
Non-Roman Scripts Unicode from Day 1 Flexible encodings UTF-8 and UTF-16 Retain support for legacy code pages Rendering for NRS Data entry using  Native keyboarding Glyph map Unicode character codes Using open-source off-the-shelf Unicode rendering tool kits rather than reimplementing Directionality Horizontal (L>R/R>L) support Vertical (T>B/B>T) modality support
Presentation Output Text as Image Raster Formats GIF, JPEG, TIFF, EPS Vector Formats SVG Text in Presentation Format PDF, RTF, HTML Customisable Presentation HTML + CSS (including user specified CSS) XML + XSL (including user specified XSL – Hughes, Bird & Bow 2003 demonstrate a range of transformations for interlinear text using XSL) Publisher’s Templates Interface with 3 rd  party XSL engines
Conclusion Survey-based approach to specification of functional requirements allows us to build a best-of-breed interlinear application Implementing within an open source framework eg AGTK and NLTK Additional resources at:  http://www.cs.mu.oz.au/research/lt/projects/interlinear
Acknowledgements The research reported here is supported by the National Science Foundation: Grant #0094934 Electronic Metastructure for Endangered Language Data Grant #998009 TalkBank Grant #0317826 Querying Linguistic Databases

More Related Content

PPT
Nondeterministic Finite Automata
PPTX
Code Optimization
PPTX
Compiler design syntax analysis
PDF
4. THREE DIMENSIONAL DISPLAY METHODS
PPTX
Software maintenance Unit5
PPTX
Visible surface identification
PPTX
Dijkstra’S Algorithm
PPT
Symbol Table, Error Handler & Code Generation
Nondeterministic Finite Automata
Code Optimization
Compiler design syntax analysis
4. THREE DIMENSIONAL DISPLAY METHODS
Software maintenance Unit5
Visible surface identification
Dijkstra’S Algorithm
Symbol Table, Error Handler & Code Generation

What's hot (20)

PPTX
Finite automata-for-lexical-analysis
PPTX
Genetic programming
PDF
Natural Language Processing
PPT
context free language
DOCX
Software Engineering (Short & Long Questions)
PPT
06 uml-component
PPTX
Lecture 06 production system
PPTX
3 d display-methods
PPTX
Object oriented methodologies
PDF
backtracking algorithms of ada
PDF
Feng’s classification
PPT
PROCESS MODELS.ppt
PDF
Compiler Design Lecture Notes
PPTX
Lecture 3 threads
PPT
Polygon filling
PPTX
Travelling salesman dynamic programming
PPT
Introduction to design and analysis of algorithm
PPTX
Daa unit 1
PPT
PPTX
1.10. pumping lemma for regular sets
Finite automata-for-lexical-analysis
Genetic programming
Natural Language Processing
context free language
Software Engineering (Short & Long Questions)
06 uml-component
Lecture 06 production system
3 d display-methods
Object oriented methodologies
backtracking algorithms of ada
Feng’s classification
PROCESS MODELS.ppt
Compiler Design Lecture Notes
Lecture 3 threads
Polygon filling
Travelling salesman dynamic programming
Introduction to design and analysis of algorithm
Daa unit 1
1.10. pumping lemma for regular sets
Ad

Viewers also liked (20)

PDF
NSTIC IDESG Functional Requirements status report from FMO
PPTX
Functional vs Non-functional Requirements - Which comes first?
PPTX
Red7 Developing Product Requirements: Tools and Process
PPT
Functional requirements: Thinking Like A Pirate
PDF
Building Computational Grids with Apple’s Xgrid Middleware
PDF
Week 34 Sponges
PPTX
Il multimediale online: il contesto tecnologico
PDF
Keynote deeldag nijmegen_deelstad_2015
PDF
Week 38 Sponges
PPT
Failure3
PPT
Zappos - ANA - 10-17-08
PDF
Techo may16
PDF
For Sale Infiniti J30 in Greeley CO
PDF
Week 15 Sponges
PPT
Zappos - WOA - Offset And Beyond - 5-5-09
PPT
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
PPT
0708 De gebruiker heeft altijd gelijk - user-centered design
PPT
Schleswig June 07
PDF
Techo Club Freshmen
PPT
Zappos - Community 2.0 Conference - 05-13-08
NSTIC IDESG Functional Requirements status report from FMO
Functional vs Non-functional Requirements - Which comes first?
Red7 Developing Product Requirements: Tools and Process
Functional requirements: Thinking Like A Pirate
Building Computational Grids with Apple’s Xgrid Middleware
Week 34 Sponges
Il multimediale online: il contesto tecnologico
Keynote deeldag nijmegen_deelstad_2015
Week 38 Sponges
Failure3
Zappos - ANA - 10-17-08
Techo may16
For Sale Infiniti J30 in Greeley CO
Week 15 Sponges
Zappos - WOA - Offset And Beyond - 5-5-09
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
0708 De gebruiker heeft altijd gelijk - user-centered design
Schleswig June 07
Techo Club Freshmen
Zappos - Community 2.0 Conference - 05-13-08
Ad

Similar to Functional Requirements for an Interlinear Text Editor (20)

PPT
Glis Localization Internationalization 05 20071030
PDF
Overlapping optimization with parsing through metagrammars
PDF
PDF
Icsme16.ppt
PPTX
Re-implementing Thrift using MDE
PPTX
Antconc
PDF
Source-to-source transformations: Supporting tools and infrastructure
PDF
OOP Comparative Study
PDF
2010 tool forum ata handout
PPTX
Unit1 principle of programming language
PPT
Arabic MT Project
PDF
Cross language information retrieval in indian
PDF
Principles of programming languages .pdf
PDF
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
PDF
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
ODP
Introducing ODF to mobile platforms
PPTX
BERT QnA System for Airplane Flight Manual
PPT
Chapter One
PDF
PSEUDOCODE TO SOURCE PROGRAMMING LANGUAGE TRANSLATOR
PDF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
Glis Localization Internationalization 05 20071030
Overlapping optimization with parsing through metagrammars
Icsme16.ppt
Re-implementing Thrift using MDE
Antconc
Source-to-source transformations: Supporting tools and infrastructure
OOP Comparative Study
2010 tool forum ata handout
Unit1 principle of programming language
Arabic MT Project
Cross language information retrieval in indian
Principles of programming languages .pdf
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
Introducing ODF to mobile platforms
BERT QnA System for Airplane Flight Manual
Chapter One
PSEUDOCODE TO SOURCE PROGRAMMING LANGUAGE TRANSLATOR
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF

More from Baden Hughes (12)

PDF
Closing the Gap: Data Models for Documentary Linguistics
PDF
Managing Perl Installations: A SysAdmin's View
PDF
If We're Not There Yet, How Far Do We Have To Go ? Web Metadata at The Univer...
PPT
Disambiguating Advanced Computing for Humanities Researchers
PPT
Metadata Quality Evaluation: Experience from the Open Language Archives Commu...
PPT
Encoding and Presenting Interlinear Text Using XML Technologies
PDF
Refactoring Metadata:
PDF
Towards a Web Search Service for Minority Language Communities
PDF
Change Management and Versioning in Ontologies
PDF
Object Reuse and Exchange (ORE) : Experience in the Open Language Archives Co...
PDF
The Effects of Cross-Pollination : How non-library mass market services are c...
PDF
Why Digitization Increases the Value of Print Collections
Closing the Gap: Data Models for Documentary Linguistics
Managing Perl Installations: A SysAdmin's View
If We're Not There Yet, How Far Do We Have To Go ? Web Metadata at The Univer...
Disambiguating Advanced Computing for Humanities Researchers
Metadata Quality Evaluation: Experience from the Open Language Archives Commu...
Encoding and Presenting Interlinear Text Using XML Technologies
Refactoring Metadata:
Towards a Web Search Service for Minority Language Communities
Change Management and Versioning in Ontologies
Object Reuse and Exchange (ORE) : Experience in the Open Language Archives Co...
The Effects of Cross-Pollination : How non-library mass market services are c...
Why Digitization Increases the Value of Print Collections

Recently uploaded (20)

PDF
how_to_earn_50k_monthly_investment_guide.pdf
PDF
caregiving tools.pdf...........................
PDF
1a In Search of the Numbers ssrn 1488130 Oct 2009.pdf
PDF
Copia de Minimal 3D Technology Consulting Presentation.pdf
PPTX
The discussion on the Economic in transportation .pptx
PPTX
How best to drive Metrics, Ratios, and Key Performance Indicators
PDF
Mathematical Economics 23lec03slides.pdf
PDF
illuminati Uganda brotherhood agent in Kampala call 0756664682,0782561496
PPTX
kyc aml guideline a detailed pt onthat.pptx
PDF
final_dropping_the_baton_-_how_america_is_failing_to_use_russia_sanctions_and...
PDF
Understanding University Research Expenditures (1)_compressed.pdf
PPTX
Introduction to Customs (June 2025) v1.pptx
PDF
Lecture1.pdf buss1040 uses economics introduction
PDF
Dr Tran Quoc Bao the first Vietnamese speaker at GITEX DigiHealth Conference ...
PPTX
EABDM Slides for Indifference curve.pptx
PPT
E commerce busin and some important issues
PPTX
Globalization-of-Religion. Contemporary World
PDF
NAPF_RESPONSE_TO_THE_PENSIONS_COMMISSION_8 _2_.pdf
PPTX
Antihypertensive_Drugs_Presentation_Poonam_Painkra.pptx
PPTX
FL INTRODUCTION TO AGRIBUSINESS CHAPTER 1
how_to_earn_50k_monthly_investment_guide.pdf
caregiving tools.pdf...........................
1a In Search of the Numbers ssrn 1488130 Oct 2009.pdf
Copia de Minimal 3D Technology Consulting Presentation.pdf
The discussion on the Economic in transportation .pptx
How best to drive Metrics, Ratios, and Key Performance Indicators
Mathematical Economics 23lec03slides.pdf
illuminati Uganda brotherhood agent in Kampala call 0756664682,0782561496
kyc aml guideline a detailed pt onthat.pptx
final_dropping_the_baton_-_how_america_is_failing_to_use_russia_sanctions_and...
Understanding University Research Expenditures (1)_compressed.pdf
Introduction to Customs (June 2025) v1.pptx
Lecture1.pdf buss1040 uses economics introduction
Dr Tran Quoc Bao the first Vietnamese speaker at GITEX DigiHealth Conference ...
EABDM Slides for Indifference curve.pptx
E commerce busin and some important issues
Globalization-of-Religion. Contemporary World
NAPF_RESPONSE_TO_THE_PENSIONS_COMMISSION_8 _2_.pdf
Antihypertensive_Drugs_Presentation_Poonam_Painkra.pptx
FL INTRODUCTION TO AGRIBUSINESS CHAPTER 1

Functional Requirements for an Interlinear Text Editor

  • 1. Functional Requirements for an Interlinear Text Editor Baden Hughes 1 , Catherine Bow 1 and Steven Bird 1,2 1 University of Melbourne 2 Linguistic Data Consortium, University of Pennsylvania
  • 2. Overview Introduction Motivation Selection Process Evaluation Process Functional Requirements Conclusion
  • 3. Introduction Interlinear text is a highly prevalent linguistic data type in both field linguistic data as well as in collated corpora
  • 4. Motivation Previous work has provided an open interlinear encoding standard using XML technologies and demonstrated the flexibility of such an approach Bow, Hughes & Bird, 2003; Hughes, Bird & Bow 2003 Survey-based results of common functionality across a range of interlinear text handling applications Motivated by the need to build a new interlinear text editing tool and a re-usable API for XML based interlinear text
  • 5. Selection Process Discovered 40+ linguistically-grounded applications with at least some interlinear functionality Technically-oriented selection criteria end user applications rather than application development frameworks obtainable at low or zero cost only require moderate level of technology literacy to install and use applications which can be used in multiple contexts rather than a specialised single use support for both unimodal and multimodal data exclusion of presentation-oriented applications
  • 6. Evaluation Process Use of real linguistic data motivated by Replicate typical use patterns Establish a data baseline for comparison Cross-platform evaluation where possible Linguistically-oriented evaluation criteria from a functional perspective General editing Structural segmentation and alignment Flexible content model Import and export capability Non-Roman Script / Unicode Customisable presentation output
  • 7. Functional Requirements Seeking commonly implemented functions for working with interlinear text, and the degrees of granularity at which these functions can be implemented Functions derived from previous work which has contributed to the definition of the range and type of operations performed on interlinear text Bickford 1997; Kew & McConnell 1997; Maeda & Bird 2000; Bird et al 2002; Maeda et al 2002 Functions derived from selection process Application and API Usable through whole project lifecycle Multimodal and unimodal support Cross-platform API Freely redistributable Functions derived from evaluation process …
  • 8. General Editing Functions Text selection one or more constituents at morph, word, phrase level differentiate content from structure – select across morph/word/phrase cells and obtain content, structure or both Cut, copy & paste any unit of selected text, with or without rendered orthographic support combinations will facilitate split and merge type actions multiple selection clipboard Search regular expressions within selection/range multiple files cache of previous searches result navigation within text or index Replace As for search, with the addition of: Optional replacement within text or index Multiple level redo and undo
  • 9. Segmentation and Alignment Granularity of segmentation and alignment Support for morph, word or phrase segmentation Annotation attachment to range of morphs, words or phrases Ontology support Links to discipline standard (eg GOLD) Links to user specified ontologies for annotations Multimodal integration Any combination of: text, text + audio, text + video, audio + video, text + audio + video user extensible annotation tiers Cross-resource linking (eg XML ID/IDREF construct)
  • 10. Flexible Content Models Incomplete annotation ambiguous (multi-segment) partial annotations free text annotations Standoff annotation open format non-resource dependent structurally constrained and linked Ontology support Links to discipline standard (eg GOLD) Links to user specified ontologies
  • 11. Import and Export Native XML data format Support for DTD or schema based XML interlinearised materials Format conversion Support for common interlinear formats such as Shoebox/Toolbox ELAN TASX AGTK/InterTrans Parsers for SGML/HTML/XML Change/Version control Internal provenance tracking Links to external change/version control systems eg CVS/RCS/Subversion/MKS …
  • 12. Non-Roman Scripts Unicode from Day 1 Flexible encodings UTF-8 and UTF-16 Retain support for legacy code pages Rendering for NRS Data entry using Native keyboarding Glyph map Unicode character codes Using open-source off-the-shelf Unicode rendering tool kits rather than reimplementing Directionality Horizontal (L>R/R>L) support Vertical (T>B/B>T) modality support
  • 13. Presentation Output Text as Image Raster Formats GIF, JPEG, TIFF, EPS Vector Formats SVG Text in Presentation Format PDF, RTF, HTML Customisable Presentation HTML + CSS (including user specified CSS) XML + XSL (including user specified XSL – Hughes, Bird & Bow 2003 demonstrate a range of transformations for interlinear text using XSL) Publisher’s Templates Interface with 3 rd party XSL engines
  • 14. Conclusion Survey-based approach to specification of functional requirements allows us to build a best-of-breed interlinear application Implementing within an open source framework eg AGTK and NLTK Additional resources at: http://www.cs.mu.oz.au/research/lt/projects/interlinear
  • 15. Acknowledgements The research reported here is supported by the National Science Foundation: Grant #0094934 Electronic Metastructure for Endangered Language Data Grant #998009 TalkBank Grant #0317826 Querying Linguistic Databases