Web-Harvesting
Web-Harvesting Concept Issues Prospects
Web-Harvesting Concept Issues Prospects
Concept Web resource Web resource Web resource Web resource Web resource Web resource Web resource Web resource Web resource
Reference to Web Resource Harnad, Stevan (2004).  “The Self-Archiving Initiative.”  http://www.ecs.soton.ac.uk/  ~harnad/Tp/Nature4.htm . Accessed last 15 September 2004.
The Web
The Web
The Web Harvester
3 Major Activities Imaging Digitization Storage Retrieval Web   Archiving Migration Storage Retrieval Storage Migration Retrieval
Storage
Migration
Retrieval
Developments in Web Archiving Internet Archive NEDLIB Nordic Web Archive Amiga Realm Internet Archive WebArchivist.org September 11 Web Archive Eprints.org
Web-Harvesting Concept WWW - publishing venue Web resources – non-permanent Web harvester - to store, migrate, retrieve web resources
Web-Harvesting Concept Issues Prospects
Web-Harvesting Concept Issues Prospects
Issues – Storage Legal justification Non-permanency of materials Daily changes Checksum No consistency in citations Refinement of criteria
Issues – Storage Several systems providing information Continued development Inaccessibility of data in databases Several information formats Overload of information Sufficient storage space
Issues – Migration Developments in information formats Developments in hardware, operating systems, and software
Issues – Retrieval Need for registries Completeness of metadata Commercial vendor or not? Legal or illegal?
Web-Harvesting Concept Issues Prospects
Web-Harvesting Concept Issues Prospects
Prospects Future harvesters will be more powerful Overflow, duplication Current options: Self-archiving by universities Self-archiving by authors Burning of cited web pages Printing of cited web pages
Have a nice day!

More Related Content

PDF
Webinar@ASIRA: AGRIS: Providing Access to Agricultural Research and Technolog...
PPT
PPT
Results of the new AGRIS Vision: Promoting Open Access to Research
PDF
Using AGRIS as a portal of choice to access agricultural research and technol...
PPTX
'Building the Legal Deposit E-Journal Archive for the UK' by Andrew MacEwan
PDF
EXPLOREit@ICRISAT
PPTX
Spreading the word: marketing your Trusted Institutional Repository
PPTX
DATAD-R: Criteria for Trusted African Institutional Repositories
Webinar@ASIRA: AGRIS: Providing Access to Agricultural Research and Technolog...
Results of the new AGRIS Vision: Promoting Open Access to Research
Using AGRIS as a portal of choice to access agricultural research and technol...
'Building the Legal Deposit E-Journal Archive for the UK' by Andrew MacEwan
EXPLOREit@ICRISAT
Spreading the word: marketing your Trusted Institutional Repository
DATAD-R: Criteria for Trusted African Institutional Repositories

What's hot (20)

PPTX
Load webinar deposit.final
PPTX
‘PERSIST – UNESCO’s Memory of the World Programme as a catalyst for the deba...
PPT
Web24dev Icrisat 2
PPTX
AKstem Service: Supporting the AGRIS Network
PPTX
Ariadne: Archiving and Repositories
PPTX
iMarine Services
PDF
Open Access of Research Data - The Present and Future Situation in Germany
PDF
Andrew White's Technical Breakfast Club
ODP
2014-02-27 Wikidata talk Cambridge
PPTX
IPTC Semantic Web 2012 Spring Working Group
PPTX
ARTiFACTS, Emma Boswood
PPT
Integrating Data for Archaeology
PDF
The OAIS reference model and archaeological data
PDF
ICOS: Integrated Carbon Observation System Open data to open our eyes to clim...
PPTX
Let's talk about data: Citation and publication
PPTX
Ag Data Commons for AgBioData
PPTX
Jisc updates - Jisc research data shared service
PDF
BIBFRAME on its way
PPTX
Using controlled vocabularies to help organize ILRI’s information products
PPTX
IPTC Semantic Web Working Group 2011 Autumn Working Group
Load webinar deposit.final
‘PERSIST – UNESCO’s Memory of the World Programme as a catalyst for the deba...
Web24dev Icrisat 2
AKstem Service: Supporting the AGRIS Network
Ariadne: Archiving and Repositories
iMarine Services
Open Access of Research Data - The Present and Future Situation in Germany
Andrew White's Technical Breakfast Club
2014-02-27 Wikidata talk Cambridge
IPTC Semantic Web 2012 Spring Working Group
ARTiFACTS, Emma Boswood
Integrating Data for Archaeology
The OAIS reference model and archaeological data
ICOS: Integrated Carbon Observation System Open data to open our eyes to clim...
Let's talk about data: Citation and publication
Ag Data Commons for AgBioData
Jisc updates - Jisc research data shared service
BIBFRAME on its way
Using controlled vocabularies to help organize ILRI’s information products
IPTC Semantic Web Working Group 2011 Autumn Working Group
Ad

Viewers also liked (20)

PPTX
Smart Crawler -A Two Stage Crawler For Efficiently Harvesting Deep Web
PDF
Rethink Web Harvesting and Scraping
PPS
Usage of Technology and Digital Resources in the De La Salle University Library
PDF
Preaching
PDF
Preaching
PPTX
Hummingbird Banding in Paridise, AZ
PDF
Preaching
PPT
ProQuest Tutorial Revised
PPT
Electronic Resource Management Systems
PDF
Colloquim Report on Crawler - 1 Dec 2014
PPT
Library Orientation Revised
PPTX
Web crawler
DOCX
Smart crawler a two stage crawler
PPTX
Smart crawlet A two stage crawler for efficiently harvesting deep web interf...
PPT
WebCrawler
PPTX
M.Tech_Thesis_Presentation
PPT
Web crawler
PPTX
PPTX
Deep web
PPTX
Biomass supported solar thermal power plant
Smart Crawler -A Two Stage Crawler For Efficiently Harvesting Deep Web
Rethink Web Harvesting and Scraping
Usage of Technology and Digital Resources in the De La Salle University Library
Preaching
Preaching
Hummingbird Banding in Paridise, AZ
Preaching
ProQuest Tutorial Revised
Electronic Resource Management Systems
Colloquim Report on Crawler - 1 Dec 2014
Library Orientation Revised
Web crawler
Smart crawler a two stage crawler
Smart crawlet A two stage crawler for efficiently harvesting deep web interf...
WebCrawler
M.Tech_Thesis_Presentation
Web crawler
Deep web
Biomass supported solar thermal power plant
Ad

Similar to Web-Harvesting: concepts, issues, and prospects (20)

PPTX
Web archiving challenges and opportunities
PPT
Web Archiving Intro (circa 2015)
PPT
Creating and Maintaining Web Archives
PDF
The web is a mess: how I learnt to stop worrying and love web archiving. Kris...
PPT
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
PPTX
Capture All the URLs: First Steps in Web Archiving
PDF
Introduction to Web Archiving
PPT
Lsr vpresntation
PPT
The JISC-PoWR Handbook - Explaining Web Preservation (Kevin Ashley, ULCC)
PDF
Internet content as research data
PPT
Preserving the scholarly record with WebCite (www.webcitation.org): an archiv...
PPTX
Archiving Web-Based #musetech for Institutional Memory
PPT
Web Preservation in a Web 2.0 Environment (Brian Kelly, UKOLN)
PDF
Time -Travel on the Internet
PPTX
Capture All the URLS: First Steps in Web Archiving
PPTX
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
PPT
Web 1.0, Web 2.0 and Digital Preservation
PPT
MS PowerPoint format
PPT
Web Preservation in a Web 2.0 Environment
PPT
The Archives Forum - The National Archives - 02 March 2011
Web archiving challenges and opportunities
Web Archiving Intro (circa 2015)
Creating and Maintaining Web Archives
The web is a mess: how I learnt to stop worrying and love web archiving. Kris...
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
Capture All the URLs: First Steps in Web Archiving
Introduction to Web Archiving
Lsr vpresntation
The JISC-PoWR Handbook - Explaining Web Preservation (Kevin Ashley, ULCC)
Internet content as research data
Preserving the scholarly record with WebCite (www.webcitation.org): an archiv...
Archiving Web-Based #musetech for Institutional Memory
Web Preservation in a Web 2.0 Environment (Brian Kelly, UKOLN)
Time -Travel on the Internet
Capture All the URLS: First Steps in Web Archiving
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Web 1.0, Web 2.0 and Digital Preservation
MS PowerPoint format
Web Preservation in a Web 2.0 Environment
The Archives Forum - The National Archives - 02 March 2011

More from De La Salle University Library (6)

PDF
Technical Competencies of Health Librarians in a Library 2.0 Environment
PDF
De La Salle University Library System Migration: a Strategic Decision
PPT
Collaborative Cataloging
PPT
PPT
Cataloging At The De La Salle University Library
PPS
Knowledge Management: the De La Salle University-Manila Library’s Experience
Technical Competencies of Health Librarians in a Library 2.0 Environment
De La Salle University Library System Migration: a Strategic Decision
Collaborative Cataloging
Cataloging At The De La Salle University Library
Knowledge Management: the De La Salle University-Manila Library’s Experience

Recently uploaded (20)

PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PPTX
Configure Apache Mutual Authentication
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
Comparative analysis of machine learning models for fake news detection in so...
PDF
STKI Israel Market Study 2025 version august
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
UiPath Agentic Automation session 1: RPA to Agents
PPTX
The various Industrial Revolutions .pptx
DOCX
search engine optimization ppt fir known well about this
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
Flame analysis and combustion estimation using large language and vision assi...
PPT
What is a Computer? Input Devices /output devices
PPTX
Chapter 5: Probability Theory and Statistics
PDF
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Enhancing plagiarism detection using data pre-processing and machine learning...
Final SEM Unit 1 for mit wpu at pune .pptx
Configure Apache Mutual Authentication
The influence of sentiment analysis in enhancing early warning system model f...
Getting started with AI Agents and Multi-Agent Systems
Comparative analysis of machine learning models for fake news detection in so...
STKI Israel Market Study 2025 version august
Custom Battery Pack Design Considerations for Performance and Safety
UiPath Agentic Automation session 1: RPA to Agents
The various Industrial Revolutions .pptx
search engine optimization ppt fir known well about this
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
1 - Historical Antecedents, Social Consideration.pdf
Improvisation in detection of pomegranate leaf disease using transfer learni...
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
Flame analysis and combustion estimation using large language and vision assi...
What is a Computer? Input Devices /output devices
Chapter 5: Probability Theory and Statistics
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
How ambidextrous entrepreneurial leaders react to the artificial intelligence...

Web-Harvesting: concepts, issues, and prospects