SlideShare a Scribd company logo
TrendMachine:
Temporal Resilience of Web Pages
@WaybackMachine
IIPC Web Archiving Conference (WAC), May 03, 2023, Online
Sawood Alam
Mark Graham
Kritika Garg
Michele C. Weigle
Michael L. Nelson
Dietrich Ayala
Internet Archive
Internet Archive
Old Dominion University
Old Dominion University
Old Dominion University
Protocol Labs
@WebSciDL @ProtocolLabs
Supported in part by Protocol Labs and Filecoin Foundation
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 2
Research Question
How healthy has a web page been
throughout its lifetime?
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 3
Temporal and Spatial Landscape of Archival Analysis
Long Duration
Single
Webpage
● TMVis
● Wayback Machine Changes
● TrendMachine
● MementoMap
● CDX Summary
● Archives Unleashed Toolkit
Webpage
Collection
● Memento Damage
● Archival ACID Test
● Reconstructive
● Warrick
● Wayback Machine Downloader
● Video Archiving Insights
Short Duration
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 4
Modeling Web Page Health: Linear vs. S-Curve
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 5
Sigmoid Function for Web Page Resilience
Spread: How far up or down the value can go from its starting position?
Shift: How soon any significant change in the value can begin?
Slope: How quickly the value reaches close to the maximum change?
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 6
TrendMachine: Composite Sigmoid Parameters of Resilience
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 7
TrendMachine: Overview
Code: https://github.com/internetarchive/trendmachine
Demo: https://trendmachine.sawood-dev.us.archive.org/
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 8
TrendMachine: Temporal Distribution of Archiving Activities
The page is archived
as few as one or zero
times and as many as
tens of thousands of
times in a single day.
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 9
Specimen Selection Algorithm
PRIORITY = ["2xx", "4xx", "5xx", "3xx"]
FOREACH st OF PRIORITY
IF st IN statuses(day)
specimen = statuses(day).match(st)[0]
BREAK
DAY1 DAY2 DAY3 DAY4
4xx 3xx 5xx 3xx
3xx 3xx 3xx 5xx
2xx 3xx 5xx 3xx
5xx 4xx 5xx
2xx 4xx
A 3xx specimen usually suggests that the URL is
redirecting to somewhere other than a variation of
the same URL.
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 10
Filling Missing Observations
Policy DAY1 DAY2 DAY3 DAY4 DAY5 DAY6
Identical 2xx 2xx 2xx 4xx 2xx
Closest 2xx 2xx 2xx 4xx 4xx 2xx
Forward 2xx 2xx 2xx 2xx 4xx 2xx
Backward 2xx 2xx 4xx 4xx 4xx 2xx
ANY 2xx 2xx
Do not fill the gap if the
status codes before and
after are not identical.
Do not fill the gap if it is
larger than a configured
threshold.
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 11
TrendMachine: TimeMap Status Codes vs. Daily Specimens
Most of the self-redirect 3xx observations
(HTTP/HTTPS or WWW/Apex domain) are
eliminated in daily specimens.
About one third of the days since the first
observation have no captures, of which
some are filled using a filling policy.
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 12
TrendMachine: Resilience
● Resilience score is calculated using Sigmoid function on status codes of daily specimens
● Initial value of 0.5 and normalized between 0 and 1
● After the first few observations, Wayback Machine did not archive it for several months in 2002
● Towards the end of 2002, Resilience score went up slowly due to infrequent archiving
● In 2003 “wikipedia.org” started to redirect to “en.wikipedia.org”
● After 2005, Resilience of the Wikipedia home page has mostly been stable and high
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 13
TrendMachine: Fixity
● Fixity score (normalized) is calculated using Sigmoid
function on content digests of daily specimens
● Content digest reported in CDX can be sensitive to
Content-Encoding, resulting in false alarms, even
when the underlying content remains unchanged
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 14
TrendMachine: Chaos
● Chaos score (normalized) is calculated using a Run-Length Encoding inspired technique on all
status codes of the CDX data in which consecutive duplicates are removed in the numerator
● An alternate sliding-window calculation is performed on the last N observations as the score
becomes insensitive to recent changes on large TimeMaps
● A high Chaos along with a high Resilience is often an indication of canonical redirects (e.g.,
adoption of HTTPS and/or consolidation of WWW and Apex domain)
Chaos =
| 2xx, 2xx, 2xx, 3xx, 3xx, 2xx |
=
3
= 0.5
| 2xx, 2xx, 2xx, 3xx, 3xx, 2xx | 6
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 15
TrendMachine: Status Code Transitions
● Large numbers along the major diagonal
indicate status code stability for extended
periods of time
● Large numbers in non-diagonal cells suggest
frequent changes in Resilience curve
● Web pages with high Resilience score for
extended periods usually exhibit large numbers
in the top-left cell (2xx -> 2xx)
● A large number in the 3xx -> 3xx cell usually
indicates extended periods of redirection to
other URLs (e.g., URL restructuring, login wall,
domain change, and parked domain)
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 16
TrendMachine: Compare First and Last Mementos
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 17
TrendMachine: Live Web Page With Headers
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 18
Potential Use Cases
● Detect points of interest in a large TimeMap
● Sample captures/mementos from TimeMaps for visual summarization
● Detect archival sinks (like login pages, paywalls, and misconfigured redirects)
● Detect poor-quality pages like Soft-404 and parked domains
● Detect potential link-rot (and fix them when possible, like in a wiki page)
● Optimize crawl jobs by minimizing wasteful downloads and maximizing coverage
● Archival quality assurance
● Cluster pages of a large archival collection in different categories
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 19
Future Work
● Report heuristics-based archival summary by combining various scores
● Report/embed captures/mementos that can be points of interest
● Calculate Fixity using less-sensitive digests (e.g., SimHash)
● Calculate Chaos after applying convolutions to smooth out alternate changes
● Allow alternate web page health models (not just Sigmoid functions)
● Deploy in production by integrating with Wayback Machine
TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 20
Summary
Code: https://github.com/internetarchive/trendmachine
Demo: https://trendmachine.sawood-dev.us.archive.org/
A mathematical model
to quantify temporal
health of a web page
Resilience, Fixity,
Chaos, Distributions,
Transitions, etc. reports
An interactive portal with
configuration options for
experiments
An evolving
open-source codebase
and demo deployment

More Related Content

PDF
Big datainmemory pub
PDF
Geographic Distribution for Global Web Application Performance
PDF
Introduction to ASP.NET MVC
PPTX
Building Cloud-Native Applications in MiCADO - MiCADO webinar No.2/4 - 09/2019
PDF
WSA: Scaling Web Service to Handle Millions of Requests per Second
PPT
Performance-driven front-end development
PDF
WordPress Cluster for Enterprise High-Availability and On-Demand Scaling
PPTX
IT Resilience Technical
Big datainmemory pub
Geographic Distribution for Global Web Application Performance
Introduction to ASP.NET MVC
Building Cloud-Native Applications in MiCADO - MiCADO webinar No.2/4 - 09/2019
WSA: Scaling Web Service to Handle Millions of Requests per Second
Performance-driven front-end development
WordPress Cluster for Enterprise High-Availability and On-Demand Scaling
IT Resilience Technical

Similar to TrendMachine: Temporal Resilience of Web Pages (20)

PDF
Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj
PDF
Lessons Learned From the Longitudinal Sampling of a Large Web Archive
PDF
MySQL Schema Design in Practice
PPTX
Monitoring web application response times^lj a hybrid approach for windows
PPTX
Why is this ASP.NET web app running slowly?
PDF
Targeting Mobile Platform with MVC 4.0
PDF
Introduction to WSO2 Storage Server
PDF
Majid_Jalili_SRC_2014
PPT
Private cloud with vmware
PPTX
Web Performance Optimization
PPTX
C* Summit 2013: Optimizing the Public Cloud for Cost and Scalability with Cas...
PPTX
Docker в автоматизации тестирования
PDF
Understanding the Top Four Use Cases for IoT
PDF
Quarkus Denmark 2019
PDF
CloudBridge and NetApp Storage Solutions - The Killer App
PDF
VMworld 2013: Maximize Database Performance in Your Software-Defined Data Center
PPTX
Velocity spa faster_092116
PPTX
Making Single Page Applications (SPA) faster
PPTX
6. Live VM migration
PDF
Microservices @ Work - A Practice Report of Developing Microservices
Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj
Lessons Learned From the Longitudinal Sampling of a Large Web Archive
MySQL Schema Design in Practice
Monitoring web application response times^lj a hybrid approach for windows
Why is this ASP.NET web app running slowly?
Targeting Mobile Platform with MVC 4.0
Introduction to WSO2 Storage Server
Majid_Jalili_SRC_2014
Private cloud with vmware
Web Performance Optimization
C* Summit 2013: Optimizing the Public Cloud for Cost and Scalability with Cas...
Docker в автоматизации тестирования
Understanding the Top Four Use Cases for IoT
Quarkus Denmark 2019
CloudBridge and NetApp Storage Solutions - The Killer App
VMworld 2013: Maximize Database Performance in Your Software-Defined Data Center
Velocity spa faster_092116
Making Single Page Applications (SPA) faster
6. Live VM migration
Microservices @ Work - A Practice Report of Developing Microservices
Ad

More from Sawood Alam (20)

PDF
CDX Summary: Web Archival Collection Insights
PDF
Video Archiving and Playback in the Wayback Machine
PDF
Profiling Web Archival Voids for Memento Routing
PDF
Readying Web Archives to Consume and Leverage Web Bundles
PDF
Summarize Your Archival Holdings With MementoMap
PDF
MementoMap: A Web Archive Profiling Framework for Efficient Memento Routing
PDF
Supporting Web Archiving via Web Packaging
PDF
MementoMap: An Archive Profile Dissemination Framework
PDF
Impact of HTTP Cookie Violations in Web Archives
PDF
Archive Assisted Archival Fixity Verification Framework
PDF
MementoMap Framework for Flexible and Adaptive Web Archive Profiling
PDF
Web ARChive (WARC) File Format
PDF
InterPlanetary Wayback: The Next Step Towards Decentralized Web Archiving
PDF
MemGator - A Memento Aggregator CLI and Server in Go
PDF
Dockerize Your Projects - A Brief Introduction to Containerization
PDF
Avoiding Zombies in Archival Replay Using ServiceWorker
PDF
Client-side Reconstruction of Composite Mementos Using ServiceWorker
PDF
TPDL 2016 Doctoral Consortium - Web Archive Profiling
PDF
Introducing Web Archiving and WSDL Research Group
PDF
InterPlanetary Wayback: Peer-To-Peer Permanence of Web Archives
CDX Summary: Web Archival Collection Insights
Video Archiving and Playback in the Wayback Machine
Profiling Web Archival Voids for Memento Routing
Readying Web Archives to Consume and Leverage Web Bundles
Summarize Your Archival Holdings With MementoMap
MementoMap: A Web Archive Profiling Framework for Efficient Memento Routing
Supporting Web Archiving via Web Packaging
MementoMap: An Archive Profile Dissemination Framework
Impact of HTTP Cookie Violations in Web Archives
Archive Assisted Archival Fixity Verification Framework
MementoMap Framework for Flexible and Adaptive Web Archive Profiling
Web ARChive (WARC) File Format
InterPlanetary Wayback: The Next Step Towards Decentralized Web Archiving
MemGator - A Memento Aggregator CLI and Server in Go
Dockerize Your Projects - A Brief Introduction to Containerization
Avoiding Zombies in Archival Replay Using ServiceWorker
Client-side Reconstruction of Composite Mementos Using ServiceWorker
TPDL 2016 Doctoral Consortium - Web Archive Profiling
Introducing Web Archiving and WSDL Research Group
InterPlanetary Wayback: Peer-To-Peer Permanence of Web Archives
Ad

Recently uploaded (20)

PDF
Unit-1 introduction to cyber security discuss about how to secure a system
PPTX
Internet___Basics___Styled_ presentation
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PPTX
Job_Card_System_Styled_lorem_ipsum_.pptx
PDF
Sims 4 Historia para lo sims 4 para jugar
PPTX
Funds Management Learning Material for Beg
PDF
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
PPTX
Introuction about WHO-FIC in ICD-10.pptx
PPTX
Introuction about ICD -10 and ICD-11 PPT.pptx
PDF
SASE Traffic Flow - ZTNA Connector-1.pdf
PDF
Tenda Login Guide: Access Your Router in 5 Easy Steps
PPTX
Power Point - Lesson 3_2.pptx grad school presentation
PPTX
innovation process that make everything different.pptx
PDF
Slides PDF The World Game (s) Eco Economic Epochs.pdf
PDF
How to Ensure Data Integrity During Shopify Migration_ Best Practices for Sec...
PPTX
PptxGenJS_Demo_Chart_20250317130215833.pptx
PDF
RPKI Status Update, presented by Makito Lay at IDNOG 10
PPTX
artificial intelligence overview of it and more
PPTX
Module 1 - Cyber Law and Ethics 101.pptx
PDF
Cloud-Scale Log Monitoring _ Datadog.pdf
Unit-1 introduction to cyber security discuss about how to secure a system
Internet___Basics___Styled_ presentation
The New Creative Director: How AI Tools for Social Media Content Creation Are...
Job_Card_System_Styled_lorem_ipsum_.pptx
Sims 4 Historia para lo sims 4 para jugar
Funds Management Learning Material for Beg
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
Introuction about WHO-FIC in ICD-10.pptx
Introuction about ICD -10 and ICD-11 PPT.pptx
SASE Traffic Flow - ZTNA Connector-1.pdf
Tenda Login Guide: Access Your Router in 5 Easy Steps
Power Point - Lesson 3_2.pptx grad school presentation
innovation process that make everything different.pptx
Slides PDF The World Game (s) Eco Economic Epochs.pdf
How to Ensure Data Integrity During Shopify Migration_ Best Practices for Sec...
PptxGenJS_Demo_Chart_20250317130215833.pptx
RPKI Status Update, presented by Makito Lay at IDNOG 10
artificial intelligence overview of it and more
Module 1 - Cyber Law and Ethics 101.pptx
Cloud-Scale Log Monitoring _ Datadog.pdf

TrendMachine: Temporal Resilience of Web Pages

  • 1. TrendMachine: Temporal Resilience of Web Pages @WaybackMachine IIPC Web Archiving Conference (WAC), May 03, 2023, Online Sawood Alam Mark Graham Kritika Garg Michele C. Weigle Michael L. Nelson Dietrich Ayala Internet Archive Internet Archive Old Dominion University Old Dominion University Old Dominion University Protocol Labs @WebSciDL @ProtocolLabs Supported in part by Protocol Labs and Filecoin Foundation
  • 2. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 2 Research Question How healthy has a web page been throughout its lifetime?
  • 3. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 3 Temporal and Spatial Landscape of Archival Analysis Long Duration Single Webpage ● TMVis ● Wayback Machine Changes ● TrendMachine ● MementoMap ● CDX Summary ● Archives Unleashed Toolkit Webpage Collection ● Memento Damage ● Archival ACID Test ● Reconstructive ● Warrick ● Wayback Machine Downloader ● Video Archiving Insights Short Duration
  • 4. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 4 Modeling Web Page Health: Linear vs. S-Curve
  • 5. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 5 Sigmoid Function for Web Page Resilience Spread: How far up or down the value can go from its starting position? Shift: How soon any significant change in the value can begin? Slope: How quickly the value reaches close to the maximum change?
  • 6. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 6 TrendMachine: Composite Sigmoid Parameters of Resilience
  • 7. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 7 TrendMachine: Overview Code: https://github.com/internetarchive/trendmachine Demo: https://trendmachine.sawood-dev.us.archive.org/
  • 8. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 8 TrendMachine: Temporal Distribution of Archiving Activities The page is archived as few as one or zero times and as many as tens of thousands of times in a single day.
  • 9. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 9 Specimen Selection Algorithm PRIORITY = ["2xx", "4xx", "5xx", "3xx"] FOREACH st OF PRIORITY IF st IN statuses(day) specimen = statuses(day).match(st)[0] BREAK DAY1 DAY2 DAY3 DAY4 4xx 3xx 5xx 3xx 3xx 3xx 3xx 5xx 2xx 3xx 5xx 3xx 5xx 4xx 5xx 2xx 4xx A 3xx specimen usually suggests that the URL is redirecting to somewhere other than a variation of the same URL.
  • 10. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 10 Filling Missing Observations Policy DAY1 DAY2 DAY3 DAY4 DAY5 DAY6 Identical 2xx 2xx 2xx 4xx 2xx Closest 2xx 2xx 2xx 4xx 4xx 2xx Forward 2xx 2xx 2xx 2xx 4xx 2xx Backward 2xx 2xx 4xx 4xx 4xx 2xx ANY 2xx 2xx Do not fill the gap if the status codes before and after are not identical. Do not fill the gap if it is larger than a configured threshold.
  • 11. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 11 TrendMachine: TimeMap Status Codes vs. Daily Specimens Most of the self-redirect 3xx observations (HTTP/HTTPS or WWW/Apex domain) are eliminated in daily specimens. About one third of the days since the first observation have no captures, of which some are filled using a filling policy.
  • 12. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 12 TrendMachine: Resilience ● Resilience score is calculated using Sigmoid function on status codes of daily specimens ● Initial value of 0.5 and normalized between 0 and 1 ● After the first few observations, Wayback Machine did not archive it for several months in 2002 ● Towards the end of 2002, Resilience score went up slowly due to infrequent archiving ● In 2003 “wikipedia.org” started to redirect to “en.wikipedia.org” ● After 2005, Resilience of the Wikipedia home page has mostly been stable and high
  • 13. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 13 TrendMachine: Fixity ● Fixity score (normalized) is calculated using Sigmoid function on content digests of daily specimens ● Content digest reported in CDX can be sensitive to Content-Encoding, resulting in false alarms, even when the underlying content remains unchanged
  • 14. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 14 TrendMachine: Chaos ● Chaos score (normalized) is calculated using a Run-Length Encoding inspired technique on all status codes of the CDX data in which consecutive duplicates are removed in the numerator ● An alternate sliding-window calculation is performed on the last N observations as the score becomes insensitive to recent changes on large TimeMaps ● A high Chaos along with a high Resilience is often an indication of canonical redirects (e.g., adoption of HTTPS and/or consolidation of WWW and Apex domain) Chaos = | 2xx, 2xx, 2xx, 3xx, 3xx, 2xx | = 3 = 0.5 | 2xx, 2xx, 2xx, 3xx, 3xx, 2xx | 6
  • 15. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 15 TrendMachine: Status Code Transitions ● Large numbers along the major diagonal indicate status code stability for extended periods of time ● Large numbers in non-diagonal cells suggest frequent changes in Resilience curve ● Web pages with high Resilience score for extended periods usually exhibit large numbers in the top-left cell (2xx -> 2xx) ● A large number in the 3xx -> 3xx cell usually indicates extended periods of redirection to other URLs (e.g., URL restructuring, login wall, domain change, and parked domain)
  • 16. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 16 TrendMachine: Compare First and Last Mementos
  • 17. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 17 TrendMachine: Live Web Page With Headers
  • 18. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 18 Potential Use Cases ● Detect points of interest in a large TimeMap ● Sample captures/mementos from TimeMaps for visual summarization ● Detect archival sinks (like login pages, paywalls, and misconfigured redirects) ● Detect poor-quality pages like Soft-404 and parked domains ● Detect potential link-rot (and fix them when possible, like in a wiki page) ● Optimize crawl jobs by minimizing wasteful downloads and maximizing coverage ● Archival quality assurance ● Cluster pages of a large archival collection in different categories
  • 19. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 19 Future Work ● Report heuristics-based archival summary by combining various scores ● Report/embed captures/mementos that can be points of interest ● Calculate Fixity using less-sensitive digests (e.g., SimHash) ● Calculate Chaos after applying convolutions to smooth out alternate changes ● Allow alternate web page health models (not just Sigmoid functions) ● Deploy in production by integrating with Wayback Machine
  • 20. TrendMachine: Temporal Resilience of Web Pages | IIPC WAC 2023 | Sawood Alam <@ibnesayeed> 20 Summary Code: https://github.com/internetarchive/trendmachine Demo: https://trendmachine.sawood-dev.us.archive.org/ A mathematical model to quantify temporal health of a web page Resilience, Fixity, Chaos, Distributions, Transitions, etc. reports An interactive portal with configuration options for experiments An evolving open-source codebase and demo deployment