SlideShare a Scribd company logo
Flickr   and PHP Cal Henderson
What’s Flickr Photo sharing Open APIs
Logical Architecture Page Logic Application Logic Database Photo Storage API Endpoints Templates Users 3 rd  Party Apps Flickr Apps Node Service Flickr .com Email
Physical Architecture Web Servers Database Servers Node Servers Static Servers Users
Where is PHP? Page Logic Application Logic Database Photo Storage API Endpoints Templates Users 3 rd  Party Apps Flickr Apps Node Service Flickr .com Email
Other than PHP? Smarty for templating PEAR for XML and Email parsing Perl for controlling… ImageMagick, for image processing MySQL (4.0 / InnoDb) Java, for the node service Apache 2, Redhat, etc. etc.
Big Application? One programmer, one designer, etc. ~60,000 lines of PHP code ~60,000 lines of templates ~70 custom smarty functions/modifiers ~25,000 DB transactions/second at peak ~1000 pages per second at peak
Thinking outside the web app Services Atom/RSS/RDF Feeds APIs SOAP XML-RPC REST PEAR::XML::Tree
More cool stuff Email interface Postfix PHP PEAR::Mail::mimeDecode FTP Uploading API Authentication API Unicode
Even more stuff Real time application Cool flash apps Blogging Blogger API (1 & 2) Metaweblog API Atom LiveJournal
APIs are simple! Modeled on XML-RPC (Sort of) Method calls with XML responses SOAP, XML-RPC and REST are just transports PHP endpoints mean we can use the same application logic as the website
XML isn’t simple :( PHP 4 doesn’t have good a XML parser Expat is cool though (PEAR::XML::Parser) Why doesn’t PEAR have XPath? Because PEAR is stupid! PHP 4 sucks!
I love XPath if ($tree->root->name == 'methodResponse'){ if (($tree->root->children[0]->name == 'params')   && ($tree->root->children[0]->children[0]->name == 'param')   && ($tree->root->children[0]->children[0]->children[0]->name == 'value')   && ($tree->root->children[0]->children[0]->children[0]->children[0]->name == 'array')   && ($tree->root->children[0]->children[0]->children[0]->children[0]->children[0]->name == 'data')){ $rsp = $tree->root->children[0]->children[0]->children[0]->children[0]->children[0]; } if ($tree->root->children[0]->name == 'fault'){ $fault = $tree->root->children[0]; return $fault; } } $nodes = $tree->select_nodes('/methodResponse/params/param[1]/value[1]/array[1]/data[1]/text()'); if (count($nodes)){ $rsp = array_pop($nodes); }else{ list($fault) = $tree->select_nodes('/methodResponse/fault'); return $fault; }
Creating API methods Stateless method-call APIs are easy to extend Adding a method requires no knowledge of the transport Adding a method once makes it available to all the interfaces Self documenting
Red Hot Unicode Action UTF-8 pages CJKV support It’s  really  cool
 
Unicode for all It’s  really  easy Don’t need PHP support Don’t need MySQL support Just need the right headers UTF-8 is 7-bit transparent (Just don’t mess with high characters) Don’t use HtmlEntities()! But bear in mind… JavaScript has patchy Unicode support People using your APIs might be stupid
Scaling the beast Why PHP is great MySQL scaling Search scaling Horizontal scaling
Why PHP is great Stateless We can bounce people around servers Everything is stored in the database Even the smarty cache “Shared nothing” (so long as we avoid PHP sessions)
MySQL Scaling Our database server started to slow Load of 200 Replication!
MySQL Replication But it only gives you more SELECT’s Else you need to partition vertically Re-architecting sucks :(
Looking at usage Snapshot of db1.flickr.com SELECT’s 44,220,588 INSERT’s 1,349,234 UPDATE’s 1,755,503 DELETE’s 318,439 13 SELECT’s per I/U/D
Replication is really cool A bunch of slave servers handle all the SELECT’s A single master just handles I/U/D’s It can scale horizontally, at least for a while.
Searching A simple text search We were using RLIKE Then switched to LIKE Then disabled it all together
FULLTEXT Indexes MySQL saves the day! But they’re only supported my MyISAM tables We use InnoDb, because it’s a lot faster We’re doomed
But wait! Partial replication saves the day Replicate the portion of the database we want to search.  But change the table types on the slave to MyISAM It can keep up because it’s only handling I/U/D’s on a couple of tables And we can reduce the I/U/D’s with a little bit of vertical partitioning
JOIN’s are slow Normalised data is for sissies Keep multiple copies of data around Makes searching faster Have to ensure consistency in the application logic
Our current setup Slave Farm DB1 Master I/U/D’s SELECT’s Search Slave Farm Search SELECT’s DB3 Main Search slave DB2 Main Slave
Horizontal scaling At the core of our design Just add hardware! Inexpensive Not exponential Avoid redesigns
Talking to the Node Service Everyone speaks XML (badly) Just TCP/IP - fsockopen() We’re issuing commands, not requesting data, so we don’t bother to parse the response Just substring search for  state=“ok” Don’t rely on it!
RSS / Atom / RDF Different formats All quite bad We’re generating a lot of different feeds Abstract the difference away using templates No good way to do private feeds. Why is nobody working on this? (WSSE maybe?)
Receiving email Want users to be able to email photos to Flickr Just get postfix to pipe each mail to a PHP script Parse the mail and find any photos Cellular phone companies hate you Lots of mailers are retarded Photos as text/plain attachments :/
Upload via FTP PHP isn’t so great at being a daemon Leaks memory like a sieve No threads Java to the rescue Java just acts as an FTPd and passes all uploaded files to PHP for processing (This isn’t actually public) Bricolage does this I think. Maybe Zope?
Blogs Why does everyone loves blogs so much? Only a few APIs really Blogger Metaweblog Blogger2 Movable Type Atom Live Journal
It’s all broken Lots of blog software has broken interfaces It’s a support nightmare Manila is tricky But it all works, more or less Abstracted in the application logic We just call  blogs_post_message();
Back to those APIs We opened up the Flickr APIs a few weeks ago Programmers mainly build tools for other programmers We have Perl, python, PHP, ActionScript, XMLHTTP and .NET interface libraries But also a few actual applications
Flickr Rainbow
Tag Wallpaper
So what next? Much more scaling PHP 5? MySQL 5? Taking over the world
Flickr   and PHP Cal Henderson
Any  Questions?

More Related Content

PPS
Web20expo Filesystems
PPS
Flickr Services
PPS
Web20expo Scalable Web Arch
PPS
Web20expo Scalable Web Arch
PPS
Filesystems
PPS
Web20expo Filesystems
PPS
Web20expo Filesystems
PDF
AWS tutorial-Part54:AWS Route53
Web20expo Filesystems
Flickr Services
Web20expo Scalable Web Arch
Web20expo Scalable Web Arch
Filesystems
Web20expo Filesystems
Web20expo Filesystems
AWS tutorial-Part54:AWS Route53

What's hot (7)

PPTX
AWS Route53 Fundamentals
PPTX
Cloud storage with AWS
PDF
Gophers Riding Elephants: Writing PostgreSQL tools in Go
PPT
Lamp Stack Optimization
PDF
Aem dispatcher – tips & tricks
PPTX
Migrating enterprise workloads to AWS
PPT
Basic Lecture on Domains and Webhosting
AWS Route53 Fundamentals
Cloud storage with AWS
Gophers Riding Elephants: Writing PostgreSQL tools in Go
Lamp Stack Optimization
Aem dispatcher – tips & tricks
Migrating enterprise workloads to AWS
Basic Lecture on Domains and Webhosting
Ad

Similar to Flickr Php (20)

PPS
Flickr Services
PDF
flickr's architecture & php
PDF
Flickr Architecture Presentation
PDF
Flickr and PHP - Cal Henderson
PDF
Flickr Architecture Presentation
ODP
PHP: The Beginning and the Zend
PDF
Php Interview Questions
PPT
Phpwebdev
PPT
Ruby on Rails (RoR) as a back-end processor for Apex
PPT
2010 Sopac Cosugi
PPT
Hanoi php day 2008 - 05. nguyen hai nhat huy - building-restful-web-service-w...
PPT
Everyone loves PHP
PDF
ODTUG KSCOPE 2018 - REST APIs for FDMEE and Cloud Data Management
PPT
The Semantic Web An Introduction
PPT
Phpwebdevelping
PDF
PHP Basics
KEY
Synchronous Reads Asynchronous Writes RubyConf 2009
ODP
Cakefest higher education
PDF
Current state-of-php
PPT
Mashups MAX 360|MAX 2008 Unconference
Flickr Services
flickr's architecture & php
Flickr Architecture Presentation
Flickr and PHP - Cal Henderson
Flickr Architecture Presentation
PHP: The Beginning and the Zend
Php Interview Questions
Phpwebdev
Ruby on Rails (RoR) as a back-end processor for Apex
2010 Sopac Cosugi
Hanoi php day 2008 - 05. nguyen hai nhat huy - building-restful-web-service-w...
Everyone loves PHP
ODTUG KSCOPE 2018 - REST APIs for FDMEE and Cloud Data Management
The Semantic Web An Introduction
Phpwebdevelping
PHP Basics
Synchronous Reads Asynchronous Writes RubyConf 2009
Cakefest higher education
Current state-of-php
Mashups MAX 360|MAX 2008 Unconference
Ad

More from royans (10)

PDF
Hadoop: Distributed data processing
PDF
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
PPT
Grid – Distributed Computing at Scale
PPT
How Typepad changed their architecture without taking down the service
PPT
Dmk Bo2 K7 Web
PPT
21 Www Web Services
PPS
Web Design World Flickr
PPS
Scalable Web Arch
PPS
Web 2.0 Summit Flickr
PPS
Etech2005
Hadoop: Distributed data processing
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
Grid – Distributed Computing at Scale
How Typepad changed their architecture without taking down the service
Dmk Bo2 K7 Web
21 Www Web Services
Web Design World Flickr
Scalable Web Arch
Web 2.0 Summit Flickr
Etech2005

Recently uploaded (20)

PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Machine learning based COVID-19 study performance prediction
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Approach and Philosophy of On baking technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Encapsulation_ Review paper, used for researhc scholars
Network Security Unit 5.pdf for BCA BBA.
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Building Integrated photovoltaic BIPV_UPV.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Mobile App Security Testing_ A Comprehensive Guide.pdf
Review of recent advances in non-invasive hemoglobin estimation
Chapter 3 Spatial Domain Image Processing.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Empathic Computing: Creating Shared Understanding
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Advanced methodologies resolving dimensionality complications for autism neur...
MYSQL Presentation for SQL database connectivity
Machine learning based COVID-19 study performance prediction
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Approach and Philosophy of On baking technology

Flickr Php

  • 1. Flickr and PHP Cal Henderson
  • 2. What’s Flickr Photo sharing Open APIs
  • 3. Logical Architecture Page Logic Application Logic Database Photo Storage API Endpoints Templates Users 3 rd Party Apps Flickr Apps Node Service Flickr .com Email
  • 4. Physical Architecture Web Servers Database Servers Node Servers Static Servers Users
  • 5. Where is PHP? Page Logic Application Logic Database Photo Storage API Endpoints Templates Users 3 rd Party Apps Flickr Apps Node Service Flickr .com Email
  • 6. Other than PHP? Smarty for templating PEAR for XML and Email parsing Perl for controlling… ImageMagick, for image processing MySQL (4.0 / InnoDb) Java, for the node service Apache 2, Redhat, etc. etc.
  • 7. Big Application? One programmer, one designer, etc. ~60,000 lines of PHP code ~60,000 lines of templates ~70 custom smarty functions/modifiers ~25,000 DB transactions/second at peak ~1000 pages per second at peak
  • 8. Thinking outside the web app Services Atom/RSS/RDF Feeds APIs SOAP XML-RPC REST PEAR::XML::Tree
  • 9. More cool stuff Email interface Postfix PHP PEAR::Mail::mimeDecode FTP Uploading API Authentication API Unicode
  • 10. Even more stuff Real time application Cool flash apps Blogging Blogger API (1 & 2) Metaweblog API Atom LiveJournal
  • 11. APIs are simple! Modeled on XML-RPC (Sort of) Method calls with XML responses SOAP, XML-RPC and REST are just transports PHP endpoints mean we can use the same application logic as the website
  • 12. XML isn’t simple :( PHP 4 doesn’t have good a XML parser Expat is cool though (PEAR::XML::Parser) Why doesn’t PEAR have XPath? Because PEAR is stupid! PHP 4 sucks!
  • 13. I love XPath if ($tree->root->name == 'methodResponse'){ if (($tree->root->children[0]->name == 'params') && ($tree->root->children[0]->children[0]->name == 'param') && ($tree->root->children[0]->children[0]->children[0]->name == 'value') && ($tree->root->children[0]->children[0]->children[0]->children[0]->name == 'array') && ($tree->root->children[0]->children[0]->children[0]->children[0]->children[0]->name == 'data')){ $rsp = $tree->root->children[0]->children[0]->children[0]->children[0]->children[0]; } if ($tree->root->children[0]->name == 'fault'){ $fault = $tree->root->children[0]; return $fault; } } $nodes = $tree->select_nodes('/methodResponse/params/param[1]/value[1]/array[1]/data[1]/text()'); if (count($nodes)){ $rsp = array_pop($nodes); }else{ list($fault) = $tree->select_nodes('/methodResponse/fault'); return $fault; }
  • 14. Creating API methods Stateless method-call APIs are easy to extend Adding a method requires no knowledge of the transport Adding a method once makes it available to all the interfaces Self documenting
  • 15. Red Hot Unicode Action UTF-8 pages CJKV support It’s really cool
  • 16.  
  • 17. Unicode for all It’s really easy Don’t need PHP support Don’t need MySQL support Just need the right headers UTF-8 is 7-bit transparent (Just don’t mess with high characters) Don’t use HtmlEntities()! But bear in mind… JavaScript has patchy Unicode support People using your APIs might be stupid
  • 18. Scaling the beast Why PHP is great MySQL scaling Search scaling Horizontal scaling
  • 19. Why PHP is great Stateless We can bounce people around servers Everything is stored in the database Even the smarty cache “Shared nothing” (so long as we avoid PHP sessions)
  • 20. MySQL Scaling Our database server started to slow Load of 200 Replication!
  • 21. MySQL Replication But it only gives you more SELECT’s Else you need to partition vertically Re-architecting sucks :(
  • 22. Looking at usage Snapshot of db1.flickr.com SELECT’s 44,220,588 INSERT’s 1,349,234 UPDATE’s 1,755,503 DELETE’s 318,439 13 SELECT’s per I/U/D
  • 23. Replication is really cool A bunch of slave servers handle all the SELECT’s A single master just handles I/U/D’s It can scale horizontally, at least for a while.
  • 24. Searching A simple text search We were using RLIKE Then switched to LIKE Then disabled it all together
  • 25. FULLTEXT Indexes MySQL saves the day! But they’re only supported my MyISAM tables We use InnoDb, because it’s a lot faster We’re doomed
  • 26. But wait! Partial replication saves the day Replicate the portion of the database we want to search. But change the table types on the slave to MyISAM It can keep up because it’s only handling I/U/D’s on a couple of tables And we can reduce the I/U/D’s with a little bit of vertical partitioning
  • 27. JOIN’s are slow Normalised data is for sissies Keep multiple copies of data around Makes searching faster Have to ensure consistency in the application logic
  • 28. Our current setup Slave Farm DB1 Master I/U/D’s SELECT’s Search Slave Farm Search SELECT’s DB3 Main Search slave DB2 Main Slave
  • 29. Horizontal scaling At the core of our design Just add hardware! Inexpensive Not exponential Avoid redesigns
  • 30. Talking to the Node Service Everyone speaks XML (badly) Just TCP/IP - fsockopen() We’re issuing commands, not requesting data, so we don’t bother to parse the response Just substring search for state=“ok” Don’t rely on it!
  • 31. RSS / Atom / RDF Different formats All quite bad We’re generating a lot of different feeds Abstract the difference away using templates No good way to do private feeds. Why is nobody working on this? (WSSE maybe?)
  • 32. Receiving email Want users to be able to email photos to Flickr Just get postfix to pipe each mail to a PHP script Parse the mail and find any photos Cellular phone companies hate you Lots of mailers are retarded Photos as text/plain attachments :/
  • 33. Upload via FTP PHP isn’t so great at being a daemon Leaks memory like a sieve No threads Java to the rescue Java just acts as an FTPd and passes all uploaded files to PHP for processing (This isn’t actually public) Bricolage does this I think. Maybe Zope?
  • 34. Blogs Why does everyone loves blogs so much? Only a few APIs really Blogger Metaweblog Blogger2 Movable Type Atom Live Journal
  • 35. It’s all broken Lots of blog software has broken interfaces It’s a support nightmare Manila is tricky But it all works, more or less Abstracted in the application logic We just call blogs_post_message();
  • 36. Back to those APIs We opened up the Flickr APIs a few weeks ago Programmers mainly build tools for other programmers We have Perl, python, PHP, ActionScript, XMLHTTP and .NET interface libraries But also a few actual applications
  • 39. So what next? Much more scaling PHP 5? MySQL 5? Taking over the world
  • 40. Flickr and PHP Cal Henderson