SlideShare a Scribd company logo
The Web
GiantsCulture – Practices – Architecture
AUGMENTED
Foreword........................................................................................................................6
Introduction..................................................................................................................9
Culture..........................................................................................................................11
The Obsession with Performance Measurement......................13
Build vs Buy.....................................................................................................19
Enhancing User Experience...................................................................27
Code crafters.................................................................................................33
Open Source Contribution....................................................................41
Sharing Economy platforms..................................................................47
Organization.............................................................................................................57
Pizza Teams.....................................................................................................59
Feature Teams...............................................................................................65
DevOps.............................................................................................................71
Practices.......................................................................................................................85
Lean Startup...................................................................................................87
Minimum Viable Product.........................................................................95
Continuous Deployment.......................................................................105
Feature Flipping.........................................................................................113
Test A/B...........................................................................................................123
Design Thinking.........................................................................................129
Device Agnostic.........................................................................................143
Perpetual beta.............................................................................................151
Architecture............................................................................................................157
Cloud First.....................................................................................................159
Commodity Hardware............................................................................167
Sharding..........................................................................................................179
TP vs. BI: the new NoSQL approach..............................................193
Big Data Architecture..............................................................................201
Data Science................................................................................................211
Design for Failure......................................................................................221
The Reactive Revolution........................................................................227
Open API ......................................................................................................235
About OCTO Technology..............................................................................243
Authors......................................................................................................................245
Table of Contents
3
THE WEB GIANTS
It has become such a cliché to start a book, a talk or a preface by stating that the rate
of change is accelerating. However, it is true: the world is changing faster both because
of the exponential rate of technology evolution and the central role of the user in today’s
economy. It is also a change characterized by Marc Andreessen in his famous blog post
as “software is eating the world“. Not only is software at the core of the digital economy,
but producing software is changing dramatically too. This is not a topic for Web
companies, this is a revolution that touches all companies. To cope with their
environment’s change, they need to reinvent themselves into software companies, with
new ways of working, organizing themselves and producing digital experiences for their
customers.	
This is why I am so pleased to write the preface to “The Web’s Giants“. I have been
using this book intensely since the first French edition was on the market. I have given
copies to colleagues both at Bouygues Telecom and at AXA, I have made it a permanent
reference in my own blogs, talks and writing. Why? It is the simplest, most pragmatic
and convincing set of answers to the previous questions: what to do in this software-
infused, technology-enabled, customer-centric fast changing 21st century?	
This is not a conceptual book, a book about why you should do this or that. This is a
beautifully written story about how software and service development is organized in
some of the best-run companies of the world. First, this is a book about practices. The
best way to grow change in a complex world is to adopt practices. It is the only way to
learn, by doing. These practices are sorted into three categories: culture, organization
and architecture; but there is a common logic and a systemic reinforcement. Practices
are easier to pick and they are less intimidating than methodologies or concepts.
However, strong will and perseverance are required. I will not spoil your reading by
summarizing what OCTO found when they look at the most common practices of the
most successful software companies of the world. I will rather try to convince you that
reading this book is an urgent task for almost everyone, based on four ideas.	
The first and foremost idea is that software systems must be built to change constantly. 
This is equally true for information systems, support systems, embedded, web or mobile
software. What we could define as customer engagement platforms are no longer
complex systems that one designs and builds, but continuously evolving systems that are
grown. This new generation of software systems is the core of the Web Giants. Constant
evolution is mandatory to cope with exponential technology changes, as well as the only
way to co-construct engagement platforms through customer feedbacks. The
unpredictability of usage, especially social usage, means that digital experiences software
processes that can only be crafted through measure and continuous improvement. This
Foreword
4
THE WEB GIANTS FOREWORD
critical change, from software being designed to software being grown, means that all
companies that provide digital experiences to their customers must become software
companies.  A stable software support system could be outsourced, delegated or bought,
but a constantly evolving self-adaptive system becomes a core capability. This capability
is deeply mixed with business and its delivery processes and agents are to be valued and
respected.	
The second key idea is that there exists a new way of building such software systems.
We are facing two tremendous challenges: to churn out innovations at the rate that is
expected by the market, and to constantly integrate new features while factoring out
olderones,toavoidthesuffocationbyconstantgrowththatplaguedpreviousgenerations
of software systems. The solution is a combination of open innovation - there are clearly
more smart developers outside any company than inside – together with source-level
“white box“ integration and minimalist “platform“ design principles. When all your
code needs to be constantly updated to follow the environment change, the less you own
the better. It is also time to bring source code back from the dark depths of “black box
integration“. Open source culture is both about leveraging the treasure trove of what
may be found in larger development communities and about mashing up composite
applications by weaving source code that one may be proud of. Follow the footsteps of
the Web Giants: code that changes constantly is worth being well-written, structured,
documented and test-viewed by as many eyeballs as possible.	
The third idea is another way of saying that “software is eating the world“, this book
is not about software, it is about a new way of thinking about your company, whichever
businessyouarein.Notsurprisingly,many“known“practicessuchasagiledevelopment,
lean startup, measure obsession or obsession about saving customer’s time - the most
precious commodity of the digital age -, have found their way into Octo’s list. By
reading the practical testimonies from the Web Giants, a new kind of customer-focused
organization will emerge. Thus, this is a book for everyone, not for geeks only. This is
of the utmost importance since many of the change levers lay in other stakeholders’
hands than software developers themselves. For instance, a key requirement for agility
is to switch from solution requirement to problem requirement, allowing the solution to
be co-developed by cross-functional teams as well as users.	
The last idea I would propose is that there is a price to pay for this transformation.
There are technologies, tools and practices that you must acquire and learn. Devops
practices, such as continuous delivery or managing infrastructure as code, require to
master a set of tools and to build skills, there is no “free lunch“. A key set of benefits
from the Web Giants way of working comes from massive automation. This book also
5
shows some of the top recent technology patterns in the architecture section. Since this
list is evolving by nature, the most important lesson is to create an environment where
“doers“ may continuously experience the tools of the future, such as massively parallel
cloud programming, big data or artificial intelligence. A key consequence is that there is
a true efficiency and competitiveness difference between those who do and those who
don’t master the said set of tools and skills. In the world of technology, we often use the
world “Barbarians“ to talk about newcomers who leverage their software/technology
skills to displace incumbents in older industries. This is not a question of mindset (trying
to take legacy companies head-front is an age-old strategy for newcomers) but a matter
of capabilities!	
As stated earlier, there would be other, more conceptual, ways to introduce the key ideas
and practices that are pictured in this book. One could tell about the best sources on
motivation and collaborative work, such as Daniel Pink for instance. These Web Giants
practices reflect the state of the art of managing intrinsic motivation. The same could be
said about the best books on lean management and self-organization. The reference to
Lean Startup is one from many subtle references to the influence of the Toyota Way in
the modern 21st century forms of organization. Similarly, it would be tempting to
convoke complex system theory - see Jurgen Apello and his “Management 3.0“ book for
instance - to explain why the practices observed and selected by Octo are the natural
answer to the challenges of the increasingly changing and complex world that we live
in. From a technology perspective, it is striking to see the similarity with the culture &
organizational traits described by Salim Ismael, Michael Malone and Yuri van Geest in
their book “Exponential organizations“. The beauty of this pragmatic approach is that
you have almost all what you need to know in a much shorter package, which is fun
and engaging to read.	
To conclude this preface, I would advise you to read this book carefully, to share it with
your colleagues, your friends and your children - when it’s time to think about what it
means to do something that matters in this new world. It tells a story about the new way
of working that you cannot afford to miss. Some of the messages: measuring everything,
learning by doing, loving your code and respecting those who build things, may make
the most seasoned manager smile, but times are changing. This is no longer a set of
suggested, “nice-to-have“ practices, as it might have been ten years ago. It is the
standard of web-age software development, and de facto the only way for any company
to succeed in the digital world.
Yves Caseau - National Academy of Technologies of France,
President of the ICT commission.
Head of Digital of AXA Group
THE WEB GIANTS
6
THE WEB GIANTS INTRODUCTION
Introduction
Something extraordinary is happening at this very moment; a sort of revolution is
underway. Across the Atlantic, as well as in other parts of the world such as France,
people are reinventing how to work with information technology.
They are Amazon, Facebook, Google, Netflix and LinkedIn, to name but the most
famous. This new generation of players has managed to shed old dogmas to examine
afresh the issues at hand by coming up with new, radical and efficient solutions for
long-standing IT problems.
Computer scientists are well aware of the fact that when IT tools are introduced to a
trade, the benefits of computerization can only be reaped if business processes are
re-thought in light of the new potential offered by technology.
One trade, however, has mostly managed thus far to avoid upheavals in their processes:
Information Technology itself. Many continued – and still do – to build information
systems the way one would build highways or bridges.
There is a tendency to forget that the matter being handled on a daily basis is extremely
volatile. By dint of hearing tell of Moore’s law,[1]
its true meaning is forgotten: what
couldn’t be done last year is possible today; what cannot be done today will be
possible tomorrow.
The beliefs and habits of the ecosystem we live in must be challenged at regular intervals.
This thought is both terrifying and wonderful.
Now that the pioneers have paved the way, it is important to re-visit business
processes. The new approaches laid out here offer significant increases in through
efficiency, proactivity, and the capacity for innovation, to be harnessed before the
competition pulls the rug out from under your feet.
The good news is that the Web Giants are not only paving the way; they espouse the
vision of an IT community.
They are committed to the Open Source principle, openly communicating their practices
to appeal to potential recruits, and work in close collaboration with the research
community. Their work methods are public knowledge and very accessible to those who
care to delve.
The aim of this book is to provide a synthesis of practices, technological solutions and
the most salient traits of IT culture. Our hope is that it will inspire readers to make
contributions to an information age capable of reshaping our world.
This book is designed for both linear and thematic reading. Those who opt for the
former may find some repetition.
[1] empirical law which states that computing power roughly doubles in capacity at a fixed
price every 18 months.
7
THE WEB GIANTS
Culture
8
The obsession with performance measurement................................. 13
Build vs Buy..................................................................................... 19
Enhancing the user experience......................................................... 27
Code crafters................................................................................... 33
Developing Open Source................................................................. 41
THE WEB GIANTS
THE WEB GIANTS
The obsession
with
performance
measurement
10
THE WEB GIANTS CULTURE / L’OBSESSION DE LA MESURE
11
THE WEB GIANTSCULTURE / THE OBSESSION WITH PERFORMANCE MEASUREMENT
Description
In IT, we are all familiar with quotes reminding us of the importance of
performance measurement:
That which cannot be measured cannot be improved;
without measurement, it is all opinion.
Web Giants have taken this idea to the extreme, and most have
developed a strong culture of performance measurement. The
structure of their activities leads them in this direction.
These activities often share three characteristics:
		 For these companies, IT is their means of production. Their costs
are therefore directly correlated to the optimal use of equipment and
software. Improvements in the number of concurrent users or CPU
usage result in rapid ROI.
		 Revenues are directly correlated to the efficiency of the service
provided. As a result, improvements in conversion rates lead to rapid
ROI.
		 They are surrounded by computers! And computers are excellent
measurement instruments, so they may as well get the most out of
them!
Most Web Giants have made a habit of measuring everything, response
times, most visited web pages or the articles (content or sales pages) that
work best, the time spent on individual pages...
In short, nothing unusual – at first glance.
But that’s not all! – They also measure the heat generated by a given CPU,
or the energy consumption of a transformer, as well as the average time
between two hard disk failures (MTBF, Mean Time Between Failure).[1]
This
motivates them to build infrastructure that maximizes the energy efficiency
of their installations, as these players closely monitor PUE, or Power Usage
Effectiveness.
Most importantly, they have learned to base their action plans on this
wealth of metrics.
[1] http://storagemojo.com/2007/02/19/googles-disk-failure-experience
12
THE WEB GIANTS
Part of this trend is A/B testing (see “A/B Testing“ on p. 123 for further
information), which consists of testing different versions of an application
on different client groups. Does A work better than B? The best way
to find out remains objective measurement: it results in concrete data
that defy common sense and reveal the limits of armchair expertise, as
demonstrated by the www.abtests.com website, which references A/B
testing results.
In an interview, Yassine Hinnach – then Senior Engineer Manager at LinkedIn –
spoke of how LinkedIn teams were encouraged to quickly put any technology
designed to boost site performance to the test. Thus decisions to adopt a
given technology are made on the basis of observed metrics.
HighScalability.com has published an article presenting Amazon’s recipes
for success, based on interviews with its CTO. Among the more interesting
quotes, the following caught our attention:
Everyone must be able to experiment, learn, and iterate.
Position, obedience, and tradition should hold no power.
For innovation to flourish, measurement must rule.[2]
As another example of this approach, here is what Timothy B. Lee, a
journalist for Wired and the New York Times, had to say about Google’s
culture of performance measurement:
Rather than having intimate knowledge of what their
subordinates are doing, Google executives rely on
quantitative measurements to evaluate the company’s
performance. The company keeps statistics on everything—
page load times, downtime rates, click-through rates,
etc—and works obsessively to improve these figures. The
obsession with data-driven management extends even to
the famous free snacks, which are chosen based on careful
analysis of usage patterns and survey results.“[3]
[2] http://highscalability.com/amazon-architecture
[3] http://arstechnica.com/apple/news/2011/06/fourth-times-a-charm-why-icloud-faces-long-
odds.ars
13
THE WEB GIANTSCULTURE / THE OBSESSION WITH PERFORMANCE MEASUREMENT
The consequences of this modus operandi run deep. A number of pure
players display in their offices the motto “In God we trust. Everything else,
we test“. This is more than just a nod to Deming;[4]
it is a profoundly pragmatic
approach to the issues at hand.
An extreme example of this trend, verging on caricature, is Google’s ‘Project
Oxygen’: a team of internal statisticians combed through HR data collected
from within – annual performance reviews, feedback surveys, nominations
for top-manager awards. They distilled the essence of what makes a good
manager down to 8 rules. Reading through them, any manager worthy
of the name would be struck by how jaw-droppingly obvious it all seems.
However, they backed their claims with hard, cold data,[5]
and that made all
the difference!
What about me?
The French are fond of modeling, and are often less pragmatic than their
English-speaking counterparts.
Indeed, we believe that this constant and quick feedback loop “hypothesis
measurement decision“ should be an almost systematic reflex in the ISD
world, and can be put into effect at a moment’s notice.
The author of these lines still has painful memories of two four-hour
meetings with ten people organized to find out if shifting requests to the
service layer to http would have a “significant“ impact on performance.
Ten working days would have largely sufficed for a developer to figure that
out, at a much lower cost.
OCTO consultants have also had the experience, several times over, of
discovering that applications performed better when the cache that
was used to improve performance was removed! The cure was therefore
worse than the disease and its alleged efficacy never actually measured.
Management runs the risk of falling into the trap of believing that analysis
by “hard data“ is a done deal. It may be a good idea to regularly check
that this is indeed the case, and especially that the information gathered
is put to use in decision-making.
[4] “In God we trust; all others must bring data“, W. Edward Deming.
[5] Adam BRYANT, Google’s Quest to Build a Better Boss, The New York Times Company,
March 12, 2011 : http://www.nytimes.com/2011/03/13/business/13hire.html
13
14
THE WEB GIANTS
Nevertheless, it cannot be emphasized enough that an ecosystem
fostering the application of said information makes up part of the recipe
for success of Web Giants.
Two other practices support the culture of performance metrics:
		 Automated tests: it’s either red or green, no one can argue with
that. As a result, this ensures that it is always the same thing being
measured.
		 Short cycles. To measure – and especially interpret – the data, one
must be able to compare options, “all other things being equal“.
This is crucial. We recently diagnosed the steps undertaken to
improve the performance of an application. But about a dozen
other optimizations were made to the next release. How then can
efficient optimizations be distinguished from those that are counter-
productive?
15
THE WEB GIANTS
Build
vs
Buy
16
THE WEB GIANTS
17
THE WEB GIANTS CULTURE / BUILD VS BUY
Description
One striking difference in the strategy of Web Giants as compared
to more usual IT departments lies in their arbitrations around Build
vs. Buy.
The issue is as old as computers themselves: is it better to invest
in designing software to best fit your needs or to use a software
package complete with the capitalization and R&D of a publisher
(or community) having had all necessary leisure to master the
technology and business points?
Most major firms have gone for the second option and have enshrined
maximal software packaging among their guiding principles, based on
the view that IT is not one of their pillar businesses so is better left to
professionals.
The major Web companies have tended to do the exact reverse. This
makes sense given that IT is precisely their core business, and as such is
too sensitive to be left in the hands of outsiders.
The resulting divergences are thus coherent.
Nonetheless, it is useful to push the analysis one step further because Web
Giants have other motives too: first, being in control of the development
process to ensure it is perfectly adjusted to meet their needs, and second,
the cost of scaling up! These are concerns found in other IT departments,
meaning that it can be a good idea to look very closely into your software
package decisions.
Finding balanced solutions
On the first point, one of the built-in flaws of software packages is that they
are designed for and by the needs which most arise for the publisher’s
clients.[1]
Your needs are thus only a small subset of what the software
package is built to do. Adopting a software package by definition
entails overkill, i.e. an overly complex solution not optimized for your
[1] We will not insist here on the fact that you should not stray too far from the standard
out-of-the-box software package as this can be (very) expensive in the long term,
especially when there are new releases.
18
THE WEB GIANTS
needs; and which has a price both in terms of execution and complexity,
offsetting any savings made by not investing in the design and development
of a complete application.
This is particularly striking in the software package data model. Much of the
model’s complexity stems from the fact that the package is optimized for
interoperability (a highly standardized Conceptual Data Model, extension
tables, low model expressiveness as it is a meta-model...). However the
abstractions and the “hyper-genericity“ that this leads to in software
design has an impact on processing performance.[2]
Moreover, Web Giants have constraints in terms of volumes, transaction
speed and the number of simultaneous users which push the envelopes
of traditional architecture and which, in consequence, require fine-tuned
optimizations determined by observed access-patterns. Such read-
intensive transactions must not be optimized in the same way as others,
where the stakes will be determined by I/O writing metrics.
In short, to attain such results, you have to pop the hood and poke around
in the engine, which is not something you will be able to do with a software
package (all guarantees are revoked from the moment you fiddle with the
innards).
Because performance is an obsession for Web Giants, the overhead costs
and low possibilities for adjustments to the software package make the
latter quite simply unacceptable.
Costs
The second particularly critical point is of course the cost when scaling up.
When the number of processors and servers increases, the costs rise very
quickly, but not always in linear fashion, making some items more visible.
And this is true of both business software packages and hardware.
That is precisely one of the arguments which led LinkedIn to gradually
replace their Oracle database by an in-house solution, Voldemort.[3]
.
In a similar vein, in 2010 we carried out a study on the main e-commerce
[2] When it is not a case of a cumbersome interface.
[3] Yassine Hinnach, Évolution de l’architecture de LinkedIn, enjeux techniques et
Organizationnels, USI 2011:
http://www.usievents.com/fr/conferences/8-paris-usi-2011/sessions/1007
19
THE WEB GIANTS CULTURE / BUILD VS BUY
sites in France: at the time, eight of the ten largest sites (in terms of annual
turnover) ran on platforms developed in-house and 2 used e-commerce
software packages.
Web Giants thus prefer Build to Buy. But not only. They also massively
have recourse to Open source solutions (cf. “Developing open source“,
p. 41). Linux and MySQL reign supreme in many firms. Development
languages and technologies are almost all open source: very little .NET
for example, but instead Java, Ruby, PHP, C(++), Python, Scala... And they
do not hesitate to fork off from other projects: Google for example uses
a largely modified Linux kernel.[4]
This is also the case for one of the main
worldwide Global Distribution Systems.
Most technologies making a stir today in the world of high
performance architecture are the result of developments carried
out by Web Giants and then opened to the community. Cassandra,
developed by Facebook, Hadoop and HBase inspired by Google and
developed by Yahoo!, Voldemort by LinkedIn...
A way, in fact, of combining the advantages of software perfectly tailored
to your needs but nonetheless enhanced by improvements contributed by
the development community, with, as an added bonus, a market trained
to use the technologies you use.
Coming back to the example of LinkedIn, many of their technologies are
grounded in open source solutions:
		 Zoie, a real time indexing and search system based on Lucene.
		 Bobo, a faceted search library based on Lucene.
		 Azkaban, a batch workflow job scheduler to manage Hadoop job
dependencies.
		 GLU, a deployment framework.
[4] http://lwn.net/Articles/357658
20
THE WEB GIANTS
How can I make it work for me?
Does this mean I have to do away with software packages in my IT choices?
Of course not, not for everything. Software packages can be the best
solution, no one today would dream of reengineering a payroll system.
However, ad hoc developments should be considered in certain cases:
when the IT tool is key to the success of your business. Figure 1 lays out
orientations in terms of strategy.
The other context where specific developments can be the right choice is
that of high performance: with companies turning to “full web solutions“,
very few business software packages have the architecture to support
the traffic intensity of some websites.
As for infrastructure solutions, open source has become the norm: OSs and
application servers foremost. Often also databases and message buses.
Open source are ideally adapted to run the solutions of Web Giants. There
is no doubt as to their capacity for performance and stability.
One hurdle remains: reluctance on the part of CIOs to forgo the support
found in software packages. And yet, when you look at what actually
happens, when there are problems with the commercial technical platform,
it is rarely support from the publisher, handsomely paid for, which
provides the solution, but rather networks of specialists and help fora
Unique,
differentiating.
Perceived as
a commercial asset.
Innovations and
strategic assets
Faster
SPECIFIC
SOFTWARE
PACKAGE
BPO[5]
Resources
Cheaper
Common to all
industry organizations.
Perceived as a production asset.
Common to all organizations.
Perceived as a ressource.
[5] Business Process Outsourcing.
21
THE WEB GIANTS CULTURE / BUILD VS BUY
on the Internet. For application platforms of the database or message
bus type, the answer is less clearcut because some commercial solutions
include functionalities that you do not find in open source alternatives.
However if you are sending an Oracle into regions where MySQL will not
be able to follow, that means that you have very sophisticated needs...
which is not the case for 80% of the contexts we encounter !
22
THE WEB GIANTS
Enhancing
User Experience
23
WEB GIANTS
24
THE WEB GIANTS CULTURE / ENHANCING THE USER EXPERIENCE
Description
Performance: a must
One conviction shared by Web Giants is that users’ judgment of
performance is crucial. Performance is directly linked to visitor
retention and loyalty. How users feel about a particular service is
linked to the speed with which the graphic interface is displayed.
Most people have no interest in software architecture, server power,
or network latency due to web based services. All that matters is the
impression of seamlessness.
User-friendliness is no longer negotiable
Web Giants have fully grasped this and speak of metrics in terms of
“the bat of an eyelash“. In other words, it is a matter of fractions of seconds.
Their measurements, carried out namely through A/B testing (cf. “A/B
Testing“, p. 123), are very clear:
		 Amazon :
a 100ms. increase in latency means a 1% loss in sales.
		 Google :
a page taking more than 500ms to load loses 20% of traffic (pages
visited).
		 Yahoo! :
more than 400ms to load means + 5 to 9 % abandons.
		 Bing :
over 1 second to load means a loss of 2.8% in advertising income.
How are these performances attained?
In keeping with the Device Agnostic pattern (cf. “Device Agnostic“,
p. 143), Web Giants develop native interfaces, or Web interfaces,
to always offer the best possible user experience. In both cases,
performance as perceived by the user must be maximized.
25
THE WEB GIANTS
Native applications
With the iPhone, Apple reintroduced applications developed for a specific
device (stopping short of the assembler however) to maximize perceived
performance. Thus Java and Flash technologies are banished from the
iPhone. The platform also uses visual artifacts: when an app is launched,
it displays the view as seen when it was last charged by the system to
strengthen the impression that it is instantaneous, with the actual app
being loaded in the background. On Android, Java applications are
executed on a virtual machine optimized for the platform. They can also
be written in C to maximize performance.
Generally speaking, there is a consensus around native development,
especially on mobile platform: it must be as tightly linked as possible
to the device. Multi-platform technologies such as Java ME, Flash and
Silverlight do not directly enhance the user experience and are therefore
put aside.
Web applications
Fully loading a Web page usually takes between 4 and 10 seconds
(including graphics, JavaScript, Flash, etc.).
It would seem that perceived slowness in display is generally linked
for 5% to server processing, and for 95% to browser processing. Web
Giants have therefore taken considerable care to optimize the display of
Web pages.
As illustration, here is a list of the main good practices which most agree
optimize user perception:
		 It is crucial to cache all static resources (graphics, CSS style sheets,
JavaScript scripts, Flash animations, etc.) whenever possible. There
are various HTTP cache technologies for this. It is important to
become skillful at optimizing the life-cycle of the resources in the
cache.
		 It is also advisable to use a cache network, or Content Delivery
Network (CDN) to bring the resources as close as possible to the
end user to reduce network latency. We highly recommend that you
have cache servers in the countries where the majority of your users
live.
26
CULTURE / ENHANCING THE USER EXPERIENCE
		 Downloading in background is a way of masking sluggishness in
the display of various elements on the page.
		 One thing many do is to use sprites: the principle is to aggregate
images in a single file to limit the amount of data to be loaded;
they can then be selected on the fly by the navigator (see the Gmail
example below).
		 Having recourse to multiple domain names is a way to maximize
parallelization in simultaneous resource loading by the navigator.
One must bear in mind that navigators are subjected to a maximum
number of simultaneous queries for a same domain. Yahoo.fr for
example loads their images from l.yimg.com.
		 Placing JavaScript resources at the very end of the page to ensure
that graphics appear as quickly as possible.
		 Using tools to minimize, i.e. removing from the code (JavaScript,
HTML, etc.) all characters (enter, comments, etc.) serving to read
the code but not to execute it, and to shorten as much as possible
function names.
		 Compacting the various source code files such as JavaScript in a
single file whenever possible.
Who makes it work for them?
There are many examples of such practices among Web Giants, e.g.
Google, Gmail, Viadeo, Github, Amazon, Yahoo!...
References among Web Giants
Google has the most extensive distributed cache network of all Web
Giants: the search giant is said to have machines in all major cities, and
even a private global network, although corroboration is difficult to come
by.
Google Search pushes the real-time user experience to the limits with its
“Instant Search“ which loads search results as you type your query. This
function stems from formidable technical skill and has aroused the interest
of much of the architect community.
27
THE WEB GIANTS
Gmail images are reduced to a strict minimum (two sprite images shown
on Figure 1), and the site makes intensive cache use and loads JavaScript
in the background
Figure 1: Gmail sprite images.
France
Sites using or having used the content delivery network Akamai:
		 cite-sciences.fr
		 lemonde.fr
		 allocine.com
		 urbandive.com
How can I make it work for me?
The consequences of display latency are the same with in-house
applications within any IT department: users who get fed up with the
application and stop using it. This to say that this is a pattern which
perfectly applies to your own business
Sources
• Eric Daspet, “Performance des applications Web, quoi faire et
pourquoi ?“ USI 2011 (French only):
> http://www.usievents.com/fr/conferences/10-casablanca-usi-2011/
sessions/997-performance-des-applications-web-quoi-faire-et-pourquoi
• Articles on Google Instant Search:
> http://highscalability.com/blog/2010/9/9/how-did-google-instant-
become-faster-with-5-7x-more-results.html
> http://googleblog.blogspot.com/2010/09/google-instant-behind-
scenes.html
Editor’s note: By definition,
sprites are designed for
screen display, we are
unable to provide any
better definition for the
printing of this example.
Thank you for your
understanding.
28
THE WEB GIANTS
Code
Crafters
29
THE WEB GIANTS CULTURE / CODE CRAFTERS
Description
Today Web Giants are there to remind us that a career as a developer
can be just as prestigious as manager or consultant. Indeed, some of
the most striking successes of Silicon Valley have originated with one
or several visionary geeks who are passionate about quality code.
When these companies’ products gain in visibility, satisfying an increasing
number of users means hugging the virtuous cycle in development quality,
without which success can vanish as quickly as it came.
Which is why a software development culture is so important to Web
Giants, based on a few key principles:
		 attracting and recruiting the best programmers,
		 investing in developer training and allowing them more
independence,
		 gaining their loyalty through workplace attractiveness and payscale,
		 being intransigent as to the quality of software development -
because quality is non-negotiable.
Implementation
The first challenge the Giants face is thus recruiting the best
programmers. They have become masters at the art, which is trickier than
it might at first appear.
One test which is often used by the majors is to have the candidates write
code. A test Facebook uses is the FizzBuzz. This exercise, inspired by a
drinking game which some of you might recognize, consists in displaying
the first 1000 prime numbers, except for multiples of 3 or 5, where “Fizz“
or “Buzz“ respectively must be displayed, and except for multiples of
3 and 5, where “FizzBuzz“ must be displayed. This little programming
exercise weeds out 99.5% of the candidates. Similarly, to be hired by
Google, between four and nine technical interviews are necessary.
30
THE WEB GIANTS
Salary is obviously to be taken into account. To have very good developers,
you have to be ready to pay the price. At Facebook, Senior Software
Engineers are among the best paid employees.
Once programmers have joined your firm, the second challenge is to
favor their development, fulfillment, and to enrich their skills. In such
companies, programmers are not considered code laborers to be
watched over by a manager but instead as key players. The Google
model, which encourages developers to devote 20% of their time to
R&D projects, is often cited as an example. This practice can give rise
to contributions to open-source projects, which provide many benefits
to the company (cf. “Open Source Contribution“, p. 41). On the Netflix
blog for example, they mention their numerous open source initiatives,
namely on Zookeeper and Cassandra. The benefit to Netflix is twofold: its
developers gain in notoriety outside the company, while at the same time
developing the Netflix platform.
Another key element in developer loyalty is the working conditions. The
internet provides ample descriptions of the extent to which Web Giants
are willing to go to provide a pleasant workplace. The conditions are
strikingly different from what one finds in most Tech companies. But that
is not all! Netflix, again, has built a culture which strongly focuses on its
employees’ autonomy and responsibility. More recently, Valve, a video
game publisher, sparked a buzz among developers when they published
their Handbook, which describes a work culture which is highly demanding
but also propitious to personal fulfillment. 37 signals, lastly, with their
book Getting Real, lays out their very open practices, often the opposite
of what one generally finds in such organizations.
In addition to efforts deployed in recruiting and holding on to
programmers, there is also a strong culture of code and software quality.
It is this culture that creates the foundations for moving and adapting
quickly, all while managing mammoth technological platforms where
performance and robustness are crucial. Web Giants are very close to
the Software Craftsmanship[1]
movement, which promotes a set of
values and practices aiming to guarantee top-quality software and to
provide as much value as possible to end-users. Within this movement,
Google and GitHub have not hesitated to share their coding guidelines[2]
.
[1] http://manifesto.softwarecraftsmanship.org
[2] http://code.google.com/p/google-styleguide/ and https://github.com/styleguide
31
THE WEB GIANTS
How can I make it work for me?
Recruiting It is important to implement very solid recruitment processes
when hiring your programmers. After a first interview to get a sense of
the person you wish to recruit, it is essential to have the person code. You
can propose a few technical exercises to assess the candidate’s expertise,
but it is even more interesting to have them code as a pair with one of
your developers, to see whether there is good feeling around the project.
You can also ask programmers to show their own code, especially what
they are most proud of - or most ashamed of. More than the code itself,
discussions around coding will bring in a wealth of information on the
candidate. Also, did they put their code on GitHub? Do they take part in
open source projects? If so, you will have representative samples of the
code they can produce.
Quality: Offer your developers the context which will allow them to
continue producing top-quality software (since that is non-negotiable).
Leave them time to write unit tests, to set up the development build you
will need for Continuous Deployment (cf. “Continuous Deployment“,
p. 105), to work in pairs, to hold design workshops in their business
domain, to prototype. The practice which is known to have the most
impact on quality is peer code reviewing. This happens all too rarely in
our sector.
R&D: Giving your developers the chance to participate in R&D projects in
addition to their work is a practice which can be highly profitable. It can
generate innovation, contribute to project improvement and, in the case
of Open Source, increase your company’s attractiveness for developers. It
is also simply a source of motivation for this often neglected group. More
and more firms are adopting the principles of Hackathons, popularized
by Facebook, where the principle consists in coding, in one or two days,
working software.
CULTURE / CODE CRAFTERS
32
THE WEB GIANTS
Training: Training can be externalized but you can also profit from
knowledge sharing among in-house developers by e.g. organizing group
programming workshops, commonly called “Dojo“.[3]
Developers can
gather for half a day, around a video projector, to share knowledge and
together learn about specific technical issues. It is also a way to share
developer practices and, within a team, to align with programming
standards. Lastly, working on open source projects is also a way of learning
about new technologies.
Workplace: Where and how you work are important! Allowing
independence, promoting openness and transparency, hailing mistakes
and keeping a manageable rhythm are all paying practices in the long
term.
Associated patterns
Pattern “Pizza Teams“, p. 59.
Pattern “DevOps“, p. 65.
Pattern “Continuous Deployment“, p. 105.
Sources
•	 Company culture at Netflix:
> http://www.slideshare.net/reed2001/culture-1798664
•	 What every good programmer should know:
> http://www.slideshare.net/petegoodliffe/becoming-a-better-
programmer
•	 List of all the programmer positions currently open at Facebook:
 http://www.facebook.com/careers/teams/engineering
•	 The highest salary at Facebook? Senior Software Engineer:
 http://www.businessinsider.com/the-highest-paying-jobs-at-facebook-
ranked-2012-5?op=1
[3] http://codingdojo.org/cgi-bin/wiki.pl?WhatIsCodingDojo
33
THE WEB GIANTS CULTURE / CODE CRAFTERS
•	 GitHub programming guidelines:
 https://github.com/styleguide
•	 How GitHub grows:
 http://zachholman.com/talk/scaling-github
•	 Open source contributions from Netflix:
 http://techblog.netflix.com/2012/07/open-source-at-netflix-by-ruslan.
html
•	 The FizzBuzz test:
 http://c2.com/cgi/wiki?FizzBuzzTest
•	 Getting Real:
 http://gettingreal.37signals.com/GR_fra.php
•	 The Software Craftsmanship manifesto:
 http://manifesto.softwarecraftsmanship.org
•	 The Google blog on tests:
 http://googletesting.blogspot.fr
•	 The Happy Manifesto:
 http://www.happy.co.uk/wp-content/uploads/Happy-Manifesto1.pdf
34
THE WEB GIANTS
Open Source
Contribution
35
THE WEB GIANTS
Description
Why is it Web Giants such as Facebook, Google and Twitter do so
much to develop Open Source?
A technological edge is a key to conquering the Web. Whether it be to
stand out from the competition by launching new services (remember
when Gmail came out with all its storage space at a time when Hotmail
was lording it?) or more practically to overcome inherent constraints such
as the growth challenge linked to the expansion of their user base. On
numerous occasions, Web Giants have pulled through by inventing new
technologies.
If so, one would think that their technological mastery, and the asset
which is the code, would be carefully shielded from prying eyes, whereas
in fact the widely shared pattern one finds is that Web Giants are not
only major consumers of open source technology, they are also the
main contributors.
The pattern “developing open source“ consists of making public a software
tool (library, framework...) developed and used in-house. The code is
made available on a public server such as GitHub, with a free license of
the Apache type for example, authorizing its use and adaptation by other
companies. In this way, the code is potentially open to development by
the entire world. Moreover, open source applications are traditionally
accompanied by much publicity on the web and during programming
conferences.
Who makes it work for them?
There are many examples. Among the most representative is Facebook
and its Cassandra database, built to manage massive quantities of data
distributed over several servers. It is interesting to note that among current
users of Cassandra, one finds other Web Giants, e.g. Twitter and Digg,
whereas Facebook has abandoned Cassandra in favor of another open
source storage solution - HBase - launched by the company Powerset.
With the NoSQL movement, the new foundations of the Web are today
massively based on the technologies of the Giants.
36
THE WEB GIANTS
Facebook has furthermore opened several frameworks up to the
community, such as its HipHop engine which compiles PHP in C++, Thrift,
a multilanguage development service, and Open Compute, an Open
hardware initiative which aims to optimize how datacenters function. But
Facebook is not alone.
Google has done the same with its user interface framework GWT, used
namely in Adword. Another example is the Tesseract Optical Character
Recognition (OCR) tool initially developed by HP and then by Google,
which opened it up to the community a few years later. Lastly, one cannot
name Google without citing Android, its open source operating system
for mobile devices, not to mention their numerous scientific publications
on storing and processing massive quantities of data. We are referring
more particularly to their papers on Big Table and Map Reduce which
inspired the Hadoop project.
The list could go on and on, so we will end with first Twitter and its CSS
framework and very trendy responsive design, called Bootstrap, and the
excellent Ruby On Rails extracted from the Basecamp project management
software opened up to the community by 37signals.
Why does it work?
Putting aside ideological considerations, we propose to explore various
advantages to be drawn from developing open software.
Open and free does not necessarily equate with price and profit wars.
In fact, from one angle, opening up software is a way of cutting competition
off in the bud for specific technologies. Contributing to Open Source is
a way of redefining a given technology sector while ensuring sway
over the best available solution. For a long time, Google was the main
sponsor of the Mozilla Foundation and its flagship project Firefox, to the
tune of 80%. A way to diversify to counter Microsoft. Let us come back to
our analysis of the three advantages.
[1] Interface Homme Machine.
CULTURE / OPEN SOURCE CONTRIBUTION
37
THE WEB GIANTS
Promoting the brand
By opening cutting-edge technology up to the community, Web Giants
position themselves as leaders, pioneers. It implicitly communicates a
spiritofinnovationreigningintheirhalls,aconstantquestforimprovements.
They show themselves as being able to solve big problems, masters of
technological prowess. Delivering a successful Open Source framework
says that you solved a common problem faster or better than anyone else.
And that, in a way, the problem is now behind you. Done and gone, you’re
already moving onto the next. One step ahead of the game.
To share a framework is to make a strong statement, to reinforce the
brand. It is a way to communicate an implicit and primal message: “We
are the best, don’t you worry“
And then, to avoid being seen as the new Big Brother, one can’t but help
feeling that the message also implied is:
“We’re open, we’re good guys, fear not“.[2]
Attracting - and keeping - the best
This is an essential aspect which can be fostered by an open source
approach. Because “displaying your code“ means showing part of your
DNA, your way of thinking, of solving problems - show me your code
and I will tell you who you are. It is the natural way of publicizing what
exactly goes on in your company: the expertise of your programmers,
your quality standards, what your teams work on day by day... A good
means to attract “compatible“ coders who would have already been
following the projects led by your company.
Developing Open Source thus helps you to spot the most dedicated,
competent and motivated programmers, and when you hire them you
are already sure they will easily integrate your ecosystem. In a manner of
speaking, Open Source is like a huge trial period, open to all.
[2] Google’s motto: “Don’t be evil“
38
THE WEB GIANTS
Attracting the best geeks is one thing, hanging on to them is another. On
this point, Open Source can be a great way to offer your company’s best
programmers a showcase demonstration open to the whole world.
That way they can show their brilliance, within their company and beyond.
Promoting Open Source bolsters your programmers’ resumes. It takes
into account the Personal Branding needs of your staff, while keeping
them happy at work. All programmers want to work in a place where
programming is important, within an environment which offers a career
path for software engineers. Spoken as a programmer.
Improving quality
Simply “thinking open source“ is already a leap forward in quality:
opening up code - a framework - to the community first entails defining its
contours, naming it, describing the framework and its aim. That alone is a
significant step towards improving the quality of your software because it
inevitably leads to breaking it up into modules, giving it structure. It also
makes it easier to reuse the code in-house. It defines accountability within
the code and even within teams.
It goes without saying that programmers who are aware that their code
will be checked (not to mention read by programmers the world over) will
think twice before committing an untested method or a hastily assembled
piece of code. Beyond making programmers more responsible, feedback
from peers outside the company is always useful.
How can I make it work for me?
When properly used, Open Source can be an intelligent way not only to
structure your RD but also to assess programmer performance.
The goal of this paper was to explore the various advantages offered by
opening up certain technologies. If you are not quite up to making the
jump culturally speaking, or if your IS is not ready yet, it can nonetheless
be useful to play with the idea taking a few simple-to-implement actions.
Depending on the size of your company, launching your very first Open
Source project can unfortunately be met with general indifference. We do
not all have the powers of communication of Facebook. Beginning by
CULTURE / OPEN SOURCE CONTRIBUTION
39
THE WEB GIANTS
contributing to Open Source projects already underway can be a good
initial step for testing the culture within your teams.
Like Google and GitHub, another action which works towards the three
advantages laid out here can be to materialize and publish on the web
your programming guidelines. Another possibility is to encourage your
programmers to open a development blog where they could discuss
the main issues they have come up against. The Instagram Engineering
Tumblr moderated by Instagram can be a very good source of inspiration.
Sources
• The Facebook developer portal, Open Source projects:
 http://developers.facebook.com/opensource
• Open-Source Projects Released By Google:
 http://code.google.com/opensource/projects.html
• The Twitter developer portal, Open Source projects:
 http://dev.twitter.com/opensource/projects
• Instagram Engineering Blog:
 http://instagram-engineering.tumblr.com
• The rules for writing GitHub code:
 http://github.com/styleguide
• A question on Quora: Open Source: “Why would a big company do
open-source projects?“:
 http://www.quora.com/Open-Source/Why-would-a-big-company-do-
open-source-projects
40
THE WEB GIANTS
Sharing
Economy
platforms
OCTO_TheWebGiants_2016
42
THE WEB GIANTS CULTURE / SHARING ECONOMY PLATFORMS
Description
The principles at work in the platforms of the sharing economy
(exponential business platforms) are one of the keys to the successes
of the web giants and other startups valuated at $1 billion (“unicorns“)
such as BlablaCar, Cloudera, Social finance, or over $10 billion
(“decacorns“) such as Uber, AirBnB, Snapchat, Flipkart (List and
valuation of the Uni/Deca-corns).
The latter are disrupting existing ecosystems, inventing new ones,
wiping out others. And yet “Businesses never die, only business
models evolve“ (To learn more, see: Philippe Siberzahan, “Relevez le
défi de l’innovation de rupture“).
Concerns over the risks of disintermediation are legitimate given that
digital technology has led to the development of numerous highly
successful “exponential business platforms“ (see the article by Maurice
Levy, “Se faire ubériser“).
The article below begins with a recap of what is common to these
platforms and then explores the main fundamentals necessary for
building or becoming an exponential business platform.
The wonderful world
of the “Sharing economy“
Thereisacontinuousstreamofnewcomersknockingatthedoor,progressively
transforming many sectors of the economy, driving them towards a so-
called “collaborative“ economy. Among other goals, this approach strives
to develop a new type of relation: Consumer-to-Consumer (C2C). This is
true e.g. in the world of consumer loans, where the company LendingHome
(Presentation of LendingHome) is based on peer-2-peer lending. Another
area of interest is blockchain technology such as decentralisation and the
“peer-2-peer'isation“ of money through the Bitcoin!
What is most striking is that this type of relation can have an impact in
unexpected places such as personalised urban car services (e.g. Luxe and
Drop Don't Park), and movers (Lugg as an “Uber/Lyft for moving“).
Business platforms such as these favor peer-2-peer relations. They have
achieved exponential growth by leveraging the multitudes (For further
information, see: Nicolas Colin  Henri Verdier, L'âge de la multitude:
Entreprendre et gouverner après la révolution numérique). Such models
make it possible for very small structures to grow very quickly by generating
revenues per employee which can be from 100 to 1000 times higher than
43
THE WEB GIANTS
in businesses working in the same sector but which are much larger. The
fundamental question is then to know what has enabled some of them to
become hits and to grow their popularity, in terms of both community and
revenues. What are the ingredients in the mix, and how does one become so
rapidly successful?
At this stage, the contextual elements and common ground we discern are:
	 An often highly regulated market where these platforms appear
and then develop by providing new solutions which break away
from regulations (for example the obligation for hotels to make at
least 10% of their rooms disability friendly, which does not apply
to individuals using the AirBnB system).
	 An as yet unmet need in supply and demand can make it possible
to earn a living or to generate additional revenue for a better
quality of life (Cf. AirBnB's 2015 communication campaign on the
subject) or at the least to share costs (Blablacar). This point in
particular raises crucial questions as to the very notion of work, its
regulation and the taxation of platforms.
	 There is strong friction around the experience, of clients and
citizens, where the market has as yet to provide a response (such
as valet car services in large cities around the world where parking
is completely saturated)
	 A deliberate strategy to not invest in material assets but rather to
efficiently embrace the business of creating links between people.
Given this understanding of the context, the 5 main principles we propose
to become an exponential business platform are:
	 Develop your “network lock-in effect“.
	 Pair up algorithms with the user experience.
	 Develop trust.
	 Think user and be rigorous in execution.
	 Carefully choose your target when you launch platform
experiments.
44
THE WEB GIANTS
“Network lock-in effect“
The more supply and demand grow and come together, the more
indispensable your platform becomes. Indispensable because in the end
that is where the best offers are to be found, the best deals, where your
friends are.
There is an inflection point where the network of suppliers and users
becomes the main asset, the central pillar. Attracting new users is no
longer the principal preoccupation. This asset makes it possible to
become the reference platform for your segment. This growth can provide
a monopoly over its use case, especially if there are exclusive deals that
can be obtained through offers valid on your platform only.
It can then extend to offers which follow upon the first (for example Uber's
position as an urban mobility platform has led them to diversify into a
meal delivery service for restaurants). This is one of the elements which
were very quickly theorised in the Lean Startup approach: the virality
coefficient.
The perfect match:
User eXperience  Algorithms
What is crucial in the platform is setting up the perfect relation between
supply and demand, celerity in implementing relations in time and/
or space, lower prices as compared to traditional systems, and even
providing services that weren't possible before. For some, algorithms
for establishing relations are the core of their operations to deliver on
their daily promise of offering suggestions and possibilities for relevant
connections within a few micro-seconds.
The perfect match is a fine-tuned mix between stellar research into the
user experience (all the way to swipe!), often using a mobile-first approach
to explore and offer services, based on advanced algorithms to expose
relevant associations. A telling example is the use of “Swipe“ in terms
of uniquely tailored user experiences for fast browsing as in the personal
relationship tool “Tinder“.
CULTURE / SHARING ECONOMY PLATFORMS
45
THE WEB GIANTS
Trust  security
To get beyond the early adapters to reach the market majority, two
elements are critical to the client experience: trust in the platform, trust
towards the other platform users (both consumers and providers).
Who has not experienced stress when reserving one's first AirBnB? Who
has not wondered whether Uber would actually be there?
This level of trust conveyed by the platform and platform users is so
important that it has been one of the leveraging effects, like for the shared
Blablacar platform which thrived once the transactions were operated by
the platform.
What happens to the confidential data provided to the platform?
You may remember a recent hacking event of personal data on the “Ashley
Madison“ sites affecting the 37 million platform users who wanted total
discretion (Revelations around the hacking of the Ashley Madison sites).
Security is thus key to protecting platform transactions, guaranteeing
private data and reassuring users.
Think user  excel in execution
Above all it is about realising that what the market and what the clients
want is not to be found in marketing plans, sales forecasts and key
functionalities. The main questions to ask revolve around the triplets Client
/ Problem / Solution: Do I really have a problem that is worth solving? Is
my solution the right one for my client? Will my client buy it? For how
much? Use whatever you can to check your hypotheses: interviews, market
studies, prototypes...
To succeed, these platforms aim to reach production very quickly, iterating
and improving while their competition is still exploring their business plan.
It is then a ferocious race between pioneers and copycats, because in
this type of race “winner takes all“ (For further reading, see The Second
Machine Age, Erik Brynjolfsson  Andrew Mcafee).
46
THE WEB GIANTS
Then excellence in execution becomes the other pillar. This operational
excellence covers:
	 the platform itself and the users it “hosts“: active users, quality of
the goods offered... quality in rating with numerous well assessed
offers...
	 offers which are mediated by the platform (comments, satisfaction
surveys...)
One may note in particular the example of AirBnB on the theme of
excellence in execution, beyond software, where the quality in the
description of the lodgings as well as beautiful photos were a strong
differential as compared to the competition of the time (Craig's List) (A
few words on the quality of the photos at AirBnB).
Critical market size
Critical market size is one of the elements which make it possible to rapidly
reach a sufficiently strong network effect (speed in reaching a critical size is
fundamental to not being overrun by copycats).
Critical market size is made up of two aspects:
	 Selecting the primary territories for deployment, most often in
cities or mega-cities,
	 Ensuring deployment in other cities in the area, when possible in
standardized regulatory contexts.
You must therefore choose cities particularly concerned by your value
propositions for your platform, where a sufficient number of early adapters
is high enough to quickly garner takeaways. Mega-cities in the Americas,
Europe and Asia are therefore choice targets for experimental deployments.
Lastly, during the generalisation phase, it is no surprise to see stakeholders
deploying massively in the USA (a market which represents 350 million
inhabitants, with standardised tax and regulatory environments, despite
state and federal differences) or in China (where the Web giants are among
the most impressive players, such as: Alibaba, Tencent and Weibo) as well
as Russia.
CULTURE / SHARING ECONOMY PLATFORMS
47
THE WEB GIANTS
In Europe, cities such as Paris, Barcelona, London, Berlin, etc. are often
prime choices for businesses.
What makes it work for them?
As examined above, there are many ingredients for exponentially
scalable Organizations and business models on the platform model:
strong possibilities for employees to self-organise, the User eXperience,
continuous experimentation... algorithms (namely intelligent networking),
and leveraging one's community.
What about me?
For IT and marketing departments, you can begin your thinking by
exploring digital innovations (looking for new uses) that fit in with your
business culture (based e.g. on Design thinking).
In certain domains, this approach can give you access to new markets or to
disruption before the competition. A recent example is that of Accor which
has entered the market of independent hotels through its acquisition of
Fastbooking (Accor gets its hands on Fastbooking).
Still in the area of self-disruption, two main strategies are coming to the
fore. The first consists, based on partnerships or capital investments
through incubators, in coming back into the game without shouldering
all of the risk. The other strategy, more ambitious and therefore riskier, is
to take inspiration from these new approaches to transform from within.
It is then important to examine whether some of these processes can be
opened up to transform them into an open platform, thereby leveraging the
multitudes.
In the distribution sector for example, the question of positioning and
opening up various strategic processes is raised: is it a good idea to turn
your supply chain into a peer-2-peer platform so that SMEs can become
consumers and not only providers in the supply chain? Are pharmacies the
next on the list of programmed uberisations through stakeholders such
as 1001pharmacie.com? In the medical domain, Doctolib.com has just
leveraged €18 million to ensure its development (Doctolib raises funds)...
48
THE WEB GIANTS
Associated patterns
Enhancing the user experience
A/B Testing
Feature Flipping
Lean Startup
Sources
•	List of unicorns:
 https://www.cbinsights.com/research-unicorn-companies
•	Philippe Siberzahan, “Relevez le défi de l’innovation de rupture“,
édition Pearson
•	 Article by Maurice Levy on “Tout le monde a peur de se faire ubériser“
http://www.latribune.fr/technos-medias/20141217tribd1e82ceae/tout-
le-monde-a-peur-de-se-faire-uberiser-maurice-levy.html
• Lending Home present through “C’est pas mon idée“:
 http://cestpasmonidee.blogspot.fr/2015/09/lendinghome-part-
lassaut-du-credit.html
•	 Nicolas Colin  Henri Verdier, “l’âge de la multitude, 2nde
édition“
•	 Ashley Madison hacking:
 http://www.slate.fr/story/104559/ashley-madison-site-rencontres-
extraconjugales-hack-adultere
•	 Second âge de la machine, Erik Brynjolfsson
•	 Quality of AirBnB photos:
 https://growthhackers.com/growth-studies/airbnb
•	 Accor met la main sur Fastbooking:
 http://www.lesechos.fr/17/04/2015/lesechos.fr/02115417027_accor-
met-la-main-sur-fastbooking.htm
•	 Doctolib raises 18M€:
 http://www.zdnet.fr/actualites/doctolib-nouvelle-levee-de-fonds-a-18-
millions-d-euros-39826390.htm
CULTURE / SHARING ECONOMY PLATFORMS
49
THE WEB GIANTS
Organization
50
Pizza Teams..................................................................................... 59
Feature Teams................................................................................. 65
DevOps........................................................................................... 71
THE WEB GIANTS
51
THE WEB GIANTS
Pizza
Teams
52
THE WEB GIANTS
53
THE WEB GIANTS ORGANIZATION / PIZZA TEAMS
Description
What is the right size for a team to develop great software?
Organizational studies have been investigating the issue of team size
for several years now. Although answers differ and seem to depend on
various criteria such as the nature of tasks to be carried out, the average
level, and team diversity, there is consensus on a size of between 5 and
15 members.[1][5]
Any fewer than 5 and the team is vulnerable to outside
events and lacks creativity. Any more than 12 and communication is less
efficient, coherency is lost, there is an increase in free-riding and in power
struggles, and the team’s performance drops rapidly the more members
there are.
This is obviously also true in IT. The firm Quantitative Software
Management, specialized in the preservation and analysis of metrics from
IT projects, has published some interesting statistics. If you like numbers,
I highly recommend their Web site, it is chock full of information! Based
on a sample of 491 projects, QSM measured a loss of productivity
and heightened variability with an increase in team size, with a quite
clear break once one reaches 7 people. In correlation, average project
duration increases and development efforts skyrocket once one goes
beyond 15.[6]
In a nutshell: if you want speed and quality, cut your team size!
Why are we mentioning such matters in this work devoted to Web Giants?
Very simply because they are particularly aware of the importance of team
size for project success, and daily deploy techniques to keep size down.
[1] http://knowledge.wharton.upenn.edu/article.cfm?articleid=1501
[2] http://www.projectsatwork.com/article.cfm?ID=227526
[3] http://www.teambuildingportal.com/articles/systems/teamperformance-teamsize
[4] http://math.arizona.edu/~lega/485-585/Group_Dynamics_RV.pdf
[5] http://www.articlesnatch.com/Article/What-Project-Team-Size-Is-Best-/589717
[6] http://www.qsm.com/process_improvement_01.html
54
THE WEB GIANTS
In fact the title of this chapter is inspired by the name Amazon gave to
this practice:[7]
if your team can’t be fed on two pizzas, then cut people.
Albeit these are American size pizzas, but nonetheless about 8 people.
Werner Vogels (Amazon VP and CTO) drove the point home with the
following quote which could almost be by Nietzsche:
Small teams are holy.
But Amazon is not alone, far from it.
To illustrate the importance that team dynamics have for Web Giants:
Google hired Evan Wittenberg to be manager of Global Leadership
Development; the former academic was known, in part, for his work on
team size.
The same discipline is applied at Yahoo! which limits its product teams in
the first year to between 5 and 10 people.
As for Vidaeo, they have adopted the French pizza size approach with
teams of 5-6 people.
In the field of startups, Instagram, Dropbox, Evernote.... are known for
having kept their development teams as small as possible for as long as
possible.
How can I make it work for me?
A small, agile team will always be more efficient than a big lazy team; such
is the conclusion which could be drawn from the accumulated literature
on team size.
In the end, you only need to remember it to apply it... and to steer away
from linear logic such as: “to go twice as fast, all you need is double the
people!“ Nothing could be more wrong!
According to these studies, a team exceeding 15 people should set
alarm bells ringing.[8][10]
[7] http://www.fastcompany.com/magazine/85/bezos_4.html
[8] https://speakerdeck.com/u/searls/p/the-mythical-team-month
[9] http://www.3circlepartners.com/news/team-size-matters
[10] http://37signals.com/svn/posts/995-if-youre-working-in-a-big-group-youre-fighting-
human-nature
55
THE WEB GIANTS ORGANIZATION / PIZZA TEAMS
You then have two options:
		 Fight tooth and nail to prevent the team from growing, and, if that
fails, to adopt the second solution;
		 split the team up into smaller teams. But think very carefully before
you do so and bear in mind that a team is a group of people
motivated around a common goal. Which is the subject of the
following chapter, “Feature Teams“.
56
THE WEB GIANTS
Feature
Teams
57
THE WEB GIANTS ORGANIZATION / FEATURE TEAMS
Description
In the preceding chapter, we saw that Web Giants pay careful
attention to the size of their teams. That is not all they pay attention
to concerning teams however: they also often organize their teams
around functionalities, known as “feature teams“.
A small and versatile team is a key to moving swiftly, and most Web
Giants resist multiplying the number of teams devoted to a single
product as much as possible.
However, when a product is a hit, a dozen people no longer suffice for
the scale up. Even in such a case, team size must remain small to ensure
coherence, therefore it is the number of teams which must be increased.
This raises the question of how to delimit the perimeters of each.
There are two main options:[1]
		 Segmenting into “technological“ layers.
		 Segmenting according to “functionality thread“.
By “functionality thread“ we mean being in a position to deliver
independent functionalities from beginning to end, to provide a service
to the end user.
In contrast, one can also divide teams along technological layers, with one
team per type of technology: typically, the presentation layer, business
layer, horizontal foundations, database...
This is generally the organization structure adopted in Information
Departments, each group working within its own specialty.
However, whenever Time To Market becomes crucial, organization into
technological layers, also known as Component Teams, begins to show
its limitations. This is because Time To Market crunches often necessitate
Agile or Lean approaches. This means specification, development, and
production with the shortest possible cycles, if not on the fly.
[1] There are in truth other possible groupings, e.g. by release, geographic area, user segment
or product family. But that would be beyond the scope of the work here; some of the options
are dead ends, others can be assimilated to functionality thread divisions.
58
THE WEB GIANTS
Functionality 1
Functionality 2
Functionality 4
Functionality 5
Team 1
- Front
Team 1
- Back
Team 1
- Exchange
Team 1
- Base
The trouble with Component Teams is you often find yourself with
bottlenecks.
Let us take the example laid out in Figure 1.
Figure 1
Theredarrowsindicatethefirstproblem.Themostimportantfunctionalities
(functionality 1) are swamping the Front team. The other teams are left
producing marginal elements for these functionalities. But nothing can be
released until Team 1 has finished. There is not much the other teams can
do to help (not sharing the same specialty as Team 1), so are left twiddling
their thumbs or stocking less important functionalities (and don’t forget
that in Lean, stocks are bad...).
There’s worse. Functionality 4 needs all four teams to work together.
The trouble is that, in Agile mode, each team individually carries out the
detailed analysis. Whereas here, what is needed is the detailed impact
analysis on the 4 teams. This means that the detailed analysis has to take
place upstream, which is precisely what Agile strives to avoid. Similarly,
downstream, the work of the 4 teams has to be synchronized for testing,
which means waiting for laggers. To limit the impact, task priorities have to
be defined for each team in a centralized manner. And little by little, you
find yourselves with a scheduling department striving to best synchronize
all the work but leaving no room for team autonomy.
59
THE WEB GIANTS ORGANIZATION / FEATURE TEAMS
In short, you have a waterfall effect upstream in analysis and planning and
a waterfall effect downstream in testing and deploying to production. This
type of dynamics is very well described in the work of Craig Larman and
Bas Vodde, Scaling Lean and Agile.
Feature teams can correct these errors: with each team working on a
coherent functional subset - and doing so without having to think about
the technology - they are capable of delivering value to the end client
at any moment, with little need to call on other teams. This entails
having all necessary skills for producing functionalities in each team,
which can mean (among others) an architect, an interface specialist, a
Web developer, a Java developer, a database expert, and, yes, even
someone to run it... because when taken to the extreme, you end up
with the DevOps “you build it, you run it“, as described in the next
chapter (cf. “DevOps“, p. 71).
But then how do you ensure the technological coherence of the product,
if each Java expert in each feature team takes the decisions within their
perimeter? This issue is addressed by the principle of community of
practice. Peers from each type of specialty get together at regular intervals
to exchange on their practices and to agree on technological strategies
for the product being produced.
Feature Teams have the added advantage that teams quickly progress
in the business, this in turn fosters implication of the developers in the
quality of the final product.
Practicing the method is of course sloppier than what we’ve laid out here:
defining perimeters is no easy task, team dynamics can be complicated,
communities of practice must be fostered... Despite the challenges, this
organization method brings true benefits as compared to hierarchical
structures, and is much more effective and agile.
To come back to our Web Giants, this is the type of organization they tend
to favor. Facebook in particular, which communicates a lot around the
culture, focuses on teams which bring together all the necessary talents to
create a functionality.[2]
[2] http://www.time.com/time/specials/packages
article/0,28804,2036683_2037109_2037111,00.html
60
THE WEB GIANTS
It is also the type of structure that Viadeo, Yahoo! and Microsoft[3]
have
chosen to develop their products.
How can I make it work for me?
Web Giants are not alone in applying the principles of Feature Teams. It is
an approach also often adopted by software publishers.
Moreover, Agile is spreading throughout our Information Departments and
is starting to be applied to bigger and bigger projects. Once your project
reaches a certain size (3-4 teams), Feature Teams are the most effective
answer, to the point where some Information Departments naturally turn
to that type of pattern.[4]
[3] Michael A. Cusumano and Richard W. Selby. 1997. How Microsoft builds software.
Commun. ACM 40, 6 (June 1997), 53-61 :http://doi.acm.org/10.1145/255656.255698
[4] http://blog.octo.com/compte-rendu-du-petit-dejeuner-organise-par-octo-et-strator-
retour-dexperience-lagilite-a-grande-echelle (French only).
61
THE WEB GIANTS
DevOps
62
THE WEB GIANTS
63
THE WEB GIANTS ORGANIZATION / DEVOPS
Description
The “DevOps“ method is a call to rethink the divisions common in
our organizations, separating development on one hand, i.e. those
who write application codes (“Devs“) and operations on the other,
i.e. those who deploy and implement the applications (“Ops“).
Such thoughts are certainly as old as CIOs but find renewed life thanks
notably to two groups. First there are the agilists who have minimized
constraints on the development side and are now capable of providing
highly valued software to their clients on a much more frequent basis.
Then there are the experts or “Prod“ managers, known as the Web Giants
(Amazon, Facebook, LinkedIn...) who have shared their experiences in
how they have managed the Dev vs. Ops divide.
Beyond the intellectual beauty of the exercise, DevOps is mainly (if not
entirely) gearing to reduce the Time To Market (TTM). Obviously, there are
other positive effects, but the main priority, all being mentioned, is this
TTM (hardly surprising in the Web industry).
Dev  Ops: differing local concerns but a common goal
Organizational divides notwithstanding, the preoccupations of
Development and Operations are indeed distinct and equally laudable:
Figure 1
Seeking to innovate Seeking to rationalize
Local targets
DevOps
“wall of confusion“
Different
cultures
Deliver new functionalities
(of quality)
Guarantee application runs
(stability)
Product Culture (software)
Service Culture (archiving,
supervision, etc.)
64
THE WEB GIANTS
Software development seeks heightened responsiveness (under
pressure notably from their industry and the market): they have to move
fast, add new functionalities, reorient work, refactor, upgrade frameworks,
test deployment across all environments... The very nature of software is
to be flexible and adaptable.
In contrast, Operations need stability and standardization.
Stability, because it is often difficult to anticipate what the impacts of
a given modification to the code, architecture or infrastructure will be.
Converting a local disk into a server can impact response times, a change
in code can heavily impact CPU activity leading to difficulties in capacity
planning.
Standardization, because Operations seek to ensure that certain rules
(equipment configuration, software versions, network security, log file
configuration...) are uniformly followed to ensure the quality of service of
the infrastructure.
And yet both groups, Devs and Ops, have a shared objective: to make
the system work for the client.
DevOps: capitalizing on Agility
Agility became a buzzword somewhat over ten years ago, its main
objective being to reduce constraints in development processes.
The Agile method introduced the notions of “short cycle“, “user
feedback“, “Product owner“, i.e. a person in charge of managing the
roadmap, setting priorities, etc.
Agility also shook up traditional management structures by including
cross-silo teams (developers and operators) and played havoc with
administrative departments.
Today, when those barriers are removed, software development is most
often carried out with one to two-week frequencies. Business sees the
software evolve during the construction phase.
It is now time to bring people from operations into the following phases:
65
THE WEB GIANTS
		 Provisioning / spinning up environments: in most firms, deploying
to an environment can take between one to four months (even
though environments are now virtualized). This is surprisingly long,
especially when the challengers are Amazon or Google.
		 Deployment: this is without doubt the phase when problems come
to a crunch as it creates the most instability; agile teams sometimes
limit themselves to one deployment per quarter to limit the impacts
on production. In order to guarantee system stability, these
deployments are often carried out manually, are therefore lengthy,
and can introduce errors. In short, they are risky.
		 Incident resolution and meeting non-functional needs: Production
is the other software user. Diagnosis must be fast, the problems and
resilience stakes must be explained, and robustness must be taken
into account.
DevOps is organized around 3 pillars: infrastructure as code
(IaC), continuous delivery, and a culture of cooperation
1. “Infrastructure as Code“ or how to reduce provisioning and environment
deployment delays
One of the most visible friction points is in the lack of collaboration
between Dev and Ops in deployment phases. Furthermore this is the
activity which consumes the most resources: half of production time is
thus taken up by deployment and incident management.
Figure 2.
Source: Study by Deepak Patil (Microsoft Global Foundation Services) in 2006, via a
presentation modified by James Hamilton (Amazon Web Services), http://mvdirona.com/
jrh/TalksAndPapers/JamesHamilton_POA20090226.pdf
ORGANIZATION / DEVOPS
66
THE WEB GIANTS
CMDB
Mustreflecttargetconfiguration
real-worldsystemconfiguration
configuration
OpenStack
VMWare vCloud
OpenNebula
VM instanciation / OS Installation
- Installation of Operating System
Bootstrapping
Capistrano
Custom script
(shell, python…)
Commandand
control
Application Service Orchestration
- Deploy application code to services
(war, php source, ruby, ...)
- RDBMS deployment (figure...)
Chef
Puppet
CFEngine
System Configuration
- Deploy and install services required for
application execution (JVM, application servers...)
- Configuration of these services
(logs, ports, rights, etc.)
And although it is difficult to establish general rules, it is highly likely that
part of this cost (the 31% segment) could be reduced by automating
deployment.
There are many reliable tools available today to generate provisioning
and deployment to new environments, ranging from setting up Virtual
Machines to software deployment and system configuration.
Figure 3. Classification of the main tools (october 2012)
These tools (each in its own language) can be used to code infrastructure:
to install and deploy an HTTP service for server applications, to create
repositories for the log files... The range of services and associated gains
are many:
		 Guaranteeing replicable and reliable processes (no user interaction,
thus removing a source of errors) namely through their capacity to
manage versions and rollback operations.
		 Productivity. One-click deployment rather than a set of manual
tasks, thus saving time.
		 Traceability to quickly understand and explain any failures.
67
THE WEB GIANTS ORGANIZATION / DEVOPS
		 Reducing Time To Recovery: In a worst case scenario, the
infrastructure can be recreated from scratch. In terms of recovery
this is highly useful. In keeping with ideas stemming from Recovery
Oriented Architecture, resilience can be addressed either by
attempting to prevent systems from failing by working on the MTBF
- Mean Time Between Failures, or by accelerating repairs by working
on the MTTR - Mean Time To Recovery. The second approach,
although not always possible to implement, is the least costly. It is
also useful in organizations where many environments are necessary.
In such organizations, the numerous environments are essentially
kept available and little used because configuration takes too long.
Automation is furthermore a way of initializing a change in collaboration
culture between Dev and Ops. This is because automation increases the
possibilities for self-service for Dev teams, at the very least over the ante-
production environments.
2. Continuous Delivery
Traditionally, in our organizations, the split between Dev and Ops comes
to a head during deployment phases, when development delivers or
shuffles off their code, which then continues on its long way through the
production process.
The following quote from Mary and Tom Poppendieck[1]
puts the problem
in a nutshell:
How long would it take your organization to deploy a
change
that involves just one single line of code?
The answer is of course not obvious, but in the end it is here that
differences in objectives diverge the most. Development seeks control
over part of the infrastructure, for rapid deployment, on demand, to all
environments. In contrast, production must see to making environments
available, rationalizing costs, allocating resources (bandwidth, CPU...)
[1] Mary and Tom Poppendieck, Implementing Lean Software Development: From Concept
to Cash, Addison-Wesley, 2006.
68
THE WEB GIANTS
Also ironical is the fact that the less one deploys, the more the TTR (Time To
Repair) increases, therefore reducing the quality of service to the end client.
Figure 4.
Source: http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-
change-4608108
In other words, the more changes there are between releases (i.e. the
higher the number of changes to the code), the lower the capacity to
rapidly fix bugs following deployment, thus increasing TTR - this is the
instability ever-dreaded by Ops.
Here again, addressing such waste can reduce the time taken up by
Incident Management as shown in Figure 2.
Figure 5.
Source: http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-
change-4608108
Deploys
Size of Deploy Vs Incident TTR
5 180
UnitsofChangedCode
TTR(minutes)
160
140
120
100
80
60
40
20
0
4
3
2
1
0
Sev 1 TTR Sev 2 TTR Lines Per Deploys Changed
CHANGE
SIZE
Huge changesets
deployed rarely
(high TTR)
(low TTR)
Tiny changesets
deployed often
CHANGE FREQUENCY
69
THE WEB GIANTS ORGANIZATION / DEVOPS
To finish, Figure 5, taken from a Flickr study, shows the correlation between
TTR (and therefore the seriousness of the incidents) depending on the
amount of code deployed (and therefore the number of change to the
code).
However, continuous deployment is not easy and requires:
		 Automation of the deployment and provisioning processes: Infras-
tructure as Code
		 Automation of the software construction and deployment processes.
Build automation becomes the construction chain which carries the
source management software to the various environments where
the software will be deployed. Thus a new build system is neces-
sary, including environment management, workflow management
for more quickly compiling source code into binary code, creating
documentation and release notes to swiftly understand and fix any
failures, the capacity to distribute testing across agents to reduce
delays, and always guaranteeing short cycle times.
		 Taking these factors into account at the architecture level and above
all respecting the following principle: decouple functionality deploy-
ment and code deployment using patterns such as: Feature flipping
(cf. Feature flipping p. 113), dark launch… This of course entails a
new level of complexity but offers the necessary flexibility for this
type of continuous deployment.
		 A culture of measurement with user-oriented metrics. This is not only
about measuring CPU consumption, it is also about correlating busi-
ness and application metrics to understand and anticipate system
behavior.
3. A culture of collaboration if not an organizational model
These two practices, Infrastructure as Code and Continuous Delivery, can
be implemented in traditional organizations (with Infrastructure as Code
at Ops and Continuous Delivery at Dev). However, once development and
production reach their local optimum and a good level of maturity, the
latter will always be hampered by the organizational division.
70
THE WEB GIANTS
This is where the third pillar comes into its own; a culture of collaboration,
nay cooperation, with all teams becoming more independent rather than
throwing problems at each other in the production process. This can mean
for example giving Dev access to machine logs, providing them with
production data the day before so that they can roll out the integration
environments themselves, opening up the metrics and monitoring tools
(or even displaying the metrics in open spaces)... Bringing that much more
flexibility to Dev, sharing responsibility and information on “what happens
in Prod“, which are actually just so many tasks with little added value that
Ops would no longer have to shoulder.
The main cultural elements around DevOps could be summarized as
follows:
		 Sharing both technical metrics (response times, number of
backups...) as well as business metrics (changes in generated
profits...)
		 Ops is also the software client. This can mean making changes
to the software architecture and developments to more easily
integrate monitoring tools, to have relevant and useful log files,
to help diagnosis (and reduce the TTD, Time To Diagnose). To go
further, certain Ops needs should be expressed as user stories in the
backlog.
		 A lean approach [http://blog.octo.com/tag/lean/] and post-mortems
which focus on the deep causes (the 5 whys) and implementing
countermeasures (French only).
It remains however that in this model, the zones of responsibility (especially
development, software monitoring, datacenter use and support) which
exist are somewhat modified.
Traditional firms give the project team priority. In this model, deployment
processes, software monitoring and datacenter management are spread
out across several organizations.
71
THE WEB GIANTS ORGANIZATION / DEVOPS
Figure 6: Project teams
Inversely, some stakeholders (especially Amazon) have taken this model
very far by proposing multidisciplinary teams in charge of ensuring the
service functions - from the client’s perspective (cf. Feature Teams, p. 65).
You build it, you run it. In other words, each team is responsible for the
business, from Dev to Ops.
Figure 7: Product team – You build it, you run it.
BUSINESS
SOFTWARE PRODUCTION FLOW
MONITORING
(BUILD)
PRODUCTION
(RUN)
(Source: Cutter IT Journal, Vol. 24. N°8. August 21), modified)
Project Teams
Application
Management
Technical
Management
Service
Desk
Users
SOFTWARE PRODUCTION FLOW
PRODUCTS/SERVICES
(BUILD  RUN)
PRODUCTION
Service Desk
Infrastructure
Users
(Source: Cutter IT Journal, Vol. 24. N°8. August 21), modified)
72
THE WEB GIANTS
Moreover it is within this type of organization that the notion of self-
service takes on a different and fundamental meaning. One then sees
one team managing the software and its use and another team in
charge of datacenters. The dividing line is farther “upstream“ than is
usual, which allows scaling up and ensuring a balance between agility and
cost rationalization (e.g. linked to the datacenter architecture). The AWS
Cloud is probably the result of this... It is something else altogether, but
imagine an organization with product teams and production teams who
would jointly offer services (in the sense of ITIL) such as AWS or Google
App Engine...
Conclusion
DevOps is thus nothing more than a set of practices to leverage
improvements around:
	 Tools to industrialize the infrastructure and reassure production
as to how the infrastructure is used by development. Self service
is a concept hardwired into the Cloud. Public Cloud offers are
mature on the subject but some offers (for example VMWare) aim
to reproduce the same methods internally. Without necessarily
reaching such levels of maturity however, one can imagine using
tools like Puppet, Chef or CFEngine...
	 Architecture which makes it possible to decouple deployment
cycles, to deploy code without deploying all functionalities… (cf.
Feature flipping, p. 113 and Continuous Deployment, p.105).
	 Organizational methods, leading to implementation of Amazon’s
“Pizza teams“ patterns (cf. Pizza Teams, p. 59) and You build it, you
run it.
	 Processes and methodologies to render all these exchanges more
fluid. How to deploy more often? How to limit risks when deploying
progressively? How to apply the “flow“ lessons from Kanban to
production? How to rethink the communication and coordination
mechanisms at work along the development/operations divide?
73
THE WEB GIANTS ORGANIZATION / DEVOPS
In sum, these four strands make it possible to reach the DevOps
goals: improve collaboration, trust and objective alignment between
development and operations, giving priority to addressing the stickiest
issues, summarized in Figure 8.
Figure 8
Faster
provisioning
Improved
quality
of service
Continuous
improvement
Operational
efficiency
Infrastructure
as Code
Continuous
Delivery
Increased
deployment
reliability
Faster incident
resolution (MTTR)
Improved TTM
Culture of
collaboration
74
Sources
• White paper on the DevOps Revolution:
 http://www.cutter.com/offers/devopsrevolution.html
• Wikipedia article:
 http://en.wikipedia.org/wiki/DevOps
• Flickr Presentation at the Velocity 2009 conference:
 http://velocityconference.blip.tv/file/2284377/
• Definition of DevOps by Damon Edwards:
 http://dev2ops.org/blog/2010/2/22/what-is-devops.html
• Article by John Allspaw on DevOps:
 http://www.kitchensoap.com/2009/12/12/devops-cooperation-doesnt-
just-happen-with-deployment/
• Article on the share of deployment activities in Operations:
 http://dev2ops.org/blog/2010/4/7/why-so-many-devopsconversations-
focus-on-deployment.html
• USI 2009 (French only):
 http://www.usievents.com/fr/conferences/4-usi-2009/sessions/797-
quelques-idees-issues-des-grands-du-web-pour-remettre-en-cause-vos-
reflexes-d-architectes#webcast_autoplay
THE WEB GIANTS
75
THE WEB GIANTS
Practices
76
Lean Startup.................................................................................... 87
Minimum Viable Product.................................................................. 95
Continuous Deployment................................................................ 105
Feature Flipping............................................................................. 113
Test A/B......................................................................................... 123
Design Thinking............................................................................. 129
Device Agnostic............................................................................. 143
Perpetual beta............................................................................... 151
THE WEB GIANTS
77
THE WEB GIANTS
Lean
Startup
78
THE WEB GIANTS PRACTICES / LEAN STARTUP
Description
Creating a product is a very perilous undertaking. Figures show that 95%
of all products and startups perish from want of clients. Lean Startup is an
approach to product creation designed to reduce risks and the impact of
failures by, in parallel, tackling organizational, business and technical aspects,
and through aggressive iterations. It was formalized by Eric Ries, and was
strongly inspired by Steve Blank’s Customer Development
Build – Mesure – Learn
All products and functionalities start with a hypothesis. The hypothesis can
stem from data collection on the ground or a simple intuition. Whatever
the underlying reason, the Lean Startup approach aims to:
		 Consider all ideas as hypotheses, it doesn’t matter whether they
concern marketing or functionalities,
		 validate all hypotheses as quickly as possible on the ground.
This last point is at the core of the Lean Startup approach. Each hypothesis,
from business, systems admin or development - must be validated,
for quality as well as metrics. Such an approach makes it possible to
implement a learning loop for both the product and the client. Lean Startup
refuses the approach which consists of developing a product for over a
year only to discover that the choices made (in marketing, functionalities,
sales) threaten the entire organization. Testing is of the essence.
	 Figure 1
IDEAS
PRODUCTLEARN
BUILD
DATA MEASURE
79
THE WEB GIANTS
Experiment to validate
Part of the approach is based on the notion of Minimum Viable Product
(MVP) (cf. “Minimum Viable Product“, p. 95). At what minimum can I
validate my hypotheses?
We’re not necessarily speaking here of code and products in their
technical senses, but rather of any effort that leads to progress on a
hypothesis. Anything can be used to test market appetite - Google Docs
questionnaire, mailing list or fake functionality. Experimentations
with its afferent lessons are an invaluable asset in piloting a product and
justifying the implementation of a learning loop.
The measurement obsession
Obviously experiments must be systematically monitored through full
and reliable metrics (cf. “The obsession with performance measurement“,
p. 13).
A client-centered approach – Go out of the building
Checking metrics and validating quality very often means
“leaving the building“, as Bob Dorf puts it, co-author of the
famous “4 Steps to the Epiphany“.
“Go out of the building“ (GOOB) is at the heart of the preoccupations of
Product Managers who practice Lean Startup. Until a hypothesis has been
confronted with reality, it remains a supposition. And therefore presents
risks for the organization.
“No plan survives first contact with customers“ (Steve Blank) is thus one of
the mottoes of Product teams:
		 Build only the minimum necessary for validating a hypothesis.
		 GOOB (from face-to-face interviews to continuous deployment).
		 Learn.
		 Build, etc.
80
THE WEB GIANTS PRACTICES / LEAN STARTUP
This approach also allows constant contact with the client, in other words,
constant validation of business hypotheses. Zappos, a giant in online shoe
sales in the US, is an example of MVP being put into users’ hands at a very
early stage. To confront reality and validate that users would be willing to
buy shoes online, the future CEO of Zappos took snapshots of the shoes
in local stores, thereby creating the inventory for an e-commerce site from
scratch. In doing so, and without building cathedrals, he quickly validated
that demand was there and that producing the product would be viable.
Piloting with data
Naturally, to grasp user behavior during GOOB sessions, Product
Managers meticulously gather data which will help them make the right
decision. They also set up tools and processes to collect such data.
The most used are well known to all. They use interviews and analytics
solutions.
The Lean Startup method implements the ferocious use of these
indicators to truly pilot the product strategy. On ChooseYourBoss.com[1]
,
we postulated that users would choose LinkedIn or Viadeo to connect,
to avoid users having to set up accounts and to save us the trouble of
developing a login system. In such a way we built the minimum to validate
or invalidate the hypothesis of what people would do when given three
options to sign up, LinkedIn, Viadeo or by opening a ChooseYourBoss
account. The first two worked well while the 3rd, the ChooseYourBoss
account, indicated that the ChooseYourBoss account was not viable for
production. Results: users not wishing to use these networks to sign in
represented 11% of visitors to our site. We will therefore abstain for the
time being from implementing accounts outside of social networks. We
went from “informed by data“ to “piloted by data“.
Who makes it work for them?
IMVU, Dropbox, Heroku, Votizen and Zappos are a few examples of Web
products that managed to integrate user feedback at a very early stage in
product design. Dropbox for example completely overhauled its way of
doing things by drastically simplifying management of synchronized files.
Heroku went from a development platform in the Cloud to a Cloud server
solution. Examples abound, each more ingenious than the previous one.
[1] A site for connecting candidates and recruiters.
81
THE WEB GIANTS
What about me?
Lean Startup is not a dogma. Above all it is about realizing that what the
market and the clients want is not to be found in architecture, marketing
plans, sales forecasts and key functionalities.
Once you’ve come to that realization, you will start seeing hypotheses
everywhere. It all consists in setting up processes for validating hypotheses,
without losing sight of the principle of validating minimum functionalities
at any given instant t.
Before writing any code, the main questions to ask revolve around the
triad Client / Problem / Solution:
		 Do I really have a problem that deserves to be resolved?
		 Is my solution the right one for my client?
		 Will my client buy it? How much?
Use whatever you can to check your hypotheses: interviews, market
studies, prototypes...
The next step is to know whether the model you are testing on a small
scale is replicable and expandable.
How can you get clients to acquire a product they’ve never heard of?
Will they be in a position to understand, use, and profit from your product?
The third and fourth steps revolve around growth: how do you attract
clients and how do you build a company capable of taking on your product
and moving it forward?
Contrary to what one might think after reading this chapter, Lean Startup
is not an approach reserved for mainstream websites. Innovation
through validating hypotheses as quickly as possible and limiting financial
investment is obviously logic which can be transposed to any type of
information systems project, even in-house. We are convinced that this
approach deserves wider deployment to avoid Titanic-type projects which
can swallow colossal sums despite providing very little value for users.
For more information, you can also consult the sessions on Lean Startup at
USI which present the first two stages (www.usievents.com).
82
THE WEB GIANTS PRACTICES / LEAN STARTUP
Sources
• Running Lean – Ash Maurya
• 4 Steps to the Epiphany – Steve Blank  Bob Dorf :
 http://www.stevenblank.com/books.html
• Blog Startup Genome Project :
 http://blog.startupcompass.co/
• The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation
to Create Radically Successful Businesses – Eric Ries :
 http://www.amazon.com/The-Lean-Startup-Entrepreneurs-Continuous/
dp/0307887898
• The Startup Owner’s Manual – Steve Blank  Bob Dorf :
 http://www.amazon.com/The-Startup-Owners-Manual-Step-By-Step/
dp/0984999302
83
THE WEB GIANTS
Minimum
Viable Product
84
THE WEB GIANTS
85
THE WEB GIANTS PRACTICES / MINIMUM VIABLE PRODUCT
Description
A Minimum Viable Product (MVP) is a strategy for product development
. Lean Startup creator Eric Ries, who strongly contributed to the
elaboration of this approach, gives the following definition:
The minimum viable product is that version of a new
product which allows a team to collect the maximum
amount of validated learning about customers with the least
effort.[1]
In sum, it is a way to quickly develop a minimal product prototype to
establish whether the need for it is there, to identify possible markets, and
to validate business hypotheses on e.g. income generation.
The interest of the approach is obvious: to more quickly design a product
that truly meets market needs, by keeping costs down in two ways:
		 by reducing TTM:[2]
faster means less human effort, therefore less
outlay - all else being equal,
		 and by reducing the functional perimeter: less effort spent on
functionalities which have not yet proven their worth to the end user.
In the case of startups, funds usually run low. It is therefore best to test
your business plan hypotheses as rapidly as possible - and this is where a
MVP shows its worth.
The advantages are well illustrated by Eric Ries’s experience at IMVU.
com, an online chatting and 3D avatar website: it took them only six
months to create their first MVP, whereas in a previous startup experience
it took them almost five years to release their first product - which was
questionably viable!
[1] http://www.startuplessonslearned.com/2009/08/minimum-viable-product-guide.html
[2] Time To Market
86
THE WEB GIANTS
Today, 6 months is considered a relatively long delay, and MVPs are often
deployed in less.
This is because designing an MVP does not necessarily mean producing
code or a sophisticated website, quite the contrary. The goal is to get a
feel for the market very early on in the project so as to validate your plans
for developing your product or service. This is what is known as a
Fail Fast approach.
MVPs allow you to quickly validate your client needs hypotheses and
therefore to reorient your product or service accordingly, very early on
in your design process. This is known as a “pivot“ in the Lean Startup
jargon. Or, if your hypotheses are validated by the MVP run, you must then
move on to the next step: implementing the functionality you simulated,
creating a proper web site, or simply a marketing page.
An MVP is not only useful for launching a new product: the principle
is perfectly applicable for adding new functionalities to a product that
already exists. The approach can also be more direct: for example you
can ask for user feedback on what functionalities people would like (see
Figure 1), at the same time gathering information on how they use your
product.
MVPs are particularly relevant when you have no or little knowledge
of your market and clients, nor any well defined product vision.
Implementation
An MVP can be extremely simple. For example, Nivi Babak states that
“The Minimum Viable Product (MVP) is often an ad on Google Or a Power
Point slide. Or a dialog box. Or a landing page. You can often build it in a
day or a week.“[3]
The most minimalist approach is called a Smoke Test,
in reference to electronic component testing to check that a component
functions properly before moving on to the next stage
[3] http://venturehacks.com/articles/minimum-viable-product
87
THE WEB GIANTS PRACTICES / MINIMUM VIABLE PRODUCT
of testing (stress tests, etc.) and the fact that in case of failure there is often
a great deal of smoke!
The most minimal form of a Smoke Test consists of an advertisement in a
major search engine for example, promoting the qualities of the product
you hope to develop. Clicking on the ad will send the person to a generally
static web page with minimal information, but e.g. suggesting links, the
goal being to gather click information, indicative of how interested the
client is in the proposed service, and willingness to buy it. That is to say
that the functionalities laid out in the links do not have to be operational at
this stage! The strict minimum is the ad, as this is the first step in gathering
information.
In an early version of the website theleanstartup.com, which applies the
principles it preaches (the EYODF pattern),[4]
was proposed, at the very
bottom of its home page (the MVP of theleanstartup.com), a very simple
dialog box for collecting user needs. There were only two fields to be filled
in: e-mail address and suggestion for a new functionality, as well as the
invitation: What would you like to see on future versions of this website?
Figure 1. Form for collecting user information
on the website theleanstartup.com once the fields are filled in.
In terms of tooling, services such as Google Analytics, Xiti, etc. which
track all user actions and browsing characteristics on a given website, are
indispensable allies. For example, in the case of a new website functionality
to be implemented, it is very simple to add a new tab, menu option,
advertisement, and to track user actions with this type of tool.
SendMy Email and
What do you want to see in future releases?
THIS IS OUR MINIMUM VIABLE PRODUCT
smoky@test.com my smoke test
Success! We’ve recieved your feedback.
[4] Eat Your Own Dog Food, i.e. be the own consumers of your services.
88
THE WEB GIANTS
Risks...
Beware, the MVP can generate ambiguous results, including false
negatives. In fact, if an MVP is not sufficiently well thought-out, or is badly
presented, it can trigger a negative reaction on the targeted clients’ part.
It can seem to indicate that the planned product isn’t viable whereas in
fact it is only a question of iterating to perfect the process to better meet
client needs. The point is to not stop at the first whiff of failure: a single
step is all it can take to go from non-viable to viable, i.e. to the MVP itself.
Henry Ford put it very aptly:
“If I had asked people what they wanted, they would have said faster
horses.“ Having a product vision can be more than just an option.
Who makes it work for them?
Once again we will mention IMVU (see above), one of the pioneers of
Lean Startup where Eric Ries  Co. tested the MVP concept, more
particularly in the field of 3D avatar design. Their website, imvu.com is
an online social media for 3D avatars, chat rooms, gaming, and has the
world’s largest catalog of virtual goods, most of which are created by the
users themselves.
Let us also return to the example of Dropbox, an online file storage service
which has seen its growth skyrocket, all based on an MVP which was a
fake showcase demonstration, the product didn’t yet exist. Following
the posting of the video, a tidal wave of subscribers brought the beta
list sign-ups from 5,000 to 75,000 people in one night, confirming that
Dropbox’s product vision was indeed solid.
89
THE WEB GIANTS PRACTICES / MINIMUM VIABLE PRODUCT
How can I make it work for me?
With the prevalence of e-commerce in the social media, the web is now
at the heart of economic development strategies for businesses. The
MVP strategy can be activated as is for a wide range of projects, whether
stemming from the IT department or Marketing, but don’t forget that it
can also be applied outside the web.
It can even be applied to purely personal projects. In his reference work
Running Lean, Ash Maurya gives the example of applying an MVP (and
Lean Startup) to the publication of that self-same book.
Auditing Information Systems is a major part of our work at OCTO
and we are often faced with innovation projects (community platforms,
e-services, online shopping...) that encounter difficulties in the production
process, say every six months, and where the release, delayed by one or
two years, is often a flop, because the value delivered to users does not
correspond to market demand... In the interval, millions of euros will have
been swallowed up, for a project that will finally end up in the waste bin
of the web.
An MVP type approach reduces such risks and associated costs. On the
web, delays of that length to release a product cannot be sustained, and
competition is not only ferocious but also swift!
Within a business information system, it is hard to see how one could carry
out Smoke Tests with advertisements. And yet there too one often finds
applications and functionalities which took months to develop, without
necessarily being adopted by users in the end... The virtue of Lean Startup
and the MVP approach is to center attention on the value added for users,
and to better understand their true needs.
In such cases, an MVP can serve to prioritize the end users of the
functionalities to be developed in future versions of your application.
90
THE WEB GIANTS
Sources
• Eric Ries, Minimum Viable Product: a guide, Lessons Learned,
3 August, 2009
 http://www.startuplessonslearned.com/2009/08/minimum-viable-
product-guide.html
• Eric Ries, Minimum Viable Product, StartupLessonLearned conference
 http://www.slideshare.net/startuplessonslearned/minimum-viable-
product
• Eric Ries, Venture Hacks interview: “What is the minimum viable
product? “
 http://www.startuplessonslearned.com/2009/03/minimum-viable-
product.html
• Eric Ries, How DropBox Started As A Minimal Viable Product,
19 October, 2011
 http://techcrunch.com/2011/10/19/dropbox-minimal-viable-product
• Wikipedia, Minimum viable product
 http://en.wikipedia.org/wiki/Minimum_viable_product
• Timothy Fitz, Continuous Deployment at IMVU: Doing the impossible
fifty times a day, 10 February, 2009
 http://timothyfitz.wordpress.com/2009/02/10/continuous-deployment-
at-imvu-doing-the-impossible-fifty-times-a-day
• Benoît Guillou, Vincent Coste, Lean Start-up, 29 June, 2011,
Université du S.I. 2011, Paris
 http://www.universite-du-si.com/fr/conferences/8-paris-usi-2011/
sessions/1012-lean-start-up (French only)
• Nivi Babak, What is the minimum viable product?, 23 March, 2009
 http://venturehacks.com/articles/minimum-viable-product
• Geoffrey A. Moore, Crossing the Chasm: Marketing and Selling
High-Tech Products to Mainstream Customers, 1991 (revised 1996),
HarperBusiness, ISBN 0066620022
91
THE WEB GIANTS PRACTICES / MINIMUM VIABLE PRODUCT
• Silicon Valley Product Group (SVPG), Minimum Viable Product,
24 August, 2011
 http://www.svpg.com/minimum-viable-product
• Thomas Lissajoux, Mathieu Gandin, Fast and Furious Enough,
Définissez et testez rapidement votre premier MVP en utilisant des
pratiques issues de Lean Startup, Conference Paris Web,
15 October, 2011
 http://www.slideshare.net/Mgandin/lean-startup03-slideshare
(French only)
• Ash Maurya, Running Lean
 http://www.runningleanhq.com/
92
THE WEB GIANTS
Continuous
Deployment
93
THE WEB GIANTS PRACTICES / CONTINUOUS DEPLOYMENT
Description
In the chapter “Perpetual beta“, p. 151, we will see that Web Giants
improve their products continuously. How do they manage to deliver
improvements so frequently while in some IT departments the least
change can take several weeks to be deployed in production?
In most cases, they have implemented a continuous deployment process,
which can be done in two ways:
Either entirely automatically - modifications to the code are automatically
tested and, if validated, deployed to production.
	 Or semi-automatically: at any time one can deploy the latest
stable code to production in one go. This is known as “one-click
deployment“.
	 Obviously, setting up this pattern entails a certain number of
prerequisites.
Why deploy continuously?
The primary motivation behind continuous deployment is to shorten
the Time To Market, but it is also a means to test hypotheses, to validate
them and, in fine, to improve the product.
Let us imagine a team which deploys to production on the 1st of every
month (which is already a lot for many IT departments):
	 I have an idea on the 1st
.
	 With a little luck, the developers will be able to implement it in the
remaining 30 days.
	 As planned, it is deployed to production in the monthly release plan
on the 1st of the following month.
	 Data are collected over the next month and indicate thatthe basic
idea needs improvement.
	 But it will be a month before the new improvement can be
implemented, which is to say it takes three months to reach a
stabilized functionality.
94
THE WEB GIANTS
In this example, it is not development that is slowing things down but in
fact the delivery process and the release plan.
Thus continuous deployment shortens the Time To Market but is also a
way to accelerate product-improvement cycles.
This improves the famous Lean Startup cycle (cf. “Lean Startup“, p. 87):
Figure 1
A few definitions
Many people use “Continuous Delivery“ and “Continuous Deployment“
interchangeably. To avoid any errors in interpretation, here is our definition:
With each commit (or time interval), the code is:
Compiled, tested, deployed to an integration environment
		= Continuous Integration
Compiled, tested, delivered to the next team (Tests, Qualification,
Production, Ops).
		= Continuous Delivery
Compiled, tested, deployed to production.
		= Continuous Deployment
IDEAS
CODEDATA
LEARN FAST CODE FAST
MEASURE FAST
95
THE WEB GIANTS PRACTICES / CONTINUOUS DEPLOYMENT
The point here is not to say that Continuous Delivery and Continuous
Integration are a waste of time. Quite the contrary, they are essential steps:
Continuous Deployment is simply the natural extension of Continuous
Delivery, itself the natural extension of Continuous Integration.
What about quality?
One frequent objection to Continuous Deployment is the lack of quality
and the fear of delivering an imperfect product, of delivering bugs.
Just as with Continuous Integration, Continuous Deployment is only
fully useful if you are in a position to be sure of your code at all times.
This entails a full array of tests (on units, integration, performance, etc.).
Beyond the indispensable unit tests, there is a wide range of automated
tests such as:
		 Integration tests (Fitnesse, Greenpepper, etc.)
		 GUI tests (Selenium, etc.)
		 Performance tests (Gatling, OpenSTA, etc.)
Test automation can seem costly, but when the goal is to execute them
several times a day (IMVU launches 1 million tests per day), return on
investment grows rapidly. Some, such as Etsy, do not hesitate to create
and share tools to best meet their testing and automation needs.[1]
Furthermore, when you deploy every day, the size of the deployments is
obviously much smaller than when you deploy once a month. In addition,
the smaller the deployment, the shorter the Time To Repair, as can be
seen in Figure 2.
[1] https://github.com/etsy/deployinator
96
THE WEB GIANTS
Figure 2 (modified). Source: http://www.slideshare.net/jallspaw/ops-
metametrics-the-currency-you-pay-for-change-4608108
Etsy well illustrates the trust one can have in code and in the possibility
of repairing any errors quickly. This is because they don’t bother with
planning for rollbacks: “We don’t roll back code, we fix it“. According to
one of their employees, the longest time span it has taken them to fix a
critical bug was four minutes.
Big changes lead to big problems, little changes lead to little problems.
Who does things this way?
Many of Web Giants have successfully implemented Continuous
Deployment, here are a few of the most representative numbers
		 Facebook, very aggressive on test automation, deploys twice a day.
		 Flickr makes massive use of Feature Flipping (cf. “Feature Flipping“,
p. 113) to avoid development branches and deploys over ten times
daily. A page displays the details of the last deployment: http://
code.flickr.com
		 Etsy (an e-commerce company), hugely invested in automated tests
and deployment tooling, and deploys more than 25 times a day.
CHANGE
SIZE
Huge changesets
deployed rarely
(high TTR)
(low TTR)
Tiny changesets
deployed often
CHANGE FREQUENCY
97
THE WEB GIANTS PRACTICES / CONTINUOUS DEPLOYMENT
		 IMVU (an online gaming and 3D avatar site), performs over a million
tests a day and deploys approximately 50 times.
What about me?
Start by estimating (or even better, by measuring!) the time it takes you
and your team to deliver a simple line of code through to production,
respecting the standard process, of course.
Setting up Continuous Deployment
Creating a “Development Build“ is the first step towards Continuous
Deployment.
To move on, you have to ensure that the tests you run cover most of the
software. While some don’t hesitate to code their own test frameworks
(Netflix initiated the “Chaos Monkey“ project which shuts down servers at
random), there are also ready made frameworks available, such as JUnit,
Gatling and Selenium. To reduce testing time, IMVU distributes its tests
over no fewer than 30 machines. Others use Cloud services such as AWS
to instantiate test environments on the fly and carry out parallel testing.
Once the development build produces sufficiently tested artifacts, it
can be expanded to deliver the artifacts to the teams who will deploy
the software across the various environments. At this stage, you are
already in Continuous Delivery.
The last team can now enrich the build to include deployment tasks.
This obviously entails automating various tasks, such as configuring the
environments, deploying the artifacts which constitute the application,
migrating the database diagrams and much more. Be very careful with
your deployment scripts! It is code and, like all code, must meet quality
standards (use of a SCM, testing, etc.).
Forcing Continuous Deployment
A more radical but highly interesting solution is to force the rhythm of
release, making it weekly for example, to stir up change.
98
THE WEB GIANTS
Associated patterns
When you implement Continuous Delivery this is necessarily accompanied
by several patterns, including:
		 Zero Downtime Deployment, because while an hour of system
shut-down isn’t a problem if you release once a month, it can
become one if you release every week or every day.
		 Feature Flipping (see the next chapter, “Feature Flipping“),
because regular releases unavoidably entail delivering unfinished
functionalities or errors, you must therefore have a way of deactivating
problematic functionalities instantaneously or upstream.
		 DevOps obviously, because Continuous Deployment is one of its
pillars (cf. “DevOps“, p. 71).
Sources
• Chuck Rossi, Ship early and ship twice as often, 3 August, 2012:
 https://www.facebook.com/notes/facebook-engineering/ship-early-
and-ship-twice-as-often/10150985860363920
• Ross Harmess, Flipping out, Flickr Developer Blog, 2 December, 2009:
 http://code.flickr.com/blog/2009/12/02/flipping-out
• Chad Dickerson, How does Etsy manage development and
operations? 4 February, 2011:
 http://codeascraft.etsy.com/2011/02/04/how-does-etsy-manage-
development-and-operations
• Timothy Fitz, Continuous Deployment at IMVU: Doing the impossible
fifty times a day, 10 February, 2009:
 http://timothyfitz.wordpress.com/2009/02/10/continuous-deployment-
at-imvu-doing-the-impossible-fifty-times-a-day
• Jez Humble, Four Principles of Low-Risk Software Releases,
16 February, 2012:
 http://www.informit.com/articles/article.aspx?p=1833567
• Fred Wilson, Continuous Deployment, 12 February, 2011:
 http://www.avc.com/a_vc/2011/02/continuous-de
99
THE WEB GIANTS
Feature
Flipping
100
THE WEB GIANTS PRACTICES / FEATURE FLIPPING
Description
The “Feature Flipping“ pattern allows you to activate or deactivate
functionalities directly in production, without having to release new
code.
Several terms are used by Web Giants: Flickr and Etsy use “feature
flags“, Facebook “gatekeepers“, Forrst “feature buckets“, Lyris Inc.
“feature bits“, while Martin Fowler opted for “feature toggles“.
In short, everyone names and implements the pattern in their own way,
and yet all of these techniques strive to reach a same goal. In this article
we will use the term “feature flipping“. Successfully implemented in
our enterprise app store Appaloosa,[1]
this technique has brought many
advantages with just a few drawbacks.
Implementation
It is a very simple mechanism, you simply have to condition execution of
the code for a given functionality in the following way:
if Feature.is_enabled(‘new_feature’)
	 # do something new
else
	 # do same as before
end
The implementation of the function “is enabled“ will e.g. query a
configuration file or database to know whether the functionality is
activated or not.
You then need an administration console to configure the state of the
various flags on the different environments.
Continuous deployment
One of the first advantages in being able to hot-switch functionalities
on or off is to be able to continuously deliver the application being
produced. Indeed, one of the first problems faced by organizations imple-
menting continuous delivery is:
[1] cf. appaloosa-store .com
101
THE WEB GIANTS
how can one regularly commit the source referential while guaranteeing
application stability and constant production readiness? In the case of
functionality developments which cannot be finished in less than a day,
only committing the functionality once it’s done (after a few days) is
contrary to development best practices in continuous integration.
The truth is that the farther apart your commits, the more complicated
and risky are merges, with only limited possibilities for transversal
refactoring. Given these constraints, there are two choices: “feature
branching“ or “feature flipping“. In other words, creating a branch via
the configuration management tool or in the code. Each has its fervent
partisans, you can find some of the heated debates at: http://jamesmckay.
net/2011/07/why-does-martin-fowler-not-understand-feature-branches
Feature Flipping makes it possible for developers to code inside their
“ifs“, and to thus commit unfinished, non-functional code, as long as the
code compiles and the tests are passed. Other developers can obtain
the modifications without difficulty as long as they do not activate the
functionalities being developed. Thus the code can be deployed to
production since, again, the functionality will not be activated. That is
where the interest lies: deployment of code to production no longer
depends on completing all the functionalities under development.
Once the functionality is finished, it can be activated by simply changing
the status of the flag on the administration console.
This has an added benefit in that the functionality can be activated to
coincide e.g. with an advertising campaign; it is a way of avoiding mishaps
on the day of the release.
Mastering deployment
One of the major gains brought by this pattern is that you are in
control of deployment, because it allows you to activate a functionality
with a simple click, and to deactivate it just as easily, thus avoiding drawn-
out and problem-prone rollback processes to bring the system back to its
N-1 release.
Thus you can very quickly cancel the activation of a functionality if
production tests are inconclusive or user feedback is negative.
102
THE WEB GIANTS PRACTICES / FEATURE FLIPPING
Unfortunately, things are not quite that simple: you must be very careful
with your data and ensure that the model will work with or without the
functionality being activated (see the paragraph “Limits and constraints
 major modifications“).
Experiment to improve the product
A natural off-shoot of feature flipping is that it enables you to activate or
deactivate functionalities for specific sub-populations. You can thus test a
functionality on a user group and, depending on their response, activate it
for all users or scrap it. In which case the code will look something like this:
if Feature.is_enabled_for(‘new_feature’, current_user)
	 # do something new
else
	 # do same as before
end
You can then use the mechanism to test a functionality’s performance by
modifying one variable in its implementation for several sub-populations.
Result metrics will help you determine which implementation performs
best. In other words, feature flipping is an ideal tool for carrying out
A/B testing (cf. “A/B Testing“, p. 123).
Provide custom-made products
In some cases, it can be interesting to let the client choose between the
two. Let us take the example of attachments in Gmail: by default, the
interface proposes a number of advanced functionalities (drag and drop,
multiple uploads) which can be deactivated by the user with a simple click
in case of dysfunction.
Inversely, you can offer users an “enhanced“ mode, “labs“ (Gmail) are
telling examples of feature flipping implementation.
To do so, all you have to do is to propose an interface where users can
control the activation/deactivation of certain functionalities (self service).
103
THE WEB GIANTS
Managing billable functionalities
Activating paying functionalities with various levels of service can be com-
plicated to implement, and entails conditional code of the following type:
if current_user.current_plan == ‘enterprise’
	 || current_user.current_plan == ‘advanced’
Let us say that some “special“ firms are paying for the basic plan but you
want to give them access to all functionalities.
		 A given functionality was included in the “advanced“ plan two
months before, but marketing has decided that it should only be
included in the “enterprise“ plan... except for those who subscribed
more than two months earlier.
		 You can use feature flipping to avoid having to manage such
exceptions in the code. You just need to condition activation of
the features when a client subscribes. When users subscribe to the
enterprise plan, the functionalities X, Y and Z are activated. You can
then very easily manage exceptions in the administration interface.
Graceful degradation
Some functionalities are more crucial to business than others. When
scaling up it is a good idea to favor certain functionalities over others.
Unfortunately, it is difficult to ask your software or server to give priority to
anything to do with billing over displaying synthesis graphs... unless the
graph display functionality is feature flipped.
We have already mentioned the importance of metrics (cf. “The obsession
with performance measurement“, p. 13). Once your metrics are set up, it
becomes trivial to flip functions accordingly. For example: “If the average
response time for displaying the graph exceeds 10 seconds over a period
of 3 minutes, then deactivate the feature“.
This allows you to progressively degrade website features in order
to maintain a satisfying experience for the users of the core business
functionalities. This is akin to the “circuit breaker“ pattern (described in
the book “Release It!“ by Michel Nygard) which makes it possible to short-
circuit a functionality if an external service is down.
104
THE WEB GIANTS PRACTICES / FEATURE FLIPPING
Limits and constraints
As noted above, all you need to implement feature flipping is an “if“.
However, like with any development, this can easily become a new source
of complexity if you do not take the necessary precautions.
1. 1 “if“ = 2 tests.
Automated tests are still the best way to check that your software is
working as it should. In the case of feature flipping, you will need at
least 2 tests: with the feature flipped OFF (activated) and with the feature
flipped ON (deactivated).
In development, one often forgets to test the feature OFF even though
this is what your clients will see unless it is ON. Therefore, once more,
applying TDD[2]
is a good solution: tests written in the initial development
phases guarantee testing of OFF functionalities.
2. Clean up!
Extensive use of feature flipping can lead to an accumulation of “ifs“,
making it more and more difficult to manage the code. Remember that
for some functionalities, flipping is only useful for ensuring continuous
deployment.
For all functionalities that should never again need to be deactivated (free/
optional functionalities which will never be degraded as they are critical
from a functional perspective), it is important to delete the “ifs“ to lighten
the code and keep it serviceable.
You should therefore set aside some time following deployment to
production to “clean up“. Like all code refactoring tasks, it is all the
easier the more regularly you do it.
[2] Test Driven Development
105
THE WEB GIANTS
2. Major modifications (i.e. changing your relational model)
Some functionalities entail major changes in the code and data model.
Let us take the example of a Person table containing an Address field. To
meet new needs, you decide to divide the tables as follows:
To manage cases like this, here is a strategy you can implement:
		 Add the table Address (so that the base contains both the column
Address AND the table Address). For applications nothing has
changed, they continue querying the old columns.
		 You then modify your existing applications so that they use the new
tables.
		 You migrate the data you have and delete all unused columns.
		 At this point, most often the application will have changed little for
the user, but calls upon a new data model.
		 You can then start developing new functionalities based on your
new data model, using feature flipping.
The strategy is relatively simple and entails down time for the various
releases (phases 2 and 4).
Other techniques can be used to manage in parallel several version of your
data model, in keeping with the pattern “zero downtime deployment“,
allowing you to update your relational diagram without impacting the
availability of the application using it, based on various types of scripts
(script expansion and contraction), triggers to synchronize the data, or
even views to expose the data to the applications through an abstraction
layer.
Person
ID
Last name
First name
Address
Person
ID
Last name
First name
Address
ID
Person_ID
Street
Post_Code
Town
Country
106
THE WEB GIANTS PRACTICES / FEATURE FLIPPING
Changes to one’s relational model are much less frequent than changes to
code, but they are complex and have to be planned well in advance and
managed very carefully.
NoSQL (Not Only SQL) databases are much more flexible as concerns the
data model so can also be an interesting option.
Who makes it work for them?
It works for us, even though we are not (yet!) Web Giants.
		 In the framework of our Appaloosa project we successfully
implemented the various patterns described in this article.
		 For Web Giants, their size, constraints due to deployment to several
sites, big data migrations, leave them no choice but to implement
such mechanisms. Among the most famous are Facebook, Flickr and
Lyris Inc. Closer to home are Meetic, the Bilbiothèque Nationale
de France and Viadeo, with the latter being particularly insistent on
code clean-up and only leaving flippers in production for a few days.
	 And anyone who practices continuous deployment (cf. “Continuous
Deployment“, p. 105) applies, in one way or another, the feature
flipping pattern..
How can I make it work for me?
There are various ready-made implementations in different languages
such as the gem rollout in Ruby and the feature flipper in Grails, but it is
so easy that we recommend you design your own implementation tailored
to your specific needs.
There are multiple benefits and possible uses, so if you need to
progressively deploy functionalities, or carry out user group tests, or
deploy continuously, then get started!
107
THE WEB GIANTS
Sources
• Flickr Developer Blog:
 http://code.flickr.com/blog/2009/12/02/flipping-out
• Summary of the Flickr session at Velocity 2010:
 http://theagileadmin.com/2010/06/24/velocity-2010-always-ship-trunk
• Quora Questions on Facebook:
 http://www.quora.com/Facebook-Engineering/How-does-Facebooks-
Gatekeeper-service-work
• Forrst Engineering Blog:
 http://blog.forrst.com/post/782356699/how-we-deploy-new-features-
on-forrst
• Slideshare Lyrics Inc. :
 http://www.slideshare.net/eriksowa/feature-bits-at-devopsdays-2010-us
• Talk Lyrics Inc. at Devopsdays 2010:
 http://www.leanssc.org/files/201004/videos/20100421_
Sowa_EnabilingFlowWithinAndAcrossTeams/20100421_Sowa_
EnabilingFlowWithinAndAcrossTeams.html
• Whitepaper Lyrics Inc. :
 http://atlanta2010.leanssc.org/wp-content/uploads/2010/04/Lean_
SSC_2010_Proceedings.pdf
• Interview with Ryan King from Twitter:
 http://nosql.mypopescu.com/post/407159447/cassandra-twitter-an-
interview-with-ryan-king
• Blog post by Martin Fowler:
 http://martinfowler.com/bliki/FeatureToggle.html
• Blog 99designs:
 http://99designs.com/tech-blog/blog/2012/03/01/feature-flipping
108
THE WEB GIANTS
Test
A/B
109
THE WEB GIANTS
110
THE WEB GIANTS PRACTICES / A/B TEST
Description
A/B Testing is a product development method to test a given
functionality’s effectiveness. You can thus test e.g. a marketing
campaign via e-mail, a home page, an advertising insert or a payment
method.
This test strategy allows you to validate various object releases for a
single variable: the subject line of an e-mail or the contents of a web
page.
Like any test designed to measure performance, A/B Testing can only be
carried out in an environment capable of measuring an action’s success.
Let us take the example of a subject heading in an email. The test must
bear on how many times it was opened to determine which contents
were most compelling. For web pages, you look at click-through rates; for
payments, conversion rates.
Implementation
The method itself is relatively simple. You have variants of an object which
you want to test on various user subsets. Once you have determined
the best variant, you open it to all users.
A piece of cake? Not quite.
The first question must be the nature of the variation: where do you set
your cursor between micro-optimization and major overhaul? All depends
on where you are on the learning curve. If you’re in the client exploration
phase (cf. “Minimum Viable Product“, p. 95, “Lean Startup“, p. 87), A/B
Testing can completely change the version tested. For example, you can
set up two home pages with different marketing messages, different
layouts and graphics, to see user reactions to both. If you are farther along
in your project, where the variation of a conversion goal of 1% makes a
difference, variations can be more subtle (size, color, placement, etc.).
111
THE WEB GIANTS
The second question is your segmentation. How will you define the
various sub-sets? There is no magic recipe, but there is a fundamental
rule: the segmentation criteria must have no influence on the
experience results (A/B Testing = a single variable). You can take a very
basic feature such as subscription date, alphabetical order, as long as it
does not affect the results.
The third question is when to stop. How do you know when you have
enough responses to generalize the results of the experiment? It
all depends on how much traffic you are able to generate, on how
complex your experiment is and the difference in performance across
your various samplings. In other words, if traffic is low and results are
very similar, the test will have to run for longer. The main tools available
on the market (Google Website Optimizer, Omniture TestTarget,
Optimizely) include methods for determining if your tests are significant.
If you manage your tests manually, you should brush up on statistics and
sampling principles. There are also websites to calculate significance
levels for you.[1]
Let us now turn to two pitfalls to be avoided when you start A/B Testing.
First, looking at performance tests from the perspective of a single
goal can be misleading. Given that the test changes the user experience,
you must also monitor your other business objectives. By changing the
homepage of a web site for example, you will naturally monitor your
subscription rate, without forgetting to look at payment performance.
The other pitfall is to offer a different experience to a single group over
time. The solution you implement must be absolutely consistent for
the duration of the experiment: returning users must be presented with
the same experimentation version, both for the relevance of your results
and the user experience. Once you have established the best solution,
you will then obviously deploy it for all.
Who makes it work for them?
We cannot not cite the pioneer of A/B Testing: Amazon. Web players on
the whole show a tendency to share their experiments. On the Internet
you will have no trouble finding examples from Google, Microsoft, Netflix,
Zynga, Flickr, eBay, and many others, with at times surprising results. The
site www.abtests.com lists various experiments.
[1] http://visualwebsiteoptimizer.com/ab-split-significance-calculator
112
THE WEB GIANTS
How can I make it work for me?
A/B Testing is above all a right to experiment. Adopting a learning
stance, with results hypotheses from the outset and a modus operandi, is
a source of motivation for product teams. Linking the tests to performance
is a way to set up product management driven by data.
It is relatively simple to set up A/B Testing (although you do need to
respect a certain hygiene in your practices). Google Web Site Optimizer,
to mention but one, implements a tool which is directly hooked up to
Google Analytics. For a reasonable outlay, you can give your teams the
means to objectivize their actions in relation to the end-product.
Sources
• 37Ssignals, A/B Testing on the signup page:
 http://37signals.com/svn/posts/1525-writing-decisions-headline-tests-
on-the-highrise-signup-page
• Tim Ferris:
 http://www.fourhourworkweek.com/blog/2009/08/12/google-website-
optimizer-case-study
• Wikipedia:
 http://en.wikipedia.org/wiki/A/B_testing
PRACTICES / A/B TEST
113
THE WEB GIANTS
Design
Thinking
114
115
THE WEB GIANTS CULTURE / DESIGN THINKING
Description
In their daily quest for more connection with users, businesses
are beginning to realise that these “users“, “clients“, and other
“collaborators“ are first and foremost human beings. Emerging
behaviour patterns, spawned by new possibilities opened up by
technology, are changing consumer needs and their brand loyalties.
The web giants were among the first to adopt an approach based on the
relevance of all stakeholders involved in the creation of a product, and
therefore concerned by the user experience provided by a given service.
Here, the way Designers have appropriated the work tools is ideal for
qualifying an innovative need.
Reconsidering Design has become a key issue. It is essential for any
Organization that wishes to change and innovate, to question the business
culture, to dare go as far as disruption.
Born in the 1950s and more recently formalised by the Agency IDEO[1]
Design Thinking was developed at Stanford University in the USA as well
as the University of Toronto in Canada, before making a significant impact
on Silicon Valley, to the extent that it is becoming an approach assimilated
by all major web businesses and startups. It then spread to the rest of the
English speaking world, and then all of Europe.
Design thinking is a human-centered approach
to innovation that draws from the designer’s toolkit
to integrate the needs of people, the possibilities of
technology, and the requirements for business success
Tim Brown IDEO
[1]  http://www.wired.com/insights/2014/04/origins-design-thinking/
116
THE WEB GIANTS
A new vision of Design
Emergence of a strategic asset
First of all one must reconsider the word Design itself, to understand
its deeper, almost etymological, meaning. And therefore recognise that
when you speak of Design, it means that you want to give significance to
something, whether a product, a service or an Organization.
In fact, Design is whenever you want to “give meaning“ to something.
A far cry from the simple representation, aesthetic or merely practical, of
a product.
“Great design is not something anybody has traditionally
expected from Google“ – TheVerge
Several web giants became aware of the strategic relevance of
“operational“ Design before more fully implementing Design Thinking[2]
This is the case for Google which, in 2011, [3]
published a strong strategic
vision for Design, namely offering an additional choice between “Full
metrics“ (systematic A/B Testing, incremental feedback without embarked
user feedback...)
[2]  http://www.forbes.com/sites/darden/2012/05/01/designing-for-growth-apple-does-it-
so-can-you/
[3]  http://www.theverge.com/2013/1/24/3904134/google-redesign-how-larry-page-
engineered-beautiful-revolution
MEANING CONCEPTION
DESIGN
WHY HOW
117
THE WEB GIANTS CULTURE / DESIGN THINKING
Today, there are even Designers behind the creation of various web
giants, such as AirBnB.[4]
And some who go so far as to consider Design as
the main asset in their global business strategy (Pinterest, various Design
Disruptors).
The first step to implementing a strategic Design is to create an
environment which fosters the expression of different opinions around
the role of Design within the company. This is how you avoid conflation
between operational, cognitive and strategic aspects.
[4]  http://www.centrodeinnovacionbbva.com/en/news/airbnb-design-thinking-success-story
STEP 01
STEP 02
STEP 03
STEP 04
Companies that do not use design
Companies that use design for styling and appearance
Companies that integrate design
into the development process
Companies that consider design
a key strategic element
Emotional Design
Interaction Design
Strategic Design
Meaningful
Usable
Delightful
118
THE WEB GIANTS
Designing the experience a dialog between users
and professionals
“Design is human. it’s not about “is it pretty,“ but about
the connection it creates between a product and our lives.“
– Jenny Arden, Design Manager AirBnB
A strong bond is established through Design between the user and the
designer.
This is a context where the designer offers a service, promises an
experience, after which the user qualifies the experience through feedback
- negative or positive - which can lead to designer loyalty.
It is this relationship that leads to strong business value.
Such commitments are to be seen towards social networks (LinkedIn,
Facebook, Pinterest, Twitter…) and therefore largely among the web
giants and, by extension, towards all desirable digital services.
It is the Design process which materialises this relationship; the shared
history between the brand, its product, or the service behind the product
and users.
	
“When people can build an identity with your product over
time, they form a natural loyalty to it.“
Soleio Cuervo, Head of Design, Dropbox
Then come specialists of this precious relationship, in the form of labs or
other types of specialised Organizations (Google Venture, IBM), working
to optimise this new balance.
119
THE WEB GIANTS CULTURE / DESIGN THINKING
Design thinking
The working hypothesis
Design thinking entails understanding needs, and makes it possible
to create tailored and adequate solutions for any problem that comes
up. This means taking an interest in fellow humans in the most open,
compassionate way possible.
Innovation appears in the balance between the following factors:
What is viable from a business prospective, in line with the business model.
What is technologically feasible, neither obsolete nor too in advance.
And, lastly, what is desirable, the human factor and takeaways.
The specificity of the process lies in its ability to address a problem
through unprecedented collaboration between all stakeholders: from
the “creators“ (those who drive the business strategy, for example the
company) to the “users“ whoever they may be (in-house and external,
direct and indirect).
Business
HumansTechnologic
doable désirable
viable
Responsibility
INNOVATION
120
THE WEB GIANTS
methodological approach
The methodological translation of the Design Thinking approach is a
series of steps where the goal is to provide structure for innovation by
optimising the analytical and intuitive aspects of a problem.
100%
reliability
100%
validity
Bridging the Fundamental Predilection Gap
Design Thinking
50/50
Mix
Rotman
The approach unfolds in three main phases:
	 Inspiration or Discovery: learning to examine a problem or request.
Understanding and observing people and their habits. Getting a feel
for emerging wishes and needs.
	Ideating or Defining: making sense of the discoveries triggered
by a concept or vision. Establishing the business and technology
possibilities and prototyping the target innovations as quickly as
possible.
	Implementing or Delivering: materialising and testing to maximise
feedback on the innovation so as to swiftly make adjustments.
121
THE WEB GIANTS CULTURE / DESIGN THINKING
More precisely, these phases are often broken down into several steps
to anchor the methodology. The number and nature of the steps vary
depending on who is implementing them. Below are the 5+1 steps
suggested by the Stanford Institute of Design and adopted by IDEO:5
Empathy: Begin by understanding the people who will be impacted by
your product or service. This has to do with contacts, interviews, relations.
It is the choice of rediscovering the demand environment. The mandate is
openness, curiosity, and not formalisation.
Definition: It is the formalisation of a concept bearing on all the elements
discovered during the first step. It is based on real needs, driven by
potential clients rather than the company's context.
Ideation: This is the step where ideas are generated. This optimism phase
encourages all possible ideas emerging from the previously discovered
concepts. Exercises and Design workshops can serve to focus on specific
aspects to see what intentions are possible. Little by little, ideas are
grouped together, refined, completed, and given more specific meaning.
Prototyping: Then comes the moment for materialisation, for moving on
to the “how“. Here the problems are represented more concretely, to
draw out potential. Speed is of the essence, especially in making mistakes
so as to quickly reposition. Simple materials are used such as cardboard,
putty...
Testing: It is then time to test the prototype, with potential users, to ensure
its feasibility and check that it is a cultural fit for your brand. Sparked
interest is proof that the prototype is a solution in tune with a user need.
Lastly, let us add evolution: The results from the preceding phases should
be a new starting point for researching the best way to create value
around a given need. One thus understands that the implementation of
the Design approach does not end once the process has started, because
it forces you to systematically evolve what you already have.
[5]  https://dschool.stanford.edu/sandbox/groups/designresources/wiki/36873/
attachments/74b3d/ModeGuideBOOTCAMP2010L.pdf?sessionID=c2bb722c7c1ad51
462291013c0eeb6c47f33e564
122
THE WEB GIANTS
Empathize Ideate
Define Prototype
Test
Some of the steps can be repeated, adjusted, refined, added to. New
ideas are born out of tests: following prototyping for example, other
types of potential clients can emerge... And this happens in a context
of iteration, co-creation, sometimes without any hierarchy, and with a
sufficiently optimistic mindset to accept any failures.
Design vs. Tech[6]
Design is currently such a major driver for the web giants that questions
arise concerning technology as a crucial strategic element.
Choices are made in the front-end of everything – Scott
Belsky Behance
One effectively observes that the beneficial effects of Moore's law are
diminishing while, at the same time, users are gaining in maturity, to the
point where they are increasingly involved in defining the perfect interface
for them.
[6] http://www.kpcb.com/blog/design-in-tech-report-2015
123
THE WEB GIANTS CULTURE / DESIGN THINKING
Why are Tech Companies Acquiring Design Agencies?
The old way of thinking The new way of thinking
The solution to every new
problem in tech has been
simple : more tech.
1
A better experience was
made with a faster CPU
or more memory
2
Moore’s Law no longer
cuts it as yhe key path to
a happier customer.
3
(modified from Design In Tech presentation, John Maeda KMPG partner)
Moreover, the new generations of users no longer consider possibilities
driven by technology as innovation breakthroughs but rather as basic
expectations (it is normal for technology to open up new possibilities).
Thus it is Design which makes the difference in what clients buy and the
brands they are loyal to.
Noting this trend, many web giants started buying up companies
specialised in Design in 2010.
#Design in Tech MA Activity
(modified from Design In Tech presentation, John Maeda KMPG partner)
2005
NUMBER OF
DESIGNER
CO-FOUNDED
TECH COMPANIES
2006 2007 2008 2009 2010 to the present
Flikr
Android
YouTube
Vimeo
Fab
LevelMoney
Polar
Ultravisual
WillCall
Beats
Readmill
Simple
Sold
Tumblr
Pulse
Mailbox
Foodspotting
Forrst
Behance
Acrylic Sofware
Slideshare
Instagram
OMGPOP
Postcrous
Gowalla
Hunch
Push Pop Press
Daytum
about.me
SongzaMint
+acq. for $1.65B
+acq. for $1.0B
Mobile was the
inflection point
for #DesignInTech
Mobile was the
inflection point
for #DesignInTech
124
THE WEB GIANTS
How can I make it work for me?
Which way implementing
The crucial step is to evolve your company into a Design-centric
Organization:
The strategy is to promote full integration of Design Thinking in your company:[7]
Design-centric leaders, who consider Design as a structural cultural edge
both within their company and in the expression of their values (Products,
services, expert advice, quality of product code...).
Embracing the Design culture: The development of the business culture is
systematically informed by values of empathy rather than organic growth,
the user experience (UX) is the most important benchmark, and the goal is
to provide high quality client experiences (CX) with true value.
The Design thought process Design thinking and its implementation
are a given in the company mindset, and therefore teams concentrate
on opportunities in problematics rather than on project opportunities.
Several implementation vectors can serve to promote this mindset:
The acquisition of talent, i.e. incorporating designers (IBM)
Callinguponconsultantsforhelpwithissueswhichgobeyondmethodology
Assimilation, by integrating Design studios and coaches[8]
A structure built around Design. Companies are organised around
attracting talent and co-leaders for each position to encourage each other
to create initiatives, an integral part of their responsibilities.
Globally speaking, 10% of the 125 richest companies in the USA have Top
Managers or CEOs from Design. Alongside the web giants, one notes that
the CEO of Nike is a Designer. Apple is the only company to have a SVP
for Design.
Adapting will take time, especially as most have yet to realise the relevance
of Design. Getting help from Designers or UX specialists familiar with the
approach is necessary for sharing these new tools and then putting them
into operation.
[7] https://hbr.org/2015/09/design-as-strategy
[8] http://www.wired.com/2015/05/consulting-giant-mckinsey-bought-top-design-firm/
125
THE WEB GIANTS CULTURE / DESIGN THINKING
Companies that
do not use design
2003 2007 2003 2007 2003 2007 2003 2007
Companies that use
design for styling
and appearance
Companies that
integrate design
into the development
process
Companies that
consider design
a key strategic
element
How can I make it work for them?
While since 2010, GAFAM, NATU and other web giants have been
following this strategy, today all sectors refer, directly or indirectly, to
Design Thinking in their quest for an optimal client experience.[9]
Among concrete examples, we will mention the following:
On the point of disappearing after multiple failures, AirBnB managed to
turn themselves around thanks to Design Thinking[10]
Exploration of aggregated services, proposed and tested by Uber in
partnership with Google[11]
Still at Uber, the Design Thinking approach underlies the entire internal
structure of the company[12]
With the same goal, IBM restructured its Organization through a Design
transition[13]
At Dropbox, Design Thinking is ubiquitous. Both in terms of its products
and internal structure[14] [15]
More precisely, one can describe strong implication in Strategic Design as:
Implementation in several stages (from visual Design to strategic Design) at:
Google, Apple, Facebook, Dropbox, Twitter, Netflix, Salesforce, Amazon…
An overarching Design-centric strategy at:
Pinterest, AirBnB, Google Ventures, Coursera, Etsy, Uber, most FinTechs
[9] http://blog.invisionapp.com/product-design-documentary-design-disruptors/
[10] https://www.youtube.com/watch?v=RUEjYswwWPY
[11]  http://www.happinessmakers.com/knowledge/2015/11/29/inside-ubers-design-
thinking
[12]  http://talks.ui-patterns.com/videos/applying-design-thinking-at-the-organizational-
level-uber-amritha-prasad
[13]  http://www.fastcodesign.com/3028271/ibm-invests-100-million-to-expand-design-
business
[14]  https://twitter.com/intercom/status/614537634833137664
[15]  http://designerfund.com/bridge/day-in-the-life-rasmus-andersson/
126
THE WEB GIANTS
Associated patterns
		 Pattern “Enhancing the user experience“ p. 27
		 Pattern “Lean Startup“ p. 87
Sources
• Evolution of Design Thinking: Special issue of the Harvard Business Review:
 https://hbr.org/archive-toc/BR1509?cm_sp=Magazine%20Archive-_-
Links-_-Previous%20Issues
 http://stanfordbusiness.tumblr.com/post/129579353544/how-design-
thinking-can-help-drive-relevancy-in
• The example of AirBnB:
 https://growthhackers.com/growth-studies/airbnb
 https://www.youtube.com/watch?v=RUEjYswwWPY
• Methodology:
 https://www.ideo.com/images/uploads/thoughts/IDEO_HBR_Design_
Thinking.pdf
 https://www.rotman.utoronto.ca/Connect/RotmanAdvantage/
CreativeMethodology.aspx
 http://www.gv.com/sprint/
• Design Value:
 http://www.dmi.org/default.asp?page=DesignDrivesValue#.
VW6gfEycSdQ.twitter
 Design-Driven Innovation-Why it Matters for SME Competitiveness
White Paper – Circa Group
• Design in Tech:
 http://www.kpcb.com/blog/design-in-tech-report-2015
127
THE WEB GIANTS
Device
Agnostic
128
THE WEB GIANTS
129
THE WEB GIANTS PRATICES / DEVICE AGNOSTIC
Description
For Web Giants, user-friendliness is no longer open to debate: it is non
negotiable.
As early as 2003, the Web 2.0 manifesto pleaded in favor of the “Rich
User Experience“, and today, anyone working in the world of the Web
knows the importance of providing the best possible user interface. It
is held to be a crucial factor in winning market shares.
In addition to demanding high quality user experience, people want to
access their applications anywhere, anytime, in all contexts of their daily
lives. Thus a distinction is generally made between situations where one
is sitting (e.g. at the office), nomadic (e.g. waiting in an airport terminal) or
mobile (e.g. walking down the street).
These situations are currently linked to various types of equipment, or
devices. Simply put, one can distinguish between:
		 Desktop computers for sedentary use.
		 Laptops and tablets for nomadic use.
		 Smartphones for mobile use.
The Device Agnostic pattern means doing one’s utmost to offer the
best user experience possible whatever the situation and device.
One of the first companies to develop this type of pattern was Apple with
its iTunes ecosystem. In fact, Apple first made music accessible on PC/
Mac and iPod, then on the iPhone and iPad. Thus they have covered the
three use situations. In contrast, Apple does not fully apply the pattern as
their music is not accessible on Android or Windows Phone.
To implement this approach, it can be necessary to offer as many
interfaces as there are use situations. Indeed, a generic interface of the
one-size-fits-all type does not allow for optimal use on computers, tablets,
smartphones, etc.
130
THE WEB GIANTS
The solution adopted by many of Web Giants is to invest in developing
numerous interfaces, applying the pattern API first (cf. “Open API“,
p. 235). Here the principle is for the application architecture to be based on
a generic API, with the various interfaces then being directly developed by
the company, or indirectly through the developer and partner ecosystem
based on the API.
To get the most out of each device, it is becoming ever more difficult to use
only Web interfaces. This is because they do not manage functionalities
specific to a given device (push, photo-video capture, accelerometer,
etc.). Users also get an impression of lag because the process entails
frontloading the entire contents,[1]
whereas native applications need no
loading or only a few XML or JSON resources.
I’d love to build one version of our App that could work
everywhere. Instead, we develop separate native versions
for Windows, Mac, Desktop Web, iOS, Android, BlackBerry,
HP WebOS and (coming soon) Windows Phone 7.
We do it because the results are better and, frankly,
that’s all-important.
We could probably save 70% of our development budget
by switching to a single, cross-platform client, but we would
probably lose 80% of our users.
Phil Libin, CEO Evernote (January, 2011)
However things are changing with HTML5 which functions in offline
mode and provides resources for many applications not needing GPS
or an accelerometer. In sum, there are two approaches adopted by Web
companies: those who use only native applications such as Evernote, and
those who take a hybrid approach using HTML5 contents embarked in
the native application which then becomes a simple empty shell, capable
only of receiving push notifications. This is in particular the case of Gmail,
Google+ and Facebook for iPhone. One of the benefits of this approach is
to enhance visibility in the AppStores where users go for their applications.
The hybrid pattern is thus a good compromise: companies can use the
HTML5 code on a variety of devices and still install the application via
an App Store with Apple, Android, Windows Phone, and, soon, Mac and
Windows.
[1] This frontloading can be optimized (cf. “Enhancing the user experience“, p. 27) but there
are no miracles…
131
THE WEB GIANTS PRATICES / DEVICE AGNOSTIC
Who makes it work for them?
There are many examples of the Device Agnostic pattern being
implemented among Web Giants. Among others:
		 In the category of exclusively native applications: Evernote, Twitter,
Dropbox, Skype, Amazon, Facebook.
		 In the category of hybrid applications: Gmail, Google+.
References among Web Giants
Facebook proposes:
		 A Web interface for PC/Mac: www.facebook.com.
		 A Web interface for Smartphones: m.facebook.com.
		 Embarked mobile interfaces for iPad, iPhone, Android, Windows
Phone, Blackberry, PalmOS.
		 A text message interface to update one’s status and receive
notifications of friend updates.
		 An email interface to update one’s status.
In addition, there are several embarked interfaces for Mac and PC offered
by third parties such as Seesmic and Twhirl.
Twitter stands out from the other Web Giants in that it is their ecosystem
which does the implementing for them (cf. “Open API“, p. 235). Many
of the Twitter graphic interfaces were in fact created by third parties
such as TweetDeck, Tweetie for Mac and PC, Twitterrific, Twidroid for
smartphones... To the extent that, for a time, Twitter’s web interface was
considered unuser friendly and many preferred to use the interfaces
generated by the ecosystem instead. Twitter is currently overhauling the
interfaces.
132
THE WEB GIANTS
In France
One finds the Device Agnostic pattern among major media groups. For
example Le Monde proposes:
		 A Web interface for PC/Mac: www.lemonde.fr
		 A Web interface for Smartphones: mobile.lemonde.fr
		 Hybrid mobile interfaces for iPhone, Android, Windows Phone,
Blackberry, PalmOS, Nokia OVI, Bada
		 An interface for iPad
It is also found in services with high consultation rates such as banking. For
example, the Crédit Mutuel proposes:
		 A Web interface for PC/Mac: www.creditmutuel.fr
		 A redirect service for all types of device: m.cmut.fr
		 A Web interface for Smartphones: mobi.cmut.fr
		 A Web interface for tablets: mini.cmut.fr
		 A WAP interface: wap.cmut.fr
		 A simplified Java interface for low technology phones
		 Embarked mobile interfaces for iPad, iPhone, Android, Windows
Phone, Blackberry
		 An interface for iPad.
133
THE WEB GIANTS PRATICES / DEVICE AGNOSTIC
How can I make it work for me?
The pattern is useful for any B2C service where access anywhere, anytime
is important.
If your budget is limited, you can implement the mobile application most
used by your target clients, and propose an open API in the hopes that
others will develop the interface for additional devices.
Associated patterns
		 The Open API or open ecosystem pattern, p. 235.
		 The Enhancing the User Experience pattern, p. 27.
Exception!
As mentioned earlier, this pattern is only limited by the budget required
for its implementation.
Sources
• Rich User Experiences, Web2.0 Manifesto, Tim Oreilly:
 http://oreilly.com/Web2/archive/what-is-Web-20.html
• Four Lessons From Evernote’s First Week On The Mac App Store,
Phil Libin:
 http://techcrunch.com/2011/01/19/evernote-mac-app-store
134
THE WEB GIANTS
Perpetual
beta
135
THE WEB GIANTS
136
THE WEB GIANTS PRATICES / PERPETUAL BETA
Description
Before introducing perpetual beta, we must revisit a classic pattern in
the world of open software:
Release early,
release often.
The principle behind this pattern consists of regularly releasing code to the
community to get continuous feedback on your product from programmers,
testers, and users. This practice is described in Eric Steven Raymond’s 1999
work “The Cathedral and the Bazaar“ It is in keeping with the short iteration
principle in agile methods.
The principle of perpetual beta was introduced in the Web 2.0 manifesto
written by Tim O’Reilly where he writes:
Users must be treated as co-developers, in a reflection of
open source development practices (...).
The open source dictum, ‘release early and release often’,
in fact has morphed into an even more radical position,
‘the perpetual beta’, in which the product is developed in
the open, with new features slipstreamed in on a monthly,
weekly, or even daily basis.
The term “perpetual beta“ refers to the fact that an application is never
finalized but is constantly evolving: there are no real releases of new
versions. Working this way is obviously in line with the logic of “Continuous
Delivery“ (cf. “Continuous Deployment“, p. 105).
This constant evolution is possible because it is a case here of services on
line rather than software:
		 In the case of software, version management usually follows a
roadmap with publication benchmarks: releases. These releases are
spread out over time for two reasons: the time it takes to deploy
the versions to the users, and the need to ensure maintenance
and support for the various versions released to users. Monitoring
137
THE WEB GIANTS
support, safety updates and ongoing maintenance on several
versions of a single program is a nightmare, and a costly one. Let
us take the example of Microsoft: the Redmond-based publisher
had to manage at one point the changes to Windows XP, Vista and
Seven. One imagines three engineering teams all working on the
same software: a terrible waste of energy and a major crisis for any
company lacking Microsoft’s resources. This syndrome is known as
“version perversion“.
		 In the context of online services, only one version of the application
needs to be managed. Furthermore, since it is Web Giants
themselves who upload and host their applications, users benefit
from updates without having to manage the software deployment.
New functionalities appear on the fly where they are “happily“
discovered by the users. In this way one learns to use new functions in
applications progressively. Generally speaking, the logistics of ascendant
interoperability are well managed (with a few exceptions, such as support
in disconnected mode in Gmail, when they gave up Google Gears). This
model is widely applied by the stakeholders of Cloud Computing.
The “customer driven roadmap“ is a complementary and virtuous
feature of the perpetual beta (cf. “Lean Startup“, p. 87). Since Web Giants
manage the production platform, they can also finely measure use of their
software. Thereby measuring the success of each new functionality. As
mentioned previously, the Giants follow metrics very closely. So closely in
fact that we have devoted a chapter to the subject (cf. “The obsession with
performance measurement“, p. 13).
More classically, running the production platform provides opportunities
to launch surveys among various target populations to get user feedback.
To apply the perpetual beta pattern, you must have the means to carry out
regular deployments. The prerequisites are:
		 implementing automatic software builds,
		 practicing Continuous Delivery,
		 ensuring you can rollback in case of trouble...
138
THE WEB GIANTS PRATICES / PERPETUAL BETA
There is some controversy around the perpetual beta: some clients equate
beta with an unfinished product and believe that services following this
pattern are not reliable enough to count on. This has led some service
operators to remove the mention “beta“ from their site, albeit without
changing their practices.
Who makes it work for them?
The reference was Gmail which sported the mention beta until 2009 (with
the vintage function “back to beta“ being added later).
It is a practice implemented by many Web Giants. Facebook, Amazon,
Twitter, Flickr, Delicious.com, etc.
A good illustration of perpetual beta is provided by Gmail Labs: they are
small unitary functionalities which users can decide to activate or not.
Depending on the rate of adoption, Google then decides to integrate
them in the standard version of their service or not (cf. “Feature Flipping“,
p. 113).
In France, the following services display, or have displayed, the beta logo
on their
home page:
		 urbandive.com : a navigation service with street view by the Pages
Jaunes,
		 sen.se : a service for storing and analyzing personal data.
Associated patterns
		 Pattern “Continous Deployment“, p. 105.
		 Pattern “Test A/B“, p. 123.
		 Pattern “The obsession with performance measurement“, p. 13.
139
THE WEB GIANTS
Exception!
Some Web Giants still choose to keep multiple versions up and running
simultaneously. Maintaining several versions of an API is particularly
relevant as it saves developers from being forced into updating their
code every time a new version of the API is released. (cf. “Open API“,
p.235.)
The Amazon Web Services API is a good example.
Sources
• Tim O’Reilly, What Is Web 2.0 ?, 30 September, 2005:
 http://oreilly.com/pub/a/web2/archive/what-is-web-20.html
• Eric Steven Raymond, The Cathedral and the Bazaar:
 http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-
bazaar/
140
THE WEB GIANTS
Architecture
141
Cloud First..................................................................................... 159
Commodity Hardware.................................................................... 167
Sharding........................................................................................ 179
TP vs. BI: the new NoSQL approach................................................ 193
Big Data Architecture..................................................................... 201
Data Science.................................................................................. 211
Design for Failure........................................................................... 219
The Reactive Revolution................................................................. 225
Open API ...................................................................................... 233
142
THE WEB GIANTS
Cloud
First
143
THE WEB GIANTS
144
THE WEB GIANTS ARCHITECTURE / CLOUD FIRST
Description
As we saw in the description of the pattern “Build vs. Buy“ (cf. “Build vs.
Buy“, p. 19): Web Giants favor specific developments so as to control
their tools from end to end, whereas many companies instead use
software packages, considering that software tools are commodities.[1]
AlthoughWebGiants,likestartups,prefertodevelopcriticalapplications
in-house, they do at times have recourse to third-party commodities. In
this case, they apply the commodity logic to the fullest by choosing to
completely outsource the service in the Cloud.
By favoring services in the Cloud, Web Giants, again like startups, take
a very pragmatic stance: profiting from the best innovations by their
peers, speedily and with an easy-to-use purchase model, to focus their
efforts on their business strengths. This model can be inspiring for all
companies wishing to move fast and to reduce investment costs to win
market shares.
Why favor the Cloud in the commodity framework? The table on the
following page lays out the advantages.
The Cloud approach can be divided into three main strands:
		 Using APIs and Mashups: Web Giants massively call upon services
developed by Cloud companies (Google Maps, user identification
on Facebook, payment with PayPal, statistics with Google Analytics,
etc.) and integrate them in their own pages via the mashup principle.
		 Outsourcing functional commodities: Web majors often
externalize their commodities to SaaS services (e.g. Google Apps
for collaborating, Salesforce for managing sales personnel, etc.)
	 Outsourcing technical commodities: Web players also regularly use
Iaas and PaaS platforms to host their services (Netflix and Heroku for
example use Amazon Web Services).
145
THE WEB GIANTS
Analysis axis
Model
In-house
management
Cloud
Cost Initial outlay for
licenses, equipment,
staff.
Pay-per-use: neither
investment nor com-
mitment.
Time to Market License purchase,
then deployment by
the company within a
few weeks.
Self-service subs-
cription automatically
implemented within
minutes.
Roadmap/new
functionalities
Designed in the mid
term by publishers
following feedback
from user groups.
Implemented in the
short term depending
on what users do
with the service.
Rhythm
of change
Often one major
release per year.
New functionalities on
the fly.
Support and
updates
Additional yearly
cost.
Included in the subs-
cription.
Hosting and
operating
Entails building and
operating a datacenter
by experts.
Delegated to the Cloud
operator.
The physical
safety of data
Data integrity is the
responsibility of the
company.
The major Cloud opera-
tors ensure the safety of
data in accordance with
the ISO standards ISO
27001[1]
and SSAE 16.[2]
[1] ISO 27001 : http://en.wikipedia.org/wiki/ISO_27001
[2] SSAE 16 (replacing the Type 2 SAS 70) : http://www.ssae-16.com
146
THE WEB GIANTS ARCHITECTURE / CLOUD FIRST
Housing technical commodities in the Cloud is particularly interesting
for Web companies. With the pay-as-you-go model, they can launch
online activities with next to no hosting costs. Charges increase
progressively as the number of users grows, alongside revenues, so all
is well. The Cloud has thus radically changed their launch schedules.
The Amazon Web Services platform IaaS is massively used
by Web Giants such as Dropbox, 37signals, Netflix, Heroku...
During the CloudForce 2009 conference in Paris,
a Vice-President of Salesforce affirmed that the company
did not use an IaaS platform because such solutions did not
exist when the company was created, but that if it were to be
done today they would certainly choose IaaS.
Who makes it work for them?
The eligibility of the Cloud varies depending both on the type of data you
manipulate and regulatory constraints. Thus:
	 Banks in Luxembourg are forbidden from storing their data elsewhere
than in certified organizations.
	 Companies working with sensitive data, industrial secrets or
patents are reluctant to store them in the Cloud. The Patriot Act[3]
in particular pushes companies away from the Cloud: it forces
companies registered in the United States to make their databases
available upon request by government authorities.
	 Companies which work with personal data can also be forced to
restrict their recourse to the Cloud because of the CNIL regulations,
the respect of which varies from one Cloud platform to the next
(variable implementation of Safe Harbor Privacy Principles).[4]
[3] http://en.wikipedia.org/wiki/PATRIOT_Act
[4] http://en.wikipedia.org/wiki/International_Safe_Harbor_Privacy_Principle
147
THE WEB GIANTS
When there are no such constraints, using the Cloud is possible. And many
companies of all sizes and from all sectors have migrated to the Cloud, in
the USA as well as in Europe.
Let us describe a case that well illustrates the potential of the Cloud:
In 2011, Vivek Kundra, former CIO at the White House,
announced the program “Cloud First“ which stipulated
that all US administrations had to use the Cloud first and
foremost for IT.
This decision should be put in context: in the USA there is
the “GovCloud“, i.e. Cloud offers suited to administrations,
with full respect for their constraints, located on American
soil, and isolated from other clients.
Such services are offered by Amazon, Google and other
providers.[5]
In some companies, it is the mindset which is dead against storing data in
the Cloud. This reluctance is due to the factors presented above, but also
to a lack of confidence (Cloud providers have not yet reached the levels
of trust of banks) and also possibly unwillingness to change. Web Giants
are less affected by these two latter impediments, they are already well
acquainted with the Cloud providers and are open to change.
Cloud addiction?
One should also be careful not to depend too fully on a single Cloud
platform to house critical applications. These platforms are not fail-proof,
as shown by recent failures: Microsoft Azure (February, 2012), Salesforce
(June, 2012), Amazon Web Services (April and July, 2012).
The failures at AWS highlighted their lack of maturity in the use of the
Cloud:
		 Pinterest, Instagram, Heroku which were dependent on a single
Amazon datacenter were strongly impacted,
[5] Federal Cloud Computing Strategy, Vivek Kundra, 2011:
http://www.forbes.com/sites/microsoft/2011/02/15/kundra-outlines-cloud-first-policy-for-u-
s-government
148
THE WEB GIANTS ARCHITECTURE / CLOUD FIRST
		 Netflix used several Amazon datacenters and was thus less affected[6]
(cf. “Design for Failure“, p. 221).
One should note however that such failures create media hype whereas
very little is known about the robustness of corporate datacenters. It is
therefore difficult to measure the true impact on users.
Here are a few Service Level Agreements that you can compare with
those of your companies:
		 Amazon EC2: 99.95% availability per year.
		 Google Apps: 99.9% availability per year.
References among Web Giants
A few examples of recourse to the Cloud by Web Giants:
		 using Amazon Web Services: Heroku, Dropbox, 37Signals, Netflix,
Etsy, Foursquare, Voyages SNCF. In fact, Amazon represents 1% of
all traffic on the Web;
		 using Salesforce: Google, LinkedIn;
		 using Google Apps: Box.net.
In France
A few examples of Cloud use in France:
		 In industry: Valeo, Treves use Google Apps.
		 In insurance: Malakoff Méderic uses Google Apps.
[6] Feedback from Netflix on AWS failures:
http://techblog.netflix.com/2011/04/lessons-netflix-learned-from-aws-outage.htm
149
THE WEB GIANTS
		 In the banking sector: most use Salesforce for at least part of their
activities.
		 In the Internet sector: PagesJaunes uses Amazon Web Services.
		 In the public sector: La Poste uses Google Apps for their mail
delivery staff.
How can I make it work for me?
If you are a SME or a VSE, you would probably benefit from externalizing
your commodities in the Cloud, for the same reasons as Web Giants. All the
more so as regulatory issues, such as the protection of industrial secrets,
must be resolved following the emergence of French and European
Clouds such as Adromède.
If you are a large company, already well endowed with hardware and IT
teams, the benefits of the Cloud can be offset by the cost of change. It can
nevertheless be relevant to study the question. In any case, you can profit
from the Cloud’s agility and pay-as-you-go approach for:
		 innovative projects: pilot projects, Proof of Concept, project
incubation, etc.
		 Environments with limited life spans (development, testing, design,
etc.).
Related Pattern
		 Pattern “Build vs. Buy“, p. 19.	
Exception!
As stated earlier, regulatory constraints can cut off access to the Cloud.
In some cases, re-internalization is the best solution: when data and user
volumetrics increase spectacularly, it can be cheaper to repatriate applications
and build a datacenter on totally optimized architecture.
This type of optimization does however typically require highly-qualified staff.
150
THE WEB GIANTS
Commodity
Hardware
151
THE WEB GIANTS
152
THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE
Description
Although invisible behind your web browser, millions of servers run day
and night to make the Web available 24/7. There are very few leaks as
to numbers, but it is clear that major Web companies have dozens or
even hundreds of thousands of machines like EC2,[1]
it is even surmised
that Google has somewhere around a million.[2]
Managing so many
machines is not only a technical challenge, it is above all an economic
one.
Most major players have circumvented the problem by using mass
produced equipment, also called “commodity hardware“, which is the
term we will use from now.
This is one of the reasons which has led the Web Giants to interconnect
a large number of mass-produced machines rather than using a single
large system. A single service to a client, a single application, can run
on hundreds of machines. Managing hardware this way is known as
Warehouse Scale Computing,[3]
with hundreds of machines replacing a
single server.
Business needs
Web Giants share certain practices, described in various other chapters of
this book:[4]
		 A business model tied to the analysis of massive quantities of data
- for example indexing web pages (i.e. approximately 50 billion
pages).
		 One of the most important performance issues is to ensure that
query response times stay low.
[1] Source SGI.
[2] Here again it is hard to make estimates.
[3] This concept is laid out in great detail in the very long paper The Data Center as
a Computer, we only mention a few of their concepts here. The full text can be found at:
http://www.morganclaypool.com/doi/pdfplus/10.2200/S00516ED2V01Y201306CAC024
[4] cf. in particular “Sharding“, p. 179.
153
THE WEB GIANTS
		 Income from e.g. advertising is not linked to the number of queries,
per query income is actually very low.[5]
Comparatively speaking, the
cost per unit using traditional large servers remains too elevated.
The incentive to find the architecture with the lowest transaction
costs is thus very high.
Lastly, the scales of magnitude of processing carried out by the Giants are
far removed from traditional computer processing management, where
until now the number of users was limited by the number of employees.
No machine, however big, is capable of meeting their needs.
In short, these players need scalability (marginal cost per constant
transaction), and the marginal cost must stay low.
Mass-produced machines vs. high-end servers
When scalability is at issue, there are two main alternatives:
		 Scale-up or vertical growth consists in using a better performing
machine. This is the alternative that has most often been chosen in
the past because it is very simple to implement. Moreover Moore’s
law means that builders regularly offer more powerful machines at
constant prices.
		 Scale-out or horizontal scaling consists in pooling the resources of
several machines which individually can be much less powerful. This
removes all limits as to the size of the machine.
Furthermore, PC components, technologies and architectures show
a highly advantageous performance/cost ratio. Their relatively weak
processing capacity as compared to more efficient architectures such
as RISC are compensated for by lower costs obtained through mass
production. A study based on the results of the TPC-C[6]
shows that the
relative cost per transaction is three times lower with a low-end server than
with a top of the line one.
[5] “Early on, there was an emphasis on the dollar per (search) query,“ [Urs] Hoelzle said. “We
were forced to focus. Revenue per query is very low.“ http://news.cnet.com/8301-1001_3-
10209580-92.html
[6] Ibid, [3] preceeding page
154
THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE
At the scales implemented by Web Giants - thousands of machines
coordinated to execute a single function - other costs become highly
prominent: electric power, cooling, space, etc. The cost per transaction
must take these various factors into account.
Realizing that has led the Giants to favor horizontal expansion (scale-out)
based on commodity hardware.
Who makes it work for them?
Just about all of Web Giants. Google, Amazon, Facebook, LinkedIn… all
currently use x86 type servers and commodity hardware. However, using
such components introduces other constraints, and having a Data Center
as a Computer entails scaling constraints which differ widely from what
most of us think of as datacenters. Let us therefore go into more detail.
Material characteristics which impact programming
Traditional server architecture strives, to the extent allowed by the
hardware, to provide developers with a “theoretical architecture“,
including a processor, a central memory containing the program and data,
and a file system.[7]
Familiar programming based on variables, calling functions, threads and
processes make this approach necessary.
The architectures of large systems are as close to this “theoretical
architecture“ as a set of machines in a datacenter is far.
Machines of the SMP (Symmetric Multi Processor) type, used for scaling-
up, now make it possible to use standard programming, with access to the
entire memory and all disks in a uniform manner.
[7] This architecture is known as the Von Neumann architecture.
155
THE WEB GIANTS
Figure 1 (modified). Source RedPaper 4640, page 34.
As the figures on the diagram show, great efforts are made to ensure
that speed and latency are nearly identical between a processor, its
memory and disks, whether they are connected directly, connected to
a same processor book[9]
or different ones. If any NUMA (Non Uniform
Memory Access - accessing a nearby memory is faster than accessing
memory in a different part of the system) characteristics are retained, they
are concentrated on the central memory, with latency and bandwidth
differences in a 1 to 2 ratio.
[9] A processor book is a compartment which contains processors, memory and in and out
connectors, at the first level it is comparable to a main computer board. Major SMP systems
are made up of a set of compartments of this sort interconnected through a second board:
the midplane.
Processor Book 8 of 8
Processor Book n of 8
I/O Drawer
HMC
HMC
24 port, 100Mb Enet Switch
Oscillator
Card
TPMDTPMD
DIMM
DIMM
DIMM
DIMM BUFFER
BUFFER
BUFFER
BUFFER DIMM
DIMM
DIMM
DIMM BUFFER
BUFFER
BUFFER
BUFFER
Midplane
Oscillator
Card
FSP
System
Controller
FSP Node
Controller
FSP Node
Controller
FSP
System
Controller
One server :
RAM : 8 TB, 39,4
Disk : 304 TB, 10 m
to 50 GB/s
One Processor Bo
RAM 1 TB,133 ns,
Disk : 304 TB, 10m
to 50 GB/s
One Processor :
RAM 256 GB,100
Disk : 304 TB, 10m
to 50 GB/s
INTER NODE FABRIC BUS
INTER NODE
FABRIC BUSINTER NODE FABRIC BUS
INTER NODE
FABRIC BUS
INTER NODE
FABRIC BUS
24 port, 100Mb Enet Switch
Lorem Ipsum
Lorem Ipsum
Lorem Ipsum
Lorem Ipsum
Lorem Ipsum
Lorem
Ipsum
Lorem
Ipsum
Lorem
Ipsum
Lorem
Ipsum
One server :
RAM : 8 TB, 39,4 GB/s
Disk : 304 TB, 10 ms up
to 50 GB/s
One Processor Book{9}
:
RAM 1 TB,133 ns, 46,6 GB/s
Disk : 304 TB, 10ms up
to 50 GB/s
One Processor :
RAM 256 GB,100 ns, 76,5 GB/s
Disk : 304 TB, 10ms up
to 50 GB/s
156
THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE
Operating systems and middleware like Oracle can take charge of such
disparities.
From a scale-out perspective, the program no longer runs on a single
large system but is instead managed by a program which distributes it
over a set of machines. This manner of connecting machines in commodity
hardware gives a very different vision from that of the
“theoretical architecture“ for the developer.
Figure 2. Source The Data Center As A Computer page 8
L1$ :Level 1 cache , L2$ : level 2.cache
Local DRAM
Local 25
L 15 L 25
Local 25
L 15 L 25
Rack Switch
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
P P P P
Datacenter Switch
One server
DRAM: 16G
Disk : 2TB, 1
Local Rack (
DRAM: 1TB,
Disk : 160TB
Cluster (30
DRAM: 30TB
Disk: 4.80PB
Disk
One server
DRAM: 16GB 100ns, 20GB/s
Disk : 2TB, 10ms, 200MB/s
P : Processor
Local Rack (80 servers)
DRAM: 1TB, 300µs, 100MB/s
Disk : 160TB, 11ms, 100MB/s
Cluster (30 racks)
DRAM: 30TB, 500µs, 10MB/s
Disk: 4,80PB, 12ms, 10MB/s
2$
L 1$
M
Disk
Disk
Disk
P
One server
DRAM: 16GB 100ns, 20GB/s
Disk : 2TB, 10ms, 200MB/s
P : Processor
Local Rack (80 servers)
DRAM: 1TB, 300µs, 100MB/s
Disk : 160TB, 11ms, 100MB/s
Cluster (30 racks)
DRAM: 30TB, 500µs, 10MB/s
Disk: 4,80PB, 12ms, 10MB/s
Local DRAM
Local 2$
L 1$ L 1$
Local 2$
L 1$ L 1$
Rack Switch
DRAM
DRAM
DRAM
DRAM
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
Disk
P P P P
One server
DRAM: 16GB 100ns, 20GB/s
Disk : 2TB, 10ms, 200MB/s
P : Processor
Local Rack (80 servers)
DRAM: 1TB, 300µs, 100MB/s
Disk : 160TB, 11ms, 100MB/s
Cluster (30 racks)
DRAM: 30TB, 500µs, 10MB/s
Disk: 4,80PB, 12ms, 10MB/s
157
THE WEB GIANTS
Whenever you use the network to access data on another server, availability
time increases and speed is divided by 1000. In addition, it is the network
equipment feeding into the datacenter that is the limiting factor in terms
of the aggregated bandwidth of all machines.
In consequence, to optimize access time and speed within the datacenter,
the data and processing must be well distributed across servers (especially
to avoid distributing data often accessed together over several machines).
However, operating systems and the traditional middleware layers are not
designed for functioning this way. The solution is for processing to take
place at the application level. This is precisely where sharding[10]
strategies
come into play.
Service front elements, serving Web pages, easily support such constraints
giventhatversioningisnotanissueanditiseasytodistributeHTTPrequests
over several machines. It will however be up to the other applications to
explicitly manage network exchanges or to anchor themselves in new
specific middleware layers. Solutions for storing this type of material are
also deployed among Web Giants by using sharding techniques.
Implementing failure resistance
The second significant difference between large systems and Warehouse
Scale Computers lies in failure tolerance. For decades, large systems
have been coming up with advanced hardware mechanisms to maximally
reduce failures (RAID, changing equipment live, replication at the SAN
level, error correction and failover at the memory and I/O level, etc.). A
Warehouse Scale Computer has the opposite features for two reasons:
		 Commodity hardware components are less reliable;
		 the global availability of a system simultaneously deploying to
several machines is the product of the availability of each server.[11]
[10] cf. “Sharding“, p. 179.
[11] Thus if each machine has an annual downtime of 9 hours, the availability of 100 servers will
be at best 0.999100≈ 0.90%, i.e. 36 days of unavailability per year!
158
THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE
[12] SGI is the result of a merger between Silicon Graphics, Cray and above all of Rackable who
had expertise in the field of x86 servers.
[13] http://www.youtube.com/watch?v=Ho1GEyftpmQ
Because of this, Web Giants consider that the system must be able to
function continuously even when some components have failed. Once
again, the application layer is responsible for ensuring this tolerance for
failure (cf. “Design for Failure“, p. 221).
On what criteria are the machines chosen?
That being said, the machines chosen by the Giants do not always
resemble what we think of as PCs or even the x86 servers of majors such
as HP or IBM. Google is certainly the most striking example as it builds its
own machines. Other majors such as Amazon work with more specialized
suppliers such as SGI.[12]
The top priority in choosing their servers is, of course, the bottom line.
Whittling components down to meet their precise needs and the quantity
of servers purchased give Web Giants a strong negotiating position.
Although verified data is lacking, it is estimated that the cost of a server
for them can go as low as $500.
The second priority is electric power consumption. Given the sheer
magnitude of servers deployed, power consumption has become a major
expense item. Google recently stated that their average consumption was
about 260 million watts, amounting to a bill of approximately $30,000
per hour. The choice of components as well as a capacity to configure the
consumption of each component very precisely can also engender huge
savings.
In sum, even though they contain the same parts you would find in your
desktop, the server configurations are a long shot away. With the exception
of a few initiatives such as OpenCompute from Facebook, the finer details
are a secret that the Giants keep fiercely. The most one can discover is
that Google replaced its centralized oscillators with 12V batteries directly
connected to the servers.[13]
159
THE WEB GIANTS
Exception!
There are almost no examples of Web Giants communicating with any
other technology besides x86. If we went back in time, we would probably
find a “Powered by Sun“ logo at Salesforce[14]
.
How can I make it work for me?
Downsizing, i.e. replacing central servers by smaller machines peaked in
the 1990s. We are not giving a salespitch for commodity hardware, even
if one does get the feeling that the x86 has taken over the business. The
extensive choice of commodity hardware goes beyond, as it transfers the
responsibility for scalability and failure resistance to applications.
For Warehouse Scale Computing, like for the Web Giants, when the costs
of electricity and investment become crucial, it is the only viable solution.
For existing software which can run on the sole resources of a single
multiprocessor server, the cost of (re-)developing it as a distributed system
and the cost of the hardware can be balanced in the Information System.
The decision to use commodity hardware in your company must be made
in the framework of your global architecture: as much as possible, develop
what you already have with better quality machines or adapt it to migrate
(completely) to commodity hardware. In practice, applications designed
for distribution such as front Web services will migrate easily. In contrast,
highly integrated applications such as software packages necessarily entail
specific infrastructure with disk redundancy, which is hardly compatible
with a commodity hardware datacenter such as used by Web Giants.
[14]  http://techcrunch.com/2008/07/14/salesforce-ditch-remainder-of-sun-hardware
160
THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE
Associated patterns
Distributed computing is essential to using commodity hardware. Patterns
such as sharding (cf. “Sharding“, p. 179) need to be implemented in the
code to be able to migrate to commodity hardware for data storage.
Using a large number of machines also complicates server administration,
and patterns such as DevOps need to be adopted (cf. “DevOps“, p. 71).
Lastly, the propensity shown by Web Giants to design computers, or rather
datacenters, adapted to their needs is obviously linked to their preference
for build vs. buy (cf. “Build vs. Buy“, p. 19).
161
THE WEB GIANTS
Sharding
162
THE WEB GIANTS
Description
For any information system, data are an important asset which must be
captured, stored and processed reliably and efficiently. While central
servers often play the role of data custodian, most Web Giants have
adopted a different strategy: sharding, or data distribution.[1]
Sharding describes a set of techniques for distributing data over
several machines to ensure architecture scalability.
Business needs
Before detailing implementation, let us say a few words about the needs
driving the process. Among Web Giants there are several shared concerns
which most are familiar with: storing and analyzing massive quantities
of data,[2]
strong performance stakes to ensure delays are minimal,
scalability[3]
and even flexibility needs linked to consultation peaks.[4]
We will insist on a specificity of the type of actors facing the issues
mentioned above. For Web Giants, revenues are often independent of
the quantity of data processed and stem instead from advertising and
user subscriptions.[5]
They therefore need to keep unit costs per transaction
very low. In traditional IT departments, transactions can easily be linked
to physical flows (sales, inventory). Such flows make it easy to bill services
depending on the number of transactions (conceptually speaking through
a sort of tax). However with e-commerce sites for example, browsing the
catalog or adding items to a cart does not necessarily entail revenues
because the user can quit the site just before confirming payment.
[1] According to Wikipedia, a database shard is a horizontal partition of data in a database or
search server. (http://en.wikipedia.org/wiki/Shard_(database_architecture)
[2] Heightened by Information Systems being opened to the Internet (user behavior analysis,
links to social media...).
[3] Scalability is of course tied to a system’s capacity to absorb a bigger load, but more
important still is the cost. In other words, a system is scalable if it can handle the additional
query without taking more time and if the additional query costs the same amount as the
preceding ones (i.e. underlying infrastructure costs must not skyrocket).
[4] Beyond scalability, elasticity is linked to the capacity to have only variable costs unrelated to
the load. Which is to say that a system is elastic if, whatever the traffic (10 queries per second
or 1000 queries per second), the query price per unit remains the same.
[5] For example, no size limit to e-mail accounts.
163
THE WEB GIANTS ARCHITECTURE / SHARDING
In sum, the Information Systems of Web Giants must ensure scalability at
extremely low marginal costs to uphold their business model.
Sharding to cut costs
As yet, most databases are organized centrally: a single server, possibly
with redundancy in active/passive mode for availability. The usual solution
for increasing the transaction load is vertical scalability or scale-up, i.e.
buying a more powerful machine (more I/O, more CPUs, more RAM...).
There are limits however to this approach: a single machine, no matter
how powerful, cannot alone index the entire Web for example. Moreover
there is the all-important question of costs leading to the search for other
approaches.
Remember from the last chapter:
A study[6]
carried out by engineers at Google shows that as soon as the
load exceeds the capacities of a large system, the unit cost for large
systems is much higher than with mass-produced machines.[7]
Although calculating per transaction costs is no easy matter and is open
to controversy - architecture complexification, network load to be figured
into the costs - the majority of Web Giants have opted for commodity
hardware.
Sharding is one of the key elements in implementing horizontal scale-
up.
[6] The study http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905
CAC006 is also summarized in the OCTO blog article: http://blog.octo.com/ datacenter-as-a-
computer-une-plongee-dans-les-datacenters-des-acteurs-du-cloud.
[7] This is another way of saying “commodity hardware“: the machines are not necessarily low-
end, but the performance/cost ratio is the highest possible for a given system
164
THE WEB GIANTS
Centralized
database
Vertical
partitioning
Horizontal
partitioning
Client
1
2
Client
1
2
Client
1
Contrat
A
Contrat
A
B
Contrat
A
B
Client
2
Contrat
B
How to shard
In fact there are two ways of partitioning - or sharding - data: vertically or
horizontally.
Vertical sharding is the most widely used and consists of isolating rows in
the database table per concept. For example, deciding to store client lists
in one database and their contracts in another.
Horizontal sharding is where the database tables are divided and
distributed across multiple servers. For example, storing client lists from A
to M on one machine and from N to Z on another. Horizontal sharding is
based on a distribution key
- the first letter in the name in the example above.[8]
Web Giants have mostly implemented horizontal sharding. It has the
advantage namely of not being limited by the number of concepts as is
the case with vertical sharding.
Figure 1
[8] In fact, partitioning is a function of the probability of names to begin with a given letter.
165
THE WEB GIANTS ARCHITECTURE / SHARDING
Techniques linked to sharding
Based on their choice of horizontal scale-up, Web Giants have developed
specific solutions (grouped under the acronym NoSQL -
Not Only SQL) to meet the challenges and having the following
characteristics:
		 implementation using mass-produced machines,
		 data sharding managed at the software level.
While sharding makes it possible to overcome the issues mentioned
above, it also entails implementing new techniques.
		 Managing availability is much more complex. In a centralized sys-
tem, or one used as such, the system is either available or not, and
only the rate of unavailability will be measured. In a sharded system,
some data servers can be available and others not. If the failure of a
single server makes the entire system unavailable, the unavailability
rate is equal to the product of the unavailability of each of the data
servers. The rate will thus drop sharply: If 100 machines are each
down 1 day per year, the system would show a rate of unavailability
of nearly 3 months.[9]
Since a distributed system can remain available
despite the failure of one of the data servers, albeit in downgraded
mode, availability must be measured through two figures: yield, i.e.
the above defined unavailability rate; and harvest, i.e. completeness
of the response, i.e. measuring so to say the absence of unavailabi-
lity.[10]
		 Distribution of the load is usually tailored to data use. A product
reference (massively accessed in read mode) won’t raise the same
performance issues as a virtual shopping cart (massively accessed in
write mode). The replication rate, for example, will be different.
[9] (364/365)100 = 76% = 277/365 i.e. 88 days.
[10] Thus if when a server fails, the others ignore the modifications made to that server and
then resolve the various modifications once the server reconnects to the cluster, the harvest
is smaller. The response is incomplete because it has not integrated the latest changes,
but maintains the yield. The NoSQL solutions developed by the Giants integrate various
mechanisms to manage this: data replication over several servers, vector clock algorithms to
resolve competing updates when the server reconnects to the cluster. Further details may be
found in the following article: http://radlab.cs.berkeley.edu/people/fox/ static/pubs/pdf/c18.
pdf
166
THE WEB GIANTS
Lastly, managing the addition of new servers and the data partitioning
problems this poses (recalibrating the cluster) are novel issues specific
to sharding. FourSquare for example were down for 11 hours in October
2010[11]
following overload of one of their servers then to trouble when
they connected the back-up server, which in the end caused the entire
site to crash. Data distribution algorithms such as consistent hashing[12]
limit data replication costs when servers are removed or connected to
overcome these problems.
Sharding also means adapting your application architecture:
		 Queries have to be adapted to take distribution into account so as
to avoid any inter-shard queries because the cost of accessing seve-
ral remote servers is prohibitive. Thus the APIs of such systems limit
query possibilities to data in the same shard.
		 Whether one is using relational databases or NoSQL type bases,
models are upended and modelization is widely limited in such sys-
tems to the level key/value, key/document or in classes of columns
for which the key or line index serves as the basis for partitioning.
		 Atomicity (the A in ACID) is often restricted so as to avoid atomic
updates affecting several shards and therefore transactions distribu-
ted over several machines at high performance cost.
Who makes it work for them?
The implementation of these techniques varies across companies. Some
have simply adapted their databases to facilitate sharding. Others have
hand-written ad hoc NoSQL solutions. Following the path from SQL to
NoSQL, here are a few representative implementations:
[11] For more details on the FourSquare incident:
http://blog.foursquare.com/2010/10/05/so-that-was-a-bummer/ and the analysis of another
blog http://highscalability.com/blog/2010/10/15/troubles-with-sharding-what-can-we-learn-
from- the-foursquare.html
[12] Further details in the following article:
http://blog.octo.com/consistent-hashing-ou-l%E2%80%99art-de-distribuer-les-donnees/
167
THE WEB GIANTS ARCHITECTURE / SHARDING
Wikipedia
This famous collaborative encyclopedia rests on many instances of
distributed MySQL and a MemCached memory cache. It is thus an
example of sharding implementation with run-of-the-mill components.
Figure 2
The architecture uses master-slave replication to divide the load between
read and write functions on the one hand, and partitions the data by Wiki
and use case. The article text is also deported to dedicated instances.
They thus use MySQL instances with between 200 and 300 GB of data.
Consultation Edition
MemCached
Metadata
DATA STORAGE
FOR ARTICLE TEXT
DATA STORAGE
FOR ARTICLE TEXT
SLAVES FOR READSLAVES FOR READ MASTER FOR WRITESMASTER FOR WRITES
Wiki A
MySQL
Replication
Wiki B Wiki B
168
THE WEB GIANTS
Flickr
The architecture of this photo sharing site is also based on several master
and slave MySQL instances (the shards), but here based on a replication
ring making it easier to add data servers..
Figure 3
An identifier serves as the partitioning key (usually the photo owner’s ID)
which distributes the data over the various servers. When a server fails,
entries are redirected to the next server in the loop. Each instance on the
loop is also replicated on two slave servers to function in read-only mode
if their master server is down.
MasterMaster
SlavesSlaves
ids 1
à N/4
MemCached
Metadata
Next master
Reads
Writes
MySQL
replication
169
THE WEB GIANTS ARCHITECTURE / SHARDING
Facebook
The Facebook architecture is interesting in that it shows the transition
from a relational data base to an entirely distributed model.
Facebook started out using MySQL, a highly efficient open source solution.
They then implemented a number of extensions to partition the data.
Figure 4
Today, the Facebook architecture has banished all central data storage.
Centralized access is managed by the cache (MemCached) or a dedicated
service. In their architecture, MySQL serves to feed data to MemCached
in the form of key-value and is no longer queried in SQL. The MySQL
replication system is also used after an extension to replicate the shards
across several datacenters. That being said, its use has very little to do
with relational databases. Data are accessed only through the key-value.
At this level there are no joins. Lastly, the structure of the data is taken into
account to co-locate data used simultaneously.
DATACENTER #1DATACENTER #1 DATACENTER #2DATACENTER #2
Clé, Valeur
Clé = C1
MemCached
Clé, Valeur
MySQL MySQLMySQL replication
Asynchronous
170
THE WEB GIANTS
Amazon
The Amazon architecture stands out in its more advanced management of
the loss of one or more datacenters on Dynamo.
Amazon started out in the 1990s with a single Web server and an Oracle
database. They then set up a set of business services in 2001 with dedicated
storage. Alongside databases, two systems use sharding: S3 and Dynamo.
S3 is an online blob storage site identified by a URL. Dynamo (first used
in-house, but recently made available to the public through Amazon Web
Services) is a distributed key-value storage system designed to ensure
high availability and very fast responses.
In order to enhance availability on Dynamo, several versions of a same
dataset can coexist, using the principle of eventual consistency[13]
.
Figure 5
[13] There are quorum mechanisms (http://en.wikipedia.org/wiki/Quorum_(distributed_
computing) to arbitrate between availability and consistency.
Consultation Edition
Foo (Bar= «1», Version=1)
Foo (Bar= «2», Version=2)
Asynchronous
propagation
171
THE WEB GIANTS ARCHITECTURE / SHARDING
In read mode, an algorithm such as the vector clock[14]
or, as a last resort, the
client application, will have to resolve any conflicts. There is thus a balance
to be found in how much is replicated to choose the best compromise
between resistance to data center failure on the one hand and system
performance on the other.
LinkedIn
LinkedIn’s background is similar to Amazon’s: they started in 2003 with a
single database approach, then partitioned for specific businesses with
implementation of a distributed system similar to Dynamo’s: Voldemort.
But contrary to Dynamo, it is open source. One should also note that
indexes and social graphs have always been stored separately by LinkedIn.
Google
Google was the first to broadcast information on their distributed storage
system. Rather than having its roots in databases, it emulates file systems.
In the paper[15]
on the Google File System (GFS), the authors mention
that their choice of commodity hardware was instrumental, given the
weaknesses noted in a previous chapter (cf. “Commodity Hardware“,
p. 167). This distributed file system is used, directly and indirectly, to store
Google’s data (search index, emails).
Figure 6
Its architecture is based on a centralized metadata server (to guide client
applications) and a very large number of data storage systems. The
degree of data consistency is lower than that guaranteed by a traditional
[14] The Vector Clock algorithm provides the order in which a given distributed dataset was
modified.
[15] http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google. com/
fr//papers/gfs-sosp2003.pdf
Client Chunk ServersMaster
1 2
3 4
5 6
3
2
6
4
1
5
172
THE WEB GIANTS
file system, but this topic alone deserves an entire article. In production,
Google uses clusters of several hundred machines, enabling them to store
petabytes of data to index.
Exception!
It is however undeniable that a great many sites are grounded in relational
database technologies without sharding (or without mentioning sharding):
StackOverflow, SalesForce, Voyages-SNCF, vente-privee.com… It is
difficult to draw up an exhaustive list, one way or another.
We nonetheless believe that sharding has become the traditional
strategy on data-intensive web sites. Indeed, the architecture of
SalesFoce is based on an Oracle database, but it uses the architecture
very differently from the practices in our usual ITs: tables with multiple un-
typed columns with generic names (col1, col2), a query engine upstream
from Oracle to take into account these specificities, etc. Optimizations
show the limits of purely relational architecture.
In our view, the most striking exception is StackOverflow, where the
architecture is based on a single relational SQL server. This site chose
to implement architecture based purely on vertical scalability, with their
initial architecture, inspired by Wikipedia, then evolving to conform to
this strategy. Moreover, one must also note that the scalability needs of
StockOverflow are not necessarily comparable to those of other sites
because their targeted community (IT engineers) is narrow, thus the
mode favors the quality of contributions over their quantity. Furthermore,
choosing a platform under Microsoft license gives them an efficient tool
but where the costs would certainly become prohibitive in the case of a
horizontal scale up.
How can I make it work for me?
Data distribution is one of the keys that enabled Web Giants to reach their
current size and to provide services that no other architecture is capable of
supporting. But make no mistake, it is no easy task: issues which are easy
to resolve in a relational world (joins, data integrity) demand mastering
new tools and methods.Areas which are data intensive but with
limited consistency stakes - as is e.g. the case with data which can be
partitioned - are those where distributed data will be most beneficial.
173
THE WEB GIANTS ARCHITECTURE / SHARDING
Offers compatible with Hadoop use these principles and are relevant
to BI, more particularly in analyzing non-structured data. Concerning
transactions, consistency issues are more important. Constraints around
access APIs are also a limiting factor, but new offers such as SQLFire by
VMWare or NuoDB attempt to combine sharding and an SQL interface.
Thus something to keep an eye on.
In short, you need to ask yourself which data belong to the same use
case (what partitions are possible?) and, for each, what the consequences
of loss of data integrity would be. Depending on the answers, you can
identify the main architecture features that would enable you, above and
beyond sharding, to choose the tool to best meet your needs. More than
a magic fix, data partitioning must be considered as a strategy to reach
scale-up levels which would be impossible without it.
Associated patterns
Whether you use open source or in-house products depends on your
use of data partitioning as it entails a great deal of fine tuning. The
ACID transactional model is also affected by data sharding. The pattern
Eventually Consistent offers another vision and solution to meet user needs
despite the impacts due to sharding. Again, mastering this pattern is very
useful for implementing distributed data. Lastly, and more importantly,
sharding is cannot be dissociated from the commodity hardware choice
implemented by Web Giants.
Sources
• Olivier Mallassi, Datacenter as a Computer : une plongée dans les
datacenters des acteurs du cloud, 6 June, 2011 (French only) :
 http://blog.octo.com/datacenter-as-a-computer-une-plongee-dans-les-
datacenters-des-acteurs-du-cloud/
• The size of the World Wide Web (The Internet), Daily estimated size
of the World Wide Web:
 http://www.worldwidewebsize.com/
THE WEB GIANTS
174
• Wikipedia:
 http://en.wikipedia.org/wiki/Shard_(database_architecture)
 http://en.wikipedia.org/wiki/Partition_%28database%29
 http://www.codefutures.com/weblog/database-sharding/2008/06/
wikipedias-scalability-architecture.html
• eBay:
 http://www.codefutures.com/weblog/database-sharding/2008/05/
database-sharding-at-ebay.html
• Friendster and Flickr:
 http://www.codefutures.com/weblog/database-sharding/2007/09/
database-sharding-at-friendster-and.html
• HighScalability:
 http://highscalability.com/
• Amazon:
 http://www.allthingsdistributed.com/
175
THE WEB GIANTS
TP vs. BI:
the new
NoSQL approach
176
THE WEB GIANTS ARCHITECTURE / TP VS. BI: THE NEW NOSQL APPROACH
Description
In traditional ISs, structured data processing architectures are generally
split across two domains. Both of course are grounded in relational
databases, but each with their own models and constraints.
On the one hand, Transactional Processing (TP), based on ACID
transactions,
and on the other Business Intelligence (BI), grounded in fact tables
and dimensions.
Web Giants have both developed new tools and come up with new
ways of organizing processing to meet these two needs. Distributed
storage and processing is widely used in both cases.
Business needs
One recurrent specificity of Web Giants is their need to process data which
are only partially structured, or not at all, different from the usual data
tables used in management information systems: Web pages for Google,
social graphs for Facebook and LinkedIn. A relational model based on
two-dimensional tables where one of the dimensions is stable (the number
and type of columns) is ill-adapted to this type of need.
Moreover, as we saw in the chapter on sharding (cf. “Sharding“, p. 179),
constraints on data volumes and transaction amounts often push Web
Giants to partition their data. This overturns the traditional vision of TP
where the data are always consistent.
BI solutions, lastly, are usually driven by internal IT decisions. For Web
Giants, BI is often the foundation for new services which can be used
directly by clients: LinkedIn’s People You May Know, new music releases
suggested by sites such as Last.fm,[1]
Amazon recommendations, are all
services which entail
[1] Hadoop, The Definitive Guide O’Reilly, June, 2009.
177
THE WEB GIANTS
manipulating vast quantities of data to provide recommendations to users
as quickly as possible.
Who makes it work for them?
The new approach of Web Giants on the level of TP (Transaction
Processing) and BI (Business Intelligence) lies in generic storage
and deferred processing whenever possible. The main goal in the
underlying storage is only to absorb huge volumes of queries both
redundantly and reliably. We call it ‘generic’ because it is poorer in
terms of indexing, data organization and consistency than traditional
databases. Processing and analyzing data for queries and consistency
management are deported to the software level. The following
strategies are implemented.
TP: the ACID constraints limited to what is strictly
necessary
The sharding pattern highly complicates the traditional vision of a single
consistent database used for TP. Major players such as Facebook and
Amazon have thus adapted their view of transactional data. As specified
by the CAP theorem,[2]
within a given system one cannot at the same time
achieve consistency, availability and partition tolerance. First of all, data
consistency is no longer permanent but only provided when the user
reads the data.
This is known as eventual consistency: it is when the information is read
that its integrity is checked, and any differing versions in the data servers
are resolved.
Amazon fostered this approach when they designed their distributed
storage system Dynamo.[3]
On a set of N machines, the data are replicated
on W of them, in addition to version stamping. For queries, N-W+1
machines are searched, thereby ensurin that the user has the latest
version.[4]
The e-commerce giant chose to reduce data consistency in favor
of gains in the availability of its distributed system.
[2] http://en.wikipedia.org/wiki/CAP_theorem
[3] http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
[4] In this way one is always certain of reading the data on at least one of the W machines
where the freshest data have been written. For further information, see http://www.
allthingsdistributed.com/2007/10/amazons_dynamo.html
178
THE WEB GIANTS ARCHITECTURE / TP VS. BI: THE NEW NOSQL APPROACH
Furthermore, to meet their performance goals, data freshness criteria
are no longer comprehensive, but categorized. Facebook and LinkedIn
rely on user updates for real-time freshness of these data: modifications
must be immediately visible to ensure user trust in the system. In contrast,
global consistency is reduced: when users sign up for a Facebook group
for example, they immediately see the information appear but other
group members may experience some delay in being notified.[5]
At LinkedIn, services are also categorized. For non critical services such as
retweets, the information is propagated asynchronously.[6]
Whereas any user modifications on their own data are immediately
propagated so as to be instantly visible to them.
Asynchronous processing is what makes it possible for Web Giants
to best manage the heavy traffic loads they face. In sum, to guarantee
performance and availability, Web Giants tailor their storage systems so
that data consistency depends on usage. The goal is not to be consistent
at all times, but rather to provide eventual consistency.
BI: the indexation mechanism behind all searches
To provide information on vast quantities of data, Web Giants also tend to
pre-calculate indexes, which is to say data structures specifically designed
to answer user questions. To better understand this point, let us look at
the indexes that Google has designed for its search engine. Google is
foremost in the arena due to the volume of its indexing: the Web entire.
[5] http://www.infoq.com/presentations/Facebook-Software-Stack
[6] Interview with Yassine Hinnach, Architect at LinkedIn.
179
THE WEB GIANTS
At the implementation level, Google uses sharding to store raw data
(BigTable column database grounded in the distributed Google File
System).[7]
Indexes based on keywords are then produced asynchronously, and are
used to answer user queries. The raw data are analyzed with a distributed
algorithm, based on the programming model MapReduce.
The process can be divided into two main phases: map, which, in parallel,
identically processes each piece of data; and reduce, which aggregates the
various results in a single final result. The map phase is easily distributable
by using one machine for processing and another for the corresponding
data, as can be seen in Figure 1.
Figure 1
[7] cf. “Sharding“, p. 179.
ReduceMap
0
1
2
3
4
5
6
7
180
THE WEB GIANTS ARCHITECTURE / TP VS. BI: THE NEW NOSQL APPROACH
This technique is highly scalable[8]
and makes it possible for example for
a web crawler to consume all web pages visited, to establish for each the
list of outgoing links, then to aggregate them during the reduce phase
to obtain a list of the most referenced pages. Google has implemented
a sequence of MapReduce tasks to generate the indexes for its search
engine.[9]
This allows them to process huge quantities of data in batch mode.
The technique has been widely copied, namely through the Apache
Foundation open source project Hadoop.[10]
Hadoop uses both the distributed file system and a framework to
implement the MapReduce programming model, directly inspired by
Google’s research paper. It was then adopted by Yahoo! for indexing, by
LinkedIn to prepare its email campaigns, and by Facebook to analyze the
various logs generated by their servers... Many firms, including several
other Web Giants (eBay, Twitter) use it.[11]
In 2010, Google set up a new indexation process based on event
mechanisms.[12]
Updates do not happen in real time, contrary to database
triggers, but latency (the time between page publication and the possibility
to search it) is greatly reduced as compared to a batch system based on
the MapReduce programming model.
Exception!
All of these examples share a commonality: they target a pretty specific
set of needs. Many key Web players also use relational databases for
other applications.
The “one size fits all“ approach of these databases means they are easier
to use but also more limited, notably in terms of scalability. The processes
and distributed storage systems described above are only implemented
for the services most frequently used by these key players.
[8] Or scalable, i.e. capable of processing more data if the system is enlarged.
[9] http://research.google.com/archive/mapreduce.html
[10] http://hadoop.apache.org
[11] http://wiki.apache.org/hadoop/PoweredBy
[12] Google Percolator: http://research.google.com/pubs/pub36726.html
181
THE WEB GIANTS
How can I make it work for me?
It is certainly in indexation solutions and BI on Big Data that the market
is most mature. With Hadoop, a reliable open source implementation, a
large number of support solutions, related tools, re-implementations and
commercial repackaging have been developed, based on the same APIs.
Projects based on the indexation of large quantities of data, or which are
semi- or non- structured, are the primary candidates for adoption of this
type of method. The main advantage is that data can be preserved thanks
to much lower storage costs. Information is no longer lost through over-
hasty aggregations.
In this way the data analysis algorithms producing indexes or reports can
also be more easily adjusted over time since they are constantly processing
all available data rather than pre-filtered subsets. A switch from relational
databases in TP will probably take more time. Various distributed solutions
inspired by Web Giants’ technologies have come out under the label
NoSQL (Cassandra, Redis).
Other distributed solutions, more at the crossroads of relational databases
and data matrices in terms of consistency and APIs, have come out
under the name NewSQL (SQLFire, VoltDB). Architectural patterns such
as Event Sourcing and CQRS[13]
can also contribute to spanning gaps
across disciplines. In fact, their contributions make it possible to model
transactional data as a flow of events which are both non correlated and
semi-structured. Building a comprehensive and consistent vision of the
data comes after, for data dissemination. Web Giants models cannot be
directly transposed to meet the general TP needs of businesses, and
there are many other approaches to be found on the market to overcome
traditional database limits.
Associated patterns
This pattern is mainly linked to the sharding pattern (cf. “Sharding“,
p. 179), because, through distributed algorithms, it makes it possible to
work on this new type of storage.
One should also note here the influence of the pattern Build vs. Buy (cf.
“Build vs. Buy“, p. 19) which has led Web Giants to adopt highly specialized
tools to meet their needs.
[13] Command and Query Responsibility Separation.
182
THE WEB GIANTS
Big Data
Architecture
183
THE WEB GIANTS
To better meet their users' needs, the Web Giants do everything they can
to reduce their Time to Market. Data in all forms are key to this strategy.
They not only serve for technical analyses, but are also business drivers.
They are what make it possible to personalise the user experience, more
and more often in real time, and above all inform decision making. The
Web giants have long understood the importance of data and use them
unabashedly. At Google for example, all ideas must come with metrics,
all arguments must be based on data, or you will not be heard in the
meeting.[1]
Everyone speaks of Big Data, but the Web Giants were the first
stakeholders, or, at the least, associates. Behind the buzz word are new
challenges, including an especially complicated one: how do you store
and process the exponential volume of data generated? There are more
connected objects than humans on the planet, and Cisco forecasts that
by 2020 there will be over 50 billion sensors,[2]
how do you use all that
information?
Time to Action
As shown in the preceding chapter, NoSQL architecture can process and
query ever larger amounts of data.
Big Data is usually described by 3 main characteristics, often called the
3Vs:[3]
	 Volume, the capacity to process terabytes, petabytes, and even
exabytes of extracted data
	Variety, the capacity to process all data formats, whether structured
or not
	Velocity, the capacity to process events in real time, or at least as
quickly as possible
With architectures of the NoSQL/NewSQL type, as described previously,
only the components Variety and Volume were highlighted. Let us now
look at how the Web Giants also embrace the third component: Velocity.
[1] http://googlesystem.blogspot.com.au/2005/12/google-ten-golden-rules.html
[2] https://www.cisco.com/web/about/ac79/docs/innov/IoT_IBSG_0411FINAL.pdf
[3] https://en.wikipedia.org/wiki/Big_data
184
THE WEB GIANTS ARCHITECTURE / BIG DATA ARCHITECTURE
Making data available
We will talk here about double-headed architectures capable of storing and
querying data in all forms, processed in batches or in real time. But before
broaching this complex subject, let us first take a look at the characteristics
and Big Data architecture patterns the Web Giants implement.
A data lake for data
In an information system, the data are distributed over dozens, or even
hundreds, of components. The data are spread out in various sources,
some on site but others with third party editors or blocked in proprietary
software. Having the data on hand is not enough, they must also be
instantly accessible. If you don't have the data, it is unlikely you will think
of playing around with them. Isolated data is underexploited data: the
Allen Curve[5]
also applies to data!
That is why the Web giants centralise their data in a scalable system where
they can be easily queried without any presumptions about how they will be
used. Perhaps most of them will not even be used, but that does not matter:
the important thing is to have them nearby just in case a new idea emerges.
This type of system, usually based on the Hadoop framework, is commonly
called a “data lake“.[6A]
It is a storage and distributed processing platform
capable of handling ever increasing amounts of data, whatever their
nature. On paper, it can be scaled to infinity,[7]
both in terms of storage
and processing, and can manage numerous competing jobs and tasks
linearly thanks to the size of the infrastructure.
An aside
Some also speak of 4Vs or even 5Vs,[4]
adding components to the
3Vs mentioned above such as:
	 Veracity,thecapacitytomanageinconsistenciesandambiguitiesa
	 Value, the capacity to apply differential processing to data
depending on the value attributed to them
The latter is without doubt the most debatable, since the main benefit
of this type of architecture is that there are no presuppositions as to
how the data will be analysed, and therefore no pre-established values.
[4] https://www.linkedin.com/pulse/20140306073407-64875646-big-data-the-5-vs-everyone-must-know
[5] https://en.wikipedia.org/wiki/Allen_curve
[6A] https://en.wikipedia.org/wiki/Data_lake
[7] even if nothing is infinitely scalable https://www.youtube.com/watch?v=modXC5IWTJI
185
THE WEB GIANTS
Immutable data
A data lake can store all types of data, it is up to the user to decide what
to use it for. Of all the data it can hold, raw data are particularly interesting.
Available without changes or alterations, they can be modelled depending
on user needs.
Immutability drastically reduces manipulation errors:
	 the data are entered without any transformation, limiting the risk of
losing the context or errors in interpretation
	 the data are stored only once and are never updated, thus limiting
manipulation errors and keeping a full record.
Immutable, they can also theoretically[6b]
be reused an infinite number of
times. The data are not “consumed“ but “used“. In case of errors, bugs
or code updates, the processing simply needs to be relaunched to obtain
the latest results.
When they are timestamped and sufficiently individualised, such immutable
data are also known as “events“.
Schema on read
Another highly interesting characteristic is in interpreting the data. For
a “traditional“ BI ingestion, the data are cleaned up, formatted, and
normalised before being ingested. The Web Giants consider that each
time data is transformed, part of the context is altered. By storing raw
data, it is up to users to decide how to transform them.
Let us take the example of Twitter. Each tweet contains a multitude of
information: text, images, videos, links, hashtags. They are timestamped,
geographically located, shared, liked... Depending on the system using
the data, it must be able to transform them by focusing on the aspect
which seems most relevant. An application to map the most recent tweets
will probably not have the same angle of approach as one looking for the
most shared content.
[6B] In practice, Google uses its data over a period of 30 days, for both volumetric and legal
reasons.
186
THE WEB GIANTS ARCHITECTURE / BIG DATA ARCHITECTURE
This pattern, Schema on read, has several advantages:
	 It maximally simplifies ingestion, avoiding all data loss and making
it much less expensive to add data to a data lake.
	 It gives clients flexibility by allowing personalised extraction and
transformation depending on needs.
This pattern, joined with the preceding ones, becomes a driver of
innovation. It does away with technical barriers to data processing, making
it possible to develop new prototypes more and more quickly. The best
way to find value in your data is to play around with them!
From Big Data to Fast Data
The Web Giants strive to give value to their clients as quickly as possible.
Sometimes, and more and more often, offline processing is no longer
sufficient for user needs.
In that case, the best way to get value from your data is to interact with
them as soon as they are ingested: the data lake as described above allows
Enterprise
DWH
Database
Transactional
Systems
Reporting,
requests
External Data,
OpenAPI
Messages
 Events
Messages
 Events
PUBLICATION
INGESTION
Analytical
batchs
Machine
Learning
Flow
management
Non-structured
storage
Semi-structured
storage (NoSQL)
Structured storage
(ex. relational)
DATALAKE
Interactive
requests
Raw
files
Applicative
logs
External Data,
OpenAPI
187
THE WEB GIANTS
you to process data in batch mode only. However, between two batch
passages, freshly gathered data are not used. Not only do you not get full
benefit from them, but worse, some data may be outdated before they're
even used. The fresher the data, the greater their potential interest.
To process millions or even billions of events per second, two types of
technology are used:
	 Event distributors and collectors such as Flume and Kafka
	 Tools to process the events in near real time, such as Spark and Storm
More than being just customers, the Web giants partake in creating and
sharing these bricks:
	 Kafka is a high speed distributed message queue developed by
LinkedIn[8]
	 Storm makes it possible to process millions of messages per second,
originally developed by Twitter[9]
The goal is not to replace the batch processing brick already included
in the data lake, but instead to add real time features. This layer is often
referred to as the Fast Layer, and the capacity to leverage Big Data for real
time processing is known as Fast Data.[10]
Real time reduces the Time to
Action, so prized by the Web Giants.[11]
APIs
DATALAKE
REALTIME
PUBLICATION
INGESTION
REAL TIME PUBLICATION
REAL TIME INGESTION
APIs
Interactive and
batch processing
Sandbox
Distributed File Storage Resilient Storage
Stateless
processing
Stateful
processing
Enterprise DWH
High Volume data Log files Applications Applications High Velocity data
Data
import
Data
import
Data
import
Data
import
Data
import
[8] http://kafka.apache.org/
[9] http://storm.apache.org/
[10] http://www.infoworld.com/article/2608040/big-data/fast-data--the-next-step-after-big-data.html
[11] http://www.datasciencecentral.com/profiles/blogs/time-to-insight-versus-time-to-action
188
THE WEB GIANTS ARCHITECTURE / BIG DATA ARCHITECTURE
Should the two channels, batch and real time, be treated as distinct or, on
the contrary, be unified?
In theory, the ideal is to be able to process the entire dataset, but that is
not so simple. There are numerous initiatives but you are unlikely to need
any for your ecosystem, where most use cases can do without. The Web
giants advise batch oriented architecture if you have no strong latency
constraints, or instead fully real time architecture, but rarely both at once.
Lambda architecture
Lambda architecture is undoubtedly the most widespread response to the
need to unify the two approaches. The principle is to process the data in
two layers, batch and real time, carrying out the same processes in both
channels, then consolidating the results in a third, dedicated layer:
	The batch layer precalculates the results based on the complete
dataset. It processes raw data and can be regenerated on demand.
	 The speed layer serves to overcome batch latency by generating
real time views which undergo the same processing as in the batch
layer. These real time views are continuously updated and the events
are crushed in the process, therefore the views can only be replayed
by the batch layer.
	The serving layer then indexes both views, batch and real time, and
displays them in the form of consolidated output.
Since the raw data are always available in the batch layer, if there are any
errors, the output can be regenerated.
Sensor
Layer
Distribution
Layer
Batch Layer Serving Layer
IoT
...
All data
Process stream
Incremented
information
real time view
real time view
Precomputed
information
batch view
batch view
DataService(Merge)
Visualization
Adapted from: Marz, N.  Warren, J. (2013) Big Data. Manning.
Batch
recompute
Realtime
increment
Speed Layer
Incoming
Data
mobile
social
189
THE WEB GIANTS
However, few use cases are truly adapted to this type of architecture. It
has not yet reached maturity, even among the Web Giants, and is highly
complex to implement. More specifically, it entails developing the same
processing twice on two types of very different technologies. Doing it once
is already difficult enough without having to double the task, especially
given that it must all be synchronised.
As an alternative to Lambda architecture, Twitter offers, through
Summingbird,[12]
an abstraction layer where you can integrate computation
in both layers within a single framework. What you gain in simplicity you
lose in flexibility however: the number of usable features is reduced at the
intersection of both modes.
Kappa Architecture
LinkedIn has released another variant of this model: Kappa Architecture.[13]
Their approach is based on processing all data, old and new, in a single
layer: the fast layer, thus reducing the complex equation.
It is a way of better dividing the streams into small independent steps,
easier to debug, with each step serving as a checkpoint to replay
unitary processing in case of error. Reprocessing data is one of the
most complicated challenges with this type of architecture and must be
thoroughly thought through from the outset. Because code, formats and
data constantly change, processing must be able to integrate the changes
continuously, and that is no small matter.
Sensor
Layer
Distribution
Layer
Batch Layer Serving Layer
IoT
... Process stream
Incremented
information
real time view
real time view
Visualization
Adapted from: Marz, N.  Warren, J. (2013) Big Data. Manning.
Batch
Analytical analysis
Realtime
increment
Speed Layer
Incoming
Data
mobile
social
Replay
DataService
All data
[12] https://github.com/twitter/summingbird
[13] http://radar.oreilly.com/2014/07/questioning-the-lambda-architecture.html
190
THE WEB GIANTS ARCHITECTURE / BIG DATA ARCHITECTURE
How can I make it work for me?
Whether you have already invested in Business Intelligence or not,
leveraging your data is no longer an option. A data lake type solution
has become almost inevitable. More flexible than a data warehouse, it
is now possible to process unstructured data and create models on
demand. It does not (yet) replace traditional BI but opens up new vistas
and possibilities.
Based on open source solutions, mostly around Hadoop and its ecosystem,
this central business reference is a staunch ally to make data accessible,
whatever their type: managing unstructured data, storing and processing
large volumes, all with commodity hardware, which is to say low outlay.
Whatever your business line, the use cases are numerous and varied:
from log analysis and safety audits to optimising the buying journey, not
forgetting data science of course, data lakes are a key component to
intelligent user experience design.
To go beyond the offline processing of your data, add online features to
your data lake. Although we do not necessarily recommend implementing
e.g. Lambda or Kappa architectures, which are too complex for most use
cases and not always mature, this does not take away from the advantages
to be reaped from real time schemas which truly open new perspectives.
Stay simple!
191
THE WEB GIANTS
Data
Science
OCTO_TheWebGiants_2016
193
THE WEB GIANTS DATA SCIENCE
Data science now provides technology which is both low cost
and methodologically reliable to better use data in information
systems. Data science drives business intelligence even deeper by
automating data analysis and processing in order to e.g. predict
events, behavior patterns, trends or to generate new insights.
In what follows we provide an overview of data science, with
illustrations taken from some of its most groundbreaking and surprising
applications.
Data science is used to extract information from more or
less structured data, based on methodologies and expertise
developed at the crossroads of IT, statistics, and all
business lines involving data.[1] [2]
Practically speaking, solving a data science problem translates as
projecting into the future patterns grounded in data from the past.
One speaks of supervised learning when the main issue is forecasting
for a specific target. When the target has not been specified or data are
lacking, detecting patterns is said to be unsupervised.
One should note that data science also includes building atemporal
patterns and then visualizing their various facets.
Taking the classic example of purchasing histories and pricing in online
retail, data science serves to determine whether a client will buy a new
product, or what price they would be willing to pay for the product, and
are thus two examples of supervised learning in the respective areas of
classification and regression. Carving out marketing segments based on
behavior variables, in contrast, is an example of unsupervised learning.
More broadly, data science covers all technology and algorithms used to
model, implement and visualize an issue using available data, but also to
better understand problems by examining them from several viewpoints
to potentially solve them in the future. Machine learning is defined as the
algorithmic aspect of data science.
[1] Dhar V. 2013. “Data science and prediction“. Communications of the ACM
[2] Cleveland WS. 2001. “Data science: an action plan for expanding the technical area of the
field of statistics“. Bell Labs Statistics Research Report
194
THE WEB GIANTS
Enthusiasm for the discipline is such that today's data scientists must
constantly monitor the field to remain on top. Let us seize the occasion
to note that in the second half of 2015, OCTO published a Hadoop
white book and a book on data science (in French, English translation
forthcoming).[3] [4]
Web Giants
Among the Web Giants, there is strong movement towards unstructured
data (e.g. video and sound). These have traditionally been ignored by
analytics due to volume constraints and technical barriers to extracting
the information. However they are back in fashion with a combination
of breakthroughs in neural network science (including the field currently
known as deep learning); in technology, with ever more affordable and
powerful machines; and lastly with the wide media coverage of a number
of futuristic applications.
Groundbreaking work has been going on over the last few years, namely
in images and natural language processing, both sound and text.
	In December, 2014 Microsoft announced the launch of Skype
Translator, a real time translation tool for 5 languages, to break down
language barriers.[5]
	With DeepFace, Facebook announced, in June, 2014 a giant step
forward in facial recognition, reaching a precision level of 97%, close
to human performance for a similar task.[6]
	Google presents similar results with FaceNet in an article dated
June, 2015 on facial recognition and clustering.[7]
[3] http://bit.ly/WP-Hadoop2015 (French)
[4] data-science-fondamentaux-et-etudes-de-cas
[5] skype-translator-unveils-the-magic-to-more-people-around-the-world
[6] deepface-closing-the-gap-to-human-level-performance-in-face-verification
[7] http://arxiv.org/pdf/1503.03832.pdf
195
THE WEB GIANTS
Such developments in unstructured data processing show that it is now
possible to extract value from data hitherto considered out of reach.
The key lies in structuring the data:
	A raw image is transformed into a face, and then linked to a person.
The image's context can also be described in a sentence.[8]
The
patterns extracted from the images can be reproduced with slight
modifications, or blended with other images, such as a famous
painting to produce artistic motifs.[9]
	Speech can be transcribed as text, and music as notes on a score.
Patterns extracted from music make it possible to a certain extent to
reproduce a composer or musical genre.
	Masses of unstructured texts are transformed into meaning using
semantic vectors. Processing natural language becomes a question
of algebraic manipulations, facilitating its use by the algorithms of
data science.[10]
The mainstreaming of bots and personal assistants
such as Apple's Siri, Google's Now and Facebook's M partakes in
our ability to carry out more and more detailed semantic analyses on
unstructured text.
	The study of brain activity provides clues to identifying signs of
illness such as epilepsy or to determining which cerebral patterns
correspond to moving one's arm.[11]
	Some problems requiring cutting edge expertise are now being
handled using data science approaches, including to detect the
Higgs boson and searching for black matter using sky imaging.[12] [13]
Such use cases, often tightly linked to challenges launched by academic
circles, have largely contributed to the media frenzy around data science.
Moreover, for the Web Giants, data science has become not only a way to
continuously improve internal processes, but also an integral part of the
business model. Google products are free because the data generated
by the user has value for advertising targeting. Twitter draws a share of its
revenue from the combination of advertising and analytics products. Uber is a
perfect example of a data-driven company which, in serving as intermediary
between the client and the driver, has nothing to sell other than intelligence
in creating links.[14]
Intermediation services can easily be copied by the
competition, but not the intelligence behind the services.
[9] inceptionism-going-deeper-into-neural
[10] learning-meaning-behind-words
[11] grasp-and-lift-eeg-detection
[12] kaggle.com/c/higgs-boson
[13] kaggle.com/c/DarkWorlds/data
[14] data-science-disruptors
DATA SCIENCE
196
THE WEB GIANTS
A flourishing ecosystem and accessible tools
The standardization of data science came about through the contribution
of many tools from the open source world such as the multiple machine
learning and data handling libraries in languages such as R and Python[15] [16]
and from the world of Big Data. These open source ecosystems and their
dynamic communities have facilitated access to data science for many an
IT engineer or statistician wishing to become a data scientist.
In parallel, tools for data analysis by major publishers, whether oriented
statistics or IT, have also evolved towards integrating open source tools or
developing their own implementations of machine learning algorithms.[17]
Both the open source and proprietary ecosystems are flourishing, mature,
and more and more accessible in terms of training and documentation.
Open source is used as much to attract major talent from data science
as to provide tools for the community. This strategy is picking up speed
as illustrated by the buzz generated by TensorFlow, an open source
deep learning framework for digital calculations published by Google in
November, 2015.[18]
Thanks to highly permissive licensing, these tools are
absorbed and improved by the community, transforming them into de
facto standards.
We have completely lost track of the number of tools from the Hadoop
ecosystem which were internally developed by the Web Giants (such as
Hive and Presto at Facebook, Pig at Yahoo, Storm and Summingbird at
Twitter...) and then took on a second life in the open source world.
Platforms for online competitions in data science (such as the most
well known kaggle.com or datascience.net in France) have given new,
vibrant visibility to the potential of data science. Various Web Giants
such as Facebook and major players in distribution and industry quickly
understood that this could help them attract the best talent.[19]
Many data
science competitions propose job interviews as the top prize, in addition
to financial awards and certain glory.
[15] four-main-languages-analytics-data-mining-data-science
[16] kdnuggets.com/2015/05/r-vs-python-data-science
[17] Why-is-SAS-insufficient-to-become-a-data-scientist-Why-need-to-learn-Python-or-R
[18] tensorflow-googles-latest-machine_9
[19] kaggle.com/competitions
197
THE WEB GIANTS
The Web Giants swiftly organized to recruit the best data scientists, thus
anticipating the value added by interdisciplinary teams specialized in
capitalizing on data.[20]
Many, e.g. Google, Facebook and Baidu, have also hired top specialists in
machine learning such as Geoffrey Hinton, Yann LeCun and Andrew Ng.[21] [22] [23]
Current challenges in data science
One of the most crucial steps in any data science project is called feature
engineering. This consists of extracting the relevant numeric variables
to characterize one or several facets of the phenomenon under study.
For example, numerically describing user behavior on a web site by
calculating how often a given page is accessed, or characterizing an
image by the number of contours it contains. Feature engineering is also
considered one of the most fastidious tasks a data science has to carry out.
For unstructured data such as images, deep learning has made it
possible to automate the procedure, placing the use cases mentioned
above within reach.
For structured data, the creation and selection of new features to improve
prediction remain strongly specific to each particular business. This is
an essential component of the alchemy of a good data scientist. Feature
engineering is sill largely implemented manually by the world's best data
scientists for structured data.[24]
How can I make it work for me?
Are all the data you produce stored and then readily accessible? What
percentage of the data is in fact processed and analyzed? How often? To
what extent do you use the available data to measure your processes and
orient your actions? How much importance do you attach to recruiting
data scientists, data engineers and data architects?
Data science contributes more broadly to the best practices of data driven
companies, i.e. those that use the available data both qualitatively and
quantitatively to improve all their processes. Answering the few questions
above allows you to measure your maturity as concerns data.
[20] the-state-of-data-science
[21] wired.com/2013/03/google_hinton/
[22] facebook.com/yann.lecun/posts/10151728212367143
[23] chinese-search-giant-baidu-hires-man-behind-the-google-brain
[24] http://blog.kaggle.com/2014/08/01/learning-from-the-best/
DATA SCIENCE
198
THE WEB GIANTS
You have perhaps already used predictive methods based on linear
algorithms such as logistic regression traditionally found when establishing
marketing scores. Today, the rigorous implementation of the data science
methodology gives you control over the inherent complexity in using
non linear algorithms. The underlying compromise in giving up linear
algorithms is the loss of capacity to understand and explain predictions in
exchange for more realistic, and therefore more useful, predictions.
How do I get started?
Depending on the nature of your business, you may have unstructured
data that deserve a fresh look:
Call center recordings to be transcribed and semanticized to better
understand your customer relations.
Written texts supplied by clients or emails sent by staff to be used to
categorize complaints and requests, to detect fads and trends.
The takeaway is that in the use cases of most of our clients and in
international competitions, the vast majority concern structured or
semi-structured data:
Mapping links between customers and timestamped transactions can
bring to light potential fraud by processing volumes far beyond what is
possible manually.
Web logs begin as far upstream as possible to characterize customer
journey's which lead to a strategic target such as shopping cart
abandonment.
Temporal series produced by industrial sensors help prevent problems on
assembly lines.
Server logs identify warning signs before a machine breaks down.
Relational data on clients, sales and products form a set of characteristics
including identity, geographic location, behavior patterns and social
networks which are systematically integrated in the 360 models of the
examples described above.
Better yet, personalizing your client segment, predicting component
failures, improving the performance of your production units, gaining
customer loyalty, forecasting increases in demand and reducing churn,
are all possible use cases.[25]
Data science has become a strategic business
asset that you can no longer do without.
[25] kaggle.com/wiki/DataScienceUseCases
199
THE WEB GIANTS
Sources
[1] Dhar V. 2013. “Data science and prediction“. Communications of the ACM
[2] Cleveland WS. 2001. “Data science: an action plan for expanding the technical area of the
field of statistics“. Bell Labs Statistics Research Report
[3] http://bit.ly/WP-Hadoop2015 (French)
[4] data-science-fondamentaux-et-etudes-de-cas
[5] skype-translator-unveils-the-magic-to-more-people-around-the-world
[6] deepface-closing-the-gap-to-human-level-performance-in-face-verification
[7] http://arxiv.org/pdf/1503.03832.pdf
[8] google-stanford-build-hybrid-neural-networks-that-can-explain-photos
[9] inceptionism-going-deeper-into-neural
[10] learning-meaning-behind-words
[11] grasp-and-lift-eeg-detection
[12] kaggle.com/c/higgs-boson
[13] kaggle.com/c/DarkWorlds/data
[14] data-science-disruptors
[15] four-main-languages-analytics-data-mining-data-science
[16] kdnuggets.com/2015/05/r-vs-python-data-science
[17] Why-is-SAS-insufficient-to-become-a-data-scientist-Why-need-to-learn-Python-or-R
[18] tensorflow-googles-latest-machine_9
[19] kaggle.com/competitions
[20] the-state-of-data-science
[21] wired.com/2013/03/google_hinton/
[22] facebook.com/yann.lecun/posts/10151728212367143
[23] chinese-search-giant-baidu-hires-man-behind-the-google-brain
[24] http://blog.kaggle.com/2014/08/01/learning-from-the-best/
[25] kaggle.com/wiki/DataScienceUseCases
DATA SCIENCE
200
THE WEB GIANTS
Design
for
Failure
201
THE WEB GIANTS
Description of the pattern
“Everything fails all the time“ is a famous aphorism by Werner Vogels, CTO
of Amazon: indeed it is impossible to plan for all the ways a system can
crash, in any layer - an inconsistent administration rule, system resources
that are not released following a transaction, hardware failure, etc.
It is on this simple principle that the architecture of Web Giants is based,
it is known as the Design for Failure pattern: computer software must
be able to overcome the failure of any underlying component and
infrastructure.
Hardwareisnever100%reliable,itisthereforecrucialtoisolatecomponents
and applications (data grids, HDFS...) to guarantee permanent service
availability.
At Amazon for example, it is estimated that 30 hard drives are changed every
day per data center. The cost is justified by the nearly constant availability
of the site amazon.fr (less than 0.3 s. of outage per year), where one must
remember that each minute of outage costs over 50,000 euros in lost sales.
A distinction is generally made between the traditional continuity of
service management model and the design for failure model which is
characterized by five stages of redundancy:
		 Stage 1: physical redundancy (network, disk, data center). That is
where the traditional model stops.
		 Stage 2: virtual redundancy. An application is distributed over
several identical virtual machines within a VM cluster.
		 Stage 3: redundancy of the VM clusters (or Availability Zone on
AWS). These clusters are organized into clusters of clusters.
		 Stage 4: redundancy of the clusters of clusters (or Region on AWS).
A single supplier manages these regions.
		 Stage 5: redundancy of Internet suppliers (e.g. AWS and Rackspace)
in the highly unlikely event of AWS being completely down. Of
course, you will have understood that the higher the redundancy
level, the more the deployment and switch mechanisms are
automated.
202
THE WEB GIANTS ARCHITECTURE / DESIGN FOR FAILURE
Applications created within Design for failure continue to function despite
system or connected application crashes, even if it means, to continue
providing an acceptable level of service, downgrading functions for the
most recently connected users or all users.
This entails including design for failure in the application engineering,
based for example on:
		 Eventual consistency: instead of systematically seeking consistency
with each transaction with often costly mechanisms of the XA[1]
type, consistency is ensured at the end (eventually) when the failed
services are once again available.
		 Graceful degradation (not to be confused with the Web User
Interface of the same name): when there are sharp spikes in load,
performance-costly functionalities are deactivated live.
At Netflix, the streaming service is never interrupted, even when their
system for recommendations is down or failing or slow: they are there, no
matter what the failure.
Moreover, to reach that continuity of service, Netflix uses automated testing
tools such as ChaosMonkey (recently open-sourced), LatencyMonkey and
ChaosGorilla, which check that applications continue to run correctly
despite random failures in, respectively, one or several VM, network
latency, an Availability Zone.
Netflix thus lives up to its motto: “The best way to avoid failure is to fail
constantly“.
Who makes it work for them?
Obviously Amazon, who furnishes the basic AWS building blocks.
Obviously Google and Facebook who communicate frequently on these
topics.
But also Netflix, SmugMug, Twilio, Etsy, etc.
In France, although some sites have very high availability rates, very few
comment on their processes and, to the best of our knowledge, very few
are capable of expanding their redundancy beyond stage 1 (physical)
[1] Distributed transaction, 2-phase commit.
203
THE WEB GIANTS
or 2 (virtual machines). Let us nonetheless mention Criteo, Amadeus,
Viadeo, the main telephone operators (SFR, Bouygues, Orange) for their
real-time need coverage.
What about me?
Physical redundancy, rollback plans, Disaster Recovery Plan sites, etc. are
not Design for Failure patterns but rather redundancy stages.
Design for Failure entails a change in paradigm, going from “preventing
all failures“ to “failure is part of the game“, going from “fear of crashing“
to “analyzing and improving“.
In fact, applications built along the lines of Design for Failure no longer
generate such feelings of panic because all failures are naturally mastered;
this leaves time for post-mortem analysis and improvements to the PDCA[2]
.
It is, to borrow a term from Improv Theater, “taking emergencies easy“.
This entails taking action on both a technical and a human level. First of all
in application engineering:
		 The components of an application or application set must be
decentralized and made redundant using VM, by Zone, by Region
(in the Cloud. Same principle if you host your own IS) without any
shared failure zones. The most complex issue is synchronizing
databases.
		 All components must be resilient to underlying infrastructure failures.
		 Applications must support communication breaks and high network
latency.
		 The entire production workflow for these applications has to be
automated.
Then, for the organization:
		 Get out of the A-Team culture (remember: “the last chance at the
last moment“) and automate processes to overcome systems failure.
At Google, there is 1 systems administrator for over 3000 machines.
[2] Plan-Do-Check-Act, a method for continuous improvement, known as the “Deming Wheel“.
204
THE WEB GIANTS
		 Analyze and fix failures upstream with the Failure Mode and Effects
Analysis (FMEA) method, and downstream with post-mortems and
PDCA.
Patterns connexes
		 Pattern “Cloud First“ , p. 159.
		 Pattern “Commodity Hardware“, p. 167.
		 Pattern “DevOps“, p. 71.
Exceptions
For totally disconnected applications, with few users or few business
challenges, the redundancy must be simple or non-existant.
Arbitration between each redundancy level is then carried out using ROI
criteria (costs and complexities vs. estimated losses during inavailabilities).
Sources
• Don MacAskill, How SmugMug survived the Amazonpocalypse, 21 April,
2004:
 http://don.blogs.smugmug.com/2011/04/24/how-smugmug-survived-
the-amazonpocalypse
• Scott Gilbertson, Lessons From a Cloud Failure: It’s Not Amazon, It’s You, 25
April, 2011:
 http://www.wired.com/business/2011/04/lessons-amazon-cloud-failure
• Krishnan Subramanian, Designing For Failure: Some Key Facts It’s You, 26
April, 2011:
http://www.cloudave.com/11973/designing-for-failure-some-key-facts
ARCHITECTURE / DESIGN FOR FAILURE
205
THE WEB GIANTS
The Reactive
Revolution
206
THE WEB GIANTS
For many years now, competing processes have been executed in different
threads. A program is basically a sequence of instructions that run linearly
in a thread. To perform all the requested tasks, a server will generate
several threads. But these threads will spend most of their time waiting for
the result of a network call, a disk read or a database query.
Web giants have moved on to a new model to eliminate such time loss and
to increase the number of users per server by reducing latency, improving
performance globally and managing peak loads more simply.
The reactive manifesto defines a reactive application around four
interrelated pillars: event-driven, responsive, scalable and resilient.
A responsive application is event-driven, able to provide an optimal user
experience, by making better use of available computing power and
higher error and failure tolerance, and hence scalability and resilience. But
the most powerful concept here is the event-driven orientation, everything
else can be seen through this prism.
The reactive model is a development model driven by events.
It is called by a variety of names. It's all a matter of perspective:
	 event-driven, driven by events
	 reactive, that reacts to events
	 push based application, the data is fronted as it becomes available
	 Even better: Hollywood, summarised by the famous “don’t call
us, we’ll call you“
Use cases: when latency matters
This architectural model is very relevant for applications interacting with
users in real time.
This includes several use cases like:
	 Social networks, shared documents and direct communication tools
207
THE WEB GIANTS
	 Financial analysis, pooled information like traffic congestion or
public transport, pollution...
	 Multiplayer games
	 Multi-channel approaches, mobile application synchronisation
	 Open or private APIs, when usage is impossible to predict
	 IoT and index management
	 Massive user influx such as sport events, sales, TV ads...
	 And more generally when effectively managing complex
algorithms is the issue, e.g. for ticket booking, graph management,
the semantic web
One of the crucial elements in all these applications is latency handling.
For an application to be responsive and thus usable, users must experience
the lowest possible latency.
It’s all about the threading strategy
To put it simply, there are two types of thread:
	 Hard-threads: these are real competing processes that are
executed by the different processor cores
	 Soft-threads: these are simulations of competing processes that
dedicate portions of the CPU to each process, alternately
Fortunately, the soft-threads allow machines to simultaneously run many
more threads than they have cores.
The reactive model aims to remove as many soft-threads as possible and
only use hard-threads, making more efficient use of modern processors.
To reduce the number of threads, the CPU must not be shared on a time
basis, but instead on an event basis. Each call involves processing a
piece of code. It must never be blocked, to release the CPU as quickly as
possible to process the next event.
ARCHITECTURE / THE REACTIVE REVOLUTION
208
THE WEB GIANTS
Implementing this model means operating in all software layers:
from operating systems to development languages passing through
frameworks, hardware drivers and databases.
A data structure that eliminates locks is beyond doubt an important lever
for system performance. New functional data models then become the
best allies for reactive models.
Among new software making the most buzz, many use an internal reactive
model. To name but a few: Redis, Node.js, Storm, Play, Vertx, Axom, and
Scala.
The reactive model is more likely to respond well to load peaks. It reduces
the limit on the number of simultaneous users controlled by an arbitrary
fixed parameter on the thread pool. Most of the Web giants have published
their experience feedback on their migration to this model: Coursera,[1]
Gilt, Groupon, Klout, LinkedIn,[2]
NetFlix[3]
, Paypal, Twitter,[4]
WalMart[5]
and
Yahoo.
Their voices are unanimous: reactive architectures make it possible to offer
the best user experience with the highest scalability.
Why now?
“Software gets slower faster than hardware gets faster. “
Niklaus Wirth – 1995
The reactive model is not new. It has been used in all user interface
frameworks since the invention of the mouse. Each click or keystroke
generates an event.
Even client-side JavaScript uses this model. There is no thread in this
language, yet it is possible to have multiple simultaneous AJAX requests.
Everything works using call-backs and events.
[1] http://downloads.typesafe.com/website/casestudies/Coursera-Case-Study.pdf
[2] http://engineering.linkedin.com/play/play-framework-async-io-without-thread-pool-and-
callback-hell
[3] https://blog.twitter.com/2013/new-tweets-per-second-record-and-how
[4] http://venturebeat.com/2012/01/24/why-walmart-is-using-node-js/
[5] http://www.infoq.com/presentations/netflix-reactive-rest
209
THE WEB GIANTS
Current development architectures are the result of a succession of steps
and evolutions. Some strong concepts have been introduced and used
extensively before being replaced by new ideas. The environment is also
changing. The way we respond to it has changed.
User experience has been the driving force of this change: today, who is
willing to fill in a form, wait for the page to reload to provide feedback
(failure/success) and wait again for the confirmation email? What of getting
such information immediately rather than asynchronously?
Have we reached the limits of our systems? Is there still space to be
conquered? Performance gains to discover?
In our systems, there is a huge untapped power reservoir. For doubling
the number of users, adding a server will do the trick. But since the
advent of mobile, companies have to handle about 20x more requests:
is it reasonable to multiply the number of servers in proportion? And is
it sufficient? Certainly not. To a certain extent, it sounds better to review
the architecture to harness the power that’s available: there are many
more available processor cycles to optimise. And when programs spend
significant amounts of time waiting for disks, networks or databases, they
don’t harness server potential.
From this point forward, this paradigm becomes accessible to everyone
while becoming built-in into modern development languages. These new
development patterns integrate latency and performance management at
the beginning of all projects. It is no longer a challenge to overcome when
it is too late to change the application architecture.
Applications based on the request/response model (HTTP / SOAP / REST)​​
can tolerate a thread model. In contrast, applications based on flows like
JMS or WebSocket will have everything to gain from working off a model
based on events and soft threads.
Unless your application is mostly devoted to calculations, you should start
thinking about implementing the reactive approach. The paradigm is
compatible with all languages.
ARCHITECTURE / THE REACTIVE REVOLUTION
210
THE WEB GIANTS
Things are moving fast: new frameworks now offer asynchronous APIs,
and, in-house, mostly use non blocking APIs, with language libraries also
changing, now providing classes which make it possible to react to events
more simply, and, lastly, the languages themselves are changing to make
it easier to script simple codes (closures) or generate asynchronous code
from synchronous code.
In addition, patterns can be set up to manage threadless multitasking
scripts:
	 a generator, which produces elements and pauses for each
iteration, until the next invocation
	 continuation, a closure which becomes a procedure to be
executed once it has been processed
	 coroutine, which makes it possible to pause processing
	 composition, which makes it possible to sequence processing in
the pipeline
	 Async/Await, to distribute processing over several cores
In other words, the reactive revolution is underway!
How can I make it work for me?
Reactive architecture is to architecture what NoSQL is to relational
databases: a very good alternative when you have reached your limits.
It is all a question of latency and access competition: for real time
applications, whether embarked or not, choosing reactive architecture
is justified as soon as you have a significant increase in volume. So no
reactive corporate website, but instead real time processing and display
of IoT data (a fleet's position for example).
The same goes for APIs, so the back-ends must be designed in
consequence: if your volume is under control, reactive architecture is
overkill. Open to partners or even in open API, it appears necessary to
design non-blocking architecture from the outset.
211
THE WEB GIANTS
Lastly, on one hand, wisely using the cloud can help you overcome many
of these limits (Amazon's Lambdas for example), and, on the other hand,
many software publishers have demonstrated their willingness to produce
highly scalable architecture. When choosing a SaaS software package or
one hosted on the premises for these use cases, companies must now turn
to editors who have proven they master such architecture.
All of these technologies have physical limits. Disk volumes are increasing,
but not access time. There are more cores in processors, but frequency
has not increased. Memory is increasing, beyond the capacity of garbage
collectors. If you are nearing these limits, or will do so in the next few
years, reactive architecture is definitely made for you.
ARCHITECTURE / THE REACTIVE REVOLUTION
212
THE WEB GIANTS
Open API
213
THE WEB GIANTS
214
THE WEB GIANTS ARCHITECTURE / OPEN API
Description
The principle behind Open API is to develop and offer services which
can be used by a third party without any preconceived ideas as to how
they will be used.
Development is thus mainly devoted to applied logic and system
persistence. The interface and business logic are developed by others,
often more specialized in interface technologies and ergonomics, or
having other specificities.[1]
The application engine therefore exposes an API,[2]
which is to say a bundle
of services. The end application is based on service packages, which can
include services provided by third parties. This is the case for example for
HousingMaps.com, a service for visualizing advertisements on CraigsList
using Google Maps.
The pattern belongs to the broader principles of SOA:[3]
decoupling and
composition possibilities. For a while, there was a divide between the
architecture of Web Giants, generally of the REST[4]
type and corporate
SOA, mostly based on SOAP.[5]
There has been a lot of controversy among
bloggers on this opposition between the two architectures. What we
believe is that the REST API exposed by Web Giants is just one form of
SOA among others.
Web Giants publicly expose their API, thus creating open ecosystems.
What this strategy does for them is to:
		 Generate direct income, by billing the service. Example: Google
Maps charges for their service beyond 25,000 transactions per day.
		 Expand the community, thereby recruiting users. Example: thanks
to the apps derived from its platform, Twitter has reached 140 million
active users (and 500 million subscribers).
[1] http://www.slideshare.net/kmakice/maturation-of-the-twitter-ecosystem
[2] Application Programming Interface.
[3] Service Oriented Architecture.
[4] Representational State Transfer. [5] Simple Object Access Protocol.
215
THE WEB GIANTS
		 Foster the emergence of new uses for its platform thus developing
their income model. Example: in 2009, Apple noted that application
developers wanted to sell not only their applications, but also
content for them. The AppStore model was changed to include that
possibility.
		 At times, externalize RD, then acquire the most talented startups.
That is what Salesforce did with Financialforce.com.
Marc Andreessen, creator of Netscape, divides open platforms into three
types:
		 Level 1 - Access API: these platforms allow users to access business
applications without providing the user interface. Examples: book
searches on Amazon, geocoding on Mappy.
		 Level 2 - Plug-in API: These platforms integrate applications in
the supplier’s user interface. Examples: Facebook apps, Netvibes
Widgets.
		 Level 3 - Runtime Environment: These platforms provide not only
the API and the interface, but also the execution environment.
Example: AppExchange applications in the Salesforce or iPhone
ecosystem.
It is also good to know that Web Giants APIs are accessible in self-service,
i.e. you can subscribe directly on the web site without any commercial
relations with the provider.
At level 3, you must design a multi-tenant system. The principle is to
manage the applications of several businesses in isolation, finding a
balance between mutualization and self-containment.
The pattern API First is derived from the Open API pattern: its approach
is to begin by building an API, then to consume it to build applications
for your end users. The idea is to be on the same level as the ecosystem
users, which means applying the same architecture principles you are
offering your clients to yourself, which is to say the pattern Eat Your Own
Dog’s Food (EYODF). Some architects working for Web Giants consider it
the best way to build a new platform.
216
THE WEB GIANTS ARCHITECTURE / OPEN API
In practice, the API First pattern is an ideal which is not always reached:
in recent history, it would seem that it has been applied for Google Maps
and Google Wave, two services developed by Lars Rasmussen. And yet it
was not applied for Google+, stirring the wrath of many a blogger.
Who makes it work for them?
Pretty much everyone, actually...
References among Web Giants
The Google Maps API is a celebrity: according to programmableWeb.com,
alongside Twitter it is one of those most used by websites. It has become
the de facto standard for showing objects on a map. It uses authentication
processes (client IDs) to measure consumption of a given application, so
as to be able to bill the service beyond a certain quota.
Twitter’s API is widely used: it offers sophisticated services to access subs-
criber data, in read and write versions. One can even using streaming
to receive tweet updates in real time. All of the site’s functionalities are
accessible via their API. The API also makes it possible to delegate the
authorization process (using the OAuth protocol), thereby allowing a third
party application to tweet in your name.
In France
The mapping service Mappy offers APIs for geocoding, calculating itine-
raries, etc., available at api.mappy.com
With api.orange.com, Orange offers the possibility to send text messages,
to geolocalize subscribers, etc.
What about me?
You should consider Open API whenever you want to create an ecosystem
open to partners or clients, in-house or externally. Such an ecosystem can
be open on Internet or restricted to a single organization. A relatively classic
scenario in a business is exposing the yearly directory of collaborators to
integrate their identities in the applications.
217
THE WEB GIANTS
Another familiar case is integrating services exposed by other suppliers
(for example a bank consuming the services of an insurance company).
Lastly, a less traditional use is to open a platform for your end clients:
		 A bank could allow its users to access all of their transactions: see
the examples of the AXA Banque and CAStore APIs.
		 A telephone or energy provider could give their clients access to to
their current consumption rate.
Related Pattern
Pattern “Device Agnostic“ p. 143
Exception!
		 Anything requiring a complex workflow.
		 Real-time IT (aircraft, car, machine tool): in this case service
composition can pose performance issues.
		 Data manipulation posing regulatory issues: channelling critical data
between platforms is best avoided.
Sources
• REST (Representational State Transfer) style:
 http://en.wikipedia.org/wiki/Representational_State_Transfer
• SOA
 http://en.wikipedia.org/wiki/Service-oriented_architecture
• Book “SOA, Le guide de l’architecte d’un SI agile“ (French only):
 http://www.dunod.com/informatique-multimedia/fondements-de-lin-
formatique/architectures-logicielles/ouvrages-professionnel/soa-0
218
THE WEB GIANTS
• Open platforms according to Marc Andreessen:
 http://highscalability.com/scalability-perspectives-3-marc-andreessen-
internet-platforms
• Mathieu Lorber, Stéphen Périn, What strategy for your web API? USI
2012 (French only):
 http://www.usievents.com/fr/sessions/1052-what-strategy-for-your-
web-api?conference_id=11-paris-usi-2012
ARCHITECTURE / OPEN API
219
About OCTO Technology
“We believe that IT transforms our societies. We are fully convinced that
major breakthroughs are the result of sharing knowledge and the pleasure
of working with others. We are constantly in quest of improvements.
THERE IS A BETTER WAY !“
– OCTO Technology Manifest
OCTO Technology specializes in consulting and ICT project creation.
Since 1998, we have been helping our clients build their Information Systems
and create the software to transform their firms. We provide expertise on
technology, methodology, and Business Intelligence.
At OCTO our clients are accompanied by teams who are passionate about
maximizing technology and creativity to rapidly transform their ideas into
value: Adeo, Altadis, Asip Santé, Ag2r, Allianz, Amadeus, Axa, Banco Fibra,
BNP Fortis, Bouygues, Canal+, Cdiscount, Carrefour, Cetelem, CNRS, Corsair
Fly, Danone, DCNS, Generali, GEFCO ING, Itaú, LegalGeneral, La Poste,
Maroc Telecom, MMA, Orange, Pages jaunes, Parkeon, Société Générale,
Viadeo, TF1, Thales, etc.
We have grown into an international group with four subsidiaries:
Morocco, Switzerland, Brazil and, more recently, Australia.
Since 2007, OCTO Technology has been granted the status of “innovative
firm“ by OSEO Innovation.
For four years, from 2011 to 2015, OCTO was awarded 1st
or 2nd
prize in
the Great Place to Work contest for firms with fewer than 500 employees.
ABOUT US
220
THE WEB GIANTS
Authors
Erwan Alliaume
David Alia
Philippe Benmoussa
Marc Bojoly
Renaud Castaing
Ludovic Cinquin
Vincent Coste
Mathieu Gandin
Benoît Guillou
Rudy Krol
Benoît Lafontaine
Olivier Malassi
Éric Pantera
Stéphen Périn
Guillaume Plouin
Phillipe Prados
Translated
from the French
by Margaret Dunham  Natalie Schmitz
Copyright © November 2012 by OCTO Technology,
All rights reserved.
Illustrations
The drawings are by Tonu in collaboration with Luc de Brabandere.
They are both active on www.cartoonbase.com, located in Belgium.
CartoonBase works mostly with businesses and works to promote the
use of cartoons and to encourage greater creativity in graphic art and
illustrations of all kinds.
Graphics and design by OCTO Technology,
with the support of Studio CPCR
and toumine@gmail.com
ISBN 13 : 978-2-9525895-4-3
Price: AUD $32
The Web
GiantsCulture – Practices – Architecture
In the US and elsewhere around the world, people are reinventing the way IT is
done. These revolutionaries most famously include Amazon, Facebook, Google,
Netflix, and LinkedIn. We call them the Web Giants.
This new generation has freed itself from tenets of the past to provide a different
approach and radically efficient solutions to old IT problems.
Now that these pioneers have shown us the way, we cannot simply maintain
the status quo. The Web Giant way of working combines firepower, efficiency,
responsiveness, and a capacity for innovation that our competitors will go after
if we don’t first.
In your hands is a compilation and structural outline of the Web Giants’
practices, technological solutions, and most salient cultural traits (obsession
with measurement, pizza teams, DevOps, open ecosystems, open software, big
data and feature flipping).
Written by a consortium of experts from the OCTO community, this book is for
anyone looking to understand Web Giant culture. While some of the practices
are fairly technical, most of them do not require any IT expertise and are open
for exploitation by marketing and product teams, managers, and geeks alike.
We hope this will inspire you to be an active part of IT, that driving force that
transforms our societies.
THE OBSESSION WITH MEASUREMENT • FLUIDITY OF THE USER
EXPERIENCE • ARTISAN CODERS • BUILD VERSUS BUY • CONTRIBUTING
TO FREE SOFTWARE • DEVOPS • PIZZA TEAMS • MINIMUM VIABLE
PRODUCT • PERPETUAL BETA • A/B TESTING • DEVICE AGNOSTIC
• OPEN API AND OPEN ECOSYSTEMS • FEATURE FLIPPING • SHARDING •
COMMODITY HARDWARE • TP VERSUS BI: THE NEW NOSQL APPROACH
• CLOUD FIRST • DATA SCIENCE • REACTIVE PROGRAMMING •
DESIGN THINKING • BIG DATA ARCHITECTURE • BUSINESS PLATFORM
OCTO designs, develops, and implements
tailor-made IT solutions and strategic apps
...Differently.
WE WORK WITH startups, public
administrations, AND large corporations
FORWHOM IT IS a powerful engine for change.
octo.com - blog.octo.com - web-giants.com

More Related Content

PDF
Accenture Technology Vision 2012
PDF
Visión de la Tecnología Por Accenture
PDF
Accenture technology vision 2012 final
PDF
Accenture Technology Vision 2012
PDF
Experience Probes for Exploring the Impact of Novel Products
PPTX
Business Analysis & The Impact of Disruptive Technologies
PDF
Introduction to mike2 presentation
PDF
Democratizing Data
Accenture Technology Vision 2012
Visión de la Tecnología Por Accenture
Accenture technology vision 2012 final
Accenture Technology Vision 2012
Experience Probes for Exploring the Impact of Novel Products
Business Analysis & The Impact of Disruptive Technologies
Introduction to mike2 presentation
Democratizing Data

What's hot (20)

PDF
A contrarian view to adoption of collaboration-tools-in-the-global-workplace
PDF
Analytics trends deloitte
PDF
Connected Products Studio Report
PDF
Ten tech-enabled business trands to watch - August 10
PDF
21st Century-Corporation
PDF
IoT as hub of cyclical organization
PDF
Doing More with More (Venturespring White Paper)
PPTX
The great collision of open source, cloud technologies, with agile, creative ...
PDF
Ct June 2009 Newsletter
PDF
Emerging Trend #4 | Disruption
PPTX
Risk management in business analysis the great differentiator
PPT
Digital disruption - From Disrupted To Disruptor
PDF
Intranet & Digital strategy survey - Synthesis 2017
PDF
Growing the Digital Business: Accenture Mobility Research 2015
PPTX
CONTEMPORARY ISSUES AND FUTURE CHALLENGES AFFECTING CORPORATE STRATEGIC PLANNING
PDF
Visions for the Journey Towards a Post-2020 Employee Experience | IOM Summit ...
PDF
William Jephcote | Human-Centred Designer | Portfolio
PPTX
20160414 abecon zorgdag inspiratiesessie - internet of things - peter de ha...
PDF
Change that Works (for the Digital Workplace)
PDF
#CSOAUS: Innovation - for a brighter future at News Corp Australia
A contrarian view to adoption of collaboration-tools-in-the-global-workplace
Analytics trends deloitte
Connected Products Studio Report
Ten tech-enabled business trands to watch - August 10
21st Century-Corporation
IoT as hub of cyclical organization
Doing More with More (Venturespring White Paper)
The great collision of open source, cloud technologies, with agile, creative ...
Ct June 2009 Newsletter
Emerging Trend #4 | Disruption
Risk management in business analysis the great differentiator
Digital disruption - From Disrupted To Disruptor
Intranet & Digital strategy survey - Synthesis 2017
Growing the Digital Business: Accenture Mobility Research 2015
CONTEMPORARY ISSUES AND FUTURE CHALLENGES AFFECTING CORPORATE STRATEGIC PLANNING
Visions for the Journey Towards a Post-2020 Employee Experience | IOM Summit ...
William Jephcote | Human-Centred Designer | Portfolio
20160414 abecon zorgdag inspiratiesessie - internet of things - peter de ha...
Change that Works (for the Digital Workplace)
#CSOAUS: Innovation - for a brighter future at News Corp Australia
Ad

Viewers also liked (8)

PDF
Les pratiques des geants du web
PDF
BBL Personal Kanban one-on-one
PDF
Mini-course "Practices of the Web Giants" at Global Code - São Paulo
PPTX
Giants of the web - creadigitalday
PDF
Démystifions l'API-culture!
PDF
Digital Transformation Review 9: The Digital Strategy Imperative #DTR9
PDF
Quelle stratégie d'API pour votre S.I. ? USI 2012
PDF
OCTO - Petit Dejeuner API, 06-12-2012
Les pratiques des geants du web
BBL Personal Kanban one-on-one
Mini-course "Practices of the Web Giants" at Global Code - São Paulo
Giants of the web - creadigitalday
Démystifions l'API-culture!
Digital Transformation Review 9: The Digital Strategy Imperative #DTR9
Quelle stratégie d'API pour votre S.I. ? USI 2012
OCTO - Petit Dejeuner API, 06-12-2012
Ad

Similar to OCTO_TheWebGiants_2016 (20)

PDF
Innovation Excellence Weekly - Issue 18
PDF
Lean ux
PDF
Sep15 SPOT Brown
PDF
Hackathon report final
PDF
Smart Business Design In The Age of The Internet of Things
PDF
Role Of Programmer On Telecom Industry
PPT
Emerging Web Technology
PPT
Ms Emerging Tech2008
PDF
Accelerating Digital Transformation 10 Years Of Software Center Jan Bosch
PDF
Freedom & Functionality – A Startup Approach to Open Source & Innovation for ...
PDF
Marketing plan mozilla firefox
DOCX
Top Strategic Technology Trends for 2022.docx
PDF
Microservices for-java-developers
PDF
Guide To Cloud Native Microservices 1st Edition The New Stack
PDF
Google's guide to innovation: How to unlock strategy, resources and technology
PDF
Microservices for Java Developers
PDF
WIN WORLD INSIGHTS | ISSUE 10 | YEAR 02
PDF
The Software Manager"s Guide to Practical Innovation
PDF
Instant Download Google Anthos in Action - MEAP Version 6 Antonio Gulli PDF A...
PDF
The Cloud Disaster Recovery "Cookbook''
Innovation Excellence Weekly - Issue 18
Lean ux
Sep15 SPOT Brown
Hackathon report final
Smart Business Design In The Age of The Internet of Things
Role Of Programmer On Telecom Industry
Emerging Web Technology
Ms Emerging Tech2008
Accelerating Digital Transformation 10 Years Of Software Center Jan Bosch
Freedom & Functionality – A Startup Approach to Open Source & Innovation for ...
Marketing plan mozilla firefox
Top Strategic Technology Trends for 2022.docx
Microservices for-java-developers
Guide To Cloud Native Microservices 1st Edition The New Stack
Google's guide to innovation: How to unlock strategy, resources and technology
Microservices for Java Developers
WIN WORLD INSIGHTS | ISSUE 10 | YEAR 02
The Software Manager"s Guide to Practical Innovation
Instant Download Google Anthos in Action - MEAP Version 6 Antonio Gulli PDF A...
The Cloud Disaster Recovery "Cookbook''

OCTO_TheWebGiants_2016

  • 1. The Web GiantsCulture – Practices – Architecture AUGMENTED
  • 2. Foreword........................................................................................................................6 Introduction..................................................................................................................9 Culture..........................................................................................................................11 The Obsession with Performance Measurement......................13 Build vs Buy.....................................................................................................19 Enhancing User Experience...................................................................27 Code crafters.................................................................................................33 Open Source Contribution....................................................................41 Sharing Economy platforms..................................................................47 Organization.............................................................................................................57 Pizza Teams.....................................................................................................59 Feature Teams...............................................................................................65 DevOps.............................................................................................................71 Practices.......................................................................................................................85 Lean Startup...................................................................................................87 Minimum Viable Product.........................................................................95 Continuous Deployment.......................................................................105 Feature Flipping.........................................................................................113 Test A/B...........................................................................................................123 Design Thinking.........................................................................................129 Device Agnostic.........................................................................................143 Perpetual beta.............................................................................................151 Architecture............................................................................................................157 Cloud First.....................................................................................................159 Commodity Hardware............................................................................167 Sharding..........................................................................................................179 TP vs. BI: the new NoSQL approach..............................................193 Big Data Architecture..............................................................................201 Data Science................................................................................................211 Design for Failure......................................................................................221 The Reactive Revolution........................................................................227 Open API ......................................................................................................235 About OCTO Technology..............................................................................243 Authors......................................................................................................................245 Table of Contents
  • 3. 3 THE WEB GIANTS It has become such a cliché to start a book, a talk or a preface by stating that the rate of change is accelerating. However, it is true: the world is changing faster both because of the exponential rate of technology evolution and the central role of the user in today’s economy. It is also a change characterized by Marc Andreessen in his famous blog post as “software is eating the world“. Not only is software at the core of the digital economy, but producing software is changing dramatically too. This is not a topic for Web companies, this is a revolution that touches all companies. To cope with their environment’s change, they need to reinvent themselves into software companies, with new ways of working, organizing themselves and producing digital experiences for their customers. This is why I am so pleased to write the preface to “The Web’s Giants“. I have been using this book intensely since the first French edition was on the market. I have given copies to colleagues both at Bouygues Telecom and at AXA, I have made it a permanent reference in my own blogs, talks and writing. Why? It is the simplest, most pragmatic and convincing set of answers to the previous questions: what to do in this software- infused, technology-enabled, customer-centric fast changing 21st century? This is not a conceptual book, a book about why you should do this or that. This is a beautifully written story about how software and service development is organized in some of the best-run companies of the world. First, this is a book about practices. The best way to grow change in a complex world is to adopt practices. It is the only way to learn, by doing. These practices are sorted into three categories: culture, organization and architecture; but there is a common logic and a systemic reinforcement. Practices are easier to pick and they are less intimidating than methodologies or concepts. However, strong will and perseverance are required. I will not spoil your reading by summarizing what OCTO found when they look at the most common practices of the most successful software companies of the world. I will rather try to convince you that reading this book is an urgent task for almost everyone, based on four ideas. The first and foremost idea is that software systems must be built to change constantly.  This is equally true for information systems, support systems, embedded, web or mobile software. What we could define as customer engagement platforms are no longer complex systems that one designs and builds, but continuously evolving systems that are grown. This new generation of software systems is the core of the Web Giants. Constant evolution is mandatory to cope with exponential technology changes, as well as the only way to co-construct engagement platforms through customer feedbacks. The unpredictability of usage, especially social usage, means that digital experiences software processes that can only be crafted through measure and continuous improvement. This Foreword
  • 4. 4 THE WEB GIANTS FOREWORD critical change, from software being designed to software being grown, means that all companies that provide digital experiences to their customers must become software companies.  A stable software support system could be outsourced, delegated or bought, but a constantly evolving self-adaptive system becomes a core capability. This capability is deeply mixed with business and its delivery processes and agents are to be valued and respected. The second key idea is that there exists a new way of building such software systems. We are facing two tremendous challenges: to churn out innovations at the rate that is expected by the market, and to constantly integrate new features while factoring out olderones,toavoidthesuffocationbyconstantgrowththatplaguedpreviousgenerations of software systems. The solution is a combination of open innovation - there are clearly more smart developers outside any company than inside – together with source-level “white box“ integration and minimalist “platform“ design principles. When all your code needs to be constantly updated to follow the environment change, the less you own the better. It is also time to bring source code back from the dark depths of “black box integration“. Open source culture is both about leveraging the treasure trove of what may be found in larger development communities and about mashing up composite applications by weaving source code that one may be proud of. Follow the footsteps of the Web Giants: code that changes constantly is worth being well-written, structured, documented and test-viewed by as many eyeballs as possible. The third idea is another way of saying that “software is eating the world“, this book is not about software, it is about a new way of thinking about your company, whichever businessyouarein.Notsurprisingly,many“known“practicessuchasagiledevelopment, lean startup, measure obsession or obsession about saving customer’s time - the most precious commodity of the digital age -, have found their way into Octo’s list. By reading the practical testimonies from the Web Giants, a new kind of customer-focused organization will emerge. Thus, this is a book for everyone, not for geeks only. This is of the utmost importance since many of the change levers lay in other stakeholders’ hands than software developers themselves. For instance, a key requirement for agility is to switch from solution requirement to problem requirement, allowing the solution to be co-developed by cross-functional teams as well as users. The last idea I would propose is that there is a price to pay for this transformation. There are technologies, tools and practices that you must acquire and learn. Devops practices, such as continuous delivery or managing infrastructure as code, require to master a set of tools and to build skills, there is no “free lunch“. A key set of benefits from the Web Giants way of working comes from massive automation. This book also
  • 5. 5 shows some of the top recent technology patterns in the architecture section. Since this list is evolving by nature, the most important lesson is to create an environment where “doers“ may continuously experience the tools of the future, such as massively parallel cloud programming, big data or artificial intelligence. A key consequence is that there is a true efficiency and competitiveness difference between those who do and those who don’t master the said set of tools and skills. In the world of technology, we often use the world “Barbarians“ to talk about newcomers who leverage their software/technology skills to displace incumbents in older industries. This is not a question of mindset (trying to take legacy companies head-front is an age-old strategy for newcomers) but a matter of capabilities! As stated earlier, there would be other, more conceptual, ways to introduce the key ideas and practices that are pictured in this book. One could tell about the best sources on motivation and collaborative work, such as Daniel Pink for instance. These Web Giants practices reflect the state of the art of managing intrinsic motivation. The same could be said about the best books on lean management and self-organization. The reference to Lean Startup is one from many subtle references to the influence of the Toyota Way in the modern 21st century forms of organization. Similarly, it would be tempting to convoke complex system theory - see Jurgen Apello and his “Management 3.0“ book for instance - to explain why the practices observed and selected by Octo are the natural answer to the challenges of the increasingly changing and complex world that we live in. From a technology perspective, it is striking to see the similarity with the culture & organizational traits described by Salim Ismael, Michael Malone and Yuri van Geest in their book “Exponential organizations“. The beauty of this pragmatic approach is that you have almost all what you need to know in a much shorter package, which is fun and engaging to read. To conclude this preface, I would advise you to read this book carefully, to share it with your colleagues, your friends and your children - when it’s time to think about what it means to do something that matters in this new world. It tells a story about the new way of working that you cannot afford to miss. Some of the messages: measuring everything, learning by doing, loving your code and respecting those who build things, may make the most seasoned manager smile, but times are changing. This is no longer a set of suggested, “nice-to-have“ practices, as it might have been ten years ago. It is the standard of web-age software development, and de facto the only way for any company to succeed in the digital world. Yves Caseau - National Academy of Technologies of France, President of the ICT commission. Head of Digital of AXA Group THE WEB GIANTS
  • 6. 6 THE WEB GIANTS INTRODUCTION Introduction Something extraordinary is happening at this very moment; a sort of revolution is underway. Across the Atlantic, as well as in other parts of the world such as France, people are reinventing how to work with information technology. They are Amazon, Facebook, Google, Netflix and LinkedIn, to name but the most famous. This new generation of players has managed to shed old dogmas to examine afresh the issues at hand by coming up with new, radical and efficient solutions for long-standing IT problems. Computer scientists are well aware of the fact that when IT tools are introduced to a trade, the benefits of computerization can only be reaped if business processes are re-thought in light of the new potential offered by technology. One trade, however, has mostly managed thus far to avoid upheavals in their processes: Information Technology itself. Many continued – and still do – to build information systems the way one would build highways or bridges. There is a tendency to forget that the matter being handled on a daily basis is extremely volatile. By dint of hearing tell of Moore’s law,[1] its true meaning is forgotten: what couldn’t be done last year is possible today; what cannot be done today will be possible tomorrow. The beliefs and habits of the ecosystem we live in must be challenged at regular intervals. This thought is both terrifying and wonderful. Now that the pioneers have paved the way, it is important to re-visit business processes. The new approaches laid out here offer significant increases in through efficiency, proactivity, and the capacity for innovation, to be harnessed before the competition pulls the rug out from under your feet. The good news is that the Web Giants are not only paving the way; they espouse the vision of an IT community. They are committed to the Open Source principle, openly communicating their practices to appeal to potential recruits, and work in close collaboration with the research community. Their work methods are public knowledge and very accessible to those who care to delve. The aim of this book is to provide a synthesis of practices, technological solutions and the most salient traits of IT culture. Our hope is that it will inspire readers to make contributions to an information age capable of reshaping our world. This book is designed for both linear and thematic reading. Those who opt for the former may find some repetition. [1] empirical law which states that computing power roughly doubles in capacity at a fixed price every 18 months.
  • 8. 8 The obsession with performance measurement................................. 13 Build vs Buy..................................................................................... 19 Enhancing the user experience......................................................... 27 Code crafters................................................................................... 33 Developing Open Source................................................................. 41 THE WEB GIANTS
  • 9. THE WEB GIANTS The obsession with performance measurement
  • 10. 10 THE WEB GIANTS CULTURE / L’OBSESSION DE LA MESURE
  • 11. 11 THE WEB GIANTSCULTURE / THE OBSESSION WITH PERFORMANCE MEASUREMENT Description In IT, we are all familiar with quotes reminding us of the importance of performance measurement: That which cannot be measured cannot be improved; without measurement, it is all opinion. Web Giants have taken this idea to the extreme, and most have developed a strong culture of performance measurement. The structure of their activities leads them in this direction. These activities often share three characteristics: For these companies, IT is their means of production. Their costs are therefore directly correlated to the optimal use of equipment and software. Improvements in the number of concurrent users or CPU usage result in rapid ROI. Revenues are directly correlated to the efficiency of the service provided. As a result, improvements in conversion rates lead to rapid ROI. They are surrounded by computers! And computers are excellent measurement instruments, so they may as well get the most out of them! Most Web Giants have made a habit of measuring everything, response times, most visited web pages or the articles (content or sales pages) that work best, the time spent on individual pages... In short, nothing unusual – at first glance. But that’s not all! – They also measure the heat generated by a given CPU, or the energy consumption of a transformer, as well as the average time between two hard disk failures (MTBF, Mean Time Between Failure).[1] This motivates them to build infrastructure that maximizes the energy efficiency of their installations, as these players closely monitor PUE, or Power Usage Effectiveness. Most importantly, they have learned to base their action plans on this wealth of metrics. [1] http://storagemojo.com/2007/02/19/googles-disk-failure-experience
  • 12. 12 THE WEB GIANTS Part of this trend is A/B testing (see “A/B Testing“ on p. 123 for further information), which consists of testing different versions of an application on different client groups. Does A work better than B? The best way to find out remains objective measurement: it results in concrete data that defy common sense and reveal the limits of armchair expertise, as demonstrated by the www.abtests.com website, which references A/B testing results. In an interview, Yassine Hinnach – then Senior Engineer Manager at LinkedIn – spoke of how LinkedIn teams were encouraged to quickly put any technology designed to boost site performance to the test. Thus decisions to adopt a given technology are made on the basis of observed metrics. HighScalability.com has published an article presenting Amazon’s recipes for success, based on interviews with its CTO. Among the more interesting quotes, the following caught our attention: Everyone must be able to experiment, learn, and iterate. Position, obedience, and tradition should hold no power. For innovation to flourish, measurement must rule.[2] As another example of this approach, here is what Timothy B. Lee, a journalist for Wired and the New York Times, had to say about Google’s culture of performance measurement: Rather than having intimate knowledge of what their subordinates are doing, Google executives rely on quantitative measurements to evaluate the company’s performance. The company keeps statistics on everything— page load times, downtime rates, click-through rates, etc—and works obsessively to improve these figures. The obsession with data-driven management extends even to the famous free snacks, which are chosen based on careful analysis of usage patterns and survey results.“[3] [2] http://highscalability.com/amazon-architecture [3] http://arstechnica.com/apple/news/2011/06/fourth-times-a-charm-why-icloud-faces-long- odds.ars
  • 13. 13 THE WEB GIANTSCULTURE / THE OBSESSION WITH PERFORMANCE MEASUREMENT The consequences of this modus operandi run deep. A number of pure players display in their offices the motto “In God we trust. Everything else, we test“. This is more than just a nod to Deming;[4] it is a profoundly pragmatic approach to the issues at hand. An extreme example of this trend, verging on caricature, is Google’s ‘Project Oxygen’: a team of internal statisticians combed through HR data collected from within – annual performance reviews, feedback surveys, nominations for top-manager awards. They distilled the essence of what makes a good manager down to 8 rules. Reading through them, any manager worthy of the name would be struck by how jaw-droppingly obvious it all seems. However, they backed their claims with hard, cold data,[5] and that made all the difference! What about me? The French are fond of modeling, and are often less pragmatic than their English-speaking counterparts. Indeed, we believe that this constant and quick feedback loop “hypothesis measurement decision“ should be an almost systematic reflex in the ISD world, and can be put into effect at a moment’s notice. The author of these lines still has painful memories of two four-hour meetings with ten people organized to find out if shifting requests to the service layer to http would have a “significant“ impact on performance. Ten working days would have largely sufficed for a developer to figure that out, at a much lower cost. OCTO consultants have also had the experience, several times over, of discovering that applications performed better when the cache that was used to improve performance was removed! The cure was therefore worse than the disease and its alleged efficacy never actually measured. Management runs the risk of falling into the trap of believing that analysis by “hard data“ is a done deal. It may be a good idea to regularly check that this is indeed the case, and especially that the information gathered is put to use in decision-making. [4] “In God we trust; all others must bring data“, W. Edward Deming. [5] Adam BRYANT, Google’s Quest to Build a Better Boss, The New York Times Company, March 12, 2011 : http://www.nytimes.com/2011/03/13/business/13hire.html 13
  • 14. 14 THE WEB GIANTS Nevertheless, it cannot be emphasized enough that an ecosystem fostering the application of said information makes up part of the recipe for success of Web Giants. Two other practices support the culture of performance metrics: Automated tests: it’s either red or green, no one can argue with that. As a result, this ensures that it is always the same thing being measured. Short cycles. To measure – and especially interpret – the data, one must be able to compare options, “all other things being equal“. This is crucial. We recently diagnosed the steps undertaken to improve the performance of an application. But about a dozen other optimizations were made to the next release. How then can efficient optimizations be distinguished from those that are counter- productive?
  • 17. 17 THE WEB GIANTS CULTURE / BUILD VS BUY Description One striking difference in the strategy of Web Giants as compared to more usual IT departments lies in their arbitrations around Build vs. Buy. The issue is as old as computers themselves: is it better to invest in designing software to best fit your needs or to use a software package complete with the capitalization and R&D of a publisher (or community) having had all necessary leisure to master the technology and business points? Most major firms have gone for the second option and have enshrined maximal software packaging among their guiding principles, based on the view that IT is not one of their pillar businesses so is better left to professionals. The major Web companies have tended to do the exact reverse. This makes sense given that IT is precisely their core business, and as such is too sensitive to be left in the hands of outsiders. The resulting divergences are thus coherent. Nonetheless, it is useful to push the analysis one step further because Web Giants have other motives too: first, being in control of the development process to ensure it is perfectly adjusted to meet their needs, and second, the cost of scaling up! These are concerns found in other IT departments, meaning that it can be a good idea to look very closely into your software package decisions. Finding balanced solutions On the first point, one of the built-in flaws of software packages is that they are designed for and by the needs which most arise for the publisher’s clients.[1] Your needs are thus only a small subset of what the software package is built to do. Adopting a software package by definition entails overkill, i.e. an overly complex solution not optimized for your [1] We will not insist here on the fact that you should not stray too far from the standard out-of-the-box software package as this can be (very) expensive in the long term, especially when there are new releases.
  • 18. 18 THE WEB GIANTS needs; and which has a price both in terms of execution and complexity, offsetting any savings made by not investing in the design and development of a complete application. This is particularly striking in the software package data model. Much of the model’s complexity stems from the fact that the package is optimized for interoperability (a highly standardized Conceptual Data Model, extension tables, low model expressiveness as it is a meta-model...). However the abstractions and the “hyper-genericity“ that this leads to in software design has an impact on processing performance.[2] Moreover, Web Giants have constraints in terms of volumes, transaction speed and the number of simultaneous users which push the envelopes of traditional architecture and which, in consequence, require fine-tuned optimizations determined by observed access-patterns. Such read- intensive transactions must not be optimized in the same way as others, where the stakes will be determined by I/O writing metrics. In short, to attain such results, you have to pop the hood and poke around in the engine, which is not something you will be able to do with a software package (all guarantees are revoked from the moment you fiddle with the innards). Because performance is an obsession for Web Giants, the overhead costs and low possibilities for adjustments to the software package make the latter quite simply unacceptable. Costs The second particularly critical point is of course the cost when scaling up. When the number of processors and servers increases, the costs rise very quickly, but not always in linear fashion, making some items more visible. And this is true of both business software packages and hardware. That is precisely one of the arguments which led LinkedIn to gradually replace their Oracle database by an in-house solution, Voldemort.[3] . In a similar vein, in 2010 we carried out a study on the main e-commerce [2] When it is not a case of a cumbersome interface. [3] Yassine Hinnach, Évolution de l’architecture de LinkedIn, enjeux techniques et Organizationnels, USI 2011: http://www.usievents.com/fr/conferences/8-paris-usi-2011/sessions/1007
  • 19. 19 THE WEB GIANTS CULTURE / BUILD VS BUY sites in France: at the time, eight of the ten largest sites (in terms of annual turnover) ran on platforms developed in-house and 2 used e-commerce software packages. Web Giants thus prefer Build to Buy. But not only. They also massively have recourse to Open source solutions (cf. “Developing open source“, p. 41). Linux and MySQL reign supreme in many firms. Development languages and technologies are almost all open source: very little .NET for example, but instead Java, Ruby, PHP, C(++), Python, Scala... And they do not hesitate to fork off from other projects: Google for example uses a largely modified Linux kernel.[4] This is also the case for one of the main worldwide Global Distribution Systems. Most technologies making a stir today in the world of high performance architecture are the result of developments carried out by Web Giants and then opened to the community. Cassandra, developed by Facebook, Hadoop and HBase inspired by Google and developed by Yahoo!, Voldemort by LinkedIn... A way, in fact, of combining the advantages of software perfectly tailored to your needs but nonetheless enhanced by improvements contributed by the development community, with, as an added bonus, a market trained to use the technologies you use. Coming back to the example of LinkedIn, many of their technologies are grounded in open source solutions: Zoie, a real time indexing and search system based on Lucene. Bobo, a faceted search library based on Lucene. Azkaban, a batch workflow job scheduler to manage Hadoop job dependencies. GLU, a deployment framework. [4] http://lwn.net/Articles/357658
  • 20. 20 THE WEB GIANTS How can I make it work for me? Does this mean I have to do away with software packages in my IT choices? Of course not, not for everything. Software packages can be the best solution, no one today would dream of reengineering a payroll system. However, ad hoc developments should be considered in certain cases: when the IT tool is key to the success of your business. Figure 1 lays out orientations in terms of strategy. The other context where specific developments can be the right choice is that of high performance: with companies turning to “full web solutions“, very few business software packages have the architecture to support the traffic intensity of some websites. As for infrastructure solutions, open source has become the norm: OSs and application servers foremost. Often also databases and message buses. Open source are ideally adapted to run the solutions of Web Giants. There is no doubt as to their capacity for performance and stability. One hurdle remains: reluctance on the part of CIOs to forgo the support found in software packages. And yet, when you look at what actually happens, when there are problems with the commercial technical platform, it is rarely support from the publisher, handsomely paid for, which provides the solution, but rather networks of specialists and help fora Unique, differentiating. Perceived as a commercial asset. Innovations and strategic assets Faster SPECIFIC SOFTWARE PACKAGE BPO[5] Resources Cheaper Common to all industry organizations. Perceived as a production asset. Common to all organizations. Perceived as a ressource. [5] Business Process Outsourcing.
  • 21. 21 THE WEB GIANTS CULTURE / BUILD VS BUY on the Internet. For application platforms of the database or message bus type, the answer is less clearcut because some commercial solutions include functionalities that you do not find in open source alternatives. However if you are sending an Oracle into regions where MySQL will not be able to follow, that means that you have very sophisticated needs... which is not the case for 80% of the contexts we encounter !
  • 24. 24 THE WEB GIANTS CULTURE / ENHANCING THE USER EXPERIENCE Description Performance: a must One conviction shared by Web Giants is that users’ judgment of performance is crucial. Performance is directly linked to visitor retention and loyalty. How users feel about a particular service is linked to the speed with which the graphic interface is displayed. Most people have no interest in software architecture, server power, or network latency due to web based services. All that matters is the impression of seamlessness. User-friendliness is no longer negotiable Web Giants have fully grasped this and speak of metrics in terms of “the bat of an eyelash“. In other words, it is a matter of fractions of seconds. Their measurements, carried out namely through A/B testing (cf. “A/B Testing“, p. 123), are very clear: Amazon : a 100ms. increase in latency means a 1% loss in sales. Google : a page taking more than 500ms to load loses 20% of traffic (pages visited). Yahoo! : more than 400ms to load means + 5 to 9 % abandons. Bing : over 1 second to load means a loss of 2.8% in advertising income. How are these performances attained? In keeping with the Device Agnostic pattern (cf. “Device Agnostic“, p. 143), Web Giants develop native interfaces, or Web interfaces, to always offer the best possible user experience. In both cases, performance as perceived by the user must be maximized.
  • 25. 25 THE WEB GIANTS Native applications With the iPhone, Apple reintroduced applications developed for a specific device (stopping short of the assembler however) to maximize perceived performance. Thus Java and Flash technologies are banished from the iPhone. The platform also uses visual artifacts: when an app is launched, it displays the view as seen when it was last charged by the system to strengthen the impression that it is instantaneous, with the actual app being loaded in the background. On Android, Java applications are executed on a virtual machine optimized for the platform. They can also be written in C to maximize performance. Generally speaking, there is a consensus around native development, especially on mobile platform: it must be as tightly linked as possible to the device. Multi-platform technologies such as Java ME, Flash and Silverlight do not directly enhance the user experience and are therefore put aside. Web applications Fully loading a Web page usually takes between 4 and 10 seconds (including graphics, JavaScript, Flash, etc.). It would seem that perceived slowness in display is generally linked for 5% to server processing, and for 95% to browser processing. Web Giants have therefore taken considerable care to optimize the display of Web pages. As illustration, here is a list of the main good practices which most agree optimize user perception: It is crucial to cache all static resources (graphics, CSS style sheets, JavaScript scripts, Flash animations, etc.) whenever possible. There are various HTTP cache technologies for this. It is important to become skillful at optimizing the life-cycle of the resources in the cache. It is also advisable to use a cache network, or Content Delivery Network (CDN) to bring the resources as close as possible to the end user to reduce network latency. We highly recommend that you have cache servers in the countries where the majority of your users live.
  • 26. 26 CULTURE / ENHANCING THE USER EXPERIENCE Downloading in background is a way of masking sluggishness in the display of various elements on the page. One thing many do is to use sprites: the principle is to aggregate images in a single file to limit the amount of data to be loaded; they can then be selected on the fly by the navigator (see the Gmail example below). Having recourse to multiple domain names is a way to maximize parallelization in simultaneous resource loading by the navigator. One must bear in mind that navigators are subjected to a maximum number of simultaneous queries for a same domain. Yahoo.fr for example loads their images from l.yimg.com. Placing JavaScript resources at the very end of the page to ensure that graphics appear as quickly as possible. Using tools to minimize, i.e. removing from the code (JavaScript, HTML, etc.) all characters (enter, comments, etc.) serving to read the code but not to execute it, and to shorten as much as possible function names. Compacting the various source code files such as JavaScript in a single file whenever possible. Who makes it work for them? There are many examples of such practices among Web Giants, e.g. Google, Gmail, Viadeo, Github, Amazon, Yahoo!... References among Web Giants Google has the most extensive distributed cache network of all Web Giants: the search giant is said to have machines in all major cities, and even a private global network, although corroboration is difficult to come by. Google Search pushes the real-time user experience to the limits with its “Instant Search“ which loads search results as you type your query. This function stems from formidable technical skill and has aroused the interest of much of the architect community.
  • 27. 27 THE WEB GIANTS Gmail images are reduced to a strict minimum (two sprite images shown on Figure 1), and the site makes intensive cache use and loads JavaScript in the background Figure 1: Gmail sprite images. France Sites using or having used the content delivery network Akamai: cite-sciences.fr lemonde.fr allocine.com urbandive.com How can I make it work for me? The consequences of display latency are the same with in-house applications within any IT department: users who get fed up with the application and stop using it. This to say that this is a pattern which perfectly applies to your own business Sources • Eric Daspet, “Performance des applications Web, quoi faire et pourquoi ?“ USI 2011 (French only): > http://www.usievents.com/fr/conferences/10-casablanca-usi-2011/ sessions/997-performance-des-applications-web-quoi-faire-et-pourquoi • Articles on Google Instant Search: > http://highscalability.com/blog/2010/9/9/how-did-google-instant- become-faster-with-5-7x-more-results.html > http://googleblog.blogspot.com/2010/09/google-instant-behind- scenes.html Editor’s note: By definition, sprites are designed for screen display, we are unable to provide any better definition for the printing of this example. Thank you for your understanding.
  • 29. 29 THE WEB GIANTS CULTURE / CODE CRAFTERS Description Today Web Giants are there to remind us that a career as a developer can be just as prestigious as manager or consultant. Indeed, some of the most striking successes of Silicon Valley have originated with one or several visionary geeks who are passionate about quality code. When these companies’ products gain in visibility, satisfying an increasing number of users means hugging the virtuous cycle in development quality, without which success can vanish as quickly as it came. Which is why a software development culture is so important to Web Giants, based on a few key principles: attracting and recruiting the best programmers, investing in developer training and allowing them more independence, gaining their loyalty through workplace attractiveness and payscale, being intransigent as to the quality of software development - because quality is non-negotiable. Implementation The first challenge the Giants face is thus recruiting the best programmers. They have become masters at the art, which is trickier than it might at first appear. One test which is often used by the majors is to have the candidates write code. A test Facebook uses is the FizzBuzz. This exercise, inspired by a drinking game which some of you might recognize, consists in displaying the first 1000 prime numbers, except for multiples of 3 or 5, where “Fizz“ or “Buzz“ respectively must be displayed, and except for multiples of 3 and 5, where “FizzBuzz“ must be displayed. This little programming exercise weeds out 99.5% of the candidates. Similarly, to be hired by Google, between four and nine technical interviews are necessary.
  • 30. 30 THE WEB GIANTS Salary is obviously to be taken into account. To have very good developers, you have to be ready to pay the price. At Facebook, Senior Software Engineers are among the best paid employees. Once programmers have joined your firm, the second challenge is to favor their development, fulfillment, and to enrich their skills. In such companies, programmers are not considered code laborers to be watched over by a manager but instead as key players. The Google model, which encourages developers to devote 20% of their time to R&D projects, is often cited as an example. This practice can give rise to contributions to open-source projects, which provide many benefits to the company (cf. “Open Source Contribution“, p. 41). On the Netflix blog for example, they mention their numerous open source initiatives, namely on Zookeeper and Cassandra. The benefit to Netflix is twofold: its developers gain in notoriety outside the company, while at the same time developing the Netflix platform. Another key element in developer loyalty is the working conditions. The internet provides ample descriptions of the extent to which Web Giants are willing to go to provide a pleasant workplace. The conditions are strikingly different from what one finds in most Tech companies. But that is not all! Netflix, again, has built a culture which strongly focuses on its employees’ autonomy and responsibility. More recently, Valve, a video game publisher, sparked a buzz among developers when they published their Handbook, which describes a work culture which is highly demanding but also propitious to personal fulfillment. 37 signals, lastly, with their book Getting Real, lays out their very open practices, often the opposite of what one generally finds in such organizations. In addition to efforts deployed in recruiting and holding on to programmers, there is also a strong culture of code and software quality. It is this culture that creates the foundations for moving and adapting quickly, all while managing mammoth technological platforms where performance and robustness are crucial. Web Giants are very close to the Software Craftsmanship[1] movement, which promotes a set of values and practices aiming to guarantee top-quality software and to provide as much value as possible to end-users. Within this movement, Google and GitHub have not hesitated to share their coding guidelines[2] . [1] http://manifesto.softwarecraftsmanship.org [2] http://code.google.com/p/google-styleguide/ and https://github.com/styleguide
  • 31. 31 THE WEB GIANTS How can I make it work for me? Recruiting It is important to implement very solid recruitment processes when hiring your programmers. After a first interview to get a sense of the person you wish to recruit, it is essential to have the person code. You can propose a few technical exercises to assess the candidate’s expertise, but it is even more interesting to have them code as a pair with one of your developers, to see whether there is good feeling around the project. You can also ask programmers to show their own code, especially what they are most proud of - or most ashamed of. More than the code itself, discussions around coding will bring in a wealth of information on the candidate. Also, did they put their code on GitHub? Do they take part in open source projects? If so, you will have representative samples of the code they can produce. Quality: Offer your developers the context which will allow them to continue producing top-quality software (since that is non-negotiable). Leave them time to write unit tests, to set up the development build you will need for Continuous Deployment (cf. “Continuous Deployment“, p. 105), to work in pairs, to hold design workshops in their business domain, to prototype. The practice which is known to have the most impact on quality is peer code reviewing. This happens all too rarely in our sector. R&D: Giving your developers the chance to participate in R&D projects in addition to their work is a practice which can be highly profitable. It can generate innovation, contribute to project improvement and, in the case of Open Source, increase your company’s attractiveness for developers. It is also simply a source of motivation for this often neglected group. More and more firms are adopting the principles of Hackathons, popularized by Facebook, where the principle consists in coding, in one or two days, working software. CULTURE / CODE CRAFTERS
  • 32. 32 THE WEB GIANTS Training: Training can be externalized but you can also profit from knowledge sharing among in-house developers by e.g. organizing group programming workshops, commonly called “Dojo“.[3] Developers can gather for half a day, around a video projector, to share knowledge and together learn about specific technical issues. It is also a way to share developer practices and, within a team, to align with programming standards. Lastly, working on open source projects is also a way of learning about new technologies. Workplace: Where and how you work are important! Allowing independence, promoting openness and transparency, hailing mistakes and keeping a manageable rhythm are all paying practices in the long term. Associated patterns Pattern “Pizza Teams“, p. 59. Pattern “DevOps“, p. 65. Pattern “Continuous Deployment“, p. 105. Sources • Company culture at Netflix: > http://www.slideshare.net/reed2001/culture-1798664 • What every good programmer should know: > http://www.slideshare.net/petegoodliffe/becoming-a-better- programmer • List of all the programmer positions currently open at Facebook: http://www.facebook.com/careers/teams/engineering • The highest salary at Facebook? Senior Software Engineer: http://www.businessinsider.com/the-highest-paying-jobs-at-facebook- ranked-2012-5?op=1 [3] http://codingdojo.org/cgi-bin/wiki.pl?WhatIsCodingDojo
  • 33. 33 THE WEB GIANTS CULTURE / CODE CRAFTERS • GitHub programming guidelines: https://github.com/styleguide • How GitHub grows: http://zachholman.com/talk/scaling-github • Open source contributions from Netflix: http://techblog.netflix.com/2012/07/open-source-at-netflix-by-ruslan. html • The FizzBuzz test: http://c2.com/cgi/wiki?FizzBuzzTest • Getting Real: http://gettingreal.37signals.com/GR_fra.php • The Software Craftsmanship manifesto: http://manifesto.softwarecraftsmanship.org • The Google blog on tests: http://googletesting.blogspot.fr • The Happy Manifesto: http://www.happy.co.uk/wp-content/uploads/Happy-Manifesto1.pdf
  • 34. 34 THE WEB GIANTS Open Source Contribution
  • 35. 35 THE WEB GIANTS Description Why is it Web Giants such as Facebook, Google and Twitter do so much to develop Open Source? A technological edge is a key to conquering the Web. Whether it be to stand out from the competition by launching new services (remember when Gmail came out with all its storage space at a time when Hotmail was lording it?) or more practically to overcome inherent constraints such as the growth challenge linked to the expansion of their user base. On numerous occasions, Web Giants have pulled through by inventing new technologies. If so, one would think that their technological mastery, and the asset which is the code, would be carefully shielded from prying eyes, whereas in fact the widely shared pattern one finds is that Web Giants are not only major consumers of open source technology, they are also the main contributors. The pattern “developing open source“ consists of making public a software tool (library, framework...) developed and used in-house. The code is made available on a public server such as GitHub, with a free license of the Apache type for example, authorizing its use and adaptation by other companies. In this way, the code is potentially open to development by the entire world. Moreover, open source applications are traditionally accompanied by much publicity on the web and during programming conferences. Who makes it work for them? There are many examples. Among the most representative is Facebook and its Cassandra database, built to manage massive quantities of data distributed over several servers. It is interesting to note that among current users of Cassandra, one finds other Web Giants, e.g. Twitter and Digg, whereas Facebook has abandoned Cassandra in favor of another open source storage solution - HBase - launched by the company Powerset. With the NoSQL movement, the new foundations of the Web are today massively based on the technologies of the Giants.
  • 36. 36 THE WEB GIANTS Facebook has furthermore opened several frameworks up to the community, such as its HipHop engine which compiles PHP in C++, Thrift, a multilanguage development service, and Open Compute, an Open hardware initiative which aims to optimize how datacenters function. But Facebook is not alone. Google has done the same with its user interface framework GWT, used namely in Adword. Another example is the Tesseract Optical Character Recognition (OCR) tool initially developed by HP and then by Google, which opened it up to the community a few years later. Lastly, one cannot name Google without citing Android, its open source operating system for mobile devices, not to mention their numerous scientific publications on storing and processing massive quantities of data. We are referring more particularly to their papers on Big Table and Map Reduce which inspired the Hadoop project. The list could go on and on, so we will end with first Twitter and its CSS framework and very trendy responsive design, called Bootstrap, and the excellent Ruby On Rails extracted from the Basecamp project management software opened up to the community by 37signals. Why does it work? Putting aside ideological considerations, we propose to explore various advantages to be drawn from developing open software. Open and free does not necessarily equate with price and profit wars. In fact, from one angle, opening up software is a way of cutting competition off in the bud for specific technologies. Contributing to Open Source is a way of redefining a given technology sector while ensuring sway over the best available solution. For a long time, Google was the main sponsor of the Mozilla Foundation and its flagship project Firefox, to the tune of 80%. A way to diversify to counter Microsoft. Let us come back to our analysis of the three advantages. [1] Interface Homme Machine. CULTURE / OPEN SOURCE CONTRIBUTION
  • 37. 37 THE WEB GIANTS Promoting the brand By opening cutting-edge technology up to the community, Web Giants position themselves as leaders, pioneers. It implicitly communicates a spiritofinnovationreigningintheirhalls,aconstantquestforimprovements. They show themselves as being able to solve big problems, masters of technological prowess. Delivering a successful Open Source framework says that you solved a common problem faster or better than anyone else. And that, in a way, the problem is now behind you. Done and gone, you’re already moving onto the next. One step ahead of the game. To share a framework is to make a strong statement, to reinforce the brand. It is a way to communicate an implicit and primal message: “We are the best, don’t you worry“ And then, to avoid being seen as the new Big Brother, one can’t but help feeling that the message also implied is: “We’re open, we’re good guys, fear not“.[2] Attracting - and keeping - the best This is an essential aspect which can be fostered by an open source approach. Because “displaying your code“ means showing part of your DNA, your way of thinking, of solving problems - show me your code and I will tell you who you are. It is the natural way of publicizing what exactly goes on in your company: the expertise of your programmers, your quality standards, what your teams work on day by day... A good means to attract “compatible“ coders who would have already been following the projects led by your company. Developing Open Source thus helps you to spot the most dedicated, competent and motivated programmers, and when you hire them you are already sure they will easily integrate your ecosystem. In a manner of speaking, Open Source is like a huge trial period, open to all. [2] Google’s motto: “Don’t be evil“
  • 38. 38 THE WEB GIANTS Attracting the best geeks is one thing, hanging on to them is another. On this point, Open Source can be a great way to offer your company’s best programmers a showcase demonstration open to the whole world. That way they can show their brilliance, within their company and beyond. Promoting Open Source bolsters your programmers’ resumes. It takes into account the Personal Branding needs of your staff, while keeping them happy at work. All programmers want to work in a place where programming is important, within an environment which offers a career path for software engineers. Spoken as a programmer. Improving quality Simply “thinking open source“ is already a leap forward in quality: opening up code - a framework - to the community first entails defining its contours, naming it, describing the framework and its aim. That alone is a significant step towards improving the quality of your software because it inevitably leads to breaking it up into modules, giving it structure. It also makes it easier to reuse the code in-house. It defines accountability within the code and even within teams. It goes without saying that programmers who are aware that their code will be checked (not to mention read by programmers the world over) will think twice before committing an untested method or a hastily assembled piece of code. Beyond making programmers more responsible, feedback from peers outside the company is always useful. How can I make it work for me? When properly used, Open Source can be an intelligent way not only to structure your RD but also to assess programmer performance. The goal of this paper was to explore the various advantages offered by opening up certain technologies. If you are not quite up to making the jump culturally speaking, or if your IS is not ready yet, it can nonetheless be useful to play with the idea taking a few simple-to-implement actions. Depending on the size of your company, launching your very first Open Source project can unfortunately be met with general indifference. We do not all have the powers of communication of Facebook. Beginning by CULTURE / OPEN SOURCE CONTRIBUTION
  • 39. 39 THE WEB GIANTS contributing to Open Source projects already underway can be a good initial step for testing the culture within your teams. Like Google and GitHub, another action which works towards the three advantages laid out here can be to materialize and publish on the web your programming guidelines. Another possibility is to encourage your programmers to open a development blog where they could discuss the main issues they have come up against. The Instagram Engineering Tumblr moderated by Instagram can be a very good source of inspiration. Sources • The Facebook developer portal, Open Source projects: http://developers.facebook.com/opensource • Open-Source Projects Released By Google: http://code.google.com/opensource/projects.html • The Twitter developer portal, Open Source projects: http://dev.twitter.com/opensource/projects • Instagram Engineering Blog: http://instagram-engineering.tumblr.com • The rules for writing GitHub code: http://github.com/styleguide • A question on Quora: Open Source: “Why would a big company do open-source projects?“: http://www.quora.com/Open-Source/Why-would-a-big-company-do- open-source-projects
  • 42. 42 THE WEB GIANTS CULTURE / SHARING ECONOMY PLATFORMS Description The principles at work in the platforms of the sharing economy (exponential business platforms) are one of the keys to the successes of the web giants and other startups valuated at $1 billion (“unicorns“) such as BlablaCar, Cloudera, Social finance, or over $10 billion (“decacorns“) such as Uber, AirBnB, Snapchat, Flipkart (List and valuation of the Uni/Deca-corns). The latter are disrupting existing ecosystems, inventing new ones, wiping out others. And yet “Businesses never die, only business models evolve“ (To learn more, see: Philippe Siberzahan, “Relevez le défi de l’innovation de rupture“). Concerns over the risks of disintermediation are legitimate given that digital technology has led to the development of numerous highly successful “exponential business platforms“ (see the article by Maurice Levy, “Se faire ubériser“). The article below begins with a recap of what is common to these platforms and then explores the main fundamentals necessary for building or becoming an exponential business platform. The wonderful world of the “Sharing economy“ Thereisacontinuousstreamofnewcomersknockingatthedoor,progressively transforming many sectors of the economy, driving them towards a so- called “collaborative“ economy. Among other goals, this approach strives to develop a new type of relation: Consumer-to-Consumer (C2C). This is true e.g. in the world of consumer loans, where the company LendingHome (Presentation of LendingHome) is based on peer-2-peer lending. Another area of interest is blockchain technology such as decentralisation and the “peer-2-peer'isation“ of money through the Bitcoin! What is most striking is that this type of relation can have an impact in unexpected places such as personalised urban car services (e.g. Luxe and Drop Don't Park), and movers (Lugg as an “Uber/Lyft for moving“). Business platforms such as these favor peer-2-peer relations. They have achieved exponential growth by leveraging the multitudes (For further information, see: Nicolas Colin Henri Verdier, L'âge de la multitude: Entreprendre et gouverner après la révolution numérique). Such models make it possible for very small structures to grow very quickly by generating revenues per employee which can be from 100 to 1000 times higher than
  • 43. 43 THE WEB GIANTS in businesses working in the same sector but which are much larger. The fundamental question is then to know what has enabled some of them to become hits and to grow their popularity, in terms of both community and revenues. What are the ingredients in the mix, and how does one become so rapidly successful? At this stage, the contextual elements and common ground we discern are: An often highly regulated market where these platforms appear and then develop by providing new solutions which break away from regulations (for example the obligation for hotels to make at least 10% of their rooms disability friendly, which does not apply to individuals using the AirBnB system). An as yet unmet need in supply and demand can make it possible to earn a living or to generate additional revenue for a better quality of life (Cf. AirBnB's 2015 communication campaign on the subject) or at the least to share costs (Blablacar). This point in particular raises crucial questions as to the very notion of work, its regulation and the taxation of platforms. There is strong friction around the experience, of clients and citizens, where the market has as yet to provide a response (such as valet car services in large cities around the world where parking is completely saturated) A deliberate strategy to not invest in material assets but rather to efficiently embrace the business of creating links between people. Given this understanding of the context, the 5 main principles we propose to become an exponential business platform are: Develop your “network lock-in effect“. Pair up algorithms with the user experience. Develop trust. Think user and be rigorous in execution. Carefully choose your target when you launch platform experiments.
  • 44. 44 THE WEB GIANTS “Network lock-in effect“ The more supply and demand grow and come together, the more indispensable your platform becomes. Indispensable because in the end that is where the best offers are to be found, the best deals, where your friends are. There is an inflection point where the network of suppliers and users becomes the main asset, the central pillar. Attracting new users is no longer the principal preoccupation. This asset makes it possible to become the reference platform for your segment. This growth can provide a monopoly over its use case, especially if there are exclusive deals that can be obtained through offers valid on your platform only. It can then extend to offers which follow upon the first (for example Uber's position as an urban mobility platform has led them to diversify into a meal delivery service for restaurants). This is one of the elements which were very quickly theorised in the Lean Startup approach: the virality coefficient. The perfect match: User eXperience Algorithms What is crucial in the platform is setting up the perfect relation between supply and demand, celerity in implementing relations in time and/ or space, lower prices as compared to traditional systems, and even providing services that weren't possible before. For some, algorithms for establishing relations are the core of their operations to deliver on their daily promise of offering suggestions and possibilities for relevant connections within a few micro-seconds. The perfect match is a fine-tuned mix between stellar research into the user experience (all the way to swipe!), often using a mobile-first approach to explore and offer services, based on advanced algorithms to expose relevant associations. A telling example is the use of “Swipe“ in terms of uniquely tailored user experiences for fast browsing as in the personal relationship tool “Tinder“. CULTURE / SHARING ECONOMY PLATFORMS
  • 45. 45 THE WEB GIANTS Trust security To get beyond the early adapters to reach the market majority, two elements are critical to the client experience: trust in the platform, trust towards the other platform users (both consumers and providers). Who has not experienced stress when reserving one's first AirBnB? Who has not wondered whether Uber would actually be there? This level of trust conveyed by the platform and platform users is so important that it has been one of the leveraging effects, like for the shared Blablacar platform which thrived once the transactions were operated by the platform. What happens to the confidential data provided to the platform? You may remember a recent hacking event of personal data on the “Ashley Madison“ sites affecting the 37 million platform users who wanted total discretion (Revelations around the hacking of the Ashley Madison sites). Security is thus key to protecting platform transactions, guaranteeing private data and reassuring users. Think user excel in execution Above all it is about realising that what the market and what the clients want is not to be found in marketing plans, sales forecasts and key functionalities. The main questions to ask revolve around the triplets Client / Problem / Solution: Do I really have a problem that is worth solving? Is my solution the right one for my client? Will my client buy it? For how much? Use whatever you can to check your hypotheses: interviews, market studies, prototypes... To succeed, these platforms aim to reach production very quickly, iterating and improving while their competition is still exploring their business plan. It is then a ferocious race between pioneers and copycats, because in this type of race “winner takes all“ (For further reading, see The Second Machine Age, Erik Brynjolfsson Andrew Mcafee).
  • 46. 46 THE WEB GIANTS Then excellence in execution becomes the other pillar. This operational excellence covers: the platform itself and the users it “hosts“: active users, quality of the goods offered... quality in rating with numerous well assessed offers... offers which are mediated by the platform (comments, satisfaction surveys...) One may note in particular the example of AirBnB on the theme of excellence in execution, beyond software, where the quality in the description of the lodgings as well as beautiful photos were a strong differential as compared to the competition of the time (Craig's List) (A few words on the quality of the photos at AirBnB). Critical market size Critical market size is one of the elements which make it possible to rapidly reach a sufficiently strong network effect (speed in reaching a critical size is fundamental to not being overrun by copycats). Critical market size is made up of two aspects: Selecting the primary territories for deployment, most often in cities or mega-cities, Ensuring deployment in other cities in the area, when possible in standardized regulatory contexts. You must therefore choose cities particularly concerned by your value propositions for your platform, where a sufficient number of early adapters is high enough to quickly garner takeaways. Mega-cities in the Americas, Europe and Asia are therefore choice targets for experimental deployments. Lastly, during the generalisation phase, it is no surprise to see stakeholders deploying massively in the USA (a market which represents 350 million inhabitants, with standardised tax and regulatory environments, despite state and federal differences) or in China (where the Web giants are among the most impressive players, such as: Alibaba, Tencent and Weibo) as well as Russia. CULTURE / SHARING ECONOMY PLATFORMS
  • 47. 47 THE WEB GIANTS In Europe, cities such as Paris, Barcelona, London, Berlin, etc. are often prime choices for businesses. What makes it work for them? As examined above, there are many ingredients for exponentially scalable Organizations and business models on the platform model: strong possibilities for employees to self-organise, the User eXperience, continuous experimentation... algorithms (namely intelligent networking), and leveraging one's community. What about me? For IT and marketing departments, you can begin your thinking by exploring digital innovations (looking for new uses) that fit in with your business culture (based e.g. on Design thinking). In certain domains, this approach can give you access to new markets or to disruption before the competition. A recent example is that of Accor which has entered the market of independent hotels through its acquisition of Fastbooking (Accor gets its hands on Fastbooking). Still in the area of self-disruption, two main strategies are coming to the fore. The first consists, based on partnerships or capital investments through incubators, in coming back into the game without shouldering all of the risk. The other strategy, more ambitious and therefore riskier, is to take inspiration from these new approaches to transform from within. It is then important to examine whether some of these processes can be opened up to transform them into an open platform, thereby leveraging the multitudes. In the distribution sector for example, the question of positioning and opening up various strategic processes is raised: is it a good idea to turn your supply chain into a peer-2-peer platform so that SMEs can become consumers and not only providers in the supply chain? Are pharmacies the next on the list of programmed uberisations through stakeholders such as 1001pharmacie.com? In the medical domain, Doctolib.com has just leveraged €18 million to ensure its development (Doctolib raises funds)...
  • 48. 48 THE WEB GIANTS Associated patterns Enhancing the user experience A/B Testing Feature Flipping Lean Startup Sources • List of unicorns: https://www.cbinsights.com/research-unicorn-companies • Philippe Siberzahan, “Relevez le défi de l’innovation de rupture“, édition Pearson • Article by Maurice Levy on “Tout le monde a peur de se faire ubériser“ http://www.latribune.fr/technos-medias/20141217tribd1e82ceae/tout- le-monde-a-peur-de-se-faire-uberiser-maurice-levy.html • Lending Home present through “C’est pas mon idée“: http://cestpasmonidee.blogspot.fr/2015/09/lendinghome-part- lassaut-du-credit.html • Nicolas Colin Henri Verdier, “l’âge de la multitude, 2nde édition“ • Ashley Madison hacking: http://www.slate.fr/story/104559/ashley-madison-site-rencontres- extraconjugales-hack-adultere • Second âge de la machine, Erik Brynjolfsson • Quality of AirBnB photos: https://growthhackers.com/growth-studies/airbnb • Accor met la main sur Fastbooking: http://www.lesechos.fr/17/04/2015/lesechos.fr/02115417027_accor- met-la-main-sur-fastbooking.htm • Doctolib raises 18M€: http://www.zdnet.fr/actualites/doctolib-nouvelle-levee-de-fonds-a-18- millions-d-euros-39826390.htm CULTURE / SHARING ECONOMY PLATFORMS
  • 50. 50 Pizza Teams..................................................................................... 59 Feature Teams................................................................................. 65 DevOps........................................................................................... 71 THE WEB GIANTS
  • 53. 53 THE WEB GIANTS ORGANIZATION / PIZZA TEAMS Description What is the right size for a team to develop great software? Organizational studies have been investigating the issue of team size for several years now. Although answers differ and seem to depend on various criteria such as the nature of tasks to be carried out, the average level, and team diversity, there is consensus on a size of between 5 and 15 members.[1][5] Any fewer than 5 and the team is vulnerable to outside events and lacks creativity. Any more than 12 and communication is less efficient, coherency is lost, there is an increase in free-riding and in power struggles, and the team’s performance drops rapidly the more members there are. This is obviously also true in IT. The firm Quantitative Software Management, specialized in the preservation and analysis of metrics from IT projects, has published some interesting statistics. If you like numbers, I highly recommend their Web site, it is chock full of information! Based on a sample of 491 projects, QSM measured a loss of productivity and heightened variability with an increase in team size, with a quite clear break once one reaches 7 people. In correlation, average project duration increases and development efforts skyrocket once one goes beyond 15.[6] In a nutshell: if you want speed and quality, cut your team size! Why are we mentioning such matters in this work devoted to Web Giants? Very simply because they are particularly aware of the importance of team size for project success, and daily deploy techniques to keep size down. [1] http://knowledge.wharton.upenn.edu/article.cfm?articleid=1501 [2] http://www.projectsatwork.com/article.cfm?ID=227526 [3] http://www.teambuildingportal.com/articles/systems/teamperformance-teamsize [4] http://math.arizona.edu/~lega/485-585/Group_Dynamics_RV.pdf [5] http://www.articlesnatch.com/Article/What-Project-Team-Size-Is-Best-/589717 [6] http://www.qsm.com/process_improvement_01.html
  • 54. 54 THE WEB GIANTS In fact the title of this chapter is inspired by the name Amazon gave to this practice:[7] if your team can’t be fed on two pizzas, then cut people. Albeit these are American size pizzas, but nonetheless about 8 people. Werner Vogels (Amazon VP and CTO) drove the point home with the following quote which could almost be by Nietzsche: Small teams are holy. But Amazon is not alone, far from it. To illustrate the importance that team dynamics have for Web Giants: Google hired Evan Wittenberg to be manager of Global Leadership Development; the former academic was known, in part, for his work on team size. The same discipline is applied at Yahoo! which limits its product teams in the first year to between 5 and 10 people. As for Vidaeo, they have adopted the French pizza size approach with teams of 5-6 people. In the field of startups, Instagram, Dropbox, Evernote.... are known for having kept their development teams as small as possible for as long as possible. How can I make it work for me? A small, agile team will always be more efficient than a big lazy team; such is the conclusion which could be drawn from the accumulated literature on team size. In the end, you only need to remember it to apply it... and to steer away from linear logic such as: “to go twice as fast, all you need is double the people!“ Nothing could be more wrong! According to these studies, a team exceeding 15 people should set alarm bells ringing.[8][10] [7] http://www.fastcompany.com/magazine/85/bezos_4.html [8] https://speakerdeck.com/u/searls/p/the-mythical-team-month [9] http://www.3circlepartners.com/news/team-size-matters [10] http://37signals.com/svn/posts/995-if-youre-working-in-a-big-group-youre-fighting- human-nature
  • 55. 55 THE WEB GIANTS ORGANIZATION / PIZZA TEAMS You then have two options: Fight tooth and nail to prevent the team from growing, and, if that fails, to adopt the second solution; split the team up into smaller teams. But think very carefully before you do so and bear in mind that a team is a group of people motivated around a common goal. Which is the subject of the following chapter, “Feature Teams“.
  • 57. 57 THE WEB GIANTS ORGANIZATION / FEATURE TEAMS Description In the preceding chapter, we saw that Web Giants pay careful attention to the size of their teams. That is not all they pay attention to concerning teams however: they also often organize their teams around functionalities, known as “feature teams“. A small and versatile team is a key to moving swiftly, and most Web Giants resist multiplying the number of teams devoted to a single product as much as possible. However, when a product is a hit, a dozen people no longer suffice for the scale up. Even in such a case, team size must remain small to ensure coherence, therefore it is the number of teams which must be increased. This raises the question of how to delimit the perimeters of each. There are two main options:[1] Segmenting into “technological“ layers. Segmenting according to “functionality thread“. By “functionality thread“ we mean being in a position to deliver independent functionalities from beginning to end, to provide a service to the end user. In contrast, one can also divide teams along technological layers, with one team per type of technology: typically, the presentation layer, business layer, horizontal foundations, database... This is generally the organization structure adopted in Information Departments, each group working within its own specialty. However, whenever Time To Market becomes crucial, organization into technological layers, also known as Component Teams, begins to show its limitations. This is because Time To Market crunches often necessitate Agile or Lean approaches. This means specification, development, and production with the shortest possible cycles, if not on the fly. [1] There are in truth other possible groupings, e.g. by release, geographic area, user segment or product family. But that would be beyond the scope of the work here; some of the options are dead ends, others can be assimilated to functionality thread divisions.
  • 58. 58 THE WEB GIANTS Functionality 1 Functionality 2 Functionality 4 Functionality 5 Team 1 - Front Team 1 - Back Team 1 - Exchange Team 1 - Base The trouble with Component Teams is you often find yourself with bottlenecks. Let us take the example laid out in Figure 1. Figure 1 Theredarrowsindicatethefirstproblem.Themostimportantfunctionalities (functionality 1) are swamping the Front team. The other teams are left producing marginal elements for these functionalities. But nothing can be released until Team 1 has finished. There is not much the other teams can do to help (not sharing the same specialty as Team 1), so are left twiddling their thumbs or stocking less important functionalities (and don’t forget that in Lean, stocks are bad...). There’s worse. Functionality 4 needs all four teams to work together. The trouble is that, in Agile mode, each team individually carries out the detailed analysis. Whereas here, what is needed is the detailed impact analysis on the 4 teams. This means that the detailed analysis has to take place upstream, which is precisely what Agile strives to avoid. Similarly, downstream, the work of the 4 teams has to be synchronized for testing, which means waiting for laggers. To limit the impact, task priorities have to be defined for each team in a centralized manner. And little by little, you find yourselves with a scheduling department striving to best synchronize all the work but leaving no room for team autonomy.
  • 59. 59 THE WEB GIANTS ORGANIZATION / FEATURE TEAMS In short, you have a waterfall effect upstream in analysis and planning and a waterfall effect downstream in testing and deploying to production. This type of dynamics is very well described in the work of Craig Larman and Bas Vodde, Scaling Lean and Agile. Feature teams can correct these errors: with each team working on a coherent functional subset - and doing so without having to think about the technology - they are capable of delivering value to the end client at any moment, with little need to call on other teams. This entails having all necessary skills for producing functionalities in each team, which can mean (among others) an architect, an interface specialist, a Web developer, a Java developer, a database expert, and, yes, even someone to run it... because when taken to the extreme, you end up with the DevOps “you build it, you run it“, as described in the next chapter (cf. “DevOps“, p. 71). But then how do you ensure the technological coherence of the product, if each Java expert in each feature team takes the decisions within their perimeter? This issue is addressed by the principle of community of practice. Peers from each type of specialty get together at regular intervals to exchange on their practices and to agree on technological strategies for the product being produced. Feature Teams have the added advantage that teams quickly progress in the business, this in turn fosters implication of the developers in the quality of the final product. Practicing the method is of course sloppier than what we’ve laid out here: defining perimeters is no easy task, team dynamics can be complicated, communities of practice must be fostered... Despite the challenges, this organization method brings true benefits as compared to hierarchical structures, and is much more effective and agile. To come back to our Web Giants, this is the type of organization they tend to favor. Facebook in particular, which communicates a lot around the culture, focuses on teams which bring together all the necessary talents to create a functionality.[2] [2] http://www.time.com/time/specials/packages article/0,28804,2036683_2037109_2037111,00.html
  • 60. 60 THE WEB GIANTS It is also the type of structure that Viadeo, Yahoo! and Microsoft[3] have chosen to develop their products. How can I make it work for me? Web Giants are not alone in applying the principles of Feature Teams. It is an approach also often adopted by software publishers. Moreover, Agile is spreading throughout our Information Departments and is starting to be applied to bigger and bigger projects. Once your project reaches a certain size (3-4 teams), Feature Teams are the most effective answer, to the point where some Information Departments naturally turn to that type of pattern.[4] [3] Michael A. Cusumano and Richard W. Selby. 1997. How Microsoft builds software. Commun. ACM 40, 6 (June 1997), 53-61 :http://doi.acm.org/10.1145/255656.255698 [4] http://blog.octo.com/compte-rendu-du-petit-dejeuner-organise-par-octo-et-strator- retour-dexperience-lagilite-a-grande-echelle (French only).
  • 63. 63 THE WEB GIANTS ORGANIZATION / DEVOPS Description The “DevOps“ method is a call to rethink the divisions common in our organizations, separating development on one hand, i.e. those who write application codes (“Devs“) and operations on the other, i.e. those who deploy and implement the applications (“Ops“). Such thoughts are certainly as old as CIOs but find renewed life thanks notably to two groups. First there are the agilists who have minimized constraints on the development side and are now capable of providing highly valued software to their clients on a much more frequent basis. Then there are the experts or “Prod“ managers, known as the Web Giants (Amazon, Facebook, LinkedIn...) who have shared their experiences in how they have managed the Dev vs. Ops divide. Beyond the intellectual beauty of the exercise, DevOps is mainly (if not entirely) gearing to reduce the Time To Market (TTM). Obviously, there are other positive effects, but the main priority, all being mentioned, is this TTM (hardly surprising in the Web industry). Dev Ops: differing local concerns but a common goal Organizational divides notwithstanding, the preoccupations of Development and Operations are indeed distinct and equally laudable: Figure 1 Seeking to innovate Seeking to rationalize Local targets DevOps “wall of confusion“ Different cultures Deliver new functionalities (of quality) Guarantee application runs (stability) Product Culture (software) Service Culture (archiving, supervision, etc.)
  • 64. 64 THE WEB GIANTS Software development seeks heightened responsiveness (under pressure notably from their industry and the market): they have to move fast, add new functionalities, reorient work, refactor, upgrade frameworks, test deployment across all environments... The very nature of software is to be flexible and adaptable. In contrast, Operations need stability and standardization. Stability, because it is often difficult to anticipate what the impacts of a given modification to the code, architecture or infrastructure will be. Converting a local disk into a server can impact response times, a change in code can heavily impact CPU activity leading to difficulties in capacity planning. Standardization, because Operations seek to ensure that certain rules (equipment configuration, software versions, network security, log file configuration...) are uniformly followed to ensure the quality of service of the infrastructure. And yet both groups, Devs and Ops, have a shared objective: to make the system work for the client. DevOps: capitalizing on Agility Agility became a buzzword somewhat over ten years ago, its main objective being to reduce constraints in development processes. The Agile method introduced the notions of “short cycle“, “user feedback“, “Product owner“, i.e. a person in charge of managing the roadmap, setting priorities, etc. Agility also shook up traditional management structures by including cross-silo teams (developers and operators) and played havoc with administrative departments. Today, when those barriers are removed, software development is most often carried out with one to two-week frequencies. Business sees the software evolve during the construction phase. It is now time to bring people from operations into the following phases:
  • 65. 65 THE WEB GIANTS Provisioning / spinning up environments: in most firms, deploying to an environment can take between one to four months (even though environments are now virtualized). This is surprisingly long, especially when the challengers are Amazon or Google. Deployment: this is without doubt the phase when problems come to a crunch as it creates the most instability; agile teams sometimes limit themselves to one deployment per quarter to limit the impacts on production. In order to guarantee system stability, these deployments are often carried out manually, are therefore lengthy, and can introduce errors. In short, they are risky. Incident resolution and meeting non-functional needs: Production is the other software user. Diagnosis must be fast, the problems and resilience stakes must be explained, and robustness must be taken into account. DevOps is organized around 3 pillars: infrastructure as code (IaC), continuous delivery, and a culture of cooperation 1. “Infrastructure as Code“ or how to reduce provisioning and environment deployment delays One of the most visible friction points is in the lack of collaboration between Dev and Ops in deployment phases. Furthermore this is the activity which consumes the most resources: half of production time is thus taken up by deployment and incident management. Figure 2. Source: Study by Deepak Patil (Microsoft Global Foundation Services) in 2006, via a presentation modified by James Hamilton (Amazon Web Services), http://mvdirona.com/ jrh/TalksAndPapers/JamesHamilton_POA20090226.pdf ORGANIZATION / DEVOPS
  • 66. 66 THE WEB GIANTS CMDB Mustreflecttargetconfiguration real-worldsystemconfiguration configuration OpenStack VMWare vCloud OpenNebula VM instanciation / OS Installation - Installation of Operating System Bootstrapping Capistrano Custom script (shell, python…) Commandand control Application Service Orchestration - Deploy application code to services (war, php source, ruby, ...) - RDBMS deployment (figure...) Chef Puppet CFEngine System Configuration - Deploy and install services required for application execution (JVM, application servers...) - Configuration of these services (logs, ports, rights, etc.) And although it is difficult to establish general rules, it is highly likely that part of this cost (the 31% segment) could be reduced by automating deployment. There are many reliable tools available today to generate provisioning and deployment to new environments, ranging from setting up Virtual Machines to software deployment and system configuration. Figure 3. Classification of the main tools (october 2012) These tools (each in its own language) can be used to code infrastructure: to install and deploy an HTTP service for server applications, to create repositories for the log files... The range of services and associated gains are many: Guaranteeing replicable and reliable processes (no user interaction, thus removing a source of errors) namely through their capacity to manage versions and rollback operations. Productivity. One-click deployment rather than a set of manual tasks, thus saving time. Traceability to quickly understand and explain any failures.
  • 67. 67 THE WEB GIANTS ORGANIZATION / DEVOPS Reducing Time To Recovery: In a worst case scenario, the infrastructure can be recreated from scratch. In terms of recovery this is highly useful. In keeping with ideas stemming from Recovery Oriented Architecture, resilience can be addressed either by attempting to prevent systems from failing by working on the MTBF - Mean Time Between Failures, or by accelerating repairs by working on the MTTR - Mean Time To Recovery. The second approach, although not always possible to implement, is the least costly. It is also useful in organizations where many environments are necessary. In such organizations, the numerous environments are essentially kept available and little used because configuration takes too long. Automation is furthermore a way of initializing a change in collaboration culture between Dev and Ops. This is because automation increases the possibilities for self-service for Dev teams, at the very least over the ante- production environments. 2. Continuous Delivery Traditionally, in our organizations, the split between Dev and Ops comes to a head during deployment phases, when development delivers or shuffles off their code, which then continues on its long way through the production process. The following quote from Mary and Tom Poppendieck[1] puts the problem in a nutshell: How long would it take your organization to deploy a change
that involves just one single line of code? The answer is of course not obvious, but in the end it is here that differences in objectives diverge the most. Development seeks control over part of the infrastructure, for rapid deployment, on demand, to all environments. In contrast, production must see to making environments available, rationalizing costs, allocating resources (bandwidth, CPU...) [1] Mary and Tom Poppendieck, Implementing Lean Software Development: From Concept to Cash, Addison-Wesley, 2006.
  • 68. 68 THE WEB GIANTS Also ironical is the fact that the less one deploys, the more the TTR (Time To Repair) increases, therefore reducing the quality of service to the end client. Figure 4. Source: http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for- change-4608108 In other words, the more changes there are between releases (i.e. the higher the number of changes to the code), the lower the capacity to rapidly fix bugs following deployment, thus increasing TTR - this is the instability ever-dreaded by Ops. Here again, addressing such waste can reduce the time taken up by Incident Management as shown in Figure 2. Figure 5. Source: http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for- change-4608108 Deploys Size of Deploy Vs Incident TTR 5 180 UnitsofChangedCode TTR(minutes) 160 140 120 100 80 60 40 20 0 4 3 2 1 0 Sev 1 TTR Sev 2 TTR Lines Per Deploys Changed CHANGE SIZE Huge changesets deployed rarely (high TTR) (low TTR) Tiny changesets deployed often CHANGE FREQUENCY
  • 69. 69 THE WEB GIANTS ORGANIZATION / DEVOPS To finish, Figure 5, taken from a Flickr study, shows the correlation between TTR (and therefore the seriousness of the incidents) depending on the amount of code deployed (and therefore the number of change to the code). However, continuous deployment is not easy and requires: Automation of the deployment and provisioning processes: Infras- tructure as Code Automation of the software construction and deployment processes. Build automation becomes the construction chain which carries the source management software to the various environments where the software will be deployed. Thus a new build system is neces- sary, including environment management, workflow management for more quickly compiling source code into binary code, creating documentation and release notes to swiftly understand and fix any failures, the capacity to distribute testing across agents to reduce delays, and always guaranteeing short cycle times. Taking these factors into account at the architecture level and above all respecting the following principle: decouple functionality deploy- ment and code deployment using patterns such as: Feature flipping (cf. Feature flipping p. 113), dark launch… This of course entails a new level of complexity but offers the necessary flexibility for this type of continuous deployment. A culture of measurement with user-oriented metrics. This is not only about measuring CPU consumption, it is also about correlating busi- ness and application metrics to understand and anticipate system behavior. 3. A culture of collaboration if not an organizational model These two practices, Infrastructure as Code and Continuous Delivery, can be implemented in traditional organizations (with Infrastructure as Code at Ops and Continuous Delivery at Dev). However, once development and production reach their local optimum and a good level of maturity, the latter will always be hampered by the organizational division.
  • 70. 70 THE WEB GIANTS This is where the third pillar comes into its own; a culture of collaboration, nay cooperation, with all teams becoming more independent rather than throwing problems at each other in the production process. This can mean for example giving Dev access to machine logs, providing them with production data the day before so that they can roll out the integration environments themselves, opening up the metrics and monitoring tools (or even displaying the metrics in open spaces)... Bringing that much more flexibility to Dev, sharing responsibility and information on “what happens in Prod“, which are actually just so many tasks with little added value that Ops would no longer have to shoulder. The main cultural elements around DevOps could be summarized as follows: Sharing both technical metrics (response times, number of backups...) as well as business metrics (changes in generated profits...) Ops is also the software client. This can mean making changes to the software architecture and developments to more easily integrate monitoring tools, to have relevant and useful log files, to help diagnosis (and reduce the TTD, Time To Diagnose). To go further, certain Ops needs should be expressed as user stories in the backlog. A lean approach [http://blog.octo.com/tag/lean/] and post-mortems which focus on the deep causes (the 5 whys) and implementing countermeasures (French only). It remains however that in this model, the zones of responsibility (especially development, software monitoring, datacenter use and support) which exist are somewhat modified. Traditional firms give the project team priority. In this model, deployment processes, software monitoring and datacenter management are spread out across several organizations.
  • 71. 71 THE WEB GIANTS ORGANIZATION / DEVOPS Figure 6: Project teams Inversely, some stakeholders (especially Amazon) have taken this model very far by proposing multidisciplinary teams in charge of ensuring the service functions - from the client’s perspective (cf. Feature Teams, p. 65). You build it, you run it. In other words, each team is responsible for the business, from Dev to Ops. Figure 7: Product team – You build it, you run it. BUSINESS SOFTWARE PRODUCTION FLOW MONITORING (BUILD) PRODUCTION (RUN) (Source: Cutter IT Journal, Vol. 24. N°8. August 21), modified) Project Teams Application Management Technical Management Service Desk Users SOFTWARE PRODUCTION FLOW PRODUCTS/SERVICES (BUILD RUN) PRODUCTION Service Desk Infrastructure Users (Source: Cutter IT Journal, Vol. 24. N°8. August 21), modified)
  • 72. 72 THE WEB GIANTS Moreover it is within this type of organization that the notion of self- service takes on a different and fundamental meaning. One then sees one team managing the software and its use and another team in charge of datacenters. The dividing line is farther “upstream“ than is usual, which allows scaling up and ensuring a balance between agility and cost rationalization (e.g. linked to the datacenter architecture). The AWS Cloud is probably the result of this... It is something else altogether, but imagine an organization with product teams and production teams who would jointly offer services (in the sense of ITIL) such as AWS or Google App Engine... Conclusion DevOps is thus nothing more than a set of practices to leverage improvements around: Tools to industrialize the infrastructure and reassure production as to how the infrastructure is used by development. Self service is a concept hardwired into the Cloud. Public Cloud offers are mature on the subject but some offers (for example VMWare) aim to reproduce the same methods internally. Without necessarily reaching such levels of maturity however, one can imagine using tools like Puppet, Chef or CFEngine... Architecture which makes it possible to decouple deployment cycles, to deploy code without deploying all functionalities… (cf. Feature flipping, p. 113 and Continuous Deployment, p.105). Organizational methods, leading to implementation of Amazon’s “Pizza teams“ patterns (cf. Pizza Teams, p. 59) and You build it, you run it. Processes and methodologies to render all these exchanges more fluid. How to deploy more often? How to limit risks when deploying progressively? How to apply the “flow“ lessons from Kanban to production? How to rethink the communication and coordination mechanisms at work along the development/operations divide?
  • 73. 73 THE WEB GIANTS ORGANIZATION / DEVOPS In sum, these four strands make it possible to reach the DevOps goals: improve collaboration, trust and objective alignment between development and operations, giving priority to addressing the stickiest issues, summarized in Figure 8. Figure 8 Faster provisioning Improved quality of service Continuous improvement Operational efficiency Infrastructure as Code Continuous Delivery Increased deployment reliability Faster incident resolution (MTTR) Improved TTM Culture of collaboration
  • 74. 74 Sources • White paper on the DevOps Revolution: http://www.cutter.com/offers/devopsrevolution.html • Wikipedia article: http://en.wikipedia.org/wiki/DevOps • Flickr Presentation at the Velocity 2009 conference: http://velocityconference.blip.tv/file/2284377/ • Definition of DevOps by Damon Edwards: http://dev2ops.org/blog/2010/2/22/what-is-devops.html • Article by John Allspaw on DevOps: http://www.kitchensoap.com/2009/12/12/devops-cooperation-doesnt- just-happen-with-deployment/ • Article on the share of deployment activities in Operations: http://dev2ops.org/blog/2010/4/7/why-so-many-devopsconversations- focus-on-deployment.html • USI 2009 (French only): http://www.usievents.com/fr/conferences/4-usi-2009/sessions/797- quelques-idees-issues-des-grands-du-web-pour-remettre-en-cause-vos- reflexes-d-architectes#webcast_autoplay THE WEB GIANTS
  • 76. 76 Lean Startup.................................................................................... 87 Minimum Viable Product.................................................................. 95 Continuous Deployment................................................................ 105 Feature Flipping............................................................................. 113 Test A/B......................................................................................... 123 Design Thinking............................................................................. 129 Device Agnostic............................................................................. 143 Perpetual beta............................................................................... 151 THE WEB GIANTS
  • 78. 78 THE WEB GIANTS PRACTICES / LEAN STARTUP Description Creating a product is a very perilous undertaking. Figures show that 95% of all products and startups perish from want of clients. Lean Startup is an approach to product creation designed to reduce risks and the impact of failures by, in parallel, tackling organizational, business and technical aspects, and through aggressive iterations. It was formalized by Eric Ries, and was strongly inspired by Steve Blank’s Customer Development Build – Mesure – Learn All products and functionalities start with a hypothesis. The hypothesis can stem from data collection on the ground or a simple intuition. Whatever the underlying reason, the Lean Startup approach aims to: Consider all ideas as hypotheses, it doesn’t matter whether they concern marketing or functionalities, validate all hypotheses as quickly as possible on the ground. This last point is at the core of the Lean Startup approach. Each hypothesis, from business, systems admin or development - must be validated, for quality as well as metrics. Such an approach makes it possible to implement a learning loop for both the product and the client. Lean Startup refuses the approach which consists of developing a product for over a year only to discover that the choices made (in marketing, functionalities, sales) threaten the entire organization. Testing is of the essence. Figure 1 IDEAS PRODUCTLEARN BUILD DATA MEASURE
  • 79. 79 THE WEB GIANTS Experiment to validate Part of the approach is based on the notion of Minimum Viable Product (MVP) (cf. “Minimum Viable Product“, p. 95). At what minimum can I validate my hypotheses? We’re not necessarily speaking here of code and products in their technical senses, but rather of any effort that leads to progress on a hypothesis. Anything can be used to test market appetite - Google Docs questionnaire, mailing list or fake functionality. Experimentations with its afferent lessons are an invaluable asset in piloting a product and justifying the implementation of a learning loop. The measurement obsession Obviously experiments must be systematically monitored through full and reliable metrics (cf. “The obsession with performance measurement“, p. 13). A client-centered approach – Go out of the building Checking metrics and validating quality very often means “leaving the building“, as Bob Dorf puts it, co-author of the famous “4 Steps to the Epiphany“. “Go out of the building“ (GOOB) is at the heart of the preoccupations of Product Managers who practice Lean Startup. Until a hypothesis has been confronted with reality, it remains a supposition. And therefore presents risks for the organization. “No plan survives first contact with customers“ (Steve Blank) is thus one of the mottoes of Product teams: Build only the minimum necessary for validating a hypothesis. GOOB (from face-to-face interviews to continuous deployment). Learn. Build, etc.
  • 80. 80 THE WEB GIANTS PRACTICES / LEAN STARTUP This approach also allows constant contact with the client, in other words, constant validation of business hypotheses. Zappos, a giant in online shoe sales in the US, is an example of MVP being put into users’ hands at a very early stage. To confront reality and validate that users would be willing to buy shoes online, the future CEO of Zappos took snapshots of the shoes in local stores, thereby creating the inventory for an e-commerce site from scratch. In doing so, and without building cathedrals, he quickly validated that demand was there and that producing the product would be viable. Piloting with data Naturally, to grasp user behavior during GOOB sessions, Product Managers meticulously gather data which will help them make the right decision. They also set up tools and processes to collect such data. The most used are well known to all. They use interviews and analytics solutions. The Lean Startup method implements the ferocious use of these indicators to truly pilot the product strategy. On ChooseYourBoss.com[1] , we postulated that users would choose LinkedIn or Viadeo to connect, to avoid users having to set up accounts and to save us the trouble of developing a login system. In such a way we built the minimum to validate or invalidate the hypothesis of what people would do when given three options to sign up, LinkedIn, Viadeo or by opening a ChooseYourBoss account. The first two worked well while the 3rd, the ChooseYourBoss account, indicated that the ChooseYourBoss account was not viable for production. Results: users not wishing to use these networks to sign in represented 11% of visitors to our site. We will therefore abstain for the time being from implementing accounts outside of social networks. We went from “informed by data“ to “piloted by data“. Who makes it work for them? IMVU, Dropbox, Heroku, Votizen and Zappos are a few examples of Web products that managed to integrate user feedback at a very early stage in product design. Dropbox for example completely overhauled its way of doing things by drastically simplifying management of synchronized files. Heroku went from a development platform in the Cloud to a Cloud server solution. Examples abound, each more ingenious than the previous one. [1] A site for connecting candidates and recruiters.
  • 81. 81 THE WEB GIANTS What about me? Lean Startup is not a dogma. Above all it is about realizing that what the market and the clients want is not to be found in architecture, marketing plans, sales forecasts and key functionalities. Once you’ve come to that realization, you will start seeing hypotheses everywhere. It all consists in setting up processes for validating hypotheses, without losing sight of the principle of validating minimum functionalities at any given instant t. Before writing any code, the main questions to ask revolve around the triad Client / Problem / Solution: Do I really have a problem that deserves to be resolved? Is my solution the right one for my client? Will my client buy it? How much? Use whatever you can to check your hypotheses: interviews, market studies, prototypes... The next step is to know whether the model you are testing on a small scale is replicable and expandable. How can you get clients to acquire a product they’ve never heard of? Will they be in a position to understand, use, and profit from your product? The third and fourth steps revolve around growth: how do you attract clients and how do you build a company capable of taking on your product and moving it forward? Contrary to what one might think after reading this chapter, Lean Startup is not an approach reserved for mainstream websites. Innovation through validating hypotheses as quickly as possible and limiting financial investment is obviously logic which can be transposed to any type of information systems project, even in-house. We are convinced that this approach deserves wider deployment to avoid Titanic-type projects which can swallow colossal sums despite providing very little value for users. For more information, you can also consult the sessions on Lean Startup at USI which present the first two stages (www.usievents.com).
  • 82. 82 THE WEB GIANTS PRACTICES / LEAN STARTUP Sources • Running Lean – Ash Maurya • 4 Steps to the Epiphany – Steve Blank Bob Dorf : http://www.stevenblank.com/books.html • Blog Startup Genome Project : http://blog.startupcompass.co/ • The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses – Eric Ries : http://www.amazon.com/The-Lean-Startup-Entrepreneurs-Continuous/ dp/0307887898 • The Startup Owner’s Manual – Steve Blank Bob Dorf : http://www.amazon.com/The-Startup-Owners-Manual-Step-By-Step/ dp/0984999302
  • 85. 85 THE WEB GIANTS PRACTICES / MINIMUM VIABLE PRODUCT Description A Minimum Viable Product (MVP) is a strategy for product development . Lean Startup creator Eric Ries, who strongly contributed to the elaboration of this approach, gives the following definition: The minimum viable product is that version of a new product which allows a team to collect the maximum amount of validated learning about customers with the least effort.[1] In sum, it is a way to quickly develop a minimal product prototype to establish whether the need for it is there, to identify possible markets, and to validate business hypotheses on e.g. income generation. The interest of the approach is obvious: to more quickly design a product that truly meets market needs, by keeping costs down in two ways: by reducing TTM:[2] faster means less human effort, therefore less outlay - all else being equal, and by reducing the functional perimeter: less effort spent on functionalities which have not yet proven their worth to the end user. In the case of startups, funds usually run low. It is therefore best to test your business plan hypotheses as rapidly as possible - and this is where a MVP shows its worth. The advantages are well illustrated by Eric Ries’s experience at IMVU. com, an online chatting and 3D avatar website: it took them only six months to create their first MVP, whereas in a previous startup experience it took them almost five years to release their first product - which was questionably viable! [1] http://www.startuplessonslearned.com/2009/08/minimum-viable-product-guide.html [2] Time To Market
  • 86. 86 THE WEB GIANTS Today, 6 months is considered a relatively long delay, and MVPs are often deployed in less. This is because designing an MVP does not necessarily mean producing code or a sophisticated website, quite the contrary. The goal is to get a feel for the market very early on in the project so as to validate your plans for developing your product or service. This is what is known as a Fail Fast approach. MVPs allow you to quickly validate your client needs hypotheses and therefore to reorient your product or service accordingly, very early on in your design process. This is known as a “pivot“ in the Lean Startup jargon. Or, if your hypotheses are validated by the MVP run, you must then move on to the next step: implementing the functionality you simulated, creating a proper web site, or simply a marketing page. An MVP is not only useful for launching a new product: the principle is perfectly applicable for adding new functionalities to a product that already exists. The approach can also be more direct: for example you can ask for user feedback on what functionalities people would like (see Figure 1), at the same time gathering information on how they use your product. MVPs are particularly relevant when you have no or little knowledge of your market and clients, nor any well defined product vision. Implementation An MVP can be extremely simple. For example, Nivi Babak states that “The Minimum Viable Product (MVP) is often an ad on Google Or a Power Point slide. Or a dialog box. Or a landing page. You can often build it in a day or a week.“[3] The most minimalist approach is called a Smoke Test, in reference to electronic component testing to check that a component functions properly before moving on to the next stage [3] http://venturehacks.com/articles/minimum-viable-product
  • 87. 87 THE WEB GIANTS PRACTICES / MINIMUM VIABLE PRODUCT of testing (stress tests, etc.) and the fact that in case of failure there is often a great deal of smoke! The most minimal form of a Smoke Test consists of an advertisement in a major search engine for example, promoting the qualities of the product you hope to develop. Clicking on the ad will send the person to a generally static web page with minimal information, but e.g. suggesting links, the goal being to gather click information, indicative of how interested the client is in the proposed service, and willingness to buy it. That is to say that the functionalities laid out in the links do not have to be operational at this stage! The strict minimum is the ad, as this is the first step in gathering information. In an early version of the website theleanstartup.com, which applies the principles it preaches (the EYODF pattern),[4] was proposed, at the very bottom of its home page (the MVP of theleanstartup.com), a very simple dialog box for collecting user needs. There were only two fields to be filled in: e-mail address and suggestion for a new functionality, as well as the invitation: What would you like to see on future versions of this website? Figure 1. Form for collecting user information on the website theleanstartup.com once the fields are filled in. In terms of tooling, services such as Google Analytics, Xiti, etc. which track all user actions and browsing characteristics on a given website, are indispensable allies. For example, in the case of a new website functionality to be implemented, it is very simple to add a new tab, menu option, advertisement, and to track user actions with this type of tool. SendMy Email and What do you want to see in future releases? THIS IS OUR MINIMUM VIABLE PRODUCT smoky@test.com my smoke test Success! We’ve recieved your feedback. [4] Eat Your Own Dog Food, i.e. be the own consumers of your services.
  • 88. 88 THE WEB GIANTS Risks... Beware, the MVP can generate ambiguous results, including false negatives. In fact, if an MVP is not sufficiently well thought-out, or is badly presented, it can trigger a negative reaction on the targeted clients’ part. It can seem to indicate that the planned product isn’t viable whereas in fact it is only a question of iterating to perfect the process to better meet client needs. The point is to not stop at the first whiff of failure: a single step is all it can take to go from non-viable to viable, i.e. to the MVP itself. Henry Ford put it very aptly: “If I had asked people what they wanted, they would have said faster horses.“ Having a product vision can be more than just an option. Who makes it work for them? Once again we will mention IMVU (see above), one of the pioneers of Lean Startup where Eric Ries Co. tested the MVP concept, more particularly in the field of 3D avatar design. Their website, imvu.com is an online social media for 3D avatars, chat rooms, gaming, and has the world’s largest catalog of virtual goods, most of which are created by the users themselves. Let us also return to the example of Dropbox, an online file storage service which has seen its growth skyrocket, all based on an MVP which was a fake showcase demonstration, the product didn’t yet exist. Following the posting of the video, a tidal wave of subscribers brought the beta list sign-ups from 5,000 to 75,000 people in one night, confirming that Dropbox’s product vision was indeed solid.
  • 89. 89 THE WEB GIANTS PRACTICES / MINIMUM VIABLE PRODUCT How can I make it work for me? With the prevalence of e-commerce in the social media, the web is now at the heart of economic development strategies for businesses. The MVP strategy can be activated as is for a wide range of projects, whether stemming from the IT department or Marketing, but don’t forget that it can also be applied outside the web. It can even be applied to purely personal projects. In his reference work Running Lean, Ash Maurya gives the example of applying an MVP (and Lean Startup) to the publication of that self-same book. Auditing Information Systems is a major part of our work at OCTO and we are often faced with innovation projects (community platforms, e-services, online shopping...) that encounter difficulties in the production process, say every six months, and where the release, delayed by one or two years, is often a flop, because the value delivered to users does not correspond to market demand... In the interval, millions of euros will have been swallowed up, for a project that will finally end up in the waste bin of the web. An MVP type approach reduces such risks and associated costs. On the web, delays of that length to release a product cannot be sustained, and competition is not only ferocious but also swift! Within a business information system, it is hard to see how one could carry out Smoke Tests with advertisements. And yet there too one often finds applications and functionalities which took months to develop, without necessarily being adopted by users in the end... The virtue of Lean Startup and the MVP approach is to center attention on the value added for users, and to better understand their true needs. In such cases, an MVP can serve to prioritize the end users of the functionalities to be developed in future versions of your application.
  • 90. 90 THE WEB GIANTS Sources • Eric Ries, Minimum Viable Product: a guide, Lessons Learned, 3 August, 2009 http://www.startuplessonslearned.com/2009/08/minimum-viable- product-guide.html • Eric Ries, Minimum Viable Product, StartupLessonLearned conference http://www.slideshare.net/startuplessonslearned/minimum-viable- product • Eric Ries, Venture Hacks interview: “What is the minimum viable product? “ http://www.startuplessonslearned.com/2009/03/minimum-viable- product.html • Eric Ries, How DropBox Started As A Minimal Viable Product, 19 October, 2011 http://techcrunch.com/2011/10/19/dropbox-minimal-viable-product • Wikipedia, Minimum viable product http://en.wikipedia.org/wiki/Minimum_viable_product • Timothy Fitz, Continuous Deployment at IMVU: Doing the impossible fifty times a day, 10 February, 2009 http://timothyfitz.wordpress.com/2009/02/10/continuous-deployment- at-imvu-doing-the-impossible-fifty-times-a-day • Benoît Guillou, Vincent Coste, Lean Start-up, 29 June, 2011, Université du S.I. 2011, Paris http://www.universite-du-si.com/fr/conferences/8-paris-usi-2011/ sessions/1012-lean-start-up (French only) • Nivi Babak, What is the minimum viable product?, 23 March, 2009 http://venturehacks.com/articles/minimum-viable-product • Geoffrey A. Moore, Crossing the Chasm: Marketing and Selling High-Tech Products to Mainstream Customers, 1991 (revised 1996), HarperBusiness, ISBN 0066620022
  • 91. 91 THE WEB GIANTS PRACTICES / MINIMUM VIABLE PRODUCT • Silicon Valley Product Group (SVPG), Minimum Viable Product, 24 August, 2011 http://www.svpg.com/minimum-viable-product • Thomas Lissajoux, Mathieu Gandin, Fast and Furious Enough, Définissez et testez rapidement votre premier MVP en utilisant des pratiques issues de Lean Startup, Conference Paris Web, 15 October, 2011 http://www.slideshare.net/Mgandin/lean-startup03-slideshare (French only) • Ash Maurya, Running Lean http://www.runningleanhq.com/
  • 93. 93 THE WEB GIANTS PRACTICES / CONTINUOUS DEPLOYMENT Description In the chapter “Perpetual beta“, p. 151, we will see that Web Giants improve their products continuously. How do they manage to deliver improvements so frequently while in some IT departments the least change can take several weeks to be deployed in production? In most cases, they have implemented a continuous deployment process, which can be done in two ways: Either entirely automatically - modifications to the code are automatically tested and, if validated, deployed to production. Or semi-automatically: at any time one can deploy the latest stable code to production in one go. This is known as “one-click deployment“. Obviously, setting up this pattern entails a certain number of prerequisites. Why deploy continuously? The primary motivation behind continuous deployment is to shorten the Time To Market, but it is also a means to test hypotheses, to validate them and, in fine, to improve the product. Let us imagine a team which deploys to production on the 1st of every month (which is already a lot for many IT departments): I have an idea on the 1st . With a little luck, the developers will be able to implement it in the remaining 30 days. As planned, it is deployed to production in the monthly release plan on the 1st of the following month. Data are collected over the next month and indicate thatthe basic idea needs improvement. But it will be a month before the new improvement can be implemented, which is to say it takes three months to reach a stabilized functionality.
  • 94. 94 THE WEB GIANTS In this example, it is not development that is slowing things down but in fact the delivery process and the release plan. Thus continuous deployment shortens the Time To Market but is also a way to accelerate product-improvement cycles. This improves the famous Lean Startup cycle (cf. “Lean Startup“, p. 87): Figure 1 A few definitions Many people use “Continuous Delivery“ and “Continuous Deployment“ interchangeably. To avoid any errors in interpretation, here is our definition: With each commit (or time interval), the code is: Compiled, tested, deployed to an integration environment = Continuous Integration Compiled, tested, delivered to the next team (Tests, Qualification, Production, Ops). = Continuous Delivery Compiled, tested, deployed to production. = Continuous Deployment IDEAS CODEDATA LEARN FAST CODE FAST MEASURE FAST
  • 95. 95 THE WEB GIANTS PRACTICES / CONTINUOUS DEPLOYMENT The point here is not to say that Continuous Delivery and Continuous Integration are a waste of time. Quite the contrary, they are essential steps: Continuous Deployment is simply the natural extension of Continuous Delivery, itself the natural extension of Continuous Integration. What about quality? One frequent objection to Continuous Deployment is the lack of quality and the fear of delivering an imperfect product, of delivering bugs. Just as with Continuous Integration, Continuous Deployment is only fully useful if you are in a position to be sure of your code at all times. This entails a full array of tests (on units, integration, performance, etc.). Beyond the indispensable unit tests, there is a wide range of automated tests such as: Integration tests (Fitnesse, Greenpepper, etc.) GUI tests (Selenium, etc.) Performance tests (Gatling, OpenSTA, etc.) Test automation can seem costly, but when the goal is to execute them several times a day (IMVU launches 1 million tests per day), return on investment grows rapidly. Some, such as Etsy, do not hesitate to create and share tools to best meet their testing and automation needs.[1] Furthermore, when you deploy every day, the size of the deployments is obviously much smaller than when you deploy once a month. In addition, the smaller the deployment, the shorter the Time To Repair, as can be seen in Figure 2. [1] https://github.com/etsy/deployinator
  • 96. 96 THE WEB GIANTS Figure 2 (modified). Source: http://www.slideshare.net/jallspaw/ops- metametrics-the-currency-you-pay-for-change-4608108 Etsy well illustrates the trust one can have in code and in the possibility of repairing any errors quickly. This is because they don’t bother with planning for rollbacks: “We don’t roll back code, we fix it“. According to one of their employees, the longest time span it has taken them to fix a critical bug was four minutes. Big changes lead to big problems, little changes lead to little problems. Who does things this way? Many of Web Giants have successfully implemented Continuous Deployment, here are a few of the most representative numbers Facebook, very aggressive on test automation, deploys twice a day. Flickr makes massive use of Feature Flipping (cf. “Feature Flipping“, p. 113) to avoid development branches and deploys over ten times daily. A page displays the details of the last deployment: http:// code.flickr.com Etsy (an e-commerce company), hugely invested in automated tests and deployment tooling, and deploys more than 25 times a day. CHANGE SIZE Huge changesets deployed rarely (high TTR) (low TTR) Tiny changesets deployed often CHANGE FREQUENCY
  • 97. 97 THE WEB GIANTS PRACTICES / CONTINUOUS DEPLOYMENT IMVU (an online gaming and 3D avatar site), performs over a million tests a day and deploys approximately 50 times. What about me? Start by estimating (or even better, by measuring!) the time it takes you and your team to deliver a simple line of code through to production, respecting the standard process, of course. Setting up Continuous Deployment Creating a “Development Build“ is the first step towards Continuous Deployment. To move on, you have to ensure that the tests you run cover most of the software. While some don’t hesitate to code their own test frameworks (Netflix initiated the “Chaos Monkey“ project which shuts down servers at random), there are also ready made frameworks available, such as JUnit, Gatling and Selenium. To reduce testing time, IMVU distributes its tests over no fewer than 30 machines. Others use Cloud services such as AWS to instantiate test environments on the fly and carry out parallel testing. Once the development build produces sufficiently tested artifacts, it can be expanded to deliver the artifacts to the teams who will deploy the software across the various environments. At this stage, you are already in Continuous Delivery. The last team can now enrich the build to include deployment tasks. This obviously entails automating various tasks, such as configuring the environments, deploying the artifacts which constitute the application, migrating the database diagrams and much more. Be very careful with your deployment scripts! It is code and, like all code, must meet quality standards (use of a SCM, testing, etc.). Forcing Continuous Deployment A more radical but highly interesting solution is to force the rhythm of release, making it weekly for example, to stir up change.
  • 98. 98 THE WEB GIANTS Associated patterns When you implement Continuous Delivery this is necessarily accompanied by several patterns, including: Zero Downtime Deployment, because while an hour of system shut-down isn’t a problem if you release once a month, it can become one if you release every week or every day. Feature Flipping (see the next chapter, “Feature Flipping“), because regular releases unavoidably entail delivering unfinished functionalities or errors, you must therefore have a way of deactivating problematic functionalities instantaneously or upstream. DevOps obviously, because Continuous Deployment is one of its pillars (cf. “DevOps“, p. 71). Sources • Chuck Rossi, Ship early and ship twice as often, 3 August, 2012: https://www.facebook.com/notes/facebook-engineering/ship-early- and-ship-twice-as-often/10150985860363920 • Ross Harmess, Flipping out, Flickr Developer Blog, 2 December, 2009: http://code.flickr.com/blog/2009/12/02/flipping-out • Chad Dickerson, How does Etsy manage development and operations? 4 February, 2011: http://codeascraft.etsy.com/2011/02/04/how-does-etsy-manage- development-and-operations • Timothy Fitz, Continuous Deployment at IMVU: Doing the impossible fifty times a day, 10 February, 2009: http://timothyfitz.wordpress.com/2009/02/10/continuous-deployment- at-imvu-doing-the-impossible-fifty-times-a-day • Jez Humble, Four Principles of Low-Risk Software Releases, 16 February, 2012: http://www.informit.com/articles/article.aspx?p=1833567 • Fred Wilson, Continuous Deployment, 12 February, 2011: http://www.avc.com/a_vc/2011/02/continuous-de
  • 100. 100 THE WEB GIANTS PRACTICES / FEATURE FLIPPING Description The “Feature Flipping“ pattern allows you to activate or deactivate functionalities directly in production, without having to release new code. Several terms are used by Web Giants: Flickr and Etsy use “feature flags“, Facebook “gatekeepers“, Forrst “feature buckets“, Lyris Inc. “feature bits“, while Martin Fowler opted for “feature toggles“. In short, everyone names and implements the pattern in their own way, and yet all of these techniques strive to reach a same goal. In this article we will use the term “feature flipping“. Successfully implemented in our enterprise app store Appaloosa,[1] this technique has brought many advantages with just a few drawbacks. Implementation It is a very simple mechanism, you simply have to condition execution of the code for a given functionality in the following way: if Feature.is_enabled(‘new_feature’) # do something new else # do same as before end The implementation of the function “is enabled“ will e.g. query a configuration file or database to know whether the functionality is activated or not. You then need an administration console to configure the state of the various flags on the different environments. Continuous deployment One of the first advantages in being able to hot-switch functionalities on or off is to be able to continuously deliver the application being produced. Indeed, one of the first problems faced by organizations imple- menting continuous delivery is: [1] cf. appaloosa-store .com
  • 101. 101 THE WEB GIANTS how can one regularly commit the source referential while guaranteeing application stability and constant production readiness? In the case of functionality developments which cannot be finished in less than a day, only committing the functionality once it’s done (after a few days) is contrary to development best practices in continuous integration. The truth is that the farther apart your commits, the more complicated and risky are merges, with only limited possibilities for transversal refactoring. Given these constraints, there are two choices: “feature branching“ or “feature flipping“. In other words, creating a branch via the configuration management tool or in the code. Each has its fervent partisans, you can find some of the heated debates at: http://jamesmckay. net/2011/07/why-does-martin-fowler-not-understand-feature-branches Feature Flipping makes it possible for developers to code inside their “ifs“, and to thus commit unfinished, non-functional code, as long as the code compiles and the tests are passed. Other developers can obtain the modifications without difficulty as long as they do not activate the functionalities being developed. Thus the code can be deployed to production since, again, the functionality will not be activated. That is where the interest lies: deployment of code to production no longer depends on completing all the functionalities under development. Once the functionality is finished, it can be activated by simply changing the status of the flag on the administration console. This has an added benefit in that the functionality can be activated to coincide e.g. with an advertising campaign; it is a way of avoiding mishaps on the day of the release. Mastering deployment One of the major gains brought by this pattern is that you are in control of deployment, because it allows you to activate a functionality with a simple click, and to deactivate it just as easily, thus avoiding drawn- out and problem-prone rollback processes to bring the system back to its N-1 release. Thus you can very quickly cancel the activation of a functionality if production tests are inconclusive or user feedback is negative.
  • 102. 102 THE WEB GIANTS PRACTICES / FEATURE FLIPPING Unfortunately, things are not quite that simple: you must be very careful with your data and ensure that the model will work with or without the functionality being activated (see the paragraph “Limits and constraints major modifications“). Experiment to improve the product A natural off-shoot of feature flipping is that it enables you to activate or deactivate functionalities for specific sub-populations. You can thus test a functionality on a user group and, depending on their response, activate it for all users or scrap it. In which case the code will look something like this: if Feature.is_enabled_for(‘new_feature’, current_user) # do something new else # do same as before end You can then use the mechanism to test a functionality’s performance by modifying one variable in its implementation for several sub-populations. Result metrics will help you determine which implementation performs best. In other words, feature flipping is an ideal tool for carrying out A/B testing (cf. “A/B Testing“, p. 123). Provide custom-made products In some cases, it can be interesting to let the client choose between the two. Let us take the example of attachments in Gmail: by default, the interface proposes a number of advanced functionalities (drag and drop, multiple uploads) which can be deactivated by the user with a simple click in case of dysfunction. Inversely, you can offer users an “enhanced“ mode, “labs“ (Gmail) are telling examples of feature flipping implementation. To do so, all you have to do is to propose an interface where users can control the activation/deactivation of certain functionalities (self service).
  • 103. 103 THE WEB GIANTS Managing billable functionalities Activating paying functionalities with various levels of service can be com- plicated to implement, and entails conditional code of the following type: if current_user.current_plan == ‘enterprise’ || current_user.current_plan == ‘advanced’ Let us say that some “special“ firms are paying for the basic plan but you want to give them access to all functionalities. A given functionality was included in the “advanced“ plan two months before, but marketing has decided that it should only be included in the “enterprise“ plan... except for those who subscribed more than two months earlier. You can use feature flipping to avoid having to manage such exceptions in the code. You just need to condition activation of the features when a client subscribes. When users subscribe to the enterprise plan, the functionalities X, Y and Z are activated. You can then very easily manage exceptions in the administration interface. Graceful degradation Some functionalities are more crucial to business than others. When scaling up it is a good idea to favor certain functionalities over others. Unfortunately, it is difficult to ask your software or server to give priority to anything to do with billing over displaying synthesis graphs... unless the graph display functionality is feature flipped. We have already mentioned the importance of metrics (cf. “The obsession with performance measurement“, p. 13). Once your metrics are set up, it becomes trivial to flip functions accordingly. For example: “If the average response time for displaying the graph exceeds 10 seconds over a period of 3 minutes, then deactivate the feature“. This allows you to progressively degrade website features in order to maintain a satisfying experience for the users of the core business functionalities. This is akin to the “circuit breaker“ pattern (described in the book “Release It!“ by Michel Nygard) which makes it possible to short- circuit a functionality if an external service is down.
  • 104. 104 THE WEB GIANTS PRACTICES / FEATURE FLIPPING Limits and constraints As noted above, all you need to implement feature flipping is an “if“. However, like with any development, this can easily become a new source of complexity if you do not take the necessary precautions. 1. 1 “if“ = 2 tests. Automated tests are still the best way to check that your software is working as it should. In the case of feature flipping, you will need at least 2 tests: with the feature flipped OFF (activated) and with the feature flipped ON (deactivated). In development, one often forgets to test the feature OFF even though this is what your clients will see unless it is ON. Therefore, once more, applying TDD[2] is a good solution: tests written in the initial development phases guarantee testing of OFF functionalities. 2. Clean up! Extensive use of feature flipping can lead to an accumulation of “ifs“, making it more and more difficult to manage the code. Remember that for some functionalities, flipping is only useful for ensuring continuous deployment. For all functionalities that should never again need to be deactivated (free/ optional functionalities which will never be degraded as they are critical from a functional perspective), it is important to delete the “ifs“ to lighten the code and keep it serviceable. You should therefore set aside some time following deployment to production to “clean up“. Like all code refactoring tasks, it is all the easier the more regularly you do it. [2] Test Driven Development
  • 105. 105 THE WEB GIANTS 2. Major modifications (i.e. changing your relational model) Some functionalities entail major changes in the code and data model. Let us take the example of a Person table containing an Address field. To meet new needs, you decide to divide the tables as follows: To manage cases like this, here is a strategy you can implement: Add the table Address (so that the base contains both the column Address AND the table Address). For applications nothing has changed, they continue querying the old columns. You then modify your existing applications so that they use the new tables. You migrate the data you have and delete all unused columns. At this point, most often the application will have changed little for the user, but calls upon a new data model. You can then start developing new functionalities based on your new data model, using feature flipping. The strategy is relatively simple and entails down time for the various releases (phases 2 and 4). Other techniques can be used to manage in parallel several version of your data model, in keeping with the pattern “zero downtime deployment“, allowing you to update your relational diagram without impacting the availability of the application using it, based on various types of scripts (script expansion and contraction), triggers to synchronize the data, or even views to expose the data to the applications through an abstraction layer. Person ID Last name First name Address Person ID Last name First name Address ID Person_ID Street Post_Code Town Country
  • 106. 106 THE WEB GIANTS PRACTICES / FEATURE FLIPPING Changes to one’s relational model are much less frequent than changes to code, but they are complex and have to be planned well in advance and managed very carefully. NoSQL (Not Only SQL) databases are much more flexible as concerns the data model so can also be an interesting option. Who makes it work for them? It works for us, even though we are not (yet!) Web Giants. In the framework of our Appaloosa project we successfully implemented the various patterns described in this article. For Web Giants, their size, constraints due to deployment to several sites, big data migrations, leave them no choice but to implement such mechanisms. Among the most famous are Facebook, Flickr and Lyris Inc. Closer to home are Meetic, the Bilbiothèque Nationale de France and Viadeo, with the latter being particularly insistent on code clean-up and only leaving flippers in production for a few days. And anyone who practices continuous deployment (cf. “Continuous Deployment“, p. 105) applies, in one way or another, the feature flipping pattern.. How can I make it work for me? There are various ready-made implementations in different languages such as the gem rollout in Ruby and the feature flipper in Grails, but it is so easy that we recommend you design your own implementation tailored to your specific needs. There are multiple benefits and possible uses, so if you need to progressively deploy functionalities, or carry out user group tests, or deploy continuously, then get started!
  • 107. 107 THE WEB GIANTS Sources • Flickr Developer Blog: http://code.flickr.com/blog/2009/12/02/flipping-out • Summary of the Flickr session at Velocity 2010: http://theagileadmin.com/2010/06/24/velocity-2010-always-ship-trunk • Quora Questions on Facebook: http://www.quora.com/Facebook-Engineering/How-does-Facebooks- Gatekeeper-service-work • Forrst Engineering Blog: http://blog.forrst.com/post/782356699/how-we-deploy-new-features- on-forrst • Slideshare Lyrics Inc. : http://www.slideshare.net/eriksowa/feature-bits-at-devopsdays-2010-us • Talk Lyrics Inc. at Devopsdays 2010: http://www.leanssc.org/files/201004/videos/20100421_ Sowa_EnabilingFlowWithinAndAcrossTeams/20100421_Sowa_ EnabilingFlowWithinAndAcrossTeams.html • Whitepaper Lyrics Inc. : http://atlanta2010.leanssc.org/wp-content/uploads/2010/04/Lean_ SSC_2010_Proceedings.pdf • Interview with Ryan King from Twitter: http://nosql.mypopescu.com/post/407159447/cassandra-twitter-an- interview-with-ryan-king • Blog post by Martin Fowler: http://martinfowler.com/bliki/FeatureToggle.html • Blog 99designs: http://99designs.com/tech-blog/blog/2012/03/01/feature-flipping
  • 110. 110 THE WEB GIANTS PRACTICES / A/B TEST Description A/B Testing is a product development method to test a given functionality’s effectiveness. You can thus test e.g. a marketing campaign via e-mail, a home page, an advertising insert or a payment method. This test strategy allows you to validate various object releases for a single variable: the subject line of an e-mail or the contents of a web page. Like any test designed to measure performance, A/B Testing can only be carried out in an environment capable of measuring an action’s success. Let us take the example of a subject heading in an email. The test must bear on how many times it was opened to determine which contents were most compelling. For web pages, you look at click-through rates; for payments, conversion rates. Implementation The method itself is relatively simple. You have variants of an object which you want to test on various user subsets. Once you have determined the best variant, you open it to all users. A piece of cake? Not quite. The first question must be the nature of the variation: where do you set your cursor between micro-optimization and major overhaul? All depends on where you are on the learning curve. If you’re in the client exploration phase (cf. “Minimum Viable Product“, p. 95, “Lean Startup“, p. 87), A/B Testing can completely change the version tested. For example, you can set up two home pages with different marketing messages, different layouts and graphics, to see user reactions to both. If you are farther along in your project, where the variation of a conversion goal of 1% makes a difference, variations can be more subtle (size, color, placement, etc.).
  • 111. 111 THE WEB GIANTS The second question is your segmentation. How will you define the various sub-sets? There is no magic recipe, but there is a fundamental rule: the segmentation criteria must have no influence on the experience results (A/B Testing = a single variable). You can take a very basic feature such as subscription date, alphabetical order, as long as it does not affect the results. The third question is when to stop. How do you know when you have enough responses to generalize the results of the experiment? It all depends on how much traffic you are able to generate, on how complex your experiment is and the difference in performance across your various samplings. In other words, if traffic is low and results are very similar, the test will have to run for longer. The main tools available on the market (Google Website Optimizer, Omniture TestTarget, Optimizely) include methods for determining if your tests are significant. If you manage your tests manually, you should brush up on statistics and sampling principles. There are also websites to calculate significance levels for you.[1] Let us now turn to two pitfalls to be avoided when you start A/B Testing. First, looking at performance tests from the perspective of a single goal can be misleading. Given that the test changes the user experience, you must also monitor your other business objectives. By changing the homepage of a web site for example, you will naturally monitor your subscription rate, without forgetting to look at payment performance. The other pitfall is to offer a different experience to a single group over time. The solution you implement must be absolutely consistent for the duration of the experiment: returning users must be presented with the same experimentation version, both for the relevance of your results and the user experience. Once you have established the best solution, you will then obviously deploy it for all. Who makes it work for them? We cannot not cite the pioneer of A/B Testing: Amazon. Web players on the whole show a tendency to share their experiments. On the Internet you will have no trouble finding examples from Google, Microsoft, Netflix, Zynga, Flickr, eBay, and many others, with at times surprising results. The site www.abtests.com lists various experiments. [1] http://visualwebsiteoptimizer.com/ab-split-significance-calculator
  • 112. 112 THE WEB GIANTS How can I make it work for me? A/B Testing is above all a right to experiment. Adopting a learning stance, with results hypotheses from the outset and a modus operandi, is a source of motivation for product teams. Linking the tests to performance is a way to set up product management driven by data. It is relatively simple to set up A/B Testing (although you do need to respect a certain hygiene in your practices). Google Web Site Optimizer, to mention but one, implements a tool which is directly hooked up to Google Analytics. For a reasonable outlay, you can give your teams the means to objectivize their actions in relation to the end-product. Sources • 37Ssignals, A/B Testing on the signup page: http://37signals.com/svn/posts/1525-writing-decisions-headline-tests- on-the-highrise-signup-page • Tim Ferris: http://www.fourhourworkweek.com/blog/2009/08/12/google-website- optimizer-case-study • Wikipedia: http://en.wikipedia.org/wiki/A/B_testing PRACTICES / A/B TEST
  • 114. 114
  • 115. 115 THE WEB GIANTS CULTURE / DESIGN THINKING Description In their daily quest for more connection with users, businesses are beginning to realise that these “users“, “clients“, and other “collaborators“ are first and foremost human beings. Emerging behaviour patterns, spawned by new possibilities opened up by technology, are changing consumer needs and their brand loyalties. The web giants were among the first to adopt an approach based on the relevance of all stakeholders involved in the creation of a product, and therefore concerned by the user experience provided by a given service. Here, the way Designers have appropriated the work tools is ideal for qualifying an innovative need. Reconsidering Design has become a key issue. It is essential for any Organization that wishes to change and innovate, to question the business culture, to dare go as far as disruption. Born in the 1950s and more recently formalised by the Agency IDEO[1] Design Thinking was developed at Stanford University in the USA as well as the University of Toronto in Canada, before making a significant impact on Silicon Valley, to the extent that it is becoming an approach assimilated by all major web businesses and startups. It then spread to the rest of the English speaking world, and then all of Europe. Design thinking is a human-centered approach to innovation that draws from the designer’s toolkit to integrate the needs of people, the possibilities of technology, and the requirements for business success Tim Brown IDEO [1] http://www.wired.com/insights/2014/04/origins-design-thinking/
  • 116. 116 THE WEB GIANTS A new vision of Design Emergence of a strategic asset First of all one must reconsider the word Design itself, to understand its deeper, almost etymological, meaning. And therefore recognise that when you speak of Design, it means that you want to give significance to something, whether a product, a service or an Organization. In fact, Design is whenever you want to “give meaning“ to something. A far cry from the simple representation, aesthetic or merely practical, of a product. “Great design is not something anybody has traditionally expected from Google“ – TheVerge Several web giants became aware of the strategic relevance of “operational“ Design before more fully implementing Design Thinking[2] This is the case for Google which, in 2011, [3] published a strong strategic vision for Design, namely offering an additional choice between “Full metrics“ (systematic A/B Testing, incremental feedback without embarked user feedback...) [2] http://www.forbes.com/sites/darden/2012/05/01/designing-for-growth-apple-does-it- so-can-you/ [3] http://www.theverge.com/2013/1/24/3904134/google-redesign-how-larry-page- engineered-beautiful-revolution MEANING CONCEPTION DESIGN WHY HOW
  • 117. 117 THE WEB GIANTS CULTURE / DESIGN THINKING Today, there are even Designers behind the creation of various web giants, such as AirBnB.[4] And some who go so far as to consider Design as the main asset in their global business strategy (Pinterest, various Design Disruptors). The first step to implementing a strategic Design is to create an environment which fosters the expression of different opinions around the role of Design within the company. This is how you avoid conflation between operational, cognitive and strategic aspects. [4] http://www.centrodeinnovacionbbva.com/en/news/airbnb-design-thinking-success-story STEP 01 STEP 02 STEP 03 STEP 04 Companies that do not use design Companies that use design for styling and appearance Companies that integrate design into the development process Companies that consider design a key strategic element Emotional Design Interaction Design Strategic Design Meaningful Usable Delightful
  • 118. 118 THE WEB GIANTS Designing the experience a dialog between users and professionals “Design is human. it’s not about “is it pretty,“ but about the connection it creates between a product and our lives.“ – Jenny Arden, Design Manager AirBnB A strong bond is established through Design between the user and the designer. This is a context where the designer offers a service, promises an experience, after which the user qualifies the experience through feedback - negative or positive - which can lead to designer loyalty. It is this relationship that leads to strong business value. Such commitments are to be seen towards social networks (LinkedIn, Facebook, Pinterest, Twitter…) and therefore largely among the web giants and, by extension, towards all desirable digital services. It is the Design process which materialises this relationship; the shared history between the brand, its product, or the service behind the product and users. “When people can build an identity with your product over time, they form a natural loyalty to it.“ Soleio Cuervo, Head of Design, Dropbox Then come specialists of this precious relationship, in the form of labs or other types of specialised Organizations (Google Venture, IBM), working to optimise this new balance.
  • 119. 119 THE WEB GIANTS CULTURE / DESIGN THINKING Design thinking The working hypothesis Design thinking entails understanding needs, and makes it possible to create tailored and adequate solutions for any problem that comes up. This means taking an interest in fellow humans in the most open, compassionate way possible. Innovation appears in the balance between the following factors: What is viable from a business prospective, in line with the business model. What is technologically feasible, neither obsolete nor too in advance. And, lastly, what is desirable, the human factor and takeaways. The specificity of the process lies in its ability to address a problem through unprecedented collaboration between all stakeholders: from the “creators“ (those who drive the business strategy, for example the company) to the “users“ whoever they may be (in-house and external, direct and indirect). Business HumansTechnologic doable désirable viable Responsibility INNOVATION
  • 120. 120 THE WEB GIANTS methodological approach The methodological translation of the Design Thinking approach is a series of steps where the goal is to provide structure for innovation by optimising the analytical and intuitive aspects of a problem. 100% reliability 100% validity Bridging the Fundamental Predilection Gap Design Thinking 50/50 Mix Rotman The approach unfolds in three main phases: Inspiration or Discovery: learning to examine a problem or request. Understanding and observing people and their habits. Getting a feel for emerging wishes and needs. Ideating or Defining: making sense of the discoveries triggered by a concept or vision. Establishing the business and technology possibilities and prototyping the target innovations as quickly as possible. Implementing or Delivering: materialising and testing to maximise feedback on the innovation so as to swiftly make adjustments.
  • 121. 121 THE WEB GIANTS CULTURE / DESIGN THINKING More precisely, these phases are often broken down into several steps to anchor the methodology. The number and nature of the steps vary depending on who is implementing them. Below are the 5+1 steps suggested by the Stanford Institute of Design and adopted by IDEO:5 Empathy: Begin by understanding the people who will be impacted by your product or service. This has to do with contacts, interviews, relations. It is the choice of rediscovering the demand environment. The mandate is openness, curiosity, and not formalisation. Definition: It is the formalisation of a concept bearing on all the elements discovered during the first step. It is based on real needs, driven by potential clients rather than the company's context. Ideation: This is the step where ideas are generated. This optimism phase encourages all possible ideas emerging from the previously discovered concepts. Exercises and Design workshops can serve to focus on specific aspects to see what intentions are possible. Little by little, ideas are grouped together, refined, completed, and given more specific meaning. Prototyping: Then comes the moment for materialisation, for moving on to the “how“. Here the problems are represented more concretely, to draw out potential. Speed is of the essence, especially in making mistakes so as to quickly reposition. Simple materials are used such as cardboard, putty... Testing: It is then time to test the prototype, with potential users, to ensure its feasibility and check that it is a cultural fit for your brand. Sparked interest is proof that the prototype is a solution in tune with a user need. Lastly, let us add evolution: The results from the preceding phases should be a new starting point for researching the best way to create value around a given need. One thus understands that the implementation of the Design approach does not end once the process has started, because it forces you to systematically evolve what you already have. [5] https://dschool.stanford.edu/sandbox/groups/designresources/wiki/36873/ attachments/74b3d/ModeGuideBOOTCAMP2010L.pdf?sessionID=c2bb722c7c1ad51 462291013c0eeb6c47f33e564
  • 122. 122 THE WEB GIANTS Empathize Ideate Define Prototype Test Some of the steps can be repeated, adjusted, refined, added to. New ideas are born out of tests: following prototyping for example, other types of potential clients can emerge... And this happens in a context of iteration, co-creation, sometimes without any hierarchy, and with a sufficiently optimistic mindset to accept any failures. Design vs. Tech[6] Design is currently such a major driver for the web giants that questions arise concerning technology as a crucial strategic element. Choices are made in the front-end of everything – Scott Belsky Behance One effectively observes that the beneficial effects of Moore's law are diminishing while, at the same time, users are gaining in maturity, to the point where they are increasingly involved in defining the perfect interface for them. [6] http://www.kpcb.com/blog/design-in-tech-report-2015
  • 123. 123 THE WEB GIANTS CULTURE / DESIGN THINKING Why are Tech Companies Acquiring Design Agencies? The old way of thinking The new way of thinking The solution to every new problem in tech has been simple : more tech. 1 A better experience was made with a faster CPU or more memory 2 Moore’s Law no longer cuts it as yhe key path to a happier customer. 3 (modified from Design In Tech presentation, John Maeda KMPG partner) Moreover, the new generations of users no longer consider possibilities driven by technology as innovation breakthroughs but rather as basic expectations (it is normal for technology to open up new possibilities). Thus it is Design which makes the difference in what clients buy and the brands they are loyal to. Noting this trend, many web giants started buying up companies specialised in Design in 2010. #Design in Tech MA Activity (modified from Design In Tech presentation, John Maeda KMPG partner) 2005 NUMBER OF DESIGNER CO-FOUNDED TECH COMPANIES 2006 2007 2008 2009 2010 to the present Flikr Android YouTube Vimeo Fab LevelMoney Polar Ultravisual WillCall Beats Readmill Simple Sold Tumblr Pulse Mailbox Foodspotting Forrst Behance Acrylic Sofware Slideshare Instagram OMGPOP Postcrous Gowalla Hunch Push Pop Press Daytum about.me SongzaMint +acq. for $1.65B +acq. for $1.0B Mobile was the inflection point for #DesignInTech Mobile was the inflection point for #DesignInTech
  • 124. 124 THE WEB GIANTS How can I make it work for me? Which way implementing The crucial step is to evolve your company into a Design-centric Organization: The strategy is to promote full integration of Design Thinking in your company:[7] Design-centric leaders, who consider Design as a structural cultural edge both within their company and in the expression of their values (Products, services, expert advice, quality of product code...). Embracing the Design culture: The development of the business culture is systematically informed by values of empathy rather than organic growth, the user experience (UX) is the most important benchmark, and the goal is to provide high quality client experiences (CX) with true value. The Design thought process Design thinking and its implementation are a given in the company mindset, and therefore teams concentrate on opportunities in problematics rather than on project opportunities. Several implementation vectors can serve to promote this mindset: The acquisition of talent, i.e. incorporating designers (IBM) Callinguponconsultantsforhelpwithissueswhichgobeyondmethodology Assimilation, by integrating Design studios and coaches[8] A structure built around Design. Companies are organised around attracting talent and co-leaders for each position to encourage each other to create initiatives, an integral part of their responsibilities. Globally speaking, 10% of the 125 richest companies in the USA have Top Managers or CEOs from Design. Alongside the web giants, one notes that the CEO of Nike is a Designer. Apple is the only company to have a SVP for Design. Adapting will take time, especially as most have yet to realise the relevance of Design. Getting help from Designers or UX specialists familiar with the approach is necessary for sharing these new tools and then putting them into operation. [7] https://hbr.org/2015/09/design-as-strategy [8] http://www.wired.com/2015/05/consulting-giant-mckinsey-bought-top-design-firm/
  • 125. 125 THE WEB GIANTS CULTURE / DESIGN THINKING Companies that do not use design 2003 2007 2003 2007 2003 2007 2003 2007 Companies that use design for styling and appearance Companies that integrate design into the development process Companies that consider design a key strategic element How can I make it work for them? While since 2010, GAFAM, NATU and other web giants have been following this strategy, today all sectors refer, directly or indirectly, to Design Thinking in their quest for an optimal client experience.[9] Among concrete examples, we will mention the following: On the point of disappearing after multiple failures, AirBnB managed to turn themselves around thanks to Design Thinking[10] Exploration of aggregated services, proposed and tested by Uber in partnership with Google[11] Still at Uber, the Design Thinking approach underlies the entire internal structure of the company[12] With the same goal, IBM restructured its Organization through a Design transition[13] At Dropbox, Design Thinking is ubiquitous. Both in terms of its products and internal structure[14] [15] More precisely, one can describe strong implication in Strategic Design as: Implementation in several stages (from visual Design to strategic Design) at: Google, Apple, Facebook, Dropbox, Twitter, Netflix, Salesforce, Amazon… An overarching Design-centric strategy at: Pinterest, AirBnB, Google Ventures, Coursera, Etsy, Uber, most FinTechs [9] http://blog.invisionapp.com/product-design-documentary-design-disruptors/ [10] https://www.youtube.com/watch?v=RUEjYswwWPY [11] http://www.happinessmakers.com/knowledge/2015/11/29/inside-ubers-design- thinking [12] http://talks.ui-patterns.com/videos/applying-design-thinking-at-the-organizational- level-uber-amritha-prasad [13] http://www.fastcodesign.com/3028271/ibm-invests-100-million-to-expand-design- business [14] https://twitter.com/intercom/status/614537634833137664 [15] http://designerfund.com/bridge/day-in-the-life-rasmus-andersson/
  • 126. 126 THE WEB GIANTS Associated patterns Pattern “Enhancing the user experience“ p. 27 Pattern “Lean Startup“ p. 87 Sources • Evolution of Design Thinking: Special issue of the Harvard Business Review: https://hbr.org/archive-toc/BR1509?cm_sp=Magazine%20Archive-_- Links-_-Previous%20Issues http://stanfordbusiness.tumblr.com/post/129579353544/how-design- thinking-can-help-drive-relevancy-in • The example of AirBnB: https://growthhackers.com/growth-studies/airbnb https://www.youtube.com/watch?v=RUEjYswwWPY • Methodology: https://www.ideo.com/images/uploads/thoughts/IDEO_HBR_Design_ Thinking.pdf https://www.rotman.utoronto.ca/Connect/RotmanAdvantage/ CreativeMethodology.aspx http://www.gv.com/sprint/ • Design Value: http://www.dmi.org/default.asp?page=DesignDrivesValue#. VW6gfEycSdQ.twitter Design-Driven Innovation-Why it Matters for SME Competitiveness White Paper – Circa Group • Design in Tech: http://www.kpcb.com/blog/design-in-tech-report-2015
  • 129. 129 THE WEB GIANTS PRATICES / DEVICE AGNOSTIC Description For Web Giants, user-friendliness is no longer open to debate: it is non negotiable. As early as 2003, the Web 2.0 manifesto pleaded in favor of the “Rich User Experience“, and today, anyone working in the world of the Web knows the importance of providing the best possible user interface. It is held to be a crucial factor in winning market shares. In addition to demanding high quality user experience, people want to access their applications anywhere, anytime, in all contexts of their daily lives. Thus a distinction is generally made between situations where one is sitting (e.g. at the office), nomadic (e.g. waiting in an airport terminal) or mobile (e.g. walking down the street). These situations are currently linked to various types of equipment, or devices. Simply put, one can distinguish between: Desktop computers for sedentary use. Laptops and tablets for nomadic use. Smartphones for mobile use. The Device Agnostic pattern means doing one’s utmost to offer the best user experience possible whatever the situation and device. One of the first companies to develop this type of pattern was Apple with its iTunes ecosystem. In fact, Apple first made music accessible on PC/ Mac and iPod, then on the iPhone and iPad. Thus they have covered the three use situations. In contrast, Apple does not fully apply the pattern as their music is not accessible on Android or Windows Phone. To implement this approach, it can be necessary to offer as many interfaces as there are use situations. Indeed, a generic interface of the one-size-fits-all type does not allow for optimal use on computers, tablets, smartphones, etc.
  • 130. 130 THE WEB GIANTS The solution adopted by many of Web Giants is to invest in developing numerous interfaces, applying the pattern API first (cf. “Open API“, p. 235). Here the principle is for the application architecture to be based on a generic API, with the various interfaces then being directly developed by the company, or indirectly through the developer and partner ecosystem based on the API. To get the most out of each device, it is becoming ever more difficult to use only Web interfaces. This is because they do not manage functionalities specific to a given device (push, photo-video capture, accelerometer, etc.). Users also get an impression of lag because the process entails frontloading the entire contents,[1] whereas native applications need no loading or only a few XML or JSON resources. I’d love to build one version of our App that could work everywhere. Instead, we develop separate native versions for Windows, Mac, Desktop Web, iOS, Android, BlackBerry, HP WebOS and (coming soon) Windows Phone 7. We do it because the results are better and, frankly, that’s all-important. We could probably save 70% of our development budget by switching to a single, cross-platform client, but we would probably lose 80% of our users. Phil Libin, CEO Evernote (January, 2011) However things are changing with HTML5 which functions in offline mode and provides resources for many applications not needing GPS or an accelerometer. In sum, there are two approaches adopted by Web companies: those who use only native applications such as Evernote, and those who take a hybrid approach using HTML5 contents embarked in the native application which then becomes a simple empty shell, capable only of receiving push notifications. This is in particular the case of Gmail, Google+ and Facebook for iPhone. One of the benefits of this approach is to enhance visibility in the AppStores where users go for their applications. The hybrid pattern is thus a good compromise: companies can use the HTML5 code on a variety of devices and still install the application via an App Store with Apple, Android, Windows Phone, and, soon, Mac and Windows. [1] This frontloading can be optimized (cf. “Enhancing the user experience“, p. 27) but there are no miracles…
  • 131. 131 THE WEB GIANTS PRATICES / DEVICE AGNOSTIC Who makes it work for them? There are many examples of the Device Agnostic pattern being implemented among Web Giants. Among others: In the category of exclusively native applications: Evernote, Twitter, Dropbox, Skype, Amazon, Facebook. In the category of hybrid applications: Gmail, Google+. References among Web Giants Facebook proposes: A Web interface for PC/Mac: www.facebook.com. A Web interface for Smartphones: m.facebook.com. Embarked mobile interfaces for iPad, iPhone, Android, Windows Phone, Blackberry, PalmOS. A text message interface to update one’s status and receive notifications of friend updates. An email interface to update one’s status. In addition, there are several embarked interfaces for Mac and PC offered by third parties such as Seesmic and Twhirl. Twitter stands out from the other Web Giants in that it is their ecosystem which does the implementing for them (cf. “Open API“, p. 235). Many of the Twitter graphic interfaces were in fact created by third parties such as TweetDeck, Tweetie for Mac and PC, Twitterrific, Twidroid for smartphones... To the extent that, for a time, Twitter’s web interface was considered unuser friendly and many preferred to use the interfaces generated by the ecosystem instead. Twitter is currently overhauling the interfaces.
  • 132. 132 THE WEB GIANTS In France One finds the Device Agnostic pattern among major media groups. For example Le Monde proposes: A Web interface for PC/Mac: www.lemonde.fr A Web interface for Smartphones: mobile.lemonde.fr Hybrid mobile interfaces for iPhone, Android, Windows Phone, Blackberry, PalmOS, Nokia OVI, Bada An interface for iPad It is also found in services with high consultation rates such as banking. For example, the Crédit Mutuel proposes: A Web interface for PC/Mac: www.creditmutuel.fr A redirect service for all types of device: m.cmut.fr A Web interface for Smartphones: mobi.cmut.fr A Web interface for tablets: mini.cmut.fr A WAP interface: wap.cmut.fr A simplified Java interface for low technology phones Embarked mobile interfaces for iPad, iPhone, Android, Windows Phone, Blackberry An interface for iPad.
  • 133. 133 THE WEB GIANTS PRATICES / DEVICE AGNOSTIC How can I make it work for me? The pattern is useful for any B2C service where access anywhere, anytime is important. If your budget is limited, you can implement the mobile application most used by your target clients, and propose an open API in the hopes that others will develop the interface for additional devices. Associated patterns The Open API or open ecosystem pattern, p. 235. The Enhancing the User Experience pattern, p. 27. Exception! As mentioned earlier, this pattern is only limited by the budget required for its implementation. Sources • Rich User Experiences, Web2.0 Manifesto, Tim Oreilly: http://oreilly.com/Web2/archive/what-is-Web-20.html • Four Lessons From Evernote’s First Week On The Mac App Store, Phil Libin: http://techcrunch.com/2011/01/19/evernote-mac-app-store
  • 136. 136 THE WEB GIANTS PRATICES / PERPETUAL BETA Description Before introducing perpetual beta, we must revisit a classic pattern in the world of open software: Release early, release often. The principle behind this pattern consists of regularly releasing code to the community to get continuous feedback on your product from programmers, testers, and users. This practice is described in Eric Steven Raymond’s 1999 work “The Cathedral and the Bazaar“ It is in keeping with the short iteration principle in agile methods. The principle of perpetual beta was introduced in the Web 2.0 manifesto written by Tim O’Reilly where he writes: Users must be treated as co-developers, in a reflection of open source development practices (...). The open source dictum, ‘release early and release often’, in fact has morphed into an even more radical position, ‘the perpetual beta’, in which the product is developed in the open, with new features slipstreamed in on a monthly, weekly, or even daily basis. The term “perpetual beta“ refers to the fact that an application is never finalized but is constantly evolving: there are no real releases of new versions. Working this way is obviously in line with the logic of “Continuous Delivery“ (cf. “Continuous Deployment“, p. 105). This constant evolution is possible because it is a case here of services on line rather than software: In the case of software, version management usually follows a roadmap with publication benchmarks: releases. These releases are spread out over time for two reasons: the time it takes to deploy the versions to the users, and the need to ensure maintenance and support for the various versions released to users. Monitoring
  • 137. 137 THE WEB GIANTS support, safety updates and ongoing maintenance on several versions of a single program is a nightmare, and a costly one. Let us take the example of Microsoft: the Redmond-based publisher had to manage at one point the changes to Windows XP, Vista and Seven. One imagines three engineering teams all working on the same software: a terrible waste of energy and a major crisis for any company lacking Microsoft’s resources. This syndrome is known as “version perversion“. In the context of online services, only one version of the application needs to be managed. Furthermore, since it is Web Giants themselves who upload and host their applications, users benefit from updates without having to manage the software deployment. New functionalities appear on the fly where they are “happily“ discovered by the users. In this way one learns to use new functions in applications progressively. Generally speaking, the logistics of ascendant interoperability are well managed (with a few exceptions, such as support in disconnected mode in Gmail, when they gave up Google Gears). This model is widely applied by the stakeholders of Cloud Computing. The “customer driven roadmap“ is a complementary and virtuous feature of the perpetual beta (cf. “Lean Startup“, p. 87). Since Web Giants manage the production platform, they can also finely measure use of their software. Thereby measuring the success of each new functionality. As mentioned previously, the Giants follow metrics very closely. So closely in fact that we have devoted a chapter to the subject (cf. “The obsession with performance measurement“, p. 13). More classically, running the production platform provides opportunities to launch surveys among various target populations to get user feedback. To apply the perpetual beta pattern, you must have the means to carry out regular deployments. The prerequisites are: implementing automatic software builds, practicing Continuous Delivery, ensuring you can rollback in case of trouble...
  • 138. 138 THE WEB GIANTS PRATICES / PERPETUAL BETA There is some controversy around the perpetual beta: some clients equate beta with an unfinished product and believe that services following this pattern are not reliable enough to count on. This has led some service operators to remove the mention “beta“ from their site, albeit without changing their practices. Who makes it work for them? The reference was Gmail which sported the mention beta until 2009 (with the vintage function “back to beta“ being added later). It is a practice implemented by many Web Giants. Facebook, Amazon, Twitter, Flickr, Delicious.com, etc. A good illustration of perpetual beta is provided by Gmail Labs: they are small unitary functionalities which users can decide to activate or not. Depending on the rate of adoption, Google then decides to integrate them in the standard version of their service or not (cf. “Feature Flipping“, p. 113). In France, the following services display, or have displayed, the beta logo on their home page: urbandive.com : a navigation service with street view by the Pages Jaunes, sen.se : a service for storing and analyzing personal data. Associated patterns Pattern “Continous Deployment“, p. 105. Pattern “Test A/B“, p. 123. Pattern “The obsession with performance measurement“, p. 13.
  • 139. 139 THE WEB GIANTS Exception! Some Web Giants still choose to keep multiple versions up and running simultaneously. Maintaining several versions of an API is particularly relevant as it saves developers from being forced into updating their code every time a new version of the API is released. (cf. “Open API“, p.235.) The Amazon Web Services API is a good example. Sources • Tim O’Reilly, What Is Web 2.0 ?, 30 September, 2005: http://oreilly.com/pub/a/web2/archive/what-is-web-20.html • Eric Steven Raymond, The Cathedral and the Bazaar: http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral- bazaar/
  • 141. 141 Cloud First..................................................................................... 159 Commodity Hardware.................................................................... 167 Sharding........................................................................................ 179 TP vs. BI: the new NoSQL approach................................................ 193 Big Data Architecture..................................................................... 201 Data Science.................................................................................. 211 Design for Failure........................................................................... 219 The Reactive Revolution................................................................. 225 Open API ...................................................................................... 233
  • 144. 144 THE WEB GIANTS ARCHITECTURE / CLOUD FIRST Description As we saw in the description of the pattern “Build vs. Buy“ (cf. “Build vs. Buy“, p. 19): Web Giants favor specific developments so as to control their tools from end to end, whereas many companies instead use software packages, considering that software tools are commodities.[1] AlthoughWebGiants,likestartups,prefertodevelopcriticalapplications in-house, they do at times have recourse to third-party commodities. In this case, they apply the commodity logic to the fullest by choosing to completely outsource the service in the Cloud. By favoring services in the Cloud, Web Giants, again like startups, take a very pragmatic stance: profiting from the best innovations by their peers, speedily and with an easy-to-use purchase model, to focus their efforts on their business strengths. This model can be inspiring for all companies wishing to move fast and to reduce investment costs to win market shares. Why favor the Cloud in the commodity framework? The table on the following page lays out the advantages. The Cloud approach can be divided into three main strands: Using APIs and Mashups: Web Giants massively call upon services developed by Cloud companies (Google Maps, user identification on Facebook, payment with PayPal, statistics with Google Analytics, etc.) and integrate them in their own pages via the mashup principle. Outsourcing functional commodities: Web majors often externalize their commodities to SaaS services (e.g. Google Apps for collaborating, Salesforce for managing sales personnel, etc.) Outsourcing technical commodities: Web players also regularly use Iaas and PaaS platforms to host their services (Netflix and Heroku for example use Amazon Web Services).
  • 145. 145 THE WEB GIANTS Analysis axis Model In-house management Cloud Cost Initial outlay for licenses, equipment, staff. Pay-per-use: neither investment nor com- mitment. Time to Market License purchase, then deployment by the company within a few weeks. Self-service subs- cription automatically implemented within minutes. Roadmap/new functionalities Designed in the mid term by publishers following feedback from user groups. Implemented in the short term depending on what users do with the service. Rhythm of change Often one major release per year. New functionalities on the fly. Support and updates Additional yearly cost. Included in the subs- cription. Hosting and operating Entails building and operating a datacenter by experts. Delegated to the Cloud operator. The physical safety of data Data integrity is the responsibility of the company. The major Cloud opera- tors ensure the safety of data in accordance with the ISO standards ISO 27001[1] and SSAE 16.[2] [1] ISO 27001 : http://en.wikipedia.org/wiki/ISO_27001 [2] SSAE 16 (replacing the Type 2 SAS 70) : http://www.ssae-16.com
  • 146. 146 THE WEB GIANTS ARCHITECTURE / CLOUD FIRST Housing technical commodities in the Cloud is particularly interesting for Web companies. With the pay-as-you-go model, they can launch online activities with next to no hosting costs. Charges increase progressively as the number of users grows, alongside revenues, so all is well. The Cloud has thus radically changed their launch schedules. The Amazon Web Services platform IaaS is massively used by Web Giants such as Dropbox, 37signals, Netflix, Heroku... During the CloudForce 2009 conference in Paris, a Vice-President of Salesforce affirmed that the company did not use an IaaS platform because such solutions did not exist when the company was created, but that if it were to be done today they would certainly choose IaaS. Who makes it work for them? The eligibility of the Cloud varies depending both on the type of data you manipulate and regulatory constraints. Thus: Banks in Luxembourg are forbidden from storing their data elsewhere than in certified organizations. Companies working with sensitive data, industrial secrets or patents are reluctant to store them in the Cloud. The Patriot Act[3] in particular pushes companies away from the Cloud: it forces companies registered in the United States to make their databases available upon request by government authorities. Companies which work with personal data can also be forced to restrict their recourse to the Cloud because of the CNIL regulations, the respect of which varies from one Cloud platform to the next (variable implementation of Safe Harbor Privacy Principles).[4] [3] http://en.wikipedia.org/wiki/PATRIOT_Act [4] http://en.wikipedia.org/wiki/International_Safe_Harbor_Privacy_Principle
  • 147. 147 THE WEB GIANTS When there are no such constraints, using the Cloud is possible. And many companies of all sizes and from all sectors have migrated to the Cloud, in the USA as well as in Europe. Let us describe a case that well illustrates the potential of the Cloud: In 2011, Vivek Kundra, former CIO at the White House, announced the program “Cloud First“ which stipulated that all US administrations had to use the Cloud first and foremost for IT. This decision should be put in context: in the USA there is the “GovCloud“, i.e. Cloud offers suited to administrations, with full respect for their constraints, located on American soil, and isolated from other clients. Such services are offered by Amazon, Google and other providers.[5] In some companies, it is the mindset which is dead against storing data in the Cloud. This reluctance is due to the factors presented above, but also to a lack of confidence (Cloud providers have not yet reached the levels of trust of banks) and also possibly unwillingness to change. Web Giants are less affected by these two latter impediments, they are already well acquainted with the Cloud providers and are open to change. Cloud addiction? One should also be careful not to depend too fully on a single Cloud platform to house critical applications. These platforms are not fail-proof, as shown by recent failures: Microsoft Azure (February, 2012), Salesforce (June, 2012), Amazon Web Services (April and July, 2012). The failures at AWS highlighted their lack of maturity in the use of the Cloud: Pinterest, Instagram, Heroku which were dependent on a single Amazon datacenter were strongly impacted, [5] Federal Cloud Computing Strategy, Vivek Kundra, 2011: http://www.forbes.com/sites/microsoft/2011/02/15/kundra-outlines-cloud-first-policy-for-u- s-government
  • 148. 148 THE WEB GIANTS ARCHITECTURE / CLOUD FIRST Netflix used several Amazon datacenters and was thus less affected[6] (cf. “Design for Failure“, p. 221). One should note however that such failures create media hype whereas very little is known about the robustness of corporate datacenters. It is therefore difficult to measure the true impact on users. Here are a few Service Level Agreements that you can compare with those of your companies: Amazon EC2: 99.95% availability per year. Google Apps: 99.9% availability per year. References among Web Giants A few examples of recourse to the Cloud by Web Giants: using Amazon Web Services: Heroku, Dropbox, 37Signals, Netflix, Etsy, Foursquare, Voyages SNCF. In fact, Amazon represents 1% of all traffic on the Web; using Salesforce: Google, LinkedIn; using Google Apps: Box.net. In France A few examples of Cloud use in France: In industry: Valeo, Treves use Google Apps. In insurance: Malakoff Méderic uses Google Apps. [6] Feedback from Netflix on AWS failures: http://techblog.netflix.com/2011/04/lessons-netflix-learned-from-aws-outage.htm
  • 149. 149 THE WEB GIANTS In the banking sector: most use Salesforce for at least part of their activities. In the Internet sector: PagesJaunes uses Amazon Web Services. In the public sector: La Poste uses Google Apps for their mail delivery staff. How can I make it work for me? If you are a SME or a VSE, you would probably benefit from externalizing your commodities in the Cloud, for the same reasons as Web Giants. All the more so as regulatory issues, such as the protection of industrial secrets, must be resolved following the emergence of French and European Clouds such as Adromède. If you are a large company, already well endowed with hardware and IT teams, the benefits of the Cloud can be offset by the cost of change. It can nevertheless be relevant to study the question. In any case, you can profit from the Cloud’s agility and pay-as-you-go approach for: innovative projects: pilot projects, Proof of Concept, project incubation, etc. Environments with limited life spans (development, testing, design, etc.). Related Pattern Pattern “Build vs. Buy“, p. 19. Exception! As stated earlier, regulatory constraints can cut off access to the Cloud. In some cases, re-internalization is the best solution: when data and user volumetrics increase spectacularly, it can be cheaper to repatriate applications and build a datacenter on totally optimized architecture. This type of optimization does however typically require highly-qualified staff.
  • 152. 152 THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE Description Although invisible behind your web browser, millions of servers run day and night to make the Web available 24/7. There are very few leaks as to numbers, but it is clear that major Web companies have dozens or even hundreds of thousands of machines like EC2,[1] it is even surmised that Google has somewhere around a million.[2] Managing so many machines is not only a technical challenge, it is above all an economic one. Most major players have circumvented the problem by using mass produced equipment, also called “commodity hardware“, which is the term we will use from now. This is one of the reasons which has led the Web Giants to interconnect a large number of mass-produced machines rather than using a single large system. A single service to a client, a single application, can run on hundreds of machines. Managing hardware this way is known as Warehouse Scale Computing,[3] with hundreds of machines replacing a single server. Business needs Web Giants share certain practices, described in various other chapters of this book:[4] A business model tied to the analysis of massive quantities of data - for example indexing web pages (i.e. approximately 50 billion pages). One of the most important performance issues is to ensure that query response times stay low. [1] Source SGI. [2] Here again it is hard to make estimates. [3] This concept is laid out in great detail in the very long paper The Data Center as a Computer, we only mention a few of their concepts here. The full text can be found at: http://www.morganclaypool.com/doi/pdfplus/10.2200/S00516ED2V01Y201306CAC024 [4] cf. in particular “Sharding“, p. 179.
  • 153. 153 THE WEB GIANTS Income from e.g. advertising is not linked to the number of queries, per query income is actually very low.[5] Comparatively speaking, the cost per unit using traditional large servers remains too elevated. The incentive to find the architecture with the lowest transaction costs is thus very high. Lastly, the scales of magnitude of processing carried out by the Giants are far removed from traditional computer processing management, where until now the number of users was limited by the number of employees. No machine, however big, is capable of meeting their needs. In short, these players need scalability (marginal cost per constant transaction), and the marginal cost must stay low. Mass-produced machines vs. high-end servers When scalability is at issue, there are two main alternatives: Scale-up or vertical growth consists in using a better performing machine. This is the alternative that has most often been chosen in the past because it is very simple to implement. Moreover Moore’s law means that builders regularly offer more powerful machines at constant prices. Scale-out or horizontal scaling consists in pooling the resources of several machines which individually can be much less powerful. This removes all limits as to the size of the machine. Furthermore, PC components, technologies and architectures show a highly advantageous performance/cost ratio. Their relatively weak processing capacity as compared to more efficient architectures such as RISC are compensated for by lower costs obtained through mass production. A study based on the results of the TPC-C[6] shows that the relative cost per transaction is three times lower with a low-end server than with a top of the line one. [5] “Early on, there was an emphasis on the dollar per (search) query,“ [Urs] Hoelzle said. “We were forced to focus. Revenue per query is very low.“ http://news.cnet.com/8301-1001_3- 10209580-92.html [6] Ibid, [3] preceeding page
  • 154. 154 THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE At the scales implemented by Web Giants - thousands of machines coordinated to execute a single function - other costs become highly prominent: electric power, cooling, space, etc. The cost per transaction must take these various factors into account. Realizing that has led the Giants to favor horizontal expansion (scale-out) based on commodity hardware. Who makes it work for them? Just about all of Web Giants. Google, Amazon, Facebook, LinkedIn… all currently use x86 type servers and commodity hardware. However, using such components introduces other constraints, and having a Data Center as a Computer entails scaling constraints which differ widely from what most of us think of as datacenters. Let us therefore go into more detail. Material characteristics which impact programming Traditional server architecture strives, to the extent allowed by the hardware, to provide developers with a “theoretical architecture“, including a processor, a central memory containing the program and data, and a file system.[7] Familiar programming based on variables, calling functions, threads and processes make this approach necessary. The architectures of large systems are as close to this “theoretical architecture“ as a set of machines in a datacenter is far. Machines of the SMP (Symmetric Multi Processor) type, used for scaling- up, now make it possible to use standard programming, with access to the entire memory and all disks in a uniform manner. [7] This architecture is known as the Von Neumann architecture.
  • 155. 155 THE WEB GIANTS Figure 1 (modified). Source RedPaper 4640, page 34. As the figures on the diagram show, great efforts are made to ensure that speed and latency are nearly identical between a processor, its memory and disks, whether they are connected directly, connected to a same processor book[9] or different ones. If any NUMA (Non Uniform Memory Access - accessing a nearby memory is faster than accessing memory in a different part of the system) characteristics are retained, they are concentrated on the central memory, with latency and bandwidth differences in a 1 to 2 ratio. [9] A processor book is a compartment which contains processors, memory and in and out connectors, at the first level it is comparable to a main computer board. Major SMP systems are made up of a set of compartments of this sort interconnected through a second board: the midplane. Processor Book 8 of 8 Processor Book n of 8 I/O Drawer HMC HMC 24 port, 100Mb Enet Switch Oscillator Card TPMDTPMD DIMM DIMM DIMM DIMM BUFFER BUFFER BUFFER BUFFER DIMM DIMM DIMM DIMM BUFFER BUFFER BUFFER BUFFER Midplane Oscillator Card FSP System Controller FSP Node Controller FSP Node Controller FSP System Controller One server : RAM : 8 TB, 39,4 Disk : 304 TB, 10 m to 50 GB/s One Processor Bo RAM 1 TB,133 ns, Disk : 304 TB, 10m to 50 GB/s One Processor : RAM 256 GB,100 Disk : 304 TB, 10m to 50 GB/s INTER NODE FABRIC BUS INTER NODE FABRIC BUSINTER NODE FABRIC BUS INTER NODE FABRIC BUS INTER NODE FABRIC BUS 24 port, 100Mb Enet Switch Lorem Ipsum Lorem Ipsum Lorem Ipsum Lorem Ipsum Lorem Ipsum Lorem Ipsum Lorem Ipsum Lorem Ipsum Lorem Ipsum One server : RAM : 8 TB, 39,4 GB/s Disk : 304 TB, 10 ms up to 50 GB/s One Processor Book{9} : RAM 1 TB,133 ns, 46,6 GB/s Disk : 304 TB, 10ms up to 50 GB/s One Processor : RAM 256 GB,100 ns, 76,5 GB/s Disk : 304 TB, 10ms up to 50 GB/s
  • 156. 156 THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE Operating systems and middleware like Oracle can take charge of such disparities. From a scale-out perspective, the program no longer runs on a single large system but is instead managed by a program which distributes it over a set of machines. This manner of connecting machines in commodity hardware gives a very different vision from that of the “theoretical architecture“ for the developer. Figure 2. Source The Data Center As A Computer page 8 L1$ :Level 1 cache , L2$ : level 2.cache Local DRAM Local 25 L 15 L 25 Local 25 L 15 L 25 Rack Switch DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk Disk P P P P Datacenter Switch One server DRAM: 16G Disk : 2TB, 1 Local Rack ( DRAM: 1TB, Disk : 160TB Cluster (30 DRAM: 30TB Disk: 4.80PB Disk One server DRAM: 16GB 100ns, 20GB/s Disk : 2TB, 10ms, 200MB/s P : Processor Local Rack (80 servers) DRAM: 1TB, 300µs, 100MB/s Disk : 160TB, 11ms, 100MB/s Cluster (30 racks) DRAM: 30TB, 500µs, 10MB/s Disk: 4,80PB, 12ms, 10MB/s 2$ L 1$ M Disk Disk Disk P One server DRAM: 16GB 100ns, 20GB/s Disk : 2TB, 10ms, 200MB/s P : Processor Local Rack (80 servers) DRAM: 1TB, 300µs, 100MB/s Disk : 160TB, 11ms, 100MB/s Cluster (30 racks) DRAM: 30TB, 500µs, 10MB/s Disk: 4,80PB, 12ms, 10MB/s Local DRAM Local 2$ L 1$ L 1$ Local 2$ L 1$ L 1$ Rack Switch DRAM DRAM DRAM DRAM Disk Disk Disk Disk Disk Disk Disk Disk Disk P P P P One server DRAM: 16GB 100ns, 20GB/s Disk : 2TB, 10ms, 200MB/s P : Processor Local Rack (80 servers) DRAM: 1TB, 300µs, 100MB/s Disk : 160TB, 11ms, 100MB/s Cluster (30 racks) DRAM: 30TB, 500µs, 10MB/s Disk: 4,80PB, 12ms, 10MB/s
  • 157. 157 THE WEB GIANTS Whenever you use the network to access data on another server, availability time increases and speed is divided by 1000. In addition, it is the network equipment feeding into the datacenter that is the limiting factor in terms of the aggregated bandwidth of all machines. In consequence, to optimize access time and speed within the datacenter, the data and processing must be well distributed across servers (especially to avoid distributing data often accessed together over several machines). However, operating systems and the traditional middleware layers are not designed for functioning this way. The solution is for processing to take place at the application level. This is precisely where sharding[10] strategies come into play. Service front elements, serving Web pages, easily support such constraints giventhatversioningisnotanissueanditiseasytodistributeHTTPrequests over several machines. It will however be up to the other applications to explicitly manage network exchanges or to anchor themselves in new specific middleware layers. Solutions for storing this type of material are also deployed among Web Giants by using sharding techniques. Implementing failure resistance The second significant difference between large systems and Warehouse Scale Computers lies in failure tolerance. For decades, large systems have been coming up with advanced hardware mechanisms to maximally reduce failures (RAID, changing equipment live, replication at the SAN level, error correction and failover at the memory and I/O level, etc.). A Warehouse Scale Computer has the opposite features for two reasons: Commodity hardware components are less reliable; the global availability of a system simultaneously deploying to several machines is the product of the availability of each server.[11] [10] cf. “Sharding“, p. 179. [11] Thus if each machine has an annual downtime of 9 hours, the availability of 100 servers will be at best 0.999100≈ 0.90%, i.e. 36 days of unavailability per year!
  • 158. 158 THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE [12] SGI is the result of a merger between Silicon Graphics, Cray and above all of Rackable who had expertise in the field of x86 servers. [13] http://www.youtube.com/watch?v=Ho1GEyftpmQ Because of this, Web Giants consider that the system must be able to function continuously even when some components have failed. Once again, the application layer is responsible for ensuring this tolerance for failure (cf. “Design for Failure“, p. 221). On what criteria are the machines chosen? That being said, the machines chosen by the Giants do not always resemble what we think of as PCs or even the x86 servers of majors such as HP or IBM. Google is certainly the most striking example as it builds its own machines. Other majors such as Amazon work with more specialized suppliers such as SGI.[12] The top priority in choosing their servers is, of course, the bottom line. Whittling components down to meet their precise needs and the quantity of servers purchased give Web Giants a strong negotiating position. Although verified data is lacking, it is estimated that the cost of a server for them can go as low as $500. The second priority is electric power consumption. Given the sheer magnitude of servers deployed, power consumption has become a major expense item. Google recently stated that their average consumption was about 260 million watts, amounting to a bill of approximately $30,000 per hour. The choice of components as well as a capacity to configure the consumption of each component very precisely can also engender huge savings. In sum, even though they contain the same parts you would find in your desktop, the server configurations are a long shot away. With the exception of a few initiatives such as OpenCompute from Facebook, the finer details are a secret that the Giants keep fiercely. The most one can discover is that Google replaced its centralized oscillators with 12V batteries directly connected to the servers.[13]
  • 159. 159 THE WEB GIANTS Exception! There are almost no examples of Web Giants communicating with any other technology besides x86. If we went back in time, we would probably find a “Powered by Sun“ logo at Salesforce[14] . How can I make it work for me? Downsizing, i.e. replacing central servers by smaller machines peaked in the 1990s. We are not giving a salespitch for commodity hardware, even if one does get the feeling that the x86 has taken over the business. The extensive choice of commodity hardware goes beyond, as it transfers the responsibility for scalability and failure resistance to applications. For Warehouse Scale Computing, like for the Web Giants, when the costs of electricity and investment become crucial, it is the only viable solution. For existing software which can run on the sole resources of a single multiprocessor server, the cost of (re-)developing it as a distributed system and the cost of the hardware can be balanced in the Information System. The decision to use commodity hardware in your company must be made in the framework of your global architecture: as much as possible, develop what you already have with better quality machines or adapt it to migrate (completely) to commodity hardware. In practice, applications designed for distribution such as front Web services will migrate easily. In contrast, highly integrated applications such as software packages necessarily entail specific infrastructure with disk redundancy, which is hardly compatible with a commodity hardware datacenter such as used by Web Giants. [14] http://techcrunch.com/2008/07/14/salesforce-ditch-remainder-of-sun-hardware
  • 160. 160 THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE Associated patterns Distributed computing is essential to using commodity hardware. Patterns such as sharding (cf. “Sharding“, p. 179) need to be implemented in the code to be able to migrate to commodity hardware for data storage. Using a large number of machines also complicates server administration, and patterns such as DevOps need to be adopted (cf. “DevOps“, p. 71). Lastly, the propensity shown by Web Giants to design computers, or rather datacenters, adapted to their needs is obviously linked to their preference for build vs. buy (cf. “Build vs. Buy“, p. 19).
  • 162. 162 THE WEB GIANTS Description For any information system, data are an important asset which must be captured, stored and processed reliably and efficiently. While central servers often play the role of data custodian, most Web Giants have adopted a different strategy: sharding, or data distribution.[1] Sharding describes a set of techniques for distributing data over several machines to ensure architecture scalability. Business needs Before detailing implementation, let us say a few words about the needs driving the process. Among Web Giants there are several shared concerns which most are familiar with: storing and analyzing massive quantities of data,[2] strong performance stakes to ensure delays are minimal, scalability[3] and even flexibility needs linked to consultation peaks.[4] We will insist on a specificity of the type of actors facing the issues mentioned above. For Web Giants, revenues are often independent of the quantity of data processed and stem instead from advertising and user subscriptions.[5] They therefore need to keep unit costs per transaction very low. In traditional IT departments, transactions can easily be linked to physical flows (sales, inventory). Such flows make it easy to bill services depending on the number of transactions (conceptually speaking through a sort of tax). However with e-commerce sites for example, browsing the catalog or adding items to a cart does not necessarily entail revenues because the user can quit the site just before confirming payment. [1] According to Wikipedia, a database shard is a horizontal partition of data in a database or search server. (http://en.wikipedia.org/wiki/Shard_(database_architecture) [2] Heightened by Information Systems being opened to the Internet (user behavior analysis, links to social media...). [3] Scalability is of course tied to a system’s capacity to absorb a bigger load, but more important still is the cost. In other words, a system is scalable if it can handle the additional query without taking more time and if the additional query costs the same amount as the preceding ones (i.e. underlying infrastructure costs must not skyrocket). [4] Beyond scalability, elasticity is linked to the capacity to have only variable costs unrelated to the load. Which is to say that a system is elastic if, whatever the traffic (10 queries per second or 1000 queries per second), the query price per unit remains the same. [5] For example, no size limit to e-mail accounts.
  • 163. 163 THE WEB GIANTS ARCHITECTURE / SHARDING In sum, the Information Systems of Web Giants must ensure scalability at extremely low marginal costs to uphold their business model. Sharding to cut costs As yet, most databases are organized centrally: a single server, possibly with redundancy in active/passive mode for availability. The usual solution for increasing the transaction load is vertical scalability or scale-up, i.e. buying a more powerful machine (more I/O, more CPUs, more RAM...). There are limits however to this approach: a single machine, no matter how powerful, cannot alone index the entire Web for example. Moreover there is the all-important question of costs leading to the search for other approaches. Remember from the last chapter: A study[6] carried out by engineers at Google shows that as soon as the load exceeds the capacities of a large system, the unit cost for large systems is much higher than with mass-produced machines.[7] Although calculating per transaction costs is no easy matter and is open to controversy - architecture complexification, network load to be figured into the costs - the majority of Web Giants have opted for commodity hardware. Sharding is one of the key elements in implementing horizontal scale- up. [6] The study http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905 CAC006 is also summarized in the OCTO blog article: http://blog.octo.com/ datacenter-as-a- computer-une-plongee-dans-les-datacenters-des-acteurs-du-cloud. [7] This is another way of saying “commodity hardware“: the machines are not necessarily low- end, but the performance/cost ratio is the highest possible for a given system
  • 164. 164 THE WEB GIANTS Centralized database Vertical partitioning Horizontal partitioning Client 1 2 Client 1 2 Client 1 Contrat A Contrat A B Contrat A B Client 2 Contrat B How to shard In fact there are two ways of partitioning - or sharding - data: vertically or horizontally. Vertical sharding is the most widely used and consists of isolating rows in the database table per concept. For example, deciding to store client lists in one database and their contracts in another. Horizontal sharding is where the database tables are divided and distributed across multiple servers. For example, storing client lists from A to M on one machine and from N to Z on another. Horizontal sharding is based on a distribution key - the first letter in the name in the example above.[8] Web Giants have mostly implemented horizontal sharding. It has the advantage namely of not being limited by the number of concepts as is the case with vertical sharding. Figure 1 [8] In fact, partitioning is a function of the probability of names to begin with a given letter.
  • 165. 165 THE WEB GIANTS ARCHITECTURE / SHARDING Techniques linked to sharding Based on their choice of horizontal scale-up, Web Giants have developed specific solutions (grouped under the acronym NoSQL - Not Only SQL) to meet the challenges and having the following characteristics: implementation using mass-produced machines, data sharding managed at the software level. While sharding makes it possible to overcome the issues mentioned above, it also entails implementing new techniques. Managing availability is much more complex. In a centralized sys- tem, or one used as such, the system is either available or not, and only the rate of unavailability will be measured. In a sharded system, some data servers can be available and others not. If the failure of a single server makes the entire system unavailable, the unavailability rate is equal to the product of the unavailability of each of the data servers. The rate will thus drop sharply: If 100 machines are each down 1 day per year, the system would show a rate of unavailability of nearly 3 months.[9] Since a distributed system can remain available despite the failure of one of the data servers, albeit in downgraded mode, availability must be measured through two figures: yield, i.e. the above defined unavailability rate; and harvest, i.e. completeness of the response, i.e. measuring so to say the absence of unavailabi- lity.[10] Distribution of the load is usually tailored to data use. A product reference (massively accessed in read mode) won’t raise the same performance issues as a virtual shopping cart (massively accessed in write mode). The replication rate, for example, will be different. [9] (364/365)100 = 76% = 277/365 i.e. 88 days. [10] Thus if when a server fails, the others ignore the modifications made to that server and then resolve the various modifications once the server reconnects to the cluster, the harvest is smaller. The response is incomplete because it has not integrated the latest changes, but maintains the yield. The NoSQL solutions developed by the Giants integrate various mechanisms to manage this: data replication over several servers, vector clock algorithms to resolve competing updates when the server reconnects to the cluster. Further details may be found in the following article: http://radlab.cs.berkeley.edu/people/fox/ static/pubs/pdf/c18. pdf
  • 166. 166 THE WEB GIANTS Lastly, managing the addition of new servers and the data partitioning problems this poses (recalibrating the cluster) are novel issues specific to sharding. FourSquare for example were down for 11 hours in October 2010[11] following overload of one of their servers then to trouble when they connected the back-up server, which in the end caused the entire site to crash. Data distribution algorithms such as consistent hashing[12] limit data replication costs when servers are removed or connected to overcome these problems. Sharding also means adapting your application architecture: Queries have to be adapted to take distribution into account so as to avoid any inter-shard queries because the cost of accessing seve- ral remote servers is prohibitive. Thus the APIs of such systems limit query possibilities to data in the same shard. Whether one is using relational databases or NoSQL type bases, models are upended and modelization is widely limited in such sys- tems to the level key/value, key/document or in classes of columns for which the key or line index serves as the basis for partitioning. Atomicity (the A in ACID) is often restricted so as to avoid atomic updates affecting several shards and therefore transactions distribu- ted over several machines at high performance cost. Who makes it work for them? The implementation of these techniques varies across companies. Some have simply adapted their databases to facilitate sharding. Others have hand-written ad hoc NoSQL solutions. Following the path from SQL to NoSQL, here are a few representative implementations: [11] For more details on the FourSquare incident: http://blog.foursquare.com/2010/10/05/so-that-was-a-bummer/ and the analysis of another blog http://highscalability.com/blog/2010/10/15/troubles-with-sharding-what-can-we-learn- from- the-foursquare.html [12] Further details in the following article: http://blog.octo.com/consistent-hashing-ou-l%E2%80%99art-de-distribuer-les-donnees/
  • 167. 167 THE WEB GIANTS ARCHITECTURE / SHARDING Wikipedia This famous collaborative encyclopedia rests on many instances of distributed MySQL and a MemCached memory cache. It is thus an example of sharding implementation with run-of-the-mill components. Figure 2 The architecture uses master-slave replication to divide the load between read and write functions on the one hand, and partitions the data by Wiki and use case. The article text is also deported to dedicated instances. They thus use MySQL instances with between 200 and 300 GB of data. Consultation Edition MemCached Metadata DATA STORAGE FOR ARTICLE TEXT DATA STORAGE FOR ARTICLE TEXT SLAVES FOR READSLAVES FOR READ MASTER FOR WRITESMASTER FOR WRITES Wiki A MySQL Replication Wiki B Wiki B
  • 168. 168 THE WEB GIANTS Flickr The architecture of this photo sharing site is also based on several master and slave MySQL instances (the shards), but here based on a replication ring making it easier to add data servers.. Figure 3 An identifier serves as the partitioning key (usually the photo owner’s ID) which distributes the data over the various servers. When a server fails, entries are redirected to the next server in the loop. Each instance on the loop is also replicated on two slave servers to function in read-only mode if their master server is down. MasterMaster SlavesSlaves ids 1 à N/4 MemCached Metadata Next master Reads Writes MySQL replication
  • 169. 169 THE WEB GIANTS ARCHITECTURE / SHARDING Facebook The Facebook architecture is interesting in that it shows the transition from a relational data base to an entirely distributed model. Facebook started out using MySQL, a highly efficient open source solution. They then implemented a number of extensions to partition the data. Figure 4 Today, the Facebook architecture has banished all central data storage. Centralized access is managed by the cache (MemCached) or a dedicated service. In their architecture, MySQL serves to feed data to MemCached in the form of key-value and is no longer queried in SQL. The MySQL replication system is also used after an extension to replicate the shards across several datacenters. That being said, its use has very little to do with relational databases. Data are accessed only through the key-value. At this level there are no joins. Lastly, the structure of the data is taken into account to co-locate data used simultaneously. DATACENTER #1DATACENTER #1 DATACENTER #2DATACENTER #2 Clé, Valeur Clé = C1 MemCached Clé, Valeur MySQL MySQLMySQL replication Asynchronous
  • 170. 170 THE WEB GIANTS Amazon The Amazon architecture stands out in its more advanced management of the loss of one or more datacenters on Dynamo. Amazon started out in the 1990s with a single Web server and an Oracle database. They then set up a set of business services in 2001 with dedicated storage. Alongside databases, two systems use sharding: S3 and Dynamo. S3 is an online blob storage site identified by a URL. Dynamo (first used in-house, but recently made available to the public through Amazon Web Services) is a distributed key-value storage system designed to ensure high availability and very fast responses. In order to enhance availability on Dynamo, several versions of a same dataset can coexist, using the principle of eventual consistency[13] . Figure 5 [13] There are quorum mechanisms (http://en.wikipedia.org/wiki/Quorum_(distributed_ computing) to arbitrate between availability and consistency. Consultation Edition Foo (Bar= «1», Version=1) Foo (Bar= «2», Version=2) Asynchronous propagation
  • 171. 171 THE WEB GIANTS ARCHITECTURE / SHARDING In read mode, an algorithm such as the vector clock[14] or, as a last resort, the client application, will have to resolve any conflicts. There is thus a balance to be found in how much is replicated to choose the best compromise between resistance to data center failure on the one hand and system performance on the other. LinkedIn LinkedIn’s background is similar to Amazon’s: they started in 2003 with a single database approach, then partitioned for specific businesses with implementation of a distributed system similar to Dynamo’s: Voldemort. But contrary to Dynamo, it is open source. One should also note that indexes and social graphs have always been stored separately by LinkedIn. Google Google was the first to broadcast information on their distributed storage system. Rather than having its roots in databases, it emulates file systems. In the paper[15] on the Google File System (GFS), the authors mention that their choice of commodity hardware was instrumental, given the weaknesses noted in a previous chapter (cf. “Commodity Hardware“, p. 167). This distributed file system is used, directly and indirectly, to store Google’s data (search index, emails). Figure 6 Its architecture is based on a centralized metadata server (to guide client applications) and a very large number of data storage systems. The degree of data consistency is lower than that guaranteed by a traditional [14] The Vector Clock algorithm provides the order in which a given distributed dataset was modified. [15] http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google. com/ fr//papers/gfs-sosp2003.pdf Client Chunk ServersMaster 1 2 3 4 5 6 3 2 6 4 1 5
  • 172. 172 THE WEB GIANTS file system, but this topic alone deserves an entire article. In production, Google uses clusters of several hundred machines, enabling them to store petabytes of data to index. Exception! It is however undeniable that a great many sites are grounded in relational database technologies without sharding (or without mentioning sharding): StackOverflow, SalesForce, Voyages-SNCF, vente-privee.com… It is difficult to draw up an exhaustive list, one way or another. We nonetheless believe that sharding has become the traditional strategy on data-intensive web sites. Indeed, the architecture of SalesFoce is based on an Oracle database, but it uses the architecture very differently from the practices in our usual ITs: tables with multiple un- typed columns with generic names (col1, col2), a query engine upstream from Oracle to take into account these specificities, etc. Optimizations show the limits of purely relational architecture. In our view, the most striking exception is StackOverflow, where the architecture is based on a single relational SQL server. This site chose to implement architecture based purely on vertical scalability, with their initial architecture, inspired by Wikipedia, then evolving to conform to this strategy. Moreover, one must also note that the scalability needs of StockOverflow are not necessarily comparable to those of other sites because their targeted community (IT engineers) is narrow, thus the mode favors the quality of contributions over their quantity. Furthermore, choosing a platform under Microsoft license gives them an efficient tool but where the costs would certainly become prohibitive in the case of a horizontal scale up. How can I make it work for me? Data distribution is one of the keys that enabled Web Giants to reach their current size and to provide services that no other architecture is capable of supporting. But make no mistake, it is no easy task: issues which are easy to resolve in a relational world (joins, data integrity) demand mastering new tools and methods.Areas which are data intensive but with limited consistency stakes - as is e.g. the case with data which can be partitioned - are those where distributed data will be most beneficial.
  • 173. 173 THE WEB GIANTS ARCHITECTURE / SHARDING Offers compatible with Hadoop use these principles and are relevant to BI, more particularly in analyzing non-structured data. Concerning transactions, consistency issues are more important. Constraints around access APIs are also a limiting factor, but new offers such as SQLFire by VMWare or NuoDB attempt to combine sharding and an SQL interface. Thus something to keep an eye on. In short, you need to ask yourself which data belong to the same use case (what partitions are possible?) and, for each, what the consequences of loss of data integrity would be. Depending on the answers, you can identify the main architecture features that would enable you, above and beyond sharding, to choose the tool to best meet your needs. More than a magic fix, data partitioning must be considered as a strategy to reach scale-up levels which would be impossible without it. Associated patterns Whether you use open source or in-house products depends on your use of data partitioning as it entails a great deal of fine tuning. The ACID transactional model is also affected by data sharding. The pattern Eventually Consistent offers another vision and solution to meet user needs despite the impacts due to sharding. Again, mastering this pattern is very useful for implementing distributed data. Lastly, and more importantly, sharding is cannot be dissociated from the commodity hardware choice implemented by Web Giants. Sources • Olivier Mallassi, Datacenter as a Computer : une plongée dans les datacenters des acteurs du cloud, 6 June, 2011 (French only) : http://blog.octo.com/datacenter-as-a-computer-une-plongee-dans-les- datacenters-des-acteurs-du-cloud/ • The size of the World Wide Web (The Internet), Daily estimated size of the World Wide Web: http://www.worldwidewebsize.com/
  • 174. THE WEB GIANTS 174 • Wikipedia: http://en.wikipedia.org/wiki/Shard_(database_architecture) http://en.wikipedia.org/wiki/Partition_%28database%29 http://www.codefutures.com/weblog/database-sharding/2008/06/ wikipedias-scalability-architecture.html • eBay: http://www.codefutures.com/weblog/database-sharding/2008/05/ database-sharding-at-ebay.html • Friendster and Flickr: http://www.codefutures.com/weblog/database-sharding/2007/09/ database-sharding-at-friendster-and.html • HighScalability: http://highscalability.com/ • Amazon: http://www.allthingsdistributed.com/
  • 175. 175 THE WEB GIANTS TP vs. BI: the new NoSQL approach
  • 176. 176 THE WEB GIANTS ARCHITECTURE / TP VS. BI: THE NEW NOSQL APPROACH Description In traditional ISs, structured data processing architectures are generally split across two domains. Both of course are grounded in relational databases, but each with their own models and constraints. On the one hand, Transactional Processing (TP), based on ACID transactions, and on the other Business Intelligence (BI), grounded in fact tables and dimensions. Web Giants have both developed new tools and come up with new ways of organizing processing to meet these two needs. Distributed storage and processing is widely used in both cases. Business needs One recurrent specificity of Web Giants is their need to process data which are only partially structured, or not at all, different from the usual data tables used in management information systems: Web pages for Google, social graphs for Facebook and LinkedIn. A relational model based on two-dimensional tables where one of the dimensions is stable (the number and type of columns) is ill-adapted to this type of need. Moreover, as we saw in the chapter on sharding (cf. “Sharding“, p. 179), constraints on data volumes and transaction amounts often push Web Giants to partition their data. This overturns the traditional vision of TP where the data are always consistent. BI solutions, lastly, are usually driven by internal IT decisions. For Web Giants, BI is often the foundation for new services which can be used directly by clients: LinkedIn’s People You May Know, new music releases suggested by sites such as Last.fm,[1] Amazon recommendations, are all services which entail [1] Hadoop, The Definitive Guide O’Reilly, June, 2009.
  • 177. 177 THE WEB GIANTS manipulating vast quantities of data to provide recommendations to users as quickly as possible. Who makes it work for them? The new approach of Web Giants on the level of TP (Transaction Processing) and BI (Business Intelligence) lies in generic storage and deferred processing whenever possible. The main goal in the underlying storage is only to absorb huge volumes of queries both redundantly and reliably. We call it ‘generic’ because it is poorer in terms of indexing, data organization and consistency than traditional databases. Processing and analyzing data for queries and consistency management are deported to the software level. The following strategies are implemented. TP: the ACID constraints limited to what is strictly necessary The sharding pattern highly complicates the traditional vision of a single consistent database used for TP. Major players such as Facebook and Amazon have thus adapted their view of transactional data. As specified by the CAP theorem,[2] within a given system one cannot at the same time achieve consistency, availability and partition tolerance. First of all, data consistency is no longer permanent but only provided when the user reads the data. This is known as eventual consistency: it is when the information is read that its integrity is checked, and any differing versions in the data servers are resolved. Amazon fostered this approach when they designed their distributed storage system Dynamo.[3] On a set of N machines, the data are replicated on W of them, in addition to version stamping. For queries, N-W+1 machines are searched, thereby ensurin that the user has the latest version.[4] The e-commerce giant chose to reduce data consistency in favor of gains in the availability of its distributed system. [2] http://en.wikipedia.org/wiki/CAP_theorem [3] http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html [4] In this way one is always certain of reading the data on at least one of the W machines where the freshest data have been written. For further information, see http://www. allthingsdistributed.com/2007/10/amazons_dynamo.html
  • 178. 178 THE WEB GIANTS ARCHITECTURE / TP VS. BI: THE NEW NOSQL APPROACH Furthermore, to meet their performance goals, data freshness criteria are no longer comprehensive, but categorized. Facebook and LinkedIn rely on user updates for real-time freshness of these data: modifications must be immediately visible to ensure user trust in the system. In contrast, global consistency is reduced: when users sign up for a Facebook group for example, they immediately see the information appear but other group members may experience some delay in being notified.[5] At LinkedIn, services are also categorized. For non critical services such as retweets, the information is propagated asynchronously.[6] Whereas any user modifications on their own data are immediately propagated so as to be instantly visible to them. Asynchronous processing is what makes it possible for Web Giants to best manage the heavy traffic loads they face. In sum, to guarantee performance and availability, Web Giants tailor their storage systems so that data consistency depends on usage. The goal is not to be consistent at all times, but rather to provide eventual consistency. BI: the indexation mechanism behind all searches To provide information on vast quantities of data, Web Giants also tend to pre-calculate indexes, which is to say data structures specifically designed to answer user questions. To better understand this point, let us look at the indexes that Google has designed for its search engine. Google is foremost in the arena due to the volume of its indexing: the Web entire. [5] http://www.infoq.com/presentations/Facebook-Software-Stack [6] Interview with Yassine Hinnach, Architect at LinkedIn.
  • 179. 179 THE WEB GIANTS At the implementation level, Google uses sharding to store raw data (BigTable column database grounded in the distributed Google File System).[7] Indexes based on keywords are then produced asynchronously, and are used to answer user queries. The raw data are analyzed with a distributed algorithm, based on the programming model MapReduce. The process can be divided into two main phases: map, which, in parallel, identically processes each piece of data; and reduce, which aggregates the various results in a single final result. The map phase is easily distributable by using one machine for processing and another for the corresponding data, as can be seen in Figure 1. Figure 1 [7] cf. “Sharding“, p. 179. ReduceMap 0 1 2 3 4 5 6 7
  • 180. 180 THE WEB GIANTS ARCHITECTURE / TP VS. BI: THE NEW NOSQL APPROACH This technique is highly scalable[8] and makes it possible for example for a web crawler to consume all web pages visited, to establish for each the list of outgoing links, then to aggregate them during the reduce phase to obtain a list of the most referenced pages. Google has implemented a sequence of MapReduce tasks to generate the indexes for its search engine.[9] This allows them to process huge quantities of data in batch mode. The technique has been widely copied, namely through the Apache Foundation open source project Hadoop.[10] Hadoop uses both the distributed file system and a framework to implement the MapReduce programming model, directly inspired by Google’s research paper. It was then adopted by Yahoo! for indexing, by LinkedIn to prepare its email campaigns, and by Facebook to analyze the various logs generated by their servers... Many firms, including several other Web Giants (eBay, Twitter) use it.[11] In 2010, Google set up a new indexation process based on event mechanisms.[12] Updates do not happen in real time, contrary to database triggers, but latency (the time between page publication and the possibility to search it) is greatly reduced as compared to a batch system based on the MapReduce programming model. Exception! All of these examples share a commonality: they target a pretty specific set of needs. Many key Web players also use relational databases for other applications. The “one size fits all“ approach of these databases means they are easier to use but also more limited, notably in terms of scalability. The processes and distributed storage systems described above are only implemented for the services most frequently used by these key players. [8] Or scalable, i.e. capable of processing more data if the system is enlarged. [9] http://research.google.com/archive/mapreduce.html [10] http://hadoop.apache.org [11] http://wiki.apache.org/hadoop/PoweredBy [12] Google Percolator: http://research.google.com/pubs/pub36726.html
  • 181. 181 THE WEB GIANTS How can I make it work for me? It is certainly in indexation solutions and BI on Big Data that the market is most mature. With Hadoop, a reliable open source implementation, a large number of support solutions, related tools, re-implementations and commercial repackaging have been developed, based on the same APIs. Projects based on the indexation of large quantities of data, or which are semi- or non- structured, are the primary candidates for adoption of this type of method. The main advantage is that data can be preserved thanks to much lower storage costs. Information is no longer lost through over- hasty aggregations. In this way the data analysis algorithms producing indexes or reports can also be more easily adjusted over time since they are constantly processing all available data rather than pre-filtered subsets. A switch from relational databases in TP will probably take more time. Various distributed solutions inspired by Web Giants’ technologies have come out under the label NoSQL (Cassandra, Redis). Other distributed solutions, more at the crossroads of relational databases and data matrices in terms of consistency and APIs, have come out under the name NewSQL (SQLFire, VoltDB). Architectural patterns such as Event Sourcing and CQRS[13] can also contribute to spanning gaps across disciplines. In fact, their contributions make it possible to model transactional data as a flow of events which are both non correlated and semi-structured. Building a comprehensive and consistent vision of the data comes after, for data dissemination. Web Giants models cannot be directly transposed to meet the general TP needs of businesses, and there are many other approaches to be found on the market to overcome traditional database limits. Associated patterns This pattern is mainly linked to the sharding pattern (cf. “Sharding“, p. 179), because, through distributed algorithms, it makes it possible to work on this new type of storage. One should also note here the influence of the pattern Build vs. Buy (cf. “Build vs. Buy“, p. 19) which has led Web Giants to adopt highly specialized tools to meet their needs. [13] Command and Query Responsibility Separation.
  • 182. 182 THE WEB GIANTS Big Data Architecture
  • 183. 183 THE WEB GIANTS To better meet their users' needs, the Web Giants do everything they can to reduce their Time to Market. Data in all forms are key to this strategy. They not only serve for technical analyses, but are also business drivers. They are what make it possible to personalise the user experience, more and more often in real time, and above all inform decision making. The Web giants have long understood the importance of data and use them unabashedly. At Google for example, all ideas must come with metrics, all arguments must be based on data, or you will not be heard in the meeting.[1] Everyone speaks of Big Data, but the Web Giants were the first stakeholders, or, at the least, associates. Behind the buzz word are new challenges, including an especially complicated one: how do you store and process the exponential volume of data generated? There are more connected objects than humans on the planet, and Cisco forecasts that by 2020 there will be over 50 billion sensors,[2] how do you use all that information? Time to Action As shown in the preceding chapter, NoSQL architecture can process and query ever larger amounts of data. Big Data is usually described by 3 main characteristics, often called the 3Vs:[3] Volume, the capacity to process terabytes, petabytes, and even exabytes of extracted data Variety, the capacity to process all data formats, whether structured or not Velocity, the capacity to process events in real time, or at least as quickly as possible With architectures of the NoSQL/NewSQL type, as described previously, only the components Variety and Volume were highlighted. Let us now look at how the Web Giants also embrace the third component: Velocity. [1] http://googlesystem.blogspot.com.au/2005/12/google-ten-golden-rules.html [2] https://www.cisco.com/web/about/ac79/docs/innov/IoT_IBSG_0411FINAL.pdf [3] https://en.wikipedia.org/wiki/Big_data
  • 184. 184 THE WEB GIANTS ARCHITECTURE / BIG DATA ARCHITECTURE Making data available We will talk here about double-headed architectures capable of storing and querying data in all forms, processed in batches or in real time. But before broaching this complex subject, let us first take a look at the characteristics and Big Data architecture patterns the Web Giants implement. A data lake for data In an information system, the data are distributed over dozens, or even hundreds, of components. The data are spread out in various sources, some on site but others with third party editors or blocked in proprietary software. Having the data on hand is not enough, they must also be instantly accessible. If you don't have the data, it is unlikely you will think of playing around with them. Isolated data is underexploited data: the Allen Curve[5] also applies to data! That is why the Web giants centralise their data in a scalable system where they can be easily queried without any presumptions about how they will be used. Perhaps most of them will not even be used, but that does not matter: the important thing is to have them nearby just in case a new idea emerges. This type of system, usually based on the Hadoop framework, is commonly called a “data lake“.[6A] It is a storage and distributed processing platform capable of handling ever increasing amounts of data, whatever their nature. On paper, it can be scaled to infinity,[7] both in terms of storage and processing, and can manage numerous competing jobs and tasks linearly thanks to the size of the infrastructure. An aside Some also speak of 4Vs or even 5Vs,[4] adding components to the 3Vs mentioned above such as: Veracity,thecapacitytomanageinconsistenciesandambiguitiesa Value, the capacity to apply differential processing to data depending on the value attributed to them The latter is without doubt the most debatable, since the main benefit of this type of architecture is that there are no presuppositions as to how the data will be analysed, and therefore no pre-established values. [4] https://www.linkedin.com/pulse/20140306073407-64875646-big-data-the-5-vs-everyone-must-know [5] https://en.wikipedia.org/wiki/Allen_curve [6A] https://en.wikipedia.org/wiki/Data_lake [7] even if nothing is infinitely scalable https://www.youtube.com/watch?v=modXC5IWTJI
  • 185. 185 THE WEB GIANTS Immutable data A data lake can store all types of data, it is up to the user to decide what to use it for. Of all the data it can hold, raw data are particularly interesting. Available without changes or alterations, they can be modelled depending on user needs. Immutability drastically reduces manipulation errors: the data are entered without any transformation, limiting the risk of losing the context or errors in interpretation the data are stored only once and are never updated, thus limiting manipulation errors and keeping a full record. Immutable, they can also theoretically[6b] be reused an infinite number of times. The data are not “consumed“ but “used“. In case of errors, bugs or code updates, the processing simply needs to be relaunched to obtain the latest results. When they are timestamped and sufficiently individualised, such immutable data are also known as “events“. Schema on read Another highly interesting characteristic is in interpreting the data. For a “traditional“ BI ingestion, the data are cleaned up, formatted, and normalised before being ingested. The Web Giants consider that each time data is transformed, part of the context is altered. By storing raw data, it is up to users to decide how to transform them. Let us take the example of Twitter. Each tweet contains a multitude of information: text, images, videos, links, hashtags. They are timestamped, geographically located, shared, liked... Depending on the system using the data, it must be able to transform them by focusing on the aspect which seems most relevant. An application to map the most recent tweets will probably not have the same angle of approach as one looking for the most shared content. [6B] In practice, Google uses its data over a period of 30 days, for both volumetric and legal reasons.
  • 186. 186 THE WEB GIANTS ARCHITECTURE / BIG DATA ARCHITECTURE This pattern, Schema on read, has several advantages: It maximally simplifies ingestion, avoiding all data loss and making it much less expensive to add data to a data lake. It gives clients flexibility by allowing personalised extraction and transformation depending on needs. This pattern, joined with the preceding ones, becomes a driver of innovation. It does away with technical barriers to data processing, making it possible to develop new prototypes more and more quickly. The best way to find value in your data is to play around with them! From Big Data to Fast Data The Web Giants strive to give value to their clients as quickly as possible. Sometimes, and more and more often, offline processing is no longer sufficient for user needs. In that case, the best way to get value from your data is to interact with them as soon as they are ingested: the data lake as described above allows Enterprise DWH Database Transactional Systems Reporting, requests External Data, OpenAPI Messages Events Messages Events PUBLICATION INGESTION Analytical batchs Machine Learning Flow management Non-structured storage Semi-structured storage (NoSQL) Structured storage (ex. relational) DATALAKE Interactive requests Raw files Applicative logs External Data, OpenAPI
  • 187. 187 THE WEB GIANTS you to process data in batch mode only. However, between two batch passages, freshly gathered data are not used. Not only do you not get full benefit from them, but worse, some data may be outdated before they're even used. The fresher the data, the greater their potential interest. To process millions or even billions of events per second, two types of technology are used: Event distributors and collectors such as Flume and Kafka Tools to process the events in near real time, such as Spark and Storm More than being just customers, the Web giants partake in creating and sharing these bricks: Kafka is a high speed distributed message queue developed by LinkedIn[8] Storm makes it possible to process millions of messages per second, originally developed by Twitter[9] The goal is not to replace the batch processing brick already included in the data lake, but instead to add real time features. This layer is often referred to as the Fast Layer, and the capacity to leverage Big Data for real time processing is known as Fast Data.[10] Real time reduces the Time to Action, so prized by the Web Giants.[11] APIs DATALAKE REALTIME PUBLICATION INGESTION REAL TIME PUBLICATION REAL TIME INGESTION APIs Interactive and batch processing Sandbox Distributed File Storage Resilient Storage Stateless processing Stateful processing Enterprise DWH High Volume data Log files Applications Applications High Velocity data Data import Data import Data import Data import Data import [8] http://kafka.apache.org/ [9] http://storm.apache.org/ [10] http://www.infoworld.com/article/2608040/big-data/fast-data--the-next-step-after-big-data.html [11] http://www.datasciencecentral.com/profiles/blogs/time-to-insight-versus-time-to-action
  • 188. 188 THE WEB GIANTS ARCHITECTURE / BIG DATA ARCHITECTURE Should the two channels, batch and real time, be treated as distinct or, on the contrary, be unified? In theory, the ideal is to be able to process the entire dataset, but that is not so simple. There are numerous initiatives but you are unlikely to need any for your ecosystem, where most use cases can do without. The Web giants advise batch oriented architecture if you have no strong latency constraints, or instead fully real time architecture, but rarely both at once. Lambda architecture Lambda architecture is undoubtedly the most widespread response to the need to unify the two approaches. The principle is to process the data in two layers, batch and real time, carrying out the same processes in both channels, then consolidating the results in a third, dedicated layer: The batch layer precalculates the results based on the complete dataset. It processes raw data and can be regenerated on demand. The speed layer serves to overcome batch latency by generating real time views which undergo the same processing as in the batch layer. These real time views are continuously updated and the events are crushed in the process, therefore the views can only be replayed by the batch layer. The serving layer then indexes both views, batch and real time, and displays them in the form of consolidated output. Since the raw data are always available in the batch layer, if there are any errors, the output can be regenerated. Sensor Layer Distribution Layer Batch Layer Serving Layer IoT ... All data Process stream Incremented information real time view real time view Precomputed information batch view batch view DataService(Merge) Visualization Adapted from: Marz, N. Warren, J. (2013) Big Data. Manning. Batch recompute Realtime increment Speed Layer Incoming Data mobile social
  • 189. 189 THE WEB GIANTS However, few use cases are truly adapted to this type of architecture. It has not yet reached maturity, even among the Web Giants, and is highly complex to implement. More specifically, it entails developing the same processing twice on two types of very different technologies. Doing it once is already difficult enough without having to double the task, especially given that it must all be synchronised. As an alternative to Lambda architecture, Twitter offers, through Summingbird,[12] an abstraction layer where you can integrate computation in both layers within a single framework. What you gain in simplicity you lose in flexibility however: the number of usable features is reduced at the intersection of both modes. Kappa Architecture LinkedIn has released another variant of this model: Kappa Architecture.[13] Their approach is based on processing all data, old and new, in a single layer: the fast layer, thus reducing the complex equation. It is a way of better dividing the streams into small independent steps, easier to debug, with each step serving as a checkpoint to replay unitary processing in case of error. Reprocessing data is one of the most complicated challenges with this type of architecture and must be thoroughly thought through from the outset. Because code, formats and data constantly change, processing must be able to integrate the changes continuously, and that is no small matter. Sensor Layer Distribution Layer Batch Layer Serving Layer IoT ... Process stream Incremented information real time view real time view Visualization Adapted from: Marz, N. Warren, J. (2013) Big Data. Manning. Batch Analytical analysis Realtime increment Speed Layer Incoming Data mobile social Replay DataService All data [12] https://github.com/twitter/summingbird [13] http://radar.oreilly.com/2014/07/questioning-the-lambda-architecture.html
  • 190. 190 THE WEB GIANTS ARCHITECTURE / BIG DATA ARCHITECTURE How can I make it work for me? Whether you have already invested in Business Intelligence or not, leveraging your data is no longer an option. A data lake type solution has become almost inevitable. More flexible than a data warehouse, it is now possible to process unstructured data and create models on demand. It does not (yet) replace traditional BI but opens up new vistas and possibilities. Based on open source solutions, mostly around Hadoop and its ecosystem, this central business reference is a staunch ally to make data accessible, whatever their type: managing unstructured data, storing and processing large volumes, all with commodity hardware, which is to say low outlay. Whatever your business line, the use cases are numerous and varied: from log analysis and safety audits to optimising the buying journey, not forgetting data science of course, data lakes are a key component to intelligent user experience design. To go beyond the offline processing of your data, add online features to your data lake. Although we do not necessarily recommend implementing e.g. Lambda or Kappa architectures, which are too complex for most use cases and not always mature, this does not take away from the advantages to be reaped from real time schemas which truly open new perspectives. Stay simple!
  • 193. 193 THE WEB GIANTS DATA SCIENCE Data science now provides technology which is both low cost and methodologically reliable to better use data in information systems. Data science drives business intelligence even deeper by automating data analysis and processing in order to e.g. predict events, behavior patterns, trends or to generate new insights. In what follows we provide an overview of data science, with illustrations taken from some of its most groundbreaking and surprising applications. Data science is used to extract information from more or less structured data, based on methodologies and expertise developed at the crossroads of IT, statistics, and all business lines involving data.[1] [2] Practically speaking, solving a data science problem translates as projecting into the future patterns grounded in data from the past. One speaks of supervised learning when the main issue is forecasting for a specific target. When the target has not been specified or data are lacking, detecting patterns is said to be unsupervised. One should note that data science also includes building atemporal patterns and then visualizing their various facets. Taking the classic example of purchasing histories and pricing in online retail, data science serves to determine whether a client will buy a new product, or what price they would be willing to pay for the product, and are thus two examples of supervised learning in the respective areas of classification and regression. Carving out marketing segments based on behavior variables, in contrast, is an example of unsupervised learning. More broadly, data science covers all technology and algorithms used to model, implement and visualize an issue using available data, but also to better understand problems by examining them from several viewpoints to potentially solve them in the future. Machine learning is defined as the algorithmic aspect of data science. [1] Dhar V. 2013. “Data science and prediction“. Communications of the ACM [2] Cleveland WS. 2001. “Data science: an action plan for expanding the technical area of the field of statistics“. Bell Labs Statistics Research Report
  • 194. 194 THE WEB GIANTS Enthusiasm for the discipline is such that today's data scientists must constantly monitor the field to remain on top. Let us seize the occasion to note that in the second half of 2015, OCTO published a Hadoop white book and a book on data science (in French, English translation forthcoming).[3] [4] Web Giants Among the Web Giants, there is strong movement towards unstructured data (e.g. video and sound). These have traditionally been ignored by analytics due to volume constraints and technical barriers to extracting the information. However they are back in fashion with a combination of breakthroughs in neural network science (including the field currently known as deep learning); in technology, with ever more affordable and powerful machines; and lastly with the wide media coverage of a number of futuristic applications. Groundbreaking work has been going on over the last few years, namely in images and natural language processing, both sound and text. In December, 2014 Microsoft announced the launch of Skype Translator, a real time translation tool for 5 languages, to break down language barriers.[5] With DeepFace, Facebook announced, in June, 2014 a giant step forward in facial recognition, reaching a precision level of 97%, close to human performance for a similar task.[6] Google presents similar results with FaceNet in an article dated June, 2015 on facial recognition and clustering.[7] [3] http://bit.ly/WP-Hadoop2015 (French) [4] data-science-fondamentaux-et-etudes-de-cas [5] skype-translator-unveils-the-magic-to-more-people-around-the-world [6] deepface-closing-the-gap-to-human-level-performance-in-face-verification [7] http://arxiv.org/pdf/1503.03832.pdf
  • 195. 195 THE WEB GIANTS Such developments in unstructured data processing show that it is now possible to extract value from data hitherto considered out of reach. The key lies in structuring the data: A raw image is transformed into a face, and then linked to a person. The image's context can also be described in a sentence.[8] The patterns extracted from the images can be reproduced with slight modifications, or blended with other images, such as a famous painting to produce artistic motifs.[9] Speech can be transcribed as text, and music as notes on a score. Patterns extracted from music make it possible to a certain extent to reproduce a composer or musical genre. Masses of unstructured texts are transformed into meaning using semantic vectors. Processing natural language becomes a question of algebraic manipulations, facilitating its use by the algorithms of data science.[10] The mainstreaming of bots and personal assistants such as Apple's Siri, Google's Now and Facebook's M partakes in our ability to carry out more and more detailed semantic analyses on unstructured text. The study of brain activity provides clues to identifying signs of illness such as epilepsy or to determining which cerebral patterns correspond to moving one's arm.[11] Some problems requiring cutting edge expertise are now being handled using data science approaches, including to detect the Higgs boson and searching for black matter using sky imaging.[12] [13] Such use cases, often tightly linked to challenges launched by academic circles, have largely contributed to the media frenzy around data science. Moreover, for the Web Giants, data science has become not only a way to continuously improve internal processes, but also an integral part of the business model. Google products are free because the data generated by the user has value for advertising targeting. Twitter draws a share of its revenue from the combination of advertising and analytics products. Uber is a perfect example of a data-driven company which, in serving as intermediary between the client and the driver, has nothing to sell other than intelligence in creating links.[14] Intermediation services can easily be copied by the competition, but not the intelligence behind the services. [9] inceptionism-going-deeper-into-neural [10] learning-meaning-behind-words [11] grasp-and-lift-eeg-detection [12] kaggle.com/c/higgs-boson [13] kaggle.com/c/DarkWorlds/data [14] data-science-disruptors DATA SCIENCE
  • 196. 196 THE WEB GIANTS A flourishing ecosystem and accessible tools The standardization of data science came about through the contribution of many tools from the open source world such as the multiple machine learning and data handling libraries in languages such as R and Python[15] [16] and from the world of Big Data. These open source ecosystems and their dynamic communities have facilitated access to data science for many an IT engineer or statistician wishing to become a data scientist. In parallel, tools for data analysis by major publishers, whether oriented statistics or IT, have also evolved towards integrating open source tools or developing their own implementations of machine learning algorithms.[17] Both the open source and proprietary ecosystems are flourishing, mature, and more and more accessible in terms of training and documentation. Open source is used as much to attract major talent from data science as to provide tools for the community. This strategy is picking up speed as illustrated by the buzz generated by TensorFlow, an open source deep learning framework for digital calculations published by Google in November, 2015.[18] Thanks to highly permissive licensing, these tools are absorbed and improved by the community, transforming them into de facto standards. We have completely lost track of the number of tools from the Hadoop ecosystem which were internally developed by the Web Giants (such as Hive and Presto at Facebook, Pig at Yahoo, Storm and Summingbird at Twitter...) and then took on a second life in the open source world. Platforms for online competitions in data science (such as the most well known kaggle.com or datascience.net in France) have given new, vibrant visibility to the potential of data science. Various Web Giants such as Facebook and major players in distribution and industry quickly understood that this could help them attract the best talent.[19] Many data science competitions propose job interviews as the top prize, in addition to financial awards and certain glory. [15] four-main-languages-analytics-data-mining-data-science [16] kdnuggets.com/2015/05/r-vs-python-data-science [17] Why-is-SAS-insufficient-to-become-a-data-scientist-Why-need-to-learn-Python-or-R [18] tensorflow-googles-latest-machine_9 [19] kaggle.com/competitions
  • 197. 197 THE WEB GIANTS The Web Giants swiftly organized to recruit the best data scientists, thus anticipating the value added by interdisciplinary teams specialized in capitalizing on data.[20] Many, e.g. Google, Facebook and Baidu, have also hired top specialists in machine learning such as Geoffrey Hinton, Yann LeCun and Andrew Ng.[21] [22] [23] Current challenges in data science One of the most crucial steps in any data science project is called feature engineering. This consists of extracting the relevant numeric variables to characterize one or several facets of the phenomenon under study. For example, numerically describing user behavior on a web site by calculating how often a given page is accessed, or characterizing an image by the number of contours it contains. Feature engineering is also considered one of the most fastidious tasks a data science has to carry out. For unstructured data such as images, deep learning has made it possible to automate the procedure, placing the use cases mentioned above within reach. For structured data, the creation and selection of new features to improve prediction remain strongly specific to each particular business. This is an essential component of the alchemy of a good data scientist. Feature engineering is sill largely implemented manually by the world's best data scientists for structured data.[24] How can I make it work for me? Are all the data you produce stored and then readily accessible? What percentage of the data is in fact processed and analyzed? How often? To what extent do you use the available data to measure your processes and orient your actions? How much importance do you attach to recruiting data scientists, data engineers and data architects? Data science contributes more broadly to the best practices of data driven companies, i.e. those that use the available data both qualitatively and quantitatively to improve all their processes. Answering the few questions above allows you to measure your maturity as concerns data. [20] the-state-of-data-science [21] wired.com/2013/03/google_hinton/ [22] facebook.com/yann.lecun/posts/10151728212367143 [23] chinese-search-giant-baidu-hires-man-behind-the-google-brain [24] http://blog.kaggle.com/2014/08/01/learning-from-the-best/ DATA SCIENCE
  • 198. 198 THE WEB GIANTS You have perhaps already used predictive methods based on linear algorithms such as logistic regression traditionally found when establishing marketing scores. Today, the rigorous implementation of the data science methodology gives you control over the inherent complexity in using non linear algorithms. The underlying compromise in giving up linear algorithms is the loss of capacity to understand and explain predictions in exchange for more realistic, and therefore more useful, predictions. How do I get started? Depending on the nature of your business, you may have unstructured data that deserve a fresh look: Call center recordings to be transcribed and semanticized to better understand your customer relations. Written texts supplied by clients or emails sent by staff to be used to categorize complaints and requests, to detect fads and trends. The takeaway is that in the use cases of most of our clients and in international competitions, the vast majority concern structured or semi-structured data: Mapping links between customers and timestamped transactions can bring to light potential fraud by processing volumes far beyond what is possible manually. Web logs begin as far upstream as possible to characterize customer journey's which lead to a strategic target such as shopping cart abandonment. Temporal series produced by industrial sensors help prevent problems on assembly lines. Server logs identify warning signs before a machine breaks down. Relational data on clients, sales and products form a set of characteristics including identity, geographic location, behavior patterns and social networks which are systematically integrated in the 360 models of the examples described above. Better yet, personalizing your client segment, predicting component failures, improving the performance of your production units, gaining customer loyalty, forecasting increases in demand and reducing churn, are all possible use cases.[25] Data science has become a strategic business asset that you can no longer do without. [25] kaggle.com/wiki/DataScienceUseCases
  • 199. 199 THE WEB GIANTS Sources [1] Dhar V. 2013. “Data science and prediction“. Communications of the ACM [2] Cleveland WS. 2001. “Data science: an action plan for expanding the technical area of the field of statistics“. Bell Labs Statistics Research Report [3] http://bit.ly/WP-Hadoop2015 (French) [4] data-science-fondamentaux-et-etudes-de-cas [5] skype-translator-unveils-the-magic-to-more-people-around-the-world [6] deepface-closing-the-gap-to-human-level-performance-in-face-verification [7] http://arxiv.org/pdf/1503.03832.pdf [8] google-stanford-build-hybrid-neural-networks-that-can-explain-photos [9] inceptionism-going-deeper-into-neural [10] learning-meaning-behind-words [11] grasp-and-lift-eeg-detection [12] kaggle.com/c/higgs-boson [13] kaggle.com/c/DarkWorlds/data [14] data-science-disruptors [15] four-main-languages-analytics-data-mining-data-science [16] kdnuggets.com/2015/05/r-vs-python-data-science [17] Why-is-SAS-insufficient-to-become-a-data-scientist-Why-need-to-learn-Python-or-R [18] tensorflow-googles-latest-machine_9 [19] kaggle.com/competitions [20] the-state-of-data-science [21] wired.com/2013/03/google_hinton/ [22] facebook.com/yann.lecun/posts/10151728212367143 [23] chinese-search-giant-baidu-hires-man-behind-the-google-brain [24] http://blog.kaggle.com/2014/08/01/learning-from-the-best/ [25] kaggle.com/wiki/DataScienceUseCases DATA SCIENCE
  • 201. 201 THE WEB GIANTS Description of the pattern “Everything fails all the time“ is a famous aphorism by Werner Vogels, CTO of Amazon: indeed it is impossible to plan for all the ways a system can crash, in any layer - an inconsistent administration rule, system resources that are not released following a transaction, hardware failure, etc. It is on this simple principle that the architecture of Web Giants is based, it is known as the Design for Failure pattern: computer software must be able to overcome the failure of any underlying component and infrastructure. Hardwareisnever100%reliable,itisthereforecrucialtoisolatecomponents and applications (data grids, HDFS...) to guarantee permanent service availability. At Amazon for example, it is estimated that 30 hard drives are changed every day per data center. The cost is justified by the nearly constant availability of the site amazon.fr (less than 0.3 s. of outage per year), where one must remember that each minute of outage costs over 50,000 euros in lost sales. A distinction is generally made between the traditional continuity of service management model and the design for failure model which is characterized by five stages of redundancy: Stage 1: physical redundancy (network, disk, data center). That is where the traditional model stops. Stage 2: virtual redundancy. An application is distributed over several identical virtual machines within a VM cluster. Stage 3: redundancy of the VM clusters (or Availability Zone on AWS). These clusters are organized into clusters of clusters. Stage 4: redundancy of the clusters of clusters (or Region on AWS). A single supplier manages these regions. Stage 5: redundancy of Internet suppliers (e.g. AWS and Rackspace) in the highly unlikely event of AWS being completely down. Of course, you will have understood that the higher the redundancy level, the more the deployment and switch mechanisms are automated.
  • 202. 202 THE WEB GIANTS ARCHITECTURE / DESIGN FOR FAILURE Applications created within Design for failure continue to function despite system or connected application crashes, even if it means, to continue providing an acceptable level of service, downgrading functions for the most recently connected users or all users. This entails including design for failure in the application engineering, based for example on: Eventual consistency: instead of systematically seeking consistency with each transaction with often costly mechanisms of the XA[1] type, consistency is ensured at the end (eventually) when the failed services are once again available. Graceful degradation (not to be confused with the Web User Interface of the same name): when there are sharp spikes in load, performance-costly functionalities are deactivated live. At Netflix, the streaming service is never interrupted, even when their system for recommendations is down or failing or slow: they are there, no matter what the failure. Moreover, to reach that continuity of service, Netflix uses automated testing tools such as ChaosMonkey (recently open-sourced), LatencyMonkey and ChaosGorilla, which check that applications continue to run correctly despite random failures in, respectively, one or several VM, network latency, an Availability Zone. Netflix thus lives up to its motto: “The best way to avoid failure is to fail constantly“. Who makes it work for them? Obviously Amazon, who furnishes the basic AWS building blocks. Obviously Google and Facebook who communicate frequently on these topics. But also Netflix, SmugMug, Twilio, Etsy, etc. In France, although some sites have very high availability rates, very few comment on their processes and, to the best of our knowledge, very few are capable of expanding their redundancy beyond stage 1 (physical) [1] Distributed transaction, 2-phase commit.
  • 203. 203 THE WEB GIANTS or 2 (virtual machines). Let us nonetheless mention Criteo, Amadeus, Viadeo, the main telephone operators (SFR, Bouygues, Orange) for their real-time need coverage. What about me? Physical redundancy, rollback plans, Disaster Recovery Plan sites, etc. are not Design for Failure patterns but rather redundancy stages. Design for Failure entails a change in paradigm, going from “preventing all failures“ to “failure is part of the game“, going from “fear of crashing“ to “analyzing and improving“. In fact, applications built along the lines of Design for Failure no longer generate such feelings of panic because all failures are naturally mastered; this leaves time for post-mortem analysis and improvements to the PDCA[2] . It is, to borrow a term from Improv Theater, “taking emergencies easy“. This entails taking action on both a technical and a human level. First of all in application engineering: The components of an application or application set must be decentralized and made redundant using VM, by Zone, by Region (in the Cloud. Same principle if you host your own IS) without any shared failure zones. The most complex issue is synchronizing databases. All components must be resilient to underlying infrastructure failures. Applications must support communication breaks and high network latency. The entire production workflow for these applications has to be automated. Then, for the organization: Get out of the A-Team culture (remember: “the last chance at the last moment“) and automate processes to overcome systems failure. At Google, there is 1 systems administrator for over 3000 machines. [2] Plan-Do-Check-Act, a method for continuous improvement, known as the “Deming Wheel“.
  • 204. 204 THE WEB GIANTS Analyze and fix failures upstream with the Failure Mode and Effects Analysis (FMEA) method, and downstream with post-mortems and PDCA. Patterns connexes Pattern “Cloud First“ , p. 159. Pattern “Commodity Hardware“, p. 167. Pattern “DevOps“, p. 71. Exceptions For totally disconnected applications, with few users or few business challenges, the redundancy must be simple or non-existant. Arbitration between each redundancy level is then carried out using ROI criteria (costs and complexities vs. estimated losses during inavailabilities). Sources • Don MacAskill, How SmugMug survived the Amazonpocalypse, 21 April, 2004: http://don.blogs.smugmug.com/2011/04/24/how-smugmug-survived- the-amazonpocalypse • Scott Gilbertson, Lessons From a Cloud Failure: It’s Not Amazon, It’s You, 25 April, 2011: http://www.wired.com/business/2011/04/lessons-amazon-cloud-failure • Krishnan Subramanian, Designing For Failure: Some Key Facts It’s You, 26 April, 2011: http://www.cloudave.com/11973/designing-for-failure-some-key-facts ARCHITECTURE / DESIGN FOR FAILURE
  • 205. 205 THE WEB GIANTS The Reactive Revolution
  • 206. 206 THE WEB GIANTS For many years now, competing processes have been executed in different threads. A program is basically a sequence of instructions that run linearly in a thread. To perform all the requested tasks, a server will generate several threads. But these threads will spend most of their time waiting for the result of a network call, a disk read or a database query. Web giants have moved on to a new model to eliminate such time loss and to increase the number of users per server by reducing latency, improving performance globally and managing peak loads more simply. The reactive manifesto defines a reactive application around four interrelated pillars: event-driven, responsive, scalable and resilient. A responsive application is event-driven, able to provide an optimal user experience, by making better use of available computing power and higher error and failure tolerance, and hence scalability and resilience. But the most powerful concept here is the event-driven orientation, everything else can be seen through this prism. The reactive model is a development model driven by events. It is called by a variety of names. It's all a matter of perspective: event-driven, driven by events reactive, that reacts to events push based application, the data is fronted as it becomes available Even better: Hollywood, summarised by the famous “don’t call us, we’ll call you“ Use cases: when latency matters This architectural model is very relevant for applications interacting with users in real time. This includes several use cases like: Social networks, shared documents and direct communication tools
  • 207. 207 THE WEB GIANTS Financial analysis, pooled information like traffic congestion or public transport, pollution... Multiplayer games Multi-channel approaches, mobile application synchronisation Open or private APIs, when usage is impossible to predict IoT and index management Massive user influx such as sport events, sales, TV ads... And more generally when effectively managing complex algorithms is the issue, e.g. for ticket booking, graph management, the semantic web One of the crucial elements in all these applications is latency handling. For an application to be responsive and thus usable, users must experience the lowest possible latency. It’s all about the threading strategy To put it simply, there are two types of thread: Hard-threads: these are real competing processes that are executed by the different processor cores Soft-threads: these are simulations of competing processes that dedicate portions of the CPU to each process, alternately Fortunately, the soft-threads allow machines to simultaneously run many more threads than they have cores. The reactive model aims to remove as many soft-threads as possible and only use hard-threads, making more efficient use of modern processors. To reduce the number of threads, the CPU must not be shared on a time basis, but instead on an event basis. Each call involves processing a piece of code. It must never be blocked, to release the CPU as quickly as possible to process the next event. ARCHITECTURE / THE REACTIVE REVOLUTION
  • 208. 208 THE WEB GIANTS Implementing this model means operating in all software layers: from operating systems to development languages passing through frameworks, hardware drivers and databases. A data structure that eliminates locks is beyond doubt an important lever for system performance. New functional data models then become the best allies for reactive models. Among new software making the most buzz, many use an internal reactive model. To name but a few: Redis, Node.js, Storm, Play, Vertx, Axom, and Scala. The reactive model is more likely to respond well to load peaks. It reduces the limit on the number of simultaneous users controlled by an arbitrary fixed parameter on the thread pool. Most of the Web giants have published their experience feedback on their migration to this model: Coursera,[1] Gilt, Groupon, Klout, LinkedIn,[2] NetFlix[3] , Paypal, Twitter,[4] WalMart[5] and Yahoo. Their voices are unanimous: reactive architectures make it possible to offer the best user experience with the highest scalability. Why now? “Software gets slower faster than hardware gets faster. “ Niklaus Wirth – 1995 The reactive model is not new. It has been used in all user interface frameworks since the invention of the mouse. Each click or keystroke generates an event. Even client-side JavaScript uses this model. There is no thread in this language, yet it is possible to have multiple simultaneous AJAX requests. Everything works using call-backs and events. [1] http://downloads.typesafe.com/website/casestudies/Coursera-Case-Study.pdf [2] http://engineering.linkedin.com/play/play-framework-async-io-without-thread-pool-and- callback-hell [3] https://blog.twitter.com/2013/new-tweets-per-second-record-and-how [4] http://venturebeat.com/2012/01/24/why-walmart-is-using-node-js/ [5] http://www.infoq.com/presentations/netflix-reactive-rest
  • 209. 209 THE WEB GIANTS Current development architectures are the result of a succession of steps and evolutions. Some strong concepts have been introduced and used extensively before being replaced by new ideas. The environment is also changing. The way we respond to it has changed. User experience has been the driving force of this change: today, who is willing to fill in a form, wait for the page to reload to provide feedback (failure/success) and wait again for the confirmation email? What of getting such information immediately rather than asynchronously? Have we reached the limits of our systems? Is there still space to be conquered? Performance gains to discover? In our systems, there is a huge untapped power reservoir. For doubling the number of users, adding a server will do the trick. But since the advent of mobile, companies have to handle about 20x more requests: is it reasonable to multiply the number of servers in proportion? And is it sufficient? Certainly not. To a certain extent, it sounds better to review the architecture to harness the power that’s available: there are many more available processor cycles to optimise. And when programs spend significant amounts of time waiting for disks, networks or databases, they don’t harness server potential. From this point forward, this paradigm becomes accessible to everyone while becoming built-in into modern development languages. These new development patterns integrate latency and performance management at the beginning of all projects. It is no longer a challenge to overcome when it is too late to change the application architecture. Applications based on the request/response model (HTTP / SOAP / REST)​​ can tolerate a thread model. In contrast, applications based on flows like JMS or WebSocket will have everything to gain from working off a model based on events and soft threads. Unless your application is mostly devoted to calculations, you should start thinking about implementing the reactive approach. The paradigm is compatible with all languages. ARCHITECTURE / THE REACTIVE REVOLUTION
  • 210. 210 THE WEB GIANTS Things are moving fast: new frameworks now offer asynchronous APIs, and, in-house, mostly use non blocking APIs, with language libraries also changing, now providing classes which make it possible to react to events more simply, and, lastly, the languages themselves are changing to make it easier to script simple codes (closures) or generate asynchronous code from synchronous code. In addition, patterns can be set up to manage threadless multitasking scripts: a generator, which produces elements and pauses for each iteration, until the next invocation continuation, a closure which becomes a procedure to be executed once it has been processed coroutine, which makes it possible to pause processing composition, which makes it possible to sequence processing in the pipeline Async/Await, to distribute processing over several cores In other words, the reactive revolution is underway! How can I make it work for me? Reactive architecture is to architecture what NoSQL is to relational databases: a very good alternative when you have reached your limits. It is all a question of latency and access competition: for real time applications, whether embarked or not, choosing reactive architecture is justified as soon as you have a significant increase in volume. So no reactive corporate website, but instead real time processing and display of IoT data (a fleet's position for example). The same goes for APIs, so the back-ends must be designed in consequence: if your volume is under control, reactive architecture is overkill. Open to partners or even in open API, it appears necessary to design non-blocking architecture from the outset.
  • 211. 211 THE WEB GIANTS Lastly, on one hand, wisely using the cloud can help you overcome many of these limits (Amazon's Lambdas for example), and, on the other hand, many software publishers have demonstrated their willingness to produce highly scalable architecture. When choosing a SaaS software package or one hosted on the premises for these use cases, companies must now turn to editors who have proven they master such architecture. All of these technologies have physical limits. Disk volumes are increasing, but not access time. There are more cores in processors, but frequency has not increased. Memory is increasing, beyond the capacity of garbage collectors. If you are nearing these limits, or will do so in the next few years, reactive architecture is definitely made for you. ARCHITECTURE / THE REACTIVE REVOLUTION
  • 214. 214 THE WEB GIANTS ARCHITECTURE / OPEN API Description The principle behind Open API is to develop and offer services which can be used by a third party without any preconceived ideas as to how they will be used. Development is thus mainly devoted to applied logic and system persistence. The interface and business logic are developed by others, often more specialized in interface technologies and ergonomics, or having other specificities.[1] The application engine therefore exposes an API,[2] which is to say a bundle of services. The end application is based on service packages, which can include services provided by third parties. This is the case for example for HousingMaps.com, a service for visualizing advertisements on CraigsList using Google Maps. The pattern belongs to the broader principles of SOA:[3] decoupling and composition possibilities. For a while, there was a divide between the architecture of Web Giants, generally of the REST[4] type and corporate SOA, mostly based on SOAP.[5] There has been a lot of controversy among bloggers on this opposition between the two architectures. What we believe is that the REST API exposed by Web Giants is just one form of SOA among others. Web Giants publicly expose their API, thus creating open ecosystems. What this strategy does for them is to: Generate direct income, by billing the service. Example: Google Maps charges for their service beyond 25,000 transactions per day. Expand the community, thereby recruiting users. Example: thanks to the apps derived from its platform, Twitter has reached 140 million active users (and 500 million subscribers). [1] http://www.slideshare.net/kmakice/maturation-of-the-twitter-ecosystem [2] Application Programming Interface. [3] Service Oriented Architecture. [4] Representational State Transfer. [5] Simple Object Access Protocol.
  • 215. 215 THE WEB GIANTS Foster the emergence of new uses for its platform thus developing their income model. Example: in 2009, Apple noted that application developers wanted to sell not only their applications, but also content for them. The AppStore model was changed to include that possibility. At times, externalize RD, then acquire the most talented startups. That is what Salesforce did with Financialforce.com. Marc Andreessen, creator of Netscape, divides open platforms into three types: Level 1 - Access API: these platforms allow users to access business applications without providing the user interface. Examples: book searches on Amazon, geocoding on Mappy. Level 2 - Plug-in API: These platforms integrate applications in the supplier’s user interface. Examples: Facebook apps, Netvibes Widgets. Level 3 - Runtime Environment: These platforms provide not only the API and the interface, but also the execution environment. Example: AppExchange applications in the Salesforce or iPhone ecosystem. It is also good to know that Web Giants APIs are accessible in self-service, i.e. you can subscribe directly on the web site without any commercial relations with the provider. At level 3, you must design a multi-tenant system. The principle is to manage the applications of several businesses in isolation, finding a balance between mutualization and self-containment. The pattern API First is derived from the Open API pattern: its approach is to begin by building an API, then to consume it to build applications for your end users. The idea is to be on the same level as the ecosystem users, which means applying the same architecture principles you are offering your clients to yourself, which is to say the pattern Eat Your Own Dog’s Food (EYODF). Some architects working for Web Giants consider it the best way to build a new platform.
  • 216. 216 THE WEB GIANTS ARCHITECTURE / OPEN API In practice, the API First pattern is an ideal which is not always reached: in recent history, it would seem that it has been applied for Google Maps and Google Wave, two services developed by Lars Rasmussen. And yet it was not applied for Google+, stirring the wrath of many a blogger. Who makes it work for them? Pretty much everyone, actually... References among Web Giants The Google Maps API is a celebrity: according to programmableWeb.com, alongside Twitter it is one of those most used by websites. It has become the de facto standard for showing objects on a map. It uses authentication processes (client IDs) to measure consumption of a given application, so as to be able to bill the service beyond a certain quota. Twitter’s API is widely used: it offers sophisticated services to access subs- criber data, in read and write versions. One can even using streaming to receive tweet updates in real time. All of the site’s functionalities are accessible via their API. The API also makes it possible to delegate the authorization process (using the OAuth protocol), thereby allowing a third party application to tweet in your name. In France The mapping service Mappy offers APIs for geocoding, calculating itine- raries, etc., available at api.mappy.com With api.orange.com, Orange offers the possibility to send text messages, to geolocalize subscribers, etc. What about me? You should consider Open API whenever you want to create an ecosystem open to partners or clients, in-house or externally. Such an ecosystem can be open on Internet or restricted to a single organization. A relatively classic scenario in a business is exposing the yearly directory of collaborators to integrate their identities in the applications.
  • 217. 217 THE WEB GIANTS Another familiar case is integrating services exposed by other suppliers (for example a bank consuming the services of an insurance company). Lastly, a less traditional use is to open a platform for your end clients: A bank could allow its users to access all of their transactions: see the examples of the AXA Banque and CAStore APIs. A telephone or energy provider could give their clients access to to their current consumption rate. Related Pattern Pattern “Device Agnostic“ p. 143 Exception! Anything requiring a complex workflow. Real-time IT (aircraft, car, machine tool): in this case service composition can pose performance issues. Data manipulation posing regulatory issues: channelling critical data between platforms is best avoided. Sources • REST (Representational State Transfer) style: http://en.wikipedia.org/wiki/Representational_State_Transfer • SOA http://en.wikipedia.org/wiki/Service-oriented_architecture • Book “SOA, Le guide de l’architecte d’un SI agile“ (French only): http://www.dunod.com/informatique-multimedia/fondements-de-lin- formatique/architectures-logicielles/ouvrages-professionnel/soa-0
  • 218. 218 THE WEB GIANTS • Open platforms according to Marc Andreessen: http://highscalability.com/scalability-perspectives-3-marc-andreessen- internet-platforms • Mathieu Lorber, Stéphen Périn, What strategy for your web API? USI 2012 (French only): http://www.usievents.com/fr/sessions/1052-what-strategy-for-your- web-api?conference_id=11-paris-usi-2012 ARCHITECTURE / OPEN API
  • 219. 219 About OCTO Technology “We believe that IT transforms our societies. We are fully convinced that major breakthroughs are the result of sharing knowledge and the pleasure of working with others. We are constantly in quest of improvements. THERE IS A BETTER WAY !“ – OCTO Technology Manifest OCTO Technology specializes in consulting and ICT project creation. Since 1998, we have been helping our clients build their Information Systems and create the software to transform their firms. We provide expertise on technology, methodology, and Business Intelligence. At OCTO our clients are accompanied by teams who are passionate about maximizing technology and creativity to rapidly transform their ideas into value: Adeo, Altadis, Asip Santé, Ag2r, Allianz, Amadeus, Axa, Banco Fibra, BNP Fortis, Bouygues, Canal+, Cdiscount, Carrefour, Cetelem, CNRS, Corsair Fly, Danone, DCNS, Generali, GEFCO ING, Itaú, LegalGeneral, La Poste, Maroc Telecom, MMA, Orange, Pages jaunes, Parkeon, Société Générale, Viadeo, TF1, Thales, etc. We have grown into an international group with four subsidiaries: Morocco, Switzerland, Brazil and, more recently, Australia. Since 2007, OCTO Technology has been granted the status of “innovative firm“ by OSEO Innovation. For four years, from 2011 to 2015, OCTO was awarded 1st or 2nd prize in the Great Place to Work contest for firms with fewer than 500 employees. ABOUT US
  • 220. 220 THE WEB GIANTS Authors Erwan Alliaume David Alia Philippe Benmoussa Marc Bojoly Renaud Castaing Ludovic Cinquin Vincent Coste Mathieu Gandin Benoît Guillou Rudy Krol Benoît Lafontaine Olivier Malassi Éric Pantera Stéphen Périn Guillaume Plouin Phillipe Prados Translated from the French by Margaret Dunham Natalie Schmitz Copyright © November 2012 by OCTO Technology, All rights reserved.
  • 221. Illustrations The drawings are by Tonu in collaboration with Luc de Brabandere. They are both active on www.cartoonbase.com, located in Belgium. CartoonBase works mostly with businesses and works to promote the use of cartoons and to encourage greater creativity in graphic art and illustrations of all kinds. Graphics and design by OCTO Technology, with the support of Studio CPCR and toumine@gmail.com
  • 222. ISBN 13 : 978-2-9525895-4-3 Price: AUD $32 The Web GiantsCulture – Practices – Architecture In the US and elsewhere around the world, people are reinventing the way IT is done. These revolutionaries most famously include Amazon, Facebook, Google, Netflix, and LinkedIn. We call them the Web Giants. This new generation has freed itself from tenets of the past to provide a different approach and radically efficient solutions to old IT problems. Now that these pioneers have shown us the way, we cannot simply maintain the status quo. The Web Giant way of working combines firepower, efficiency, responsiveness, and a capacity for innovation that our competitors will go after if we don’t first. In your hands is a compilation and structural outline of the Web Giants’ practices, technological solutions, and most salient cultural traits (obsession with measurement, pizza teams, DevOps, open ecosystems, open software, big data and feature flipping). Written by a consortium of experts from the OCTO community, this book is for anyone looking to understand Web Giant culture. While some of the practices are fairly technical, most of them do not require any IT expertise and are open for exploitation by marketing and product teams, managers, and geeks alike. We hope this will inspire you to be an active part of IT, that driving force that transforms our societies. THE OBSESSION WITH MEASUREMENT • FLUIDITY OF THE USER EXPERIENCE • ARTISAN CODERS • BUILD VERSUS BUY • CONTRIBUTING TO FREE SOFTWARE • DEVOPS • PIZZA TEAMS • MINIMUM VIABLE PRODUCT • PERPETUAL BETA • A/B TESTING • DEVICE AGNOSTIC • OPEN API AND OPEN ECOSYSTEMS • FEATURE FLIPPING • SHARDING • COMMODITY HARDWARE • TP VERSUS BI: THE NEW NOSQL APPROACH • CLOUD FIRST • DATA SCIENCE • REACTIVE PROGRAMMING • DESIGN THINKING • BIG DATA ARCHITECTURE • BUSINESS PLATFORM OCTO designs, develops, and implements tailor-made IT solutions and strategic apps ...Differently. WE WORK WITH startups, public administrations, AND large corporations FORWHOM IT IS a powerful engine for change. octo.com - blog.octo.com - web-giants.com