SlideShare a Scribd company logo
A TALE OF TWO SYSTEMS:
INSIGHTS FROM
SOFTWARE ARCHITECTURE
DAVID MAX
Senior Software Engineer
ABOUT LINKEDIN NEW YORK CITY
● Located in Empire State Building.
● Approximately 90 engineers and out of
about 1000 employees total.
● Multiple teams, front end, back end and
data science.
#nwd2018WHAT “TWO SYSTEMS”?
System 1
● A working system that is nearing the limits of its capacity.
System 2
● The replacement system designed to address the capacity issues.
○ Solves the capacity problem…
○ …but utterly fails in other ways.
ANTI-PATTERN
“A common response to a recurring
problem that is usually ineffective and
risks being highly counterproductive.”
– Wikipedia
“An antipattern is just like a pattern,
except that instead of a solution it gives
something that looks superficially like a
solution but isn’t one.”
– Andrew Koenig
COACH VS. ROOKIE
More powerful conceptual
models help us better make
sense of what we see.
WHAT THE COACH HAS IS...
“...a set of mental abstractions that allow him to convert his
perceptions of raw phenomena, such as a ball being passed, into a
condensed and integrated understanding of what is happening,
such as the success of an offensive strategy.
The coach watches the same game that the rookie does, but he
understands it better.”
– George Fairbanks, Just Enough Software Architecture
THINKING LIKE A
COACH -
CONCEPTUAL MODELS
“Software Architecture refers to the high
level structures of a software system, the
discipline of creating such structures,
and the documentation of these
structures. These structures are needed
to reason about the software system.”
– Wikipedia
“Software architecture is the set of design
decisions which, if made incorrectly, may
cause your project to be cancelled.”
― Eoin Woods
What is Software Architecture?
#nwd2018ARCHITECTURALLY SIGNIFICANT REQUIREMENTS (ASRs)
Constraints - Unchangeable design decisions, usually given, sometimes
chosen.
Quality Attributes - Externally visible properties that characterize how
the system operates in a specific context.
Influential Functional Requirements - Features and functions that
require special attention in the architecture.
Other Influencers - Time, knowledge, experience, skills, office politics,
your own geeky biases, and all the other stuff that sways your decision
making.
― Michael Keeling, Design It!
#nwd2018QUALITY ATTRIBUTES - STANDARD BLENDER
Pros:
● Powerful motor (550 Watts)
● Sits well on kitchen counter
● Dishwasher safe
Cons:
● Must be plugged in
● Limited portability
(example from Design It! by Michael Keeling)
#nwd2018CORDLESS RECHARGEABLE HAND BLENDER
Pros:
● Small, very portable
● Doesn’t need electric outlet to operate
● Very easy to clean
Cons:
● Less powerful (2.5 Watts)
● Needs to be recharged after 20 minutes
● Must hold in hand to operate
#nwd2018CHAINSAW BLENDER
Pros
● Portable, doesn’t need
electric outlet
● Powerful! (37cc gas-powered
engine)
Cons
● Tad loud
● Emits exhaust unsafe for
indoor use
● Not suitable for kitchen
countertop use
#nwd2018TAKEAWAYS
● Three solutions for accomplishing the same task
● Each solution promotes a different set of quality attributes
● Quality attributes often trade off against each other
● The “best” design depends on which properties are most highly valued
#nwd2018
Processing
AGGREGATION
Input files
Output file
#nwd2018OLD SYSTEM FLOW
#nwd2018OLD SYSTEM FLOW
#nwd2018PROBLEMS
● Aggregator terminates with an out-of-memory error on the
largest inputs.
● Task Manager shows there’s plenty of memory left.
● A single memory allocation is requesting well over 500MB at
once, and fails.
WHO NEEDS 500MB at once?
If there is plenty of memory left, why is it failing?
#nwd2018WIN32 PROCESS ADDRESS SPACE
2 GB
8000000
FFFFFFFF
0000000
System virtual address space.
Reserved for use by system.
0000000
2 GB
0000000
7FFFFFFF
Per-process virtual address space.
Available for use by applications
#nwd2018MEMORY MAPPED FILE
#nwd2018ADDRESS SPACE FRAGMENTATION
Even with plenty of memory available, fragmentation of the
address space means there’s not enough contiguous address space
to fit this new block:
#nwd2018COACHABLE MOMENT
● Don’t wait until your system is already blowing up.
● Some scaling problems can’t be solved by buying a bigger computer.
#nwd2018LET’S FIX IT!
Symptom: Aggregator is failing with an out-of-memory error.
Reason: Output file is too large to fit in a Win32 memory mapped file.
Analysis: Current implementation can’t scale beyond a certain size output.
Conclusion: We have a scalability problem.
Solution: Replace aggregation data store with a more scalable solution.
#nwd2018IN-MEMORY DISTRIBUTED DATA CACHE
#nwd2018NEW ARCHITECTURE HAS NICE NEW ATTRIBUTES
#nwd2018NEW ARCHITECTURE OFFERS NEW SCALABILITY OPTIONS
Increasing Scalability
#nwd2018OLD SYSTEM FLOW
#nwd2018NEW SYSTEM FLOW
#nwd2018RUN TIME PERFORMANCE (NIGHTLY BATCH)
#nwd2018ROOKIE MISTAKES
● Include all constraints
○ Fixated on scalability
○ Forgot that we also had important time constraint as well!
● Quality Attributes
○ Worried mainly about scalability, time to implement, and reducing
changes to other parts of the system.
○ Forgot that quality attributes trade off against each other, and did
not analyze to what extent scalability is an ASR.
● Other differences
○ Single process memory mapped files have different performance
characteristics from in-memory distributed data caches.
#nwd2018SIGNIFICANT DIFFERENCES
Scenario - Lots of workers writing to same record.
Memory Mapped File - Best performance because the memory page is
most likely to be in memory. Less likely to need to swap to disk.
File on Disk
Mapped
Address
Range
Memory PageCPU Cache
Worker
Worker
Worker
Worker
Worker
#nwd2018IN-MEMORY DISTRIBUTED CACHE
Scenario - Lots of workers writing to same record.
Worst performance when workers write to the
same record on different machines because of
node-to-node synchronization.
Node Node
NodeNode
Node Node
Worker
Worker
Worker
Worker
#nwd2018IN-MEMORY DISTRIBUTED CACHE
Scenario - Lots of workers writing to same node.
Poor performance because unable to distribute load.
Node Node
NodeNode
Node Node
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
#nwd2018MEMORY MAPPED FILE
Scenario - Every worker writes to a different record.
Worse performance, because fewer cache hits,
more page faults, and more disk I/O.
File on Disk
Mapped
Address
Range
Memory PageCPU Cache
Worker
Worker
Worker
Worker
Worker
Memory Page
Page Fault
#nwd2018IN-MEMORY DISTRIBUTED CACHE
Scenario - Records associated with particular nodes. Load distributed over nodes.
Best performance. Record locality minimizes node-to-node synchronization.
Distributing connections over the cluster promotes better scaling.
Node Node
NodeNode
Node Node
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
#nwd2018CONCLUSION
● Thinking about the architecture helps us better understand how what
we are building addresses the important requirements.
● Promoting one quality attribute usually involves some kind of tradeoff.
Software Engineering is the discipline of balancing tradeoffs.
● The architecture is the hardest thing to change after the fact, so it pays
to invest some time up front analyzing the ASRs.
● Don’t wait until your system is falling over to make needed changes.
Less time spent on the architecture up front often means more time
spent doing avoidable rework later.
Thank You!
linkedin.com/in/davidpmax

More Related Content

PPTX
David Max: A Tale of Two Systems | Nowhere Developers 2018
PDF
Big data & frameworks: no book for you anymore
PDF
Big data & frameworks: no book for you anymore.
PDF
12 best practices for virtualizing active directory DCs
PDF
VMworld 2014: Virtualize Active Directory, the Right Way!
PDF
VMworld 2013: Virtualize Active Directory ‒ The Right Way!
PDF
Bridging the Developer and the Datacenter
PDF
MeasureWorks - Velocity Conference Europe 2012 - a Web Performance dashboard ...
David Max: A Tale of Two Systems | Nowhere Developers 2018
Big data & frameworks: no book for you anymore
Big data & frameworks: no book for you anymore.
12 best practices for virtualizing active directory DCs
VMworld 2014: Virtualize Active Directory, the Right Way!
VMworld 2013: Virtualize Active Directory ‒ The Right Way!
Bridging the Developer and the Datacenter
MeasureWorks - Velocity Conference Europe 2012 - a Web Performance dashboard ...

What's hot (9)

PDF
Becoming a Rock Star DBA
PPTX
Using AWS, Eucalyptus and Chef for the Optimal Hybrid Cloud
PPTX
Branch Office Infrastructure
PPTX
Citrix XenDesktop: Dealing with Failure - SYN408
DOCX
Dileep-Resume
PPTX
5 Ways Your Backup Design Can Impact Virtualized Data Protection
PPTX
DBTA Data Summit : Eliminating the data constraint in Application Development
PDF
Software Process... the good parts
PPT
VMWare Winnipeg Forum - 2011
Becoming a Rock Star DBA
Using AWS, Eucalyptus and Chef for the Optimal Hybrid Cloud
Branch Office Infrastructure
Citrix XenDesktop: Dealing with Failure - SYN408
Dileep-Resume
5 Ways Your Backup Design Can Impact Virtualized Data Protection
DBTA Data Summit : Eliminating the data constraint in Application Development
Software Process... the good parts
VMWare Winnipeg Forum - 2011
Ad

Similar to A Tale of Two Systems - Insights from Software Architecture (20)

PPTX
Solving the Database Problem
PPTX
NoSQL and ACID
PDF
AWS User Group October
ODP
Big data nyu
PDF
Elephant grooming: quality with Hadoop
PDF
Building a High Performance Analytics Platform
PDF
Cloud arch patterns
PPTX
Disrupting the Storage Industry talk at SNIA Data Storage Innovation Conference
PPTX
Dori Exterman, Considerations for choosing the parallel computing strategy th...
PPTX
Webinar: Overcoming the Storage Roadblock to Data Center Modernization
PPTX
Tales from the Field
PPT
MongoDB Sharding Webinar 2014
PDF
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops
PDF
Automatic Undo for Cloud Management via AI Planning
PPTX
Altitude SF 2017: Reddit - How we built and scaled r/place
PPTX
Choosing the right parallel compute architecture
PDF
Big Data: fall seven times, stand up eight!
PDF
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
PDF
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
PDF
VMworld 2013: Re-imagining VDI Design: New Strategies for Solving VDI Challen...
Solving the Database Problem
NoSQL and ACID
AWS User Group October
Big data nyu
Elephant grooming: quality with Hadoop
Building a High Performance Analytics Platform
Cloud arch patterns
Disrupting the Storage Industry talk at SNIA Data Storage Innovation Conference
Dori Exterman, Considerations for choosing the parallel computing strategy th...
Webinar: Overcoming the Storage Roadblock to Data Center Modernization
Tales from the Field
MongoDB Sharding Webinar 2014
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops
Automatic Undo for Cloud Management via AI Planning
Altitude SF 2017: Reddit - How we built and scaled r/place
Choosing the right parallel compute architecture
Big Data: fall seven times, stand up eight!
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
VMworld 2013: Re-imagining VDI Design: New Strategies for Solving VDI Challen...
Ad

Recently uploaded (20)

PPTX
UNIT 4 Total Quality Management .pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
Construction Project Organization Group 2.pptx
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Well-logging-methods_new................
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
additive manufacturing of ss316l using mig welding
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
PPT on Performance Review to get promotions
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPT
Mechanical Engineering MATERIALS Selection
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
DOCX
573137875-Attendance-Management-System-original
PPTX
Foundation to blockchain - A guide to Blockchain Tech
UNIT 4 Total Quality Management .pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Construction Project Organization Group 2.pptx
bas. eng. economics group 4 presentation 1.pptx
Well-logging-methods_new................
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
additive manufacturing of ss316l using mig welding
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
OOP with Java - Java Introduction (Basics)
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPT on Performance Review to get promotions
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Automation-in-Manufacturing-Chapter-Introduction.pdf
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Mechanical Engineering MATERIALS Selection
CYBER-CRIMES AND SECURITY A guide to understanding
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
573137875-Attendance-Management-System-original
Foundation to blockchain - A guide to Blockchain Tech

A Tale of Two Systems - Insights from Software Architecture

  • 1. A TALE OF TWO SYSTEMS: INSIGHTS FROM SOFTWARE ARCHITECTURE DAVID MAX Senior Software Engineer
  • 2. ABOUT LINKEDIN NEW YORK CITY ● Located in Empire State Building. ● Approximately 90 engineers and out of about 1000 employees total. ● Multiple teams, front end, back end and data science.
  • 3. #nwd2018WHAT “TWO SYSTEMS”? System 1 ● A working system that is nearing the limits of its capacity. System 2 ● The replacement system designed to address the capacity issues. ○ Solves the capacity problem… ○ …but utterly fails in other ways.
  • 4. ANTI-PATTERN “A common response to a recurring problem that is usually ineffective and risks being highly counterproductive.” – Wikipedia “An antipattern is just like a pattern, except that instead of a solution it gives something that looks superficially like a solution but isn’t one.” – Andrew Koenig
  • 5. COACH VS. ROOKIE More powerful conceptual models help us better make sense of what we see.
  • 6. WHAT THE COACH HAS IS... “...a set of mental abstractions that allow him to convert his perceptions of raw phenomena, such as a ball being passed, into a condensed and integrated understanding of what is happening, such as the success of an offensive strategy. The coach watches the same game that the rookie does, but he understands it better.” – George Fairbanks, Just Enough Software Architecture
  • 7. THINKING LIKE A COACH - CONCEPTUAL MODELS “Software Architecture refers to the high level structures of a software system, the discipline of creating such structures, and the documentation of these structures. These structures are needed to reason about the software system.” – Wikipedia “Software architecture is the set of design decisions which, if made incorrectly, may cause your project to be cancelled.” ― Eoin Woods What is Software Architecture?
  • 8. #nwd2018ARCHITECTURALLY SIGNIFICANT REQUIREMENTS (ASRs) Constraints - Unchangeable design decisions, usually given, sometimes chosen. Quality Attributes - Externally visible properties that characterize how the system operates in a specific context. Influential Functional Requirements - Features and functions that require special attention in the architecture. Other Influencers - Time, knowledge, experience, skills, office politics, your own geeky biases, and all the other stuff that sways your decision making. ― Michael Keeling, Design It!
  • 9. #nwd2018QUALITY ATTRIBUTES - STANDARD BLENDER Pros: ● Powerful motor (550 Watts) ● Sits well on kitchen counter ● Dishwasher safe Cons: ● Must be plugged in ● Limited portability (example from Design It! by Michael Keeling)
  • 10. #nwd2018CORDLESS RECHARGEABLE HAND BLENDER Pros: ● Small, very portable ● Doesn’t need electric outlet to operate ● Very easy to clean Cons: ● Less powerful (2.5 Watts) ● Needs to be recharged after 20 minutes ● Must hold in hand to operate
  • 11. #nwd2018CHAINSAW BLENDER Pros ● Portable, doesn’t need electric outlet ● Powerful! (37cc gas-powered engine) Cons ● Tad loud ● Emits exhaust unsafe for indoor use ● Not suitable for kitchen countertop use
  • 12. #nwd2018TAKEAWAYS ● Three solutions for accomplishing the same task ● Each solution promotes a different set of quality attributes ● Quality attributes often trade off against each other ● The “best” design depends on which properties are most highly valued
  • 16. #nwd2018PROBLEMS ● Aggregator terminates with an out-of-memory error on the largest inputs. ● Task Manager shows there’s plenty of memory left. ● A single memory allocation is requesting well over 500MB at once, and fails. WHO NEEDS 500MB at once? If there is plenty of memory left, why is it failing?
  • 17. #nwd2018WIN32 PROCESS ADDRESS SPACE 2 GB 8000000 FFFFFFFF 0000000 System virtual address space. Reserved for use by system. 0000000 2 GB 0000000 7FFFFFFF Per-process virtual address space. Available for use by applications
  • 19. #nwd2018ADDRESS SPACE FRAGMENTATION Even with plenty of memory available, fragmentation of the address space means there’s not enough contiguous address space to fit this new block:
  • 20. #nwd2018COACHABLE MOMENT ● Don’t wait until your system is already blowing up. ● Some scaling problems can’t be solved by buying a bigger computer.
  • 21. #nwd2018LET’S FIX IT! Symptom: Aggregator is failing with an out-of-memory error. Reason: Output file is too large to fit in a Win32 memory mapped file. Analysis: Current implementation can’t scale beyond a certain size output. Conclusion: We have a scalability problem. Solution: Replace aggregation data store with a more scalable solution.
  • 23. #nwd2018NEW ARCHITECTURE HAS NICE NEW ATTRIBUTES
  • 24. #nwd2018NEW ARCHITECTURE OFFERS NEW SCALABILITY OPTIONS Increasing Scalability
  • 27. #nwd2018RUN TIME PERFORMANCE (NIGHTLY BATCH)
  • 28. #nwd2018ROOKIE MISTAKES ● Include all constraints ○ Fixated on scalability ○ Forgot that we also had important time constraint as well! ● Quality Attributes ○ Worried mainly about scalability, time to implement, and reducing changes to other parts of the system. ○ Forgot that quality attributes trade off against each other, and did not analyze to what extent scalability is an ASR. ● Other differences ○ Single process memory mapped files have different performance characteristics from in-memory distributed data caches.
  • 29. #nwd2018SIGNIFICANT DIFFERENCES Scenario - Lots of workers writing to same record. Memory Mapped File - Best performance because the memory page is most likely to be in memory. Less likely to need to swap to disk. File on Disk Mapped Address Range Memory PageCPU Cache Worker Worker Worker Worker Worker
  • 30. #nwd2018IN-MEMORY DISTRIBUTED CACHE Scenario - Lots of workers writing to same record. Worst performance when workers write to the same record on different machines because of node-to-node synchronization. Node Node NodeNode Node Node Worker Worker Worker Worker
  • 31. #nwd2018IN-MEMORY DISTRIBUTED CACHE Scenario - Lots of workers writing to same node. Poor performance because unable to distribute load. Node Node NodeNode Node Node Worker Worker Worker Worker Worker Worker Worker Worker
  • 32. #nwd2018MEMORY MAPPED FILE Scenario - Every worker writes to a different record. Worse performance, because fewer cache hits, more page faults, and more disk I/O. File on Disk Mapped Address Range Memory PageCPU Cache Worker Worker Worker Worker Worker Memory Page Page Fault
  • 33. #nwd2018IN-MEMORY DISTRIBUTED CACHE Scenario - Records associated with particular nodes. Load distributed over nodes. Best performance. Record locality minimizes node-to-node synchronization. Distributing connections over the cluster promotes better scaling. Node Node NodeNode Node Node Worker Worker Worker Worker Worker Worker Worker Worker Worker
  • 34. #nwd2018CONCLUSION ● Thinking about the architecture helps us better understand how what we are building addresses the important requirements. ● Promoting one quality attribute usually involves some kind of tradeoff. Software Engineering is the discipline of balancing tradeoffs. ● The architecture is the hardest thing to change after the fact, so it pays to invest some time up front analyzing the ASRs. ● Don’t wait until your system is falling over to make needed changes. Less time spent on the architecture up front often means more time spent doing avoidable rework later.