SlideShare a Scribd company logo
THE JOURNEY OF ASYNCIO ADOPTION
IN INSTAGRAM
Jimmy Lai
in PyCon TW 2018
OUTLINE
2
1 What's asyncio?
2 Asyncio Adoption in Instagram
3 Q&A
ABOUT ME - JIMMY LAI
• Software Engineer in Instagram Infrastructure
• I like Python
• Recent interests: Python efficiency
• profiling
• Cython
• asyncio
3
INSTAGRAM BACKEND
• Python + Django
• Serving with uwsgi
• Data fetching from backends
• No. of processes > No. CPU
4
Server
uwsgi
Django process
sharedmemory
memcached
cassandra
thrift services
https://instagram-engineering.com/
...
CPU
Django process
Django process
Django process
Django process
Django process
BLOCKING I/O PROBLEMS
• Slow API: API takes longer time to finish. Bad user experience.
• CPU idle: Context switch between processes come with overhead.
• Harakiri: Long request process termination (uwsgi Harakiri). Restarting process has high
overhead.
5
WHAT'S ASYNCIO
• Asynchronous I/O
• Running I/O concurrently
• Blocking IO mode
• Async IO mode
6https://rarehistoricalphotos.com/samuel-reshevsky-age-8-france-1920/
• Simultaneous Exhibition
CPU I/O CPU I/O
CPU I/O
CPU I/O
CPU I/O
CPU I/O
time
ASYNCIO AS SOLUTION
• Slow API: API runs faster and user get better experiences.
• CPU idle: In-thread context switch vs process context switch.
• Harakiri: Just cancel pending async call. No need to kill process.
7
MYTHS ABOUT ASYNCIO
1. asyncio is multi-processes or parallel computing. It's single single-threaded.
• Only one function could be executed at one time.
• Only I/O could run concurrently.
2. asyncio is always faster regarding CPU and Latency.
• Overhead of event loop and context switch could be significant.
8
CPYTHON ASYNCIO
• asyncio module became available starting in CPython 3.4
• Instagram used version 2.7 for a long time and migrated to 3.5 in 2017
9
ASYNC SYNTAX
• async def, await, coroutine
10
1 In [1]: async def sleep_and_return(sec):
2 ...: await asyncio.sleep(sec)
3 ...: return sec
4 ...:
5
6 In [2]: sleep_and_return()
7 Out[2]: <coroutine object sleep_and_return
at 0x10556ae60>
ASYNC SYNTAX
• async def, await, coroutine
• run async function in event loop
11
1 In [1]: async def sleep_and_return(sec):
2 ...: await asyncio.sleep(sec)
3 ...: return sec
4 ...:
5
6 In [2]: sleep_and_return()
7 Out[2]: <coroutine object sleep_and_return
at 0x10556ae60>
8
9 In [3]:
asyncio.get_event_loop().run_until_complete(
sleep_and_return(1))
10 Out[3]: 1
ASYNC SYNTAX
• async def, await, coroutine
• run async function in event loop
12
1 In [1]: async def sleep_and_return(sec):
2 ...: await asyncio.sleep(sec)
3 ...: return sec
4 ...:
5
6 In [2]: sleep_and_return()
7 Out[2]: <coroutine object sleep_and_return
at 0x10556ae60>
• gather async functions to run IO concurrently
8
9 In [3]:
asyncio.get_event_loop().run_until_complete(
sleep_and_return(1))
10 Out[3]: 1
9 In [3]: async def run():
10 ...: results = await asyncio.gather(
11 ...: sleep_and_return(1),
12 ...: sleep_and_return(1),
13 ...: sleep_and_return(2),
14 ...: )
15 ...: print(results)
16 ...:
17
18 In [4]: %timeit -r 1
asyncio.get_event_loop().run_until_complete(run())
19 ...:
20 ...:
21 [1, 1, 2]
22 [1, 1, 2]
23 2 s ± 0 ns per loop (mean ± std. dev. of 1
run, 1 loop each)
ASYNC SYNTAX
• async def, await, coroutine
• run async function in event loop
13
1 In [1]: async def sleep_and_return(sec):
2 ...: await asyncio.sleep(sec)
3 ...: return sec
4 ...:
5
6 In [2]: sleep_and_return()
7 Out[2]: <coroutine object sleep_and_return
at 0x10556ae60>
• gather async functions to run IO concurrently
8
9 In [3]:
asyncio.get_event_loop().run_until_complete(
sleep_and_return(1))
10 Out[3]: 1
9 In [3]: async def run():
10 ...: results = await asyncio.gather(
11 ...: sleep_and_return(1),
12 ...: sleep_and_return(1),
13 ...: sleep_and_return(2),
14 ...: )
15 ...: print(results)
16 ...:
17
18 In [4]: %timeit -r 1
asyncio.get_event_loop().run_until_complete(run())
19 ...:
20 ...:
21 [1, 1, 2]
22 [1, 1, 2]
23 2 s ± 0 ns per loop (mean ± std. dev. of 1
run, 1 loop each)
gather() is the key to get latency win!
HOW ASYNCIO WORKS?
• nonblocking I/O mode: socket.setblocking(False)
• register I/O to EpollSelector and wait until I/O ready by select( )
14Source code are simplified for explanation purpose.
1 class BaseSelectorEventLoop:
2 async def sock_recv(self, sock, n):
3 """Receive data from the socket."""
4 fut = self.create_future()
5 fd = sock.fileno()
6 handle = events.Handle(
7 self._sock_recv, args, self, None
8 )
9 self._selector.register(
10 fd, selectors.EVENT_READ, (handle, None)
11 )
12 return await fut
13
14 def _sock_recv(self, fut, registered_fd, sock, n):
15 try:
16 data = sock.recv(n)
17 except (BlockingIOError, InterruptedError):
18 ...
19
20 def run_until_complete(self, future):
21 """Run until the Future is done."""
22 self.run_forever()
23
24 def run_forever(self):
25 """Run until stop() is called."""
26 while True:
27 self._run_once()
28 if self._stopping:
29 break
30
31 def _run_once(self):
32 """Run one full iteration of the event loop."""
33 event_list = self._selector.select(None)
34 self._process_events(event_list)
35 ntodo = len(self._ready)
36 for i in range(ntodo):
37 handle = self._ready.popleft()
38 handle._run()
1
2
3
ASYNCIO ADOPTION IN INSTAGRAM
ASYNCIO ADOPTION IN INSTAGRAM JUST LIKE
decorate some trees in a forest
16
Instagram started using
Django and launched in
2010.
Large repo and many
developers.
ASYNCIO ADOPTION CHALLENGES
• scale: collaboration in large code repo with a lot of developers
• usability: asyncio utility and bug fix
• prioritization: too much blocking calls to migrate
• automation: reduce repeated manual effort
• efficiency: asyncio CPU overhead is very high
17
BACKEND CLIENT LIBRARIES ASYNCIO SUPPORT
• Thrift
• fbthrift py3 and py.asyncio namespaces
• Http
• aiohttp replaces requests
• Other backends
• https://github.com/aio-libs
18
• wait_for • async_test
MAKE ASYNCIO EASIER
19
1 import asyncio
2
3 def wait_for(coro):
4 loop = asyncio.get_event_loop()
5 return loop.run_until_complete(coro)
6
7 result = wait_for(async_func())
1 def async_test(func):
2 def inner(*args, **kwargs):
3 return wait_for(
4 func(*args, **kwargs)
5 )
6 return inner
7
8 class TestAsyncMethods(unittest.TestCase):
9 @async_test
10 async def test_async_method(self):
11 obj = Cls()
12 self.assertTrue(await obj.async_func())
ASYNC STACK MIGRATION
20
1 def func():
2 blocking_thrift_call()
3
4 ## after migrating to async
5
6 async def func():
7 await async_thrift_call()
IDENTIFY BLOCKING CALLS
Blocking Call Finder
• Figure out blocking call stack
and prioritize among tons of
stacks
• Prioritize stack by latency/call
count
• Implementation:
• use profile to collect
runtime stack trace
• use pygraphviz to render
graph view
21
1 def f():
2 blocking_thrift_call()
3
4 def g():
5 h()
6
7 def h():
8 blocking_http_call()
9
10 def api():
11 f()
12 g()
api
f g
blocking_thrift_call
h
blocking_http_call
20ms
50k calls
10ms
10k calls
9ms
9k calls
9ms
9k calls
20ms
50k calls
WHEN TOO MANY DEPENDENCY IN STACK
• Use sync wrapper
22
SYNC
func = sync(async_func)
• Provide async and non-async versions
given a function.
• Supports classmethod, staticmethod,
etc.
• Clean up sync wrapper line after
migrate all callsite to async.
23
1 def sync(async_func):
2 is_classmethod = False
3 if isinstance(async_func, classmethod):
4 async_func = async_func.__func__
5 is_classmethod = True
6 elif isinstance(async_func, staticmethod):
7 async_func = async_func.__func__
8 if not asyncio.iscoroutinefunction(async_func):
9 async_func = asyncio.coroutine(async_func)
10
11 @functools.wraps(async_func)
12 def _no_profile_sync(*args, **kwargs):
13 return wait_for(async_func(*args, **kwargs))
14
15 if is_classmethod:
16 return classmethod(_no_profile_sync)
17 else:
18 return _no_profile_sync
19
20 func = sync(async_func)
NESTED EVENT LOOP
RuntimeError: This event loop is already running
24
run_until_complete( )
async def f( )
def g( )
def h( )
run_until_complete( )
async def i( )
• Use new event loop when loop
is already running.
• Loop pool for reusing event loop
• Set current event loop and
running loop when loop is
already running.
• Restore event loop after finish
run_until_complete.
1 import asyncio
2 from contextlib import contextmanager
3
4 def wait_for(coro):
5 with get_event_loop() as loop:
6 return loop.run_until_complete(coro)
7
8 @contextmanager
9 def get_event_loop():
10 loop = asyncio.get_event_loop()
11 if not loop.is_running():
12 yield loop
13 else:
14 new_loop = loop_pool.borrow_loop()
15 asyncio.set_event_loop(new_loop)
16 running_loop = asyncio.events._get_running_loop()
17 asyncio.events._set_running_loop(None)
18 try:
19 yield new_loop
20 finally:
21 loop_pool.return_loop(new_loop)
22 asyncio.set_event_loop(loop)
23 asyncio.events._set_running_loop(running_loop)
RUNTIME ERROR: EVENT LOOP STOPPED BEFORE FUTURE
COMPLETED.
25
1 def test_run_until_complete_loop_orphan_future_close_loop(self):
2 class ShowStopper(BaseException):
3 pass
4
5 async def foo(delay):
6 await asyncio.sleep(delay, loop=self.loop)
7
8 def throw():
9 raise ShowStopper
10
11 self.loop._process_events = mock.Mock()
12 self.loop.call_soon(throw)
13 try:
14 self.loop.run_until_complete(foo(0.1))
15 except ShowStopper:
16 pass
17
18 # This call fails if run_until_complete does not clean up
19 # done-callback for the previous future.
20 self.loop.run_until_complete(foo(0.2))
Fix in run_until_complete( )
https://github.com/python/cpython/pull/1688
GLOBAL VARIABLE ISSUE
• Execution order is not guaranteed. Shared
mutable global variable may cause
unexpected result.
26
1 var = Container()
2
3 async def f():
4 var.val = await read_from_db1()
5 await write_to_db1(var)
6
7 async def g():
8 var.val = await read_from_db2()
9 await write_to_db2(var)
10
11 async def run():
12 await asyncio.gather(f(), g())
1 import contextvars
2 var = contextvars.ContextVar('var')
3
4 async def f():
5 var.set(await read_from_db1())
6 await write_to_db1(var.get())
7
8 async def g():
9 var.set(await read_from_db2())
10 await write_to_db2(var.get())
11
12 async def run():
13 await asyncio.gather(f(), g())
• Context Variable added in Python 3.7
GATHER DESIGN PATTERN
• To achieve the maximum concurrency
27
1 async def identity(value):
2 return value
3
4 async def run():
5 awaitables = [
6 f(),
7 g() if a is True else identity(None),
8 h() if b is True else identity(None),
9 ]
10 _, var1, var2 = await asyncio.gather(*awaitables)
1 async def run():
2 await f()
3 var1 = None
4 if a is True:
5 var1 = await g()
6
7 var2 = None
8 if b is True:
9 var2 = await h()
LINT
Provide guidance to write better asyncio code
• Rules:
1. async function should be named with async_ prefix
• e.g. async_func( ) vs func( )
2. gather await in loop
3. warning when adding new blocking calls
• implemented with ast + flake8
28
1 for data in data_list:
2 await async_func(data)
3
4 # use gather to run faster
5 await asyncio.gather(*[async_func(data) for data in data_list])
AUTOMATION
• Many of asyncio changes are simple and repetitive
• smart code modifier for asyncio adoption:
• collect caller-callee from runtime profiling and offline pyan static analysis
• modify source code ast tree
• change blocking call to async call
• add await
• auto formatting code using isort and black
29
source
code
ast
code
modifier
change
set
pull
request
CPU OVERHEAD
• Adopting asyncio could cost ~20% CPU instructions on Instagram servers.
• CPython asyncio was slow due to Python implementation of event loop and helpers.
• Optimization strategies:
• simplify the code and remove redundant computation
• Cython
• C API
• Available optimizations:
• uvloop: libuv + Cython binding for event loop
• CPython 3.6 implement Future and Task in C
• CPython 3.7 implement get_event_loop( ) in C. Future and gather( ) also become
faster.
30
CUSTOM OPTIMIZATION
• Example: gather( ) -> ensure_future( ) -> isfuture/iscoroutine/isawaitable
• Reorder: check iscoroutine first
• gather deduplicate coroutines using a dict. Remove the assumption.
• Implement all helper functions by C API
• Optimization result: reduce the overall asyncio CPU overhead by 2X (10%)
31
CURRENT RESULTS
• API latency become 30% faster on server side
• Better user engagement
• more media views
• more time spent
• Next Steps
• 100% asyncio
• concurrent request handling
32
Q&A
jimmylai@instagram.com

More Related Content

PDF
A Case Study in the Design of a Restaurant Management System.pdf
PPTX
Software engineering 13 software product metrics
PPTX
Introduction to Flutter
PPTX
Flutter frame work
DOC
priti_resume
PPTX
Introduction to Flutter
PDF
Flutter - it's all about widgets - Flutter Rhein-Main Meetup
PPTX
Mobile Web Apps
A Case Study in the Design of a Restaurant Management System.pdf
Software engineering 13 software product metrics
Introduction to Flutter
Flutter frame work
priti_resume
Introduction to Flutter
Flutter - it's all about widgets - Flutter Rhein-Main Meetup
Mobile Web Apps

What's hot (20)

DOC
Manual_testing_Resume
DOC
Sreekumar_6+ Years QA Manual AutomationQTP Tester Resume
DOC
01. testing fresher-resume
DOCX
Pawan Resume
PPTX
Software Testing Fundamentals | Basics Of Software Testing
PDF
Introduction to flutter
PPTX
Tour and travel management system
PDF
What is Performance Testing?
PPTX
Static analysis
DOCX
Resume - Shrikrishan - SOFTWARE TESTING
PDF
SRS FOR CHAT APPLICATION
PPTX
Introduction to flutter(1)
PPTX
Intro to Flutter SDK
PDF
Developing Cross platform apps in flutter (Android, iOS, Web)
PPTX
Introduction to Software Engineering
DOC
Resume For Software Test Engineer
PDF
Software Engineering in 6 hours of knowledge gate
PPT
Software Quality Assurance in software engineering
PPTX
Programming languages
DOC
Fresher testing cv
Manual_testing_Resume
Sreekumar_6+ Years QA Manual AutomationQTP Tester Resume
01. testing fresher-resume
Pawan Resume
Software Testing Fundamentals | Basics Of Software Testing
Introduction to flutter
Tour and travel management system
What is Performance Testing?
Static analysis
Resume - Shrikrishan - SOFTWARE TESTING
SRS FOR CHAT APPLICATION
Introduction to flutter(1)
Intro to Flutter SDK
Developing Cross platform apps in flutter (Android, iOS, Web)
Introduction to Software Engineering
Resume For Software Test Engineer
Software Engineering in 6 hours of knowledge gate
Software Quality Assurance in software engineering
Programming languages
Fresher testing cv
Ad

Similar to The journey of asyncio adoption in instagram (20)

PDF
BUILDING APPS WITH ASYNCIO
PPTX
Async programming and python
PDF
Global Interpreter Lock: Episode I - Break the Seal
PDF
HOW TO DEAL WITH BLOCKING CODE WITHIN ASYNCIO EVENT LOOP
PDF
Aio...whatever
PDF
Python, do you even async?
PDF
OSMC 2012 | Neues in Nagios 4.0 by Andreas Ericsson
PPTX
How NOT to write in Node.js
PDF
Syncing up with Python’s asyncio for (micro) service development, Joir-dan Gumbs
PDF
Asynchronous programming intro
PDF
A deep dive into PEP-3156 and the new asyncio module
PPTX
Gevent rabbit rpc
PDF
JavaScript Async for Effortless UX
PPTX
04_ForkPipe.pptx
PDF
Gevent be or not to be
PDF
«Gevent — быть или не быть?» Александр Мокров, Positive Technologies
PDF
Alexander Reelsen - Seccomp for Developers
PDF
Letswift19-clean-architecture
PDF
Automating with NX-OS: Let's Get Started!
PPTX
Down the rabbit hole, profiling in Django
BUILDING APPS WITH ASYNCIO
Async programming and python
Global Interpreter Lock: Episode I - Break the Seal
HOW TO DEAL WITH BLOCKING CODE WITHIN ASYNCIO EVENT LOOP
Aio...whatever
Python, do you even async?
OSMC 2012 | Neues in Nagios 4.0 by Andreas Ericsson
How NOT to write in Node.js
Syncing up with Python’s asyncio for (micro) service development, Joir-dan Gumbs
Asynchronous programming intro
A deep dive into PEP-3156 and the new asyncio module
Gevent rabbit rpc
JavaScript Async for Effortless UX
04_ForkPipe.pptx
Gevent be or not to be
«Gevent — быть или не быть?» Александр Мокров, Positive Technologies
Alexander Reelsen - Seccomp for Developers
Letswift19-clean-architecture
Automating with NX-OS: Let's Get Started!
Down the rabbit hole, profiling in Django
Ad

More from Jimmy Lai (20)

PDF
[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
PDF
PyCon JP 2024 Streamlining Testing in a Large Python Codebase .pdf
PDF
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
PDF
Python Linters at Scale.pdf
PDF
EuroPython 2022 - Automated Refactoring Large Python Codebases
PDF
Annotate types in large codebase with automated refactoring
PDF
Data Analyst Nanodegree
PDF
Distributed system coordination by zookeeper and introduction to kazoo python...
PDF
Continuous Delivery: automated testing, continuous integration and continuous...
PDF
Build a Searchable Knowledge Base
PDF
[LDSP] Solr Usage
PDF
[LDSP] Search Engine Back End API Solution for Fast Prototyping
PDF
Text classification in scikit-learn
PDF
Big data analysis in python @ PyCon.tw 2013
PDF
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
PDF
Software development practices in python
PDF
Fast data mining flow prototyping using IPython Notebook
PDF
Documentation with sphinx @ PyHug
PDF
Apache thrift-RPC service cross languages
PDF
NetworkX - python graph analysis and visualization @ PyHug
[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
PyCon JP 2024 Streamlining Testing in a Large Python Codebase .pdf
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
Python Linters at Scale.pdf
EuroPython 2022 - Automated Refactoring Large Python Codebases
Annotate types in large codebase with automated refactoring
Data Analyst Nanodegree
Distributed system coordination by zookeeper and introduction to kazoo python...
Continuous Delivery: automated testing, continuous integration and continuous...
Build a Searchable Knowledge Base
[LDSP] Solr Usage
[LDSP] Search Engine Back End API Solution for Fast Prototyping
Text classification in scikit-learn
Big data analysis in python @ PyCon.tw 2013
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Software development practices in python
Fast data mining flow prototyping using IPython Notebook
Documentation with sphinx @ PyHug
Apache thrift-RPC service cross languages
NetworkX - python graph analysis and visualization @ PyHug

Recently uploaded (20)

PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
A Presentation on Artificial Intelligence
PDF
Encapsulation theory and applications.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Machine learning based COVID-19 study performance prediction
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
cuic standard and advanced reporting.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
The Rise and Fall of 3GPP – Time for a Sabbatical?
Advanced methodologies resolving dimensionality complications for autism neur...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
MYSQL Presentation for SQL database connectivity
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
A Presentation on Artificial Intelligence
Encapsulation theory and applications.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Unlocking AI with Model Context Protocol (MCP)
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Big Data Technologies - Introduction.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Machine learning based COVID-19 study performance prediction
Mobile App Security Testing_ A Comprehensive Guide.pdf
The AUB Centre for AI in Media Proposal.docx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Understanding_Digital_Forensics_Presentation.pptx
Chapter 3 Spatial Domain Image Processing.pdf
cuic standard and advanced reporting.pdf

The journey of asyncio adoption in instagram

  • 1. THE JOURNEY OF ASYNCIO ADOPTION IN INSTAGRAM Jimmy Lai in PyCon TW 2018
  • 2. OUTLINE 2 1 What's asyncio? 2 Asyncio Adoption in Instagram 3 Q&A
  • 3. ABOUT ME - JIMMY LAI • Software Engineer in Instagram Infrastructure • I like Python • Recent interests: Python efficiency • profiling • Cython • asyncio 3
  • 4. INSTAGRAM BACKEND • Python + Django • Serving with uwsgi • Data fetching from backends • No. of processes > No. CPU 4 Server uwsgi Django process sharedmemory memcached cassandra thrift services https://instagram-engineering.com/ ... CPU Django process Django process Django process Django process Django process
  • 5. BLOCKING I/O PROBLEMS • Slow API: API takes longer time to finish. Bad user experience. • CPU idle: Context switch between processes come with overhead. • Harakiri: Long request process termination (uwsgi Harakiri). Restarting process has high overhead. 5
  • 6. WHAT'S ASYNCIO • Asynchronous I/O • Running I/O concurrently • Blocking IO mode • Async IO mode 6https://rarehistoricalphotos.com/samuel-reshevsky-age-8-france-1920/ • Simultaneous Exhibition CPU I/O CPU I/O CPU I/O CPU I/O CPU I/O CPU I/O time
  • 7. ASYNCIO AS SOLUTION • Slow API: API runs faster and user get better experiences. • CPU idle: In-thread context switch vs process context switch. • Harakiri: Just cancel pending async call. No need to kill process. 7
  • 8. MYTHS ABOUT ASYNCIO 1. asyncio is multi-processes or parallel computing. It's single single-threaded. • Only one function could be executed at one time. • Only I/O could run concurrently. 2. asyncio is always faster regarding CPU and Latency. • Overhead of event loop and context switch could be significant. 8
  • 9. CPYTHON ASYNCIO • asyncio module became available starting in CPython 3.4 • Instagram used version 2.7 for a long time and migrated to 3.5 in 2017 9
  • 10. ASYNC SYNTAX • async def, await, coroutine 10 1 In [1]: async def sleep_and_return(sec): 2 ...: await asyncio.sleep(sec) 3 ...: return sec 4 ...: 5 6 In [2]: sleep_and_return() 7 Out[2]: <coroutine object sleep_and_return at 0x10556ae60>
  • 11. ASYNC SYNTAX • async def, await, coroutine • run async function in event loop 11 1 In [1]: async def sleep_and_return(sec): 2 ...: await asyncio.sleep(sec) 3 ...: return sec 4 ...: 5 6 In [2]: sleep_and_return() 7 Out[2]: <coroutine object sleep_and_return at 0x10556ae60> 8 9 In [3]: asyncio.get_event_loop().run_until_complete( sleep_and_return(1)) 10 Out[3]: 1
  • 12. ASYNC SYNTAX • async def, await, coroutine • run async function in event loop 12 1 In [1]: async def sleep_and_return(sec): 2 ...: await asyncio.sleep(sec) 3 ...: return sec 4 ...: 5 6 In [2]: sleep_and_return() 7 Out[2]: <coroutine object sleep_and_return at 0x10556ae60> • gather async functions to run IO concurrently 8 9 In [3]: asyncio.get_event_loop().run_until_complete( sleep_and_return(1)) 10 Out[3]: 1 9 In [3]: async def run(): 10 ...: results = await asyncio.gather( 11 ...: sleep_and_return(1), 12 ...: sleep_and_return(1), 13 ...: sleep_and_return(2), 14 ...: ) 15 ...: print(results) 16 ...: 17 18 In [4]: %timeit -r 1 asyncio.get_event_loop().run_until_complete(run()) 19 ...: 20 ...: 21 [1, 1, 2] 22 [1, 1, 2] 23 2 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)
  • 13. ASYNC SYNTAX • async def, await, coroutine • run async function in event loop 13 1 In [1]: async def sleep_and_return(sec): 2 ...: await asyncio.sleep(sec) 3 ...: return sec 4 ...: 5 6 In [2]: sleep_and_return() 7 Out[2]: <coroutine object sleep_and_return at 0x10556ae60> • gather async functions to run IO concurrently 8 9 In [3]: asyncio.get_event_loop().run_until_complete( sleep_and_return(1)) 10 Out[3]: 1 9 In [3]: async def run(): 10 ...: results = await asyncio.gather( 11 ...: sleep_and_return(1), 12 ...: sleep_and_return(1), 13 ...: sleep_and_return(2), 14 ...: ) 15 ...: print(results) 16 ...: 17 18 In [4]: %timeit -r 1 asyncio.get_event_loop().run_until_complete(run()) 19 ...: 20 ...: 21 [1, 1, 2] 22 [1, 1, 2] 23 2 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each) gather() is the key to get latency win!
  • 14. HOW ASYNCIO WORKS? • nonblocking I/O mode: socket.setblocking(False) • register I/O to EpollSelector and wait until I/O ready by select( ) 14Source code are simplified for explanation purpose. 1 class BaseSelectorEventLoop: 2 async def sock_recv(self, sock, n): 3 """Receive data from the socket.""" 4 fut = self.create_future() 5 fd = sock.fileno() 6 handle = events.Handle( 7 self._sock_recv, args, self, None 8 ) 9 self._selector.register( 10 fd, selectors.EVENT_READ, (handle, None) 11 ) 12 return await fut 13 14 def _sock_recv(self, fut, registered_fd, sock, n): 15 try: 16 data = sock.recv(n) 17 except (BlockingIOError, InterruptedError): 18 ... 19 20 def run_until_complete(self, future): 21 """Run until the Future is done.""" 22 self.run_forever() 23 24 def run_forever(self): 25 """Run until stop() is called.""" 26 while True: 27 self._run_once() 28 if self._stopping: 29 break 30 31 def _run_once(self): 32 """Run one full iteration of the event loop.""" 33 event_list = self._selector.select(None) 34 self._process_events(event_list) 35 ntodo = len(self._ready) 36 for i in range(ntodo): 37 handle = self._ready.popleft() 38 handle._run() 1 2 3
  • 15. ASYNCIO ADOPTION IN INSTAGRAM
  • 16. ASYNCIO ADOPTION IN INSTAGRAM JUST LIKE decorate some trees in a forest 16 Instagram started using Django and launched in 2010. Large repo and many developers.
  • 17. ASYNCIO ADOPTION CHALLENGES • scale: collaboration in large code repo with a lot of developers • usability: asyncio utility and bug fix • prioritization: too much blocking calls to migrate • automation: reduce repeated manual effort • efficiency: asyncio CPU overhead is very high 17
  • 18. BACKEND CLIENT LIBRARIES ASYNCIO SUPPORT • Thrift • fbthrift py3 and py.asyncio namespaces • Http • aiohttp replaces requests • Other backends • https://github.com/aio-libs 18
  • 19. • wait_for • async_test MAKE ASYNCIO EASIER 19 1 import asyncio 2 3 def wait_for(coro): 4 loop = asyncio.get_event_loop() 5 return loop.run_until_complete(coro) 6 7 result = wait_for(async_func()) 1 def async_test(func): 2 def inner(*args, **kwargs): 3 return wait_for( 4 func(*args, **kwargs) 5 ) 6 return inner 7 8 class TestAsyncMethods(unittest.TestCase): 9 @async_test 10 async def test_async_method(self): 11 obj = Cls() 12 self.assertTrue(await obj.async_func())
  • 20. ASYNC STACK MIGRATION 20 1 def func(): 2 blocking_thrift_call() 3 4 ## after migrating to async 5 6 async def func(): 7 await async_thrift_call()
  • 21. IDENTIFY BLOCKING CALLS Blocking Call Finder • Figure out blocking call stack and prioritize among tons of stacks • Prioritize stack by latency/call count • Implementation: • use profile to collect runtime stack trace • use pygraphviz to render graph view 21 1 def f(): 2 blocking_thrift_call() 3 4 def g(): 5 h() 6 7 def h(): 8 blocking_http_call() 9 10 def api(): 11 f() 12 g() api f g blocking_thrift_call h blocking_http_call 20ms 50k calls 10ms 10k calls 9ms 9k calls 9ms 9k calls 20ms 50k calls
  • 22. WHEN TOO MANY DEPENDENCY IN STACK • Use sync wrapper 22
  • 23. SYNC func = sync(async_func) • Provide async and non-async versions given a function. • Supports classmethod, staticmethod, etc. • Clean up sync wrapper line after migrate all callsite to async. 23 1 def sync(async_func): 2 is_classmethod = False 3 if isinstance(async_func, classmethod): 4 async_func = async_func.__func__ 5 is_classmethod = True 6 elif isinstance(async_func, staticmethod): 7 async_func = async_func.__func__ 8 if not asyncio.iscoroutinefunction(async_func): 9 async_func = asyncio.coroutine(async_func) 10 11 @functools.wraps(async_func) 12 def _no_profile_sync(*args, **kwargs): 13 return wait_for(async_func(*args, **kwargs)) 14 15 if is_classmethod: 16 return classmethod(_no_profile_sync) 17 else: 18 return _no_profile_sync 19 20 func = sync(async_func)
  • 24. NESTED EVENT LOOP RuntimeError: This event loop is already running 24 run_until_complete( ) async def f( ) def g( ) def h( ) run_until_complete( ) async def i( ) • Use new event loop when loop is already running. • Loop pool for reusing event loop • Set current event loop and running loop when loop is already running. • Restore event loop after finish run_until_complete. 1 import asyncio 2 from contextlib import contextmanager 3 4 def wait_for(coro): 5 with get_event_loop() as loop: 6 return loop.run_until_complete(coro) 7 8 @contextmanager 9 def get_event_loop(): 10 loop = asyncio.get_event_loop() 11 if not loop.is_running(): 12 yield loop 13 else: 14 new_loop = loop_pool.borrow_loop() 15 asyncio.set_event_loop(new_loop) 16 running_loop = asyncio.events._get_running_loop() 17 asyncio.events._set_running_loop(None) 18 try: 19 yield new_loop 20 finally: 21 loop_pool.return_loop(new_loop) 22 asyncio.set_event_loop(loop) 23 asyncio.events._set_running_loop(running_loop)
  • 25. RUNTIME ERROR: EVENT LOOP STOPPED BEFORE FUTURE COMPLETED. 25 1 def test_run_until_complete_loop_orphan_future_close_loop(self): 2 class ShowStopper(BaseException): 3 pass 4 5 async def foo(delay): 6 await asyncio.sleep(delay, loop=self.loop) 7 8 def throw(): 9 raise ShowStopper 10 11 self.loop._process_events = mock.Mock() 12 self.loop.call_soon(throw) 13 try: 14 self.loop.run_until_complete(foo(0.1)) 15 except ShowStopper: 16 pass 17 18 # This call fails if run_until_complete does not clean up 19 # done-callback for the previous future. 20 self.loop.run_until_complete(foo(0.2)) Fix in run_until_complete( ) https://github.com/python/cpython/pull/1688
  • 26. GLOBAL VARIABLE ISSUE • Execution order is not guaranteed. Shared mutable global variable may cause unexpected result. 26 1 var = Container() 2 3 async def f(): 4 var.val = await read_from_db1() 5 await write_to_db1(var) 6 7 async def g(): 8 var.val = await read_from_db2() 9 await write_to_db2(var) 10 11 async def run(): 12 await asyncio.gather(f(), g()) 1 import contextvars 2 var = contextvars.ContextVar('var') 3 4 async def f(): 5 var.set(await read_from_db1()) 6 await write_to_db1(var.get()) 7 8 async def g(): 9 var.set(await read_from_db2()) 10 await write_to_db2(var.get()) 11 12 async def run(): 13 await asyncio.gather(f(), g()) • Context Variable added in Python 3.7
  • 27. GATHER DESIGN PATTERN • To achieve the maximum concurrency 27 1 async def identity(value): 2 return value 3 4 async def run(): 5 awaitables = [ 6 f(), 7 g() if a is True else identity(None), 8 h() if b is True else identity(None), 9 ] 10 _, var1, var2 = await asyncio.gather(*awaitables) 1 async def run(): 2 await f() 3 var1 = None 4 if a is True: 5 var1 = await g() 6 7 var2 = None 8 if b is True: 9 var2 = await h()
  • 28. LINT Provide guidance to write better asyncio code • Rules: 1. async function should be named with async_ prefix • e.g. async_func( ) vs func( ) 2. gather await in loop 3. warning when adding new blocking calls • implemented with ast + flake8 28 1 for data in data_list: 2 await async_func(data) 3 4 # use gather to run faster 5 await asyncio.gather(*[async_func(data) for data in data_list])
  • 29. AUTOMATION • Many of asyncio changes are simple and repetitive • smart code modifier for asyncio adoption: • collect caller-callee from runtime profiling and offline pyan static analysis • modify source code ast tree • change blocking call to async call • add await • auto formatting code using isort and black 29 source code ast code modifier change set pull request
  • 30. CPU OVERHEAD • Adopting asyncio could cost ~20% CPU instructions on Instagram servers. • CPython asyncio was slow due to Python implementation of event loop and helpers. • Optimization strategies: • simplify the code and remove redundant computation • Cython • C API • Available optimizations: • uvloop: libuv + Cython binding for event loop • CPython 3.6 implement Future and Task in C • CPython 3.7 implement get_event_loop( ) in C. Future and gather( ) also become faster. 30
  • 31. CUSTOM OPTIMIZATION • Example: gather( ) -> ensure_future( ) -> isfuture/iscoroutine/isawaitable • Reorder: check iscoroutine first • gather deduplicate coroutines using a dict. Remove the assumption. • Implement all helper functions by C API • Optimization result: reduce the overall asyncio CPU overhead by 2X (10%) 31
  • 32. CURRENT RESULTS • API latency become 30% faster on server side • Better user engagement • more media views • more time spent • Next Steps • 100% asyncio • concurrent request handling 32