SlideShare a Scribd company logo
PyCon APAC 2015
Global Interpreter Lock
Episode I - Break the Seal
Tzung-Bi Shih
<penvirus@gmail.com>
PyCon APAC 2015
Introduction
• Global Interpreter Lock[1]
• giant lock[2]
• GIL in CPython[5] protects:
• interpreter state, thread state, ...
• reference count
• “a guarantee”
2
• other implementations
• fine-grained lock[3]
• lock-free[4]
some CPython features and extensions
depend on the agreement
PyCon APAC 2015
GIL over Multi-Processor[6]
We want to produce efficient program.
To achieve higher throughputs, we usually divide a program
into several independent logic segments and execute them
simultaneously over MP architecture by leveraging multi-
threading technology.
Unfortunately, only one of the threads gets executed at a time
if they compete for a same GIL.
Some people are working on how to remove the giant lock
which shall be a difficult job[7][8][9]. Before the wonderful world
comes, we will need to learn how to live along with GIL well.
3
PyCon APAC 2015
Brainless Solution
multi-process
• Embarrassingly parallel[10]
• no dependency between those parallel tasks
• IPC[11]-required parallel task
• share states with other peers
• Examples:
• multiprocessing[12], pp[13], pyCSP[14]
4
PyCon APAC 2015
Example[15]
multiprocessing: process pool
5
1 import os
2 from multiprocessing import Pool
3
4 def worker(i):
5 print 'pid=%d ppid=%d i=%d' % (os.getpid(), os.getppid(), i)
6
7 print 'pid=%d' % os.getpid()
8 pool = Pool(processes=4)
9 pool.map(worker, xrange(10))
10 pool.terminate()
Round 1:
pid=11326
pid=11327 ppid=11326 i=0
pid=11328 ppid=11326 i=1
pid=11328 ppid=11326 i=3
pid=11329 ppid=11326 i=2
pid=11329 ppid=11326 i=5
pid=11329 ppid=11326 i=6
pid=11329 ppid=11326 i=7
pid=11329 ppid=11326 i=8
pid=11327 ppid=11326 i=4
pid=11328 ppid=11326 i=9
nondeterministic[16]:
the same input, different output
Round 2:
pid=11372
pid=11373 ppid=11372 i=0
pid=11373 ppid=11372 i=2
pid=11374 ppid=11372 i=1
pid=11376 ppid=11372 i=3
pid=11374 ppid=11372 i=4
pid=11374 ppid=11372 i=7
pid=11373 ppid=11372 i=6
pid=11376 ppid=11372 i=8
pid=11375 ppid=11372 i=5
pid=11375 ppid=11372 i=9
PyCon APAC 2015
Example
multiprocessing: further observations (1/2)
6
=> What if I create the target function after the pool initialized?
1 import os
2 from multiprocessing import Pool
3
4 print 'pid=%d' % os.getpid()
5 pool = Pool(processes=4)
6
7 def worker(i):
8 print 'pid=%d ppid=%d i=%d' % (os.getpid(), os.getppid(), i)
9
10 pool.map(worker, xrange(10))
11 pool.terminate()
• Adopts un-named pipe to handle IPC
• Workers are forked when initializing the pool
• so that workers can “see” the target function (they
will share the same memory copy)
PyCon APAC 2015
Example
multiprocessing: further observations (2/2)
7
Output:
pid=12093
Process PoolWorker-1:
Process PoolWorker-2:
Traceback (most recent call last):
Process PoolWorker-3:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
...ignored...
AttributeError: 'module' object has no attribute 'worker'
...ignored...
pid=12101 ppid=12093 i=4
pid=12101 ppid=12093 i=5
pid=12101 ppid=12093 i=6
pid=12101 ppid=12093 i=7
pid=12101 ppid=12093 i=8
pid=12101 ppid=12093 i=9
^CProcess PoolWorker-6:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 374, in get
racquire()
KeyboardInterrupt
lost 0~3
process hanging
ctrl+c pressed
worker #6
#1~4 were terminated due to the exception
following workers will be forked
PyCon APAC 2015
Example
overhead of IPC and GIL battle[17]
comparison
8
1 import time
2 from multiprocessing import Process
3 from threading import Thread
4 from multiprocessing import Queue as MPQ
5 from Queue import Queue
6
7 MAX = 1000000
8
9 def test_(w_class, q_class):
10 def worker(queue):
11 for i in xrange(MAX):
12 queue.put(i)
13
14 q = q_class()
15 w = w_class(target=worker, args=(q,))
16
17 begin = time.time()
18 w.start()
19 for i in xrange(MAX):
20 q.get()
21 w.join()
22 end = time.time()
23
24 return end - begin
26 def test_sthread():
27 q = Queue()
28
29 begin = time.time()
30 for i in xrange(MAX):
31 q.put(i)
32 q.get()
33 end = time.time()
34
35 return end - begin
36
37 print 'mprocess: %.6f' % test_(Process, MPQ)
38 print 'mthread: %.6f' % test_(Thread, Queue)
39 print 'sthread: %.6f' % test_sthread()
Output:
mprocess: 14.225408
mthread: 7.759567
sthread: 2.743325
API of multiprocessing is similar to threading[18]
IPC is the most costly
overhead of the GIL battle
PyCon APAC 2015
Example
pp remote node
9
Server:
$ ppserver.py -w 1 -p 10000 &
[1] 16512
$ ppserver.py -w 1 -p 10001 &
[2] 16514
$ ppserver.py -w 1 -p 10002 &
[3] 16516
$ netstat -nlp
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 16512/python
tcp 0 0 0.0.0.0:10001 0.0.0.0:* LISTEN 16514/python
tcp 0 0 0.0.0.0:10002 0.0.0.0:* LISTEN 16516/python
$ pstree -p $$
bash(11971)-+-ppserver.py(16512)---python(16513)
|-ppserver.py(16514)---python(16515)
|-ppserver.py(16516)---python(16517)
`-pstree(16547)
# of workers listen to wait remote jobs
workers
PyCon APAC 2015
Example
pp local node
10
Output:
pid=16633
pid=16634 ppid=16633 i=0
pid=16513 ppid=16512 i=1
pid=16517 ppid=16516 i=2
pid=16515 ppid=16514 i=3
pid=16513 ppid=16512 i=4
pid=16517 ppid=16516 i=5
pid=16515 ppid=16514 i=6
pid=16634 ppid=16633 i=7
pid=16517 ppid=16516 i=8
pid=16513 ppid=16512 i=9
1 import os
2 import pp
3 import time
4 import random
5
6 print 'pid=%d' % os.getpid()
7
8 def worker(i):
9 print 'pid=%d ppid=%d i=%d' % (os.getpid(), os.getppid(), i)
10 time.sleep(random.randint(1, 3))
11
12 servers = ('127.0.0.1:10000', '127.0.0.1:10001', '127.0.0.1:10002')
13 job_server = pp.Server(1, ppservers=servers)
14
15 jobs = list()
16 for i in xrange(10):
17 job = job_server.submit(worker, args=(i,), modules=('time', 'random'))
18 jobs.append(job)
19
20 for job in jobs:
21 job()
# of workerspp worker collects stdout
determine the result order (deterministic) accumulative,
beware of RSIZE of remote node
A pp local node is an execution node too. It dispatches jobs to itself first.
computed by local node
PyCon APAC 2015
Example
ppserver.py gives some exceptions
11
Exception:
Exception in thread client_socket:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/local/bin/ppserver.py", line 176, in crun
ctype = mysocket.receive()
File "/usr/local/lib/python2.7/dist-packages/pptransport.py", line 196, in receive
raise RuntimeError("Socket connection is broken")
RuntimeError: Socket connection is broken
Don’t worry. Expected.
PyCon APAC 2015
Release the GIL
• Especially suitable for processor-bound tasks
• Examples:
• ctypes[19]
• Python/C extension[20][21]
• Cython[22]
• Pyrex[23]
12
PyCon APAC 2015
Example
ctypes (1/2)
13
3 duration = 10
4
5 def internal_busy():
6 import time
7
8 count = 0
9 begin = time.time()
10 while True:
11 if time.time() - begin > duration:
12 break
13 count += 1
14
15 print 'internal_busy(): count = %u' % count
16
17 def external_busy():
18 from ctypes import CDLL
19 from ctypes import c_uint, c_void_p
20
21 libbusy = CDLL('./busy.so')
22 busy_wait = libbusy.busy_wait
23 busy_wait.argtypes = [c_uint]
24 busy_wait.restype = c_void_p
25
26 busy_wait(duration)
27
28 print 'two internal busy threads, CPU utilization cannot over 100%'
29 t1 = threading.Thread(target=internal_busy); t1.start()
31 t2 = threading.Thread(target=internal_busy); t2.start()
33 t1.join(); t2.join()
35
36 print 'with one external busy thread, CPU utilization gains to 200%'
37 t1 = threading.Thread(target=internal_busy); t1.start()
39 t2 = threading.Thread(target=external_busy); t2.start()
41 t1.join(); t2.join()
6 void busy_wait(unsigned int duration)
7 {
8 uint64_t count = 0;
9 time_t begin = time(NULL);
10
11 while(1) {
12 if(time(NULL) - begin > duration)
13 break;
14 count++;
15 }
16
17 printf("busy_wait(): count = %" PRIu64 "n", count);
18 }
consume CPU resource
specify input/output types
(strongly recommended)
PyCon APAC 2015
Example
ctypes (2/2)
14
Output:
two internal busy threads, CPU utilization cannot over 100%
internal_busy(): count = 12911610
internal_busy(): count = 16578663
with one external busy thread, CPU utilization gains to 200%
internal_busy(): count = 45320393
busy_wait(): count = 3075909775
Atop Display:
CPU | sys 46% | user 72% | irq 0% | idle 82% | wait 0% |
cpu | sys 26% | user 39% | irq 1% | idle 35% | cpu001 w 0% |
cpu | sys 20% | user 33% | irq 0% | idle 46% | cpu000 w 1% |
Atop Display:
CPU | sys 1% | user 199% | irq 0% | idle 0% | wait 0% |
cpu | sys 1% | user 99% | irq 0% | idle 0% | cpu000 w 0% |
cpu | sys 0% | user 100% | irq 0% | idle 0% | cpu001 w 0% |
PyCon APAC 2015
Example
Python/C extension (1/3)
15
20 static PyObject *with_lock(PyObject *self, PyObject *args)
21 {
22 unsigned int duration;
23
24 if(!PyArg_ParseTuple(args, "I", &duration))
25 return NULL;
26
27 busy_wait(duration);
28
29 Py_INCREF(Py_None);
30 return Py_None;
31 }
32
33 static PyObject *without_lock(PyObject *self, PyObject *args)
34 {
35 unsigned int duration;
36
37 if(!PyArg_ParseTuple(args, "I", &duration))
38 return NULL;
39
40 PyThreadState *_save;
41 _save = PyEval_SaveThread();
42 busy_wait(duration);
43 PyEval_RestoreThread(_save);
44
45 Py_INCREF(Py_None);
46 return Py_None;
47 }
48
49 static PyMethodDef busy_methods[] = {
50 {"with_lock", with_lock, METH_VARARGS, "Busy wait for a given duration with GIL"},
51 {"without_lock", without_lock, METH_VARARGS, "Busy wait for a given duration without GIL"},
52 {NULL, NULL, 0, NULL}
53 };
54
55 PyMODINIT_FUNC initbusy(void)
56 {
57 if(Py_InitModule("busy", busy_methods) == NULL)
58 return PyErr_SetString(PyExc_RuntimeError, "failed to Py_InitModule");
59 }
release the GIL before being busy
exported symbol name
require an unsigned integer
argument (busy duration)
return None
Compilation:
$ cat Makefile
busy.so: busy.c
$(CC) -o $@ -fPIC -shared -I/usr/include/python2.7 busy.c
$ make
accept positional args.
module name
PyCon APAC 2015
Example
Python/C extension (2/3)
16
1 import threading
2
3 duration = 10
4
5 def internal_busy():
6 import time
7
8 count = 0
9 begin = time.time()
10 while True:
11 if time.time() - begin > duration:
12 break
13 count += 1
14
15 print 'internal_busy(): count = %u' % count
16
17 def external_busy_with_lock():
18 from busy import with_lock
19
20 with_lock(duration)
21
22 def external_busy_without_lock():
23 from busy import without_lock
24
25 without_lock(duration)
26
27 print 'two busy threads compete for GIL, CPU utilization cannot over 100%'
28 t1 = threading.Thread(target=internal_busy); t1.start()
30 t2 = threading.Thread(target=external_busy_with_lock); t2.start()
32 t1.join(); t2.join()
34
35 print 'with one busy thread released GIL, CPU utilization gains to 200%'
36 t1 = threading.Thread(target=internal_busy); t1.start()
38 t2 = threading.Thread(target=external_busy_without_lock); t2.start()
40 t1.join(); t2.join()
linking to the busy.so extension
PyCon APAC 2015
Example
Python/C extension (3/3)
17
Output:
two busy threads compete for GIL, CPU utilization cannot over 100%
busy_wait(): count = 3257960533
internal_busy(): count = 45524
with one busy thread released GIL, CPU utilization gains to 200%
internal_busy(): count = 48049276
busy_wait(): count = 3271300229
Atop Display:
CPU | sys 2% | user 100% | irq 0% | idle 99% | wait 0% |
cpu | sys 0% | user 100% | irq 0% | idle 0% | cpu001 w 0% |
cpu | sys 1% | user 0% | irq 0% | idle 99% | cpu000 w 0% |
Atop Display:
CPU | sys 2% | user 198% | irq 0% | idle 0% | wait 0% |
cpu | sys 0% | user 100% | irq 0% | idle 0% | cpu000 w 0% |
cpu | sys 1% | user 98% | irq 0% | idle 0% | cpu001 w 0% |
PyCon APAC 2015
Cooperative Multitasking
• Only applicable to IO-bound tasks
• Single process, single thread
• no other thread, no GIL battle
• Executing the code when exactly needed
• Examples:
• generator
[24]
• pyev
[25]
• gevent
[26]
18
PyCon APAC 2015
Example
pyev
19
1 import pyev
2 import signal
3 import sys
4
5 def alarm_handler(watcher, revents):
6 sys.stdout.write('.')
7 sys.stdout.flush()
8
9 def timeout_handler(watcher, revents):
10 loop = watcher.loop
11 loop.stop()
12
13 def int_handler(watcher, revents):
14 loop = watcher.loop
15 loop.stop()
16
17 if __name__ == '__main__':
18 loop = pyev.Loop()
19
20 alarm = loop.timer(0.0, 1.0, alarm_handler)
21 alarm.start()
22
23 timeout = loop.timer(10.0, 0.0, timeout_handler)
24 timeout.start()
25
26 sigint = loop.signal(signal.SIGINT, int_handler)
27 sigint.start()
28
29 loop.start()
Case 1 Output:
...........
Case 2 Output:
..^C
11 dots
libev Timer:
(after)|(repeat)|(repeat)|(repeat)|...
interval event raised
the example:
after 0.0 second, raise
every 1.0 second, raise
raises 11 times in total
PyCon APAC 2015
Example
pyev: further observations
20
20 loop.timer(0.0, 1.0, alarm_handler).start()
21
22 loop.start()
Output:
Exception SystemError: 'null argument to internal routine' in Segmentation fault (core dumped)
20 timeout = loop.timer(0.0, 1.0, alarm_handler)
21 timeout.start()
22
23 timeout = loop.timer(10.0, 0.0, timeout_handler)
24 timeout.start()
25
26 loop.start()
20 alarm = loop.timer(0.0, 1.0, alarm_handler)
21 alarm.start()
22 sigint = loop.timer(10.0, 0.0, timeout_handler)
23 sigint.start()
24 sigint = loop.signal(signal.SIGINT, int_handler)
25 sigint.start()
26 loop.start()
Output:
...........Exception SystemError: 'null argument to internal routine' in Segmentation fault (core dumped)
manual of ev[27]:
you are responsible for allocating the
memory for your watcher structures
PyCon APAC 2015
Example
gevent
21
1 import gevent
2 from gevent import signal
3 import signal as o_signal
4 import sys
5
6 if __name__ == '__main__':
7 ctx = dict(stop_flag=False)
8
9 def int_handler():
10 ctx['stop_flag'] = True
11 gevent.signal(o_signal.SIGINT, int_handler)
12
13 count = 0
14 while not ctx['stop_flag']:
15 sys.stdout.write('.')
16 sys.stdout.flush()
17
18 gevent.sleep(1)
19
20 count += 1
21 if count > 10:
22 break
Case 1 Output:
...........
Case 2 Output:
..^C
PyCon APAC 2015
Interpreter as an Instance
• Rough idea, not a concrete solution yet
• C program, single process, multi-thread
• still can share states with relatively low penalty
• Allocate memory space for interpreter context
• that is, accept an address to put instance context
in Py_Initialize()
22
PyCon APAC 2015
Conclusion
• How to live along with GIL well?
• Multi-process
• Release the GIL
• Cooperative Multitasking
• Perhaps, Interpreter as an Instance
23
PyCon APAC 2015
References
[1]: http://en.wikipedia.org/wiki/Global_Interpreter_Lock
[2]: http://en.wikipedia.org/wiki/Giant_lock
[3]: http://en.wikipedia.org/wiki/Fine-grained_locking
[4]: http://en.wikipedia.org/wiki/Non-blocking_algorithm
[5]: https://wiki.python.org/moin/GlobalInterpreterLock
[6]: http://en.wikipedia.org/wiki/Multiprocessing
[7]: https://docs.python.org/2/faq/library.html#can-t-we-get-rid-of-the-global-interpreter-lock
[8]: http://www.artima.com/weblogs/viewpost.jsp?thread=214235
[9]: http://dabeaz.blogspot.tw/2011/08/inside-look-at-gil-removal-patch-of.html
[10]: http://en.wikipedia.org/wiki/Embarrassingly_parallel
[11]: http://en.wikipedia.org/wiki/Inter-process_communication
[12]: https://docs.python.org/2/library/multiprocessing.html
[13]: http://www.parallelpython.com/
[14]: https://code.google.com/p/pycsp/
[15]: https://github.com/penvirus/gil1
[16]: http://en.wikipedia.org/wiki/Nondeterministic_algorithm
[17]: http://www.dabeaz.com/python/GIL.pdf
[18]: https://docs.python.org/2/library/threading.html
[19]: https://docs.python.org/2/library/ctypes.html
[20]: https://docs.python.org/2/c-api/
[21]: https://docs.python.org/2/c-api/init.html#releasing-the-gil-from-extension-code
[22]: http://cython.org/
[23]: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/
[24]: http://www.dabeaz.com/coroutines/Coroutines.pdf
[25]: http://pythonhosted.org/pyev/
[26]: http://www.gevent.org/
[27]: http://linux.die.net/man/3/ev
24

More Related Content

PDF
Global Interpreter Lock: Episode III - cat &lt; /dev/zero > GIL;
PDF
Zone IDA Proc
PDF
The origin: Init (compact version)
PDF
Feldo: Function Event Listing and Dynamic Observing for Detecting and Prevent...
PDF
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
PDF
Functional Reactive Programming on Android
PDF
TensorFlow XLA RPC
PDF
Bridge TensorFlow to run on Intel nGraph backends (v0.4)
Global Interpreter Lock: Episode III - cat &lt; /dev/zero > GIL;
Zone IDA Proc
The origin: Init (compact version)
Feldo: Function Event Listing and Dynamic Observing for Detecting and Prevent...
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
Functional Reactive Programming on Android
TensorFlow XLA RPC
Bridge TensorFlow to run on Intel nGraph backends (v0.4)

What's hot (20)

PDF
Job Queue in Golang
PDF
用 Go 語言打造多台機器 Scale 架構
PDF
Bridge TensorFlow to run on Intel nGraph backends (v0.5)
PDF
How to make a large C++-code base manageable
PDF
Profiling and optimizing go programs
PDF
TVM VTA (TSIM)
PPTX
C++17 now
PDF
Facebook Glow Compiler のソースコードをグダグダ語る会
PDF
Multithreading done right
PDF
Hiveminder - Everything but the Secret Sauce
PDF
Joel Falcou, Boost.SIMD
PDF
An Embedded Error Recovery and Debugging Mechanism for Scripting Language Ext...
PDF
Csw2016 gong pwn_a_nexus_device_with_a_single_vulnerability
PDF
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2 「エッジAIモダン計測制御の世界」オ...
ODP
Linux kernel tracing superpowers in the cloud
PPT
Virtual platform
PDF
GPU Programming on CPU - Using C++AMP
PDF
Работа с реляционными базами данных в C++
PDF
History & Practices for UniRx(EN)
PDF
Антон Наумович, Система автоматической крэш-аналитики своими средствами
Job Queue in Golang
用 Go 語言打造多台機器 Scale 架構
Bridge TensorFlow to run on Intel nGraph backends (v0.5)
How to make a large C++-code base manageable
Profiling and optimizing go programs
TVM VTA (TSIM)
C++17 now
Facebook Glow Compiler のソースコードをグダグダ語る会
Multithreading done right
Hiveminder - Everything but the Secret Sauce
Joel Falcou, Boost.SIMD
An Embedded Error Recovery and Debugging Mechanism for Scripting Language Ext...
Csw2016 gong pwn_a_nexus_device_with_a_single_vulnerability
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2 「エッジAIモダン計測制御の世界」オ...
Linux kernel tracing superpowers in the cloud
Virtual platform
GPU Programming on CPU - Using C++AMP
Работа с реляционными базами данных в C++
History & Practices for UniRx(EN)
Антон Наумович, Система автоматической крэш-аналитики своими средствами
Ad

Viewers also liked (20)

PPTX
Nuotolinis mokymas(-si) organizacijoje
PPTX
Windows of Change: How Connected Educators Are Driving Real Reform
PPT
Pat Kane: Advocating play - a strategy for playworkers
DOCX
David Koehler (3)
PDF
The Insiders Guide to Passive Candidates
PPT
Using Social Media for ADC Collaboration and Recruitment
PPT
Informe junín
PDF
Building Data Moments in the Midst of Your Student Affairs Work
PDF
Seven secrets every developer should know before getting into manager or lead...
PDF
CBR 2016 issue 4
PPTX
Improvement and change ppt
PDF
Deep Fried Convnets
PDF
Digital Communications: The Basics
PDF
5 major opportunities awaiting manufacturers and their CFOs
PPTX
WD 2015_Spin off & the city_Michael De Blauwe (KULeuven Research & Development)
PPTX
Presentación Universidad MercadoLibre - Bogotá 2013
PDF
Analytics for Social Media
PPS
REDON, Odilon,Featured Paintings in Detail (1)
PDF
Mercadolibre reloaded - Karen Bruck
PDF
기획자
Nuotolinis mokymas(-si) organizacijoje
Windows of Change: How Connected Educators Are Driving Real Reform
Pat Kane: Advocating play - a strategy for playworkers
David Koehler (3)
The Insiders Guide to Passive Candidates
Using Social Media for ADC Collaboration and Recruitment
Informe junín
Building Data Moments in the Midst of Your Student Affairs Work
Seven secrets every developer should know before getting into manager or lead...
CBR 2016 issue 4
Improvement and change ppt
Deep Fried Convnets
Digital Communications: The Basics
5 major opportunities awaiting manufacturers and their CFOs
WD 2015_Spin off & the city_Michael De Blauwe (KULeuven Research & Development)
Presentación Universidad MercadoLibre - Bogotá 2013
Analytics for Social Media
REDON, Odilon,Featured Paintings in Detail (1)
Mercadolibre reloaded - Karen Bruck
기획자
Ad

Similar to Global Interpreter Lock: Episode I - Break the Seal (20)

PDF
JVM Mechanics: When Does the JVM JIT & Deoptimize?
PPTX
Jdk 7 4-forkjoin
PDF
LSFMM 2019 BPF Observability
PDF
Silicon Valley JUG: JVM Mechanics
PPTX
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
PDF
May2010 hex-core-opt
PDF
Debugging Hung Python Processes With GDB
PPT
Euro python2011 High Performance Python
PDF
Os lab final
PDF
The journey of asyncio adoption in instagram
PDF
Доклад Антона Поварова "Go in Badoo" с Golang Meetup
PDF
Lab report 201001067_201001104
PDF
Lab report 201001067_201001104
PDF
Lab report 201001067_201001104
PDF
PyHEP 2018: Tools to bind to Python
PDF
1032 cs208 g operation system ip camera case share.v0.2
PPTX
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
PDF
Oleksandr Smoktal "Parallel Seismic Data Processing Using OpenMP"
PPTX
Improving go-git performance
PDF
Runtime Code Generation and Data Management for Heterogeneous Computing in Java
JVM Mechanics: When Does the JVM JIT & Deoptimize?
Jdk 7 4-forkjoin
LSFMM 2019 BPF Observability
Silicon Valley JUG: JVM Mechanics
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
May2010 hex-core-opt
Debugging Hung Python Processes With GDB
Euro python2011 High Performance Python
Os lab final
The journey of asyncio adoption in instagram
Доклад Антона Поварова "Go in Badoo" с Golang Meetup
Lab report 201001067_201001104
Lab report 201001067_201001104
Lab report 201001067_201001104
PyHEP 2018: Tools to bind to Python
1032 cs208 g operation system ip camera case share.v0.2
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
Oleksandr Smoktal "Parallel Seismic Data Processing Using OpenMP"
Improving go-git performance
Runtime Code Generation and Data Management for Heterogeneous Computing in Java

Recently uploaded (20)

PDF
Digital Strategies for Manufacturing Companies
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
medical staffing services at VALiNTRY
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
System and Network Administraation Chapter 3
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
top salesforce developer skills in 2025.pdf
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPT
Introduction Database Management System for Course Database
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Digital Strategies for Manufacturing Companies
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Upgrade and Innovation Strategies for SAP ERP Customers
Online Work Permit System for Fast Permit Processing
Design an Analysis of Algorithms I-SECS-1021-03
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
medical staffing services at VALiNTRY
Odoo POS Development Services by CandidRoot Solutions
System and Network Administraation Chapter 3
Internet Downloader Manager (IDM) Crack 6.42 Build 41
top salesforce developer skills in 2025.pdf
Design an Analysis of Algorithms II-SECS-1021-03
Softaken Excel to vCard Converter Software.pdf
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Introduction Database Management System for Course Database
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Operating system designcfffgfgggggggvggggggggg
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...

Global Interpreter Lock: Episode I - Break the Seal

  • 1. PyCon APAC 2015 Global Interpreter Lock Episode I - Break the Seal Tzung-Bi Shih <penvirus@gmail.com>
  • 2. PyCon APAC 2015 Introduction • Global Interpreter Lock[1] • giant lock[2] • GIL in CPython[5] protects: • interpreter state, thread state, ... • reference count • “a guarantee” 2 • other implementations • fine-grained lock[3] • lock-free[4] some CPython features and extensions depend on the agreement
  • 3. PyCon APAC 2015 GIL over Multi-Processor[6] We want to produce efficient program. To achieve higher throughputs, we usually divide a program into several independent logic segments and execute them simultaneously over MP architecture by leveraging multi- threading technology. Unfortunately, only one of the threads gets executed at a time if they compete for a same GIL. Some people are working on how to remove the giant lock which shall be a difficult job[7][8][9]. Before the wonderful world comes, we will need to learn how to live along with GIL well. 3
  • 4. PyCon APAC 2015 Brainless Solution multi-process • Embarrassingly parallel[10] • no dependency between those parallel tasks • IPC[11]-required parallel task • share states with other peers • Examples: • multiprocessing[12], pp[13], pyCSP[14] 4
  • 5. PyCon APAC 2015 Example[15] multiprocessing: process pool 5 1 import os 2 from multiprocessing import Pool 3 4 def worker(i): 5 print 'pid=%d ppid=%d i=%d' % (os.getpid(), os.getppid(), i) 6 7 print 'pid=%d' % os.getpid() 8 pool = Pool(processes=4) 9 pool.map(worker, xrange(10)) 10 pool.terminate() Round 1: pid=11326 pid=11327 ppid=11326 i=0 pid=11328 ppid=11326 i=1 pid=11328 ppid=11326 i=3 pid=11329 ppid=11326 i=2 pid=11329 ppid=11326 i=5 pid=11329 ppid=11326 i=6 pid=11329 ppid=11326 i=7 pid=11329 ppid=11326 i=8 pid=11327 ppid=11326 i=4 pid=11328 ppid=11326 i=9 nondeterministic[16]: the same input, different output Round 2: pid=11372 pid=11373 ppid=11372 i=0 pid=11373 ppid=11372 i=2 pid=11374 ppid=11372 i=1 pid=11376 ppid=11372 i=3 pid=11374 ppid=11372 i=4 pid=11374 ppid=11372 i=7 pid=11373 ppid=11372 i=6 pid=11376 ppid=11372 i=8 pid=11375 ppid=11372 i=5 pid=11375 ppid=11372 i=9
  • 6. PyCon APAC 2015 Example multiprocessing: further observations (1/2) 6 => What if I create the target function after the pool initialized? 1 import os 2 from multiprocessing import Pool 3 4 print 'pid=%d' % os.getpid() 5 pool = Pool(processes=4) 6 7 def worker(i): 8 print 'pid=%d ppid=%d i=%d' % (os.getpid(), os.getppid(), i) 9 10 pool.map(worker, xrange(10)) 11 pool.terminate() • Adopts un-named pipe to handle IPC • Workers are forked when initializing the pool • so that workers can “see” the target function (they will share the same memory copy)
  • 7. PyCon APAC 2015 Example multiprocessing: further observations (2/2) 7 Output: pid=12093 Process PoolWorker-1: Process PoolWorker-2: Traceback (most recent call last): Process PoolWorker-3: Traceback (most recent call last): File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap Traceback (most recent call last): File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap ...ignored... AttributeError: 'module' object has no attribute 'worker' ...ignored... pid=12101 ppid=12093 i=4 pid=12101 ppid=12093 i=5 pid=12101 ppid=12093 i=6 pid=12101 ppid=12093 i=7 pid=12101 ppid=12093 i=8 pid=12101 ppid=12093 i=9 ^CProcess PoolWorker-6: Traceback (most recent call last): File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, **self._kwargs) File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker task = get() File "/usr/lib/python2.7/multiprocessing/queues.py", line 374, in get racquire() KeyboardInterrupt lost 0~3 process hanging ctrl+c pressed worker #6 #1~4 were terminated due to the exception following workers will be forked
  • 8. PyCon APAC 2015 Example overhead of IPC and GIL battle[17] comparison 8 1 import time 2 from multiprocessing import Process 3 from threading import Thread 4 from multiprocessing import Queue as MPQ 5 from Queue import Queue 6 7 MAX = 1000000 8 9 def test_(w_class, q_class): 10 def worker(queue): 11 for i in xrange(MAX): 12 queue.put(i) 13 14 q = q_class() 15 w = w_class(target=worker, args=(q,)) 16 17 begin = time.time() 18 w.start() 19 for i in xrange(MAX): 20 q.get() 21 w.join() 22 end = time.time() 23 24 return end - begin 26 def test_sthread(): 27 q = Queue() 28 29 begin = time.time() 30 for i in xrange(MAX): 31 q.put(i) 32 q.get() 33 end = time.time() 34 35 return end - begin 36 37 print 'mprocess: %.6f' % test_(Process, MPQ) 38 print 'mthread: %.6f' % test_(Thread, Queue) 39 print 'sthread: %.6f' % test_sthread() Output: mprocess: 14.225408 mthread: 7.759567 sthread: 2.743325 API of multiprocessing is similar to threading[18] IPC is the most costly overhead of the GIL battle
  • 9. PyCon APAC 2015 Example pp remote node 9 Server: $ ppserver.py -w 1 -p 10000 & [1] 16512 $ ppserver.py -w 1 -p 10001 & [2] 16514 $ ppserver.py -w 1 -p 10002 & [3] 16516 $ netstat -nlp Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 16512/python tcp 0 0 0.0.0.0:10001 0.0.0.0:* LISTEN 16514/python tcp 0 0 0.0.0.0:10002 0.0.0.0:* LISTEN 16516/python $ pstree -p $$ bash(11971)-+-ppserver.py(16512)---python(16513) |-ppserver.py(16514)---python(16515) |-ppserver.py(16516)---python(16517) `-pstree(16547) # of workers listen to wait remote jobs workers
  • 10. PyCon APAC 2015 Example pp local node 10 Output: pid=16633 pid=16634 ppid=16633 i=0 pid=16513 ppid=16512 i=1 pid=16517 ppid=16516 i=2 pid=16515 ppid=16514 i=3 pid=16513 ppid=16512 i=4 pid=16517 ppid=16516 i=5 pid=16515 ppid=16514 i=6 pid=16634 ppid=16633 i=7 pid=16517 ppid=16516 i=8 pid=16513 ppid=16512 i=9 1 import os 2 import pp 3 import time 4 import random 5 6 print 'pid=%d' % os.getpid() 7 8 def worker(i): 9 print 'pid=%d ppid=%d i=%d' % (os.getpid(), os.getppid(), i) 10 time.sleep(random.randint(1, 3)) 11 12 servers = ('127.0.0.1:10000', '127.0.0.1:10001', '127.0.0.1:10002') 13 job_server = pp.Server(1, ppservers=servers) 14 15 jobs = list() 16 for i in xrange(10): 17 job = job_server.submit(worker, args=(i,), modules=('time', 'random')) 18 jobs.append(job) 19 20 for job in jobs: 21 job() # of workerspp worker collects stdout determine the result order (deterministic) accumulative, beware of RSIZE of remote node A pp local node is an execution node too. It dispatches jobs to itself first. computed by local node
  • 11. PyCon APAC 2015 Example ppserver.py gives some exceptions 11 Exception: Exception in thread client_socket: Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner self.run() File "/usr/lib/python2.7/threading.py", line 763, in run self.__target(*self.__args, **self.__kwargs) File "/usr/local/bin/ppserver.py", line 176, in crun ctype = mysocket.receive() File "/usr/local/lib/python2.7/dist-packages/pptransport.py", line 196, in receive raise RuntimeError("Socket connection is broken") RuntimeError: Socket connection is broken Don’t worry. Expected.
  • 12. PyCon APAC 2015 Release the GIL • Especially suitable for processor-bound tasks • Examples: • ctypes[19] • Python/C extension[20][21] • Cython[22] • Pyrex[23] 12
  • 13. PyCon APAC 2015 Example ctypes (1/2) 13 3 duration = 10 4 5 def internal_busy(): 6 import time 7 8 count = 0 9 begin = time.time() 10 while True: 11 if time.time() - begin > duration: 12 break 13 count += 1 14 15 print 'internal_busy(): count = %u' % count 16 17 def external_busy(): 18 from ctypes import CDLL 19 from ctypes import c_uint, c_void_p 20 21 libbusy = CDLL('./busy.so') 22 busy_wait = libbusy.busy_wait 23 busy_wait.argtypes = [c_uint] 24 busy_wait.restype = c_void_p 25 26 busy_wait(duration) 27 28 print 'two internal busy threads, CPU utilization cannot over 100%' 29 t1 = threading.Thread(target=internal_busy); t1.start() 31 t2 = threading.Thread(target=internal_busy); t2.start() 33 t1.join(); t2.join() 35 36 print 'with one external busy thread, CPU utilization gains to 200%' 37 t1 = threading.Thread(target=internal_busy); t1.start() 39 t2 = threading.Thread(target=external_busy); t2.start() 41 t1.join(); t2.join() 6 void busy_wait(unsigned int duration) 7 { 8 uint64_t count = 0; 9 time_t begin = time(NULL); 10 11 while(1) { 12 if(time(NULL) - begin > duration) 13 break; 14 count++; 15 } 16 17 printf("busy_wait(): count = %" PRIu64 "n", count); 18 } consume CPU resource specify input/output types (strongly recommended)
  • 14. PyCon APAC 2015 Example ctypes (2/2) 14 Output: two internal busy threads, CPU utilization cannot over 100% internal_busy(): count = 12911610 internal_busy(): count = 16578663 with one external busy thread, CPU utilization gains to 200% internal_busy(): count = 45320393 busy_wait(): count = 3075909775 Atop Display: CPU | sys 46% | user 72% | irq 0% | idle 82% | wait 0% | cpu | sys 26% | user 39% | irq 1% | idle 35% | cpu001 w 0% | cpu | sys 20% | user 33% | irq 0% | idle 46% | cpu000 w 1% | Atop Display: CPU | sys 1% | user 199% | irq 0% | idle 0% | wait 0% | cpu | sys 1% | user 99% | irq 0% | idle 0% | cpu000 w 0% | cpu | sys 0% | user 100% | irq 0% | idle 0% | cpu001 w 0% |
  • 15. PyCon APAC 2015 Example Python/C extension (1/3) 15 20 static PyObject *with_lock(PyObject *self, PyObject *args) 21 { 22 unsigned int duration; 23 24 if(!PyArg_ParseTuple(args, "I", &duration)) 25 return NULL; 26 27 busy_wait(duration); 28 29 Py_INCREF(Py_None); 30 return Py_None; 31 } 32 33 static PyObject *without_lock(PyObject *self, PyObject *args) 34 { 35 unsigned int duration; 36 37 if(!PyArg_ParseTuple(args, "I", &duration)) 38 return NULL; 39 40 PyThreadState *_save; 41 _save = PyEval_SaveThread(); 42 busy_wait(duration); 43 PyEval_RestoreThread(_save); 44 45 Py_INCREF(Py_None); 46 return Py_None; 47 } 48 49 static PyMethodDef busy_methods[] = { 50 {"with_lock", with_lock, METH_VARARGS, "Busy wait for a given duration with GIL"}, 51 {"without_lock", without_lock, METH_VARARGS, "Busy wait for a given duration without GIL"}, 52 {NULL, NULL, 0, NULL} 53 }; 54 55 PyMODINIT_FUNC initbusy(void) 56 { 57 if(Py_InitModule("busy", busy_methods) == NULL) 58 return PyErr_SetString(PyExc_RuntimeError, "failed to Py_InitModule"); 59 } release the GIL before being busy exported symbol name require an unsigned integer argument (busy duration) return None Compilation: $ cat Makefile busy.so: busy.c $(CC) -o $@ -fPIC -shared -I/usr/include/python2.7 busy.c $ make accept positional args. module name
  • 16. PyCon APAC 2015 Example Python/C extension (2/3) 16 1 import threading 2 3 duration = 10 4 5 def internal_busy(): 6 import time 7 8 count = 0 9 begin = time.time() 10 while True: 11 if time.time() - begin > duration: 12 break 13 count += 1 14 15 print 'internal_busy(): count = %u' % count 16 17 def external_busy_with_lock(): 18 from busy import with_lock 19 20 with_lock(duration) 21 22 def external_busy_without_lock(): 23 from busy import without_lock 24 25 without_lock(duration) 26 27 print 'two busy threads compete for GIL, CPU utilization cannot over 100%' 28 t1 = threading.Thread(target=internal_busy); t1.start() 30 t2 = threading.Thread(target=external_busy_with_lock); t2.start() 32 t1.join(); t2.join() 34 35 print 'with one busy thread released GIL, CPU utilization gains to 200%' 36 t1 = threading.Thread(target=internal_busy); t1.start() 38 t2 = threading.Thread(target=external_busy_without_lock); t2.start() 40 t1.join(); t2.join() linking to the busy.so extension
  • 17. PyCon APAC 2015 Example Python/C extension (3/3) 17 Output: two busy threads compete for GIL, CPU utilization cannot over 100% busy_wait(): count = 3257960533 internal_busy(): count = 45524 with one busy thread released GIL, CPU utilization gains to 200% internal_busy(): count = 48049276 busy_wait(): count = 3271300229 Atop Display: CPU | sys 2% | user 100% | irq 0% | idle 99% | wait 0% | cpu | sys 0% | user 100% | irq 0% | idle 0% | cpu001 w 0% | cpu | sys 1% | user 0% | irq 0% | idle 99% | cpu000 w 0% | Atop Display: CPU | sys 2% | user 198% | irq 0% | idle 0% | wait 0% | cpu | sys 0% | user 100% | irq 0% | idle 0% | cpu000 w 0% | cpu | sys 1% | user 98% | irq 0% | idle 0% | cpu001 w 0% |
  • 18. PyCon APAC 2015 Cooperative Multitasking • Only applicable to IO-bound tasks • Single process, single thread • no other thread, no GIL battle • Executing the code when exactly needed • Examples: • generator [24] • pyev [25] • gevent [26] 18
  • 19. PyCon APAC 2015 Example pyev 19 1 import pyev 2 import signal 3 import sys 4 5 def alarm_handler(watcher, revents): 6 sys.stdout.write('.') 7 sys.stdout.flush() 8 9 def timeout_handler(watcher, revents): 10 loop = watcher.loop 11 loop.stop() 12 13 def int_handler(watcher, revents): 14 loop = watcher.loop 15 loop.stop() 16 17 if __name__ == '__main__': 18 loop = pyev.Loop() 19 20 alarm = loop.timer(0.0, 1.0, alarm_handler) 21 alarm.start() 22 23 timeout = loop.timer(10.0, 0.0, timeout_handler) 24 timeout.start() 25 26 sigint = loop.signal(signal.SIGINT, int_handler) 27 sigint.start() 28 29 loop.start() Case 1 Output: ........... Case 2 Output: ..^C 11 dots libev Timer: (after)|(repeat)|(repeat)|(repeat)|... interval event raised the example: after 0.0 second, raise every 1.0 second, raise raises 11 times in total
  • 20. PyCon APAC 2015 Example pyev: further observations 20 20 loop.timer(0.0, 1.0, alarm_handler).start() 21 22 loop.start() Output: Exception SystemError: 'null argument to internal routine' in Segmentation fault (core dumped) 20 timeout = loop.timer(0.0, 1.0, alarm_handler) 21 timeout.start() 22 23 timeout = loop.timer(10.0, 0.0, timeout_handler) 24 timeout.start() 25 26 loop.start() 20 alarm = loop.timer(0.0, 1.0, alarm_handler) 21 alarm.start() 22 sigint = loop.timer(10.0, 0.0, timeout_handler) 23 sigint.start() 24 sigint = loop.signal(signal.SIGINT, int_handler) 25 sigint.start() 26 loop.start() Output: ...........Exception SystemError: 'null argument to internal routine' in Segmentation fault (core dumped) manual of ev[27]: you are responsible for allocating the memory for your watcher structures
  • 21. PyCon APAC 2015 Example gevent 21 1 import gevent 2 from gevent import signal 3 import signal as o_signal 4 import sys 5 6 if __name__ == '__main__': 7 ctx = dict(stop_flag=False) 8 9 def int_handler(): 10 ctx['stop_flag'] = True 11 gevent.signal(o_signal.SIGINT, int_handler) 12 13 count = 0 14 while not ctx['stop_flag']: 15 sys.stdout.write('.') 16 sys.stdout.flush() 17 18 gevent.sleep(1) 19 20 count += 1 21 if count > 10: 22 break Case 1 Output: ........... Case 2 Output: ..^C
  • 22. PyCon APAC 2015 Interpreter as an Instance • Rough idea, not a concrete solution yet • C program, single process, multi-thread • still can share states with relatively low penalty • Allocate memory space for interpreter context • that is, accept an address to put instance context in Py_Initialize() 22
  • 23. PyCon APAC 2015 Conclusion • How to live along with GIL well? • Multi-process • Release the GIL • Cooperative Multitasking • Perhaps, Interpreter as an Instance 23
  • 24. PyCon APAC 2015 References [1]: http://en.wikipedia.org/wiki/Global_Interpreter_Lock [2]: http://en.wikipedia.org/wiki/Giant_lock [3]: http://en.wikipedia.org/wiki/Fine-grained_locking [4]: http://en.wikipedia.org/wiki/Non-blocking_algorithm [5]: https://wiki.python.org/moin/GlobalInterpreterLock [6]: http://en.wikipedia.org/wiki/Multiprocessing [7]: https://docs.python.org/2/faq/library.html#can-t-we-get-rid-of-the-global-interpreter-lock [8]: http://www.artima.com/weblogs/viewpost.jsp?thread=214235 [9]: http://dabeaz.blogspot.tw/2011/08/inside-look-at-gil-removal-patch-of.html [10]: http://en.wikipedia.org/wiki/Embarrassingly_parallel [11]: http://en.wikipedia.org/wiki/Inter-process_communication [12]: https://docs.python.org/2/library/multiprocessing.html [13]: http://www.parallelpython.com/ [14]: https://code.google.com/p/pycsp/ [15]: https://github.com/penvirus/gil1 [16]: http://en.wikipedia.org/wiki/Nondeterministic_algorithm [17]: http://www.dabeaz.com/python/GIL.pdf [18]: https://docs.python.org/2/library/threading.html [19]: https://docs.python.org/2/library/ctypes.html [20]: https://docs.python.org/2/c-api/ [21]: https://docs.python.org/2/c-api/init.html#releasing-the-gil-from-extension-code [22]: http://cython.org/ [23]: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ [24]: http://www.dabeaz.com/coroutines/Coroutines.pdf [25]: http://pythonhosted.org/pyev/ [26]: http://www.gevent.org/ [27]: http://linux.die.net/man/3/ev 24