SlideShare a Scribd company logo
Example with GCoptimization
Kevin Keraudren
Imperial College London

October 31st , 2013
Cython overview
Syntax between Python and C (keyword cdef)
C/C++ code automatically generated,
then compiled into a Python module
For a speed gain, variables must be declared
C++ templates must be instantiated (compiled code)
Choose between accessing low-level C++ or a blackbox
Documentation: docs.cython.org
Learn from examples:
scikit-learn, scikit-image, github.com/amueller

2/21
How to?

1

Organize your C/C++ code

2

Write module.pyx

3

Write setup.py

4

Build

3/21
Example 1
Interface the whole API

4/21
Graphcut (Boykov & Kolmogorov)

V (p, q ) =

1
p−q
1
p−q

2

2
2
e−(Ip −Iq ) /2σ

2

if Ip ≥ Iq
if Ip < Iq

σ : noise estimate
5/21
GCoptimisation (Boykov & Kolmogorov)
template <typename captype,
typename tcaptype,
typename flowtype> class Graph {
public:
...
Graph( int node_num_max, int edge_num_max,
void (*err_function)(const char *) = NULL);
void add_edge( node_id i, node_id j,
captype cap, captype rev_cap);
void add_tweights( node_id i,
tcaptype cap_source, tcaptype cap_sink);
flowtype maxflow( bool reuse_trees = false,
Block<node_id>* changed_list = NULL);
termtype what_segment( node_id i,
termtype default_segm = SOURCE);
...
}
6/21
Begin your module.pyx

import numpy as np
cimport numpy as np
np.import_array()
ctypedef double captype
ctypedef double tcaptype
ctypedef double flowtype

7/21
Declare what you need from C++

cdef extern from "graph.h":
cdef cppclass Graph[captype,tcaptype,flowtype]:
Graph( size_t, size_t )
size_t add_node(size_t)
void add_edge(size_t,size_t,captype,captype)
void add_tweights(size_t,tcaptype,tcaptype)
flowtype maxflow()
int what_segment(size_t)

8/21
Create your Python class

cdef class PyGraph:
# hold a C++ instance which we’re wrapping
cdef Graph[captype,tcaptype,flowtype] *thisptr
def __cinit__(self, size_t nb_nodes, size_t nb_edges):
self.thisptr = new Graph[captype,
tcaptype, flowtype](nb_nodes,nb_edges)
def __dealloc__(self):
del self.thisptr

9/21
Create your Python class

def add_node(self, size_t nb_nodes=1):
self.thisptr.add_node(nb_nodes)
def add_edge(self, size_t i, size_t j,
captype cap, captype rev_cap):
self.thisptr.add_edge(i,j,cap,rev_cap)
def add_tweights(self, size_t i,
tcaptype cap_source, tcaptype cap_sink):
self.thisptr.add_tweights(i,cap_source,cap_sink)
def maxflow(self):
return self.thisptr.maxflow()
def what_segment(self, size_t i):
return self.thisptr.what_segment(i)

10/21
Write setup.py
from
from
from
from

distutils.core import setup
distutils.extension import Extension
Cython.Distutils import build_ext
numpy.distutils.misc_util import get_numpy_include_dirs

setup(
cmdclass = {’build_ext’: build_ext},
ext_modules = [
Extension( "graphcut",
[ "graphcut.pyx",
"../maxflow-v3.02.src/graph.cpp",
"../maxflow-v3.02.src/maxflow.cpp" ],
language="c++",
include_dirs=get_numpy_include_dirs()+["../maxflow-v3.02.src"],
)
]
)

And build:

python setup.py build_ext --build-temp tmp 
--build-lib lib 
--pyrex-c-in-temp
11/21
And use it!
from lib import graphcut
G = graphcut.PyGraph(nb_pixels,nb_pixels*(8+2))
G.add_node(nb_pixels)
...
print "building graph..."
for i in range(img.shape[0]):
for j in range(img.shape[1]):
for a,b in neighbourhood:
if ( 0 <= i+a < img.shape[0]
and 0 <= j+b < img.shape[1] ):
dist = np.sqrt( a**2 + b**2 )
if img[i,j] < img[i+a,j+b]:
w = 1.0/dist
else:
w = np.exp(-(img[i,j] - img[i+a,j+b])**2
w /= 2.0 * std**2 * dist
G.add_edge( index(i,j,img),
index(i+a,j+b,img),
w, 0 )
12/21
Result

13/21
Example 2
Use C++ as a blackbox

14/21
Declare C++ function

cdef extern from "_graphcut.h":
void _graphcut( voxel_t*,
int, int,
double,
unsigned char*,
unsigned char* )

15/21
And use it!
def graphcut( np.ndarray[voxel_t, ndim=2, mode="c"] img,
np.ndarray[unsigned char, ndim=2, mode="c"] mask,
double std ):
cdef np.ndarray[unsigned char,
ndim=2,
mode="c"] seg = np.zeros( (img.shape[0],
img.shape[1]),
dtype=’uint8’)
print "starting graphcut..."
_graphcut( <voxel_t*> img.data,
img.shape[0], img.shape[1],
std,
<unsigned char*> mask.data,
<unsigned char*> seg.data )
return seg
16/21
Result

17/21
Timing

Example 1: 18.01s
Example 2: 0.37s
Nearly 50 times faster...

18/21
Another result...

19/21
Conclusion

Huge speedup for a low amount of code
Perfect if C++ code already exists
Make sure your Python code is optimised (good use of numpy)
before using cython

Slides and code are on github

github.com/kevin-keraudren/talk-cython

20/21
Thanks!

21/21

More Related Content

PDF
Brief Introduction to Cython
PPT
Introduction to cython
PPT
A useful tools in windows py2exe(optional)
PPT
Profiling in python
PPTX
Mixing C++ & Python II: Pybind11
PPTX
Tensorflow in practice by Engineer - donghwi cha
PDF
C++ Concepts and Ranges - How to use them?
PPTX
C++ via C#
Brief Introduction to Cython
Introduction to cython
A useful tools in windows py2exe(optional)
Profiling in python
Mixing C++ & Python II: Pybind11
Tensorflow in practice by Engineer - donghwi cha
C++ Concepts and Ranges - How to use them?
C++ via C#

What's hot (20)

PDF
Concurrency in Python4k
PDF
Unmanaged Parallelization via P/Invoke
PDF
Rpy2 demonstration
PPTX
Summary of C++17 features
PDF
mpi4py.pdf
DOCX
Computer Science Practical File class XII
PDF
All I know about rsc.io/c2go
PPTX
Queue oop
PPTX
Mono + .NET Core = ❤️
PDF
GoFFIng around with Ruby #RubyConfPH
PPTX
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
PDF
A peek on numerical programming in perl and python e christopher dyken 2005
PDF
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
PDF
C++11 & C++14
PDF
D vs OWKN Language at LLnagoya
PPTX
PDF
Modern c++ (C++ 11/14)
PDF
Activity Recognition Through Complex Event Processing: First Findings
PDF
Diving into byte code optimization in python
PDF
BeepBeep 3: A declarative event stream query engine (EDOC 2015)
Concurrency in Python4k
Unmanaged Parallelization via P/Invoke
Rpy2 demonstration
Summary of C++17 features
mpi4py.pdf
Computer Science Practical File class XII
All I know about rsc.io/c2go
Queue oop
Mono + .NET Core = ❤️
GoFFIng around with Ruby #RubyConfPH
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
A peek on numerical programming in perl and python e christopher dyken 2005
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
C++11 & C++14
D vs OWKN Language at LLnagoya
Modern c++ (C++ 11/14)
Activity Recognition Through Complex Event Processing: First Findings
Diving into byte code optimization in python
BeepBeep 3: A declarative event stream query engine (EDOC 2015)
Ad

Similar to Introduction to cython: example of GCoptimization (20)

PDF
Cython - close to metal Python
PPT
Euro python2011 High Performance Python
PDF
PyHEP 2018: Tools to bind to Python
PDF
PyCon2022 - Building Python Extensions
PDF
Towards Chainer v1.5
PDF
Pycvf
PDF
The graph above is just an example that shows the differences in dis.pdf
PPTX
Scaling Python to CPUs and GPUs
PDF
Building SciPy kernels with Pythran
PDF
Dafunctor
PDF
Princeton RSE Peer network first meeting
PDF
MLOps Case Studies: Building fast, scalable, and high-accuracy ML systems at ...
PDF
SunPy: Python for solar physics
PDF
Engineer Engineering Software
PDF
The road ahead for scientific computing with Python
PDF
On the necessity and inapplicability of python
PDF
On the Necessity and Inapplicability of Python
PPTX
Gtc 2010 py_cula_better
PDF
PyCon Estonia 2019
PDF
HDF5 2.0: Cloud Optimized from the Start
Cython - close to metal Python
Euro python2011 High Performance Python
PyHEP 2018: Tools to bind to Python
PyCon2022 - Building Python Extensions
Towards Chainer v1.5
Pycvf
The graph above is just an example that shows the differences in dis.pdf
Scaling Python to CPUs and GPUs
Building SciPy kernels with Pythran
Dafunctor
Princeton RSE Peer network first meeting
MLOps Case Studies: Building fast, scalable, and high-accuracy ML systems at ...
SunPy: Python for solar physics
Engineer Engineering Software
The road ahead for scientific computing with Python
On the necessity and inapplicability of python
On the Necessity and Inapplicability of Python
Gtc 2010 py_cula_better
PyCon Estonia 2019
HDF5 2.0: Cloud Optimized from the Start
Ad

More from Kevin Keraudren (17)

PDF
Automatic Localisation of the Brain in Fetal MRI (Miccai 2013 poster)
PDF
Automated Fetal Brain Segmentation from 2D MRI Slices for Motion Correction (...
PDF
Automated Localization of Fetal Organs in MRI Using Random Forests with Steer...
PDF
Automated Localization of Fetal Organs in MRI Using Random Forests with Steer...
PDF
Automatic Localisation of the Brain in Fetal MRI (Miccai 2013)
PDF
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)
PDF
Keraudren-K-2015-PhD-Thesis
PDF
PhD viva - 11th November 2015
PDF
PyData London 2015 - Localising Organs of the Fetus in MRI Data Using Python
PDF
Automated Fetal Brain Segmentation from 2D MRI Slices for Motion Correction
PDF
Sparsity Based Spectral Embedding: Application to Multi-Atlas Echocardiograph...
PDF
Endocardial 3D Ultrasound Segmentation using Autocontext Random ForestsPresen...
PDF
Faceccrumbs: Manifold Learning on 1M Face Images, MSc group project
PDF
Slides on Photosynth.net, from my MSc at Imperial
PDF
Slides presented at the Steiner Unit, Hammersmith Hospital, 08/06/2012
PDF
Reading group - 22/05/2013
PDF
Segmenting Epithelial Cells in High-Throughput RNAi Screens (MIAAB 2011)
Automatic Localisation of the Brain in Fetal MRI (Miccai 2013 poster)
Automated Fetal Brain Segmentation from 2D MRI Slices for Motion Correction (...
Automated Localization of Fetal Organs in MRI Using Random Forests with Steer...
Automated Localization of Fetal Organs in MRI Using Random Forests with Steer...
Automatic Localisation of the Brain in Fetal MRI (Miccai 2013)
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)
Keraudren-K-2015-PhD-Thesis
PhD viva - 11th November 2015
PyData London 2015 - Localising Organs of the Fetus in MRI Data Using Python
Automated Fetal Brain Segmentation from 2D MRI Slices for Motion Correction
Sparsity Based Spectral Embedding: Application to Multi-Atlas Echocardiograph...
Endocardial 3D Ultrasound Segmentation using Autocontext Random ForestsPresen...
Faceccrumbs: Manifold Learning on 1M Face Images, MSc group project
Slides on Photosynth.net, from my MSc at Imperial
Slides presented at the Steiner Unit, Hammersmith Hospital, 08/06/2012
Reading group - 22/05/2013
Segmenting Epithelial Cells in High-Throughput RNAi Screens (MIAAB 2011)

Recently uploaded (20)

PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
KodekX | Application Modernization Development
PDF
Encapsulation theory and applications.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Approach and Philosophy of On baking technology
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Advanced methodologies resolving dimensionality complications for autism neur...
KodekX | Application Modernization Development
Encapsulation theory and applications.pdf
A Presentation on Artificial Intelligence
Approach and Philosophy of On baking technology
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
NewMind AI Monthly Chronicles - July 2025
Per capita expenditure prediction using model stacking based on satellite ima...
Building Integrated photovoltaic BIPV_UPV.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Understanding_Digital_Forensics_Presentation.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Diabetes mellitus diagnosis method based random forest with bat algorithm
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Dropbox Q2 2025 Financial Results & Investor Presentation
Bridging biosciences and deep learning for revolutionary discoveries: a compr...

Introduction to cython: example of GCoptimization

  • 1. Example with GCoptimization Kevin Keraudren Imperial College London October 31st , 2013
  • 2. Cython overview Syntax between Python and C (keyword cdef) C/C++ code automatically generated, then compiled into a Python module For a speed gain, variables must be declared C++ templates must be instantiated (compiled code) Choose between accessing low-level C++ or a blackbox Documentation: docs.cython.org Learn from examples: scikit-learn, scikit-image, github.com/amueller 2/21
  • 3. How to? 1 Organize your C/C++ code 2 Write module.pyx 3 Write setup.py 4 Build 3/21
  • 4. Example 1 Interface the whole API 4/21
  • 5. Graphcut (Boykov & Kolmogorov) V (p, q ) = 1 p−q 1 p−q 2 2 2 e−(Ip −Iq ) /2σ 2 if Ip ≥ Iq if Ip < Iq σ : noise estimate 5/21
  • 6. GCoptimisation (Boykov & Kolmogorov) template <typename captype, typename tcaptype, typename flowtype> class Graph { public: ... Graph( int node_num_max, int edge_num_max, void (*err_function)(const char *) = NULL); void add_edge( node_id i, node_id j, captype cap, captype rev_cap); void add_tweights( node_id i, tcaptype cap_source, tcaptype cap_sink); flowtype maxflow( bool reuse_trees = false, Block<node_id>* changed_list = NULL); termtype what_segment( node_id i, termtype default_segm = SOURCE); ... } 6/21
  • 7. Begin your module.pyx import numpy as np cimport numpy as np np.import_array() ctypedef double captype ctypedef double tcaptype ctypedef double flowtype 7/21
  • 8. Declare what you need from C++ cdef extern from "graph.h": cdef cppclass Graph[captype,tcaptype,flowtype]: Graph( size_t, size_t ) size_t add_node(size_t) void add_edge(size_t,size_t,captype,captype) void add_tweights(size_t,tcaptype,tcaptype) flowtype maxflow() int what_segment(size_t) 8/21
  • 9. Create your Python class cdef class PyGraph: # hold a C++ instance which we’re wrapping cdef Graph[captype,tcaptype,flowtype] *thisptr def __cinit__(self, size_t nb_nodes, size_t nb_edges): self.thisptr = new Graph[captype, tcaptype, flowtype](nb_nodes,nb_edges) def __dealloc__(self): del self.thisptr 9/21
  • 10. Create your Python class def add_node(self, size_t nb_nodes=1): self.thisptr.add_node(nb_nodes) def add_edge(self, size_t i, size_t j, captype cap, captype rev_cap): self.thisptr.add_edge(i,j,cap,rev_cap) def add_tweights(self, size_t i, tcaptype cap_source, tcaptype cap_sink): self.thisptr.add_tweights(i,cap_source,cap_sink) def maxflow(self): return self.thisptr.maxflow() def what_segment(self, size_t i): return self.thisptr.what_segment(i) 10/21
  • 11. Write setup.py from from from from distutils.core import setup distutils.extension import Extension Cython.Distutils import build_ext numpy.distutils.misc_util import get_numpy_include_dirs setup( cmdclass = {’build_ext’: build_ext}, ext_modules = [ Extension( "graphcut", [ "graphcut.pyx", "../maxflow-v3.02.src/graph.cpp", "../maxflow-v3.02.src/maxflow.cpp" ], language="c++", include_dirs=get_numpy_include_dirs()+["../maxflow-v3.02.src"], ) ] ) And build: python setup.py build_ext --build-temp tmp --build-lib lib --pyrex-c-in-temp 11/21
  • 12. And use it! from lib import graphcut G = graphcut.PyGraph(nb_pixels,nb_pixels*(8+2)) G.add_node(nb_pixels) ... print "building graph..." for i in range(img.shape[0]): for j in range(img.shape[1]): for a,b in neighbourhood: if ( 0 <= i+a < img.shape[0] and 0 <= j+b < img.shape[1] ): dist = np.sqrt( a**2 + b**2 ) if img[i,j] < img[i+a,j+b]: w = 1.0/dist else: w = np.exp(-(img[i,j] - img[i+a,j+b])**2 w /= 2.0 * std**2 * dist G.add_edge( index(i,j,img), index(i+a,j+b,img), w, 0 ) 12/21
  • 14. Example 2 Use C++ as a blackbox 14/21
  • 15. Declare C++ function cdef extern from "_graphcut.h": void _graphcut( voxel_t*, int, int, double, unsigned char*, unsigned char* ) 15/21
  • 16. And use it! def graphcut( np.ndarray[voxel_t, ndim=2, mode="c"] img, np.ndarray[unsigned char, ndim=2, mode="c"] mask, double std ): cdef np.ndarray[unsigned char, ndim=2, mode="c"] seg = np.zeros( (img.shape[0], img.shape[1]), dtype=’uint8’) print "starting graphcut..." _graphcut( <voxel_t*> img.data, img.shape[0], img.shape[1], std, <unsigned char*> mask.data, <unsigned char*> seg.data ) return seg 16/21
  • 18. Timing Example 1: 18.01s Example 2: 0.37s Nearly 50 times faster... 18/21
  • 20. Conclusion Huge speedup for a low amount of code Perfect if C++ code already exists Make sure your Python code is optimised (good use of numpy) before using cython Slides and code are on github github.com/kevin-keraudren/talk-cython 20/21