Automatically Fusing Functions on CuPy

Automatically+Fusing+Functions+on+CuPy
Akifumi Imanishi

What’s'CuPy
• An'implementation'of'NumPy6compatible
multi6dimensional'array'on'CUDA
• CuPy enables'us'to'write'Python'Codes
for'running'on'GPU.
• Two'basic'operations
• elementwise
• Applying'the'function'to'each'element
• reduction
• Reducing'elements

Problems'of'CuPy
• Small'functions'are'called'many'times.
• Communication'time'between'CPU'and'GPU'is'a'
bottleneck.
• A'mechanism'of'fusing'functions'is'needed'to'resolve'it.
• ex.)':''x'*'y'+'z'*'3'+'5
• There'are'4'kernel'calls'in'total.
• We'want'to'calculate'the'expression'in'1'kernel'call.

UI'for'elementwise'kernel
• Converting'a'Python'function'to'an'Elementwise.
• ex.)

Constructing'a'Data'Structure
3 5
*
*
+
+
x y z

UI'for'reduction'kernel
• Converting'a'Python'function'to'a'ReductionKernel.
• ex.)

Rewrite'adam.py by'using'”fuse”

Results
• chainer/optimizers/adam.py (update_one_gpu)
• chainer/example/mnist/train_mnist.py
Memory'usage'(MiB)
Ufunc 225
Elementwise 211
Fusion 211
78.656
62.430 62.874
55.000
60.000
65.000
70.000
75.000
80.000
85.000
Ufunc Elementwise fusion
Running'times
Memory'usage

Automatically Fusing Functions on CuPy

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Automatically Fusing Functions on CuPy (20)

More from Preferred Networks (20)

Recently uploaded (20)

Automatically Fusing Functions on CuPy