SlideShare a Scribd company logo
TensorFlow
Internal
Hyunghun Cho
(webofthink@snu.ac.kr)
1
Overview
■ Dataflow-like model
■ Runs on a wide variety of different H/W platform
2※ Source: tensorflow.org
※ Source: github.com/zer0n/deepframeworks
Basic concepts
■ Tensor
– definition: an array with more than two axes
– arbitrary dimensionality array
■ Directed graph describes T/F computation
– node: instantiation of an Operation
■ Operation
– an abstract computation
– have attribute(s)
■ Kernel
– particular implementation of an Operation
– run on a type of device (e.g. CPU, GPU)
■ Variable
– special Operation to persistent mutable Tensor
■ Session
– Created to interact with T/F system
3
nodein out
0…* 0…*
※ Source: T/F white paper
Programming Model
■ Example T/F code and corresponding computation graph
■ Single machine and distributed system architecture
4※ Source: T/F white paper
Previous work
■ DistBelief
– Downpour SGD
– Sandblaster L-BFGS
■ Related to
– Project Adam
• MSR
– Parameter
Server project
5
※ Source: Large Scale Distributed Deep Networks
※ Source: parameter server architecture github wiki
※ Source: Project Adam paper
Feature Comparison
Feature
Tensor
Flow
Theano Torch Caffe Chainer CNTK
Run on
Single Machine
O O O O O O
Run on
Distributed
Machines
O X X X X O
Symbolic
differentiation
O O X X O X
Implemented by
C++
O X X O X X
6
※ Source: T/F white paper
■ For detail, refer to Wikipedia
Execution Mode
■ Single Device
■ Multi Device
– Node placement
– Cross-Device Communication
■ Distributed
– Fault Tolerance
• Error handling between Send-Receive node pair
• Periodic health check to worker process
7
Programming Idioms
■ Programming Idioms
– Data Parallel Training
• sequential SGD
– Model Parallel Training
• Recurrent deep LSTM
– Concurrent Steps
8
Code Metrics
■ Source
– https://github.com/tensorflow/tensorflow
■ Code Summary
– Total 114MB
• 3373 files including C/C++, python, HTML, …
– Top 5 languages for implementation
• C++ and Python are the major languages
• Protocol Buffers: provide mechanism for serializing structured data
9
language files blank comment code
C++ 1092 46473 43399 276160
C/C++ Header 779 23457 44727 86274
Python 641 27622 46660 97570
Protocol Buffers 179 2217 7294 8724
Java 167 8296 17325 49374
C# 116 4285 8653 34347
How it works
■ Python-C++ connection with SWIG wrapper
10
[tensorflow.i] [py_func.i]
[py_func.h] [py_func.cc]
v v
Code Structure
■ C++ implementation under /core folder
11
Folder C/C++ Header C++ Protocol Buffers 총합계
./tensorflow/core/client/ 511 511
./tensorflow/core/common_runtime/ 1384 8526 9910
./tensorflow/core/common_runtime/gpu/ 644 3674 4318
./tensorflow/core/distributed_runtime/ 581 2579 3160
./tensorflow/core/distributed_runtime/rpc/ 434 2759 3193
./tensorflow/core/example/ 116 209 45 370
./tensorflow/core/framework/ 3539 14022 451 18012
./tensorflow/core/graph/ 952 5586 6538
./tensorflow/core/kernels/ 9180 42188 11 51379
./tensorflow/core/lib/core/ 573 1240 25 1838
./tensorflow/core/lib/gtl/ 1452 1943 3395
./tensorflow/core/lib/hash/ 36 400 436
./tensorflow/core/lib/histogram/ 60 324 384
./tensorflow/core/lib/io/ 340 2134 2474
./tensorflow/core/lib/jpeg/ 78 767 845
./tensorflow/core/lib/png/ 37 311 348
./tensorflow/core/lib/random/ 690 856 1546
./tensorflow/core/lib/strings/ 532 3111 3643
./tensorflow/core/lib/wav/ 13 166 179
./tensorflow/core/ops/ 9346 9346
./tensorflow/core/ops/compat/ 25 204 229
./tensorflow/core/platform/ 805 738 1543
./tensorflow/core/platform/default/ 349 290 639
./tensorflow/core/platform/posix/ 31 656 687
./tensorflow/core/protobuf/ 333 333
./tensorflow/core/public/ 202 202
./tensorflow/core/user_ops/ 20 20
./tensorflow/core/util/ 1354 4426 170 5950
./tensorflow/core/util/ctc/ 600 298 898
./tensorflow/core/util/sparse/ 504 498 1002
총합계 24511 107782 1035 133328
C++ framework
■ Key classes
12
C++ kernels
■ Inherit from OpKernel
■ Kernel is implemented per CPU / GPU [How to]
– GPU version uses CUDA library
13
[constant_op.h]
[constant_op.cc]
[constant_op_gpu.cu.cc]
Code Structure
■ Python implementation under /python folder
14
Folder C/C++ Header C++ Protocol Buffers Python 총합계
./tensorflow/python/ 168 168
./tensorflow/python/client/ 33 475 2031 2539
./tensorflow/python/framework/ 13 686 7097 7796
./tensorflow/python/kernel_tests/ 25391 25391
./tensorflow/python/lib/core/ 26 316 342
./tensorflow/python/lib/io/ 52 75 31 158
./tensorflow/python/ops/ 14995 14995
./tensorflow/python/platform/ 888 888
./tensorflow/python/platform/default
/
389 389
./tensorflow/python/summary/ 1168 1168
./tensorflow/python/summary/impl/ 693 693
./tensorflow/python/tools/ 280 280
./tensorflow/python/training/ 6 7732 7738
./tensorflow/python/user_ops/ 7 7
./tensorflow/python/util/ 51 51
총합계 124 1552 6 60921 62603
Python Implementation
■ Operations
■ Trainings
15
Code Summary
■ The Python part
– Various operations and trainings
– API:
• the most complete and the easiest to use
■ The C++ part
– Framework and kernel functions
– API:
• offer some performance advantages
• supports deployment to small devices such as Android
16
Meta Framework
■ Keras
■ TensorFlow Slim
– a lightweight library for defining, training and evaluating models
■ Skflow
– provide Scikit Learn style API
■ PrettyTensor
– support a chainable object syntax to quickly define neural networks
■ TFLearn
– a modular and transparent deep learning library
17

More Related Content

PDF
Yokogawa Centum VP
PPT
Basic plc
PDF
HMI- Human Machine Interface
PDF
파알못의 파이썬 크롤러 이해하기
PDF
배워봅시다 머신러닝 with TensorFlow
PPTX
Howto_Tensorflow+Linear Regression
PDF
텐서플로 걸음마 (TensorFlow Tutorial)
PDF
Deep dive into deeplearn.js
Yokogawa Centum VP
Basic plc
HMI- Human Machine Interface
파알못의 파이썬 크롤러 이해하기
배워봅시다 머신러닝 with TensorFlow
Howto_Tensorflow+Linear Regression
텐서플로 걸음마 (TensorFlow Tutorial)
Deep dive into deeplearn.js

Similar to Tensorflow internal (20)

PDF
Rlite software-architecture (1)
PDF
LAS16-210: Hardware Assisted Tracing on ARM with CoreSight and OpenCSD
PPTX
Designing Tracing Tools
PDF
Serving Deep Learning Models At Scale With RedisAI: Luca Antiga
PDF
Continuous Go Profiling & Observability
PDF
MOVED: The challenge of SVE in QEMU - SFO17-103
PDF
BKK16-103 OpenCSD - Open for Business!
PDF
Using Netconf/Yang with OpenDalight
PDF
OSN days 2019 - Open Networking and Programmable Switch
PDF
Concurrent Programming OpenMP @ Distributed System Discussion
PDF
Fletcher Framework for Programming FPGA
PDF
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
PPT
3.2 process text streams using filters
PPTX
CS345 09 - Ch04 Threads operating system1.pptx
PPTX
Threads and multi threading
PDF
A Peek into TFRT
PDF
Designing Tracing Tools
PDF
TFLite NNAPI and GPU Delegates
PPTX
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
PDF
LCU14 302- How to port OP-TEE to another platform
Rlite software-architecture (1)
LAS16-210: Hardware Assisted Tracing on ARM with CoreSight and OpenCSD
Designing Tracing Tools
Serving Deep Learning Models At Scale With RedisAI: Luca Antiga
Continuous Go Profiling & Observability
MOVED: The challenge of SVE in QEMU - SFO17-103
BKK16-103 OpenCSD - Open for Business!
Using Netconf/Yang with OpenDalight
OSN days 2019 - Open Networking and Programmable Switch
Concurrent Programming OpenMP @ Distributed System Discussion
Fletcher Framework for Programming FPGA
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
3.2 process text streams using filters
CS345 09 - Ch04 Threads operating system1.pptx
Threads and multi threading
A Peek into TFRT
Designing Tracing Tools
TFLite NNAPI and GPU Delegates
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
LCU14 302- How to port OP-TEE to another platform
Ad

More from Hyunghun Cho (9)

PDF
2018 소프트웨어에 물들다 - 기계는 어떻게 생각할까?
PDF
Somul 2017 소프트웨어, 사람과 사물의 소통을 향하여
PDF
Do IoT Yourself 3rd : Open API - revision 3
PDF
IoT Web App - 수집된 정보의 가공, 처리, 융합
PDF
Do IoT Yourself! - 사물 간의 연결을 위한 Open API
PDF
IoT, 기술의 혁신과 미래 그리고 통찰
PDF
GameTube app-swing-introduction
PPTX
Home sensor prototype on Arduino & Raspberry Pi with Node.JS
PDF
REST to JavaScript for Better Client-side Development
2018 소프트웨어에 물들다 - 기계는 어떻게 생각할까?
Somul 2017 소프트웨어, 사람과 사물의 소통을 향하여
Do IoT Yourself 3rd : Open API - revision 3
IoT Web App - 수집된 정보의 가공, 처리, 융합
Do IoT Yourself! - 사물 간의 연결을 위한 Open API
IoT, 기술의 혁신과 미래 그리고 통찰
GameTube app-swing-introduction
Home sensor prototype on Arduino & Raspberry Pi with Node.JS
REST to JavaScript for Better Client-side Development
Ad

Recently uploaded (20)

PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Digital Strategies for Manufacturing Companies
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
System and Network Administration Chapter 2
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
L1 - Introduction to python Backend.pptx
PPTX
Essential Infomation Tech presentation.pptx
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PPTX
Materi-Enum-and-Record-Data-Type (1).pptx
Which alternative to Crystal Reports is best for small or large businesses.pdf
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Softaken Excel to vCard Converter Software.pdf
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Digital Strategies for Manufacturing Companies
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
VVF-Customer-Presentation2025-Ver1.9.pptx
System and Network Administration Chapter 2
Design an Analysis of Algorithms II-SECS-1021-03
L1 - Introduction to python Backend.pptx
Essential Infomation Tech presentation.pptx
How to Choose the Right IT Partner for Your Business in Malaysia
Design an Analysis of Algorithms I-SECS-1021-03
Materi-Enum-and-Record-Data-Type (1).pptx

Tensorflow internal

  • 2. Overview ■ Dataflow-like model ■ Runs on a wide variety of different H/W platform 2※ Source: tensorflow.org ※ Source: github.com/zer0n/deepframeworks
  • 3. Basic concepts ■ Tensor – definition: an array with more than two axes – arbitrary dimensionality array ■ Directed graph describes T/F computation – node: instantiation of an Operation ■ Operation – an abstract computation – have attribute(s) ■ Kernel – particular implementation of an Operation – run on a type of device (e.g. CPU, GPU) ■ Variable – special Operation to persistent mutable Tensor ■ Session – Created to interact with T/F system 3 nodein out 0…* 0…* ※ Source: T/F white paper
  • 4. Programming Model ■ Example T/F code and corresponding computation graph ■ Single machine and distributed system architecture 4※ Source: T/F white paper
  • 5. Previous work ■ DistBelief – Downpour SGD – Sandblaster L-BFGS ■ Related to – Project Adam • MSR – Parameter Server project 5 ※ Source: Large Scale Distributed Deep Networks ※ Source: parameter server architecture github wiki ※ Source: Project Adam paper
  • 6. Feature Comparison Feature Tensor Flow Theano Torch Caffe Chainer CNTK Run on Single Machine O O O O O O Run on Distributed Machines O X X X X O Symbolic differentiation O O X X O X Implemented by C++ O X X O X X 6 ※ Source: T/F white paper ■ For detail, refer to Wikipedia
  • 7. Execution Mode ■ Single Device ■ Multi Device – Node placement – Cross-Device Communication ■ Distributed – Fault Tolerance • Error handling between Send-Receive node pair • Periodic health check to worker process 7
  • 8. Programming Idioms ■ Programming Idioms – Data Parallel Training • sequential SGD – Model Parallel Training • Recurrent deep LSTM – Concurrent Steps 8
  • 9. Code Metrics ■ Source – https://github.com/tensorflow/tensorflow ■ Code Summary – Total 114MB • 3373 files including C/C++, python, HTML, … – Top 5 languages for implementation • C++ and Python are the major languages • Protocol Buffers: provide mechanism for serializing structured data 9 language files blank comment code C++ 1092 46473 43399 276160 C/C++ Header 779 23457 44727 86274 Python 641 27622 46660 97570 Protocol Buffers 179 2217 7294 8724 Java 167 8296 17325 49374 C# 116 4285 8653 34347
  • 10. How it works ■ Python-C++ connection with SWIG wrapper 10 [tensorflow.i] [py_func.i] [py_func.h] [py_func.cc] v v
  • 11. Code Structure ■ C++ implementation under /core folder 11 Folder C/C++ Header C++ Protocol Buffers 총합계 ./tensorflow/core/client/ 511 511 ./tensorflow/core/common_runtime/ 1384 8526 9910 ./tensorflow/core/common_runtime/gpu/ 644 3674 4318 ./tensorflow/core/distributed_runtime/ 581 2579 3160 ./tensorflow/core/distributed_runtime/rpc/ 434 2759 3193 ./tensorflow/core/example/ 116 209 45 370 ./tensorflow/core/framework/ 3539 14022 451 18012 ./tensorflow/core/graph/ 952 5586 6538 ./tensorflow/core/kernels/ 9180 42188 11 51379 ./tensorflow/core/lib/core/ 573 1240 25 1838 ./tensorflow/core/lib/gtl/ 1452 1943 3395 ./tensorflow/core/lib/hash/ 36 400 436 ./tensorflow/core/lib/histogram/ 60 324 384 ./tensorflow/core/lib/io/ 340 2134 2474 ./tensorflow/core/lib/jpeg/ 78 767 845 ./tensorflow/core/lib/png/ 37 311 348 ./tensorflow/core/lib/random/ 690 856 1546 ./tensorflow/core/lib/strings/ 532 3111 3643 ./tensorflow/core/lib/wav/ 13 166 179 ./tensorflow/core/ops/ 9346 9346 ./tensorflow/core/ops/compat/ 25 204 229 ./tensorflow/core/platform/ 805 738 1543 ./tensorflow/core/platform/default/ 349 290 639 ./tensorflow/core/platform/posix/ 31 656 687 ./tensorflow/core/protobuf/ 333 333 ./tensorflow/core/public/ 202 202 ./tensorflow/core/user_ops/ 20 20 ./tensorflow/core/util/ 1354 4426 170 5950 ./tensorflow/core/util/ctc/ 600 298 898 ./tensorflow/core/util/sparse/ 504 498 1002 총합계 24511 107782 1035 133328
  • 12. C++ framework ■ Key classes 12
  • 13. C++ kernels ■ Inherit from OpKernel ■ Kernel is implemented per CPU / GPU [How to] – GPU version uses CUDA library 13 [constant_op.h] [constant_op.cc] [constant_op_gpu.cu.cc]
  • 14. Code Structure ■ Python implementation under /python folder 14 Folder C/C++ Header C++ Protocol Buffers Python 총합계 ./tensorflow/python/ 168 168 ./tensorflow/python/client/ 33 475 2031 2539 ./tensorflow/python/framework/ 13 686 7097 7796 ./tensorflow/python/kernel_tests/ 25391 25391 ./tensorflow/python/lib/core/ 26 316 342 ./tensorflow/python/lib/io/ 52 75 31 158 ./tensorflow/python/ops/ 14995 14995 ./tensorflow/python/platform/ 888 888 ./tensorflow/python/platform/default / 389 389 ./tensorflow/python/summary/ 1168 1168 ./tensorflow/python/summary/impl/ 693 693 ./tensorflow/python/tools/ 280 280 ./tensorflow/python/training/ 6 7732 7738 ./tensorflow/python/user_ops/ 7 7 ./tensorflow/python/util/ 51 51 총합계 124 1552 6 60921 62603
  • 16. Code Summary ■ The Python part – Various operations and trainings – API: • the most complete and the easiest to use ■ The C++ part – Framework and kernel functions – API: • offer some performance advantages • supports deployment to small devices such as Android 16
  • 17. Meta Framework ■ Keras ■ TensorFlow Slim – a lightweight library for defining, training and evaluating models ■ Skflow – provide Scikit Learn style API ■ PrettyTensor – support a chainable object syntax to quickly define neural networks ■ TFLearn – a modular and transparent deep learning library 17