SlideShare a Scribd company logo
Programming for Data
Analysis
Week 2
Dr. Ferdin Joe John Joseph
Faculty of Information Technology
Thai – Nichi Institute of Technology, Bangkok
Today’s lesson
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
2
• Merging
• Concatenating
• Reshaping
• Laboratory
Merging
• Used in pandas to combine data from two sources
• Sources can be from same format or different
• csv and csv, csv and json, json and xml and a concoction of all these
• Similar to numpy array manipulation but effective with pandas
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
3
Function Used
concat()
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
4
Syntax of concat()
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
5
Syntax of concat()
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
6
•objs : a sequence or mapping of Series or DataFrame objects. If a dict is passed, the sorted keys will be used as the keys argument,
unless it is passed, in which case the values will be selected (see below). Any None objects will be dropped silently unless they are all
None in which case a ValueError will be raised.
•axis : {0, 1, …}, default 0. The axis to concatenate along.
Syntax of concat()
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
7
•join : {‘inner’, ‘outer’}, default ‘outer’. How to handle indexes on other axis(es).
Outer for union and inner for intersection.
•ignore_index : boolean, default False. If True, do not use the index values on the
concatenation axis. The resulting axis will be labeled 0, …, n - 1. This is useful if you are
concatenating objects where the concatenation axis does not have meaningful indexing
information. Note the index values on the other axes are still respected in the join.
Syntax of concat()
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
8
•keys : sequence, default None. Construct hierarchical index using the passed
keys as the outermost level. If multiple levels passed, should contain tuples.
•levels : list of sequences, default None. Specific levels (unique values) to use
for constructing a MultiIndex. Otherwise they will be inferred from the keys.
•names : list, default None. Names for the levels in the resulting hierarchical index.
Syntax of concat()
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
9
•verify_integrity : boolean, default False. Check whether the new concatenated
axis contains duplicates. This can be very expensive relative to the
actual data concatenation.
•copy : boolean, default True. If False, do not copy data unnecessarily.
Example
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
10
Example
• Available Data frames: df1, df2 and df3
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
11
Creation of arrays
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
12
Creation of arrays
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
13
Creation of arrays
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
14
Creation of Arrays
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
15
Frames
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
16
Concatenation
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
17
Concatenation views
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
18
Setting other axes
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
19
Setting other axes
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
20
Setting other axes
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
21
Inner Join
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
22
Outer Join
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
23
Append()
• Alternative to concat()
• Combines two dataframes in first index only
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
24
Append
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
25
Sort
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
26
Append multiple dataframes
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
27
Varying dimension concatenation
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
28
Appending rows to a dataframe
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
29
How it works with csv, json and xml
• Convert these files to pandas dataframe object
• Play with concat or append
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
30
DSA 207 - Merging
• Create two arrays A1 and A2 and convert them into pandas data
frame. Merge the data frames and store in A2. Display A2 before and
after merging
• Merge the given csv files together using pandas and display the first
10 data and last 15 data.
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
31

More Related Content

PDF
Week 1: Programming for Data Analysis
PDF
Week 8: Programming for Data Analysis
PDF
Programming for Data Analysis: Week 3
PDF
Programming for Data Analysis: Week 4
PDF
Week 11: Programming for Data Analysis
PDF
Week 10: Programming for Data Analysis
PDF
Blockchain Technology - Week 2 - Blockchain Terminologies
PDF
Blockchain Technology - Week 9 - Blockciphers
Week 1: Programming for Data Analysis
Week 8: Programming for Data Analysis
Programming for Data Analysis: Week 3
Programming for Data Analysis: Week 4
Week 11: Programming for Data Analysis
Week 10: Programming for Data Analysis
Blockchain Technology - Week 2 - Blockchain Terminologies
Blockchain Technology - Week 9 - Blockciphers

What's hot (20)

PDF
Week 9: Programming for Data Analysis
PDF
Blockchain Technology - Week 6 - Role of Cryptography in Blockchain
PDF
Data Wrangling Week 4
PDF
Data wrangling week 10
PDF
Data wrangling week 6
PDF
Blockchain Technology - Week 5 - Cryptography and Steganography
PDF
Blockchain Technology - Week 1 - Introduction to Blockchain
PDF
Blockchain Technology - Week 4 - Hyperledger and Smart Contracts
PDF
Week 12: Cloud AI- DSA 441 Cloud Computing
PDF
Week 11: Cloud Native- DSA 441 Cloud Computing
PDF
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud Computing
PDF
SIX ABEJA 講演資料 もうブラックボックスとは呼ばせない~機械学習を支援する情報
PDF
もしその単語がなかったら
PDF
半教師あり学習
PDF
学振特別研究員になるために~知っておくべき10のTips~
PDF
Statistical voice conversion with direct waveform modeling
PPTX
Python による 「スクレイピング & 自然言語処理」入門
PDF
機械学習 入門
PPTX
ネオ・サイバネティクス略史
PDF
強化学習の基礎的な考え方と問題の分類
Week 9: Programming for Data Analysis
Blockchain Technology - Week 6 - Role of Cryptography in Blockchain
Data Wrangling Week 4
Data wrangling week 10
Data wrangling week 6
Blockchain Technology - Week 5 - Cryptography and Steganography
Blockchain Technology - Week 1 - Introduction to Blockchain
Blockchain Technology - Week 4 - Hyperledger and Smart Contracts
Week 12: Cloud AI- DSA 441 Cloud Computing
Week 11: Cloud Native- DSA 441 Cloud Computing
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud Computing
SIX ABEJA 講演資料 もうブラックボックスとは呼ばせない~機械学習を支援する情報
もしその単語がなかったら
半教師あり学習
学振特別研究員になるために~知っておくべき10のTips~
Statistical voice conversion with direct waveform modeling
Python による 「スクレイピング & 自然言語処理」入門
機械学習 入門
ネオ・サイバネティクス略史
強化学習の基礎的な考え方と問題の分類
Ad

Similar to Week2: Programming for Data Analysis (20)

PDF
Pandas in Depth_ Data Manipultion(Chapter 5)(Important).pdf
PPTX
Pandas in Programming (python) presentation
PPTX
Pandas in Programming (Python) Presentation
PDF
pandas-221217084954-937bb582.pdf
PPTX
Pandas.pptx
PDF
Data wrangling week3
PPTX
dataframe_operations and various functions
PDF
Panda data structures and its importance in Python.pdf
PPTX
python-pandas-For-Data-Analysis-Manipulate.pptx
PPTX
Data Science ppt on dataframe operations.pptx
PPT
Python Panda Library for python programming.ppt
PPTX
Pandas csv
PDF
NUS-ISS Learning Day 2018- Pandas ate my data
PDF
lecture14DATASCIENCE AND MACHINE LER.pdf
PDF
Data wrangling week1
PPTX
DataStructures in Pyhton Pandas and numpy.pptx
PPT
Pandas-and-NumPy-Powerful-Tools-for-Data-Analysis (1).ppt
PPTX
Handling Missing Data for Data Analysis.pptx
PDF
330 Pandas Interview Questions and Answers MCQ Format 1st Edition Manish Salunke
PPTX
Group B - Pandas Pandas is a powerful Python library that provides high-perfo...
Pandas in Depth_ Data Manipultion(Chapter 5)(Important).pdf
Pandas in Programming (python) presentation
Pandas in Programming (Python) Presentation
pandas-221217084954-937bb582.pdf
Pandas.pptx
Data wrangling week3
dataframe_operations and various functions
Panda data structures and its importance in Python.pdf
python-pandas-For-Data-Analysis-Manipulate.pptx
Data Science ppt on dataframe operations.pptx
Python Panda Library for python programming.ppt
Pandas csv
NUS-ISS Learning Day 2018- Pandas ate my data
lecture14DATASCIENCE AND MACHINE LER.pdf
Data wrangling week1
DataStructures in Pyhton Pandas and numpy.pptx
Pandas-and-NumPy-Powerful-Tools-for-Data-Analysis (1).ppt
Handling Missing Data for Data Analysis.pptx
330 Pandas Interview Questions and Answers MCQ Format 1st Edition Manish Salunke
Group B - Pandas Pandas is a powerful Python library that provides high-perfo...
Ad

More from Ferdin Joe John Joseph PhD (18)

PDF
Invited Talk DGTiCon 2022
PDF
Week 10: Cloud Security- DSA 441 Cloud Computing
PDF
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
PDF
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...
PDF
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...
PDF
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud Computing
PDF
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...
PDF
Week 2: Virtualization and VM Ware - DSA 441 Cloud Computing
PDF
Week 1: Introduction to Cloud Computing - DSA 441 Cloud Computing
PDF
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculum
PDF
Hadoop in Alibaba Cloud
PDF
Cloud Computing Essentials in Alibaba Cloud
PDF
Transforming deep into transformers – a computer vision approach
PDF
Deep learning - Introduction
PDF
Data wrangling week 11
PDF
Data wrangling week 9
PDF
Data Wrangling Week 7
PDF
Deep Learning and CNN Architectures
Invited Talk DGTiCon 2022
Week 10: Cloud Security- DSA 441 Cloud Computing
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud Computing
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...
Week 2: Virtualization and VM Ware - DSA 441 Cloud Computing
Week 1: Introduction to Cloud Computing - DSA 441 Cloud Computing
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculum
Hadoop in Alibaba Cloud
Cloud Computing Essentials in Alibaba Cloud
Transforming deep into transformers – a computer vision approach
Deep learning - Introduction
Data wrangling week 11
Data wrangling week 9
Data Wrangling Week 7
Deep Learning and CNN Architectures

Recently uploaded (20)

PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Supervised vs unsupervised machine learning algorithms
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPT
Quality review (1)_presentation of this 21
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Introduction to Business Data Analytics.
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Business Acumen Training GuidePresentation.pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Supervised vs unsupervised machine learning algorithms
Reliability_Chapter_ presentation 1221.5784
Moving the Public Sector (Government) to a Digital Adoption
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
oil_refinery_comprehensive_20250804084928 (1).pptx
climate analysis of Dhaka ,Banglades.pptx
Quality review (1)_presentation of this 21
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Galatica Smart Energy Infrastructure Startup Pitch Deck
Introduction to Business Data Analytics.
IBA_Chapter_11_Slides_Final_Accessible.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Introduction-to-Cloud-ComputingFinal.pptx

Week2: Programming for Data Analysis

  • 1. Programming for Data Analysis Week 2 Dr. Ferdin Joe John Joseph Faculty of Information Technology Thai – Nichi Institute of Technology, Bangkok
  • 2. Today’s lesson Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 2 • Merging • Concatenating • Reshaping • Laboratory
  • 3. Merging • Used in pandas to combine data from two sources • Sources can be from same format or different • csv and csv, csv and json, json and xml and a concoction of all these • Similar to numpy array manipulation but effective with pandas Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 3
  • 4. Function Used concat() Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 4
  • 5. Syntax of concat() Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 5
  • 6. Syntax of concat() Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 6 •objs : a sequence or mapping of Series or DataFrame objects. If a dict is passed, the sorted keys will be used as the keys argument, unless it is passed, in which case the values will be selected (see below). Any None objects will be dropped silently unless they are all None in which case a ValueError will be raised. •axis : {0, 1, …}, default 0. The axis to concatenate along.
  • 7. Syntax of concat() Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 7 •join : {‘inner’, ‘outer’}, default ‘outer’. How to handle indexes on other axis(es). Outer for union and inner for intersection. •ignore_index : boolean, default False. If True, do not use the index values on the concatenation axis. The resulting axis will be labeled 0, …, n - 1. This is useful if you are concatenating objects where the concatenation axis does not have meaningful indexing information. Note the index values on the other axes are still respected in the join.
  • 8. Syntax of concat() Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 8 •keys : sequence, default None. Construct hierarchical index using the passed keys as the outermost level. If multiple levels passed, should contain tuples. •levels : list of sequences, default None. Specific levels (unique values) to use for constructing a MultiIndex. Otherwise they will be inferred from the keys. •names : list, default None. Names for the levels in the resulting hierarchical index.
  • 9. Syntax of concat() Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 9 •verify_integrity : boolean, default False. Check whether the new concatenated axis contains duplicates. This can be very expensive relative to the actual data concatenation. •copy : boolean, default True. If False, do not copy data unnecessarily.
  • 10. Example Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 10
  • 11. Example • Available Data frames: df1, df2 and df3 Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 11
  • 12. Creation of arrays Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 12
  • 13. Creation of arrays Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 13
  • 14. Creation of arrays Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 14
  • 15. Creation of Arrays Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 15
  • 16. Frames Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 16
  • 17. Concatenation Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 17
  • 18. Concatenation views Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 18
  • 19. Setting other axes Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 19
  • 20. Setting other axes Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 20
  • 21. Setting other axes Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 21
  • 22. Inner Join Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 22
  • 23. Outer Join Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 23
  • 24. Append() • Alternative to concat() • Combines two dataframes in first index only Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 24
  • 25. Append Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 25
  • 26. Sort Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 26
  • 27. Append multiple dataframes Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 27
  • 28. Varying dimension concatenation Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 28
  • 29. Appending rows to a dataframe Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 29
  • 30. How it works with csv, json and xml • Convert these files to pandas dataframe object • Play with concat or append Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 30
  • 31. DSA 207 - Merging • Create two arrays A1 and A2 and convert them into pandas data frame. Merge the data frames and store in A2. Display A2 before and after merging • Merge the given csv files together using pandas and display the first 10 data and last 15 data. Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok 31