SlideShare a Scribd company logo
Python Basic for
Data Analysis
Chetan Khanzode
GUI
• Anaconda-navigator 1.8.7
– Use either of IDE
– Jupyter Notebook: It is an interactive computational environment, in which you
can combine code execution, rich text, mathematics, plots and rich media.
– Spyder is an open source cross-platform integrated development environment
for scientific programming in the Python language
• Download and install
– https://anaconda.org/anaconda/anaconda-
navigator
Why Python
• A simple language
• Free and open-source
• Ease of Portability
• Extensible and Embeddable with other
languages
• A high-level, interpreted language
• Standard libraries
• Object-oriented
Keywords
keywords are case sensitive
Help on function
History and path setup
Run history command to see previous commands as well as
copying executed code
Set up path on local machine so you can refer and
run the same with %run command. You need to create
MyHelloWorld.py in local path
Comments
One line comments are denoted by # at the start of line
Multiple line/block comments start with ''' and end with '''
Data structures
• Data Types
– Integer, Floating point number, Strings,
Boolean Values, Date and Timestamp
• Advanced Data Types
– Tuples , Lists, Dictionary
Integer , float , string
String
Python has very strong string processing capabilities. Subsets of strings can be
taken using the slice operator ( [ ] and [ : ] ) with indexes starting at 0 in the
beginning of the string and working their way from -1 at the end. Strings in Python
are immutable.
Date Time
Python has a built-in datetime module for working with dates and times
List
A list contains items separated by commas and enclosed within square brackets ([]).
All the items belonging to a list can be of different data type.
List -continue
Dictionary
Python's dictionaries are kind of hash tables, associative arrays with key-value pairs
Dictionaries are enclosed by curly braces ( { } ) and values can be assigned
Dictionaries accessed using square braces ( [] )
Tuple
Tuple is immutable collection separated by commas
Data conversion
To convert integer to float, use the float() function
To convert a float to an integer, the int() function.
To convert a string use str() function.
Operators
Import conventions
• Commonly used modules:
– import numpy as np
– import pandas as pd
– import matplotlib.pyplot as plt
Conditional if statements
Conditional – while and if in list
For loop
Function
• Function begin with the keyword def followed by the function name and
parentheses ( ( ) ).
• Any input parameters or arguments should be placed within these
parentheses.Arguments are specified within parentheses in function
definition separated by commas.
• It is also possible to assign default values to parameters in order to make
the program flexible and not behave in an unexpected manner.
• The code block within every function starts with a colon (:) and is indented.
• it allows you to pass any number of arguments (*argv) and you do not have
to worry about specifying the number when writing the function. This feature
becomes extremely important when dealing with lists or input data where
you do not know number of data observations beforehand.
• The statement return [expression] exits a function, optionally passing back
an expression to the caller.
• A return statement with no arguments is the same as return None.
Function
Exception
• An exception is an interruption that happens during execution of a program. When that error
occurs, Python generate an exception that can be handled, which avoid program to stop.
• We can handle exceptions using the try..except statement.
• We basically put our usual statements within the try-block
and put all our error handlers in the except-block.
Exception
File Operation
Classes
Python is an object oriented programming language and a Class is object constructor
The __init__() Function
All classes have a function called __init__(),
which is always executed when the class is being initiated.
The __init__() function to assign values to object properties.
The __init__() function is called automatically every time the class is being used
to create a new object
Numpy
• NumPy, short for Numerical Python, is the foundational package for
scientific computing in Python.
– NumPy provides basic numerical functions, especially for multi-dimensional
arrays and mathematical computation.
– SciPy builds on NumPy to provide features for scientific applications.
– a powerful N-dimensional array object
– sophisticated (broadcasting) functions, Functions for performing element-wise
computations with arrays or mathematical operations between arrays
– tools for integrating C/C++ and Fortran code
– useful linear algebra, Fourier transform, and random number capabilities
•
N-dimensional array
• import numpy as np
– Where np as alias
– use np.array() to create an array
– use np.arange() to create an arithmetic progression array
• ndarray: The N-dimensional array
– Use the np.array() constructor to create an
array with any number of dimensions.
– np.array(object, dtype=None)
N-dimensional array
Two dimension array slicing
Three Dimensional array slicing
Arithmetic progression
Array attributes
• array.ndim : The number of dimensions of this
array.
• array.shape :A tuple of the array's dimensions.
• array.dtype :The array's data type.
Array Method
<Array name>.astype(Transform to datatype specified)
<Arrayname>.mean() returns mean of the values in Array
<Arrayname>. Var() returns Variance of the values in array
Array Method
Copy and resize methods
Reshape Array Matrix
Array functions
Mathematical Operations
• The usual mathematical operators (+ - * /) generalize to NumPy
array
• Two vectors vector1 and vector2 of the same length, the “+”
operator gives you an element-by-element sum broadcasting
N-dimensional array
Pandas
• import pandas as pd
• import numpy as np
• path = 'C:/BigData/Python‘
• A Series is a one-dimensional array-like object containing an array of data with an associated
array of data index.
• s1 = pd.Series([1,2,4,5,6,7])
Pandas
• Create Custom index and access series
Pandas
• Adding
Data frames
Null values
Replace null with Mean of column
Companydf ['Fin_Department'] = Companydf['Fin_Department'].fillna(np.mean(Companydf.Fin_Department))
Companydf['Pur_Department'] = Companydf['Pur_Department'].fillna(np.mean(Companydf.Pur_Department))
Companydf['Sales_Department'] = Companydf['Sales_Department'].fillna(np.mean(Companydf.Sales_Department))
Data frame Missing values
Data frame missing values
Replace missing age value with mean of student age in class
Data frame missing values
• Fill null values with text
Data frame missing values
Replace gender missing with most frequent values
Data frame missing values
Replace test score based on grouping on gender and taking mean
Read Excel File
Worksheet details
Column Name List
Read excel worksheet
Clean the data
Drop NaN column with axis=1
Set index ‘Year’
Charts
Charts
Read the Workbook Table1
Rows and Columns access
Rows and Columns access
Conditional Selection
Add new columns in DataFrame
Mathematical operations
Value updation in Data Frame
Sequence creation
Remove column
Int conversion
Show Dataframe Values
Clean data and Charts
Dataframe from series
You can pass a number of data structures to DataFrame such as a ndarray, lists
, dict, Series, and another DataFrame
Transform dataframe
Reindex
Forward fill missing values
Backfill missing values
Random seed
lambda
lambda
lambda
Sum and cumsum
describe
Pandas Datareader - quandl
Pandas Datareader
Charts
Read CSV file
Clean Data
Charts
Ref
• http://www.numpy.org/
• https://digital.nhs.uk/data-and-information/publications/statistical/statistics-on-obesity-physica
• https://media.readthedocs.org/pdf/pandas-datareader/latest/pandas-datareader.pdf

More Related Content

PDF
Introduction To Python | Edureka
PDF
Python Programming Tutorial | Edureka
PDF
Python Tutorial | Python Tutorial for Beginners | Python Training | Edureka
PDF
Python Programming Language | Python Classes | Python Tutorial | Python Train...
PPTX
Python PPT
PDF
Tkinter Python Tutorial | Python GUI Programming Using Tkinter Tutorial | Pyt...
PDF
Introduction to python programming
Introduction To Python | Edureka
Python Programming Tutorial | Edureka
Python Tutorial | Python Tutorial for Beginners | Python Training | Edureka
Python Programming Language | Python Classes | Python Tutorial | Python Train...
Python PPT
Tkinter Python Tutorial | Python GUI Programming Using Tkinter Tutorial | Pyt...
Introduction to python programming

What's hot (20)

PPTX
Introduction to python
PPTX
Python - An Introduction
PPTX
Pandas csv
PPTX
PDF
Django Introduction & Tutorial
PPTX
Beginning Python Programming
PPT
PYTHON - TKINTER - GUI - PART 1.ppt
ODP
Python Presentation
PPTX
Basics of python
PDF
Python Basics | Python Tutorial | Edureka
PPT
Python ppt
PDF
Python Projects For Beginners | Python Projects Examples | Python Tutorial | ...
PDF
Learn Python Programming | Python Programming - Step by Step | Python for Beg...
PDF
Python course syllabus
PPT
Introduction to Python
PDF
Introduction To Python
DOCX
Python Applications
PPTX
Python
PPTX
Introduction to-python
PDF
Zero to Hero - Introduction to Python3
Introduction to python
Python - An Introduction
Pandas csv
Django Introduction & Tutorial
Beginning Python Programming
PYTHON - TKINTER - GUI - PART 1.ppt
Python Presentation
Basics of python
Python Basics | Python Tutorial | Edureka
Python ppt
Python Projects For Beginners | Python Projects Examples | Python Tutorial | ...
Learn Python Programming | Python Programming - Step by Step | Python for Beg...
Python course syllabus
Introduction to Python
Introduction To Python
Python Applications
Python
Introduction to-python
Zero to Hero - Introduction to Python3
Ad

Similar to Python (20)

PPTX
Pa2 session 1
PDF
Standardizing arrays -- Microsoft Presentation
PPTX
Python-Libraries,Numpy,Pandas,Matplotlib.pptx
PPTX
Intellectual technologies
PPTX
Python with data Sciences
PPTX
Data Analysis in Python-NumPy
PPTX
Introduction to Python Basics for PSSE Integration
PPTX
Docketrun's Python Course for beginners.pptx
PDF
summer training report on python
PDF
Functions-.pdf
PPT
Basic Introduction to Python Programming
PPTX
Python for ML.pptx
PPTX
Automation Testing theory notes.pptx
PPTX
Lecture1_introduction to python.pptx
DOCX
Machine learning Experiments report
PDF
Standardizing on a single N-dimensional array API for Python
PPTX
PPT on Python - illustrating Python for BBA, B.Tech
PDF
Tips and tricks for data science projects with Python
PPTX
function_xii-BY APARNA DENDRE (1).pdf.pptx
PDF
Python. libraries. modules. and. all.pdf
Pa2 session 1
Standardizing arrays -- Microsoft Presentation
Python-Libraries,Numpy,Pandas,Matplotlib.pptx
Intellectual technologies
Python with data Sciences
Data Analysis in Python-NumPy
Introduction to Python Basics for PSSE Integration
Docketrun's Python Course for beginners.pptx
summer training report on python
Functions-.pdf
Basic Introduction to Python Programming
Python for ML.pptx
Automation Testing theory notes.pptx
Lecture1_introduction to python.pptx
Machine learning Experiments report
Standardizing on a single N-dimensional array API for Python
PPT on Python - illustrating Python for BBA, B.Tech
Tips and tricks for data science projects with Python
function_xii-BY APARNA DENDRE (1).pdf.pptx
Python. libraries. modules. and. all.pdf
Ad

More from Chetan Khanzode (8)

PPTX
Exploratory data analysis of 2017 US Employment data using R
PPTX
Data science in health care
PPTX
Order to cash
PPTX
Smart project management - Best Practices to Manage Project effectively
PPTX
Value driven IT program management
PPTX
Value driven IT program management
PPTX
Value driven IT program management
PPTX
Value driven IT program management
Exploratory data analysis of 2017 US Employment data using R
Data science in health care
Order to cash
Smart project management - Best Practices to Manage Project effectively
Value driven IT program management
Value driven IT program management
Value driven IT program management
Value driven IT program management

Recently uploaded (20)

PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Lecture1 pattern recognition............
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
annual-report-2024-2025 original latest.
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Introduction to machine learning and Linear Models
PPT
Quality review (1)_presentation of this 21
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Miokarditis (Inflamasi pada Otot Jantung)
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Supervised vs unsupervised machine learning algorithms
Lecture1 pattern recognition............
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
IB Computer Science - Internal Assessment.pptx
climate analysis of Dhaka ,Banglades.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
annual-report-2024-2025 original latest.
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Fluorescence-microscope_Botany_detailed content
oil_refinery_comprehensive_20250804084928 (1).pptx
ISS -ESG Data flows What is ESG and HowHow
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction to machine learning and Linear Models
Quality review (1)_presentation of this 21
iec ppt-1 pptx icmr ppt on rehabilitation.pptx

Python

Editor's Notes

  • #3: Anaconda Navigator is a desktop graphical user interface included in Anaconda that allows you to launch applications and easily manage conda packages, environments and channels without the need to use command line commands.
  • #4: A simple language which is easier to learnPython has a very simple and elegant syntax. It&amp;apos;s much easier to read and write Python programs compared to other languages like: C++, Java, C#. Python makes programming fun and allows you to focus on the solution rather than syntax. Free and open-sourceYou can freely use and distribute Python, even for commercial use. Not only can you use and distribute softwares written in it, you can even make changes to the Python&amp;apos;s source code.Python has a large community constantly improving it in each iteration. PortabilityYou can move Python programs from one platform to another, and run it without any changes.It runs seamlessly on almost all platforms including Windows, Mac OS X and Linux. Extensible and EmbeddableSuppose an application requires high performance. You can easily combine pieces of C/C++ or other languages with Python code.This will give your application high performance as well as scripting capabilities which other languages may not provide out of the box. A high-level, interpreted languageUnlike C/C++, you don&amp;apos;t have to worry about daunting tasks like memory management, garbage collection and so on.Likewise, when you run Python code, it automatically converts your code to the language your computer understands. You don&amp;apos;t need to worry about any lower-level operations. Large standard libraries to solve common tasksPython has a number of standard libraries which makes life of a programmer much easier since you don&amp;apos;t have to write all the code yourself. For example: Need to connect MySQL database on a Web server? You can use MySQLdb library using import MySQLdb .Standard libraries in Python are well tested and used by hundreds of people. So you can be sure that it won&amp;apos;t break your application. Object-orientedEverything in Python is an object. Object oriented programming (OOP) helps you solve a complex problem intuitively.With OOP, you are able to divide these complex problems into smaller sets by creating objects. Applications of Python Web Applications You can create scalable Web Apps using frameworks and CMS (Content Management System) that are built on Python. Some of the popular platforms for creating Web Apps are: Django, Flask, Pyramid, Plone, Django CMS. Sites like Mozilla, Reddit, Instagram and PBS are written in Python. Scientific and Numeric Computing There are numerous libraries available in Python for scientific and numeric computing. There are libraries like: SciPy and NumPy that are used in general purpose computing. And, there are specific libraries like: EarthPy for earth science, AstroPy for Astronomy and so on. Also, the language is heavily used in machine learning, data mining and deep learning. Creating software Prototypes Python is slow compared to compiled languages like C++ and Java. It might not be a good choice if resources are limited and efficiency is a must. However, Python is a great language for creating prototypes. For example: You can use Pygame (library for creating games) to create your game&amp;apos;s prototype first. If you like the prototype, you can use language like C++ to create the actual game. Good Language to Teach Programming Python is used by many companies to teach programming to kids and newbies. It is a good language with a lot of features and capabilities. Yet, it&amp;apos;s one of the easiest language to learn because of its simple easy-to-use syntax.
  • #15: One of the most important built-in data structures. Python&amp;apos;s dictionaries are kind of hash tables. They work like associative arrays and consist of key-value pairs. A dictionary key can be almost any Python type, but are usually numbers or strings. Values, on the other hand, can be any arbitrary Python object. Dictionaries are enclosed by curly braces ( { } ) and values can be assigned and accessed using square braces ( [] ).
  • #42: Library Highlights A fast and efficient DataFrame object for data manipulation with integrated indexing; Tools for reading and writing data between in-memory data structures and different formats: CSV and text files, Microsoft Excel, SQL databases, and the fast HDF5 format; Intelligent data alignment and integrated handling of missing data: gain automatic label-based alignment in computations and easily manipulate messy data into an orderly form; Flexible reshaping and pivoting of data sets; Intelligent label-based slicing, fancy indexing, and subsetting of large data sets; Columns can be inserted and deleted from data structures for size mutability; Aggregating or transforming data with a powerful group by engine allowing split-apply-combine operations on data sets; High performance merging and joining of data sets; Hierarchical axis indexing provides an intuitive way of working with high-dimensional data in a lower-dimensional data structure; Time series-functionality: date range generation and frequency conversion, moving window statistics, moving window linear regressions, date shifting and lagging. Even create domain-specific time offsets and join time series without losing data; Highly optimized for performance, with critical code paths written in Cython or C. Python with pandas is in use in a wide variety of academic and commercial domains, including Finance, Neuroscience, Economics, Statistics, Advertising, Web Analytics, and more.