SlideShare a Scribd company logo
Python Learning Outcomes
04 May 2021
Python packages
• Import package to use: import package_name_here as shortened_name_here
Example: import numpy as np
• Packages required:
• numpy: perform numerical functions
• Pandas: reading, importing, creating DataFrames
• matplotlib.pyplot: data visualisation
• seaborn: data visualisation for stats
• Talib: technical analysis
• bt: backtest trading strategy
The basics
• List: first item has an index value of 0
• Slicing: includes the start & up to (but not include) the end, can slice with step (must be integer)
• Mylist [startAt : endBefore : step]
• Methods: .sort(), .append(), .extend(), .index(x), .min(), .max()
• Array: faster in reading, storing, calculating items
• Create array: np.array()
• Functions: .shape, .size, .mean, np.std(), np.arange(start, end, step), np.transpose()
• Can subset array using Boolean array
• Visualistion: use matplotlib.pyplot or seaborn packages
• Boxplot for quantiles and outliers: sns.boxplot(x= ,y= , data= )
• Line plot: plt.plot()
• Scatter plot: plt.scatter()
• Histogram: plt.hist(x = , bins = ); normed = 1 to use %
• Plt.show() to show graphs, plt.legend() to show legends
• Miscellaneous: color = , linestyle = ’ ‘, legend =‘ ‘, subplot = True, plt.xlabel(’ ‘), plt.ylabel(‘ ‘)
• Add a vertical line on chart: ax.axvline()
• Other plot types ( kind = ‘ ‘): bar, barh, hist, box, kde, density, area, pie, scatter, hexbin
Intermediate Python
• Representing time: use datetime package
• Convert datetime from string, from string to datetime
• Formatting time: consult materials. Example: %A, %B $d,
%Y
• datetime.now (), .datetime(year, month, day, hour,
minute)
• Attributes: .year, .month, .day, .hour,…
• Time delta: how much time between 2
timestamps
• Create relative datetime using timedelta()
• Dictionary: store and lookup values using keys
• Create dictionary: {‘key 1’:’value 1’, ’key 2’:’value 2’, ‘key
3’:’value 3’}
• Add new keys: dictionary [‘key’] = ‘value’
• Access values: dictionary [‘key’] or use get method
• Delete: del(dictionary[‘key’])
• Comparison operators: ==, !=, >, <, <=, =>
• Boolean operators: and,, or, not
• If statements:
• If <expression/control statement> :
Statement 1
Statement 2
Statement 3
• Else: excute code when the control statement is False
• Elif: only excute when initial statement is False and the
Elif statement is satisfied
• Loops:
• For <variable> in <sequence>:
Statement
• While <expression>:
Statement
• Skipping loop: if <expression> :
continue
• Stopping loop: if <expression> :
break
Intermediate Python
• DataFrames: using pandas package, similar to
spreadsheets or tables
• Can create DataFrames from dictionary or list of lists
• Reading data: pd.read_<file type> (’ file name or
path to file’)
• File type: excel, csv, json, html, pickle, sql
• Access column:
• Use [] brackets, dot
• Or list of name for multiple columns
• Access rows:
• Use slicing []
• List of booleans
• Access columns and rows in small dataset:
• iloc (by name)
• loc (by index)
• Methods: .count(), .min(), .max(), .first(), .last(), .sum(),
.prod(), .mean(), .median(), .std(), .var(), .quantiles()
• Note: method runs across rows by default, run across
columns if axis = 1
• Manipulating data:
• Remove column: .drop(columns=[ ], axis = 1, inplace = True)
• Remove row: .drop() remove rows by default
• Add multiple rows: .append() or .concat()
• Operations on DataFrames:
• apply directly to column
• .map: apply the defined operation to the selected columns
• .apply: across rows and columns
• Checking data:
• .info(): to view data structure
• .head(): display first 5 rows
• .tail(): display last 5 rows
• .describe(): summary stats
• Include = …
• Percentiles = [.1, .5, .9]
• Exclude = …
• Filtering data:
• Apply comparison expression on selected column
 result: Boolean values for each row in that column
• .loc [boolean_result] to filter values that satisfy the
comparison expression
Importing and managing financial data
• Import and inspecting data:
• CSV: pd.read_csv(‘file_name.csv’, na_values=‘n/a’,
parse_dates =[‘label of the column containing date info’])
• Excel: pd.read_excel( )
• Import an entire worksheet or just 1 or 2 sheets
• Combine data from multiple worksheets:
pd.concat()
• Combine vertically and combine data based on columns
• Note: a reference column is needed
• Google Finance:
• 1st step is importing datetime functionality  define
start and end date using date ()
• Data source: ‘google’
• E.g. stock_date = DataReader(ticker, data_source, start, end)
• Fed Researve:
• Series code: available on the website
• E.g. data = DataReader (series_code, data_source, start)
• Dealing with missing values:
• Drop rows with missing values: .dropa (inplace = True)
• Replace missng value with mean: .filla
• Useful methods:
• .sort_values (‘column’, ascending = False)
• .set_index: assign a different data type/values to the index
• .idxmax(): find index of max value
• .unique(): unique values as numpy array
• .div(): divide the whole column
• . nlargest(n = ): find n largest values
• .index.to.list(): convert index to list
• .panel.to_frame(): convert panel to frame
• Why? 2D multiIndex is easier to work with than panel
• .unstack(): unstack data, move from a long format to wide
format
• Methods for categorical data
• .nunique(): identify unique values or categories
• .value_count(): how many times each value occurs
• .groupby(): group data
• .agg(): pass a list with names of stats metric
Financial Trading
• Packages needed: ta-lib and bt
• Plot interactive candle sticks:
• Use plotly.graph_objects package
• go.Candlestick(x=, open=, high= , low= ,close=)
• Resample data: hourly to daily, daily to weekly
• Important calculations:
• Daily return: .pct_change()*100
(calculate % change from preceding row by default)
• SMA: .rolling(window = n).mean()
talib.SMA(data, time period)
• EMA: talib.EMA(data, time period)
• ADX: talib.EDX(high, low, close, timeperiod)
• RSI: talib.RSI(data, time period)
• Bollinger Band: talib.BBANDS(data, nbdevup = , nddevdn
= , time period)
Construct trading signal:
1. Get historical price: bt.get(‘ticker’,start=,end=)
2. Calculate indicators
3. Create signal DataFrame
signal=indicator_long.copy()
signal=[indicator_long.isnull()]=0
Define strategy:
Signal[condition 1] = 1 (long signal)
Signal[condition 2] = -1 (short signal)
Plot signal, prices and indicators: create a combined dataframe
using bt.merge
4. Define signal-based strategy
Bt_strategy = bt.Strategy(‘strategy_name’,
[bt.algos.SelectWhere( condition),
bt.algos.WeighEqually(),
bt.algos.Rebalance()])
Or
Bt_strategy=bt.Strategy(‘strategy_name’,
[bt.algos.WeighTarget(signal),
bt.algos.Rebalance()])
Financial Trading
Backtest
Bt_backtest = bt.Backtest (bt_strategy, price_data)
Bt_result = bt.run(bt_backtest)
Plot the backtest PnL:
bt_result,plot(title= )
Strategy optimization: try a range of input
parameter values
Define function (to save time, don’t have to repeat code)
Def signal_strategy (ticker,period,name,start =,end = )
<get historical values, calculate indicators, define signal, define
signal-based strategy>
Return bt.Backtest(bt_strattegy, price_data)
Can call this function several times, run backtest to find the
optimal input
Benchmarking: can compared active trading
strategy with buy and hold strategy
Def buy_and_hold (ticker,name,start=,end=)
<get historical data>
bt_strategy = bt.Strategy(name,
[bt.algos.RunOnce(),
bt.algos.SelectAll(),
bt.algos.WeighEqually(),
bt.algos.Rebalance()])
return bt.Backtest (bt_strategy, price_data)
 Run backtest on strategies and benchmark and compare
Strategy return analysis:
• Backtest stats: resInfo = bt_result.stats
• View all stats index: print(resInfo.index)
• Stats: rate of returns, cagr, max drawdown, calmar
ratio, share ratio, sortino ratio (yearly, monthly, daily
data)
E.g. print(‘Compound annual growth rate: %.4f’% resInfo.loc[‘cagr’])
• Compare multiple strategy returns:
lookback_results = bt_result.display_lookback_returns()
print(lookback_result)
TO DO
• Convert Unix timestamp to GMT+7 (stack overflow)
• Calculate MA (course on DataCamp)
• Find sources to import crypto data
• Find sources to import liquidation data

More Related Content

PPTX
2 Arrays & Strings.pptx
PDF
An Introduction to Programming in Java: Arrays
PPT
Array in Java
PPTX
Arrays in Java
PDF
Basic data structures in python
PPT
C++ Arrays
PDF
PPT
Collections
2 Arrays & Strings.pptx
An Introduction to Programming in Java: Arrays
Array in Java
Arrays in Java
Basic data structures in python
C++ Arrays
Collections

What's hot (20)

PDF
Arrays in python
PDF
Collections Framework Begineers guide 2
PDF
9 python data structure-2
PPT
Lec 25 - arrays-strings
PDF
How To Use Higher Order Functions in Scala
PPTX
PPTX
Array lecture
PDF
Java Arrays
PPTX
Array Introduction One-dimensional array Multidimensional array
PDF
Python programming : Arrays
PPTX
Java Foundations: Maps, Lambda and Stream API
PPT
Collections
PPTX
Python array
PPTX
Arrays in java language
PDF
The Ring programming language version 1.5.2 book - Part 21 of 181
PPTX
Java arrays
PDF
Practical cats
PPT
One dimensional 2
PPT
Templates
Arrays in python
Collections Framework Begineers guide 2
9 python data structure-2
Lec 25 - arrays-strings
How To Use Higher Order Functions in Scala
Array lecture
Java Arrays
Array Introduction One-dimensional array Multidimensional array
Python programming : Arrays
Java Foundations: Maps, Lambda and Stream API
Collections
Python array
Arrays in java language
The Ring programming language version 1.5.2 book - Part 21 of 181
Java arrays
Practical cats
One dimensional 2
Templates
Ad

Similar to Python presentation (20)

PPTX
Pa1 session 5
PPTX
Data Visualization_pandas in hadoop.pptx
PDF
Python-for-Data-Analysis.pdf
PPTX
Python for data analysis
PPTX
python for data anal gh i o fytysis creation.pptx
PPTX
Meetup Junio Data Analysis with python 2018
PPT
SASasasASSSasSSSSSasasaSASsasASASasasASs
PDF
Python for Data Analysis.pdf
PPTX
Python-for-Data-Analysis.pptx
PPTX
Python-for-Data-Analysis.pptx
PPTX
Pandas yayyyyyyyyyyyyyyyyyin Python.pptx
PPTX
Lecture 1 Abstract Data Types of Complexity Analysis of Big Oh Notation.pptx
PPTX
Lecture 1 Abstract Data Types of Complexity Analysis of Big Oh Notation.pptx
PPTX
Python-for-Data-Analysis.pptx
PPTX
Lecture 9.pptx
PPTX
Unit 3_Numpy_Vsp.pptx
PPTX
Aggregate.pptx
PPTX
introduction to data structures in pandas
PPTX
Pythonggggg. Ghhhjj-for-Data-Analysis.pptx
PPTX
pandas directories on the python language.pptx
Pa1 session 5
Data Visualization_pandas in hadoop.pptx
Python-for-Data-Analysis.pdf
Python for data analysis
python for data anal gh i o fytysis creation.pptx
Meetup Junio Data Analysis with python 2018
SASasasASSSasSSSSSasasaSASsasASASasasASs
Python for Data Analysis.pdf
Python-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptx
Pandas yayyyyyyyyyyyyyyyyyin Python.pptx
Lecture 1 Abstract Data Types of Complexity Analysis of Big Oh Notation.pptx
Lecture 1 Abstract Data Types of Complexity Analysis of Big Oh Notation.pptx
Python-for-Data-Analysis.pptx
Lecture 9.pptx
Unit 3_Numpy_Vsp.pptx
Aggregate.pptx
introduction to data structures in pandas
Pythonggggg. Ghhhjj-for-Data-Analysis.pptx
pandas directories on the python language.pptx
Ad

Recently uploaded (20)

PDF
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
PPTX
Belch_12e_PPT_Ch18_Accessible_university.pptx
PDF
How to Get Funding for Your Trucking Business
PDF
IFRS Notes in your pocket for study all the time
PDF
Chapter 5_Foreign Exchange Market in .pdf
PDF
Unit 1 Cost Accounting - Cost sheet
PDF
Types of control:Qualitative vs Quantitative
PDF
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
PDF
A Brief Introduction About Julia Allison
PPTX
Principles of Marketing, Industrial, Consumers,
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
PPTX
ICG2025_ICG 6th steering committee 30-8-24.pptx
PDF
Training And Development of Employee .pdf
PDF
20250805_A. Stotz All Weather Strategy - Performance review July 2025.pdf
PPT
340036916-American-Literature-Literary-Period-Overview.ppt
PPT
Data mining for business intelligence ch04 sharda
PDF
Reconciliation AND MEMORANDUM RECONCILATION
PDF
DOC-20250806-WA0002._20250806_112011_0000.pdf
PDF
Laughter Yoga Basic Learning Workshop Manual
PPTX
Probability Distribution, binomial distribution, poisson distribution
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
Belch_12e_PPT_Ch18_Accessible_university.pptx
How to Get Funding for Your Trucking Business
IFRS Notes in your pocket for study all the time
Chapter 5_Foreign Exchange Market in .pdf
Unit 1 Cost Accounting - Cost sheet
Types of control:Qualitative vs Quantitative
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
A Brief Introduction About Julia Allison
Principles of Marketing, Industrial, Consumers,
unit 1 COST ACCOUNTING AND COST SHEET
ICG2025_ICG 6th steering committee 30-8-24.pptx
Training And Development of Employee .pdf
20250805_A. Stotz All Weather Strategy - Performance review July 2025.pdf
340036916-American-Literature-Literary-Period-Overview.ppt
Data mining for business intelligence ch04 sharda
Reconciliation AND MEMORANDUM RECONCILATION
DOC-20250806-WA0002._20250806_112011_0000.pdf
Laughter Yoga Basic Learning Workshop Manual
Probability Distribution, binomial distribution, poisson distribution

Python presentation

  • 2. Python packages • Import package to use: import package_name_here as shortened_name_here Example: import numpy as np • Packages required: • numpy: perform numerical functions • Pandas: reading, importing, creating DataFrames • matplotlib.pyplot: data visualisation • seaborn: data visualisation for stats • Talib: technical analysis • bt: backtest trading strategy
  • 3. The basics • List: first item has an index value of 0 • Slicing: includes the start & up to (but not include) the end, can slice with step (must be integer) • Mylist [startAt : endBefore : step] • Methods: .sort(), .append(), .extend(), .index(x), .min(), .max() • Array: faster in reading, storing, calculating items • Create array: np.array() • Functions: .shape, .size, .mean, np.std(), np.arange(start, end, step), np.transpose() • Can subset array using Boolean array • Visualistion: use matplotlib.pyplot or seaborn packages • Boxplot for quantiles and outliers: sns.boxplot(x= ,y= , data= ) • Line plot: plt.plot() • Scatter plot: plt.scatter() • Histogram: plt.hist(x = , bins = ); normed = 1 to use % • Plt.show() to show graphs, plt.legend() to show legends • Miscellaneous: color = , linestyle = ’ ‘, legend =‘ ‘, subplot = True, plt.xlabel(’ ‘), plt.ylabel(‘ ‘) • Add a vertical line on chart: ax.axvline() • Other plot types ( kind = ‘ ‘): bar, barh, hist, box, kde, density, area, pie, scatter, hexbin
  • 4. Intermediate Python • Representing time: use datetime package • Convert datetime from string, from string to datetime • Formatting time: consult materials. Example: %A, %B $d, %Y • datetime.now (), .datetime(year, month, day, hour, minute) • Attributes: .year, .month, .day, .hour,… • Time delta: how much time between 2 timestamps • Create relative datetime using timedelta() • Dictionary: store and lookup values using keys • Create dictionary: {‘key 1’:’value 1’, ’key 2’:’value 2’, ‘key 3’:’value 3’} • Add new keys: dictionary [‘key’] = ‘value’ • Access values: dictionary [‘key’] or use get method • Delete: del(dictionary[‘key’]) • Comparison operators: ==, !=, >, <, <=, => • Boolean operators: and,, or, not • If statements: • If <expression/control statement> : Statement 1 Statement 2 Statement 3 • Else: excute code when the control statement is False • Elif: only excute when initial statement is False and the Elif statement is satisfied • Loops: • For <variable> in <sequence>: Statement • While <expression>: Statement • Skipping loop: if <expression> : continue • Stopping loop: if <expression> : break
  • 5. Intermediate Python • DataFrames: using pandas package, similar to spreadsheets or tables • Can create DataFrames from dictionary or list of lists • Reading data: pd.read_<file type> (’ file name or path to file’) • File type: excel, csv, json, html, pickle, sql • Access column: • Use [] brackets, dot • Or list of name for multiple columns • Access rows: • Use slicing [] • List of booleans • Access columns and rows in small dataset: • iloc (by name) • loc (by index) • Methods: .count(), .min(), .max(), .first(), .last(), .sum(), .prod(), .mean(), .median(), .std(), .var(), .quantiles() • Note: method runs across rows by default, run across columns if axis = 1 • Manipulating data: • Remove column: .drop(columns=[ ], axis = 1, inplace = True) • Remove row: .drop() remove rows by default • Add multiple rows: .append() or .concat() • Operations on DataFrames: • apply directly to column • .map: apply the defined operation to the selected columns • .apply: across rows and columns • Checking data: • .info(): to view data structure • .head(): display first 5 rows • .tail(): display last 5 rows • .describe(): summary stats • Include = … • Percentiles = [.1, .5, .9] • Exclude = … • Filtering data: • Apply comparison expression on selected column  result: Boolean values for each row in that column • .loc [boolean_result] to filter values that satisfy the comparison expression
  • 6. Importing and managing financial data • Import and inspecting data: • CSV: pd.read_csv(‘file_name.csv’, na_values=‘n/a’, parse_dates =[‘label of the column containing date info’]) • Excel: pd.read_excel( ) • Import an entire worksheet or just 1 or 2 sheets • Combine data from multiple worksheets: pd.concat() • Combine vertically and combine data based on columns • Note: a reference column is needed • Google Finance: • 1st step is importing datetime functionality  define start and end date using date () • Data source: ‘google’ • E.g. stock_date = DataReader(ticker, data_source, start, end) • Fed Researve: • Series code: available on the website • E.g. data = DataReader (series_code, data_source, start) • Dealing with missing values: • Drop rows with missing values: .dropa (inplace = True) • Replace missng value with mean: .filla • Useful methods: • .sort_values (‘column’, ascending = False) • .set_index: assign a different data type/values to the index • .idxmax(): find index of max value • .unique(): unique values as numpy array • .div(): divide the whole column • . nlargest(n = ): find n largest values • .index.to.list(): convert index to list • .panel.to_frame(): convert panel to frame • Why? 2D multiIndex is easier to work with than panel • .unstack(): unstack data, move from a long format to wide format • Methods for categorical data • .nunique(): identify unique values or categories • .value_count(): how many times each value occurs • .groupby(): group data • .agg(): pass a list with names of stats metric
  • 7. Financial Trading • Packages needed: ta-lib and bt • Plot interactive candle sticks: • Use plotly.graph_objects package • go.Candlestick(x=, open=, high= , low= ,close=) • Resample data: hourly to daily, daily to weekly • Important calculations: • Daily return: .pct_change()*100 (calculate % change from preceding row by default) • SMA: .rolling(window = n).mean() talib.SMA(data, time period) • EMA: talib.EMA(data, time period) • ADX: talib.EDX(high, low, close, timeperiod) • RSI: talib.RSI(data, time period) • Bollinger Band: talib.BBANDS(data, nbdevup = , nddevdn = , time period) Construct trading signal: 1. Get historical price: bt.get(‘ticker’,start=,end=) 2. Calculate indicators 3. Create signal DataFrame signal=indicator_long.copy() signal=[indicator_long.isnull()]=0 Define strategy: Signal[condition 1] = 1 (long signal) Signal[condition 2] = -1 (short signal) Plot signal, prices and indicators: create a combined dataframe using bt.merge 4. Define signal-based strategy Bt_strategy = bt.Strategy(‘strategy_name’, [bt.algos.SelectWhere( condition), bt.algos.WeighEqually(), bt.algos.Rebalance()]) Or Bt_strategy=bt.Strategy(‘strategy_name’, [bt.algos.WeighTarget(signal), bt.algos.Rebalance()])
  • 8. Financial Trading Backtest Bt_backtest = bt.Backtest (bt_strategy, price_data) Bt_result = bt.run(bt_backtest) Plot the backtest PnL: bt_result,plot(title= ) Strategy optimization: try a range of input parameter values Define function (to save time, don’t have to repeat code) Def signal_strategy (ticker,period,name,start =,end = ) <get historical values, calculate indicators, define signal, define signal-based strategy> Return bt.Backtest(bt_strattegy, price_data) Can call this function several times, run backtest to find the optimal input Benchmarking: can compared active trading strategy with buy and hold strategy Def buy_and_hold (ticker,name,start=,end=) <get historical data> bt_strategy = bt.Strategy(name, [bt.algos.RunOnce(), bt.algos.SelectAll(), bt.algos.WeighEqually(), bt.algos.Rebalance()]) return bt.Backtest (bt_strategy, price_data)  Run backtest on strategies and benchmark and compare Strategy return analysis: • Backtest stats: resInfo = bt_result.stats • View all stats index: print(resInfo.index) • Stats: rate of returns, cagr, max drawdown, calmar ratio, share ratio, sortino ratio (yearly, monthly, daily data) E.g. print(‘Compound annual growth rate: %.4f’% resInfo.loc[‘cagr’]) • Compare multiple strategy returns: lookback_results = bt_result.display_lookback_returns() print(lookback_result)
  • 9. TO DO • Convert Unix timestamp to GMT+7 (stack overflow) • Calculate MA (course on DataCamp) • Find sources to import crypto data • Find sources to import liquidation data