Statistical functions and methods from NumPy, pandas, and SciPy

Certainly! Here is a comprehensive list of statistical functions and methods from NumPy, pandas, and SciPy, presented in a similar format with brief descriptions for each. and These keywords should cover most of the statistical operations and data visualization tasks you might need to perform in a Jupyter notebook.

NumPy Keywords

```python

import numpy as np

# Creating Arrays

np.array() # Create an array

# Basic Statistics

np.mean() # Compute the arithmetic mean

np.median() # Compute the median

np.std() # Compute the standard deviation

np.var() # Compute the variance

np.min() # Find the minimum value

np.max() # Find the maximum value

np.ptp() # Compute the range (peak to peak)

np.percentile() # Compute the nth percentile

# Sum and Product

np.sum() # Compute the sum of array elements

np.prod() # Compute the product of array elements

# Cumulative Sum and Product

np.cumsum() # Compute the cumulative sum

np.cumprod() # Compute the cumulative product

# Correlation and Covariance

np.corrcoef() # Compute the correlation coefficient matrix

np.cov() # Compute the covariance matrix

```

pandas Keywords

```python

import pandas as pd

# Creating DataFrame

pd.DataFrame() # Create a DataFrame

# Basic Statistics

df.mean() # Compute the mean of values

df.median() # Compute the median of values

df.mode() # Compute the mode of values

df.std() # Compute the standard deviation

df.var() # Compute the variance

df.min() # Find the minimum value

df.max() # Find the maximum value

df.sum() # Compute the sum of values

df.prod() # Compute the product of values

df.cumsum() # Compute the cumulative sum

df.cumprod() # Compute the cumulative product

# Descriptive Statistics

df.describe() # Generate descriptive statistics summary

df.count() # Count the number of non-NA/null values

df.quantile() # Compute the quantile

df.mad() # Compute the mean absolute deviation

# Correlation and Covariance

df.corr() # Compute pairwise correlation

df.cov() # Compute pairwise covariance

# Skewness and Kurtosis

df.skew() # Compute the skewness

df.kurt() # Compute the kurtosis

# Grouping and Aggregation

df.groupby() # Group DataFrame using a mapper or by a Series of columns

df.aggregate() # Aggregate using one or more operations over the specified axis

df.agg() # Aggregate using one or more operations over the specified axis

df.transform() # Apply a function on a group

df.apply() # Apply a function along an axis of the DataFrame

# Missing Data Handling

df.isnull() # Detect missing values

df.notnull() # Detect non-missing values

df.fillna() # Fill NA/NaN values using the specified method

df.dropna() # Remove missing values

# Ranking and Sorting

df.rank() # Compute numerical data ranks

df.sort_values() # Sort by the values along either axis

df.sort_index() # Sort DataFrame by index

```

SciPy Keywords

```python

from scipy import stats

# Descriptive Statistics

stats.describe() # Compute descriptive statistics

stats.skew() # Compute the skewness

stats.kurtosis() # Compute the kurtosis

stats.mode() # Compute the mode

stats.trim_mean() # Compute the trimmed mean

# Probability Distributions

stats.norm() # Normal Distribution

stats.t() # T-Distribution

stats.chi2() # Chi-Square Distribution

stats.f() # F-Distribution

stats.expon() # Exponential Distribution

stats.binom() # Binomial Distribution

stats.poisson() # Poisson Distribution

# Statistical Tests

stats.ttest_1samp() # One-sample t-test

stats.ttest_ind() # Independent t-test

stats.ttest_rel() # Related t-test

stats.chi2_contingency() # Chi-Square Test

stats.pearsonr() # Pearson Correlation Coefficient

stats.spearmanr() # Spearman Correlation Coefficient

stats.linregress() # Linear Regression

stats.ansari() # Ansari-Bradley Test

stats.bartlett() # Bartlett Test

stats.fligner() # Fligner-Killeen Test

stats.levene() # Levene Test

stats.mannwhitneyu() # Mann-Whitney U Test

stats.kruskal() # Kruskal-Wallis Test

stats.wilcoxon() # Wilcoxon Signed-Rank Test

stats.f_oneway() # One-Way ANOVA

stats.shapiro() # Shapiro-Wilk Test

stats.ks_2samp() # Kolmogorov-Smirnov Test

stats.anderson() # Anderson-Darling Test

stats.kstest() # Kolmogorov-Smirnov Test for Goodness of Fit

```

Matplotlib Keywords (for visualization)

```python

import matplotlib.pyplot as plt

# Plotting

plt.plot() # Plot y versus x as lines and/or markers

plt.scatter() # Make a scatter plot of x vs y

plt.bar() # Make a bar plot

plt.hist() # Plot a histogram

plt.boxplot() # Make a box and whisker plot

plt.pie() # Make a pie chart

plt.errorbar() # Plot x versus y with error bars

plt.stem() # Create a stem plot

plt.step() # Make a step plot

plt.fill() # Fill areas between y-values and 0

plt.fill_between() # Fill between two horizontal curves

# Customization

plt.title() # Set a title for the axes

plt.xlabel() # Set the label for the x-axis

plt.ylabel() # Set the label for the y-axis

plt.legend() # Place a legend on the axes

plt.grid() # Configure the grid lines

plt.xlim() # Get or set the x limits of the current axes

plt.ylim() # Get or set the y limits of the current axes

# Display

plt.show() # Display a figure

# Saving Figures

plt.savefig() # Save the current figure

```

Statistical functions and methods from NumPy, pandas, and SciPy

Naresh Maddela

Founder @ Asuraa -- Data science & ML ll Top Data Science Voice ll 1M+ impressions on LinkedIn || Top 1% on @TopMate ll Marketer

NumPy Keywords

pandas Keywords

SciPy Keywords

Matplotlib Keywords (for visualization)

More articles by this author

Others also viewed

Mastering Data Visualization with Matplotlib: A Comprehensive Guide to Creating Powerful Plots and Charts

Difference between Pandas and Numpy and their uses.

In Defense of the Humble .ipynb

Understanding Pandas DataFrames: A Complete Guide with Real-World Examples

Data Scientist Journey with the 100 Days of Code Challenge - Part 1

Pandas - GroupBy Practice

Change the data type of columns in Pandas

Pandas - Sort DataFrame

Time-Series-Analysis-with-Statsmodels - Chapter 3

Creating a Dashboard with the Matplotlib Library 📈

Explore topics

NumPy Keywords

pandas Keywords

SciPy Keywords

Matplotlib Keywords (for visualization)

Types of Machine Learning Models From Basics to Advanced

Feb 12, 2025

Lists in Python: A Complete Guide

Feb 8, 2025

Understanding Unicode Character Mapping

Jan 29, 2025

Key Mathematical Formulas and Their Use Cases in Data Science

Jan 27, 2025

SQL Professional Basic Roadmap by Categories

Jan 2, 2025

Learn SQL with ME Part - 1

Jan 1, 2025

Comprehensive Guide to Data Science Problem-Solving: Models and Solutions for Modern Challenges

Nov 3, 2024

The Power of Data Science: Transforming Insights into Strategic Action

Oct 21, 2024

Python string methods

Oct 2, 2024

The range() function in Python

Sep 28, 2024

Others also viewed

Mastering Data Visualization with Matplotlib: A Comprehensive Guide to Creating Powerful Plots and Charts

Difference between Pandas and Numpy and their uses.

In Defense of the Humble .ipynb

Understanding Pandas DataFrames: A Complete Guide with Real-World Examples

Data Scientist Journey with the 100 Days of Code Challenge - Part 1

Pandas - GroupBy Practice

Change the data type of columns in Pandas

Pandas - Sort DataFrame

Time-Series-Analysis-with-Statsmodels - Chapter 3

Creating a Dashboard with the Matplotlib Library 📈

Explore topics