Statistical functions and methods from NumPy, pandas, and SciPy

Statistical functions and methods from NumPy, pandas, and SciPy

Certainly! Here is a comprehensive list of statistical functions and methods from NumPy, pandas, and SciPy, presented in a similar format with brief descriptions for each. and These keywords should cover most of the statistical operations and data visualization tasks you might need to perform in a Jupyter notebook.

NumPy Keywords

```python

import numpy as np

# Creating Arrays

np.array() # Create an array

# Basic Statistics

np.mean() # Compute the arithmetic mean

np.median() # Compute the median

np.std() # Compute the standard deviation

np.var() # Compute the variance

np.min() # Find the minimum value

np.max() # Find the maximum value

np.ptp() # Compute the range (peak to peak)

np.percentile() # Compute the nth percentile

# Sum and Product

np.sum() # Compute the sum of array elements

np.prod() # Compute the product of array elements

# Cumulative Sum and Product

np.cumsum() # Compute the cumulative sum

np.cumprod() # Compute the cumulative product

# Correlation and Covariance

np.corrcoef() # Compute the correlation coefficient matrix

np.cov() # Compute the covariance matrix

```

pandas Keywords

```python

import pandas as pd

# Creating DataFrame

pd.DataFrame() # Create a DataFrame

# Basic Statistics

df.mean() # Compute the mean of values

df.median() # Compute the median of values

df.mode() # Compute the mode of values

df.std() # Compute the standard deviation

df.var() # Compute the variance

df.min() # Find the minimum value

df.max() # Find the maximum value

df.sum() # Compute the sum of values

df.prod() # Compute the product of values

df.cumsum() # Compute the cumulative sum

df.cumprod() # Compute the cumulative product

# Descriptive Statistics

df.describe() # Generate descriptive statistics summary

df.count() # Count the number of non-NA/null values

df.quantile() # Compute the quantile

df.mad() # Compute the mean absolute deviation

# Correlation and Covariance

df.corr() # Compute pairwise correlation

df.cov() # Compute pairwise covariance

# Skewness and Kurtosis

df.skew() # Compute the skewness

df.kurt() # Compute the kurtosis

# Grouping and Aggregation

df.groupby() # Group DataFrame using a mapper or by a Series of columns

df.aggregate() # Aggregate using one or more operations over the specified axis

df.agg() # Aggregate using one or more operations over the specified axis

df.transform() # Apply a function on a group

df.apply() # Apply a function along an axis of the DataFrame

# Missing Data Handling

df.isnull() # Detect missing values

df.notnull() # Detect non-missing values

df.fillna() # Fill NA/NaN values using the specified method

df.dropna() # Remove missing values

# Ranking and Sorting

df.rank() # Compute numerical data ranks

df.sort_values() # Sort by the values along either axis

df.sort_index() # Sort DataFrame by index

```

SciPy Keywords

```python

from scipy import stats

# Descriptive Statistics

stats.describe() # Compute descriptive statistics

stats.skew() # Compute the skewness

stats.kurtosis() # Compute the kurtosis

stats.mode() # Compute the mode

stats.trim_mean() # Compute the trimmed mean

# Probability Distributions

stats.norm() # Normal Distribution

stats.t() # T-Distribution

stats.chi2() # Chi-Square Distribution

stats.f() # F-Distribution

stats.expon() # Exponential Distribution

stats.binom() # Binomial Distribution

stats.poisson() # Poisson Distribution

# Statistical Tests

stats.ttest_1samp() # One-sample t-test

stats.ttest_ind() # Independent t-test

stats.ttest_rel() # Related t-test

stats.chi2_contingency() # Chi-Square Test

stats.pearsonr() # Pearson Correlation Coefficient

stats.spearmanr() # Spearman Correlation Coefficient

stats.linregress() # Linear Regression

stats.ansari() # Ansari-Bradley Test

stats.bartlett() # Bartlett Test

stats.fligner() # Fligner-Killeen Test

stats.levene() # Levene Test

stats.mannwhitneyu() # Mann-Whitney U Test

stats.kruskal() # Kruskal-Wallis Test

stats.wilcoxon() # Wilcoxon Signed-Rank Test

stats.f_oneway() # One-Way ANOVA

stats.shapiro() # Shapiro-Wilk Test

stats.ks_2samp() # Kolmogorov-Smirnov Test

stats.anderson() # Anderson-Darling Test

stats.kstest() # Kolmogorov-Smirnov Test for Goodness of Fit

```

Matplotlib Keywords (for visualization)

```python

import matplotlib.pyplot as plt

# Plotting

plt.plot() # Plot y versus x as lines and/or markers

plt.scatter() # Make a scatter plot of x vs y

plt.bar() # Make a bar plot

plt.hist() # Plot a histogram

plt.boxplot() # Make a box and whisker plot

plt.pie() # Make a pie chart

plt.errorbar() # Plot x versus y with error bars

plt.stem() # Create a stem plot

plt.step() # Make a step plot

plt.fill() # Fill areas between y-values and 0

plt.fill_between() # Fill between two horizontal curves

# Customization

plt.title() # Set a title for the axes

plt.xlabel() # Set the label for the x-axis

plt.ylabel() # Set the label for the y-axis

plt.legend() # Place a legend on the axes

plt.grid() # Configure the grid lines

plt.xlim() # Get or set the x limits of the current axes

plt.ylim() # Get or set the y limits of the current axes

# Display

plt.show() # Display a figure

# Saving Figures

plt.savefig() # Save the current figure

```


To view or add a comment, sign in

Others also viewed

Explore topics