Statistical functions and methods from NumPy, pandas, and SciPy
Certainly! Here is a comprehensive list of statistical functions and methods from NumPy, pandas, and SciPy, presented in a similar format with brief descriptions for each. and These keywords should cover most of the statistical operations and data visualization tasks you might need to perform in a Jupyter notebook.
NumPy Keywords
```python
import numpy as np
# Creating Arrays
np.array() # Create an array
# Basic Statistics
np.mean() # Compute the arithmetic mean
np.median() # Compute the median
np.std() # Compute the standard deviation
np.var() # Compute the variance
np.min() # Find the minimum value
np.max() # Find the maximum value
np.ptp() # Compute the range (peak to peak)
np.percentile() # Compute the nth percentile
# Sum and Product
np.sum() # Compute the sum of array elements
np.prod() # Compute the product of array elements
# Cumulative Sum and Product
np.cumsum() # Compute the cumulative sum
np.cumprod() # Compute the cumulative product
# Correlation and Covariance
np.corrcoef() # Compute the correlation coefficient matrix
np.cov() # Compute the covariance matrix
```
pandas Keywords
```python
import pandas as pd
# Creating DataFrame
pd.DataFrame() # Create a DataFrame
# Basic Statistics
df.mean() # Compute the mean of values
df.median() # Compute the median of values
df.mode() # Compute the mode of values
df.std() # Compute the standard deviation
df.var() # Compute the variance
df.min() # Find the minimum value
df.max() # Find the maximum value
df.sum() # Compute the sum of values
df.prod() # Compute the product of values
df.cumsum() # Compute the cumulative sum
df.cumprod() # Compute the cumulative product
# Descriptive Statistics
df.describe() # Generate descriptive statistics summary
df.count() # Count the number of non-NA/null values
df.quantile() # Compute the quantile
df.mad() # Compute the mean absolute deviation
# Correlation and Covariance
df.corr() # Compute pairwise correlation
df.cov() # Compute pairwise covariance
# Skewness and Kurtosis
df.skew() # Compute the skewness
df.kurt() # Compute the kurtosis
# Grouping and Aggregation
df.groupby() # Group DataFrame using a mapper or by a Series of columns
df.aggregate() # Aggregate using one or more operations over the specified axis
df.agg() # Aggregate using one or more operations over the specified axis
df.transform() # Apply a function on a group
df.apply() # Apply a function along an axis of the DataFrame
# Missing Data Handling
df.isnull() # Detect missing values
df.notnull() # Detect non-missing values
df.fillna() # Fill NA/NaN values using the specified method
df.dropna() # Remove missing values
# Ranking and Sorting
df.rank() # Compute numerical data ranks
df.sort_values() # Sort by the values along either axis
df.sort_index() # Sort DataFrame by index
```
SciPy Keywords
```python
from scipy import stats
# Descriptive Statistics
stats.describe() # Compute descriptive statistics
stats.skew() # Compute the skewness
stats.kurtosis() # Compute the kurtosis
stats.mode() # Compute the mode
stats.trim_mean() # Compute the trimmed mean
# Probability Distributions
stats.norm() # Normal Distribution
stats.t() # T-Distribution
stats.chi2() # Chi-Square Distribution
stats.f() # F-Distribution
stats.expon() # Exponential Distribution
stats.binom() # Binomial Distribution
stats.poisson() # Poisson Distribution
# Statistical Tests
stats.ttest_1samp() # One-sample t-test
stats.ttest_ind() # Independent t-test
stats.ttest_rel() # Related t-test
stats.chi2_contingency() # Chi-Square Test
stats.pearsonr() # Pearson Correlation Coefficient
stats.spearmanr() # Spearman Correlation Coefficient
stats.linregress() # Linear Regression
stats.ansari() # Ansari-Bradley Test
stats.bartlett() # Bartlett Test
stats.fligner() # Fligner-Killeen Test
stats.levene() # Levene Test
stats.mannwhitneyu() # Mann-Whitney U Test
stats.kruskal() # Kruskal-Wallis Test
stats.wilcoxon() # Wilcoxon Signed-Rank Test
stats.f_oneway() # One-Way ANOVA
stats.shapiro() # Shapiro-Wilk Test
stats.ks_2samp() # Kolmogorov-Smirnov Test
stats.anderson() # Anderson-Darling Test
stats.kstest() # Kolmogorov-Smirnov Test for Goodness of Fit
```
Matplotlib Keywords (for visualization)
```python
import matplotlib.pyplot as plt
# Plotting
plt.plot() # Plot y versus x as lines and/or markers
plt.scatter() # Make a scatter plot of x vs y
plt.bar() # Make a bar plot
plt.hist() # Plot a histogram
plt.boxplot() # Make a box and whisker plot
plt.pie() # Make a pie chart
plt.errorbar() # Plot x versus y with error bars
plt.stem() # Create a stem plot
plt.step() # Make a step plot
plt.fill() # Fill areas between y-values and 0
plt.fill_between() # Fill between two horizontal curves
# Customization
plt.title() # Set a title for the axes
plt.xlabel() # Set the label for the x-axis
plt.ylabel() # Set the label for the y-axis
plt.legend() # Place a legend on the axes
plt.grid() # Configure the grid lines
plt.xlim() # Get or set the x limits of the current axes
plt.ylim() # Get or set the y limits of the current axes
# Display
plt.show() # Display a figure
# Saving Figures
plt.savefig() # Save the current figure
```