More than Linear Interpolation in Python

More than Linear Interpolation in Python

In the previous article, I presented a data extraction tool and the strategy I generally use on the data extraction for interpolation. Besides that, I discussed some linear interpolation methods implemented in Python libraries NumPy and SciPy

Python presents some other great interpolation methods than just linear ones. On this article, we are going to explore more of these other interpolation methods using Python SciPy library.

Importing Libs and Loading Data

Before start the interpolation, it is necessary to import the necessary Python libs and load the dataset that will be used on this study.

No alt text provided for this image

This step is similar to the one performed on the previous article. Let's take a look at the data by plotting it.

No alt text provided for this image

This simple plot show that the data we have here is exactly the same we used before. On the previous article, we had chosen to discretize the graphic to obtain a good linear interpolation, as it is shown in the plotted figure above.

However, in this article, the main goal is to evaluate the other interpolation methods. Thus, we will select only a set of those points used before.

In summary, in this article, instead of using all the original graphic extracted data, we will only use a part of it as "input" of the interpolation function. The other points, that were not selected, will work as control points to check efficiency of the methods.

No alt text provided for this image
No alt text provided for this image

Now, that we have a smaller set of points for interpolation to characterize the original graphic curve, the goal will be to fit a curve that best represents this dataset using different interpolation method

Quadratic Interpolation

The quadratic interpolation is the approximation of a curve by a second order polynomial function.

𝑓(𝑥)=𝑎𝑥²+𝑏𝑥+𝑐

On the last article, we saw that the SciPy interpolation tool works with a pipeline that follow the next steps:

1 - load data

2 - train the interp1d function

3 - predict the desired values

The quadratic interpolation will also follow the same pipeline, that we will apply here. The idea will be to compare this method with previous method (linear interpolation) using the selected points.

No alt text provided for this image

Now that the interpolation functions are created, we can plot their results and compare it.

No alt text provided for this image
No alt text provided for this image
No alt text provided for this image
No alt text provided for this image

From the plots above, it is possible to make some observations:

  • The interpolated data presents a good fit to input data within its range
  • The SciPy quadratic interpolation presented a good approximation to original data
  • The extrapolated results are extremely dependent from input data and differs significantly from the one observed on previous article for linear extrapolation.

Given all these observations, it is recommended to avoid extrapolating data.

On the previous article we also used a NumPy linear interpolation function, but, as was discussed on the last article, NumPy don't present an embedded quadratic interpolation method.

Cubic Interpolation

As alternative to linear and quadratic methods, it is possible to perform a cubic interpolation. The cubic interpolation method is the approximation of a curve by a third order polynomial function.

𝑓(𝑥)=𝑎𝑥³+𝑏𝑥²+𝑐𝑥+𝑑

In this section we will only present the SciPy cubic interpolation. 

No alt text provided for this image
No alt text provided for this image

As it is possible to see, quadratic and cubic interpolation methods presented a slightly better representation of the original data when compared to linear method for this set of data.

Besides that, it is possible to verify that each method presented a different representation of the out of boundary region.

Spline methods - Cubic

Spline methods are commonly used in CAD tools and engineering software programs now a day.

The spline methods works by applying a series of low degree polynomial function to small subsets instead of applying a single high degree polynomial function to represent all the subset.

The cubic spline is the most common of the spline methods and is used to avoid interpolation problems by providing a smoother polynomial function with smaller errors.

As a great mathematical lib, SciPy presents this method implemented within it.

From now on, we will not extrapolate values even though this method enables it, given we don't have enough data to compare.

Now, it is necessary to follow the pipeline describe before.

No alt text provided for this image
No alt text provided for this image
No alt text provided for this image

It is possible to see that, different from the other interpolation methods, the spline curve with smaller set didn't manage to well reproduce the complete set of data. However, with additional points (the full set) the spline method was able to well fit the original graphic curve.

Akima Interpolation

Continuing the interpolation method exploration, I present the last method we will discuss here. The Akima interpolation method (BTW, thanks MR. Daniel P. Raymer for the suggestion).

The Akima interpolation, as described in its SciPy reference, uses a continuously differentiable sub-spline built from piece wise cubic polynomials. As a result of that, the resulting spline will be smooth and presents a natural visual.

As usual, we will follow the same pipeline and, after that, plot the results for comparison.

No alt text provided for this image
No alt text provided for this image
No alt text provided for this image

Again, with a good amount of point to fit the curve, Akima method also presented a good fitting to the original graphic curve. Moreover, with fewer points to perform the interpolation, Akima method performed as good as (or even better) than other methods.

Interpolation comparison

Before finishing this article, someone can ask,

"Wait! Which method is the best one?"

Don't worry about that now. Let your worries to the time you will use your own data and application. Any answerer presented herein would be valid only for this dataset.

To give an answer to this question, we would need to extract more than the available points and use a dataset of similar size as the currently full dataset.

So ... DON'T PANIC!

We made extra data extraction from this plot! This new data is much denser than the earlier one.

Due to that, it was possible to select more points. With these points, we can train each method to a condition closer to its best one.

No alt text provided for this image

Once we have the data, let's divide the "training" and testing data to compare the methods for this case.

No alt text provided for this image

Now, it's interpolation time!!!

For each presented method, we will interpolate the data for the original X value and calculate the error.

Once the interpolation is calculated, we can finally determine the error. For this study, we will calculate the root of the square difference between original and interpolated data divided by original data multiplied by 100. It will show the percent of the error.

No alt text provided for this image
No alt text provided for this image

As we can see, the error difference between these methods with these set of data is small. And finally, we can check the graphical difference between them.

No alt text provided for this image

As it is possible to see, for this set of data, the interpolation method that presented the smaller error was the quadratic interpolation. But this observation can be changed if the set of data also change.

Conclusion

SciPy is an amazing Python scientific lib that presents different interpolation methods embedded within it (not only the linear interpolation) that can be used by the user on almost any study that demands interpolation.

The method selection must be carefully done to avoid undesirable extrapolation, bad fitting or data misrepresentation. Due to that, it is always important to check if the chosen interpolation method will be representative of the analyzed data. 

Bibliography

- Linear interpolation in Python

- Wikipedia - Spline Interpolation

wikiversity - Cubic Spline Interpolation

- SciPy - interpolate

- SciPy - interp1d

- SciPy - CubicSpline

- SciPy - Akima1DInterpolator

- AGARD 264


Great walkthrough! Thanks for posting. Nice to see the Akima method too!

Doug Greenwell

Consulting Aerodynamicist

3y

I know it's curve-fitting, rather than interpolation, but I think this cartoon from Randall Munroe has some useful messages for those of us who spend a lot of time trying to fit a numerical model to experimental data (especially when you have what my old boss at RR called a 'starry starry night' plot) https://xkcd.com/2048/

stefano destefanis

MSc mechanical engineer | vibroacoustics | structural dynamics | space cae | hardware qualification

3y

If it can be helpful, I have used for quite some time the constrained spline interpolation method, which worked really well (for me, at least). Visually, the resulting interpolation is very similar (no overshoot at interpolated points). More information on the approach (and a VBA implementation) can be found here: https://pages.uoregon.edu/dgavin/software/spline.pdf

Daiane Klein

Senior Data Analyst | Analytics Engineer | Data Scientist | BI Specialist

3y

Sensacional! 👏🏻

Rodrigo Gosling

Data Analyst | Business Intelligence | Power BI | Tableau | SQL | Python

3y

Mais uma vez superando expectativas. Parabéns, Álvaro Carnielo e Silva

To view or add a comment, sign in

Others also viewed

Explore topics