Skip to content Skip to sidebar Skip to footer

Polynomial Regression Values Generated Too Far From The Coordinates

As per the the below code for Polynomial Regression coefficients value, when I calculate the regression value at any x point. Value obtained is way more away from the equivalent y

Solution 1:

The polynomial fit performs as expected. There is no error here, just a great deviation in your data. You might want to rescale your data though. If you add the parameter full=True to np.polyfit, you will receive additional information, including the residuals which essentially is the sum of the square fit errors. See this other SO post for more details.

import matplotlib.pyplot as plt
import  numpy as np

x = [0,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100]
y = [0,885,3517,5935,8137,11897,10125,13455,14797,15925,16837,17535,18017,18285,18328,18914,19432,19879,20249,20539,20746]

m = max(y)
y = [p/m for p in y] # rescaled y such that max(y)=1, and dimensionless

z, residuals, rank, sing_vals, cond_thres = np.polyfit(x,y,3,full=True)

print("Z: ",z) # [ 9.23914285e-07 -2.67082878e-04  2.79497972e-02 -5.39544708e-02]print("resi:", residuals) # 0.02188 : quite decent, depending on WHAT you're measuring ..

Z = [z[3] + q*z[2] +  q*q*z[1] + q*q*q*z[0] for q in x]

fig = plt.figure()
ax = fig.add_subplot(111)

ax.scatter(x,y)
ax.plot(x,Z,'r')
plt.show()

enter image description here

Solution 2:

After I reviewed the answer of @Magnus, I reduced the limits used for the data in a 3rd order polynomial. As you can see, the points within my crudely drawn red circle cannot both lie on a smooth line with the nearby data. While I could fit smooth lines such as a Hill sigmoidal equation through the data, the data variance (noise) itself appears to be the limiting factor in achieving a peak absolute error of 150 with this data set.

plot

Post a Comment for "Polynomial Regression Values Generated Too Far From The Coordinates"