python - Why my SGD is far off than my linear regression model? -
i'm trying compare linear regression (normal equation) sgd looks sgd far off. doing wrong?
here's code
x = np.random.randint(100, size=1000) y = x * 0.10 slope, intercept, r_value, p_value, std_err = stats.linregress(x=x, y=y) print("slope %f , intercept %s" % (slope,intercept)) #slope 0.100000 , intercept 1.61435309565e-11
and here's sgd
x = x.reshape(1000,1) clf = linear_model.sgdregressor() clf.fit(x, y, coef_init=0, intercept_init=0) print(clf.intercept_) print(clf.coef_) #[ 1.46746270e+10] #[ 3.14999003e+10]
i have thought coef
, intercept
same data linear.
when tried run code, got overflow error. suspect you're having same problem, reason, it's not throwing error.
if scale down features, works expected. using scipy.stats.linregress
:
>>> x = np.random.random(1000) * 10 >>> y = x * 0.10 >>> slope, intercept, r_value, p_value, std_err = stats.linregress(x=x, y=y) >>> print("slope %f , intercept %s" % (slope,intercept)) slope 0.100000 , intercept -2.22044604925e-15
using linear_model.sgdregressor
:
>>> clf.fit(x[:,none], y) sgdregressor(alpha=0.0001, epsilon=0.1, eta0=0.01, fit_intercept=true, l1_ratio=0.15, learning_rate='invscaling', loss='squared_loss', n_iter=5, penalty='l2', power_t=0.25, random_state=none, shuffle=false, verbose=0, warm_start=false) >>> print("slope %f , intercept %s" % (clf.coef_, clf.intercept_[0])) slope 0.099763 , intercept 0.00163353754797
the value slope
little lower, i'd guess that's because of regularization.
Comments
Post a Comment