Simple Linear Regression without Gradient Descent Optimization Jan 1, 2019
Prose
Tech
Python
AI/ML Linear Regression in a different way Trying to implement linear regression for a simple dataset without using existing regression library functions or the Gradient Decent Technique
Theory Regression A method of establishing relationship between 1 or more independant variables against one dependant variable
Linear Regression Establishing a Linear relationship
Let’s try implementing the same:
Import the Google Drive to access your dataset
1
2
3
from google.colab import drive
drive. mount('/content/drive' )
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
1
2
import numpy as np
import pandas as pd
Reading the data
1
df = pd. read_csv('/content/drive/My Drive/slr_data.csv' )
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
1
2
import matplotlib.pyplot as plt
import seaborn as sns
1
2
3
4
5
plt. figure(figsize = (20 ,10 ))
sns. relplot(x = 'x' , y = 'y' , data = df)
plt. show()
<Figure size 1440x720 with 0 Axes>
Linear Regression is all about finding a straight line that has the least Root Mean Square Value for the given data points => the line is to be existing between the extreme Y values to make sure the line is between the points in a given dataset
105.5918375
-3.4678837889999996
1
y_mid = (y_max + y_min) / 2
51.0619768555
87
55
1
y2 = df. y[df. x. idxmax()]
1
y1 = df. y[df. x. idxmin()]
105.5918375
-1.040114209
1
slope_m = (y2 - y1) / (x2 - x1)
1.06631951709
1
2
def lin_equ (x):
return slope_m* (x - (df. x[df. y. idxmax()] + df. x[df. y. idxmin()]) / 2 ) + (df. y[df. y. idxmax()] + df. y[df. y. idxmin()]) / 2
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
79.85260381693
20.13871085989
21.20503037698
19.0723913428
36.13350361624
1
df['y_pred' ] = pd. Series(map(lin_equ, df. x. values))
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
1
2
3
4
5
plt. figure(figsize = (20 ,15 ))
sns. relplot(x = 'x' , y = 'y_pred' , data = df)
plt. show()
<Figure size 1440x1080 with 0 Axes>
1
2
3
4
5
6
7
plt. figure(figsize = (20 ,15 ))
sns. relplot(x = 'x' , y = 'y' , data = df)
sns. lineplot(x = 'x' , y = 'y_pred' , data = df)
plt. show()
<Figure size 1440x1080 with 0 Axes>
1
neg_err_sum = sum([i* i for i in df. y_pred- df. y if i < 0 ])
1
pos_err_sum = sum([i* i for i in df. y_pred- df. y if i >= 0 ])
1228.4737143905872
2396.9837025596657