Simple Linear Regression without Gradient Descent Optimization

Jan 1, 2019 Prose Tech Python AI/ML

Linear Regression in a different way

Trying to implement linear regression for a simple dataset without using existing regression library functions or the Gradient Decent Technique

Theory

Regression

A method of establishing relationship between 1 or more independant variables against one dependant variable

Linear Regression

Establishing a Linear relationship

Let’s try implementing the same:

Import the Google Drive to access your dataset

1
2
3
from google.colab import drive

drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

1
2
import numpy as np
import pandas as pd

Reading the data

1
df = pd.read_csv('/content/drive/My Drive/slr_data.csv')

1
df.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

1
df.describe()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

1
2
import matplotlib.pyplot as plt
import seaborn as sns

1
2
3
4
5
plt.figure(figsize = (20,10))

sns.relplot(x = 'x', y = 'y', data = df)

plt.show()

<Figure size 1440x720 with 0 Axes>

png

Linear Regression is all about finding a straight line that has the least Root Mean Square Value for the given data points => the line is to be existing between the extreme Y values to make sure the line is between the points in a given dataset

1
y_max = df.y.max()

1
y_min = df.y.min()

1
y_max

105.5918375

1
y_min

-3.4678837889999996

1
y_mid = (y_max + y_min) / 2

1
y_mid

51.0619768555

1
x2 = df.x.max()

1
x1 = df.x.min()

1
df.x.idxmax()

1
df.x.idxmin()

1
y2 = df.y[df.x.idxmax()]

1
y1 = df.y[df.x.idxmin()]

1
y2

105.5918375

1
y1

-1.040114209

1
slope_m = (y2 - y1) / (x2 - x1)

1
slope_m

1.06631951709

1
2
def lin_equ(x):
    return slope_m*(x - (df.x[df.y.idxmax()] + df.x[df.y.idxmin()]) / 2) + (df.y[df.y.idxmax()] + df.y[df.y.idxmin()]) / 2

1
df.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

1
lin_equ(77)

79.85260381693

1
lin_equ(21)

20.13871085989

1
lin_equ(22)

21.20503037698

1
lin_equ(20)

19.0723913428

1
lin_equ(36)

36.13350361624

1
df['y_pred'] = pd.Series(map(lin_equ, df.x.values))

1
df.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

1
2
3
4
5
plt.figure(figsize = (20,15))

sns.relplot(x = 'x', y = 'y_pred', data = df)

plt.show()

<Figure size 1440x1080 with 0 Axes>

png

1
2
3
4
5
6
7
plt.figure(figsize = (20,15))

sns.relplot(x = 'x', y = 'y', data = df)

sns.lineplot(x = 'x', y = 'y_pred', data = df)

plt.show()

<Figure size 1440x1080 with 0 Axes>

png

1
neg_err_sum = sum([i*i for i in df.y_pred-df.y if i < 0])

1
pos_err_sum = sum([i*i for i in df.y_pred-df.y if i >= 0])

1
neg_err_sum

1228.4737143905872

1
pos_err_sum

2396.9837025596657

Gautham's

Posts

Tags

About

Resume

Simple Linear Regression without Gradient Descent Optimization

Linear Regression in a different way

Theory

Regression

Linear Regression

Wanna reach out? Email me!