There is a bike rent store. Given excel file gives number of rentals based on given features. There are 3 features in the dataset and 732 samples. The features are temperature, humidity, and windspeed. In not cold and not hot temperatures, low humidity, and low windspeed, it is expected to have more bike rentals.
Requirements:
In this assignment you will develop a linear regression model with multiple features. You can use scikit-learn API. Make sure to split your data in training and test sets (randomly; in order not to have bias). Here is a scikit-learn example on how to split your data in training and test sets:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state = 1234)
Visualize your data using scatter plot from matplotlib such as
bikerentals.plot(kind = 'scatter', x = 'temperature', y = 'rentals')
Visualize temperature in x axis, and rentals on y axis
Visualize humidity in x axis, and rentals on y axis
Visualize windspeed in x axis and rentals in y axis.
From these visualizations, explain if you see any correlation between individual features and rentals. Clearly explain for each feature.
After training your linear regression model on training dataset, test it on test dataset.
Report your results and accuracy of your results
From results you obtained from scikit linear regression model, write your final hypothesis function by obtaining coefficients (parameters; Thetas mentioned in class). Note: In scikit-learn your model’s intercept_ variable is Theta0, coef_ variable is a list of other Theta parameters (coefficients) e.g. Theta 1, Theta 2, …etc. Based on these obtained values clearly write your Hypothesis function.
Deliverables:
Your Python code. Name your file: YourFullNAme_HW3_CPSC4350.py
Word or pdf document of your answers to questions including 3 visualization plots.
Name file: YourFullNAme_HW3_CPSC4350.docx or
YourFullNAme_HW3_CPSC4350.pdf
CPSC 4370 Artificial Intelligence
Please answer as soon as possible and correctly.