Logistic Regression is the appropriate regression analysis to conduct when the dependent variable is dichotomous (binary). Like all regression analyses, the logistic regression is a predictive analysis. Logistic regression is used to describe data and to explain the relationship between one dependent binary variable and one or more nominal, ordinal, interval or ratio-level independent variables.
Since logistic regression has 2 outputs we replaced finishing column with the forvet column which only had values 0 and 1.

Then, we applied 70-30% train-test split to our data and fitted our model.
Version #1
You can see the details of dataset used for Version #1 in “Data Manipulation” part.
Results are below:

Accuracy Score : 95%
Lets understand this classification report.
The precision is the ratio tp / (tp + fp) where tp is the number of true positives and fp the number of false positives. The precision is intuitively the ability of the classifier not to label as positive a sample that is negative. So we got a very good precision 95 percent.
The recall is the ratio tp / (tp + fn) where tp is the number of true positives and fn the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples.
The support is the number of occurrences of each class in y_true.


According to our confusion matrix, there was total of 45+4=49 player who had forvet values as 0, and our model correctly classified 45 of them as 0. Also, there was total of 41+0=41 player who had forvet values as 1 and our model correctly classified all 41 of them as 1.

Version #2
You can see the details of dataset used for Version #2 in “Data Manipulation” part.
Since having too many features can cause overfitting, we also wanted to try with fewer features.

Accuracy Score: 90%

Version #3
You can see the details of dataset used for Version #3 in “Data Manipulation” part.

Accuracy Score : 97%

