Results & Conclusions

3.PNG
Accuracy Scores of each ML technique

* Version #1: It is a shuffled version of Version #3. And contains only the first 300 rows.

* Version #2: Instead of 13 columns, it has 5 columns. Other than that, same as Version #1.

* Version #3: Contains all the rows of the raw dataset (17994) but has only 13 columns.

Why did we choose these three algorithms/ techniques?

  • All of them can handle both linearly and non-linearly related data. Our dataset contained both relationships.
  • All of them can handle multiple inputs.
  • Logistic Regression has categorical output. It gives answer to our initial question.Should a manager hire this player as a striker? Yes (1) or No(0).
  • Decision tree has very nice visualizations. You can see clearly why each decision was made. The steps to the final classification is really transparent.
  • Decision tree allowed us to weigh possible choices against one another based on their information gains.
  • Neural Network has a big advantage that it can learn by itself. So we counted on NN that it can see a relation, come up with a model that the other two could not find.
  • But in NN we can’t see the steps reaching to the end results as we saw in Decision Tree.

Some observations from the accuracy results:

  • Generally all of them did a really good job at prediction and their accuracy scores are very close to each other.
  • Worst accuracy scores happened when we tried with fewer features (4 input, 1 output). Especially NN give a very bad result compared to others.
  • We think that the reason that the accuracy scores of Version#2 is bad is underfitting. Since there was 4 features, probably there was not enough information to build a strong model.
  • Version #1 and #3 has very close results but among them, it seems that the best ML technique that give the higher accuracy results is Logistic Regression.
  • Also, we applied Linear Regression to our data. It is not an ML technique and it oversimplifies the problem. Especially, it was not suitable for our data since it only took one input. Also, the input and output had to be linearly related which again did not applied to all of our data.