Goal or No Goal? A Goal Scoring Prediction for Shots in Hockey.

Roman Nagy
6 min readDec 16, 2021

--

Hockey is an amazing game. Fast, exciting, with a lot of goals — normally. Players are fighting and giving their best to score a goal. But is their best really enough? Is the tactic of the team and shooting skills of the player always a guarantee to score? This article presents a project using game data to predict scoring of a goal.

There are several very interesting datasets providing insightful information on a very high level of detail for many hockey games and players. You can find data describing in a very structured way a hockey game, event by event. Every shot, every hit, every penalty is recorded. Besides this, there are extensive player statistics. This project used for the prediction two main datasets. It’s NHL Game Data and NHL Player Salaries from Kaggle.

Predicting a goal is a task of binary classification. To perform this task, the Logistic Regression classifier from the scikit-learn library has been used. It uses multiple numerical values as predictive features and the binary value goal as a target. The classifier predicts the probability of a goal. This probability is a value from the interval [0, 1] and is converted to a binary prediction 0 or 1 based on a given threshold, default 0.5. If the predicted probability is higher than the threshold, the predicted value is interpreted as 1. If it’s lower than the threshold, it’s interpreted as 0.

The first version of the model used predictive features related to the shot circumstances (type, location):

features = ['st_x', 'st_y', 'secondaryType_Deflected', secondaryType_SlapShot', 'secondaryType_SnapShot',        'secondaryType_TipIn', 'secondaryType_Wraparound', 'secondaryType_WristShot']target = 'goal'

The performance of the model has been evaluated using the confusion matrix and the resulting F1 Score. These metrics can be considered as standard metrics for evaluation of binary classification. The goal here is to quantify how many data points in the test set have been predicted correctly (0 or 1), how many of them have been predicted as FalsePositive (Type I error), and how many as FalseNegative (Type II error). As a FalsePositive can be considered values predicted as Positive (1) even if they should have been predicted as Negative (0). FalseNegative are predictions predicted as Negative (0) even if they should have been predicted as Positive (1). Based on these values, Precision, Recall, and finally the F1 Score can be calculated as:

The first version of the model achieved the following values of defined metrics:

True Positive = 18 399
True Negative = 14 671
False Positive = 11 589
False Negative = 7 654
========================
Overall F1 Score = 0.66

The value of the F1 Score=0.66 is a good starting point, but there is still some space for improvements. Next, a new feature has been engineered: distance between the shot and the goal. After adding the distance as a predictive feature and retraining the model, a significant improvement of the model has been achieved:

The F1 Score=0.70 is definitely an improvement. To give some intuition on how exactly to interpret this value and how good the model fits the data points in the dataset, let’s have a closer look at logistic regression. As already mentioned, it predicts the probability of every outcome to be 0 or 1. A threshold is defined (default 0.5) to decide if a particular outcome will be classified as 0 or 1. In order to do this, logistic regression uses the sigmoid function:

Let’s consider a logistic regression model using just one predictive feature, which is the distance to the goal. The sigmoid function is:

The logistic regression classifier tries during training to fit its sigmoid function according to this formula. To visualize how well it fits the data points, we can extract the coefficients coef and interpect directly from the model trained using the feature distance only:

*************************************************
Training duration: 0.08207 seconds
Score of the model is 0.6490
F1-Score of the model is 0.6833
-------------------------------------
Coef of the model is [[-0.04341287]]
Intercept of the model is [1.61758467]

Having the values of coef and intercept, we can visualize the sigmoid function generated by our logistic regression model and see how this function fits data points from the dataset. On the left plot is the sigmoid curve (green) produced by the trained model, on the right plot is the same curve together with data points from the dataset representing all shots from the distance between 10 and 70 with their real probability to achieve a goal. As you can see, the curve fits the data points amazingly good and even a model with one feature achieving F1 score=0.68 can definitely make meaningful predictions:

After some follow up experiments with different classifiers (LogisticRegression, KNeighborsClassifier, LGBMClassifier), adding several new predictive features (angle, player, goalie, …), and fine-tuning using cross validation, the finally chosen model was the LightGBM (gradient boosting framework) classifier with 20 features.

This model achieved an F1 Score=0.76. Very important and interesting finding during these experiments was dealing of the model with used features and the evaluation of feature importance made by the model. As you can observe in the following plots, the most important features have been selected distance, periodTime, angle, skater_id, and savePercentage. This makes absolutely sense and is very intuitive. Skater is the player making the shot. His skills are a very important factor in the probability of the shot to land behind the goalie (who’s savePercentage is the 5th important feature). Distance, angle and periodTime are important factors as well:

Comparing both plots showing the features correlation with the goal in the training set (on the left) and the LightGBM feature importance (on the right) you will notice that the classifier rated some of the predictive features as very important, even if they are not strongly correlated with the target goal. This applied e.g. for features skater_id and goalie_id, which have been rated as important features but have low level of correlation with the target goal.

The value of F1 Score=0.76 was achieved for all shots in the final dataset. The model was performing even better for short distance shots only. For shots with distance < 30, the model achieved a value of F1 Score=0.81. For long distance shots, the value of F1 Score=0.64 was achieved. This finding is related to the fact, that shots from long distance are much less predictable in generell. It’s very often more or a less a matter of luck, of how the player hits the puck and other circumstances. Short distance shots are much better influenceable by the skills of the shooting player.

Conclusion

Even with predictable goals, hockey is still an amazing game. The LightGBM classifier was able to predict goals based on shot-related parameters, achieving the F1 Score=0.76 for all shots and F1 Score=0.81 for short distance shots.

The ideal value F1 Score=1.0 was not achieved. This would be achievable only if we had data which can to 100% explain and to cover all aspects influencing the chance of scoring a goal after a shot. Additionally, we’d need a model being able to 100% fit the data. Besides all data available in the used dataset, there still can be other factors having an impact on the goal/no-goal result. Defending players and their actions? Ice temperature? All those might be the missing pieces of the mosaic.

Link to Github repo with the source code used in this project and all links to the used datasets: https://github.com/rmnng/dsnanocapstone

--

--