PGA Tour Analytics: Accuracy Off the Tee

Screen Shot 2020-02-26 at 7.41.25 PM.png

 

We Looked at Distance

In the previous analysis of distance, we found that there was a severe uptick in distance gained off the tee after the introduction of the Pro V1 golf ball. As you can see from the plot below, the red line indicates when the Pro V1 was introduced in October 2000.

Screen Shot 2020-02-26 at 7.20.46 PM.png

Now Enter Accuracy

Screen Shot 2020-02-26 at 7.43.25 PM.png

But what has happened to accuracy off the tee during this same time period? Again, taking data derived from Shotlink and scraped from the PGA Tour’s public-facing website using Python, we have managed to collect information on the tournament week level for every player to have made the cut since 1980. Using this dataset, we have added another piece to the puzzle of an eventual model that can help us determine the most important features of the modern tour player’s success. (Note: the year 2005 is missing from this dataset due to issues scraping that particular year. All calculations made impute missing values based on the years 2004 and 2006 for this year).

Leading up to 2000, technology helped Tour players find the fairway off the tee. In fact, the trend from 1980 to 1999 is a story of increased accuracy off the tee. The sharp decrease occurred immediately when golf balls started flying further.

Screen Shot 2020-02-26 at 7.38.59 PM.png

The narrative of distance over accuracy becomes apparent when we view distance and accuracy off the tee together. On average, Tour players got longer at the cost of accuracy. The relationship between driving distance and accuracy still holds for those players that won during the week, if not more. For Tour winners, there is an even more exaggerated drop in the percentage of fairways hit off the tee, while distance is more than the average Tour player.

Screen Shot 2020-02-26 at 7.22.12 PM.png

Forming Relationships

Let’s now take a look at the correlation between driving distance and accuracy. Taking each player that has made the cut in a tour sanctioned event since 1980, approximately 112,619 observations, we can plot the distance and accuracy. Each blue dot represents a player, which allows us to view the distribution of accuracy and the distribution of distance on the far right and top axis as well. More importantly, this combined scatterplot lets us see the relationship between distance and accuracy. Known as a Pearson coefficient, we can calculate the linear co-movement of these two variables. Simply put, in relation to each other, how well do they move? For an additional yard of distance, what decrease in accuracy can we infer?

And for the stats nerds out there, the equation for your enjoyment.

Screen Shot 2020-02-26 at 7.26.06 PM.png

The following scatterplot highlights the average correlation coefficient between 1980 and 2020 of -0.27, meaning that for every additional yard, a player can expect to lose 0.27 percent in accuracy. For better interpretation, an increase in 10 yards would yield a decrease in accuracy of 2.7 percent.

Screen Shot 2020-02-26 at 7.29.48 PM.png

Now, this is all on average and it is very difficult to infer that 1980 looks like 2020. When running the numbers for 1980, the Pearson coefficient was -0.24, while 2020 was -3.3. What would be interesting to see is these coefficients over time.

As you can see from the scatterplot below, each Pearson coefficient was calculated for each year. These coefficients were then plotted over time. A linear trend line was placed to demonstrate that while there were fluctuations between years, the overall story is that players have been giving up accuracy as they get longer.

Screen Shot 2020-02-26 at 7.32.37 PM.png

For example, in 1980, a player gave up approximately 2.5% fairway accuracy for each additional 10 yards they gained. But in 2020, a player will need to give up almost an additional 1% decrease in accuracy off the tee to gain 10 yards. This makes sense when we think about it for a moment. Players hit it longer and a 5 degree miss with the driver will be further offline at 300 yards out than it is at 250 yards out. This is simple geometry; the further one travels from a line at an angle, the further that person will be from the other line.

While none of these findings are earth-shattering, my hope is that through iterations of exploring these PGA statistics, a meager contribution to the golf analytics community can be made.

As always, the code used in this analysis is available at the author’s GitHub repository: https://github.com/nbeaudoin/PGA-Tour-Analytics and can be found on LinkedIn at https://www.linkedin.com/in/nicholas-beaudoin-805ba738/

 

Image sources:

https://www.wallstreetmojo.com/pearson-correlation-coefficient/

https://www.golfdiscount.com/blog/fun-facts/2018-pga-tour-driving-statistics/#prettyPhoto/0/

https://www.pgatour.com/news/2019/05/11/nine-things-to-know-pga-championship-bethpage-black.html

 

 

Standard

PGA Tour Driving Distance Analytics

Distance Debate

The USGA’s February 2020 distance report has riveted the professional golf community. Calls by top players to have a bifurcation of the current golf ball at the tour level and an amateur golf ball, have created a rift that will most likely be played out in courts as ball manufacturers go to battle over maintaining their patented technology. The debate about distance centers around courses being unplayable to the modern PGA Tour professional due to increasing distance off the tee. Holes such as the 13th at Augusta National becoming an iconoclastic Par 5 to one that is merely a drive, pitch and putt have necessitated the lengthening of courses and purchase of additional properties to facilitate the added length. Both environmental concerns and the degradation of strategy on courses such as Riviera, host of the 2028 Olympics, are painstakingly debated. The following analysis helps paint the narrative for how we have gotten here.

Screen Shot 2020-02-19 at 6.43.35 PM.png

 

Enter Pro V1: King of the Golf Balls

October 11, 2000, was a day that rocked the golf world. On this day, the Pro V1 entered its first round of tournament play in Las Vegas at the Invensys Classic. The “Professional Veneer One,” a.k.a. the Pro V1, was a solid core ball that displaced wound technology in place of the traditional golf ball. Immediately, tour players began to turn to the Pro V1 as their prime gamers with its distance benefits. Not wanting to be left behind the curve, Nike developed its own version of the urethane golf ball in its Nike Tour Accuracy, which Tiger Woods immediately put into rotation. Throughout the early 2000s, it is said that distance gains were dramatic. But just how much did distance increase during this time period?

Screen Shot 2020-02-19 at 6.46.03 PM.png

The following data comes from my analysis of data derived from the PGA Tour’s public-facing website. Using Python to scrape the data through the HTML code, the following analysis provides a unique glimpse into 2,022 PGA Tour players and how the PGA Tour derives its data from Shotlink (see myformer article posted on Shotlink analytics). Since Shotlink no longer authorizes or allows academic usage of their data, it has become more challenging for researchers of the game to inquire to pressing questions at the top of the game. My hope is that the following analysis helps contribute to a more analytical narrative of golf’s top performers.

 

The Numbers

When looking at the average driving distance from 1980 through February 2020, it is obvious that the mean distance has increased. In fact, since the introduction of the Pro V1, the average distance on tour has increased by 18.17 yards.

Screen Shot 2020-02-19 at 6.49.09 PM.png

When looking at the difference of year-over-year change after the introduction of the Pro v1, we see a phenomenal spike in distance. 2000 and 2002 saw the biggest jump in distance off the tee. With each company that followed suit to Titleist’s Pro V1, it can be hypothesized that the game-changing performance of PGA Tour players off the tee rose dramatically due to this technology change.

Screen Shot 2020-02-19 at 6.49.39 PM.png

To gain a deeper look at the spike in driving distance, we took a peek at the distribution of distance off the tee between 1999 and 2003. The assumption is that since the Pro V1 was introduced in late 2000, competing ball manufactures adopted the urethane cover and proceeded to have their Tour players game the same technology. This four-year time span gives the data robustness to cover any technology adoption between Tour members and should show a statistically significant difference due to its large sample size.

Screen Shot 2020-02-19 at 6.53.32 PM.png

 

What’s Next?

While this analysis merely provides a summary glimpse into how driving distance has changed over the past 20 years, it is important to keep in mind that before the ProV1 was introduced, the driving distance was already increasing at an unprecedented rate. The debate rages on whether to scale back the current ball played on tour. Further analysis is warranted to determine how much of an impact the modern solid core playing ball had on player performance. For this analysis, an impact evaluation leveraging multivariate regression, coupled with checks for statistical robustness via parallel trends of pre and post ProV1 introduction, including difference-in-difference estimation strategies, is needed.

 

Methodology and Code

Data analytics code and notebooks created are available at the author’s GitHub:

https://github.com/nbeaudoin/PGA-Tour-Analytics

 

Data sources: 

http://www.pgatour.com

Images sources:

2015 Titleist Pro V1 ball review

https://www.brandsoftheworld.com/logo/pga-tour-4

 

 

 

Standard