Nicking Distributions and Comparative Usefulness
True Nicks edges E-Nicks in Keeneland September analysis.
Most descriptive statistics aimed at categorizing a population will form a bell-shaped curve of distribution, meaning most values gravitate towards the center (or average), while fringe values that deviate from the average will migrate in both directions from the more populous center. Seldom are there perfectly bell-shaped distributions, but equally seldom do we find gross distortions of the bell curve if the intervals are properly assigned.
For years, skeptics of the nicking sciences have suspected that the distribution of assigned letter grades is anything but a standard bell curve. Casual observation led most of us to believe that computerized nicking systems had skewed the grades to cast most hypothetical matings in a favorable light. The obvious benefit to the purveyors of nicking would be to claim responsibility for accurately predicting a large number of successful matings. Though we had our suspicions, there hadn’t been any independent research to support our opinion. With the debut of True Nicks in 2008, we became even more curious how their distribution compared with Jack Werk’s E-Nick system. True Nicks has claimed in their advertising that their competitors engage in grade inflation ("If everyone gets an A, is it really an A?"). True Nicks has also attempted to distinguish itself as a function of analyzing the entire population. E-Nicks has never claimed to study the dynamics of all live foals, just varying subsets of the stakes population.
Often times, an empirical project begins and ends with very different objectives, as was the case here. Originally, we had set out to go back in time (namely, the 2005 Keeneland September Yearling Sale) and identify yearlings where one nicking system had given a significantly higher rating than the opposing system, and vice versa. From there, we intended to fast forward time and see which system had predicted racetrack success with better accuracy.
First, we had to define our methodology. We obviously had to limit our sample yearlings to those sired by stallions that were currently available on both company’s web sites. We also needed to stabilize as many variables as possible without eliminating a large portion of the population. In order to avoid stacking one side with a disproportionate number of sires that eventually failed and were exiled from central Kentucky, we narrowed our search to sires that later proved to be at least useful enough to maintain their Kentucky residency.
It was also important to avoid having one system overly loaded with yearlings out of high end stakes-producing mares, or on the other end of the spectrum, unplaced racemares and/or underachieving producers. For these reasons, we limited our sample groups to yearlings out of winning or unraced mares with no foals to race as of catalog printing. While it’s impossible to achieve total stabilization of variables, we felt like this would at least prevent one nicking system from getting a random, unfair advantage. Nick ratings were pulled from each company’s web site between May 15th and 21st of 2009. From Books 1-4, we found 152 yearlings that matched the criteria above:
|Rum and Earl||A||C+||1||0||0||7,429||0.13|
|Dattts Our Girl||A||C+||1||1||0||110,224||2.02|
|Goodbye Norma Jean||A++||A++||1||1||1||268,070||4.53|
|North of Fortynine||A||B||1||0||0||7,011||0|
|Time for the Check||A||A||0||0||0||0||0|
|Sixty Six Hundred||A+||A||0||0||0||0||0|
|Point Me the Way||A++||A++||1||1||1||51,056||1.03|
|Five for a Nickel||A||A||0||0||0||0||0|
|Spies in the Midst||C+||D||0||0||0||0||0|
|Mr. Frankie C||B+||B+||1||1||0||25,634||0.60|
|Oh So Classic||A+||B+||1||0||0||2,623||0|
|Tale of the Kitty||A||D||1||0||0||1,770||0.16|
|Big Red Tate||A++||A++||1||1||0||85,200||2.12|
|I Got Game||A||B||1||1||0||20,440||0.47|
|State Your Case||A++||A++||0||0||0||0||0|
|Bridle Way Bay||A||A++||1||1||0||41,778||0.89|
|You Know It Is||A||B+||1||1||0||66,952||1.13|
|Puff N Smoke||B||B+||1||1||0||29,760||0.64|
|Grand Slam Girl||A++||A||1||0||0||6,660||0.47|
|Shot of Whiskey||A++||A++||1||0||0||5,395||0.13|
|That Girl is Mine||B+||A||1||1||1||81,241||2.36|
|Cat From Heaven||A+||B+||1||1||0||92,536||1.35|
As mentioned earlier, our original intent was to identify a significant sample size of yearlings that carried a favorable rating from one nicking model (in this case, a B or higher), and an unfavorable rating from the other model (a C+ or lower). But as is often the case, reliable sample sizes can be elusive even when minimal attempts to stabilize variables are made. After screening for proven sires and unraced/winning dams with no foals to race at catalog printing, we could only muster up 21 starters with favorable Werk ratings and unfavorable TrueNick ratings and 9 starters with inverse ratings:
|Avg. earnings per starter||$41,063||$84,217|
|Avg. purchase price||$82,833||$57,250|
We have to emphasize here that our sample size is far from being statistically reliable. However, in instances where Werk stands alone with a favorable rating, the .90 SSI and lack of a single stakes horse has to be viewed as a disappointment, especially when this information is coupled with the heavy skewing we’ll see from both companies, and in particular, Werk’s system.
So if this information cannot be used as a definitive method for determining which system best predicts success, it can be used to assess each system’s distribution tendencies. And unfortunately for those who may be placing a disproportionate emphasis on nicking, neither system appears to be overly discriminating, nor do they even loosely resemble a typical distribution.
In a typical distribution, the majority of the values would lie near the center, or in this case, a C grade. As illustrated below, less than 12% of the values fall within the C range in Werk’s system (as opposed to 66.8% that are graded an A or higher). True Nicks places less than 18% within the C range (as opposed to 36.7% in the A category). Using the definition of a favorable rating (B or higher), Werk assigned a favorable rating to 80.7% of the subject yearlings, while TrueNicks gave the same ratings to 69.3% within the same group. Most problematic is that between the two companies, just one of the subject yearlings was given an F rating.
|Werk E-Nicks||True Nicks|
|% rated A or higher||66.8%||36.7%|
|% rated B or higher||80.7%||69.3%|
|% rated C+ or lower||19.2%||30.6%|
While both companies appear to have problems developing a discriminating system, it does seem that True Nicks has at least taken a step in the right direction. The fact that over 30% of the subject yearlings were given unfavorable ratings by True Nicks and that they handed out A ratings at a clip 45% slower than Werk seems to indicate that True Nicks has at the very least, taken the nicking sciences in the right direction.
Additionally, when averaging the letter grades on a standard academic scale (A++ = 4.6, A+ = 4.3 and so on), Werk’s ratings come in at 3.57 while True Nicks assigns an average point value of 3.07. Both systems skew to the positive end of the spectrum, but Werk’s 3.57 is a clear indicator of just how inflated and subsequently, how useless his system has become.
We also examined how many generations back each system was in computing their ratings as many skeptics point to ratings based on sires found four, five, or even six generations deep on the hypothetical pedigree. As an example, if a rating was based on the paternal great grandsire (three generations removed) and maternal grandsire (two generations removed), the rating was based on a total of five generations removed. In this case, there was virtually no difference between the two with E-Nicks coming in at an average of 4.43 generations removed compared to TrueNicks 4.41.