Mathematicians predict 2005 Cy Young winners

November 07, 2005

Pitchers Chris Carpenter of the St. Louis Cardinals and Mariano Rivera of the New York Yankees will win the 2005 Major League Baseball Cy Young awards, predicts a pair of mathematicians from Rhode Island College. The actual winners, intended to represent the most outstanding American League and National League pitchers during the regular season, will be announced November 8 (AL) and 10 (NL) by the Baseball Writers' Association of America, whose members vote on the award.

Mathematicians Rebecca Sparks and David Abrahamson, a husband-and-wife team who teach at Rhode Island College, have developed a formula that predicts which pitchers will place first through third in Cy Young voting. The researchers structured their formula to predict the voting results for starting pitchers, who almost always win the award, rather than relief pitchers, who are rarely the recipients. However, their formula reveals a lack of standout American League starting pitchers this year, suggesting that the AL award will go to relief pitcher Mariano Rivera for his extraordinary 2005 season.

Sparks and Abrahamson presented their model in the April 2005 issue of Math Horizons, a magazine published by the Mathematical Association of America (MAA). Abrahamson will discuss the model in a talk about math and sports at a regional MAA meeting to take place at the University of New Hampshire on November 18 and 19, 2005.

Every season, the baseball writers' association selects two sportswriters from every city in the major leagues to vote for a first, second and third place choice. The ballots are due right after the regular season ends. "The identities of the voters change frequently," Sparks and Abrahamson write in their Math Horizons article, "but we will see that their voting results follow a predictable course."

The pair took an extremely pragmatic approach in developing a method to forecast Cy Young winners. They did not consider which pitchers should win the award, or which qualities were most important in a pitcher. They simply aimed to develop a mathematical formula that would best match the voting results.

Their formula computes a score for each pitcher on a scale from roughly 0 to 10. For their formula to be successful, it must yield the top score in a particular season to the pitcher who places first in Cy Young voting, the next-highest score to the player who places second, and the third-highest score to the player who places third.

To calculate the scores, they first chose four key pitching statistics: wins, losses, strikeouts, and ERA (earned run average, which is the average number of runs that the pitcher is responsible for giving up per 9 innings of play). They also included a fifth statistic, the winning percentage of the pitcher's team, as they thought that it influences the voting results.

But the main question, according to the two researchers, is how much importance the voters placed on each of those five categories. Do voters, consciously or unconsciously, generally value a pitcher's number of wins more than his number of strikeouts? Does a pitcher on a first-place team really have a better chance of winning the award than a pitcher with slightly better stats on a last-place team?

The tools of mathematics can answer this seemingly subjective question. First, the researchers looked up the statistics in those five categories for starting pitchers between 1993 and 2002 and compared them to the Cy Young voting results for those years.

Then, to determine the relative importance of each of the five categories in the voting results, they turned to a mathematical method, dating to the 1940s, called linear programming. First developed by economists (who won the Nobel Prize for work that employed it) and mathematician George Dantzig, the idea is to find the missing numbers (in this case, the relative importance or "weight" of each pitching category in the voting) in order to satisfy certain constraints (i.e., a formula that would correctly yield the first- through third-place results for Cy Young balloting).

Analyzing the 1993 to 2002 data, they concluded that a pitcher's number of wins carried almost three times as much weight in the voting as his earned run average. ERA, in turn, was about one-and-a-half times more important than strikeouts, and about twice as important as the winning percentage of the pitcher's team. Almost completely insignificant, according to the model, is a pitcher's number of losses; they seemed to have very little bearing on the voting results.

By taking each pitcher's statistics in these five categories and adjusting their values according to these relative weights, the researchers' formula correctly yielded all but one of the first-, second- and third place vote-getters in each league from 1993 to 2002. Recently, they incorporated the data for the 2003 and 2004 seasons into the model, and predicted three out of four Cy Young winners (the fourth was a reliever). By looking at the 2003 and 2004 statistics, they again found that the relative weights of the five categories were almost exactly the same as in the earlier data.

Using their formula, the researchers come up with the following predictions for the first three places in the 2005 National League voting:
• Chris Carpenter, St. Louis (6.4257 points)
• Dontrelle Willis, Florida (6.3420)
• Roy Oswalt, Houston (5.9064)
According to Abrahamson, it is possible that voters may drift away from their past behavior by voting for Roger Clemens or Andy Pettitte ahead of Roy Oswalt this year.

Clemens and Pettitte are generally better known veterans who may have a somewhat higher profile in the news media than Oswalt.

In the American League, the top starters in their model are, in order,
• Bartolo Colon, LA/Anaheim (5.8074)
• Johann Santana, Minnesota (5.3671)
• Jon Garland, Chicago (5.0730)
The model shows that there is no standout starter in the American League this year. Bartolo Colon, the top starter according to their model, has a total score of less than 6, a far cry from many AL Cy Young award winners in years past, such as Barry Zito (6.75, 2002) and Pedro Martinez (7.54, 1999).

"Our model quantifies the fact that there is no AL pitcher who will knock the voters' socks off," says Abrahamson. Therefore, Sparks says the two are "very confident" that the AL Cy Young Award will go to Mariano Rivera, a relief pitcher who had a particularly outstanding year. A Cy Young for Rivera, they say, would also serve as a kind of "lifetime achievement award" as Rivera, who has never earned the award, is likely toward the end of a very distinctive career.

The researchers think that their mathematical approach, known generally as "constrained optimization," might work for other sports awards, such as the most valuable player in various leagues. It also might help provide insights into how magazines rank corporations, or top colleges. But the point of their approach, they say, is to show how the methods of mathematics can apply in many unexpected everyday situations.

"The moral is always the same for the mathematical modeler," they write in their Math Horizons article. "More often than we may know, there is a pattern out there. We just have to keep thinking creatively, and we have got a good chance of finding it."
-end-
Reference:
Rebecca L. Sparks and David L. Abrahamson, "A Mathematical Model to Predict Award Winners," in Math Horizons, April 2005.

American Institute of Physics

Related Math Articles from Brightsurf:

Smokers good at math are more likely to want to quit
For smokers who are better at math, the decision to quit just adds up, a new study suggests.

Not a 'math person'? You may be better at learning to code than you think
New research from the University of Washington finds that a natural aptitude for learning languages is a stronger predictor of learning to program than basic math knowledge.

Speak math, not code
Writing algorithms in mathematics rather than code is not only more elegant but also more efficient, says 2013 Turing Award winner Leslie Lamport.

Math that feels good
Mathematics and science Braille textbooks are expensive and require an enormous effort to produce -- until now.

Using math to blend musical notes seamlessly
MIT researchers have invented an algorithm that produces a real-time portamento effect, gliding a note at one pitch into a note of another pitch, between any two audio signals, such as a piano note gliding into a human voice.

Novel math could bring machine learning to the next level
In recent years, a theory called 'Topological Data Analysis,' stemmed from a branch of Mathematics so abstract that it did not seem to have any application whatsoever in the real world, has been making computers much better at recognizing meaningful structure inside all kinds of large datasets (a.k.a.

Study shows we like our math like we like our art: Beautiful
A beautiful landscape painting, a beautiful piano sonata -- art and music are almost exclusively described in terms of aesthetics, but what about math?

Phase transitions: The math behind the music
Physics Professor Jesse Berezovsky contends that until now, much of the thinking about math and music has been a top-down approach, applying mathematical ideas to existing musical compositions as a way of understanding already existing music.

IQ a better predictor of adult economic success than math
IQ in childhood is a better indicator of adult wealth than math for very preterm and very low-weight babies, according to a new study in PLOS ONE.

Math + good posture = better scores
A San Francisco State University study finding that students perform better at math while sitting with good posture could have implications for other kinds of performance under pressure.

Read More: Math News and Math Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.