next up previous
Next: About this document Up: My Home Page

Computer Science 101 - Computer and Information Technology
Prof. Steven Skiena
Fall 1996

Project 3 - Exceling at Baseball
Due Monday, November 18, 1996

Spreadsheet programs such as Microsoft Excel make it easy to manipulate large amounts of data, to help recognize important and interesting quantities. This project will concern analyzing the baseball statistics from the just-completed 1996 season using Excel spreadsheet.

This assignment has three parts:

  1. Recompute Existing Statistics - The data files contain statistics which can be computed from other data in the file. Thus you can compare to see whether your formulas are correct - although you should expect certain minor differences due to transcription errors and formula variations. Write spreadsheet functions to compute the following quantities:

    • Batting average = hits/atbats
    • On-base average = (hits+bb)/(atbats+bb)
    • 1B = hits - 2B - 3B - HR
    • Total bases = 1B + 2*2B + 3*3B + 4*HR
    • Slugging percentage = TotalBases / atbats

    Turn in a printout of your spreadsheet having computed all of these quantities on the 1996 New York Yankees.

  2. Compute new quantities - There are many ways to evaluate players and teams. Compute the following quantities

    • Home Run Percentage = HR / atbats. What was the maximum HR percentage of any player, and who had it?
    • Power-Speed Number = tex2html_wrap_inline43 . Have the spreadsheet put a star next to any player with power-speed number greater than 15.
    • Batting champion - The batting champ is the player with the best batting average among any player with at least 502 plate appearences, where plate appearences are the atbats plus the bb'ss (walks). Have the spreadsheet put the word `champ' next to the player who is the batting champ of his team.
    • Team statistics - For each statistical field, compute the team's totals. Note that for the average or percentage fields, you must compute them from the raw totals as above. The average of averages is not the average.
    • Total statistics - Build a pie chart showing how the total number of stolen-bases were divided up among the members of the team.

    Turn in a printout of your spreadsheet having computed all of these quantities on the 1996 New York Mets and the 1996 New York Yankees.

  3. Player Analysis - You are charged with using batting statistics to identify the 10 best hitters in baseball. You may use whatever statistical criteria you think are most appropriate. For non-baseball fans, the most important numbers tend to be batting average, home runs, and RBIs.

    Write a 2-3 page analysis in Microsoft Word justifying your choices of the ten best. Any set of choices will receive full credit, provided you give data justifying it. Include at least one graph or chart in your analysis.

Rules of the Game

  1. All three of the data files will be available from ftp my WWW page.
  2. The grade for parts 1 and 2 will be a depend upon obtaining the correct answers for each quantity. The grade for part 3 will be based on the quality of the final report - your analysis should reveal that you played with the complete data enough to make some intelligent decisions about it. Also, the analysis document must be presentable.
  3. As a hint, first make sure that all your basic spreadsheet functions work correctly with one player's worth of data before extending them to the entire data set. You should be able to copy the formal cells to avoid typing the formula repeatedly.
  4. The parts are ordered by difficulty. Do part 1 before part 2 before part 3.
  5. This is not intended to test you knowledge of baseball. Please feel very free to ask me to explain any of the baseball terminology as often as you wish!
  6. Parts 1 and 2 are designed to give you practice with manipulating spreadsheets. Part 3 is designed to give you experience working with large amounts of data on spreadsheets.
  7. If you are interested, you can try the same kinds of things with pitching statistics. You will have to obtain the data yourself and format it for the Excel Spreadsheet from http://www.cloud9.net/ jcg/.
  8. This project can be completed during your lab section, during open lab hours in either Engineering 106 or the new lab 318 Harriman, or on any other computer running Word that you have access to.




next up previous
Next: About this document Up: My Home Page

Steve Skiena
Mon Oct 28 14:16:38 EST 1996