Project 3: Finding relationships in baseball.

Background

When you hear the word “relationship” what is the first thing that comes to mind? Probably not baseball. But a relationship is simply a way to describe how two or more objects are connected. There are many relationships in baseball such as those between teams and managers, players and salaries, even stadiums and concession prices. The graphs on Data Visualizations from Best Tickets show many other relationships that exist in baseball.

For this project, your client would like developed SQL queries that they can use to retrieve data for use on their website without needing Python. They would also like to see example Altair charts.

Data

Data Conection: lahmansbaseballdb
Connection Instructions: See SQL for Data Science

Readings

Optional References

Questions and Tasks

  1. Write an SQL query to create a new dataframe about baseball players who attended BYU-Idaho. The new table should contain five columns: playerID, schoolID, salary, and the yearID/teamID associated with each salary. Order the table by salary (highest to lowest) and print out the table in your report.

  2. This three-part question requires you to calculate batting average (number of hits divided by the number of at-bats)

    1. Write an SQL query that provides playerID, yearID, and batting average for players with at least 1 at bat that year. Sort the table from highest batting average to lowest, and then by playerid alphabetically. Show the top 5 results in your report.
    2. Use the same query as above, but only include players with at least 10 at bats that year. Print the top 5 results.
    3. Now calculate the batting average for players over their entire careers (all years combined). Only include players with at least 100 at bats, and print the top 5 results.
  3. Pick any two baseball teams and compare them using a metric of your choice (average salary, home runs, number of wins, etc). Write an SQL query to get the data you need, then make a graph in Altair to visualize the comparison. What do you learn?

Deliverables

Use this template to submit your Client Report. The template has three sections (for additional details please see the instructional template):

  1. A short summary that highlights key that describes the results describing insights from metrics of the project and the tools you used (Think “elevator pitch”).
  2. Answers to the grand questions. Each answer should include a written description of your results, code snippets, charts, and tables.