SQL for Data Science

There are many flavors of SQL but most flavors have the same base commands. SQL queries are typed in the following pattern;

SELECT -- <columns> and <column calculations>
FROM -- <table name>
  JOIN -- <table name>
  ON -- <columns to join>
WHERE -- <filter condition on rows>
GROUP BY -- <subsets for column calculations>
HAVING -- <filter conditions on groups>
ORDER BY -- <how the output is returned in sequence>
LIMIT -- <number of rows to return>
import pandas as pd 
import altair as alt
import numpy as np
import sqlite3

# %%
# careful to list your path to the file.
sqlite_file = 'lahmansbaseballdb.sqlite'
con = sqlite3.connect(sqlite_file)

results = pd.read_sql_query( 
    'SELECT * FROM allstarfull LIMIT 5',
    con)

results

You can see the list of tables available in the database;

table = pd.read_sql_query(
    "SELECT * FROM sqlite_master WHERE type='table'",
    con)
print(table.filter(['name']))
print('\n\n')
# 8 is collegeplaying
print(table.sql[8])