Project 3 WorkBook

Tutoring Lab Info

The data science lab is a resource you can use in person, online, and in Slack.

SQL Query and Data Retrieval

1. `sqlite3.connect()`

Connect to Database

The sqlite3.connect() function establishes a connection to an SQLite database file. In this example, the file named ‘lahmansbaseballdb.sqlite’ is being connected to. Use Relative Path meaning this sqlite file is in the same directory as the file you are working with. This will ensure a connection when using sqlite3.

Code Snippet

import sqlite3

con = sqlite3.connect('*\lahmansbaseballdb.sqlite')

2. `SELECT/FROM`

SELECT/FROM Statement Explanation

The SELECT statement is used to retrieve data from one or more tables in a database. It specifies which columns to include in the result set. The FROM clause specifies the table or tables from which to retrieve data. It forms the foundation of the SELECT statement.

Code Snippet


p = """

SELECT column1, column2
FROM table_name

"""

pd.read_sql_query(p, con)

3. `WHERE`

WHERE Clause Explanation

In SQL, when using the WHERE clause, logical symbols are commonly employed to specify conditions based on column values. These symbols help define the relationship between a column’s value and the desired condition.

Code Snippet


p = """

SELECT column1, column2
FROM table_name
WHERE column1 = 'value'


"""

pd.read_sql_query(p, con)

Code Snippet


p = """

SELECT column1, column2
FROM table_name
WHERE column1 <> 'value'


"""

pd.read_sql_query(p, con)

Code Snippet


p = """

SELECT column1, column2
FROM table_name
WHERE column1 > value



"""

pd.read_sql_query(p, con)

Code Snippet


p = """

SELECT column1, column2
FROM table_name
WHERE column1 < value



"""

pd.read_sql_query(p, con)

5. `ORDER BY`

ORDER BY Clause Explanation

The default order for the ORDER BY clause is ascending (ASC). ASC sorts from the lowest value to the highest value. DESC sorts from the highest value to the lowest value.

Code Snippet


p = """

SELECT column1, column2
FROM table_name
ORDER BY column1 DESC

"""

pd.read_sql_query(p, con)

Code Snippet


p = """

SELECT column1, column2
FROM table_name
ORDER BY column1 ASC

"""

pd.read_sql_query(p, con)

7. `ALIAS`

Alias Explanation

An alias is a temporary name assigned to a table or column in a SQL query. It can be used to make the output more readable or to shorten lengthy column names.

Code Snippet


p = """

SELECT column1 AS alias_name
FROM table_name

"""

pd.read_sql_query(p, con)

6. `CAST`

CAST Function Explanation

The CAST function is used to convert data from one data type to another. It is particularly useful for performing calculations or comparisons on data of different types.

Code Snippet


p = """

SELECT CAST(column_name AS new_data_type) AS new_column_name
FROM table_name


"""

pd.read_sql_query(p, con)

8. `HAVING`

HAVING Clause Explanation

The HAVING clause is used to filter rows in a result set based on a specified condition. It is similar to the WHERE clause but is used with aggregate functions in GROUP BY queries.

Code Snippet



p = """

SELECT column1, aggregate_function(column2)
FROM table_name
GROUP BY column1
HAVING aggregate_function(column2) > value

"""


pd.read_sql_query(p, con)

9. `LIMIT`

LIMIT Clause Explanation

The LIMIT clause is used to limit the number of rows returned in a result set. It is often used in combination with the ORDER BY clause to retrieve a specific number of top or bottom records.

Code Snippet


p = """

SELECT column1, column2
FROM table_name
LIMIT 10

"""

pd.read_sql_query(p , con)

10. `GROUP BY`

GROUP BY Clause Explanation

The GROUP BY clause is used to group rows in a result set based on one or more columns. It is typically used with aggregate functions to perform calculations on grouped data.

Code Snippet


q = """

SELECT column1, aggregate_function(column2)
FROM table_name
GROUP BY column1


"""

pd.read_sql_query(q, con)

Project 3 WorkBook

Tutoring Lab Info

SQL Query and Data Retrieval

1. sqlite3.connect()

2. SELECT/FROM

3. WHERE

5. ORDER BY

7. ALIAS

6. CAST

8. HAVING

9. LIMIT

10. GROUP BY