10:00
Cameron Marsden and Jeff Pannell
We are Cameron Marsden and Jeff Pannell. We are senior data analysts at OSPI. We work on the Report Card. Combined we have 13 years of SQL experience.
Introduce yourself!
What is your name?
What do you do?
What is your experience with SQL?
How do you work with data?
If you find that you’re finishing the exercises early, try helping your neighbor (if asked). Trying to explain why what you did worked can help you learn.
If you want to learn more:
W3 Schools online tutorials (https://www.w3schools.com/)
Geeks For Geeks (https://www.geeksforgeeks.org)
Example Google Search “W3Schools SQL how to execute based off a condition.”
Centralized data storage solution
Data source for reports
Compute calculations on data and store for easy and convenient access
Find statistics on your data without external tools
Conduct data exploration to better understand your data
Other Advanced uses (Partitions, Conditional Statements, Running Tetris … seriously you can. But don’t)
Tables
Views
10:00
Download DBeaver Community from: https://dbeaver.io/download/
Open it on your computer.
Only install it for your account OR just unzip it.
Download the wera-db.duckdb database file
Open DBeaver.
In the upper left corner, click on the blue plug with the green plus side.
In the popup that appears, type in “duck”, and click on the only option. Then click “next”
Click on “Open” and navigate to where you downloaded “wera-db.duckdb” and select that file. Then click Finish
You may need to download the JDBC driver. If prompted, please do so.
SELECT
or select
or even sElEcT
(please don’t). The convention is to use all capital letters for key words./* Here's another way
to leave a
comment over multiple lines */
SELECT *
FROM Table1
SELECT SchoolYear
,graduationrate
FROM graduation
SELECT DISTINCT
SchoolYear
FROM "wera-db".main.graduation
spaces and new lines don’t matter for making the code work. The sql code can all be on one line.
Important
Remember that you can get data from a table using a combination of a SELECT statement and a FROM clause.
05:00
It would be a real shame if somebody left a handy select syntax reference link here.
05:00
Wait, this is the same select syntax reference link as the last slide!
05:00
This link seems DISTINCT from the other two select syntax reference links.
-- Question 1
SELECT *
FROM "wera-db".main.enrollment;
-- Question 2
SELECT DISTINCT SchoolYear
FROM "wera-db".main.enrollment;
-- Question 3
SELECT DISTINCT
DistrictName
,SchoolName
FROM "wera-db".main.enrollment;
Don’t worry too much about how your code looks. As long as it runs and you can read it later, it’s fine. If you want very well organized code that can take some time and effort. Jeff recommends giving it to Cameron to write.
WHERE statements tell the query what filter conditions you want. WHERE clauses always come after the FROM Clause.
SELECT *
FROM "wera-db".main.graduation AS g -- The AS keyword is an alias.
-- You can rename something if you
-- don't want to have to type it
-- out all the time.
WHERE g.GraduationRate < 0.1
The ‘%’ character is a wildcard. It means “anything can go where this character is”.
Important
Remember that WHERE clauses FILTER data.
05:00
This is WHERE somebody left a handy reference link to WHERE clauses again.
05:00
Find all the schools in the Seattle School District.
Find all schools in the Seattle School District that have “Elementary” in their name.
WHERE could a person find a helpful reference for questions LIKE we have above?
05:00
Find all the schools in the Seattle School District.
Find all schools in the Seattle School District that have “Elementary” in their name.
From that list of schools, make the largest student count be on top. (Extra credit! LIMIT the results to 5)
All these references are completely out of ORDER?
-- Question 1
SELECT DISTINCT
DistrictName
,SchoolName
FROM "wera-db".main.enrollment
WHERE DistrictName = 'Seattle School District No. 1';
-- Question 2
SELECT DISTINCT
DistrictName
,SchoolName
FROM "wera-db".main.enrollment
WHERE DistrictName = 'Seattle School District No. 1'
AND SchoolName LIKE '%Elementary%';
-- Question 3
SELECT DiSTINCT
DistrictName,
SchoolName,
student_count
FROM "wera-db".main.enrollment
WHERE DistrictName = 'Seattle School District No. 1'
AND SchoolName LIKE '%Elementary%'
AND student_group = 'All Students'
AND GradeLevel = 'All Grades'
ORDER BY student_count DESC;
-- Extra Credit! Use this as the final line on Question 3
LIMIT 5;
SELECT
COUNT(SchoolName) as school_count
--This will count all non-null values from SchoolName in the table
FROM main.enrollment
Important
Remember that if you want to AGGREGATE, you will need to GROUP BY the other columns.
05:00
05:00
We can COUNT on this reference being GREATER THAN the last one.
05:00
This is a tricky question, but there has to be SUM operator we can use
-- Question 1
SELECT COUNT(*) as DistrictCount
FROM "wera-db".main.enrollment
WHERE student_count BETWEEN 900 AND 1100
AND OrganizationLevel = 'District'
AND student_group = 'All Students'
AND GradeLevel = 'All Grades';
-- Question 2
SELECT DistrictName, COUNT(*) as BigSchoolCount
FROM "wera-db".main.enrollment
WHERE student_count > 1000
AND OrganizationLevel = 'School'
AND student_group = 'All Students'
AND GradeLevel = 'All Grades'
GROUP BY DistrictName
ORDER BY BigSchoolCount DESC;
-- Question 3
SELECT County
,SUM(student_count) as Student_Count
FROM "wera-db".main.enrollment
WHERE OrganizationLevel = 'District'
AND student_group = 'All Students'
AND GradeLevel = 'All Grades'
GROUP BY County
ORDER BY Student_Count DESC;
Image references may expire, unlike Kit Kat bars.
10:00
LEFT JOIN INNER JOIN
05:00
-- Example Syntax from 4.2
SELECT DistrictName, COUNT(*) as BigSchoolCount
FROM "wera-db".main.enrollment
WHERE student_count > 1000
AND OrganizationLevel = 'School'
AND student_group = 'All Students'
AND GradeLevel = 'All Grades'
GROUP BY DistrictName
ORDER BY BigSchoolCount DESC;
Fun fact! People who learn SQL are never lonely at lunch because they know how to JOIN tables.
05:00
Fun fact! People who learn SQL are never lonely at lunch because they know how to JOIN tables.
05:00
-- Question 1
SELECT E.DistrictName,
COUNT(*) as BigSchoolCount,
PD.Administrator,
FROM "wera-db".main.enrollment E
LEFT JOIN "wera-db".main.public_directory PD
ON E.DistrictOrganizationId = PD.DistrictOrganizationId
AND PD.OrganizationLevel = 'District'
WHERE student_count > 1000 AND
E.OrganizationLevel = 'School' AND
student_group = 'All Students' AND
GradeLevel = 'All Grades'
GROUP BY E.DistrictName, PD.Administrator
ORDER BY BigSchoolCount DESC;
-- Question 2
SELECT DISTINCT
e.SchoolYear
,e.SchoolName
,g.GraduationRate
FROM "wera-db".main.enrollment e
LEFT JOIN "wera-db".main.graduation g
ON g.SchoolOrganizationId = e.SchoolOrganizationID
WHERE e.student_group = 'All Students'
AND e.GradeLevel = 'All Grades'
AND e.SchoolName != 'District Total'
AND g.GraduationRate IS NOT NULL
AND g.Cohort = 'Four Year'
AND g.StudentGroupType LIKE '%All%'
AND e.student_count > 1000
ORDER BY g.GraduationRate DESC
-- Question 3
SELECT DISTINCT
e.SchoolYear
,e.SchoolName
,g.GraduationRate
FROM "wera-db".main.enrollment e
LEFT JOIN "wera-db".main.graduation g
ON g.SchoolOrganizationId = e.SchoolOrganizationID
WHERE e.student_group = 'All Students'
AND e.GradeLevel = 'All Grades'
AND e.SchoolName != 'District Total'
AND g.GraduationRate IS NULL;
WITH step1 AS (
SELECT SchoolYear,
SUM(graduate)/SUM(finalcohort) AS grad_rate,
SUM(continuing)/SUM(finalcohort) AS continuing_rate
FROM "wera-db".main.graduation
WHERE studentgrouptype = 'All'
AND cohort = 'Four Year'
AND organizationlevel = 'State'
GROUP BY SchoolYear)
SELECT SUM(grad_rate + continuing_rate) AS combined_rate
FROM step1
15:00
Some people really like the CTE, but I prefer the mountains.
-- Question 1
WITH big_schools AS ( SELECT E.DistrictName,
COUNT(*) as BigSchoolCount,
PD.Administrator,
FROM "wera-db".main.enrollment E
LEFT JOIN "wera-db".main.public_directory PD
ON E.DistrictOrganizationId = PD.DistrictOrganizationId
AND PD.OrganizationLevel = 'District'
WHERE student_count > 1000 AND
E.OrganizationLevel = 'School' AND
student_group = 'All Students' AND
GradeLevel = 'All Grades'
GROUP BY E.DistrictName, PD.Administrator)
SELECT *
FROM big_schools
If you want to learn more:
W3 Schools online tutorials (https://www.w3schools.com/)
Geeks For Geeks (https://www.geeksforgeeks.org)
Example Google Search “W3Schools SQL how to execute based off a condition.”
What data questions are do you encounter in your job?
How do you think you can answer these questions, or questions like these using SQL now?
For example, what types of functions would you use to:
Here is the link to a form where you can tell us what we did well and how we can improve. https://forms.office.com/r/gDm9wXPyrF
Similar to WHERE statements, but they are used if you are filtering based off an aggregate function.
Comes after the GROUP BY clause and before the ORDER BY clause.
If you need to filter on a row-level criteria, use the WHERE clause.
The list below contains all the possible keywords. We will get to all of them in this workshop.
This is one of those areas where SQL languages will differ.
Most languages will say “Put the LIMIT keyword after everything else and put the number of rows you want.
Microsoft SQL Products will have you put the TOP keyword after the SELECT Statement.