SQL

CIS 112, Dr. Ladd

spacebar to go to the next slide, esc/menu to navigate

Structured Query Language

creates rules for defining and manipulating databases.

SQL works in Postgres

but it’s not the same as Postgres.

Postgres is the program, SQL is the language.

Postgres is an object-relational database management system (DBMS). It…

  • is open-source, i.e. it’s free for all to use, redistribute, and modify
  • supports SQL (short for PostgreSQL)
  • maintains referential integrity according to Codd’s relational model
  • allows developers to implement their own “atomic” data types

Data types are part of a single “object”

Using a relational approach, data “objects” can inherit values from other objects.

Data Definition Language (DDL)

Consider the groceries example…

date item price quantity
8-11-2023 Apple $3.99 2
8-14-2023 CokeZero $4.99 3
8-14-2023 Apple $3.99 5

… and its ERD.

You can CREATE individual schemas.

CREATE SCHEMA "groceries";

Always use quotes for attributes and names!

You can SET that schema as your search_path.

You should do this at the beginning of every SQL script in NotePad++. (Save files with .sql extension.)

SET search_path TO "groceries";

You can CREATE an entire TABLE with one command.

CREATE TABLE "cost" (
    "item" TEXT PRIMARY KEY,
    "price" FLOAT(3) NOT NULL
);

Make sure you include data types, constraints, and primary keys.

The Postgres documentation is a huge help!

Sometimes your table will have a composite PRIMARY KEY.

CREATE TABLE "purchase" (
    "date" DATE NOT NULL,
    "item" TEXT NOT NULL,
    "quantity" INTEGER NOT NULL,
    PRIMARY KEY ("date", "item")
);

Delete tables with DROP and CASCADE.

DROP TABLE removes the table and the CASCADE argument removes objects that depend on the table.

DROP TABLE "purchase" CASCADE;

Add CONSTRAINTs to individual columns or the whole table.

Use the Postgres Documentation for a complete guide.

You’ve already done this with PRIMARY KEY and NOT NULL!

You can make one or many columns UNIQUE.

CREATE TABLE "cost" (
    "item" TEXT UNIQUE,
    "price" FLOAT(3)
);

or

CREATE TABLE "cost" (
    "item" TEXT,
    "price" FLOAT(3),
    UNIQUE ("item")
);

Add foreign keys with the REFERENCES constraint.

  • Avoid errors with appropriate flow: i.e. you must reference a table that already exists!
  • If you don’t specify the columns, SQL will automatically use the reference table’s primary key.

Examples of the REFERENCES constraint

CREATE TABLE "purchase" ( --Option 1: Column Constraint
    "date" DATE,
    "item" TEXT REFERENCES "cost" ("item"),
    "quantity" INTEGER NOT NULL,
    PRIMARY KEY ("date", "item")
);
CREATE TABLE "purchase" ( --Option 2: Table Constraint
    "date" DATE,
    "item" TEXT,
    "quantity" INTEGER NOT NULL,
    PRIMARY KEY ("date", "item"),
    FOREIGN KEY ("item") REFERENCES "cost" ("item")
);

Create custom constraints with CHECK.

CREATE TABLE "cost" (
    "item" TEXT PRIMARY KEY,
    "price" FLOAT(3) CHECK ("price" > 0)
);

You try it!

  1. Completely delete the groceries schema (but save your .sql files).
  2. Rewrite your DDL script with all original constraints.
  3. Add a constraint for the correct foreign key in the “purchase” table.
  4. Add a constraint to prevent “price” from being negative.
  5. Re-run your INSERT commands from last time to repopulate the database.
  6. Look at the schemas and data (with SELECT) to see if it worked.

Change tables with ALTER.

A few examples:

-- Add a column
ALTER TABLE "cost" ADD COLUMN "weight" FLOAT(4);

-- Drop a column
ALTER TABLE "purchase" DROP COLUMN "quantity";

-- Rename a column
ALTER TABLE "purchase" RENAME COLUMN "date" TO "day";

-- Add (or remove) a constraint
ALTER TABLE "purchase" ADD FOREIGN KEY ("item") REFERENCES "cost";

More in the Postgres documentation.

Data Manipulation Language (DML)

INSERT data into your tables.

INSERT INTO "cost" VALUES ('Apple', 3.99);

Or INSERT into specific columns.

INSERT INTO "cost" (item) VALUES ('CokeZero');

This will only work if the other columns can be NULL!

You try it!

Enter the command for inserting data into the purchase table.

HINT: You may need to look at the Postgres documentation for data types.

Check if your data is there with SELECT.

SELECT * from "cost";
SELECT * from "purchase";

We’ll learn a lot more about SELECT later on…

DELETE records from your table.

DELETE FROM "purchase" WHERE "item"='Apple';
  • Delete takes a condition with WHERE (not IF)
  • Type of quotation marks is important! Tables and column names in double; values in single.
  • DELETE FROM "purchase"; will delete ALL the data without warning.

You try it!

Delete CokeZero from the cost table.

Copy an entire table from a CSV file.

\copy "Musical Instrument" FROM 'musical_instruments.csv' WITH CSV HEADER
  • The column names MUST be in the same order.
  • The CSV must be in the same directory as your psql login.

Change records with UPDATE.

-- Change one or many rows
UPDATE "purchase" SET "quantity" = 6 WHERE "quantity" = 5;

-- Change all rows based on original value
UPDATE "cost" SET "price" = "price" * 100;

-- Change more than one column
UPDATE "cost" SET "item" = 'Banana', "price" = 599 WHERE "item" = 'Apple';

All this and more in the Postgres documentation.

Get data while modifying rows with RETURNING.

INSERT INTO "cost" VALUES ('CokeZero', 4.99) RETURNING "item";

UPDATE "cost" SET "price" = "price" * 100 RETURNING "price";

As always, more in the Postgres documentation.

Queries (DML)

ERDs are essential for queries.

You need them for column names, but also for capitalization, combining different tables, etc.

SELECT is the command for all queries.

In its simplest form, you can use it to do simple arithmetic or to get all rows from a table.

SELECT 5 * 9;
SELECT * FROM "purchase";
SELECT "item", "quantity" FROM "purchase";

Perform arithmetic on columns and give them new names with AS.

SELECT "item", "quantity", "quantity" + 5 AS "updated_quantity" FROM "purchase";

If the column isn’t numeric (i.e. money or a date), you may have to type cast to avoid errors:

SELECT "item", "quantity", "quantity"::NUMERIC + 5 AS "updated_quantity" FROM "purchase";

Other forms of type casting:

"quantity"::NUMERIC::INTEGER
CAST("quantity" AS NUMERIC)
CAST(CAST("quantity" AS NUMERIC) AS INTEGER))

Sort results with ORDER BY.

SELECT "item", "quantity" FROM "purchase" ORDER BY "item";

ORDER BY can also be used with arithmetic expressions and newly created columns!

Get specific rows with WHERE conditions.

SELECT "item", "date" FROM "purchase" WHERE "item" = 'Apple';

Get a set number of rows with LIMIT.

SELECT * FROM "purchase" LIMIT 2;

Use OFFSET to start somewhere other than the beginning (which is row 0):

SELECT * FROM "purchase" LIMIT 2 OFFSET 1;

It’s recommended to use LIMIT with ORDER BY always!

Remember, you can always refer back to the documentation:

SELECT queries: https://www.postgresql.org/docs/14/queries-overview.html

ORDER BY and AS: https://www.postgresql.org/docs/14/queries-order.html

LIMIT and OFFSET: https://www.postgresql.org/docs/14/queries-limit.html

You Try It!

  1. Finish adding tables and data to the Music schema.
  2. Write queries to answer the following questions:
  • What pitches were used in files where a clarinet was played?
  • What would each instrument cost if I got a 10% discount?
  • Which tone colors are visible?
  • What are the last three note pitches?
  1. Save all of these queries in a separate .sql file, with comments that explain what each one does.
  2. When you finish, return to the other practice data sets.

Aggregate functions let you calculate based on columns.

SELECT COUNT("item") FROM "purchase";

SELECT AVG("price"::NUMERIC) FROM "cost";

Remember to add type casts as needed!

Key aggregate functions:

  • COUNT()
  • AVG()
  • SUM()
  • MIN()
  • MAX()

You can use the full list from the Postgres documentation!

Use DISTINCT to work with unique values.

SELECT COUNT(DISTINCT "item") FROM "purchase";

GROUP rows to summarize your table.

SELECT "item", AVG("quantity") FROM "purchase" GROUP BY "item";

Use HAVING to limit groups.

https://www.postgresql.org/docs/14/queries-table-expressions.html#QUERIES-GROUP

Subqueries define connections between tables.

There are several expressions for subqueries in the documentation.

Use foreign keys to guide you! This is ideal for optional relations.

-- This should all be on one line in your .sql file
SELECT AVG("price"::NUMERIC) FROM "cost" WHERE "item" 
    IN (SELECT "item" FROM "purchase" WHERE "quantity" < 3);

You try it!

Write queries to answer the following for the “music” schema:

  1. What is the average cost of all instruments?
  2. What is the average cost of each instrument (since some have multiple manufacturers)?
  3. What is the maximum frequency of pitch colors with a WvMin (for color) of less than 500?
  4. How many individual instrument families are there?

Combine tables with JOIN

A JOIN combines results from 2 or more tables.

The basics of JOIN syntax:

https://www.postgresql.org/docs/14/tutorial-join.html

Details on different JOIN statements:

https://www.postgresql.org/docs/14/queries-table-expressions.html#QUERIES-FROM

INNER JOINs only return rows that exist in both tables.

Consider the inner part of a venn diagram. Only the overlapping part is returned.

Let’s try to make venn diagrams for LEFT, RIGHT, and FULL OUTER JOINs.

INNER JOIN is the default.

SELECT * FROM "cost" JOIN "purchase" ON "cost"."item" = "purchase"."item";

Match foreign keys with ON or with USING.

--Using needs parentheses and eliminates duplicates
SELECT * FROM "cost" JOIN "purchase" USING ("item");

Use Aliases to simplify your statements.

SELECT c."item", c."price", p."date"
    FROM "cost" AS c JOIN "purchase" AS p ON c."item" = p."item";

You can do this without AS, too.

SELECT c."item", c."price", p."date"
    FROM "cost" c JOIN "purchase" p ON c."item" = p."item";

You Try It!

  1. Find the dynamic, MIDI, and frequency for every pitch.
  2. Get the average cost for each family of instrument.
  3. Get WvMin and WvMax values for both pitch and color.

OUTER JOIN resources

You Try It: Practicing JOINS

  1. Make a list of all instruments, their families, and their cost (even if there’s no cost listed).
  2. Get the average frequency played by each instrument.
  3. Create a table showing all colors with their pitch, their visibility, and whether they were played on piano. The table should show pitches that don’t have colors and vice versa.
  4. Show the files and pitches in which pitches were played with a frequency greater than 210, and show any of those pitches that were not played in files.

Practicing JOINS (cont.)

  1. Get the names, ages, games, and star ratings for every spy who playtested a game and every spy who didn’t playtest.
  2. List the platform for every game played by Jason Bourne.
  3. Which spies played their games for more than one hour?
  4. Create a table of all games, the year they came out, all spies, and their ages, whether or not a spy playtested a game or a game was playtested at all.