A Beginner’s Guide to SQL
SQL is one of the core skills of a data engineer and data scientist. This mini-tutorial explains the four fundamental SQL functions: Create, Read, Update, and Delete using a fun example of movie quotes database.
Next step is filling up the table with our movie quotes data. There are many available tools with graphic interfaces (GUI) for managing tables and data in the database. But often it is quicker to write an SQL script, which is basically a collection of SQL commands that will be executed sequentially. And this is especially handy when you need to fill up your table with lots of data. SQL command for inserting or adding a row of data into the table is INSERT. The format is as follows:
INSERT INTO table_name (column_1, column_2, ... column_n) VALUES (value_1, value_2, ... value_n);
To insert a row of data into table, `INSERT` keyword is followed with keyword `INTO` and a name of the table. Following column names in the parenthesis and separated by commas is optional, but is a good practice to define the columns to be sure that the proper data will be inserted into proper columns. The last part is defining what data will be inserted with a keyword `VALUES`, followed by the list of values enclosed in parenthesis. Please note that strings should be enclosed in single quotes, and numbers should not. Our example SQL script that will fill up our `Movie_quotes` table looks like this:
INSERT INTO Movie_quotes (Q_TEXT, Q_CHARACTER, Q_MOVIE, Q_YEAR) VALUES ('I find your lack of faith disturbing.', 'Darth Vader', 'Star Wars', 1977); INSERT INTO Movie_quotes (Q_TEXT, Q_CHARACTER, Q_MOVIE, Q_YEAR) VALUES ('It’s a trap!', 'Admiral Ackbar', 'Star Wars', 1983); INSERT INTO Movie_quotes (Q_TEXT, Q_CHARACTER, Q_MOVIE, Q_YEAR) VALUES ('Never tell me the odds.', 'Han Solo', 'Star Wars', 1980); INSERT INTO Movie_quotes (Q_TEXT, Q_CHARACTER, Q_MOVIE, Q_YEAR) VALUES ('Do. Or do not. There is no try.', 'Yoda', 'Star Wars', 1980); INSERT INTO Movie_quotes (Q_TEXT, Q_CHARACTER, Q_MOVIE, Q_YEAR) VALUES ('Stupid is as stupid does.', 'Forrest Gump', 'Forrest Gump', 1994); INSERT INTO Movie_quotes (Q_TEXT, Q_CHARACTER, Q_MOVIE, Q_YEAR) VALUES ('My mama always said: Life was like a box of chocolates. You never know what you’re gonna get.', 'Forrest Gump', 'Forrest Gump', 1994); INSERT INTO Movie_quotes (Q_TEXT, Q_CHARACTER, Q_MOVIE, Q_YEAR) VALUES ('Run, Forrest! Run!', 'Jenny Curran', 'Forrest Gump', 1994); INSERT INTO Movie_quotes (Q_TEXT, Q_CHARACTER, Q_MOVIE, Q_YEAR) VALUES ('I’ll be back', 'The Terminator', 'The Terminator', 1984);
With the data saved in the database, we can now query that data and see what is stored in our table, and we can filter and sort that data in different ways. The `SELECT` statement is used to query, or select, the data that we want returned from the database. We will start with very simple queries, but the SELECT statement has many different options and extensions, which provide the great flexibility that we may ultimately need. The basic `SELECT` statement syntax is as follows:
SELECT column_1, column_1, ... column_n FROM table_name;
Declared column names determine which columns will be returned in the results, and in which order. If we want to select all the columns, or we just don’t know which exactly columns are in the table, we can use a wildcard asterisk character, `*`, that will select everything from the database:
SELECT * FROM table_name;
Query that will show all the data from our example table would look like this:
SELECT * FROM Movie_quotes;
And a query that will show only the list of quotes followed with a year of the movie, would look like this:
SELECT Q_TEXT, Q_YEAR FROM Movie_quotes;
Sometimes we don’t want to get all the data from the table. This may be the case when there is a lot of data in the table, or when we are searching for specific data that match certain criteria. For this we can use `WHERE` SQL clause.
The `WHERE` clause will filter records and limit which records are retrieved from the database to those that meet specified defined criteria:
SELECT column_1, column_1, ... column_n FROM table_name WHERE column_name operator value;
As mentioned, the `WHERE` clause is optional, but if we decide to use it, the following operators are available:
- `=` – Equal to.
- `>` – Greater than.
- `<` – Less than.
- `>=` – Greater than or equal to.
- `<=` – Less than or equal to.
- `<>` – Not equal to.
- `BETWEEN` – Between two values.
- `LIKE` – Search for a pattern.
- `IN` – Multiple possible values for a column.
Mathematical operators are self-explanatory. The `BETWEEN` operator looks for values that are between two declared values, including equality on both ends. The `LIKE` pattern matching operator is a very powerful operator that allows us to select only rows that are similar to what we specify. The percent sign `%` can be used as a wildcard to match any possible character that might appear before or after the specified characters. For example, to get only movie quotes from the movie Star Wars we would write:
SELECT * FROM Movie_quotes WHERE Q_MOVIE = ‘Star Wars’;
Please note that the `WHERE` clause is case sensitive, and that the following SQL command will return no result:
SELECT * FROM Movie_quotes WHERE Q_MOVIE = ‘STAR WARS’;
There can also be more than one `WHERE` clause, combined with logical operators `AND` and `OR`. If we end up using more than one `AND` logical operator on the same column, we should consider using the `IN` clause instead. Again, to show this with an example, let’s return all the movie quotes from the movies Star Wars and The Terminator:
SELECT * FROM Movie_quotes WHERE Q_MOVIE = ‘Star Wars’ AND Q_MOVIE = ‘The Terminator’;
Better written above example would be by using `IN` clause instead:
SELECT * FROM Movie_quotes WHERE Q_MOVIE IN (‘Star Wars’, ‘The Terminator’);