Manipulating Schema with SQL
Altering Tables
To rename a table, type RENAME TABLE, then the table name, the TO keyword, then the new name. Remember, you can use more than one statement, so long as you have a semi colon at the end of each.
However, you can write that in one statement by comma separating them.
To delete a table, use the DROP keyword, then TABLE, then its name. You can optionally add IF EXISTS to make it so that if the table doesn’t exist it just throws a warning rather than an error.
To remove a table without deleting it forever, you can use the TRUNCATE keyword, then the table name (you don’t need the TABLE keyword). Note that this differs from DML, it’s still DDL. It actually deletes the table, then makes a new one with the same name but no rows in it.
Altering Columns
To add a column, write ALTER TABLE, then the table name, then ADD, then the optional keyword COLUMN (to specify what we’re adding), then its name and data type.
To add multiple columns, simply wrap them in parentheses and comma separate them.
To change the names of preexisting columns, replace ADD with CHANGE. Once again COLUMN is optional. Then, write it’s old name, a space, then its new one, then its data type definition. Unlike ADD, CHANGE cannot accept multiple columns. You’d need a second statement.
Now, for our year column in the movies table, year is actually a MySQL keyword, so its better to rename it to something else.
To remove a column, you’d just use DROP.
Removing Databases
You used to be able to rename a db, but they removed it because it was dangerous. It’s better to back one up, then move that data into a new one.
Deleting a db deletes the schema and the data. To delete a db, you’d write DROP DATABASE (or SCHEMA), an optional IF EXISTS, then the name of the db you want to delete.
Joining Relational Data Between Tables in SQL
Normalization
SQL_SAFE_UPDATES is there to make sure we’ve modeled our data in a safe way.
So, we’d make a separate table with the different movie genres listed, each with an id, that we could then reference in another table.
Keys and Auto-Incrementing Values
There are different kinds of keys.
We’ll now use the treehouse db, and create a new table called genres, with the needed column definitions. We’ll add PRIMARY KEY for the id one, to specify it as such. Before that we’ll add AUTO_INCREMENT, which increases the value by one with each new row. The name is a unique key, so we’ll add UNIQUE, with an optional KEY. Since unique keys can be null, but we don’t want that, we’ll add NOT NULL, where specified by the documentation.
Now, lets add a genre to our table. If we try adding that same one a second time, it will give us an error, due to that fact that that column is for unique keys.
Next we’ll add a id column to our movies table, which will auto increment and be a primary key. We’ll also make it the first column in our table by adding FIRST at the end.
Now we’ll add an genre_id, which we don’t want to increment (since we’ll use the same one for multiple rows), nor be primary or the first column. It can be null so we’ll add NULL. We want a constraint so only certain values can be put into this column, in this case the ids that are listed in the genres table. To do that we write ADD CONSTRAINT, then FOREIGN KEY, then the column name to be affected by it in parentheses, then REFERENCES, then the table name, and its column in brackets. Note that even though I broke it onto different lines it’s still one statement.
Now we’ll add the genre id of sci fi to the aliens movies. In our genres table, sci fi is 1.
Joining Tables and Aliasing
To join our two tables, we’ll do SELECT * FROM, then JOIN between their names, then ON for the condition where we set two columns equal to each other, in this case the foreign key on the movies table, genre_id, with the primary key on the genres table, id.
The JOIN here is known as and can also be written as an INNER JOIN, which is the data between the two that overlaps in both. If you want to pull in data outside of the overlapping data, write LEFT OUTER instead.
The left here means that we get all the info from the left join, the movies table, plus the inner join columns. If we wrote RIGHT we’d get this.
Let’s say we just wanted to return the title from the movie table and the genre from the genre table. As we’ve seen, to select a column in a table you write table_name.column_name. So, we can replace the star with the needed columns.
But for the name column, it’s not very clear that it’s the genre name. So, we can use aliasing, which lets us rename that. To do that, after the column name simply add AS, then the new name.
Note that these alias’s can’t be queried on later in the string, you’d have to the table_name.column_name format.
If the column name is unique, and only in one table, you can just write its name, but if it’s not and is in both tables, like id, it won’t work.
SQL Calculating, Aggregating and Other Functions
Numeric and Aggregate Functions
We have a new table for reviews, containing columns for id, movie_id (a foreign key) and a score. We’ll check out our table where movie_id = 1, which if we check our movies table we can see is the movie alien.
To see how many reviews there are, use the COUNT function, which takes an argument, in this case * for all the columns.
Once again, we can use an alias to make the column name more readable.
To check for the minimum value, we can use MIN, and pass it the column we want to check, in this case score. MAX works the same way.
To get the average we could use SUM() and divide it by the COUNT(), but it’s better to use the AVG() function.
Keep in mind that at this time these values are not being stored anywhere.
Grouping, Joining and Cleaning Up
If we were to remove our WHERE clause, it would average for all movies. Even if we selected the column for movie.id, it would still only return one row.
So, we can add GROUP BY at the end, then what we want to group the returned values by, in this case movie_id.
But just returning the id doesn’t make much sense, so we’ll use a INNER JOIN, which will return the subset of info that is common between the two.
But what if a movie doesn’t have any reviews yet? We can use a LEFT OUTER JOIN and see Tron has no reviews.
We can use IFNULL, which takes two arguments, the thing to check for, and what to put if it’s null.
To filter for results with an average above 3, use the keyword HAVING, then the condition. You use HAVING after grouping and aggregating, and since we used a GROUP BY, we’ll use that here. If we wanted to use a WHERE clause we’d have to do so before it the GROUP BY.
String Functions
Here we have our users table.
Note that the email column has some upperspace characters. We can use the LOWER function to lowercase all of them. UPPER makes thing uppercase.
LENGTH() gives you the length of a string. Lets find people with a username less than 19 characters.
CONCAT() lets you join strings together. If you want a space, make sure to add one.
The SUBSTRING() method takes three arguments, the string to truncate, the the character to start from, and the one to end from. Here, we shorten the emails. Note that in SQL, they don’t start with 0, but with 1. We can also use CONCAT to add some dots to the end.
There are lots of functions. Learn more about them in the documentation.
Securing and Maintaining a MySQL Database
Indexing Columns in MySQL
If you’re filtering over a certain column frequently, you should add an index to improve performance. It’s like having an address book for your data. Without one, the db would have to search through the whole column for a name, but with one it can make things much faster.
Keep in mind though that having an index means that when you write to a db, it also has to write to an index.
Let’s look through our users table for people with the last name Chalkley. When we run the command, the db is going through each row and checking if the last_name column equals Chalkley.
We can use the keyword EXPLAIN at the start of our statement to get some ideas on whats happening when we run this command. Under type, we see that it’s running against all rows, under rows we can see its running through all three rows, and under Extra we can see its using the WHERE clause.
We’ll make an index for last name to improve this search. Start with CREATE INDEX, then the name you want to give it, then ON, then the name of the table, with the name of the column you want it on in that table in parentheses.
Let’s look at our explained query again. We can see now that the type is ref, and the keys it’s looking on is the last_name_idx. For rows, it’s just going to one, and under Extra we can see that it’s using the index condition.
We’ll now add a new user to the database with the same last name.
If we explain our search again, we’ll see that it now brings up two rows, so Henry was added to the index.
If we make it more specific and search for a first name as well, we can see that it still searches for 2 rows based on the index, but is also using the WHERE clause.
Setting up Multiple Users
It’s good to think about what permissions you want to give users, to keep them from messing with your database in ways you don’t want them to. To make a user, start with the GRANT keyword, then the permissions you want to give them. The first user will only have read permissions, which we’d write as SELECT. Then we’d write ON, then the name of the db we’re referring to. Following the db name with a .* means it includes all the tables in the db.
Then, you write TO, then the name of the user, then @’%’, which means the user is at a particular host name address. So, a user with this username and the password we specify can connect. If this were for a certain website, you may want to instead specify your web server. Finally, you write IDENTIFIED BY, then the password in single quotes.
To be able to connect as user1 though, you must first do FLUSH PRIVILEGES, which resets and reloads all the privileges of all the users for the database.
Now, at this point you’re supposed to be able to create a new connection for that user, using their username and password, but this didn’t work for me at all and unsurprisingly treehouse provided no help.
Signed in as the user, you’ll find you can SELECT * FROM to view tables, but cannot edit them in any way. You’ll get an error if you try.
To make a user with read and write permissions, you’d simply add more keywords – INSERT, UPDATE and DELETE. Make sure to flush the privileges when you’re done.
If you create a connection for this user and use it, you’ll find you have access to all the DML CRUD operations. However, if you try to drop a table, a DDL command, it will be denied.
To create a user with DDL permissions, use the keywords ALTER, CREATE, and DROP.
Backing Up and Restoring Databases
If you load up your connection and choose management on the left hand nav, you’ll see the Data Export option. You can choose what db you want to back up, and even which tables within it. You can then choose whether to export into separate files, or all into one. Choose your file path and make sure you use a .sql extension, and click Start Export.
If there’s a discrepancy between versions, you may get an error. To fix this, go to Preferences > Admin > Path to MySQL dump tool > /usr/local/mysql/bin/mysqldump.
So, keep in mind that the GUI interface is just running commands for you. If you typed it yourself in a command line, it would start with the path to the dump, then some parameters for the hostname, user and path to the db.
Once you have the file, you can open it with a text editor to view it.
Notice it starts by creating the db, if it doesn’t already exist, then the next line is to USE it. Then it creates the tables, first by dropping them if there’s already one with that name, then creating one and inserting the values. If you wanted, you could modify this and rename the db to something like treehouse_movie_db_clone.
To restore a db, select the Data Import/Restore option from the left nav > Import from Self-Contained file > choose path to file. Under Default Target Schema you can choose which one you’d like to use for the queries to be run in. However, this is only used if the dump file doesn’t contain its schema (our first two lines to create and use), and is otherwise ignored. So, it will actually be ignored here. Click Start Import > Ok.
Now after refreshing our schemas, we see our clone. If we hadn’t changed the name, it would have overwritten our previous db.