Introduction to Data, Databases and SQL
Introduction to Data, Databases and SQL
A database is something that lets you store data, and then create, read, update and delete that data. It acts as the memory for a website. Ruby and other languages can abstract the finer details, but this course we’ll cover them as they are often the key to solving problems and speeding up performance. There are two main components to databases. The first is schema, which is the organization and relationships within the database. It is the different sections, and how they should store data. This is also known as modeling. A model is a plan/blueprint for building a database. This involves choosing what you’re going to store for each thing. If you were storing movies, you may choose the title and release year, and not include the director.
The second main component is data, whose formatting is described by the schema. There are different kinds of databases out there, but in this case we’ll be covering relational ones, which include MySQL, SQLite and Postgres, mainly MySQL. Databases contain collections of info called tables, which are spreadsheet like structures with rows (sometimes called tuples) and columns (which is a property of the entity you want to store). Depending on each database there is usually a naming convention for them.
To get data in and out of the database we use Structured Query Language (SQL). It is written in keywords and phrases to make a sentence, which is sometimes known as a statement. Here’s an example. select is a keyword, and the * means every column in the table. From is the clause to say where it should be retrieved from, and movies is the table.
Introduction to Data Types
Columns can have data types, which are like a type of formatting strictly enforced for that column. The three main kinds are strings, numeric and date/time types. There are two kinds of strings in MySQL. varchar contain short strings of characters, ideal for things like product names and movie titles. text are for longer strings like descriptions.
For numeric, there’s integers (whole numbers), fixed point (where the number of decimal places is fixed, like for currency), and float point (where the number of decimals can vary). How they are stored by the db (database) can vary though, if you put in 2.3 it may be stored as 2.2999.
For date and time types, which can contain a date, a time, or a combo of both, which is known as datetime. The types you choose are important, especially when it comes to sorting the columns later. Strings take up more space than integers, so it’d be silly to store a whole number in one.
A basic SQL statement to create a table would look like this:
The CREATE keyword tells the db what to do, and the TABLE keyword tells it what to create. Then, you put the table name (actors). Then, in the parentheses you have the column definitions, in this case we only have one. You put the column name, then the data type, and in this case we’re setting the character length with 50. To add more columns, simply separate them with commas.
Here’s a basic SQL statement for inserting some info into a table.
INSERT INTO says you want to add info to a table. The VALUES keyword sets up the values, which appear in quotes. Make sure to enter your info here in the same order as your columns.
Installing MySQL Server and MySQL Workbench
Introduction to MySQL Server and MySQL Workbench
The client server model is when you use a GUI interface or programming language to interacting with a database on a server. For this lesson, the client will be the MySQL workbench, and the server will be the MySQL community server.
Installing MySQL Server and MySQL Workbench on OSX
NOTE: TREEHOUSE’S LESSON ON THIS IS EXTREMELY OUT OF DATE, FOLLOW THE DIRECTIONS BELOW ALONG WITH THE VIDEO AND FOLLOW MINE WHERE THEY DIFFER.
1. Go to MySQL.com > downloads > community > MySQL community server > download the file for your version of OSX. There’s two options, compressed TAR and DMG, select DMG. > No thanks, just start my download.
2. Download and Install it.
3. Download the MySQL Workbench GUI tool under MySQL.com > downloads > community > MySQL Workbench > download the one for OSX. Note that in this lesson he’s using an older version of this so some things may look/work differently.
4. Download and install it. Under System Pref > MySQL > Start MySQL Server
5. MySQL requires a special config file in a folder called ETC. Open Terminal > enter this: sudo cp /usr/local/mysql/support-files/my-default.cnf /etc/my.cnf
6. Open MySQL workbench. Click the little + next to MySQL Connection. For connection name use localhost, for connection method standard TCP/IP, and for hostname localhost. Port should be 3306 and username should be root. Click Configure Server Management in the lower left hand corner.
7. Click Continue. It should now test your connection. If you’re getting an error, make sure that under System Preferences > MySQL you have the server running. Click Continue.
8. For Operating System is should be MacOS X and for the MySQL Installation type it should be MacOS X(MySQL Package). Click Continue.
9. Now it can test the connection, and if it passes should look like this. Click Continue, then Continue again.
10. You can now start and stop the server by clicking on it’s panel in the main window.
11. If you click Server Status on the left you can confirm its running.
Running a Script and Using MySQL Workbench
He gave us a file to download. Open the connection then go to File > Open SQL Script > select it > Open. You should now see the scripts for the db. Click the lightning at the top to run the script. You’ll see a series of action outputs at the bottom, if there are green check marks it’s all good.
On the left, you’ll now see a section called schemas. Right click > Refresh all and you should now see one called treehouse_movie_db. In there, open tables to see the names of each, each of which can be expanded to show the columns. So, this GUI lets us see a graphical representation of the db. Click on the treehouse_movie_db to select it, and it will turn bold.
Close the movie_db import script tab, and in the query 1 tab type SELECT * FROM movies and you’ll see it returns the data. If you type movie you’ll get an error because that table doesn’t exist.
Reading Data from Databases with SQL
Introduction to CRUD
CRUD stands for the four operations you’ll want to perform in the db – Create Read Update and Delete. Each has a SQL keyword associated with it. For Read, it’s SELECT.
Retrieving a Result Set
NOTE: the shortcut for execute is Command + Return. Before we had SELECT * FROM movies; for all movies. But, you can also write SELECT movies.title, movies.year FROM movies;.
You can change the order of the columns returns by switching the column names. You can also write it without the movies before each column name as comma separated values. Note you don’t need a period here.
If you just want data from only one column, write just that columns name.
A clause lets you filter a result set. The where clause makes it so it only returns the data matching that condition. You right it at the end. Here, we test for movies where the year column equals 1999. Note that = is not an assignment operator here, but rather a comparison one.
Use != to find the ones that aren’t from 1999.
You can also use >, <, >=, <= as well. You can combine conditions using AND and OR. Here, we look for movies from 1999 with the title the matrix.
To search between values, write the column name, then BETWEEN, then the number to start, the keyword AND, and then the ending number.
You can search a string using the keyword LIKE with a combo of wildcards. If we were to search for “godfather” we’d get no results since the movie starts with “the”. The wildcard is a percent sign. Placing that before the g makes it wildcard, so anything that starts with something and then has godfather in it will come up. Note that LIKE is case insensitive.
If we place a wildcard at the end of it as well, then it will bring up any godfather matches that also have something after them.
Ordering the Result Set
Use the keyword combo ORDER BY to order results. You add it at the end, then what you want to order things by. See that is gives you an ascending order, and that null values come at the top.
To have them descend, ass the keyword DESC at the end. If you needed to explicitly ascend them, add the keyword ASC to the end.
You can chain ordering together by comma separating them. The first one listed is what the table sorts by, then the second sorts within the first column specified. So, here we have the table sorting by date, then for each year the titles sorting descending.
Limiting the Result Set
If you don’t need all the results from the table, you can use the LIMIT keyword followed by the number of elements you want returned.
To still pull in 10 but start later in your database, use OFFSET after that and a number, which is the number of rows you want to offset by.
If you type OFFSET 0, it will start from the first row, so like an array the database starts counting the rows at 0. So, if you wanted to start at the 11th row, you’d offset by ten. You can also write this with just LIMIT, then two comma separated numbers, the first being the offset, and the second being the limit number.
Note that if you don’t specify a limit, it will auto specify one for you at 0, 1000. You’ll find that some clients add one. If you were using a language like php or a command line tool though, these limits would not be there.
Dealing with NULL
NULL is a value used to indicate data isn’t present in the database. To find the movies where the year is null, you’d simply write WHERE year IS null; Note that you can’t use = here, you need to use the IS keyword.
We can combine this with our ORDER BY keyword to sort our movies and not include ones with a NULL year. In this case we’ll use IS NOT.
Manipulating Data with SQL
Manipulating Schema and Data
The SQL syntax is split into two parts: DDL (Data Definition Language), which has to do with the keywords that create and modify tables and databases, aka schema, and DML (Data Manipulation Language), which deals with the CRUD operations in a database, aka the data.
Creating a Database
Think of a database as a container that keeps groups of tables separate from other groups of tables. To make a new one, in your workbench click the little icon that looks like a cylinder cut into three pieces.
This makes a new schema, which is a database. Give it a name, and ignore default collection for now > Apply. Then, a screen will pop up showing you the keywords you’d use to do this by typing > apply > close > you’ll now see it on the left under schemas.
To select our new db from the schema list on the left, rather than clicking on it we’ll type USE movie_db_1;
Note that if we try to make a new db with the same name we’ll get an error. The documentation on creating databases gives us more info. First it has the keyword needed, CREATE, then in curly braces it has the database and schema separated by a pipe, which means or. The words in the square brackets are optional.
If we add the IF NOT EXISTS to our statement, we now get a warning rather than an error. This is good because when a script errors, it stops running, but with a warning it will continue on. Whether or not you’d want it to stop or not depends on the situation.
Note that in the documentation database and schema are interchangeable, so we can use that to make a db as well. Make sure to right click > refresh your schema list after to see it there.
You have the option to set your character set when making your db. To see the options, use the GUI to add a new schema, and select one from Default Collection. UTF-8 is the standard for the www. When creating it, it shows the nomenclature, which is DEFAULT CHARACTER SET utf-8;
However, the documentation says the default is optional, and that we can add an equals sign. This isn’t necessary, but shows we can modify it.
Creating a Table
Here’s how we’d make some tables. CREATE TABLE, then the name, then the columns, commas separated in parentheses.
The documentation on creating tables shows us how to do that. We have the option to make a temporary table, which is visible only in the current session, and is dropped automatically when closed. You can add IF NOT EXISTS like before. Then, you can create definitions, which would be the column name and its definition, which is its data type. You see that for the data type you have the option between not null, which means there has to be something set for it, or null, which means it can be empty. This would be good for our actor name, so we’ll add that in, and run the command to make the table.
Now we’ll create one for movies. We always want the title, and don’t always need the year, so we’ll add not null and null respectively. Note that if we didn’t specify not null or null, it would default to null, so we don’t need to write it for year.
Another way to make a table is to add table option, in this case ENGINE, which is a specific way to store data in the db. If we open a new tab, and type SHOW ENGINES, it will show us the engines available. InnoDB is the default and the one you’ll like want to use, but to specify one, simply add ENGINE then the name of the one you want at the end.
Inserting, Updating and Deleting Rows
So all of that was DDL, this will cover the DML, aka the CRUD stuff. Here’s a table of the keywords you need for each part.
Here we’ll insert a movie. Start with INSERT INTO, then the table you want to use, then VALUES, then in parentheses the values for the corresponding tables.
If you didn’t know the ordering of the columns, you could list the column names between the table name and VALUES, and the values would correspond to how you wrote them.
Another way to write this is to insert multiple rows at once, by adding a section set of values in a second set of parentheses.
You can also use the SET keyword, then the column name equal to the value, separated by commas for each.
If we SELECT * FROM movies, we can see the rows we’ve created.
To update, use the UPDATE keyword, then with SET and a clause to only do this for movies where the title equals Avatar 2 (if we didn’t do this it would set it for all movies). Note that for this to work you have to turn off safe updates under Preferences > SQL Editor > uncheck Safe Updates, or you can write SET SQL_SAFE_UPDATES = 0; in a new tab.
To update multiple values, simply comma separate them.
To delete, us the DELETE FROM keywords, and a WHERE clause to ensure you don’t delete everything in the table.