|
[code:1]One of the most elementary statements in MySQL is the SELECT statement, used for querying a database. And yet, I regularly see questions posted on Usenet forums from people looking for assistance with some aspect of it. Since this is one of the first articles in this MySQL series, exploring the possibilities of SELECT should prove useful. I'll briefly cover the basics and then quickly move to intricate ways to employ this essential command. I will assume that the reader has MySQL installed, knows how to log in, and has at least one database set up for testing. If not, the beginning of my last article on MySQL covers creating a database and tables. For the examples here, I will use a database of a fictitious bookstore.
Basic Selection
For good form, let's quickly get basic SELECT statements out of the way. My bookstore database has a primary table called, books. This table has several fields or columns: item_no, isbn, title, author, publisher, pub_date, and lit_form. To query the books table for a list of all records with all columns, one would enter this statement into the mysql client:
SELECT * FROM books;
Although MySQL is fairly case insensitive, by convention all MySQL keywords are presented in upper-case letters and by my convention table and column names are in lower-case.
To get a list of only a few and not all columns, the asterisk above is replaced with a comma separated list of columns:
SELECT item_id, title, author FROM books;
If a more aesthetically pleasing display is preferred, then aliases can be used to relabel column headings. After each column name for which the output heading is to be changed, AS and the new heading in quotes is inserted:
SELECT title AS 'Book Title',
author AS 'Author' FROM books;
Notice that this SQL statement is broken up over two lines. MySQL will wait for a semi-colon followed by [Enter] before acting on a statement.
To make the output more presentable, there is the CONCAT function to concatenate text and columns together:
SELECT CONCAT('"', title, '" by ', author)
FROM books;
This SQL statement would display data like this: "The Dubliners" by James Joyce.
Setting Limits
In addition to limiting columns, one can also limit the number of records or rows displayed with the LIMIT clause. To see only a sampling of five records, this statement would do the trick:
SELECT item_no, title, author
FROM books LIMIT 5;
Suppose that one wants to get the next eight rows, but not repeat the first five (i.e., rows six through thirteen). MySQL would need the number of rows to skip (5) and the number to select (8):
SELECT item_no, title, author
FROM books LIMIT 5,8;
Bringing about Order
Being able to sort data in a more meaningful way than order of entry is usually preferred. Hence, the ORDER BY clause will sort rows by a named column:
SELECT item_no, title, author
FROM books
ORDER BY author LIMIT 5;
Incidentally, all of the SQL statements presented are in a specific order; the syntax is not arbitrary on my part. One can add or leave out some components, but not mix them in any order.
To select a limited number of rows starting from the end of the table ordered by a column, one would need to include DESC to specify descending order. Ascending is noted by ASC, but it's usually not necessary to specify it since it's assumed.
SELECT item_no, title, author
FROM books
ORDER BY author DESC
LIMIT 5;
In my database I get five titles by William Shakespeare since his name starts with "W". I probably should store author information by last name first or put the last names in separate columns. Nevertheless, the results are not in alphabetical order by title and they're not the last five titles in reverse order. MySQL selected all of the rows of the last author listed alphabetically to determine a result set--if there had been less than five Shakespeare titles, it would have selected more authors to fulfill the limit. It then gave me a sub-set containing the first five records of the result set based on the order that they were found in the database. Below is the result of the statement above with a limit of ten rows, which is all my database has for the author:
+---------+--------------------------+---------------------+
| item_no | title | author |
+---------+--------------------------+---------------------+
| 1115 | Romeo & Juliet | William Shakespeare |
| 1116 | Henry V | William Shakespeare |
| 1117 | Hamlet | William Shakespeare |
| 1119 | King Lear | William Shakespeare |
| 1120 | MacBeth | William Shakespeare |
| 1121 | Anthony & Cleopatra | William Shakespeare |
| 1122 | Coriolanus | William Shakespeare |
| 1123 | Othello | William Shakespeare |
| 2113 | Love's Labour's Lost | William Shakespeare |
| 2114 | Measure for Measure | William Shakespeare |
+----------+-------------------------+---------------------+
These titles aren't in alphabetical order. With a limit of five rows, I get the first five rows above. So the previous statement needs another sort condition; the results follow:
SELECT item_no, title, author
FROM books
ORDER BY author DESC, title DESC
LIMIT 5;
+--------+---------------------+---------------------+
|item_no | title | author |
+--------+---------------------+---------------------+
| 1121 | Anthony & Cleopatra | William Shakespeare |
| 1122 | Coriolanus | William Shakespeare |
| 1117 | Hamlet | William Shakespeare |
| 1116 | Henry V | William Shakespeare |
| 1119 | King Lear | William Shakespeare |
+--------+---------------------+---------------------+
Getting Specific
So far all we've done is browse data. To select records that match specific conditions, the WHERE clause is required. For example, to get a list of all books written by Henry James, one would use this statement:
SELECT isbn, title, pub_date
FROM books
WHERE author='Henry James';
To get a list of Anton Chekov plays, but not his novels or short stories, a compound WHERE clause would be used:
SELECT isbn, title, pub_date
FROM books
WHERE author='Anton Chekov'
AND lit_form='play';
To get all of the Chekov titles but the plays, the <> or the != operative could be used:
SELECT isbn, title, pub_date
FROM books
WHERE author='Anton Chekov'
AND lit_form!='play';
In the books table there are rows with null values in the isbn column. To exclude these records, another condition to the WHERE clause is needed:
SELECT isbn, title, publisher
FROM books
WHERE author='Italo Calvino'
AND isbn IS NOT NULL;
To list only rows with an isbn that is null, IS NULL would be used. If one wants to keep rows without an ISBN, an IF function can be wrapped around the isbn selection to display something besides NULL:
SELECT title, author,
IF(isbn IS NULL, 'No ISBN', isbn)
FROM books
WHERE author='P.G. Wodehouse';
Within the parenthesis of the IF function, the test is given first, then the outcome if it fails, followed by the outcome if it succeeds. One could trim this IF function down to: IFNULL(isbn, 'No ISBN'). Here, the test is implied by the keyword IFNULL. The column to test is given first--it will be displayed if it succeeds. The next element will be shown if it fails.
If one wants a list of playwrights only, the clause WHERE lit_form='play' could be used. However, some authors will be listed many times--once for each play. The DISTINCT caveat can limit output to one record per author:
SELECT DISTINCT author
FROM books
WHERE lit_form='play';
To know how many plays the database has for each author, one would use the COUNT function. However, DISTINCT won't work with it since DISTINCT would only take note of each distinctive occurrence--actually, it will cause an error. Instead the GROUP BY clause is required:
SELECT author, COUNT(title)
FROM books
WHERE lit_form='play'
GROUP BY author;
This prints one line for each author and a count of the number of titles for each author.
Searching
Sometimes users will know the core name of a book, but not its complete name. In my database, I have a record for The Epic of Gilgamesh. However, if I didn't know the whole title and were to query the database with just Gilgamesh, nothing would be found. If I were to browse alphabetically based on title and visually scan in the G's, I wouldn't see it because it would be in the T's. I couldn't search on author, because the author is anonymous and the book's authorship is attributed to the translator. For this situation, I would use the LIKE expression:
SELECT title, author, publisher
FROM books
WHERE title LIKE '%Gilgamesh%';
Notice here that a wildcard (%) is used at the beginning and again at the end of the value Gilgamesh. This way, any title with Gilgamesh in it will match. As a result, the books The Epic of Gilgamesh and Mesopotamia: Gilgamesh & Other Babylonian Tales will both be displayed.
Select Relations
Since MySQL is a relational database, relationships can be formed between tables when querying a database to get merged results. In the bookstore database, suppose there's another table called inventory with the columns item_no, quantity, and location. This item_no column can match the item_no in the books table. Unlike books, in inventory there are duplicate item numbers. This is because location contains the warehouse or store identification number where books are located. Since some are carried in multiple locations, there are multiple records in inventory. To list all of the Thomas Hardy books and their quantities and locations, this statement would be entered:
SELECT books.item_no, title,
location, quantity
FROM books, inventory
WHERE author LIKE '%Hardy%'
AND books.item_no=inventory.item_no;
Notice that both tables are named in the FROM clause, separated by a comma. Also, the linking point (i.e., item_no) between the two tables is given. Although a human would understand simply saying, item_no=item_no, MySQL would find that to be ambiguous. So, the table name followed by period, followed by the column name is required for each connecting point. This format is also used to stipulate which item_no is to be selected.
The query above will only show data for matching records. In the books table there are some titles that don't have a matching record in the inventory table (i.e., there's none in inventory). Suppose there is going to be a Samuel Becket festival in the area. To prepare, the store's buyer wants a list of all Becket plays and their quantities at each location to decide on what he needs to order. He also wants titles for which there is no inventory. The LEFT JOIN operator of the FROM clause will work. The statement below will list at least one record for each now found in books, regardless of whether there is a matching record in inventory:
SELECT title,
IFNULL(location, '') AS location,
IFNULL(quantity, 0) AS quantity
FROM books
LEFT JOIN inventory
ON books.item_no=inventory.item_no
WHERE author LIKE '%Becket'
AND lit_form='play';
Notice that the FROM format has changed. Also, the linking point is now part of the FROM clause and not the WHERE clause.
Let's add another table to our scenario: A couple of the stores are located next to small colleges that don't have campus bookstores. Instead, these colleges have arrangements with our neighboring bookstores to carry their text books. Text books require different information to be tracked. Therefore, there is a separate table for text books (i.e., text_books). Suppose the buyer wants to get a listing of all C.S. Lewis books from both books and text_books. C.S. Lewis has written popular fiction books, as well as literary criticisms and other texts used in college courses. To accomplish the request, UNION (available starting in MySQL version 4.0.) could be used to combine the result sets of two SELECT statements:
SELECT item_no, title, course_no
FROM text_books
WHERE author='C.S. Lewis'
UNION
SELECT item_no, title, 'Non-Text Book'
FROM books
WHERE author='C.S. Lewis'
ORDER BY title;
Since the books table doesn't have a course_no column, alternative text is provided to plug-in the column and to indicate why it's not applicable. The column counts have to equal. Also, the ORDER clause can only refer to a column named in the first SELECT statement. Basically, the combined result set will be sorted based on the column named.
Selective Reproduction
Sometimes one needs to populate a table from another table. This can occur when a developer realizes that a table should be split into two separate tables. This also can happen when one has a temporary table containing imported data and only some columns need to be disbursed into existing tables. For situations like these, an INSERT INTO...SELECT statement works rather nicely.
Suppose a publisher has given the bookstore a file listing books that they've recently published. Importing from an external file is beyond this article. Therefore, assuming it was successfully done, the temporary table has some columns to be copied over into the books table. The SQL statement below will make quick work of extracting the data:
INSERT INTO books
(isbn, title, author, lit_form)
SELECT ISBN_Nbr, Title,
CONCAT(Author_First, ' ', Author_Last), GENRE
FROM tmp_newbooks;
Notice that the SELECT statement overlaps the INSERT statement to accomplish the task.
Conclusion
The possibilities for SELECT are pretty extensive. I do a good bit of Perl programming with MySQL. In so doing, I have found that better SQL statements can lead to tighter Perl code. I've eliminated many lines of Perl code in scripts by just being more selective with MySQL statements. Learning all that can be done with SELECT is the first step to more powerful use of MySQL.
[/code:1] |
|