Database Queries to One Table (Part 2)
This is an excerpt from the Mondial database that is widely used for practice exercises in this article.
|Table 12: Extract from the MONDIAL Database|
|France||F||Paris||Ile de France||547030||58317450||Europe||100|
|United States||USA||Washington DC||Distr. Columbia||9372610||266476278||America||100|
Sort Query Output (ORDER BY)
Typically, the results of an SQL query are output in the order in which they are read from the database. However, SQL also offers the option of sorting the result set of a query according to defined criteria. The keyword ORDER BY is used for this purpose. On the one hand, presorted results are easier for the user to process—on the other hand, when using DBMS sorting functions, the implementation of a separate sorting function within the application can be dispensed with if necessary.
SELECT Name FROM Country WHERE Continent=’America’ ORDER BY Population DESC;
The keyword ORDER BY initiates the sorting of the query result. In our example, the list of country names is sorted according to the number of inhabitants so that the country with the most inhabitants is named first, followed by all other countries in descending order.
As a selection condition of WHERE, the value of the Continent attribute is checked for equality with the character string America. Only records for which the selection condition is fulfilled are included in the result.
When ORDER BY is used, the sort sequence is specified in addition to the column used for sorting. Table 18 shows the sequence indicators that can be used there.
|Table 18: Sequence Indicator|
|ASC||Ascending; A–Z or 0–9|
|DESC||Descending; Z–A and 9–0, respectively|
|– – (not specified)||Default value: Ascending|
Structure of ORDER BY
The ORDER BY keyword can be used as follows:
Sort by a column, ascending, or descending.
ORDER BY column1 sequence;
ORDER BY Population DESC;
Sort by several columns. The first column specified is the first criterion, the second column the second criterion, and so on. All criteria are sorted in the same order, i.e., ascending or descending.
ORDER BY column1, column2, ..., columnN sequence;
ORDER BY Continent, name ASC;
Sort by several columns. For each column, the system explicitly specifies whether it is sorted in ascending or descending order.
ORDER BY column1 sequence1, column2 sequence2, ..., column order;
ORDER BY Continent ASC, Percentage DESC, Name ASC;
Queries with Group Formation (GROUP BY)
The GROUP BY keyword is used to group data. Groupings are usually used together with evaluation functions in SQL statements. In this way, the individual data records can be grouped in the dataset and then individually evaluated for each group. Thus, for example, it is possible to directly query the database about how many customers bought a certain product or how high the average turnover per postal code is.
The following example illustrates the use of GROUP BY. For each continent, the number of countries stored in the database is searched for.
SELECT Continent, COUNT(*) FROM Country GROUP BY Continent;
As many groups are formed for the relation Country as there are different values of Continent. So here, every record with Asia as a value for Continent is sorted into one group, every record with Africa into another group, and so on. For the example table, this grouping by continent resulted in five different groups. The aggregate function COUNT(*) is then used to count the number of data records in each group. The data records of the result set are collected from the grouping criterion (Continent) and the result of the aggregate function executed for each group (COUNT(*)).
Aggregate functions are used to aggregate several values into one value—that is, to summarize or combine them. With these functions, calculations can be carried out very efficiently on the stored data records in the database. Table 19 explains the most important aggregate functions in SQL. These functions can also be used without GROUP BY, in which case they will refer to the entire table or to the result set restricted by WHERE.
Naming Columns in the Query Result with AS
The AS keyword can be used to rename the column names of the result table. This is particularly helpful when using aggregate functions. Examples are shown together with the aggregate functions in Table.
|Table 19: Important Aggregate Functions in SQL|
|COUNT(*)||Counts the number of data records in a group. The example includes an operation together with GROUP BY and an operation without GROUP BY.||SELECT Continent, COUNT(*) ASNumberofCountriesPerContinentFROM Country GROUP BY Continent;SELECT COUNT(*) ASnumber of countries in the worldFROM Country|
|SUM(column)||Totals the values of a column within a group. In the example, the area of all countries of a continent.||SELECT Continent, SUM(Area) ASAreaofContinentsFROM Country GROUP BY Continent;|
|AVG(column)||Calculates the average value of a column within a group.||SELECT Continent, AVG(Area) ASAverageAreaPerCountryFROM Country GROUP BY Continent;|
|MAX(column)||Calculates the maximum value of a column within a group; can be used for numbers, strings, and dates.||SELECT Continent, MAX(Area) ASSizeOfTheLargestCountryFROM Country GROUP BY Continent;|
|MIN(column)||Calculates the minimum value of a column within a group; can be used for numbers, strings, and dates.||SELECT Continent, MIN(Area) ASSizeOfTheSmallestCountryFROM Country GROUP BY Continent;|
Restriction of Groups with HAVING
With the keyword HAVING, it is possible to restrict the output of the grouped results. While WHERE refers to the selection of data records before grouping, HAVING restricts groupings that have already been created. If, for example, you want to determine all articles in an online shop that have been sold more than 500 times, the data records of your sales database are first grouped according to article number, then the number per group is determined, and finally only those groups whose number is greater than 500 are output. The following example illustrates the use of HAVING using the country relation.
We are looking for a list of continents with at least five countries that are completely located on one continent.
SELECT Continent FROM Country WHERE Percentage=100 GROUP BY Continent HAVING COUNT(*)>=5;
In the Country relation, all countries whose Percentage attribute value equals 100 are taken into account first. These records are grouped by the attribute values in the Continent field. The HAVING now refers to the grouping: only those groupings that contain more than five data records are output. The result table contains only the Continent column.