The relational database language SQL (Standard Query Language) is a standardised international language for relational database systems; it has found widespread use and is implemented in most database management systems. SQL can be used to modify data (UPDATE, INSERT, DELETE), manipulate database objects (CREATE, ALTER, DROP) and request information (SELECT). To ensure secure operation of a database, the following basic guidelines should be observed when making database queries:
SQL queries should be formulated as precisely as possible. This applies particularly to SQL queries generated from applications. For example, the SQL statement
SELECT * FROM
WHERE
inevitably leads to errors or even causes the related application to crash if the table scheme has been modified (addition or deletion of fields, or changes in the field sequence).
Fields should always be specified explicitly. This ensures that the data is made available in the awaited sequence, and that only the information which is actually required is selected, for instance.
Example:
A table consists of the following fields (with the related data types):
Article number NUMBER(10)
Net price NUMBER(10,2)
Article designation VARCHAR(30)
Intended use VARCHAR(200)
A new field titled "Order number" and having type NUMBER(8) is added. To optimise memory utilisation, the field is located at the second position of the table, not at the end. The new table then appears as follows:
Article number NUMBER(10)
Order number NUMBER(8)
Net price NUMBER(10,2)
Article designation VARCHAR(30)
Intended use VARCHAR(200)
In the worst case, a SELECT-* statement issued from an application would now lead to a crash, as the data is selected automatically in the specified field sequence. The example shown above not only poses the problem of the table having been extended by one field (whose data are selected additionally), but also of the field sequence having been modified, so that the data is no longer selected in the original order. This leads inevitably to type conflicts and, possibly, to program failure.
The sequence of the specified selection conditions is of great importance for restrictive database queries (WHERE clause). The WHERE clause should be formulated so that the first condition selects the smallest possible result set, whilst the last condition selects the largest result set. This optimises the performance of the database system through an efficient arrangement of the selection conditions, which notably accelerates search operations. The same applies to database queries formulated for several tables (so-called joins).
It must be noted here that database management systems often optimise database queries automatically. In fact, many database management systems offer several optimisation strategies which can be selected via various parameters. If a database management system does make use of such Optimisers however, it might be possible that carefully formulated database queries are internally not processed as expected.
In this respect, some database management systems allow the processing of database queries to be monitored (e.g. with EXPLAIN in Oracle or SETOEP in Ingres). It is also possible to use HINTS to explicitly define the processing of database queries, thus eliminating the need for optimisers. However, this option should only be used when absolutely necessary.
The optimisers supported by a database management system as well as their advantages and disadvantages are usually documented in the manuals accompanying the system. If several optimisers are available, the administrator should be requested to specify a suitable one for use.
In the case of Joins, it must also be noted that fields are allocated uniquely to tables.
Example:
TabA
ID
NUMBER(4)
Manufacturer#
NUMBER(6)
TabB
ID
NUMBER(4)
Article#
NUMBER(10)
Price
NUMBER(10,2)
Designation
VARCHAR(30)
SELECT TabA.ID, TabB.Designation, TabB.Price
FROM TabA, TabB
WHERE TabA.ID=TabB.ID
The "ID" field is present in both tables and must therefore be specified explicitly together with the corresponding table name in the database query. Otherwise the uniqueness of the selection is no longer ensured, and the database query is aborted with the issue of a corresponding error message.
All other fields can be allocated uniquely in this case. SQL does not require an explicit specification of the related table name for each field. Nevertheless, the individual fields should be allocated to the tables, as shown above in the "Price" and "Designation" fields of table TabB. Only in this manner can unforeseeable problems be avoided.
In the above example, the addition of a "Designation" field in TabA would not cause any problems. However, it would if the SQL statement did not contain an explicit allocation of the fields to the tables. In this case, it would no longer be possible to clearly determine whether the "Designation" field should be selected from TabA or TabB, as both tables would contain a field of this name following the modification to TabA. The SQL statement would then be aborted with the issue of a corresponding error message.
If views exist, they must be used for the formulation of database queries.
All database transactions must be confirmed explicitly with a COMMIT. If the database management system in use supports an automatic COMMIT, this feature should not be activated, as it might inadvertently lead to inconsistencies in the database.
Example: Several individual modifications are grouped together logically, and confirmed automatically by a COMMIT. If the transaction is now interrupted in an uncontrolled manner, thus leading to a rollback, the operations already completed are confirmed and remain in the database, while the remainder remains unexecuted.
To avoid locking conflicts or even deadlocks, a locking strategy should be specified for every application-specific database (e.g. hierarchical locking or explicit locking of all tables at the beginning of a transaction).
Application developers should check the error status after every SQL statement, to allow the application to respond as quickly as possible to the occurrence of errors.
If the database management system supports certain system-specific commands which permit the logging function to be deactivated or the locking procedure to be modified, for example, users should be denied the right to use such commands. Precise clarification is required in advance as to the system-specific settings / commands which may be changed / invoked by users and application developers.
During development of the applications, all database access operations should be grouped in one module or a particular section of the program code, otherwise the entire program code of the application system would need to be scrutinised in order to test the above-mentioned principles. Grouping the operations together facilitates the maintenance and updating of the application system, e.g. in the event of alterations to the data model.
Additional controls:
Have guidelines for database queries been prepared?
Are the guidelines for database queries known to the application developers?