Difference between revisions of "Portable SQL/Common/Quoting"

From Linuxnetworks
Jump to: navigation, search
(header hierarchy)
(No difference)

Revision as of 16:26, 17 May 2007

Quoting identifiers

Looking for good identifiers for tables, columns or views can often be a serious pain when your application should work with different database servers. There are a lot of reserved words which can't be used and each database vendor constantly extends his list. How can an application developer be sure that there will be no name clashes in the future? Similarly, databases tend to change the character case of identifiers. Some of them convert all to upper case, some to lower case while many preserve the case. If you want to access the value of a column by using the name instead of the position, things get really tricky.

To avoid both situations, the OpenDBX library forces all databases to operate in ANSI standard mode. Thus, they all support double quotes (") for quoting identifiers, e.g.

CREATE TABLE "order" ( "id" INTEGER, "limit" DECIMAL(10,2) )

SELECT o."limit" FROM "order" o WHERE o."id" = 1

Quoted identifiers can't clash anymore with reserved words (like "order" with ORDER BY) when they are quoted and they always keep their case. The disadvantage is that using them unquoted won't work any more in most cases.

The second possibility is to use unquoted names which are unlikely to get in conflict with reserved SQL words like so_limit

CREATE TABLE shop_order ( so_id INTEGER, so_limit DECIMAL(10,2) )

SELECT so.so_limit FROM shop_order so WHERE so.so_id = 1

This also works if other applications will access the tables which aren't using the OpenDBX library and don't adjust the connections to ANSI mode if necessary.

Quoting values

Quoting string values which should be used inside an SQL statement is essential. But what about numerical values like integers, decimal and floating point values? Many database server support this but MS SQL Server and Sybase ASE don't. Single quotes around numerical values can also cause a performance hit (exceptions seems to be older MySQL servers) and therefore they should be avoided. The correct usage is:

SELECT * FROM table WHERE intcol=1 AND strcol='test'

To avoid being vulnerable to SQL injection attacks, special care must be taken for numerical values from untrusted input sources. Untrusted input is everything that you haven't hard-coded into your application which includes values fetched from the database. Passing them to the escape function of the database library doesn't prevent attacks because as long as no single quote appears somewhere in the value provided by the attacker, the statement will still be valid. If e.g. "1 OR intcol=2" is provided as input for the intcol value of the statement above, this will result in

SELECT id FROM table WHERE intcol=1 OR intcol=2 AND strcol='test'

The statement will be executed by the database server correctly but the meaning has totally changed. Therefore, it's important to do some kind of type checking for the input values if they are provided as character strings. The easiest way is to use the strtol(), strtoll() or strtod() functions provided by the C library:

char* temp;
strtol( input, &temp, 10 );
if( *temp != 0 ) { /* attack detected */ }

If the temp pointer doesn't point to the zero character of the end of the input string afterwards, then the application is sure that the input can't be used without risking serious harm.



Back to Overview