Difference between revisions of "Portable SQL/Tips/Identifier Naming"

Latest revision as of 00:39, 19 December 2010

Table prefixes

Many problems arise from the thinking that your tables will be the only ones in the database in every installation worldwide. Unfortunately, this isn't the case and often applications need to be combined with others to create an improved user experience. To be able to access tables from both applications at once and to perform SELECTs on them to retrieve combined data, both applications have to use the same database and therefore have to share the same namespace.

Often, this is the point where conflicts arise. For example, a shop application uses a "categories" table as well as the forum application and both tables have incompatible definitions. In this cases it doesn't work that they share the same user table for login by simply adapting the login mechanism of the forum. These kind of problems often arise with third party products if you don't use prefixes for your tables. Therefore, always use a short string in front of the table name and separate both by an underscore.

Good examples are:

mwiki_bookmarks
shop_products
lw_hosts

The prefix shouldn't be longer than 4-6 characters and it's best to use a short form of your application name.

Column prefixes

In almost all tables contain a "id" column to have a unique reference for further operations. If they are all named short and descriptive "id", you will have to take special care into each query which joins two of these queries. It's important to use the table name or an alias of it each time you reference such an "id" column. Otherwise, the database server conplains about the ambiguous identifier names.

However, there's a much bigger problem. Database libraries and programming languages can provide database records in associative arrays and most of us appreciate and use this feature because it's so convenient. Now things can get ugly. You have to use aliases all the time if you join two tables with columns of the same name:

SELECT
  so.id AS orderid, so.*,
  sp.id AS productid, sp.*
FROM shop_order
JOIN shop_product ON so.productid = sp.id

If you forgot about this and simply select all columns from both tables, you will be struggling with with hard to detect errors. This is because the ID value of the product table will overwrite the value of the order table and you won't get noticed about that behavior.

Also, there's another problem lurking around. The SQL syntax is evolving and the standard group as well as the database vendors extend it constantly. Thus, there will be more reserved words in the future and if you have bad luck, one of the column names you've used will be part of the SQL syntax in the future.

To avoid this, you can prefix the column names by a short string for each table so they will be unique across the database. Prefixes with two letters and an additional underscore are optimal but often you need more to provide unique prefixes. They should be, however, not longer than six characters to keep them readable.

Good examples are:

so_id
sp_id
sop_orderid

It's not a good idea to use the complete table name in front of each column name because the names tend to be extremely long in this case. Using short column prefixes, your query can be now much shorter and more elegant:

SELECT *
FROM shop_order
JOIN shop_product ON so_productid = sp_id

This is one of the more controversial discussed suggestions because there's also a disadvantage. If you ever have to rename a table, you should also rename all the column names to prevent inconsistencies. This can be tedious if you have to adapt this in your code as well. So please keep care of a good table naming.

Also note that if you join a table with itself and retrieve columns with the same name from both table references, you have to use aliases again. Otherwise, your key to access the value in the associated array isn't unique any more and it's likely to get wrong results.

Short and descriptive

The identifiers you use should help other users to understand what's the content of the table or column. It should describe the content as short as possible without sacrificing the descriptiveness. It's important that the name explains the purpose. I prefer column names in singular and table names in plural form as a column stores a single value but tables store multiple rows.

Table examples:

mwiki_bookmarks
shop_products
lw_hosts

Column examples:

bm_name
sp_name
lwh_name

If you follow this suggestions, your database tables will look like this one:

CREATE TABLE shop_products (
  sp_id INTEGER NOT NULL,
  sp_code VARCHAR(16) NOT NULL,
  sp_name VARCHAR(60) NOT NULL,
  sp_amount INTEGER NOT NULL
)

Character case

Identifiers can be in lower case, upper case or mixed case (e.g. CamelCase) and it depends on the database implementation and sometimes the operating system how they are stored (e.g. in MySQL). Some implementations convert all names to lower case, some to upper case. Preserving the character case is only guaranteed by using quotes ("" in SQL standard compliant implementations).

Therefore, the best way is to use all lower case characters without quoting. This does work in all implementations and you avoid problems with different quoting characters required by the databases. Setting every connection to ANSI mode first instead can be a tedious task if it isn't done automatically like in the OpenDBX library

Foreign keys

Foreign keys are columns referencing unique or primary keys in associated tables. A good way to show their inheritance is to concatenate the main part of the associated table name in singular form with 'id' and prefix it according to the rule above:

CREATE TABLE shop_categories (
  sc_id INTEGER NOT NULL PRIMARY KEY,
  sc_name VARCHAR(60) NOT NULL
)

CREATE TABLE shop_products (
  sp_id INTEGER NOT NULL PRIMARY KEY,
  sp_categoryid INTEGER NOT NULL,
  sp_name VARCHAR(60) NOT NULL
)

Thus, it's easy for everybody to see that sp_categoryid is related to shop_categories.id even without looking into the documentation or the entity relationship model.

Difference between revisions of "Portable SQL/Tips/Identifier Naming"

Latest revision as of 00:39, 19 December 2010

Contents

Table prefixes

Column prefixes

Short and descriptive

Character case

Foreign keys

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Projects

Toolbox

@@ Line 46: / Line 46: @@
 This is one of the more controversial discussed suggestions because there's also a disadvantage. If you ever have to rename a table, you should also rename all the column names to prevent inconsistencies. This can be tedious if you have to adapt this in your code as well. So please keep care of a good table naming.
+Also note that if you join a table with itself and retrieve columns with the same name from both table references, you have to use aliases again. Otherwise, your key to access the value in the associated array isn't unique any more and it's likely to get wrong results.
 == Short and descriptive ==
@@ Line 72: / Line 74: @@
 == Character case ==
-Identifiers can be in lower case, upper case or mixed case (e.g. CamelCase) and it depends on the database implementation and sometimes the operating system how they are stored.
+Identifiers can be in lower case, upper case or mixed case (e.g. CamelCase) and it depends on the database implementation and sometimes the operating system how they are stored (e.g. in MySQL). Some implementations convert all names to lower case, some to upper case. Preserving the character case is only guaranteed by using quotes ("" in SQL standard compliant implementations).
+Therefore, the best way is to use all lower case characters without quoting. This does work in all implementations and you avoid problems with different quoting characters required by the databases. Setting every connection to ANSI mode first instead can be a tedious task if it isn't done automatically like in the [[OpenDBX]] library
+== Foreign keys ==
+Foreign keys are columns referencing unique or primary keys in associated tables. A good way to show their inheritance is to concatenate the main part of the associated table name in singular form with 'id' and prefix it according to the rule above:
+ CREATE TABLE shop_categories (
+   sc_id INTEGER NOT NULL PRIMARY KEY,
+   sc_name VARCHAR(60) NOT NULL
+ )
+ CREATE TABLE shop_products (
+   sp_id INTEGER NOT NULL PRIMARY KEY,
+   sp_categoryid INTEGER NOT NULL,
+   sp_name VARCHAR(60) NOT NULL
+ )
+Thus, it's easy for everybody to see that sp_categoryid is related to shop_categories.id even without looking into the documentation or the entity relationship model.