SQL Best Practices

SQL Best Practices
COMMENTS ()
Tweet

Abstract

Standardizing the T-SQL code across an organization is essential to facilitating effective and efficient code reviews, troubleshooting T-SQL script, supporting joint development efforts and ensuring smooth hand-off of projects from one group to another. Standardization helps keep code quality, readability and performance in check, while making the code easy to maintain.
In this post, we will discuss the following key aspects of SQL Best Practices standardization:

  • SQL Formatting and why it”s so important.
  • Common mistake while designing SQL code.
  • How to write efficient, effective and tuned SQL queries.

SQL Formatting

Proper formatting is extremely important for code readability (code review, hand-off & joint development). It is just as vital for code migration between different Databases as well as code maintainability and troubleshooting.

Where should the focus be (decide your own standard and follow)

Text Case

One of the primary focus areas in SQL formatting is text case. Common areas of focus here and are keywords, data types, user-defined objects, etc. One must decide which components of the script / query will be in CAPS, and which will be in lower case. Should all keywords be uppercase, and data types all lower case, while user-defined objects are in camel case (compound words where the first letter of an identifier is lowercase and the first letter of each subsequent concatenated word is capitalized). Such as table names for example?

 

The following is an example of such a text case standardization.

 

IF OBJECT_ID (“ProductDocs”,“U”)IS NOT NULL

DROP TABLE ProductDocs

GO

CREATE TABLE ProductDocs

(

DocID intNOT NULL IDENTITY,

DocTitle nvarchar (50) NOT NULL,

DocFileName nvarchar (400) NOT NULL,

CONSTRAINT PK_ProductDocs_DocID PRIMARYKEYCLUSTERED (DocID ASC)

)

GO

Commas

The placement of commas is also very important in SQL as they can cause syntax errors or alternatively increase readability, depending on where they are used. Should commas be used to end the line or begin the line? Should spaces or tabs ever be used before or after commas? This is what needs to be decided. Regardless of whichever approach is used, consistency in their usage must be maintained throughout the code. For example:

 

SELECT FirstName

,MiddleName

,LastName

,City

,StateProvinceName

FROM HumanResources.vEmployee

WHERE JobTitle LIKE“Production Technician%”

ORDERBY StateProvinceName, City

 

This is by no means limited to just SELECT lists and ORDER BY clauses, commas can also be used in DDL statements, as depicted below:

 

IF OBJECT_ID (“ProductDocs”,“U”)IS NOT NULL

DROP TABLE ProductDocs

GO

CREATE TABLE ProductDocs

(

DocID int NOT NULL IDENTITY,

DocTitle nvarchar (50) NOT NULL,

DocFileName nvarchar (400) NOT NULL,

CONSTRAINT PK_ProductDocs_DocID PRIMARY KEY CLUSTERED (DocID ASC)

)

GO

Spacing and Alignment and linefeeds

Another formatting consideration is deciding when to use Space and linefeed. For example, should they be used after each column, for each Reserve Word or for each SQL clause, etc.? The practice of indentation must also be standardized, whether it”s from the left or right; indentation must be consistent throughout the code to enhance readability. A good example for this is the sample given below.

 

SELECT FirstName

,MiddleName

,LastName

,City

,StateProvinceName

FROM HumanResources.vEmployee

WHERE JobTitle LIKE “Production Technician%”

ORDERBY StateProvinceName, City

Comments

Comments are also very helpful for query readability and aid in understanding complex logic when reading thorough hundreds of lines of code. One must make it a point to use comments in queries.

 

A good approach, is to keep multiple line comments in /*….*/ and one line comment in –, as depicted in the code sample below.

 

/*

Retrieves employee data for Production Technicians

Orders data by state, then city

*/

SELECT FirstName

,MiddleName + LastName –retrieve full name

,City

,dbo.proper_case(StateProvinceName) — User Defined Function

FROM
HumanResources.vEmployee

— Pull Production Technicians only

WHERE JobTitle LIKE “Production Technician%”

ORDER BY StateProvinceName,
City –order by state, then city

/*

CONFIDENTIAL

*/

Aliases

Aliases should also be used as much as possible, as they are not only helpful in query maintainability and avoiding referencing conflict, but also for performance because when you prefix each object with an alias, the optimizer does not need extra cost to look into the data dictionary for that particular object”s reference.

 

SELECT

(cnt.FirstName + ” “ + cnt.LastName) FullName

, emp.LoginID

, emp.Title

FROM
HumanResources.Employee emp

INNER JOIN Person.Contact cnt

ON
emp.ContactID = cnt.ContactID

ORDER BY cnt.LastName

JOIN

It’s also a good practice to use ANSI standard joins instead of old style joins, as this enables you to migrate code easily between different databases. For example:

 

–old style join

SELECT a.Au_id,

t.Title

FROM TITLES t,

AUTHORS a,

TITLEAUTHOR ta

WHERE a.Au_id = ta.Au_id

AND ta.Title_id = t.Title_id

ANDt.Title LIKE %Computer%

–ANSI join syntax

SELECT a.Au_id,

t.Title

FROM AUTHORS a

INNERJOIN TITLEAUTHOR ta

ON a.Au_id = ta.Au_id

INNERJOIN TITLES t

ON ta.Title_id = t.Title_id

WHERE t.Title LIKE %Computer%

Common Mistakes When Writing SQL Scripts

Avoid SELECT *

Avoid using Select * in queries. Since the more data you choose to read from tables, the slower your query will become. Large data size increases the time it takes for disk operations. Also, in cases where the database server is separate from the web server, fetching large data chunks in queries will result in longer network delays, as the data has to be transferred between the servers. For example:

 

Use:

SELECT customer_id, last_name, first_name, street, city FROM customer

 

Rather than:

SELECT * FROM customer;

Sample MySQL Code

// do I have any users from Alabama?

// what NOT to do:

$r = mysql_query(“SELECT
* FROM user WHERE state = “Alabama””);

if (mysql_num_rows($r) >
0) {

// …

}

// much better:

$r = mysql_query(“SELECT
1 FROM user WHERE state = “Alabama” LIMIT 1″);

if (mysql_num_rows($r) >
0) {

// …

}

Use UNION ALL Instead Of UNION unless really needed:

UNIONs perform a DISTINCT operation which requires extra cost. This must be avoided in order to facilitate faster query execution. For example:

 

Use:

SELECT employeeID, firstname, lastname

FROM names WHERE
dept = “prod”

UNION ALL

SELECT employeeID, firstname, IF EXISTS
(SELECT org_idastname

FROM names WHERE
city = “Orlando”

 

Rather than:

SELECT employeeID, firstname, lastname
FROM names WHERE
dept = “prod”

UNION

SELECT employeeID,
firstname, lastname FROM
names WHERE city = “Orlando”

Use EXISTS over COUNT (*)

Another best practice is to use Exists when you want to check the existence of records, rather than COUNT (*). For example:

 

Use:

IF EXISTS(SELECT org_id

FROM dbo.[PatReq.Organization] AS org

WHERE org.ID = 1)

PRINT “Record Exists”

 

Rather than:

DECLARE @n INT

SELECT @n= Count(org_id)

FROM dbo.[PatReq.Organization] AS org

WHERE org.ID = 1

IF @n > 0

PRINT “Record Exists”

Use Table Aliases

Always use table aliases in column names, in order to avoid situations where the optimizer has to search for columns in the data dictionary. Using this approach will result in faster query execution. For example:

 

Use:

SELECT

(cnt.FirstName
+ ” “ + cnt.LastName) AS FullName,

emp.LoginID,

emp.Title

FROM HumanResources.Employee AS emp

INNERJOIN Person.ContactAS cnt

ON emp.ContactID = cnt.Contact_ID

 

Rather than:

SELECT

(FirstName + ” “ + LastName) AS FullName,

LoginID,

Title

FROM HumanResources.Employee

INNER JOIN Person.Contact

ON ContactID = Contact_ID

BETWEEN vs. IN

Using the BETWEEN function is much more efficient than using IN. Therefore when searching for records in sequence, it”s always preferable to use BETWEEN rather than IN, as depicted below.

 

Use:

SELECT org.*

FROM dbo.[PatReq.Organization] AS org

WHERE org.ID
BETWEEN 12654 AND 12657

 

Rather than:

SELECT org.*

FROM dbo.[PatReq.Organization] AS org

WHERE org.ID
IN(12654,12655,12656,12657)

Do not use column numbers in the ORDER BY clause

Using column numbers in Order By clauses incurs additional CPU cost, so it”s better not to use them in conjunction. The following is a good example of this scenario:

 

Use:

SELECT OrderID, OrderDate

FROM Orders

ORDER BY
OrderDate

 

Rather than:

SELECT OrderID, OrderDate

FROM Orders

ORDER BY 2

AVOID DISTINCT For Unique Rows

If you are already working on unique rows don”t use the DISTINCT keyword, as this too incurs additional CPU cost. For example:

 

Use:

SELECT d.dept_id, d.dept

FROM dept;

 

Rather than:

SELECT DISTINCT
d.dept_id, d.dept

FROM dept;

How to Write Efficient, Effective and Tuned SQL Queries

Using the Index Effectively

Its always best to use sargable conditions in Where clause instead of non-sargable search conditions, as non-sargable search conditions have a time cost associated with them. Therefor a best practice of query optimization is to use sargable conditions wherever possible. For example:

 

Use:

=, >, >=, <, =<, BETWEEN, LIKE ‘literal%’

 

Rather Than:

<>, !=, !>, !<, NOT EXISTS, NOT IN, NOT LIKE, LIKE ‘%literal%’

 

Use:

SELECT org.Name

FROM dbo.[PatReq.Organization] AS org

WHERE org.Name LIKE “SAMA%”

 

Rather Than:

SELECT org.Name

FROM dbo.[PatReq.Organization] AS org

WHERE org.Name LIKE “%SAMA%”

Avoid Arithmetic Operators in WHERE

For proper index usage it is best to avoid using arithmetic operators on Where clauses. For example:

 

Use:

SELECT org.ID, org.ProcessedDate

FROM dbo.[PatReq.Organization] AS org

WHERE org.ProcessedDate = “2002-04-04” + 2

 

Rather Than:

SELECT org.ID, org.ProcessedDate

FROM dbo.[PatReq.Organization] AS org

WHERE org.ProcessedDate + 2 = “2002-04-06”

Avoid FUNCTION on WHERE

It”s also best to avoid using Function on WHERE clauses, as doing so will not utilize the index properly. The following code sample depicts the best practice to use in this situation.

 

Use:

SELECT org.ID, org.ProcessedDate

FROM dbo.[PatReq.Organization]AS org

WHERE org.ProcessedDate >= “2002-04-01”

AND org.ProcessedDate < “2002-05-01

 

Rather Than:

SELECT org.ID, org.ProcessedDate

FROM dbo.[PatReq.Organization] AS org

WHERE DATEPART(yy,org.ProcessedDate) = “2002”

AND DATEPART(mm,org.ProcessedDate) = “4”

Operate On small Result Set

Replace NOT IN with LEFT OUTER JOIN

Sometimes we can avoid an extra row scan by replacing NOT IN with LEFT OUTER JOIN. For example, if we

 

Use:

SELECT org.*

FROM dbo.[PatReq.Organization] AS org

LEFT OUTER
JOIN [dbo].[PatReq.Document] doc

ON org.DocumentId = doc.Idextension

WHERE doc.Idextension IS NULL

 

Rather Than:

SELECT org.*

FROM dbo.[PatReq.Organization] AS org

WHERE org.DocumentId
NOT IN

(SELECT doc.Idextension


FROM [dbo].[PatReq.Document] doc)

Remove Unnecessary JOINS

If Derive has fewer rows than the Join table and only one column in Select, then it is best to avoid using JOIN and use a sub-query instead, as depicted in the sample below:

 

Use:

SELECT pd.Idextension

,(SELECT TOP 1 PO.Name

FROM [dbo].[PatReq.Organization] PO

WHERE PO.DocumentId = PD.IdExtension)
AS RepresentedOrganization

FROM [PatReq.Document] PD

 

Rather Than:

SELECT pd.Idextension

, PO.Name

FROM [PatReq.Document] PD

INNER JOIN [dbo].[PatReq.Organization] PO

ON PO.DocumentId = PD.IdExtension

Use TOP/LIMIT Keyword

IF you have duplicate rows and want to fetch just one row, its best practice to use TOP or LIMIT, as depicted below:

 

SELECT TOP 1 org_id

FROM dbo.[PatReq.Organization] AS org

WHERE org.ID = 1

Conclusion

As a Big Data solution provider, Folio3 has extensive expertise in query optimization and performance management solutions. For more information about our Big Data services, please visit here or contact us for details.

CALL

USA408 365 4638

VISIT

1301 Shoreway Road, Suite 160,

Belmont, CA 94002

Contact us

Whether you are a large enterprise looking to augment your teams with experts resources or an SME looking to scale your business or a startup looking to build something.
We are your digital growth partner.

Tel: +1 408 365 4638
Support: +1 (408) 512 1812