SQL Best Practices

Abstract

Standardizing the T-SQL code across an organization is essential to facilitating effective and efficient code reviews, troubleshooting T-SQL script, supporting joint development efforts and ensuring smooth hand-off of projects from one group to another. Standardization helps keep code quality, readability and performance in check, while making the code easy to maintain.
In this post, we will discuss the following key aspects of SQL Best Practices standardization:

SQL Formatting and why it”s so important.
Common mistake while designing SQL code.
How to write efficient, effective and tuned SQL queries.

SQL Formatting

Proper formatting is extremely important for code readability (code review, hand-off & joint development). It is just as vital for code migration between different Databases as well as code maintainability and troubleshooting.

Where should the focus be (decide your own standard and follow)

Text Case

One of the primary focus areas in SQL formatting is text case. Common areas of focus here and are keywords, data types, user-defined objects, etc. One must decide which components of the script / query will be in CAPS, and which will be in lower case. Should all keywords be uppercase, and data types all lower case, while user-defined objects are in camel case (compound words where the first letter of an identifier is lowercase and the first letter of each subsequent concatenated word is capitalized). Such as table names for example?

The following is an example of such a text case standardization.

IF OBJECT_ID (“ProductDocs”,“U”)IS NOT NULL

DROP TABLE ProductDocs

CREATE TABLE ProductDocs

(

DocID intNOT NULL IDENTITY,

DocTitle nvarchar (50) NOT NULL,

DocFileName nvarchar (400) NOT NULL,

CONSTRAINT PK_ProductDocs_DocID PRIMARYKEYCLUSTERED (DocID ASC)

)

Commas

The placement of commas is also very important in SQL as they can cause syntax errors or alternatively increase readability, depending on where they are used. Should commas be used to end the line or begin the line? Should spaces or tabs ever be used before or after commas? This is what needs to be decided. Regardless of whichever approach is used, consistency in their usage must be maintained throughout the code. For example:

SELECT FirstName

,MiddleName

,LastName

,City

,StateProvinceName

FROM HumanResources.vEmployee

WHERE JobTitle LIKE“Production Technician%”

ORDERBY StateProvinceName, City

This is by no means limited to just SELECT lists and ORDER BY clauses, commas can also be used in DDL statements, as depicted below:

IF OBJECT_ID (“ProductDocs”,“U”)IS NOT NULL

DROP TABLE ProductDocs

CREATE TABLE ProductDocs

(

DocID int NOT NULL IDENTITY,

DocTitle nvarchar (50) NOT NULL,

DocFileName nvarchar (400) NOT NULL,

CONSTRAINT PK_ProductDocs_DocID PRIMARY KEY CLUSTERED (DocID ASC)

)

Spacing and Alignment and linefeeds

Another formatting consideration is deciding when to use Space and linefeed. For example, should they be used after each column, for each Reserve Word or for each SQL clause, etc.? The practice of indentation must also be standardized, whether it”s from the left or right; indentation must be consistent throughout the code to enhance readability. A good example for this is the sample given below.

SELECT FirstName

,MiddleName

,LastName

,City

,StateProvinceName

FROM HumanResources.vEmployee

WHERE JobTitle LIKE “Production Technician%”

ORDERBY StateProvinceName, City

Comments

Comments are also very helpful for query readability and aid in understanding complex logic when reading thorough hundreds of lines of code. One must make it a point to use comments in queries.

A good approach, is to keep multiple line comments in /*….*/ and one line comment in –, as depicted in the code sample below.

Retrieves employee data for Production Technicians

Orders data by state, then city

SELECT FirstName

,MiddleName + LastName –retrieve full name

,City

,dbo.proper_case(StateProvinceName) — User Defined Function

FROM
HumanResources.vEmployee

— Pull Production Technicians only

WHERE JobTitle LIKE “Production Technician%”

ORDER BY StateProvinceName,
City –order by state, then city

CONFIDENTIAL

Aliases

Aliases should also be used as much as possible, as they are not only helpful in query maintainability and avoiding referencing conflict, but also for performance because when you prefix each object with an alias, the optimizer does not need extra cost to look into the data dictionary for that particular object”s reference.

SELECT

(cnt.FirstName + ” “ + cnt.LastName) FullName

, emp.LoginID

, emp.Title

FROM
HumanResources.Employee emp

INNER JOIN Person.Contact cnt

ON
emp.ContactID = cnt.ContactID

ORDER BY cnt.LastName

JOIN

It’s also a good practice to use ANSI standard joins instead of old style joins, as this enables you to migrate code easily between different databases. For example:

–old style join

SELECT a.Au_id,

t.Title

FROM TITLES t,

AUTHORS a,

TITLEAUTHOR ta

WHERE a.Au_id = ta.Au_id

AND ta.Title_id = t.Title_id

ANDt.Title LIKE “%Computer%“

–ANSI join syntax

SELECT a.Au_id,

t.Title

FROM AUTHORS a

INNERJOIN TITLEAUTHOR ta

ON a.Au_id = ta.Au_id

INNERJOIN TITLES t

ON ta.Title_id = t.Title_id

WHERE t.Title LIKE “%Computer%“

Common Mistakes When Writing SQL Scripts

Avoid SELECT *

Avoid using Select * in queries. Since the more data you choose to read from tables, the slower your query will become. Large data size increases the time it takes for disk operations. Also, in cases where the database server is separate from the web server, fetching large data chunks in queries will result in longer network delays, as the data has to be transferred between the servers. For example:

Use:

SELECT customer_id, last_name, first_name, street, city FROM customer

Rather than:

SELECT * FROM customer;

Sample MySQL Code

// do I have any users from Alabama?

// what NOT to do:

$r = mysql_query(“SELECT
* FROM user WHERE state = “Alabama””);

if (mysql_num_rows($r) >
0) {

// …

}

// much better:

$r = mysql_query(“SELECT
1 FROM user WHERE state = “Alabama” LIMIT 1″);

if (mysql_num_rows($r) >
0) {

// …

}

Use UNION ALL Instead Of UNION unless really needed:

UNIONs perform a DISTINCT operation which requires extra cost. This must be avoided in order to facilitate faster query execution. For example:

Use:

SELECT employeeID, firstname, lastname

FROM names WHERE
dept = “prod”

UNION ALL

SELECT employeeID, firstname, IF EXISTS
(SELECT org_idastname

FROM names WHERE
city = “Orlando”

Rather than:

SELECT employeeID, firstname, lastname
FROM names WHERE
dept = “prod”

UNION

SELECT employeeID,
firstname, lastname FROM
names WHERE city = “Orlando”

Use EXISTS over COUNT (*)

Another best practice is to use Exists when you want to check the existence of records, rather than COUNT (*). For example:

Use:

IF EXISTS(SELECT org_id

FROM dbo.[PatReq.Organization] AS org

WHERE org.ID = 1)

PRINT “Record Exists”

Rather than:

DECLARE @n INT

SELECT @n= Count(org_id)

FROM dbo.[PatReq.Organization] AS org

WHERE org.ID = 1

IF @n > 0

PRINT “Record Exists”

Use Table Aliases

Always use table aliases in column names, in order to avoid situations where the optimizer has to search for columns in the data dictionary. Using this approach will result in faster query execution. For example:

Use:

SELECT

(cnt.FirstName
+ ” “ + cnt.LastName) AS FullName,

emp.LoginID,

emp.Title

FROM HumanResources.Employee AS emp

INNERJOIN Person.ContactAS cnt

ON emp.ContactID = cnt.Contact_ID

Rather than:

SELECT

(FirstName + ” “ + LastName) AS FullName,

LoginID,

Title

FROM HumanResources.Employee

INNER JOIN Person.Contact

ON ContactID = Contact_ID

BETWEEN vs. IN

Using the BETWEEN function is much more efficient than using IN. Therefore when searching for records in sequence, it”s always preferable to use BETWEEN rather than IN, as depicted below.

Use:

SELECT org.*

FROM dbo.[PatReq.Organization] AS org

WHERE org.ID
BETWEEN 12654 AND 12657

Rather than:

SELECT org.*

FROM dbo.[PatReq.Organization] AS org

WHERE org.ID
IN(12654,12655,12656,12657)

Do not use column numbers in the ORDER BY clause

Using column numbers in Order By clauses incurs additional CPU cost, so it”s better not to use them in conjunction. The following is a good example of this scenario:

Use:

SELECT OrderID, OrderDate

FROM Orders

ORDER BY
OrderDate

Rather than:

SELECT OrderID, OrderDate

FROM Orders

ORDER BY 2

AVOID DISTINCT For Unique Rows

If you are already working on unique rows don”t use the DISTINCT keyword, as this too incurs additional CPU cost. For example:

Use:

SELECT d.dept_id, d.dept

FROM dept;

Rather than:

SELECT DISTINCT
d.dept_id, d.dept

FROM dept;

How to Write Efficient, Effective and Tuned SQL Queries

Using the Index Effectively

Its always best to use sargable conditions in Where clause instead of non-sargable search conditions, as non-sargable search conditions have a time cost associated with them. Therefor a best practice of query optimization is to use sargable conditions wherever possible. For example:

Use:

=, >, >=, <, =<, BETWEEN, LIKE ‘literal%’

Rather Than:

<>, !=, !>, !<, NOT EXISTS, NOT IN, NOT LIKE, LIKE ‘%literal%’

Use:

SELECT org.Name

FROM dbo.[PatReq.Organization] AS org

WHERE org.Name LIKE “SAMA%”

Rather Than:

SELECT org.Name

FROM dbo.[PatReq.Organization] AS org

WHERE org.Name LIKE “%SAMA%”

Avoid Arithmetic Operators in WHERE

For proper index usage it is best to avoid using arithmetic operators on Where clauses. For example:

Use:

SELECT org.ID, org.ProcessedDate

FROM dbo.[PatReq.Organization] AS org

WHERE org.ProcessedDate = “2002-04-04” + 2

Rather Than:

SELECT org.ID, org.ProcessedDate

FROM dbo.[PatReq.Organization] AS org

WHERE org.ProcessedDate + 2 = “2002-04-06”

Avoid FUNCTION on WHERE

It”s also best to avoid using Function on WHERE clauses, as doing so will not utilize the index properly. The following code sample depicts the best practice to use in this situation.

Use:

SELECT org.ID, org.ProcessedDate

FROM dbo.[PatReq.Organization]AS org

WHERE org.ProcessedDate >= “2002-04-01”

AND org.ProcessedDate < “2002-05-01

Rather Than:

SELECT org.ID, org.ProcessedDate

FROM dbo.[PatReq.Organization] AS org

WHERE DATEPART(yy,org.ProcessedDate) = “2002”

AND DATEPART(mm,org.ProcessedDate) = “4”

Operate On small Result Set

Replace NOT IN with LEFT OUTER JOIN

Sometimes we can avoid an extra row scan by replacing NOT IN with LEFT OUTER JOIN. For example, if we

Use:

SELECT org.*

FROM dbo.[PatReq.Organization] AS org

LEFT OUTER
JOIN [dbo].[PatReq.Document] doc

ON org.DocumentId = doc.Idextension

WHERE doc.Idextension IS NULL

Rather Than:

SELECT org.*

FROM dbo.[PatReq.Organization] AS org

WHERE org.DocumentId
NOT IN

(SELECT doc.Idextension

FROM [dbo].[PatReq.Document] doc)

Remove Unnecessary JOINS

If Derive has fewer rows than the Join table and only one column in Select, then it is best to avoid using JOIN and use a sub-query instead, as depicted in the sample below:

Use:

SELECT pd.Idextension

,(SELECT TOP 1 PO.Name

FROM [dbo].[PatReq.Organization] PO

WHERE PO.DocumentId = PD.IdExtension)
AS RepresentedOrganization

FROM [PatReq.Document] PD

Rather Than:

SELECT pd.Idextension

, PO.Name

FROM [PatReq.Document] PD

INNER JOIN [dbo].[PatReq.Organization] PO

ON PO.DocumentId = PD.IdExtension

Use TOP/LIMIT Keyword

IF you have duplicate rows and want to fetch just one row, its best practice to use TOP or LIMIT, as depicted below:

SELECT TOP 1 org_id

FROM dbo.[PatReq.Organization] AS org

WHERE org.ID = 1

Conclusion

As a Big Data solution provider, Folio3 has extensive expertise in query optimization and performance management solutions. For more information about our Big Data services, please visit here or contact us for details.

Products

Animal Management

Solutions

Animal Management

Crop Management

Agriculture Ecosystems

Animal Management

Crop Management

Services

Products

Food Safety Suite

Food ERP

Solutions

Food App Development

Integrations

Industries

AI Product Development in Healthcare

Healthcare Data Analytics

Healthcare Interoperability

Virtual Care Solutions

Wearable & Device Integration

Patient Data Privacy & Compliance

Healthcare Product Design & Strategy

Healthcare ERP

FemTech / Women Health

Resource Augmentation

Medical Billing & Claims

Specialized Healthcare Services

SQL Best Practices

Posted by: Blog Post May 17, 2013

COMMENTS ()

Abstract

SQL Formatting

Text Case

Commas

Spacing and Alignment and linefeeds

Comments

Aliases

JOIN

Common Mistakes When Writing SQL Scripts

Avoid SELECT *

Use UNION ALL Instead Of UNION unless really needed:

Use EXISTS over COUNT (*)

Use Table Aliases

BETWEEN vs. IN

Do not use column numbers in the ORDER BY clause

AVOID DISTINCT For Unique Rows

How to Write Efficient, Effective and Tuned SQL Queries

Using the Index Effectively

Avoid Arithmetic Operators in WHERE

Avoid FUNCTION on WHERE

Operate On small Result Set

Replace NOT IN with LEFT OUTER JOIN

Remove Unnecessary JOINS

Use TOP/LIMIT Keyword

Conclusion

CALL

EMAIL

VISIT

Contact us