Home
Articles
Blog
Professional Services
Search
Newsletter
Write for us
Career Centre
Contact Us
Questions
Links
Free Stuff
Terms & Conditions
Disclaimer

Subscribe To This Site
XML RSS
Add to Google
Add to My Yahoo!
Add to My MSN
Subscribe with Bloglines

SQL Performance

SQL Performance

In my article http://www.gre-sqlserver-solutions.com/sqlserverperformance.html I gave some useful suggestions for initial troubleshooting and performance analysing for DBAs.

I know what you’re thinking; sql performance tuning is done by a DBA. But in practice, the DBA is not going to have time to scrutinise every change made to a stored procedure. Learning to do basic tuning as a developer might save you from reworking code late in the game.

Having recent acquired the responsibility for a small team of Web developers, I have realised the importance of ensuring that SQL Server-based web applications are also performing adequately. I often get queries from developers asking me to investigate potential performance hogs, it’s never the application, it's the network, it's the server hardware, it's SQL Server, it's the users themselves – but more often than not they will claim it's the database! In reality the issues are hidden in either the coding or the actual configuration of the application in relation to the underlying SQL database.Whilst we as DBAs often find ourselves forever tuning databases etc, developers too need to consider the impact they could have on the overall performance of the database/application. In order for SQL Server-based applications to run at their full potential, developers & DBAs alike must begin considering performance issues during the early stages of the application's development.

For any application, SQL Server or otherwise, scalability and performance have to be built in from the very beginning. Once the application is rolled out, it is very difficult, time-consuming, and in the long run, expensive to resolve most scalability and performance issues.In this article I am to highlight some useful tips on areas such as designing, coding, and implementing performance optimized SQL Server applications. I do not profess to be a developer in any shape or form, however, after recent conversations with my peers, I have come across some very useful advice and hopefully I’ll be able to pass that on to you here.

My disclaimer here is doing these won’t guarantee absolute, headache-free speedy performance, but it won’t be slow either!There are some pretty basic areas to consider from the onset, for example ensure you choose the correct DB for your objectives. Does SQL Server actually give you everything you are going to need? Many developers unknowingly choose and pay for a particular database without planning or considering the later requirements of their database, by which stage it’s too late! As well as the choice of actual DBMS, the developer will have a variety of different languages to choose from. It is crucial for developers to use the proper language for the job.

Coding is of paramount importance – ensuring it is efficient, compliant and secure are fundamentals of ensuring it performs well. Most developers are comfortable with coding, more comfortable than us DBAs that’s for sure. When developing, make sure you choose the proper data type for the column. Choosing the appropriate data types can affect how quickly SQL Server can SELECT, INSERT, UPDATE, and DELETE data, and choosing the most optimum data type is not always obvious.

(See Books Online for all the available Data Types – it would take me an age to list them here as well as outlining the pros and cons of each)It may seem common sense to use the appropriate data types that but this becomes especially important when indexing is used on the database. Therefore, if you're using indexing, you'll probably be perplexed as to why your query is taking forever, when it should be a simple index scan. This is the reason that developers need to pay special attention to data types and their applications.Now is probably a good time to mention indexes.

If you didn’t know - a database index is a data structure that improves the speed of operations on a database table.

Some people think that all you have to do is index everything, and then all of your performance issues will go away. It doesn't work that way. Just as an index can speed data access, it can also degrade access if it is inappropriately selected. The problem with extra indexes is that SQL Server must maintain them every time that a record is INSERTED, UPDATED, or DELETED from a table. While maintaining one or two indexes on a table is not too much overhead for SQL Server to deal with, if you have four, five, or more indexes, they can be a large performance burden on tables. Ideally, you want to have as few as indexes as you can. It is often a balancing act to select the ideal number of indexes for a table in order to find optimal performance.

As a general rule of thumb, don't automatically add indexes to a table because it seems like the right thing to do. Only add indexes if you know that they will be used by the queries run against the table. If you don't know what queries will be run against your table, and then don't add any indexes until you know for sure. It is too easy to make a guess on what queries will be run, create indexes, and then later find out your guesses were wrong. You must know the type of queries that will be run against your data, and then these need to be analyzed to determine the most appropriate indexes, and then the indexes must be created and tested to see if they really help or not.

Indexing is incredibly important when working with large tables, however, occasionally smaller tables should be indexed, if they are expected to grow. Small tables that will remain small, however, should not be indexed.

Many developers write their code during a test phase on tables with 10, or 100 rows, and are satisfied when their code performs adequately. However, as the table grows to 10,000, or 1,000,000 rows, the code will slow to a snail's pace, and then you’ll find yourself sitting impatiently at your screen waiting for results to be returned.

In brief, create a primary key on each table you create and unless you are really knowledgeable enough to figure out a better plan, make it the clustered index. Create an index on any column that is a foreign key. If you know it will be unique, set the flag to force the index to be unique. Don’t index anything else (yet).

It is important to remember that indexes need to be rebuilt occasionally, as data is added to the table (as discussed in http://www.gre-sqlserver-solutions.com/sqlserverperformance.html). Additionally, while indexes increase data access performance, it slows the modification of data. Because of this, most DBMS have an option to temporarily disable an index to facilitate mass data modification, and then allow it to be re-enabled and rebuilt later.

Efficient coding usually involves the inclusion of joins. I doubt very much that you can get away with writing thousands of single table select statements (although that would be a lot easier!).

Table joins can be a big contributor of performance problems, especially if the joins include more than two tables, or if the tables are very large. Unfortunately, joins are a fact of life in relational databases. Because they are so common, you will need to take extra time to help ensure that your joins are as optimal as possible. Understanding joins became the bain of my life in my early days as an inspiring DBA, the subject can be a complex one to those unfamiliar with SQL code but I’ll try to summarise the different types below:-

Inner joins return matching records whereas outer joins do not require each record to have a matching record. There are different sub-types of outer joins too:

  • Left outer join: A left outer join of tables A and B will return all matching records of A and B, as well as any non-matched records from the left table, in this case, A.
  • Right outer join: A right outer join of tables A and B will return all matching records of A and B, as well as any non-matched records from the right table, in this case, B.
  • Full outer join: A full outer join of tables A and B will return all matching records of A and B, as well as any non-matched records from both tables.
They are also something called a self Joins which is a join from a table to itself (surprisingly!).Implementing joins correctly can make your SQL code extremely powerful. But you will need to make sure you use the proper join for your given situation, otherwise performance will suffer.

Normalising the data is also something developers will need to consider. The benefits of normalisation in sql performanceare clear - as the data progresses through the normalization forms, it becomes cleaner, better organized, and faster. The main purpose of normalizing a table is to eliminate the redundant data and to split the tables to give more clarity. The less data there is, the less work SQL Server has to perform, enhancing its performance.

As your database grows, the effects of normalisation will become much more apparent with regards to speed and maintaining data integrity. However, conversely there are some situations in which normalisation doesn't make sense, such as when normalising the data will create excessively complex queries required to return the data.

As for making it compliant, all programming languages have standards which web developers should be aware, and SQL isn't any different. SQL was standardized by ANSI and then ISO, with new revisions to the language being occasionally submitted. The latest revision is SQL: 2008, although the most important revision that developers should be aware of is SQL:1999. The 1999 revision introduced recursive queries, triggers, support for PL/SQL and T-SQL, and a few newer features. It also defined that the JOIN statements be done in the FROM clause, as opposed to the WHERE clause.

When writing code, it is important to keep in mind why standards-compliant code is useful. There are two primary reasons why standards are used. The first is maintainability, and the second is cross-platform standardisation. As with desktop applications, it is assumed that websites will have long lifespans, and will go through various updates to add new functionality and repair problems. When a different programmer accesses your code in 2, 5 or 10 years, will they still be able to understand what your code is doing? Standards and comments are designed to promote maintainability.

The other reason is cross-platform functionality. With CSS, there is currently an ongoing standards battle between Firefox, Internet Explorer and other browsers about the interpretation of code. The reason for the SQL standards is to prevent a similar situation between Oracle (boo!), Microsoft and other SQL variants such as MySQL.

So you’ve now ensured your code is efficient and compliant – it also needs to be secure.Databases store a plethora of valuable information and are often prime targets for attack. Many developers are unaware that their code has critical security vulnerabilities, which is a very scary fact not only for clients, but also for you. Currently, developers can be held legally accountable if their own personal negligence results in a database security risk that is then exploited – might be worth informing your developers of that if didn’t already know!It’s not a straightforward task of writing secure code really. Developers need to understand the DBMS with which they are working and the inherent security issues.

SQL injection is the current popular method of attack but isn't the only security vulnerability for databases and developers to worry about. It is important to test your code and be familiar with the latest security issues for your DBMS in order to protect against attacks. So, what else can developers do to ensure optimal performance?

One of the age old arguments is whether the use of stored procedures is of more benefit?

Virtually all of the Transact-SQL used in your SQL Server-based applications should be encapsulated in stored procedures, not run as dynamic SQL or scripts. This not only reduces network traffic but it speeds up the T-SQL because the code in the stored procedure residing on the server is already pre-compiled. Here are a couple of things to keep in mind when writing stored procedures for optimal performance.

When a stored procedure is first executed, it is optimised and a query plan is compiled and cached in SQL Server's memory. If the same stored procedure is called again, it will use the cached query plan instead of creating a new one, saving time and boosting performance. This may or may not be what you want. If the query in the stored procedure is the same each time, then this is a good thing. But if the query is dynamic (the WHERE clauses changes substantially from one execution of the stored procedure to the next), then this is a bad thing, as the query will not be optimised when it is run, and the performance of the query can suffer.

If you know that your query will vary each time it is run from the stored procedure, you will want to add the WITH RECOMPILE option when you create the stored procedure. This will force the stored procedure to be re-compiled each time it is run, ensuring the query is optimised each time it is run.Another minor tip is to include the statement, "SET NOCOUNT ON" in your stored procedures. If you don't turn this feature on, then every time a SQL statement is executed, SQL Server will send a response to the client indicating the number of rows affected by the statement. It is rare that the client will ever need this information. Using this statement will reduce network traffic between the server and the client if you don't need this feature.Note here, any stored procedures prefixed with 'sp_' are first searched for in the Master database rather than the one it is created in. This will cause a delay in the stored procedure being executed.

Plus, when calling a stored procedure you should include the owner name in the call, i.e. use ‘exec dbo.spJustinsStoredProc’ instead of ‘exec spJustinsStoredProc’. Prefixing it with the owner when executing it will stop SQL Server from placing a COMPILE lock on the procedure while it determines if all objects referenced in the code have the same owners as the objects in the current cached procedure plan. I found this to be the case on the majority of our SPs!

One aspect to note when using Stored Procedures is the possibility of “deadlocking”. This can occur within a stored procedure when two user processes have locks on separate objects and each process is trying to acquire a lock on the object that the other process has. When this happens, SQL Server ends the deadlock by automatically choosing one and aborting the process, allowing the other process to continue. The aborted transaction is rolled back and an error message is sent to the user of the aborted process.

To help avoid deadlocking in your SQL Server application, try to design your application using these suggestions: 1) have the application access server objects in the same order each time; 2) during transactions, don't allow any user input. Collect it before the transaction begins; 3) keep transactions short and within a single batch, and 4) if appropriate, use as low of an isolation level as possible for the user connection running the transaction.Other quick suggestions include:

Never underestimating the use of a WHERE clause in your SELECT statements. I know this sounds obvious and you’ve heard it a million times, but only return the columns and the rows you need. But also check the order of WHERE clauses when using AND!Also, cursors have their advantages but ideally, for high-performing SQL Server-based applications, cursors should be avoided. While in some cases they are unavoidable, often there are ways to remove them from your code. Quicker alternatives include derived tables, sub-queries, multiple queries and to some degree temporary tables. (But even temporary tables need to be used with caution).

Unfortunately, these are not always possible, and you have to use a cursor. If you find it impossible to avoid using cursors in your applications. There’s plenty of information on cursors in Books Online, if you have to include them, use them efficiently and effectively.As with cursors, triggers also have their benefits. But you need to use them cautiously.

Triggers can be a powerful tool in Transact-SQL, but they can produce a lot of overhead. Here are some tips on how to optimize trigger performance. Obviously the more code you have in a trigger the slower it will run and try to avoid rolling back triggers because of the overhead involved. Instead of letting the trigger find a problem and rollback a transaction, catch the error before it can get to the trigger (easier said than done I know!).There are many other little performance enhancing tips out there, things like avoiding excessive string concatenation and using the more efficient operators, sometimes there is an alternative way to re-write your SQL statement. The order of performance (with the most efficient first) is =, >, >=, <, <=, LIKE, <>.

This article has a lot of useful information, but on the other hand, it has barely touched the surface when it comes to performance tuning – (I would lose what’s left of my sanity if I were to continue!). Reading this article has been your first step. You will have to research particular areas of concern, there’s ample amount of literature out there.Please send us your thoughts/comments, whether you think this article was worthwhile, whether we left anything ‘essential’ out etc. (http://www.gre-sqlserver-solutions.com/ContactUS.html)

At the very least, if you take advantage of the advice and information in this article, you will find that performance tuning your SQL Server-related applications is not a big as mystery as you might think. So good luck!


footer for sql performance page