SQL Performance Tuning
SQL tuning is the process of enhancing SQL queries to speed up the performance of your server. Its main goal is to shorten the time it takes for a user to receive a response after sending a query and to utilize fewer resources in the process. The idea is that users can occasionally produce the same intended result set with a faster-running query.
SQL performance tuning is speeding up queries against a relational database.
There is not just one tool or method for optimizing SQL speed. Instead, it’s a set of procedures that utilize a variety of methods, and procedures. Let’s talk about some of the major factors that will influence how many computations you must perform and how long it takes for your query to run:
- Table size: Performance may be impacted if your query hits one or more tables with millions of rows or more.
- Joins: Your query is likely to be slow if it joins two tables in a way that significantly raises the number of rows in the return set.
- Aggregations: Adding several rows together to create a single result needs more processing than just retrieving those values individually.
- Other users executing queries: The more queries a database has open at once, the more it must process at once, and the slower it will all be. It can be particularly problematic if other users are using a lot of resources to perform queries that meet some of the aforementioned requirements.
Ways to Find Slow SQL Queries in SQL Server to Measure SQL Server Performance:
1. Create an Execution Plan: It is essential to be able to create execution plans, which you can accomplish with SQL Server Management Studio, in order to diagnose delayed queries. After the queries run, actual execution plans are generated. To create an execution plan:
- Start by selecting “Database Engine Query” from the toolbar of SQL Server Management Studio.
- Enter the query after that, and then select “Include Actual Execution Plan” from the Query option.
- It’s time to run your query at this point. You can do that by pressing F5 or the “Execute” toolbar button.
- The execution plan will then be shown in the results pane, under the “Execution Pane” tab, in SQL Server Management Studio.
2. Monitor Resource Usage: The performance of a SQL database is greatly influenced by resource use. Monitoring resource use is important since you can’t improve what you don’t measure. Use the System Monitor tool on Windows to evaluate SQL Server’s performance. You may view SQL Server objects, performance counters, and other object activity with it. Simultaneously watch Windows and SQL Server counters with System Monitor to see if there is any correlation between the two services’ performance.
3. Use SQL DMVs to Find Slow Queries: The abundance of dynamic management views (DMVs) that SQL Server includes is one of its best features. There are many of them, and they may offer a lot of knowledge on a variety of subjects.
Various DMVs are available that offer information on query stats, execution plans, recent queries, and much more. These can be combined to offer some incredible insights.
Optimizing a Query for SQL:
The server/database is crucial. If a query is inefficient or contains errors, it will consume up the production database’s resources and slow down or disconnect other users. You must optimize your queries to have the least possible negative influence on database performance. The following techniques can be used to optimize SQL queries:
1. SELECT fields instead of using SELECT *:
Many SQL developers use the SELECT * shortcut to query all of the data in a table while executing exploratory queries. However, if a table contains a lot of fields and rows, the database would be taxed by superfluous data queries. By using the SELECT statement, one may direct the database to only query the data you actually need to suit your business needs. For example:
Inefficient: Select * from GeeksTable; Efficient: SELECT FirstName, LastName, Address, City, State, Zip FROM GeeksTable;
2. Avoid SELECT DISTINCT:
It is practical to get rid of duplicates from a query by using SELECT DISTINCT. To get separate results, SELECT DISTINCT GROUPs for every field in the query. However, a lot of computing power is needed to achieve this goal. Furthermore, data may be inaccurately classified to a certain extent. Choose more fields to produce distinct results instead of using SELECT DISTINCT.
Inefficient and inaccurate: SELECT DISTINCT FirstName, LastName, State FROM GeeksTable; Efficient and accurate: SELECT FirstName, LastName, Address, City, State, Zip FROM GeeksTable;
3. Create queries with INNER JOIN (not WHERE or cross join):
WHERE clause joins are preferred by some SQL developers, as in the examples below:
SELECT GFG1.CustomerID, GFG1.Name, GFG1.LastSaleDate FROM GFG1, GFG2 WHERE GFG1.CustomerID = GFG2.CustomerID
A Cartesian Connection, also known as a Cartesian Product or a CROSS JOIN, is produced by this kind of join. A Cartesian Join creates every conceivable combination of the variables. If we had 1,000 customers and 1,000 in total sales in this example, the query would first produce 1,000,000 results before filtering for the 1,000 entries where CustomerID is correctly connected. The database has performed 100 times more work than was necessary, therefore this is a wasteful use of its resources. Due to the possibility of producing billions or trillions of results, Cartesian Joins pose a particular challenge for large-scale databases.
To prevent creating a Cartesian Join, use INNER JOIN instead:
SELECT GFG1.CustomerID, GFG1.Name, GFG1.LastSaleDate FROM GFG1 INNER JOIN GFG2 ON GFG1.CustomerID = GFG2.CustomerID
The database would only generate the limited desired records where CustomerID is equal.
4. Use WHERE instead of HAVING to define filters:
A successful query will only retrieve the necessary records from the database. HAVING statements are computed after WHERE statements in accordance with the SQL Order of Operations. A WHERE statement is more effective if the goal is to filter a query based on conditions.
Assuming 500 sales were made in 2019, for instance, query to find how many sales were made per client that year.
SELECT GFG1.CustomerID, GFG1.Name, GFG1.LastSaleDate FROM GFG1 INNER JOIN GFG2 ON GFG1.CustomerID = GFG2.CustomerID GROUP BY GFG1.CustomerID, GFG1.Name HAVING GFG2.LastSaleDate BETWEEN "1/1/2019" AND "12/31/2019"
The GFG2 table would be queried to retrieve 1,000 sales records, which would then be filtered to only include the 500 entries produced in 2019 before the dataset was counted.
WHERE clauses, in contrast, set a limit on how many records can be retrieved:
SELECT GFG1.CustomerID, GFG1.Name, GFG1.LastSaleDate FROM GFG1 INNER JOIN GFG2 ON GFG1.CustomerID = GFG2.CustomerID WHERE GFG2.LastSaleDate BETWEEN "1/1/2019" AND "12/31/2019" GROUP BY GFG1.CustomerID, GFG1.Name
5. Use wildcards at the end of a phrase only:
Wildcards enable the broadest search when searching unencrypted material, such as names or cities. However, the most extensive search is also the least effective. The database is required to search all records for a match anywhere inside the chosen field when a leading wildcard is used, particularly when combined with an ending wildcard.
Think about using this search and find cities that start with “No”:
SELECT City FROM GeekTable WHERE City LIKE ‘%No%’
The expected results are Noida, and Noopur, which will be returned by this query. Unexpected results will also be produced.
A better inquiry would be:
SELECT City FROM GeekTable WHERE City LIKE ‘No%’
6. Use LIMIT to sample query results:
Use a LIMIT statement to check if the results will be pleasing and useful before executing a query for the first time. (In some DBMS systems, the terms TOP and LIMIT are used synonymously.) Only the given number of records are returned by the LIMIT statement. By using a LIMIT statement, you can avoid stressing the production database with a big query only to discover that it needs to be edited or improved.
We will look at a maximum of 10 rows in the 2019 sales query from above:
SELECT GFG1.CustomerID, GFG1.Name, GFG1.LastSaleDate FROM GFG1 INNER JOIN GFG2 ON GFG1.CustomerID = GFG2.CustomerID WHERE GFG2.LastSaleDate BETWEEN "1/1/2019" AND "12/31/2019" GROUP BY GFG1.CustomerID, GFG1.Name LIMIT 10
We can see by the sample whether we have a usable data set or not.
7. Run your query during off-peak hours:
About planning any query to run at a time when it won’t be as busy in order to reduce the impact of your analytical queries on the database. When the number of concurrent users is at its lowest, which is often overnight, the query should be executed.
Your query should run query more frequently by using the following:
- Selecting from large tables (>1,000,000 records)
- Cartesian Joins or CROSS JOINs
- Looping statements
- SELECT DISTINCT statements
- Nested subqueries
- Wildcard searches in long text or memo fields
- Multiple schema queries
- Query confidently
When choosing and building indexes, database tuning includes index tuning. The index tuning objective is to speed up query processing. It can be challenging to employ indexes in dynamic contexts with numerous ad-hoc searches scheduled in advance. The queries that are based on indexes are subject to index tweaking, and the indexes are generated automatically as needed. Users of the database do not need to take any specific activities to tune the index.
Indexes can be used to increase a database’s speed and query performance. Index tuning is the process of improving the index selection.
Advantages of Index Tuning:
The performance of queries and databases can be improved by using the Index tuning wizard. It accomplishes this using the following methods:
- It makes recommendations for the best index usage based on the analysis of queries performed by the query optimizer in relation to workload.
- The changes in query distribution, index utilization, and performance are examined to determine the impact. Additionally, it suggests strategies to fine-tune the database for a select group of problematic queries.
- SQL profiler is used to record activity traces and to improve performance. To capture a broad range of data, the trace can be prolonged for a while.
One of the best methods to boost performance in a database application is through effective indexes. Without an index, the SQL Server engine is similar to a reader who is looking through each page of a book to discover a word. A reader can finish the assignment considerably faster by using the book’s index in the back. A table scan occurs in a database when a query cannot be assisted by an index. When doing a table scan, SQL Server looks at each row in the table to fulfil the query’s requirements. Table scans can occasionally be avoided, but they have a significant negative influence on performance when applied to big tables.
- Consequences of Creating Index: When it comes time to edit records, indexes slow down performance. The indexes on the data must be updated whenever a query alters the data in a table. It will be necessary to test and monitor your database to determine the ideal balance before you can create the proper amount of indexes. Databases used frequently for reporting on static systems can support larger indexes to allow read-only queries. A database with a large volume of data modification transactions will require fewer indexes to enable higher throughput. Disk space is also needed by indexes. The number of records in the table, the number and size of the index’s columns, and other factors will all affect the precise size.
- Creating The Best Index: There are a few rules to follow while creating the best indexes for your application. When choosing the indexes for your tables, take into account the following factors, from the columns user choose to the data values inside them:
- Short Indexes: Short indexes are advantageous for two reasons. First, database operations require a lot of disc space by nature. Larger index keys will require the database to read data from the disc more frequently, reducing throughput. Second, smaller items are simpler to compare since index entries are frequently used in comparisons. Since an integer is compact and simple for the database to compare, a single integer column makes the best index possible. Character strings, on the other hand, demand careful consideration of collation settings and character-by-character comparison.
- Distinct Indexes: The indexes with the least amount of duplicated values are the most efficient. Imagine a phone book for a community where practically everyone has the last name Gupta as an analogy. In any town, a phone book that is arranged by the last name is not particularly helpful because you can only eliminate a relatively limited number of records while looking for a Gupta.
- A distinct index is one that has a significant portion of unique values. A unique index is obviously extremely selective because there are no duplicate entries. In order to know how selective an index is, many databases keep statistics about each index in their records. When creating an execution plan for a query, the database takes this information into account.
- Clustered Indexes: Many databases feature a unique special index for each table that contains all of the data from a row. This index is known as a clustered index by SQL Server. Because each index item has all the information you need and there are no references to follow to pick up extra data values.
Every non-trivial table should, as a general rule, have a clustered index. Make the index a clustered index if you only construct one index for a table. If a clustered index doesn’t already exist, SQL Server will automatically construct one using the primary key column as the index key. The most efficient indexes are clustered ones because, when employed, they always cover a query. Clustered indexes are also used in many databases to help the database effectively manage the space needed to hold the table.
Make sure to select a column with static data when selecting the column(s) for a clustered index. The database may need to shift the index entry if you modify a record and alter the value of a column in a clustered index (to keep the entries in sorted order). It’s important to keep in mind that index entries for clustered indexes contain all of the column data; therefore, shifting an item is equivalent to running a DELETE statement followed by an INSERT, which can easily lead to performance issues if done frequently. Because of this, primary key columns or foreign key columns frequently have clustered indexes.
Finding the correct index to utilize when creating an execution plan is one of the database’s most crucial tasks. The majority of large databases come with tools that may be used to optimize and tune indexes as well as display the execution paths for a query. When building and altering indexes for any database, user should keep a few helpful rules in mind as mentioned above.