SQL Server Performance Tuning Class - Day 1

Denver 2015

Getting ready steps

~~Write down 3 biggest pain points~~
- "IS" it the database that's slowing things down or environmental? Is perfmon the answer?
- Deadlocks / locking. Should we be running in a differnet isolation mode?
- Audit solution - triggers vs in-app, etc.
~~Gather SQL server stats~~
~~Gather Wait statistics~~

Day One

How To Triage

Batch Requests per Second (measure of workload)
"SQL Query Stress" tool
rollback will give you a % complete status while 'kill' will not
rollback only ever executes on a single thread, regardless of DOP
sp_WhoIsActive only shows you the last command in the transaction in the session
- make sure you're using @get_locks command to see the locks held by the session
sp_AskBrent will diagnose whether or not there's really a problem with the SQL Server
SOS_Scheduler_Yield points to a CPU issue
Page IO Latch should return in less than 100ms otherwise there may be a storage issue
exec sp_configure 'remote admin connections', 1 This will enable Direct Admin Connections
sp_BlitzCache
- cpu weight is a measure of cpu load
- focus on read weight to see who is doing the most work
- exec sp_BlitzCache @top = 10, @sort_order = 'reads' Will also run sort by 'executions'
Run sp_BlitzCache once per week and review to get a feel for what the top queries should be
Triage tools in order
1. Monitoring software (flatline)
2. sp_WhoIsActive
3. sp_AskBrent
4. sp_BlitzCache
5. sp_BlitzIndex
6. sp_Blitz

T-Sql Anti Patterns

set statistics io on
when using profiler, set 75 millisconds as a threshold to filter out the small statements
Extended Events - uses event tracing for windows. lighter weight than a server side trace
Single Statement Table Value Function - SQL Server can optimize and process this much more efficiently because SQL will inline the function
SQL Server 2014 estimates that 100 rows will come back from a TVF. 2012 and earlier return 1.
By pushing data into a temp table first, statistics are going to be accurate
Look for usages of fn_split in our larger procedures. SQL server does a terrible job of estimating when this function is involved
DON'T JOIN TO TABLE VALUE FUNCTIONS (or table variables for that matter). use temp tables instead
CTEs are executed every time they are referenced.
Dynamic SQL can result in a large plan cache if you're not using parameters properly
SARG (search argumentable). non-SARGable means it won't use an index. this can happen with optional sproc parameters
WHERE LEFT, UPPER, LIKE, YEAR, VARCHAR = @NVARCHAR all of these result in a non-sargable queries. all of these will result in full table scans
Correlated subquery issues:
- has some of the same problems as scalar functions
- can use windowing functions ('OVER') instead
Nested Views - can cause performance issues. Be careful

Understanding Parallelism

CXPACKET means there were inefficiencies in how work was handed out
Cost Threshold for Parallelism - default of 5 is typically too low these days starting point should be 50 or 100.
Amdahl's law - adding more workers becomes counter productive at some point
CXPACKET should be below 50%, but don't drive to 0
- usually means there's missing indexes or query opportunities
- this ensures that only big queries go parallel
- this is not fixed by setting max DOP, you need to index and tune queries
Don't look at %s when looking at wait types, look at clock time
Monitoring software will measure hours of wait time per hour. Very useful

Parameter Sniffing

"Density Vector" average number of rows for any given value
If an execution plan gets seeded with a small number of estimated rows, then when the sproc runs again with different parameters it won't necessarily use statistics to find the plan to run
Look on the actual execution plan XML and you'll see ParameterCompiledValue differences from ParameterRuntimeValue. If it's zero, then it's not using statistics it's just using the cached item.
With procedures, look for differences in the avg duration between statements and the procedure. This could be an indication that parameter sniffing is an issue.
stored procedures use statistics differently from direct statements. This can be a real issue when you're trying to trace down perf issues with stored procs inside of mgmt studio.
`exec sp_helpstats 'dbo.Posts', 'All'``
DBCC SHOW_STATISTICS('Posts', 'kl_posts_OwnerUserId')
- Steps will show you the number of histogram blocks. 200 is the max, 201 steps with nulls
If you use a local variable in a sproc then it uses a different set of statistics - uses density vector calculations instead. Talked without someone else at the conference who said they used this across all their sprocs to avoid parameter sniffing issues. This could be used instead of WITH RECOMPILE
You won't be able to reproduce the problem oustide of a sproc
INSTEAD, create a temp sproc to find out what's going on...
The WITH RECOMPILE option, the exec plan will be re-compiled and then not stored in cache
- Note that this only applies to the outer most statement
- Rather than 'WITH RECOMPILE' at the sproc level, you should use 'OPTION (RECOMPILE)' at the statement level within the sproc
- DON'T put RECOMPILE hints in the header of the sproc
`OPTION (OPTIMIZE FOR UNKNOWN)`` as an alternative to recompile. Forces it to caculate estimated rows based on averages
Local variables inside a sproc CANNOT be sniffed
`OPTION OPTIMIZE FOR (@UserId=557499)`` this will give you a consistent execution plan
Another option would be to have different versions of the sproc for small / large customers
Prepared statements act just like spocs
NonClustered indexes always include the clustered index key
When testing slow running sprocs, create a temp sproc, look at the 'compiled for' value from the slow running instance and then OPTIMIZE FOR that value.
Updating statistics forces a recompile. This doesn't mean that the stats were bad, it could be that the cached plan was bad.

Find and Fight Blocking

Read committed is the current default pessimistic isolation level in the boxed version of sql server
Use monitoring tools or perfmon alerts to identify blocking issues
Create alert in SQL Agent - General Statistics:Processes Blocked. You can also set the response values to run sp_WhoIsActive when this happens. Make sure you set a soak time of like 5 minutes to prevent multiple alerts from coming in
LCK_M_IX, LCK_M_S, LCK_M_U, LCK_M_IS - locks are either row or object locks
LCK_M_SCH_S, LCK_M_SCH_M - Index rebuilds will also grab a schema lock when indexing occurs
- ALTER TABLE DISABLE TRIGGER - nobody can use the table while you're running
Blocked process report (look for links in materials) can help you visualize the blocking chains every 5 seconds
Deadlock monitor runs every 5 seconds. Also reports on blocking that can be picked up by a trace which can be picked up by extended events.
'Lock Escalation' SQL Server will escalate to a larger scoped lock once you cross a certain threshold. If you do any type of clustered scan there's a risk that the lock could escalate
Look at wait stats to see what types of locking are occuring
- Identify the root of the blocking chain
Optimistic Concurrency
- Read Committed Snapshot Isolation (SNAPSHOT)
  - most recent committed version at the statement level
- Snapshot Isolation
  - transactionally consistent within a transaction
- Use Snapshot isolation for reports can help without blocking the rest of the application. Make the big read only reports use snapshot isoluation. Best to use only on big read queries, not updates
You can run RCSI and SNAPSHOT isolation together. SNAPSHOT is great for long running reports, just be sure to wrap the logic for the report in a transaction and set SNAPSHOT isolation
- With RCSI and SNAPSHOT, performance of tempdb becomes vital.
- Monitor tempdb perf and look for abandoned transactions (look back through notes on how to do this)

evenkiel/gist:a05d3dbdc29b17ac2284

SQL Server Performance Tuning Class - Day 1

Denver 2015

Getting ready steps

Day One

How To Triage

T-Sql Anti Patterns

Understanding Parallelism

Parameter Sniffing

Find and Fight Blocking