Find last statistics update & outdated and missing stats in Synapse SQL Pools

Disclaimer: Hey Guys, this post contains affiliate link to help our reader to buy best product\service for them. It helps us because we receive compensation for our time and expenses.

InterServer Web Hosting and VPS
Introduction:

Azure Synapse Dedicated SQL Pools is that while they have auto create stats, there is no auto update stats. So statistics will become stale faster. In addition, if you create an empty table, the SQL pool will just assume 1,000 rows and pretty much never update statistics. We call this a missing statistic.

Check last stats update:

We will use  STATS_DATE function to get the date of last update. A statistics could be 10 years old and still valid if the table hasn’t changed, or it could be 24 hours old but a billion rows have been inserted causing skew.

SELECT
    sm.[name] AS [schema_name],
    tb.[name] AS [table_name],
    co.[name] AS [stats_column_name],
    st.[name] AS [stats_name],
    STATS_DATE(st.[object_id],st.[stats_id]) AS [stats_last_updated_date]
FROM
    sys.objects ob
    JOIN sys.stats st
        ON  ob.[object_id] = st.[object_id]
    JOIN sys.stats_columns sc
        ON  st.[stats_id] = sc.[stats_id]
        AND st.[object_id] = sc.[object_id]
    JOIN sys.columns co
        ON  sc.[column_id] = co.[column_id]
        AND sc.[object_id] = co.[object_id]
    JOIN sys.types  ty
        ON  co.[user_type_id] = ty.[user_type_id]
    JOIN sys.tables tb
        ON  co.[object_id] = tb.[object_id]
    JOIN sys.schemas sm
        ON  tb.[schema_id] = sm.[schema_id]
WHERE
    st.[user_created] = 1;
Working on missing and outdated stats:

The answer lies in this sample script from Microsoft which tracks out of date statistics at the partition level using sys.dm_db_partition_stats

https://github.com/Microsoft/sql-data-warehouse-samples/blob/main/samples/sqlops/MonitoringScripts/ImpactedTables

I’ve taken that concept, cleaned it up into a view, added by partition support and the first sample is here in a view called vTableStats

https://github.com/ProdataSQL/SynapseTools/blob/main/SqlPools/Maintenance/vTableStats.sql

Demo of dbo.vTableStats

Running this view on my sample EDW in a SQL Pool I can now locate all the missing and out date statistics

SELECT *  FROM [dbo].[vTableStats] 
WHERE recommend_update =1

We can see one table which was created by using a CREATE TABLE and INSERT INTO (rather than CTAS), which the statistics think has 1,000 rows but actually has over 61 million rows and over 40 other potential statistic issues.

How to update stats:
UPDATE STATISTICS [Schema_Name].[Table_Name] ([stats_name])

Thank you!