Hey guys! How are you? Excited for another post???

Today we're going to dive into a very common topic for those who work with data integration in the SQL Server ecosystem: monitoring executions in SSISDB. If you use the SQL Server Integration Services (SSIS) catalog to store packages, you know that it's not always easy to extract what really matters quickly.

What is the SSISDB Catalog?

For those just arriving, SSISDB is the central database that stores all projects, packages, parameters, environments and, most importantly for this post, execution history. When you run a package, SQL Server records each step, each error and each performance metric within this database.

Observation: By default, SSISDB has an automatic cleaning routine (Maintenance Job). If you don't find old data, check ownership Retention Window in the catalog settings. Generally, the default is 365 days, but in high-load environments, this can make the bank grow absurdly.

Monitoring via the Interface (SSMS)

The most common way to view what happened to your packages is through the native reports. SQL Server Management Studio (SSMS).

  • In Object Explorer, expand the node Integration Services Catalogs.
  • Right click on the folder SSISDB.
  • Go to Reports > Standard Reports > All Executions.

While these reports are visually pleasing, they have some severe limitations for those who need to do this frequently:

  • Analyze dozens or hundreds of runs: SSMS is not designed for mass analysis. Each execution needs to be opened individually, without a consolidated view by package, project, server or period, making it practically impossible to identify failure patterns, recurring problematic packages or performance degradation over time.
  • Slow: In catalogs with thousands of executions, native reports make heavy, non-indexed queries in SSISDB, which can take several minutes to load a simple screen, making operational use in mission-critical environments unfeasible.
  • Filter Difficulty: Filters are extremely limited: it is not possible to filter by error message, error code, specific task, pipeline component, message substring, nor combine multiple criteria (for example: packages that failed more than X times in the last month).
  • Troubleshooting Errors: To find the actual error of an execution, the user needs to navigate through several layers of screens (All Executions > Overview > Messages > Internal Messages), often finding only generic messages before reaching the actual error in the pipeline.
  • Create reports: SSMS reports cannot be integrated with Power BI, SSRS or any other BI tool, making it impossible to create dashboards with load SLA, ranking of packages with the most failures, average execution time, historical trend and operational indicators.
  • Automate analytics: There is no way to automate alerts, correlations or advanced analyzes (e.g.: “notify me if a package fails 3 times in 1 hour”, “identify loads with time above P95”, “cross failures with time and server”), requiring manual monitoring.

Optimizing the Query with T-SQL

To gain agility, nothing beats a well-structured script that goes directly to the catalog tables. The script below was designed to bring the history of the last 100 executions, already translating the status codes and bringing the error message if the package has failed.

SELECT DISTINCT TOP ( 100 )
    [A].[folder_name]                                    AS [Nm_Folder],
    [A].[project_name]                                   AS [Nm_Project],
    [A].[package_name]                                   AS [Nm_Package],
    [A].[use32bitruntime]                                AS [Fl_32Bit],
    ( CASE [A].[status]
          WHEN 1 THEN '1 - Created'
          WHEN 2 THEN '2 - Running'
          WHEN 3 THEN '3 - Canceled'
          WHEN 4 THEN '4 - Failed'
          WHEN 5 THEN '5 - Pending'
          WHEN 6 THEN '6 - Ended unexpectedly'
          WHEN 7 THEN '7 - Succeeded'
          WHEN 8 THEN '8 - Stopping'
          WHEN 9 THEN '9 - Completed'
      END
    )                                                    AS [Ds_Status],
    [A].[start_time]                                     AS [Dt_Inicio],
    [A].[end_time]                                       AS [Dt_Fim],
    DATEDIFF( SECOND, [A].[start_time], [A].[end_time] ) AS [Nr_Duracao_Segundos],
    [A].[executed_as_name]                               AS [Nm_Executor],
    [A].[caller_name]                                    AS [Nm_Chamador],
    [A].[stopped_by_name]                                AS [Nm_Parado_Por],
    [A].[server_name]                                    AS [Nm_Servidor],
    [C].[message_type]                                   AS [Nr_Tipo_Mensagem],
    [C].[message]                                        AS [Ds_Mensagem_Erro]
FROM
    [SSISDB].[internal].[execution_info]               AS [A]
    LEFT JOIN [SSISDB].[internal].[operations]         AS [B] ON [A].[execution_id] = [B].[operation_id]
    LEFT JOIN [SSISDB].[internal].[operation_messages] AS [C] ON [C].[operation_id] = [B].[operation_id]
                                                                 AND [C].[message_type] = 120 -- 120 REPRESENTA ERRO
    LEFT JOIN [SSISDB].[internal].[event_messages]     AS [D] ON [D].[operation_id] = [B].[operation_id]
                                                                 AND [D].[event_name] = 'OnError'
                                                                 AND ISNULL( [D].[subcomponent_name], '' ) <> 'SSIS.Pipeline'
-- WHERE
--     [A].[execution_id] = 11501
ORDER BY
    [A].[start_time] DESC;

In this script, we are accessing the internal views of the SSISDB. Note that I used a LEFT JOIN with the table [operation_messages] filtering through message_type = 120. This is essential, because in a successful execution, we will not have error messages, and we do not want the record to disappear from our report.

If you want a version of the script that only returns the last line per execution, you can use this script below:

SELECT TOP (100)
    [e].[folder_name]                                                AS [Nm_Folder],
    [e].[project_name]                                               AS [Nm_Project],
    [e].[package_name]                                               AS [Nm_Package],
    [e].[use32bitruntime]                                            AS [Fl_32Bit],
    CASE [e].[status]
        WHEN 1 THEN '1 - Created'
        WHEN 2 THEN '2 - Running'
        WHEN 3 THEN '3 - Canceled'
        WHEN 4 THEN '4 - Failed'
        WHEN 5 THEN '5 - Pending'
        WHEN 6 THEN '6 - Ended unexpectedly'
        WHEN 7 THEN '7 - Succeeded'
        WHEN 8 THEN '8 - Stopping'
        WHEN 9 THEN '9 - Completed'
    END                                                              AS [Ds_Status],
    [e].[start_time]                                                 AS [Dt_Inicio],
    [e].[end_time]                                                   AS [Dt_Fim],
    DATEDIFF(SECOND, [e].[start_time], COALESCE([e].[end_time], SYSDATETIME())) AS [Nr_Duracao_Segundos],
    [e].[executed_as_name]                                           AS [Nm_Executor],
    [e].[caller_name]                                                AS [Nm_Chamador],
    [e].[stopped_by_name]                                            AS [Nm_Parado_Por],
    [e].[server_name]                                                AS [Nm_Servidor],
    [m].[message_type]                                               AS [Nr_Tipo_Mensagem],
    [m].[message]                                                    AS [Ds_Mensagem_Erro],
    [m].[message_time]                                               AS [Dt_Mensagem]
FROM
    [SSISDB].[catalog].[executions] AS [e]
    OUTER APPLY
    (
        SELECT TOP (1)
            [om].[message_type],
            [om].[message],
            [om].[message_time]
        FROM
            [SSISDB].[catalog].[operation_messages] AS [om]
        WHERE
            [om].[operation_id] = [e].[execution_id]
            AND [om].[message_type] = 120
        ORDER BY
            [om].[message_time] DESC
    ) AS [m]
-- WHERE [e].[execution_id] = 11501
ORDER BY
    [e].[start_time] DESC;

Why use Script instead of Interface?

Feature SSMS interface T-SQL Script
Speed Low (graphical interface loading) High (direct execution in the engine)
Customization Limited to fixed filters Total (filters by date, project, error)
Automation Impossible Can be used in Dashboards (Power BI/Grafana)
Error History Need to navigate subreports Now available in the same line of execution

Monitor the SSISDB It's not just about seeing if it runs. When the catalog grows large, log queries can start to generate IO contention and heavy locks.

Attention: If you notice that SSISDB queries are slow, check the Wait Types. It is very common to find LCK_M_S or PAGEIOLATCH_SH if the cleanup job is running concurrently with log queries.

To improve the reading performance of this script in critical environments:

  • Indexes: SSISDB natively does not come with all the ideal indexes for custom queries. If you do this type of query a lot, consider creating indexes on date columns like start_time.
  • Isolation: If you are going to use this script in a real-time monitoring Dashboard, consider using the hint WITH (NOLOCK) to prevent reading the logs from blocking the writing of new logs by running packages.

With this script in hand, you have the power to quickly identify which package failed, why it failed, and who was responsible for executing it, all without relying on the slowness of standard software reports. SSMS.

I hope you liked this tip, a big hug and see you next time!