May 10, 2024
Calculating the dollars wasted on idle time in Snowflake
You might be surprised with the number.
Sahil Singla
Co-founder of Baselit
In the current economic climate, it's critical for data teams to extract maximum value from every dollar spent on Snowflake. Optimizing queries and processing less data are effective ways to reduce costs in Snowflake. However, these tasks have limited returns on investment because they require continuous effort and bandwidth.
The activity with the highest ROI involves minimizing money spent on idle time in Snowflake. This is money that you are spending but not getting anything in return. Moreover, this is a 1-time activity where you can write a piece of code that runs in the background to handle this, and once it's set up, it doesn't require further intervention.
Why do my warehouses run idle?
Each Snowflake warehouse has an AUTO_SUSPEND parameter, typically set to 60 seconds. For example, if a query runs for 90 seconds, the warehouse will continue running for another 60 seconds after the query completes.
Total time billed = 150 seconds
Actual time spent processing data = 90 seconds
Percentage of money spent on idle time = 60/150 = 40%
Note: Although the proportion of money spent on idle time decreases with heavier workloads, it remains significant.
Here is a warehouse timeline diagram that depict $$$ being wasted on idle time.
Multi-cluster Warehouses: The idle time for multi-cluster warehouses tends to be more significant due to the conservative manner in which Snowflake spins down the clusters.
The following is an image from the official documentation that illustrates the spin down process for multi-cluster warehouses. It's important to note that in terms of cost, a cluster is equivalent to a warehouse of the same size.
When accumulated across multiple warehouses, the amount spent on idle time becomes significant and should be reduced.
Rather than just estimating these costs, use the following SQL query to accurately calculate the credits expended solely on idle time.
-- startDate and endDate represents period for which idle time is being calculated
SET startDate = (SELECT current_date - 7);
SET endDate = (SELECT current_date);
WITH warehouse_credits_map AS (
SELECT * FROM (
VALUES
('X-Small', 'STANDARD', 1),
('Small', 'STANDARD', 2),
('Medium', 'STANDARD', 4),
('Large', 'STANDARD', 8),
('X-Large', 'STANDARD', 16),
('2X-Large', 'STANDARD', 32),
('3X-Large', 'STANDARD', 64),
('4X-Large', 'STANDARD', 128),
('5X-Large', 'STANDARD', 256),
('6X-Large', 'STANDARD', 512),
('Medium', 'SNOWPARK-OPTIMIZED', 6),
('Large', 'SNOWPARK-OPTIMIZED', 12),
('X-Large', 'SNOWPARK-OPTIMIZED', 24),
('2X-Large', 'SNOWPARK-OPTIMIZED', 48),
('3X-Large', 'SNOWPARK-OPTIMIZED', 96),
('4X-Large', 'SNOWPARK-OPTIMIZED', 192),
('5X-Large', 'SNOWPARK-OPTIMIZED', 384),
('6X-Large', 'SNOWPARK-OPTIMIZED', 768)
) AS t (warehouse_size, warehouse_type, credits_per_hour)
),
stop_threshold AS (
SELECT MAX(timestamp) AS timestamp
FROM snowflake.account_usage.warehouse_events_history
),
-- warehouse_cluster_status stores warehouse events timeline
warehouse_cluster_status_base AS (
SELECT
warehouse_id,
warehouse_name,
cluster_number,
timestamp AS valid_from,
LEAD(timestamp) OVER (PARTITION BY warehouse_id, cluster_number ORDER BY timestamp ASC) AS valid_to,
event_name = 'RESUME_CLUSTER' AS is_active,
event_reason,
user_name,
role_name,
query_id
FROM snowflake.account_usage.warehouse_events_history
WHERE
event_name IN ('RESUME_CLUSTER', 'SUSPEND_CLUSTER')
AND event_state = 'COMPLETED'
AND valid_from::date BETWEEN $startDate AND $endDate
),
warehouse_cluster_status AS (
SELECT
warehouse_cluster_status_base.warehouse_id,
warehouse_cluster_status_base.warehouse_name,
warehouse_cluster_status_base.cluster_number,
warehouse_cluster_status_base.is_active,
warehouse_cluster_status_base.valid_from,
COALESCE(warehouse_cluster_status_base.valid_to, stop_threshold.timestamp) AS valid_to,
warehouse_cluster_status_base.event_reason,
warehouse_cluster_status_base.user_name,
warehouse_cluster_status_base.role_name,
warehouse_cluster_status_base.query_id
FROM warehouse_cluster_status_base
CROSS JOIN stop_threshold
),
warehouse_cluster_status_enriched AS (
SELECT *,
LAG(valid_from, 1) OVER (PARTITION BY warehouse_id, cluster_number ORDER BY valid_from) AS previous_valid_from
FROM warehouse_cluster_status
WHERE valid_from::date BETWEEN $startDate AND $endDate
),
query_history_enriched AS (
SELECT
query_id,
TIMEADD('milliseconds', queued_overload_time + compilation_time + queued_provisioning_time + queued_repair_time + list_external_files_time, start_time) AS act_start_time,
execution_time,
TIMEADD('milliseconds', execution_time, act_start_time) AS act_end_time,
warehouse_id,
warehouse_name,
cluster_number,
warehouse_type,
warehouse_size
FROM snowflake.account_usage.query_history
WHERE
warehouse_size IS NOT NULL AND warehouse_name IS NOT NULL
AND act_start_time::date BETWEEN $startDate AND $endDate
),
ranked_queries AS (
SELECT
query_id,
act_start_time,
act_end_time,
warehouse_id,
cluster_number,
warehouse_size,
warehouse_type,
MAX(act_end_time) OVER (PARTITION BY warehouse_id, cluster_number ORDER BY act_start_time ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS latest_end_time_before_start,
LAG(warehouse_size, 1) OVER (PARTITION BY warehouse_id, cluster_number ORDER BY act_end_time) AS previous_warehouse_size,
LAG(warehouse_type, 1) OVER (PARTITION BY warehouse_id, cluster_number ORDER BY act_end_time) AS previous_warehouse_type
FROM query_history_enriched
),
-- It is checking the current name for the corresponding warehouse id, needed as warehouse can be renamed
current_warehouse_name AS (
SELECT
warehouse_id,
MAX_BY(warehouse_name, act_start_time) AS warehouse_name
FROM query_history_enriched
WHERE
warehouse_size IS NOT NULL
AND warehouse_name IS NOT NULL
GROUP BY warehouse_id
),
idle_time_info AS (
SELECT
ranked_queries.query_id,
act_start_time,
act_end_time,
ranked_queries.warehouse_id,
ranked_queries.cluster_number,
latest_end_time_before_start,
previous_warehouse_size,
previous_warehouse_type,
is_active,
valid_from,
previous_valid_from,
valid_to,
event_reason,
user_name,
role_name
FROM ranked_queries
JOIN warehouse_cluster_status_enriched ON warehouse_cluster_status_enriched.warehouse_id = ranked_queries.warehouse_id
AND warehouse_cluster_status_enriched.cluster_number = ranked_queries.cluster_number
AND (warehouse_cluster_status_enriched.valid_from BETWEEN ranked_queries.latest_end_time_before_start AND ranked_queries.act_start_time)
WHERE
latest_end_time_before_start < act_start_time
AND is_active = FALSE
AND valid_from::date BETWEEN $startDate AND $endDate
),
enriched_idle_time_info AS (
SELECT
idle_time_info.*,
warehouse_credits_map.credits_per_hour / 3600 AS credits_per_second,
current_warehouse_name.warehouse_name
FROM idle_time_info
INNER JOIN warehouse_credits_map ON idle_time_info.previous_warehouse_size = warehouse_credits_map.warehouse_size
AND idle_time_info.previous_warehouse_type = warehouse_credits_map.warehouse_type
INNER JOIN current_warehouse_name ON current_warehouse_name.warehouse_id = idle_time_info.warehouse_id
),
idle_time_data AS (
SELECT
valid_from::date AS DATE,
sum(credits_per_second * greatest(TIMEDIFF('seconds', greatest(previous_valid_from, latest_end_time_before_start), valid_from), 0)) AS idle_time_credits
FROM enriched_idle_time_info
GROUP BY 1
)
SELECT SUM(idle_time_credits) AS idle_credits_weekly, idle_credits_weekly * 365/(DATEDIFF('day', $startDate, $endDate)+1) AS idle_credits_annually FROM
-- startDate and endDate represents period for which idle time is being calculated
SET startDate = (SELECT current_date - 7);
SET endDate = (SELECT current_date);
WITH warehouse_credits_map AS (
SELECT * FROM (
VALUES
('X-Small', 'STANDARD', 1),
('Small', 'STANDARD', 2),
('Medium', 'STANDARD', 4),
('Large', 'STANDARD', 8),
('X-Large', 'STANDARD', 16),
('2X-Large', 'STANDARD', 32),
('3X-Large', 'STANDARD', 64),
('4X-Large', 'STANDARD', 128),
('5X-Large', 'STANDARD', 256),
('6X-Large', 'STANDARD', 512),
('Medium', 'SNOWPARK-OPTIMIZED', 6),
('Large', 'SNOWPARK-OPTIMIZED', 12),
('X-Large', 'SNOWPARK-OPTIMIZED', 24),
('2X-Large', 'SNOWPARK-OPTIMIZED', 48),
('3X-Large', 'SNOWPARK-OPTIMIZED', 96),
('4X-Large', 'SNOWPARK-OPTIMIZED', 192),
('5X-Large', 'SNOWPARK-OPTIMIZED', 384),
('6X-Large', 'SNOWPARK-OPTIMIZED', 768)
) AS t (warehouse_size, warehouse_type, credits_per_hour)
),
stop_threshold AS (
SELECT MAX(timestamp) AS timestamp
FROM snowflake.account_usage.warehouse_events_history
),
-- warehouse_cluster_status stores warehouse events timeline
warehouse_cluster_status_base AS (
SELECT
warehouse_id,
warehouse_name,
cluster_number,
timestamp AS valid_from,
LEAD(timestamp) OVER (PARTITION BY warehouse_id, cluster_number ORDER BY timestamp ASC) AS valid_to,
event_name = 'RESUME_CLUSTER' AS is_active,
event_reason,
user_name,
role_name,
query_id
FROM snowflake.account_usage.warehouse_events_history
WHERE
event_name IN ('RESUME_CLUSTER', 'SUSPEND_CLUSTER')
AND event_state = 'COMPLETED'
AND valid_from::date BETWEEN $startDate AND $endDate
),
warehouse_cluster_status AS (
SELECT
warehouse_cluster_status_base.warehouse_id,
warehouse_cluster_status_base.warehouse_name,
warehouse_cluster_status_base.cluster_number,
warehouse_cluster_status_base.is_active,
warehouse_cluster_status_base.valid_from,
COALESCE(warehouse_cluster_status_base.valid_to, stop_threshold.timestamp) AS valid_to,
warehouse_cluster_status_base.event_reason,
warehouse_cluster_status_base.user_name,
warehouse_cluster_status_base.role_name,
warehouse_cluster_status_base.query_id
FROM warehouse_cluster_status_base
CROSS JOIN stop_threshold
),
warehouse_cluster_status_enriched AS (
SELECT *,
LAG(valid_from, 1) OVER (PARTITION BY warehouse_id, cluster_number ORDER BY valid_from) AS previous_valid_from
FROM warehouse_cluster_status
WHERE valid_from::date BETWEEN $startDate AND $endDate
),
query_history_enriched AS (
SELECT
query_id,
TIMEADD('milliseconds', queued_overload_time + compilation_time + queued_provisioning_time + queued_repair_time + list_external_files_time, start_time) AS act_start_time,
execution_time,
TIMEADD('milliseconds', execution_time, act_start_time) AS act_end_time,
warehouse_id,
warehouse_name,
cluster_number,
warehouse_type,
warehouse_size
FROM snowflake.account_usage.query_history
WHERE
warehouse_size IS NOT NULL AND warehouse_name IS NOT NULL
AND act_start_time::date BETWEEN $startDate AND $endDate
),
ranked_queries AS (
SELECT
query_id,
act_start_time,
act_end_time,
warehouse_id,
cluster_number,
warehouse_size,
warehouse_type,
MAX(act_end_time) OVER (PARTITION BY warehouse_id, cluster_number ORDER BY act_start_time ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS latest_end_time_before_start,
LAG(warehouse_size, 1) OVER (PARTITION BY warehouse_id, cluster_number ORDER BY act_end_time) AS previous_warehouse_size,
LAG(warehouse_type, 1) OVER (PARTITION BY warehouse_id, cluster_number ORDER BY act_end_time) AS previous_warehouse_type
FROM query_history_enriched
),
-- It is checking the current name for the corresponding warehouse id, needed as warehouse can be renamed
current_warehouse_name AS (
SELECT
warehouse_id,
MAX_BY(warehouse_name, act_start_time) AS warehouse_name
FROM query_history_enriched
WHERE
warehouse_size IS NOT NULL
AND warehouse_name IS NOT NULL
GROUP BY warehouse_id
),
idle_time_info AS (
SELECT
ranked_queries.query_id,
act_start_time,
act_end_time,
ranked_queries.warehouse_id,
ranked_queries.cluster_number,
latest_end_time_before_start,
previous_warehouse_size,
previous_warehouse_type,
is_active,
valid_from,
previous_valid_from,
valid_to,
event_reason,
user_name,
role_name
FROM ranked_queries
JOIN warehouse_cluster_status_enriched ON warehouse_cluster_status_enriched.warehouse_id = ranked_queries.warehouse_id
AND warehouse_cluster_status_enriched.cluster_number = ranked_queries.cluster_number
AND (warehouse_cluster_status_enriched.valid_from BETWEEN ranked_queries.latest_end_time_before_start AND ranked_queries.act_start_time)
WHERE
latest_end_time_before_start < act_start_time
AND is_active = FALSE
AND valid_from::date BETWEEN $startDate AND $endDate
),
enriched_idle_time_info AS (
SELECT
idle_time_info.*,
warehouse_credits_map.credits_per_hour / 3600 AS credits_per_second,
current_warehouse_name.warehouse_name
FROM idle_time_info
INNER JOIN warehouse_credits_map ON idle_time_info.previous_warehouse_size = warehouse_credits_map.warehouse_size
AND idle_time_info.previous_warehouse_type = warehouse_credits_map.warehouse_type
INNER JOIN current_warehouse_name ON current_warehouse_name.warehouse_id = idle_time_info.warehouse_id
),
idle_time_data AS (
SELECT
valid_from::date AS DATE,
sum(credits_per_second * greatest(TIMEDIFF('seconds', greatest(previous_valid_from, latest_end_time_before_start), valid_from), 0)) AS idle_time_credits
FROM enriched_idle_time_info
GROUP BY 1
)
SELECT SUM(idle_time_credits) AS idle_credits_weekly, idle_credits_weekly * 365/(DATEDIFF('day', $startDate, $endDate)+1) AS idle_credits_annually FROM
Want to hear the latest on Snowflake optimization?
Sign up for our newsletter that we send out (roughly) every two weeks.
Want to hear the latest on Snowflake optimization?
Sign up for our newsletter that we send out (roughly) every two weeks.
Stay updated with strategies to optimize Snowflake.
Resources
Legal
Stay updated with strategies to optimize Snowflake.
Resources
Legal