12 Steps to Reduce the Database Size

12 Steps to Reduce the Database Size

We produce and receive data lieterally every second in our life. On a daily basis, the data grows exponentially. No matter how modern your database architecture is, hardware is limited and resource-extensive, therefore it is always a good idea to check on your database and look for ways to reduce the size.

The following tips are not new, but still working for at least SQL Server 2017. I’m pretty sure they are also applicable for the newer SQL Server versions.

Here’s a step-by-step guide on how to reduce the size of a SQL Server database, along with some SQL code to help you optimize your database.

Step 1: Identify the cause of the large database size

Before you can start reducing the size of your database, it’s important to understand why it has grown so large. There are a number of factors that can contribute to a large database size, including:

  • Accumulation of old or outdated data
  • Inefficient or redundant indexes
  • Large transaction logs
  • Unused or redundant database objects
  • Lack of maintenance and regular cleanup

To identify the cause of the large database size, you can use the following SQL query to retrieve information about the size of each table in the database:

SELECT 
    t.NAME AS TableName,
    p.rows AS RowCounts,
    CAST(SUM(a.total_pages) * 8 / 1024.0 / 1024.0 AS DECIMAL(18,2)) AS TotalSizeGB
FROM 
    sys.tables t
INNER JOIN 
    sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN 
    sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN 
    sys.allocation_units a ON p.partition_id = a.container_id
WHERE 
    t.NAME NOT LIKE 'sys%' 
    AND t.is_ms_shipped = 0
    AND i.OBJECT_ID > 255
GROUP BY 
    t.Name, p.Rows
ORDER BY 
    TotalSizeGB DESC;

This query retrieves information from the system tables in SQL Server to calculate the size of each table in the database, and groups the results by table name and row count. The size is displayed in gigabytes (GB) using the CAST function to convert the result to a decimal with two decimal places.

Use the results of this query to identify which tables are taking up the most space in the database. Once you have identified the largest tables, you can start optimizing them to reduce the overall size of the database.

Step 2: Optimize table size

The following techniques can help you optimize the size of individual tables in your database:

1. Remove old or outdated data

One of the most common causes of a large database size is the accumulation of old or outdated data. You can use the following SQL query to delete old data from a table:

DELETE FROM TableName
WHERE DateColumn < 'YYYY-MM-DD';

Replace TableName with the name of the table you want to delete data from, and DateColumn with the name of the column that contains the date information for each row. Replace YYYY-MM-DD with the date before which you want to delete data.

Be careful when deleting data, as it is permanent and cannot be undone. Make sure to take a backup of your database before deleting data, and test your queries on a non-production system before running them on your live database.

2. Reorganize or rebuild indexes

Inefficient or redundant indexes can contribute to a large database size and slow performance. You can use the following SQL query to identify unused or redundant indexes in your database:

SELECT 
    OBJECT_NAME(s.object_id) AS TableName,
    i.name AS IndexName,
    s.user_seeks,
    s.user_scans,
    s.user_lookups,
    s.user_updates
FROM 
    sys.dm_db_index_usage_stats s
INNER JOIN 
    sys.indexes i ON i.object_id = s.object_id AND i.index_id = s.index_id
WHERE 
    OBJECTPROPERTY(s.object_id,'IsUserTable') = 1 
    AND s.database_id = DB_ID()
    AND (s.user_seeks = 0 AND s.user_scans = 0 AND s.user_lookups = 0)
ORDER BY 
    TableName, IndexName;

This query retrieves information about index usage from the system tables in SQL Server, and identifies indexes that have not been used for any seeks, scans, or lookups. Unused indexes can be safely removed using the DROP INDEX statement.

Alternatively, you can use the ALTER INDEX statement to reorganize or rebuild indexes. Reorganizing an index uses minimal system resources and can improve query performance, while rebuilding an index creates a new index and can reduce fragmentation. Here are the basic syntax for both:

-- Reorganize an index
ALTER INDEX IndexName ON TableName REORGANIZE;

-- Rebuild an index
ALTER INDEX IndexName ON TableName REBUILD;

3. Use compression

SQL Server offers several compression options that can help reduce the size of your database. You can use the following SQL query to check the compression level of each table in your database:

SELECT 
    t.name AS TableName,
    p.rows AS RowCounts,
    CAST(SUM(a.total_pages) * 8 / 1024.0 / 1024.0 AS DECIMAL(18,2)) AS TotalSizeGB,
    CASE WHEN p.data_compression = 0 THEN 'NONE'
         WHEN p.data_compression = 1 THEN 'ROW'
         WHEN p.data_compression = 2 THEN 'PAGE'
         ELSE CAST(p.data_compression AS VARCHAR(20))
    END AS CompressionType
FROM 
    sys.tables t
INNER JOIN 
    sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN 
    sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN 
    sys.allocation_units a ON p.partition_id = a.container_id
WHERE 
    t.NAME NOT LIKE 'sys%' 
    AND t.is_ms_shipped = 0
    AND i.OBJECT_ID > 255
GROUP BY 
    t.Name, p.Rows, p.data_compression
ORDER BY 
    TotalSizeGB DESC;

This query retrieves information about the size and compression level of each table in your database. If a table is not compressed, you can use the ALTER TABLE statement to compress it:

-- Compress a table with PAGE compression
ALTER TABLE TableName REBUILD WITH (DATA_COMPRESSION = PAGE);

Compression can help reduce the size of your database, but it can also increase CPU usage and slow down queries. Test the impact of compression on your system before using it in production.

Step 3: Shrink the database (optional)

Shrinking a database should only be done as a last resort, and only if Step 2 did not significantly reduce the size of your database. Shrinking a database can be a risky operation as it can cause data loss, file fragmentation, and performance issues. Therefore, it is important to back up your database before attempting to shrink it.

To shrink a database, you can use the following SQL query:

DBCC SHRINKDATABASE (DatabaseName);

This command will shrink the database to its smallest possible size by releasing unused space in data and log files. You can also specify a target size for the database, like this:

DBCC SHRINKDATABASE (DatabaseName, TargetSizeInMB);

However, shrinking a database can cause fragmentation and slow down performance. To reduce fragmentation, you can use the DBCC INDEXDEFRAG command, which reorganizes the physical pages of an index. Alternatively, you can use the ALTER INDEX command with the REBUILD option, which creates a new index and removes fragmentation.

Before shrinking a database, consider the following:

  • Shrink data files only if you have deleted a large amount of data or if you have migrated data to another database. Shrinking data files regularly can lead to file fragmentation and slow performance.
  • Shrink log files only if you have backed up your transaction log and if it has grown too large. Shrinking log files regularly can lead to data loss and slow performance.
  • Do not shrink a database if it is near its maximum size or if it is used heavily. Shrinking a database can take a long time and can impact database performance.
  • Monitor your database after shrinking it to ensure that it is performing as expected.

In conclusion, reducing the size of a SQL Server database can help improve performance and reduce storage costs. To reduce database size, you can optimize table size by removing unused tables, reorganizing or rebuilding indexes, and using compression. If these methods do not significantly reduce the size of your database, you can consider shrinking it, but be cautious and back up your database before attempting to do so.

In addition to the steps outlined in the previous message, there are a few more strategies that can be used to reduce the size of a SQL Server database. Here are some additional steps to consider:

Step 4: Monitor and Optimize the Transaction Log

The transaction log is an important component of a SQL Server database, as it records all changes made to the database. However, if the transaction log is not managed properly, it can grow very large and consume a significant amount of disk space.

To optimize the transaction log, you can perform regular backups and set the database recovery model to simple or bulk-logged. You can also use the DBCC SQLPERF(LOGSPACE) command to monitor the size of the transaction log and identify any issues.

Step 5: Monitor Disk Usage

It’s important to monitor disk usage to ensure that your database is not using more storage than necessary. You can use the sp_spaceused command to check the amount of disk space used by a table or database.

Additionally, you can use tools such as SQL Server Management Studio, Performance Monitor, or third-party solutions to monitor disk usage and identify any potential issues.

Step 6: Consider Database Partitioning

Database partitioning involves dividing a large table or index into smaller, more manageable parts. Partitioning can help to improve query performance and reduce storage requirements.

To partition a table or index, you can use the CREATE PARTITION FUNCTION and CREATE PARTITION SCHEME commands. Partitioning can be complex and may require significant changes to your database schema and applications, so it should be carefully planned and tested before implementation.

Step 7: Consider Using Columnstore Indexes

If you have large tables with many columns, consider using columnstore indexes to reduce the size of your database and improve query performance. Columnstore indexes store data in a column-wise format, which can reduce the amount of disk space required to store data and improve query performance.

To create a columnstore index, you can use the CREATE CLUSTERED COLUMNSTORE INDEX command. Columnstore indexes can be used for both analytic and transactional workloads and can significantly reduce query execution time for large tables.

Step 8: Optimize Database Maintenance

Regular database maintenance can help to reduce the size of your database and improve performance. This includes tasks such as backing up the database, performing regular maintenance tasks (such as updating statistics and rebuilding indexes), and monitoring database performance.

To optimize database maintenance, consider using SQL Server Agent to schedule maintenance tasks and automate routine tasks such as backups and index maintenance.

Step 9: Consider Using a Cloud Solution

If you are running out of disk space or experiencing performance issues with your on-premises database, consider using a cloud-based solution such as Azure SQL Database or Amazon RDS. Cloud solutions can provide elastic storage, automatic backup and recovery, and can scale to meet changing demand.

To migrate to a cloud-based solution, you can use the Database Migration Service or other migration tools provided by your cloud provider.

Step 10: Implement Row-Level Compression

Row-level compression is a feature in SQL Server that can help to reduce the size of your database by compressing data at the row level. Row-level compression can be used for both tables and indexes and can significantly reduce the amount of disk space required to store data.

To implement row-level compression, you can use the ALTER TABLE or ALTER INDEX command with the REBUILD option and the PAGE or ROW compression options.

Step 11: Consider Using In-Memory OLTP

In-Memory OLTP is a feature in SQL Server that allows you to store and process data entirely in memory. In-Memory OLTP can significantly improve query performance and reduce storage requirements, especially for transactional workloads.

To use In-Memory OLTP, you can create memory-optimized tables and indexes and migrate data from disk-based tables to memory-optimized tables. In-Memory OLTP requires careful planning and testing, as it may require significant changes to your database schema and applications.

Step 12: Consider Using External Tables

External tables are a feature in SQL Server that allows you to store data outside of your database, in a separate file or data source. External tables can be used to store large or infrequently accessed data and can reduce the size of your database and improve query performance.

To use external tables, you can create an external table definition using the CREATE EXTERNAL TABLE command and specify the location and format of the external data source.

Conclusion

Reducing the size of a SQL Server database requires a comprehensive approach that includes optimizing table size, reorganizing or rebuilding indexes, using compression, archiving data, optimizing the transaction log, monitoring disk usage, considering database partitioning, using columnstore indexes, optimizing database maintenance, considering a cloud-based solution, implementing row-level compression, using In-Memory OLTP, and using external tables. By following these steps, you can improve database performance, reduce storage costs, and optimize database management. Remember to always back up your database before making any changes and to monitor your database after implementing any optimizations.


Photo by elora manzo on Unsplash

Leave a Reply