How I Learned to Stop Worrying and Love NUL

After the merger of Woodgrove Bank & Finance with Contoso Bank, each Division had its own silo of information.  Each bank (and several of their prior acquisitions) had slightly different ways of doing things.  Often, the same information was represented differently (Client vs. ClientID vs. ClientNumber) and sometimes one bank would use pieces of data that the other one didn’t.  Month-end financial reporting for the newly combined behemoth was nearly impossible with all of the legacy systems.  “Let’s build a data warehouse from scratch,” they said.  “We’ll stage data from the different systems in different staging databases and standardize the data as it is fed into a central data warehouse.”

They gathered requirements and carefully considered different designs and kept meetings frequent but mercifully short.  Soon they had a data warehouse built on SQL Server 2014 AlwaysOn Availability Groups.  All of the servers were virtual machines in beefy new hosts that used 100% SSD storage.  It was an engineering marvel.  And then it went live.  The staging databases were ginormous and several of the daily feeds were over 150 GB.  The system was able to keep up with all of the data and all of the reporting demands, but the network share for database backups could not handle the sheer size of transaction log backups.  After adding a TB a day for several days, Systems was begging for some relief.

“The flat files from all of the daily feeds take up less space than the databases to begin with, not to mention compressing much, much better after data deduplication,” they said.  “Let’s just set the databases to Simple Recovery, take nightly full backups, and reimport the flat files if we ever need to restore.  Easy peasy lemon squeezy.”

While it still seems to me a valid requirement to have a highly available database that does not require point in time recovery, Microsoft disagrees.

Checklist: Requirements (Availability Databases)
https://msdn.microsoft.com/en-us/library/ff878487(v=sql.120).aspx#PrerequisitesForDbs

If you attempt to set an Availability database to Simple recovery, this is what you will see:

AOAG_Nul

Enter the Good-ish Really Bad Idea™.

Readers familiar with Unix may have used /dev/null.  Output ‘written’ to /dev/null simply ceases to exist as all of those ones and zeros pass into the ether, but the write operation is reported to have succeeded.  In DOS/Windows, this black hole is NUL (one ‘L’) and is the magic behind the Recycle Bin.  SQL Server can backup to this device, but it is almost always a Really Bad Idea™.  As often happens with screwy edge cases, a Really Bad Idea™ may be the last, best option at one’s disposal.

BACKUP LOG [ODS_Contoso] TO DISK = N'NUL'

SQL does what you tell it to do.  It does not care nor will it prompt you if you write your full backups to NUL.  A single transaction log backup written to NUL will break the log chain, so if you really do need point-in-time recovery, you’ll be out of luck until the next full backup.  And don’t forget, this merely discards inactive records in the log file; it doesn’t shrink the file.

This is such a bad idea that the previously documented way of doing this (BACKUP LOG WITH TRUNCATE_ONLY) was officially removed by Microsoft with the release of SQL 2008.  Now, Microsoft insists that if you don’t want to back up the transaction log that you put the database in Simple recovery mode.  Except that you can’t when the database is part of an Availability Group.

Advertisements

#alwayson-availability-groups, #sql-server