Pages

Monday, 7 February 2022

SQL Server – Backing up the Tail of the Log

When a database gets corrupted or failure occurs, before restoring the database from the backups you've created, it is recommended to create a tail-log backup of the records that haven't been backed up. This helps restore the database to the exact point at which it failed, preventing data loss.

Read on to learn about the other reasons when you need to back up the tail of the transaction log. Also, understand how to take a tail-log backup and restore it to get back the data you fear losing in the event of a crisis.

Why and When Should You Back Up The Tail of the Log?

Tail-log backup helps capture the tail of the log records when the database is offline, damaged, or data files are missing.

Reasons Why You Need To Back Up the Tail of the Transaction Log

  • Database is corrupted, or the data file is corrupted or deleted.
  • Database goes offline and doesn't start; you may want to recover the database as quickly as possible. But, before you begin recovery, first take the tail-log backup.
  • The database is online, and you plan on restoring the database, start by backing up the tail of the log.
  • Migrating database from one server to the other server.

Example Demonstrating the Need to Take Tail-log Backup

Let's say, you run DBCC CHECKDB to check for corruption in the database. It returns consistency errors and you decide to restore the previously taken backups, such as the Full backup. Then, you restore the Differential and all the transaction log backups. But you don't want to lose the log records that haven't been captured in the transaction log backup. So, to avoid losing those log records (i.e., the tail of the backup) and maintain the log chain intact, you will need to take Tail Log Backup.

Let’s consider a scenario.

Assume taking a Full database backup and Log backups after every one hour.

Time

Event

8:00 AM

Create a Full Database Backup

9:00 AM

Take Transaction Log Backup

10:00 AM

Take Transaction Log Backup

11:00 AM

Take Transaction Log Backup

11:30 AM

Failure occurs


You can restore the database starting from Full backup (taken at 8 AM), then restore all the three transaction log backups (taken at 9 AM, 10 AM, 11 AM). But, there are no backups from 11:00 AM till 11:30 AM, resulting in data loss.

So, how to recover the database without data loss between 11-11:30 AM?

Take t-log backup by executing the BACKUP LOG command ‘WITH NO_TRUNCATE’ option. It will create a t-log backup file. Restore the file after the last transaction log backup (11 AM) WITH NORECOVERY to recover the lost data.

USE MASTER

GO

BACKUP LOG [Database] TO DISK = ‘C:\ProgramFiles\MSSQLServer\Data\Tail_Log1.LOG’ WITH NO_TRUNCATE;

How to Back up and Restore Tail of the Log?

Before we discuss the process to back up the tail of the transaction log and restore it, it’s important to know the clauses you need for creating a t-log backup.

  • NORECOVERY: Using this clause leave the database into the restoring state. This assures that the database won’t change after the t-log backup.
  • NO_TRUNCATE: Use this clause only when the database is damaged.

  • CONTINUE_AFTER_ERROR: If a database is damaged and you cannot take t-log backup, back up the tail of the log using CONTINUE_AFTER_ERROR.

Demo

  • Create a new database

CREATE DATABASE Tail_LogDB;

GO

USE Tail_LogDB;

GO


  • Create a new table and insert some data into it.

CREATE TABLE Employee (

EmployeeID int IDENTITY(1000,1) PRIMARY KEY NOT NULL,

EmployeeAge int

);

GO

This T-SQL query will create a table named Employee with columns ‘EmployeeID’ and ‘EmployeeAge’. 


  • Create a stored procedure to add more records to the table.

CREATE PROCEDURE InsertEmployee

AS

DECLARE @i int = 100

WHILE @i > 0

BEGIN

INSERT Employee (EmployeeAge) VALUES (@i)

Set @1 -=1

END

GO

EXECUTE InsertEmployee;

GO

SELECT * FROM Employee;

GO

Executing this T-SQL query will create an ‘InsertEmployee’ stored procedure that runs through a loop to add 100 more records into the Employee table. Then, select the Employee table to verify that everything works.


  • Create a full backup of the Tail_LogDB

BACKUP DATABASE Tail_LogDB

TO DISK = 'C:\TempDB\ Tail_LogDB_FULL.bak'

This command will create a full database backup with the 100 records we added in the table in Step 3. And, the backup gets saved in a folder we have created, ‘TempDB’.

  • Insert some more records into the table

EXECUTE InsertEmployee;

GO

SELECT * FROM Employee;

GO

After executing this T-SQL query, we will have 200 records in the database table. 



  • Simulate a database failure

If you're keeping your data and log files on different physical drives, then it's entirely possible that drive failure takes out the data file and leaves you only with the transaction log. We can simulate this simply by deleting the mdf file from the hard drive. Here's how:
  • Right-click on Tail_LogDB > Tasks > Take Offline. 


  • Select the ‘Drop all active connections’ checkbox and press OK. 


  • Now refresh the database, and you can see that the db is now OFFLINE. 

  • Next, go to the location where the data file is stored (i.e., TempDB folder), and you can see the db that we just saved. 


  • Now go to the location where the .mdf file and .ldf files for the Tail_LogDB database are saved. Delete the .mdf file.  


Now let's head back to SSMS and understand how we can recover from this disaster.

Bring Database Back Online

·         Right-click on Tail_LogDB > Tasks > Bring Online.


·         A dialog box with errors, click Close.


         Refresh the database.

As you can see, the database status has changed to Recovery Pending. Before attempting the restore operation, ensure to back up the tail of the log to capture the second instance of the 100 records we added into the database.

Now, let’s take the tail of the log. 

Switch into the master database and execute the BACKUP LOG statement with the CONTINUE_AFTER_ERROR option. This option will ensure to perform tail log backup even if any error occurs.

USE master;

GO

BACKUP LOG Tail_LogDB

TO DISK = 'C:\TempDB\Tail_LogDB.log'

WITH CONTINUE_AFTER_ERROR;

GO

Restore the t-log backup

Let's initiate the restore process by restoring the full database backup 'WITH NORECOVERY' option. Using this option specifies that the restore procedure would not attempt to undo or roll back any uncommitted transactions. This is important because if a modification to the data had begun but not finished when the failure occurred, there would be a record in the transaction log. Typically, SQL Server will attempt to roll back any of these partially completed changes during a restore, and we don't want this to happen.

USE master

RESTORE DATABASE Tail_LogDB

FROM DISK = 'C:\TempDB\Tail_LogDB_FULL.bak'

WITH NORECOVERY;

GO

This restores the backup of the first 100 records. 


To complete restoring the entire record set, let's restore the log file as well.

RESTORE LOG Tail_LogDB

FROM DISK = 'C:\TempDB\Tail_LogDB.log';

GO


·         Verify the results

USE Tail_LogDB

SELECT * FROM Employee;

GO


So, as you can see, all the 200 records are now restored.

Conclusion: Key Take-Away Points

  • A tail-log backup is useful to avoid losing data when a database is damaged or corrupted. However, you may fail to back up the tail of a damaged database log. So, when executing the BACKUP LOG statement, use WITH CONTINUE_AFTER_ERROR option to take t-log backup.
  • You must also take a tail-log backup before restoring a database in an ONLINE state. If the database is in OFFLINE state and doesn't start, back up the tail of the transaction log WITH NORECOVERY before performing the restore procedure.
  • It is also recommended to take t-log backup when migrating a large database from one source to another.
  • But remember, you can take tail-log backups only if the transaction log file is accessible. Meaning, you cannot perform t-log backup on a database with a corrupted and inaccessible log file. 

Wednesday, 24 November 2021

3-2-1 Backup Rule for Data Protection in SQL – What Is It & Is It Relevant?

In the event of a disaster leading to database unavailability, restoring backups is the first step to ensure business continuity. However, situations may arise when attempts to restore the database fails. For instance, the tape drive used for storing backups may get damaged rendering the backup data corrupt and unusable. In fact, tape drive media failure is the most common cause why restore fails. To prevent this and ensure database can be restored with uncorrupted data, SQL users implement the 3-2-1 backup rule for data protection. This article discusses this data protection rule and if it is still relevant.

The 3-2-1 Backup Rule for Data Protection – What It Is?

The 3-2-1 rule of data protection ensures that database can be restored with uncorrupted data. The idea behind the rule is to have:

3 copies of backup: You must have at least three copies of your data. One is your production data plus two backup copies. The more copies you have, the less risk you have of losing data.

2 copies on different media: Ensure to store two copies of database backup on different media types. This is important because a backup media can fail. When you split your backup into different media and a device fails, you’ll have another to fall back on.

1 copy on offsite: Store one of the two backups offsite. Doing so ensures that if anything happens to one backup copy, it won’t (hopefully) affect the other copy.



Is the 3-2-1 Backup Rule of Data Protection Relevant for SQL Users?

The 3-2-1 backup is a good starting point for devising any disaster recovery plan, particularly for SQL users who aren’t backing up at all. But, the backup rule has certain shortcomings.

Data Can Be Compromised

Maintaining three copies of data is fine, as more copies ensure recovery is possible in case of any disaster. But keeping two copies on different media types has limitations. Having two copies stored in two storage media or devices means quicker access to the backup (if the primary fails), however, this might not always be the case.

What happens if a ransomware infects your secondary storage while the primary is already down? You may lose all the data unless you pay a ransom. And, with several organizations replacing tape backup with cloud storage, an ever-increasing number of databases becoming vulnerable to ransomware attacks.

According a report by Imperva, “46% of all on-premises databases are vulnerable to attack.” Imperva predicted that data breaches will continue to grow as nearly one out of two on-premises databases is vulnerable to attacks. And so, you need more comprehensive and stronger data protection strategies than ever before.

Faulty Interpretation of the 3-2-1 Backup Rule

Backing up on tape drives is more expensive than backing up data on the cloud. And, as the demand for storage space grows so does the need for storage cost. Though, tape is still used – but due to slow recovery time and high cost involved – users are moving data to offsite locations, such as the cloud. That’s where the problem starts.

As cloud-based services not necessarily store backups at the same storage facility, point “2” and “1” in the backup rule are ignored. In other words, moving offsite data to cloud can fulfil the purpose of point “2” – it can be used to store a backup copy that is incorruptible and used for recovery if the first copy is affected. But this way, you’ll have only a single copy which doesn’t offer the protection you need from ransomware or other cyber threats.

Air Gap Protection is Lost

Though tape-based storage can slow down your recovery due to bandwidth constraints, it provides air gap to prevent ransomware from affecting your backup copies. However, air gap protection is missing in the 3-2-1 rule.

Air gap is basically a way of protecting a backup copy by storing it on a network that is physically separate from the primary data.

It was easy to provide an air gap when using tape backups. You can place tape backups in a box and transport them to off-site locations, creating an air gap between your backup and primary data copy. This makes it harder for hackers to attack a database, as they cannot attack both primary and backup storage devices.

How to Overcome 3-2-1 Backup Rule Shortcomings?

Backup strategies like 3-2-1-1-0 or 4-3-2 offers additional protection against ransomware attacks. Let’s discuss in brief about these two strategies:

3-2-1-1-0 Backup Rule

Like the ‘3-2-1’ backup rule, the 3-2-1-1-0 data protection strategy also requires maintaining at least three copies of data, storing data on at least two different storage media, and storing one backup copy offsite. Plus, it requires two additional steps:

  • Keeping one tape backup copy offline or air gapped, as it requires storing tape backups off-site. Or you may store cloud backups with immutability, thereby preventing data from getting modified or changed.
  • Monitors data to help identify and correct any errors in the backups.

Essentially, the 3-2-1-1-0 backup rule ensures that you've an error-free offline backup copy you can use to recover data in case of system failure or cloud failure.

4-3-2 Backup Rule

Developed by IT security partners, Continuity Centers, the 4-3-2 rule states that four copies of data are stored in three different locations. Out of the three locations, two are offsite and the third copy is stored in the cloud. And, the fourth backup goes to another cloud storage.

The 4-3-2 backup strategy ensures that duplicate copies of backups are created and stored at geographically distant locations to avoid data loss in the events of natural disasters.

Concluding Thoughts

Preventing data loss, in the event of a disaster, is crucial for business continuity. And so, you must be regularly taking backups and reviewing them to ensure their effectiveness in restoring the SQL databases. The 3-2-1 backup rule is a good starting point for data security, but you need more extensive backup strategy to protect your data against the growing number of digital threats. Upgrading the 3-2-1 rule to a 3-2-1-1-0 or 4-3-2 backup strategy provides an additional layer of security to help you recover ransomware affected databases.

Thursday, 23 September 2021

How to Design Backups in SQL Server?

The Recovery Point Objective (RPO) and Recovery Time Objective (RTO) are basic concepts related to the information to be recovered and the time that it will take to recover. In a Recovery Design plan, it is essential to define the time required to restore the data and how much data we can lose in the recovery.

The RPO defines the objective related to the point of recovery. In some cases, we can lose one hour of data, and sometimes we can lose 1 entire day of data. Therefore, defining the RPO at the beginning is critical to design the plan. The same applies to the RTO.

In this article, we will talk about both RPO and RTO, and also discuss how to design the right backup and restore strategy.

Recovery Point Objective (RPO)

Recovery Point Objective (RPO) is a proportion of how often it is necessary to take backups. On the off chance that a disaster happens between backups, would you be able to stand to lose ten minutes of information refreshes? Or then again ten hours? Or then again, an entire full day? RPO addresses how new recuperated information will be. The RPO shows the measure of information (refreshed or made) that will be lost or should be restored after a blackout.

Recovery Time Objective (RTO)

Recovery Time Objective (RTO) isn't just the length of time between the disaster and recovery. The goal additionally represents the means IT team should take to reestablish the application and its information. In the event that IT team has put resources into failover administrations for high-need applications, they can securely communicate RTO in short order.

How to Design the Right Backup and Restore Strategy to Meet Business Goals?

To reduce the RTO, you need to consider combining the transaction log backups and differential backups in your strategy. If you only make full backups, the time to restore the data may be too high. Using differential backups can help reduce the recovery time.

Here are some tips to consider when doing a backup:

  • Make sure to store the backup on a different server. This will prevent you from losing data in case the server is damaged.
  • Use fast hard disks. The backup and restore will be faster if you use hardware with good performance. This will help reach the RTO.
  • Use differential backups to reduce the number of transactional files used to restore. The differential backups allow to restore the data faster and to get the RTO.
  • Estimate how often the data changes. If your data does not change too much during the day, you can make one single differential backup each day and a weekly full backup. On the other hand, if your database has several transactions per minute, you will need to combine full, differential, and transactional backups to recover all the data in the correct RPO.
  • Always test your backup and make sure that RPO and RTO are accomplished on your tests. You will test that the integrity of the data is fine. It is recommended to have a production environment and a testing environment to verify the backups.
  • Make sure to automate the backup and restore strategy by using jobs. The backups should run in a schedule that depends on how the RTO and RPO are designed.
  • If your database is big, using the MAXTRANSFERSIZE and the BUFFERCOUNT may help. These arguments will allow you to increase the size and the buffer. This way you will be able to restore the database faster.
  • Use database and backup compression to increase the speed of the backup.

 Use Stellar Repair for MS SQL for Faster Recovery

There is also another way to recover information in a fast and secure way. Stellar Repair for MS SQL is a well-known third-party software that helps recover data from a corrupt or damaged database. It can also repair damaged databases. If you want to know more about this software, please visit the website here.

Wednesday, 7 July 2021

How to Fix SQL Database Error 8961?

Running DBCC CHECKDB or DBCC CHECKTABLE on a database may report error 8961. Such a situation could occur while changing the data type of a table column from “ntext to nvarchar(max)” and updating the table with over 4000 records.

Essentially, updating the database table (in SQL Server 2012, 2014, or 2016) with 4000+ records leads to corruption. As a result, you get the following error message:

Msg 8961, Level 16, State 1, LineNumber

Table error: Object ID, indexID, paritionID, allocUnitD (type LOB data). The off-row data node at page (PageID), slot 0, textID does not match its reference from page (PageID), slot 0.

The complete error message looks similar to:


What Causes SQL Error 8961?

This is a corruption bug in the SQL Server engine. The corruption occurs within the Large Object (LOB) column.

How to Fix SQL Error 8961?

Microsoft has released the following cumulative updates to fix the database corruption error 8961:

Cumulative Update 5 for SQL Server 2016 RTM

Cumulative Update 2 for SQL Server 2016 SP1

Cumulative Update 4 for SQL Server 2014 SP2

Cumulative Update 11 for SQL Server 2014 SP1

Cumulative Update 7 for SQL Server 2012 Service Pack 3

Apply the cumulative update based on the SQL Server version you are using.

What If the Error Persists?

If applying the cumulative updates doesn’t help resolve the issue, to work around this issue do the following:

Note: Before trying out the below workarounds, investigate the hardware like drivers for I/O subsystem to check if corruption occurs due to hardware problem. If the hardware is faulty, contact your vendor or hardware manufacturer for further assistance.

  • Set the “large value types out of row” Option to 1

After changing the data type, also change the “large value types out of row” option to 1 by executing the query:

ALTER TABLE tbl_Name ALTER COLUMN COLUMN_NAME nvarchar(max) NOT NULL

go

exec sp_tableoption 'tbl_Name', 'large value types out of row', '1'

  •  Restore Records from Backup

If the above workaround fails to fix the error, try resolving the problem by restoring the database from a good known backup. If you don’t have a valid backup, skip to the next method.

  • Run DBCC CHECKDB with Repair Option

As a last resort, you may try running the DBCC CHECKDB command using the minimum repair level, i.e., “REPAIR_ALLOW_DATA_LOSS”.

DBCC CHECKDB (‘db_name’, REPAIR_ALLOW_DATA_LOSS)

 But, this may delete the rows from the table.

  • Use SQL Database Repair Tool

To repair a severely corrupt SQL database and retrieve all the records, try using a third-party SQL database repair tool. Stellar Repair for MS SQL is one such tool that is built to safely scan and fix a corrupted SQL Server database. It supports repairing a db on SQL Server 2019, 2017, 2016, and earlier versions. The software repairs database files (.mdf and .ndf) and recovers all the objects like table, deleted records, stored procedures, etc.

Conclusion

DBCC CHECKDB may report database corruption error 8961 when the data type of a table column is changed and the table is updated with 4000+ rows. The issue occurs due to a corruption bug within the SQL Server engine. To fix this, try installing the latest cumulative updates released by Microsoft. If this doesn’t work, try the workarounds discussed above to fix the 8961 error. If you are required to repair the database, a better alternative is to use SQL database repair tool. The tool can help fix the error with no added risk of data loss.