SQL Server on Linux – Log Shipping – part 4

Preamble

Log shipping (LS) is a proven legacy, the oldest, high availability (HA) feature based on backup and restore in SQL Server. Recovery Point Objective (RPO) and Recovery Time Objective (RTO) are more flexible and if your database is not highly critical then you might consider this HA option as well.

Log shipping can be combined with Availability Groups or Failover Cluster instances but remains the most simple to implement in places where you might have budget or skills issues.

Based on SQL Server native backup transaction log are sent from the primary server to secondary server where they are applied. SQL Server Agent manage the automatic flow of those transaction logs:

sql_server_linux_log_shipping01
sql_server_linux_log_shipping01

Microsoft claim that one of biggest advantage of LS is the capability to delay log apply and be able to react in case of human error before the mistake is applied to secondary database. They also claim that LS is also more tolerant to slow network.

In a LS architecture you can mix Linux and Windows hosts. For my trial I have chosen to use two virtual machines running Oracle Linux Server release 8.5. SQL Server is release 2019 and exactly 15.0.4188:

  • server1.domain.com with IP address 192.168.56.101, my primary server
  • server2.domain.com with IP address 192.168.56.102, my secondary server

To share storage between my two servers I have decided to follow official documentation and try Samba for the first time with the objective to lean something.

Samba installation and configuration

On primary server

[root@server1 ~]# dnf -y install samba
[root@server1 ~]# mkdir /mssql/tlogs
[root@server1 ~]# chown mssql:mssql /mssql/tlogs
[root@server1 ~]# tail -n 8 /etc/samba/smb.conf
[tlogs]
        comment = Transation logs directory for Log Shipping
        path=/mssql/tlogs
        available=yes
        read only=yes
        browsable=yes
        public=yes
        writable=no
[root@server1 ~]# testparm
Load smb config files from /etc/samba/smb.conf
Loaded services file OK.
Weak crypto is allowed
 
Server role: ROLE_STANDALONE
 
Press enter to see a dump of your service definitions
[root@server1 ~]# smbpasswd -a mssql
New SMB password:
Retype new SMB password:
Added user mssql.
[root@server1 ~]# systemctl restart smb nmb

On secondary server

[root@server2 ~]# smbclient --user=mssql -L //192.168.56.101
Enter SAMBA\mssql's password:
 
        Sharename       Type      Comment
        ---------       ----      -------
        print$          Disk      Printer Drivers
        tlogs           Disk      Transation logs directory for Log Shipping
        IPC$            IPC       IPC Service (Samba 4.14.5)
        mssql           Disk      Home Directories
SMB1 disabled -- no workgroup available
[root@server2 ~]# dnf -y install cifs-utils
[root@server2 ~]# mkdir -p /mssql/tlogs
[root@server2 ~]# vi ~mssql/.tlogscreds
[root@server2 ~]# cat  ~mssql/.tlogscreds
username=mssql
password=secure_password
domain=workgroup
[root@server2 ~]# chmod 660 ~mssql/.tlogscreds
[root@server2 ~]# ll  ~mssql/.tlogscreds
-rw-rw---- 1 root root 52 Dec 16 17:14 /var/opt/mssql/.tlogscreds
[root@server2 ~]# grep cifs /etc/fstab
//192.168.56.101/tlogs /mssql/tlogs cifs credentials=/var/opt/mssql/.tlogcreds,ro,uid=mssql,gid=mssql 0 0
[root@server2 ~]# mount -a
[root@server2 ~]# df -h /mssql/tlogs
Filesystem              Size  Used Avail Use% Mounted on
//192.168.56.101/tlogs 1014M  123M  892M  13% /mssql/tlogs

You can create a test file on your primary server in /mssql/tlogs and you can access it in read only mode in secondary server: magical !

Log shipping configuration

Enable SQL Server Agent (for SQL Server 2019 and SQL Server 2017 CU4 and higher no package to install, just a configuration) on all servers:

[root@server1 mssql]# /opt/mssql/bin/mssql-conf set sqlagent.enabled true
SQL Server needs to be restarted in order to apply this setting. Please run
'systemctl restart mssql-server.service'.
[root@server1 mssql]# systemctl restart mssql-server

On primary server

Create a test database with a test table:

1> create database testdb
2> go
1> use testdb
2> create table test01(id int, descr varchar(20))
3> insert into test01 values(1,'One')
4> go
Changed database context to 'testdb'.
 
(1 rows affected)

And backup it in LS shared directory:

1> backup database testdb to disk = '/mssql/tlogs/testdb.bak'
2> go
Processed 328 pages for database 'testdb', file 'testdb' on file 1.
Processed 2 pages for database 'testdb', file 'testdb_log' on file 1.
BACKUP DATABASE successfully processed 330 pages in 0.080 seconds (32.177 MB/sec).

I was then expecting a not so complex command but I was wrong. The script is provided by Microsoft directly and I have just customized it for my environment. The script is roughly based on sp_add_log_shipping_primary_database. It create jobs to handle the process (msdb.dbo.sysjobs_view and msdb.dbo.sp_help_job procedure to see them):

DECLARE @LS_BackupJobId	AS uniqueidentifier
DECLARE @LS_PrimaryId	AS uniqueidentifier
DECLARE @SP_Add_RetCode	As int
EXECUTE @SP_Add_RetCode = master.dbo.sp_add_log_shipping_primary_database
  @database = N'testdb'
  ,@backup_directory = N'/mssql/tlogs'
  ,@backup_share = N'/mssql/tlogs'
  ,@backup_job_name = N'LSBackup_testdb'
  ,@backup_retention_period = 4320
  ,@backup_compression = 2
  ,@backup_threshold = 60
  ,@threshold_alert_enabled = 1
  ,@history_retention_period = 5760
  ,@backup_job_id = @LS_BackupJobId OUTPUT
  ,@primary_id = @LS_PrimaryId OUTPUT
  ,@overwrite = 1
 
IF (@@ERROR = 0 AND @SP_Add_RetCode = 0)
BEGIN
 
DECLARE @LS_BackUpScheduleUID	As uniqueidentifier
DECLARE @LS_BackUpScheduleID	AS int
 
EXECUTE msdb.dbo.sp_add_schedule
  @schedule_name =N'LSBackupSchedule'
  ,@enabled = 1
  ,@freq_type = 4
  ,@freq_interval = 1
  ,@freq_subday_type = 4
  ,@freq_subday_interval = 15
  ,@freq_recurrence_factor = 0
  ,@active_start_date = 20170418
  ,@active_end_date = 99991231
  ,@active_start_time = 0
  ,@active_end_time = 235900
  ,@schedule_uid = @LS_BackUpScheduleUID OUTPUT
  ,@schedule_id = @LS_BackUpScheduleID OUTPUT
 
EXECUTE msdb.dbo.sp_attach_schedule
  @job_id = @LS_BackupJobId
  ,@schedule_id = @LS_BackUpScheduleID
 
EXECUTE msdb.dbo.sp_update_job
  @job_id = @LS_BackupJobId
  ,@enabled = 1
 
END
 
EXECUTE master.dbo.sp_add_log_shipping_alert_job
 
EXECUTE master.dbo.sp_add_log_shipping_primary_secondary
  @primary_database = N'testdb'
  ,@secondary_server = N'192.168.56.102'
  ,@secondary_database = N'testdb'
  ,@overwrite = 1

On secondary server

As you have access to the shared folder start by restoring the database you plan to handle with LS:

1> restore database testdb from disk = '/mssql/tlogs/testdb.bak' with norecovery
2> go
Processed 328 pages for database 'testdb', file 'testdb' on file 1.
Processed 2 pages for database 'testdb', file 'testdb_log' on file 1.
RESTORE DATABASE successfully processed 330 pages in 0.101 seconds (25.487 MB/sec).

Same as for primary database the script is provided by Microsoft and is not so easy to follow. The procedure in case of the secondary is sp_add_log_shipping_secondary_primary. It create jobs to handle the process (msdb.dbo.sysjobs_view and msdb.dbo.sp_help_job procedure to see them):

DECLARE @LS_Secondary__CopyJobId	AS uniqueidentifier
DECLARE @LS_Secondary__RestoreJobId	AS uniqueidentifier
DECLARE @LS_Secondary__SecondaryId	AS uniqueidentifier
DECLARE @LS_Add_RetCode	As int
 
EXECUTE @LS_Add_RetCode = master.dbo.sp_add_log_shipping_secondary_primary
  @primary_server = N'192.168.56.101'
  ,@primary_database = N'testdb'
  ,@backup_source_directory = N'/mssql/tlogs/'
  ,@backup_destination_directory = N'/mssql/tlogs/'
  ,@copy_job_name = N'LSCopy_testdb'
  ,@restore_job_name = N'LSRestore_testdb'
  ,@file_retention_period = 4320
  ,@overwrite = 1
  ,@copy_job_id = @LS_Secondary__CopyJobId OUTPUT
  ,@restore_job_id = @LS_Secondary__RestoreJobId OUTPUT
  ,@secondary_id = @LS_Secondary__SecondaryId OUTPUT
 
IF (@@ERROR = 0 AND @LS_Add_RetCode = 0)
BEGIN
 
DECLARE @LS_SecondaryCopyJobScheduleUID	As uniqueidentifier
DECLARE @LS_SecondaryCopyJobScheduleID	AS int
 
EXECUTE msdb.dbo.sp_add_schedule
  @schedule_name =N'DefaultCopyJobSchedule'
  ,@enabled = 1
  ,@freq_type = 4
  ,@freq_interval = 1
  ,@freq_subday_type = 4
  ,@freq_subday_interval = 15
  ,@freq_recurrence_factor = 0
  ,@active_start_date = 20170418
  ,@active_end_date = 99991231
  ,@active_start_time = 0
  ,@active_end_time = 235900
  ,@schedule_uid = @LS_SecondaryCopyJobScheduleUID OUTPUT
  ,@schedule_id = @LS_SecondaryCopyJobScheduleID OUTPUT
 
EXECUTE msdb.dbo.sp_attach_schedule
  @job_id = @LS_Secondary__CopyJobId
  ,@schedule_id = @LS_SecondaryCopyJobScheduleID
 
DECLARE @LS_SecondaryRestoreJobScheduleUID	As uniqueidentifier
DECLARE @LS_SecondaryRestoreJobScheduleID	AS int
 
EXECUTE msdb.dbo.sp_add_schedule
  @schedule_name =N'DefaultRestoreJobSchedule'
  ,@enabled = 1
  ,@freq_type = 4
  ,@freq_interval = 1
  ,@freq_subday_type = 4
  ,@freq_subday_interval = 15
  ,@freq_recurrence_factor = 0
  ,@active_start_date = 20170418
  ,@active_end_date = 99991231
  ,@active_start_time = 0
  ,@active_end_time = 235900
  ,@schedule_uid = @LS_SecondaryRestoreJobScheduleUID OUTPUT
  ,@schedule_id = @LS_SecondaryRestoreJobScheduleID OUTPUT
 
EXECUTE msdb.dbo.sp_attach_schedule
  @job_id = @LS_Secondary__RestoreJobId
  ,@schedule_id = @LS_SecondaryRestoreJobScheduleID
 
END
DECLARE @LS_Add_RetCode2	As int
IF (@@ERROR = 0 AND @LS_Add_RetCode = 0)
BEGIN
 
EXECUTE @LS_Add_RetCode2 = master.dbo.sp_add_log_shipping_secondary_database
  @secondary_database = N'testdb'
  ,@primary_server = N'192.168.56.101'
  ,@primary_database = N'testdb'
  ,@restore_delay = 0
  ,@restore_mode = 0
  ,@disconnect_users	= 0
  ,@restore_threshold = 45
  ,@threshold_alert_enabled = 1
  ,@history_retention_period	= 5760
  ,@overwrite = 1
 
END
 
IF (@@error = 0 AND @LS_Add_RetCode = 0)
BEGIN
 
EXECUTE msdb.dbo.sp_update_job
  @job_id = @LS_Secondary__CopyJobId
  ,@enabled = 1
 
EXECUTE msdb.dbo.sp_update_job
  @job_id = @LS_Secondary__RestoreJobId
  ,@enabled = 1
 
END

Start the log shipping configuration

On primary server start the log shipping job:

1> use msdb
2> execute dbo.sp_start_job N'LSBackup_testdb'
3> go
Changed database context to 'msdb'.
Job 'LSBackup_testdb' started successfully.

On seconday server start the log shipping job:

1> USE msdb
2> execute dbo.sp_start_job N'LSCopy_testdb'
3> execute dbo.sp_start_job N'LSRestore_testdb'
4> go
Changed database context to 'msdb'.
Job 'LSCopy_testdb' started successfully.
Job 'LSRestore_testdb' started successfully.

Log shipping failover testing

The log shipped database is not accessible, even in read only mode:

1> use testdb
2> go
Msg 927, Level 14, State 2, Server server2, Line 1
Database 'testdb' cannot be opened. It is in the middle of a restore.

To access it you must break the log shipping process. To check if my test table rows are replicated I insert a new row:

1> insert into test01 values(2,'Two');
2> select * from test01
3> go
 
(1 rows affected)
id          descr
----------- --------------------
          1 One
          2 Two

Break the LS process on secondary server with:

1> restore database testdb with recovery
2> go
RESTORE DATABASE successfully processed 0 pages in 0.441 seconds (0.000 MB/sec).

Then you can access the database and confirm all has been replicated:

1> use testdb
2> select * from test01
3> go
Changed database context to 'testdb'.
id          descr
----------- --------------------
          1 One
          2 Two
 
(2 rows affected)

If I’m not wrong to recover LS situation you have to restart from a full backup of the original database. Which might be an issue if your primary is really big…

On primary server make a full database backup:

1> backup database testdb to disk = '/mssql/tlogs/testdb.bak'
2> go
Processed 328 pages for database 'testdb', file 'testdb' on file 3.
Processed 2 pages for database 'testdb', file 'testdb_log' on file 3.
BACKUP DATABASE successfully processed 330 pages in 0.065 seconds (39.603 MB/sec).

Restore this backup on secondary server with replace option to overwrite the existing database:

1> restore database testdb from disk = '/mssql/tlogs/testdb.bak' with norecovery
2> go
Msg 3159, Level 16, State 1, Server server2, Line 1
The tail of the log for the database "testdb" has not been backed up. Use BACKUP LOG WITH NORECOVERY to backup the log if it contains work you do not want to lose. Use the WITH REPLACE or WITH STOPAT clause of the RESTORE statement to just overwrite the contents of the log.
Msg 3013, Level 16, State 1, Server server2, Line 1
RESTORE DATABASE is terminating abnormally.
1> restore database testdb from disk = '/mssql/tlogs/testdb.bak' with norecovery, replace
2> go
Processed 328 pages for database 'testdb', file 'testdb' on file 1.
Processed 2 pages for database 'testdb', file 'testdb_log' on file 1.
RESTORE DATABASE successfully processed 330 pages in 0.065 seconds (39.603 MB/sec).

On primary restart the LS job:

1> exec msdb.dbo.sp_start_job N'LSBackup_testdb'
2> go
Job 'LSBackup_testdb' started successfully.

On secondary restart the two LS jobs:

1> exec msdb.dbo.sp_start_job N'LSCopy_testdb'
2> exec msdb.dbo.sp_start_job N'LSRestore_testdb'
3> go
Job 'LSCopy_testdb' started successfully.
Job 'LSRestore_testdb' started successfully.

And you are back to original situation…

Log shipping monitoring

You have plenty of views and procedures (https://docs.microsoft.com/en-us/sql/database-engine/log-shipping/monitor-log-shipping-transact-sql?view=sql-server-ver15) but again the easiest way is to use SQL Server Management Studio (SSMS), right mouse click on instance then Reports / Standard Reports / Transaction Log Shipping Status menu:

sql_server_linux_log_shipping02
sql_server_linux_log_shipping02

References

About Post Author

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>