Sunday 26 April 2020

Stairway to Microsoft Azure SQL Database Part 1 : Deployment Options

Recently I started working with Azure SQL Database. It took some time for me to understand the concepts of the Azure SQL database. Earlier I have experience in AWS RDS for SQL server, but in Azure it works in a little complex way. Once you understand the different offerings, security, deployment model, etc, it is is easy to provision the resources as per the requirements.

Through a set of articles, I will try to explain the Azure SQL database in a very concise way. All this information is available in Azure documentation. For the granular level of information, always refer to Azure documentation.   

Microsoft Azure database is a fully managed platform as a service (PaaS). Azure provides three different deployment options to cater to the different needs of the customer.

  • Managed Instance 
  • Single Database 
  • Elastic Pool
Let us look into the details of each deployment model :

Managed Instance: Managed instance is one of the most suitable deployment option for those who are moving to cloud from an on-premise server. This deployment model supports most of our on-premise database features.
  • Managed instance does not provide access to the Operating system.
  • It does not allow you to specify the version of the SQL Server. Managed instance always runs the latest stable SQL engine version, which is equal to or higher than the latest available RTM version of SQL Server. 
  • Supports 
    • SQL Agent service 
    • DB mail with external SMTP server
    • Service broker
    • SQL Agent service 
    • Transactional replication
    • Change Data Capture (CDC)  
    • CLR creation from binary (Not using the assembly file)
    • linked server 
    • Managed instance supports SQL Server Integration Services (SSIS) and can host SSIS catalog (SSISDB) that stores SSIS packages, but they are executed on a managed Azure-SSIS Integration Runtime (IR) in Azure Data Factory (ADF)
    • To choose the time zone
  • Not yet supporting the file stream or file table.
On top of that, managed instance support all PaaS features like :
  • No separate purchase of infrastructure and managing the underlying hardware of the database server.
  • Automated patching and version upgrade. There is no option to control the maintenance window of patching or upgrade. To handle the connection error during the maintenance window, you need to implement the retry mechanism in your code.
  • No headache of scheduling and monitoring of backups. Automated backups are available 
Migration to the managed instance is much easier as this supports restoring from the native backup created from the on-premises server. To restore the backup in the managed instance, the backup should be available in Azure storage account and should use RESTORE DATABASE FROM URL. Managed instances also allow the customer to take COPY_ONLY backups which do not break the azure automated backup chain. 

Managed instance is placed inside the Azure virtual network and in a dedicated subnet. This provides :
  • Secure private IP address.
  • The ability to connect an on-premises network to a managed instance.
  • The ability to connect a managed instance or another on-premises database server through a linked server.
Managed instances also provide public endpoints. Public endpoint provides the ability to connect to the Managed Instance from the Internet without using a VPN. Access is disabled by default unless explicitly allowed. We need to explicitly whitelist the IP address to access through the public IP address.

managed instance provides two flavors of compute and storage :
  • General Purpose: This supports most of the production workload. Support up to 80 vCore and 8TB fast storage
  • Business Critical: For IO intensive and compute-intensive workload.Support up to 80 vCore and 4 TB super fast storage 
Managed instance also support to bring your own license with software assurance (SA). Considering all these, managed instances are the best option for lift and shift of your existing workload to the cloud. Note that, if you need direct access to OS/filesystem or dependent on a specific version of SQL server or required specific features that are not supported in the Azure SQL server then Azure Paas database service is not suitable unless you resolve those dependencies.

Single Database:  A single database is a fully-managed database as a service (DbaaS) under the Azure PaaS offering. This deployment option is most suited for modern application development like microservices. The single database deployment option creates a database in Azure with its own set of resources and is managed via a SQL database server. With a single database, each database resources are isolated from each other. A SQL database server is completely different from the SQL server that we used to have in the on-premise installation. In this context of azure DbaaS, an SQL database server is a logical construct that acts as a central administrative point for one or more single databases.
  • Single database does not provide access to the Operating system.
  • It does not allow you to specify the version of the SQL Server. Single database always runs the latest stable SQL engine version, which is equal to or higher than the latest available RTM version of SQL Server. 
  • Does not Supports 
    • SQL Agent service 
    • DB mail service
    • Service broker
    • Replication (can be push subscriber )
    • Change Data Capture (CDC)  
    • CLR creation 
    • linked server 
    • To choose the time zone
    • File Stream and File table
In short, Single database support only database-level features and does not support server level features.  To migrate your existing workload to a single database,  there are three primary methods.
  • Export to a BACPAC file and import the BACPAC file into azure single database. (BACPAC file is a ZIP file with an extension of BACPAC containing the metadata and data from a SQL Server database. A BACPAC file can be stored in Azure Blob storage or in local storage in an on-premises location and later imported back into Azure SQL Database or into a SQL Server on-premises installation). This is a time-consuming process and requires downtime. The time required to complete the migration depends on the size of the database.
  • Using transactional replication: Azure single database can be configured as a subscriber of your on-premise database publication. There is no UI available.  Need to configure through T-SQL. Once the complete data synched with the source server, on migration day cut down the traffic to your on-premise database and after synchronizing the data, point your application to the new Azure database 
  • Use the Azure DMS (Data Migration service ).
Azure database server does not allow you to host inside the Vnet. By default, the access is through the public endpoint.  All the traffic through this endpoint is blocked and you need to explicitly whitelist the required IP address to connect to the database server through this endpoint. There is no option to disable the private endpoint but as mentioned earlier, no one can connect through public endpoint unless you provide the access by whitelisting the IP addresses. You can enable the private endpoint by integrating virtual network private endpoints. Private endpoint helps to connect from the same VNet or peered VNet in the same/cross-region or from on-premises using a VPN.  Note that virtual network private endpoint will be billed separately 

Azure Database provides two types of purchase :
  • DTU base
  • vCore base
Users can select the option based on their workload. In the vCore model, azure allows you to bring your own license. We will discuss these options in another post.

Single database support PaaS features like :

  • No separate purchase of infrastructure and managing the underlying hardware of the database server.
  • Automated patching and version upgrade. There is no option to control the maintenance window of patching or upgrade. To handle the connection error during the maintenance window, you need to implement the retry mechanism in your code.
  • No headache of scheduling and monitoring of backups. Automated backups are available 

Elastic Pool: Elastic Pool Is a set of single databases with a shared resource pool. In a single database, each database will have its own dedicated resource. In the elastic pool, resources are configured in the pool level and each single database connected with that pool shares the resource of elastic pool. SQL Database elastic pools are a simple solution for managing and scaling multiple databases that have varying and unpredictable usage demands. For example, let us assume you are providing some SaaS solutions to multiple clients in different time zone. Each customer data are stored in independent databases. In this scenario, the resource can be utilized in a much better way by adding those databases to the elastic pool instead of allocating dedicated resources to each individual customer database. Apart from this, the rest of the characteristics are the same as the Single database mentioned in the above section.

Hope the below pictorial representation helps you to understand different deployment option available in Azure DbaaS. 

Sunday 19 April 2020

Disk Performance Metrics in AWS and Azure

In the last two posts, we have discussed the different types of storage disks available in the AWS and Azure cloud platforms. We also discussed the characteristics that affect the performance of the disk. You can read those article using the below links:

In this post, let us discuss about performance counters that can be used to measure the performance of your disk. These counters can be captured using the perfmon tool available in windows. Aws and azure provides a graphical representation of some of these counters in their portal.

Disk Reads/sec  and Disk Writes/sec (IOPS)
Measures the rate of read/write operations from/to the disk. In other words, these counters measure the number of read/write operations completed by the disk. You can use disk operations/Sec (IOPS) counter in the Azure console to measure the same. On the other side AWS provides VolumeReadOps and  VolumeWriteOps metrics to measure the same. This sum of these two counters gives us IOPS utilization of the disk.

Disk Read Bytes/sec  and Disk Write Bytes/sec (Throughput)
Measure the number of bytes read from the disk and write to the disk per second.  Disk bytes/sec counter provides the throughput utilization of the disk. Azure provides Disk Bytes/Sec to measure the throughput in the dashboard. In AWS, you can use the VolumeReadBytes and VolumeWriteByte cloud watch metrics to measure the throughput.

Average Disk sec/Read and Average disk sec/Write (Latency)
The Avg. Disk sec/Read performance counter indicates the average time, in seconds, of a read of data from the disk. The average value of the Avg. Disk sec/Read performance counter should be under 10 milliseconds. In other words, the number of seconds to complete one read/write operation.

Average Disk Bytes/Read and Average Disk Bytes/Write (Size of the IO)
This counter shows the average size of the IO operation of the disk workload.

Avg. Disk Queue Length and Current Disk Queue Length
Avg. Disk Queue Length is the average number of (read and write) requests that are queued for the disk operations during the sample interval. Current Disk Queue Length is the number of IO requests pending on the disk at that point in time. Higher value of this counter results in high disk latency. Azure provides Disk QD (Queue Depth) to measure the average disk queue length. Whereas AWS provides the cloud watch metrics  VolumeQueueLength to measure the same. 

I hope this article helps you to troubleshoot the database performance issues in the cloud virtual machines. It may be much easier to identify the CPU and memory bottleneck of your database server. Detecting the IO bottleneck requires more effort and time.

Sunday 5 April 2020

Disks performance in Azure

In our earliest post, we have discussed in detail about the factors that affect the disk performance and different types of disk offerings in the AWS platform. In this blog post, let us go through different types of disk available in the Azure Platform.

Azure offers three types of disk :
  • Standard SSD
  • Premium SSD 
  • Ultra Disk
Apart from this, Azure provides a standard HDD. These disks are low-cost disk support for VMs running low latency workload. Standard HDD offers write latencies under 10ms and read latencies under 20ms.

Before describing the characteristic of each disk type, let us go through some common facts:
  • I/O operation less than or equal to 256 KiB of size is considered as a single IO operation. AWS supports the IO size of 128 KiB for each IOPS. In azure, a disk with 1000 IOPS can attain maximum throughput of 1000X256=256000KB=250MiB/sec, but Azure put max throughput limit for each disk type and that will be less than or equal to the possible max throughput (IOPS X 256 KB)
  • Azure provides predefined disk size except for ultra disk. Users are allowed to create custom size disks but will be charged at the rate of next available size. For example, if the user creates a disk with 200 GB, he will be billed for the next available disk size offered by Azure. This is not applicable to Ultra Disk. With ultra disk, users can select the custom size and charge for that size.
  • Apart from the disk level maximum limit of  IOPS and throughput, Azure implements throttling at the VM level for IOPS and throughput. This cap works on the aggregate value of  IOPS and throughput of all disk attached to the VM.

Standard SSD: Standard SSD disks are low-cost SSD provides better latency than the standard HDD. Standard SSD disks are best suitable for OS disk, web servers, and low IOPS application servers. Azure offers fourteen different disk sizes ranging from 4GB to 32 TB with IOPS ranging from 500 IOPS to 6000 IOPS and throughput range between 25MiB/sec to 750MiB/sec. Actual IOPS and throughput may vary depending on the traffic pattern. As far as I understood from the documentation, these are the maximum IOPS and throughput but not guaranteed always. For exact details of size and  IOPS refer to azure page 

Premium SSD : Azure premium SSD disks are high performing and low-latency disks designed to support IO-intensive workloads. Premium SSDs can only be used with the VM series that are compatible with premium SSD disks. Azure offers fourteen different disk sizes ranging from 4GB to 32 TB with IOPS ranging from 120 IOPS to 20000 IOPS and throughput range between 25MiB/sec to 900MiB/sec. The major difference from the standard SSD, premium SSD provides more consistent performance in terms of IOPS and throughput. For exact details of size and  IOPS refer to azure page 

Apart from this, premium SSD supports bursting for disk size up to 512 GB. These disks can burst up to 3500 IOPS and throughput up to 170 MiB/sec. Bursting is based on a credit system. Credits are automatically accumulated when disk traffic is below the provisioned performance and credits are automatically consumed when traffic bursts beyond the provisioned performance. The max burst limit defines the ceiling of disk IOPS and throughput even if you have accumulated burst credits. This helps to support random unpredictable spikes of the IO workload for disk up to 512 GB in size.

For example, a disk with a size of 128 GB has provisioned 500 IOPS and 100MiB/sec throughput. This disk supports the burstable performance of 3500 IOPS and 170 MiB/sec for maximum of 30 minutes.  The burstable limit is the same for all the disk size. When the disk workload goes below the provisioned IOPS and throughput ( 500 IOPS and 100MiB/sec), it starts accumulating the burstable credits and the same will be used when the workload goes beyond the provisioned IOPS and throughput. For example, if the actual workload is 200 IOPS and 80MiB/sec in the past one-second interval, then the unused 300 IOPS and 20MB are accumulated to the burst bucket of the disk.

Disk bursting will never go beyond VM throttling on  IOPS and throughput. Irrespective of burstable credits available, the disk performance never goes beyond the burstable limit (1700 IOPS and 170MiB/sec). For example, even if the disk has 10,000 IOPS  in the burstable credit bucket, the disk can not support more than the max burst limit of 3500 IOPS. In the same way, if the disk is attached to a VM with an IOPS limit of 1000 IOPS and throughput limit of 150 MiB/sec, the disk can't burst beyond the VM limit.

When a VM starts with the burstable disk, it starts with a full burst credit bucket. This helps the booting speed of the VM if the boot disk is a burstable disk . A burstable disk  credit bucket can be in any of the three states :
  • Accruing: When the disk traffic is less than the provisioned IOPS and throughput. 
  • Declining: when the disk traffic is using more than the provisioned IOPS and throughput. The burst traffic will independently consume credits from IOPS and throughput.
  • Remaining Constant: When the disk performance is exactly at the provisioned IOPS and throughput level.

Ultra Disk : Azure ultra disks provide high throughput, high IOPS and consistent low latency disk storage for IO intensive mission-critical application. Ultra disk can only be used as a data disk(not as OS disk). Ultra disk supports the disk size ranging from 4 GB to 64 TB. When we configure the ultra disk, we can customize the size and performance variables (IOPS and throughput) 
  • Ultra disk support 300 IOPS per GB with a maximum limit of 160000 IOPS per disk. For example, an 8GB disk can be configured to support a maximum of 2400 IOPS. All disk above the size of 533 GB, support only  160K IOPS. (533*300=159900)
  • The minimum configurable IOPS for an ultra disk is 2 IOPS per GB with a baseline of 100 IOPS per disk. for example, a 32 GB disks can be configured minimum of 100 IOPS instead of 64 IOPS. A 1000 GB disk minimum configurable value for IOPS is 2000
  • To get the benefit of the provisioned IOPS, ensure that the VM IOPS limit is higher than the IOPS configured for the disk. 
  • The throughput limit is 256KiB/sec for each provisioned IOPS with a maximum of 2000 MiB/sec per disk. The minimum throughput per disk is 4KiB/sec for each provisioned IOPS with a baseline of 1MB/sec.
  • Ultra disk supports adjusting the performance variables (Throughput and IOPS) at runtime without detaching disk or restarting the VM. After changing the performance variable, it will take a maximum of one hour to get it effective.
There are lots of restrictions in using the ultra disk. Refer to the azure website for up to date information. 

Disk Stripping for better performance: Use multiple disks and stripes them together to get a combined higher IOPS and throughput. For example, in premium storage , P40 (2TB) and P50(4TB) provide the same  IOPS (7500) and throughput (250MiB/sec). Instead of going for P50 disk, we can allocate two P40 disks and stripes them in OS level to get combined IOPS of 15000 and 500MiB/sec throughput. 

Hope this explains well about the different types of disk available in the Azure portal and keep these points in mind while allocating disk for better performance. In the next blog, we will go through different parameters to measure the performance of the disk.