Setup SQL Server Connection Resiliency with Entity Framework

Tested on:
Microsoft SQL Server 13.00.5292
Entity Framework 6.2.0

When an application execute too many operations against SQL Server, a transient fault could occur.
Transient fault in SQL Server could occurred in self-hosted or cloud SQL Server. It could caused by many reasons and extremely (if not impossible) to debug.
You might encountered error messages like:

“Database * on server * is not currently available. Please retry the connection later. If the problem persists, contact customer support, and provide them the session tracing ID of *.”

Other similar messages also point to transient fault errors.

Microsoft suggested solution to transient fault is to retry the executed operation, often dubbed as database connection resiliency. To set this up in Entity Framework, follow this step:

  1. Set EF DbConfiguration using code-based configuration, see this post.

  2. In the constructor, add following code

    SetExecutionStrategy("System.Data.SqlClient", () => new SqlAzureExecutionStrategy(10, TimeSpan.FromSeconds(50)));
    

  3. The first parameter of SqlAzureExecutionStrategy is the maximum number of retry and second parameter is the delay in milliseconds between retries.

Once set, EF will retry the faulted operation automatically. Consider performance when deciding how many tries and delay time to configure. The longer the try or delay time will increase operation time.

References:
Microsoft
Sql Server Backup Academy

Advertisements

Parse JSON in SQL Server

I have following rows in database.

2019-07-17 17_33_37-SQLQuery45.sql - (local)_.Infinity (ENERGY_sutet (53))_ - Microsoft SQL Server M

Here is the full json:

{
    "FirstName": "Paul",
    "LastName": "Goodman",
    "Occupations": [
        {
            "Title": "Senior Manager",
            "Company": "Purple Ocean",
            "YearOfService": 9
        },
        {
            "Title": "Manager",
            "Company": "Blue Ocean",
            "YearOfService": 4
        }
    ]
}
{
    "FirstName": "Tom",
    "LastName": "Badman",
    "Occupations": [
        {
            "Title": "Supervisor",
            "Company": "Red Sky",
            "YearOfService": 1
        },
        {
            "Title": "Janitor",
            "Company": "Yellow Sky",
            "YearOfService": 7
        }
    ]
}

Here is how I can use JSON_VALUE to parse `Data` column.

2019-07-17 17_34_04-SQLQuery45.sql - (local)_.Infinity (ENERGY_sutet (53))_ - Microsoft SQL Server M

I can also use JSON_VALUE in WHERE clause.

2019-07-17 17_34_38-SQLQuery45.sql - (local)_.Infinity (ENERGY_sutet (53))_ - Microsoft SQL Server M

Comparing Linq Join and GroupJoin

Tested on:
.Net 4.5
Entity Framework 6 – Github

This is a post comparing Linq Join and GroupJoin for querying database.

I created simple database consist of 2 tables, `Instructor` and `Class`, which has 1 to many relationship.

SQL instructor table

SQL class table

Entity Framework diagram looks like this.
EF diagram

The console app using Linq to join these 2 tables look like this.

using System.Linq;

namespace LinqJoin
{
    class Program
    {
        static void Main(string[] args)
        {
            using (var db = new TestEntities())
            {
                var join = db.Instructor
                    .Join(
                        db.Class,
                        i => i.InstructorId,
                        c => c.InstructorId,
                        (i, c) => new
                        {
                            Instructor = i,
                            Class = c
                        })
                    .ToList();

                var groupJoin = db.Instructor
                    .GroupJoin(
                        db.Class,
                        i => i.InstructorId,
                        c => c.InstructorId,
                        (i, c) => new
                        {
                            Instructor = i,
                            Class = c
                        })
                    .ToList();

                var flattenGroupJoin = db.Instructor
                    .GroupJoin(
                        db.Class,
                        i => i.InstructorId,
                        c => c.InstructorId,
                        (i, c) => new
                        {
                            Instructor = i,
                            Class = c
                        })
                    .SelectMany(r => r.Class, (r, c) => new
                    {
                        Instructor = r.Instructor,
                        Class = c
                    })
                    .ToList();
            }
        }
    }
}

Following are the result.

** Join **

Visual Studio Debug
VS join result

SQL Profiler

SELECT 
    [Extent1].[InstructorId] AS [InstructorId], 
    [Extent1].[FirstName] AS [FirstName], 
    [Extent1].[LastName] AS [LastName], 
    [Extent2].[ClassId] AS [ClassId], 
    [Extent2].[Name] AS [Name], 
    [Extent2].[ClassNumber] AS [ClassNumber], 
    [Extent2].[InstructorId] AS [InstructorId1]
    FROM  [dbo].[Instructor] AS [Extent1]
    INNER JOIN [dbo].[Class] AS [Extent2] ON [Extent1].[InstructorId] = [Extent2].[InstructorId]

** Group Join **

`GroupJoin` result in grouped data by specified entity. In this case I grouped by `Instructor`.

Visual Studio Debug
VS groupjoin result

SQL Profiler

SELECT 
    [Project1].[InstructorId] AS [InstructorId], 
    [Project1].[FirstName] AS [FirstName], 
    [Project1].[LastName] AS [LastName], 
    [Project1].[C1] AS [C1], 
    [Project1].[ClassId] AS [ClassId], 
    [Project1].[Name] AS [Name], 
    [Project1].[ClassNumber] AS [ClassNumber], 
    [Project1].[InstructorId1] AS [InstructorId1]
    FROM ( SELECT 
        [Extent1].[InstructorId] AS [InstructorId], 
        [Extent1].[FirstName] AS [FirstName], 
        [Extent1].[LastName] AS [LastName], 
        [Extent2].[ClassId] AS [ClassId], 
        [Extent2].[Name] AS [Name], 
        [Extent2].[ClassNumber] AS [ClassNumber], 
        [Extent2].[InstructorId] AS [InstructorId1], 
        CASE WHEN ([Extent2].[ClassId] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]
        FROM  [dbo].[Instructor] AS [Extent1]
        LEFT OUTER JOIN [dbo].[Class] AS [Extent2] ON [Extent1].[InstructorId] = [Extent2].[InstructorId]
    )  AS [Project1]
    ORDER BY [Project1].[InstructorId] ASC, [Project1].[C1] ASC

** Flatten Group Join **

Visual Studio Debug
VS flatten groupjoin result

SQL Profiler

SELECT 
    [Extent1].[InstructorId] AS [InstructorId], 
    [Extent1].[FirstName] AS [FirstName], 
    [Extent1].[LastName] AS [LastName], 
    [Extent2].[ClassId] AS [ClassId], 
    [Extent2].[Name] AS [Name], 
    [Extent2].[ClassNumber] AS [ClassNumber], 
    [Extent2].[InstructorId] AS [InstructorId1]
    FROM  [dbo].[Instructor] AS [Extent1]
    INNER JOIN [dbo].[Class] AS [Extent2] ON [Extent1].[InstructorId] = [Extent2].[InstructorId]

Adding and Reference a Database Name in Data-Tier Application SQL Server

This apply to SQL Server Database Project.

Continuing from this post.

When adding data-tier application to SQL Server project, you can specify how to reference a database.

1. Right click on `References` under your project, select `Add Database Reference…`.
1

2. A dialog box will show up. Select your file data-tier application file with `Browse…` button.

3. Once data-tier application is selected, you can specify how you would like to reference a database. The default is show on screen show below.
3

The `Example usage` show how you can reference a database. For example, in your View, default way to reference a database is:

CREATE VIEW dbo.AccountView
	AS

SELECT *
FROM [$(test)].dbo.Account

4. (Optional) To change default way to reference a database, remove `Database variable` or set to anything you like. If `Database variable` is removed, you can reference a database the “normal” way.
2

CREATE VIEW dbo.AccountView
	AS

SELECT *
FROM test.dbo.Account

Data Warehouse Solutions in Azure

Date Warehousing Solutions at a Glance

With today’s big data requirements where data could be structured, unstructured, batch, stream and come in many other forms and size, traditional data warehouse is not going to cut it.

Typically, there are 4 types of data stage:

  • Ingest
  • Store
  • Processing
  • Consuming

Different technology is required at different stage. This also depends heavily on size and form of data and the 4 Vs: Volume, Variety, Velocity, Veracity.

Consideration for the solutions sometime also depends on:

  • Ease of management
  • Team skill sets
  • Language
  • Cost
  • Specification / requirements
  • Integration with existing / others system.

Azure Services

Azure offers many services for data warehouse solutions. Traditionally, data warehouse has been ETL process + relational database storage like SQL Data Warehouse. Today, that may not always be the case.

Some of Azure services for data warehousing:

  • Azure HDInsight
    Azure offers various cluster types that comes with HDInsight, fully managed by Microsoft, but still require management from users. Also supports Data Lake Storage. More about HDInsight. HDInsight sits on “Processing” data stage.
  • Azure Databricks
    Its support for machine learning, AI, analytics and stream / graph processing makes it a go-to solution for data processing. It’s also fully integrated with Power BI and other source / destination tools. Notebooks in Databricks allows collaboration between data engineers, data scientist and business users. Compare to HDInsight.
  • Azure Data Factory
    The “Ingest” part of data stage. Its function is to bring data in and move them around different system. Azure Data Factory supports different pipelines across Azure services to connect the data and even on-premise data. Azure Data Factory can be used to control the flow of data.
  • Azure SQL Data Warehouse
    Typically the end destination of data and to be consumed by business users. SQL DW is platform as a service, require less management from users and great for team who already familiar with TSQL and SSMS (SQL Management Studio). You can also scale it dynamically, pause / resume the compute. SQL DW uses internal storage to store data and include the compute component. SQL Data Warehouse sits on “Consuming” stage.
  • Database services (RDBMS, Cosmos, etc)
    SQL database, or other relational database system, Cosmos are part of the storage solutions offered in Azure Services. This is typically more expensive than Azure Storage, but also offer other features. Database services are part of “Storage” stage.
  • Azure Data Lake Storage
    Build on top of Azure Storage, ADLS offers unlimited storage and file system based on HDFS, allowing optimization for analytics purpose, like Hadoop or HDInsight. ADLS is part of “Storage” stage.
  • Azure Data Lake Analytics
    ADLA is a high-level abstraction of HDInsight. Users will not need to worry about scaling and management of the clusters at all, it’s an instant scale per job. However, this also comes with some limitations. ADLA support USQL, a SQL-like language that allows custom user defined function in C#. The tooling is also what developers are already familiar with, Visual Studio.
  • Azure Storage
  • Azure Analysis Services
  • Power BI

Which one to use?

There’s no right or wrong answer. The right solution depends on many others things, technical and non-technical as well as the considerations mentioned above.

Simon Lidberg and Benjamin Wright Jones have a really good presentation around this topic. See the link at reference for their full talk. But, basically, the flowchart to make decision looks like this:

data-warehouse-solutions-in-azure

Reference

https://myignite.techcommunity.microsoft.com/sessions/66581

What is SQL Server “Included Columns” ?

SQL Server organized indexes in a table as B-tree structure, which look like this.

B-tree-structure

A table that has `clustered index` store actual data rows at leaf level.
A table that has `non-clustered index`, only store (at leaf level) the value from the indexed column and a `row locator` point to actual data rows. It does not store actual data rows.

Specifying Included Columns in `non-clustered index` tell SQL Server you want to store actual data rows at leaf level.

For example:

CREATE NONCLUSTERED INDEX IX_Employee_Department
ON Employee(EmployeeFirstName)
INCLUDE (EmployeeAddress, EmployeeDOB)
GO

This example will create `non-clustered index` and its leaf level will store value from indexed column (EmployeeFirstName) and data rows from EmployeeAddress and EmployeeDOB columns.

SQL Server Configuration Manager: “Cannot connect to WMI provider. You do not have permission”

This caused by uninstalling certain version of SQL Server and re-installing different version.

See this for fixes.

Note: Run command prompt as Administrator.