List to Read

The emerging big data architectural pattern

An Azure post about big data architecture. If you are about to start one, this is a good introduction. It mainly focuses on Lambda architect (batch and speed layers) using Azure resources.

Roadmap for learning the JavaScript language

Do you just start learning JavaScript? Are you planning to master JavaScript? This post is for you. Learn JavaScript from the early version and how it evolve to current version.

Choosing the right frontend database for a single page application

And if you are now serious about PWA, check this article out on what’s the ‘right’ frontend database for your app. Very useful especially when you are starting a new project.

Turn your Angular App into a PWA in 4 Easy Steps

Now you know what’s PWA, you can turn your existing Angular app into a PWA. Looks pretty straigh-forward honestly. But the devil is always in the details, right?

Progressive Web Apps & Electron

You can’t talk about PWA without Electron app. But, what’s the difference? Well, check this post out to find the answer. In short, they are like siblings, similar but different.

Programmatically Configure EF DbConfiguration

Tested on:
Entity Framework 6.2.0

There are few ways to set EF configuration, this describes using code-based DbConfiguration.

  1. Create a class inherit from DbConfiguration.
    using System.Data.Entity;
    namespace InfinityDataModel
        public class InfinityConfiguration : DbConfiguration
            public InfinityConfiguration()
                // Sample configuration
                SetDefaultConnectionFactory(new LocalDbConnectionFactory("InfinityDb"));

  2. The class must be in same assembly as your the Entity Framework Data Model.

  3. In the constructor, set the configuration you wanted. For all possible configurations, see here.
  4. There is no need to do anything else. EF will load your configuration class when initialized.


Adding and Reference a Database Name in Data-Tier Application SQL Server

This apply to SQL Server Database Project.

Continuing from this post.

When adding data-tier application to SQL Server project, you can specify how to reference a database.

1. Right click on `References` under your project, select `Add Database Reference…`.

2. A dialog box will show up. Select your file data-tier application file with `Browse…` button.

3. Once data-tier application is selected, you can specify how you would like to reference a database. The default is show on screen show below.

The `Example usage` show how you can reference a database. For example, in your View, default way to reference a database is:

CREATE VIEW dbo.AccountView

FROM [$(test)].dbo.Account

4. (Optional) To change default way to reference a database, remove `Database variable` or set to anything you like. If `Database variable` is removed, you can reference a database the “normal” way.

CREATE VIEW dbo.AccountView

FROM test.dbo.Account

Data Warehouse Solutions in Azure

Date Warehousing Solutions at a Glance

With today’s big data requirements where data could be structured, unstructured, batch, stream and come in many other forms and size, traditional data warehouse is not going to cut it.

Typically, there are 4 types of data stage:

  • Ingest
  • Store
  • Processing
  • Consuming

Different technology is required at different stage. This also depends heavily on size and form of data and the 4 Vs: Volume, Variety, Velocity, Veracity.

Consideration for the solutions sometime also depends on:

  • Ease of management
  • Team skill sets
  • Language
  • Cost
  • Specification / requirements
  • Integration with existing / others system.

Azure Services

Azure offers many services for data warehouse solutions. Traditionally, data warehouse has been ETL process + relational database storage like SQL Data Warehouse. Today, that may not always be the case.

Some of Azure services for data warehousing:

  • Azure HDInsight
    Azure offers various cluster types that comes with HDInsight, fully managed by Microsoft, but still require management from users. Also supports Data Lake Storage. More about HDInsight. HDInsight sits on “Processing” data stage.
  • Azure Databricks
    Its support for machine learning, AI, analytics and stream / graph processing makes it a go-to solution for data processing. It’s also fully integrated with Power BI and other source / destination tools. Notebooks in Databricks allows collaboration between data engineers, data scientist and business users. Compare to HDInsight.
  • Azure Data Factory
    The “Ingest” part of data stage. Its function is to bring data in and move them around different system. Azure Data Factory supports different pipelines across Azure services to connect the data and even on-premise data. Azure Data Factory can be used to control the flow of data.
  • Azure SQL Data Warehouse
    Typically the end destination of data and to be consumed by business users. SQL DW is platform as a service, require less management from users and great for team who already familiar with TSQL and SSMS (SQL Management Studio). You can also scale it dynamically, pause / resume the compute. SQL DW uses internal storage to store data and include the compute component. SQL Data Warehouse sits on “Consuming” stage.
  • Database services (RDBMS, Cosmos, etc)
    SQL database, or other relational database system, Cosmos are part of the storage solutions offered in Azure Services. This is typically more expensive than Azure Storage, but also offer other features. Database services are part of “Storage” stage.
  • Azure Data Lake Storage
    Build on top of Azure Storage, ADLS offers unlimited storage and file system based on HDFS, allowing optimization for analytics purpose, like Hadoop or HDInsight. ADLS is part of “Storage” stage.
  • Azure Data Lake Analytics
    ADLA is a high-level abstraction of HDInsight. Users will not need to worry about scaling and management of the clusters at all, it’s an instant scale per job. However, this also comes with some limitations. ADLA support USQL, a SQL-like language that allows custom user defined function in C#. The tooling is also what developers are already familiar with, Visual Studio.
  • Azure Storage
  • Azure Analysis Services
  • Power BI

Which one to use?

There’s no right or wrong answer. The right solution depends on many others things, technical and non-technical as well as the considerations mentioned above.

Simon Lidberg and Benjamin Wright Jones have a really good presentation around this topic. See the link at reference for their full talk. But, basically, the flowchart to make decision looks like this:



Entity Framework, .Net and SQL Server Table Valued Parameter

This is step by step setup of using SQL Server TVP (Table Valued Parameter) in .Net application with EF (Entity Framework). In this example, I use SQL Server 2016 (SP2-CU3), .Net 4.5.1 and EF 6.20.

1. Create a table to store data.

CREATE TABLE [dbo].[Something] (
    [Id]            INT	IDENTITY(1,1)   NOT NULL,
    [Name]          VARCHAR(150)        NOT NULL,
    [Status]        CHAR(1)             NOT NULL,

2. Create `User Defined Table Type` in SQL Server. For simplicity, in this example the type’s columns are same as table I created on step 1. In real-world, the type’s columns could be significantly different than table where we store the data, it might even used for join with other tables.

CREATE TYPE [dbo].[udt_Something] AS TABLE (
	[Name]		VARCHAR(150)	NOT NULL,
	[Status]	CHAR(1)		NOT NULL,

3. Create stored procedure to take parameter (of a `User Defined Table` type we created earlier) and perform necessary task to persist our data.

CREATE PROCEDURE [dbo].[sp_SaveSomething]
	@udt_Something [dbo].[udt_Something] READONLY
    INSERT INTO [dbo].[Something]
        SELECT *
        FROM @udt_Something

4. Create extension method to convert `IEnumerable<T>` object to a `DataTable` object. In order to use SQL TVP, we have to pass our parameter as a `DataTable`. This method will help convert our data to `DataTable` type.

using System;
using System.Collections.Generic;
using System.Data;
using System.Linq;
using System.Reflection;

namespace QC
    public static class Helper
        public static DataTable ToDataTable<T>(this IEnumerable<T> enumerable, IEnumerable<string> orderedColumnNames)
            var dataTable = new DataTable();

            // Get all properties of the object
            PropertyInfo[] properties = typeof(T).GetProperties(BindingFlags.Public | BindingFlags.Instance);
            PropertyInfo[] readableProperties = properties.Where(w => w.CanRead).ToArray();

            // Get column
            var columnNames = (orderedColumnNames ?? readableProperties.Select(s => s.Name)).ToArray();

            // Add columns to data table
            foreach (string name in columnNames)
                dataTable.Columns.Add(name, readableProperties.Single(s => s.Name.Equals(name)).PropertyType);

            // Add rows to data table from object
            foreach (T obj in enumerable)
                dataTable.Rows.Add(columnNames.Select(s => readableProperties.Single(s2 => s2.Name.Equals(s)).GetValue(obj)).ToArray());

            return dataTable;

5. For the purpose of this example, let’s say we want to save a collection of objects. This is our object definition.

namespace QC
    public class Something
        public int Id { get; set; }
        public string Name { get; set; }
        public string Status { get; set; }

6. Using EF, called stored procedure we created and pass in `SqlParameter`, which is a collection of objects that we converted to `DataTable`. Don’t forget to specify parameter type as `User Defined Table Type`.

using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Linq;

namespace QC
    public class DataAccess
        public void Save(IEnumerable<Something> data)
            // Columns for ordering, the order of the columns must be the same as user defined table type
            var orderedCols = new[] { "Name", "Status" };

            // SQL parameter to pass to stored procedure
            var param = new SqlParameter("@udt_Something", SqlDbType.Structured);
            param.Value = data.ToDataTable(orderedCols);
            param.TypeName = "dbo.udt_Something";

                // QCDB is our EF entities
                using (var db = new QCDB())
                    // Call stored procedure and pass in table valued parameter
                    db.Database.ExecuteSqlCommand("EXEC dbo.sp_SaveSomething @udt_Something", param);

7. Example of usage.

using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Linq;

namespace QC
    public class OpsController : ApiController
        public void SaveSomething()
            var data = new List<Something>();
            data.Add(new Something { Id = 1, Name = "Chap", Status = "A" });
            data.Add(new Something { Id = 2, Name = "Stick", Status = "D" });

            var dataAccess = new DataAccess();

Extract (Create) SQL Data-Tier Application

SQL Data-Tier allow database administrator to package SQL Server objects into portable artifact.
There are 2 types:
1. DACPAC – contain database schema only.
2. BACPAC – contain database schema and data.

You can create Data-Tier in SSMS (SQL Server Management Studio) or Visual Studio Data Tools. I’d like Visual Studio Data Tools as it give me more options to configure.

To extract SQL Data-Tier from a database in Visual Studio.
1. Open SQL Server Object Explorer.
2018-12-10 14_06_30-

2. Right click on database you want to extract (assuming you have configured database connection).
3. Select “Extract Data-Tier Application…”
2018-12-10 14_04_49-MDEDatabase - Microsoft Visual Studio (Administrator)

4. Dialog displayed.
2018-12-10 14_07_26-Extract Data-tier Application

Here you can configure different options. Some to consider:

  • Include user login mappings – whether to include objects under Security (users, roles, etc).
  • Verify extration – whether Visual Studio verify reference in every objects, for example: if you have view that join table from another database, VS will verify the connection to the other database.

To deploy Data-Tier to database engine (in VS):
1. Right click on “Databases”
2. Select “Publish Data-Tier Application…”
2018-12-10 14_17_53-MDEDatabase - Microsoft Visual Studio (Administrator)

What is SQL Server “Included Columns” ?

SQL Server organized indexes in a table as B-tree structure, which look like this.


A table that has `clustered index` store actual data rows at leaf level.
A table that has `non-clustered index`, only store (at leaf level) the value from the indexed column and a `row locator` point to actual data rows. It does not store actual data rows.

Specifying Included Columns in `non-clustered index` tell SQL Server you want to store actual data rows at leaf level.

For example:

ON Employee(EmployeeFirstName)
INCLUDE (EmployeeAddress, EmployeeDOB)

This example will create `non-clustered index` and its leaf level will store value from indexed column (EmployeeFirstName) and data rows from EmployeeAddress and EmployeeDOB columns.

SQL Server Configuration Manager: “Cannot connect to WMI provider. You do not have permission”

This caused by uninstalling certain version of SQL Server and re-installing different version.

See this for fixes.

Note: Run command prompt as Administrator.

Fix Linq Generate ‘ESCAPE’ Keyword on Contains/StartsWith/EndsWith Predicate and Cause Performance Hit

When using Linq to Entity (or Linq to SQL) with these predicates, Linq will generate ‘ESCAPE’ keyword in SQL statement which cause performance hit.

This code:

public IEnumerable GetUser(string filter)
    using (var db = new SomeEntities())
        var result = db.User
            .Where(u => u.DisplayName.Contains(filter))

Will generate this sql statement:

SELECT [Extent1].[DisplayName] AS [DisplayName]
FROM   (SELECT [User].[DisplayName] AS [DisplayName]
        FROM   [dbo].[User] AS [User]) AS [Extent1]
WHERE  [Extent1].[DisplayName] LIKE '%garlan%' /* @p__linq__0 */ ESCAPE '~'

The `ESCAPE ‘~’` could potentially cause performance degradation.

Solution to this issue is to pass in constant into Contains predicate. It’s easier said than done.
If we have method like `GetUser` above which take ‘filter’ parameter, it’s not possible to convert to constant.
To overcome this, we have to create method that return predicate with constant.

This solution is proposed here.

Summary is to create extension class.

using System;
using System.Linq.Expressions;

namespace Blahblahblah
    public static class PredicateConstantCreator
        public static Expression EmbedConstant(this Expression expression, TConstant constant)
            var body = expression.Body.Replace(expression.Parameters[1], Expression.Constant(constant));

            return Expression.Lambda(body, expression.Parameters[0]);

        private static Expression Replace(this Expression expression, Expression searchEx, Expression replaceEx)
            return new ReplaceVisitor(searchEx, replaceEx).Visit(expression);

        internal class ReplaceVisitor : ExpressionVisitor
            private readonly Expression from;
            private readonly Expression to;

            public ReplaceVisitor(Expression from, Expression to)
                this.from = from;
       = to;

            public override Expression Visit(Expression node)
                return node == this.from ? : base.Visit(node);

And use it like this:

Expression predicate = (item, filterTerm) => item.DisplayName.Contains(filterTerm);
var result = User

Linq will generate sql statement like this:

SELECT [Extent1].[DisplayName] AS [DisplayName]
FROM   (SELECT [User].[DisplayName] AS [DisplayName]
        FROM   [dbo].[User] AS [User]) AS [Extent1]
WHERE  [Extent1].[DisplayName] LIKE '%garlan%' /* @p__linq__0 */

Alternative solution is to add function to generated XML file in EDMX. This solution is lay out here.