Improve performance of T-sql statements containing encrypted columns.

We can use database level encryption to encrypt sensitive information. A commonly used level of encryption is to only encrypt columns that contain sensitive information like credit card numbers, date of birth etc. Encrypting and decrypting database level information is very CPU resource intensive. If a column is used as a primary key or used in comparison clauses (WHERE clauses, JOIN conditions) the database will have to decrypt the whole column to perform operations involving those columns.

We faced a similar performance issue in one of the previous companies. Database selects based on SSN were very slow as the database had to decrypt the whole column to perform comparison. If you look at the information below(under Current Problem) the database performed 4413 logical reads, 4406 read-ahead reads and took 302 milli seconds to perform the operation.

To improve the performance of these kinds of queries, we decided to store the last four digits of the SSN in clear text and index the column as a non-clustered index. First filtering the rows based on last four digits of the SSN and then filtering the rows based on SSN lead to a dramatic improvement in performance. f you look at the information below(under Improved Solution) the database performed 27 logical reads, 10 physical reads 10, 8 read-ahead reads and took 71 milli seconds to perform the operation.
All in all a huge performance improvement.

Current Problem:

begin
DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;
SET STATISTICS TIME ON;
SET STATISTICS IO ON;

declare @ssn varchar(max) = '123123123'
with PersonalIdentifierInformationCTE as
(
select CONVERT (CHAR(32), DECRYPTBYKEYAUTOCERT(CERT_ID('CertificateName'), NULL, Identifier)) as Identifier
from PersonalIdentifierInformation
)
select *
from PersonalIdentifierInformationCTE where Identifier is not null
and Identifier=@ssn
end
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(8 row(s) affected)
Table 'PersonalIdentifierInformation'. Scan count 1, logical reads 4413, 
physical reads 3, read-ahead reads 4406, lob logical reads 0, 
lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 250 ms, elapsed time = 302 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.

Improved Solution:

begin
declare @ssn varchar(max) = '123123123'

DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;
SET STATISTICS TIME ON;
SET STATISTICS IO ON;

with PersonalIdentifierInformationCTE as 
(
select CONVERT (CHAR(32), DECRYPTBYKEYAUTOCERT(CERT_ID('CertificateName'), NULL, Identifier)) as Identifier, SSNLast4
from PersonalIdentifierInformation
)
select *
from PersonalIdentifierInformationCTE 
where SSNLast4<> 0 and SSNLast4 = substring(@ssn, LEN(@ssn)-4+1, LEN(@ssn)) 
and Identifier is not null
and  Identifier=@ssn
end
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 13 ms.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 2 ms.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
 SQL Server Execution Times:
   CPU time = 16 ms,  elapsed time = 16 ms.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
    (8 row(s) affected)
Table 'PersonalIdentifierInformation'. Scan count 1, logical reads 27, 
physical reads 10, read-ahead reads 8, lob logical reads 0, 
lob physical reads 0, lob read-ahead reads 0.
    (1 row(s) affected)
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 71 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

Continue Reading

Use ProtoBuf.Net to improve serialization performance

Our application relied on data that did not change often. It had therefore implemented a solution that cached this data as xml files. These files were written to the web server on the first request and subsequently read whenever they were needed.
Some of the problems that arose due to this was:
a. Since the application was load balanced, a web server servicing a web request was not guaranteed to serve the next web request. This lead to the data in the various web servers quickly getting out of sync.
b. Our application was very slow. Profiling the application showed that showed that reading and writing(i.e serialization and de-serialization) the XML cache files was one of the biggest bottleneck in the application.
XML serialization is one of the slowest serializationde-serialization formats in the .Net framework. See the excellent article where Maxim Novak compares the various types of serialization in the .Net framework.

We therefore decided to change the serialization format from XML to ProtoBuf.Net. Protobuf-net is a contract based serializer for .NET code, that writes data in the “protocol buffers” serialization format engineered by Google. This framework component is written by Marc Gravell, considered as a .Net guru.
We also decided to change the application to write the cache files to a single file server and create the cache files in a background thread. I will write a separate article for the same in the future.
Changing the application to ProtoBuf.Net protocol was pretty straightforward.
a. Use Nuget to install ProtoBuf.Net framework component.

Install-Package protobuf-net

b. Add [ProtoContract] attribute to each class that needs to be serialized and [ProtoMember] Attribute with a unique integer to identify each member that needs to be serialized.

using ProtoBuf;
[ProtoContract]
class Person 
{
    [ProtoMember(1)]
    public int Id {get;set;}
    [ProtoMember(2)]
    public string Name {get;set:}
}

c. Serialize and deserialize your data

using (var file = File.Create("person.bin")) 
{
    Serializer.Serialize(file, person);
}
Person person;
using (var file = File.OpenRead("person.bin")) 
{
    person = Serializer.Deserialize<Person>(file);
}

As you can see from the charts below, the change from
a. XML to ProtoBuf.Net serialization
b. caching the files in a single file server instead of multiple file servers
c. offloading the work of generating the cache files to a background thread
was a huge success.

ProtoBuf.NetPerformanceImprovement-Part1

 

ProtoBuf.NetPerformanceImprovement-Part2

Using ProtoBuf.Net Serialization: The .Net profiler shows “HotSpots” due to serialization and de-serialization are much reduced. File sizes are also much reduced.

ProtoBufNetSerialization1

ProtoBufNetSerialization2

Using XML Serialization: The .Net profiler shows high CPU usage i.e “HotSpots” due to XML serialization and de-serialization. File sizes are also much larger.

XMLSerialization1

XMLSerialization2

Continue Reading

Tips to increase database performance

Here are some of the points that you should consider when you write queries:

  • Limit number of rows and columns
  • Use “Search argument able” conditions in “where” clause. They help query optimizers to use the index defined on column(s) effectively and have a higher chance of meeting index seek than index or table scan.
    “Search argument able” operators are :
    =, >, >=, <, <=, BETWEEN, LIKE (only those LIKE conditions that have a wildcard character as a suffix, eg: FirstName LIKE 'A%'
  • Do not use arithmetic operators directly on the “column name” in the “where” clause. Instead use logical workarounds to get performance benefits. eg:
    instead of

    WHERE HourlyRate * 8 <= 100

    use

    WHERE HourlyRate <= 100/8
  • Do not use functions on columns in the “where” clause. eg:
    a. Instead of

    WHERE DATEPART(YYYY,CreateDate)='2016'

    use

    WHERE CreateDate >= '01/01/2016' AND CreateDate <= '12/31/2016'
  • b. Instead of

    WHERE Left(LastName,1)='A'

    use

    WHERE LastName LIKE 'A%'
  • Use proper primary and Foreign Key constraints. They help the query optimizer select the best-suited execution plan for the query.
  • Here are some of the points that you should consider when designing indexes.

  • Try to use maximum 4-5 indexes on one table, not more. If you have read-only table, then the number of indexes may be increased.
  • Keep your indexes as narrow as possible. This reduces the size of the index and reduces the number of reads required to read the index.
  • Try to create indexes on columns that have integer values rather than character values.
  • If you create a composite (multi-column) index, the order of the columns in the key are very important. Try to order the columns in the key as to enhance selectivity, with the most selective columns to the leftmost of the key.
  • Continue Reading

    ASP.Net session size in SQL Server session database

    Problem:
    We need to calculate ASP.Net Session size in a SQL server session database

    Solution:

    Use the following query(shamelessly stolen from asp.net: checking session size in SQL Server ASPState DB) to check the size of the session state data.

    SELECT 
        a.SessionId, 
        SUBSTRING(a.SessionId, 25, 8) AS AppIDHex, 
        b.AppId AS AppIDDec, 
        b.AppName, 
        DATALENGTH(a.SessionItemLong) AS SessionSize 
    FROM 
        dbo.ASPStateTempSessions AS a 
        LEFT OUTER JOIN 
        dbo.ASPStateTempApplications AS b 
        ON 
        SUBSTRING(a.SessionId, 25, 8) = 
        SUBSTRING(sys.fn_varbintohexstr(CONVERT(VarBinary,b.AppId)), 3, 8) 
    WHERE 
        (DATALENGTH(a.SessionItemLong) > 0) 
    ORDER BY SessionSize DESC
    
    Continue Reading

    Performance Improvement for very important client

    This data is for all our clients that use our application. Based on the data below there is a significant performance improvement in the application performance in the latest release. Please see highlighted area below. The latest release took place on March 2016.

    March2016PerformanceImprovement

    One of our biggest clients was threatening to leave us for another provider. Its users were experiencing very long load times(more than a minute in some cases), timeouts etc. I profiled the application using JetBrains DotTrace. All the hotspots in the profiler trace pointed to a very complex stored procedure being called again and again in the application. Since the stored procedure was very complex and used by several other applications it was decided to not change it. Instead we decided to store the results of this stored procedure in session. This lead to the dramatic performance improvement as shown above.

        using System.Web;
        public class RentableItemsStrategy
        {
            public RentableItemsStrategy()
            {
            }
    
            public RentableTypes Get(int unitId, string dateNeeded, int leaseId)
            {
                var keyTemp = Key(unitId, dateNeeded, leaseId);
                if (HttpContext.Current == null || HttpContext.Current.Session == null)
                {
                    return null;
                }
                
                var rentableTypesForUnitIdDateNeededLeaseId = HttpContext.Current.Session[keyTemp] as RentableTypes;
                if (rentableTypesForUnitIdDateNeededLeaseId != null)
                {
                    return rentableTypesForUnitIdDateNeededLeaseId;
                }
                
                var xmlNode = new ServiceAgent.ResidentSvc().GetRentableItems(unitId, dateNeeded, leaseId);
                var rentableTypes = RentableTypes.FromXml(xmlNode);
                HttpContext.Current.Session[keyTemp] = rentableTypes;
                return rentableTypes;
            }
    
            private string Key(int unitId, string dateNeeded, int leaseId)
            {
                return string.Format("RentableTypes-{0}-{1}-{2}", unitId, dateNeeded, leaseId);
            }
        }
    
    Continue Reading