Author: admin

DYNAMICALLY RECLAIM SPACE FROM THE LARGEST NDF’s IN A DATABASE

A quick google search will reveal several articles which recommend not shrinking your database files. At a high level this is because the space you are freeing up is likely to be needed again in the future, the shrink causes heavy index fragmentation, and the cost of running the operation is usually not worth the gain. You can read more about shrinking databases here and here. Adding complexity to this is that shrinking database files which contain LOB data can take a very long time.
There are however fringe cases where it makes sense to shrink, as described below. To address this, I have developed a process to shrink database files automatically and dynamically, based on the largest files in a database with the most free space.

The approach is suitable where the following conditions exist;

* Large amount of space available for reclamation
* Predicted future DB size is significantly less than current size
* Large number of data files exist and free space amount is evolving
* Data files being shrunk do not contain LOB data
* A data archiving routine is in place
* A regular maintenance window is available

Note that before developing the method below, I looked at other options for reclaiming database space, such as rebuilding the largest tables, through two methods;

1) Rebuilding the clustered index on a new filegroup using the CREATE INDEX… WITH DROP_EXISTING clause. This works well but does not move LOB data from the old filegroup. The LOB data filegroup can only be specified at the initial table creation stage using the syntax TEXTIMAGE_ON .

2) Making a copy of the table that we can populate with data behind the scenes, then perform a synch of data, an object rename, and drop the old table all in serialized transaction. This is my preferred method however it introduces a small amount of risk coupled with administrative overhead. The benefit however is that we can switch over to the new smaller table with minimal impact to the application because everything is performed side-by-side. This is likely to be your only option if reclaiming large amounts of LOB data and you do not have the luxury of an outage window. This is also useful for making table-level DDL changes on a 24/7 system.

Below is the automation script. It utilizes two SQL agent jobs to achieve asynchronous execution, and an allowable run day/time can be configured to prevent unplanned performance impacts against your database. The first SQL agent job runs the shrink operation. The second SQL agent job is kicked off automatically and will monitor and kill the shrink process when your configurable window is finished. Take some time to familiarize yourself with the script as it needs to be configured for your environment, and don’t forget to make sure you are rebuilding indexes after you shrink.

ShrinkData

Here is a script which covers the creation of both SQL agent jobs and the shrink process all in one. This illustrates the end to end process in a little more detail. For example, the agent job step ‘on failure’ workflows required a bit of care to handle the killed process.

ShrinkData(wSQLJobs)

If you want to take things further, you could modify the shrink command so that Instead of targeting zero (which means release all free space available), you could target the current free space minus a few GB, or a percentage. This would allow you to see results faster because if you target zero, no space will be released until the entire operation has completed.

Finally, I have provided some example scripts which show the other two methods that were mentioned.

RecreateTableMethods

Enjoy and let me know if you have any questions / feedback.

Check data distribution in partitioned table

In an ideal world, all partitioned tables would maintain themselves through a sliding window implementation, archiving process, or other automated routine which performs a split/switch/merge of partitions as-needed. Occasionally however, you may come across a partitioned table which requires manual intervention.

This may happen when the table has outgrown the predefined partition ranges, or when the usage type/profile of the table has changed over time. Below is a procedure which I developed, to provide an overview of the partitions within your table.

The script finds the partitioning key column, and then returns the median, standard deviation, and distribution of data within partitions. It also suggest where action may be required!

From here you can determine your course of action; this might be a new partition, a revised function/scheme which are more suitable, or something automated.

Note that this particular stored procedure is designed to look for an even distribution across partition ranges. This is because I was targeting a rapidly growing OLTP table with high read/write concurrency, which is not being archived.

If your table has a different usage pattern, then feel free to go ahead and modify the procedure to meet your own needs.


/* ———- Start of Script ———— */

if object_id(‘dbo.CheckPartitioning’) is not null
drop procedure dbo.CheckPartitioning
go

create procedure dbo.CheckPartitioning

@Partitioned_Table nvarchar(255)

AS

/* declare variables */
declare @Partitioning_Key nvarchar(255),
@Partitioned_RowCount bigint,
@Partition_Range bigint,
@Sql nvarchar(max),
@Retval nvarchar(255),
@StdDev bigint,
@Median bigint

/* get partitioning key column. */
select @Partitioning_Key = sc.name from sys.tables st INNER JOIN sys.indexes si on si.object_id = st.object_id
INNER JOIN sys.index_columns ic on ic.index_id = si.index_id AND ic.object_id = st.object_id
INNER JOIN sys.columns sc on sc.object_id = ic.object_id AND sc.column_id = ic.column_id
where st.object_id = object_id(@Partitioned_Table)AND ic.partition_ordinal = 1

/* get max value from partitioned table */
set @Retval = N’@RetValOUT int OUTPUT’;
set @Sql = ‘select @RetValOUT = (select top 1 ‘ + @Partitioning_Key + ‘ from ‘ + @Partitioned_Table + ‘ with (nolock) order by 1 desc)’
exec sp_executesql @Sql, @Retval, @retvalOUT=@Partitioned_RowCount output;

/* get max range specified for the partition function */
set @Partition_Range = convert(bigint,(select top 1 pv.[value] from sys.partition_range_values pv inner join sys.partition_functions pf on pv.function_id = pf.function_id
inner join sys.partition_schemes ps on pf.function_id = ps.function_id
inner join sys.indexes si on si.data_space_id = ps.data_space_id
where si.object_id = object_id(@Partitioned_Table) order by pv.boundary_id desc))

/* determine if action is required */
select @Partitioned_Table as ‘table’, @Partition_Range as ‘last range in partition function’, @Partitioned_RowCount as ‘number of rows’, (@Partition_Range-@Partitioned_RowCount) as ‘rows until last range reached’,
case when @Partitioned_RowCount < @Partition_Range then ‘no action required’ else ‘rows are in the rightmost range, a new filegroup and boundary point will be needed.’ end as ‘result’

/* grab the spread of rows within each partitions */
if object_id(‘tempdb.dbo.##temp_partition_data’) is not null
drop table dbo.##temp_partition_data
select distinct p.partition_number, fg.name, p.rows
into dbo.##temp_partition_data
from sys.partitions p
INNER JOIN sys.allocation_units au on au.container_id = p.hobt_id
INNER JOIN sys.filegroups fg on fg.data_space_id = au.data_space_id
where p.object_id = OBJECT_ID(@Partitioned_Table)

/* report on whether variation is greater than 10% of the median */
SET @StdDev = (
select stdev(rows) [stddev]
from ( select rows,
row_number() over( partition by rows order by rows asc) as rowsAsc,
row_number() over( partition by rows order by rows desc) as rowsDesc
from dbo.##temp_partition_data where rows > 0
) x
where rowsAsc in (rowsDesc, rowsDesc – 1, rowsDesc + 1))

SET @Median = (
select avg(rows) [avgrows]
from ( select rows,
row_number() over( partition by rows order by rows asc) as rowsAsc,
row_number() over( partition by rows order by rows desc) as rowsDesc
from dbo.##temp_partition_data where rows > 0
) x
where rowsAsc in (rowsDesc, rowsDesc – 1, rowsDesc + 1))

select @Median as ‘median rowcount in partition’, @StdDev as ‘standard deviation’

if @StdDev > @Median / 10
begin
select ‘The deviation of rows within each partition is greater than 10 percent from the median. Suggest reviewing the partition function ranges.’ as [status]
select * from dbo.##temp_partition_data
end
else
begin
select * from dbo.##temp_partition_data
end

/* cleanup */
if object_id(‘tempdb.dbo.##temp_partition_data’) is not null
drop table dbo.##temp_partition_data

GO

/* ———- End of Script ———— */

Finally, a quick acknowledgment to Adam Machanic – whose post I referenced here, to efficiently return the median value in this query.

If you have any questions or feedback, don’t hesitate to send me an email, or reach out on twitter.

Dynamic Snapshot Creation Script

Recently I was testing a database release against a VLDB, and I needed the ability to revert the database to its original state, without restoring from backup. So the obvious approach here is to use SQL Server’s Database Snapshot technology, right!?

#SQLFunFact – Database snapshot creation has no GUI in SQL server and needs to be done through T-SQL. No worries, the code looks like this…

CREATE DATABASE db_snapshot_name
ON
(
NAME = logical_file_name, FILENAME = ‘os_file_name’
) [ ,…n ]
AS SNAPSHOT OF source_db_name

In order to complete this script, you will need 3 things.

1) A unique name to give your snapshot which identifies it as being a snapshot, and optionally a date/time stamp to indicate when it was created.

2) A list of each logical filename from your source database, and a newly allocated physical filename. (This will be the SPARSE file that changed extents will be written to.) This can be retrieved from sys.master_files

3) You will need to decide which volume(s) will host your sparse files. Consider the following; how much data is going to change while my snapshot is active? And which drive has enough space to accommodate this new file growth? Additionally you may want to consider the contention and therefore performance effects of placing your sparse file on a drive that may host your T-Log, TempDB, or heavily used Filegroups.

Now, some databases can have a LOT of files underneath… and these can be going to any number of underlying volumes, too! You don’t necessarily want to be typing all of these out…

So with that in mind, I knocked up a script which will (for a given DB) retrieve the underlying data file information, and augment it in such a way that a ‘create database snapshot’ statement is generated. For simplicity sake, I designed it so that the sparse file will be hosted in the same volume and directory, as its NDF counterpart.

The benefit of this is that an administrator should already have an idea of what the growth patterns are on each volume, so the snapshot size against the underlying volumes would be somewhat predictable.
You will find the script below. Simply add in your DB name, set your query result output to text, and then execute against your target server.

The output from the query will be a script which you can run in another window to create your snapshot!

One of the things I did with this script was to store the dynamic query definition in a table variable instead of your typical VARCHAR (max).
By adding an identity column to our table variable definition, this gives us the ability to iterate through the dynamically generated query, effectively giving us a line number which can be referenced as a unique row.

Below is the query definition. If you have a lot of data files this may be of some use.

Alternatively, you can take the script and modify it further to suit your own environment.

/* ————— CREATE SNAPSHOT SCRIPT ————— */

DECLARE @dbname varchar(255)
SET @dbname = ‘MyDatabase’ —<<<<<< SET SOURCE DATABASE NAME HERE >>>>>>

SET QUOTED_IDENTIFIER ON
SET NOCOUNT ON

DECLARE @SnapshotSuffix varchar(50)
DECLARE @sql table (id int identity(1,1), data varchar(max))
DECLARE @FileIDExceptionList table (id int)

SET @SnapshotSuffix = ‘_snapshot_’ + CONVERT(varchar(12),GETDATE(),112)
INSERT INTO @FileIDExceptionList (id) values (2) — default ldf id = 2 <<<<<< Add additional File Exceptions here >>>>>>

IF (SELECT (1) FROM sysfiles WHERE filename like ‘%ldf%’ and fileid not in (SELECT id FROM @FileIDExceptionList)) is not null
RAISERROR (‘you have non default MDF or LDF file extensions that will be excluded FROM this scipt, please add them to the fileID exception list.’, 16,1)

IF (SELECT name FROM sys.databases WHERE name = @dbname) IS NULL
RAISERROR (‘Cannot find the database specified. please check that it exists and try again’, 16,1)
ELSE
BEGIN
INSERT INTO @sql values (‘use master’),(‘GO’),(‘CREATE DATABASE ‘ + @dbname + @SnapshotSuffix + ‘ ON’)
INSERT INTO @sql
SELECT ‘( NAME = ‘ + name +’ , FILENAME = ”’ + REPLACE(physical_name, ‘.ndf’, ‘.ss’) + ”’),’ FROM sys.master_files
WHERE database_id = DB_ID(@dbname) AND physical_name like ‘%.ndf%’
UNION
SELECT ‘( NAME = ‘ + name +’ , FILENAME = ”’ + REPLACE(physical_name, ‘.mdf’, ‘.ss’) + ”’),’ FROM sys.master_files
WHERE database_id = DB_ID(@dbname) AND physical_name like ‘%.mdf%’
INSERT INTO @sql values (‘AS SNAPSHOT OF ‘ + @dbname + ‘;’),(‘GO’)

UPDATE @sql
SET data = REPLACE(data,’),’,’)’) WHERE id in (SELECT MAX(id) -2 FROM @sql)
SELECT data AS ‘– Generated Script — ‘FROM @sql
END

/* ————— End of Script ————— */

SQL Saturday #365 Speaker Idol!

I delivered my first ever session to SQL Saturday #365 in Melbourne over the weekend, and I was pleased to find out that I had received the SQL Saturday Speaker Idol accolade, with my brief talk on Microsoft Azure!

Overall it was a daunting but humbling experience. I am grateful for the opportunity that the organizers of SQL Saturday have afforded me, and I look forward to doing a full length presentation in the future.

Big shout out for the other first time speakers and of course, the regular presenters who do a fantastic job of not only entertaining, but educating the attendees.

This was my submission:

“SQL Server 2012 gave us some extra cloud integration when it was shipped, and SQL server 2014 has expanded on these features. This technology will slowly but surely become more integrated into the life of a DBA, as our IT landscape evolves in support of cloud-based computing. In this brief session we look at the deployment of a simple backend solution to Azure, as well how to package and deploy database changes from the comfort of your SSMS client.”

Thank you to all of the organizers and sponsors of SQL Saturday for what is always such a great event!

Don’t know what SQL Saturday’s are? Read this – https://www.sqlsaturday.com/

Synchronous DB Mirroring and running in an ‘exposed’ state

A few people I have spoken to as of late have been quite surprised to find that their ‘zero data loss’ Synchronous Mirroring solution can actually allow data loss!

In fact, data loss is ALWAYS a possibility, regardless of your budget and HA solution in place. It’s just that if you spend enough money it is very, very, unlikely.

I fear that some businesses are running under the assumption that they will not have any data loss if they have a Synchronous DB Mirroring solution in place, so I want to demonstrate here that it is possible for Mirroring to open up a window of vulnerability whereby it is running exposed and you are no longer safeguarded against data loss.

Consider the following example:

A basic Synchronous DB Mirroring configuration from Site A, to Site B. With manual failover. This scenario DOES NOT include the use of a WITNESS server, and was previously known as “High Protection” mode.

This could be represented by the following simple diagram.

Mirroring

Now let’s take a brief refresher on Synchronous Mirroring so that we can set the stage… when an application submits a transaction to the database in Synchronous mode, this transaction is first committed at the Mirror, then at the Principle, before finally giving ACK back to the application. Therefore we have 2 copies of our data hardened to the transaction log on two different servers, at two different sites.

This obviously comes at the cost of latency, so a DB Mirror in Synchronous mode should not have a large geographical separation from its Principal.

So let’s look at a couple of potential disaster scenarios:

  1. The server hosting the Principal database is lost. The application starts chucking errors and users cannot connect. You can redirect your application to the Mirror database without a single lost transaction. Albeit some small downtime (less if you use a witness!)
  1. The server hosting the Mirror database is lost. (or even just the connection to the Mirror) The application cannot commit at both databases so it will wait for the period of time specified In the PARTNER_TIMEOUT database Mirroring property (default 10 seconds), before committing any transactions.

At this point we have no scope for data loss within the Mirroring scenario. But what happens after that 10 seconds is where things get interesting…

After the amount of time specified in DB Mirroring PARTNER_TIMEOUT (10 seconds by default, remember) the Principal database will now start accepting transactions again, but it is now operating in an ASYNCHRNOUS state!

It is stockpiling all of the transactions on the Principal database so that it can apply them to the Mirror when it comes back online. Until such time, it is no longer keeping two copies of your data.

You can change your PARTNER_TIMEOUT value using the following TSQL:

Partner_Timeout

Do consider however that there is not generally a good reason to change this value. For example if you do not want data loss, then whether you revert to ASYNCH after 10 seconds or 30 seconds is irrelevant.

Note that a value cannot be specified that is less than 5 seconds. (This would too easily cause false failovers if using a Witness). And a value greater than 30 seconds is going to cause clients to receive timeouts.

So what are the implications of PARTNER_TIMEOUT? Well it means that if we lose the Mirror, after 10 seconds we are running ‘exposed’. Or in other words, if we lose the server hosting the Principal database during this time we can no longer guarantee zero data loss.

This by design makes sense. DB Mirroring is a High Availability  technology and is designed to keep our service available and online.

But what if the risk of experiencing data loss was catastrophic to the business!? Perhaps to some businesses an outage is the preferred result over running the risk of losing a single transaction…

So when your Mirror is unavailable, after investigating the problem you may decide to:

  • Take no action and allow the Mirror to catch-up. If for example it was a temporary network issue. (running exposed)
  • Completely re-establish the Mirroring session to the same or another server if it was completely lost. (running exposed)
  • Stop the application until such time as you have another Mirror. (No data loss)

In most cases it is unlikely that you would stop the application and prevent users from connecting just because of ‘potential’ data loss, but this is something to keep in mind if you are in a shop where the target RPO is zero.

Note that the above is just an example scenario. If you have a REAL need to recover to a point-in-time, then you will want to couple your DB Mirroring with a SAN and other technologies such as Failover Clustering, or AlwaysOn availability groups.

 

Archive Data with Table Partitioning

In my last post called Out In Space, I looked at ways of freeing up some space on your database server in a hurry; this led me to thinking about longer term solutions, like archiving your data.

Let’s run through a demo of how this can be achieved with table partitioning. One of the benefits of this method, as well as being really fast, is that the process of moving data off our transactional table is a metadata change, and will therefore not adversely affect concurrency.

There are lots of ways to implement this, and what you should do depends on your business requirements, so get creative and use a solution that works best for you!

In this example we are concerned with only the last 12 months worth of data, anything older than that will be moved to another table and from there it can be archived. I have chosen to create the partitioning key over an integer column storing the day and month only, this way it can be recycled regardless of the year. e.g. mmdd)

To get started, we create the following objects:

1) a database with 13 Filegroups.

2) at least one data file in each Filegroup.

3) a Partition Function and a Partition Scheme.

The script below can be used to create our Partition Function and Scheme. (Note the boundary points and FG’s specified)

SQL1

Now that we have defined our storage, let’s go ahead and create two tables (one for our transactional data, and one for our archive data.)

SQL2

Next, we want to simulate an application, so chuck some data in!

SQL3

Now we have two partitioned tables, one of them populated with data, which is distributed across its partitions evenly using the partitioning key (ModTimeId). Next we will perform the SWITCH operation. But before we do that, let’s verify our data placement.

Execute a select statement on our newly created tables and you will see that we have no data in our archive table.

partition4

You can use the following query to analyze the spread of data in each partition (Note the date ranges within each partition!)

SQL4

partition6

Let’s go ahead and create a procedure that will SWITCH data into the archive table.

SQL5

Now, execute this procedure. The results are shown below…

partition8

What you can see is all data which was in the oldest partition has been (almost) instantaneously relocated to the archive table! (This is possible because the data itself is not moved; but the metadata about the location of the data is changed.)

Now that we have our archive table populated, we can take the final steps in our process. This could be deleting data from the archive table where date older than (<insert threshold here>), moving it to a slower (and cheaper) disk system, or moving to offline/near-line storage.

In terms of completing the solution, you can schedule this proc to run via an SQL agent job once per month, or at your desired frequency. As always, make sure you add some transaction and error handling to ensure your logic is bulletproof and you are moving the correct data.

So there we have a basic example of how you can use table partitioning to archive data. Please note that archiving itself is a broad topic which is beyond the scope of this post. It has many considerations including both business and legal compliance.

Out in Space

Lately, I have come across a few DB servers that have .NDF files approaching 1TB in size, and while this is a far cry from being a VLDB, it does start to become a problem when you are running out of disk space… and in some cases due to the growth rate, you are running out FAST!

Example Scenario: So it’s Friday afternoon, you have been to the pub at lunch to bond with your colleagues, and now you are working on those proactive tasks that will bring recognition, value and glory to the team. Then, all of a sudden you get an alert, and it looks like one of your servers is not going to last the weekend!!! (Disk space, remember?)

A couple questions come to mind: Shouldn’t that alert have notified us earlier? And where was the capacity planning that would have predicted this growth? These will need to be addressed… but first and foremost we need to relieve the issue with disk pressure!

As DBA’s, addressing a disk that is filling-up-fast becomes our issue when we cannot get approval to throw more disk resource at a server. If it is a high profile or valuable system, then you will usually get your volume expanded. But sometimes the business case will not justify the expenditure, or perhaps the work cannot be carried out in time…

If you can prove that an increase in storage is justifiable, then it’s your job to provide this business case to management and get the ball rolling. Failing that, we are going to have to take action!

This means considering options such as using compression (if you can wear the performance trade-off), purging unneeded data (which is backed up elsewhere right!?), finding alternate storage (another server/UNC, another Data Center, offline storage), or maybe you can reclaim some space by removing unneeded files from the volume. If you need to get slightly more invasive, then we can look into removing some unneeded indexes…

All these options can reclaim some much needed space, but not all of them are an option, every time.

In our example scenario, it turns out that the largest tables were great candidates for implementing Page-Level compression, and in about one hour, we had plenty of space for the weekend to keep us out of trouble and the business happy. The below script can be used to enable Page-Level compression on a table.

Compress Table

So there you have it, space is no longer an immediate issue. However we are going to need a long term solution… and this got me thinking about ARCHIVING….

A full discussion on archiving is beyond the scope of this post, but I knocked up a quick demo on how we can achieve an elegant solution using the SWITCH statement and some PARTITIONING.

Stay tuned for the next post where this will be demonstrated…

Suitably Certifiable?

Last week, I had the pleasure of sitting my 6th ever SQL exam and I did it in the heart of London city. I don’t think I would even be in London, if I had not done my first… In this post I’ll share the process that I go through taking exams. But first, a few words on certification…

Exams can be a great way to constantly develop your knowledge while keeping up with the latest changes in your chosen field. As a bonus exams will demonstrate value to your employer, and a commitment to your career development. Additionally, a lot of job ads in the market these days will cite certification as a requirement.

But not everyone thinks that exams are the way to go, and they are a controversial topic. For instance there are a lot of database professionals who have made it to the top of their game without having sat any exams. And in fact without experience, exams won’t mean much at all!

Whether you find exams useful or not will be up to you, but exams can provide a lot of value and are a means by which to keep learning. Let’s look at the exam process…

There are currently 2 paths when pursuing a certification in SQL server 2012; Data Platform and Business Intelligence. Jump on the MS site and have a look at what best suits your career intentions. http://www.microsoft.com/learning/en-us/sql-certification.aspx

How long you can spend on each exam will depend on how much time you wish to invest and how much you have to learn. I usually spend about 2-3 months on each. From the link above, choose your path and figure out which exam you need to do first.

The next thing you will want to do is gather some resources… I personally rely on e-books, Google, BOL and just using the software itself. I recommend MS Press training material if available as I find it covers the material well and also comes with practice exams which are great for learning the multiple choice format.

Do note however that newer exams may not have training material available. Look at the ‘Preparation Options’ for each exam. e.g. http://www.microsoft.com/learning/en-us/exam-70-461.aspx If nothing is available yet, you will have to search for books and material online. I suggest using Amazon and having a good look at the reviews.

Next up, unless you have an environment at work, then you are going to have to either download an evaluation version of SQL server, or purchase a Developer / Technet edition. http://www.microsoft.com/en-gb/download/details.aspx?id=29066

What about the method? How you study will largely be up to you but try to spend at least 30 minutes per day. I take a lot of notes as I make my way through each book. Anything I did not know or I think is important, I will write down using my own words. These notes can be reviewed later and also help commit what you have learned to memory.

Try to go back and revisit your notes every week or two, and basically go through your book front to back. You can set yourself up examples and test anything you’re not sure on.

After that it is time to commit… Logon to Prometric, find yourself a testing centre and set yourself a date and time! https://www.prometric.com/en-us/Pages/home.aspx

Plan to give yourself at least two weeks of pure revision after you finish the book and before sitting the exam. Use this time to review all your notes again, make a list if the things you don’t feel strong on, and give yourself some demos to solidify your knowledge.
I usually book around one month in advance as I’m finishing up the book. If you’re a morning person, book the exam for earlier in the day. If you work best later in the day, book the appointment accordingly.

When exam day comes, give yourself plenty of rest the night before, make sure you get some healthy food into you and prepare yourself mentally by accepting that you might fail in advance. Finally, get to your exam early and make sure you bring identification to show who you are.

Exams can take up to 3 hours so make sure you have the right fuel beforehand.

brain_food

Now get in there and do it!!! If you come across features you were unaware of, commit them to memory so that you can learn them later. (The bonus here is that if you fail, you will have an idea what items you need to study in order to pass the next time!)

If you’re lucky and you have been diligent in your studies, chances are your hard work will pay off and you will have earned one of these…
Success Beer! a success beer 😀

If you have any questions on certification, feel free to send me an email.

Good luck!!

SQL 2014 ColumnStore Indexes

When SQL 2012 was released it came with a feature called ColumnStore indexes. What are they you ask? In a nutshell, the traditional b-tree structure that we know and love stores pages of data which contain the ‘rows’ of a table. An index page will contain one or more columns from each row. This is efficient for most OLTP scenarios where you are providing predicates to filter data during most queries, and where data access is typically more varied.

But in a data warehouse OLAP scenario, the data access requirements differ slightly. With aggregations often made over vast numbers of rows, but only a few carefully chosen columns, there can be an overhead associated with querying unneeded columns.

Enter ColumnStore indexes. The data is still stored in separate pages, although each page is filled with all the values from one or more columns! With that method, a query referencing only a select few columns can utilize the index which contains only those columns! Thereby reducing IO, as only the data that was needed was cached.

Additionally, having that we are now storing data of the same type in a given page, and that there is a high level of redundancy, this lends itself to increased efficiency when we apply compression.

As well as reducing IO as more data can fit into memory, ColumnStore indexes are also memory optimized and are processed in chunks of columns called batches which reduce CPU usage.

ColumnStore indexes were introduced In SQL 2012, were non clustered only, and could not be updated. This meant that you needed to have a workaround for updating your data. One method could be to disable or drop your index, load the data, and then rebuild. Or if you utilize partitioning you could possibly perform a SWITCH operation.

If you would like to go into more details about ColumnStore indexes you can follow the link here. http://technet.microsoft.com/en-gb/library/gg492088.aspx

But for now, we will explore that fact that in SQL 2014, Microsoft has allowed us to not only create clustered ColumnStore indexes, but also to update them!

Let’s run through a real quick example to see the benefits of a ColumnStore index, and finally we will prove that they can now be updated. As stated ColumnStore indexes have their real benefits in a DW situation, so for this example I will use the AdventureWorksDW2012 database, and we will be using the dbo.FactResellerSales table.

First of all, we will take a copy of the FactResellerSales table, and then create a clustered ColumnStore index on that copy. This will give us two tables which apart from the ColumnStore index, are identical to test against.

query1

Now lets query both tables in a single batch, and then compare the results. We will turn on the graphical execution plan and also generate some time/IO metrics.

query1

Now let’s look at the results…

graphical_execution_plan

As you can see, the use of a clustered ColumnStore index to satisfy this query had an overall cost of 25% relative to the batch. Whereas the query utilizing the standard clustered index has cost the remaining 75%! Additionally, if we tooltip over the ColumnStore index operator we can see that it has an estimated IO cost of 0.003125 compared to 2.19868 for the row-based index.

operator_properties1operator_properties2

Lets go further and look at the statistics for Time and IO!

query_messages

From this output you can see that the first query, with the normal clustered index, has performed 2982 logical reads. And the second query, which uses the ColumnStore index, has performed only 77 logical reads as well as 60 read-ahead reads. A significant difference!

A brief aside on reads… A ‘logical read’ occurs every time SQL server requests a data page from the buffer cache. If the page is not in memory, then this needs to be read from the disk and that constitutes a ‘physical read’. ‘read-ahead reads’ are an optimization method whereby the pages are brought into the buffer cache before they are referenced by the query. Put simply, SQL server was able to determine the pages that it needed in advance.

As stated previously, in SQL 2012 ColumnStore indexes could not be updated. And if you tried to update a ColumnStore index, you would get the following error.

success_message

Now in SQL 2014 ColumnStore indexes can be clustered, and they can also be updated, which means you have time to make another coffee, because you will see this screen instead!

success_message

So there we have it – a little example where you can see that the benefits of using a ColumnStore index are substantial. Please do consider however that this test was performed against the AdventureworksDW database and this particular fact table had only 60,000 rows. In a real life DW scenario it is likely that you will have hundreds of millions of rows, and beyond. And this is where you could truly see a significant gain by using ColumnStore indexes.

SQL2014 governs IO!

The release of SQL 2014 CTP1 has brought IO to the list of resources it can govern! (yes, previously it was memory and cpu only.) Lets get down to business and run through a quick demo so we can get a sneak peek of SQL server 2014 metering IO!

 For the sake of simplicity in this demo, we will do 3 things:

 1) Create two resource pools, each with their associated workloadgroup, and we will configure the max_iops value only. No throttling of memory or cpu will take place.

2) The users will each will have a login created, and under the context of those logins, we will generate some IO against the system volume (C:)

3) We will review some DMO’s and analyze the results using perfmon.

First create two new resource pools and set the max_iops_per_volume value. Note, there is no GUI for this in SQL 2014 CTP, so you will have to use TSQL. I have shown the code below. Just repeat this code twice for your resource pools and change out the max iops value.

resourcepool

I have randomly chosen to assign a low number so that we can quickly see the throttling take place. As you can see my pool name is ‘users’ and i have set max_iops to 100. My second resource pool is named ‘developers’ and i have given it a max_iops_per_volume value of 50.

Next, we need to create 2 workload groups which will be assigned to these resource pools. These can be done through GUI or scripts.

workloadgroup

Then we create and assign a classifier function, and the logins. (I wont show this here as nothing has changed since the last version of SQL. Please email me if you would like a copy of the scripts that were used.)

resource governor gui

Now we need to open two new query windows and log into each using the ‘users’ and ‘developers’ credentials. In each window we will run a DBCC CHECKDB on two copies of the Adventureworks2012 database in order to generate some IO.

workload

Open Performance monitor to see how the IO has been throttled. The results look like this. (Note the performance counters that we are using)

perfmon

perfmon

The first thing that i notice is that throttling is clearly taking place. The second thing that i notice however is that the max_iops values that i have set are not being reached consistently. I suspect that a lot of CHECKDB is happening in memory after the initial process begins as AdventureWorks is quite small, a more IO specific workload and tuning of the max_iops could see a more consistent result set here.

Lets have a look at a couple of DMO’s. Microsoft has kindly added several new fields to the sys.dm_resource_governor_resource_pools. You can see here the reads_total versus the reads_throttled. (Note that this System view is not populated until you restart your instance, and that this is only a subset of the newly available columns.)

dmv

Lets also checkout sys.dm_resource_governor_resource_pool_volumes!

dmv

Here you can see the volumes that are in use by our resource pools, and also a taste of some of the other new columns added. Notice that the ‘developers’ login experienced much more throttling. (This DMV is populated once a user connection is made through Resource Governor.)

So there we have it, a first look at resource governors new IO capability. I have done a very basic example here and if you would like to test further you could scale out by increasing the workload, reducing concurrency, and playing with different values for min and max iops.