Galin Iliev's blog

Software Architecture & Development

Databases tips

I have been dealing with MS SQL Server 2005 and Microsoft SQL Server 2005 Analysis Services as well as Reporting services to display results to end users.

It was amazing how many issues are in the fields of databases. I developed projects on my dev box with snapshot of database (biggest table contains aprox 200,000 rows) and I though it was done. Unfortunately I was not able to deploy in production because of following reasons:

  • cube and dimensions processing was extremely slow - this was solved with playing with indecies
  • Memory error: The operation cannot be completed because the memory quota estimate (1678MB) exceeds the available system memory  - this was on 4xCPU and 4GB RAM on 32bit Windows 2003 std. This was solved with limiting SQL Server serive memory to 2GB and set /3GB startup switch in windows.
  • then came:  Arithmetic overflow error converting expression to data type int. This was tough also because I had to wait 2:30 hours before I get the error. then I make changes and try again... this was very, very long loop. For fully understand it check SQL Server Books and as resolution check Barnabas Kendall's blog entry.

there are  some good points in Barnabas Kendall's blog entry like:

I don’t enjoy reading documentation (lack of plot and character development),   :):)

  • Create the view with the SCHEMABINDING option.
  • Define the view to pre-aggregate the data (this is where the performance enhancement comes from). For example, I have a view that splits up the year, month, day, and hour of a click, groups by client, and also returns a total. I can also reuse this view to get totals by hour, day, month, year, or all time.
  • Don’t use COUNT, use COUNT_BIG.
  • Don’t use AVG; use SUM and COUNT_BIG. Beware of integer overflows in your final query; SUM always wants to make an INT. Why isn’t there a SUM_BIG? You can get around this by using CAST or CONVERT in your final query.
  • Make a clustered unique index on the view. This forces the database engine to persist the values to disk rather than calculating them on the fly, which is what leads to five-minute queries. I am surprised to learn that you don’t need to put all the values of the table in the index, just the ones necessary to make a clustered unique index.
  • When querying the view, use the NOEXPAND hint.

I would like to add: All aggregate functions return result depending on input parameter's type- if input column is of type int and result would be if type int:

select sum(numbers) from dbo.TestTable

And if you use SUM of multiple rows all that contains values like 2,000,000,000 you will end up with Arithmetic overflow error

In order to change result type this:

select sum(cast(numbers as bigint)) from dbo.TestTable

I employed some of these techniques and started the process again. I am keeping my fingers crossed...

Comments (1) -

  • Stephan Zahariev

    5/8/2007 12:06:26 PM | Reply

    Nice post. I always like to read resolution to real-world problems.