The Chronicles of Nojo: January 2010

Tuesday, January 26, 2010

Visual Studio Shortcuts

We all make typogarphical mistakes. Oops. Put the caret between the 'a' and 'r' and hit Ctrl+T.

This shortcut allows you to move a single letter one position to the right every time you use it. More precisely, it swaps the characters on the left and right of the caret, then advances the caret one character to the right. The advancement of the caret means you can keep hitting Ctrl+T to effectively move a single letter right through a word.

Sometimes I'll type in a class name without declaring the corresponding import statement at he top of my C# source file. Visual Studio adds a visual clue (small red box under the final character of the type name) and if you move your hands off the keyboard and wiggle the mouse around a bit, Visual Studio will display a context menu to choose the namespace. If you want to keep your hands on the keyboard, hit Ctrl+Shift+F10 and then hit the up and down arrows to select the appropriate namespace from the list.

Some other shortcuts can be found here: http://www.microsoft.com/downloads/details.aspx?FamilyID=e5f902a8-5bb5-4cc6-907e-472809749973&displaylang=en

Tuesday, January 12, 2010

Split Range - Partitions to Filegroups

So I asked the questions: How do I know which filegroup any given partition is mapped onto? How do I know which is the new partition when a range is split on a new boundary. And which partition keeps the data and which is removed when two partitions are merged? The first question is answered by looking at the sys.destination_data_spaces table. The second is answered in this post, and the third will be the subject of a later post.

We can set up a partition for this exercise with the following SQL:

CREATE PARTITION FUNCTION pfLeft(INT) AS RANGE LEFT FOR VALUES()
CREATE PARTITION SCHEME psLeft AS PARTITION pfLeft TO (FG1)

Our diagnostic SQL, which we run repeatedly, is:

SELECT destination_data_spaces.*
FROM sys.destination_data_spaces 
INNER JOIN sys.partition_schemes ON sys.destination_data_spaces.partition_scheme_id = sys.partition_schemes.data_space_id
WHERE sys.partition_schemes.name = 'psLeft'
ORDER BY destination_id ASC

For a partition function with no boundaries (i.e. all rows fit in one partition) we see that the first (one and only) destination_id maps to the data_space_id of FG1.

partition_scheme_id destination_id data_space_id
------------------- -------------- -------------
65601               1              2

If we set the next used filegroup to FG2 and then split the partition on the boundary of 0 we see:

partition_scheme_id destination_id data_space_id
------------------- -------------- -------------
65601               1              3
65601               2              2

The interesting bit is that FG2 is now used for the first destination_id. When the existing partition was divided, the LEFT sub-partition (including value 0) was assigned to the new filegroup. Let's try it again by setting the next used filegroup to FG3 and splitting on 100:

partition_scheme_id destination_id data_space_id
------------------- -------------- -------------
65601               1              3
65601               2              4
65601               3              2

Again, it's the LEFT sub-partition (including value 100) of the partition that was being divided that is assigned to the new filegroup. If we try this one more time with a split on -100, I'd hope to see FG4 take up the LEFT sub-partition of our left-most partition.

partition_scheme_id destination_id data_space_id
------------------- -------------- -------------
65601               1              5
65601               2              3
65601               3              4
65601               4              2

Lets do everything from the start, but this time we'll use:

CREATE PARTITION FUNCTION pfRight(INT) AS RANGE RIGHT FOR VALUES()
CREATE PARTITION SCHEME psRight AS PARTITION pfRight TO (FG1)

We'll also need to change the diagnostic code to show the psRight partition scheme:

SELECT destination_data_spaces.*
FROM sys.destination_data_spaces 
INNER JOIN sys.partition_schemes ON sys.destination_data_spaces.partition_scheme_id = sys.partition_schemes.data_space_id
WHERE sys.partition_schemes.name = 'psRight'
ORDER BY destination_id ASC

Then we run all the splits in order: 0 -> FG2, 100 -> FG3, -100 -> FG4

destination_id data_space_id
-------------- -------------
1              2
2              5
3              3
4              4

The latest results are consistent with what we've seen previously, but now it's the RIGHT segment (including the boundary value) of the newly divided partition that gets moved onto the new filegroup.

In summary, if your partition function is defined as RANGE LEFT, then a SPLIT RANGE (x) alteration will split an existing partition and the LEFT segment will become the new partition on the NEXT USED filegroup; the other segment stays on its original filegroup. Conversely, if your partition function is defined as RANGE RIGHT, then the RIGHT segment becomes the new partition on the NEXT USED filegroup.

Wednesday, January 06, 2010

Splitting a PDF

I discovered a novel way of splitting a large PDF file into more manageable chunks: print out the range you want to the "print to file" printer in Ubuntu, making sure to choose the PDF output!

Tuesday, January 05, 2010

Horizontal Partitioning 1

When you want to speed up the physical reads on a SQL Server, you have two options: faster disks, or more disks. But only once you've determined that physical reads are a (potential) bottleneck do you get to make the choice.

If using faster disks is not enough to get your system back to an acceptable level of performance, then there's still the option of adding more disks. Merely adding disks won't - in itself - speed anything up. To realize performance gains you need to restructure your data. SQL Server allows you to partition tables, and also indexes. But as usual, the devil resides in the details.

Partition functions can only take one parameter. This means that the partition in which each row resides is determined by the value of just one column value in that row.

If your original table had a clustered index, you'll probably want to keep it. However, this has a big consequence: you will need to make the partition function congruent with the clustered index. SQL Server will complain if you leave the partition column out of the clustered index "Partition columns for a unique index must be a subset of the index key". It gets worse if you want a composite clustered index - you should be aware that in some cases SQL Server appears to store data internally sorted first by the partition column, and then by any other columns in the composite clustered index. If your original table was sorted by Col_1,Col_2 and you choose Col_2 as your partition column, then your table may be sorted internally by Col_2,Col_1. Actually, it's not this straightforward: I need some time to figure this out; it will be the subject of a later post.

The Chronicles of Nojo