Thursday, March 05, 2009

Goodput - Part 1

I've been interested recently in maximizing the application-level throughput of data across networks. The word I didn't know I was looking for - but found anyway - was goodput.

At first I tried streaming across a large DivX file. When I realised that I might be measuring disk seek time (doubtful, at the pitiful data rates I was achieving) I transitioned my tests to stream meaningless large byte arrays directly from memory (being careful not to invoke the wrath of the garbage collector, or any of the slow operating system-level memory functions).

What I noticed was that the application-level control of the stream was a big factor in slowing down the effective data transfer rate. In short, if my design of the logical/physical stream protocol was "bad", so would be the goodput.

Throughout the test, data was transmitted in chunks of varying sizes (from 4k to 128k). Firstly, just to see what it was like, I tried establishing a new System.Net.Socket connections for each chunk. Not good. This is why database connection pooling has really gained ground. It's expensive to create new connections. Next I tried a single connection where the client explicitly requested the next chunk. Also really bad. It was chatty like an office secretary, and got less done. So I tried a design I thought would be pretty good, where 1 request resulted in many chunks being returned. For some reason, I thought that prepending each chunk with its size would be a good idea. It was 360% better than the previous incarnations, but the extra size information was just repeating data that wasn't at all useful to the protocol I had devised: it was wasting bits and adding extra CPU load, and giving nothing in return; it had to go. Stripping these items from the stream resulted in an extra 3.6% of throughput.

Interestingly, I noticed that the choice of buffer size could drastically affect the goodput, especially when it was ((1024*128)+4) bytes. I expect this was something to do with alignment. It would be cool to do some more tests, looking for optimal buffer sizes.

No comments: