The Chronicles of Nojo: 2008

Wednesday, December 03, 2008

C# Verbatim Identifier

I've known for a long time that C# identifiers could have an optional @ sign in front of them, but until recently I thought that the character became part of the identifier.

int foo = 3;
int @foo = 4; //<-- error here: identifier 'foo' already in scope

So, it's really a way to call into code that may have been written in another CLR language that has identifiers that clash with C#'s reserved and keywords.

By way of example, to call the following VB.NET function

Public Class VerbatimIdentifierTest
Public Shared Function int(ByVal value As Double) As Integer
Return Convert.ToInt32(value)
End Function
End Class

from C# you'd invoke:

VerbatimIdentifierTest.@int(1.0);

Saturday, October 25, 2008

Amazon EC2

Amazon's Elastic Cloud Compute (EC2) is a nice idea, but it's important not to overlook the "elastic" in the name. If there's an obvious temporal pattern to your service's usage (examples: 1. it could easily consume 100% CPU of 5 machines between 7 am and 11 am, but slow to a dawdle for the rest of the day, or 2. a large system could lie unused on weekends and holidays), and you design your application with parallelism in mind, this could be a cost effective strategy for hosting the service. As they bill you per "instance-hour" (loosely defined as an "instance" available for all - or part of - an hour), an available _idle_ instance will cost the same in a month as an available _fully-utilised_ instance. Costs start to fall as soon as you turn off your under-utilised instances dynamicall; which is something you can't yet do when renting physical space in a data center (AFAIK).

For small business without the need for elasticity (note: does that spell "unsuccessful"?), I'm not convinced it would be as cost effective as renting some tin in a data center and running multiple virtualized instances on some technology like Hyper-V, ESX or Xen.

Wednesday, October 22, 2008

Assert.That(actual, Is.EqualTo(expected).Within(tolerance))

New to NUnit 2.4 back in March 2007:

using NUnit.Framework.SyntaxHelpers;

I really like this idea as it can make some unit test code much more legible!

Saturday, October 18, 2008

Language Optimization Observations

Progamming languages are becoming prolific. Most recently the trend of domain specific languages has emerged. On any given project, multiple languages will be employed (to varying degrees of success). Their design can be influenced by many factors. Some - if not most - languages enable one to succinctly and elegantly define a solution to a problem, speeding up initial development and subsequent modifications. However, as the languages get more abstract they appear to get slower and slower to run. Maybe it's that they're solving bigger problems? Maybe it's that their runtime environments have been designed for the general case, and literally "karpullions" of CPU cycles are being wasted maintaining state in which we [for a given application] are not interested. The smart money is on over-generalized and sometimes poorly implemented runtime environments.

Thursday, October 16, 2008

Duct Typing

defn: A portmanteau of duck-typing and duct-taping. Used to describe the effects of defining .NET extension methods in your own code to give existing objects "new" functionality.

Closures in C#

If I'm using the wrong terminology: sorry; you lose. The C# compiler is a wonderful thing. Among it's many gifts to us are iterator blocks (think: yield return) and anonymous methods (think: delegate literals). Neither of these constructs has a direct parallel in MSIL, the common intermediate language to which all C# code is compiled. Instead the C# compiler defines new types to encapsulate their behaviours.

Iterator Blocks

An iterator block becomes a state machine; a disposable enumerator. Code defined in C# as this:


public static IEnumerable PublicEnumerable()
{
    yield return 1;
    yield return 2;
    yield return 3;
}

is compiled down to the MSIL equivalent of this:


public static IEnumerable PublicEnumerable()
{
    return new d__0(-2);
}

The class d__0 is a closure, and if we take a peek in Reflector we see the following definition:


[CompilerGenerated]
private sealed class d__0 : IEnumerable, IEnumerable, IEnumerator, IEnumerator, IDisposable
{
    // Fields
    private int <>1__state;
    private int <>2__current;

    // Methods
    [DebuggerHidden]
    public d__0(int <>1__state);
    private bool MoveNext();
    [DebuggerHidden]
    IEnumerator IEnumerable.GetEnumerator();
    [DebuggerHidden]
    IEnumerator IEnumerable.GetEnumerator();
    [DebuggerHidden]
    void IEnumerator.Reset();
    void IDisposable.Dispose();

    // Properties
    int IEnumerator.Current { [DebuggerHidden] get; }
    object IEnumerator.Current { [DebuggerHidden] get; }
}

The object instance can be returned to the caller of PublicEnumerable where it will behave like the iterator block we know and love.

Anonymous Methods

Anonymous methods can access local variables defined in the same scope. When a delegate literal contains a reference to a method local variable (see: i in the following example):


public static void DelegateLiteralTest()
{
    Int32 i = 5;
    Action add = delegate(Int32 value) { i += value; };
    Console.WriteLine(i);
    add(1);
    Console.WriteLine(i);
    add(2);
    Console.WriteLine(i);
}

the C# compiler generates a new type (closure) that resembles the following:


[CompilerGenerated]
private sealed class <>c__DisplayClass1
{
    // Fields
    public int i;

    // Methods
    public <>c__DisplayClass1();
    public void b__0(int value);
}

Notice how the local variable is now scoped to the instance of the closure, and how it's an instance of this closure that's used inside the method:


public static void DelegateLiteralTest()
{
    <>c__DisplayClass1 __closure = new <>c__DisplayClass1();
    __closure.i = 5;
    Console.WriteLine(__closure.i);
    __cloure.b__0(1);
    Console.WriteLine(__closure.i);
    __closure.b__0(2);
    Console.WriteLine(__closure.i);
}

I think this is pretty darn cool.

Sunday, October 12, 2008

Singleton Pattern vs. C# Static Class

At the most basic level, both constructs are useful in limiting the number of instances of an object that can be created. There are numerous subtle differences at the implementation level, but the single biggest difference is at an object-oriented level: static classes do not support polymorphism. This stems from a combination of factors at the language level; in short, a static class cannot inherit from a base class (abstract or concrete) nor can it implement any interface methods. Conversely, you can neither define static virtual methods on a class, nor can you define them on an interface. So, if you're ever using dependency injection [insert link here] - a technique that relies heavily on polymorphism - you will likely find that static classes will be inadequate, limiting, and plain frustrating.

There are a number of other runtime level differences, but these are all secondary to the polymorphism issue. The simplest is that a static class can never be instantiated, and all methods are executed against the type instance. Contrast this with the singleton, of which exactly one instance is created, and where methods are executed against that instance.

Next time you're involved in object design and you come across a potential application for the singleton pattern, don't forget this posting!

Monday, October 06, 2008

Delegate vs. Event

What's the difference between a delegate and an event in C#, you ask? An event provides better encapsulation (read: a more object orientated approach) while a delegate is "merely" a immutable multicast type-safe function pointer. In other words, an event is an encapsulated delegate, where only the += and -= functions are non-private. That means that although we can add new subscribers to the event from another class, we cannot invoke it. The add and remove accessors are the only members to retain the event's declared access (e.g. public).

It's easier to explain delegates first.

Delegates

Function pointer: a delegate acts as a layer of indirection allowing references to methods to be passed between objects. The target methods are invoked when the delegate is invoked.

Type-safe: unlike function pointers in C++, a delegate knows the type of the target object, and the signature of the target method.

Multicast: a delegate maintains a list of methods; it isn't limited to invoking just one

Immutable: delegate instances - like strings - cannot be changed. When you add/remove another target method to/from a delegate instance, you're really creating a completely new delegate instance with a superset/subset of target methods.

A delegate example

Define the delegate type, or use one of the many predefined types shipped with the FCL.
public delegate void Function(string value);
Define the target methods that we can call. These will have to conform to the signature of the chosen delegate type.


        static void WriteOut(string value)
        {
            Console.Out.WriteLine(value);
        }

        static void WriteError(object value)
        {
            Console.Error.WriteLine(value);
        }

Define a class with a public delegate field (bad object-oriented encapsulation, but it's just for the purpose of example).


class DF
{
    public Function function;
}

Construct a new instance of this class, add a couple of target methods and invoke them through the delegate:


DF df = new DF();
df.function += new Function(WriteOut);
df.function += new Function(WriteError);
df.function("Delegate");

Events

An event is a construct that is functionally similar to a delegate, but provides more encapsulation. Specifically, adding/removing target methods is given the accessibility of the event (e.g. public/protected/internal). Everything else is given the accessibility modifier of "private". This allows subscribers to be notified of an event, but not to trigger it themselves.

An event example

Again, pick a delegate type from the FCL or define your own. For this example, I will reuse the Function delegate type, and the two Function methods defined above.
Define a class with an Event:


class EF
{
    public event Function function;
}

Construct a new instance of this class, add a couple of target methods:


EF ef = new EF();
ef.function += new Function(WriteOut);
ef.function += new Function(WriteError);

Because of the encapsulation, we cannot call the following method from outside the declaring type:


//ef.function("Event"); // ... can only appear on the left hand side of += or -= (unless used from within the event's declaring type)

A good technical resource:
Difference between a delegate and an event.

Wednesday, October 01, 2008

Indispensable Software Engineering Tools for .NET

Requirements
=
JIRA: rudimentary but ok, not so great for managing project plans
-
Source Control
=
Subversion: all check-ins must have a reference back to the original requirement.
Everything included in the release must be held in the versioned repository.
Everything means everything... environment specific config files.
(Don't store usernames and passwords in source control, but that's ok because you didn't hard code them into your configuration anyway, did you!?)
-
Build
=
CruiseControl.NET:
> get latest from source control, tag so that the build can be repeated.
> ensure compiled assemblies are versioned so they can be traced back to the build.
> build, run unit tests and other code metric tools.
Code Metric Tools:
>FxCop
>NUnit
>NCover
>anything for cyclomatic complexity?
Post Build
=
FishEye: great for seeing all metrics, committed changes etc. Useful when you need to see what changes went into a build.
-

Friday, September 26, 2008

Pulsed Threads

It happened quite a while back, but I thought I'd put a note up here as a reminder: a pulsed thread is still idle and another thread could acquire the lock. This blog entry is not about how best to re-write the producer/consumer problem as an event-driven model, it's to show how naive multi-threaded operations are - among other things - low hanging fruit when it comes to finding bugs...

Imagine a group of producer threads [P0..Pn] and a group of consumer threads [C0..Cn] writing to and reading from a queue of arbitrary size. Access to the queue needs to be thread safe, but the consumer threads could conceivably take a long time to process the items they consume from the queue. It is an error to attempt to read from an empty queue. For this example, we do not consider throttling the producers.

For a first (naive) attempt in C#:


01: // consumer
02: while (running)
03: {
04: Int64 value;
05: lock (queue)
06: {
07: if (queue.Count == 0)
08: {
09: Monitor.Wait(queue);
10: }
11: value = queue.Dequeue();
12: }
13: Console.WriteLine("{0}:{1}", threadId, value);
14: }

01: // producer
02: while (running)
03: {
04: lock (queue)
05: {
06: Int64 item = Interlocked.Increment(ref seed);
07: queue.Enqueue(item);
08: Console.WriteLine("{0}:{1}", threadId, item);
09: Monitor.Pulse(queue);
10: }
11: }

The result? After several thousand iterations we get a "Queue empty" exception on line 10 of the consumer. Puzzled we look a little closer: it's not immediately obvious. Of the 5 consumer threads (call it C1), one has acquired the lock, tested the size of the queue and put itself in a wait state until a producer thread (call it P1) has added an item. P1 pulses the queue to signal an item has been added. An extract from MSDN says:

When the thread that invoked Pulse releases the lock, the next thread in the ready queue (which is not necessarily the thread that was pulsed) acquires the lock.

So, while we were expecting C1 (waiting on the monitor) to acquire the lock, we were wrong. C0 was next in the queue for the lock.

To get the application running correctly, we'd need to change line 6 of the consumer to:


06: while (queue.Count == 0)

Kids, always read the label carefully!

Sunday, September 14, 2008

Coupling: Interfaces and Abstract Classes

When "loosely coupled" is la mode du jour, nobody really likes an abstract class. Well nobody on my team. I suspect there's more going on here than the simple increase in coupling (e.g. the number of assumptions a caller must make) over calling an interface method.

Deriving from an abstract class:
1) forces you to inherit from an object hierarchy, and
2) forces you to provide an implementation for a set of abstract methods, and
3) usually means that some behaviour is "thrown in for free" in the base class(es).

Implementing an interface
1) forces you to provide an implementation for a set of method signatures.

So... use more interfaces; achieve a lower level of coupling, and you'll also implicitly be choosing composition over inheritance.

Friday, August 08, 2008

Exceptions

If only the System.Exception class were abstract or only contained protected constructors... This would force people to define their own custom exceptions (a good thing), but still wouldn't solve humankind's insatiable appetite for the catch-all exception handler.

Wednesday, August 06, 2008

System.OutOfMemoryException Part 2

We've been seeing loads of these recently, not as a result of memory leaks (the usual suspects) but the result of trying to do too much at one time. Now that 64-bit operating systems are here, couldn't we take advantage of the extra virtual address space? Microsoft have a comparison of 32-bit and 64-bit memory architecture for 64-bit editions of Windows XP and Windows Server 2003
outlining the changes between the two memory architectures. Next step is to see whether the 64-bit operating systems are approved.

UnhandledException

From .NET 2.0 onwards, unhandled exceptions on any thread cause the process to terminate. This is - according to a number of experts - the "correct" behaviour, and is different to how they were handled back in .NET 1.1.

The AppDomain.UnhandledException event allows you to perform some kind of diagnostics (such as logging the exception and stack trace, or performing a mini-dump) after such an exception is thrown, but the CLR is going to exit - whether you like it or not - just as soon as all event handlers have run.

So... if you are a responsible application developer who is spawning new threads (or queueing user work items to the thread pool) please please please ensure that _if_ you can handle the exception, that it is caught and not rethrown. Even if it means storing the exception and letting another thread handle it, as in the case of the asynchronous programming model (APM).

Wednesday, June 25, 2008

Goodbye to Awkward Dictionaries in .NET

I take enjoyment in analytic work, and often find myself writing code to empirically test a bunch of XML files to see if I've got the "implied schema" right. For example, I might have 500 files of similar structure, and I'd want to see the possible values (and count) of the "/Reports/Globals/@Client" node's value (using XPath). Usually this would entail using a Dictionary and every access would have to test if the dictionary already contains the key ... until now. After seeing that Ruby's designers allow you to specify a default value for a Hash, I took another step forward, allowing a user of the class to provide a default function, allowing more complex defaults such as empty lists.

using System;
using System.Collections.Generic;

namespace ConsoleApplication6
{
class Program
{
public delegate T DefaultFunction<T>();

public class RubyDictionary<TKey, TValue>
{
private IDictionary<TKey, TValue> inner = new Dictionary<TKey, TValue>();
private DefaultFunction<TValue> defaultFunction;

private static TValue Default()
{
return default(TValue);
}

public RubyDictionary()
{
defaultFunction = Default;
}

public RubyDictionary(TValue defaultValue)
{
defaultFunction = delegate()
{
return defaultValue;
};
}

public RubyDictionary(DefaultFunction<TValue> defaultFunction)
{
this.defaultFunction = defaultFunction;
}

public TValue this[TKey key]
{
get
{
if (!inner.ContainsKey(key))
{
inner.Add(key, defaultFunction());
}
return inner[key];
}
set
{
inner[key] = value;
}
}
}

static void Main(string[] args)
{
try
{
RubyDictionary<string, int> dict = new RubyDictionary<string, int>();
Console.WriteLine(dict["9"]);
dict["8"] += 1;
Console.WriteLine(dict["8"]);
dict["7"] = 5;
Console.WriteLine(dict["7"]);
dict["6"]++;
Console.WriteLine(dict["6"]);

dict["5"]--; dict["5"]--;
Console.WriteLine(dict["5"]);

RubyDictionary<string, IList<string>> lists = new RubyDictionary<string, IList<string>>(delegate() { return new List<string>(); });
lists["shopping"].Add("bread");
lists["shopping"].Add("milk");
lists["todo"].Remove("write this dictionary class");

foreach (string item in lists["shopping"])
{
Console.WriteLine(item);
}

RubyDictionary<int, int> foo = new RubyDictionary<int, int>(777);
Console.WriteLine(foo[8] - 111);
}
catch (Exception exception)
{
Console.Error.WriteLine(exception);
}
finally
{
if (System.Diagnostics.Debugger.IsAttached)
{
Console.WriteLine("Press enter to continue...");
Console.ReadLine();
}
}
}
}
}

Saturday, May 31, 2008

IEnumerable vs IEnumerator

IEnumerator<T> generic interface defines "cursor" methods for iterating over a collection:
T Current { get; }
bool MoveNext();
void Reset();
void Dispose();

IEnumerable<T> generic interface defines just one method:
IEnumerator<T> GetEnumerator();

C# supports iterator blocks. These are blocks of code (i.e. method or property getter blocks) in which you "yield" a return value at various points in the code. The compiler uses hidden magic to create a nested type that implements both IEnumerable<T> and IEnumerator<T>; the body of your original method is stubbed out to call into this new object, and all your original logic is implemented in the IEnumerator<T>'s MoveNext method.

The C# foreach keyword is quite flexible and operates on any object that exposes a public GetEnumerator() method (the 'collection' pattern) - the object doesn't have to implement IEnumerable or IEnumerable<T>.

Thursday, May 29, 2008

Experiment in the Fidelity of UTC DateTime

// make sure your local clock is not displaying GMT

DateTime a = new DateTime(2008, 7, 6, 4, 3, 2, DateTimeKind.Utc);
DateTime b = new DateTime(2008, 7, 6, 4, 3, 2, DateTimeKind.Local);
DateTime c = new DateTime(2008, 7, 6, 4, 3, 2, DateTimeKind.Unspecified);

// no surprises here...
Console.WriteLine(a);
Console.WriteLine(b);
Console.WriteLine(c);

// and again, no surprises. this makes it that much more explicit.
Console.WriteLine("-");
Console.WriteLine(a.ToUniversalTime());
Console.WriteLine(b.ToUniversalTime());
Console.WriteLine(c.ToUniversalTime());

// notice the Z(ulu) for UTC, the time zone for local, and the (nothing) for unspecified
Console.WriteLine("-");
XmlSerialize(a);
XmlSerialize(b);
XmlSerialize(c);

// ooh! our first loss of fidelity
Console.WriteLine("-");
DataSet dataSet = new DataSet();
DataTable dataTable = dataSet.Tables.Add();
DataColumn dataColumn = dataTable.Columns.Add("DateTime", typeof(DateTime));
dataTable.Rows.Add(a);
dataTable.Rows.Add(b);
dataTable.Rows.Add(c);
Console.WriteLine(dataSet.GetXml());

// those .net framework guys fight back with the DateTimeMode property (new in 2.0)
// understandably, weird stuff starts happening when not all input dates are of the same kind
Console.WriteLine("-");
DataSet dataSet2 = new DataSet();
DataTable dataTable2 = dataSet2.Tables.Add();
DataColumn dataColumn2 = dataTable2.Columns.Add("DateTime", typeof(DateTime));
dataColumn2.DateTimeMode = DataSetDateTime.Utc;
dataTable2.Rows.Add(a);
dataTable2.Rows.Add(b);
dataTable2.Rows.Add(c);
Console.WriteLine(dataSet2.GetXml());

//you can't just get the bytes for a DateTime, it's a struct.
//Console.WriteLine(BitConverter.ToString(BitConverter.GetBytes(a)));
//Console.WriteLine(BitConverter.ToString(BitConverter.GetBytes(b)));
//Console.WriteLine(BitConverter.ToString(BitConverter.GetBytes(c)));

Console.ReadKey();

Tuesday, May 27, 2008

Flex interaction with JavaScript (and ASP.NET)

It just doesn't quite work out of the box.

First, a flash object within an HTML form element fails when the flash runtime tries to register the callbacks you defined in your Flex app (e.g. a callback added in ActionScript using ExternalInterface.addCallback("someFunction", someFunctionPointer) is evaluated in the browser as __flash__addCallback(objectId, "someFunction"). A hack to get around this would be to assign your Flash object to window.objectId as soon as you've created it: hey presto! you've worked around one limitation.

Second. There appears to be a race condition if you need to initialize your Flex app with JavaScript data as soon as both are ready. At first, I thought I had a choice... push from JavaScript or pull from Flex. To push, we need to know that the callback has been added in the browser: either by polling or by registering an event. I refuse to poll, and I can't find a way to register for the event. The remaining option is to pull. The question is, will the source data (JavaScript) have been assigned by the time my Flex Application's applicationComplete event has fired? No idea, really.

Friday, April 25, 2008

Notes On the :: Operator in C#

Have you seen this construct before in auto-generated partial classes (e.g. code-behind .designer.cs for a web control)?

protected global::System.Web.UI.WebControls.TextBox MyTextBox;

Basically it's there to ensure that the System namespace is the same System namespace that's declared in the global namespace. So, why would they need this? Consider that you might define your new UserControl -derived class in a namespace like this:

namespace MyNamespace.System { ... }

Visual Studio's code-generator would create a partial class in the MyNamespace.System namespace, in which it would define a field MyTextBox of type System.Web.UI.WebControls.TextBox. However, when the time came for the compiler to put the puzzle together, the namespace resolution rules would match the "System" part of the member's type to the [unintended] "MyNamespace.System" namespace. Not cool.

Enter the global namespace alias, and the namespace alias qualifier operator. (Wow, that sentence is a mouthful).

To force the compiler to use the intended "System" namespace (which resides in the global namespace, not the MyNamespace namespace), the MyTextBox member is declared with the global:: namespace alias qualifier.

Obviously, things NOT to do include:
1) using global = System.Diagnostics;
Generates the compiler warning: Defining an alias named 'global' is ill-advised since 'global::' always references the global namespace and not an alias.
2) namespace System { class MyClass { } }
No warning from the compiler (Java doesn't let you do the equivalent).

Further usage patterns:

using sys = global::System; //<-- nice and recursive :-)
using diag = System.Diagnostics;
using dbg = System.Diagnostics.Debug;

sys::Object o = new System.Object();
diag::Debug.Assert(true); // <-- OK
dbg.Assert(true); // <-- OK, but important to make the distinction that it's a "type" alias and not a "namespace" alias.

Wednesday, March 19, 2008

More HTTP Compression in IIS 6.0

Further to my previous post, I've come across more issues with HTTP compression and IIS 6.0. IIS allows the compression of dynamic files (e.g. files generated on the fly by ASP.NET handlers), but Internet Explorer doesn't appear to like it when the dynamic file is a zip itself. Unfortunately, compression is controlled at the "Service" level (right-click on Web Sites, rather than right clicking on an individual web site) in IIS manager snap-in, and using shared infrastructure can mean that your web site breaks immediately for no apparent reason. Lucky for you though, the metabase can be tweaked to enable/disable compression at the "Site" level... you just have to be able to figure out the site number.

adsutil.vbs SET W3SVC/687245286/root/DoDynamicCompression False
adsutil.vbs SET W3SVC/687245286/root/DoStaticCompression True

IIS from the dark ages

Back in IIS 5.1 you can easily run multiple sites, just not all at once. Run the following scripts to enumerate the sites already installed, and copy an existing site to a new site!
[code]
C:\inetpub\AdminScripts>CScript.exe .\adsutil.vbs COPY W3SVC/3 W3SVC/4
C:\inetpub\AdminScripts>CScript.exe .\adsutil.vbs ENUM W3SVC /P
[code]

Wednesday, March 12, 2008

(De)bugger

A neat trick to assist debugging in C# or JavaScript, when you have no (or little) control over how or when the process is started: use the System.Diagnostics.Debugger.Attach() method or the debugger statement.

Monday, February 25, 2008

Visual Studio Dependencies

You're building a project named A.exe and you've included a file reference to the assembly B.dll (the value of the "CopyLocal" option is set to "true"). Assembly B references C.dll and D.dll, one of which is installed to the GAC. When you hit the build button, a copy of B is made in the project's output directory. Now comes the tricky part: without the ability to specify (or query) the "CopyLocal" option for the dependent assemblies, does msbuild.exe copy C.dll and D.dll? The answer: if the dependent assembly cannot be found in the GAC, then it's copied; otherwise it won't.

One more trick here: don't use Windows Explorer to try and view the GAC on another machine; you can't. Something about a shell extension? You can see the GAC with a command prompt and a mapped network drive.

To disable the shell extension, you can set the following registry value:
HKLM\Software\Microsoft\Fusion\DisableCacheViewer [DWORD] to 1. This will enable you to view the GAC of a remote machine.