The Chronicles of Nojo

Tuesday, April 22, 2025

Decay

Smoke detectors are fascinating. They contain a tiny radioactive sample of Americium-241, the atoms of which are continuously decaying into Neptunium-237. Alpha decay is when the nucleus of an atom emits an alpha particle, which is made of 2 protons and 2 neutrons. The half-life of $Am$ is approximately 432.2 years, which means that if you start off with $N_0$ atoms at $t=0$ then at $t=1$ you'll have $\frac{1}{2}N_0$ and at $t=2$ you'll have $\frac{1}{4}N_0$ left. Generally:

$$N_t = N_0(\frac{1}{2})^t\tag{1}$$

This is an exponential function, so if you wanted to, you could rewrite it as:

$$N_t = N_0e^{t\ln\frac{1}{2}}\tag{2}$$

$\ln\frac{1}{2}$ is an interesting constant that will reappear later. It's roughly equivalent to $-0.69314718$.

If you plotted the above graph, you might note that time is on the x-axis and the number of $Am$ atoms is on the y-axis. This number is always decreasing as time increases. For a given time interval $t=a, t=b$, the decrease in $N$ from $N_{a}$ to $N_{b}$ is a direct consequence of the activity.

Activity, measured in Becquerels $Bq$, is the number of decays per second, but as our graph's time axis is currently measured in half-lives we can't just read the activity directly from the slope.

Approach #1

$$432.2 years\times365.25\times24\times3600 = 13639194720 secs\tag{3}$$

If we already had $N_0$ we could plug $\frac{1}{13639194720}$ into our first equation and see $N$ one second later. This would be the number of Becquerels.

Approach #2

Modify the original equation so that $t$ is no longer measured in half-lives but instead in seconds:

$$N_t = N_0(\frac{1}{2})^\frac{t}{13639194720}\tag{4}$$

So actually, both approaches are identical, but the second approach is more reusable: we've taken the general equation $1$ and altered it to be specific to the half-life of Americium-241.

Let's find a typical value of $N_0$ for our smoke detector example, where a Google search reveals about 0.3μg is common. How do we find the number of Americium-241 atoms in 0.3μg? Avogadro's number: $6.02214076\times10^{23}$ is the number of particles in one mole.

$$N_0 = 0.3\times10^{-6} / 241 \times 6.02214076\times10^{23} = 7.49644\times10^{14}\tag{5}$$

Let's plug this number into equation $4$. Unsurprisingly, if you follow along in Excel, you'll see the same number for $N_0$ as $N_1$.

$$N_1 = 7.49644\times10^{14}\tag{6}$$

However, if you rearrange it, you'll find that Excel didn't throw away the bits, it just hid them away with some formatting that prioritises the most significant digits:

$$N_0 - N_1 = 38097.125\tag{7}$$

Here you have it. In a smoke detector with just 0.3μg of Americium 241, there are approximately 38k decays per second. Alpha particles are being emitted at 38kHz! Note that in equation $7$, $N_1 - N_0$ would give us a negative slope of $-38097.125$ indicating that our sample contained fewer atoms at $t = 1$.

Bonus section:

The slope of the exponential function $1$ is always directly proportional to the value on the y-axis! In this case, the $N$ value is scaled by $ln\frac{1}{2}$ while exponential function $4$ is scaled by a different number $\frac{\ln\frac{1}{2}}{13639194720}$. Intuitively this should make sense: when the amount of radioactive material is halved, so is the activity.

Sunday, May 17, 2020

Heterodyne

Frequency modulation (FM) is one of the techniques used to transmit audio signals at radio frequencies. If we "multiply" a sinusoidal signal wave with a sinusoidal carrier wave, the result is two waves at frequencies just above and just below the original carrier.

By way of an example let's try and transmit a 440 Hz tone using a 10 kHz carrier.

$signal(t) * carrier(t) = mixed(t)$

The mixed result will then be the superposition of a 10,000 Hz - 440 Hz and a 10,000 Hz + 440 Hz wave.

We can un-mix the mixed wave by applying the same function to our pair of tones at 9,560 Hz and 10,440 Hz; doing so will create another pair of tones: one for each input. Two of those tones will be very high frequency and almost inaudible; the other two will be centered around 0 Hz. Originally, the tone at 9,560 Hz would be a kind of "mirror image" of our original tone (though this distinction isn't readily apparent for a single frequency), but after the second multiplication, it's now the "mirror image of a mirror image" so it's the right way around. The tone that was at 10,440 Hz when modulated will now be at -440 Hz and inaudible.

As a second example, let's say the signal is an audio file containing many frequencies between 20 Hz and 20 kHz. We know from the superposition property of waves that we can decompose our song into 19,980 different signals (yes, there will be some rounding errors if we only allow integer values, but hear me out). And let's chose a slightly higher frequency carrier of 100 kHz. Our lowest frequency of 20 Hz will now be mixed to 99,980 Hz and 100,020 Hz while our highest frequency will be mixed to 80,000 Hz and 120,000 Hz. The signal frequencies in between the lowest and highest will then fall into the two bands when mixed. One band will appear to have the "mirror image" of our original signal: that is to say the higher signal frequencies will now occupy lower mixed frequencies and the lower signal frequencies will now occupy higher mixed frequencies. The other band will contain the same information as the signal but simply shifted to a higher base frequency.

Saturday, April 25, 2020

Binary vs Decimal, Part 2

Microsoft's .NET framework gives developers options for representing floating point numbers as well as integers. The same is true of Microsoft's SQL Server, though the options differ subtly.

Remembering from my earlier post that we can convert between base 10 (decimal) and base 2 (binary) integers with no loss of information, it shouldn't surprise you that .NET and SQL Server offer similar integer types with varying number of bits. Even though we're storing numbers that we entered as a sequence of digits (i.e. literals) in C# source or SQL script, they are converted into base 2 integers. However, this post is not about integers.

If, instead of integers, we wish to store fractions (e.g. negative powers of our chosen bases) we must ask a few questions, the first being:

Are we trying to represent a fractional number that has a base 10 representation? If so, we already know that the conversion from base 10 to base 2 cannot necessarily be done without losing information. We can very easily represent the decimal number 0.625 in binary because it's simply the sum of 0.500 (or 2^-1) and 0.125 (or 2^-3), but we cannot exactly represent the decimal number 0.99 because the number of base 2 powers required to add up to 0.99 would quickly exceed our chosen precision (the number of bits available to represent the number.) Instead, if wanted to store our decimal number in a binary field with a fixed number of bits, we would need to approximate the decimal.

0.989999949932098388671875
0.9900000095367431640625
0.990000069141387939453125

The three numbers above are actually sums of powers of 2 and can be stored in a 24 bit binary field without losing information^[1]. Notice that the middle number is closest to 0.99.

But what if we wanted to represent the price of our chocolate: 34.99? In order to fit into a field of the same width we would need to adjust the largest exponent and make a corresponding adjustment to
smallest exponent. Our closest "sum of powers of 2" approximation might be:

34.990001678466796875

In binary this same number would look like

100010.111111010111000011

2⁵	2⁴	2³	2²	2¹	2⁰	2^-1	2^-2	2^-3	2^-4	2^-5	2^-6	2^-7	2^-8	2^-9	2^-10	2^-11	2^-12	2^-13	2^-14	2^-15	2^-16	2^-17	2^-18
1	0	0	0	1	0	1	1	1	1	1	1	0	1	0	1	1	1	0	0	0	0	1	1

We're ready to ask ourselves the second question: can our application tolerate the approximation in converting from base 10 to base 2? In applications where the inputs, intermediate values and outputs are not exact numbers to start with, (e.g. in the fields of science and statistics, to name a few) the answer is probably yes, but certain applications with exact inputs (e.g. in the field of finance) might strictly require us not to approximate the values. The next section presents a way to circumvent the approximation.

In this series' first post, we learned that we can convert numbers between all bases without loss of information, but only when the lowest power for both bases is b^0 or 1. Some smart engineers realised that if we pre-scale our number by some chosen scale so that it becomes an integer, then we can post-scale it when we're done operating on it. Going back to our example number of 0.99, we could pre-scale it by 10^2 (or 100) giving us 99, which converts exactly into the binary number 0110 0011. Now we're able to store the exact value in 7 bits when previously we could only store the approximation in 24 bits.

At this point SQL Server and .NET diverge:

SQL Server offers the DECIMAL data type, which has a fixed precision and scale. In other words, when you define a value of this type you explicitly provide the precision and scale (e.g. DECIMAL(26, 2). If we wanted to store the number 31.42 in this field, SQL Server would pre-scale it by 10^2, convert the number 3142 to binary, and store the binary representation.

.NET, on the other hand, offers the System.Decimal struct type which is a floating point number. The scale that's chosen to pre-scale any given decimal number is packed and stored along with the binary integer representation, and can be accessed to post-scale the integer back into a fraction when needed. The struct is 128 bits wide, and stores the integer part in 96 of those bits. 2^96 is approximately 7.92282E+28. We can create a System.Decimal with anywhere between 0 and 27 (or 28) digits and we choose where the floating decimal point will go.

It might be worth emphasizing that SQL Server's DECIMAL is not floating point. The distinction between fixed and floating point is probably best made by looking at where the scale value is stored.

Converting between SQL Server's DECIMAL and .NET's System.Decimal can take a little forethought.

[1] https://www.h-schmidt.net/FloatConverter/IEEE754.html
[2] https://docs.microsoft.com/en-us/sql/t-sql/data-types/decimal-and-numeric-transact-sql

Binary vs Decimal, Part 1

We commonly represent numbers in base 10, or decimal. The number three thousand, one hundred and forty-two, for example, can then be represented by the composite symbol 3142.

Power	10³	10²	10¹	10⁰
Digit	3	1	4	2
Product	3000	100	40	2

The arithmetical sum of products (of powers and digits) gives us back our number: 3142.

Computers represent numbers in base 2, or binary. The same number three thousand, one hundred and forty-two has a binary representation of 1100 0100 0110 (I added some spaces to make the long sequence of bits easier to process visually.)

Power	2¹¹	2¹⁰	2⁹	2⁸	2⁷	2⁶	2⁵	2⁴	2³	2²	2¹	2⁰
Bit	1	1	0	0	0	1	0	0	0	1	1	0
Product	2048	1024	0	0	0	64	0	0	0	4	2	0

Again, the arithmetical sum of products (of powers and bits) gives us back our number: 3142.

Notice the similarity of the mechanical operations between the two tables above: a pattern (or algorithm) should emerge. Once it does, you may continue reading.

With representations like these, it's instructive to notice that we can represent any number that is a multiple of the rightmost power, up to a maximum that is determined by the leftmost power. Once we choose (fix) the rightmost and leftmost power, we restrict our representation to only those real numbers that can be represented subject to those constraints.

Consider a real world example: the price of a bar of chocolate at your local grocery store is R 34.99. If the shop has one of those electronic price displays on the shelf, that display is likely to be physically limited to displaying only 6 digits.

To enable us to represent any number that is a multiple of 1 cent (R 0.01) we must set the rightmost exponent to -2. Consequently, because we also defined the number of digits as 6, we are also constrained to represent numbers less than or equal to 9999.99. So, we can encode any number from 0.01 to 9999.99 inclusive, but we cannot encode smaller numbers (such as 0.001) or larger numbers (such as 10000.00) as they exceed our chosen constraints.

Power	10³	10²	10¹	10⁰	10^-1	10^-2
Digit	0	0	3	4	9	9
Product	0	0	30	4	0.9	0.09

Once we restrict our representations to a limited number of digits (in the case of decimal) or bits (in the case of binary) we essentially restrict the set of numbers that we can represent. Remember: every number must be some multiple of the rightmost power. It's not coincidental that I chose an exponent of 0 for the first two examples; b⁰ is always 1 whatever base b is. That has the consequence that we can represent all integers in both decimal and binary (as long as we increase the number of digits or bits sufficiently.) As long as our smallest representable value is common to both schemes, we can represent the same numbers in each of them. If the following equation holds true then we can convert between two schemes d and b, with integer exponents m and n respectively, without loss of information:

${d}^{m} = {b}^{n}$

When d is 10 and b is 2 the only solution is where m and n are both zero. We can only convert integers (i.e. multiples of b⁰, or 1) between base 10 and base 2 without losing information.

If we chose a different pair of bases 16 (hexadecimal) and 2 (binary, again) then we see an additional repeating solution each time the larger base power is an integer multiple of the smaller base power, but we still wouldn't be able to represent the other powers of 2.

Base 16

...

16¹

16⁰

16^-1

16^-2

...

Base 2

2⁴

2³

2²

2¹

2⁰

2^-1

2^-2

2^-3

2^-4

2^-5

2^-6

2^-7

2^-8

Multiples of

0.0625

0.00390625

Sunday, January 13, 2019

Wake On Lan (WOL)

Wake On Lan (WOL) is possible on the ASUS M3A78-CM board, but it requires a level of finesse to get it working. First, the cryptically named BIOS option "Power > APM Configuration > Power on From S5 by PMEs" must be enabled. Then, it's important to note that WOL won't work when the power switch is off (i.e. SB_PWR LED is still lit, but the computer has been shut down). Instead, the machine needs to be merely suspended in order to pick up the "magic" WOL ethernet packet. You can suspend it with the command:

sudo systemctl suspend

You'll also need to decide what to do with the BIOS setting "Power > APM Configuration > Restore on AC Power Loss"... [Last State] won't bring it back to suspended after an AC power loss, but it will bring it back to running if the server was running when the power went off. If the power goes out when you're suspended, then you're up a certain creek without a paddle.

On waking, some network drivers need to be reloaded. Check my github's private ubuntu/WOL repository for the script that does this, or follow the steps in this accepted answer: ubuntu 18.04 - ethernet disconnected after suspend

To send a WOL packet, install and run etherwake from some other machine in the same subnet:

sudo apt install etherwake
sudo etherwake -i enp1s0 00:22:15:DE:31:42

Monday, November 19, 2018

Hypothesis Testing Errors

Hypothesis testing allows us to quantify relationships between known samples and the unknown population from which they were taken. What do I mean by that? Let's say I am interested in the returns from investing in a particular stock. The daily return might be calculated as the ratio of today's closing price over yesterday's closing price. Whether I was to take a sample of 30 returns (over 31 days) or 60 returns (over 61 days), I still couldn't know the population mean return, but I could hypothesize about it... so I do.

I choose a "null hypothesis" that the population mean is 4.5%. Given that my sample mean was 5% and there was a 2% standard deviation, 30 observations would produce a test statistic of $\frac{0.05 - 0.045}{\frac{0.02}{\sqrt{30}}} = 1.37$ standard deviations. In other words, the p-values would be 8.5% and 91.5%. For an 80% confidence two-tailed test (10% in the left tail and 10% in the right tail) we would reject the hypothesis, but at 90% confidence we would accept it. Note how we've already accepted or rejected the hypothesis regardless of its truth.

Now, imagine an all-seeing and all-knowing supernatural bystander watching the same events unfurl... they could know the population mean... and even though they wouldn't be obliged to share the exact value with me, let's say that they'd at least tell me if my hypothesis was true or false; that is to say: if I hypothesized that the population mean was 4.5% and it actually was 4.5% then the hypothesis would be true, otherwise if would be false (it could be 4.2% or 4.3% or even -5% or 63%; the point is we don't know).

If we take our two test results and combine them with the two possible truth values, it produces this 2X2 matrix of outcomes.

	Accept	Reject
True	Correct	Type 1 Error
False	Type 2 Error	Correct

True/False: does the actual population mean match the hypothesized mean?
Reject/Accept: does our statistic fall outside/inside the confidence interval?
Correct/Incorrect: did we accept a true (or reject a false) hypothesis or did we commit an error?

Let's ask the bystander if our hypothesis was indeed true or if it was false.

Yes, it's true:
- At 90% we accepted it
- At 80% we rejected it (Type 1 Error)
No, it's false:
- At 90% we accepted it (Type 2 Error)
- At 80% we rejected it.

This last pair of possibilities deserves more analysis: when the bystander tells us our hypothesis was false, it doesn't seem to matter why we were correct to reject the hypothesis; all that matters is that we did.

This video tries to explain it but I am not confident the author is correct. I'd prefer to side with authors of the CFA curriculum who say - in short - "it's complicated".

Wednesday, November 07, 2018

Fourier Transform in C#

I am fascinated by the Fourier transform; specifically: its ability to decompose a wave into a series of fundamental waves. Mathematicians might prefer to see it described it more formally, but here I'll be trying to make it understandable.

Obviously, for a wave to have motion, time needs to pass. If we freeze time then our wave cannot oscillate. To say a wave has frequency, we need first to observe a period in which the wave has motion. The longer we observe its movement the more we can mathematically extract from it, but for now let's simply say that we need to acquire a range of samples measured as the wave oscillates. From this point, armed with our array of floating point numerical samples, we menacingly approach the rabbit hole.

The remainder of this post will focus on a single practical aspect of the Fourier transform: how to interpret the result, especially its magnitude. I'm using the MathNet.Numerics package from Nuget, which provides the handy method Fourier.Forward(samples, FourierOptions.NoScaling).
Note: you need to specifically pass FourierOptions.NoScaling if you want control of the output.

Since I grew up in an age when Compact Discs were seen as technological marvels, I am going to use the standard 44.1 kHz as my sample rate for the following example. A full second's worth of samples (44100) might seem a reasonable place to start, but as we'll see later, it's by no means a requirement. The chosen sample rate theoretically limits the upper frequency to 22050 Hz, and even though we pass in 44100 samples to Fourier.Forward we get back a symmetrical result with the first 22050 data points carrying the frequencies up to 22050 Hz, and the second 22050 carrying them all in again reverse. We can safely ignore the second half of the output for this post. If our input signal (encoded in the 44100 samples) was a pure sinusoidal signal (of any frequency up to 22050 Hz) oscillating between +1 and -1, it will contribute 22050 units to the frequency bin associated with the signal. Those units can be found in the Complex32.Magnitude property. Multiplying the amplitude of the input signal by any constant A will scale the magnitude to A x 22050. However, if we pass in a fraction of the full second's worth of samples - let's say $\frac{4}{5}$, which is 35280 instead of 44100 - then we see the number of units fall to $\sqrt{\frac{4}{5}}A\frac{samplerate}{2}$. And yes, the number would rise if the fraction was greater than one.

It's probably important to note that we haven't changed the sample rate: that's still 44100 and it still means we cannot get any higher frequencies than 22050 Hz. What we've done is alter the resolution. With $\frac{4}{5}$ of the original resolution each frequency bin would be $\frac{5}{4}$ as wide so that we could still accommodate all 22050 Hz, but in fewer distinct bins. It's not a net loss, though; it's a trade off: we've lost resolution in the frequency domain but we've gained resolution in the time domain (there will be $\frac{5}{4}$ as many Fourier transforms in any given time period). Of course, this now means we cannot simply read off the frequency from the bin number; now we need to scale the bin number by our resolution factor. Staying with the $\frac{4}{5}$ example: 800 Hz would now be found in bin 640, which is $\frac{4}{5}$ of 800.

So... why did we choose to use the Complex32.Magnitude property, and what does it mean? Each of the data points we get as output from Fourier.Forward is a complex number, having both a Complex32.Real and an Complex32.Imaginary part. Besides these two orthogonal vectors pointing in the real and imaginary axes, another way to interpret them is as a single vector that has two properties: Complex32.Magnitude and Complex32.Phase. Together, these allow us to imagine the frequency bin as a clock with a single spinning hand of variable length. If the hand is long: there's more signal; if it's short: there is less. And the angle of the hand shows us the offset of the signal (relative to convention).

Input Shift	Re	Im	Mag	Phase
+0	0	-22050	22050	$-\frac{π}{2}$
+$\frac{π}{2}$	22050	0	22050	0
+π	0	22050	22050	$\frac{π}{2}$
+$\frac{3π}{2}$	-22050	0	22050	π

Finally, how did we get 22050 for the maximum Complex32.Magnitude given a maximum amplitude of 1 (and minimum amplitude of -1) for our input signal? Well, that's how the Fourier transform works: it essentially adds up the 22050 vector results of pairing the input signal with the known pure sinusoidal signal.

Sunday, October 28, 2018

Demand and Supply: Part 2

In the previous post, there was a table of coefficients for average cost (AC), total cost (TC) and marginal cost (MC). Let's decompose TC into two components: one that doesn't vary as quantity (Q) varies and another that does. We'll call the components total fixed cost (TFC) and total variable cost (TVC). We can also decompose AC into average fixed cost (AFC) and average variable cost (AVC)

	AC	AFC	AVC	TC	TFC	TVC	MC
Q^-1	a	a	0				0
Q⁰	b	0	b	a	a	0	b
Q¹	c	0	c	b	0	b	2c
Q²	d	0	d	c	0	c	3d
Q³				d	0	d

AFC: $p = \frac{a}{q}$
AVC: $p = b + cq + dq^2$
TFC: $p = a$
TVC: $p = bq + cq^2 + dq^3$

Note how $AC = AFC + AVC$ and $TC = TFC + TVC$
Note too how MR is unrelated to TFC, but has the same relation to both TC and TVC.

Demand and Supply: Part 1

Let's use a little bit of mathematics to assist our understanding of demand and supply, focusing here on the slightly more complicated concept of supply (demand functions identically, but we don't often decompose revenue into fixed and variable components). In the study of economics we often encounter three "curves" that describe how the price of an item is related to the quantity (Q) a firm is willing to supply: average cost (AC), total cost (TC) and marginal cost (MC). These three functions are mathematically related by two equations: $AC = \frac{TC}{Q}$ (which also means that $TC = ACxQ$) and $\frac{dTC}{dQ} = MC$ (which also means that $TC_q = \int_0^q MC dQ$). It's too soon to explain, but these are not quantities the firm is willing to supply, but rather the quantities the firm is able to supply.

A third degree polynomial function should be sufficient for this demonstration. In the table below, I've displayed the coefficients of each power of Q, highlighting the "fixed" component of TC (at Q⁰) and AC (at Q^-1) noting that MC has no such "fixed" component.

	AC	TC	MC
Q^-1	a		0
Q⁰	b	a	b
Q¹	c	b	2c
Q²	d	c	3d
Q³		d

AC: $p = \frac{a}{q} + b + cq + dq^2$
TC: $p = a + bq + cq^2 + dq^3$
MC: $p = b + 2cq + 3dq^2$

If the firm has any fixed costs (e.g. monthly rental payments on commercial property), they would be represented by the coefficient $a$; if it has variable costs (e.g. monthly wages payable to labour) they would be represented by the coefficients $b$,$c$ and $d$ (these are the costs that vary as Q is varied.