Sigmas and for loops
Thomas Ward February 03, 2021Short summary
Sums and products of sequences, represented with "∑" and "∏",
respectively, can be thought of as for
loops, and vice versa.
For those of us who program more than we read/write math, mentally
translating them as such can make them faster and easier to understand.
A table to translate between the two is below:
Math notation | For loop |
---|---|
∑ or ∏ | for |
i = 1 | i = 1 on the for line |
n | i <= n on the for line |
body of ∑ or ∏ | body of for loop |
This means for the sum of a sequence:
is equivalent to for (i = 1; i <=n; i++) { sum += body }
and for the product of a sequence:
is equivalent to for (i = 1; i <=n; i++) { product *= body }
Sigma notation
Math in manuscripts is often daunting and, for some of us, causes
our eyes to glaze over. It really though, is just a concise way
to communicate information. Summation and products of sequences,
represented by ∑ and ∏, often take me longer than I would like to
admit to process and understand, particularly if there are multiple
nested ones. Once I had the (obvious in retrospect) "shower thought"
that ∑ and ∏ were equivalent to "for" loops in programming, they
became much easier to read. Below are some brief examples that hopefully
are helpful to others to make the same mental translation. If you see
a ∏ instead of ∑ in the literature, remember that it is the same
translation, you just need to use multiplication (*=
) rather than
summation (+=
) in the for
loop.
Single sigma
Take calculating the log-probability score of a model, S(q), which is calculated by summing the log of the model's probability, q, for each observation, i:
You can translate this to a simple for loop as follows (assuming 1-based array indexing):
- S(q) becomes
log_prob_score
- ∑ becomes
for
i = 1
under the ∑ becomes thei = 1
in thefor
linen
over the ∑ becomesi <= n
in thefor
line- The "body" of the ∑, log(qᵢ) in this case, is the body of the
for
loop.
And that's it! See an example written in awk below:
y = get_data ;
n = lengthy ;
q = get_model_predicted_proby ;
log_prob_score = 0;
for i = 1; i <= n; i++
log_prob_score += logq[i] ;
}
print "The log prob score is", log_prob_score;
While a functional python version, whose structure is remarkably similar to the math notation, would be:
Nested sigma
Nested sigmas can similarly be translated into nested for
loops.
Take log-pointwise-predictive-density, the Bayesian version of the
log-probability score. This needs to sum the average of the each data
observation's probability for each sample of the posterior's. Two
"each"es in the previous sentence means we will need two sigmas or for
loops. It's mathematically represented as:
which can translate to two for
loops:
- lppd(...) becomes
lppd
- First ∑ becomes outer
for
i = 1
under first ∑ becomes thei = 1
in the outerfor
linen
on top of the first ∑ becomesi <= n
in the outerfor
line- The "body" of the first ∑, log(...) in this case, is the body of the
outer
for
loop. - Second ∑ becomes inner
for
(and the sum will be accumulated withsum_sample_prob
) s = 1
under the second ∑ becomes thes = 1
in the innerfor
lineS
on top of the first ∑ becomess <= S
in the innerfor
line- The "body" of the second ∑, p(...) in this case, is the body of the
inner
for
loop
This translates to the following awk code:
y = get_data ;
n = lengthy ;
theta = get_posterior_samples ;
S = lengththeta
lppd = 0;
for i = 1; i <= n; i++
sum_sample_prob = 0;
for s = 1; s <= S; s++
sum_sample_prob += proby[i], theta[s] ;
}
lppd += logsum_sample_prob / S ;
}
print "The lppd is", lppd;
and a "make-your-eyes-bleed" not great "functional" representation in python that is more similar to the math representation:
Forget that line of code above ever existed.
Comments, questions, input, concerns?
I hope this article helps! Please contact me with any questions or input on the article using any of the methods on my contact page.