# Programming and a school algebra formula finally deciphered

### Explanation is due

I guess you also could have had the same feeling when you learnt algebra at school. Some formulas were clear and understandable, but some were cryptic and it was unclear how would anyone derive them. And then the only way to master it is to memorize it. For example, there is this known formula for a difference of squares:

a2 – b2 = (a – b) * (a + b) = a2 + ab – ba – b2 = a2 – b2 , (1)

Then, there was a little bit more cryptic formula for a difference of cubes, which is not that obvious for a regular student:

a3 – b3 = (a – b) * (a2 + ab + b2) = a3 + a2b + ab2 – ba2 – ab2 – b3 = a3 – b3, (2)

So, I think you get it and the next formula is for a4 – b4 ,

a4 – b4 = (a – b) * (a3 + a2b + b2a + b3) = a4 + a3b + a2b2 + ab3 – ba3 – a2b2 – b3a – b4 = a4 – b4 , (3)

And finally, we get to the most cryptic formula that could be frustrating in a school algebra lesson, the formula for a difference of two positive whole numbers (integers) of power of n

an – bn = (a – b) * (an−1 + an−2b + an-3b2 + … + a2bn-3 + abn−2 + bn−1) , (4)

Now, the last formula seems frightening, and most interestingly one could ask, how did in the world anyone derive it? Also, how do you use it correctly?

### Take it slow

Let’s look at it in a slow motion. If we look at how we get from formula (1) to formula (4) we can notice that there is some symmetry in the numbers in the second braces in each of the formula.

So, the second braces in formula (1) have

(a + b)

the second braces in formula (2) have

(a2 + ab + b2)

the second braces in formula (3) have

(a3 + a2b + ab2 + b3)

the second braces in formula (4) have

(an−1 + an−2b + an-3b2 + … + a2bn-3 + abn−2 + bn−1)

Do you see it? When there is a2 on the left side there is a corresponding b2 on the right, when there is a2b on the left, there is a corresponding b2a on the right side, etc. So this is the symmetry I am talking about. The general formula is actually a factorization of a polynomial formula. But we can look at it in a different manner, just to understand how to use it properly. The derivation of the general formula is a little bit more complex and can be found here.

One interesting thing to notice is that the sum of powers of each a, b or there multiplication ab in the second braces is always n – 1.

(a + b) = a1 + b1 , i.e. the powers are 1, 1

(a2 + ab + b2) = a2 + a1b1 + b2 , i.e. the powers are 2, 1 + 1, 2

(a3 + a2b + b2a + b3) = (a3 + a2b1 + a1b2 + b3) , i.e. the powers are 3, 2 + 1, 1 + 2, 3

(an−1+ an−2b + an-3b2 + … + a2bn-3 + abn−2 + bn−1), i.e. the powers are n – 1, n – 2 + 1 = n – 1 , n – 3 + 2 = n – 1, 2 + n – 3 = n – 1, etc.

Now, also let’s pay attention that we can treat 1 as 1 = a0 or 1 = b0, and let’s look again at the expressions above

(a + b) = a1b0 + a0b1 ,

(a2 + ab + b2) = a2b0 + a1b1 + a0b2 ,

(a3 + a2b + b2a + b3) = (a3b0 + a2b1 + a1b2 + a0b3),

(an−1+ an−2b + an-3b2 + … + a2bn-3 + abn−2 + bn−1)

= (an−1b0+ an−2b1 + an-3b2 + … + a2bn-3 + a1bn−2 + a0bn−1),

I hope you can see that there is a systematic pattern which is going on here.

### Rules of the game

Rule 1: The number of members in the second braces is always as the power of the initial expression, say two for a2 – b2; three for a3 – b3 etc.

Rule 2: The sum of the powers of each member in the second braces is n – 1, which was already shown in the previous examples.

Pay attention that this can be also proven by mathematical induction. But I leave it as an exercise for you.

### How to use this formula and how to zip it

Now that we’ve noticed there is a pattern this pattern show us how to use the formula in a simple way without the need in rote memorization or blindly using someone else derivation.

The only thing is to remember that the first braces always have (a – b) and in the second braces the sum of the powers of each member is n -1. Let’s look at the concrete example of a8 – b8.

Let’s start from the second braces, and write each member without powers in accordance to Rule 1. We know there should be n, i.e. 8 such members.

(ab + ab + ab + ab + ab + ab + ab + ab)

Now, let’s use the Rule 2 and add powers to each member in the second braces, remembering that for a‘s, powers start from n – 1 and decrement by 1 for each consecutive a, and for b‘s powers start from 0 power and increment by 1 for each b until n – 1. Applied to our example,

for a‘s: a7, a6, a5, a4, a3, a2, a1, a0

and b‘s: b0, b1, b2, b3, b4, b5, b6, b7

Now, putting these together in the formula we get,

a8 – b8 = (a – b) * (a7b0 + a6b1 + a5b2 + a4b3 + a3b4 + a2b5 + a1b6 + a0b7)

= (a – b) * (a7 + a6b + a5b2 + a4b3 + a3b4 + a2b5 + ab6 + b7).

### Zip it

So, now we ready to zip this formula using the math notation for the sum:

$a^n - b^n=(a - b)\sum^{n - 1}_{k = 0} a^{(n - 1) - k}{b^k}$

where k increments from 0 to n – 1, i.e. 0, 1 , 2, …, n – 1.

### An interesting turn of events

What is nice about this formula is the fact that it’s actually a concise description of an algorithm that checks whether a certain string is a palindrome.

The main idea is to take a sequence of letters (an array of characters in programming speak), and then start comparing

1. First vs. last letter
2. Second vs. one before last
3. etc
4. For each such case above check whether letters are the same. If there is at least one instance when they are not the same, then it’s not a palindrome.

In Java programming language this algorithm could be implemented as follows (run this code in online Java compiler)

public class Main {
public static void main (String[]args){
String word = "TENET";
System.out.println (isPalindrome(word));
}

static boolean isPalindrome (String word){
char[] charArray = word.toCharArray();
int n = charArray.length;

for (int k = 0; k <= n - 1; k++){
if (charArray[k] != charArray[(n - 1) - k]){
return false;
}
}
return true;
}
}


# Get Back for More Than 50 years. Thoughts on The Beatles: Get Back documentary

### Back to the… past

Have you heard about a new documentary called The Beatles: Get Back. It was created from more than 150 hours of audio and 60 hours of video while The Beatles were working on their new live performance back in January 1969. I header about this film while checking news on the BBC website. The trailer of the Get Back got me curious. The video was crisp as though it was filmed just a couple of days ago. The sound quality was good and the footage was interesting.

It turned out that the documentary was available only at Disney+ channel, so I revived my membership there just to watch this series. The Get Back consists of three episodes, each more than 2 hours long. So be ready to dedicate a good chunk of your day or even two days to be able to watch it in full, unless you intend to skim through it.

### Viewpoints

What is interesting about this series of films that you can watch it from a number of viewpoints. First, of all there is a historical viewpoint, since it was filmed in January 1969 which is almost 53 years away from now. Back in 1969, Apollo 11 landed on the Moon, Vietnam war was in its height, the Hippie movement was on the rise and solid state transistors started to make their way into electronic music equipment. So it was a hectic time to say the least. The Beatles at this period found themselves at uncertain times when the band’s future was questionable. It is also interesting to notice the fashion, cultural norms, food and other things that changed since then. For example, it is no longer acceptable to smoke cigarettes in closed public places.

At this backdrop the Get Back documentary is unwinding in its full beauty.

### Episode 1

First part of the documentary starts at the Twickenham Film Studios were the Beatles is trying to come up with more than a dozen new songs for the upcoming live performance, possibly a TV show. Some of these songs ended to be the songs for the Let It Be album. But the band finds very quickly that the place is not that good for creative process. What I find interesting, that most of the band members and their spouses smoked cigarettes inside the pavilion, threw cigarette buds on the floor which we can hardly imagine happing these days.

Looking at internal dynamics inside the band and how the creative process unfolded it was interesting to see interactions between Paul McCartney and John Lennon, some tension between McCartney and George Harrison and a calmness and wisdom of Ringo Star. What I liked to see is how the band worked together on lyrics for the songs. What I find surprising is how Ringo the drummer is able to come up with grooves on the fly without the band members telling him what to play. If we look around we see that during the rehearsal sessions musicians ate breed and butter with honey and drink tea with milk, beers and vine. What you wouldn’t notice our nowadays omnipresent mobile phones.

Speaking from the musical viewpoint, more exactly, Ringo’s drum kit I find it to be very basic 5 piece kit. It had a snare, two rack toms, one floor tom. Hi-hat, crash and ride cymbals. At the very end of episode one it can be seen that Ringo also used a splash cymbal. As for his throne he had a simple chair with a back rest. Since, apparently, there were no Drumtacks mufflers back then he used some kind of fabric that he covered the snare and the floor tom with to muffle the sound. As for the mics the kit was miced with a kick drum mic, there was also a microphone on the right side and one microphone above the kit.

### Episode 2

In the part two the musicians moved to the basement studio in the Apple building were they spent the rest of the documentary.

Turning to the Public Address (PA) equipment it is interesting to notice the vertical Fender Solid State sound speakers.

### Episode 3

I find the third part most interesting since it is in this part that the band decides it will be fun to have a live performance on the Apple Studio’s roof top. And as it turned out this decision brought with it a lot of fun for most of the people on London streets, while driving some people mad, including police officers. Also in this episode you may find how the Octopus’s Garden song came to life.

All in all, if you interested in The Beatles’ history and also want to feel how a real creative process is happening within the band during rehearsals then this documentary is mandatory for you.

# When math powers algorithms it’s entertaining

I think that I already wrote previously that a couple of years ago I bought the Elements of Programming book by Alexander Stepanov and Paul McJones. The issue was that the book content was hard for me to grasp at the time. I can hardly say that I now understand it better, but now I got where the rationale for that book came from and why it was written the way it was. It turns out the Alexander Stepanov as a mathematician was influenced deeply by Abstract Algebra, Group Theory and Number Theory. The elements of these fields of mathematics can be traced in the Elements of Programming clearly. For example, chapter 5 is called Ordered Algebraic Structures and it mentions among other things semigroup, monoid and group, which are elements in Group Theory. Overall, the book is structured somewhat like Euclid’s Elements, since the book starts from definitions, that are later used to build gradually upon in other chapters of the book.

Which brings me to the main topic of this post. By the way, the post is about a different book Alexander Stepanov wrote with Daniel Rose and that book was created by refining the notes for the Four Algorithmic Journeys course that Stepanov taught in 2012 at A9 company (subsidiary of Amazon). The course is available in YouTube and it consists of three parts each having a number of videos and the Epilogue part.

I highly recommend to watch it to anyone who is curious about programming, mathematics and science in general. The course is entertaining and it talks about how programming, or more exactly algorithms that are used in programming, are based on algorithms that were already known thousands of years ago in Egypt, Babylon etc. Alexander Stepanov has a peculiar way of lecturing and I find this way of presentation funny. The slides for the course and the notes that were aggregated in the Three Algorithmic Journeys book draft are freely available at Alexander Stepanov’s site.

So the book which I want to mention is From Mathematics to Generic Programming which was published in 2014 and is a reworked version of the Three Algorithmic Journeys draft. This is how Daniel Rose describes this in the Authors’ Note of the book.

The book you are about to read is based on notes from an “Algorithmic Journeys” course taught by Alex Stepanov at A9.com during 2012. But as Alex and I worked together to transform the material into book form, we realized that there was a stronger story we could tell, one that centered on generic programming and its mathematical foundations. This led to a major reorganization of the topics, and removal of the entire section on set theory and logic, which did not seem to be part of the same story. At the same time, we added and removed details to create a more coherent reading experience and to make the material more accessible to less mathematically advanced readers.

## My verdict

As authors mentioned the book is geared towards Generic Programming, but I recommend to read both of them in parallel, since each one complements the other. I think that the Three Algorithmic Journeys is even better than the From Mathematics to Generic Programming (FM2GP). First, it’s free and second, ironically, it’s more generic than the FM2GP book.

# Unboxing inventions and innovations

It seems like there is hardly a person who didn’t hear the phrase “Thinking outside of the box”. As Wikipedia entry says it’s “a metaphor that means to think differently, unconventionally, or from a new perspective.” While it sounds good in theory, it is unclear what one should do to think unconventionally, differently, creatively etc. Only demanding from someone to think outside of the box, doesn’t provide clear guidance on how to achieve this goal.

The same issue happens in education, when a student is taught any subject that requires thinking beyond what was taught in a lesson or a lecture. There are people who can do better than others in such situations and we tend to label them as creative, smart and sometimes genius. But the psychological research into what makes experts experts, for example done by Anders K. Ericsson et al, shows that this has to do more with the way an expert practiced, and not the innate cognitive abilities.

So what makes us creative and can it be taught and learned? The short answer is yes and the rest of this post will try to justify this answer. The question of creative thinking is relevant in most fields of daily life where problems arise and when there is no obvious way of how to solve them. Here we go into realm of innovation and invention. There are many definitions of these two terms, so let me quote one from Merriam-Webster on the difference between invention and innovation

What is the difference between innovation and invention?
The words innovation and invention overlap semantically but are really quite distinct.

Invention can refer to a type of musical composition, a falsehood, a discovery, or any product of the imagination. The sense of invention most likely to be confused with innovation is “a device, contrivance, or process originated after study and experiment,” usually something which has not previously been in existence.

Innovation, for its part, can refer to something new or to a change made to an existing product, idea, or field. One might say that the first telephone was an invention, the first cellular telephone either an invention or an innovation, and the first smartphone an innovation.

Chuck Swoboda, in his The Innovator’s Spirit book also provides detentions for an innovation and an invention that will be discussed in this post and they are

An invention, by definition, is something new—something that’s never been seen before. An innovation, on the other hand, especially a disruptive one, is something new that also creates enormous value by addressing an important problem.

While I do not have any objection to his definition of an innovation, I don’t agree with the definition of an invention. Saying that invention “is something new that’s never been seen before” is too vague a definition to be practical. It takes a quick look into submitted patents to see that there are lots of similar, if not outright identical patents issued for inventions. Which means the definition of invention being something never seen before fails to capture this. Also by the same token invention “being something new” fails too.

But it turns out there is quite precise definition, that exits since 1956, of a technical invention, which was provided by Genrich Altshuller and Rafael Schapiro in a paper About the Psychology of Inventive Creativity (available in Russian) published in Psychology Issues, No. 6, 1956. – p. 37-49. In the paper they mentioned that as a technical system evolves there could arise contradictory requirements between parts of the system. For example, lots of people use mobile phones to browse the internet. To be able to comfortably see the content on the screen of the phone, the screen should be as big as possible, but this requirement clashes (contradicts) with the size of the mobile phone, which should be small enough to be able to hold it comfortably in a hand or carry it in a pocket.

Altshuller and Shapiro defined the invention as a resolution of the contradictory requirements between parts of the system, without having to trade off requirements to achieve the solution. This definition of invention allows to talk precisely about what can be thought as invention and what can’t. Generally speaking, contradictory requirements can be resolved in space, time or structure. For example, returning to the mobile phone example, to resolve the contradiction in structure of the phone, between the size of the screen and the size of the phone there is a functionality that was introduced in mobile phones that allows to screencast the video and audio from a phone to a TV screen using Wi-Fi radio signal. YouTube application on Android phones supports this functionality.

Altshuller wrote a number of books on the subject of creative thinking, particularly books that developed the Theory of Inventive Problem Solving (abbreviated as TRIZ in Russian). In these books the ideas about a contradiction, an invention and an algorithmic approach (ARIZ) to how to invent by solving contradictions in technical problems are elaborated. To name just a few books in chronological order, written by Altshuller

• How to learn to invent (“Как научиться изобретать”), 1961
• Algorithm of Invention(“АЛГОРИТМ изобретения”), 1969
• Creativity as an Exact Science: Theory of Inventive Problem Solving (“ТВОРЧЕСТВО как точная наука: Теория решения изобретательских задач”), 1979

What is important to mention about the books is that they contain systematic, detailed and step by step explanations of how to invent using an algorithm. Lots of examples and exercises for self-study included in them. The books by Altshuller somewhat resemble in their content and in a way of presenting the material books written by George Polya.

Polya being a productive mathematician was also interested in how to convey his ideas in a way that could be easily understood by other people. To this end he wrote a number of books directed to pupils, students, teachers and general audience.

For example, his book How To Solve it first published in 1945 is a step by step instruction set on how to approach mathematical problems in a systematic way, using heuristics that mathematicians accumulated doing math for thousands of years. It very much resembles to me the structure and approach taken in Altshuller’s How to learn to event. Later, Polya wrote two additional books on how mathematicians think and how they arrive to mathematical theories. Each of the books consist of two volumes and they are

What is interesting to mention is that the books written by Polya and Altshuller more than fifty years ago contained very insightful ideas and heuristics to tackle math and inventive problems. But today it’s still difficult to find a widespread adoption of these ideas in education, industry or elsewhere. For example, The Princeton Companion to Applied Mathematics book from 2015 mentions only a rudimentary number of math Tricks and Techniques in the chapter I, Introduction to Applied Mathematics, on pages 39-40, out of 1031 pages.

As well as the general ideas and principles described in
their own bags of tricks and techniques, which
they bring into play when experience suggests they
might be useful. Some will work only on very specific
problems. Others might be nonrigorous but able to give
useful insight. George Pólya is quoted as saying, “A
trick used three times becomes a standard technique.”
Here are a few examples of tricks and techniques that
prove useful on many different occasions, along with a
very simple example in each case.

– Use symmetry…
– Add and subtract a term, or multiply and divide by a term….
– Consider special cases…
– Transform the problem…
– Going into the complex plane…

As a summary, if you are curious whether it’s possible to learn how to be more creative, inventive or, in general, approach problems in a systematic way, then check the books by Genrich Altshuller and George Polya. They may provide you with just the tools that you were looking for, but didn’t know where to find.

# Levels of understanding or how good explanations matter

In this post I want to talk about why providing good and detailed explanations can be a key in deep understanding of things in different fields of life. Particularly, I want to talk about detailed proofs in mathematics that have good step by step explanations of how the proof was constructed. Recently, I’ve started to read the books on math that are piling on my table just as the image above depicts.

I want to point your attention to the second book at the top of the pile, which is Prime Numbers and the Riemann Hypothesis book from 2016 written by Barry Mazur and William Stein. This book talks about Riemann Hypothesis by starting from ‘simple’ math and gradually moving to details about Riemann Hypothesis that require more advanced math background.

So what levels of understanding and well explained proofs have to do with the content of the book. Well, you see in the first part of the book, which authors claim requires some minimal math background a read sees this

Here are two exercises that you might try to do, if this is your first encounter with primes that differ from a power of 2 by 1:

1. Show that if a number of the form M = 2^n – 1 is prime, then the exponent n is also prime. [Hint: This is equivalent to proving that if n is composite, then 2^n -1 is also composite.] For example: 2^2 – 1 = 3, 2^3 -1 = 7 are primes, but 2^4 – 1 = 15 is not. So Mersenne primes are numbers that are

– of the form 2^ prime number – 1, and

– are themselves prime numbers

### Some context for the quote

Here comes a little bit of a context about this quote. The quote comes from, part 1, chapter 3: ‘”Named” Prime Numbers’, on page 11. The chapter describes what are Mersenne Primes, which are prime numbers that are one less than a power of two:

M = 2^n – 1

Also, it’s good to know that a prime number is a whole number (positive integer) that can be divided only by itself or 1. For example,

2, 3, 5, 7, 11 … are prime numbers since they can only be divided by themselves and 1.

Now, that we know what prime numbers are, I want to draw your attention to the point in the quote where it says, that the exercises are good for ‘your first encounter with primes that differ from a power of 2 by 1‘. Well, in my opinion these exercises are good only for readers who have at least a BSc in Math or possibly an engineering degree. In a couple of sentences we’ll see why I think so. Since I consider myself as a person who is interested in math and have a BSc degree in Electronics, I think the first part of the book which is indented for a layman person is just for me. Were you able to arrive at the proof for that simple exercises above?

Frankly, I wasn’t able to prove it. So, I went and looked for a proof on the internet. As always, Wikipedia is at the top of the Google search when it comes to math topics. Lo and behold there is a page in the wiki about Mersenne primes, what they are and a number of proofs related to them. One of the proofs was exactly the solution to the exercise in the quote above. Here it comes:

Note: Since, I had no time to learn how to use LaTeX properly in the WordPress, and believe me WordPress doesn’t make blogger’s life easy I write the proofs and picture them.

Does this proof looks like a piece of cake to you? Is it intuitive and easy to grasp? In my opinion, it’s not and also this proof shows why Wikipedia gets a good portion of criticism about its content.

What I find not so obvious about that proof is how the right hand side of the equation came about. Especially, the part that has powers of a * (b – 1) , a* (b – 2), … , a, 1. And then the statement that says ‘By contrapositive, if 2^p -1 is prime then p prime‘. In this particular statement, if you had no courses on mathematical proof, logic or Abstract Algebra, then the words composite and contrapositive can be a little bit mysterious.

I thought to myself, well, the proof looks kind of unclear to me, so I searched better using the internet and also checked the books I own. I was able to find a couple of proofs that were easier to understand and also was able to find a proof in the Book of Proof by Richard Hammack that you can see in the title image for the post. It’s the blue one and it comes third from the top of the pile.

Let’s start with the proof from the Book of Proof, which sounds like a good place to begin with. The proof is a solution to the exercise 25 from Chapter 5 in the book.

If we look carefully at this proof it look almost exactly as the proof in the Wikipedia. It even look less clear. But one important thing to notice is that this proof shows where the ‘1’ comes from on the right hand side of the equation, in comparison to the proof in the wiki. I think so, since 2^(ab-ab) which equals 1, provides more information than a simple number 1. This is because it provides some clues on how the proof was constructed. But for a reader who does not remember math, both of the proofs are not that helpful. Also, we need to take into consideration, that the Book of Proof intended audience is undergraduate students of exact sciences. So the book presupposes some math background.

### Some math background

We already mentioned that a prime number is a positive integer that can be only divided by itself or number 1. All other integers, are composite, since any of them can be composed by multiplying prime numbers. This is where composite comes from. As for contrapositive the Merriam-Webster site provides this definition

a proposition or theorem formed by contradicting both the subject and predicate or both the hypothesis and conclusion of a given proposition or theorem and interchanging them

if not-B then not-A ” is the contrapositive of “if A then B ”

For example,

If it was raining then it is wet.

Then contrapositive statement would be

If it is not wet then it wasn’t raining.

Returning back to the original exercise from the Prime Numbers and the Riemann Hypothesis book, now, it becomes a little bit clear, what the hint [Hint: This is equivalent to proving that if n is composite, then 2^n -1 is also composite.] there was for. So the proofs, went to prove that

If n is composite then 2^n – 1 number is composite

and by contrapositive

If 2^n – 1 number is not composite (i.e. prime) then n is not composite (i.e. prime)

### Polynomial factorization

Now, that we know what contrapositive proof is that’s turn to the right hand side part of the equation which is

2^n – 1 = (2^a – 1) * ( 2^a*(b-1) + 2^a*(b-2) + … + 2^a + 1).

It turns out that to derive it there is need to remember what polynomial factorization is, or remember how to divide one polynomial by another, or to know what Polynomial remainder theorem is. Also it’s good to know for a start what is a polynomial.

Since this post is becoming to long I need to make it shorter, which defeats the point of providing detailed explanations 😦

But I’ll provide some hints on how the right hand side was derived.

There is a known math formula to compute the following expression a^n – b^n

It turns out that the equation in the proofs that were mentioned in this post has the same structure as the formula to derive a^n – b^n. But the most interesting part is this. If you look at the last equation the Roman numeral I stands for the initial composite number 2^n – 1, and it is composed by multiplying Part II by Part III.

Part I = Part II * Part III

A = B * C

is that B and C are factors of A, or alternatively B is a divisor and C is a quotient.

So to summarize the initial number 2^n -1 is composite since it is a product of two other numbers.

# Prime time for Riemann Hypothesis

## Books that make you think

I already had a post where I mentioned Reimann Hypothesis after reading The Music of The Primes by Marcus du Sautoy. As far as I recall, I liked the book a lot. It was written for a wide audience and was an easy read. Later, I accidentally found another book on the subject that was intended for more mathematically inclined readers, namely, Prime Obsession by John Derbyshire. Having been fascinated by the subject of prime numbers, the prime number theorem it was a short way to other similar books, such as Prime Number and the Riemann Hypothesis by Barry Mazur and William Stein. Then smoothly transitioning to A Study of Bernhard Riemann’s 1895 Paper by Terrence P. Murphy. Just to conclude with H.M. Edwards Riemann Zeta Function. By the way, the order in which I mentioned the books more or less conveys the mastery of mathematics required to be able to understand what’s going on in them. Which means that two last books require substantial background in calculus and complex analysis. But it’s doable if you have time and prime obsession.

## Easy to not-so-easy books

I’d like to provide more details about the books above which I personally read end-to-end and also about ones that I bought, but haven’t finished yet, or only skimmed through.

Actually, I’d rather start with a short description of what the Reimann Hypothesis is by citing the Millennium Problems web site that describes a number of 21st century math problems that can bring you 1,000,000 USD for solving any of them.

So the Riemann Hypothesis is

Source: Millennium Problems

Some numbers have the special property that they cannot be expressed as the product of two smaller numbers, e.g., 2, 3, 5, 7, etc. Such numbers are called prime numbers, and they play an important role, both in pure mathematics and its applications. The distribution of such prime numbers among all natural numbers does not follow any regular pattern.  However, the German mathematician G.F.B. Riemann (1826 – 1866) observed that the frequency of prime numbers is very closely related to the behavior of an elaborate function
ζ(s) = 1 + 1/2s + 1/3s + 1/4s + …
called the Riemann Zeta function. The Riemann hypothesis asserts that all interesting solutions of the equation
ζ(s) = 0
lie on a certain vertical straight line.
This has been checked for the first 10,000,000,000,000 solutions. A proof that it is true for every interesting solution would shed light on many of the mysteries surrounding the distribution of prime numbers.

Having said that now let’s look at the books.

## The Music of The Primes

The book was written by Marcus du Sautoy in 2003. As I mentioned, the book does not require a degree in mathematics to be able to understands what it’s talking about. The material in it is interesting and engaging. In addition to covering, The Prime Number Theorem and Reimann Hypothesis it also covers other topics related to prime numbers usage, like cryptography. It can be a good starting point into a long journey with prime numbers.

## Prime Numbers and the Riemann Hypothesis

The book was written by Barry Mazur and William Stein in 2016. It has four parts, where first part intended for a wide audience, and each consecutive part presuppose gradually increasing knowledge of math to be able to grasp the content. What’s interesting about this book that it sheds light on some interesting connections between Riemann Hypothesis and Fourie Transform, which electrical engineers can relate to. Also the book is quite short.

## Prime Obsession

The book is written by John Derbyshire in 2003 (same year when Marcus du Sautoy wrote his book). This book has two parts: The Prime Number Theorem and The Riemann Hypothesis, but it goes into nitty gritty details of both of them and don’t allow a reader relax too much. Following the content of the book could require some math background and at times some calculations to be sure that one gets proper understanding of what’s going on. Personally, out of all the books I mention in this post I find this one the most engaging.

## A Study of Bernhard Riemann’s 1859 Paper

The book is written by Terrence P. Murthy in 2020. It is one of the two most technical books on the subject that requires substantial background in mathematics. The book provides Riemann’s 1859 paper in full in English and then systematically goes and provide proofs for all relevant parts of Riemann’s paper in subsequent chapters (except for the Riemann Hypothesis itself :). I think Terrence Murphy summarizes who this book is intended for in his own words the best:

Who Is This Book For?
If you are reading this, chances are you have developed a keen interest in the Reimann Hypothesis. Maybe you read John Derbyshire’s excellent book Prime Obsession. Or perhaps you read that the Riemann Hypothesis is one of the seven Millennium Prize Problems, with a \$1 million prize for its proof.
To advance your knowledge substantially beyond Derbyshire’s book, you must have (or develop) a good understanding of the field of complex analysis (we will describe that as knowledge at the “hobbyist” level). So, this book is probably not for you unless you are at least at the hobbyist level.

## Riemann Zeta Function

The book was written by H.M. Edwards in 1974. I’d rather describe it by continuing the citation from the previous book by Terrence P. Murthy:

After developing an interest in the Riemann Hypothesis, the first stopping point for many is Edwards’ excellent book Riemann’s Zeta Function. The Edwards book provides a wealth of information and insight on the zeta function, the Prime Number Theorem and the Riemann Hypothesis. And that brings us to the next group of people who do not need this book. If you eat, sleep and breath complex analysis, we will say you are at the “guru” level. In that case, the Edwards book will be easy reading and will provide you with the information you need to substantially advance your knowledge of Riemann’s Paper and the Riemann Hypothesis.

As you may tell, “guru” level in math is required to fully digest this book. So it want be easy to say the least.

### A good introductory paper on the subject

If you are interested in a short, but engaging introduction into what are Prime Number Theorem and the Riemann Hypothesis I recommend to read Don Zagier’s The First 50 Millions Prime Numbers paper, published in New Mathematical Intelligencer (1977) 1-19.

If you know Russian you can read the same paper that was published in Russian only in 1984 in the Uspekhi Matematicheskikh Nauk journal.

## Parting words

All in all, these five books can take a good chunk of a full year to work through or possibly even more, especially the last two. So what are you waiting for? Life is too short to waste it on watching TV series or YouTube nonsense. The treasures of math and deeper understanding of the world are awaiting for ones who know where to look for.

# Errata for the Thinking Better book and some commentary

This post is a continuation of the previous one about Thinking Better book (ISBN-13 ‏: ‎978-1541600362) by Marcus du Sautoy published in North America by Basic Books.

It seems like I was too eager to praise the book after reading just a few dozens of pages. Even though, on average the book is interesting to read, there were a number of things that could make the content of the Thinking Better even better. For example, having more diagrams accompanying the explanations for various concepts could be helpful. Having footnotes to provide more details or sources for cited papers could be helpful too etc.

## Errata

Having found a number of possible mistakes in the book I was sure that notifying Basic Books publishing about them would be valuable and they’d be happy to review and, if required, correct the content of the book. But since I’ve sent an email to the customer support there, I didn’t hear back. Below comes the table of potential issues that I was able to find while reading the book.

## Some other suggestions

The suggestions below are based on my experience with reading dozens of popular-science books on mathematics, physics neuroscience and biology.

### Diagram and Figures

The book has a number of diagrams in each chapter. Though most of them are helpful, some are not. The main issue I see with the diagrams in the book is that even though they are numbered, just like in the table above, that number isn’t referenced in the body of the book. This makes it hard to related the diagram to the content where it was mentioned.

There are places that I would add diagrams to clarify the content, since without having a diagram it is difficult to imagine what the text represent. Or it takes quite some time to understand author’s intent. For example, Figure 3.2. Six pyramids make a cuboid on page 84, is very confusing to say the least (no diagram is shown in my post due to copyright issues).

One additional example is when the sieve of Eratosthenes was mentioned on page 106. Usually, this method is visualized by a diagram, which helps a lot in understanding it. For a good visual example refer to the chapter 7 in the Prime Obsession book by John Derbyshire. Also check Wikipedia article about the sieve of Eratosthenes.

### Missing footnotes or notes

I agree that not all popular science books have footnotes or notes, but this particular book mentions a number of other books, papers and authors. Having footnotes or notes at the end of the book could have been beneficial to a curious reader. One of the papers cited in the book, had the names of the authors incorrect. For example, in the chapter 9, on page 261 it is written

Two mathematicians Duncan Watts and Steve Strogatz, discovered the secret, which they published in a paper in Nature in 1998.

The paper was Collective dynamics of ‘small-world’ networksNature 393, 440–442 (1998). And the authors were Duncan Watts and Steven Strogatz.

### Missing bibliography

There are no references in the book. Bibliography is also not a mandatory part of popular science books, but in most of the ones I read it was there and helped find similar books on the subject or get more details about specific topics mentioned in the book.

## Too harsh a criticism?

All in all, despite the drawbacks I mentioned above the book was worth reading. I think my criticism has to do with that fact that the Music of The Primes book, also written by Marucs du Satouy in 2003 didn’t have most of the issue I brought in this post.

# Reading by doing. ‘Thinking Better’ the book by Marcus du Sautoy

Have you heard about a new book Thinking Better: The Art of the Shortcut in Math and Life by Marcus du Sautoy? If not, you may consider given it a try, since this book is very engaging. The book is about various topics in mathematics and how they provide shortcuts that can be used in different fields of daily life to find solutions to seemingly daunting problems. These shortcuts speed up the computations and free time for you to do other things. Each chapter of the book starts with a puzzle that the reader is asked to solve. Then the main content of the chapter is related to the puzzle, while the solution is provided in the end.

I’ve already read The Music of the Primes that was also written by Marcus and thoroughly enjoyed it. What makes this book to stand out in comparison to The Music of The Primes is that it’s focused on practical advise that could be actioned by readers. But what I find most compelling about this book is that reading it is not enough to get the most out of the content, there is also a need to play with the content of each chapter, carefully examining it. Otherwise the subtle details and insight could be lost and not understood properly.

Marcus tries to present his ideas in a way that is accessible to a reader who is not supposed to be a math expert. That is why it seems his approach is to use as little of math terminology and formulas as possible. Even the word ‘formula’ doesn’t show up until page 20 into the book. This approach to writing popular science books is not new and it tries to achieve a trade off between a number of readers who may be frightened by the mathematical notation, and the number of readers who could be disappointed by the lack of it. Which brings me to the main point of this post. Even though Marcus doesn’t shy away from writing down some equations to exactly convey his ideas, there are places in the book where additional mathematical details could clarify his point even better. I also suggest to name the math objects as they are used in mathematics. There is no harm in doing that just like Steven Strogatz did in The Joy of x or John Derbyshire in the Unknown Quantity books. Proper math notation doesn’t frighten, but could actually help readers understand concepts better.

For example, in the first chapter of the book, on page 20 the following sequence is introduced

1, 3, 6, 10, 15, 21 …

Which as Marcus explains is a sequence of triangular numbers and the general formula to get the nth triangular number is

1/2 * n * (n + 1)

What I think is missing here is the fact that Marcus stops short of explaining that there is a general formula to find the sum of members of a finite arithmetic sequence. This is important, since each nth number in the triangular numbers sequence is a sum of the nth numbers of the arithmetic sequence that Marcus mentions on page 2, talking about Gauss’s school lesson

1, 2, 3, 4, 5, 6 …

Notice that this sequence is arithmetic, since the difference between each consecutive member and the previous one is constant. In this particular case it equals to 1.

For example, that general formula is

Sumn = 1/2 * n * (a1 + an), where a1 is the first member, which in the case of natural number sequence is 1; an stands for the last number until which we want to sum and n stands for the number of members starting from the first to the last we want to sum. Then substituting these values into the general formula we arrive at the formula Marcus mentioned in the book

nthtriangular number = 1/2 * n * (1 + n)

What this gives a reader is that now it’s possible to use this general formula to calculate sums of other arithmetic sequences, for example the sequence of odd numbers.

1, 3, 5, 7, 9, 11, 13, 15 …

Let’s calculate the sum of first 8 members of this sequence, a1 = 1, an = 15, n = 8, then

Sumn = 1/2 * n * (a1 + an) = 1/2 * 8 * (1 + 15) = 64

One additional, example where this general formula could have been mentioned by Marcus was on page 28, where he explains how to calculate the number of handshakes in a population of a city having N people. It goes like this. Imagine these N people standing in a line. Then the first person can make N – 1 handshakes ( minus one, since he cannot shake his own hands). Then the second one can make N – 2 handshakes, continuing in similar fashion until the last person who can make no handshakes, since everyone already handshaked him. Then Marcus mentions that the sum of handshakes is the sum from 1 to N – 1, which is the sum that Gauss was asked to perform in his math class:

1/2 * (N – 1) * N

It seems to me a reader can be very confused at this point, since it’s not that clear that this is the sum Gauss was tasked to perform. This is because Gauss was asked to calculate the sum of the first nth natural numbers,

1, 2, 3, 4, 5, 6 …

As was mentioned earlier the general formula for this sum is Sumn = 1/2 * n * (a1 + an) and in this particular case it’s

Sumn = 1/2 * n * (1 + n), Then, one may ask, where does the 1/2 * (N – 1) * N comes from?

To get out of the confusion state, two things could be helpful. First, it’s a diagram of people standing in line waiting to be handshaked, and a little bit more details on how the 1/2 * (N – 1) * N formula was derived.

Looking at the diagram with four people in line we can see that the total number of handshakes is 3 + 2 + 1 + 0 = 6. Now, if we take N people in line as in the book’s example, then the first person in line is a1 = N – 1 , and the last person in line an = N – N, and the total number of people in line is N. Then substituting these numbers into the general formula for the sum of arithmetic sequence we got:

Sumn = 1/2 * n * (a1 + an) = 1/2 * (N – 1 + N – N) * N = 1/2 * (N – 1) * N

This is how the formula in the book was derived.

This post is not the last one, since I have not finished reading the book yet. There are more to come.

# What the hack is Rule 30? Cellular Automata Explained

## Introduction

If you ask the same question, what is this Rule 30, then you are not alone. This rule is related to one-dimensional Cellular Automata introduced by Stephen Wolfram in 1983 and later described in his A New Kind Of Science book. Even though, I skimmed through this book previously, I was never able to understand what was it all about. It had some neat diagrams, but I didn’t try to understand how the diagrams where generated. This changed a day ago when I watched an interesting interview that Lex Fridman had with Stephen Wolfram, where they discussed among other things Rule 30, which was the one that Wolfram described first. This rule is capable of generating a complex behavior even though the rule itself is very simple to define.

## Elementary Cellular Automaton

The Elementary Cellular Automaton consists of a one-dimensional array of cells that can be in just two sates, namely, black or white color. The cellular automaton starts from initial state, and then transition to the next state based on a certain function. The next state of the automaton is calculated based on the color of current cell and its left and right neighbors’ colors. By the way, that transition function, gives its name to the cellular automaton rule, just like Rule 30.

### Rule 30 transition function

Where, for example, the binary digits ‘111’ in the first column indicate the black color of the left, current and right cells, and the values, ‘0’ or ‘1’ indicate what will be the color of the current cell in the next state (iteration) of the automaton. If we write down all the the binary values of the next state together as a single 8 digit binary number, which is 00011110 and convert it to a decimal number we get 30, and hence the name of the Rule 30. In a Boolean form this transition function is

(left, current, right) -> left XOR (current OR right)

We’ll use this function later in the C++ implementation of the Rule 30.

This is how the output from Rule 30 looks like after 20 and 100 steps respectively.

## How to generate Rule 30?

It is easy to implement the Rule 30 using a 64 bit integer in C++. The only drawback is that we can get only 32 steps with the implementation below taken from Wikipedia.

#### Note: If this is a little bit to much for you now, then skip to the next part where we’ll use WolframAlpha to program it for us.

In the code below a 64 bit integer is used to generate an initial state using a bit mask in line 6. Then the outer for loop generates 32 steps of the Rule 30 by applying the transition function in line 19. The inner loop generates the black (using character ‘1’) and white (using character ‘-‘) colors using state bit mask.

#include <stdint.h>
#include <iostream>

int main() {
// This is our bit mask with the 32 bit set to '1' for initial state
uint64_t state = 1u << 31;

for (int i = 0 ; i < 32 ; i++) {

for (int j = sizeof(uint64_t) * 8 - 1 ; j  >=  0 ; j--) {
// Here we decide what should be the color of the current cell based on the current state bit mask.
// Bitwise operator is used to accomplish this efficiently
std::cout << char(state  >>  j & 1 ? '1' : '-');
}
std::cout << '\n';

// This is just the (left, current, right) -> left XOR (current OR right) functioned mentioned previously
// Bitwise operators  are used to accomplish this efficiently
state = (state >> 1) ^ (state | state << 1);
}
}


It is possible to run this code in the online C++ compile. Just click on the link and then click on the green Run button at the top of the screen. The result looks similar to below.

The full output for 32 steps looks like this

--------------------------------1-------------------------------
-------------------------------111------------------------------
------------------------------11--1-----------------------------
-----------------------------11-1111----------------------------
----------------------------11--1---1---------------------------
---------------------------11-1111-111--------------------------
--------------------------11--1----1--1-------------------------
-------------------------11-1111--111111------------------------
------------------------11--1---111-----1-----------------------
-----------------------11-1111-11--1---111----------------------
----------------------11--1----1-1111-11--1---------------------
---------------------11-1111--11-1----1-1111--------------------
--------------------11--1---111--11--11-1---1-------------------
-------------------11-1111-11--111-111--11-111------------------
------------------11--1----1-111---1--111--1--1-----------------
-----------------11-1111--11-1--1-11111--1111111----------------
----------------11--1---111--1111-1----111------1---------------
---------------11-1111-11--111----11--11--1----111--------------
--------------11--1----1-111--1--11-111-1111--11--1-------------
-------------11-1111--11-1--111111--1---1---111-1111------------
------------11--1---111--1111-----1111-111-11---1---1-----------
-----------11-1111-11--111---1---11----1---1-1-111-111----------
----------11--1----1-111--1-111-11-1--111-11-1-1---1--1---------
---------11-1111--11-1--111-1---1--1111---1--1-11-111111--------
--------11--1---111--1111---11-11111---1-11111-1--1-----1-------
-------11-1111-11--111---1-11--1----1-11-1-----11111---111------
------11--1----1-111--1-11-1-1111--11-1--11---11----1-11--1-----
-----11-1111--11-1--111-1--1-1---111--1111-1-11-1--11-1-1111----
----11--1---111--1111---1111-11-11--111----1-1--1111--1-1---1---
---11-1111-11--111---1-11----1--1-111--1--11-1111---111-11-111--
--11--1----1-111--1-11-1-1--11111-1--111111--1---1-11---1--1--1-
-11-1111--11-1--111-1--1-1111-----1111-----1111-11-1-1-111111111



Compare the image above to the one generated using WolframAlpha symbolic language in the Wolfram Notebook.

## Using WolframAlpha to generate Rule 30

WolframAlpha is a computational knowledge engine developed by Wolfram Research. It is embedded in Mathematica symbolic language and it has a declarative programming style. Which means that you specify what you want to be done instead of how it should be done.

For example, to generate Rule 30 and visualize it one simple writes

ArrayPlot[CellularAutomaton[30, {{1},0}, 64]]


where CellularAutomaton function generates 64 states of the automaton, starting with a one black cell in accordance with the Rule 30 generation function, and ArrayPlot function prints the result.

And the output is

Please follow the this link to the Wolfram Notebook where the various Elementary Cellular Automata Rules are generated.

## Summary

Playing with cellular automat rules seems like an interesting game to play, plus doing it using WolframAlpha language is a piece of cake.

# Better understanding with Optimization for Machine Learning

## Long awaited book from Machine Learning Mastery

Recently, I’ve been reading the new Optimization for Machine Learning book from the Machine Learning Mastery written by Jason Brownlee. It just so happened that I read it fully from start to end, since I was one of the technical reviewers of the book. The book was interesting to read thanks to a number of ingredients.

As always Jason was able to write an engaging book with practical advice that can be actioned right away using open source software on Linux, Windows or MacOS. Apart from this the book has just enough clearly explained theoretical material so that even beginning machine learning practitioners can play with optimization algorithms described in the book.

## What I liked and what surprised me in the book

Personally, I think it was while reading and working with this book, that I truly understood what an optimizations is. How it is used in machine learning. What is an optimization algorithm, like gradient descent and how to implement one from scratch. I also very much enjoyed the chapter about Global Optimization with various types of Evolution Algorithms. What was funny, that about two weeks after I finished reading the book I came across Donald Hoffman’s The Interface Theory of Perception with relation to consciousness which is based on The Theory of Evolution by Natural Selection. For example, one of his papers written with colleagues namely Does evolution favor true perception? provides an example of Genetic Algorithm (GA) which very much resembles the GA in Chapter 17 of the book. It is highly recommended reading for anyone interested in how consciousness arises in the mind. By the way, does it?

## Overall

The Optimization for Machine Learning book is what you come to expect from Machine Learning Mastery books. It’s interesting, it’s practical and it makes you understand what is that you are doing in Machine Learning. As always, each chapter has extensive references to tutorials, technical papers and books on machine learning. So don’t wait and start reading it, maybe you’ll come up with a new theory of how consciousness emerges in the mind.

## References

Donald D. Hoffman, Manish Singh, Justin Mark, “Does evolution favor true perceptions?”, Proceedings Volume 8651, Human Vision and Electronic Imaging XVIII; 865104 (2013)