When math powers algorithms it’s entertaining

I think that I already wrote previously that a couple of years ago I bought the Elements of Programming book by Alexander Stepanov and Paul McJones. The issue was that the book content was hard for me to grasp at the time. I can hardly say that I now understand it better, but now I got where the rationale for that book came from and why it was written the way it was. It turns out the Alexander Stepanov as a mathematician was influenced deeply by Abstract Algebra, Group Theory and Number Theory. The elements of these fields of mathematics can be traced in the Elements of Programming clearly. For example, chapter 5 is called Ordered Algebraic Structures and it mentions among other things semigroup, monoid and group, which are elements in Group Theory. Overall, the book is structured somewhat like Euclid’s Elements, since the book starts from definitions, that are later used to build gradually upon in other chapters of the book.

Which brings me to the main topic of this post. By the way, the post is about a different book Alexander Stepanov wrote with Daniel Rose and that book was created by refining the notes for the Four Algorithmic Journeys course that Stepanov taught in 2012 at A9 company (subsidiary of Amazon). The course is available in YouTube and it consists of three parts each having a number of videos and the Epilogue part.

I highly recommend to watch it to anyone who is curious about programming, mathematics and science in general. The course is entertaining and it talks about how programming, or more exactly algorithms that are used in programming, are based on algorithms that were already known thousands of years ago in Egypt, Babylon etc. Alexander Stepanov has a peculiar way of lecturing and I find this way of presentation funny. The slides for the course and the notes that were aggregated in the Three Algorithmic Journeys book draft are freely available at Alexander Stepanov’s site.

So the book which I want to mention is From Mathematics to Generic Programming which was published in 2014 and is a reworked version of the Three Algorithmic Journeys draft. This is how Daniel Rose describes this in the Authors’ Note of the book.

The book you are about to read is based on notes from an “Algorithmic Journeys” course taught by Alex Stepanov at A9.com during 2012. But as Alex and I worked together to transform the material into book form, we realized that there was a stronger story we could tell, one that centered on generic programming and its mathematical foundations. This led to a major reorganization of the topics, and removal of the entire section on set theory and logic, which did not seem to be part of the same story. At the same time, we added and removed details to create a more coherent reading experience and to make the material more accessible to less mathematically advanced readers.

My verdict

As authors mentioned the book is geared towards Generic Programming, but I recommend to read both of them in parallel, since each one complements the other. I think that the Three Algorithmic Journeys is even better than the From Mathematics to Generic Programming (FM2GP). First, it’s free and second, ironically, it’s more generic than the FM2GP book.

Unboxing inventions and innovations

Photo by Kelli McClintock on Unsplash

It seems like there is hardly a person who didn’t hear the phrase “Thinking outside of the box”. As Wikipedia entry says it’s “a metaphor that means to think differently, unconventionally, or from a new perspective.” While it sounds good in theory, it is unclear what one should do to think unconventionally, differently, creatively etc. Only demanding from someone to think outside of the box, doesn’t provide clear guidance on how to achieve this goal.

The same issue happens in education, when a student is taught any subject that requires thinking beyond what was taught in a lesson or a lecture. There are people who can do better than others in such situations and we tend to label them as creative, smart and sometimes genius. But the psychological research into what makes experts experts, for example done by Anders K. Ericsson et al, shows that this has to do more with the way an expert practiced, and not the innate cognitive abilities.

So what makes us creative and can it be taught and learned? The short answer is yes and the rest of this post will try to justify this answer. The question of creative thinking is relevant in most fields of daily life where problems arise and when there is no obvious way of how to solve them. Here we go into realm of innovation and invention. There are many definitions of these two terms, so let me quote one from Merriam-Webster on the difference between invention and innovation

What is the difference between innovation and invention?
The words innovation and invention overlap semantically but are really quite distinct.

Invention can refer to a type of musical composition, a falsehood, a discovery, or any product of the imagination. The sense of invention most likely to be confused with innovation is “a device, contrivance, or process originated after study and experiment,” usually something which has not previously been in existence.

Innovation, for its part, can refer to something new or to a change made to an existing product, idea, or field. One might say that the first telephone was an invention, the first cellular telephone either an invention or an innovation, and the first smartphone an innovation.

Chuck Swoboda, in his The Innovator’s Spirit book also provides detentions for an innovation and an invention that will be discussed in this post and they are

An invention, by definition, is something new—something that’s never been seen before. An innovation, on the other hand, especially a disruptive one, is something new that also creates enormous value by addressing an important problem.

While I do not have any objection to his definition of an innovation, I don’t agree with the definition of an invention. Saying that invention “is something new that’s never been seen before” is too vague a definition to be practical. It takes a quick look into submitted patents to see that there are lots of similar, if not outright identical patents issued for inventions. Which means the definition of invention being something never seen before fails to capture this. Also by the same token invention “being something new” fails too.

But it turns out there is quite precise definition, that exits since 1956, of a technical invention, which was provided by Genrich Altshuller and Rafael Schapiro in a paper About the Psychology of Inventive Creativity (available in Russian) published in Psychology Issues, No. 6, 1956. – p. 37-49. In the paper they mentioned that as a technical system evolves there could arise contradictory requirements between parts of the system. For example, lots of people use mobile phones to browse the internet. To be able to comfortably see the content on the screen of the phone, the screen should be as big as possible, but this requirement clashes (contradicts) with the size of the mobile phone, which should be small enough to be able to hold it comfortably in a hand or carry it in a pocket.

Altshuller and Shapiro defined the invention as a resolution of the contradictory requirements between parts of the system, without having to trade off requirements to achieve the solution. This definition of invention allows to talk precisely about what can be thought as invention and what can’t. Generally speaking, contradictory requirements can be resolved in space, time or structure. For example, returning to the mobile phone example, to resolve the contradiction in structure of the phone, between the size of the screen and the size of the phone there is a functionality that was introduced in mobile phones that allows to screencast the video and audio from a phone to a TV screen using Wi-Fi radio signal. YouTube application on Android phones supports this functionality.

Altshuller wrote a number of books on the subject of creative thinking, particularly books that developed the Theory of Inventive Problem Solving (abbreviated as TRIZ in Russian). In these books the ideas about a contradiction, an invention and an algorithmic approach (ARIZ) to how to invent by solving contradictions in technical problems are elaborated. To name just a few books in chronological order, written by Altshuller

  • How to learn to invent (“Как научиться изобретать”), 1961
  • Algorithm of Invention(“АЛГОРИТМ изобретения”), 1969
  • Creativity as an Exact Science: Theory of Inventive Problem Solving (“ТВОРЧЕСТВО как точная наука: Теория решения изобретательских задач”), 1979

What is important to mention about the books is that they contain systematic, detailed and step by step explanations of how to invent using an algorithm. Lots of examples and exercises for self-study included in them. The books by Altshuller somewhat resemble in their content and in a way of presenting the material books written by George Polya.

Polya being a productive mathematician was also interested in how to convey his ideas in a way that could be easily understood by other people. To this end he wrote a number of books directed to pupils, students, teachers and general audience.

For example, his book How To Solve it first published in 1945 is a step by step instruction set on how to approach mathematical problems in a systematic way, using heuristics that mathematicians accumulated doing math for thousands of years. It very much resembles to me the structure and approach taken in Altshuller’s How to learn to event. Later, Polya wrote two additional books on how mathematicians think and how they arrive to mathematical theories. Each of the books consist of two volumes and they are

What is interesting to mention is that the books written by Polya and Altshuller more than fifty years ago contained very insightful ideas and heuristics to tackle math and inventive problems. But today it’s still difficult to find a widespread adoption of these ideas in education, industry or elsewhere. For example, The Princeton Companion to Applied Mathematics book from 2015 mentions only a rudimentary number of math Tricks and Techniques in the chapter I, Introduction to Applied Mathematics, on pages 39-40, out of 1031 pages.

As well as the general ideas and principles described in
this article, applied mathematicians have at their disposal
their own bags of tricks and techniques, which
they bring into play when experience suggests they
might be useful. Some will work only on very specific
problems. Others might be nonrigorous but able to give
useful insight. George Pólya is quoted as saying, “A
trick used three times becomes a standard technique.”
Here are a few examples of tricks and techniques that
prove useful on many different occasions, along with a
very simple example in each case.

– Use symmetry…
– Add and subtract a term, or multiply and divide by a term….
– Consider special cases…
– Transform the problem…
– Proof by contradiction…
– Going into the complex plane…

As a summary, if you are curious whether it’s possible to learn how to be more creative, inventive or, in general, approach problems in a systematic way, then check the books by Genrich Altshuller and George Polya. They may provide you with just the tools that you were looking for, but didn’t know where to find.

Levels of understanding or how good explanations matter

In this post I want to talk about why providing good and detailed explanations can be a key in deep understanding of things in different fields of life. Particularly, I want to talk about detailed proofs in mathematics that have good step by step explanations of how the proof was constructed. Recently, I’ve started to read the books on math that are piling on my table just as the image above depicts.

I want to point your attention to the second book at the top of the pile, which is Prime Numbers and the Riemann Hypothesis book from 2016 written by Barry Mazur and William Stein. This book talks about Riemann Hypothesis by starting from ‘simple’ math and gradually moving to details about Riemann Hypothesis that require more advanced math background.

So what levels of understanding and well explained proofs have to do with the content of the book. Well, you see in the first part of the book, which authors claim requires some minimal math background a read sees this

Here are two exercises that you might try to do, if this is your first encounter with primes that differ from a power of 2 by 1:

1. Show that if a number of the form M = 2^n – 1 is prime, then the exponent n is also prime. [Hint: This is equivalent to proving that if n is composite, then 2^n -1 is also composite.] For example: 2^2 – 1 = 3, 2^3 -1 = 7 are primes, but 2^4 – 1 = 15 is not. So Mersenne primes are numbers that are

– of the form 2^ prime number – 1, and

– are themselves prime numbers

Some context for the quote

Here comes a little bit of a context about this quote. The quote comes from, part 1, chapter 3: ‘”Named” Prime Numbers’, on page 11. The chapter describes what are Mersenne Primes, which are prime numbers that are one less than a power of two:

M = 2^n – 1

Also, it’s good to know that a prime number is a whole number (positive integer) that can be divided only by itself or 1. For example,

2, 3, 5, 7, 11 … are prime numbers since they can only be divided by themselves and 1.

Now, that we know what prime numbers are, I want to draw your attention to the point in the quote where it says, that the exercises are good for ‘your first encounter with primes that differ from a power of 2 by 1‘. Well, in my opinion these exercises are good only for readers who have at least a BSc in Math or possibly an engineering degree. In a couple of sentences we’ll see why I think so. Since I consider myself as a person who is interested in math and have a BSc degree in Electronics, I think the first part of the book which is indented for a layman person is just for me. Were you able to arrive at the proof for that simple exercises above?

Frankly, I wasn’t able to prove it. So, I went and looked for a proof on the internet. As always, Wikipedia is at the top of the Google search when it comes to math topics. Lo and behold there is a page in the wiki about Mersenne primes, what they are and a number of proofs related to them. One of the proofs was exactly the solution to the exercise in the quote above. Here it comes:

Note: Since, I had no time to learn how to use LaTeX properly in the WordPress, and believe me WordPress doesn’t make blogger’s life easy I write the proofs and picture them.

Does this proof looks like a piece of cake to you? Is it intuitive and easy to grasp? In my opinion, it’s not and also this proof shows why Wikipedia gets a good portion of criticism about its content.

What I find not so obvious about that proof is how the right hand side of the equation came about. Especially, the part that has powers of a * (b – 1) , a* (b – 2), … , a, 1. And then the statement that says ‘By contrapositive, if 2^p -1 is prime then p prime‘. In this particular statement, if you had no courses on mathematical proof, logic or Abstract Algebra, then the words composite and contrapositive can be a little bit mysterious.

I thought to myself, well, the proof looks kind of unclear to me, so I searched better using the internet and also checked the books I own. I was able to find a couple of proofs that were easier to understand and also was able to find a proof in the Book of Proof by Richard Hammack that you can see in the title image for the post. It’s the blue one and it comes third from the top of the pile.

Let’s start with the proof from the Book of Proof, which sounds like a good place to begin with. The proof is a solution to the exercise 25 from Chapter 5 in the book.

If we look carefully at this proof it look almost exactly as the proof in the Wikipedia. It even look less clear. But one important thing to notice is that this proof shows where the ‘1’ comes from on the right hand side of the equation, in comparison to the proof in the wiki. I think so, since 2^(ab-ab) which equals 1, provides more information than a simple number 1. This is because it provides some clues on how the proof was constructed. But for a reader who does not remember math, both of the proofs are not that helpful. Also, we need to take into consideration, that the Book of Proof intended audience is undergraduate students of exact sciences. So the book presupposes some math background.

Some math background

We already mentioned that a prime number is a positive integer that can be only divided by itself or number 1. All other integers, are composite, since any of them can be composed by multiplying prime numbers. This is where composite comes from. As for contrapositive the Merriam-Webster site provides this definition

 a proposition or theorem formed by contradicting both the subject and predicate or both the hypothesis and conclusion of a given proposition or theorem and interchanging them

if not-B then not-A ” is the contrapositive of “if A then B ”

For example,

If it was raining then it is wet.

Then contrapositive statement would be

If it is not wet then it wasn’t raining.

Returning back to the original exercise from the Prime Numbers and the Riemann Hypothesis book, now, it becomes a little bit clear, what the hint [Hint: This is equivalent to proving that if n is composite, then 2^n -1 is also composite.] there was for. So the proofs, went to prove that

If n is composite then 2^n – 1 number is composite

and by contrapositive

If 2^n – 1 number is not composite (i.e. prime) then n is not composite (i.e. prime)

Polynomial factorization

Now, that we know what contrapositive proof is that’s turn to the right hand side part of the equation which is

2^n – 1 = (2^a – 1) * ( 2^a*(b-1) + 2^a*(b-2) + … + 2^a + 1).

It turns out that to derive it there is need to remember what polynomial factorization is, or remember how to divide one polynomial by another, or to know what Polynomial remainder theorem is. Also it’s good to know for a start what is a polynomial.

Since this post is becoming to long I need to make it shorter, which defeats the point of providing detailed explanations 😦

But I’ll provide some hints on how the right hand side was derived.

There is a known math formula to compute the following expression a^n – b^n

It turns out that the equation in the proofs that were mentioned in this post has the same structure as the formula to derive a^n – b^n. But the most interesting part is this. If you look at the last equation the Roman numeral I stands for the initial composite number 2^n – 1, and it is composed by multiplying Part II by Part III.

Part I = Part II * Part III

What is interesting about this composite number and that me use A, B and C instead of using I, II and III

A = B * C

is that B and C are factors of A, or alternatively B is a divisor and C is a quotient.

So to summarize the initial number 2^n -1 is composite since it is a product of two other numbers.

Prime time for Riemann Hypothesis

Books that make you think

I already had a post where I mentioned Reimann Hypothesis after reading The Music of The Primes by Marcus du Sautoy. As far as I recall, I liked the book a lot. It was written for a wide audience and was an easy read. Later, I accidentally found another book on the subject that was intended for more mathematically inclined readers, namely, Prime Obsession by John Derbyshire. Having been fascinated by the subject of prime numbers, the prime number theorem it was a short way to other similar books, such as Prime Number and the Riemann Hypothesis by Barry Mazur and William Stein. Then smoothly transitioning to A Study of Bernhard Riemann’s 1895 Paper by Terrence P. Murphy. Just to conclude with H.M. Edwards Riemann Zeta Function. By the way, the order in which I mentioned the books more or less conveys the mastery of mathematics required to be able to understand what’s going on in them. Which means that two last books require substantial background in calculus and complex analysis. But it’s doable if you have time and prime obsession.

Easy to not-so-easy books

I’d like to provide more details about the books above which I personally read end-to-end and also about ones that I bought, but haven’t finished yet, or only skimmed through.

Actually, I’d rather start with a short description of what the Reimann Hypothesis is by citing the Millennium Problems web site that describes a number of 21st century math problems that can bring you 1,000,000 USD for solving any of them.

So the Riemann Hypothesis is

Source: Millennium Problems

Some numbers have the special property that they cannot be expressed as the product of two smaller numbers, e.g., 2, 3, 5, 7, etc. Such numbers are called prime numbers, and they play an important role, both in pure mathematics and its applications. The distribution of such prime numbers among all natural numbers does not follow any regular pattern.  However, the German mathematician G.F.B. Riemann (1826 – 1866) observed that the frequency of prime numbers is very closely related to the behavior of an elaborate function
    ζ(s) = 1 + 1/2s + 1/3s + 1/4s + …
called the Riemann Zeta function. The Riemann hypothesis asserts that all interesting solutions of the equation
    ζ(s) = 0
lie on a certain vertical straight line.
This has been checked for the first 10,000,000,000,000 solutions. A proof that it is true for every interesting solution would shed light on many of the mysteries surrounding the distribution of prime numbers.

Having said that now let’s look at the books.

The Music of The Primes

The book was written by Marcus du Sautoy in 2003. As I mentioned, the book does not require a degree in mathematics to be able to understands what it’s talking about. The material in it is interesting and engaging. In addition to covering, The Prime Number Theorem and Reimann Hypothesis it also covers other topics related to prime numbers usage, like cryptography. It can be a good starting point into a long journey with prime numbers.

Prime Numbers and the Riemann Hypothesis

The book was written by Barry Mazur and William Stein in 2016. It has four parts, where first part intended for a wide audience, and each consecutive part presuppose gradually increasing knowledge of math to be able to grasp the content. What’s interesting about this book that it sheds light on some interesting connections between Riemann Hypothesis and Fourie Transform, which electrical engineers can relate to. Also the book is quite short.

Prime Obsession

The book is written by John Derbyshire in 2003 (same year when Marcus du Sautoy wrote his book). This book has two parts: The Prime Number Theorem and The Riemann Hypothesis, but it goes into nitty gritty details of both of them and don’t allow a reader relax too much. Following the content of the book could require some math background and at times some calculations to be sure that one gets proper understanding of what’s going on. Personally, out of all the books I mention in this post I find this one the most engaging.

A Study of Bernhard Riemann’s 1859 Paper

The book is written by Terrence P. Murthy in 2020. It is one of the two most technical books on the subject that requires substantial background in mathematics. The book provides Riemann’s 1859 paper in full in English and then systematically goes and provide proofs for all relevant parts of Riemann’s paper in subsequent chapters (except for the Riemann Hypothesis itself :). I think Terrence Murphy summarizes who this book is intended for in his own words the best:

Who Is This Book For?
If you are reading this, chances are you have developed a keen interest in the Reimann Hypothesis. Maybe you read John Derbyshire’s excellent book Prime Obsession. Or perhaps you read that the Riemann Hypothesis is one of the seven Millennium Prize Problems, with a $1 million prize for its proof.
To advance your knowledge substantially beyond Derbyshire’s book, you must have (or develop) a good understanding of the field of complex analysis (we will describe that as knowledge at the “hobbyist” level). So, this book is probably not for you unless you are at least at the hobbyist level.

Riemann Zeta Function

The book was written by H.M. Edwards in 1974. I’d rather describe it by continuing the citation from the previous book by Terrence P. Murthy:

After developing an interest in the Riemann Hypothesis, the first stopping point for many is Edwards’ excellent book Riemann’s Zeta Function. The Edwards book provides a wealth of information and insight on the zeta function, the Prime Number Theorem and the Riemann Hypothesis. And that brings us to the next group of people who do not need this book. If you eat, sleep and breath complex analysis, we will say you are at the “guru” level. In that case, the Edwards book will be easy reading and will provide you with the information you need to substantially advance your knowledge of Riemann’s Paper and the Riemann Hypothesis.

As you may tell, “guru” level in math is required to fully digest this book. So it want be easy to say the least.

A good introductory paper on the subject

If you are interested in a short, but engaging introduction into what are Prime Number Theorem and the Riemann Hypothesis I recommend to read Don Zagier’s The First 50 Millions Prime Numbers paper, published in New Mathematical Intelligencer (1977) 1-19.

If you know Russian you can read the same paper that was published in Russian only in 1984 in the Uspekhi Matematicheskikh Nauk journal.

Parting words

All in all, these five books can take a good chunk of a full year to work through or possibly even more, especially the last two. So what are you waiting for? Life is too short to waste it on watching TV series or YouTube nonsense. The treasures of math and deeper understanding of the world are awaiting for ones who know where to look for.

Errata for the Thinking Better book and some commentary

This post is a continuation of the previous one about Thinking Better book (ISBN-13 ‏: ‎978-1541600362) by Marcus du Sautoy published in North America by Basic Books.

It seems like I was too eager to praise the book after reading just a few dozens of pages. Even though, on average the book is interesting to read, there were a number of things that could make the content of the Thinking Better even better. For example, having more diagrams accompanying the explanations for various concepts could be helpful. Having footnotes to provide more details or sources for cited papers could be helpful too etc.

Errata

Having found a number of possible mistakes in the book I was sure that notifying Basic Books publishing about them would be valuable and they’d be happy to review and, if required, correct the content of the book. But since I’ve sent an email to the customer support there, I didn’t hear back. Below comes the table of potential issues that I was able to find while reading the book.

Page #Actual contentSuggested contentComments
p.107However, I’m going to give Eratosthenes high marks for his calculation for the circumference of the earth because it is inspired.However, I’m going to give Eratosthenes high marks for his calculation of the circumference of the earth because it is inspiring.This seems like a spelling mistake.
p. 129It will also be to do with the nature of the rock, if the rock is very non-friable and firm.It will also have to do with the nature of the rock, if the rock is very non-friable and firm.This seems like a spelling mistake.
p. 149Figure 5.9. Feynman diagram of the interaction between an electron and a positron

The right diagram of electron-positron scattering can be found at the link below
Bhabha scattering.
You can tell it by the sign above ‘e’. Electron has ‘e-’, while positron ‘e+’.
The diagram in the book is for interaction between an electron and an electron namely,
Møller scattering

Some other suggestions

The suggestions below are based on my experience with reading dozens of popular-science books on mathematics, physics neuroscience and biology.

Diagram and Figures

The book has a number of diagrams in each chapter. Though most of them are helpful, some are not. The main issue I see with the diagrams in the book is that even though they are numbered, just like in the table above, that number isn’t referenced in the body of the book. This makes it hard to related the diagram to the content where it was mentioned.

There are places that I would add diagrams to clarify the content, since without having a diagram it is difficult to imagine what the text represent. Or it takes quite some time to understand author’s intent. For example, Figure 3.2. Six pyramids make a cuboid on page 84, is very confusing to say the least (no diagram is shown in my post due to copyright issues).

One additional example is when the sieve of Eratosthenes was mentioned on page 106. Usually, this method is visualized by a diagram, which helps a lot in understanding it. For a good visual example refer to the chapter 7 in the Prime Obsession book by John Derbyshire. Also check Wikipedia article about the sieve of Eratosthenes.

Missing footnotes or notes

I agree that not all popular science books have footnotes or notes, but this particular book mentions a number of other books, papers and authors. Having footnotes or notes at the end of the book could have been beneficial to a curious reader. One of the papers cited in the book, had the names of the authors incorrect. For example, in the chapter 9, on page 261 it is written

Two mathematicians Duncan Watts and Steve Strogatz, discovered the secret, which they published in a paper in Nature in 1998.

The paper was Collective dynamics of ‘small-world’ networksNature 393, 440–442 (1998). And the authors were Duncan Watts and Steven Strogatz.

Missing bibliography

There are no references in the book. Bibliography is also not a mandatory part of popular science books, but in most of the ones I read it was there and helped find similar books on the subject or get more details about specific topics mentioned in the book.

Too harsh a criticism?

All in all, despite the drawbacks I mentioned above the book was worth reading. I think my criticism has to do with that fact that the Music of The Primes book, also written by Marucs du Satouy in 2003 didn’t have most of the issue I brought in this post.