Read as if you edit. Editing Imbalanced Classification with Python

Reading can be hard, but rewarding

I like to read books. They provide me with the opportunity to discover new worlds and learn new things. Unlike other sources, e.g. YouTube tutorials, which I find a little bit distracting, books don’t seduce you to click on them, instead they lay on a flat surface and don’t care. In addition, it’s quite hard to jump from a book to a book in haphazard way reading them in parallel physically. But when you find an interesting book, say a novel, it can draw you attention and hold you captive until you finish reading it. And there are books that are interesting and at the same time require from a reader a certain amount of concentration and work that needs to be done to get the most out of reading a book.

I call such a reading a workout. It’s similar to physical exercises, that can be unpleasant at times, but it has a reward of deeper understanding and grasp of concepts. It also resembles editing a book, call it testing, even alpha testing if you have a software background. By reading a book, as if you edit it, and as such need to pay attention to details and working on it from A to Z, you are bound to better understand the information that the book tries to convey. 

Read as if you edit

With the recent wave of high interest in Machine and Deep Learning there are a lot of books published on the subject to satisfy hungry readers. The books are ranging from popular explanations for a general audience to technical books, that teach readers how to apply Machine Learning to day to day practical applications. Machine Learning Mastery web site provides a number of such books, that are written with a hands on experience first approach. This makes the books perfect candidates for the Read as if you edit approach, since it is the best way to get actual practical experience in Machine Learning by actually applying examples from each chapter in these books. For readers, who aren’t familiar with Machine Learning Mastery books, all of them (books) are structured  in a similar way, where each chapter has just enough theory to get you started using practical code samples.

It is possible to only read through the books, without running a single code sample having a feeling of understanding how things work and being happy with yourself. The issue is, this approach brings almost zero value and provides you with no real experience. Instead, think of yourself as an editor or a tester, who was tasked with finding mistakes, omissions, unclear explanations or wrong code samples. Doing this will help you get the most out of  the book since it forces you to actually run the code, play with it by adjusting it. It also helps you to get better understanding of the material by cross-referencing unclear points by searching on the internet or in other books. 

Don’t you think that read as if you edit approach is only applicable to Machine Learning books. I find it also useful in reading books on mathematics, physics and engineering. Actually, it can be applied to any source of written information, only then it becomes a critical reading approach, where you don’t blindly trust what you read, but instead analyze it and verify the information.

So how was it editing Imbalance Classification with Python book?

I very much liked editing this recent book, since it had enough theory, math and new machine learning concepts to get me excited to work with the book from start to finish. The book has about 450 pages of actual content and it took me about three hours a day for nine days to finish it. I can’t say that it was smooth and easy. The content, at least for me, required cross-checking it with other sources. The code samples required, not once, a need to reference Python libraries documentation and quick dives into sources about imbalanced classification, statistics and information theory.

All in all, reading this from A to Z made me realize the importance of knowing that the data could be imbalanced, as in case of anomaly detection, and one cannot train a model assuming an equal distribution between positive and negative classes, since such a model will tend to classify incorrectly in practice.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.