Second Language Acquisition. A Possible Link to Deep Learning. Part two.


Why is it important to understand how second language acquisition work?

Nowadays, we live in a world that is more interconnected than ever before. The Internet, including social media made communication as instant as possible. This in turn opened an opportunity for communication with people who speak different languages. But the slight complication is that in order to be able to speak with someone who knows a different language than you, there is a need to learn that new language. This will account to second language acquisition (SLA). Surely, the technology found a work around this problem using Machine Language Translation. At first machine translation was phrase based and the results were not that good. Then came the turn of statistical method in machine translation. And finally in November 2016 Google launched end to end Neural Machine Translation based on Artificial Neural Networks now known as Deep Learning. The results of this new approach were quite impressive, in comparison to struggling previous approaches. If until now you haven’t used this service, then try and see for yourself. Since I am a native Russian speaking person I am providing below an example of English to Russian translation that I can discuss.

english to russian

The link for this specif translation is here. I think any Russian speaking person would agree with me that this machine translation is grammatically correct and sounds fine.

Then back to the subject of SLA. It seems to me that looking into how Deep Learning techniques and models that are now used for Natural Language Processing (NPL), such as Word Embedding introduced by Tomas Mikolov and  Long Short-Term Memory  (LSTM) networks which are a special case of Recurrent Neural Networks work, may be very useful in tackling SLA. These and other approaches that are employed to tackle machine translation may shed the light into some aspects of first and second language acquisition in humans. Even though, currently used neural networks are based largely on an oversimplified and superficial model of a neuron, dating back to Perception introduced in 1960s, the successes of such methods cannot be easily dismissed. Why is that?

The notion ”probability of a sentence” is an entirely useless one, under any known
interpretation of this term. (Chomsky, 1969)

In recent years thanks to the advances in Graphical Processing Units (GPU) capabilities and introduction of new architectures and methods in Artificial Neural Networks now known as Deep learning, one of the long standing challenges namely Machine Language Translation seemed like to give up. The title above that belongs to Noam Chomsky the founder of Generative Grammar and Generative Linguistics may be finally proclaimed as wrong and Generative grammar theory may be seen as proven being incorrect by successes of Recurrent Neural Networks based on statistical methods machine translation. Then if Generative Grammar is not that useful to model first or second language acquisition in humans what else is? In following parts I’ll provide my suggestions of what approaches may be more efficient in modeling natural language. And as it isn’t hard to guess Deep Learning may provide at least a partial answer.







Second Language Acquisition. What do we know? Part one.


I hope that this post will be the first one in a series of posts I want to write on the topic of second language acquisition abbreviated in linguistics as SLA. What is meant by SLA is  a language that a person learns as a second language (L2) after he had acquired the first one which is a native language (L1). The research into the subject shows that the first and second language acquisitions are interconnected and may effect each other. So it makes sense to discuss first language acquisition too.

Why am I interested in this topic?

Since childhood I was interested in how we learn languages. As my life progressed from childhood to where I am now I happened to acquire two languages with a very high level of proficiency and learned a number of others to some extent. As a native Russian speaker growing in Ukraine I learned Ukrainian as a second language at school, but my knowledge of the language is quite superficial, though I can understand it when I hear it well. Then I learned and talked Hebrew for about 19 years. Even though I also studied English back in Ukraine I never knew it well before I started to learn it by mostly reading magazines back in 1999. So I would say that real experience with English language I started to gather for about 19 years too. Though, one important point to make is that I only started to use it for speaking communication purposes for about 2 years now. In addition, back in Tel Aviv University I studied a Japanese language for a year. But my diminishing knowledge of it is rudimentary.

To summarize the above I would rate my knowledge of the languages as below, when by knowledge I understand speaking, reading and writing.

  1. Russian 
  2. Hebrew
  3. English
  4. Ukrainian
  5. Japanese

I hope that this background description explains a little bit why I might be interested in understanding how we learn a new language be it second, third or N-language.


It is very strange that we know so little about how we learn first or second languages taking into consideration the advances in Neuroscience since early 2000 and Artificial Neural Networks starting from 2012 (also known now as Deep Learning). First, I heard about the subject of SLA back in 2004 when I studied Generative Linguistics in Tel Aviv University. When looking into the state of the art of the research back then I heard only about Noam Chomsky and Stephen Krashen’s research into this subject. Now almost 15 years since the state of the art of the research seems like frozen in the same place. But my intuition indicates, that by incorporating approaches from Supervised Machine Learning which includes Recurrent Neural networks such as LSTM and Convolutional Neural Networks with Attention Mechanism, along with a very promising research done at Numenta company and other approaches it is possible to make a significant progress in the field of second and first language acquisition.

The more detailed description of what I propose will be explained in further parts.

Deep Learning for Time Series book


Is it for you?

Are you struggling to find an easy to digest and implement material on Deep Learning for Time Series? Then look no further and try the newest book by Jason Brownlee from  Machine Learning Mastery. The book is ‘Deep Learning for Time Series Forecasting‘.

What’s inside?

The book will help you apply classic and deep learning methods for time series forecasting. This book is no exception for what you expect from Machine Learning Mastery books. It is hands-on, practical with plenty of real world examples, and most importantly working and tested code samples that may form the basis for your own experiments.

You may very much like the real application of Deep Learning nets to Household Energy Consumption dataset that was used to train CNN, CNN-LSTM and ConvLSTM networks with good accuracy results.

What’s so special about the book?

I personally was fascinated with the Time Series Classification chapter that applied Deep Learning to Human Activity Recognition (HAR) dataset with quite accurate predictions. What I liked most in HAR is the fact that raw gyros and accelerators measurements from the cell phone were used to train the DL models without any feature engineering. The video of the dataset preparation is shown here.

What’s next?

In the next post I’ll use one of the examples for Human Activity Recognition in the book and try to expand it using Extensions part of the chapter.

If you’ll be able to do it before me, please feel free to provide your feedback in the comments section.


Did you know that Google’s Colaboratory provides you with the opportunity to use GPUs for free while working on your own Deep Nets implementation? More than that you can easily share these Jupyter notebooks with your peers.


Statistical Methods for Machine Learning. Is it for me?


Statistical Methods for Machine Learning?

In this age of flourishing Deep Learning frameworks that allow you to train and run a model in a matter of minutes (or more) practitioners tend to underestimate why they need Statistical Methods in their tool box. It turns out Machine Learning and Deep Learning as a sub-field of it use Statistical Methods extensively throughout the training-inference pipeline. Starting from data preparation and ending on model performance validation. So, yes if you are not aware how those methods may be helpful, then it is time to have a look at a new Statistical Methods for Machine Learning book by Dr. Jason Brownlee from Machine Learning Mastery.  This book will explain in simple terms with practical examples what are Statistical Methods and how one can incorporate them in a day to day settings. 

What is there for me?


Have you ever studied at college or university or elsewhere about normal or Gaussian distribution, but never really understood how to apply it in a real situation? Have you ever wondered what is p-value and is there any better way such as Estimation Statistics that might include the quantifying the size of an effect or the amount of uncertainty for a specific outcome or result, and not only whether there was a difference between samples. In addition, the book clarifies the difference between Law of Large Numbers and Central Limit Theorem that are frequently confused with one another.

In addition, the book includes hands-on code examples that are tested and work correctly, and may be a good starting point in your own Machine Learning projects. 

Is it worth buying?

box_plot.jpgThe book is worth buying if you intend to be a more productive Machine Learning practitioner that not only runs code from tutorials, but also understands how to prepare and analyse the data for an algorithm in an efficient way, one how strives to get better results from models by evaluating them using statistical methods, one that values code snippets that he or she may build upon in their own Machine Learning  projects.

Parting Words

All in all, Statistical Methods for Machine Learning has all the merits of books from Machine Learning Mastery that are easy to grasp and bring immediate practical value that is applicable from the start and they are joyful reading.


How often to post on a blog?

[Update 2018-03-28]

Only today I’ve posted on a blog that there is no breakthrough in Deep Learning field so far in 2018. Boy, how was I wrong. Welcome this exciting paper born out of collaboration of David Ha (Google Brain) and Jurgen Schmidhuber (one of the creators of LSTM, RNN neural network). 

This paper finally implements what Yann LeCun mentions in all his recent talks. An agent that acts on its internal Model of the world.

World Models

John Sonmez advises to post each week on a blog for blog to gain momentum and grow. I surely agree with this statement since I saw it actually worked. But as it happens I haven’t posted anything for about two months now. There were a couple of topics I wanted to write a post, but never did. In the upcoming days I’ll try to write on the topics that will spark my curiosity and that may be of interest to the readers of this blog.

It seems drumming will be one of the topics, then physics, such as how cloaking devices may work. There may be a piece on aviation with regard to stealth aircraft. Certainly, programming is also one of the topics that I like. Deep Purple, sorry, Deep Learning is progressing steadily, but no huge breakthroughs are visible despite optimistic forecasts made by various commentators in the field.

In addition, science fiction movies and stories reviews may be a possible topic for a blog post or even a sci-fi story written by me. Recently, I saw a number of movies that had an interesting sci-fi idea at their core, but in my opinion the idea wasn’t elaborated as it could. I mean movies, such as Downsizing which missed the point completely and more successful one, but nevertheless under-delivering Annihilation

That’s it for today. Stay tuned and if you want provide topics you want me to report on which are within fields mentioned above.

What do you think?
What is the right frequency of posts in a blog? 

Linear Algebra for Machine Learning


Who is this book for?

If you are starving to get deeper insights while reading Deep Learning or Machine Learning papers, but are a little bit rusty with Linear Algebra, then

Basics of Linear Algebra for Machine Learning by Jason Brownlee from Machine Learning Mastery is just for you!

What this book is all about?

  • This book is a gentle introduction into Linear Algebra for people interested in machine learning.
  • As all books written by Jason it features Python hands-on practical approach.
  • It comes with a number of exercises for each chapter.
  • It has extensive references for each chapter too.
  • It feels like a good tool for beginner practitioners to Deep or Machine Learning.

Additional resources that may come in handy

I personally liked two books that Jason mentioned in his latest book

  • No Bullshit Guide To Liner Algebra which is the best book of its kind in my opinion having examples from quantum physics and more. 
  • Deep Learning book by Ian Goodfellow, Yoshua Bengio an Aaron Courville. This book is more or less currently the Bible of Deep Learning.

What are you waiting for?

Grab one of the books and get amazed by applied math and Deep Learning.


How to start with Deep Learning?

What is Deep Learning in a nutshell?

Deep Learning is a hot topic these days and it draws a lot of attention from people around the globe. This technology is applicable to various fields, such as image recognition and classification, speech recognition and generation, self-driving cars etc.  There are a number of definitions of what Deep Learning actually is. I find this detention of Deep Learning  by As Lex Fridman from MIT, as he puts it in his latest arxiv’s paper on the subject of self-driving cars, quite simple:

Deep Learning can be defined as a branch of machine learning that seeks to form hierarchies of data representation with minimum input from a human being on the actual composition of the hierarchy.

The best way to start with Deep Learning

If you are interested in getting to know what is Deep Learning, how can it be applied in practice then the best way for you is to try to apply it yourself. Don’t worry, there is no need to enroll into PhD program in machine learning anymore since the state of Deep Learning technology is that with a dozen lines of code and leveraging existing machine and deep learning libraries along with pre-trained models it is possible to implement exciting applications of Deep Learning, such as image classification, image caption generation and more. 

All you need is a practical end to end working example

To jump start into Deep Learning (DL) right away I propose you to have a look at Machine Leaning Mastery site and specifically at the latest book there which is one related to DL and is called Natural Language Processing with Deep Learning’.

This book composed of a number of self-contained tutorials that are concerned with applying DL techniques to natural language processing, such as sentiment analysis, image caption generation and language translation. What is nice about it is that it shows you how to apply these techniques from installing all required machine learning libraries, to describing how to implement DL pipeline from start to finish. It comes with all code samples mentioned in the book working and doing the job. You can take them as a starting point and expand with you creativity. 

Although, tutorials are quite independent there they are arranged in the way that complexity of applications is growing from simple to mode advanced. 

The book engages you to try extensions and enjoy coding in Python

The book uses Python and its rich ecosystems of machine and deep learning libraries such as Keras to make you life easier and enjoyable. What is different in this book from others is that each chapter provides you with the references to all papers and books relevant to that chapter, for you to not waste time looking them up yourself. In addition, and this is the best part in my opinion, each chapter provides a number of extensions to think about and implement for application described. Such as trying to play with different model architecture, trying to tune hyper-parameters, etc.

So why are you still reading this post?

Try this book by executing every example in it, try to play with examples by expanding them and I am sure you’ll get a feeling of what this Deep Learning is and how it results with quite fascinating outcomes when the model you trained predicts something like this:

This is what Deep Learning network trained to translate from German to English thinks about Canadians.

src=[wir sind kanadier], target=[we’re canadians], predicted=[we’re unusual]