AI Everywhere All At Once
How is it that all these different AI flavors appeared at once? Usually, when there is a new technology that is so unexpected and counterintuitive (AI? That’s just a cheap, repackaged search engine), the pioneering company has about five years before competitors catch up. The iPod was introduced in 2001, and it wasn’t until 2006 that Microsoft countered with the Zune. Amazon launched its now-dominant web services in 2006, but it wasn’t until 2010 that Microsoft countered with Azure, followed by Google’s Full Compute Engine in 2012.
Before 2017, advanced AI felt like a faraway prospect. Natural Language Processing (NLP) was dominated by Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks. The philosophy back then was to have the machine read sentences much like we do. To understand a sentence, the system had to process all the prior words in order to make sense of the current one. This proved unwieldy because the computer would often "lose track" of these preceding words as the text grew longer. It was also computationally inefficient; since each new word required the algorithm to process the preceding chain of words, the speed was slow and scaling was severely limited.
Then, in June 2017, a new paper called “Attention Is All You Need” changed the world. We know that all words are not equal; some are more central to a sentence's meaning than others. Based on the importance of the words encountered, "attention scores" were assigned to each one. (In the preceding sentence, for example, these scores would tell the computer that the word “one” refers to “words,” not “scores.”) Because this replaced slow, sequential calculations with a more direct approach, large swathes of text could finally be processed at once. This allowed for parallel processing, enabling a level of scaling limited only by the infrastructure itself (i.e., servers and electricity).
Armed with this breakthrough, the eight scientists behind the paper left Google to seek fame and fortune. Like a plant releasing its pollen into the air, Google’s researchers distributed their knowledge far and wide across Silicon Valley. Consequently, companies like Essential AI, OpenAI, Cohere, and others could now perfect this new architecture called the Transformer. Hence, AI is now an open secret, preventing any one firm from dominating at the start of the revolution.
But this was 2017. Why did we have to wait until 2023 for ChatGPT to appear? That will have to wait until next time. See you then.