[{"content":" What led us here? # AI is currently experiencing the biggest wave of investment and excitement that it has ever seen. It has become accessible and mainstream, but where did it come from? As it turns out, the development of Artificial Intelligence is a pretty natural consequence of the upward trends in the quantity of digital information, and availability of computational power. What follows is a brief exploration of these trends, and how they relate to AI.\nThroughout history there have been multiple paradigm shifts in how information can spread. From cave paintings, to the invention of alphabets and languages, to the eventual creation of the printing press. Followed later by the digital age of radio, television, computers and the internet. The trend is clear. Over time we have developed newer and better ways to spread information to an ever-increasing number of people. Now, a large majority of the world\u0026rsquo;s population have access to near-infinite amounts of information.\nOf course, it is the computers and the internet which are particularly relevant to this discussion, and there are a number of important points to highlight. The first of which is [Moore\u0026rsquo;s law][https://www.asml.com/en/technology/all-about-microchips/moores-law], which states that the number of transistors on a microchip doubles approximately every two years. This pattern has largely held up over the last fifty years (hence why it became known as a law). This exponential rise in compute power led to the democratisation of computers; they became smaller, cheaper and more accessible. As computers became more mainstream so did the internet, and more and more people were establishing an online presence. New digital-native services began popping up, providing convenience which was not possible otherwise. The writing was on the wall. This was the future.\nEnterprises began adapting to this new online environment and quickly identified the golden goose. Personalisation. Dynamically customising the content which users see and interact with based on what generates the most revenue. There is only one issue - to accurately recommend things to users you need to know things about them. You need data. The more data you have the better decisions your systems can make. Enter [Big Data][https://cloud.google.com/learn/what-is-big-data] - large and diverse datasets which rapidly grow over time.\nBig Data comes with its own unique problems. As these datasets grow massively in size, storing them becomes a major challenge. There is also the issue of dynamic scaling. Online user traffic is not consistent. There are certain time periods where it grows exponentially over night due to external events such as holidays, product announcements, or concert tickets going on sale. Maintaining the infrastructure required to deal with these traffic spikes is extremely wasteful, as throughout most of the year it is not required. The solution to these problems is now known as [cloud computing][https://aws.amazon.com/what-is-cloud-computing/]. In reality there is no \u0026ldquo;cloud\u0026rdquo;, it is just somebody else\u0026rsquo;s computer. Large corporations with spare hardware began renting it out to those who need it. As the business model proved successful, new datacenters were built specifically with this intent. This removed the responsibility of managing your own infrastructure. You no longer had to worry about storage or compute restrictions, you simply pay for what you need, when you need it. If you could afford it, compute power was effectively limitless.\nThis brings us to where we are today. Access to massive amounts of data coupled with the ability to process it, presented an exciting new possibility. To build systems which analyse this data, find patterns, and make decisions by themselves. Artificial Intelligence. Research in the area is receiving huge investment at the moment, and there is a lot of competition to find the next breakthrough. We are amidst a rat race. The question is - what is the cheese? Is it solving humanity\u0026rsquo;s biggest problems? Or creating an infinite stream of revenue?\nA brief history of AI # While AI is the current hot topic, we need to acknowledge that in reality it is not a new concept. We must understand the history and recognise the giants whose shoulders we stand on today.\nAll the way back in 1950, Alan Turing published his paper [\u0026ldquo;Computing Machinery and Intelligence\u0026rdquo;][https://courses.cs.umbc.edu/471/papers/turing.pdf] where he posed the question of whether machines can exhibit intelligent behaviors which are indistinguishable from humans. This paper can largely be attributed to the birth of interest in the field of AI. It sparked a plethora of research and resulted in lots of advancements in the following decades; many of which are still relevant today.\n1951 - The first artificial neural network was built - [SNARC][[https://en.wikipedia.org/wiki/Stochastic_Neural_Analog_Reinforcement_Calculator]. 1955 - The phrase \u0026lsquo;[Artificial Intelligence][https://www-formal.stanford.edu/jmc/history/dartmouth/dartmouth.html]\u0026rsquo; was coined by John McCarthy, Marvin Minsky, Nathaniel Rochester and Claude Shannon. 1957 - Frank Rosenblatt developed the [Perceptron][https://bpb-us-e2.wpmucdn.com/websites.umass.edu/dist/a/27637/files/2016/03/rosenblatt-1957.pdf], an early artificial neural network. 1962 - Arthur Samuel pioneered the concept of machine learning by developing a [computer program that improved its performance at checkers over time][https://www.ibm.com/history/early-games]. The program beat Robert Nealy in a publicised match. 1966 - Joseph Weizenbaum developed [ELIZA][https://dl.acm.org/doi/10.1145/365153.365168]. Known as the first \u0026ldquo;intelligent\u0026rdquo; chatbot despite being rule based. The capabilities of the bot raised a lot of ethical questions around human-computer interactions. 1969 - The concept of [backpropagation][https://books.google.com/books?id=P4TKxn7qW5kC\u0026amp;printsec=frontcover] was first introduced by Arthur Bryson and Yu-Chi Ho. The 1970s, however, were not quite as promising. They brought on the first major \u0026ldquo;[AI Winter][https://en.wikipedia.org/wiki/AI_winter]\u0026rdquo; - a period of decreased funding and interest in AI research. This is largely accredited to [the Lighthill report][https://www.chilton-computing.org.uk/inf/literature/reports/lighthill_report/p001.htm] which claimed that the field had not produced the significant breakthroughs which were promised. This led to a considerable loss of support from companies and governments.\nAlthough the excitement and funding returned in the 1980s, it did not stay for long. In 1984 at an annual meeting of the Association for the Advancement of Artificial Intelligence (AAAI) the world was cautioned of another impending AI Winter (in fact this is where the phrase was originally coined). It was predicted that history was going to repeat itself and as in the 1970s, investment and research would collapse due to inflated expectations and underwhelming results. Within just three years this became reality and the second AI Winter had arrived.\nA pattern was emerging: an initial wave of hype and excitement, which in turn led to disillusionment and overblown expectations which could not be met. As we reflect on where we are today we must keep the past in mind and ride the wave of excitement with cynicism at heart, as to not repeat the same mistakes.\nNonetheless, over time the research returned and throughout the next decades many more significant advancements occurred, particularly after we entered the 21st century. Some of these are:\n1988 - Rollo Carpenter developed the [Jabberwacky][https://en.wikipedia.org/wiki/Jabberwacky] chatbot which learned from human interactions, rather than being rule based. 1989 - Handwritten [ZIP code images were recognised by a Convolutional Neural Network][https://ieeexplore.ieee.org/document/6795724] at Bell Labs. 1997 - [Deep Blue][https://www.ibm.com/history/deep-blue] beat Garry Kasparov at chess. 2009 - Andrew Ng et al. published research which recognised the [advantage of GPUs for AI workloads][https://robotics.stanford.edu/~ang/papers/icml09-LargeScaleUnsupervisedDeepLearningGPU.pdf]. 2011 - Apple launched [Siri][https://www.apple.com/newsroom/2011/10/04Apple-Launches-iPhone-4S-iOS-5-iCloud/]. 2012 - Andrew Ng et al. trained a [neural network which learned to recognise images of cats][https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/38115.pdf]. 2016 - [AlphaGo][https://deepmind.google/research/alphago/] beat Lee Sedol at Go. 2017 - Facebook\u0026rsquo;s [chatbots developed their own language][https://engineering.fb.com/2017/06/14/ml-applications/deal-or-no-deal-training-ai-bots-to-negotiate/] in communication between each other. It is 2026 and AI has once again taken center stage. Many believe that we are in the middle of an AI Spring - but what exactly happened between 2017 and now?\nThe transformer goes boom # Despite the title of this section, it will not discuss the Transformers movie franchise. Instead the focus will be on the transformer model architecture which reshaped the landscape of AI in the last ten years.\nThe idea of a model which can understand natural language and interact with a human has been around for decades - as noted above with ELIZA, Jabberwacky and many others which came along the way. However, there has been a recent revolution in the space with the arrival of the transformer architecture in 2017; introduced in the paper - [Attention Is All You Need][https://arxiv.org/pdf/1706.03762] by Ashish Vaswani et al.\nThe transformer model is a type of neural network architecture based on attention (hence the title of the paper). This concept allows transformers to determine the relationships between each part of the input, and focus on what is most important about a specific sequence of data. It tells the AI model what to pay attention to. For example, it could interpret words within a specific context (where a word may have multiple meanings), or assess the significance of a word within a sentence. While this is very simple for humans, it was a huge problem for computers. Attention changed the game.\nConsider the text below and think about how the context changes the meaning of individual words. This example has been taken from [IBM\u0026rsquo;s page about transformers][https://www.ibm.com/think/topics/transformer-model].\n\u0026ldquo;on Friday, the judge issued a sentence. \u0026quot;\nThe preceding word “the” suggests that “judge” is acting as a noun. As in, a person presiding over a legal trial rather than a verb meaning to appraise or form an opinion. That context for the word “judge” suggests that “sentence” probably refers to a legal penalty, rather than a grammatical “sentence.” The word “issued” further implies that “sentence” refers to the legal concept, not the grammatical concept. Therefore, when interpreting the word “sentence”, the model should pay close attention to “judge” and “issued”. It should also pay some attention to the word “the”. It can more or less ignore the other words. Historically, neural networks ingested input data sequentially, one at a time, and in a particular order. This approach faced issues with performance and keeping track of relationships between distant data points, particularly with large inputs. Attention mechanisms do not face these limitations. Not only do they process the entire sequence simultaneously (improving the ability to understand long-range dependencies), but this quality also enables the computations to be done in parallel rather than one at a time. This allows transformers to take full advantage of the capabilities of the GPU ([which greatly outperforms the CPU in parallel tasks][https://robotics.stanford.edu/~ang/papers/icml09-LargeScaleUnsupervisedDeepLearningGPU.pdf]), and solve many of the performance issues faced in older neural network solutions.\nOne of the first majorly successful applications of transformers was [BERT][https://blog.google/products/search/search-language-understanding-bert/] - Bidirectional Encoder Representations from Transformers - introduced by Google in 2019. It could predict and classify input text. The model quickly became ubiquitous, and is still used today as the basis of Google search. Despite its fame though, BERT was soon overtaken in popularity by [OpenAI\u0026rsquo;s GPT models][https://developers.openai.com/api/docs/models/all]. `` GPT stands for Generative Pre-trained Transformer. GPTs generate new data by applying the patterns they identified in pretraining as a response to user input. In short, they are fed unlabeled data and forced to make sense of it on their own. After the pretraining stage, the model can be fine-tuned towards a specific task with the use of labeled data. Fine-tuning can also be combined with a \u0026lsquo;reinforcement learning from human feedback\u0026rsquo;-based objective. By finding patterns in these datasets, GPTs can then draw similar conclusions when exposed to new inputs such as a user\u0026rsquo;s prompt, and generate a response.\nWith the release of ChatGPT in 2022, the general public was introduced into the world of Generative AI, which has taken the world by storm. AI has never been so mainstream and accessible before. This begs the question - are we ready for it?\n","date":"3 July 2026","externalUrl":null,"permalink":"/blog/ai/what-led-us-here/","section":"Blog","summary":"This is a summary","title":"What led us here?","type":"blog"},{"content":"","date":"3 July 2026","externalUrl":null,"permalink":"/tags/ai/","section":"tags","summary":"","title":"ai","type":"tags"},{"content":"Some technologies can be harmful to humanity. Artificial Intelligence may very well be one of them; but it does not have to be.\nWe must remember that it is us, humans, who are responsible for how this technology is developed. We must then also accept and deal with the consequences of our actions. Whether they were accounted for or not.\nThe problem statement is simple. How can we ensure we make the right decisions such that AI does not turn out harmful? Unfortunately, the answer is much more complex.\nTo meaningfully participate in this discourse, one must first understand the technology. Where did it come from? How does it work? What is it capable of? The goal of this series is to provide this understanding, and facilitate a baseline from which you can form your own conclusions.\n","date":"3 July 2026","externalUrl":null,"permalink":"/blog/ai/","section":"Blog","summary":"A blog series on artificial intelligence — its history, its present, and what it means for the rest of us.","title":"AI and Its Consequences","type":"blog"},{"content":"","date":"3 July 2026","externalUrl":null,"permalink":"/series/ai-and-its-consequences/","section":"series","summary":"","title":"AI and Its Consequences","type":"series"},{"content":"","date":"3 July 2026","externalUrl":null,"permalink":"/blog/","section":"Blog","summary":"","title":"Blog","type":"blog"},{"content":" You can just do things. ","date":"3 July 2026","externalUrl":null,"permalink":"/","section":"Home","summary":"","title":"Home","type":"page"},{"content":"","date":"3 July 2026","externalUrl":null,"permalink":"/series/","section":"series","summary":"","title":"series","type":"series"},{"content":"","date":"3 July 2026","externalUrl":null,"permalink":"/tags/","section":"tags","summary":"","title":"tags","type":"tags"},{"content":" Utils Excalidraw - Draw on a whiteboard and create diagrams Monkeytype - Minimalistic typing test / practise RemovePaywall - Read paywalled articles Blogs The Best Viewpoints - Blog about summiting the tallest peaks in Europe Travel Water At Airports - Where to find water refill stations at airports Tech Open Source Guide - Resources to grow open source projects Papers Tail at Scale - Latency of systems at high scale Bigtable - Distributed storage system for structured data Deepseek R1 - Reinforcement learning and distillation Bitcoin - Peer-to-Peer electronic cash system Game Dev Doom - Behind the Music - GDC talk by Mick Gordon about the music of DOOM 2016 Overwatch - Animations in First Person - GDC talk about the animations in Overwatch Doom - Binary Space Partitioning - Rendering techniques created by John Carmack ","externalUrl":null,"permalink":"/bookmarks/","section":"Home","summary":"","title":"Bookmarks","type":"page"},{"content":"","externalUrl":null,"permalink":"/categories/","section":"categories","summary":"","title":"categories","type":"categories"},{"content":" ","externalUrl":null,"permalink":"/gallery/","section":"Home","summary":"","title":"Gallery","type":"page"}]