Good afternoon, my name is Paula Aguiar and I would like to know the reasons behind the decision to limit the GPT chat with information up to the year 2021. And if there is any forecast to allow the instant updating of information. Thank you.
Safety and accuracy.
To train a Large Language Model AI, you first need access to a HUGE amount of documents it can read, that are suitable, that are screened for quality, because everything it will learn about language, about facts, about how words connect, and about how the words relate to each other and any kind of meaning, will come entirely from those documents. Both volume and quality of that ‘training corpus’ are critical to the outcome.
Any mistakes in its learnings often mean scrapping the entire AI and starting over, and the cost for each attempt is in the tens of millions of dollars.
Obviously, it takes time to simply assemble and then ‘clean up’ such a massive corpus of training material. Often years, just to assemble all that data. And because of the clean-up need, there will always be a lag between the documents added, and the final decision that the corpus is ready.
Then you have to train the AI. They use Neural Networks, which roughly attempt to model the way that natural brains function, all about the connections and how one thing connects to another. Like word association. Or like the way a certain scent of flowers might remind you of some relative’s garden from your youth, or rainy walks through a city you visited a decade ago, and how that in turn will remind you of colours, sights, people, places, and specific memories.
Take a word such as ‘iron’ as an example. It can be a raw material, the metal itself, or the mineral we need in our diet for healthy blood, or the device we use to press clothes after laundry, or weights at the gym when we are ‘pumping iron’. There’s lesser paths too, such as we may know of ‘shooting iron’ as slang for a gun, and so forth. Your brain has all those connection just to that one word… And that’s what the AI tries to do.
This training of an AI, often following a human supervised preliminary training just so it knows how to learn, takes a LOT of hardware and processing power. To do a training at a very modest (comparatively) speed and cost, they might need 100 of the most extremely powerful computers with top of the line graphics cards (they use these because a lot of AI work needs memory and cores, and while a normal CPU might have maybe 20-30 cores, GPUs have thousands upon thousands of cores.
Often that extreme need for a massive server farm means that the AI company needs to turn to someone like Google Cloud, Amazon AWS, or in the case of OpenAI to Microsoft’s Azure, and ‘rent’ all the necessary processing power for the necessary time. That’s why it costs tens of millions of dollars.
Did you ever hear about Tay, the earlier chatbot from Microsoft? Tay (chatbot) - Wikipedia
It cost millions of dollars, and years of time investment to create, and it ‘broke’ within hours, permanently, because it was able and allowed to continue learning from people. In the case of Tay it could learn directly from what people would say to it. But even if it had only been allowed to learn from other documents online, that were not vetted, you can be absolutely certain that some people would prompt it to read hate sites, far right extremism (or far left), and even try to create documents specifically to ‘reprogram’ the AI to promote their cause, or simply to prove they could and how clever they were.
You can learn a lot more about the concerns and considerations that went into many of the decisions about ChatGPT in the post Sam Altman (OpenAI CEO) talks about promises, risks, and fears where we share the really interesting interview that talks about exactly this.
Interesting, that concern is pertinent. However, it is also worrying and abstract when it comes to hate speech and extremism. Regulation regarding this issue is of utmost importance, but care must be taken when attributing matters of opinions and political biases. The discussion should be broad and inclusive for all political and social classes. How are these conversations going and who is investing in this technology? After all, this economic group can clearly interfere and bring legal and democratic instability to the entire world. The power that this technology has to rewrite history is enormous when it comes to adding and subtracting information, and that does generate some fear in me. I hope that authorities are treating this issue with great responsibility and that society also has a voice in defining the long-awaited neutrality in the face of this technological leap. Besides private investments, there will also be government investments in the near future - if they are not already happening - which, in turn, come from the population in general. I am excited for AI to move beyond this two-year window and be in the present. This can be a unique opportunity for us to grow as a society.
Sadly, the authorities and governments of most of the world are avidly and desperately avoiding any and all of the issues of AI. Politicians (and the agencies that depend on political budgets) really don’t like very complex issues that are not going to win them any votes. They’ll wait until either massive unemployment, or tax evasion, make it something they can see advantage to getting involved in. This is despite many, many, many senior AI scientists and concern groups begging the governments and other authorities to get involved and help avert problems rather than hope to capitalize on those problems far too late.
Italy is the only country that has taken any sort of real stance, and they were largely ridiculed and attacked for it. Admittedly, they took entirely the wrong approach, seeking to ban access to ChatGPT, rather than start an enquiry and debate that might advocate a more correct and considered action.
In my personal opinion, the right time for Governments to become involved and concerned (not worried, but having an actual interest and discussion about potential outcomes and implications) was years ago, when scientists around the world were first asking, and long before the release of ChatGPT.
You see, the history of AI goes back far, far further than ChatGPT and OpenAI. In fact, OpenAI was only formed by a group of investors after they saw how much all of the ‘big-tech’ companies (Google, Amazon, Facebook, Apple and Microsoft included) were investing, buying out every promising startup with intellectual property, and fighting to hire the best scientists in the field. Elon Musk and his friends saw an easy tech to invest in with a high exit price value.
If you go to what is almost certainly the absolute leader in the field of AI, DeepMind, the company acquired by Google’s parent company Alphabet, you can find a long history of press releases, and talks about their significant accomplishments for many years before ChatGPT came along. Please do take the time to look it all up and get to understand it all. After all, for better or worse, AI is in our lives now, all of us, and if we don’t invest some of hours really getting to understand it, not just tinkering and playing, that lack may hold us back for years to come against those who did invest that small effort.
The thing is, most of the AI scientists were more cautious and concerned than OpenAI was (OpenAI being the only company in its class not actually led by a true AI scientist, but rather a businessman). Google developed dozens of LLMs before ChatGPT, including PaLM, and MedPaLM, (fascinating one worth looking up - can perform diagnosis almost as well as human doctors and with less harmful misdiagnosis than human doctors), but they kept all of these to small test groups, and tightly controlled limited releases where they could ensure how it was being used, and ensure that safety was paramount.
Most other AI companies and scientists followed this same route - safety first, limited, controlled, and monitored releases. But OpenAI broke ranks. Sam Altman implied in the interview I linked to in my prior reply that he did this because the proper debate wasn’t happening, and to force it, to put it all out there so that people had to have the meaningful discussions, and to decide what laws, limits, and guidance was necessary…
That may be, but it is also true that before that point, OpenAI were suffering a severe lack of funding, had no business model (thus my belief that they were always a ‘build to sell’ investment), and that releasing ChatGPT got them huge press and forced Microsoft to either make its (until then) extremely token investment significant, or to attract a rival buyer. A lot of very significant benefits to Sam Altman personally that I’m more than sure he was smart enough to see, so not entirely altruistic.
In turn, Microsoft did exactly what OpenAI hoped. They made a very significant investment (they now own 49% of OpenAI but with a special deal that gives them 100% of the profits until their investment is repaid). This gave Microsoft something to battle Google with in terms of publicity, helped their share value, gave a massive boost of interest to their Bing search engine, (which was in such a position market-share wise that they had literally nothing to lose by taking any risk), and on top of all of that, the integration of GPT into Microsoft Office products alone made the whole investment worthwhile.
A few months back, hundreds of top people involved in AI, both scientists and tech investors, made a public campaign to ask for a halt on further AI development, just for 6 months, to really investigate the potential, the risks, the fallout… But of course, that would have meant giving OpenAI a complete monopoly for that entire 6 months, and, of course, the public were at the heights of being super-excited by ChatGPT with no clue at all of the long-term or wider repercussions that are inevitable. The campaign failed.
And that is where we are at right now.
OpenAI are working full-speed on GPT5 to capitalize on their lead gained through surprising the other AI companies (the bold public release the others didn’t expect), and Google, Facebook and others are all working hard to release their answer to the surprise ‘ambush’ by ChatGPT. All terribly exciting in terms of the sheer exhilaration of discovery and development, but also now absolutely certain to change societies and economics forever, world-wide.
The Agricultural Revolution made huge numbers of agricultural labourers both unemployed and homeless, and changed the economy of nations. The Industrial Revolution was largely built on the sudden mass of unemployed people available to work in workshops and factories, and an increasingly urbanized population. The Industrial Revolution involved child labour, left men unemployed because female factory workers could do the same work for less money, and led to an even greater social and economic change in all nations.
I genuinely see little reason to believe that the AI Revolution is likely to be less tumultuous, painful, and far-reaching, and perhaps only our grandchildren will be able to look back at it as an entirely positive step, just as with those earlier revolutions.