What Are Autonomous Agents? And why are they the next AI wave after ChatGPT?

This will have an impact both for creating a company and using services, as agents can literally replace employees. Soon solo-startup will have a different meaning.

May 07, 2023

Long story short:

Autonomous agents are AI-powered programs able to create tasks by themselves, finish tasks, reprioritize, and repeat this process until they achieve their goal. Most likely will have a considerable impact in the next AI wave. This will open a new way to work and what we consider an employee, and therefore many opportunities.

Some of us are already fed up with SomethingSomething-GPT, while the rest of the world is still in the middle of the hype. Truth is the GPT AI wave already generated $100m opportunities and countless companies stemming by it, mostly on the QA from documents and videos approach.

So, what’s next? ChatGPT-like solutions are more automation and a friendly Google, rather than something that can learn and be active. The next step is autonomy: Autonomous Agent, and they are expected to have a huge impact beyond the current fad around Chatgpt.

Autonomous agents are AI-powered programs able to create tasks by themselves, finish tasks, reprioritize, and repeat this process until they achieve their goal.

The most popular autonomous agent projects currently are Auto-GPT and BabyAGI. Those are both independently developed opensource Python projects, and show the viability of using existing LLM APIs (GPT3, 4, or any of the alternatives) and reasoning/tool selection prompt patterns in an infinite loop to do potentially endlessly long-running, iterative work to achieve a high level goal set by a human user. They are not based on training computationally expensive machine learning model, but rather use the existing ones. Please pay attention to the expression “high level” as it is not just a decoration.

“High level” means getting closer to general cognition related to goals, not just language model automation, it refers also indirectly to the idea of self-learning.

Autonomous agents can perform several tasks following general guidelines related to

planning and learn (or what we can call add a something not previously trained) on its own. From managing a social media account to investing in the stock market to coming up with the best ice cream (including designing the campaign, and writing you the business plan, following just one goal).

If you have 2 minutes look at the live of Yohei Nakajima setting the goal “start and grow a mobile AI startup”: Yohei on Twitter: "Watch it learn... this time, on how to build a "mobile AI startup" (2 min video) Still really rough, but it's fascinating to watch https://t.co/7QYuXFMpKO" / Twitter (Please be patient this is the full video recording in real time)

Comparing the two, Auto-GPT is more sophisticated than BabyAGI on purpose. Auto-GPT can clone GitHub repos, start other agents, screen social media, etc. The features are open, so you can check yourself:

Auto-GPT/prompt.py at ecf2ba12db11ff19bce359b842f810f0e2d09d6a · Significant-Gravitas/Auto-GPT · GitHub

For an overview of the idea, peep the summary by YoJei Nkajim below:

Since their recent release both approaches have stemmed a series of applications, web-intefaces (as Agent-GPT). In a research project from Stanford and Google (Park et al. 2023), an entire city full of agents interacting with themselves with their own goal has been set:

It's pretty clear that soon as employees we won’t have just humans. You will also have the ability to hire autonomous agents in our company, or run an entire solo-human-company with many agents.

Jim Fan, an AI scientist at Nvidia says

The recursive self cloning capacity excites me the most. To complete the work, the AI agent can duplicate itself, transmit task instructions, and begin communicating with its own sibling. These autonomous agents will be used for any work imaginable and in every industry.

The key points of agents which make them so unstoppable are

1. Metacognition, the ability of discuss (think?) about thinking.

The fact that a bot can talk about anything, does not mean it understands what it is prompting. However, metacognition bridges this gap a bit if they are endowed with. An attempt to fulfil this can be long chain-of-thought prompting (see Wei et al. 2022).

2. External memory, as accessing Wikipedia, social media, recent movie and TV show, external books, etc.

Indeed they use ChatGPT but not only.

3. Deployment automation

One of the criticism of ChatGPT is that often it produces well written code (with thousands of bugs). A real deployment using Github repositories code might address this, giving immediate feedback to the bot.

Few startups popped up with the promise of deploying and managing personal data. Adept raised $350 Million during a Series A round, to build AI that learns how to use software for you. Everything still based on their ACT-1 Action Transformer .

4. Self-learning

The deployment automation is almost nullified if new libraries are appearing, version of code is updated, new languages are used and therefore everything will crush. As you can preview from this tweet about the Self-Learning Agent for Performing APIs (SLAPA):

And there are already companies up and running: Fixie.ai — Build on LLMs

Overall it seems pretty exciting and scary at the same time. The previous fear “AI will steal our job” can be replaced by “That entire company is made by AI agents” 😊. Jokes apart, rather than focusing on replacing people's work, we should be focusing on augmenting what people can do, or give them an agent as buddy-worker.

Here is a conversation we had about this topic in our lab: