Plasticity

Today, intellectual work happens when you sit down on your computer to write software, edit a presentation, or manage the books. But this is highly inefficient. You have ideas and thoughts as to how something should be, but sitting down to do all of the jobs involved with the vision is a job no single man can do. This is how Steve Jobs was capable of bringing alive a product like the iPhone. It was not a matter of having a vision, and all of a sudden it was there, no. Behind the scenes there was surgical business engineering that allowed such a product to be birthed.

Conventional businesses today employ people to fill roles. Contrarian businesses today spawn agents in the cloud. For infinite productivity acceleration can only be unlocked once the system is completely scalable.

I believe that there is no longer need for many roles. Accounting can be sandboxed in the cloud, marketing can be engineered, products can be built with agents. This is what it looks like on the surface. But don't be fooled, such orchestration requires careful deliberation, and the ability to keep many thoughts in mind continuously. The primitive AI tools we use today have the potential to take full form in the future, but today you can build virtually anything, but the orchestrator requires the sufficient skills and opinions in order to make something great.

The probabilistic nature of large language models is paradoxically great. It makes the models perfectly accurate for executing tasks of your direction. The great distinction that will differentiate your outputs compared to all else is the direction you make. Like a conductor, your focus must be on guiding these intelligent beings and become knit into how their minds work.

Agent intelligence

Social intelligence is the matter of knowing how the other humans think, feel, and why someone acts in certain ways. Agent intelligence is the matter of how well the conductor can manage and guide his system of machines to do what he desires. Like a Napoleon in the historic books, understanding his fellow men and knowing how he could activate their spirits, you now have the power to direct infinite man power brains to activate on your instructions. The potential you sit on is infinite.

Take time to deliberate, but when the time for action comes, stop thinking and go in.

Let's look at how you can make agent clouds obey your orders:

a) Your agents will perfectly execute on your imperfect orders. See, any issue guiding these advanced systems is your own skill-issue. Imperfect instructions will be perfectly executed. And what gets done will be nothing like your implicit expectations. The first step to this is to get your mind completely clear on what you expect the agent to do. Like Napoleon's precise strategy, there are no surprises the subordinate must deal with that you as the conductor have not instructed to deal with. This entails the ability to think about many separate fields, and keep these thoughts deeply in mind. Think through the moves that must be made, and create a detailed map from where an agent can do the attack for you, repeatedly, for the rest of time.

b) Any large language model has in it a set of weights. These weights determine the high probabilistic granularity of the model. You can use this knowledge and design your orchestration around code as checks and balances on the model. Code represents a perfectly deterministic system you can build your agents inside of. This means all of your agents must return perfect structured data for all of your tasks. There are three data types which LLMs have proven themselves capable of: JSON (JavaScript Object Notation), XML (Extensible Markup Language), and Python. When using models think about the intentions of the labs who created them. These models are highly capable of running tools with code. This is how you will instruct agents to iteratively come into contact with the laws of nature through the code tools you design to validate the work of an agent.

Most meaningful creative work cannot be boxed in and validated perfectly with code, but your validations focus on the overall data integrity of every iteration to force iterative self-healing loops instead of the first try hitting failure. This is common engineering practice which must be included in your systems.

c) As the orchestrator, continuously you are on the look for areas where agents are underperforming. You know something is off if you feel a certain distaste towards whatever you are working on. It is incredibly hard to get agents to do whatever is your golden nugget definition. It is however not fundamentally impossible, it is completely within the technological feasibility to build insanely great products and architectures with today's agents. You must however work with these models for a tedious amount of time before things start to click. There is as much skill to working with LLMs as there is to any type of engineering. A helpful mental model to consider is that you are working with probabilistic machines. There are less deterministic rules to any LLM based component of a system.

Getting models to work for you is the key to all of this. You must build up a sufficient enough skillset and intuition as to how these models operate inside of environments, and how they operate tools.

Think like an engineer

Any engineer today should spend their time finding out how they can use models in their flow to increase their intellectual ability under ownership. When you are working with these models, when you train them, when you integrate them into your apps, they become extra layers of intelligence you own. Further then, learning how to work the models intelligence to the max is a skill of high leverage for any engineer. As an engineer, or founder, or such high level thinker, you possess the greatest abilities to conduct intelligence beyond your physical confines. This is a superhuman ability that is hard to learn.

When approaching building anything today, I think about what parts of the system are high entropy. Where will there be novelty entering the system? Where will there be things which deterministic code will not be able to solve? In these modules, this is where you want to design LLMs into your app. Deterministic code still wins for the majority of modules.

Speed of model inference is an additional factor you must consider. If you are working with a lot of data, work with code. If you work with smaller chunks of data, LLMs work better. Here is an example for you:

A system that calculates insurance agents' commission. Inputs: excel files, and manual UI crud actions. Output: PDF files, and clean reports for agents and leadership to close books and report information for a given period.

This is a perfect example of how not to use LLMs:

Here are the source files, go calculate all of the reports for me.

This is terrible for a number of reasons. First, you have no clue how large a commission excel file can possibly become, worst case, a file is huge and the LLM context window is immediately saturated, rendering the entire system useless. Second, by fact, you are relying on luck to calculate the commissions, remember, LLMs are 100% probabilistic, even if they supposedly use tools to check their work with python. Third, replicating the results perfectly is not perfect, the probability of an exact replication is not high enough to design an entire system around.

But let's step back a moment. There are incredible abilities within these models which we cannot simply scrub under the rug. If however you find LLMs to work perfectly for something, then double down. Because of the nature of scaling and the AI labs working hard on your behalf, any LLM based system will get free upgrades in terms of perfection, simply through the releases of new models. This is a powerful wave you can surf. Keep this in mind.

Back to example. The perfect design in my opinion is the following:

Build up comprehensive models in code, so that files can be parsed directly. If, or when, parsing fails at the code level, fall back to an agent sandbox in the cloud. A close analogy to this is how an engineer at a firm or startup might have the role of a fire fighter. If you don't understand this, put simply, somebody has to fix things (take out a fire), and fix them fast! By building in a fallback to LLM sandbox, and perhaps using the Claude Agent SDK, you can build an anti-fragile system. A system that gains strength (new models) from errors ("ahah, found it!"). Finally, wrap this inside a nice-looking UI, and the user would never notice.

This is a key mental model. The majority of people are obsessed over how the LLMs will be used in the front end. But what happens when the front end remains simple, clean, and perhaps even free of an LLM? But instead the backend is where the LLMs are put to work, in the backend the models can function as bridges to where deterministic code will have high probability of falling back on an error of some sort. A model can be used as a self-healing property inside a piece of code.

Self-healing code with LLMs

Setting up LLMs in the stack to function as self-healing modules is a very, very, interesting idea. This will increase the value of any piece of code written, as it can adapt to the world in a malleable manner. Deterministic code fails at the first sight of discrepancy, or something being different. However the world is not static. This is what we will discuss now.

Philosophically integrating an LLM module into a piece of code, and you give away the deterministic nature of code. You no longer know if the code will replicate your instructions perfectly. However, what I wish to engineer instead is code that uses LLMs as problem solvers directly in the code behind the UI to satisfy the user's demand.

How do we operationalize LLMs inside a backend? Using sandboxes we can containerize them. Making our system protected against the chance of severe errors made by an LLM. Sandboxes are isolated virtual machines running in the cloud, we can ship the code we want to use into the cloud, then get the output from the sandbox. The output from the sandbox is validated inside the virtual machine. Nothing leaves the sandbox without getting validated against some schema or requirement. This is a key design function of a sandbox, and why many great companies are being built on this technology (such as E2B).