AI Ranch podcast with Evalds Urtans

Mostly talking about AI, Pactum and selling to enterprises. It was recorded at the end of August, so some things are already likely stale!

Find the podcast on Youtube or your favourite podcast app.

Excerpts:

I’m not very surprised or disappointed by GPT-5, and that’s because the trajectory is actually quite straightforward. The scaling laws tell us that if you put 10x more compute and data in, the quality only improves logarithmically. So to get something that feels like a linear jump in intelligence, you have to keep putting exponentially more money into training. That works when you’re scaling from a million to ten million, but once you’re at billion-dollar runs, it becomes a capital question, not a research question.

According to Karpathy, the progress in 2025 mostly came from post-training in verifiable environments like math and code -- from his excellent 2025 LLM Year in Review.

And talking in more detail about the MIT AI negotiation competition:

I went into the MIT negotiation competition thinking I’d be building an agent in code—defining logic, guardrails, strategies. Then I discovered it was basically a prompting competition. You only had access to a text box, and your prompt was glued into a much larger system prompt you couldn’t see. So instead of building the best negotiator, I started exploring how fragile open-ended LLM negotiations really are, and ended up ‘negotiating’ by hacking the prompt—getting the other AI to reveal its limits or even accept deals it was never supposed to.

And:

This competition made it very obvious why we don’t do open-ended LLM negotiations in production. When everything is free text, you barely have control over behavior, and small quirks in formatting or interpretation can completely break the system. It’s not that the model is malicious—it’s that it’s not thinking like a human. It’s just following probabilities, and that makes these kinds of agent setups extremely fragile.

Since the competition (January 2025) the models have become better at avoiding these kinds of attacks.

Evalds also brought up a concern from boardrooms regarding building on top of something that is lossmaking in the billions. Here's my response:

Well, the money has to come from somewhere, but that doesn’t necessarily mean it comes from the consumer. It could also be that companies go bankrupt or just take the loss. If you look at the late 90s and early 2000s, when undersea optical cables were laid between continents, there was massive overbuilding. Way too much capacity. What happened wasn’t that internet traffic suddenly became extremely expensive. Instead, the companies that overbuilt went bankrupt or absorbed the losses.

I think what’s actually loss-making is the training, not the inference. Inference is not necessarily unit-negative. Once a model is trained — even if the original company shuts down — someone else can still run inference on it. Especially if it’s open source. The model doesn’t disappear.

And even if you build on a specific provider and they shut down or change pricing drastically, the market is extremely competitive right now. The frontier models are very close to each other. So I’m not worried about dependency in that sense. If we build on Gemini, for example, and Google raises prices 100x or shuts it down, I know I can find an alternative that’s close enough — including open-source models.