Skip to content

Beyond Prompts

Is prompt engineering, hailed as “the number one job of the future,” really the next big thing?

8 min read
Beyond Prompts
🦉
Before Growth is a weekly newsletter about startups and their builders before product–market fit, by 3x founder and programmer Kamil Nicieja.
  • Despite its current hype, the significance of prompt engineering might be short-lived.
  • We touch on a new wave of health apps that leverage psychology to increase user retention and yield better results.
  • Can real-time collaboration become a commodity feature like search? This startup seems to believe so.

I don’t like prompts.

It’s true that the basics are straightforward and open to everyone. However, when it comes to tasks with edge cases, the need for clear expression of vague preferences, or tasks that require a precise understanding of LLM behavior, choosing examples for writing prompts can be difficult. Here’s an example:

Three brilliant minds are using the “tree of thoughts” methodology to collaboratively tackle a question. Each expert will thoughtfully build upon the prior contributions of others, acknowledge any mistakes, and enhance the collective understanding. They will iterate on each other’s insights and give due recognition. This iterative process will continue until they arrive at a definitive solution. The entire discussion will be organized in a markdown table for easy reference. The question under consideration is…

This technique is known as tree-of-thought prompting. It’s particularly useful for complex problems requiring foresight or exploratory reasoning, where standard prompting methods usually aren’t effective.

Tree-of-thought prompting organizes the thought process into a branching structure, with each node representing a step or a logical progression towards the solution. In mathematics, for example, each node could be an equation. We discussed this and other advanced prompting methods, such as chain-of-thought, reflection, and chain-of-density, in No Work is Ever Wasted.

No Work Is Ever Wasted
What if you’ve launched your app, poured your heart into it, and still crashed and burned?

What about images, though? I found a prompt for a generated image on my Playground feed:

Digital realistic art of a Plymouth Road Runner 426 Hemi, tinted windows, speeding through the city streets of Miami, the sun shining on the bright orange body and chrome parts. Professional digital painting made with alcohol inks and acrylic, in the style of WLOP, RHADS, APK, vibrant colors, sharp focus, vanishing point, three-point perspective. High quality.

The author uses very specific phrases to guide the model to their desired outcome. I wonder how much experimentation it took to achieve that. I doubt I could produce something similar without further training, even though I’ve been working with generative AI for the past year.

We discussed this very issue in a previous edition of Before Growth:

To start, users need to be eloquent enough to craft the necessary textual prompts effectively. […] Seeking more advanced answers from these models, which they’re capable of providing, also requires more detailed prompts. [A prompt] leaked through a […] prompt injection attack for DALL-E 3 spans 932 words and 5,704 characters. I doubt everyday users will face this problem much, but I think people in the business world will eventually have to grapple with these arcane commands. Prominent UX researcher Jakob Nielsen refers to this issue as the “articulation barrier.”
The Articulation Barrier
Some people can’t write or dislike it. Yet, products that rely on generative AI require them to do just that.

It seems likely that prompts will become increasingly less important over time. This trend is already evident in cutting-edge tools. OpenAI’s DALL-E 3 integrates natively with ChatGPT, allowing you to use ChatGPT as a collaborator to refine your prompts. You can describe what you want to see using the chat, ranging from a simple sentence to a detailed paragraph.

However, ChatGPT doesn’t directly use this description for the image prompt. Instead, it creates its own command—a case of two AI systems interacting independently.

Let me show you an example. This is an image prompt created from a very simple command I gave to ChatGPT:

Kamil to ChatGPT: Draw a cat, please.
ChatGPT to DALL-E 3: A realistic image of a domestic cat sitting peacefully. The cat has a fluffy coat, with distinct tabby markings in shades of grey and black. Its eyes are bright green, and it is gazing curiously at something out of frame. The background is simple, with a soft focus to keep the emphasis on the cat. There’s a hint of a comfortable living room setting, with a cozy couch and a warm, soft light that enhances the cat’s fur texture.

Research shows that GPT-4 is, in fact, more effective at eliciting user preferences through questioning than users were in conveying their needs through prompts—and that it requires less mental effort.

I’ve crafted a simple prompt demonstrating how to get a similar outcome:

As a language model, your role involves examining and refining prompts for systems similar to yourself. When a user presents a prompt, break down its purpose, considering the user’s objectives and motivations. Reflect on the constraints and optimal strategies for engaging with large language models in scenarios like the user’s. Your task is to reshape the prompt into a directive for another neural network to maximize effectiveness. Recognize that the user may not have expertise in prompt engineering, so your assistance is key in enhancing their interaction.

The prompt is: “Write a short story about knights.”

I believe this approach will become the norm: experts in various fields will create task elicitation systems specific to their domains. These systems will then help casual users engage with generative AI more easily and achieve better results, all without needing to learn prompt engineering.

As models start to take over the simpler aspects of prompt engineering before the field fully matures, AI engineers will focus on the more complex tasks. These include versioning, testing, fine-tuning, and deploying prompts and models. They will use a variety of advanced skills required to develop applications with this new generation of models.


Your questions, answered…

I understand that There Are No Experts, There Is Only Us is written in good faith, but I feel like screaming at the screen because I’ve worked with CEOs who had an answer for everything, even when they didn’t, and it’s… not a pleasant experience.
—Mike

🤔
If you’ve got any questions about this week’s essay, feel free to respond to this email or post a comment once you’ve upgraded your subscription.

Certainly. The line between confidence and arrogance is thin. I’m not suggesting leaders should dismiss all feedback, as opinions vary and ultimately, you have to trust your own judgment. In situations where there are definitive answers, ignoring your team’s suggestions would be unwise. My point was more about not giving too much weight to negative feedback, especially from those who aren’t genuinely invested or from those who are involved but lack a clear consensus on the way forward.

As a CEO, your team definitely has skin in the game. They’ve committed their time to your vision, and if you’ve granted them equity, they have a financial stake as well. Therefore, I believe their feedback should be held in higher regard than that of the outside world, with the possible exception of potential customers.

However, imagine a scenario where your team is divided on how to tackle an issue. Some team members might advocate for one approach, while others support a different one. Your investors might even weigh in with their own suggestions, as they sometimes do. As a leader, this puts you in a lonely position. If you make the right decision, people might not express gratitude, considering it part of your job. On the other hand, if the decision proves wrong, you’re likely to hear “I told you so.” This is just part of the cost that comes with leadership.

This should not be seen as an excuse. Don’t act like a dick and be contrarian just for the sake of feeling like the smartest person ever born.

🗓️
Let’s dive into this week’s recap.

1-hour introduction to LLMs

If you’re looking for a beginner-friendly overview of generative AI’s technical aspects, Andrej Karpathy has a fantastic one-hour introduction on YouTube.

He released it this week and it’s aimed at a general audience and is a highly accessible, well-presented guide.

Discover ways to improve your sleep, with psychology

Finding non-AI startups to discuss seems challenging lately. You may have observed my efforts to maintain a balance. Whenever my weekly essay is about AI, I try to focus the weekly summary on different aspects of early-stage companies. To this end, I’ve compiled a list of a couple of refreshing companies that are not engaged in LLMs, computer vision, or AGI.

The first company I want to highlight is Stellar Sleep. They’ve developed a health app specifically for those struggling with insomnia. While at first glance this might not seem extraordinary, given the plethora of sleep trackers available, what sets Stellar Sleep apart is its foundation in established behavioral therapy techniques for insomnia. This method might ring a bell if you’re familiar with the health market. It’s a strategy similar to Noom's, which combines scientific principles and personalization to help with weight loss and its long-term maintenance.


Related posts

Intelligence as an API

AI models are getting hard to beat when it comes to getting simple answers right at scale.

Intelligence as an API

Pairing with AI

Raise your floor with large language models.

Pairing with AI

The Economics of LLMs

How to avoid bankruptcy scaling up large language models

The Economics of LLMs