TL;DR
- Generative AI is a remarkable tool you and your team members should use every day to work faster.
- Generative AI is inherently inaccurate. Beware.
- Generative AI does not think or reason. That’s on you. Draw the line at decision-making and critical thinking.
Checking in on generative AI
As the AI hype train continues to roll down all the tracks in tech these days, I thought it would be a good idea for a quick, pundit-free progress update on how generative AI is working out for those of us on the ground building digital products and services.
Please note that I’m scoping my observations to generative AI (“genAI”). That means large language models to generate and summarize text like ChatGPT, text-to-image models like MidJourney, and text-to-music/video. I’m not including other kinds of AI, like the flavor that drives cars or AGI — artificial general intelligence — which will never/maybe/probably/certainly destroy humanity.
I use genAI every day. And so do my colleagues. It’s a remarkable tool that, net-net, helps us move faster.
Here’s my current genAI stack and how I use each bit:
- Perplexity.ai for question based internet searches, like “what is the most current and popular Ruby gem for reading outlook calendar events?”
- GitHub copilot for autocompleting boilerplate code, like making an API call and parsing a JSON response
- MidJourney in places where I would usually stick in stock photos, like blog posts and newsletters
- ChatGPT for summarizing things, finding synonyms, and general writing tasks like suggesting grammatical changes
- ChatGPT for basic coding tasks, like writing devops command line scripts, or modernizing legacy Javascript code
- OpenAI APIs to summarize live goal updates and check-ins in Steady (we’re trying other APIs for this too as the tech gets commoditized)
Without question, each of these components replaces otherwise longer or tedious workflows. They have most certainly made work better for me and my team in these areas.
But there are pitfalls. Two biggies in particular:
First, Gen AI results can be wildly inaccurate but presented with the confidence of a numeric answer on a pocket calculator. And that problem is not going away anytime soon.
The underlying problem with this “hallucination” issue is mistrust. I know that some answers will be wrong, but I don’t know which ones. So I have to proceed like all of them are wrong. And that realization eats a giant chunk out of the efficiency gained by using genAI in the first place for many tasks.
For code generation, that means scrutinizing output line-by-line for syntax errors or, worse, security issues. For summarization tasks, that means reading everything over and checking for inaccuracies.
Every once in a while, it just would have been faster to make the thing from scratch.
Second, because of the aforementioned hype train, every product manager seems to be under a mandate to drop a sparkly ✨ button into every flow of every app. It’s awfully tempting to click away and have it fill some text in a place where critical thinking is required.
But thinking and reasoning is not what genAI does. The large language models that drive genAI are token (word) generators. They generate responses to a prompt by putting words together based on how words went together in the data they were trained with. They are shockingly awesome at that and it feels magical. But it’s not thinking, it’s tokenizing. It’s building up words into sentences based on probabilities from the training data — stuff that humans already put into the world.
The last thing I want is for me or anyone else on the team to skip the thinking step for any kind of knowledge base article, strategy document, roadmap work, or customer facing communication. If they do and use genAI, at best, it is guaranteed not to be original thinking. At worst, it’s jargony gobbledygook.
For example, the value of a design brief at Steady is rooted in the fact that a designer like Adam wrote it. He thought through the design problem, leveraged a long career of product experience, and injected his first-hand knowledge of our customers, their needs, and our overall strategy. The result is novel and bespoke. It would be detrimental to our customers and business (and Adam!) if he skipped the thinking step by tabbing through the “Ask AI to write …” prompt Notion has frustratingly added to every interaction.
Put another way, what value does a knowledge base have to your business if it consists of re-hashed material written by an LLM? For my team and probably yours, thinking is the currency, no?
Again, net-net, genAI is an incredible tool that speeds up workflows. Just beware that it’s wrong a lot, and never substitute genAI text for context, reasoning, decisions, or strategy.