Teaching prompt engineering without the jargon

The problem

AI-powered large language models (LLMs) arrived on the tech scene almost overnight, catching BigCommerce a bit flat-footed. We assumed (rightly, as time would tell) that all of our competitors in ecommerce were working on similar tools to try and tap into the AI craze. Our goal, was to take advantage of this new technology in a way that made sense without coming off as an egregious attempt at cashing in on the hot new thing.

To that end, we wanted to create a tool that would let customers create quick, well-written product descriptions for their online store. The tool, we decided, should feel like having a copywriter who can knock out quick, eye-catching product copy while staying true to the customer’s brand. We called it BigAI Copywriter.

User archetype

I came up with the following user archetype to guide us through the design process:

Informational needs

How can AI help me run my business?
How is BigCommerce integrating AI technology into their product?

Jobs to be done

I need to write product descriptions for each of the products in my store
As a new customer, I need to add my entire product catalog to my online store
Show me that AI is more than a passing fad

Psychological profile

Progressive disclosure: Start with minimal options for new users while giving experienced users the ability to play with advanced options
Familiarity bias: Couch the BigAICopywriter in the product creation process users are already familiar with
IKEA Effect: Give users enough control over the Copywriter so they feel like they’re the ones who “built” the end result, not the AI

Ideation and development

Partnering with the lead designer on the project, we started with wireframes showing how this tool could work, which we presented to the product manager.

For my part, I performed a deep dive into the LLM and figured out what our prompts needed to look like in order to create good outputs.

A good prompt, a little internet research revealed, must include the following 4 characteristics:

The request: What are you asking the AI to deliver?
Formatting requirements: How should the output be organized?
References: To previous answers, an internal dataset or other external sources
Framing: Any additional context to help the AI, often an explanation of the request itself

With this research in hand, I suggested corresponding input fields for the Copywriter and the tooltips for each field showing users how the fields affected the resulting outputs. In effect, I was teaching users how prompt engineering works without overwhelming them with a lot of fussy, nerdy stuff.

I also identified and wrote all possible status messages (error, success, confirmation, etc.).

An early mockup of the Copywriter on the left vs. the final version on the right. Note the evolution of the tooltip and CTA.

Resolving feedback

While collecting feedback from stakeholders, the product manager suggested some copy that wasn’t in accordance with BigCommerce content guidelines and, frankly, wasn’t very good. To make my case, I used a content heuristics scorecard that I had created previously just for instances like this. With the scorecard, content is evaluated according to 8 usability criteria, based on content design best practices, and given a score. Whatever copy scores the highest is most aligned with the goals of the user and BigCommerce.

The content heuristics scorecard I created for BigCommerce, used for evaluating copy and making content decisions when there are multiple proposals.

I used the heuristics scorecard to advocate for the tooltip seen here. The copy gives users examples of what they can do when instructing the Copywriter in a fun and illustrative way.

Further refinement

We included a field in the Copywriter where users could enter a list of keywords for the LLM to include in the output for branding or SEO purposes. Early test results showed the AI inserting keywords into responses in a heavy-handed way, often cramming them all into one sentence.

Based on my experience with the LLM, I suggested this was a problem with our prompt. To fix the problem, I added this line to the prompt (in the code, not visible to users):

Additional keywords (insert the given keywords separately and naturally into the description, rather than as a single phrase, ensuring they are used appropriately within the text)

Another look at the BigAI Copywriter in action, including the keywords field that caused problems early on.

Model tuning

We started this project using the standard LLM from Google Vertex AI (called “text-bison”). In the interest of quality, I suggested we set up a Pendo survey to capture ratings from customers who were already using the tool in early access. If a user created or edited a product in their store, that would trigger a survey prompt asking, “Did you use the BigAI Copywriter?” if so, we would then ask them to rate the quality of the product description the Copywriter created from 1-star to 5-stars. An early draft of the survey read “Looks like you’re using the BigAI Copywriter…” which I changed because it sounded too passive aggressive and reminded me a lot of Microsoft’s Clippy.

Next, I asked our developers to export all product descriptions with a 4-star rating and above. I curated all the best examples from the export and gave those back to our developers to feed back into the LLM via a custom dataset.

On the process of curation, good results exhibited the following:

Matched user provided specs for the product
Sounded clever without being cloying
Organized product features in a commonsense, easily digested format
Incorporated keywords in a natural-sounding way
Sounded like a human being

Validation and results

This BigAI Copywriter project came together very quickly — just 4 months from start to finish. When the Copywriter went to general availability, we had a custom tuned LLM designed specifically for writing product descriptions for ecommerce sites.

Users liked the way the Copywriter worked with as much or as little instruction as the user felt like entering into the prompt. Advanced users had lots of levers to pull when fine-tuning their brand voice, while more casual users could leave the inputs largely empty and try their luck.

The Copywriter is available now on the BigCommerce App Marketplace. It launched in October of 2023.