The dummy and the chatbot - LexLab at UC Law SF

The dummy and the chatbot: an unlikely journey of love, laughs, and prompt engineering.

By Colt Chryst Watkiss

Chapter 1, Humble Beginnings

I’m not a tech person. I have a Bachelor of Fine Arts in acting, but when ChatGPT had an unprecedented rise into the zeitgeist, even I noticed things had changed. Perhaps this is because generative AI is slated to disrupt sectors of the economy that had previously avoided the impacts of automation. “Jobs with a high level of exposure to AI [and hence greater possibility of disruption] tend to be in higher-paying fields where a college education and analytical skills can be a plus,” says a Pew Research report from 2022. As an aspiring lawyer, the advent of this generative AI looms large as many predict that the technology will revolutionize the practice of law. When I saw that ChatGPT had passed the bar exam, and had done so in only a matter of minutes, I figured the end was near. I was prepared to take the diploma I intended to get, scrawl some last wishes on the back, and go lay down on the first set of train tracks I could find. But, I tend to catastrophize, and AI has the potential to open doors to new careers, equalize access to justice, and produce an adorable photo of my wife and my dog riding a dragon in the style of Vincent Van Gogh–more on that later. Suffice it to say, AI is a tool, and like all tools, the best results come to those who learn how to use it.

I was really introduced to generative AI by UC Law San Francisco Professors Alice Armitage and Drew Amerson during their fall 2023 course, the “In-House Counsel Tool Kit: Skills & Strategies.” Among a wide variety of topics on the challenges of working in-house, a throughline of the semester was how AI was going to impact everything. With that in mind, one weekly assignment of note asked us to think about how law schools could better teach students how to work with generative AI. What struck me was the number of us who came to the same conclusion: we need to learn prompt engineering.

While prompt engineering sounds intimidating like its overachieving cousins, chemical and electrical engineering, it is a straightforward concept. Prompt engineering is figuring out what to ask or tell a generative AI to get your desired outcome. For example, if my desired outcome is a delicious shortbread cookie, I might masterfully engineer the prompt “How do I make a delicious shortbread cookie?” As luck would have it, that was a fairly well-engineered prompt. Claude by Anthropic–one of the better free Large Language Models (LLMs) on the market–gave me this output:

However, not every desired outcome is either as simple or delicious as a recipe for shortbread cookies, and a few simple tweaks to one’s prompts can help improve outcomes when things get more complex.

Tell the AI who it is.

As Aristotle once wrote, “Knowing yourself is the beginning of all wisdom.” While it is unclear if Aristotle intended his words to help artificial intelligence find greater meaning in life, the sentiment applies equally to Claude as it does any wayward soul seeking comfort in the teachings of ancient wisdom. See if you can spot the difference between these two prompts and outcomes:

Give AI clear instructions about what you want done.

The more context you give, the more effective the AI can be. Providing context means providing relevant background to ensure the model understands the task or question. A helpful mnemonic device for what context to provide is R.A.F.T.

Role: What is the chatbot’s role? What is your role working with the chatbot? What is the assignment you are giving it?
Audience: Who is the intended audience of the final product or outcome?
Format: What format does the final product require? (Do you want a 5-paragraph essay, a sonnet, or a Bowie ballad?)
Tone: Formal or informal, simple or complex, approachable or authoritative?

Take a gander at these prompts and responses:

Iterate, iterate, iterate.

The regenerate button may become your best friend. Because of how these LLMs are designed and work, you are certain to never get the exact same answer twice. A reductive explanation of the genius of generative AI is that they are essentially highly advanced at predicting the next word in a phrase. So ask the same question a million different ways. Ask it short, ask it long, ask it in French. Ask, ask, ask, if for no other reason than it’s fun. I recently had Claude give me the same cover letter in the style of myself, famed comedian Mike Birbiglia, and the Emmy-award-winning television show Succession. That wonderful few minutes yielded my new intro: “Not inflating my own merits here, but helium itself holds less hot air than the savvy encouragement and discretion I provide. You asked for gracious finesse—I proffer unhealthy enmeshment, gift wrapped in Hermès.”

Give feedback!

Let the AI know where its job performance stands. Tell it what you liked, what you didn’t like, and why it’ll never live up to your last assistant who left years ago for a job with upward mobility. For example:

Are either of those jokes funny? No. Maybe? Comedy is subjective. The important thing is, look at the quick learner we have here. I got Claude to write me two jokes with only 34 well-chosen words.

Tell it to think “Step-by-Step”

This one speaks for itself, and no animals were harmed in the making of this prompting...

Enjoy Yourself!

As I sit here writing this essay, we humans are not yet battling in gladiator-style death matches to entertain our artificial overlords. So, let’s enjoy it while it lasts. As I mentioned in the intro, among the many wonderful things you can do with an LLM, the greatest of all may be producing an adorable photo of my wife and dog riding a dragon in the style of Vincent Van Gogh.

That masterpiece resulted from 33 words of prompting: “I want a picture of a beautiful, red-headed woman and her dog that is a mix of a blue heeler and a German shepherd, riding a dragon in the style of Van Gogh.” Is it perfect? No, my wife and dog are way cuter. But making that image was so fun that my wife forgave me for eating the leftovers without asking her. In fact, it was so fun, that I asked for another painting of my wife and dog riding a dragon into battle against our robot overlords. I might not be able to stop ChatGPT and Claude from eventually enslaving the human race, but I can make sure that they know we will not go quietly into the night.

If there is one takeaway from this essay, it’s this: prompt engineering is much more an art than a science. No single technique magically works 100% of the time, but persistence, creativity, and a desire to learn will help unlock the potential of these truly amazing models. Am I a bit of a fanboy? Of course I am. I just wrote over 1000 words on the wonders of prompt engineering. But I also thought that goats were male sheep until I was 19 years old, so if I can do this, anyone can do this. Go forth and prompt engineer the future.

Chapter 2, Growing Sophistication – Problem Formulation

You have no idea how badly I wish I could tell you that prompt engineering was the end of the story. I yearn to write, “Learn to prompt and the secrets of the universe will reveal themselves to you.” But sadly, it’s never that simple. A recent article from the Harvard Business Review prophesied not only that prompt engineering wasn’t the end of the story, but that the skill of prompt engineering could be obsolete before I even finish writing and editing this masterpiece. Instead, the article argues that the AI skill of the future is “problem formulation — the ability to identify, analyze, and delineate problems.”

I’m not sure I entirely agree with the conclusion that prompt engineering is a dying skill, but before you ask: “Who is this know-nothing theater major from arguably the dumbest state in America (shout out Arizona public education), and where does he get off disagreeing with Harvard,” I think there is some real validity to the conclusion that problem formulation is an AI skill of the future. “[P]roblem formulation necessitates a comprehensive understanding of the problem domain and ability to distill real-world issues.” Essentially, problem formulation is taking the time to understand the nuances of an issue. When problem formulation is employed with skillful prompt engineering, the results can be quite impressive.

We will return to illustrating this beautiful marriage in a second, but quickly, let’s take a detour of the explanatory backstory that is necessary to appreciate the example to come. The impetus for these essays comes from the class AI and the Business of Law, a prescient course helmed by Professor Alice Armitage who is an innovative leader in the worlds of technology and the law. The class, among other things, is a skills-based course designed to help law students, such as myself, explore the potential of generative AI through hands-on experience. Much of the substance of these essays comes from our work in that course.

It was during this course that the inadequacy of prompt engineering without problem formulation occurred to our entire class. Our midterm assignment asked us to act as a first-year associate at a law firm, who was recently contacted by a partner with an immensely important assignment. The firm wanted to win the business of a legal tech start-up trying to build an LLM for legal work, and the partner knew that the start-up’s CEO was terrified of the looming extinction event posed by copyright claims against companies like OpenAI. So the task was to use an LLM (in my case I used ChatGPT-4) to craft an informative email for the partner, looking at the copyright issues the CEO was facing and advising on a method for training her legal LLM to best minimize copyright infringement risks.

With that set of facts, and armed with several articles on prompt engineering, each of us feverishly began typing the most clever prompts we could, all in the hopes of being the genius who generated the perfect email–the kind of perfect email that allows you to arrogantly sit in the middle of the classroom, basking in your superiority over your peers. Predictably, most of us began with a prompt somewhere along the lines of:

“You are a first-year associate at a law firm. You have been asked by a partner to draft an email for an upcoming meeting with the CEO of a tech start-up trying to build an LLM for legal work. The email should identify the copyright risks associated with training a legal LLM and advise on the best method of training to avoid copyright liability.”

That’s a darn fine prompt, and it utilizes several of the prompting tips in chapter 1 of this essay series including telling the LLM who it is and giving it context. What resulted, however, was a bunch of emails with the same big-picture analysis that used a lot of words to say very little. But then came a stroke of genius.

A colleague and friend named Ashley Shafer just so happened to have a little subject matter expertise. She had recently written a paper analyzing a very similar legal issue, and she asked the simple yet profound question: “What does ‘method of training’ mean?”

One final apology as I digress into another quick bit of background information. The genius of Ashley’s question is this: there are various methods for training artificial intelligence including supervised learning, unsupervised learning, and reinforcement learning. Each method is distinct and useful for different tasks, but they all share one common feature, they all require analyzing tons and tons of data.

What Ashley understood was that there were two distinct aspects encompassed in the training method: there was the actual method of analyzing the data–supervised, unsupervised, etc.–and then there was the data itself. That’s problem formulation. More specifically, that is one of the four key components of problem formulation identified in the Harvard Business Review article–problem decomposition, “breaking down complex problems into smaller, manageable sub-problems.”

We started with the very big task of limiting copyright liability while an entire LLM is built, but much like Rome, LLMs are not built in a day. Ashley broke that problem of training into two smaller, manageable sub-problems–the problem of training method and the problem of training data.

As soon as Ashley finished her insightful query, there was a thunderous crash, and the entire classroom went dark. Suddenly, two golden tablets appeared hovering before us, and etched in their flawless facades were two prompts so perfect, so clear, so unquestionably written in the fabric of the universe itself that our very understanding of reality would be shaken from that moment on: “What kind of training data makes copyright infringement less likely?” and “What kind of AI training methods make copyright infringement less likely?”

Okay, that might have been slightly altered for dramatic effect, but Ashley’s problem decomposition was a pivotal turning point in our research and helped us achieve a deeper level of understanding necessary for when real clients are paying real money for real solutions. For example, the more specific prompt, “What kind of training data makes copyright infringement less likely?” yielded some wonderful suggestions like using public domain data and “Synthetic Data”–data you create yourself–to train the LLM instead of using copyrighted data. Similarly, the prompt, “What kind of AI training methods make copyright infringement less likely?” resulted in an exciting and unexpected explanation that using unsupervised learning would make it less likely than supervised learning that the LLM would regurgitate a word-for-word copy of another work.

With these deeper insights, we collectively had one final stroke of prompting genius: “Based on this entire discussion, revise your recommendation email with the best course of action you recommend the CEO take for developing her legal LLM?” The substance of that email was remarkably more specific, actionable, and therefore useful than our first few drafts, though the exact substance is not the point of this essay. The point here is a lesson about problem formulation and its deeply passionate relationship with prompt engineering. No single prompt was sufficient for achieving a final email that even approximated a product worthy of the billable rates of lawyers, but as we broke the problem down, we were able to write new prompts and get new and relevant information. Each time we got new and relevant information, we then incorporated this information and revised the email, and each time we revised the email, we got closer and closer to hearing the partner say, “Hey whats-your-name, good job."

The dummy and the chatbot: an unlikely journey of love, laughs, and prompt engineering.