Unsolicited Advice for Leveraging a GenAI LLM

At this point, you’re probably pretty familiar with the AI hype out there. You’ve likely read that GenAI (like DALL-E or ChatGPT) is great for generating both visual and text-based content, and AI overall can be good for identifying patterns, particularly in large data sets, and providing recommendations (to a certain degree).

But you may also be familiar with the myriad ways GenAI has gone sideways in recent months (ex: Intuit’s AI tax guidance debacle, New York City’s law-breaking chatbot, the Air Canada lawsuit, and so many more). That doesn’t mean you need to stop experimenting with it, of course. But it does mean that the folks warning about it not being ready quite yet have some valid points worth listening to. 

Having built several AI solutions, including a recent GenAI LLM (large language model) solution, here’s some unsolicited advice to consider when leveraging a GenAI LLM. 

Don’t use GenAI for situations where you need a defined answer.


As evidenced in all the examples above, GenAI chatbots will – and often do – make information up. (These are called hallucinations within the industry, and it’s a big obstacle facing LLM creators.) The thing is, this is a feature, not a bug. Creating unique, natural-sounding sentences is precisely what this technology is intended to do and fighting against it is – at least with the current technology – pointless. 

There are some technical guardrails that can be set up (like pointing the system to first pull from specific piles of data, and crafting some back-end prompts to tell it not to make things up) yet still, eventually, our bot friends will find their way to inventing an answer that sounds reasonable but is not, in fact, accurate. That is what they are meant to do. 

In situations where you need defined, reliable pathways, you’re better off creating a hardcoded (read: not GenAI) conversation pathway that allows for more freeform conversation from the user while responding with precise information. (For the technically-minded, we took a hybrid format of GenAI + NLU for our latest automation and found it quite useful for ensuring that something like following a company-specific process for resetting a password was accurate and efficient – and importantly, in that use case, also more secure.)

Know thy data—and ensure it’s right.


I know it’s been said a million times over but a pile of inaccurate, poorly-written data will provide inaccurate, poorly-written responses. GenAI cannot magically update your data to be clean and accurate – it can, over time, generate new information based on existing information and its style (which should still be checked for accuracy) but asking it to provide correct information when it’s hunting for the answer through incorrect information is an impossible task. It cannot decipher what is “right” or “wrong” – only what it gets trained to understand is right and wrong. 

It’s important then to know what the data that you’re starting with looks like and do your best to ensure it’s quality data – accurate, standardized, understandable, etc. Because barring time to properly train the data (which is a serious time commitment but well worth it for anyone wanting proprietary or custom answers), starting with a clean data set is your best bet. 

Bring the experts in early.


When people have been experimenting with the technology and potential solution for a while, there is a pressure to “get it done already” by the time the experts roll in that doesn’t allow for the necessary exploration and guardrail-setting that needs to happen, particularly in an enterprise setting where there are plenty of Legal, Compliance, Security and even Marketing hurdles to clear. 

From both personal and collected experience, it’s worth noting that often the initial in-house experimentation focuses on the technical aspects without user experience considerations, or even why GenAI might – or might not – be the right solution here.  That’s going to take a little time. So it’s worth bringing in design and/or research experts, whether in-house or consultants, alongside the initial technical exploration to do some UX discovery and help the entire sussing-out process happen in tandem with the technical exploration. This can provide a clear picture of the business case for pursuing this particular solution. 

To help out, the Grand Studio team created a free, human-centered AI framework for an ideal AI design & implementation process.

Interested in knowing how to start a GenAI project of your own? Drop us a line! 

Leveraging AI in User Research

Grand Studio has a long history of working with various AI technologies and tools (including a chatbot for the underbanked and using AI to help scale the quick-service restaurant industry). We’ve created our own Human-Centered AI Framework to guide our work and our clients to design a future that is AI-powered and human-led and that builds on human knowledge and skills to make organizations run better and unlock greater capabilities for people. When ChatGPT hit the scene, we started experimenting right away with how it could improve our processes and make our work both more efficient and more robust. 

Given our experience with what AI is good at doing (and what it’s not), we knew we could use ChatGPT to help us distill and synthesize a large amount of qualitative data in a recent large-scale discovery and ideation project for a global client. 

Here are some takeaways for teams hoping to do something similar: 

1. Don’t skip the clean-up. As they say: garbage in, garbage out. Generative AI (GenAI) tools can only make sense of what you give them – they can’t necessarily decipher acronyms, shorthand, typos, or other research input errors. Spend the time to clean up your data and your algorithmic synthesis buddy will thank you. This can also include standardized formats, so if you think you may want to go this route, consider how you can standardize note-taking in your upfront research prep.

2. Protect your – and your client’s – data. While ChatGPT doesn’t currently claim any ownership or copyright over the information you put in, it will train on your data unless you make a specific privacy request . If you’re working with sensitive or private company data, do your due diligence and make sure you’ve cleaned up important or easily identifiable data first. Data safety should always be your top priority.

3. Be specific with what you need to know. ChatGPT can only do so much. If you don’t know what your research goals are, ChatGPT isn’t going to be a silver bullet that uncovers the secrets of your data for you. In our experience, it works best with specific prompts that give it clear guidelines and output parameters. For example, you can ask something like: 

“Please synthesize the following data and create three takeaways that surface what users thought of these ideas in plain language. Use only the data set provided to create your answers. Highlight the most important things users thought regarding what they liked and didn’t like, and why. Please return your response as a bulleted list, with one bullet for each key takeaway, with sub-bullets underneath those for what they liked and didn’t like, and why.” 

Doing the upfront human-researcher work of creating high quality research plans will help you focus on the important questions at this stage.

4. It’s true, ChatGPT gets tired. As with any new technology, ChatGPT is always changing. That being said,  the 4.0 version of ChatGP that we worked with demonstrated diminishing returns the longer we used it. Even though the prompts were exactly the same from question to question, with the input of fresh data sources each time, ChatGPT’s answers got shorter and less complete. Prompts asking for three synthesized takeaways would be answered with one or two, with fewer and fewer connections to the data sets. By the end, its answers were straight up wrong. Leading us to our final takeaway:

5. Always do an audit of the answers! Large language models like ChatGPT aren’t able to discern if the answers they provide are accurate or what you were hoping to receive. It’s also incredibly confident when providing its answers, even if they’re wrong. This means you can’t blindly rely on it to give you an accurate answer. You have to go back and sift through the original data and make sure that the answers it gives you line up with what you, the researcher, also see. Unfortunately this means the process will take longer than you were probably hoping for, but the alternative is incomplete, or incorrect answers – which defeat the purpose of synthesis in the first place and could cause the client to lose trust in you. 

Outcome: Did using ChatGPT speed up our synthesis significantly? Absolutely. Could we fully rely on ChatGPT’s synthesis output without any sort of audit or gut check? Not at all. We’ll keep experimenting with ways to incorporate emerging technologies like Generative AI into our workstreams, but always with research integrity and humans at our center. 

Interested in how GenAI might work for your organization? Drop us a line – we’d love to chat!