Unsolicited Advice for Leveraging a GenAI LLM

At this point, you’re probably pretty familiar with the AI hype out there. You’ve likely read that GenAI (like DALL-E or ChatGPT) is great for generating both visual and text-based content, and AI overall can be good for identifying patterns, particularly in large data sets, and providing recommendations (to a certain degree).

But you may also be familiar with the myriad ways GenAI has gone sideways in recent months (ex: Intuit’s AI tax guidance debacle, New York City’s law-breaking chatbot, the Air Canada lawsuit, and so many more). That doesn’t mean you need to stop experimenting with it, of course. But it does mean that the folks warning about it not being ready quite yet have some valid points worth listening to. 

Having built several AI solutions, including a recent GenAI LLM (large language model) solution, here’s some unsolicited advice to consider when leveraging a GenAI LLM. 

Don’t use GenAI for situations where you need a defined answer.


As evidenced in all the examples above, GenAI chatbots will – and often do – make information up. (These are called hallucinations within the industry, and it’s a big obstacle facing LLM creators.) The thing is, this is a feature, not a bug. Creating unique, natural-sounding sentences is precisely what this technology is intended to do and fighting against it is – at least with the current technology – pointless. 

There are some technical guardrails that can be set up (like pointing the system to first pull from specific piles of data, and crafting some back-end prompts to tell it not to make things up) yet still, eventually, our bot friends will find their way to inventing an answer that sounds reasonable but is not, in fact, accurate. That is what they are meant to do. 

In situations where you need defined, reliable pathways, you’re better off creating a hardcoded (read: not GenAI) conversation pathway that allows for more freeform conversation from the user while responding with precise information. (For the technically-minded, we took a hybrid format of GenAI + NLU for our latest automation and found it quite useful for ensuring that something like following a company-specific process for resetting a password was accurate and efficient – and importantly, in that use case, also more secure.)

Know thy data—and ensure it’s right.


I know it’s been said a million times over but a pile of inaccurate, poorly-written data will provide inaccurate, poorly-written responses. GenAI cannot magically update your data to be clean and accurate – it can, over time, generate new information based on existing information and its style (which should still be checked for accuracy) but asking it to provide correct information when it’s hunting for the answer through incorrect information is an impossible task. It cannot decipher what is “right” or “wrong” – only what it gets trained to understand is right and wrong. 

It’s important then to know what the data that you’re starting with looks like and do your best to ensure it’s quality data – accurate, standardized, understandable, etc. Because barring time to properly train the data (which is a serious time commitment but well worth it for anyone wanting proprietary or custom answers), starting with a clean data set is your best bet. 

Bring the experts in early.


When people have been experimenting with the technology and potential solution for a while, there is a pressure to “get it done already” by the time the experts roll in that doesn’t allow for the necessary exploration and guardrail-setting that needs to happen, particularly in an enterprise setting where there are plenty of Legal, Compliance, Security and even Marketing hurdles to clear. 

From both personal and collected experience, it’s worth noting that often the initial in-house experimentation focuses on the technical aspects without user experience considerations, or even why GenAI might – or might not – be the right solution here.  That’s going to take a little time. So it’s worth bringing in design and/or research experts, whether in-house or consultants, alongside the initial technical exploration to do some UX discovery and help the entire sussing-out process happen in tandem with the technical exploration. This can provide a clear picture of the business case for pursuing this particular solution. 

To help out, the Grand Studio team created a free, human-centered AI framework for an ideal AI design & implementation process.

Interested in knowing how to start a GenAI project of your own? Drop us a line! 

Leveraging AI in User Research

Grand Studio has a long history of working with various AI technologies and tools (including a chatbot for the underbanked and using AI to help scale the quick-service restaurant industry). We’ve created our own Human-Centered AI Framework to guide our work and our clients to design a future that is AI-powered and human-led and that builds on human knowledge and skills to make organizations run better and unlock greater capabilities for people. When ChatGPT hit the scene, we started experimenting right away with how it could improve our processes and make our work both more efficient and more robust. 

Given our experience with what AI is good at doing (and what it’s not), we knew we could use ChatGPT to help us distill and synthesize a large amount of qualitative data in a recent large-scale discovery and ideation project for a global client. 

Here are some takeaways for teams hoping to do something similar: 

1. Don’t skip the clean-up. As they say: garbage in, garbage out. Generative AI (GenAI) tools can only make sense of what you give them – they can’t necessarily decipher acronyms, shorthand, typos, or other research input errors. Spend the time to clean up your data and your algorithmic synthesis buddy will thank you. This can also include standardized formats, so if you think you may want to go this route, consider how you can standardize note-taking in your upfront research prep.

2. Protect your – and your client’s – data. While ChatGPT doesn’t currently claim any ownership or copyright over the information you put in, it will train on your data unless you make a specific privacy request . If you’re working with sensitive or private company data, do your due diligence and make sure you’ve cleaned up important or easily identifiable data first. Data safety should always be your top priority.

3. Be specific with what you need to know. ChatGPT can only do so much. If you don’t know what your research goals are, ChatGPT isn’t going to be a silver bullet that uncovers the secrets of your data for you. In our experience, it works best with specific prompts that give it clear guidelines and output parameters. For example, you can ask something like: 

“Please synthesize the following data and create three takeaways that surface what users thought of these ideas in plain language. Use only the data set provided to create your answers. Highlight the most important things users thought regarding what they liked and didn’t like, and why. Please return your response as a bulleted list, with one bullet for each key takeaway, with sub-bullets underneath those for what they liked and didn’t like, and why.” 

Doing the upfront human-researcher work of creating high quality research plans will help you focus on the important questions at this stage.

4. It’s true, ChatGPT gets tired. As with any new technology, ChatGPT is always changing. That being said,  the 4.0 version of ChatGP that we worked with demonstrated diminishing returns the longer we used it. Even though the prompts were exactly the same from question to question, with the input of fresh data sources each time, ChatGPT’s answers got shorter and less complete. Prompts asking for three synthesized takeaways would be answered with one or two, with fewer and fewer connections to the data sets. By the end, its answers were straight up wrong. Leading us to our final takeaway:

5. Always do an audit of the answers! Large language models like ChatGPT aren’t able to discern if the answers they provide are accurate or what you were hoping to receive. It’s also incredibly confident when providing its answers, even if they’re wrong. This means you can’t blindly rely on it to give you an accurate answer. You have to go back and sift through the original data and make sure that the answers it gives you line up with what you, the researcher, also see. Unfortunately this means the process will take longer than you were probably hoping for, but the alternative is incomplete, or incorrect answers – which defeat the purpose of synthesis in the first place and could cause the client to lose trust in you. 

Outcome: Did using ChatGPT speed up our synthesis significantly? Absolutely. Could we fully rely on ChatGPT’s synthesis output without any sort of audit or gut check? Not at all. We’ll keep experimenting with ways to incorporate emerging technologies like Generative AI into our workstreams, but always with research integrity and humans at our center. 

Interested in how GenAI might work for your organization? Drop us a line – we’d love to chat!

Human-Centered AI: The Successful Business Approach to AI

If AI wasn’t already the belle of the tech ball, the advanced generative AI tools surfacing left and right have certainly secured its title. Organizations are understandably in a rush to get in on the action — not just for AI’s potential utility to their business, but also because, more and more, demonstrating use of AI feels like a marketing imperative for any business that wants to appear “cutting edge,” or even simply “with the times.”

Sometimes, rapid technology integrations can be a boon to the business. But other times, this kind of urgency can lead to poor, short-sighted decision-making around implementation. If the technology doesn’t actually solve a real problem — or sometimes even when it does — many don’t want to change their process and use it. All this to say: a bitter first taste of AI within an organization can also harm its chances of success the next time around, even if the strategy has improved. 

At Grand Studio, we’ve had the privilege of working alongside major organizations taking their first high-stakes steps into AI. We know the positive impact the right kind of AI strategy can have on a business. But we’ve also seen the ways in which pressure to adopt AI can lead to rushed decision-making that leaves organizations worse off. 

Our top-level advice to businesses looking to implement AI: don’t lose sight of human-centered design principles. AI may be among the most sophisticated tools we use, but it is still just that — a tool. As such, it must always operate in service of humans that use it. 

A human lens on artificial intelligence

When implementing AI, it is tempting to start with the technology itself — what can the technology do exceptionally well? Where might its merits be of service to your organization? While these may be helpful brainstorming questions, no AI strategy is complete until it closely analyzes how AI’s merits would operate in conjunction with the humans you rely on, whether it be your employees or your customers.

CASE IN POINT 

In our work supporting a major financial organization, we designed an AI-based tool for bond traders. Originally, they imagined using AI to tag particular bonds with certain characteristics, making them easier for the traders to pull up. It seemed like a great use of technology, and a service that would speed up and optimize the trader’s workflow. But once we got on the ground and started talking to traders, it turned out that pulling up bonds based on tags was not actually their biggest problem. AI may be a golden hammer, but the proposed project wasn’t a nail — it only looked like one from far away. 

As we got more clarity on the true needs of these traders, we realized that what they actually needed was background information to help them make decisions around pricing the bonds. And they wanted the information displayed in a particular way that gave them not just a suggestion, but the data that led them there. In this way, they’d be able to incorporate their own expertise into the AI’s output. 

If we had designed a product based on the original assumptions, it likely would have flopped. To be useful, the AI needed to be particularly configured to the humans at the center of the problem.

The linkage points between human and AI are crucial

We all know that bad blood among employees can spell doom for an organization. Mistrust and negative energy are surefire ways to sink a ship. In many ways, integrating AI can feel a lot like hiring on a slough of new employees. If your existing employees aren’t appropriately trained on what to expect and how to work with the new crowd, it can ruin even the best-laid plans. 

Once you’ve identified where AI fits into your organization, we recommend paying extremely close attention to the linkage points between human and AI. Where must these parties cooperate? What trust needs to be built? What suspicion needs to be mitigated? How can each benefit the other in the best way possible?

CASE IN POINT

Recently, we worked with a financial services technology provider to develop AI that could spot fraud and inaccuracies in trading. We conducted in-depth research into the needs of the surveillance teams who’d be using the software to understand their role and also their expectations for how they’d use such a tool. This allowed us to thoughtfully build a visual interface on top of the AI that could maximally meet the surveillance team’s needs, including helping them with task management.

Taking the time to understand the precise nature of this potential human-AI collaboration helped us use resources wisely and prevent the mistrust and resistance that can cause even the best tools to fail. 

AI integrations require trust and understanding

Your AI also can’t be a “black box.” While not everyone at your organization needs to be an expert on its functionality, simply dropping an unfamiliar tool into a work environment and expecting people to trust whatever it spits out is very likely misguided. This is especially true when AI is helping experts do their jobs better. These roles are defined by the deep training that goes into them — how are they supposed to give an open-arms welcome to a new “employee” whose training they can’t see or understand?

For example, a doctor trained in reviewing mammograms may well benefit from AI software that can review 500 scans and whittle it down to only 20 that need human assessment. But you can imagine a physician’s resistance to simply taking those 20 images without understanding how and why the software weeded out the other 480. They rely on their expertise to save lives, and need to trust that whatever tools are helping them are supported by similar training and values. 

AI has the power to make big change. But if we don’t center humans in our implementations, the change we make may not be the good kind. 

Contemplating your early steps into AI? We’d love to work with you to help make your leap into the future a smart one. 

Using AI in Enterprise

AI is everywhere these days. There’s no escaping it whether it’s a board room conversation, a conference, or a meme on social media. And with good reason. As Bryan Goodman of Ford Motor Company said recently at the 2023 Reuters Momentum conference on AI, “I can’t imagine anyone being competitive without AI.” The resounding perspective across industries is in agreement with him.

The question amongst many people, particularly those in larger enterprise organizations with less scrappy flexibility and more risk, is how do we use AI in a way that is responsible to our business units, our shareholders, and humanity at large?

Download our whitepaper and read more about the best use cases and challenges to consider for enterprise AI usage.