Grand Studio August 23, 2024

A Checklist for GenAI Readiness

This is the first in a multi-part series about Generative AI, focused on how to set up your Generative AI project for success. Whether you’re new to GenAI, or have your own tactics to share, there’s more we can all learn about implementing this new technology.

With the many offerings available now in the GenAI landscape, from OpenAI’s DALL-E and ChatGPT – already at a 4o version—to Meta’s LLaMA, to Microsoft’s Orca, to Google’s multiple AI offerings, Generative AI Large Language Models (GenAI LLM) now feels a bit inescapable. It can be easy to get caught up in the excitement about adding a GenAI LLM-enabled tool to your company’s portfolio, but it can be difficult to know where to start, and what needs to be in place to succeed. So before we discuss the various offerings or how to implement LLMs, let’s take a look at how you can set your team up for success—whether you’re in Engineering, Product or Design—before embarking on your next GenAI LLM project.

Know what you can change with LLMs – and consider how you can change the rest
This is a question we think about all the time at Grand Studio: what problems can – and should – be solved with a given technology? With technologies as complex as LLMs that involve trillions of tokens, years of training, and millions of dollars, designing a new LLM might be a bit out of reach for many. But even for those who can access these solutions, it still doesn’t mean that all aspects of their problems should be solved with a GenAI modality. That’s why exploring what the problems are, how users behave and what tools they use, as well as what combination of solutions may most holistically address the issue(s) is an important first step. And if a GenAI LLM is in fact the right solution, there may still be quite a few elements of a problem that can be solved for and improved outside of a GenAI.

One recent example came up as we were designing a GenAI LLM solution: one of the use cases we wanted to tackle had to sit outside the solution’s access point due to security measures and therefore could not be addressed by the GenAI. We were able to do a UX/UI heuristic pass and create a set of digital UX adjustments that reduced the issues with that use case so much that the amount of money the enterprise was spending dropped an entire contract tier. So don’t underestimate the impact of UX/UI within a holistic solution.

Clean up your data
We’ve said it before and we’ll say it again: your GenAI will flourish or fail depending on how clean and organized your data set is. The general data sets that inform current LLMs are massive and in order to get answers that are relevant and accurate for your company, or even your industry, you’ll likely need to help the LLM focus in some way. Data lakes – essentially centralized areas for your data that an LLM can be required to check first before generating answers, and carefully crafted back-end directions on what information to present – and how – (called system prompts) can help your LLM prioritize certain data before going into its general knowledge. The trick is that data has to be organized, well-written, and clean of errors first. This can be a big ask if you are the kind of company that has a huge knowledge base archive that maybe hasn’t been overhauled in years.

One way to tackle this is to start small(er). You won’t be able to get away with only 100 clean pieces of data, but you might be able to get away with ~ 1000. Starting small and establishing a content governance structure can help you out in the long run, as knowledge becomes more relevant and up to date, both for your new GenAI buddy and for the employees in the business itself. (And if content governance is new to you, that’s something that consultancies like Grand Studio can help with.)

Testing, testing, testing
GenAI is a new – and therefore unpredictable – technology. People have a lot of mixed feelings about GenAI; some people are excited about what they see as a tool of the future, while others are skeptical or even afraid of what GenAI will mean for their job security and place in the workforce. Building multiple moments of user-centered research and testing into your project plan can help you build empathy with your target audience, with an added benefit of not only spotting technical bugs and glitches, but also helping people start to build trust and understanding of what this technology is capable of. Thorough research with the right users can also help your internal comms or external product marketing teams create a finely-tuned product launch messaging and rollout plan. (As it happens, Grand Studio is so committed to user-centering all products and services that we’ve created a public-facing framework to help put this into action).

Embrace the whimsy
Finally, as you’re gearing up to get started on your exciting new GenAI-enabled product, it’s important to set some grounded expectations and cut through the marketing hype. GenAI, and LLMs in particular, are not silver bullets. They are emerging technologies that are still being experimented on, developed, and tested out every day. There are limited functionalities as to what these LLMs are capable of; they’re not truly “intelligent” and they can’t read your – or your users’ – minds. And there’s still a learning curve to understanding how to get the best out of these technologies.

Bias and hallucinations are real risks that could open you and your company up to potential liability depending on your target audience and industry. Company security is an additional concern given that data once fed into an LLM – even a company’s proprietary LLM or Wrapper – is impossible to remove once it’s in there so there will need to be additional protections in place. Having these hard conversations about why your solution should include a GenAI-enabled product and what the expectations of this technology are before you get started will save everyone a lot of time and pain later on as these limitations make themselves known.

Overall, GenAI is an exciting thing that has a whole world of potential and possibilities attached to it. We believe that being honest about the technology’s limitations and setting yourself up for success as best as possible will give you the greatest chance to make the best use of this emerging technology and its capabilities.

Stay tuned for the next part of this series: The Ideal GenAI Design Process

Diana Deibel May 20, 2024

Unsolicited Advice for Leveraging a GenAI LLM

At this point, you’re probably pretty familiar with the AI hype out there. You’ve likely read that GenAI (like DALL-E or ChatGPT) is great for generating both visual and text-based content, and AI overall can be good for identifying patterns, particularly in large data sets, and providing recommendations (to a certain degree).

But you may also be familiar with the myriad ways GenAI has gone sideways in recent months (ex: Intuit’s AI tax guidance debacle, New York City’s law-breaking chatbot, the Air Canada lawsuit, and so many more). That doesn’t mean you need to stop experimenting with it, of course. But it does mean that the folks warning about it not being ready quite yet have some valid points worth listening to.

Having built several AI solutions, including a recent GenAI LLM (large language model) solution, here’s some unsolicited advice to consider when leveraging a GenAI LLM.

Don’t use GenAI for situations where you need a defined answer.

As evidenced in all the examples above, GenAI chatbots will – and often do – make information up. (These are called hallucinations within the industry, and it’s a big obstacle facing LLM creators.) The thing is, this is a feature, not a bug. Creating unique, natural-sounding sentences is precisely what this technology is intended to do and fighting against it is – at least with the current technology – pointless.

There are some technical guardrails that can be set up (like pointing the system to first pull from specific piles of data, and crafting some back-end prompts to tell it not to make things up) yet still, eventually, our bot friends will find their way to inventing an answer that sounds reasonable but is not, in fact, accurate. That is what they are meant to do.

In situations where you need defined, reliable pathways, you’re better off creating a hardcoded (read: not GenAI) conversation pathway that allows for more freeform conversation from the user while responding with precise information. (For the technically-minded, we took a hybrid format of GenAI + NLU for our latest automation and found it quite useful for ensuring that something like following a company-specific process for resetting a password was accurate and efficient – and importantly, in that use case, also more secure.)

Know thy data—and ensure it’s right.

I know it’s been said a million times over but a pile of inaccurate, poorly-written data will provide inaccurate, poorly-written responses. GenAI cannot magically update your data to be clean and accurate – it can, over time, generate new information based on existing information and its style (which should still be checked for accuracy) but asking it to provide correct information when it’s hunting for the answer through incorrect information is an impossible task. It cannot decipher what is “right” or “wrong” – only what it gets trained to understand is right and wrong.

It’s important then to know what the data that you’re starting with looks like and do your best to ensure it’s quality data – accurate, standardized, understandable, etc. Because barring time to properly train the data (which is a serious time commitment but well worth it for anyone wanting proprietary or custom answers), starting with a clean data set is your best bet.

Bring the experts in early.

When people have been experimenting with the technology and potential solution for a while, there is a pressure to “get it done already” by the time the experts roll in that doesn’t allow for the necessary exploration and guardrail-setting that needs to happen, particularly in an enterprise setting where there are plenty of Legal, Compliance, Security and even Marketing hurdles to clear.

From both personal and collected experience, it’s worth noting that often the initial in-house experimentation focuses on the technical aspects without user experience considerations, or even why GenAI might – or might not – be the right solution here. That’s going to take a little time. So it’s worth bringing in design and/or research experts, whether in-house or consultants, alongside the initial technical exploration to do some UX discovery and help the entire sussing-out process happen in tandem with the technical exploration. This can provide a clear picture of the business case for pursuing this particular solution.

To help out, the Grand Studio team created a free, human-centered AI framework for an ideal AI design & implementation process.

Interested in knowing how to start a GenAI project of your own? Drop us a line!

Anna Lathrop March 20, 2024

Leveraging AI in User Research

Grand Studio has a long history of working with various AI technologies and tools (including a chatbot for the underbanked and using AI to help scale the quick-service restaurant industry). We’ve created our own Human-Centered AI Framework to guide our work and our clients to design a future that is AI-powered and human-led and that builds on human knowledge and skills to make organizations run better and unlock greater capabilities for people. When ChatGPT hit the scene, we started experimenting right away with how it could improve our processes and make our work both more efficient and more robust.

Given our experience with what AI is good at doing (and what it’s not), we knew we could use ChatGPT to help us distill and synthesize a large amount of qualitative data in a recent large-scale discovery and ideation project for a global client.

Here are some takeaways for teams hoping to do something similar:

1. Don’t skip the clean-up. As they say: garbage in, garbage out. Generative AI (GenAI) tools can only make sense of what you give them – they can’t necessarily decipher acronyms, shorthand, typos, or other research input errors. Spend the time to clean up your data and your algorithmic synthesis buddy will thank you. This can also include standardized formats, so if you think you may want to go this route, consider how you can standardize note-taking in your upfront research prep.

2. Protect your – and your client’s – data. While ChatGPT doesn’t currently claim any ownership or copyright over the information you put in, it will train on your data unless you make a specific privacy request . If you’re working with sensitive or private company data, do your due diligence and make sure you’ve cleaned up important or easily identifiable data first. Data safety should always be your top priority.

3. Be specific with what you need to know. ChatGPT can only do so much. If you don’t know what your research goals are, ChatGPT isn’t going to be a silver bullet that uncovers the secrets of your data for you. In our experience, it works best with specific prompts that give it clear guidelines and output parameters. For example, you can ask something like:

“Please synthesize the following data and create three takeaways that surface what users thought of these ideas in plain language. Use only the data set provided to create your answers. Highlight the most important things users thought regarding what they liked and didn’t like, and why. Please return your response as a bulleted list, with one bullet for each key takeaway, with sub-bullets underneath those for what they liked and didn’t like, and why.”

Doing the upfront human-researcher work of creating high quality research plans will help you focus on the important questions at this stage.

4. It’s true, ChatGPT gets tired. As with any new technology, ChatGPT is always changing. That being said, the 4.0 version of ChatGP that we worked with demonstrated diminishing returns the longer we used it. Even though the prompts were exactly the same from question to question, with the input of fresh data sources each time, ChatGPT’s answers got shorter and less complete. Prompts asking for three synthesized takeaways would be answered with one or two, with fewer and fewer connections to the data sets. By the end, its answers were straight up wrong. Leading us to our final takeaway:

5. Always do an audit of the answers! Large language models like ChatGPT aren’t able to discern if the answers they provide are accurate or what you were hoping to receive. It’s also incredibly confident when providing its answers, even if they’re wrong. This means you can’t blindly rely on it to give you an accurate answer. You have to go back and sift through the original data and make sure that the answers it gives you line up with what you, the researcher, also see. Unfortunately this means the process will take longer than you were probably hoping for, but the alternative is incomplete, or incorrect answers – which defeat the purpose of synthesis in the first place and could cause the client to lose trust in you.

Outcome: Did using ChatGPT speed up our synthesis significantly? Absolutely. Could we fully rely on ChatGPT’s synthesis output without any sort of audit or gut check? Not at all. We’ll keep experimenting with ways to incorporate emerging technologies like Generative AI into our workstreams, but always with research integrity and humans at our center.

Interested in how GenAI might work for your organization? Drop us a line – we’d love to chat!