Generative AI is having one of the biggest impacts on technology in decades. It is truly a game changer. Everyone is talking about Generative AI (“Gen AI”).
ChatGPT has become a household name. It, along with similar tools like Claude and Bard, is bringing consumers closer to Generative AI and enabling people to see the benefits of AI first hand.
Generative AI is at the top of mind at enterprises, as they figure out their AI strategy and how they can leverage Gen AI in developing new, or enhancing existing, products, services, and processes.
As Enterprises explore and develop strategies around Gen AI, new concerns and challenges arise. We are still in the early days.
I had the opportunity to discuss the impact of Generative AI in the enterprise with a group of industry experts, including:
Read the summary and watch the video below.
While there is a lot of enthusiasm around Generative AI, enterprises are tending to take a cautious approach. They are more at the experimentation stage or developing internal tools, rather than consumer facing public ones.
As Austin Arensberg of Okta puts it, we are seeing a lot of “fits and starts.” Enterprises get excited with experimenting, hit a roadblock, and immediately pull back. In some cases this has even led to enterprises not allowing employees to use tools like ChatGPT.
One of the related challenges in a large enterprise is having a coherent Gen AI strategy. As Austin explains, there are often multiple teams working in parallel trying to figure out how to use Gen AI and which tools to use. Even if they decide to move forward, can they even use the Gen AI tools - will their company allow them.
Another related challenge is that we are in such early days, that some of the Large Language Model (LLM) providers do not have the sales and service teams set up yet to support the influx of enterprise requests. As Austin explains, those companies have their own long queues they are working through.
The internal uses of Gen AI are having a significant impact though. Our panelists provided great examples including in marketing, software development, and productivity.
For marketers, Gen AI tools help reduce costs and time significantly. As Roger Kibbe of Samsung puts it, marketers were “dancing in the streets” when they started using Gen AI. While previously a team may have had 40 people working on emails, writing and segmenting content, they could now do it much faster and in a more targeted manner.
With software development, Gen AI is so incredibly useful and time saving that if a developer is not using it, you have to ask why. As Roger explains, not only can the tools write code, but they help in debugging and fixing code - no more pulling one’s hair out in frustration. Tracy Rubin of Cooley adds that teams are seeing 30% savings on engineering time, especially with non-complicated, time-consuming, or redundant tasks.
Tracy is also seeing great uses of Gen AI internally in meetings. Gen AI can record, transcribe, and summarize the meetings.
While most enterprises may be experimenting, there are some that are deploying public facing applications of Gen AI. As Andrea Friio of AWS points out, with technology today, there is a fine line between experimentation / proof-of-concept and pilot / production. Amazon for instance, is making use of Gen AI to summarize reviews on product listing pages. Summarization is a great use case for Generative AI. It is relatively low risk and can have a big impact. As Andrea explains, in the contact center space, summarization at scale can save 30% of an agent’s time. That is a big win!
The opportunity Gen AI provides is driving the push to production. According to Gartner, as Andrea explains, the productivity increase, amongst knowledge workers, by using Gen AI tools is estimated to be at 30%. This is why there is a rush to get to production. Andrea recommends starting small and nimble with something lower risk, and expanding.
Data security and copyright ownership are key enterprise concerns.
When it comes to data security, enterprises need to be careful what data is being input, how the data is being used, and who will have access to it. Enterprises do not want their sensitive data getting in the hands of potential competitors.
As Tracy states, anything put into one of the Gen AI tools needs to be thought of as a disclosure of confidential information. The more people you disclose confidential data to, the more places it is living. Even if a Gen AI tool provides an option to opt-out of the data being used for training purposes, the data may still be sitting on their servers. It is not 100% risk free.
Roger echoes this sentiment when it comes to publicly available Gen AI tools. He indicates one should assume any data input into the model, or as part of Retrieval Augmented Generation (RAG) document, will be exposed. Anything put in, and anything received back, is going to be available to anyone else using the model. This is partly why Samsung banned the use of ChatGPT. It is also a reason some companies may prefer to run their own models in the cloud to keep the information private within their company.
The security of data is so important, that it needs to be looked into up front. Andrea recommends asking the LLM provider who is going to have access to the data and how is it going to be used? If you cannot get a clear answer, do not use that provider.
Copyright ownership is a big concern - not just in claiming rights, but in infringing on others’ rights. In the US, there is no ownership of anything AI generated. As Tracy explains, this complicates the ability to copyright and patent code when an engineer uses Gen AI code assistants. What code did the engineer write, and what did the tool write? The tool potentially taints all the code, making it questionable whether it can be patented or copyrighted.
Enterprises need to be careful of potential copyright infringement too. For example, when it comes to code generated by LLMs, which licenses apply? Is the code using the MIT license or something else? Is the company infringing on the IP or shielded by it? Is the code considered a derivative work? As Andrea explains, sometimes we do not have insights into how the LLMs are trained and where the data is coming from. This can be an even bigger challenge for international companies. The European Union (EU) just recently announced the Artificial Intelligence Act to address Gen AI concerns.
In the case of coding, there is also an added potential concern - the vulnerabilities of the code itself.
To help offset concerns, Tracy indicates enterprises are building in guardrails and safeguards to be more comfortable that their sensitive data is going to be protected; they will not be sued for copyright infringement; and they will not lose the opportunity to protect the information they want to own.
Tracy adds, most companies are not turning a blind eye to unsanctioned use of Gen AI. It may not be practical for every company to have their own “walled garden” approach with in-house versions. Instead, some companies negotiate enterprise terms with a provider, while others are setting up guardrails and guidelines for their teams.
Along with data security and copyright concerns, the cost of Gen AI is also a key factor to consider early in Gen AI decision making. As Andrea indicates, not all LLMS are equal in cost or performance. The cost differences between Jurassic or Claude vs Open AI can be astonishing. Usage of the models can be quite expensive. As Andrea states, the costs could bankrupt a company.
It is important to choose the right model for the right application. There is a tendency to highlight how many billions of parameters an LLM has. However, it does not make sense to pay higher costs, if the use case does not justify the need for a higher parameter LLM. As Roger recommends, choose the model that is just enough for what you want. A general purpose chatbot may need an enormous model, but if the application is something purpose built for specific use cases, then a much smaller model will be faster and cheaper.
Hallucinations, i.e. when an LLM responds with inaccurate information, are a real concern. They are one of the reasons enterprises are taking a cautious approach to rolling out Gen AI solutions to the public. Hallucinations may be considered cute, fun, and entertaining when playing around in ChatGPT, but as Roger indicates, they become a real serious problem when you start building applications. At the same time, it is one thing to have them occur in internal applications, it is a much bigger problem in public-facing ones.
Hallucinations occur when the LLM does not know the answer, but still tries to answer, and makes something up. Part of the problem is, unless you explicitly tell the LLM to do so, they never say, “I do not know,” as Roger explains.
Hallucinations are a new problem we are not used to seeing in technology. It is not just the hallucination, but that the responses are stochastic, as Roger explains. The same input can generate different outputs. This is not what engineers are used to. How can a QA person perform testing, when the output changes for the same input?
There are different approaches to help mitigate hallucinations. One approach that can work well is feeding the response back into the LLM and asking it to self evaluate. Prompt engineering can also help. In some cases the right prompts can increase accuracy up to 98-99%. Fine tuning models are another approach too. The space is still emerging and Roger sees these approaches as more like band-aids for a problem we need to solve. Andrea agrees that there currently is no way to eliminate hallucinations, but that the problem needs to be solved. This is an opportunity for new tooling to address the issue.
The space is moving very fast, with large cloud providers and startups launching new models, tools, and functionality, at an incredibly rapid pace.
The panelists are optimistic about the future of Generative AI and shared their thoughts and predictions, as well as areas of improvement.
Austin is bullish on artificial social networks - wherein people can interact with virtual AI versions of ourselves. As he explains, there is an epidemic of loneliness today. The lack of human interaction and communication we have, can scarily be supplemented by a tool like ChatGPT. With Gen AI, there are things about a person that can be easier mined today - things that only a friend would know, that could be part of a digital agent. When one starts thinking of these digital agents based on LLMs not as an index or search engine like Google, but as a person, then it is natural for Austin, that people will have virtual friends. He sees this as too big of an opportunity for someone not to solve, and it will happen sooner than we imagine.
Andrea would like to see the current iteration of chatbots disappear, in favor of new Gen AI based experiences. Users are not just trying to get information that is available on a website, they are trying to solve a complex issue, and the chatbot gets in the way of getting an answer. With Gen AI, we have the technology to make agents that are much better at solving problems and completing tasks.
He would also like to see LLMs advance into General Augmented Intelligence - with capabilities of reasoning and responding in real time.
Roger believes we will see more edge LLMs in the near future. LLMs are expensive to run in the cloud and there are privacy and latency concerns too. Currently, edge LLMs are more an academic discussion about having smaller LLMs at the edge interacting with a larger one in a data center. There is an opportunity to save costs and reduce latency having an LLM embedded on the phone. It can help make a voice assistant, app, or appliance better, faster, and cheaper.
Roger would also like to see the education space embrace Gen AI more, instead of fighting it. There is a great opportunity to leverage Gen AI, not to write a paper, but to provide feedback, like a writing tutor, or a brainstormer. Tracy agrees that AI can help level the playing field by giving access to a custom tutor, to people who normally would not be able to afford one.
Tracy believes we will see a lot more fine-tuned and custom-built models for specific use cases. These models will help reduce the concerns with security risks, hallucinations, and not getting an appropriate answer for a particular use case. In the legal profession, for example, having an AI assistant that can actually help with case law research and return real cases, or help with due diligence on a transaction, can be very useful. As she explains, general purpose model tools do not give the right results. You need a tool trained on real cases that only gives back real cases. You need finer tuned models that are really going to give you the accurate results, and not just a response that sounds human.
Generative AI is definitely having a significant impact on the enterprise, even if they are taking a cautious approach to rolling out solutions to the public.
While 2023 may have been the year of experimentation and internal tools, I believe 2024 will be the year of production deployments in public-facing applications.
I look forward to the advancements in Gen AI and seeing the new applications of the technology being released!
Reconify is a cross-platform, analytics and optimization solution for Generative AI to enable enterprises to analyze, optimize, and take action on prompts, responses, and models to improve response effectiveness, and customer satisfaction.