In this blog post, we will explore the growing significance of chatbots for e-commerce, their limitations, and how GPT-3 can enhance their capabilities. We'll discuss how GPT-3 can improve intent detection, entity extraction, and conversational flow building, and thus make chatbots more robust and efficient.

So, what makes chatbots a business imperative?

Chatbots offer a quick and convenient way for businesses to interact with customers and provide support 24/7. If your customer has a question about a product or needs help with an order, they can ask the chatbot, and it will answer them immediately.

Chatbots can recommend products to your customers based on their browsing history and purchase behavior and boost sales.

But that’s not all.

Chatbots can take care of repetitive tasks, such as answering common questions about shipping and returns policies. This means customer service representatives can devote their time to more complex inquiries.

So, chatbots are incredibly important for e-commerce websites. 

However, chatbots, as we know them today, have a few shortcomings.

Limited conversational skills: While chatbots have improved significantly over the years, their understanding of natural language is far from perfect. They have not mastered context and nuance, which sometimes results in unhelpful responses.

Lack of emotional intelligence: A large part of what drives customer service is emotional intelligence and empathy. In the absence of these key components, responses from the chatbot may sound robotic, leading to a poor customer experience.

Limited problem-solving abilities: Chatbots are programmed to handle specific tasks and queries, which means that they may struggle to solve complex problems. In such cases, customers may need to be directed to a human representative, which can be time-consuming and frustrating.

Lots of training data: Chatbots require lots of training phrases to understand the different ways in which customers might raise a query and respond accordingly. This requires developers to manually generate the data or annotate the ones they have.

So how do we overcome these problems?

GPT to the Rescue!

Many of the limitations of the older generation of chatbots are being addressed with new advancements in artificial intelligence and machine learning.

One such breakthrough is the development of GPT, an auto-regressive transformer-based model developed by OpenAI. The model is trained on a large corpus of data from the web, spanning different knowledge domains, and it can be fine-tuned for specific tasks.

Powerful GPT-3 models have followed suit owing to advances in GPU, availability of large datasets, and advanced training methodologies like reinforcement learning from human feedback (RLHF). These models can be used for a variety of tasks like classification, text conversion, summarization, content generation, translation, and so on. OpenAI has exposed these models, which vary in size and are trained on different types of data, through APIs. The best thing about these models is that you can just describe in plain text what you want them to accomplish with some examples, by a method called prompt engineering.

The best thing about these models is that you can just describe in plain text what you want them to accomplish with some examples, by a method called prompt engineering.

How Chatbots Work

Here is a high-level view of how chatbots work:

  1. Intents: The developer defines the types of questions and requests the chatbot can handle. This involves specifying the keywords and phrases that trigger the chatbot's responses, as well as the expected user input types and contexts.
  2. Knowledge base: Some chatbot engines have the option to store information like an FAQ from which the answer can be fetched in answer to customer queries.
  3. Fulfillment: Once the chatbot model identifies the intent, it comes up with the appropriate response.
  4. Conversational flows: The developer creates conversational flows, which are paths a user can take through the chatbot's responses. The conversational flow defines how the chatbot responds to user inputs and guides the user toward a successful resolution of their request.
  5. Training: Once the conversational flow is defined, the developer trains the chatbot to recognize user input and respond appropriately. This process involves feeding the chatbot sample user inputs and expected responses so that it can learn to recognize and respond to new inputs.
  6. Testing and deployment: After the chatbot is trained, the developer tests it to ensure that it is working as expected and deploys it on the website.

Let's Make the Chatbot Robust!

Now that we have a high-level idea of how chatbots work, let's see how we can use the GPT model at each step and overcome the limitations of current chatbots.

For this experiment, I have used “text-davinci-003”. You can also go with other models, depending on the complexity of your chatbot.

Intent Detection

 To understand the context of a conversation, the chatbot should be able to categorize the conversation into a predetermined set of intents, such as greetings, inquiries, searches, requests, or endings.

Current limitations: Typically, chatbots utilize small classifiers, which have been pre-trained on general intent-related data, to detect intents. To refine the accuracy of these classifiers, a sufficient amount of training data is required for each intent. However, even with fine-tuning, it is difficult to accurately classify complex statements.

GPT-3 solution: Because the GPT-3 model is trained on a vast amount of data, it has a superior understanding of most intents. It can function as a more effective classifier without requiring additional data for fine-tuning (otherwise known as zero-shot classification). For complex examples, where the model does not work as expected a small amount of additional high-quality data is enough to make it work (otherwise known as few-shot classification).

Classification prompt in GPT-3 based chatbots.
Fig 1. Classification prompt

Entity Extraction

Once the intent of a statement has been determined, the next step is to extract all the entities mentioned in it. This involves identifying the crucial elements in the text and understanding the individuals and objects being referred to. 

Current limitations: Chatbots include a basic entity extraction engine, which extracts regular entities like name, location, date, number, etc. Most e-commerce chatbots require a more diverse list of custom entities for which training phrases have to be created and manually annotated. 

GPT-3 solution: By supplying a list of entities to extract along with the user query/chat, the model has the capability to identify and map them correctly, even in cases where multiple entities are present within a single statement.

Entity extraction prompt
Fig 2. Entity extraction prompt

Knowledge base

This is a large database of textual information that can include detailed information on products, return policies, etc. Any queries for which answers are contained in this database will be fetched by the chatbot.

Current limitations: Since the chatbot uses similarity check to find related content from the database of text, it just returns the entire text instead of exact information.

GPT-3 solution: We can fine-tune the model with the knowledge database as a new-word-generation task and then with a question-and-answer task. By doing this, if we ask any related question we will get the exact information from the model. Further, if we don't want the model to answer anything out of context, we can include negative sample questions with "out-of-context" as answers during the training.

Fig 3. Negative samples for training the model
Fig 4. Question and answer prompt

Recommendation engine

Recommendation engines are a powerful tool for e-commerce platforms. They help personalize the customer experience, increase sales by suggesting relevant and complementary products, and improve customer engagement.

Current limitations: Conventional chatbots lack the recommendation feature. They are rule-based and can only provide pre-programmed responses to specific questions.

GPT-3 solution: Thanks to GPT-3's vast knowledge and advanced language processing capabilities, customers can ask for recommendations and receive personalized responses based on their requirements. The model can provide recommendations with the relevant entities. These entities can then be extracted and the relevant product list can be shared with customers. 

Recommendation prompt
Fig 5. Recommendation prompt
Fig 6. Entity extraction prompt

Conclusion

Implementing GPT-powered chatbots can revolutionize your e-commerce business because they can provide personalized and efficient customer service. These chatbots can handle a wide range of queries, offer product recommendations, and even engage customers in natural language conversations. By harnessing the power of GPT technology, you can enhance the user experience on your website, increase customer satisfaction, and ultimately drive more sales.

Don't miss out on the opportunity to transform your e-commerce business with GPT-powered chatbots.

Start exploring the possibilities today!