Wednesday 4 April 2018

Clustering Questions Part 2: Intentions

Once you have divided your questions into Topics the next step is to divide them into Intents. This is how I would find the intents inside a topic

An Intent is a purpose or goal expressed by a user’s input such as finding contact information or booking a trip.

Imagine you had an airline booking chatbot. And you had these questions in the Topic booking

There is a dataset of travel questions here I will take some questions from there and invent some myself

A booking topic could have

Question Intent
I'd like to book a trip to Atlantis from Caprica on May 13 BookTicket
I'd like to book a trip from Chicago to San Diego between Aug 26th and Sept 5th BookTicket
i wanna go to Kobe whats available? BookTicket
Can I get information for a trip from Toluca to Paris on August 25th? BookTicket
I'd like to book a trip to Tel Aviv from Tijuana. I was wondering if there are any packages from August 23rd to 26th BookTicket
I want to know how far in advance I can book a flight BookFuture
When do bookings open for 6 months time BookFuture
I want to get a ticket for my christmas flight home BookFuture
Can I check my booking? BookCheck
can i check my booking status BookCheck
can i check the status of my booking BookCheck
how do i check the status of my booking BookCheck
i need to check my booking status BookCheck
Let me know the status of my Booking BookCheck
Can I book now pay later BookPay
How can I pay for a booking? BookPay

On this the verbs Check and Pay each seem to form an intent. There is one intent on When bookings can happen. And a few unknowns that might make more sense when we have more questions later.

At this stage realise you are going to make mistakes and have to go back over your intents as you learn by doing. Fixing your intents once you have had a first cut at defining I will come back to.

One could way to find intents in a topic is to look for verbs. Unrelated actions tend to have different verbs. In this case the topic Booking is already a verb and a noun. This is common enough. Dual meanings like this can be a nightmare with Entities but that is another blogpost.

In this topic something like 'cancel a booking' is likely to be an intention. Here Cancel is the verb and booking as the object of the sentence.

Other clues to the intention are the Lexical Answer Type, the subject and the object The LAT is the type of question. Who questions have different types of answers to When questions. In practise I don't find you commonly use the LAT to define intentions.

One possible exception to this is definitional questions where users ask "What is a..." for a domain term to be explained. If more than 5% of your questions are definitional you may not have collected representitive questions as manufactured questions by non real users or real users forced to ask questions tend to be definitional. When someone runs out of real questions they will ask 'What is a booking'.

The Subject of the sentence is also rarely useful. Sometimes who is doing an action changes the answer but usually there is a set scheme to buy, book, cancel etc and who is doing it doesn't matter.

The Object of the sentence is more often useful. Frequently an intention is a combination of the verb and what it is being done to. Whichever one isn't the Topic is usually the intent. Booking might be a topic and various things you do with a booking would be intents.

In summary go through each topic. If there are verbs shared across questions they might go together in an intent. But you have to use the domain experts knowledge of what questions have the same intention this step cannot be automated.

No comments:

Post a Comment