Blog

Contents

    Jul 11

    AI Needs to Explain Itself, but Not for the Reasons You Think

    Artificial Intelligence is a black box. You give it an input and you get an output. Data scientists know how this works, but we don’t know what AI is doing exactly while it’s processing a response. This doesn’t mean we are on the verge of terminator or that we lack control over AI Agents, but it does make it more difficult for us to make sure outputs are aligned with what we intended. There’s plenty of research going into looking inside the “brain”, but this isn’t exactly what we are diving into here. Here we are discussing the importance of allowing the AI to explain itself… even when we don’t care about the explanation at all.

    The Problem

    It can be difficult to build applications like CloseBot that take humans out of the loop. When you don’t have a human in the loop, you have to trust that the AI won’t make mistakes that end up getting worse without human correction. Subtle changes in input can have drastic consequences on the accuracy of the output. The worst part is that a lot of these input mistakes aren’t intuitive or even counterintuitive in some cases.

    Some have made attempts to understand why the AI is making a decision by asking it to explain its work. However, this doesn’t function how people think. There is a blog from Anthropic that explains that you cannot believe the thought process that AI tells you. It’s not that AI is trying to lie to you, it’s just that the explanation doesn’t always reveal what’s actually driving behavior.

    An Example

    Since CloseBot’s superpower is that it dynamically builds prompts for you, we see a lot of prompts (over 500,000 per day). We truly are prompt experts… not just some guru pushing a course.

    This is an actual prompt that was used by CloseBot in conjunction with OpenAI’s gpt-4.1 model to determine when we should follow up with a contact:

     

    Given a conversation you need to come up with a datetime that defines when we should send a message to the contact.

    REMEMBER THE CURRENT DATE AND TIME RIGHT NOW IN UTC IS: 07/06/2025 08:44:03

    ONLY RESPOND WITH THE DATETIME OF WHEN WE SHOULD SEND A MESSAGE BACK TO THE CONTACT AND A DO NOT INCLUDE ANY EXPLANATION

    ### Conversation (oldest to newest)
    1: [CONTACT] This is Sophia
    2: [US] Hi Sophia When should we remind you – 2 days, 1 week, 1 month?
    3: [CONTACT] Good morning, see you in a month. thanks and best regards, Sophia</span

    It’s easy for us humans to see that the follow-up should occur one month from today. So the correct answer is 08/06/2025. However, this was the output:

    ASSISTANT: 07/06/2025 08:44:03

    Bummer… is AI really that bad at even the most simple task? Yes, it is. It takes a thorough understanding of prompting to know how to resolve issues like this, which seem at the surface like a pure limitation of AI’s capabilities. Let’s take a look at how the team resolved this:

    Given a conversation you need to come up with a datetime that defines when we should send a message to the contact.

     

    Given a conversation you need to come up with a datetime that defines when we should send a message to the contact.

    REMEMBER THE CURRENT DATE AND TIME RIGHT NOW IN UTC IS: 07/06/2025 08:44:03
    ONLY RESPOND WITH THE DATETIME OF WHEN WE SHOULD SEND A MESSAGE BACK TO THE CONTACT AND INCLUDE A SHORT EXPLANATION

    ### Conversation (oldest to newest)
    1: [CONTACT] This is Sophia
    2: [US] Hi Sophia When should we remind you – 2 days, 1 week, 1 month?
    3: [CONTACT] Good morning, see you in a month. thanks and best regards, Sophia

    To the untrained eye, you might think that this wouldn’t change anything. After all, we don’t care about the explanation. We simply intend to grab the output and use our code to schedule a follow-up. But let’s see how this simple change affects the output:

    ASSISTANT: 2025-08-06 08:44:03 UTC

    Explanation: Sophia said “see you in a month,” so we should send the message exactly one month from today’s date and time (07/06/2025 08:44:03 UTC).

    Key Takeaways

    Even when you don’t care about an explanation, it helps to allow your AI to explain itself. Keep in mind that you can’t even always trust the explanation, but it has been proven to improve accuracy. And if you’re trying to avoid extra costs from generating output you don’t care about, we’ve seen that you’re able to stop the explanation by limiting output tokens and you still gain the improved accuracy benefit.

    For example, if we limit the output tokens to 16 (the minimum for gpt-4.1) in the above example, this is what we get:

    ASSISTANT: 2025-08-06 08:44:03
    Explanation: Sophia

    For CloseBot Users

    If you’re a CloseBot user you don’t have to worry about any of this! Know that CloseBot is doing the heavy lifting for you. We are the true prompt experts that are testing and iterating to improve performance while reducing costs. If you’re not a CloseBot user and need something to automate your lead qualification and appointment booking efforts, grab your free account now!

    What you should do now

    1. See CloseBot’s powerful AI agents in action.
      Sign up for a free trial.
    2. Read more articles from our blog.
    3. If you know someone who would enjoy this article, share it with them via email, LinkedIn, Twitter/X, or Facebook.