Announcing Zep's Entity Extractor!

Today we're launching Zep's EntityExtractor, a Named Entity Recognition tool built using state-of-the-art NLP toolkit, spaCy. With Zep's EntityExtractor, developers can build sophisticated features that:

  • Trigger the use of custom prompts or agent branching;
  • Annotate the chat history, enhancing the experience for users with links to additional information, services, or products.
  • Evaluate human and agent messages further to extract dates, currencies, people's names, place names, etc.
  • and much more.

Zep's EntityExtractor runs entirely locally and does not depend on access to an LLM. Like Zep's other extractors, the EntityExtractor runs asynchronously to the chat loop and operates with very low latency, ensuring your user experience is unaffected.

The following is an agent example using Langchain's ZepChatMessageHistory class. The setup is below.

# Set up Zep Chat History
zep_chat_history = ZepChatMessageHistory(
    session_id=session_id,
    url=ZEP_API_URL,
)

# Use a standard ConversationBufferMemory 
# to encapsulate the Zep chat history
memory = ConversationBufferMemory(
    memory_key="chat_history", chat_memory=zep_chat_history
)

# Initialize the agent
llm = OpenAI(temperature=0)
agent_chain = initialize_agent(
    tools,
    llm,
    agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
    verbose=True,
    memory=memory,
)

The agent has access to a Wikipedia tool. Let's ask the agent to recommend books similar to those written by Octavia Butler, my favorite science fiction author.

agent_chain.run(
    input="""Recommend science fiction books I 
    should read that are similar to Octavia Butler's 
    Parable of the Sower. Research online."""
)
> Entering new AgentExecutor chain...
Thought: Do I need to use a tool? Yes
Action: Search
Action Input: Recommend science fiction books similar to Octavia Butler's Parable of the Sower
Observation: Page: Dune (novel)
Summary: Dune is a 1965 epic science fiction novel by American author Frank Herbert, originally published as two separate serials in Analog magazine

<snip />

Thought: Do I need to use a tool? No
AI: Based on your request, some recommended science fiction books similar to Octavia Butler's Parable of the Sower include The Left Hand of Darkness by Ursula K. Le Guin, The Dispossessed by Ursula K. Le Guin, The Female Man by Joanna Russ, Dune by Frank Herbert, and The Forever War by Joe Haldeman.

> Finished chain.

Looking at the contents of the agent's memory, you'll note that the EntityExtractor does a good job of identifying the books and authors. (I've removed some entities for brevity's sake).

Alongside the labels and identified entities, the character offsets of the entities are provided. This makes it simple to build chat annotations into your app!

Other Zep artifacts are also present: token counts, uuids, and timestamp metadata.

pp = pprint.PrettyPrinter(indent=4)

pp.pprint(zep_chat_history.zep_messages[-1].dict())
{
  "content": "Based on your request, some recommended science fiction books similar to Octavia Butler's Parable of the Sower include The Left Hand of Darkness by Ursula K. Le Guin, The Dispossessed by Ursula K. Le Guin, The Female Man by Joanna Russ, Dune by Frank Herbert, and The Forever War by Joe Haldeman.",
  "created_at": "2023-05-26T17:31:36.433237Z",
  "metadata": {
    "system": {
      "entities": [
        {
          "Label": "WORK_OF_ART",
          "Matches": [
            {
              "End": 144,
              "Start": 119,
              "Text": "The Left Hand of Darkness"
            }
          ],
          "Name": "The Left Hand of Darkness"
        },
        {
          "Label": "PERSON",
          "Matches": [
            {
              "End": 165,
              "Start": 148,
              "Text": "Ursula K. Le Guin"
            }
          ],
          "Name": "Ursula K. Le Guin"
        },
        {
          "Label": "PERSON",
          "Matches": [
            {
              "End": 235,
              "Start": 224,
              "Text": "Joanna Russ"
            }
          ],
          "Name": "Joanna Russ"
        },
        ...
        {
          "Label": "WORK_OF_ART",
          "Matches": [
            {
              "End": 279,
              "Start": 264,
              "Text": "The Forever War"
            }
          ],
          "Name": "The Forever War"
        },
        {
          "Label": "PERSON",
          "Matches": [
            {
              "End": 295,
              "Start": 283,
              "Text": "Joe Haldeman"
            }
          ],
          "Name": "Joe Haldeman"
        }
      ]
    }
  },
  "role": "ai",
  "token_count": 75,
  "uuid": "9e5c34bb-63fb-472e-847f-dc2ccd3144fb"
}

 

We're currently using spaCy's smallest English language model, en_core_web_sm, for entity extraction.

The EntityExtractor is available in our docker compose install and Render.com deployment.

Want to try this out yourself? Take a look at our notebook exploring the above in more detail.

💡
Want to get started using Zep?

Follow the Zep Quick Start Guide.