Diffbot API, FalkorDB, and LangChain are a great combination for building intelligent applications that can understand and answer questions from unstructured data.
Diffbot API has a powerful API that can extract structured data from unstructured documents, such as web pages, PDFs, or emails. With Diffbot API, you can create a Knowledge graph that represents the entities and relationships in your documents, and store it in FalkorDB. Then, you can use Langchain, to query your Knowledge graph and get answers to your questions. Langchain can handle complex and natural queries, and return relevant and accurate answers from your Knowledge graph.
1. Installing LangChain
First, you need to install LangChain and some dependencies on your machine. You can download it from the official website or use the command line:
pip install langchain langchain-experimental openai redis wikipedia
2. Starting FalkorDB server locally
Staring a local FalkorDB is as simple as running a local docker you can go read on the documentation other ways to run it
> docker run -p 6379:6379 -it --rm falkordb/falkordb:latest
6:C 26 Aug 2023 08:36:26.297 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
6:C 26 Aug 2023 08:36:26.297 # Redis version=7.2.1, bits=64, commit=00000000, modified=0, pid=6, just started
...
...
6:M 26 Aug 2023 08:36:26.322 * <graph> Starting up FalkorDB version 99.99.99.
6:M 26 Aug 2023 08:36:26.324 * <graph> Thread pool created, using 8 threads.
6:M 26 Aug 2023 08:36:26.324 * <graph> Maximum number of OpenMP threads set to 8
6:M 26 Aug 2023 08:36:26.324 * <graph> Query backlog size: 1000
6:M 26 Aug 2023 08:36:26.324 * Module 'graph' loaded from /FalkorDB/bin/linux-x64-release/src/falkordb.so
6:M 26 Aug 2023 08:36:26.324 * Ready to accept connections
Running the demo
The rest of this blog will cover the simple steps you can take to get started, you can also find try the Google Colab notebook
3. Create a Knowledge Graph
Now, let’s create a demo knowledge graph of Warren Buffett using Wikipedioa
from langchain_experimental.graph_transformers.diffbot import DiffbotGraphTransformer
from langchain.document_loaders import WikipediaLoader
diffbot_api_key = "DIFFBOT_API_KEY"
diffbot_nlp = DiffbotGraphTransformer(diffbot_api_key=diffbot_api_key)
query = "Warren Buffett"
raw_documents = WikipediaLoader(query=query).load()
graph_documents = diffbot_nlp.convert_to_graph_documents(raw_documents)
4. Storing the Knowledge Graph in FalkorDB
Last step storing the knowledge Graph to FalkorDB
from langchain.graphs import FalkorDBGraph
graph = FalkorDBGraph(
"falkordb",
)
graph.add_graph_documents(graph_documents)
graph.refresh_schema()
5. Querying the Graph
You are all set, you can start querying the Knowledge Graph… Let’s try a couple of questions.
%env OPENAI_API_KEY=OPENAI_API_KEY
from langchain.chains import GraphCypherQAChain
from langchain.chat_models import ChatOpenAI
chain = GraphCypherQAChain.from_llm(
cypher_llm=ChatOpenAI(temperature=0, model_name="gpt-4"),
qa_llm=ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo"),
graph=graph, verbose=True,
)
chain.run("Which university did Warren Buffett attend?")
> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (p:Person {name: "Warren Buffett"})-[:EDUCATED_AT]->(o:Organization)
RETURN o.name
Full Context:
[['Woodrow Wilson High School'], ['Alice Deal Junior High School'], ['Columbia Business School'], ['New York Institute of Finance']]
> Finished chain.
'Warren Buffett attended Columbia Business School.'
chain.run("Who is or was working at Berkshire Hathaway?")
> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (p:Person)-[r:EMPLOYEE_OR_MEMBER_OF]->(o:Organization) WHERE o.name = 'Berkshire Hathaway' RETURN p.name
Full Context:
[['Warren Buffett'], ['Charlie Munger'], ['Howard Buffett'], ['Susan Buffett'], ['Howard'], ['Oliver Chace']]
> Finished chain.
'Warren Buffett, Charlie Munger, Howard Buffett, Susan Buffett, Howard, and Oliver Chace are or were working at Berkshire Hathaway.'