How to Connect Wikipedia with ChatGPT and LangChain

ChatGPT integration with Wikipedia using LangChain

George Pipis

--

ChatGPT’s knowledge is limited to its training data, which has the cutoff year of 2021. This implies that we cannot extract information for cases that have occurred after the cutoff year. However, we can integrate Wikipedia with ChatGPT. We will go straightforward with an example. Our goal is to extract information about Juancho Hernangomez, the new star of Panathinaikos BC.

The wikipedia Python Package

We will need to install the wikipedia python package by running:

pip install wikipedia

From the wikipedia package, we will use the WikipediaLoader that has the following arguments

  • query: you query to wikipedia
  • optional lang: the language where the default is "en".
  • optional load_max_docs: default=100. You specify the number of downloaded documents.
  • optional load_all_available_meta: default=False. By default only the most important fields downloaded such as the published date of the wikipedia document the title and a summary.

Integrate Wikipedia with ChatGPT

Let’s start by loading the required libraries.

from langchain.document_loaders import WikipediaLoader

from langchain.chat_models import ChatOpenAI
from langchain.llms import OpenAI
from langchain.prompts.chat import (
PromptTemplate,
ChatPromptTemplate,
SystemMessagePromptTemplate,
HumanMessagePromptTemplate,
AIMessagePromptTemplate )

Our query to Wikipedia will be the “Juancho Hernangomez”. Let’s see how we can get the related Wikipedia articles as plain text. We will set a limit of 5 loaded documents.

# The number of max documents
n = 2

# The loader
loader = WikipediaLoader(query='Juancho Hernangomez', load_max_docs=n)

# Concatenate the text to variable called context_text
context_text = ''
for d in range(len(loader.load())):
context_text = context_text + ' ' +…

--

--

George Pipis

Sr. Director, Data Scientist @ Persado | Co-founder of the Data Science blog: https://predictivehacks.com/