How to Connect External Data with GPT-3 Using LlamaIndex

A gentle introduction to how to connect custom external data with LLM using LlmaIndex

George Pipis

--

Image generated by DALL-E

In this tutorial, we will show you how to connect external data with OpenAI GPT3 using LlamaIndex. For this example, we will connect the book Alice’s Adventures in Wonderland, by Lewis Carroll. By connecting the book with OpenAI GPT3 we will be able to make questions and receive answers related to the content of the book.

Installation and set-up

Using pip you can install the LlamaIndex library as follows:

pip install llama-index

Moreover, you will need to add the OpenAI API key as an environment variable called OPENAI_API_KEY, or alternatively, you can pass it by running:

# My OpenAI Key
import os
os.environ['OPENAI_API_KEY'] = "INSERT OPENAI KEY"

LlamaIndex Usage Pattern

The general usage pattern of LlamaIndex is as follows:

  1. Load in documents (either manually, or through a data loader)
  2. Construct Index (from Nodes or Documents)
  3. Query the index

The first task is to load the document. The book is in a .txt format called alice_in_wonderland.txt and it is under the data folder.

We can load the document by running:

from llama_index import SimpleDirectoryReader

documents = SimpleDirectoryReader('data').load_data()

Index Construction

We can construct an index over this document as follows:

from llama_index import GPTVectorStoreIndex

index = GPTVectorStoreIndex.from_documents(documents)

Save and Load the Index

The index is saved on memory. If we want to save them on disk, we can run:

from llama_index import StorageContext, load_index_from_storage

index.storage_context.persist(persist_dir='./storage')
# rebuild storage context
storage_context = StorageContext.from_defaults(persist_dir='./storage')

# load index
index =…

--

--

George Pipis

Sr. Director, Data Scientist @ Persado | Co-founder of the Data Science blog: https://predictivehacks.com/