How to Generate Structured Outputs of JSON with Lists And Dictionaries with LangChain

How to generate complicated structured outputs with LangChain

George Pipis

--

The output of the LLMs is plain text. However, many times we want to get structured responses in order to be able to analyze them better. The LangChain library contains several output parser classes that can structure the responses of the LLMs. The two main methods of the output parsers classes are:

  • “Get format instructions”: A method that returns a string with instructions about the format of the LLM output
  • “Parse”: A method that parses the unstructured response from the LLM into a structured format

You can find an explanation of the output parses with examples in LangChain documentation. In this tutorial, we will show you something that is not covered in the documentation, and this is how to generate a list of different objects as structured outputs.

Example of Structured Outputs of Lists and Dictionaries

Let’s say that I would like to get the following information:

  • The year of the Olympics
  • The location of the Olympics
  • The top-3 countries in terms of gold medals
  • The gold medals of the top-3 countries

We would like the output of the LLM to be a JSON where the keys will be the required outputs such a years, location and so on, and the values will be either lists (for year and location) or dictionaries (for the top 3 countries and their corresponding medals).

Let’s start coding by loading the required libraries:

from langchain.prompts import (
PromptTemplate,
ChatPromptTemplate,
HumanMessagePromptTemplate,
)
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI

from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field, validator
from typing import List, Dict, TypedDict

chat_model =…

--

--

George Pipis

Sr. Director, Data Scientist @ Persado | Co-founder of the Data Science blog: https://predictivehacks.com/