Member-only story

How to Work with Images using ChatGPT Python API

A practical example of how to use the vision capabilities of ChatGPT

3 min readJun 11, 2024

The GPT-4o model has vision capabilities that enable us to answer questions about the images. Using the ChatGPT interface, you can upload the image and ask a question:

Python SKD

That’s cool, but it would be great if we could do the same task programmatically using the Python API. For this example, we assume that the user gets the images locally.

from openai import OpenAI

import base64
import requests
import os
# OpenAI API Key
api_key = os.environ['OPENAI_API_KEY']

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

# Path to your image
image_path = "images/dogs/image_1.jpg"

# Getting the base64 string
base64_image = encode_image(image_path)

headers = {
  "Content-Type": "application/json",
  "Authorization"…

How to Work with Images using ChatGPT Python API

A practical example of how to use the vision capabilities of ChatGPT

Python SKD

Written by George Pipis

No responses yet