How to Work with Images using ChatGPT Python API

A practical example of how to use the vision capabilities of ChatGPT

George Pipis
3 min readJun 11, 2024

The GPT-4o model has vision capabilities that enable us to answer questions about the images. Using the ChatGPT interface, you can upload the image and ask a question:

Python SKD

That’s cool, but it would be great if we could do the same task programmatically using the Python API. For this example, we assume that the user gets the images locally.

from openai import OpenAI

import base64
import requests
import os
# OpenAI API Key
api_key = os.environ['OPENAI_API_KEY']

# Function to encode the image
def encode_image(image_path):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')

# Path to your image
image_path = "images/dogs/image_1.jpg"

# Getting the base64 string
base64_image = encode_image(image_path)

headers = {
"Content-Type": "application/json",
"Authorization"…

--

--

George Pipis
George Pipis

Written by George Pipis

Sr. Director, Data Scientist @ Persado | Co-founder of the Data Science blog: https://predictivehacks.com/

No responses yet