vision4R: Vision API Function using OpenAI's Vision API

View source: R/vision4R.R

vision4RR Documentation

Vision API Function using OpenAI's Vision API

Description

This function sends a local image along with a text prompt to OpenAI's GPT-4 Vision API. The function encodes the image in Base64 format and constructs a JSON payload where the user's message contains both text and an image URL (data URI). This structure mimics the provided Python code.

Usage

vision4R(
  image_path,
  user_prompt = "What is depicted in this image?",
  Model = "gpt-4o-mini",
  temperature = 1,
  api_key = Sys.getenv("OPENAI_API_KEY")
)

Arguments

image_path

A string specifying the path to the image file. The image format should be png or jpeg.

user_prompt

A string containing the text prompt. Default: "What is in this image?".

Model

The model to use. Defaults to "gpt-4-turbo". Allowed values: "gpt-4-turbo", "gpt-4o-mini".

temperature

A numeric value controlling the randomness of the output (default: 1).

api_key

Your OpenAI API key. Default: environment variable 'OPENAI_API_KEY'.

Value

A data frame containing the model's response.

Author(s)

Satoshi Kume

Examples

## Not run: 
  # Example usage of the function
  api_key <- "YOUR_OPENAI_API_KEY"
  file_path <- "path/to/your/text_file.txt"
  vision4R(image_path = file_path, api_key = api_key)

## End(Not run)


chatAI4R documentation built on April 4, 2025, 1:06 a.m.

Related to vision4R in chatAI4R...