API Reference#

Message Parts#

class kani.ext.vision.ImagePart[source]#

Base class for all image message parts.

Generally, you shouldn’t construct this directly - instead, use one of the classmethods to initialize the image from a file path, binary, or Pillow image.

static from_path(fp: str | bytes | PathLike)[source]#: Load an image from a path on the local filesystem.

static from_bytes(data: bytes)[source]#: Load an image from binary data in memory.

static from_image(image: Image)[source]#: Create an image part from an existing PIL.Image.Image.

async classmethod from_url(url: str, remote: bool = True)[source]#

Create an image part from a URL.

If remote is True, this will not download the image - it will be up to the engine to do so!

Attention

Note that this classmethod is asynchronous, unlike the other classmethods!

This is because we need to check the image headers and metadata before returning a valid image part.

property image: Image#: Get a PIL.Image.Image representing the image.

property bytes: bytes#: The binary image data.

property b64: str#

The binary image data encoded in a base64 string.

Note that this is not a web-suitable data:image/... string; just the raw binary of the image. Use b64_uri for a web-suitable string.

property b64_uri: str#: Get the binary image data encoded in a web-suitable base64 string.

property size: tuple[int, int]#: Get the size of the image, in pixels.

property mime: str#: Get the MIME filetype of the image.

The following classes are the types constructed by the ImagePart methods.

class kani.ext.vision.parts.FileImagePart(*, path: Path)[source]#

An image whose data lives at the given file path.

Use ImagePart.from_path() to construct.

class kani.ext.vision.parts.BytesImagePart(*, data: bytes)[source]#

An image whose data lives in memory.

Use ImagePart.from_bytes() to construct.

class kani.ext.vision.parts.PillowImagePart(*, pil_image: Image)[source]#

An image represented by a Pillow Image.

Use ImagePart.from_image() to construct.

class kani.ext.vision.parts.RemoteURLImagePart(*, url: str, size_: tuple[int, int], mime_: str)[source]#

A reference to a remote image stored at the given URL.

Use ImagePart.from_url() to construct.

Engines#

class kani.ext.vision.engines.openai.OpenAIVisionEngine(api_key: str | None = None, model='gpt-4-vision-preview', *args, **kwargs)[source]#

Bases: OpenAIEngine

Engine for using vision models on the OpenAI API.

This engine supports all vision-language models, chat-based models, and fine-tunes. It is a superset of the base OpenAIEngine.

Parameters:

api_key – Your OpenAI API key. By default, the API key will be read from the OPENAI_API_KEY environment variable.
model – The id of the model to use (e.g. “gpt-3.5-turbo”, “ft:gpt-3.5-turbo:my-org:custom_suffix:id”).
max_context_size – The maximum amount of tokens allowed in the chat prompt. If None, uses the given model’s full context size.
organization – The OpenAI organization to use in requests (defaults to the API key’s default org).
retry – How many times the engine should retry failed HTTP calls with exponential backoff (default 5).
api_base – The base URL of the OpenAI API to use.
headers – A dict of HTTP headers to include with each request.
client – An instance of OpenAIClient (for reusing the same client in multiple engines). You must specify exactly one of (api_key, client). If this is passed the organization, retry, api_base, and headers params will be ignored.
hyperparams – Any additional parameters to pass to OpenAIClient.create_chat_completion().

message_len(message: ChatMessage) → int[source]#: Return the length, in tokens, of the given chat message.