API Reference#
Message Parts#
- class kani.ext.vision.ImagePart[source]#
Base class for all image message parts.
Generally, you shouldn’t construct this directly - instead, use one of the classmethods to initialize the image from a file path, binary, or Pillow image.
- static from_path(fp: str | bytes | PathLike)[source]#
Load an image from a path on the local filesystem.
- static from_image(image: Image)[source]#
Create an image part from an existing
PIL.Image.Image
.
- async classmethod from_url(url: str, remote: bool = True)[source]#
Create an image part from a URL.
If remote is True, this will not download the image - it will be up to the engine to do so!
Attention
Note that this classmethod is asynchronous, unlike the other classmethods!
This is because we need to check the image headers and metadata before returning a valid image part.
- property image: Image#
Get a
PIL.Image.Image
representing the image.
The following classes are the types constructed by the ImagePart methods.
- class kani.ext.vision.parts.FileImagePart(*, path: Path)[source]#
An image whose data lives at the given file path.
Use
ImagePart.from_path()
to construct.
- class kani.ext.vision.parts.BytesImagePart(*, data: bytes)[source]#
An image whose data lives in memory.
Use
ImagePart.from_bytes()
to construct.
- class kani.ext.vision.parts.PillowImagePart(*, pil_image: Image)[source]#
An image represented by a Pillow Image.
Use
ImagePart.from_image()
to construct.
Engines#
- class kani.ext.vision.engines.openai.OpenAIVisionEngine(api_key: str | None = None, model='gpt-4-vision-preview', *args, **kwargs)[source]#
Bases:
OpenAIEngine
Engine for using vision models on the OpenAI API.
This engine supports all vision-language models, chat-based models, and fine-tunes. It is a superset of the base
OpenAIEngine
.- Parameters:
api_key – Your OpenAI API key. By default, the API key will be read from the OPENAI_API_KEY environment variable.
model – The id of the model to use (e.g. “gpt-3.5-turbo”, “ft:gpt-3.5-turbo:my-org:custom_suffix:id”).
max_context_size – The maximum amount of tokens allowed in the chat prompt. If None, uses the given model’s full context size.
organization – The OpenAI organization to use in requests (defaults to the API key’s default org).
retry – How many times the engine should retry failed HTTP calls with exponential backoff (default 5).
api_base – The base URL of the OpenAI API to use.
headers – A dict of HTTP headers to include with each request.
client – An instance of
OpenAIClient
(for reusing the same client in multiple engines). You must specify exactly one of (api_key, client). If this is passed theorganization
,retry
,api_base
, andheaders
params will be ignored.hyperparams – Any additional parameters to pass to
OpenAIClient.create_chat_completion()
.
- message_len(message: ChatMessage) int [source]#
Return the length, in tokens, of the given chat message.