Summarize Images#
- class sycamore.transforms.summarize_images.OpenAIImageSummarizer(openai_model: OpenAI | None = None, client_wrapper: OpenAIClientWrapper | None = None, prompt: str | None = None, include_context: bool = True)[source]#
Bases:
object
Image Summarizer that uses OpenAI GPT-4 Turbo to summarize the specified image.
The image is passed to OpenAI along with a text prompt and optionally the text elements immediately preceding and following the image.
- Parameters:
openai_model -- The OpenAI instance to use. If not set, one will be created.
client_wrapper -- The OpenAIClientWrapper to use when creating an OpenAI instance. Not used if openai_model is set.
prompt -- The prompt to use to pass to the model, as a string.
include_context -- Whether to include the immediately preceding and following text elements as context.
Example
The following code demonstrates how to partition a pdf DocSet and summarize the images it contains. This version uses the default prompt and disables passing additional text context.
context = sycamore.init() doc = context.read.binary(paths=paths, binary_format="pdf") .partition(partitioner=SycamorePartitioner(extract_images=True)) .transform(SummarizeImages(summarizer=OpenAIImageSummarizer(include_context=False))) .show()
- class sycamore.transforms.summarize_images.SummarizeImages(child: ~sycamore.plan_nodes.Node, summarizer=<sycamore.transforms.summarize_images.OpenAIImageSummarizer object>, **resource_args)[source]#
Bases:
Map
SummarizeImages is a transform for summarizing context into text using an LLM.
- Parameters:
child -- The source node for the transform.
summarizer -- The class to use for summarization. The default uses OpenAI gpt-4-turbo.
resource_args -- Additional resource-related arguments that can be passed to the underlying runtime.
Example
context = sycamore.init() doc = context.read.binary(paths=paths, binary_format="pdf") .partition(partitioner=SycamorePartitioner(extract_images=True)) .transform(SummarizeImages) .show()