Summarize Images

class sycamore.transforms.summarize_images.OpenAIImageSummarizer(openai_model: OpenAI | None = None, client_wrapper: OpenAIClientWrapper | None = None, prompt: str | None = None, include_context: bool = True)[source]

Bases: LLMImageSummarizer

Implementation of the LLMImageSummarizer for OpenAI models.

Parameters:
  • openai_model -- The OpenAI instance to use. If not set, one will be created.

  • client_wrapper -- The OpenAIClientWrapper to use when creating an OpenAI instance. Not used if openai_model is set.

  • prompt -- The prompt to use to pass to the model, as a string.

  • include_context -- Whether to include the immediately preceding and following text elements as context.

class sycamore.transforms.summarize_images.SummarizeImages(child: ~sycamore.plan_nodes.Node, summarizer=<sycamore.transforms.summarize_images.OpenAIImageSummarizer object>, **resource_args)[source]

Bases: Map

SummarizeImages is a transform for summarizing context into text using an LLM.

Parameters:
  • child -- The source node for the transform.

  • summarizer -- The class to use for summarization. The default uses OpenAI gpt-4-turbo.

  • resource_args -- Additional resource-related arguments that can be passed to the underlying runtime.

Example

context = sycamore.init()
doc = context.read.binary(paths=paths, binary_format="pdf")                              .partition(partitioner=SycamorePartitioner(extract_images=True))                              .transform(SummarizeImages)                              .show()