Summarize#

class sycamore.transforms.summarize.LLMElementTextSummarizer(llm: LLM, element_operator: Callable[[Element], bool] | None = None)[source]#

Bases: Summarizer

LLMElementTextSummarizer uses a specified LLM) to summarize text data within elements of a document.

Parameters:
  • llm -- An instance of an LLM class to use for text summarization.

  • element_operator -- A callable function that operates on the document and returns a list of elements to be summarized. Default is None.

Example

llm_model = OpenAILanguageModel("gpt-3.5-turbo")
element_operator = my_element_selector  # A custom element selection function
summarizer = LLMElementTextSummarizer(llm_model, element_operator)

context = sycamore.init()
pdf_docset = context.read.binary(paths, binary_format="pdf")
    .partition(partitioner=UnstructuredPdfPartitioner())
    .summarize(summarizer=summarizer)
class sycamore.transforms.summarize.Summarize(child: Node, summarizer: Summarizer, **kwargs)[source]#

Bases: NonCPUUser, NonGPUUser, Map

The summarize transform generates summaries of documents or elements.