contentdiff

package
v0.9.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 18, 2026 License: AGPL-3.0 Imports: 7 Imported by: 0

Documentation

Overview

Package contentdiff provides HTML comparison and difference detection functionality.

This package compares HTML documents to identify changes between different versions of a bookmarked web page. It extracts and compares:

  • Text content (rendered text from the page)
  • Links (href and anchor text)
  • Multimedia elements (images, videos)

The comparison uses the Myers diff algorithm (via sergi/go-diff) to compute differences in text content. For links and multimedia, it performs set-based comparison to identify additions and removals.

The package is used to show users what has changed on a bookmarked page between snapshots, making it easy to track content updates, new links, or removed sections.

Example usage:

reader1 := strings.NewReader(oldHTML)
reader2 := strings.NewReader(newHTML)
diffs, err := contentdiff.DiffHTML(reader1, reader2)
if err != nil {
    return err
}

// Display text changes
for _, d := range diffs.Text {
    fmt.Printf("[%s] %s\n", d.Type, d.Text)
}

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Diffs

type Diffs struct {
	Text       TextDiffs `json:"text"`
	Multimedia TextDiffs `json:"multimedia"`
	Link       LinkDiffs `json:"link"`
}

Diffs contains all types of differences between two HTML documents.

func DiffHTML

func DiffHTML(r1, r2 io.Reader) (*Diffs, error)

DiffHTML compares two HTML documents and returns their differences.

type HTMLContent

type HTMLContent struct {
	Links      []Link   `json:"links"`
	Multimedia []string `json:"multimedia"`
	Text       string   `json:"text"`
}

HTMLContent represents extracted content from an HTML document.

func ExtractHTMLContent

func ExtractHTMLContent(r io.Reader) *HTMLContent

ExtractHTMLContent extracts text, links, and multimedia from an HTML document.

type Link struct {
	Href string `json:"href"`
	Text string `json:"text"`
}

Link represents a hyperlink in HTML content.

type LinkDiff

type LinkDiff struct {
	Link Link   `json:"link"`
	Type string `json:"type"`
}

LinkDiff represents a link difference.

type LinkDiffs added in v0.8.0

type LinkDiffs []LinkDiff

LinkDiffs is a list of LinkDiff items

func DiffLink(l1, l2 []Link) LinkDiffs

DiffLink compares two sets of links and returns their differences.

func (LinkDiffs) String added in v0.8.0

func (lds LinkDiffs) String() string

type TextDiff

type TextDiff struct {
	Text string `json:"text"`
	Type string `json:"type"`
}

TextDiff represents a text difference.

type TextDiffs added in v0.8.0

type TextDiffs []TextDiff

TextDiffs is a list of TextDiff items

func DiffList

func DiffList(l1, l2 []string) TextDiffs

DiffList compares two string lists and returns their differences.

func DiffText

func DiffText(t1, t2 string) TextDiffs

DiffText compares two text strings and returns their differences.

func (TextDiffs) String added in v0.8.0

func (tds TextDiffs) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL