iblt

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 30, 2025 License: MIT Imports: 5 Imported by: 0

README

go-iblite

GitHub Tag Build Status Go benchmarks MIT License Docs

Overview

This is an implementation of an Invertible Bloom Lookup Table in Go. The name comes from IBLT lite; in both sense of simple and efficient. This package implements two variants of IBLT: key-only and key-value.

Invertible Bloom Lookup Tables, introduced by Michael T. Goodrich, Michael Mitzenmacher are a probabilistic data structure that compactly represents a set and can be decoded to recover its elements, provided the structure is not too full.

Unlike a standard Bloom filter, an IBLT is invertible: it stores enough aggregate information to reconstruct individual entries via a peeling process.

IBLTs are commonly used for set reconciliation between two parties. By building an IBLT from each set and subtracting them, the result encodes only the set difference, which can be decoded efficiently. This allows large sets with small differences to be synchronized with communication proportional to the difference size, not the total set size.

For a more detailed explanation of the algorithm, see b5's great explanation video.

Drawbacks

IBLTs are great but still require to dimension them beforehand, proportionally to the set difference, which can be a challenge in some cases. A great extension of that idea are Rateless IBLTs, which are not implemented here.

Example

// Let's perform a set reconciliation between Alice and Bob.
// Each of them has a set of keys, and we want to find the keys that the other party doesn't
// have, so they can synchronize the differences. This is a common problem in distributed
// systems, databases, etc. In particular, it's useful in scenarios where the difference between
// two sets is small, but the total size of the sets is large.

// Each creates an IBLT large enough to hold the expected **difference** between the two sets.
alice := iblt.NewKTable(20, 4)
bob := iblt.NewKTable(20, 4)

// Each inserts the keys they have into their respective IBLT.
// Here, we insert 1 million keys, far, far more than the IBLT can hold without saturating.
// Bob will have some keys missing and some extra keys compared to Alice.
for i := uint64(0); i < 1_000_000; i++ {
    alice.Insert(i)
}
for i := uint64(5); i < 1_000_005; i++ {
    bob.Insert(i)
}

// Bob transmits his IBLT to Alice, and Alice subtracts it from her own.
bobBytes := bob.ToBytes()
received, err := iblt.KTableFromBytes(bobBytes)
if err != nil {
    panic(err)
}

// Just to illustrate, we'll print the size of a million keys, and the size of the IBLT.
fmt.Printf("1 million keys: %d bytes\n", 1_000_000*8)
fmt.Printf("IBLT size: %d bytes\n", len(bobBytes))

// Now the magic trick:
// Alice subtracts the received IBLT from her own, and peel (decode) the missing keys.
alice.Subtract(received)

fmt.Println()
fmt.Println("Keys that Alice doesn't have:")
misses := alice.Copy()
for key := range misses.PeelMisses() {
fmt.Println(key)
}
fmt.Println("Peeling completed:", misses.Empty())
fmt.Println()
fmt.Println("Keys that Bob doesn't have:")
has := bob.Copy()
for key := range alice.Copy().PeelHas() {
fmt.Println(key)
}
fmt.Println("Peeling completed:", has.Empty())

// Output:
// 1 million keys: 8000000 bytes
// IBLT size: 484 bytes
//
// Keys that Alice doesn't have:
// 1000000
// 1000004
// 1000003
// 1000002
// 1000001
// Peeling completed: true
//
// Keys that Bob doesn't have:
// 3
// 0
// 4
// 1
// 2
// Peeling completed: false

License

MIT

Documentation

Overview

Example
package main

import (
	"fmt"

	iblt "github.com/MichaelMure/go-iblite"
)

func main() {
	// Let's perform a set reconciliation between Alice and Bob.
	// Each of them has a set of keys, and we want to find the keys that the other party doesn't have, so they can
	// synchronize the differences. This is a common problem in distributed systems, databases, etc.
	// In particular, it's useful in scenarios where the difference between two sets is small, but the total size of
	// the sets is large.

	// Each creates an IBLT large enough to hold the expected **difference** between the two sets.
	alice := iblt.NewKTable(20, 4)
	bob := iblt.NewKTable(20, 4)

	// Each inserts the keys they have into their respective IBLT.
	// Here, we insert 1 million keys, far, far more than the IBLT can hold without saturating.
	// Bob will have some keys missing and some extra keys compared to Alice.
	for i := uint64(0); i < 1_000_000; i++ {
		alice.Insert(i)
	}
	for i := uint64(5); i < 1_000_005; i++ {
		bob.Insert(i)
	}

	// Bob transmits his IBLT to Alice, and Alice subtracts it from her own.
	bobBytes := bob.ToBytes()
	received, err := iblt.KTableFromBytes(bobBytes)
	if err != nil {
		panic(err)
	}

	// Just to illustrate, we'll print the size of a million keys, and the size of the serialized IBLT.
	fmt.Printf("1 million keys: %d bytes\n", 1_000_000*8)
	fmt.Printf("IBLT size: %d bytes\n", len(bobBytes))

	// Now the magic trick: Alice subtracts the received IBLT from her own, and peel (decode) the missing keys.
	alice.Subtract(received)

	fmt.Println()
	fmt.Println("Keys that Alice doesn't have:")
	misses := alice.Copy()
	for key := range misses.PeelMisses() {
		fmt.Println(key)
	}
	fmt.Println("Peeling completed:", misses.Empty())
	fmt.Println()
	fmt.Println("Keys that Bob doesn't have:")
	has := bob.Copy()
	for key := range alice.Copy().PeelHas() {
		fmt.Println(key)
	}
	fmt.Println("Peeling completed:", has.Empty())

}
Output:

1 million keys: 8000000 bytes
IBLT size: 484 bytes

Keys that Alice doesn't have:
1000000
1000004
1000003
1000002
1000001
Peeling completed: true

Keys that Bob doesn't have:
3
0
4
1
2
Peeling completed: false

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type KTable

type KTable struct {
	// contains filtered or unexported fields
}

KTable is an Invertible Bloom Lookup Table, which only holds keys, not values. It's a probabilistic concise data structure for set representation that supports a peeling operation as the recovery of the elements in the represented set. It can be used in particular for set reconciliation, see the example. KTable is NOT thread safe.

func KTableFromBytes

func KTableFromBytes(data []byte) (*KTable, error)

func KTableFromReader

func KTableFromReader(r io.Reader) (*KTable, error)

func NewKTable

func NewKTable(bucketCount, hashCount int) *KTable

NewKTable creates a new key-only IBLT with the given parameters. - hashCount parameter is the number of hashes applied to each key. - bucketCount parameter is the number of buckets. Those parameters directly affect the memory usage of the IBLT, as well as the probability to succeed in a peeling operation.

func (*KTable) CompatibleWith

func (t *KTable) CompatibleWith(other *KTable) bool

CompatibleWith returns true if the two IBLTs have the same parameters (hashCount and bucketCount).

func (*KTable) Copy

func (t *KTable) Copy() *KTable

Copy returns a deep copy of the IBLT.

func (*KTable) Count

func (t *KTable) Count() int

Count returns the number of keys in the IBLT. IMPORTANT: this number is only accurate if: - no Subtract has been done - no Delete with a key not in the IBLT has been done

func (*KTable) Delete

func (t *KTable) Delete(key uint64)

Delete removes the given key from the IBLT.

func (*KTable) Empty

func (t *KTable) Empty() bool

Empty returns true if the IBLT is empty.

func (*KTable) Equals

func (t *KTable) Equals(other *KTable) bool

Equals returns true if the two IBLTs are identical.

func (*KTable) Insert

func (t *KTable) Insert(key uint64)

Insert adds the given key to the IBLT.

func (*KTable) Peel

func (t *KTable) Peel() iter.Seq2[uint64, int64]

Peel returns a sequence of (key, count) pairs. This function removes the key from the IBLT, so consider Copy() if you want to keep the original IBLT. The count is 1 if the key is present in the IBLT, -1 if it is missing (following a Subtract). This allows doing set reconciliation with two IBLTs (A and B), A.Subtract(B), then A.Peel(): - a count of 1 means that the key is in A, but not in B - a count of -1 means that the key is in B, but not in A IMPORTANT: the peeling process can fail if the IBLT is saturated. This function, however, will NOT return an error in that case. Instead, you NEED to call Empty() to check if the process completed successfully.

func (*KTable) PeelHas

func (t *KTable) PeelHas() iter.Seq[uint64]

PeelHas is the same as Peel, but only returns keys that are present in the IBLT (count==1).

func (*KTable) PeelMisses

func (t *KTable) PeelMisses() iter.Seq[uint64]

PeelMisses is the same as Peel, but only returns keys that are missing from the IBLT (count==-1).

func (*KTable) Subtract

func (t *KTable) Subtract(other *KTable)

Subtract subtracts the given IBLT from this one. This function panics if the two IBLTs are not compatible.

func (*KTable) ToBytes

func (t *KTable) ToBytes() []byte

ToBytes returns the serialized representation of the key-only IBLT.

func (*KTable) ToWriter

func (t *KTable) ToWriter(w io.Writer) error

ToWriter serializes the IBLT to the given writer.

type KV

type KV struct {
	Key   uint64
	Value []byte
}

type KVTable

type KVTable struct {
	// contains filtered or unexported fields
}

KVTable is an Invertible Bloom Lookup Table, which holds keys and values. It's a probabilistic concise data structure for set representation that supports a peeling operation as the recovery of the elements in the represented set. It can be used in particular for set reconciliation, see the example. KVTable is NOT thread safe.

func KVTableFromBytes

func KVTableFromBytes(data []byte) (*KVTable, error)

func KVTableFromReader

func KVTableFromReader(r io.Reader) (*KVTable, error)

func NewKVTable

func NewKVTable(bucketCount, hashCount, valueMaxLen int) *KVTable

NewKVTable creates a new key-only IBLT with the given parameters. - hashCount parameter is the number of hashes applied to each key. - bucketCount parameter is the number of buckets. - valueMaxLen parameter is the maximum length in byte of the stored values. Be aware that all buckets will accommodate for that maximum size. Those parameters directly affect the memory usage of the IBLT, as well as the probability to succeed in a peeling operation.

func (*KVTable) CompatibleWith

func (t *KVTable) CompatibleWith(other *KVTable) bool

CompatibleWith returns true if the two IBLTs have the same parameters (hashCount and bucketCount).

func (*KVTable) Copy

func (t *KVTable) Copy() *KVTable

Copy returns a deep copy of the IBLT.

func (*KVTable) Count

func (t *KVTable) Count() int

Count returns the number of keys in the IBLT. IMPORTANT: this number is only accurate if: - no Subtract has been done - no Delete with a key not in the IBLT has been done

func (*KVTable) Delete

func (t *KVTable) Delete(key uint64, value []byte)

Delete removes the given key/value from the IBLT.

func (*KVTable) Empty

func (t *KVTable) Empty() bool

Empty returns true if the IBLT is empty.

func (*KVTable) Equals

func (t *KVTable) Equals(other *KVTable) bool

Equals returns true if the two IBLTs are identical.

func (*KVTable) Insert

func (t *KVTable) Insert(key uint64, value []byte)

Insert adds the given key/value to the IBLT.

func (*KVTable) Peel

func (t *KVTable) Peel() iter.Seq2[KV, int64]

Peel returns a sequence of ([key,value], count) pairs. This function removes the key from the IBLT, so consider Copy() if you want to keep the original IBLT. The count is 1 if the key is present in the IBLT, -1 if it is missing (following a Subtract). This allows doing set reconciliation with two IBLTs (A and B), A.Subtract(B), then A.Peel(): - a count of 1 means that the key is in A, but not in B - a count of -1 means that the key is in B, but not in A IMPORTANT: the peeling process can fail if the IBLT is saturated. This function, however, will NOT return an error in that case. Instead, you NEED to call Empty() to check if the process completed successfully. IMPORTANT: the IBLT doesn't conserve the value length, so the returned values will always be of length t.valueMaxLen.

func (*KVTable) PeelHas

func (t *KVTable) PeelHas() iter.Seq2[uint64, []byte]

PeelHas is the same as Peel, but only returns key/values that are present in the IBLT (count==1).

func (*KVTable) PeelMisses

func (t *KVTable) PeelMisses() iter.Seq2[uint64, []byte]

PeelMisses is the same as Peel, but only returns key/values that are missing from the IBLT (count==-1).

func (*KVTable) Subtract

func (t *KVTable) Subtract(other *KVTable)

Subtract subtracts the given IBLT from this one. This function panics if the two IBLTs are not compatible.

func (*KVTable) ToBytes

func (t *KVTable) ToBytes() []byte

ToBytes returns the serialized representation of the key-only IBLT.

func (*KVTable) ToWriter

func (t *KVTable) ToWriter(w io.Writer) error

ToWriter serializes the IBLT to the given writer.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL