iblt

package module

v0.1.0 Latest Latest Go to latest Published: Dec 30, 2025 License: MIT Imports: 5 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/MichaelMure/go-iblite

Links

Open Source Insights

README ¶

go-iblite

Overview

This is an implementation of an Invertible Bloom Lookup Table in Go. The name comes from IBLT lite; in both sense of simple and efficient. This package implements two variants of IBLT: key-only and key-value.

Invertible Bloom Lookup Tables, introduced by Michael T. Goodrich, Michael Mitzenmacher are a probabilistic data structure that compactly represents a set and can be decoded to recover its elements, provided the structure is not too full.

Unlike a standard Bloom filter, an IBLT is invertible: it stores enough aggregate information to reconstruct individual entries via a peeling process.

IBLTs are commonly used for set reconciliation between two parties. By building an IBLT from each set and subtracting them, the result encodes only the set difference, which can be decoded efficiently. This allows large sets with small differences to be synchronized with communication proportional to the difference size, not the total set size.

For a more detailed explanation of the algorithm, see b5's great explanation video.

Drawbacks

IBLTs are great but still require to dimension them beforehand, proportionally to the set difference, which can be a challenge in some cases. A great extension of that idea are Rateless IBLTs, which are not implemented here.

Example

// Let's perform a set reconciliation between Alice and Bob.
// Each of them has a set of keys, and we want to find the keys that the other party doesn't
// have, so they can synchronize the differences. This is a common problem in distributed
// systems, databases, etc. In particular, it's useful in scenarios where the difference between
// two sets is small, but the total size of the sets is large.

// Each creates an IBLT large enough to hold the expected **difference** between the two sets.
alice := iblt.NewKTable(20, 4)
bob := iblt.NewKTable(20, 4)

// Each inserts the keys they have into their respective IBLT.
// Here, we insert 1 million keys, far, far more than the IBLT can hold without saturating.
// Bob will have some keys missing and some extra keys compared to Alice.
for i := uint64(0); i < 1_000_000; i++ {
    alice.Insert(i)
}
for i := uint64(5); i < 1_000_005; i++ {
    bob.Insert(i)
}

// Bob transmits his IBLT to Alice, and Alice subtracts it from her own.
bobBytes := bob.ToBytes()
received, err := iblt.KTableFromBytes(bobBytes)
if err != nil {
    panic(err)
}

// Just to illustrate, we'll print the size of a million keys, and the size of the IBLT.
fmt.Printf("1 million keys: %d bytes\n", 1_000_000*8)
fmt.Printf("IBLT size: %d bytes\n", len(bobBytes))

// Now the magic trick:
// Alice subtracts the received IBLT from her own, and peel (decode) the missing keys.
alice.Subtract(received)

fmt.Println()
fmt.Println("Keys that Alice doesn't have:")
misses := alice.Copy()
for key := range misses.PeelMisses() {
fmt.Println(key)
}
fmt.Println("Peeling completed:", misses.Empty())
fmt.Println()
fmt.Println("Keys that Bob doesn't have:")
has := bob.Copy()
for key := range alice.Copy().PeelHas() {
fmt.Println(key)
}
fmt.Println("Peeling completed:", has.Empty())

// Output:
// 1 million keys: 8000000 bytes
// IBLT size: 484 bytes
//
// Keys that Alice doesn't have:
// 1000000
// 1000004
// 1000003
// 1000002
// 1000001
// Peeling completed: true
//
// Keys that Bob doesn't have:
// 3
// 0
// 4
// 1
// 2
// Peeling completed: false

License

MIT

Documentation ¶

Overview ¶

Example ¶

package main

import (
	"fmt"

	iblt "github.com/MichaelMure/go-iblite"
)

func main() {
	// Let's perform a set reconciliation between Alice and Bob.
	// Each of them has a set of keys, and we want to find the keys that the other party doesn't have, so they can
	// synchronize the differences. This is a common problem in distributed systems, databases, etc.
	// In particular, it's useful in scenarios where the difference between two sets is small, but the total size of
	// the sets is large.

	// Each creates an IBLT large enough to hold the expected **difference** between the two sets.
	alice := iblt.NewKTable(20, 4)
	bob := iblt.NewKTable(20, 4)

	// Each inserts the keys they have into their respective IBLT.
	// Here, we insert 1 million keys, far, far more than the IBLT can hold without saturating.
	// Bob will have some keys missing and some extra keys compared to Alice.
	for i := uint64(0); i < 1_000_000; i++ {
		alice.Insert(i)
	}
	for i := uint64(5); i < 1_000_005; i++ {
		bob.Insert(i)
	}

	// Bob transmits his IBLT to Alice, and Alice subtracts it from her own.
	bobBytes := bob.ToBytes()
	received, err := iblt.KTableFromBytes(bobBytes)
	if err != nil {
		panic(err)
	}

	// Just to illustrate, we'll print the size of a million keys, and the size of the serialized IBLT.
	fmt.Printf("1 million keys: %d bytes\n", 1_000_000*8)
	fmt.Printf("IBLT size: %d bytes\n", len(bobBytes))

	// Now the magic trick: Alice subtracts the received IBLT from her own, and peel (decode) the missing keys.
	alice.Subtract(received)

	fmt.Println()
	fmt.Println("Keys that Alice doesn't have:")
	misses := alice.Copy()
	for key := range misses.PeelMisses() {
		fmt.Println(key)
	}
	fmt.Println("Peeling completed:", misses.Empty())
	fmt.Println()
	fmt.Println("Keys that Bob doesn't have:")
	has := bob.Copy()
	for key := range alice.Copy().PeelHas() {
		fmt.Println(key)
	}
	fmt.Println("Peeling completed:", has.Empty())

}

Output:

1 million keys: 8000000 bytes
IBLT size: 484 bytes

Keys that Alice doesn't have:
1000000
1000004
1000003
1000002
1000001
Peeling completed: true

Keys that Bob doesn't have:
3
0
4
1
2
Peeling completed: false

Examples ¶

Package

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type KTable ¶

type KTable struct {
	// contains filtered or unexported fields
}

KTable is an Invertible Bloom Lookup Table, which only holds keys, not values. It's a probabilistic concise data structure for set representation that supports a peeling operation as the recovery of the elements in the represented set. It can be used in particular for set reconciliation, see the example. KTable is NOT thread safe.

func KTableFromBytes ¶

func KTableFromBytes(data []byte) (*KTable, error)

func KTableFromReader ¶

func KTableFromReader(r io.Reader) (*KTable, error)

func NewKTable ¶

func NewKTable(bucketCount, hashCount int) *KTable

NewKTable creates a new key-only IBLT with the given parameters. - hashCount parameter is the number of hashes applied to each key. - bucketCount parameter is the number of buckets. Those parameters directly affect the memory usage of the IBLT, as well as the probability to succeed in a peeling operation.

func (*KTable) CompatibleWith ¶

func (t *KTable) CompatibleWith(other *KTable) bool

CompatibleWith returns true if the two IBLTs have the same parameters (hashCount and bucketCount).

func (*KTable) Copy ¶

func (t *KTable) Copy() *KTable

Copy returns a deep copy of the IBLT.

func (*KTable) Count ¶

func (t *KTable) Count() int

Count returns the number of keys in the IBLT. IMPORTANT: this number is only accurate if: - no Subtract has been done - no Delete with a key not in the IBLT has been done

func (*KTable) Delete ¶

func (t *KTable) Delete(key uint64)

Delete removes the given key from the IBLT.

func (*KTable) Empty ¶

func (t *KTable) Empty() bool

Empty returns true if the IBLT is empty.

func (*KTable) Equals ¶

func (t *KTable) Equals(other *KTable) bool

Equals returns true if the two IBLTs are identical.

func (*KTable) Insert ¶

func (t *KTable) Insert(key uint64)

Insert adds the given key to the IBLT.

func (*KTable) Peel ¶

func (t *KTable) Peel() iter.Seq2[uint64, int64]

Peel returns a sequence of (key, count) pairs. This function removes the key from the IBLT, so consider Copy() if you want to keep the original IBLT. The count is 1 if the key is present in the IBLT, -1 if it is missing (following a Subtract). This allows doing set reconciliation with two IBLTs (A and B), A.Subtract(B), then A.Peel(): - a count of 1 means that the key is in A, but not in B - a count of -1 means that the key is in B, but not in A IMPORTANT: the peeling process can fail if the IBLT is saturated. This function, however, will NOT return an error in that case. Instead, you NEED to call Empty() to check if the process completed successfully.

func (*KTable) PeelHas ¶

func (t *KTable) PeelHas() iter.Seq[uint64]

PeelHas is the same as Peel, but only returns keys that are present in the IBLT (count==1).

func (*KTable) PeelMisses ¶

func (t *KTable) PeelMisses() iter.Seq[uint64]

PeelMisses is the same as Peel, but only returns keys that are missing from the IBLT (count==-1).

func (*KTable) Subtract ¶

func (t *KTable) Subtract(other *KTable)

Subtract subtracts the given IBLT from this one. This function panics if the two IBLTs are not compatible.

func (*KTable) ToBytes ¶

func (t *KTable) ToBytes() []byte

ToBytes returns the serialized representation of the key-only IBLT.

func (*KTable) ToWriter ¶

func (t *KTable) ToWriter(w io.Writer) error

ToWriter serializes the IBLT to the given writer.

type KV ¶

type KV struct {
	Key   uint64
	Value []byte
}

type KVTable ¶

type KVTable struct {
	// contains filtered or unexported fields
}

KVTable is an Invertible Bloom Lookup Table, which holds keys and values. It's a probabilistic concise data structure for set representation that supports a peeling operation as the recovery of the elements in the represented set. It can be used in particular for set reconciliation, see the example. KVTable is NOT thread safe.

func KVTableFromBytes ¶

func KVTableFromBytes(data []byte) (*KVTable, error)

func KVTableFromReader ¶

func KVTableFromReader(r io.Reader) (*KVTable, error)

func NewKVTable ¶

func NewKVTable(bucketCount, hashCount, valueMaxLen int) *KVTable

NewKVTable creates a new key-only IBLT with the given parameters. - hashCount parameter is the number of hashes applied to each key. - bucketCount parameter is the number of buckets. - valueMaxLen parameter is the maximum length in byte of the stored values. Be aware that all buckets will accommodate for that maximum size. Those parameters directly affect the memory usage of the IBLT, as well as the probability to succeed in a peeling operation.

func (*KVTable) CompatibleWith ¶

func (t *KVTable) CompatibleWith(other *KVTable) bool

CompatibleWith returns true if the two IBLTs have the same parameters (hashCount and bucketCount).

func (*KVTable) Copy ¶

func (t *KVTable) Copy() *KVTable

Copy returns a deep copy of the IBLT.

func (*KVTable) Count ¶

func (t *KVTable) Count() int

Count returns the number of keys in the IBLT. IMPORTANT: this number is only accurate if: - no Subtract has been done - no Delete with a key not in the IBLT has been done

func (*KVTable) Delete ¶

func (t *KVTable) Delete(key uint64, value []byte)

Delete removes the given key/value from the IBLT.

func (*KVTable) Empty ¶

func (t *KVTable) Empty() bool

Empty returns true if the IBLT is empty.

func (*KVTable) Equals ¶

func (t *KVTable) Equals(other *KVTable) bool

Equals returns true if the two IBLTs are identical.

func (*KVTable) Insert ¶

func (t *KVTable) Insert(key uint64, value []byte)

Insert adds the given key/value to the IBLT.

func (*KVTable) Peel ¶

func (t *KVTable) Peel() iter.Seq2[KV, int64]

Peel returns a sequence of ([key,value], count) pairs. This function removes the key from the IBLT, so consider Copy() if you want to keep the original IBLT. The count is 1 if the key is present in the IBLT, -1 if it is missing (following a Subtract). This allows doing set reconciliation with two IBLTs (A and B), A.Subtract(B), then A.Peel(): - a count of 1 means that the key is in A, but not in B - a count of -1 means that the key is in B, but not in A IMPORTANT: the peeling process can fail if the IBLT is saturated. This function, however, will NOT return an error in that case. Instead, you NEED to call Empty() to check if the process completed successfully. IMPORTANT: the IBLT doesn't conserve the value length, so the returned values will always be of length t.valueMaxLen.

func (*KVTable) PeelHas ¶

func (t *KVTable) PeelHas() iter.Seq2[uint64, []byte]

PeelHas is the same as Peel, but only returns key/values that are present in the IBLT (count==1).

func (*KVTable) PeelMisses ¶

func (t *KVTable) PeelMisses() iter.Seq2[uint64, []byte]

PeelMisses is the same as Peel, but only returns key/values that are missing from the IBLT (count==-1).

func (*KVTable) Subtract ¶

func (t *KVTable) Subtract(other *KVTable)

Subtract subtracts the given IBLT from this one. This function panics if the two IBLTs are not compatible.

func (*KVTable) ToBytes ¶

func (t *KVTable) ToBytes() []byte

ToBytes returns the serialized representation of the key-only IBLT.

func (*KVTable) ToWriter ¶

func (t *KVTable) ToWriter(w io.Writer) error

ToWriter serializes the IBLT to the given writer.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL