ipfs-embed
A small, fast and reliable ipfs implementation designed for embedding in to complex p2p applications.
- node discovery via mdns
- provider discovery via kademlia
- exchange blocks via bitswap
- lru eviction policy
- aliases, an abstraction of recursively named pins
- temporary recursive pins for building dags, preventing races with the garbage collector
- efficiently syncing large dags of blocks
Some compatibility with go-ipfs can be enabled with the compat
feature flag.
Getting started
use ;
use DagCbor;
use Store;
async
Below is some notes on the history of ipfs-embed. The information is no longer accurrate for the current implementation.
What is ipfs?
Ipfs is a p2p network for locating and providing chunks of content addressed data called blocks.
Content addressing means that the data is located via it's hash as opposed to location addressing.
Unsurprisingly this is done using a distributed hash table. To avoid storing large amounts of data in the dht, the dht stores which peers have a block. After determining the peers that are providing a block, the block is requested from those peers.
To verify that the peer is sending the requested block and not an infinite stream of garbage, blocks need to have a finite size. In practice we'll assume a maximum block size of 1MB.
To encode arbitrary data in to 1MB blocks imposes two requirements on the codec. It needs to have a canonical representation to ensure that the same data results in the same hash and it needs to support linking to other content addressed blocks. Codecs having these two properties are called ipld codecs.
A property that follows from content addressing (representing edges as hashes of the node) is that arbitrary graphs of blocks are not possible. A graph of blocks is guaranteed to be directed and acyclic.
Block storage
Let's start with a naive model of a persistent block store.
Since content addressed blocks form a directed acyclic graph, blocks can't simply be deleted. A block may be referenced by multiple nodes, so some form of reference counting and garbage collection is required to determine when a block can safely be deleted. In the interest of being a good peer on the p2p network, we may want to keep old blocks around that other peers may want. So thinking of it as a reference counted cache may be a more appropriate model. We end up with something like this:
To mutate a block we need to perform three steps. Get the block, modify and insert the modified block and finally remove the old one. We also need a map from keys to cids, so even more steps are required. Any of these steps can fail leaving the block store in an inconsistent state, leading to data leakage. To prevent data leakage every api consumer would have to implement a write-ahead-log. To resolve these issues we extend the store with named pins called aliases.
Assuming that each operation is atomic and durable, we have the minimal set of operations required to store dags of content addressed blocks.
Networked block storage - the ipfs-embed api
Design patterns - ipfs-embed in action
We'll be looking at some patterns used in the chain
example. The chain
example uses
ipfs-embed
to store a chain of blocks. A block is defined as:
Atomicity
We have to different db's in this example. The one managed by ipfs-embed
that stores
blocks and aliases and one specific to the example that maps the block index to the block
cid, so that we can lookup blocks quickly without having to traverse the entire chain. To
guarantee atomicity we define two aliases and perform the syncing in two steps. This ensures
that the synced chain always has it's blocks indexed.
const TMP_ROOT: &str = alias!;
const ROOT: &str = alias!;
ipfs.alias.await?;
for _ in prev_root_id..new_root_id
ipfs.alias.await?;
Dagification
The recursive syncing algorithm will perform worst when it is syncing a chain, as every block needs to be synced one after the other, without being able to take advantage of any parallelism. To resolve this issue we increase the linking of the chain by including loopbacks, to increase the branching of the dag.
An algorithm was proposed by @rklaehn for this purpose:
Selectors
Syncing can take a long time and doesn't allow selecting the subset of data that is needed. For
this purpose there is an experimental alias_with_syncer
api that allows customizing the syncing
behaviour. In the chain example it is used to provide block validation, to ensure that the blocks
are valid. Altough this api is likely to change in the future.
Efficient block storage implementation - ipfs-embed internals
Ipfs embed uses SQLite to implement the block store, which is a performant embeddable SQL persistence layer / database.
type Id = u64;
type Atime = u64;
Given the description of operations and how it's structured in terms of trees, these operations are straight forward to implement.
Efficiently syncing dags of blocks - libp2p-bitswap internals
Bitswap is a very simple protocol. It was adapted and simplified for ipfs-embed. The message format can be represented by the following enums.
The mechanism for locating providers can be abstracted. A dht can be plugged in or a centralized db query. The bitswap api looks as follows:
So what happens when you create a get request? First all the providers in the initial set
are queried with the have request. As an optimization, in every batch of queries a block
request is sent instead. If the get query finds a block it returns a query complete. If the
block wasn't found in the initial set, a GetProviders(Cid)
event is emitted. This is where
the bitswap consumer tries to locate providers by for example performing a dht lookup. These
providers are registered by calling the add_provider
method. After the locating of providers
completes, it is signaled by calling complete_get_providers
. The query manager then performs
bitswap requests using the new provider set which results in the block being found or a block
not found error.
Often we want to sync an entire dag of blocks. We can efficiently sync dags of blocks by adding a sync query that runs get queries in parallel for all the references of a block. The set of providers that had a block is used as the initial set in a reference query. For this we extend our api with the following calls.
/// Bitswap sync trait for customizing the syncing behaviour.
Note that we can customize the syncing behaviour arbitrarily by selecting a subset of blocks we want to sync. See design patterns for more information.
License
MIT OR Apache-2.0