Blobs
A blob is an object that represents the contents of a file. To allow shared storage of data between similar files, blobs are trees consisting of leaf and branch objects. A leaf object is just bytes, and a branch object is an array of branches or leaves.
type Blob = Leaf | Branch;type Leaf = { bytes: Uint8Array;};type Branch = { children: Array<Blob>;};
You can create a blob on the command line:
$ echo "Hello, World\!" | tg blob createlef_01c8sr6jyef0bxp7j03f7emawb8q0p008nt9dn343c0qq345bba710
This blob is small, so it only needs one leaf to represent. Try creating a larger one:
$ for i in $(seq 0 999999); do echo $i; done | tg blob createbch_01bn61ywpt8vsqkerfzpkzk7qgg3gsqx54dasvv4hnzwwx76qvq8g0
You can view the object tree using tg view:
$ tg view bch_01bn61ywpt8vsqkerfzpkzk7qgg3gsqx54dasvv4hnzwwx76qvq8g0
You can also manipulate blobs directly in Tangram TypeScript:
let blob = tg.blob("Hello, ", "World!");
This will create a blob which is a branch with two children, each of which are leaves.
Content-Defined Chunking
To minimize the amount of data stored on disk and tranferred over the network, blobs with similar content should produce similar object trees. Tangram does this with a technique called content defined chunking. As the bytes are read, a rolling hash is computed. When the hash matches a fixed value, a chunk is emitted. With content defined chunking, if you make a small edit in the middle of a large file, most of of the objects in the tree will be unchanged.