Reference Source Test

Build Status Coverage Status Gzip size semantic-release Documentation Greenkeeper badge PRs Welcome

Blob comparison utility

blob-compare is a small library designed to provide some useful methods to compare two blobs in browser with various methods :

Tool rely on native browsers buffer implementations and will likely work on any modern browser. It have been tested and benchmarked on Chrome, Firefox and Edge.

To provide better performance, blob-compare automatically relies on web workers if available when performing operations on blobs.

Installation

You can install it from npm or yarn :

npm install blob-compare

yarn add blob-compare

Then, simply require/import it to use it :

const blobCompare = require('blob-compare').default;
// or
import blobCompare from 'blob-compare';

For browser direct usage, blob-compare can be required as a script from any CDN mirroring NPM or Github, for instance :

<script src="https://cdn.jsdelivr.net/npm/blob-compare@latest"></script>

<script src="https://unpkg.com/blob-compare@latest"></script>

<script src="https://cdn.jsdelivr.net/gh/liqueurdetoile/blob-compare@latest/dist/index.min.js"></script>

A global blobCompare will be automatically set after script was downloaded.

Quick reference

See documentation for a full reference.

Conversion tools

All conversions are run asynchronously.

Method Description
blobCompare::toArrayBuffer Converts a blob to an ArrayBuffer. it can be optionnally chunked and assigned to a web worker. Conversion is run asynchronously.
blobCompare::toBinaryString Converts a blob to a BinaryString. it can be optionnally chunked and assigned to a web worker. Conversion is run asynchronously.

Comparison tools

Method Description Sync/Async
blobCompare::sizeEqual Compares size of two blobs sync
blobCompare::typeEqual Compares types of two blobs. Types are not really reliable as they can be tricked when creating a blob sync
blobCompare::magicNumbersEqual Compares magic numbers of two blobs. A quick comparison is done, therefore weird data types may not be compared with 100% accuracy. In that case, simply clone repo and override this function to fit your needs async
blobCompare::bytesEqualWithArrayBuffer Converts blobs or chunk blobs to ArrayBuffers and performs a byte to byte comparison async
blobCompare::bytesEqualWithBinaryString Converts blobs or chunk blobs to BinaryString and performs a byte to byte comparison async
blobCompare::isEqual The swiss army knife to bundle multiple comparison methods above in one single call async

Usage examples

// assuming img1 and img2 are two blobs vars

/**
 * Fully compare two blobs with default methods configuration
*/
blobCompare.isEqual(img1, img2).then(res...)

/**
 * Comparing only file types
 */
blobCompare.isEqual(img1, img2, {methods: ['magic']}).then(res...)
// or
blobCompare.magicNumbersEqual(img1, img2).then(res => ...)

/**
 *  Compare file types AND the last 100 bytes of blobs
 *  Never find a use case ^^
*/
blobCompare.isEqual(img1, img2, {
  methods: ['bytes'],
  sizes: [-100]
}).then(res => ...)

/**
 * Compare file types OR the last 100 bytes of blobs
*/
blobCompare.isEqual(img1, img2, {
  methods: ['bytes'],
  sizes: [-100],
  partial: true
}).then(res => ...)

To speed up things, isEqual with its default configuration checks first if sizes are equal, then types, then magic numbers and finally performs a byte to byte comparison to ensure blobs equality.

All methods working on bytes comparison are asynchronous, use web workers by default if available and works very well with async/await syntax.

About performance

Trying to compare blobs can be tricky though the only real pitfall is most likely to run out of memory on the VM. There's not much to do with it except working only on smaller data chunks and use device storage like IndexedDb to buffer the unprocessed chunks.

Web workers

Another caveat is likely to consume device CPU to perform operations on blobs. Web workers can be very helpful in this case. blob-compare is enabling web workers by default for two major reasons :

  1. A worker is constructed each time a blob needs to be converted to raw binary data or array buffer. On a multi-threaded system, it allows efficient concurrency
  2. Huge blobs operations won't freeze the main thread

The cons is that processing will be slower due to the copy operation. A workaround could be to use directly ArrayBuffers and the blobCompare.compareBuffers method that take advantage of the transferable interface of an ArrayBuffer.

Disabling web workers can also help prevent memory issues in some cases.

Benchmarking

Repo is quite heavy due to fixtures. I've tried to implement some automated bechnmarks around karma and benchmark.js but I'm quickly hitting some troubles with larger blobs, event with small blobs on Edge.

I'm not sure that I'm doing right with my benchmarks _Oo_

If I find some time, I may try on jsPerf.

Anyway, after cloning and installing this repository, you can play with fixtures and benchmarks (they are removed from npm version).

Just bash npm run bench:all to run them into Chrome, Firefox and Edge. You can also make ChromeHeadless accessible and use npm run bench

Latest results for Chrome, Firefox and Edge are stored in results.json

Documentation

Methods are fully documented and docs are available on github pages.

Issues and PRs

Any bugs and issues can be filed on the github repository.

You are free and very welcome to fork the project and submit any PR to fix or improve blob-compare.

Changelog