Skip to content
Eugene Lazutkin edited this page Apr 14, 2023 · 6 revisions

This utility is a Transform stream. It operates in object mode accepting items and packing them into an array. As soon as the array is big enough (a configurable value) it is emitted as an output. All arrays are as big as specified but the last one can be smaller but never empty.

Introduction

Example:

const Batch = require('stream-json/utils/Batch');

const StreamArray = require('stream-json/streamers/StreamArray');
const {chain} = require('stream-chain');
const fs = require('fs');

const pipeline = chain([
  fs.createReadStream('sample.json'),
  StreamArray.withParser(),
  new Batch({batchSize: 100})
]);

// count all odd values from a huge array

let oddCounter = 0;
pipeline.on('data', data => {
  console.log('Batch size:', data.length);
  data.forEach(pair => {
    if (pair.value % 2) ++oddCounter;
  });
});
pipeline.on('end', () => console.log('Odd numbers:', oddCounter));

API

The module returns the constructor of Batch. Being a Transform stream Batch doesn't have any special interfaces. The only thing required is to configure it during construction.

constructor(options)

options is an optional object described in detail in node.js' Stream documentation. Additionally, the following custom options are recognized:

  • batchSize is a positive integer number, which defines how many items should be placed into an array before it is outputted. Default: 1000.

Static methods and properties

batch(options) and make(options)

make() and batch() are two aliases of the factory function. It takes options described above, and return a new instance of Batch. batch() helps to reduce a boilerplate when creating data processing pipelines.

The example above could be rewritten as:

const {batch} = require('stream-json/utils/Batch');
// ...

const pipeline = chain([
  fs.createReadStream('sample.json'),
  streamArray(),
  batch({batchSize: 100})
]);
// ...

make.Constructor

Constructor property of make() (and batch()) is set to Batch. It can be used for the indirect creation of verifiers or metaprogramming if needed.

Clone this wiki locally