Skip to main content

Reducing and Aggregating

So far you have learnt how to transform one collection to another. However often you want to summarize a collection. You may need to transform a collection of things into a single thing (a "scalar"). For example, you might need the sum or mean of an array of numbers. Or the number of values inside a collection satisfying a certain property.

Other times you need to group data together by a certain key, and summarize or aggregate the results. If you have a list of people and which country they live in, you may want to group the people by country. Or you may want a concise summary such as the number of people living in each country.

Below you will learn how to perform these tasks in Elara with East expressions.

Summarize via reduction

The final East expression for collections to cover is Reduce. At first you may find this to be a somewhat complex expression to use.

A Reduce expression behaves a lot like a for loop in an imperative language. You get to initialize some intermediate data, iterate over the collection updating the data, and return the outputs. Here is a loop that counts the number of items in a collection in JavaScript:

function count(collection) {
let n = 0;
for (const value in collection) {
n = n + 1;
}
return n;
}

You could write similar code to sum the collection, find the maximum value, calculate the mean, etc. The Reduce operation represents the fundamental structure of the calculation above, in a functional-programming style.

The first parameter is the collection to reduce over.

The second parameter is an expression representing the action to perform within the loop. East expressions are a functional language; you cannot mutate values or reassign variables within an expression. Rather you take the output of the previous iteration and construct a new value.

The third is the initial value. This will be used as the "previous" value on the first iteration. It will be returned directly whenever the collection is empty. The East type of this value controls the type of the variable injected into the expression above, and the expected type of that expression.

Reduce(
collection,
(n, value) => Add(n, 1),
0,
)

The above reads as "start with zero, add 1 for each entry in a collection".

East functionDescriptionExample usageResult
ReduceAdd all the values in an arrayReduce([1n, 2n, 3n], (previous, value, key) => Add(previous, value), 0n)6

You can use Reduce to iterate over arrays, sets or dictionaries. When you need multiple pieces of intermediate data, use a struct to hold the various values. For example the mean might require both the sum and the count, and you can divide the two after the reduction. Once you get used to the pattern, Reduce expressions will become second nature to write.

Group via aggregation

East does not have an extra expression to support grouping. Instead, the ToDict expression has a powerful form that lets you group and aggregate data.

By default, when ToDict encounters a key more than once, the last value is kept. Instead, ToDict can be configured to keep running summary of values that map to each key. It works a bit like performing Reduce on each key, enabling all sorts of grouping and aggregating workflows.

East functionDescriptionExample usageResult
ToDictCount the number of times distinct strings appear in an arrayToDict(["a", "b", "a"], (value, key, previous) => Add(previous, 1), (value, key) => value, 0n){"a": 2, "b": 1}

Note how, as with Reduce, the "initial" parameter is provided last. However, the order of arguments in the "value" expression differs from Reduce.

Next steps

Continue to the next section to understand how to use binary blob data and files in ELARA.