Reducing and Aggregating
So far you have learnt how to transform one collection to another. However often you want to summarize a collection. You may need to transform a collection of things into a single thing (a "scalar"). For example, you might need the sum or mean of an array of numbers. Or the number of values inside a collection satisfying a certain property.
Other times you need to group data together by a certain key, and summarize or aggregate the results. If you have a list of people and which country they live in, you may want to group the people by country. Or you may want a concise summary such as the number of people living in each country.
Below you will learn how to perform these tasks in Elara with East expressions.
Summarize via reduction
The final East expression for collections to cover is Reduce
.
At first you may find this to be a somewhat complex expression to use.
A Reduce
expression behaves a lot like a for
loop in an imperative language.
You get to initialize some intermediate data, iterate over the collection updating the data, and return the outputs.
Here is a loop that counts the number of items in a collection in JavaScript:
function count(collection) {
let n = 0;
for (const value in collection) {
n = n + 1;
}
return n;
}
You could write similar code to sum the collection, find the maximum value, calculate the mean, etc.
The Reduce
operation represents the fundamental structure of the calculation above, in a functional-programming style.
The first parameter is the collection to reduce over.
The second parameter is an expression representing the action to perform within the loop. East expressions are a functional language; you cannot mutate values or reassign variables within an expression. Rather you take the output of the previous iteration and construct a new value.
The third is the initial value. This will be used as the "previous" value on the first iteration. It will be returned directly whenever the collection is empty. The East type of this value controls the type of the variable injected into the expression above, and the expected type of that expression.
Reduce(
collection,
(n, value) => Add(n, 1),
0,
)
The above reads as "start with zero, add 1 for each entry in a collection".
East function | Description | Example usage | Result |
---|---|---|---|
Reduce | Add all the values in an array | Reduce([1n, 2n, 3n], (previous, value, key) => Add(previous, value), 0n) | 6 |
You can use Reduce
to iterate over arrays, sets or dictionaries.
When you need multiple pieces of intermediate data, use a struct to hold the various values.
For example the mean might require both the sum and the count, and you can divide the two after the reduction.
Once you get used to the pattern, Reduce
expressions will become second nature to write.
Group via aggregation
East does not have an extra expression to support grouping.
Instead, the ToDict
expression has a powerful form that lets you group and aggregate data.
By default, when ToDict
encounters a key more than once, the last value is kept.
Instead, ToDict
can be configured to keep running summary of values that map to each key.
It works a bit like performing Reduce
on each key, enabling all sorts of grouping and aggregating workflows.
East function | Description | Example usage | Result |
---|---|---|---|
ToDict | Count the number of times distinct strings appear in an array | ToDict(["a", "b", "a"], (value, key, previous) => Add(previous, 1), (value, key) => value, 0n) | {"a": 2, "b": 1} |
Note how, as with Reduce
, the "initial" parameter is provided last.
However, the order of arguments in the "value" expression differs from Reduce
.
Next steps
Continue to the next section to understand how to use binary blob data and files in ELARA.