Skip to main content

Select a collection

In this tutorial, you will use a row-based filter in tabular datastream. You will

  • use the select() method of PipelineBuilder() to select certain values from a collection datastream, and
  • launch a solution and observe the results.

This lesson will assume that you have an empty project and asset which you can to deploy to a workspace named 03_06_05_select_a_collection with the following command:

edk template deploy -ycw 03_06_05_select_a_collection

Define and deploy a template

To use the select() method you will first perform the following steps:

  1. define a datasource using a similar pattern as your previous My Source definition.
  2. set the value of the datasource to be a Map<string, { value: bigint }>() value.
  3. add the datasource to a template

In an asset, perform the above steps to create the resulting Typescript code:

import { SourceBuilder, Template } from "@elaraai/core"

const my_source = new SourceBuilder("My Source")
.value({
value: new Map([
["0", { value: 0n }],
["1", { value: 15n }],
["2", { value: 25n }],
["3", { value: 55n }],
])
})

export default Template(my_source)

Define a selection

You can define a per-row selection to be output with the select() method of PipelineBuilder(), by taking the following steps:

  1. add a new pipeline "My Pipeline"
  2. add a select() operation
  3. define a single selection of description based on value
  4. add the new pipeline to the template

In the definition My Pipeline add the above changes:

import { SourceBuilder, PipelineBuilder, Template, StringJoin } from "@elaraai/core"

const my_source = new SourceBuilder("My Source")
.value({
value: new Map([
["0", { value: 0n }],
["1", { value: 15n }],
["2", { value: 25n }],
["3", { value: 55n }],
])
})

const my_pipeline = new PipelineBuilder("My Pipeline")
.from(my_source.outputStream())
.select({
keep_all: false,
selections: {
description: fields => StringJoin`Got value ${fields.value}`
}
})

export default Template(my_source, my_pipeline);

Observe the selection

Once deployed, you can test your select() by observing the value of the Pipeline.My Pipeline datastream:

edk stream get "Pipeline.My Pipeline"  -w 03_06_05_select_a_collection

Which will result in the value below.

▹▹▹▹▹ Attempting to stream Pipeline.My Pipeline to stdout
[{"key":"0","value":{"description":"Got value 0"}},
{"key":"1","value":{"description":"Got value 15"}},
{"key":"2","value":{"description":"Got value 25"}},
{"key":"3","value":{"description":"Got value 55"}}]
✔ Download complete

You can observe that the value of Pipeline.My Pipeline now contains values including only the description.

Example solution

The code for this tutorial is available below:

Next steps

In the next tutorial, you will use the disaggregate() operation to 'unwrap' a child collection value from a collection datastream.