Navigating and Accessing Structured Data
Given Nushell's strong support for structured data, some of the more common tasks involve navigating and accessing that data.
Index to this Section
- Background and Definitions
- Cell-paths
- With Records
- With Lists
- With Tables
- Sample Data
- Example - Access a Table Row
- Example - Access a Table Column
- With Nested Data
- Using
getandselect- Example -
getvs.selectwith a Table Row - Example -
selectwith multiple rows and columns
- Example -
- Handling missing data using the optional operator
- Key/Column names with spaces
- Other commands for navigating structured data
Background
For the examples and descriptions below, keep in mind several definitions regarding structured data:
- List: Lists contain a series of zero or more values of any type. A list with zero values is known as an "empty list"
- Record: Records contain zero or more pairs of named keys and their corresponding value. The data in a record's value can also be of any type. A record with zero key-value pairs is known as an "empty record"
- Nested Data: The values contained in a list, record, or table can be either of a basic type or structured data themselves. This means that data can be nested multiple levels and in multiple forms:
- List values can contain tables, records, and even other lists
- Table: Tables are a list of records
- Record values can contain tables, lists, and other records
- This means that the records of a table can also contain nested tables, lists, and other records
- List values can contain tables, records, and even other lists
Tips
Because a table is a list of records, any command or syntax that works on a list will also work on a table. The converse is not necessarily the case; there are some commands and syntax that work on tables but not lists.
Cell-paths
A cell-path is the primary way to access values inside structured data. This path is based on a concept similar to that of a spreadsheet, where columns have names and rows have numbers. Cell-path names and indices are separated by dots.
Records
For a record, the cell-path specifies the name of a key, which is a string.
Example - Access a Record Value:
let my_record = {
a: 5
b: 42
}
$my_record.b + 5
# => 47Lists
For a list, the cell-path specifies the position (index) of the value in the list. This is an int:
Example - Access a List Value:
Remember, list indices are 0-based.
let scoobies_list = [ Velma Fred Daphne Shaggy Scooby ]
$scoobies_list.2
# => DaphneTables
- To access a column, a cell-path uses the name of the column, which is a
string - To access a row, it uses the index number of the row, which is an
int - To access a single cell, it uses a combination of the column name with the row index.
The next few examples will use the following table:
let data = [
[date temps condition ];
[2022-02-01T14:30:00+05:00, [38.24, 38.50, 37.99, 37.98, 39.10], 'sunny' ],
[2022-02-02T14:30:00+05:00, [35.24, 35.94, 34.91, 35.24, 36.65], 'sunny' ],
[2022-02-03T14:30:00+05:00, [35.17, 36.67, 34.42, 35.76, 36.52], 'cloudy' ],
[2022-02-04T14:30:00+05:00, [39.24, 40.94, 39.21, 38.99, 38.80], 'rain' ]
]Expand for a visual representation of this data
╭───┬─────────────┬───────────────┬───────────╮
│ # │ date │ temps │ condition │
├───┼─────────────┼───────────────┼───────────┤
│ 0 │ 2 years ago │ ╭───┬───────╮ │ sunny │
│ │ │ │ 0 │ 38.24 │ │ │
│ │ │ │ 1 │ 38.50 │ │ │
│ │ │ │ 2 │ 37.99 │ │ │
│ │ │ │ 3 │ 37.98 │ │ │
│ │ │ │ 4 │ 39.10 │ │ │
│ │ │ ╰───┴───────╯ │ │
│ 1 │ 2 years ago │ ╭───┬───────╮ │ sunny │
│ │ │ │ 0 │ 35.24 │ │ │
│ │ │ │ 1 │ 35.94 │ │ │
│ │ │ │ 2 │ 34.91 │ │ │
│ │ │ │ 3 │ 35.24 │ │ │
│ │ │ │ 4 │ 36.65 │ │ │
│ │ │ ╰───┴───────╯ │ │
│ 2 │ 2 years ago │ ╭───┬───────╮ │ cloudy │
│ │ │ │ 0 │ 35.17 │ │ │
│ │ │ │ 1 │ 36.67 │ │ │
│ │ │ │ 2 │ 34.42 │ │ │
│ │ │ │ 3 │ 35.76 │ │ │
│ │ │ │ 4 │ 36.52 │ │ │
│ │ │ ╰───┴───────╯ │ │
│ 3 │ 2 years ago │ ╭───┬───────╮ │ rain │
│ │ │ │ 0 │ 39.24 │ │ │
│ │ │ │ 1 │ 40.94 │ │ │
│ │ │ │ 2 │ 39.21 │ │ │
│ │ │ │ 3 │ 38.99 │ │ │
│ │ │ │ 4 │ 38.80 │ │ │
│ │ │ ╰───┴───────╯ │ │
╰───┴─────────────┴───────────────┴───────────╯This represents weather data in the form of a table with three columns:
- date: A Nushell
datefor each day - temps: A Nushell
listof 5floatvalues representing temperature readings at different weather stations in the area - conditions: A Nushell
stringfor each day's weather condition for the area
Example - Access a Table Row (Record)
Access the second day's data as a record:
$data.1
# => ╭───────────┬───────────────╮
# => │ date │ 2 years ago │
# => │ │ ╭───┬───────╮ │
# => │ temps │ │ 0 │ 35.24 │ │
# => │ │ │ 1 │ 35.94 │ │
# => │ │ │ 2 │ 34.91 │ │
# => │ │ │ 3 │ 35.24 │ │
# => │ │ │ 4 │ 36.65 │ │
# => │ │ ╰───┴───────╯ │
# => │ condition │ sunny │
# => ╰───────────┴───────────────╯Example - Access a Table Column (List)
$data.condition
# => ╭───┬────────╮
# => │ 0 │ sunny │
# => │ 1 │ sunny │
# => │ 2 │ cloudy │
# => │ 3 │ rain │
# => ╰───┴────────╯Example - Access a Table Cell (Value)
The condition for the fourth day:
$data.condition.3
# => rainNested Data
Since data can be nested, a cell-path can contain references to multiple names or indices.
Example - Accessing Nested Table Data
To obtain the temperature at the second weather station on the third day:
$data.temps.2.1
# => 36.67The first index 2 accesses the third day, then the next index 1 accesses the second weather station's temperature reading.
Using get and select
In addition to the cell-path literal syntax used above, Nushell also provides several commands that utilize cell-paths. The most important of these are:
getis equivalent to using a cell-path literal but with support for variable names and expressions.get, like the cell-path examples above, returns the value indicated by the cell-path.selectis subtly, but critically, different. It returns the specified data structure itself, rather than just its value.- Using
selecton a table will return a table of equal or lesser size - Using
selecton a list will return a list of equal or lesser size - Using
selecton a record will return a record of equal or lesser size
- Using
Continuing with the sample table above:
Example - get vs. select a table row
$data | get 1
# => ╭───────────┬───────────────╮
# => │ date │ 2 years ago │
# => │ │ ╭───┬───────╮ │
# => │ temps │ │ 0 │ 35.24 │ │
# => │ │ │ 1 │ 35.94 │ │
# => │ │ │ 2 │ 34.91 │ │
# => │ │ │ 3 │ 35.24 │ │
# => │ │ │ 4 │ 36.65 │ │
# => │ │ ╰───┴───────╯ │
# => │ condition │ sunny │
# => ╰───────────┴───────────────╯
$data | select 1
# => ╭───┬─────────────┬───────────────┬───────────╮
# => │ # │ date │ temps │ condition │
# => ├───┼─────────────┼───────────────┼───────────┤
# => │ 0 │ 2 years ago │ ╭───┬───────╮ │ sunny │
# => │ │ │ │ 0 │ 35.24 │ │ │
# => │ │ │ │ 1 │ 35.94 │ │ │
# => │ │ │ │ 2 │ 34.91 │ │ │
# => │ │ │ │ 3 │ 35.24 │ │ │
# => │ │ │ │ 4 │ 36.65 │ │ │
# => │ │ │ ╰───┴───────╯ │ │
# => ╰───┴─────────────┴───────────────┴───────────╯Notice that:
getreturns the same record as the$data.1example aboveselectreturns a new, single-row table, including column names and row indices
Tips
The row indices of the table resulting from select are not the same as that of the original. The new table has its own, 0-based index.
To obtain the original index, you can use the enumerate command. For example:
$data | enumerate | select 1Example - select with multiple rows and columns
Because select results in a new table, it's possible to specify multiple column names, row indices, or even both. This example creates a new table containing the date and condition columns of the first and second rows:
$data | select date condition 0 1
# => ╭───┬─────────────┬───────────╮
# => │ # │ date │ condition │
# => ├───┼─────────────┼───────────┤
# => │ 0 │ 2 years ago │ sunny │
# => │ 1 │ 2 years ago │ sunny │
# => ╰───┴─────────────┴───────────╯Key/Column names with spaces
If a key name or column name contains spaces or other characters that prevent it from being accessible as a bare-word string, then the key name may be quoted.
Example:
let record_example = {
"key x":12
"key y":4
}
$record_example."key x"
# => 12
# or
$record_example | get "key x"
# => 12Quotes are also required when a key name may be confused for a numeric value.
Example:
let record_example = {
"1": foo
"2": baz
"3": far
}
$record_example."1"
# => fooDo not confuse the key name with a row index in this case. Here, the first item is assigned the key name 1 (a string). If converted to a table using the transpose command, key 1 (string) would be at row-index 0 (an integer).
Handling Missing Data
The Optional Operator
By default, cell path access will fail if it can't access the requested row or column. To suppress these errors, you can add a ? to a cell path member to mark it as optional:
Example - The Optional Operator
Using the temp data from above:
let cp: cell-path = $.temps?.1 # only get the 2nd location from the temps column
# Ooops, we've removed the temps column
$data | reject temps | get $cpBy default missing cells will be replaced by null when accessed via the optional operator.
Assigning a default for missing or null data
The default command can be used to apply a default value to missing or null column result.
let missing_value = [{a:1 b:2} {b:1}]
$missing_value
# => ╭───┬────┬───╮
# => │ # │ a │ b │
# => ├───┼────┼───┤
# => │ 0 │ 1 │ 2 │
# => │ 1 │ ❎ │ 1 │
# => ╰───┴────┴───╯
let with_default_value = ($missing_value | default 'n/a' a)
$with_default_value
# => ╭───┬─────┬───╮
# => │ # │ a │ b │
# => ├───┼─────┼───┤
# => │ 0 │ 1 │ 2 │
# => │ 1 │ n/a │ 1 │
# => ╰───┴─────┴───╯
$with_default_value.1.a
# => n/a