Nushell 0.75

Nushell, or Nu for short, is a new shell that takes a modern, structured approach to your command line. It works seamlessly with the data from your filesystem, operating system, and a growing number of file formats to make it easy to build powerful command line pipelines.

Today, we're releasing version 0.75 of Nu. This release extends our unicode support, renames some important HTTP-related commands, and improves our module system. It also contains a good amount of polish and refactoring behind the scenes.

Where to get it

Nu 0.75 is available as pre-built binaries or from crates.io. If you have Rust installed you can install it using cargo install nu.

NOTE: The optional dataframe functionality is available by cargo install nu --features=dataframe.

As part of this release, we also publish a set of optional plugins you can install and use with Nu. To install, use cargo install nu_plugin_<plugin name>.

Themes of this release / New features

Changed Unicode escapes in strings (bobhy)

Warning

Breaking Change: You need to update escapes like "\u0043" to "\u{0043}"

New format:

〉echo "AB\u{43}\u{044}"
ABCD
〉echo "Gabriel, blow your \u{1f3BA}"
Gabriel, blow your 🎺

Instead of:

〉echo "AB\u0043"
ABC

This format allows you to insert any Unicode code point into a string by specifying its value as 1 through 6 hex digits (with or without leading zeros, upper or lower case). The maximum value is \u{10ffff}, which is the largest Unicode code point defined.

We've simply dropped support for the old format since we're pre-1.0 and didn't want to carry forward redundant syntax. You will have to change any unicode escapes in your scripts to the new format.

Why change? The old 4-digit syntax could not natively address recent extensions to Unicode standard, such as emoji, CJK extension and traditional scripts. There is a cumbersome workaround in the form of surrogate pairs, but this is not intuitive.

Why this change? The new format allows you to specify any Unicode code point with a single, predictable syntax. Rust and ECMAScript 6 support the same syntax. (Full disclosure: C++, Python and Java don't.)

`-g` grapheme cluster flags for `str length`, `str substring`, `str index-of`, `split words` and `split chars` (webbedspace)

As you know, str length, str substring, str index-of and split words measure the length of strings and substrings in UTF-8 bytes, which is often very unintuitive - all non-ASCII characters are of length 2 or more, and splitting a non-ASCII character can create garbage characters as a result. A much better alternative is to measure the length in extended grapheme clusters. In Unicode, a "grapheme cluster" tries to map as closely as possible to a single visible character. This means, among many other things:

Non-ASCII characters, such as ん, are considered single units of length 1, no matter how many UTF-8 bytes they use.
Combined characters, such as e and ◌́ being combined to produce é, are considered single units of length 1.
Emojis, including combined emojis such as 🇯🇵, which is made of the 🇯 and 🇵 emojis plus a zero-width joiner, are considered single units of length 1. (This is a property of "extended" grapheme clusters.)
"\r\n" is considered a single unit of length 1.

The new --graphemes/-g flag can be used with str length, str substring, str index-of and split words to enable these length/indexing measurements:

〉'🇯🇵ほげ ふが ぴよ'   | str substring 4..6 -g
ふが
〉'🇯🇵ほげ ふが ぴよ'   | str length -g
9
〉'🇯🇵ほげ ふが ぴよ'   | str index-of 'ふが' -g
4

In addition, the flag has been added to split chars. Notably, this command splits on Unicode code points rather than UTF-8 bytes, so it doesn't have the issue of turning non-ASCII characters into garbage characters. However, combining emoji and combining characters do not correspond to single code points, and are split by split chars. The -g flag keeps those characters intact:

〉'🇯🇵ほげ' | split chars -g | to nuon
[🇯🇵, ほ, げ]

These commands also have --utf-8-bytes/-b flags which enable the legacy behavior (and split chars has --code-points/-c). These currently do not do anything and need not strictly be used, since UTF-8 byte lengths are still the default behaviour. However, if this default someday changes, then these flags will guarantee that the legacy behaviour is used.

Tips

It is currently being debated whether or not grapheme clusters should be used as the default string length measurement in Nushell (making the -g flag the default behaviour for these commands), due to their intuitiveness, consistency across non-Latin scripts, and better correspondence to the length of strings displayed in the terminal (after stripping ANSI codes). Currently, the Nushell developers are uncertain what the long-term impact of such a change would be, whether existing scripts would be non-trivially harmed by it, or whether it would hinder interoperability with external programs. If you have any personal insight or opinion about this change, or about which behaviour you'd prefer not to require a flag, your input is desired!

New `enumerate` command (Sophia)

A new enumerate command will enumerate the input, and add an index and item record for each item. The index is the number of the item in the input stream, and item is the original value of the item.

> ls | enumerate | get 14
╭───────┬────────────────────────────╮
│ index │ 14                         │
│       │ ╭──────────┬─────────────╮ │
│ item  │ │ name     │ crates      │ │
│       │ │ type     │ dir         │ │
│       │ │ size     │ 832 B       │ │
│       │ │ modified │ 2 weeks ago │ │
│       │ ╰──────────┴─────────────╯ │
╰───────┴────────────────────────────╯

Rather than relying on the --numbered flags of commands like each, with the enumerate command we take more modular and composable approach than hard-coding flags to our commands. (Note: The --numbered flags have not been removed yet.)

We decided to move some of the important command for interacting with HTTP under their own http subcommands for better discoverability. The common fetch command is now http get.

Old name	New name beginning with `0.75`
`fetch`	`http get`
`post`	`http post`
`to url`	`url build-query`

`main` command exported from module defines top-level module command (kubouch)

Defining and exporting a maincommand from a module allows creating a command with the same name as the module. Consider this example:

# command.nu
export def main [] { 'This is a command' }

export def subcommand [] { 'This is a subcommand' }

Then:

> use command.nu

> command
This is a command

> command subcommand
This is a subcommand

The same thing works overlay use as well. Note that the main command continues to work the same way as before when running a script:

> nu command.nu
This is a command

Combined with a recent bugfix, this feature allows for nicer way of defining known externals and custom completions.

Before:

# cargo.nu

export extern cargo [--version, --color: string@cargo-color-complete]

export extern `cargo check` [--quiet]

def cargo-color-complete [] {
    [ auto, always, never ]
}

After:

# cargo.nu

export extern main [--version, --color: string@cargo-color-complete]

export extern check [--quiet]

def cargo-color-complete [] {
    [ auto, always, never ]
}

It is also a stepping stone towards being able to handle directories which in turn is a stepping stone towards having proper Nushell packages.

Progress bar for `save` command (Xoffio)

To watch the progress when saving large files you can now pass the --progress flag to save. It gives information about the throughput and an interactive progress bar if available.

Progress for saving a file

Breaking changes

Unicode escapes in strings now use and extended format \u{X...}, any scripts using the old syntax \uXXXX will have to be updated. See also #7883.
The to url command has been renamed and moved to url build-query as this better reflects is role as a nushell specific url command compared to a conversion. (#7702)
fetch has been renamed to http get (#7796)
post has been renamed to http post (#7796)
Quotes are trimmed when escaping to cmd.exe (#7740)
parse -r now uses zero-indexed rows and uncapitalized columns (#7897)
last, skip, drop, take until, take while, skip until, skip while, where, reverse, shuffle, append, prepend and sort-by raise error when given non-lists (#7623)
to csv and to tsv now throw error on unsupported inputs (7850)