split words for strings

Split a string's words into separate rows.

Signature

> split words {flags}

Flags

  • --min-word-length, -l {int}: The minimum word length
  • --grapheme-clusters, -g: measure word length in grapheme clusters (requires -l)
  • --utf-8-bytes, -b: measure word length in UTF-8 bytes (default; requires -l; non-ASCII chars are length 2+)

Input/output types:

inputoutput
list<string>list<list<string>>
stringlist<string>

Examples

Split the string's words into separate rows

> 'hello world' | split words
╭───┬───────╮
 0  hello 
 1  world 
╰───┴───────╯

Split the string's words, of at least 3 characters, into separate rows

> 'hello to the world' | split words --min-word-length 3
╭───┬───────╮
 0  hello 
 1  the   
 2  world 
╰───┴───────╯

A real-world example of splitting words

> http get https://www.gutenberg.org/files/11/11-0.txt | str downcase | split words --min-word-length 2 | uniq --count | sort-by count --reverse | first 10