XUTools: eXTended UNIX Text-Processing Tools

XUTools was designed so that practitioners could process files in terms of the language constructs appropriate to the problem at hand - as many of these languages lie beyond regular expressions. We thus extended traditional UNIX tools because many modern, structured-text formats break assumptions of tranditional UNIX tools.

Traditional UNIX tools operate on sequences of characters, bytes, fields, lines, and files. However, practitioners often want to manipulate files in terms of a variety of language-specific constructs--C functions, Cisco IOS interface blocks, and XML elements, to name a few.

We designed and built text-processing tools for practitioners to extract(xugrep(1)), count(xuwc(1)), and compare(xudiff(1)) texts in terms of language-specific structures.

  • xugrep(1): Traditional UNIX grep(1) extracts all lines in a file that contain strings in the language of a regular expression. Our xugrep(1) generalizes the class of languages that we can practically extract on the UNIX command line from regular to context-free. Patterns to extract may be expressed as regular or context-free grammars.
  • xuwc(1): Traditional wc(1) counts the number of words, lines, characters, or bytes contained in each input file or standard input. Our xuwc(1) generalizes wc(1) to count strings in context-free languages and to report those counts relative to language-specific contexts.
  • xudiff(1): Traditional UNIX diff(1) computes an edit script between the sequences of lines in a file. Our xudiff(1) generalizes diff(1) to compare two files in terms of their respective parse trees generated by a context-free or regular grammar.

For more information, or to use XUTools, visit https://github.com/gabriel-weaver/xutools/wiki.

Research Demo:

XUTools: Demo of XUTools, usable management tools for the smart grid's data avalanche. Presented by Gabriel Weaver during the TCIPG Industry Workshop October 2012.

TCIPG Seminar Series - Jan 4, 2013
Invited Presentation by Gabriel Weaver:

How Extended Unix Tools Can Measure the Changing Security Posture of Power-Control Networks

Tech category: