5 Customized command-line processing

Part of the intent of gema is that it can be used as a means of implementing more specialized tools. A utility program is defined by the command line arguments that it uses as well as by how it processes its input files. Therefore, gema provides a way to customize the handling of command line arguments.

The main program of gema just does some initialization, and then processes the command line arguments by translating them with a set of built-in patterns. These rules that define the command line arguments are defined in a domain named ``ARGV''. The user is free to add additional rules to this domain, thereby implementing new command line options, or even to undefine existing rules. In the input stream that is translated by the ARGV domain, the command line arguments are separated by newline characters.[Footnote 1] The actions for the ARGV rules are expected to do all their work with side-effects and to not return any value. Any value that is returned by the translation (except for the delimiting newlines) will be reported by the main program as undefined arguments.

The complete set of built-in ARGV rules can be seen by looking at the source file ``gema.c'' in the variable argv_rules. Here are a few representative examples:

  ARGV:\N-idchars\n*\n=@set-parm{idchars;$1}
  ARGV:\N-literal\n*\n=@set-syntax{L;$1}
  ARGV:\N-p\n*\n=@define{*}
  ARGV:\N\L*\=*\n=@define{$0}
  ARGV:\N-odir\n*\n=@set{.ODIR;*}
  ARGV:\N-<L1>\n=@set-switch{$1;1}
  ARGV:\N-*\n=@err{Unrecognized option\:\ "-*"\n}@exit-status{3}

For an example of extending the command line options, suppose you wanted to emulate a C pre-processor by accepting ``-D'' options to define macros. That could be done by defining rules such as:

  ARGV:\N-D<I>\=*\n=@define{\\I$1\\I\=@quote{$2}}
  ARGV:\N-D<I>\n=@define{\\I$1\\I\=1}

Instead of adding to the built-in rules, it is also possible to suppress the built-in rules and define your own rules from scratch. To do this, start the program with a command line like:
gema -prim pattern-file ...
The -prim (``primitive mode'') option suppresses loading of the built-in rules and reads patterns from the specified file. Then the remainder of the command line is processed according to whatever ARGV rules were defined in that file. Note that even the default behavior of reading from standard input and writing to standard output is implemented by the ARGV rules. (The -prim option is the only one that is hard-coded instead of being implemented by patterns.)

6 Exit codes

When the program terminates, it will return one of the following status codes to the operating system (unless overridden by the use of function @exit-status):
0
nothing wrong
2
failed match signaled by @fail or @abort
3
undefined command line argument
4
syntax error in pattern definitions
5
use of undefined name during translation (domain, variable, switch, parameter, syntax type, or locale)
6
invalid numeric operand
7
can't execute shell command for @shell function
8
I/O error on input file
9
I/O error on output file
10
out of memory

7 Status and Future development

This program is now at the stage where in an ideal world it should be regarded as a completed prototype, and it would now be the time to start designing the real program to replace it. However, as usually happens in the real world, we ship the prototype because there isn't time to do any more. There is room for improvement in the areas of consistency, ease of use, and performance at least. Also, this documentation was written rather hurriedly and is not nearly as polished as I would like.

Since this was developed by one person as a spare time hobby, it has not had very extensive testing, so there are likely to be bugs. The -w and -t options are the most recently added functionality, and hence the most likely to have inadequacies.

I don't know at this time whether I will be spending any more effort on further development, but I am interested in hearing about any bugs found or other suggestions.

Following, in no particular order, are some assorted ideas for enhancements which remain for the future:

8 Acknowledgments

This program was conceived as an extension of the concepts embodied in W. M. Waite's ``STAGE2'' processor [Footnote 2], as implemented by Roger Hall.[Footnote 3]

This program has some similarities to awk, but they are generally due more to similarity of purpose than to any deliberate copying. I did copy the $0 notation and adopt the term action.

This program was designed and coded by myself, David N. Gray, except for the regular expression processor, which utilizes public domain code written by Ozan S. Yigit and updated by Craig Durland and Harlan Sexton.