...making Linux just a little more fun!

<-- prev | next -->

Document Processing with groff and mom

By Peter Schaffter

Document Processing with groff and mom

I see a schoolmarm at a blackboard scrawling out the title of this article with squeaky chalk. A murmur rises from the desks behind her.

Billy, the Linux ultra-newbie:
Document processing? Is that anything like word processing?

Suzie, the relative newbie:
"groff"? Isn't that the program that lets me read manpages?

Todd, the old hand:
mom? Never heard of it.

Okay, class--quiet down. One question at a time.

Document Processing

No, Billy, word processing and document processing aren't the same.

When you use a modern word processor, the computer monitor shows you a persistent representation of what you're writing in its final printed form. Whenever you want to change a font, or increase the size of type, or tighten a line, you typically highlight a portion of text, point your mouse at a menu item, select the kind of operation to perform, then specify the change from yet another menu. The alteration is immediately visible.

Text processing differs from word processing in that when you write, you fire up a text editor, a program that provides powerful tools for editing text itself--tools well beyond the scope of word processors--but does not show you a representation of the printed document. Formatting and typesetting are achieved not by point-and-click, but by embedding written commands in the text. When you finish writing, you preview the document with a small program (like gv, or ghostview) whose sole function is to show you what the printed version looks like.

groff

Very good, Suzie, groff is the program that lets you read manpages. But it does much more than that.

For many Linuxers--programmers and end-users--groff begins and ends with manpages. It comes as a surprise, then, to discover that groff is actually a powerful formatting and typesetting engine capable of producing PostScript, TeX DVI and html output in addition to formatted terminal copy (i.e., manpages).

Groff has a very long history, dating back to the earliest days of Unix. By comparison, TeX--the other big player in Linux document processing--is a relative newcomer. TeX and groff are both monumental achievements, with considerable overlap in what they do, but they have a major difference: size. Even a minimal installation of TeX is huge in comparison to groff.

Many people put up with TeX's size because they mistakenly believe that TeX produces typesetting of a quality superior to groff. While that may have been true once, it is simply not the case now, and hasn't been for some time. As a typesetting engine, groff is superb.

Groff does have a liability, though: it's incredibly geeky. Owing to its long history, it--and its power users--seem stuck in a time warp. Groff's classical macro sets (macro sets are the end-user's primary interface to groff) still look as they did in those decades when memory was exorbitantly expensive, and every byte mattered. Their terse, two-letter commands tend to scare people off, as does the amount of knowledge about groff itself required to use them effectively.

That's where mom comes in.

mom

Gee, Todd, what version of groff are you running? Have you actually checked man groff_tmac recently?

To be fair to Todd, mom is the new kid on the block. She's only been around for about two years--the first major new macro set to come down the pike in quite some time.

mom's mandate is simple: to put typesetting and document processing with groff within easy reach of everyone, old hands and newbies alike. "Easy" has been accomplished -

Tutorial--creating a document with mom

mom is actually two groff macro sets in one. For the typographer, she provides a suite of tools modeled on the commands used by "dedicated" phototypesetting machines. For the writer, she provides document processing "tags" that automatically generate beautifully formatted heads, subheads, paragraphs, cited matter, footnotes, endnotes, tables of contents, and much more. In this tutorial, we'll be setting up a university essay, so the emphasis is on document processing, not typesetting.

First of all, the "rules":

You begin a mom document by entering some reference information: title, subtitle, author(s) and so on. mom uses this information to create cover pages (if you want them), set document titles and generate page headers or footers.

.TITLE    "Stretched to the Breaking Point"
.SUBTITLE "Cadential Ambiguity in Wagner, Mahler and Strauss"
.AUTHOR   "Jane Dearborne"

Next, you tell mom what type of document you're creating, whether this is a draft or a final copy, and whether you want the document typeset or "typewritten."

.DOCTYPE    DEFAULT
.COPYSTYLE  FINAL
.PRINTSTYLE TYPESET

.DOCTYPE DEFAULT and .COPYSTYLE FINAL are optional (because they're mom's defaults). However, .PRINTSTYLE TYPESET is not. All mom documents that are to be formatted with the document processing tags must contain a .PRINTSTYLE directive.

Next, you initiate document processing with the single, required macro

.START

Now you're on your way. Begin each paragraph with .PP, on a line by itself, followed by the text of the paragraph, like this:

.PP
Lorem ipsum dolor sit amet...

When you need a main head, type .HEAD, followed by the text of the head, on the same line and surrounded by double-quotes.

.HEAD "Wagner: Lohengrin to The Ring"

Subheads are accomplished similarly:

.SUBHEAD "The Pull Toward Flat Six"

If you need to insert a passage cited from another author's work, simply surround the passage with the .BLOCKQUOTE macro.

.BLOCKQUOTE
At vero eos et accusam et justo duo dolores...
.BLOCKQUOTE OFF

If you require footnotes, embed them in the body of the document, like this:

In 1890, Alma\c
.FOOTNOTE
Mahler's wife; later married to Walter Gropius of Bauhaus fame,
then again to writer, Franz Werfel.
.FOOTNOTE OFF
is reported to have...

Note the use of \c in the first line, above. Footnotes (and endnotes) require \c in order to attach markers (asterisks, daggers, superscript numbers, etc.) to the ends of words.

Carry on in this way until the end of the document, which, if you use endnotes, is terminated by the single macro

.ENDNOTES

mom is designed to produce PostScript output (for sending directly to a printer or saving as a .ps file), and groff's default "device" is PostScript, so you'd process the file, at the command line, with

groff -mom -l <filename>

or

groff -mom <filename> | lpr
to send the file to a printer, or
groff -mom <filename> > <filename>.ps
to save it to a file. Either way, you end up with a professional-looking 8.5x11 inch document, typeset justified in Times Roman at 12.5 on 16 (mom's default).

What this little tutorial doesn't demonstrate is the degree of control mom permits over the design of documents. All the document processing tags have global "control" macros that allow you, at a minimum, to change the family, font, point size and color of any tag. Where appropriate, mom provides additional control macros for things like quad direction, line spacing, underlining, capitalization, indent, numbering style, and so on. Used in conjunction with mom's typesetting macros, the control macros let you design virtually every part of a document to precise specifications and taste.

OK, I'm intrigued: How do I get my hands on mom?

mom has been part of groff for the past two years, so if you have a recent version of groff (1.18 or later), you already have a mom. :-)

However, mom is being developed independently of groff, so you'll probably want a more mature version than the one you got the last time you updated groff.

There are two ways to get an up-to-date mom: either go directly to mom's homepage and download the latest release, or checkout the latest groff from the groff CVS repository (instructions here) and build groff from source. Patches and improvements to mom are always applied to the groff repository before a new release, so either method gets you the latest version. At this time of writing, that's 1.2. To check the version number of your current mom, do a locate to find the macro file, om.tmac, then page through it to line 26.

Please note that mom currently requires you be running, at the very lowest, groff version 1.18, and for optimal use, groff version 1.19.2 or higher.

 


[BIO] (the words "groff" or "mom" must appear in the subject line of any email sent to this address, otherwise the email will get nuked)

Peter Schaffter is a classical pianist, country songwriter and professional typographer turned writer whose novel, The Schumann Proof (pub. RendezVous Press, Canada), will be on the shelves in the fall of 2004.

An ardent champion of Free Software, he is also the creator of the "mom" macro set for groff. Mom is Peter's way of saying "thank you" to the community of open source developers who made it possible, despite his perpetually impoverished state, to get his hands on some of the most powerful computing tools on the planet. Mom also reflects his interest in software documentation, a subject he considers of primary importance in open source development. In a reversal of normal devlopment procedure, he wrote much of the documentation for mom before implementing the code. "A user's first exposure to a program is usually the documentation," he says, "so why not get it right first? Besides, making a program conform to pre-written docs is a great way to ensure it behaves as advertised."

Copyright © 2004, Peter Schaffter. Released under the Open Publication license

Published in Issue 107 of Linux Gazette, October 2004

<-- prev | next -->
Tux