Date: Fri, 30 Aug 1996 13:20:07 -0500
References: <199608301611.LAA16276@ux6.cso.uiuc.edu>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-online-newspapers@marketplace.com
Precedence: bulk
Status: RO
X-Status:
At the risk of prolonging the discussion, I'd like to point out that
there is more history ... and more future ... involved here than has
been brought to light thus far.
Newspaper production systems are designed with a specific purpose in
mind. When Harris, Atex and similar systems were created, the
specification was narrowly defined. These machines were never expected
to deal with character sets that were not in common use in the newspaper
industry.
First, the history:
The character set commonly used in newspaper production was defined by
the needs of teletypesetting, a system created by Frank Gannett in the
1920s or 1930s in upstate New York. Teletypesetting, or TTS, involved
the creation of punched paper tape to drive Mergenthaler Linotype
hot-metal typesetters. Teletypes of that era used a five-bit Baudot
encoding and were "all caps." (TTY/TDD services for the deaf continue to
use Baudot to this day.)
To accommodate teletypesetting, a TTS code structure was developed,
based on Baudot. It was a six-bit code; the extra code space was used to
provide for typographical instructions that mapped to Linotype
functions: Shift in, Shift out, Rail change, En, Em, Em-dash, etc. The
TTS code structure eventually was adopted by most wire services, even
for wires that did not produce hyphenated/justified tape for directly
driving linecasters.
In the 1970s it became apparent that the six-bit code structure was not
sufficient for a computer-based editing world. However, the 7-bit ASCII
character set commonly used in the minicomputer (non-IBM) world also was
deficient; it lacked the necessary typographical codes. As a result, the
ANPA Research Institute developed a new code structure for the so-called
"high-speed" (1200bps) wire that replaced parts of ASCII that were not
commonly used in newspaper TTS type fonts with the necessary codes from
the TTS set. As it happened, the "@" sign fell by the wayside.
So here we are, contemplating the arrival of the 21st Century, saddled
with baggage from Ottmar Mergenthaler's 19th-century invention.
Now, the future:
Comite International des Telecommunications de Presse (apologies for the
lack of accents) and the Newspaper Association of America have developed
a new "Universal Text Format" standard. It was announced two years ago
at the Connections IX conference in Las Vegas and has been in
development since that time.
This News Industry Text Format (NITF) deals with formatting issues --
and other "metadata" issues -- by using SGML, the standard for creating
markup standards.
As an SGML document type, NITF is strongly influenced by HTML and indeed
a great deal of effort has been made to ensure compatibility. It is my
understanding that the NITF currently is considered a draft and a
ratification is scheduled for November.
There are no newswires currently using the NITF standard, but we might
begin to see it put into production toward the end of this year or
perhaps early next year. In my opinion, sooner is better. The current
ANPA 89-3 standard is barely adequate for print production and
absolutely inadequate for the digital world.
It is not simply a matter of lost characters such as the "@" symbol. The
current standard entangles data and metadata in an undifferentiated
glop; anyone who has contemplated putting sports agate online will
quickly see what I mean. Those of us who have implemented wire feeds
into online databases have implemented brute-force methods to parse
headlines, bylines, etc., out of the text, methods that occasionally
fail and result in garbage being placed online by our automated
processes. The new standard will let us avoid that sort of problem, and
I am looking forward to seeing it implemented.
NITF will not work with currently available newspaper front-end systems,
but it is extensively documented and a competent coder should be able to
turn out a translator for driving ancient Atex boxes without too much
difficulty. The real gain will be seen by sites with next-generation
database-driven production systems based on standard platforms.
--
Steve Yelvington
Editor/Manager, Star Tribune Online, Minneapolis-St. Paul
http://www.startribune.com/
From owner-online-newspapers@marketplace.com Fri Aug 30 17:17:24 1996
Received: from marketplace.com (majordom@marketplace.com [206.168.5.232]) by cnj.digex.net (8.6.12/8.6.12) with ESMTP id RAA19536 ; for ; Fri, 30 Aug 1996 17:17:23 -0400
Received: (from majordom@localhost) by marketplace.com (8.6.12/8.6.12) id MAA00778 for online-newspapers-outgoing; Fri, 30 Aug 1996 12:16:56 -0600
Received: from server.indra.com (server.indra.com [204.144.142.2]) by marketplace.com (8.6.12/8.6.12) with ESMTP id MAA00773 for ; Fri, 30 Aug 1996 12:16:51 -0600
Received: from indra.com by server.indra.com (8.7.4/Spike-8-1.0)
id MAA27307; Fri, 30 Aug 1996 12:06:17 -0600 (MDT)
Received: from ux6.cso.uiuc.edu by indra.com (8.7.4/Spike-8-1.0)
id MAA06706; Fri, 30 Aug 1996 12:06:16 -0600 (MDT)
Received: from edmonton-4.slip.uiuc.edu (edmonton-4.slip.uiuc.edu [128.174.23.178]) by ux6.cso.uiuc.edu (8.6.12/8.6.12) with SMTP id NAA07672 for ; Fri, 30 Aug 1996 13:06:09 -0500
Message-Id: <199608301806.NAA07672@ux6.cso.uiuc.edu>
Comments: Authenticated sender is