Subject: standars and Re: HTML/SGML/PDF Theology From: R Ballard Date: Fri, 24 Mar 1995 16:17:13 -0500 (EST)
How the Web Was Won
Subject: standars and Re: HTML/SGML/PDF Theology From: R Ballard Date: Fri, 24 Mar 1995 16:17:13 -0500 (EST)
In-Reply-To: <199503161535.AA02112@access4.digex.net>
Message-ID: 
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII




On Thu, 16 Mar 1995, Steven J. Vaughan-Nichols wrote:

> > >In any case, I sometimes wonder if squabling over HTML is missing the
> > >point that there are alternatives, like SGML and PDF, that do have
> > >solid pros and cons for becoming the Web's standard page description
> > >language. All in all, I'm beginning to find all this HTML sound and
> > >fury remindful of Windows v. OS/2 rwars that ignore Unix, et. al.
> > >Only, in this case, I don't believe that marketshare has solidified
> > >to the point that you can argue that other alternatives aren't
> > >really important.

People seem to forget that HTML became a standard in the first place
because of the user contributed software and supporting documentation
that was provided in the form of an RFC.  The Andrew Editor (EZ) was
doing SGML and HTML on the last release.

> I don't think you understand the SGML pureist point of view. In a recent 
> letter to me about one of my recent Byte stories, Bob Goldstein of the 
> University of Illinois at Chicago, wrote: "Make *all* browsers accept 
> FULL SGML. Thismeans all browsers will render *all* tag sets, including 
> all  varieties of HTML, including incompatible HTML versions, including tag 
> sets that haven't been developed yet, such as HTML 137.0."

Actually, this would be a pretty good idea.  The problem is that 
News"Paper" editors like to "play" with the layout which means that
they may end up having to sending new DTDs and presentation pages with
every story.  A bit like sending the full prolog with every postscript
or PDF story.  The other problem is that now every indexing engine
would have to try to sort out whether each field could be indexed and
treated as an independent field.  Even with HTML we find lots of 
opportunities to look-up meaningful values for short-hand html tags.

It's ironic that Postscript was a decendent of FORTH, and FORTH was
relegated to running car engines because defining words made the
language too powerful for printer users to understand.  Now everyone
uses "defining words" to tag their postscript.

Another example is the use of ASN.1 vs XDR/RPC.  RPC was a simpler
method of producing the desired result (getting a request from
one program to another and getting a response back), even though
it was less "extensible".

> For him, and others in the SGML camp, the issue isn't making HTML 
> conform to SGML, it's getting browsers SGML compliant to the point 
> that issues like what tag does what in an Web document is an 
> irrelivant issue. That's because a fully-qualified SGML browser will be able 
> to make sense of any document with an SGML DTD, regardless of what's 
> in the DTD. As you might guess, people in this camp find the current 
> uproar about HTML and NetScape to be entirely besides the point. 
> NetScape's additions can be and are being DTDed, therefore SGML 
> browsers can handle them. 
There are some new requirements however.  You now have to manage DTDs and 
style sheets.  When the user gets a document, you now have to determine
whether he has the style sheet and DTD.  If you change the DTD, what
will happen to those who think they have the correct DTD?  Do you
really want to send the same DTD with each of 300 related documents?
Then there's the fight over the style sheet.  When I send you a style
sheet, is that the final word on layout, or can you make up your own?
If I don't have adobe-gothic fonts on my X-Server can I pick an 
appropriate substitute?

HTML was accepted as a nice comprimise between something that was
fully extensible, like SGML, and something that was versioning
and prolog dependent like Postscript.

TCP/IP hid the details of the 2000 possible link layer configurations from
the end users, making it possible for people to use tcp/ip across 
everything from ATM and FDDI to ham radio equipment.  Carriers have been
working very carefully to make sure that their customer's IP traffic
doesn't end up going to Hawaii via Aloha Net (ham radio, unencrypted).

> > People are confusing HTML with a page description language - it's a
> > structural language, as all SGML applications are, and its rendering is
> > dependent on the HTML browser's built-in style sheet.  A big focus of the WWW
> > development community right now is developing a system for arbitrary style
> > sheets, so that rendering information like font size and type, color,
> > alignment, etc. are controlled in a style sheet that _doesn't_ overburden HTML.

There is such a range of preferences here. It's like trying to decide 
which Microsoft Word ".dot" template to use.  The point is that in order
to read the document as originally provided, I need Microsoft Word.  
That's great if I would rather have 200 copies of word at $400/year 
instead of 2 senior P/As, or even 1 copy instead of that 1gigabyte drive.

When I'm publishing an "electronic newspaper", do I want to only access
the users with the $400 "to spare" or do I direct toward a product which
provides free browsers, or browsers for very low cost.

> And, your last comment is why everyone's not in the SGML camp. I 
> might add, though, that SGML already includes the stylesheet idea. 
> Again, to me, it seems that much of the HTML debate is trying to 
> reinvent the wheel of online, hypermedia page description. 

Look at the billions that were spent trying to develop a "proprietary
open" (yes, that is an oxymoron :-), version of TCP/IP.  They added
so many bells and whistles that the full OSI compliant product would
do everything but tuck you in at night.  Unfortunately, the $50,000
for documentation and $100,000 for reference source code made the
costs too prohibitive.  There is a lot of bad press about internet
protocol, but any community of 60 million human beings is going to have
it's problems.  About 60 million users are on the internet in some
form, relying on TCP/IP delivery systems.

> Here, we've hit on one of the core disagreements between the 
> NetScape approach and that of others. NetScape believes that the 
> higher, DTD level, tags are a more appropriate way to deal with the 
> presentation issue, while you and others, see these issues that 
> should be dealt with on the stylesheet level. In passing, I'll note 
> that the PDF camp would say that there AI exercises aren't that 
> difficult to do and that the result is a more faithful image of the 
> publication designer's page vision. 

The missed point is that DTDs provide much more than just presentation
information.  They alsy point out fields that can be indexed, used
to search, sort, and filter documents.  The Newspaper mentality
wants you to look at every page.  The system administrator, responsible
for archiving 300 gigabytes of copy each year, must try and deal
with the data tags.  The reader, bombarded with 20 megabytes/day, must
choose which 20 pages he wants to read.

Advertizers may be a bit reluctant to put advertizing into a system that
can index their prices, locations, and product line and compare it
with 2000 other merchants of the same product, knowing that the user
will probably look for the lowest price.

Of course there's also the problem of clients.  We've almost ironed out
the GIF/JPEG/PCX/BMP/FAX/... graphics wars (haven't we? :-), so what
about the other server extensions like PDF, PCL, PS, WAIS, SQL, and
query language of the week?

> > >There's also been, to my mind, a lot of naive discussions of the
> > >standard making process. Making standards in any business is a messy,
> > >political matter. Sometimes the conflicts move at glacial speeds, as
> > >the ISO X.400 and X.500 standard making process has shown and
> > >sometimes the conflicts are fast and bloody, the period we're now
> > >entering with HTML. In either case, being first out of the gate does

There are two ways of creating standards.  De Jure standards such as ISO
X.400 and X.500 are attempts by a limited number of vendors to form
a Oligopoly, to break the hold of a Monopoly (IBM/SNA), or to get
better profit margins than they would get from an Panopoly (TCP/IP).

De Facto standards arise when someone contributes a specification
and some sample implmentations and makes them available for public
use.  The General Public License has created standards such as
TCP/IP, HTML, SGML, JPEG, MPEG, GIF, X11, Unix (originally user
contributed software from AT&T).

The problem with TCP/IP was that ANYONE could get it.  They still
don't know how many people have TCP/IP capability.  The top selling
network operating system is NetWare, but IPX is eclipsed by IP in terms
of raw numbers of users.  The reason is not that IP is superior, but
that users were downloading ka9q and Trumpet and passing it around
to friends.  Corporations even considered the modest $5/user/year
quantity 100+ fee very reasonable, they even bought extras.  Royalties
on one NetWare server would pay for 1000 TCP/IP (trumpet) users.

The same is likely to happen with HTML.  When Netscape and company
try to get their lock on the market by adding proprietary hooks and
extensions that are supported through products exclusively supported
by members of the cartel, the user community will yawn and go back
to HTML 2.x, or switch to SGML.

> > >count for a lot as does being aggressive about pushing your agenda. 
It > > >would be nice if standards were settled for the best of all involved
> > >and in a gentlemanly like fashion, but that's not how the process
> > >works. I forsee a Web fractionalized by formatting. I suspect
> > >that most of it will be using NetScape/HTML 3.0 since NetScape is
> > >combining their energetic approach with paying more than lip-service
> > >to the W3C and IETF HTML 3.0 efforts
> > 
> > Your point about the difficulty of the standards process is unfortunately
> > true.  But using the same logic, however, you should give up on limiting
> > your intake of red meat, because you're going to die some day anyway.
> 
> Sorry, that's not what my logic tells me. By the way, pass me that 
> steak. ;-)
> > 
> > Consider what the e-mail world would have been like without an agreement on
> > a common standard: RFC822. Because of it, you can send email to anyone else
> > on the Internet and feel pretty confident in the knowlege that they'll be
> > able to read it, whether they're on a Sun or a PC or a Mac or a Vax or an
> > Amiga.  Now imagine what the world would have been like if MSMail had set
> > the standards for Internet email transmissions, and you'll see why
> > standards are crucial to networked environments.
> 
> I've never argued that standards aren't vital. The sucess of RFC822 
> over X.400 and the propritary standards was, and is, vital to the 
> explosive growth of the Internet and online services. Without RFC822, 
> we might very well not have an Internet. 
> > 
> > >I forsee a Web fractionalized by formatting. 
> > 
> > A fractionalized web would be quite tragic, particularly when the
> > mechanisms for content negotiation have always been there.
> 
> I quite agree. 
> > 
> > and added lots of silly non-HTML-3.0 attributes.  
> 
> and these are? Other than, shudder, BLINK. 
> 
> > When they've implemented a
> > browser that does stylesheets, maths, figures (a more powerful version of
> > IMG) and every other part of HTML 3.0 faithfully, 
> 
> No one does, or can, do that yet. HTML 3.0 is still a standard in 
> movement. 
> 
> > _and_ when they've put
> > the one line of code in their browsers that will tell servers "hey, I can
> > accept HTML 3.0" so content-negotiating servers can give HTML 2.0 to
> > 2.0-only browsers and HTML 3.0 to 3.0 browsers (or Hey! maybe even just
> > Netscrape-modified-HTML to netscrape browsers), only then will they be
> > giving more than lip service to the standards process, 
> 
> I'm not aware of any browers that do this at this time. In any case, 
> I don't follow how this functionality makes NetScape more compliant with 
> the standards process. A NetScape browser, like any current browser, does 
> the best it can with the document that it's presented with. Isn't the 
> issue that NetScape is promoting non-standard HTML tags the one that 
> you take exception to?
> 
> Thanks for addressing the issues of Netscape HTML tags and other Web page 
> description languages instead of writing to the 'NetScape's bad' level 
> of rhetoric. I've found this discussion enlightening and I'm sure 
> that others who must deal with the nuts and bolts of the Web will 
> find it so as well.
> 
> Steven
> 
> Steven J. Vaughan-Nichols  |QOTD: When encryption is 
> sjvn@access.digex.net      |outlawed, 7ejblm34aq0!
> Infobahn Traffic Reporter  |
> 

From rballard@cnj.digex.net Mon Apr  3 12:35:46 1995
Status: O
X-Status: