19

I am preserving a manuscript written by my father. The data files are on 5.25" floppy. I have successfully read the files off the disks, but neither of us knows what format the markup language is. It is one used by book publishers back in the early 1990s.

Here are a few lines from one of the files.

\m\m<ps;3><l>\
<ep>\
{cn}1 \
{ct}Rethinking Universality:<qa>\
Six Cases<lrh;;1>Rethinking Universality: Six
Cases<xlrh><rrh;;2>Rethinking Universality: Six Cases<xrrh> \
{t1}In 1983 the anthropological community was convulsed by reactions to
Derek Freeman\'s <ital>Margaret Mead and Samoa: The Making and Unmaking
of a Myth.<med> Remarkably, two books with a very similar message but
by different authors attacking different myths were published within a
year of Freeman\'s. One was Melford Spiro\'s <ital>Oedipus in the
Trobriands<med> (1982); the other was Ekkehart Malotki\'s <ital>Hopi
Time<med> (1983). Each of these books refutes or questions one of the
centerpieces of anthropological relativism.\
%In <ital>Coming of Age in Samoa<med> (1928) Margaret Mead argued that
adolescence among Samoans was not the time of storm and stress that it
is in the West and, hence, that the Western conception of adolescence
is strictly cultural\Msomething that we could change. Freeman shows
that adolescence was just as stressful in Samoa as in the West and that
in other ways Samoa was not so different from Western societies as Mead
had led us to believe.\

Can anyone identify the markup language being used here?

I have both these original files and the published book. It wouldn't be very difficult for me to figure out what the codes mean. But if the format could be identified, some kind of automatic translator into something more recent (RTF, XML, etc.) might be available.

Edit

It's like a mystery to be solved! Here's how a table is begun:

<begtab;tbl2;1p><setnc;2><setctr;5p><tblwidth;15p><setbgut;rsidbox;0q>

I did some Googling around for terms like begtab and setbgut. The latter turned up a PDF document that seems to have a "typo" in it, but the typo is a setbgut tag almost exactly like the one from my files.

http://sfmb.ulb.ac.be/pdf/J_Biol_Chem_1999_274_22_15510.pdf (search for setbgut)

It would appear this research paper was laid out using the same software. I brought it into Acrobat and the properties say it was generated by Xyvision Parlance Publisher (XPP). Here's the best I've found about them so far: http://www.isgmlug.org/n2-1/n2-1-49.htm

Edit 2

OK, I see now. XPP is an SGML-like markup language. In fact, XPP sales literature advertises how "easy" it is to take an existing SGML document and add their proprietary tags. Unfortunately, XPP was sold to General Dynamics some years ago. Automatically translating the document into, say, HTML is difficult without the DTD. However, as others have pointed out, most of the tags are easy to figure out. Some of them, such as the bibliographic references and the values after the semicolon in tags such as <rrh;1> and <lp;&-1q> still elude me. I'll have to compare the file to the physical hard copy side-by-side to decode it all.

Barry Brown
  • 1,910
  • 11
  • 14
  • Well, it's not TeX, RTF, or SGML. Do you know whether it was hand-written or generated by a word processor? – Kevin Reid Sep 26 '11 at 00:51
  • 1
    Probably hand-written. The markup is too concise to be computer-generated. It might have been written by a computer program that allows the user very specific control over the formatting codes, however -- which would be just one step away from being hand-coded. – Barry Brown Sep 26 '11 at 01:00
  • 1
    The extension of the file name would help a lot. – Joel Coehoorn Sep 26 '11 at 01:23
  • a quick poke around rules out [XyWrite](http://en.wikipedia.org/wiki/XyWrite) and wordstar(uses dotcodes). Maybe its wordperfect? – Journeyman Geek Sep 26 '11 at 01:33
  • Are the very first few lines in the file what you have written here? If not, please post them. – Hydaral Sep 26 '11 at 01:34
  • 4
    `lrh` and `rrh` are running heads, respectively left and right. `xlrh` and `xrrh` exit the running head mode. `cn` centers a number where-as `ct` centers a title. `ital` puts you in italic mode while `med` puts you back in medium mode. `%` starts a new paragraph where a backslash just continues the paragraph. I can't figure out the others, and did quite some search terms but none reveal the format... – Tamara Wijsman Sep 26 '11 at 01:36
  • WordPerfect had **alot** of control characters embedded in the document. It would look far messier in text. . – surfasb Sep 26 '11 at 02:24
  • @JoelCoehoorn The files have no extension. – Barry Brown Sep 26 '11 at 03:32
  • @Hydaral Those are the first 20-ish lines of one of the files. Other files start differently. For example, the first line of CHAP02 is `{cn}2`. I don't believe the `\m\m` in the file I posted is a magic number. – Barry Brown Sep 26 '11 at 03:34
  • Since we're now down to decoding the markup itself, maybe this would be a better question for Stack Overflow? – Barry Brown Sep 26 '11 at 03:39
  • I don't think it's a good fit for Stack Overflow unless you can get either a complete set of markup commands to code to or have existing code you need help with. – Joel Coehoorn Sep 26 '11 at 03:50
  • Some other ideas (that I can't verify one way or the other) that have been used in publishing: PageMaker, Quark, Quark Express, MS Publisher. I had high hopes for WordStar, but it seems that was eliminated already. You might find this wikipedia article useful: http://en.wikipedia.org/wiki/Desktop_publishing – Joel Coehoorn Sep 26 '11 at 03:59
  • Added some additional material to the original question. – Barry Brown Sep 26 '11 at 05:18

2 Answers2

4

I found this PDF on Xyvision Production Publisher (which is likely what was used). Note under FinalPages it lists HTML as an output format.

If you could somehow get a copy of some version of this software running, you may be able to get it to spit out some HTML. That may or may not be harder than reverse-engineering the document markup. There's a bit more info on HTML exports at the very bottom of this page.

enter image description here

John Lyon
  • 298
  • 3
  • 15
4

It is possible that the product used to generate your files has evolved into Portalyx SDL XPP. It might be worth contacting them.

See XPP Personal Edition

Identified via a dead link on a Wikipedia page which mentioned XyVision XPP.

RedGrittyBrick
  • 81,981
  • 20
  • 135
  • 205