html-latex.flt
Usage: html-latex.flt < file.html > file.tex
Postscript
html-latex.flt is a filter to
map a (possibly multi-file) html document
into a \LaTeX\ document.
The filter recognizes only a subset of html.
It recognizes, however, additional tags (that do not
interfere with the display of the html documents
with any of the common browsers, eg. netscape)
in order to allow for the inclusion of \LaTeX\ style options,
the inclusion of (collections of) files,
and the selective incorporation of text.
Also, limited support is provided for the ISO characterset
allowed by html.
Document delimiters
Document delimiters are needed to allow for
the declaration of the appropriate \LaTeX\ style.
Simple document delimiters are:
-
<document> --
beginning of (part of) document
-
</document> -- end of document
Default is the standard article style, as showed
in the translation of the document tags below:
-
<document>
\documentstyle{article}\begin{document}
-
</document>
\end{document}
When the user wishes to employ another style or
to include style options the more elaborated forms can
be used:
-
<document{style}> -- with style
-
<document[opt]{style}> --
with optional style(s)
When other files are included only the top-level
document delimiters will be taken into account.
The delimiters in the included files will simply be ignored.
File inclusion
File inclusion provides a convenient mechanism
to handle the complexity of large documents.
In particular, for allowing the production of an
ordinary (linear) document from a collection of html
files, the possibility of including other html
files is simply convenient.
This is allowed by the source tag which is abbreviated
as so:
-
<so file1 [file2 ...]> -- to include files (with .html extension)
Multiple files may be included.
The html extension is implicit for each file.
In a similar way directories may be included by emplying
the directory tag, which is abbreviated as cd:
-
<cd dir1 [dir2 ...]> -- to include directories
To be able to use the directory tag, the directory
must contain a file of the same name with an html
extension.
To allow for the inclusion of arbitrary files, the
include tag is included:
-
<include file1 [file2 ...]> -- to include files
For the tag no implicit file extension is assumed.
Additional tags
To control the output of the filter, some additional tags are needed,
such as the control, by which portions of
the text can be indicated that must not
appear in the \LaTeX\ output:
-
<control> -- text will not appear in \LaTeX document
-
</control> -- end of control mode
Another tag is the nonumber tag, which may be used
to suppress the numbering of chapters and sections:
-
<nonumber> -- to avoid numbering of sections
-
</nonumber> -- to allow numbering of sections
Using the nonumber tag results in outputting
for example \chapter*{ instead
of \chapter when translating the h1
tag.
Translation html tags
Only a subset of html is supported.
All tags (including the ones described above)
may be written in either lower case or upper case.
When using tags that are not in this list
it is wise to emply the control
tags and include the \LaTeX\ text surrounded by
html comments.
-
<!-- no output
-
--> no output
-
<a ...> no output
-
</a ...> no output
-
<b> {\bf
-
</b> }
-
<br> newline
-
<code> \verb?
-
</code> ?
-
<dl>
\begin{description}
-
</dl>
\end{description}
-
<dd> ]
-
<dt> \item[
-
<em> {\em
-
</em> }
-
<h1> \chapter{
-
<h2> \section{
-
<h3> \subsection{
-
<h4> \paragraph{
-
</hX>
} for X = 1..4
-
<hr> no output
-
<p> double newline
-
<i> {\it
-
</i> }
-
<li> \item
-
<menu>
\begin{itemize}
-
</menu>
\end{itemize}
-
<ol>
\begin{enumerate}
-
</ol>
\end{enumerate}
-
<pre>
\begin{verbatim}
-
</pre>
\end{verbatim}
-
<tt> {\tt
-
</tt> }
-
<ul>
\begin{itemize}
-
</ul>
\end{itemize}
Anything in between tags is outputted, unless
the control mode is active.
ISO characters
Only a subset of the full ISO character set is supported:
[.]
Papers
Tutorials
Examples
Manuals
Interfaces
Sources
Packages
Resources
?