HOWTO HOWTO Mark F. Komarinski v0.12, 2 September 1999 Getting a new LDP author up and running with tools, ideas, and conven­ tions used by the LDP ______________________________________________________________________ Table of Contents 1. Introduction 1.1 History 1.2 New versions 1.2.1 Version History 1.3 Copyrights and Trademarks 1.4 Acknowledgements and Thanks 2. Background on the LDP and SGML 2.1 The LDP 2.2 SGML 2.2.1 Why SGML instead of HTML or other formats? 2.3 The tools 2.3.1 sgmltools 2.3.2 TeX 2.3.3 LyX 3. Getting Started 3.1 Mailing lists 3.2 Downloading and installing the tools 3.2.1 sgmltools 3.3 Writing SGML by hand 3.3.1 Starting out 3.3.2 Header information 3.3.3 Sections 3.3.4 Normal paragraphs 3.3.5 Enhanced Text 3.3.6 Lists 3.3.7 Verbatim text 3.3.8 URLs 3.3.9 References 3.3.10 Special characters 3.4 Writing SGML using other tools 3.4.1 LyX 3.4.2 Emacs 3.4.3 Other SGML tools 3.5 CVS basics 3.6 Distributing your documentation 3.6.1 Before you distribute 3.6.2 Submission to LDP 4. Style guides 5. FAQs about the LDP 5.1 I want to help the LDP. How can I do this? 5.2 I want to publish a collection of LDP documents in a book. How is the LDP content licensed? 5.3 I found an error in an LDP document. Can I fix it? ______________________________________________________________________ 1. Introduction 1.1. History This document was started on Aug 26, 1999 by Mark F. Komarinski markk@cgipc.com after two day's worth of frustration getting tools to work. If even one LDP author is helped by this, then I did my job. 1.2. New versions The newest version of this can be found on my homepage http://www.cgipc.com/~markk in its SGML source. Other versions may be found in different formats at the LDP homepage http://www.linuxdoc.org/ . 1.2.1. Version History v0.12 (Sep 2, 1999) · Completed most sections · Integrated changes from ldp-discuss list v0.10 (Aug 27, 1999) · Got up to section 3.4 written · Added to the outline some · Changed location of LDP mailing list to lists.debian.org from thepuffingroup.com. v0.01 (Aug 27, 1999) · First pass, got web page up, simple outline written. · Take some of what I wrote with a grain of salt. Some things need to be verified. 1.3. Copyrights and Trademarks (c) 1999 Mark F. Komarinski This document may be distributed under the terms set forth in the LDP license at http://www.linuxdoc.org/COPYRIGHT.html . 1.4. Acknowledgements and Thanks Thanks to everyone that gave comments as I was writing this. This includes Deb Richardson and Daniel Barlow and other members of the ldp-discuss list. Some sections I got from the HOWTO Index (available at many LDP locations) and the sgmltools documentation. There are pointers to sgmltools and the LDP elsewhere in this document. 2. Background on the LDP and SGML 2.1. The LDP The Linux Documentation Project (LDP) was started to provide new users a way of getting information quickly about a particular subject. It not only contains a series of books on administration, networking, and programming, but has a large number of smaller works on individual subjects, written by those who have used it. If you want to find out about printing, you get the Printing HOWTO. If you want to do some networking, grab the Ethernet HOWTO, and so on. At first, many of these works were in text or HTML. As time went on, there had to be a better way of managing these documents. One that would let you read it from a web page, a text file on a CD-ROM, or even your handheld PDA. The answer, as it turns out, is SGML. 2.2. SGML The Standard Generalized Markup Language (SGML) is a language that is based on marking up text. In this way, its similar to Tex or groff, or HTML. The power of SGML is that unlike WYSIWYG (What You See Is What You Get), you don't define things like colors, or font sizes, or even some kinds of formatting. Instead, you define elements (paragraph, section, numbered list) and let the SGML processor and the end program worry about placement, colors, fonts, and so on. HTML does the same thing, and is actually a subset of SGML. SGML has really two parts that make it up. First is the Structure, which is what is commonly called the DTD, or Document Type Definition. The DTD defines the relationship between each of the elements. The LinuxDoc DTD, used to create this document, is an example of this. The DTD gives a common look and feel to each document that's created using the DTD. Second is the Content, which is what gets rendered by the SGML processor and is eventually seen by the user. This paragraph is content, but so would a graphic image, table, numbered list, and so on. Content is surrounded by tags to separate out each different element. Over time, the LinuxDoc DTD is going to change over to the DocBook DTD, used by others and giving the LDP a consistent look and feel to other SGML documentation. As this happens, we'll keep you updated via this HOWTO or on the mailing lists. The biggest difference between LinuxDoc and DocBook is that DocBook assigns tags to different types of content (such as commands, file names, directories, and so on) while LinuxDoc assigns tags based on the way the text should look (you can assign emphasized or typewriter for example) 2.2.1. Why SGML instead of HTML or other formats? SGML provides for more than just formatting. You can automatically build indexes, table of contents, and links within the document or to outside. The sgmltools package also lets you export (I'll call it render from here on) SGML to LaTeX, info, text, HTML, and RTF. From these basic formats, you can then create other formats (DOC, PostScript, and so on). SGML doesn't suffer from some of the bloating seen in HTML of late. I don't think you'll be seeing a tag in SGML anytime soon. This makes the code that comes out not only easy to render, but easy to write as well. Programs like LyX (right now my WYSIWYM editor of choice) allow you to write in TeX format, then export it as SGML and render from SGML to whatever you chose. In the end, SGML is more concerned about the way elements work instead of the way they look. A big distinction, and one that will let you write faster, since you don't have to worry about placement of paragraphs, font sizes, font types, and so on. 2.3. The tools In this section, I'll go over some of the tools that you'll need or want to use to create your own LDP documentation. I'll describe them here, and better define them later on, along with how to install them. If you use some other tool to assist in writing LDP, please let me know and I'll add a blurb here for it. 2.3.1. sgmltools Required The sgmltools package contains the SGML tools needed to render SGML as any of the file formats listed above. It also contains the LinuxDoc DTD, needed to make LDP documentation. To create only SGML documentation, this is all you need. If you want to render to formats like TeX, you'll need to get those packages as well. The sgmltools package is available either with your distribution of choice, or via http://www.sgmltools.com/ 2.3.2. TeX Optional TeX (rhymes with blech!) is the markup language of choice for many, including those in the mathematics world. I still remember many Calculus exams that were actually written in TeX. It is also one of the first markup languages that is still around (the other being the *roff formats used in man pages). TeX actually follows some of the same concepts that SGML does. However, TeX renders its files into DVI (Device Independent) that can then be rendered into another format. Unfortunately, DVI can't be easily converted into anything other than printer languages (PostScript, PCL), making it hard to use to generate HTML. TeX is installed or is available with most Linux distributions. TeX is available on almost all distributions as LaTeX or TeTeX. Either should work for you. 2.3.3. LyX Optional The LyX program is a graphical WYSIWYM (What You See Is What You Mean) and provides a much-needed link between an easy-to-use graphical app and renderer and the sometimes-complex rules of SGML. LyX was really used to write TeX documentation, and many of the TeX rules apply in LyX. For example, while sections are automatically numbered, you can't insert whitespace (spaces and tabs) easily. It's against what TeX was designed to do. As it is, SGML often ignore the same whitespace. The LyX program can read the LinuxDoc DTD and provide a template document for you to write (or edit) your LDP documentation in a way that you're familiar with, without having to use vi and remember what the tags are for itemizing a list. LyX is available at http://www.lyx.org/ . 3. Getting Started This section shows how to get involved in writing your own LDP documentation. Getting and setting up the tools, making contact with the LDP in general, and distributing what you know to all the Linux users out there. 3.1. Mailing lists There are a few mailing lists to subscribe to so you can take part in how the LDP works. First is ldp-discuss@lists.linuxdoc.org , which is the main discussion group of the LDP. To subscribe, send a message with the subject reading "subscribe" to ldp-discuss-request@lists.linuxdoc.org . To unsubscribe, send an e-mail with the subject of "unsubscribe" to ldp-discuss- request@lists.linuxdoc.org . 3.2. Downloading and installing the tools 3.2.1. sgmltools Download the sgmltools package from http://www.sgmltools.org/ , or directly from your distribution. The source files from sgmltools.org is in source code format, so you will have to compile the source code for your machine. Using a pre-built package for your distribution is easier, as you don't have to compile it and potentially run into compilation issues (that is, if you're not a coder). With RedHat, the sgmltools is included with the distribution. If not, you can download it from ftp.redhat.com or any of its mirrors as part of the main distribution. If you're using Debian, it too has sgmltools in the standard distribution. If you don't have the package installed, you can use the apt-get command to download and install the package for you: ______________________________________________________________________ # apt-get install sgml-tools ______________________________________________________________________ For more information on the Debian package, you can look at http://www.debian.org/Packages/stable/text/sgml-tools.html If compiling from source, all you need to do is: # tar -zxvf sgmltools-x.x.x.tar.gz # cd sgmltools-x.x.x # ./configure # make # make install Replace sgmltools-x.x.x with the actual version of the sgmltools package you're using. The current version as of this writing that supports LinuxDoc is 1.0.9. The version that supports DocBook is 2.0.2. Both are available at the above web site. Once the tools are installed, you have a number of commands available to you. sgmlcheck file.sgml- Checks the syntax of a given document. sgml2html file.sgml- Converts an SGML file into HTML. Creates a file.html file that contains the Table Of Contents, then creates file- x.html files where x is the section number. sgml2rtf file.sgml- Converts an SGML file into Rich Text Format (RTF). Creates two files, the first being file.rtf that contains the TOC, and a file-0.rtf that contains all the sections. sgml2txt file.sgml- Converts an SGML file into ASCII text. The TOC and all sections are all put into file.txt. sgml2info file.sgml- Blah SGML blah INFO, used by the info command. All output is sent to file.info. sgml2latex file.sgml- Blah SGML blah TeX. sgml2lyx file.sgml- SGML yadda LyX graphical editor. This is great if you have pre-generated SGML files and want to convert them for use in LyX. 3.3. Writing SGML by hand Much like HTML, you can write SGML by hand, once you know all the markup codes you want to use. This section will go over as many of these codes as possible, along with practical examples of each. A nice place to start would be the SGML source for this document, which is available at the web site in the ``Introduction''. As the SGML may be processed differently depending on the file format you go to, I'll try to list some things to know about as you're writing. 3.3.1. Starting out To start a new document, create a new file in your favorite ASCII editor and start with this: This defines the document type (LinuxDoc in our case) that the SGML processor will use when it renders the file in an output format. Nothing is rendered from this tag. Next you need to enclose the rest of your work in
and
tags. This signifies the start of the content (or article, eh?). If you're familiar with HTML, this is similar to enclosing all your content with and . 3.3.2. Header information The first part of the content should contain general information about the rest of the content. This would be similar to the first few pages of a book, where you have a title page (title of the work, author, date of publication, table of contents, and so on). The title of the content is enclosed in and tags. The author is specified in and tags. The date uses and . The two remaining sections are the and tags, which provide an executive summary of what the content is about, and the tag, which specifies the location of the table of contents. The TOC is automatically generated by the SGML processor. We'll get into sections later on. Now, how does it all look together? Taking a nice bit of SGML code (that is, what was used to create this document) you'll see:
HOWTO HOWTO Mark F. Komarinski Aug 27, 1999 Getting a new LDP author up and running with tools, ideas, and conventions used by the LDP This bit of content created the main page you see when you look at this document in RTF or HTML format, listing all the information on one page. 3.3.3. Sections In order to build the Table of Contents, you need to have something to build with. Sections in the case of SGML is the same as chapters in traditional publishing. You have multiple sections, and each section can have a subsection, and each of those can have a subsection and so on. Starting your document with sections is great as it lets you create an outline of the major topics you want to cover. You can then break down these major sections into gradually smaller sections, until you have a nugget of information you can write about in a few short paragraphs. In writing this document, I actually started this way. Sections are one of the few sets of SGML tags that don't require to be closed. That is, there is no tag. Nor do you have to worry about numbering. The SGML processor will handle it all when you render the SGML into something else. Sections are started with tags. A new section is started with each tag. The first section is numbered 1. Creating subsections (like 1.1) is done with the tag. It also starts with 1. Subsubsections (1.1.1) is done with the tag, and also starts with 1. When the SGML processor comes across the tag, it runs through the rest of the document and builds the Table Of Contents based on the number of section tags within it. Sections are numbered and listed in the TOC and then used in the rest of the document. Subsubsections (1.1.1) do not show up in the TOC, but are put in emphasized text if possible. 3.3.4. Normal paragraphs Writing paragraphs of content is just like in HTML. Use a

tag to specify a new line, and start writing. SGML will ignore whitespace such as tabs, multiple spaces, and newlines. When SGML comes across a

tag, it starts a new paragraph. Proper SGML has you put in a

to end the paragraph. 3.3.5. Enhanced Text Every now and then you need a touch of text to stand out from the others. Either to highlight code or to list a command name. The first (emphasizing text) is done with and tags. Typewriter text (the second example) is done with and tags. 3.3.6. Lists There are two forms of doing lists under SGML. First is an enumerated list, where each item in the list is numbered (like sections) starting with 1. 1. This is the first entry in the enumerated list. 2. This is the second. 3. Third. The code for the above list looks like this: This is the first entry in the enumerated list. This is the second. Third. The tag specifies that the following items are going to be enumerated. The other method of writing lists is itemized, where each item merely has a star, or circle, or dot, or some other method of itemizing each item. · This is the first entry in the itemized list · This is the second · Third The above code looks like this in raw SGML: This is the first entry in the itemized list This is the second. Third. As you can see, the tag is the same for enumerated and itemized lists. A third form of lists is the description lists. This has a term being described, and the phrase that describes it. LDP The Linux Documentation Project SGML Standard Generalized Markup Language The code to create the above descriptions is: LDPThe Linux Documentation Project SGMLStandard Generalized Markup Language This isn't quite the same as itemized or enumerated lists, but you have the entire list surrounded by a tag ( and ) and each item in the line that is a word being defined is enclosed in and . The remainder of the line is taken to be the definition of the word. 3.3.7. Verbatim text Sometimes you just need to print some text the way you write it. For this, you can use the and tags to enclose a paragraph in verbatim mode. Spaces, carriage returns, and other literal text (including special characters) are preserved until the . The following is verbatim text . 3.3.8. URLs Also in SGML is the ability to handle Universal Resource Locators (URL) of any kind. Note that this would only work when exported to HTML mode, but you'll get some use out of this tag in other formats (does RTF use it too?). A URL doesn't have an end tag, but puts its information within the tag itself. Here is a URL that points to the LDP homepage: http://www.linuxdoc.org/ . And here's the code to create it: The url="http://www.linuxdoc.org/" tells the browser where to go, while the contents of the name="http://www.linuxdoc.org/" tells the browser what to print out to the screen. In this case, the two are similar, but I could create a URL tag that looks like this: And then looks on the page like this: LDP . 3.3.9. References While URLs are great for linking to content outside the LDP document you're working on, it's not that great for linking within the content itself. For this, you use the