The Modern Technical Communication + Publishing Process: Principles and Best practices

Technical Writing Principles / Best Practices

Create Content, Not Documents

In the age of complex hypermedia and ever-changing document formats and delivery platforms, you cannot afford to create a set of documents and do it all again when the next wave of technical progress makes your selected formats obsolete. Create reusable content.

Consistency

Avoid variations that may be helpful and entertaining in prose and journalism, but could lead to confusion in product literature. Always use the same terms for the same concepts and user interface elements.

Style

See the world (and your product) from the user’s point of view.
Write in an adequate style for your target audience.
Don’t simplify things more than necessary – Don’t talk down to users.
Don’t make things more complex than necessary – avoid jargon and overly complicating sentence structures.
Use active voice.

Terminology

Use consistent terminology. Avoid synonyms and ambiguity.
Use established terminology. Avoid neologisms.

Reusability / Granularity

Find the adequate way of “chunking” your content, based on audience, technical platform, topic etc.

If your content chunks are too small, too much user interaction is required.
If your content chunks are too big, users will find it hard to navigate and return to relevant content.

Granular content can be reused in a system supporting Transclusion.

Topic-Based Authoring

A content chunk should preferably be complete (describing one concept, topic or task) with context and navigation items leading to related information. Topic based writing is one of the core principles of modern technical writing.

Completeness

The product or service should be described in full. If this is not possible, provide links to related information.

Topicality

Make sure your content is up-to-date.
Provide metadata (release date of documentation / software / firmware described) that allows users to verify if they are using the correct version of a document.

Relevance

Do not omit relevant information.
Do not provide irrelevant information, unless it serves marketing purposes (information about other products and services, setting the mood).

Semantic markup

Do not format documents visually (by applying typographic markup as such), but semantically. I.e., use markup that conveys meaning:
- instructive
- descriptive

Technical Communication Workflows

Books have been written about the technical writing process. In general, it’s not that different from journalism and scientific writing:

Material Collection / Assessment

Research your topic. Talk to subject matter experts if you aren’t one yourself. Collect existing information about the product from designers, developers, engineers.

Structure

Outline your project. You will probably be able to benefit from existing work: templates, existing documentation for earlier products from your own company or from competitors.

Draft

Write a draft. Don’t get lost in details; broad strokes help you get an idea of the project scope.

Add media

Depending on the target format(s), you may be able to work with illustrations, photos, embedded or linked video, audio examples, image maps, interactive elements.

Expand/complete your document

Get feedback from subject-matter experts, product managers, the marketing team and – most importantly – end-users.

Typesetting

Bring your project into a presentable shape. If you are using a structured/semantic approach (and the tools that support it), you may have very little work to do here: Proper style sheets will transform structured content into beautiful documents, and usually, only a little post-editing will be required.

Review

If possible, have people review your document who haven’t read earlier versions and who don’t know too much about the product itself. Add, remove, rewrite content ads necessary. Create a “bin” with reusable material

Proofreading

After the review phase (which should focus on the quality and consistency of the content), make sure that there aren’t embarrassing errors in the document. Have a professional writer or proofreader check your document for grammatical issues, spelling errors, bad typography etc.

Publishing

Have your technical writer software / publishing tool / content management system render your content in its final forms – typically, PDF or web pages.

Technical Communication Tools

Off-the-shelf vs. custom solution

If you download a manual in PDF format for a consumer or professional product in 2017, it is very likely that it was either created from an InDesign or a Word file. Most companies “play it safe”, using tools that everyone has and knows. Major companies who have full writing and translation departments can afford building their own workflows; often based on an XML-based format such as DITA or DocBook (usually customised) or even using their own “from scratch” format.

The advantages and disadvantages for both approaches are the same as with most other software(-based) projects:

For popular standard tools, there is usually a rich knowledge base; people who know the features and limitations of these tools, third-party extensions and other useful stuff. On the other hand, building custom solutions can be harder or impossible.
Custom formats and workflows can be heavily adapted to a company’s requirements, and there is no need to live with all the compromises and limitations of a “one size fits everybody” approach. On the other hand, you either have to create everything yourself or support standard APIs and file exchange formats that allow you to work with third-party systems.

Content Creation Tools

Help Authoring Software

Help Authoring Software Features

A Help Authoring Tool (or HAT) is a software program used by technical writers to create online help systems. Common features include:

File import from third-party tools (e.g. general-purpose word processors, raw text, HTML, XML)
Help output in various formats such as WinHelp, PDF, XML, HTML
Content editing (standard text processor functions)
Generation of navigation aids (links, cross-references, index, table of contents etc.
Image editing, image hotspot editing (image maps)

Help Authoring Software – relevant products

Madcap Flare – Professional, commercial Windows software
EC Software Help and Manual – Professional, commercial Windows software
Paligo – web-based content management system for technical writing

Outliners + Text Editors

Outliners and word processors with outlining modes allow users to create and navigate hierarchical structures (outlines). They often provide features that allow to collapse, expand, move and filter document sections. Outliners are excellent tools for creating the “skeleton” of documents that will later be expanded and formatted in other tools.

OmniGroup OmniOutliner – Professional macOS outliner
MindNode – macOS mind mapping tool with outlining mode; imports and exports OPML

Typesetting software / engines

LaTex

Single Source Publishing

Single-source publishing, also known as single-sourcing publishing, is a content management method which allows the same source content to be used across different forms of media and more than one time. The labor-intensive and expensive work of editing need only be carried out once, on only one document; that source document can then be stored in one place and reused. This reduces the potential for error, as corrections are only made one time in the source document.

For more information, see “Single Source Publishing” at Wikipedia

InDesign

Adobe InDesign is a desktop publishing software application produced by Adobe Systems. It can be used to create works such as posters, flyers, brochures, magazines, newspapers, presentations, books and ebooks. InDesign can also publish content suitable for tablet devices in conjunction with Adobe Digital Publishing Suite. Graphic designers and production artists are the principal users, creating and laying out periodical publications, posters, and print media. It also supports export to EPUB and SWF formats to create e-books and digital publications, including digital magazines, and content suitable for consumption on tablet computers. In addition, InDesign supports XML, style sheets, and other coding markup, making it suitable for exporting tagged text content for use in other digital and online formats. The Adobe InCopy word processor uses the same formatting engine as InDesign.

For more information, see

FrameMaker

Adobe FrameMaker is a document processor designed for writing and editing large or complex documents, including structured documents. It is produced by Adobe. Users can work with unstructured and structured content in the same documentation. Content can be published as responsive HTML, PDF, EPUB, and other formats. FrameMaker supports DITA.

Word Processors

General-purpose word processors aren’t the greatest tools for creating complex, long-form documents such as reference manuals. However, if no other tools are available/affordable, they will do. However, the writer should ensure that the file format of the software is supported by other tools “downstream” – especially the translation software / system. For example, Apple’s Pages and Scrivener are low-cost, easy-to-learn applications, but most commercial translation tools don’t support their native file formats.

Apple Pages
Google Docs
Literature & Latte Scrivener
Microsoft Word
Open Office / LibreOffice

Document Management Systems

A document management system is software used to track, manage and store documents and reduce paper. Most document management systems keep records of the various versions created and modified by different users (history tracking). The term has some overlap with the concepts of Content Management Systems]]. A document management system is often viewed as a component of Enterprise Content Management and related to Digital Asset Management.

Publishing Tools + Platforms

Web servers

Content Management Systems

Wiki Engines

PDF Renderers

PDF Renderers convert HTML and XML source files into PDFs, typically using Cascading Stylesheets (CSS) for output formatting. They will also process CSS instructions for paged media such as headers, footers, cross references to page numbers etc. that are usually not handled by web user agents (browsers).

Lightweight Markup Languages and Converters

For an introduction to LML, see Lightweight Markup Languages.

Marked – the most popular / successful lightweight markup language, but no new releases / feature after original version
MultiMarkdown – builds on Markdown, extending it with many modern features (table of contents, tables, citations, cross-references, smart typography etc.)
Pandoc – a Swiss army knife for lightweight markup languages that will convert dozens of formats into each other (including HTML, Office formats, EPUB, OPML, LaTex, MediaWiki markup etc). It also extends basic Markdown. The feature set is impressive, but it’s a command line tool and has a certain learning curve.

For differences between MultiMarkdown and Pandoc, see “A comparison of Pandoc and Multimarkdown”

Static Site Generators

Static site generators will read a source (usually text files written in one of the popular lightweight markup languages), template files and other assets and render them into full web pages. There are some fundamental differences to traditional, “dynamic” web sites. On the plus side, static sites will render very quickly (as there is very little server-side processing), and there are fewer security concerns as there is very little technology on the server side that can be hacked/attacked. There are no packages, libraries, modules, frameworks, database engines and no dependencies.

On the other hand, personalising the user experience in any way is harder than with content management systems, as all pages are essentially “frozen”. However, interactivity can still be added in the form of third-party modules such as forum, survey and voting components.

For an introduction to the concept of static site generators and some popular products, read “An Introduction to Static Site Generators” by David Walsh.

Jekyll
Hugo

Analysis

Creating great documentation is one thing; learning if and how people have actually used it is another.

If your documentation is a downloadable, self-contained format such as PDF, you’ll receive very little info available about its use. You’re basically stuck with your web server software’s log files that will tell you how often “manual.pdf” was downloaded.

If your manual is part of your website (either in the form of static HTML pages or as regular web content), you can learn about access to specific manuals and (depending on its granularity) even sections or pages as you would with any other page.

Web Server Log File Analyzers

Madcap Analyzer

Creation / Formatting Models

WYSIWYG

WYSIWYG (“what you see is what you get”) is an approach to document editing where the presentation of a document on screen closely resembles the appearance when printed or displayed as a finished product, such as a printed document, a web page, or slide presentation. Accordingly, WYSIWYG applications will usually present editing tools that refer to the visual appearance of a document (e.g. for font sizes, style etc.)

WYSIWYG-based formatting will usually result in documents that will contain very little structural information, making them hard to parse for machines, translation systems and disabled users.

Structured Authoring

One possible definition for Structured Authoring is „a methodological approach to the creation of content incorporating information types, systematic use of metadata, XML-based semantic mark-up, modular, topic-based information architecture, a constrained writing environment with software-enforced rules, content re-use, and the separation of content and form.” (from the Oxygen XML documentation)

Usually, Structured Authoring tools will enforce the use of allowed markup elements based on an XML DTD.

In structured authoring, separation of content and form is a core principle: The way a piece of text looks during authoring is irrelevant. The formatting and presentation are post-authoring considerations, and activities possibly not performed by a technical writer.

Semantic Publishing

In theory, Structured Authoring tools and models could be used to enforce any kind of content/structure model. But usually, they will be used for semantic markup, where markers referring to the content itself is used.

From the Wikipedia article about Semantic HTML:

“As an example, recent HTML standards discourage use of the tag <i> (italic, a typeface) in preference of more accurate tags such as <em> (emphasis). A CSS stylesheet should then specify whether emphasis is denoted by an italic font, a bold font, underlining, slower or louder audible speech etc. This is because italics are used for purposes other than emphasis, such as citing a source (for this, HTML provides the tag <cite>).”

Conditional Formatting

Conditional formatting is used to mark up sections of a document that should only be used for a given

output format (e.g. print or web)
product version
platform (e.g. the Windows or macOS version of an application)
market
audience

Conditional formatting is a very powerful concept for creating multiple documents from one source document.

Working with

semantic markup
transclusion
conditional formatting

can reduce the amount of redundancy in a given document set to minimum.

Standards and Formats for Technical Communication

DocBook

DocBook is a semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software but it can be used for any other sort of documentation.

As a semantic language, DocBook enables its users to create document content in a presentation-neutral form that captures the logical structure of the content; that content can then be published in a variety of formats, including HTML, XHTML, EPUB, PDF, man pages, Web help and HTML Help, without requiring users to make any changes to the source. In other words: When a document is written in DocBook format it becomes easily portable into other formats. It solves the problem of reformatting by writing it once using XML tags.

For more information, see

LaTex

LaTeX is a document preparation system. When writing, the writer uses plain text as opposed to the formatted text found in WYSIWYG word processor. The writer uses markup tagging conventions to define the general structure of a document (such as article, book, and letter), to stylise text throughout a document (such as bold and italics), and to add citations and cross-references. A TeX distribution such as TeX Live is used to produce an output file (such as PDF or DVI) suitable for printing or digital distribution.

LaTeX can be used as a standalone document preparation system or as an intermediate format. In the latter role, for example, it is sometimes used as part of a pipeline for translating DocBook and other XML-based formats to PDF.

For more information, see the official LaTeX site.

DITA

The Darwin Information Typing Architecture or Document Information Typing Architecture (DITA) is an XML data model for authoring and publishing. It is an open standard that is defined and maintained by the OASIS DITA Technical Committee.

The name derives from the following components:

Darwin: DITA uses the principles of specialisation and inheritance, which is in some ways analogous to the naturalist Charles Darwin’s concept of evolutionary adaptation.
Information typing: Each topic has a defined primary objective (procedure, glossary entry, troubleshooting information) and structure.
Architecture: DITA is an extensible set of structures.

The DITA core principles are:

Content reuse:: Topics can be reused across multiple publications. Fragments of content within topics can be reused through the use of content references (conref or conkeyref), a transclusion mechanism.
Information typing:: DITA 1.3 includes five specialised topic types: Task, Concept, Reference, Glossary Entry, and Troubleshooting. Each of these five topic types is a specialisation of a generic Topic type, which contains a title element, a prolog element for metadata, and a body element. The body element contains paragraph, table, and list elements, similar to HTML.
Maps:: A DITA map is a container for topics used to transform a collection of content into a publication. It gives the topics sequence and structure. A map can include relationship tables that define hyperlinks between topics. Maps can be nested. Maps can reference topics or other maps, and can contain a variety of content types and metadata.
Metadata:: DITA includes extensive metadata elements and attributes, both at topic level and within elements. Conditional text allows filtering or styling content based on attributes for audience, platform, product, and other properties. The conditional processing profile (.ditaval file) is used to identify which values are to be used for conditional processing.
Specialisation:: DITA allows adding new elements and attributes through specialisation of base DITA elements and attributes. Through specialisation, DITA can accommodate new topic types, element types, and attributes as needed for specific industries or companies. Specialisations of DITA for specific industries, such as the semiconductor industry, are standardised through OASIS technical committees or subcommittees. Many organisations using DITA also develop their own specialisations.

For more information, see

the official DITA site
Wikipedia article on the Darwin Information Typing Architecture

Lightweight Markup Languages

A lightweight markup language is a markup language with simple, unobtrusive syntax. It is designed to be easy to create using any generic text editor, as well as easy to read in its raw form. Lightweight markup languages are used in applications where it may be necessary to read the raw document as well as the final rendered output.

For instance, a person downloading a software library might prefer to read the documentation in a text editor rather than a web browser. Another application for such languages is to provide for data entry in web-based publishing, such as weblogs and wikis, where the input interface is a simple text box. The server software then converts the input into a common document markup language like HTML.

For more information, see the lightweight markup languages article at Wikipedia

Popular Lightweight Markup Languages

Markdown – the original Markdown version created by IT journalist John Gruber
MultiMarkdown – an extension of Markdown adding many advanced features
AsciiDoc – a lightweight markup language supporting many target formats (PDF, HTML, EPUB, HTML-based presentations in Slidy).

AsciiDoctor claims that AsciiDoc is the right tool for the technical writer’s job: “AsciiDoc belongs to the family of lightweight markup languages, the most renowned of which is Markdown. AsciiDoc stands out from this group because it supports all the structural elements necessary for drafting articles, technical manuals, books, presentations and prose. In fact, it’s capable of meeting even the most advanced publishing requirements and technical semantics.” Unfortunately, the software palette for AsciiDoc-based workflows isn’t as rich as those of other general-purpose LMLs, and it seems that development has stopped in 2013.

Intermediate Formats

These are formats that are usually not created directly by end users (as it would be the case with Lightweight Markup Languages) or used natively by popular applications.

OPML

OPML (Outline Processor Markup Language) is an XML format for outlines (defined as “a tree, where each node contains a set of named attributes with string values”). It has since been adopted for many uses, the most common being to exchange lists of web feeds between web feed aggregators. many outlining and mind mapping applications will support it an an import and export format.

For more information, see

RTF

One word: avoid.

Delivery formats

PDF

The Portable Document Format (PDF) is a file format used to present documents in a manner independent of application software, hardware, and operating systems. Each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, graphics, and other information needed to display it. PDF was developed in the early 1990s as a way to share computer documents, including text formatting and inline images

For many years, PDF has been the format of choice for all document that users might want to download, save, print and annotate. But with regards to a world that is moving from “mobile first” to “mobile only” for ever more users, a page-based format looks increasingly outdated.

While it is possible to create accessible PDF files for users with disabilities (e.g. tagged for screen readers), doing so is neither trivial nor an “out of the box” feature in standard workflows.

HTML + CSS

HTML is the lingua franca of the World Wide Web. Whatever you might want to know about it… is available on the web.

Examples for Web/HTML-Based Product Manuals

Adobe: web manuals (“user guides”) for Photoshop, InDesign and Lightroom.
Apple: web manuals for the iPhone, the Apple Watch and Logic Pro X.

EPUB

EPUB is an e-book file format that can be downloaded and read on devices like smartphones, tablets, computers, or e-readers. It is a technical standard published by the International Digital Publishing Forum (IDPF). The Book Industry Study Group endorses EPUB 3 as the format of choice for packaging content and has stated that the global book publishing industry should rally around a single standard. EPUB is the most widely supported vendor-independent XML-based (as opposed to PDF) e-book format; that is, it is supported by the largest number of hardware readers – however not on Amazon’s Kindle (which uses its own, proprietary format).

EPUB home on the International Digital Publishing Forum site

Find

For every document longer than a few pages, a find feature is essential and expected by users.

Web Searches

Documentation made available on a website – either in the web’s native HTML format or common file formats such as PDF, Word, RTF etc. – can be discovered and indexed by search engines.

Depending on the information provider’s search engine optimisation an other factors, the full text of the manual or links to it will probably show up as a top result for “[PRODUCTNAME] manual” searches.

Searches in Current Document

Searches for strings within the currently displayed document are usually a standard feature of the user agent (software):

HTML documents: Web browser – Find feature
PDFs: Acrobat Reader – Find feature

Searches in Current Document Set

Search capabilities over a set of documents (either native HTML or popular file formats) have to be provided by the site owner’s content management system or using a domain-specific search with a general-purpose search engine.

Custom Search for Google Developers

Structure and Markup

Headlines
Lists
Tables
Emphasis (light and strong emphasis)

Next Page / Previous Page links / buttons
Alternative Page Sequences, Guided Tours (playlist-like sets of sub-documents or topics for certain audiences)

Table of contents
Index
- Traditional Index (Indexed Word Lists)
- Table Lists
- Image Lists
Hypertext links / Cross-references
- Unidirectional links
- Bidirectional links
- Multiple target links
- Related document links (A Keywords)

StretchText

StretchText is a hypertext feature that has not gained mass adoption in systems like the World Wide Web, but gives more control to the reader in determining what level of detail to read at. Authors write content to several levels of detail in a work.

StretchText is similar to outlining. However, instead of drilling down lists to greater detail (a process called hoisting), activating/clicking a StretchText node will replace it with another node (which usually proves additional content). This “stretching” to increase the amount of visible content (or contracting to decrease it) gives the feature its name. This is analogous to zooming in to get more detail.

Ted Nelson coined the term in 1967.

Static (predefined) StretchText

Static (predefined) StretchText is text and/or multimedia elements that is part of a given document, but hidden by default. This extra content can be revealed by user interaction (usually clicking a “Read more” type link).

Interactive (dynamic query) StretchText

Interactive StretchText is based on a link/query that is part of a document. Activating that link / running the query will fetch additional information from a remote source and include it in the current document. An example of a similar features are MediaWiki’s “Hover boxes” for in-Wiki links.

Technical Writing Principles / Best Practices

Create Content, Not Documents

Consistency

Style

Terminology

Reusability / Granularity

Topic-Based Authoring

Completeness

Topicality

Relevance

Semantic markup

Technical Communication Workflows

Material Collection / Assessment

Structure

Draft

Add media

Expand/complete your document

Typesetting

Review

Proofreading

Publishing

Technical Communication Tools

Off-the-shelf vs. custom solution

Content Creation Tools

Help Authoring Software

Help Authoring Software Features

Help Authoring Software – relevant products

Outliners + Text Editors

Typesetting software / engines

LaTex

Single Source Publishing

InDesign

FrameMaker

Word Processors

Document Management Systems

Publishing Tools + Platforms

Web servers

Content Management Systems

Wiki Engines

PDF Renderers

Lightweight Markup Languages and Converters

Static Site Generators

Analysis

Web Server Log File Analyzers

Madcap Analyzer

Creation / Formatting Models

WYSIWYG

Structured Authoring

Semantic Publishing

Conditional Formatting

Standards and Formats for Technical Communication

DocBook

LaTex

DITA

Lightweight Markup Languages

Popular Lightweight Markup Languages

Intermediate Formats

OPML

RTF

Delivery formats

PDF

HTML + CSS

Examples for Web/HTML-Based Product Manuals

EPUB

Navigation tools

Find

Web Searches

Searches in Current Document

Searches in Current Document Set

Structure and Markup

Navigation Aids

In-document Navigation Aids

Out-of-document Navigation Aids

StretchText

Static (predefined) StretchText

Interactive (dynamic query) StretchText