The Modern Technical Communication + Publishing Process: Principles and Best practices
Technical Writing Principles / Best Practices
Create Content, Not Documents
In the age of complex hypermedia and ever-changing document formats and delivery platforms, you cannot afford to create a set of documents and do it all again when the next wave of technical progress makes your selected formats obsolete. Create reusable content.
Consistency
Avoid variations that may be helpful and entertaining in prose and journalism, but could lead to confusion in product literature. Always use the same terms for the same concepts and user interface elements.
Style
- See the world (and your product) from the user’s point of view.
- Write in an adequate style for your target audience.
- Don’t simplify things more than necessary – Don’t talk down to users.
- Don’t make things more complex than necessary – avoid jargon and overly complicating sentence structures.
- Use active voice.
Terminology
- Use consistent terminology. Avoid synonyms and ambiguity.
- Use established terminology. Avoid neologisms.
Reusability / Granularity
Find the adequate way of “chunking” your content, based on audience, technical platform, topic etc.
- If your content chunks are too small, too much user interaction is required.
- If your content chunks are too big, users will find it hard to navigate and return to relevant content.
Granular content can be reused in a system supporting Transclusion.
Topic-Based Authoring
- A content chunk should preferably be complete (describing one concept, topic or task) with context and navigation items leading to related information. Topic based writing is one of the core principles of modern technical writing.
Completeness
- The product or service should be described in full. If this is not possible, provide links to related information.
Topicality
- Make sure your content is up-to-date.
- Provide metadata (release date of documentation / software / firmware described) that allows users to verify if they are using the correct version of a document.
Relevance
- Do not omit relevant information.
- Do not provide irrelevant information, unless it serves marketing purposes (information about other products and services, setting the mood).
Semantic markup
- Do not format documents visually (by applying typographic markup as such), but semantically. I.e., use markup that conveys meaning:
- instructive
- descriptive
Technical Communication Workflows
Books have been written about the technical writing process. In general, it’s not that different from journalism and scientific writing:
Material Collection / Assessment
Research your topic. Talk to subject matter experts if you aren’t one yourself. Collect existing information about the product from designers, developers, engineers.
Structure
Outline your project. You will probably be able to benefit from existing work: templates, existing documentation for earlier products from your own company or from competitors.
Draft
Write a draft. Don’t get lost in details; broad strokes help you get an idea of the project scope.
Add media
Depending on the target format(s), you may be able to work with illustrations, photos, embedded or linked video, audio examples, image maps, interactive elements.
Expand/complete your document
Get feedback from subject-matter experts, product managers, the marketing team and – most importantly – end-users.
Typesetting
Bring your project into a presentable shape. If you are using a structured/semantic approach (and the tools that support it), you may have very little work to do here: Proper style sheets will transform structured content into beautiful documents, and usually, only a little post-editing will be required.
Review
If possible, have people review your document who haven’t read earlier versions and who don’t know too much about the product itself. Add, remove, rewrite content ads necessary. Create a “bin” with reusable material
Proofreading
After the review phase (which should focus on the quality and consistency of the content), make sure that there aren’t embarrassing errors in the document. Have a professional writer or proofreader check your document for grammatical issues, spelling errors, bad typography etc.
Publishing
Have your technical writer software / publishing tool / content management system render your content in its final forms – typically, PDF or web pages.
Technical Communication Tools
Off-the-shelf vs. custom solution
If you download a manual in PDF format for a consumer or professional product in 2017, it is very likely that it was either created from an InDesign or a Word file. Most companies “play it safe”, using tools that everyone has and knows. Major companies who have full writing and translation departments can afford building their own workflows; often based on an XML-based format such as DITA or DocBook (usually customised) or even using their own “from scratch” format.
The advantages and disadvantages for both approaches are the same as with most other software(-based) projects:
- For popular standard tools, there is usually a rich knowledge base; people who know the features and limitations of these tools, third-party extensions and other useful stuff. On the other hand, building custom solutions can be harder or impossible.
- Custom formats and workflows can be heavily adapted to a company’s requirements, and there is no need to live with all the compromises and limitations of a “one size fits everybody” approach. On the other hand, you either have to create everything yourself or support standard APIs and file exchange formats that allow you to work with third-party systems.
Content Creation Tools
Help Authoring Software
Help Authoring Software Features
A Help Authoring Tool (or HAT) is a software program used by technical writers to create online help systems. Common features include:
- File import from third-party tools (e.g. general-purpose word processors, raw text, HTML, XML)
- Help output in various formats such as WinHelp, PDF, XML, HTML
- Content editing (standard text processor functions)
- Generation of navigation aids (links, cross-references, index, table of contents etc.
- Image editing, image hotspot editing (image maps)
Help Authoring Software – relevant products
- Madcap Flare – Professional, commercial Windows software
- EC Software Help and Manual – Professional, commercial Windows software
- Paligo – web-based content management system for technical writing
Outliners + Text Editors
Outliners and word processors with outlining modes allow users to create and navigate hierarchical structures (outlines). They often provide features that allow to collapse, expand, move and filter document sections. Outliners are excellent tools for creating the “skeleton” of documents that will later be expanded and formatted in other tools.
- OmniGroup OmniOutliner – Professional macOS outliner
- MindNode – macOS mind mapping tool with outlining mode; imports and exports OPML
Typesetting software / engines
LaTex
Single Source Publishing
Single-source publishing, also known as single-sourcing publishing, is a content management method which allows the same source content to be used across different forms of media and more than one time. The labor-intensive and expensive work of editing need only be carried out once, on only one document; that source document can then be stored in one place and reused. This reduces the potential for error, as corrections are only made one time in the source document.
For more information, see “Single Source Publishing” at Wikipedia
InDesign
Adobe InDesign is a desktop publishing software application produced by Adobe Systems. It can be used to create works such as posters, flyers, brochures, magazines, newspapers, presentations, books and ebooks. InDesign can also publish content suitable for tablet devices in conjunction with Adobe Digital Publishing Suite. Graphic designers and production artists are the principal users, creating and laying out periodical publications, posters, and print media. It also supports export to EPUB and SWF formats to create e-books and digital publications, including digital magazines, and content suitable for consumption on tablet computers. In addition, InDesign supports XML, style sheets, and other coding markup, making it suitable for exporting tagged text content for use in other digital and online formats. The Adobe InCopy word processor uses the same formatting engine as InDesign.
For more information, see
FrameMaker
Adobe FrameMaker is a document processor designed for writing and editing large or complex documents, including structured documents. It is produced by Adobe. Users can work with unstructured and structured content in the same documentation. Content can be published as responsive HTML, PDF, EPUB, and other formats. FrameMaker supports DITA.
Word Processors
General-purpose word processors aren’t the greatest tools for creating complex, long-form documents such as reference manuals. However, if no other tools are available/affordable, they will do. However, the writer should ensure that the file format of the software is supported by other tools “downstream” – especially the translation software / system. For example, Apple’s Pages and Scrivener are low-cost, easy-to-learn applications, but most commercial translation tools don’t support their native file formats.
- Apple Pages
- Google Docs
- Literature & Latte Scrivener
- Microsoft Word
- Open Office / LibreOffice
Document Management Systems
A document management system is software used to track, manage and store documents and reduce paper. Most document management systems keep records of the various versions created and modified by different users (history tracking). The term has some overlap with the concepts of Content Management Systems]]. A document management system is often viewed as a component of Enterprise Content Management and related to Digital Asset Management.
Publishing Tools + Platforms
Web servers
Content Management Systems
Wiki Engines
PDF Renderers
PDF Renderers convert HTML and XML source files into PDFs, typically using Cascading Stylesheets (CSS) for output formatting. They will also process CSS instructions for paged media such as headers, footers, cross references to page numbers etc. that are usually not handled by web user agents (browsers).
Lightweight Markup Languages and Converters
For an introduction to LML, see Lightweight Markup Languages.
- Marked – the most popular / successful lightweight markup language, but no new releases / feature after original version
- MultiMarkdown – builds on Markdown, extending it with many modern features (table of contents, tables, citations, cross-references, smart typography etc.)
- Pandoc – a Swiss army knife for lightweight markup languages that will convert dozens of formats into each other (including HTML, Office formats, EPUB, OPML, LaTex, MediaWiki markup etc). It also extends basic Markdown. The feature set is impressive, but it’s a command line tool and has a certain learning curve.
For differences between MultiMarkdown and Pandoc, see “A comparison of Pandoc and Multimarkdown”
Static Site Generators
Static site generators will read a source (usually text files written in one of the popular lightweight markup languages), template files and other assets and render them into full web pages. There are some fundamental differences to traditional, “dynamic” web sites. On the plus side, static sites will render very quickly (as there is very little server-side processing), and there are fewer security concerns as there is very little technology on the server side that can be hacked/attacked. There are no packages, libraries, modules, frameworks, database engines and no dependencies.
On the other hand, personalising the user experience in any way is harder than with content management systems, as all pages are essentially “frozen”. However, interactivity can still be added in the form of third-party modules such as forum, survey and voting components.
For an introduction to the concept of static site generators and some popular products, read “An Introduction to Static Site Generators” by David Walsh.
Analysis
Creating great documentation is one thing; learning if and how people have actually used it is another.
If your documentation is a downloadable, self-contained format such as PDF, you’ll receive very little info available about its use. You’re basically stuck with your web server software’s log files that will tell you how often “manual.pdf” was downloaded.
If your manual is part of your website (either in the form of static HTML pages or as regular web content), you can learn about access to specific manuals and (depending on its granularity) even sections or pages as you would with any other page.
Web Server Log File Analyzers
Madcap Analyzer
Creation / Formatting Models
WYSIWYG
WYSIWYG (“what you see is what you get”) is an approach to document editing where the presentation of a document on screen closely resembles the appearance when printed or displayed as a finished product, such as a printed document, a web page, or slide presentation. Accordingly, WYSIWYG applications will usually present editing tools that refer to the visual appearance of a document (e.g. for font sizes, style etc.)
WYSIWYG-based formatting will usually result in documents that will contain very little structural information, making them hard to parse for machines, translation systems and disabled users.
Structured Authoring
One possible definition for Structured Authoring is „a methodological approach to the creation of content incorporating information types, systematic use of metadata, XML-based semantic mark-up, modular, topic-based information architecture, a constrained writing environment with software-enforced rules, content re-use, and the separation of content and form.” (from the Oxygen XML documentation)
Usually, Structured Authoring tools will enforce the use of allowed markup elements based on an XML DTD.
In structured authoring, separation of content and form is a core principle: The way a piece of text looks during authoring is irrelevant. The formatting and presentation are post-authoring considerations, and activities possibly not performed by a technical writer.
Semantic Publishing
In theory, Structured Authoring tools and models could be used to enforce any kind of content/structure model. But usually, they will be used for semantic markup, where markers referring to the content itself is used.
From the Wikipedia article about Semantic HTML:
“As an example, recent HTML standards discourage use of the tag <i> (italic, a typeface) in preference of more accurate tags such as <em> (emphasis). A CSS stylesheet should then specify whether emphasis is denoted by an italic font, a bold font, underlining, slower or louder audible speech etc. This is because italics are used for purposes other than emphasis, such as citing a source (for this, HTML provides the tag <cite>).”
Conditional Formatting
Conditional formatting is used to mark up sections of a document that should only be used for a given
- output format (e.g. print or web)
- product version
- platform (e.g. the Windows or macOS version of an application)
- market
- audience
Conditional formatting is a very powerful concept for creating multiple documents from one source document.
Working with
- semantic markup
- transclusion
- conditional formatting
can reduce the amount of redundancy in a given document set to minimum.
Standards and Formats for Technical Communication
DocBook
DocBook is a semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software but it can be used for any other sort of documentation.
As a semantic language, DocBook enables its users to create document content in a presentation-neutral form that captures the logical structure of the content; that content can then be published in a variety of formats, including HTML, XHTML, EPUB, PDF, man pages, Web help and HTML Help, without requiring users to make any changes to the source. In other words: When a document is written in DocBook format it becomes easily portable into other formats. It solves the problem of reformatting by writing it once using XML tags.
For more information, see
LaTex
LaTeX is a document preparation system. When writing, the writer uses plain text as opposed to the formatted text found in WYSIWYG word processor. The writer uses markup tagging conventions to define the general structure of a document (such as article, book, and letter), to stylise text throughout a document (such as bold and italics), and to add citations and cross-references. A TeX distribution such as TeX Live is used to produce an output file (such as PDF or DVI) suitable for printing or digital distribution.
LaTeX can be used as a standalone document preparation system or as an intermediate format. In the latter role, for example, it is sometimes used as part of a pipeline for translating DocBook and other XML-based formats to PDF.
For more information, see the official LaTeX site.
DITA
The Darwin Information Typing Architecture or Document Information Typing Architecture (DITA) is an XML data model for authoring and publishing. It is an open standard that is defined and maintained by the OASIS DITA Technical Committee.
The name derives from the following components:
- Darwin
-
DITA uses the principles of specialisation and inheritance, which is in some ways analogous to the naturalist Charles Darwin’s concept of evolutionary adaptation.
- Information typing
-
Each topic has a defined primary objective (procedure, glossary entry, troubleshooting information) and structure.
- Architecture
-
DITA is an extensible set of structures.
The DITA core principles are:
- Content reuse:
-
Topics can be reused across multiple publications. Fragments of content within topics can be reused through the use of content references (conref or conkeyref), a transclusion mechanism.
- Information typing:
-
DITA 1.3 includes five specialised topic types: Task, Concept, Reference, Glossary Entry, and Troubleshooting. Each of these five topic types is a specialisation of a generic Topic type, which contains a title element, a prolog element for metadata, and a body element. The body element contains paragraph, table, and list elements, similar to HTML.
- Maps:
-
A DITA map is a container for topics used to transform a collection of content into a publication. It gives the topics sequence and structure. A map can include relationship tables that define hyperlinks between topics. Maps can be nested. Maps can reference topics or other maps, and can contain a variety of content types and metadata.
- Metadata:
-
DITA includes extensive metadata elements and attributes, both at topic level and within elements. Conditional text allows filtering or styling content based on attributes for audience, platform, product, and other properties. The conditional processing profile (.ditaval file) is used to identify which values are to be used for conditional processing.
- Specialisation:
-
DITA allows adding new elements and attributes through specialisation of base DITA elements and attributes. Through specialisation, DITA can accommodate new topic types, element types, and attributes as needed for specific industries or companies. Specialisations of DITA for specific industries, such as the semiconductor industry, are standardised through OASIS technical committees or subcommittees. Many organisations using DITA also develop their own specialisations.
For more information, see
Lightweight Markup Languages
A lightweight markup language is a markup language with simple, unobtrusive syntax. It is designed to be easy to create using any generic text editor, as well as easy to read in its raw form. Lightweight markup languages are used in applications where it may be necessary to read the raw document as well as the final rendered output.
For instance, a person downloading a software library might prefer to read the documentation in a text editor rather than a web browser. Another application for such languages is to provide for data entry in web-based publishing, such as weblogs and wikis, where the input interface is a simple text box. The server software then converts the input into a common document markup language like HTML.
For more information, see the lightweight markup languages article at Wikipedia
Popular Lightweight Markup Languages
- Markdown – the original Markdown version created by IT journalist John Gruber
- MultiMarkdown – an extension of Markdown adding many advanced features
- AsciiDoc – a lightweight markup language supporting many target formats (PDF, HTML, EPUB, HTML-based presentations in Slidy).
AsciiDoctor claims that AsciiDoc is the right tool for the technical writer’s job: “AsciiDoc belongs to the family of lightweight markup languages, the most renowned of which is Markdown. AsciiDoc stands out from this group because it supports all the structural elements necessary for drafting articles, technical manuals, books, presentations and prose. In fact, it’s capable of meeting even the most advanced publishing requirements and technical semantics.” Unfortunately, the software palette for AsciiDoc-based workflows isn’t as rich as those of other general-purpose LMLs, and it seems that development has stopped in 2013.
Intermediate Formats
These are formats that are usually not created directly by end users (as it would be the case with Lightweight Markup Languages) or used natively by popular applications.
OPML
OPML (Outline Processor Markup Language) is an XML format for outlines (defined as “a tree, where each node contains a set of named attributes with string values”). It has since been adopted for many uses, the most common being to exchange lists of web feeds between web feed aggregators. many outlining and mind mapping applications will support it an an import and export format.
For more information, see
RTF
One word: avoid.
Delivery formats
The Portable Document Format (PDF) is a file format used to present documents in a manner independent of application software, hardware, and operating systems. Each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, graphics, and other information needed to display it. PDF was developed in the early 1990s as a way to share computer documents, including text formatting and inline images
For many years, PDF has been the format of choice for all document that users might want to download, save, print and annotate. But with regards to a world that is moving from “mobile first” to “mobile only” for ever more users, a page-based format looks increasingly outdated.
While it is possible to create accessible PDF files for users with disabilities (e.g. tagged for screen readers), doing so is neither trivial nor an “out of the box” feature in standard workflows.
HTML + CSS
HTML is the lingua franca of the World Wide Web. Whatever you might want to know about it… is available on the web.
Examples for Web/HTML-Based Product Manuals
- Adobe: web manuals (“user guides”) for Photoshop, InDesign and Lightroom.
- Apple: web manuals for the iPhone, the Apple Watch and Logic Pro X.
EPUB
EPUB is an e-book file format that can be downloaded and read on devices like smartphones, tablets, computers, or e-readers. It is a technical standard published by the International Digital Publishing Forum (IDPF). The Book Industry Study Group endorses EPUB 3 as the format of choice for packaging content and has stated that the global book publishing industry should rally around a single standard. EPUB is the most widely supported vendor-independent XML-based (as opposed to PDF) e-book format; that is, it is supported by the largest number of hardware readers – however not on Amazon’s Kindle (which uses its own, proprietary format).
EPUB home on the International Digital Publishing Forum site
Navigation tools
Find
For every document longer than a few pages, a find feature is essential and expected by users.
Web Searches
Documentation made available on a website – either in the web’s native HTML format or common file formats such as PDF, Word, RTF etc. – can be discovered and indexed by search engines.
Depending on the information provider’s search engine optimisation an other factors, the full text of the manual or links to it will probably show up as a top result for “[PRODUCTNAME] manual” searches.
Searches in Current Document
Searches for strings within the currently displayed document are usually a standard feature of the user agent (software):
- HTML documents: Web browser – Find feature
- PDFs: Acrobat Reader – Find feature
Searches in Current Document Set
Search capabilities over a set of documents (either native HTML or popular file formats) have to be provided by the site owner’s content management system or using a domain-specific search with a general-purpose search engine.
Structure and Markup
- Headlines
- Lists
- Tables
- Emphasis (light and strong emphasis)
Navigation Aids
In-document Navigation Aids
- Next Page / Previous Page links / buttons
- Alternative Page Sequences, Guided Tours (playlist-like sets of sub-documents or topics for certain audiences)
Out-of-document Navigation Aids
- Table of contents
- Index
- Traditional Index (Indexed Word Lists)
- Table Lists
- Image Lists
- Hypertext links / Cross-references
- Unidirectional links
- Bidirectional links
- Multiple target links
- Related document links (A Keywords)
StretchText
StretchText is a hypertext feature that has not gained mass adoption in systems like the World Wide Web, but gives more control to the reader in determining what level of detail to read at. Authors write content to several levels of detail in a work.
StretchText is similar to outlining. However, instead of drilling down lists to greater detail (a process called hoisting), activating/clicking a StretchText node will replace it with another node (which usually proves additional content). This “stretching” to increase the amount of visible content (or contracting to decrease it) gives the feature its name. This is analogous to zooming in to get more detail.
Ted Nelson coined the term in 1967.
Static (predefined) StretchText
Static (predefined) StretchText is text and/or multimedia elements that is part of a given document, but hidden by default. This extra content can be revealed by user interaction (usually clicking a “Read more” type link).
Interactive (dynamic query) StretchText
Interactive StretchText is based on a link/query that is part of a document. Activating that link / running the query will fetch additional information from a remote source and include it in the current document. An example of a similar features are MediaWiki’s “Hover boxes” for in-Wiki links.