Document base opcode and new braille indicator opcodes
diff --git a/doc/liblouis.texi b/doc/liblouis.texi
index dc4c4bf..aed6513 100644
--- a/doc/liblouis.texi
+++ b/doc/liblouis.texi
@@ -522,13 +522,12 @@
variants of the same letter with different accents, which may be
represented in your Braille code by the same dot pattern. This is a
very common practice for accented letters that are foreign to the
-Braille code. In the following example using the @opcoderef{uplow}
-opcode, both e acute (@samp{é}) and e grave (@samp{è}) are defined as
-dot 4 followed by dots 1 and 5.
+Braille code. In the following example, both e acute (@samp{é}) and e
+grave (@samp{è}) are defined as dot 4 followed by dots 1 and 5.
@example
-uplow \x00c9\x00e9 4-15 # E acute
-uplow \x00c8\x00e8 4-15 # E grave
+lowercase \x00e9 4-15 # E acute
+lowercase \x00e8 4-15 # E grave
@end example
In this example, the dot pattern would always back-translate to e
@@ -564,35 +563,6 @@
digit 0 356 NAB computer braille
@end example
-@opcode{uplow, characters dots}
-The characters operand must be a pair of letters, of which the first
-is uppercase and the second is the corresponding lowercase. The dots
-suboperand indicates the dot pattern for both letters. It may have
-more than one cell. This opcode is needed because not all languages
-follow a consistent pattern in assigning Unicode codes to upper and
-lower case letters. For example:
-
-@example
-uplow Aa 1
-@end example
-
-@opcode{grouping, name characters dots @comma{}dots}
-This opcode is used to indicate pairs of grouping symbols used in
-processing mathematical expressions. These symbols are usually
-generated by the MathML interpreter in liblouisutdml. They are used in
-multipass opcodes. The name operand must contain only letters (a-z and
-A-Z). The letters may be upper or lower-case but the case matters. The
-characters operand must contain exactly two Unicode characters. The
-dots operand must contain exactly two braille cells, separated by a
-comma. Note that grouping dot patterns also need to be declared with
-the @opcoderef{exactdots}. The characters may need to be declared with
-the @opcoderef{math}.
-
-@example
-grouping mrow \x0001\x0002 1e,2e
-grouping mfrac \x0003\x0004 3e,4e
-@end example
-
@opcode{letter, character dots}
Associates a letter in the language with a braille representation and
defines the character as a letter. This is intended for letters which
@@ -606,9 +576,7 @@
@opcode{uppercase, character dots}
Associates a character with a dot pattern and defines the character as
an uppercase letter. Both the character and the dot pattern have the
-attributes uppercase and letter. @code{lowercase} and @code{uppercase}
-should be used when a letter has only one case. Otherwise use the
-@opcoderef{uplow}.
+attributes uppercase and letter.
@opcode{litdigit, digit dots}
Associates a digit with the dot pattern which should be used to
@@ -639,6 +607,43 @@
math + 346 plus
@end example
+@opcode{grouping, name characters dots @comma{}dots}
+This opcode is different from the previous ones in that it defines two
+characters in one rule, and associates them with each other. The
+opcode is used to indicate pairs of grouping symbols used in
+processing mathematical expressions. These symbols are usually
+generated by the MathML interpreter in liblouisutdml. They are used in
+multipass opcodes. The name operand must contain only letters (a-z and
+A-Z). The letters may be upper or lower-case but the case matters. The
+characters operand must contain exactly two Unicode characters. The
+dots operand must contain exactly two braille cells, separated by a
+comma.
+
+@example
+grouping mrow \x0001\x0002 1e,2e
+grouping mfrac \x0003\x0004 3e,4e
+@end example
+
+@opcode{base, attribute character character}
+
+This opcode is different in that it does not associate a character
+with a dot pattern, but it associates a character with another already
+defined character. The derived character inherits the dot pattern of
+the base character, and braille indicators (@pxref{Braille Indicator
+Opcodes}) are used to distinguish them. The attribute operand refers
+to the character class (@pxref{Character-Class Opcodes}) the character
+should be added to. By defining braille indicator rules associated
+with this character class, you can determine the braille indicators to
+be inserted. The character operands are the derived character and the
+base character, respectively. A typical use of this opcode is for
+defining a pair of letters, a lowercase and the corresponding
+uppercase. For example:
+
+@example
+lowercase a 1
+base uppercase A a
+@end example
+
@end table
@node Braille Indicator Opcodes
@@ -650,29 +655,46 @@
by a dot pattern, which may be one or more cells.
@table @code
+@opcode{modeletter, attribute dots}
@opcode{capsletter, dots}
-The dot pattern which indicates capitalization of a single letter. In
-English, this is dot 6. For example:
+The dot pattern which indicates that a certain mode is entered and
+ends after a single character. A ``mode'' is a state in which dot
+patterns must be interpreted a certain way. For example, in uppercase
+mode dots @samp{1} is to be interpreted as a capital ``A'' and not a
+small ``a''. In numeric mode dots @samp{1} is to be interpreted as a
+``1''. The attribute operand identifies the mode and corresponds with
+the name of the character class that determines when the mode must be
+entered and exited.
+
+@code{modeletter} is also used to mark every character when a mode
+must last for several characters but when there is no @code{begmode}
+definition, or when the sequence happens in the middle of a word and
+@code{begmodeword} is defined but no @code{endmodeword}
+(@pxopref{begmode}, @opref{begmodeword} and @opref{endmodeword})
+
+@code{capsletter} is an alias for @code{modeletter uppercase}. The
+following two examples are equivalent:
@example
capsletter 6
@end example
-In addition, @code{capsletter} is used to mark every letter in a
-sequence of uppercase letters when there is no @code{begcaps}
-definition, or when the sequence happens in the middle of a word and
-@code{begcapsword} is defined but no @code{endcapsword}
-(@pxopref{begcaps}, @opref{begcapsword} and @opref{endcapsword})
+@example
+modeletter uppercase 6
+@end example
+@code{emphletter} (@pxopref{emphletter}) is the counterpart of
+@code{modeletter} for indication of emphasis.
+
+@opcode{begmodeword, attribute dots}
@opcode{begcapsword, dots}
-The dot pattern which begins a block of capital letters at the
-beginning or within a word. The block is automatically terminated
-by any character that is not a capital letter, e.g. small letters,
-punctuation, numbers etc.
+The dot pattern which indicates that a certain mode is entered for the
+following word or remainder of the current word. The mode is
+automatically terminated by the first character that is not a letter.
-Apart from capital letters, you can define a list of characters that
-can appear within a word in capitals without terminating the block.
-Do this by using the @opcoderef{capsmodechars}.
+For uppercase mode, you can define a list of characters that can
+appear within a word in capitals without terminating the block. Do
+this by using the @opcoderef{capsmodechars}.
Example:
@@ -680,11 +702,15 @@
begcapsword 6-6
@end example
+@code{begemphword} (@pxopref{begemphword}) is the counterpart of
+@code{begmodeword} for indication of emphasis.
+
+@opcode{endmodeword, attribute dots}
@opcode{endcapsword, dots}
-The dot pattern which ends a block of capital letters within a word.
-It is used in cases where the block is not terminated automatically
-by a word boundary, a number or punctuation. A common case is when
-an uppercase block is followed directly by a lowercase letter.
+The dot pattern which terminates a mode within a word. It is used in
+cases where the block is not terminated automatically by a word
+boundary, a number or punctuation. A common case is when an uppercase
+block is followed directly by a lowercase letter.
For example:
@@ -692,9 +718,11 @@
endcapsword 6-3
@end example
-@opcode{capsmodechars, characters}
+@code{endemphword} (@pxopref{endemphword}) is the counterpart of
+@code{endmodeword} for indication of emphasis.
-Normally, any character other than a capital letter will cancel the
+@opcode{capsmodechars, characters}
+Normally, any character other than a letter will automatically cancel the
@code{begcapsword} indicator. However, by using the
@code{capsmodechars} opcode, you can specify a list of characters that
are legal within a capitalized word. In some Braille codes, this might
@@ -706,13 +734,13 @@
capsmodechars -
@end example
+@opcode{begmode, attribute dots}
@opcode{begcaps, dots}
-The dot pattern which begins a block of capital letters. It is used
-in some Braille codes to mark a whole sentence or several words as
-capital letters. The block can contain capital letters as well as
-none-alphabetic characters, punctuation, numbers etc. The
-block is terminated when a small letter is encountered or at the end
-of the input string.
+The dot pattern which indicates that a mode is entered until it is
+terminated by a @code{endmode} indicator. It is used in some Braille
+codes to mark a whole sentence or several words as capital
+letters. The block can contain capital letters as well as
+non-alphabetic characters, punctuation, numbers etc.
This is the most general opening mark, i.e. it can be used for opening
at any position.
@@ -723,17 +751,20 @@
begcaps 6-6-6
@end example
+@code{begemph} (@pxopref{begemph}) is the counterpart of
+@code{begmode} for indication of emphasis.
+
+@opcode{endmode, attribute dots}
@opcode{endcaps, dots}
-The dot pattern which ends a block of capital letters. If the
-capital letters stop at the end of a word, this indicator will be
-placed immediately before the space to the next word. If the capital
-letters stop in the middle of a word, the indicator will be placed
-immediately before the first occurring small letter.
+The dot pattern which terminates a mode.
@example
endcaps 6-3
@end example
+@code{endemph} (@pxopref{endemph}) is the counterpart of
+@code{endmode} for indication of emphasis.
+
@opcode{letsign, dots}
This indicator is needed in Grade 2 to show that a single letter is
not a contraction. It is also used when an abbreviation happens to be
@@ -1429,8 +1460,7 @@
@opcode{always, characters dots}
Replace the characters with the dot pattern no matter where they
appear. Do @emph{NOT} use an entry such as @code{always a 1}. Use the
-@code{uplow}, @code{letter}, etc. character definition opcodes
-instead. For example:
+character definition opcodes instead. For example:
@example
always world 456-2456 unconditional translation
@@ -1639,9 +1669,7 @@
character may belong to more than one class.
The basic character classes correspond to the character definition
-opcodes, with the exception of the @opcoderef{uplow}, which defines
-characters belonging to the two classes @code{uppercase} and
-@code{lowercase}. These classes are:
+opcodes. These classes are:
@table @code
@item space
@@ -4043,7 +4071,7 @@
@c LocalWords: compileString logFile logPrint checkyaml findTable
@c LocalWords: getTable checkTable readCharFromFile itemx charSize
@c LocalWords: README liblouisxml pindex samp kbd opcodes opcoderef numsign
-@c LocalWords: FIXME ctb nemeth filename multipass suboperand uplow litdigit
+@c LocalWords: FIXME ctb nemeth filename multipass suboperand litdigit
@c LocalWords: begcaps endcaps letsign noletsign largesign typeform
@c LocalWords: noletsignbefore noletsignafter compbrl firstwordital
@c LocalWords: lenitalphrase doubleOpcode lastworditalbefore firstletterital