Flag of Sweden
Andreas Rejbrand’s Website

News

Rejbrand Text Editor 3.1: Unicode

One of the most distinguishing features of Rejbrand Text Editor is its focus on Unicode characters beyond those available on a standard PC keyboard. Indeed, technical documents – as well as non-technical documents that are typographically beautiful – often require quite a few characters not found on your typical keyboard. Rejbrand Text Editor offers a set of features aimed at quickly inserting and investigating such characters.

AutoReplace

The simplest of these features is called “Auto Replace”. This feature lets you enter a Unicode character by typing its “insertion code” in the editor. For example, typing \deg followed by punctuation or whitespace will enter the degree sign (°: U+00B0: DEGREE SIGN). By default, Rejbrand Text Editor is installed with a list of 367 such codes, but you may add additional codes yourself (or remove codes that bother you).

Here are some of the default codes:

Typograhpy and misc.
Code Character Description
\en U+2013: EN DASH
\em U+2014: EM DASH
\para U+00B6: PILCROW SIGN
\asterism U+2042: ASTERISM
Greek letters
Code Character Description
\Alpha Α U+0391: GREEK CAPITAL LETTER ALPHA
\alpha α U+03B1: GREEK SMALL LETTER ALPHA
\Beta Β U+0392: GREEK CAPITAL LETTER BETA
\beta β U+03B2: GREEK SMALL LETTER BETA
\Gamma Γ U+0393: GREEK CAPITAL LETTER GAMMA
\gamma γ U+03B3: GREEK SMALL LETTER GAMMA
\Delta Δ U+0394: GREEK CAPITAL LETTER DELTA
\delta δ U+03B4: GREEK SMALL LETTER DELTA
\Epsilon Ε U+0395: GREEK CAPITAL LETTER EPSILON
\epsilon ε U+03B5: GREEK SMALL LETTER EPSILON
\Zeta Ζ U+0396: GREEK CAPITAL LETTER ZETA
\zeta ζ U+03B6: GREEK SMALL LETTER ZETA
\Eta Η U+0397: GREEK CAPITAL LETTER ETA
\eta η U+03B7: GREEK SMALL LETTER ETA
\Theta Θ U+0398: GREEK CAPITAL LETTER THETA
\theta θ U+03B8: GREEK SMALL LETTER THETA
\Iota Ι U+0399: GREEK CAPITAL LETTER IOTA
\iota ι U+03B9: GREEK SMALL LETTER IOTA
\Kappa Κ U+039A: GREEK CAPITAL LETTER KAPPA
\kappa κ U+03BA: GREEK SMALL LETTER KAPPA
\Lambda Λ U+039B: GREEK CAPITAL LETTER LAMDA
\lambda λ U+03BB: GREEK SMALL LETTER LAMDA
\Mu Μ U+039C: GREEK CAPITAL LETTER MU
\mu μ U+03BC: GREEK SMALL LETTER MU
\Nu Ν U+039D: GREEK CAPITAL LETTER NU
\nu ν U+03BD: GREEK SMALL LETTER NU
\Xi Ξ U+039E: GREEK CAPITAL LETTER XI
\xi ξ U+03BE: GREEK SMALL LETTER XI
\Omicron Ο U+039F: GREEK CAPITAL LETTER OMICRON
\omicron ο U+03BF: GREEK SMALL LETTER OMICRON
\Pi Π U+03A0: GREEK CAPITAL LETTER PI
\pi π U+03C0: GREEK SMALL LETTER PI
\Rho Ρ U+03A1: GREEK CAPITAL LETTER RHO
\rho ρ U+03C1: GREEK SMALL LETTER RHO
\Sigma Σ U+03A3: GREEK CAPITAL LETTER SIGMA
\sigma σ U+03C3: GREEK SMALL LETTER SIGMA
\Tau Τ U+03A4: GREEK CAPITAL LETTER TAU
\tau τ U+03C4: GREEK SMALL LETTER TAU
\Upsilon Υ U+03A5: GREEK CAPITAL LETTER UPSILON
\upsilon υ U+03C5: GREEK SMALL LETTER UPSILON
\Phi Φ U+03A6: GREEK CAPITAL LETTER PHI
\phi φ U+03C6: GREEK SMALL LETTER PHI
\Chi Χ U+03A7: GREEK CAPITAL LETTER CHI
\chi χ U+03C7: GREEK SMALL LETTER CHI
\Psi Ψ U+03A8: GREEK CAPITAL LETTER PSI
\psi ψ U+03C8: GREEK SMALL LETTER PSI
\Omega Ω U+03A9: GREEK CAPITAL LETTER OMEGA
\omega ω U+03C9: GREEK SMALL LETTER OMEGA
Mathematical symbols
Code Character Description
\ne U+2260: NOT EQUAL TO
\approx U+2248: ALMOST EQUAL TO
\ge U+2265: GREATER-THAN OR EQUAL TO
\le U+2264: LESS-THAN OR EQUAL TO
\minus U+2212: MINUS SIGN
\plusminus, \pm ± U+00B1: PLUS-MINUS SIGN
\cdot U+22C5: DOT OPERATOR
\cross × U+00D7: MULTIPLICATION SIGN
\deg ° U+00B0: DEGREE SIGN
\proportionalto, \proportional, \prop U+221D: PROPORTIONAL TO
\sqrt U+221A: SQUARE ROOT
\divides U+2223: DIVIDES
\ndivides U+2224: DOES NOT DIVIDE
\parallel U+2225: PARALLEL TO
\ortho U+27C2: PERPENDICULAR
\N U+2115: DOUBLE-STRUCK CAPITAL N
\Z U+2124: DOUBLE-STRUCK CAPITAL Z
\Q U+211A: DOUBLE-STRUCK CAPITAL Q
\R U+211D: DOUBLE-STRUCK CAPITAL R
\C U+2102: DOUBLE-STRUCK CAPITAL C
\H U+210D: DOUBLE-STRUCK CAPITAL H
\union, \cup U+222A: UNION
\intersection, \cap U+2229: INTERSECTION
\setminus U+2216: SET MINUS
\subset U+2282: SUBSET OF
\nsubset U+2284: NOT A SUBSET OF
\subseteq U+2286: SUBSET OF OR EQUAL TO
\subsetneq U+228A: SUBSET OF WITH NOT EQUAL TO
\superset U+2283: SUPERSET OF
\nsuperset U+2285: NOT A SUPERSET OF
\superseteq U+2287: SUPERSET OF OR EQUAL TO
\supersetneq U+228B: SUPERSET OF WITH NOT EQUAL TO
\complement U+2201: COMPLEMENT
\in U+2208: ELEMENT OF
\contains U+220B: CONTAINS AS MEMBER
\nin U+2209: NOT AN ELEMENT OF
\ncontains U+220C: DOES NOT CONTAIN AS MEMBER
\and U+2227: LOGICAL AND
\or U+2228: LOGICAL OR
\not ¬ U+00AC: NOT SIGN
\xor U+22BB: XOR
\forall U+2200: FOR ALL
\exists U+2203: THERE EXISTS
\sum U+2211: N-ARY SUMMATION
\product U+220F: N-ARY PRODUCT
\coproduct U+2210: N-ARY COPRODUCT
\infinity, \infty U+221E: INFINITY
\nabla U+2207: NABLA
\partial U+2202: PARTIAL DIFFERENTIAL
\int U+222B: INTEGRAL
\iint U+222C: DOUBLE INTEGRAL
\iiint U+222D: TRIPLE INTEGRAL
\oint U+222E: CONTOUR INTEGRAL
\oiint U+222F: SURFACE INTEGRAL
\oiiint U+2230: VOLUME INTEGRAL
\cwint U+2231: CLOCKWISE INTEGRAL
\therefore U+2234: THEREFORE
\because U+2235: BECAUSE
\qed U+220E: END OF PROOF
\to U+2192: RIGHTWARDS ARROW
\To U+21A6: RIGHTWARDS ARROW FROM BAR
\cdots U+22EF: MIDLINE HORIZONTAL ELLIPSIS
\dots U+2026: HORIZONTAL ELLIPSIS
\alef U+2135: ALEF SYMBOL
Arrows
Code Character Description
\rarr U+2192: RIGHTWARDS ARROW
\Rarr U+21D2: RIGHTWARDS DOUBLE ARROW
\larr U+2190: LEFTWARDS ARROW
\Larr U+21D0: LEFTWARDS DOUBLE ARROW
\uarr U+2191: UPWARDS ARROW
\Uarr U+21D1: UPWARDS DOUBLE ARROW
\darr U+2193: DOWNWARDS ARROW
\Darr U+21D3: DOWNWARDS DOUBLE ARROW
\lrarr U+2194: LEFT RIGHT ARROW
\LRarr (\eq) U+21D4: LEFT RIGHT DOUBLE ARROW
Superscript
Code Character Description
\sup1 ¹ U+00B9: SUPERSCRIPT ONE
\sup2 ² U+00B2: SUPERSCRIPT TWO
\sup3 ³ U+00B3: SUPERSCRIPT THREE
\sup4 U+2074: SUPERSCRIPT FOUR
\sup5 U+2075: SUPERSCRIPT FIVE
\sup6 U+2076: SUPERSCRIPT SIX
\sup7 U+2077: SUPERSCRIPT SEVEN
\sup8 U+2078: SUPERSCRIPT EIGHT
\sup9 U+2079: SUPERSCRIPT NINE
Miscellaneous symbols
Code Character Description
\backspace U+232B: ERASE TO THE LEFT
\return U+23CE: RETURN SYMBOL
\warning U+26A0: WARNING SIGN
\recycling U+2672: UNIVERSAL RECYCLING SYMBOL
\floralheart U+2766: FLORAL HEART
\bullet U+2022: BULLET
\radioactive U+2622: RADIOACTIVE SIGN
\peace U+262E: PEACE SYMBOL
\placeofinterest U+2318: PLACE OF INTEREST SIGN
\anchor U+2693: ANCHOR
\atom U+269B: ATOM SYMBOL
\thunderstorm U+2608: THUNDERSTORM
\sinewave U+223F: SINE WAVE
\airplane U+2708: AIRPLANE
\snowman U+2603: SNOWMAN

In Rejbrand Text Editor, you can press Shift+F1 to display the list of all recognised “insertion codes”:

Screenshot of Rejbrand Text Editor displaying the list of all AutoReplace-codes.

To edit this list, you use the “AutoReplace Editor” found on your Start menu. This is an external application because it affects all software that uses the Rejbrand Text Editor control. For instance, the next version of my mathematical software AlgoSim will use this control, and will therefore recognise the same list of insertion codes.

MultiInput

A related feature is MultiInput. This extends the keys on the keyboard with related characters.

For instance, if you type an asterisk (*), you may then press F9 to display a dialog box with related characters, like various kinds of multiplication signs and typographical bullets:

Screenshot of Rejbrand Text Editor displaying the asterisk/bullet/multiplication sign MultiInput dialog.

If you type the hyphen-minus character and press F9, you will similarly see various hyphens, dashes, and the minus sign:

Screenshot of Rejbrand Text Editor displaying the hyphen/dash/minus sign MultiInput dialog.

Finally, if you enter a single or double quotation mark and press F9, you see a list of single or double quotation marks of various kinds (as used in English, German, or Swedish, for instance):

Screenshot of Rejbrand Text Editor displaying the double quotation mark MultiInput dialog.

In each case, simply select the desired character using the keyboard (arrow keys or characters) and press Enter to replace the newly inserted character with the desired character.

Entering a codepoint

If you know the Unicode codepoint of the character you want to insert, you need only type the codepoint (in hexadecimal) and press Ctrl+U to insert it:

Screenshot of Rejbrand Text Editor displaying the replacement of a codepoint (U+222b) with the corresponding Unicode character (integral sign).

Searching the Unicode database

If you don’t know the Unicode codepoint or AutoReplace “insertion code” (if such a code even exists) of the character you’d like to insert, you may search or browse the Unicode database from within Rejbrand Text Editor.

Press F7 to open the “Character Information” window. Here you may press the “Search for string” button (F8) to search the Unicode database for characters with descriptions that contain a given string:

Screenshot of Rejbrand Text Editor displaying the Character Browser dialog box with the search results for string 'integral'.

Alternatively, you may select the “Browse block” item from the drop-down menu to browse the Unicode database by block:

Screenshot of Rejbrand Text Editor displaying the Character Browser dialog box with various Unicode groups.

What’s the character?

Very often I want to know exactly what character I see on the screen. Fortunately, this is easy with Rejbrand Text Editor, since the character to the right of the caret, the selected character, or the recently typed character, is displayed in the status bar:

Screenshot of Rejbrand Text Editor displaying Unicode character information in the status bar.

The “Character Information” window also displays the block of the character:

Screenshot of Rejbrand Text Editor displaying Unicode character information in the Character Information window.

Using these features, it becomes easy to make sure that you always use the right character. After all, some similar-looking characters are frequently confused: the degree sign (°) and the masculine ordinal indicator (º), the Greek letter small beta (β) and the Latin small letter sharp S (ß), the micro sign (µ) and the Greek small letter mu (μ), the typographic apostrophe (’) and the acute accent (´), just to mention a few of the most commonly confused pairs.

Characters in the current document

Using the Tools/Advanced statistics feature, you can obtain a summary of the Unicode characters found in the current document. The summary contains the number of characters found in the text from each Unicode block:

Screenshot of Rejbrand Text Editor displaying the Advanced Statistics window.

If you wonder whether or not the current text is ASCII-only, you can try Encoding/Check if file is ASCII only. If the file contains at least one non-ASCII character, the first such character will be highlighted:

Screenshot of Rejbrand Text Editor displaying the ASCII check message box.

Finally, the Find/Character search feature lets you search the current file for Unicode characters belonging to a particular block or type:

Screenshot of Rejbrand Text Editor displaying the Character Search dialog box.

When you press OK, the matching characters will be highlighted (as any search text) and can be navigated using F2 and F3:

Screenshot of Rejbrand Text Editor highlighting matching Unicode characters.

Summary

In this article, some of the Unicode-related features of Rejbrand Text Editor 3.1 have been showcased. In the next article in the Rejbrand Text Editor series, we will have a closer look at the text transformation features of the application.


Show all news items.

Only show the most recent news items.