Notes on unicode editing in Coptic

Here’s a couple of notes on how I’m editing unicode Coptic in Microsoft Word 2007.

I’m using Wazu Japan’s Comprehensive Unicode Test Page for Coptic a lot.  This allows me to identify characters and unicode character sets.

I find I can enter any character in word by just typing the four-character code, and hitting Alt-X.  So if I type 0307 after a Coptic character and hit Alt-X, I get a diacritical dot above the character.  Wazu’s page tells me what the codes are!  What I have actually done is to record a macro, so I move to the character and hit Alt-1, which runs a macro that types 0307 and hits alt-X.  It saves keystrokes.

OK; I’ve manually replaced unicode accents (code 0300) with dots on a couple of fragments, and I’m getting fed up.  Can I do a global replace?  I think so.  This microsoft page (I had to use the Google cache version, as Microsoft tried to divert me to some useless registration process) seems to tell you.  You can search for any unicode character using this:

 ^Unnnn where nnnn is the character code

Let’s try it: ^U0300 in the Find box… and it doesn’t work.  ^U is not allowed.  I try ^u, lower case, and that is allowed but finds nothing.  Rats.  It seems I am not the first to discover this.  Not merely must it be lower-case; it must be decimal, not the hexadecimal (base-16) codes supplied by charmap or the Wazu page. 

OK, let’s try.  A hex converter is here.  Hex 0300 is decimal 0768, it seems.  Let’s try ^u0768.  And … nope.  That doesn’t work either.

 Boy this wastes a lot of time!  Thanks Microsoft.

UPDATE: Persistence pays off.  Well, I have a workaround.  You cannot replace unicode combining characters like dots and accents.  But … you can replace the character and the dot together.  I have just copied an e+accent into Find What (it looks like garbage when it arrives – but no matter) and copied an e+dot into Replace, and it worked.  It replaced 462 instances, indeed.  So… I can do a lot of these that way.

Still annoyed that Word doesn’t deal with it properly, tho.

Share

Leave a Reply