The Encyclopaedia Aethiopica and the “Nine Saints”

A kind colleague sent me the article on Abba Garima from the Encyclopaedia Aethiopica.  This article, by Denis Nosnitsin, was really very good, very detailed, and well referenced.  Perfect for the newcomer interested in the subject.

This led me to search out copies of the Encyclopaedia Aethiopica.  It’s in five volumes, published in Hamburg between 2003 and 2010, although I was only able to locate four of the volumes.  The Wikipedia article for the Encyclopedia describes it as a “basic encyclopedia”, which it is not.  The same article suggests that the transcription system used is different to that generally in use, which, if so, is a curse.

Garima was one of the “Nine Saints”, I knew.  So I thought that I would look up the article on these gentlemen, and see what it said.  Had their hagiographical “Lives” even been printed, I wondered?

The article was duly located, in volume 3, and I began to read.  Again I found a detailed, well-referenced piece of work, which answered my question, and many more.

Everybody knows that Axum was evangelised in the 4th century after a group of Greek sailors were shipwrecked on the coast of Ethiopia.  But the Nine Saints belong to the 5-6th century, and formed a second wave of evangelisation of the country.  They are identified as “Roman”, and they also came from the Byzantine empire, according to Ethiopian hagiography.  They are, with the date of publication of the text:

  • Alef – 19-20th c., unpublished at the time.
  • Aftse – a early now lost text, edited by Rossini, “La leggenda di Abba Afse in Etiopia”, Melanges syriens offerts à M. R. Dussaud, (1939), p.151-6; and a different modern one, ed. Schneider, “Les Actes d’Abba Afse de Yeha”, AE 13 (1985), 105-18; Sergew Hable Sellasse, “New Historical Elements in the ‘Gadla Afse'”, JSS 9.1 (1964), pp. 200-03.
  • Aragawi  – 16th c., edited Guidi, “Il Gadla Aragawi”, RRALm ser. 5a, 2 (1896), 54-96; van den Oudenrijn (ed., tr.), La vie de saint Mikael Aragawi (1939)
  • Garima (aka Isaac or Yeshaq) – 15th c., edited Rossini 1897, which we have been looking at in previous posts.
  • Guba – 19-20th c., unpublished at the time.
  • Liqanos (i.e. Lucianos) – 19-20th c., unpublished at the time.
  • Pantalewon (a Greek name) – 14th c., edited Rossini, Acta Yared et Pantalewon (1904), reprinted CSCO 26, 27 (SAe 9, 10), 1961.  Includes Latin translation.
  • Tsahma – 19-20th c., unpublished at the time.
  • Yemata – 19-20th c., unpublished at the time.

I noted in the bibliography a publication by Bruno Ducati (tr.), La grande impresa di Amda Sion re d’Etiopia…. I miracoli di Abba Garima omelia del XV secolo, Milano (1939), 97-154.  This sounds like it might contain a translation of Rossini’s Homily on Garima.

So their Vitae or gadlat are not ancient, and date to the 14th century onwards.  At the time of printing of the Encyclopedia Aethiopica, the unpublished texts were in preparation, ed. Antonella Brita, who wrote this article in the EA.  I have not looked to see if these have appeared anywhere, but a cursory search suggests not.

That’s a fine haul of information for those of us looking for texts and translations.

Share

Experiments with Amharic and technology (part 5)

Eighteen months ago, purely for fun, I made my first attempt at seeing whether AI could produce an English translation of a text in Ethiopic, the  otherwise untranslated Homily of Yohannan, bishop of Axum, on Abba Garima, printed by Rossini.  It could not.  A week or so back, I tried again and got a load of rather decent looking results, and I’ve been working on this since.  The output from DeepSeek aligned quite well with Rossini’s summary

Last week I accidentally learned that there was a French translation by Gerard Colin.[1]  I got hold of that on Friday, and I have compared the first two paragraphs with the AI-generated output from the Ge`ez text.

The results are not positive.  The DeepSeek output is really not good enough.  Here’s Rossini’s text, followed by Colin’s French.  I’ve put the full-stops in red.  Ge`ez uses word separators, which I have kept.

ወይቤ ስምፁ ወልብዉ ኦአኀውየ፡ ፍቁራንየ ዘእነግረክሙ ርኢኩ ብእሲተ እንዘ ይዘብጥዋ ዕራቃ ፡ወእንዘ ይፀርፉ ላዕሌሃ ወላዕለ እግዝእትነ ማርያም እንዘ ይብሉ፡ በእንተ ወልዳ ክርስቶስ እምብእሲት፡ ኪያሁ ኢተወልደ ይብሉ እላ ኢየአምኑ በክርስቶስ ወኮንኩ እንዘ እረውጽ ወአኀዝኩ ፡እስዐም ታሕተ እገሪሃ ለይእቲ ብእሲት እንዘ ትብል እወ በዝ አንቀጽ ወፅአ ንጉሠ ሰማያት ወምድር ወሶበ ትብል ከመዝ ወረደ ላዕሌየ አስራበ መንፈስ ቅዱስ

Ecoutez et méditez, mes frères bien-aimés, ce que je vais vous raconter. J’ai vu une femme, stérile, que l’on fustigeait et injuriait ainsi que Notre Dame Marie, en parlant à propos du Fils de celle-ci, le Christ. « Il n’est pas né d’une femme », disaient ceux qui ne croyaient pas au Christ. Je me mis à courir et entrepris d’embrasser les pieds de cette femme en disant : « Oui, par cette porte est sorti le roi des deux et de la terre ! » Quand j’eus parlé ainsi, les trombes de l’Esprit Saint descen­dirent sur moi.

A quick run of Google Translate on Colin gives this:

Listen and meditate, my beloved brothers, on what I am about to tell you. I saw a barren woman being flogged and insulted, along with Our Lady Mary, speaking of her Son, Christ. “He was not born of woman,” said those who did not believe in Christ. I ran and began to embrace the woman’s feet, saying, “Yes, through this gate has come out the king of heaven and of the earth!” When I had spoken these words, the cloudbursts of the Holy Spirit descended upon me.

Now here’s the Deepseek output.  I’ve highlighted important differences:

Hear and understand, O my beloved brethren, what I declare to you. I saw a woman while they were stoning her and mocking her, and also mocking our Lady Mary, saying concerning her Son Christ, ‘He was not born of a woman.’ These said, ‘We do not believe in Christ.’ And I stood trembling and took refuge beneath the feet of that woman as she said, ‘Yes, by this decree, the King of Heaven and Earth shall come forth.’ And when she spoke thus, the Holy Spirit descended upon me fourteenfold.

These are not small differences.  They obscure what the text is actually saying.  And as ever with AI, the output is not stable.  That was the output a week ago.  Today the same text produced this:

And he said: Listen and understand, O my beloved brothers, what I will tell you.  I saw a woman being accused of adultery, and while they were judging her, they also judged our Lady Mary, saying concerning her Son, Christ: “He was not born of a woman’s womb.” Those who say this do not believe in Christ. And as I stood there, I became angry and struck the ground beneath that woman’s feet while she was saying, “Yes, by this argument, let the King of Heaven and Earth come forth!” And when she spoke thus, the Holy Spirit descended upon me.

These are not small differences either.  I did venture to ask DeepSeek for details, and the grammar and meaning of individual words.  This output also varied uncomfortably.  Which basically means that it is not useful.

In conclusion, AI is still not good enough to translate Ge`ez.

Perhaps I shall come back to this in a year, and see how things look then.

Share
  1. [1]Gérard Colin, Saints Fondateurs du Christianisme Éthiopien: Frumentius, Garimā, Takla-Hāymānot, Ēwostātēwos, Paris: Les Belles Lettres (2017), pp.6-37

From my diary

I’ve now trudged all the way through the remainder of the Life of Abba Garima, taking each paragraph and getting Gemini AI to translate it into English.  Why Gemini?  Well it was there on my list of AI sites, and I used it at random.  For the immense chapter 7, I broke this down into smaller paragraphs, and this worked.

I tend to believe that what I am getting out of Gemini is indeed what the Ethiopic text says, more or less, although I have no way to be certain.  At one point the name of a monarch came up, and I asked for a transcription into Roman letters of the passage.  The result verified the presence of the proper name.  That’s encouraging.

Gemini was often reluctant to translate the text, claiming “I’m just a poor LLM, I only know a subset of languages.”  I responded, “Try again” and invariably it then decided that it did know Ge`ez after all.  What this means, stripped of the fake “AI” stuff, is that the search simply failed first time, and worked the second time.

This seems to be what AI is.  It’s just a search engine.  Nothing more.  The talky bit on the front is just a chat engine, such as banks use on websites to demoralise customers who need help and make them give up.  The LLM is a big flat file containing the database that it is searching.  That database is composed of a vast amount of data, including a large quantity of pirated books.  The whole “AI” stuff is just nonsense, to camouflage that it’s engaged in massive copyright violation.  The search brings back stuff from pirated grammars etc, and assembles them into an English narrative.  As with every search engine, sometimes it guesses wrong, or gets unreliable data.  But “AI” has to choose something in order to pretend to be human.  So you get bogus stuff sometimes.  It would be interesting to know how well AI works in other languages, like French, German, etc, where there may be much less pirated data in the database.

Back to Abba Garima.  I realise that Rossini appended notes to his edition, explaining various points in the text.  The philological notes are of no use to me.  But I have now decided to OCR his Italian and see what else he says about the text.  I think it may be of interest.  We’ll see.

Share

Experiments with Amharic and technology (part 4)

I now have a corrected electronic text of the homily of Yohanan, bishop of Axum, in honour of St Garima.  I’ve numbered the paragraphs as an aid to myself, since I cannot even read the Ethiopian alphabet.  It probably needs work.  But here it is:

Today I started plugging paragraphs into DeepSeek.  I also tried asking it to interleave the sentences, Ethiopic, then English.  This did work, although curiously DeepSeek had trouble with the Ethiopic alphabet.

The first six chapters went fairly well.  Here they are:

In the name of the Father, the Son, and the Holy Spirit, one God. I have placed my trust in Him and believed in Him forever and ever. Amen.

1. The Homily delivered by Saint John, Bishop and Archbishop of Aksum, concerning the greatness and glory of Saint Isaac. And he said: Listen and understand, O my beloved brothers, what I will tell you. I saw a woman being mocked by a crowd, and as they were laughing at her and at our Lady Mary, they were saying concerning her Son, Christ: “He was not born of a woman; His birth is impossible.” These people did not believe in Christ. And as I stood there, I grew angry and stomped my foot upon the ground where that woman was standing while she spoke. And behold, at that very moment, the King of Heaven and Earth came forth. And when she spoke thus, the Holy Spirit descended upon me.

2. I tell you, I saw a man and a woman who came to me with wealth that my Lord had given me for the nourishment of your souls, the greatness and struggle of this righteous man. There was a man whose name was Mesfiyanos, king of Rome, and his wife’s name was Kefngya. For she was barren and childless, having nothing. And the two of them lived grieving for twelve years. These were righteous people who gave alms to the poor and needy and to churches. One day she went to offer sacrifice and entered the church where there were images of Peter and Paul and our Lady Mary. She stood between the three images and prayed before Mary, saying: “Give me a son who will glorify you and glorify your Son, and if he does not glorify your Son, let my womb be cursed.” And that image responded as if saying “Oh!” She offered her sacrifice and returned home. From that time she conceived and bore a son and gave thanks to God. They named him Isaac; Isaac, the joy of his mother, the splendor of his father. When he was forty days old, they took him to the church and baptized him according to Christian tradition. All the people gathered in the church of Peter and Paul. The chief bishop stood before the Ark of the Covenant and blessed him, saying: “May the God of Abraham, Isaac, and Jacob who blessed our holy forefathers bless you.” When the chief bishop blessed the child, a great light shone upon him, bright as the sun, moon, and stars. All who saw this marveled and glorified God, then offered sacrifices in their churches. The queen returned home with her child, and they celebrated a feast with great joy. When he turned twelve years old, they brought him to a teacher who instructed him in the Law, the Prophets, the Apostolic books, and their commentaries. They ordained him as a deacon. He further learned church canons, admonitions, plants, and animals. His parents consulted one another, saying: “Let us arrange a marriage for our son while we still live, that we may rejoice in his offspring.” As they were making these plans, an angel of God appeared to Mesfiyanos in a dream saying: “Many souls perish—will you not save them? A great church shall be built through him.” The king awoke and went to his wife, saying: “Last night an angel of God came to me and told me that through him miracles and wonders shall be performed.” The angel said to the queen: “I have come to command you to build and establish a church, that you may rejoice in the Kingdom of Heaven.”

3. When his father died, the nobles and dignitaries of Rome gathered and mourned for Isaac as he wept bitter tears. They enthroned him in the royal seat, and he reigned for seven years. During his days there was joy and peace. Afterwards, Saint Pantaleon came to him and said: “O my beloved Isaac, your earthly kingdom is temporary, but the kingdom of heaven is everlasting. The earthly kingdom will perish and pass away, but the heavenly kingdom will not pass away. Have you not heard what our Lord said in the Gospel: ‘Heaven and earth will pass away, but my words will not pass away’? Or what Paul said about this world: ‘It is passing away, and those who weep should live as though they were not weeping, and those who rejoice as though they were not rejoicing, and those who buy as though they had no possessions, and those who use the world as though they were not using it, for the form of this world is passing away’? My son Isaac, do not neglect the love of God.”

When the message arrived, Isaac saw it and wept bitterly. He rose at night and prayed, saying: “O my Lord Jesus Christ, I beg and implore You to hear my prayer and grant the request of Your servant Isaac. Show me the straight path that I should walk.” He rose at night from his chamber and went to a secret place outside the palace, departing by night. The angel Gabriel appeared to him, prepared a chariot and canopy for him with shining wings, and brought him from Rome to Aksum in ninety days, traveling by night on the back of an eagle, arriving at the third hour.

4. When the messengers of Father Pantaleon arrived after ten months and four days, they told the man of God, who marveled and trembled, saying: “O my son Isaac, have you not heard what David the prophet said: ‘Your works and my soul will be satisfied when I awake with Your likeness’? What I have done secretly, do not hide from me under the earth. What I have seen with your eyes, write it all in your book. Listen, my son, to the greatness of God who has called us from all things, just as He called you under the wing of the eagle.” As they spoke of God’s greatness, they remained until daybreak.

5. Saint Isaac said to Father Pantaleon: “Father, clothe me in the monastic garb.” Father Pantaleon replied: “Do you know, my son, what the monastic garb means?” Saint Isaac answered: “Yes, I know, father. Have you not heard what our Lord said in the Gospel: ‘What is exalted among men is an abomination before God’? As Paul said: ‘What will separate us from the love of Christ? Will hardship, distress, persecution, famine, nakedness, peril, or sword? Neither death nor life, nor things present nor things to come, nor powers, nor anything else in all creation will be able to separate us from the love of God in Christ Jesus.’ I desire to be bound to Christ and to be bound tightly.”

When Father Pantaleon heard these words from Saint Isaac’s mouth, he prostrated himself and blessed God, saying to Isaac: “May God strengthen you in keeping His commandments.” Then he took the monastic garb and clothed him, blessing him with these words: “May our Lord Jesus Christ, who blessed Anthony and placed upon his head the crown of monasticism, sanctify your garb. May God who blessed Father Macarius and Father Synoda bless your garb. May He who gave Elijah’s mantle to Elisha when he ascended to heaven and brought down the cloak for him—and with that cloak Elisha struck the waters and crossed over, raised the dead, and cleansed lepers—likewise sanctify your garb. May God who sanctified the priesthood of Melchizedek sanctify your priesthood.” They remained together for one year.

6. When the other saints heard that Saint Isaac had renounced the kingdom and embraced monastic life, they came to him. Among them were Abba Likanos from Qusṭinṭinya (Constantinople), Abba Yima’ata from Qosya, Abba Ṣəḥma from Antioch, Abba Guba from Cilicia, Abba Afṣe from Isya, Abba Maṭa‘ from Rome, and Abba ‘Oṣe from Caesarea. When the saints gathered, they met with Abba Pantaleon and Saint Isaac, and they were united in spiritual love. They lived together in one house, devoted to prayer. Among them, there was no one who sought personal gain, nor anyone who desired material possessions or the pleasures of the world. They remained for a period of one year.

The first 5 chapters are from DeepSeek.  I then asked it to do chapter 6, and the monstrously long chapter 7.  The results were weird, and then DeepSeek crashed.  I then tried ChatGPT on chapter 6, which is what you have above.  The DeepSeek version was about the same.

The ChatGPT version of chapter 7 was quite different in every way from that produced by DeepSeek.  Here’s DeepSeek, the opening bits:

7.  While they lived in this way, a governor of Aksum came to them and said: “There is a great beast, a king of the land of Ethiopia, to whom all the nobles bow down and offer gifts. They bring him a beautiful maiden of good appearance, and they perform sacrifices and offerings. When she is brought before this beast, he defiles her and then devours her. This beast has reigned in this manner for fifteen years.”

This beast was 170 cubits long and 4 cubits wide, with one cubit for its tail. Its eyes were like fiery coals, and the pupils of its eyes were like black pots. Its entire body was like a serpent and a dragon. When it ate, it would consume 17 oxen, and they would bring before it 10 sheep, 10 goats, 1000 chickens, 100 lambs, and countless birds. Upon its head were horns measuring three cubits. When it moved, its voice could be heard from a distance of eighty days’ journey.

When the brothers heard this, they marveled and were astonished. They trembled and said to the governor: “Do you truly speak the truth?” He replied: “Yes, truly. If you do not believe me, let one of you come and see.” Father Pantaleon said to Saint Isaac: “My son, what shall we do?” Saint Isaac replied: “Let my brother Oz go and observe this beast. If it is as he says, then you too must rise and pray.” …

Here’s ChatGPT:

7.  And when they came and reached him, a certain nobleman from Aksum spoke to them and said: “Hail, great king, the ruler of the nation of Ethiopia!” And they all bowed before him, the nobles, and presented to him the honorable and righteous virgin of the Lord. And they performed hymns and praises, and she came before him, the king, and prostrated herself. And she continued to remain pure. And the king was astonished.

And they stayed there for fifteen years. And the king stood in honor of her for one hundred and eighty days. And he prepared four feasts in her honor. And one of these feasts was the greatest. And his eyes shone like flames of fire, and his pupils were as bright as polished metal. His appearance was as dazzling as the sun and lightning. And when he spoke, seventeen scribes recorded his words. And they made a proclamation across the land.

There were ten thousand warriors, and one hundred thousand men armed with shields and spears. And they gathered before him in great numbers. And his voice was heard from afar, resounding like thunder. And the land trembled at his voice. And his presence was mighty, and all the nobles fell before him.

The DeepSeek text is at least plausible.  The ChatGPT is not.

What I will try doing is getting each to translate it, sentence by sentence, interleaved with the Ethiopic.  I’ll see what that does.

Share

From my diary

I’m working away on this Ethiopian homily of John, bishop of Axum, on St. Garima.  It was printed in 1898 by C. Conti Rossini,[1] but without translation.

Well, I don’t know any Ethiopian at all, and I don’t even know the alphabet.  There are 31 consonants, each of which has seven variants, I gather.

But I knew that it was possible to get Google to turn images into electronic text, and a couple of experiments with ChatGPT and DeepSeek quickly showed that the resulting output file could be understood by AI and produce English text.

So I need to get a decent electronic text.

My first step was to take the PDF, extract the pages with the Ethiopian text on them, and pull them into Finereader.  Finereader does NOT support Amharic, but it has useful image editing tools.  I trimmed the 24 pages down to the bare text – no footnotes, no headings, and exported them as images to a directory.

I then bundled these images up into a PDF using my incredibly elderly Adobe Acrobat Pro 9.0.  I then went into Google Drive and uploaded the PDF.  Then I right-clicked on it in Google Drive, and opened it in Google Docs.  This caused Google to OCR it, thereby creating an electronic text.  I then downloaded this in Word format.

I’ve checked the results into a local Git repository – so that I can always go back if I screw up the file.

And now, page by page, I am going through what Google has given me, removing obvious crud and irrelevant line breaks.  It seems to insert a small amount of garbage between pages.

Wish me luck!

There are other free Amharic OCR websites online, and these seem to do a reasonable job too.  But I’ve stuck so far with the Google Docs output.

Incidentally DeepSeek offered the opinion that the text is not in Amharic, as I had expected, but in Ge`ez, Classical Ethiopian.  Luckily it doesn’t care.

Share
  1. [1]C. Conti Rossini, “L’omilia di Yohannes vescovo di Aksum in onore di Garima,” Actes du Congrès International des Orientalistes, Section Sémitique (Paris, 1898).Online via here.

Experiments with Amharic and technology (part 3)

Back in August 2023, I found that it was possible to scan in some of an Ethiopian text, part of the Life of Saint Garima.  Unfortunately the results from machine translation were disappointing.  ChatGPT was no better.

This evening I was writing my last post, and remembered these efforts.  So I pasted those chunks of text into ChatGPT.  Here’s the first:

In the name of the Father, the Son, and the Holy Spirit, one God above all, I have renounced everything and believed in Him until the end of the world. Amen.

The sermon delivered by Saint John, the Bishop of Aksum, concerning the greatness and honor of Saint Isaac:

He said, “Listen and understand, my beloved ones, what I will tell you. I saw a woman who was suffering greatly, trembling and in distress. And she called upon Our Lady Mary, saying: ‘For the sake of the Son, Christ, this woman is not yet born.’ They said to her, ‘Whoever does not believe in Christ and remains doubtful, they shall be humbled and brought low under her feet.’

Then, at that moment, the King of Heaven and Earth came forth. And she said, ‘The one who is greater than Moses shall come.’

Plausible.  Here’s the second chunk:

And it happened on one night that a book was opened, read, and written down. And he arose for prayer in the church. And the angels wrote for him the Gospel at the fourth hour and interpreted it for him. And the angels of God descended and stood before him, and our Lord Christ comforted him. And his message was heard throughout all the land. His prayer and blessing remained with us.

Also plausible.

Perhaps I should have another go at this!

Share

The perils of AI translation

Rather excited by the discoveries that AI would translate medieval Greek, I thought I’d try another attempt at that Ge`ez text that I put into Google Translate some time back.  That is a homily on St Garima by a certain bishop John.  I found the text on my disk, and put a paragraph into Bard AI.  Nope.  It wouldn’t play.  Then I tried ChatGPT 3.5.  That churned out the Nicene Creed, as a supposed translation.

You can’t trust AI.  It can and will generate garbage.  You have to be able to check.

Share

More experiments with Amharic and technology

In my last post I found that it was possible to turn a PDF full of images of Amharic text into recognised electronic text using Google Drive, and then get some translation of the results into English using Google Translate.

There were some extremely interesting comments made on the post, which I have been reading.  I have also prepared a PDF of the whole text of the Life of Garima by Yohannes, and run that through the Google Drive process.

Where we started was in trying to read a passage of this text, in which – supposedly – God stopped the sun so that St Garima could copy the bible in one day.  The summary of the work  given by Rossini (instead of a proper translation, drat him), indicates that this was on lines 356-60 of his text, which turns out to be the last line of p.161 and the first three of p.162.  Here they are:

The output from the OCR is good, but you still have to compare the characters carefully.  Errors can often be picked up just by dumping the raw scan output into Google Translate, which shows things like numerals.

Here we have a character that is plainly wrong, and coming out as a numeral “4”.  It looks like an “o” with a hat and two dots under.  The two dots under are legs in another copy of Rossini.

I’m guessing that it’s a “ge” character, from looking at the Wikipedia article, but I can’t be sure. The script isn’t an alphabet, but a syllabary, based on syllables.  Each character is a consonant followed by a  vowel, which makes for a lot more characters.  There’s a table of the characters on the Wikipedia article, consonants down the left, vowels across the top.  I’ve not really looked at this.

The Google translate output is also interesting because of the choice of “detected language” – Tigrayan, rather than Amharic.  If you force it to Amharic, you get a lot less meaning.

One awkward part of using Google Drive to do the OCR is that it doesn’t preserve the line breaks.  That makes comparing the lines more awkward.   So you have to manually do this:

፬ ፡ ወኮነ ፡ በአሐቲ ፡ ዕላት ፡ ወነሥአ ፡ መጽሐፈ ፡ ወቀለመ ፡ ወወጠነ፡
ይጽሐፍ ። ወተንሥአ ፡ ለጸሎት በሰርክ ። ወጸሐፉ ፡ ሎቱ : መላእክት ፡ ወንጌ ለ ፡
በ፬ ፡ ሰዓት ፡ ወትርጓሜሁ ። ወመላእክተ ፡ እግዚአብሔር ፡ ወትረ ፡ ይት ለአክዎ ፡
ወእግዚእነሂ ፡ ክርስቶስ ፡ ያንሶሱ ፡ ምስሌሁ ። ወተሰምዐ ፡ ዜናሁ :
ውስተ ፡ ኵሉ ፡ ሀገር ። ጸሎቱ ፡ ወበረከቱ ፡ የሀሉ ፡ ምስሌነ ።

The Wikipedia article mentioned earlier gave me a list of punctuation marks.  There are two sorts of punctuation visible in here.  The colon mark is actually word division, which means that some words above go over two lines.  I’ve chosen not to split words above.  The double colon mark “::” is the full stop.  Interestingly Google Translate gives different results if you remove the spaces!

Going through the electronic text, removing spaces, I notice that sometimes the word-separator isn’t detected by the OCR.  So I added that in.  Sometimes it put a Roman colon instead, so I replaced that.  Finally I split on sentence:

፬፡ወኮነ፡በአሐቲ፡ዕላት፡ወነሥአ፡መጽሐፈ፡ወቀለመ፡ወወጠነ፡ይጽሐፍ።
ወተንሥአ፡ለጸሎት፡በሰርክ።
ወጸሐፉ፡ሎቱ፡መላእክት፡ወንጌ ለ፡በ፬፡ሰዓት፡ወትርጓሜሁ።
ወመላእክተ፡እግዚአብሔር፡ወትረ፡ይትለአክዎ፡ወእግዚእነሂ፡ክርስቶስ፡ያንሶሱ፡ምስሌሁ።
ወተሰምዐ፡ዜናሁ፡ውስተ፡ኵሉ፡ሀገር።
ጸሎቱ፡ወበረከቱ፡የሀሉ፡ምስሌነ።

And run it again and I get this:

But this still is not good enough to do much with.  If we didn’t have an idea what the text said, this would not tell us.

All this fiddling about would certainly get to into contact with the language, and start you on a journey to learning it.  But it’s not good enough a translation for other purposes, although intriguing.

One suggestion that was made in the comments to the last article was that ChatGPT gave better results.  The output quoted was indeed produced, and was very smooth and seemed to be a series of liturgical prayers.  But… I don’t think that this is actually the content.  These AI tools are really only an improved version of the text prediction tools you get on messaging on a mobile phone.  So it was pumping out garbage.

Anyway I tried it on this passage, and it crashed GPT very effectively!  At the moment I can’t get any reply of any sort, not even to “hello”.

I don’t think that I will do more here.  Clearly the technology is almost, but not quite good enough to be useful.

Share

Is it possible to read editions of Amharic texts? An experiment

In my last post I mentioned how the Life of St Garima in Ethiopian was printed by Rossini, but without a translation.  In fact it has never been translated into any modern language, to my knowledge.  I don’t know any Ethiopian, and I doubt that I ever will.

But we live in an age of wonders, when it comes to unfamiliar languages.

So… is it possible to work with Ethiopian language editions, even if you know no Ethiopian?  What about Google Translate?  Ethiopian is in this heavy unfamiliar script.  Is there OCR for this?  If you can scan Rossini’s edition, can you pop it into Google Translate and get the English?

There are two sorts of Ethiopian out there, I know.  There is Ge`ez, or classical Ethiopian; and there is Amharic, the modern dialect.  Rossini printed his text from a 19th century manuscript.  So it seems likely that this is in Amharic.

A quick Google confirmed; Google Translate knows Amharic!  A bit of googling found me an Amharic news website online, here.  I’m using Chrome, so all I had to do was right-click anywhere and select “Translate to English” and the whole website was rendered into some sort of English.  And… it worked!!  Yay me!  It’s obviously not 100%, but it’s way better than 0%!

So what about OCR?  I was sad to see that Abbyy Finereader apparently doesn’t support Amharic.  That’s a blow.  It was developed originally to handle Cyrillic, so it certainly has the capability.  But it’s not offered.  Drat.

A bit of googling brought me to a dubious-looking website here, claiming to offer a selection of tools which could do Amharic OCR.  The prose felt a bit machine-generated, so I worried that it was bunk, or worse, a malicious site.  But the first option was… Google Drive.

I never knew this, but seems that, if you upload a PDF containing an image of text, and then open it in Drive as a Google Docs document, it OCR’s the content.

Well, I thought, let’s give it a try.  So I extracted the first page of Rossini’s edition, using Adobe Acrobat Pro 9 – no flashy latest-edition stuff going on here!  Here’s a pic:

Then I uploaded it, and opened as a Google document.  And … it just treated the Amharic as an image.  Dang!  But I noticed that it did indeed OCR the Italian at the top of the page!

This is supposed to work.  So I thought maybe I should work over the image a bit.  I imported the one-page PDF into Abbyy Finereader 15, and chopped off the Italian at the top, and the critical apparatus at the bottom.  I then used the image editor in Finereader to “whiten the background”.  This can be flaky, but this time it worked fine, and I got a pure white background.   And I got this:

(I’ve just seen the marginal notes, which I need to chop off as well, so I’ll have to go round the loop again)

I exported the image as a PNG, and I used Acrobat again to create a PDF from the image.  Then I uploaded the new PDF to Google Drive, and opened it as a Google Docs document.  And… it worked!  Sort of…

በስመ : አብ : ወወልድ ‘ ወመንፈስ ፡ ቅዱስ ፡ ፩ ፡ አምላከ ፡ ላዕሌሁ ፡ ተወ ከልኩ፡ ወቦቱ ፡ አመንኩ ፡ እስከ ፡ ላዓለመ ፡ ዓለም ፡ አሜን ።

ድርሳን ፡ ዘደረሰ ፡ ቅዱስ ፡ ዮሐንስ ፡ ኤጲስ ፡ ቆጶስ ፡ ዘአክሱም o ፡ በእንተ ዕበዩ ፡ ወክብሩ ፡ ለቅዱስ ፡ ይስሓቅ = ወይቤ ፤ ስምዑ ‘ ወልብዉ ፡ ኦአኀውየ 5 ፍቁራንየ ፡ ዘእነግረከሙ ። ርኢኩ ፡ ብእሲተ ፡ እንዘ ፡ ይዘብጥዋ ፡ ዕራቃ ወእንዘ ፡ ይሀርፉ ፡ ላዕሌሃ ፡ ወላዕለ ፡ እግዝእትነ ፡ ማርያም ፡ እንዘ ፡ ይብሉ በእንተ ፡ ወልዳ ፡ ክርስቶስ ፤ እምብእሲት ፡ ኪያሁ : ኢተወልደ ፣ ይብሉ ፡ እላ ፡ ኢየአምኑ ፡ በክርስቶስ = ወኮንኩ ፡ እንዘ ፡ እረውጽ ፡ ወአኀዝኩ እስዐም ፡ ታሕተ ፡ እገሪሃ ፡ ለይእቲ ፡ ብእሲት ፡ እንዘ ፡ ትብል ፤ እወ ▪በዝ ፡ አንቀጽ ፡ ወፅአ ፡ ንጉሠ ፡ ሰማያት ፡ ወምድር ። ወሶበ ፡ ትብል፡ ከሙዝ ፡ ወ

That’s… rather astonishing.  No idea what all that is, but it looks sort of right.  Let’s bear in mind that Rossini printed his edition in 1897.  This is not a modern typeface.  So this is rather good.

Next step was to paste it into Google Translate.  It set it to auto-detect the language, and pasted in the first bit.  And… it worked.  In fact it gave a really useful transcription into Roman letters as well, which makes it a LOT easier to manipulate the text.

OK, I’m cheating slightly.  The first time I uploaded, the translation ended at “Spirit”.  But this is a Google Translate bug – it sometimes omits the remainder of a sentence.  If you split the text with a line feed, you often get the rest.  And that’s what I did.  I worked out by experiment where I needed to be, and then I got the above.

I don’t quite believe the translation of the second sentence either.  I suspect I need to play with this a bit to work out what each word is.

I notice all those colons between every word.  It might help if I actually looked up the script online!

But I think you’ll agree that this is quite marvellous – I, who know absolutely nothing about the language, am getting something useful out!

Magic!

Share