Another AI Translation Experiment: Old Slavonic

This post at Three Pillars Blog came to my attention yesterday.  Scott Cooper is experimenting with Google Translate and ChatGPT AI to see whether we can get anything useful out of Old Church Slavonic.

As you can see, Google’s language detection isn’t entirely useless with OCS. Apparently early Cyrillic and it’s modern Bulgarian equivalent, as well as some of the vocabulary, are similar enough that Google “detects” Bulgarian and renders both individual words and some complete sentences. In a pinch, it seems there are OCS dictionaries one could slog through, cobbling together what Google can’t. That sounds rather miserable. ChatGPT, as we’ll see, was able to use my description of the text as Old Church Slavonic, and produce a full translation. What follows is information on the source text, an of outline the results, and a comparison to an actual human translation.

He’s picked an Old Slavonic text which is online, and indeed already has an English translation online, and run ChatGPT against it.

Obviously we have to ask – is that translation already in the ChatGPT database?  If so, the AI translation will not in fact be doing anything much.  What we need is some u untranslated Old Slavonic.

But very interesting!

Share

Ephrem Graecus – Published English translations coming soon

Ephrem the Syrian is the most famous of the Syriac writers; but there is a mass of material in Greek attributed to him.  Some of it is translations of the Syriac, but most is clearly not.  It looks as if there was a fashion for writing in his style at one point.

Unfortunately this large splodge of unexplored material has never been critically edited.  Instead it was collected by Assemani in the 18th century from manuscripts, and more or less printed as he found it.  Some of the texts are clearly excerpts from others of the texts.  Assemani gave a Latin translation.

Since then we’ve had Phrantzolas reprint the text in 1988-98 in 7 volumes, with a modern Greek translation.  That was a step forward.

But now I read that a complete English translation is in progress!  An English translation is indeed the obvious next step.  Making it easier to dip into the texts will cause more young scholars to start doing so, and in turn to start creating scholarship about it.  Little by little Ephrem Graecus is being opened up!

Via the St Ephrem: The Greek Corpus site:

Published Translations of Ephrem Graecus Coming Soon

That’s it. That’s the news. A translation of the Greek writings attributed to St Ephrem the Syrian is currently underway with St Vladimir’s Seminary Press. They are currently working on vol. 1 (of seven). Look for it probably in 2024.

The site owner told me:

A friend of mine here in the US is working on it. I had planned to do a volume of the most (historically) “important” texts, but he was inspired to do the whole collection, so I yielded to him. It will be very good to finally put out there

This is massively good news.  Wonderful!

Share

Latin translations of the Greek fathers in Dark Ages monastic manuscript inventories

How widely known were the Greek fathers in the Latin world during the Dark Ages?  How accessible were they?

One possible source of information is the surviving inventories of medieval libraries.  A collection of these was printed by G. Becker in 1885 as Catalogi Bibliothecarum Antiqui, and it makes interesting reading indeed.  In fact if you want to get an idea of what a medieval library looked like, this is the best thing you can read.  Catalogue after catalogue, monastery after monastery.

If we do a search on “Origen”, we start finding results almost at once.  The seventh catalogue, from Fontanelle, ca. 823-33 AD, has four volumes of his homilies as entries 78-81.  The next catalogue (8), from Reichenau, at much the same time, is better still:

Homilies of Chrysostom on Matthew; Origen on Genesis, on Romans; and books from the Clementine Recognitions.  All of these are, of course, Latin translations.  It raises the question of just what the Latin world of that period had access to.

Back in 2021 an interesting article appeared in the Downside Review by Scott G. Bruce: “Veterum vestigia patrum: The Greek Patriarchs in the Manuscript Culture of Early Medieval Europe”.[1]  The abstract is worth quoting:

This article draws attention to the availability of Latin translations of Greek patristic literature in western reading communities before the year 800 through a survey of the contents of hundreds of surviving manuscripts from the Merovingian and Carolingian periods. An examination of the presence of the translated works of eastern church fathers in the 8th-century florilegium known as The Book of Sparks (Liber scintillarum) and monastic library catalogs from the early 9th century corroborates the impression left by the manuscript evidence. Taken together, these sources allow us to gauge the popularity of particular eastern authors among Latin readers in early medieval Europe and to weigh the influence and importance of Greek patristics in the western monastic tradition.

But the abstract is too modest: the author has surveyed nearly 1,800 Latin manuscripts created before 800 AD – a massive task.  His conclusion:

In conclusion, the legacy of the ancient fathers, in particular those of Greek origin, was an important aspect of the intellectual history of early medieval monasticism that has received little attention in modern scholarship. This article has laid the foundation for the study of the reception of the Greek fathers in the medieval Latin tradition. Its survey of the nearly 1800 Latin manuscripts created before or around the year 800 has shown that doctrinal, devotional, and historical works attributed to eastern Christian authors survived in relative abundance in western monastic libraries. Latin reading communities favored especially the biblical commentaries of Origen, the salvation history of Eusebius, and the homilies and sermons of John Chrysostom, but other Christian Greek authors like Basil of Caesarea, Theodore of Mopsuestia, Ephrem the Syrian, and Gregory of Nazianzos informed their thinking as well. An examination of the early 8th-century Book of Sparks and Carolingian book inventories from the first decades of the 9th century corroborated the evidence of the manuscripts, and also uncovered the presence of lesser known works of Eastern origin that attracted a western audience, including a spiritual guide by Evagrius of Pontus. … The Christological controversies of the late 8th century raised the currency of the Greek fathers even higher among Latin readers like Alcuin, who looked back to the 5th-century east for a language of authority with which to defend traditional Christian doctrine against the misguided interpretation of Christ’s nature put forward by the Adoptionists.

The article is very readable, and is recommended.

Note the presence of Ephraim the Syrian?  This is CPG 4080 = CPL 1143, De die judicii, on the Day of Judgement, found in the catalogue of St Riquier (Becker 11, p.27).  There’s an English translation online here.

Looking in Becker, I find mention of a Discourses to Monks in Whitby ca. 1180 (Becker 109, p.226), but all is not as it seems.  for Becker gives his source:

Edwards. Memoirs of libraries. (London 1859.) p. 109-111 excerpsit ex Young History of Whitby and Streoneshald Abbey {1817} p.918-920.

and the latter is accessible online.  On p.919 here we find that the entry is merely “Effrem” – the rest is speculation by the 19th century editor.  The work supposed here is CPG 3942 Exhortation to the Monks of Egypt (Sermones paraenetici ad monachos Aegypti), the first ten of which are online in English here.  (The translation site has gone, and is now preserved only at Archive.org.)

Later in Becker there is yet another “liber, qui vocatur Ephrem” – a book which is called ‘Ephrem’ – as entry 37 of Stederburg (Becker 124, p.253, 12th c.).

It’s very useful to know just what was available in Dark Ages Europe.

Share
  1. [1]Vol. 139, p.6-23.  DOI: 10.1177/0012580621994704

Cyril of Alexandria on posthumous anathemas

Letter 72 of Cyril of Alexandria, To Proclus, Bishop of Constantinople, is an interesting item.  The Greek is in PG77, column 344, and there is an English translation by John McEnerny in FOC77, p.72 f.  Nestorius has been deposed and exiled, and the hapless Proclus installed in his stead as bishop of Constantinople.  Cyril has won his civil war, and is now concerned to solidify his victory.

But throughout the East some were exceedingly vexed at this, not only of the laity but also of those assigned to the sacred ministry. … Yet by the grace of God either in pretense or in truth they speak and preach one Christ and anathematize the impious verbiage of Nestorius. In the meanwhile things there are in much tranquillity and they run toward what is steadfast in the faith day by day, even those who once were tottering.

But victory in civil war is always arrogant.  Some of Cyril’s supporters have arrived in Constantinople and asked the emperor to use his power to condemn the long-deceased scholar Theodore of Mopsuestia as well.

But his name in the East is great and his writings are admired exceedingly. As they say, all are bearing it hard that a distinguished man, one who died in communion with the churches, now is being anathematized.

Of course Nestorius was indeed following the ideas of Theodore, but Cyril is nothing if not a politician.  An exposition written by Theodore had been condemned at an Eastern synod; but the name of Theodore was not mentioned:

But while condemning those who think in this way, in prudence the synod did not mention the man, nor did it subject him to an anathema by name, through prudence, in order that some by paying heed to the opinion of the man might not cast themselves out of the churches.

The translation is somewhat awkward, or, more likely, Cyril’s prose is itself convoluted.  But what Cyril means here is that, if the synod had understood that Theodore was being condemned, they might have refused to go along with Cyril’s plans.  Cyril calls disagreement with himself “casting yourself out of the church.”  He adds:

Prudence in these matters is the best thing and a wise one.

Then to the matter:

(4) If he were still among the living and was a fellow-warrior with the blasphemies of Nestorius, or desired to agree with what he wrote, he would have suffered the anathema also in his own person. But since he has gone to God, it is enough, as I think, that what he wrote absurdly be rejected by those who hold the true doctrines, since by his books being around the chance to go further sometimes begets pretexts for disturbances.

The second sentence is hard to understand, so I took a look at the PG.

I couldn’t make anything of that either, unfortunately.  The parallel Latin is somewhat obscure also:

Quoniam vero ad Deum abiit, sufficit, ut ego puto, ea quae absurde ab ipso scripta sunt rejici ab iis qui recte sentiunt, cum iis, qui in ipsius libros incidunt, etiam ulterius progredi tumultuum occasiones nonnunquam pariat.

But since he has gone to God, it is sufficient, as I think, for the things which are absurdly written by him to be rejected by those who think rightly; to go still further with these things, which they meet with in his books, may sometimes create the occasions for disturbances.

Not sure about the Latin of the last bit – shout if you can see it better!

And in another way since the blasphemies of Nestorius have been anathematized and rejected, there have been rejected along with them those teachings of Theodore which have the closest connection to those of Nestorius. Therefore, if some of those in the East would do this unhesitatingly, and there was no disturbance expected from it, I would have said that grief at this makes no demands on them now and I would have told them in writing.

I have read this several times.  I think Cyril is saying that, if the Easterners were happy about rejecting Theodore, then nothing need be done about him; and that Cyril would be happy to say so in writing to them; a writing that could be held in evidence against him, in the putrid politics of the time.

(5) But if, as my lord, the most holy Bishop of Antioch, John, writes, they would choose rather to be burned in a fire than do any such thing, for what purpose do we rekindle the flame that has quieted down and stir up inopportunely the disturbances which have ceased lest perhaps somehow the last may be found to be worse than the first?

This seems to mean that he has heard from “the most holy John”, his political foe, that the Easterners are NOT happy about rejecting Theodore, and if they have to do so, may reject Cyril’s settlement entirely, and go back to supporting Nestorius.

And I say these things although violently objecting to the things which Theodore, already mentioned, has written and although suspecting the disturbances which will be on the part of some because of the action, lest somehow some may begin to grieve for the teachings of Nestorius as a contrivance in the fashion of that spoken of by the poet among the Greeks, “They mourned in semblance for Patroclus but each one mourned her own sorrows.”

He thinks the Easterners will rally around the name of Theodore, while meaning Nestorius.

(6) If, therefore, these words please your holiness, deign to indicate it, in order that it may be settled by a letter from both of us. It is possible even for those who ask these things to explain the prudence of the matter and persuade them to choose to be quiet rather and not to become an occasion of scandal to the churches.

So, he continues, please tell my partisans from me to shut up, stop rocking the boat, and let the Easterners get used to the idea of rejecting Nestorius.

It is obviously unfair to condemn a man who died in the peace of the church for saying things that were later turned into a big argument.  Cyril says something of the sort, but I’m not sure that this is the thrust of his argument  His appeal is instead to politics and prudence.  Principles are for free men, and the world that he lived in was not such a society.

On the other hand it’s easy to be unfair to Cyril.  He was effectively the political leader of Egypt, as his predecessor had been, and as his successors were to be.  His life was entirely a matter of politics.  Politics is the art of the possible.  Cyril did not think that condemning Theodore at this time was possible, even though he would have liked to.

Share

Some thoughts about interpolation in patristic texts

The term “Theotokos” (“Mother of God”) becomes the subject of fierce controversy in the 5th century AD.  The dispute was perhaps more political than religious – Constantinople versus Alexandria – but was fought with great ferocity, and lavish bribery, and ended in the victory of Cyril of Alexandria and the exile of Nestorius and indeed a great number of others.  Failure to use the term for Mary was a sign of Nestorianism, which could be fatally bad for you.  The use of the term is still held with passion by  Eastern Orthodox even today.

Therefore, when searching the TLG for the earliest usages of this word, it was something of a surprise to find it in Greek patristic texts from 300 onwards.  It appears in Athanasius, but also before.  Of course there is no reason why the word might not be used, and it need not imply any of the doctrines associated with it in the 5th century.  But all the same it seems odd.

Could these usages be later interpolations?  How could we tell?

I am very much opposed to alleging interpolation as a way to dispose of inconvenient evidence.  In general the texts that have reached us from antiquity do so in a very reasonable state, as far as we can tell.  The main reason for this is, of course, the prosaic one.  Anybody who put himself to the considerable trouble of copying a literary text did so precisely because he wanted a copy of that text.

But once politics and bigotry appear, then the incentive to forgery appears.  Cyril of Alexandria himself refers, in letters 39 and 40, to tampering with a letter of Athanasius:

8.  But when some of those accustomed “to pervert what is right” turn my words aside into what seems best to them, let your holiness not wonder at this, knowing that those involved in every heresy collect from the divinely inspired Scripture as pretexts of their own deviation whatever was spoken truly through the Holy Spirit, corrupting it by their own evil ideas, and pouring unquenchable fire upon their very own heads. But since we have learned that some have published a corrupt text of the letter of our all-glorious father, Athanasius, to the blessed Epictetus, a letter which is itself orthodox, so that many are done harm from it, thinking that for this reason it would be something useful and necessary for our brothers, we have sent to your holiness copies of it made from the ancient copy which is with us and is genuine. – Letter 39 (FOC 76 translation), p.152

and:

25. … For the most God-fearing Bishop of Emesa, Paul, came to me and then, after a discussion had been started concerning the true and blameless faith, questioned me rather earnestly if I approved the letter from our thrice-blessed father of famous memory, Athanasius, to Epictetus, the Bishop of Corinth. I said that, “if the document is preserved with you incorrupt,” for many things in it have been falsified by the enemies of the truth, I would approve it by all means and in every way. But he said in answer to this that he himself had the letter and that he wished to be fully assured from the copies with us and to learn whether their copies have been corrupted or not. And taking the ancient copies and comparing them with those which he brought, he found that the latter have been corrupted; and he begged that we make copies of the texts with us and send them to the Church of Antioch. And this has been done. – Letter 40 (FOC 76 translation), p.166-7.

Much later, at the Council of Florence, the Greeks and the Latins arguing over the filioque found examples on both sides of interpolation.

This is human nature.  Once a behaviour is incentivised, through advantage or fear, then it will appear.

We know something of “forced speech” in these days.  If you look at a job advertisement from most official or academic sources, each and every one will include some reference to “diversity”.  The word is pretty much meaningless of itself; but we all know that it is a code-word, indicating loyalty to a particular political agenda.  A job advertisement that did not contain it might be dangerous!  It might leave the clerks open to an accusation of failure to endorse this policy or that.  Far safer to murmur the code-words.

In the 5th century, failure to use “theotokos” might carry the same risks for any writer.  Once certain views are obligatory, and failure to conform is dangerous, then it becomes important to use the code-words.  “Theotokos” was most certainly a code-word.

A little while ago I was looking at the catena fragments which preserve bits of Origen.  These use the word “theotokos”, but I gather that scholars do not think this part of Origen’s text.  This is not unreasonable.  A catena is a literary work of itself, composed of chains of quotations from the fathers, adapted to form a continuous commentary on a passage of scripture.  I really do not see why a writer would not introduce “theotokos” when composing his catena.  It wouldn’t be wrong, or misrepresentation.  Rather it would be a case of adapting the older writer to contemporary needs.

Likewise a copyist of an integral work might add “theotokos” in the margin, as a note.  Because omissions were also written in the margins, this could easily be mistaken for a copyist omission, and become part of the text when next copied.

But all of this is speculation.  We need to ask whether there is any actual evidence that this did actually happen?  Did later copyists introduce “theotokos” into 4th century texts?  How can we tell?

One obvious way to assess this is to find copies of the patristic texts prior to 400 AD, and look.

This leads to the next question: do we have any copies of the writings of patristic writers like Athanasius prior to 400?  How could we find out?

I’m not sure that this is a very easy question to answer.  For Latin texts we have E.A.Lowe’s Codices Latini Antiquiores.  But to the best of my knowledge this is safely offline and inaccessible.  And anyway we need Greek.  There might be papyri.  These might be safely dated; or not.  But how do we find out?  A critical edition of a specific work ought to tell us at least something.  Probably that’s the way to go.

But I wouldn’t be a bit surprised if we had no 4th century manuscripts of 4th century fathers.  Surviving 4th century manuscripts are few.

So how can we detect any such process of interpolation of “code-words” into patristic texts?

At the moment, I suspect, all we can do is be cautious in this area.

Share

16 page lost section of ancient “Julian Romance” text discovered in Vatican manuscript

A pair of researchers have discovered and published a lost ancient text in the Vatican library.  It’s the long-lost opening portion of a text usually dated to the early 6th century, and known as the “Julian Romance.” This is a novelisation of the reign of Julian the Apostate, who reigned ca. 362 AD, and his persecution of the church.  The work was composed in Syriac, but widely translated in antiquity into other nearby languages including Greek.

The publication is Marianna Mazzola & Peter Van Nuffelen, “The Julian Romance: A Full Text and a New Date”, in: Journal of Late Antiquity 16 (2023) pp.324-377. (Paywalled here; first page here).  This prints the Syriac text, with an English translation, and a thorough study.

Here’s the abstract:

The Syriac Julian Romance, a tripartite fictional account of the reign of the Emperor Julian, was hitherto only partially known from two manuscripts. This article publishes the missing first section from Vat. Sir. 37, a section that narrates the death of Constantius II. The complete text allows us to demonstrate that the narrative was composed by a single author and that the tripartite structure does not reflect three older, separate texts. Further, we identify the Miscellaneous Chronicle of 640 as the source for most of the historical information in the Romance. This implies a new date in the first half of the seventh century, which is supported by other chronological indications in the Romance.

The majority of the text of the Julian Romance was already known, and can be found in British Library Additional MS 14,641.  But this copy was obviously missing a large chunk at the start.  A small part of the beginning was later found in Paris BNF Syr. 378.  But there was still, obviously, a large amount missing.

Marianna Mazzola was one of the scholars:

I was checking the historiographical excerpts contained in Syriac doctrinal florilegia for a project I have been collaborating with at Ghent University and stumbled on this text mistakenly cataloged by J. Assemani as an excerpt from Michael the Syrian’s Chronicle on the Death of Constantius II.

I did not remember such a passage in Michael’s Chronicle so I started to translate it and realised that the style was not at all the plain, dry style of Syriac chroniclers. Gradually, I realised that it could be the Romance of Julian and finally when on the last page my text overlapped with that of MS Add. 14641, I no longer had any doubts.

The article is written with Peter van Nuffelen in which we also propose a new date on the basis of the new textual evidences. Looking forward to hear any remarks! We are aware this is a much debated text that has always sparkled much scholarly discussion.

In response to a query, she added:

I worked on the on-line manuscript. Sadly, it was still COVID time when I worked on it, and it was impossible to travel to the Vatican Library. Certainly further study of the manuscript would be an important addition.

The manuscript is indeed online, and may be found at the Vatican site here.  The article lists the contents of the manuscript.  The new text is on folio 168v-173r.  Here’s the opening:

ܐܝܟܙ ܐܟܠܡ ܣܘܢܝܛܢܛܣܘܩ ܪܒ ܣܝܛܢܛܣܘܩܕ ܗܢܩܦܡ ܠܥܕ ܐܬܝܥܫܬ ܒܘܬ
.ܝܗ̈ܘܗܒܐ ܠܥ ܦܣܘܬܬܐܘ ܗܡܥ ܬܘܠ ܫܢܟܬܐܘ ܐܒܪ ܣܘܢܝܛܢܛܣܘܩܕ ܗܬ̈ܡܘܝ ܘܡܠܫ ܕܟ ܠܥܒ ܝܗܘܬܝܐܕ ܗܪܟܘܒ ܢܝܕ ܣܘܢܝܛܢܛܣܘܩ .ܝܗ̈ܘܢܒ ܐܬܠܬ ܗܪܬܒ ܢܡ ܐܬܘܟܠܡ ܘܕܚܐܘ ܐܬܘܝܘܐ ܐܕܚܒ ܢܘܗܬܢܝܒ ܐܘܗ ܬܝܐ ܐܡܠܫܘ .ܣܘܛܣܘܩܘ .ܣܝܛܢܛܣܘܩܘ .ܗܡܫ ܬܝܡ ܆ܬܠ̈ܬ ܢܝ̈ܢܫ ܟܝܐ ܐܬܘܟܠܡ ܘܪܲܒܕ ܕܟܘ .ܢܘܗܬܘܢܪܒܕܡܒ ̇ܗܣܟܛܒ ܐܝܕܪܕ .ܐܬܢܝܫܡ ܣܘܛܣܘܩ ܒܘܬ ܕܟܘ .ܝܗ̈ܘܚܐ ܕܝܨ ܐܬܘܟܠܡ ̇ܬܟܪܲܫܘ ܉ܐܫܝܫܩ ܢܘܗܘܚܐ ܣܘܢܝܛܢܛܣܘܩ ܐܬܘܟܠܡ ̇ܗܠܟܒ ܪܚܬܫܐܘ .ܐܡܠܥ ܢܡ ܕܼܢܥ ܘܼ ܗ ܦܐ ܆ܢܝܬܪ̈ܬ ܢܝ̈ܢܫ ܐܬܘܟܠܡܒ ܕܼܒܥ ܛܠܲܬܫܐܘ ܐܝܡܘܪ̈ܕ ̇ܗܠܟ ܐܬܘܟܠܡܠ ܕܼܚܐܘ .ܢܘܗܘܚܐ ܣܝܛܢܛܣܘܩ ܼ ܘܗ ܐܬܘܢܪܒܕܡܘ .ܐܝܢ̈ܘܝܕ ܐܢܝܢܡܒ ܥܒܪ̈ܐܘ ܢܝܫܡܚܘ ܐܐ̈ܡܬܫ ܬܢܫܒ .ܐܬܘܟܠܡ ܝܗܘ̈ܕܝܐܒ ̇ܬܢܩܬܘ .ܢܘܗܝܠܥ

[168v] History of the death of Constantius, son of Constantine the victorious king.

(1) When the days of Constantine the Great ended, he was gathered to his people and joined his fathers, and his three sons reigned after him: Constantine, his first-born who was named after him, Constantius, and Constans, and there was peace with one pacific consent between them, current in their government. After they had ruled for around three years, Constantine the oldest brother died, and the rule remained with his brothers. After Constans had reigned for two years, he also died, and Constantius, their brother, was left [in control of] the entire realm and the governance. He took the entire realm of the Romans and ruled over them. The realm was established under his control in the year 654 of the era of the Greeks…. (etc)

The new material is 16 pages in translation, so not a small discovery.  It renders obsolete much of the existing scholarship.  The authors discuss the date of the Julian Romance.  They make clear a word-for-word connection with the Miscellaneous Chronicle of 640, which therefore kicks the date of composition back from the early 6th century well into the 7th, and locates events around the reign of Heraclius.

It’s a fine article, and a wonderful discovery for 2023.  It goes to show that there is still stuff out there!  Never assume that even a well-studied and major collection has any idea about what is on their shelves.  The age of discovery is not over.  It just requires effort, and a bit of luck.

The discovery also shows the huge value of digitisation of manuscripts.  The Vatican have the best programme for mass digitisation known to me.  But isn’t it time that some other major manuscript libraries did the same?

Share

From my diary

My apologies for the silence.  Unfortunately I caught Covid at the start of the month, and I have been out of action ever since.  The symptoms are no worse than a cold, but I’m still testing positive even now.  I gather that rushing back to work afterwards is one of the primary causes of Long Covid, which I am not anxious to acquire.  So I shall come back slowly!  Be careful out there.

Share

More experiments with Amharic and technology

In my last post I found that it was possible to turn a PDF full of images of Amharic text into recognised electronic text using Google Drive, and then get some translation of the results into English using Google Translate.

There were some extremely interesting comments made on the post, which I have been reading.  I have also prepared a PDF of the whole text of the Life of Garima by Yohannes, and run that through the Google Drive process.

Where we started was in trying to read a passage of this text, in which – supposedly – God stopped the sun so that St Garima could copy the bible in one day.  The summary of the work  given by Rossini (instead of a proper translation, drat him), indicates that this was on lines 356-60 of his text, which turns out to be the last line of p.161 and the first three of p.162.  Here they are:

The output from the OCR is good, but you still have to compare the characters carefully.  Errors can often be picked up just by dumping the raw scan output into Google Translate, which shows things like numerals.

Here we have a character that is plainly wrong, and coming out as a numeral “4”.  It looks like an “o” with a hat and two dots under.  The two dots under are legs in another copy of Rossini.

I’m guessing that it’s a “ge” character, from looking at the Wikipedia article, but I can’t be sure. The script isn’t an alphabet, but a syllabary, based on syllables.  Each character is a consonant followed by a  vowel, which makes for a lot more characters.  There’s a table of the characters on the Wikipedia article, consonants down the left, vowels across the top.  I’ve not really looked at this.

The Google translate output is also interesting because of the choice of “detected language” – Tigrayan, rather than Amharic.  If you force it to Amharic, you get a lot less meaning.

One awkward part of using Google Drive to do the OCR is that it doesn’t preserve the line breaks.  That makes comparing the lines more awkward.   So you have to manually do this:

፬ ፡ ወኮነ ፡ በአሐቲ ፡ ዕላት ፡ ወነሥአ ፡ መጽሐፈ ፡ ወቀለመ ፡ ወወጠነ፡
ይጽሐፍ ። ወተንሥአ ፡ ለጸሎት በሰርክ ። ወጸሐፉ ፡ ሎቱ : መላእክት ፡ ወንጌ ለ ፡
በ፬ ፡ ሰዓት ፡ ወትርጓሜሁ ። ወመላእክተ ፡ እግዚአብሔር ፡ ወትረ ፡ ይት ለአክዎ ፡
ወእግዚእነሂ ፡ ክርስቶስ ፡ ያንሶሱ ፡ ምስሌሁ ። ወተሰምዐ ፡ ዜናሁ :
ውስተ ፡ ኵሉ ፡ ሀገር ። ጸሎቱ ፡ ወበረከቱ ፡ የሀሉ ፡ ምስሌነ ።

The Wikipedia article mentioned earlier gave me a list of punctuation marks.  There are two sorts of punctuation visible in here.  The colon mark is actually word division, which means that some words above go over two lines.  I’ve chosen not to split words above.  The double colon mark “::” is the full stop.  Interestingly Google Translate gives different results if you remove the spaces!

Going through the electronic text, removing spaces, I notice that sometimes the word-separator isn’t detected by the OCR.  So I added that in.  Sometimes it put a Roman colon instead, so I replaced that.  Finally I split on sentence:

፬፡ወኮነ፡በአሐቲ፡ዕላት፡ወነሥአ፡መጽሐፈ፡ወቀለመ፡ወወጠነ፡ይጽሐፍ።
ወተንሥአ፡ለጸሎት፡በሰርክ።
ወጸሐፉ፡ሎቱ፡መላእክት፡ወንጌ ለ፡በ፬፡ሰዓት፡ወትርጓሜሁ።
ወመላእክተ፡እግዚአብሔር፡ወትረ፡ይትለአክዎ፡ወእግዚእነሂ፡ክርስቶስ፡ያንሶሱ፡ምስሌሁ።
ወተሰምዐ፡ዜናሁ፡ውስተ፡ኵሉ፡ሀገር።
ጸሎቱ፡ወበረከቱ፡የሀሉ፡ምስሌነ።

And run it again and I get this:

But this still is not good enough to do much with.  If we didn’t have an idea what the text said, this would not tell us.

All this fiddling about would certainly get to into contact with the language, and start you on a journey to learning it.  But it’s not good enough a translation for other purposes, although intriguing.

One suggestion that was made in the comments to the last article was that ChatGPT gave better results.  The output quoted was indeed produced, and was very smooth and seemed to be a series of liturgical prayers.  But… I don’t think that this is actually the content.  These AI tools are really only an improved version of the text prediction tools you get on messaging on a mobile phone.  So it was pumping out garbage.

Anyway I tried it on this passage, and it crashed GPT very effectively!  At the moment I can’t get any reply of any sort, not even to “hello”.

I don’t think that I will do more here.  Clearly the technology is almost, but not quite good enough to be useful.

Share

Listening to Hard Rock helps Egyptologist make Middle-Kingdom Papyrus Discovery

In 2010 a doctoral student at Johns-Hopkins University in Baltimore named Marina Escolano-Poveda was present at a conference of Egyptologists in Mallorca. While there she visited the small and obscure local museum.  There she discovered some papyrus fragments written in demotic.  Over time she studied these and found them to belong to the early Middle Kingdom.  In the end, her repeated efforts to find out what they were bore fruit.  She was working on her doctorate, on a different period, so work on the fragments had to take place at night. She recalls the moment:

I remember perfectly the moment when I had this revelation. It must have been 3 o’clock in the morning. I was listening to the song ‘Salir’ by the group Extremoduro. I then realised that the papyrus I was looking at was by the same author as that of La Dispute![1]

The fragments belonged to a famous text, the Dialogue of a Man with his Ba, preserved as part of a set of 4 papyrus rolls in Berlin.  These came from a notorious 19th century dealer at the court of the khedive Ismail.  His real name is unknown, but he called himself Jean d’Anastasie.  He offered them for sale at Sothebys in March 1837, claiming that they originated at Thebes/Luxor, as no doubt they did.  This was the same dealer who also uncovered the library of Greek magical papyri.  The four rolls were purchased by Lepsius in 1842 who lodged them in the Berlin museum. Today it is P. Berlin 3024.

There was some damage to the start of the roll, as is not uncommon.  So it is likely that some portions split off, and were subsequently sold separately by the enterprising dealer.  Some have since been identified in the Amherst collection. Nothing is known about how they came to Mallorca.

The fragments are P. Mallorca I and II.  I found a photograph on this paywalled site:

Papyrus Mallorca I – fragment of P. Berlin 3024, Dialogue of a man with his Ba.

The find was published, and thankfully this is both in English, and open-access: Marina Escolano-Poveda, “New Fragments of Papyrus Berlin 3024,” in: Zeitschrift für Ägyptische Sprache und Altertumskunde, 144 (2017) pp. 16-54 (accessible here).

There’s stuff out there, people.  If you look, you find it.  It doesn’t matter who you are, so long as you persist.

Share
  1. [1]There are various tabloid accounts online.  This one is from Euronews, “Alicante Egyptologist Solves Enigma Of 4,000-Year-Old Manuscript”.