“Scriptor Syrus”, the scholiast on Dionysius bar Salibi: oft-quoted, but from where?

Something that comes around every year at this time is a quotation from a certain “Scriptor Syrus,” supposedly about the origins of Christmas.  Often it is supposed to be 4th century. This is the usual wording.

It was a custom of the pagans to celebrate on the same Dec. 25 the birthday of the sun, at which they kindled lights in token of festivity …Accordingly, when the church authorities perceived that the Christians had a leaning to this festival, they took counsel and resolved that the true Nativity should be solemnized on that day.

There is an excellent post at Andrew McGowan’s blog here about this “quote”, and the many errors and falsehoods involved, and a mention by Tom Holland.  It is, in fact, a marginal note by an unknown Syrian writer (= “scriptor syrus”) in a manuscript of the works of Dionysius bar Salibi, a 12th century Syriac author.

There is a somewhat fuller translation by Ramsay MacMullen, Christianity and Paganism in the Fourth to Eighth Centuries, Yale (1997), p.155:

A twelfth-century Syrian bishop explained,

“The reason, then, why the fathers of the church moved the January 6th celebration [of Epiphany] to December 25th was this, they say: it was the custom of the pagans to celebrate on this same December 25th the birthday of the Sun, and they lit lights then to exalt the day, and invited and admitted the Christians to these rites. When, therefore, the teachers of the church saw that Christians inclined to this custom, figuring out a strategy, they set the celebration of the true Sunrise on this day, and ordered Epiphany to be celebrated on January 6th; and this usage they maintain to the present day along with the lighting of lights.”[8]

p.244, 8.  Dionysius Bar-Salibi, bishop of Amida, whom I quote from the Latin of G. S. Assemani, Bibliotheca orientalis Clementino-Vaticanae 2 (Rome 1721) 164; and compare such other festivals as that of the Natale Petri of February, particularly in Fevrier (1977) 515, who protests against apologetic arguments to insulate the choice of date from any pagan antecedents or competition.

The overt polemical purpose of the modern author needs no discussion. But the reference is a useful entry-point to try to find the actual source.

What work are we talking about?  What manuscript?

Assemani was an Eastern Christian who published a whole series of extracts from eastern authors, in the original language, in his Bibliotheca orientalis Clementino-Vaticanae, with commentary and translation in Latin.  These are now online, and volume 2, page 164 may be found at Google books here.  The text is in two columns.  The original language is given, a text in italics is the translation, and Assemani’s own words are in normal text.

Page 164 from Bibliotheca Orientalis Clementino-Vaticanae, vol. 2 (1721)

Assemani introduces our scholiast thus (Google translate follows):

Hunc tamen Armenorum ritum, quem hic rejicit Bar-Salibaeus, anonymus nescio quis Syrus probare contendit in margine apud eundem Bar-Salibaeum fol. 43. a tergo, his verbis:

However an anonymous Syrian, I don’t know who, tries to prove this Armenian rite, which Bar-Salibaeus here rejects, in the margin in the same Bar-Salibaeus fol. 43. on the back, in these words:

Then follows the Syriac text, and then the Latin translation prepared by Assemani:

Mense Januario natus est Dominus eodem die quo Epiphaniam celebramus, quia veteres uno eodemque die festum Nativitatis & Epiphaniae peragebaret, quoniam eadem die natus & baptizatus est. Quare hodie etiam ab Armenis uno dic ambae festivitates celebrantur. Quibus adstipulantur Doctores, qui de utroque festo simul loquuntur. Causam porro, cur a Patribus praedicta solemnitas a die 6. Januarii ad 25. Decembris translata fuit, hanc fuisse ferunt. Solemne erat ethnicis hac ipsa die 25. Decembris festum ortus solis celebrare; ad augendam porro diei celebritatem, ignes accendere solebant: ad quos ritus populum etiam Christianum invitare & admittere consueverant. Quum ergo animadverterent Doctores ad eum morem Christianos propendere, excogitato consilio eo die festum veri Ortus constituerunt; die vero 6. Januarii Epiphaniam celebrari jussere. Hunc itaque morem ad hodiernum usque diem cum ritu accendendi ignis retinuerunt. Et quoniam sol duodecim gradus ascendit Dominus natus est hac die tertiadecima, & sicut S. Ephram docet, Solis justitiae & duodecim Apostolorum ejus mysteria repraesentat. Numerus, inquit S. Doctor, denarius perfectus est. Die decima Martii uterum intravit. Numerus item senarius perfectus est. Die 6. Januarii utramque partem nativitas ejus reconciliavit.

In the month of January, the Lord was born on the same day on which we celebrate the Epiphany, because in the olden days the festival of Nativity and Epiphany was held on the same day, since he was born and baptized on the same day. Therefore, even today, both festivals are celebrated by the Armenians. The Doctors [of the Church] support this, who speak of both festivals at the same time. Furthermore, the reason why the aforesaid solemnity was transferred by the Fathers from the 6th of January to the 25th of December, they say was this. It was traditional for the pagans to celebrate the birth of the sun on this very day, the 25th of December; to further enhance the celebration of the day, they used to light fires: to which rites they were accustomed to invite and admit even Christian people. When, therefore, the Doctors noticed that the Christians were inclined to that custom, they devised a plan and established on that day the feast of the true Resurrection; but on the 6th of January they ordered that the Epiphany be celebrated. So they have kept this custom to this day with the ritual of lighting fires. And since the sun has risen twelve degrees, the Lord was born on this thirteenth day, and as St. Ephraim teaches, he represents the mysteries of the sun of justice and his twelve apostles. The number, says the Holy Doctor, is a perfect denarius. On the tenth of March he entered the womb. The same number is perfect. On the 6th of January his birth reconciled both parties.

I don’t understand the bit about “denarius”; is it a typo for “senarius,” which seems to mean “a multiple of six”?  But it doesn’t matter for our purposes.  Assemani then continues his work by introducing a different extract from fol. 125 concerning Caiaphas, of no relevance here.

So these words, by the anonymous “syrian writer”, are on folio 43v of the manuscript used by Assemani.

But what is this a manuscript *of*?  What text?

Looking up to page 161, I see that Assemani is quoting material from folio 37v of this manuscript of a work by Dionysius bar Salibi, about the “progenitores” of Christ, from Luke’s gospel:

Quos Lucas refert Christi progenitores, eos ex Africano, Eusebio, Nazianzeno,Sarugensi, Graecisque & Syriacis Codicibus sic enumerat fol.37. a tergo:

He enumerates those whom Luke gives as progenitors of Christ, from Africanus, Eusebius, Nazianzen, [Jacob of] Sarug, from Greek and Syriac manuscripts, on fol. 37v:

He then continues with a passage from folio 161, on the nativity of Christ, before adding the material above from the scholiast.  It’s odd that this jumps about like this.

On pp.157-8, it all becomes clear.  Assemani is giving extracts from the Commentary on the Four Gospels by Dionysius bar Salibi, and he is extracting this material from a Vatican manuscript:

Commentaria in Testamentum Vetus & Novum. Et quidem expositio in quatuor Evangelia exstat in Cod. Syr. Vatic. 11. & in Cod. Syr. Clem. Vat. 16. a fol. 27. usque ad fol. 263. ejusque duo exemplaria in Bibliotheca Colbertina haberi testatur Renaudotius tom. 2. Liturg. Orient. pag. 454.

Commentaries on the Old and New Testaments. And a certain exposition on the four Gospels exists in Cod. Syr. Vatic 11. And in Cod. Syr. Clem. Vat. 16, from fol. 27. up to fol. 263. Renaudius testifies, Liturg. Orient. vol. 2, page 454 that two copies of this are held in the Bibliotheca Colbertina [i.e. now in the French National Library].

So… let’s take it further.  A lot of Vatican manuscripts are online.  But when I use the excellent Wiglaf guide to Vatican mss, and look at Vatican. Syr. 11, and Vaticanus Syr. 16, – I don’t think there is a “Clementine” subdivision of Syriac manuscripts – I find that neither has scholia on fol. 43v.  Someone has messed up the numbering of the manuscripts since!  It turns out that Assemani and his son did so, later in life, in the 1750s.  The marvellous Syri.ac website tells me of a concordance by Hyvernat, “Vatican Syriac Mss Old And New Press Marks” (1903), online here.

But this too is useless.  The old “Vat. Syr. 1” became Vat. Syr. 19, online here, but there is still no marginal note on folio 43v.  Hyvernat does not explain the “Clem.” collection at all.

Thankfully Hyvernat tells us about a catalogue composed by Assemani and son, and Syri.ac gives links to text-searchable PDF’s!

Looking at these, if we do a text search for “Salib”, we find that manuscript 156 contains Dionysius bar Salibi.  But… no scholion on fol. 43v.  In fact the manuscript has been divided into two parts, and part 2 is also online here.

The catalogue for Vat. Syr 156 says the Luke portion begins on fol. 188, which doesn’t sound right.  But at the end it says “see ms 155, fol. 161v”  And when I look at the catalogue entry for Vat. Syr. 155 – it too contains Dionysius bar Salibi!  The text search had missed it.   Are these two, perhaps, the two manuscripts that Assemani used, now placed side by side?  Hyvernat says look at the start of the catalogue entry, there may be the old shelfmark there.  And…

CLV. Codex in fol. bombycinus, foliis constans 294. Syriacis recentioribus literis exaratus, inter Syriacos Codices, a nobis in Vaticanam Bibliothecam inlatos, olim Decimus sextus: quo continentur:

150.  Folio manuscript on cotton-paper, consisting of 294 leaves, written in modern Syriac letters, one of the Syriac manuscripts brought by us into the Vatican Library, once the Sixteenth: which contains:

So this is indeed the one-time manuscript Vat. Syr. 16!   Hyvernat expresses himself bitterly toward the authors of the catalogue – “of no practical use” -, and, after more than two hours working on this, I too am less than chuffed with them.  The manuscript was never simply “Vat. Syr. 16”; prior to the reorganisation it was, in fact, Vat. Syr. Assemani 16; and the other manuscript, 156, was Vat Syr. Assemani 46.  Aaargh!

But … viewing Vat. Syr. 155 on folio 43v – there is a long scholion!  We’re there!  It matches!

Vatican Syr. 155, folio 43v – the scholion on Dionysius bar Salibi, Commentary on Luke, discussing the date of Christmas

One last wrinkle.  The catalogue (part 3, p.297) tells us that Luke is on fol.160v onwards.  That’s is item 23 in this manuscript, which contains various texts.  So what is fol. 43v part of?  Well, item 21 is the commentary on Matthew, starting on folio 32, and continuing to fol. 148v.  Not Luke, as anyone would infer from the original in the Bibliotheca Orientalis, unless they were very careful.

So this passage by “Scriptor Syrus” is, in fact, a scholion by some unknown person, on a passage in the Vatican Syr. 155 copy of Dionysius bar Salibi’s Commentary on Matthew.

It would be most useful to know exactly which passage of Dionysius bar Salibi is so annotated.  But there we must leave this.

Update: 24 Dec. 2023.  A useful comment from Syriacist Grigory Kessel is that Dionysius bar Salibi’s commentary on the gospels was printed in the CSCO series, with a Latin translation; and that the annotation above is against Dionysius’ comments on Matthew 2:1 (“Now when Jesus was born in Bethlehem of Judea in the days of Herod the king, behold, wise men from the East came to Jerusalem, saying,…”), and the relevant passage is here.  I imagine it relates to the paragraph on p.67, l.12 onwards, where 25 December is specified.  Thank you!

Share

Working with Bauer’s 1783 translation of Bar Hebraeus’ “History of the Dynasties”

Following my last post, I’ve started to look at the PDFs of Bauer’s 1783-5 German translation of Bar Hebraeus’ History of the Dynasties.

It must be said that the Fraktur print is not pleasant to deal with.  But it could be very much worse!  I’ve seen much worse.  Here’s the version from Google Books:

And here is the same page from the MDZ library:

I’ve tried running both through Abbyy Finereader 15 Pro.  Curiously the results are better, on the whole, from the higher resolution MDZ version.  I had expected that the bleed-through from the reverse might cause problems – and it may yet!  Even more oddly, the OCR on the “Plain Text” version of Google Books is better still.

But there is a problem with using Google Books in plain text mode.  There is no way to start part way through the book.  You will always be placed at the very start, and you can only navigate by clicking “Next page” or whatever it is.  This is not good news if you have 100 pages to click through before you get to where you want to be.

The opening portion of these world chronicles is always a version of the biblical narrative about the creation, followed by material from the Old Testament, combined with apocryphal material.  I may be alone here, but I have always found these parts of the narratives unreadable.  When I translated Agapius, I started with the time of Jesus, part way through.  I did the same with Eutychius. I only did the opening chapters at the end, after I had translated all the way from Jesus to the end of the book first.  I recall that it felt like wading through glue. I might have given up, except that I had already invested so much time in the project.

Starting in the time of Jesus immediately introduces us to familiar figures.  On page 88 of volume 1, the “Sixth Dynasty” starts, with Alexander the great.  It ends on page 98 with Cleopatra.  Each section starts with a familiar name, one of the Ptolemies in most cases.

On page 99, dynasty 7 begins, after an introduction, with Augustus.  The dynasty ends on p.139 with Justinian.  Each ruler gets a paragraph, often only a few sentences.

It’s all do-able, clearly.  I’m not sure that I want to get into working on this book seriously, with the St Nicholas project still in mid-air.  But it’s not hard work, which is something!

Share

An adventurer in Arab Christian Studies – Prof. Bartolomeo Pirone

None of the histories of Arabic Christian literature – Agapius, Eutychius, Yahya ibn Said al-Antaki, Al-Makin, Bar Hebraeus – exist in English translation.  This site has made some modest efforts to remedy this, by turning the French translation of Agapius and the Italian translation of Eutychius into English, and posting them online.  Judging from queries received, the effort has been worthwhile, and has drawn attention to both.  It was difficult to obtain a copy of the Italian translation, but eventually I located  and purchased one over the web from the Franciscan bookshop in Jerusalem, where it had plainly sat and gathered dust for many years.  The translator was a certain Bartolomeo Pirone, of whom I knew nothing.

Indeed how many of us are that aware of material in Italian?  Even though Google Translate handles Italian very well these days, few of us have any idea what is out there.  Yet there are invaluable translations of otherwise inaccessible patristic material.

A few days ago I became aware of a series of translations into Italian of Arabic Christian literature, the PCAC series.  This includes 30-odd texts from the literature of the Christians in the Near East, such as Theodore Abu Qurrah.  The region was occupied by Islam in the 7th century, and they were obliged to write in Arabic from the 9th century onwards, as the cultural pressure became irresistible.  But it is, at that period, a branch of Byzantine literature, and full of interest.

Much to my surprise, I discovered that the series was edited by none other than the same Dr Bartolomeo Pirone.  Now retired but still active, he was a full professor at the University of Naples L’Orientale, and lectured in Cairo and Beirut.  Judging from a google search, he has dedicated a portion of his life to making this literature known, in the most obvious way possible; by translating it into the vernacular, and gathering other scholars to do likewise. Indeed I have at this very instant just discovered that he also made a translation of Agapius into Italian![1]  But this does not exhaust his work, which also includes Muslim literature, and the interaction between Christianity and Islam.

Much of his work was published by the Franciscan Province of the Holy Land, known as the “Custody of the Holy Land“.  This in turn explains why a copy of his standalone translation of Eutychius was available in their bookshop in Jerusalem.  There is an article from 2018 at the Franciscan website here, celebrating his 40 years of research.

Prof. Bartolomeo Pirone

I would imagine that very few people in the English-speaking world have ever heard of Dr Pirone and his immensely valuable work on an area of literature known to very few.  But if you are at all interested in Arabic Christian literature, and especially if you – like myself – do not know any Arabic, then you need to know about his work.

Share
  1. [1]Agapio di Gerapoli, Storia universale, Terra Sancta (2013), ISBN 9788862401647.

Getting manuscript reproductions in the UK – important and useful court judgement?

Via Dr Bendor Grosvenor on Twitter, I learn of an interesting court case about “image fees”.  According to Dr. G, this is very good news for manuscript researchers, and historians in general, and also for those who want to download and post online images of out-of-copyright material.  Here’s his thread:

Those of us who’ve had to pay image fees will know the system relies on museums claiming copyright in their photos – irrespective of whether the art they’re photographing is itself in copyright. (In the UK, copyright lasts for 70 years after the death of the artist).  In other words, a painting by John Constable may be long out of copyright, but taking a photo of it creates a new copyright in that photo. By restricting the taking or sharing of other photos, museums force us to use their own photos for publication, and thus charge large sums.

Copyright is the glue which holds the system together, otherwise, we’d be able to either take a photo from the museum’s website, or use a photo someone else has already paid for. The ‘copyright licence’ we buy prevents us from sharing the image for wider re-use.

In the UK, this copyright claim has for long been contentious. For example, under the 2019 EU Copyright Directive (Article 14), it is not possible to claim copyright in a straightforward reproduction of a work of art which is itself out of copyright (older than 70 years).  The relevant bit of Art. 14: “when the term of protection of a work of visual art has expired, any material resulting from an act of reproduction of that work is not subject to copyright or related rights unless the material resulting from that act of reproduction is original in the sense that it is the author’s own intellectual creation.”

In other words, take a straightforward photo of the Constable painting = no new copyright in your photo. But pose something in front of it, add an extra cow in Photoshop = new copyright in your photo.

For many of us, that EU Directive looked like the end to image fees in the UK – but Brexit happened just before ratification was required in member states.

In the UK, museums and image libraries relied on the UK’s Copyright, Designs and Patents Act 1988, which appeared to give copyright to your photo of the Constable simply because of the effort you took in taking it. This was called the ‘sweat of the brow’ concept.  In other words, you did not need to demonstrate any creative effort, or add any personal touch, to claim your copyright. BUT, since 1988, various EU and UK judgements have eroded the ‘sweat of the brow’ concept.

But the situation was still not entirely clear, until now. In an Appeal Court judgement this November (THJ v Sheridan [2023] EWCA Civ 1354). Here’s the full judgement.

Click to access ewca_civ_2023_1354.pdf

(And here (to which I am indebted) is Prof. Eleonora Rosati @eLAWnora  commentary on the judgement.)

https://ipkitten.blogspot.com/2023/11/originality-in-copyright-law-objective.html

Para 16 rules that, for copyright to pertain: ‘What is required is that the author was able to express their creative abilities in the production of the work by making free and creative choices so as to stamp the work created with their personal touch.”

So, taking a straightforward photo does not count, nor does getting the lighting right or other labour of a ‘technical’ kind.

What does this mean for the image fee system which strangles so much art historical scholarship, prevents the public learning about the art they own, and acts as a tax on knowledge? In the UK, it means it’s over.  In fact, because in THJ v Sheridan, the judges said the ‘skill and labour’ test has not been valid *since 2004*, it suggests that all those ‘image licences’ which have been sold relying on copyright have been invalid, and (I suspect?) mis-sold.

Those of us who’ve been campaigning against image fees have been arguing (with hard evidence) that the system doesn’t raise meaningful revenue for museums (and in many cases, costs them money).  But to little avail, as far as museums are concerned. They just carried on charging, insisting they had copyright, which encouraged publishers to insist we kept buying ‘licences’. And now we know that for historic, 2D artworks it’s basically been a scam.

What do we do now? I suppose museums can carry on restricting the availability of decent photos. That’s why Tate’s website only lets us see low-res photos (of the art we own).  But without the glue of copyright, the system must collapse, because there’s nothing to stop images being re-used.  So, if you’re able to take a tolerably good photo of a historic artwork from online for your publication, do so.  Don’t let publishers and journals bully you into buying ‘licences’. Don’t agree to label photos (C) when no copyright exists.  And if you’re a museum director or trustee, think hard about your museum mis-selling licences for the last two decades.

Note that this is clearly downstream of the EU ruling.  This now leaves the USA behind, at least until some public-spirited person clarifies the law there.

The actual court case was about whether a GUI could be copyrighted, so it isn’t really the same thing.  But the case is about “originality” in copyright, and this is what lies behind the claim of museums that a photograph is an “original work” and therefore in copyright. There is discussion of the case on these sites:

UK Court of Appeal rules on copyright in GUIs

Originality in copyright – a review of THJ v Sheridan

Let us hope that the judgement does indeed mean what Dr G. says that it does, and frees up public domain material for the use of us all.  I suspect the foot-dragging will be immense, tho.

Share

More experiments with Amharic and technology

In my last post I found that it was possible to turn a PDF full of images of Amharic text into recognised electronic text using Google Drive, and then get some translation of the results into English using Google Translate.

There were some extremely interesting comments made on the post, which I have been reading.  I have also prepared a PDF of the whole text of the Life of Garima by Yohannes, and run that through the Google Drive process.

Where we started was in trying to read a passage of this text, in which – supposedly – God stopped the sun so that St Garima could copy the bible in one day.  The summary of the work  given by Rossini (instead of a proper translation, drat him), indicates that this was on lines 356-60 of his text, which turns out to be the last line of p.161 and the first three of p.162.  Here they are:

The output from the OCR is good, but you still have to compare the characters carefully.  Errors can often be picked up just by dumping the raw scan output into Google Translate, which shows things like numerals.

Here we have a character that is plainly wrong, and coming out as a numeral “4”.  It looks like an “o” with a hat and two dots under.  The two dots under are legs in another copy of Rossini.

I’m guessing that it’s a “ge” character, from looking at the Wikipedia article, but I can’t be sure. The script isn’t an alphabet, but a syllabary, based on syllables.  Each character is a consonant followed by a  vowel, which makes for a lot more characters.  There’s a table of the characters on the Wikipedia article, consonants down the left, vowels across the top.  I’ve not really looked at this.

The Google translate output is also interesting because of the choice of “detected language” – Tigrayan, rather than Amharic.  If you force it to Amharic, you get a lot less meaning.

One awkward part of using Google Drive to do the OCR is that it doesn’t preserve the line breaks.  That makes comparing the lines more awkward.   So you have to manually do this:

፬ ፡ ወኮነ ፡ በአሐቲ ፡ ዕላት ፡ ወነሥአ ፡ መጽሐፈ ፡ ወቀለመ ፡ ወወጠነ፡
ይጽሐፍ ። ወተንሥአ ፡ ለጸሎት በሰርክ ። ወጸሐፉ ፡ ሎቱ : መላእክት ፡ ወንጌ ለ ፡
በ፬ ፡ ሰዓት ፡ ወትርጓሜሁ ። ወመላእክተ ፡ እግዚአብሔር ፡ ወትረ ፡ ይት ለአክዎ ፡
ወእግዚእነሂ ፡ ክርስቶስ ፡ ያንሶሱ ፡ ምስሌሁ ። ወተሰምዐ ፡ ዜናሁ :
ውስተ ፡ ኵሉ ፡ ሀገር ። ጸሎቱ ፡ ወበረከቱ ፡ የሀሉ ፡ ምስሌነ ።

The Wikipedia article mentioned earlier gave me a list of punctuation marks.  There are two sorts of punctuation visible in here.  The colon mark is actually word division, which means that some words above go over two lines.  I’ve chosen not to split words above.  The double colon mark “::” is the full stop.  Interestingly Google Translate gives different results if you remove the spaces!

Going through the electronic text, removing spaces, I notice that sometimes the word-separator isn’t detected by the OCR.  So I added that in.  Sometimes it put a Roman colon instead, so I replaced that.  Finally I split on sentence:

፬፡ወኮነ፡በአሐቲ፡ዕላት፡ወነሥአ፡መጽሐፈ፡ወቀለመ፡ወወጠነ፡ይጽሐፍ።
ወተንሥአ፡ለጸሎት፡በሰርክ።
ወጸሐፉ፡ሎቱ፡መላእክት፡ወንጌ ለ፡በ፬፡ሰዓት፡ወትርጓሜሁ።
ወመላእክተ፡እግዚአብሔር፡ወትረ፡ይትለአክዎ፡ወእግዚእነሂ፡ክርስቶስ፡ያንሶሱ፡ምስሌሁ።
ወተሰምዐ፡ዜናሁ፡ውስተ፡ኵሉ፡ሀገር።
ጸሎቱ፡ወበረከቱ፡የሀሉ፡ምስሌነ።

And run it again and I get this:

But this still is not good enough to do much with.  If we didn’t have an idea what the text said, this would not tell us.

All this fiddling about would certainly get to into contact with the language, and start you on a journey to learning it.  But it’s not good enough a translation for other purposes, although intriguing.

One suggestion that was made in the comments to the last article was that ChatGPT gave better results.  The output quoted was indeed produced, and was very smooth and seemed to be a series of liturgical prayers.  But… I don’t think that this is actually the content.  These AI tools are really only an improved version of the text prediction tools you get on messaging on a mobile phone.  So it was pumping out garbage.

Anyway I tried it on this passage, and it crashed GPT very effectively!  At the moment I can’t get any reply of any sort, not even to “hello”.

I don’t think that I will do more here.  Clearly the technology is almost, but not quite good enough to be useful.

Share

Is it possible to read editions of Amharic texts? An experiment

In my last post I mentioned how the Life of St Garima in Ethiopian was printed by Rossini, but without a translation.  In fact it has never been translated into any modern language, to my knowledge.  I don’t know any Ethiopian, and I doubt that I ever will.

But we live in an age of wonders, when it comes to unfamiliar languages.

So… is it possible to work with Ethiopian language editions, even if you know no Ethiopian?  What about Google Translate?  Ethiopian is in this heavy unfamiliar script.  Is there OCR for this?  If you can scan Rossini’s edition, can you pop it into Google Translate and get the English?

There are two sorts of Ethiopian out there, I know.  There is Ge`ez, or classical Ethiopian; and there is Amharic, the modern dialect.  Rossini printed his text from a 19th century manuscript.  So it seems likely that this is in Amharic.

A quick Google confirmed; Google Translate knows Amharic!  A bit of googling found me an Amharic news website online, here.  I’m using Chrome, so all I had to do was right-click anywhere and select “Translate to English” and the whole website was rendered into some sort of English.  And… it worked!!  Yay me!  It’s obviously not 100%, but it’s way better than 0%!

So what about OCR?  I was sad to see that Abbyy Finereader apparently doesn’t support Amharic.  That’s a blow.  It was developed originally to handle Cyrillic, so it certainly has the capability.  But it’s not offered.  Drat.

A bit of googling brought me to a dubious-looking website here, claiming to offer a selection of tools which could do Amharic OCR.  The prose felt a bit machine-generated, so I worried that it was bunk, or worse, a malicious site.  But the first option was… Google Drive.

I never knew this, but seems that, if you upload a PDF containing an image of text, and then open it in Drive as a Google Docs document, it OCR’s the content.

Well, I thought, let’s give it a try.  So I extracted the first page of Rossini’s edition, using Adobe Acrobat Pro 9 – no flashy latest-edition stuff going on here!  Here’s a pic:

Then I uploaded it, and opened as a Google document.  And … it just treated the Amharic as an image.  Dang!  But I noticed that it did indeed OCR the Italian at the top of the page!

This is supposed to work.  So I thought maybe I should work over the image a bit.  I imported the one-page PDF into Abbyy Finereader 15, and chopped off the Italian at the top, and the critical apparatus at the bottom.  I then used the image editor in Finereader to “whiten the background”.  This can be flaky, but this time it worked fine, and I got a pure white background.   And I got this:

(I’ve just seen the marginal notes, which I need to chop off as well, so I’ll have to go round the loop again)

I exported the image as a PNG, and I used Acrobat again to create a PDF from the image.  Then I uploaded the new PDF to Google Drive, and opened it as a Google Docs document.  And… it worked!  Sort of…

በስመ : አብ : ወወልድ ‘ ወመንፈስ ፡ ቅዱስ ፡ ፩ ፡ አምላከ ፡ ላዕሌሁ ፡ ተወ ከልኩ፡ ወቦቱ ፡ አመንኩ ፡ እስከ ፡ ላዓለመ ፡ ዓለም ፡ አሜን ።

ድርሳን ፡ ዘደረሰ ፡ ቅዱስ ፡ ዮሐንስ ፡ ኤጲስ ፡ ቆጶስ ፡ ዘአክሱም o ፡ በእንተ ዕበዩ ፡ ወክብሩ ፡ ለቅዱስ ፡ ይስሓቅ = ወይቤ ፤ ስምዑ ‘ ወልብዉ ፡ ኦአኀውየ 5 ፍቁራንየ ፡ ዘእነግረከሙ ። ርኢኩ ፡ ብእሲተ ፡ እንዘ ፡ ይዘብጥዋ ፡ ዕራቃ ወእንዘ ፡ ይሀርፉ ፡ ላዕሌሃ ፡ ወላዕለ ፡ እግዝእትነ ፡ ማርያም ፡ እንዘ ፡ ይብሉ በእንተ ፡ ወልዳ ፡ ክርስቶስ ፤ እምብእሲት ፡ ኪያሁ : ኢተወልደ ፣ ይብሉ ፡ እላ ፡ ኢየአምኑ ፡ በክርስቶስ = ወኮንኩ ፡ እንዘ ፡ እረውጽ ፡ ወአኀዝኩ እስዐም ፡ ታሕተ ፡ እገሪሃ ፡ ለይእቲ ፡ ብእሲት ፡ እንዘ ፡ ትብል ፤ እወ ▪በዝ ፡ አንቀጽ ፡ ወፅአ ፡ ንጉሠ ፡ ሰማያት ፡ ወምድር ። ወሶበ ፡ ትብል፡ ከሙዝ ፡ ወ

That’s… rather astonishing.  No idea what all that is, but it looks sort of right.  Let’s bear in mind that Rossini printed his edition in 1897.  This is not a modern typeface.  So this is rather good.

Next step was to paste it into Google Translate.  It set it to auto-detect the language, and pasted in the first bit.  And… it worked.  In fact it gave a really useful transcription into Roman letters as well, which makes it a LOT easier to manipulate the text.

OK, I’m cheating slightly.  The first time I uploaded, the translation ended at “Spirit”.  But this is a Google Translate bug – it sometimes omits the remainder of a sentence.  If you split the text with a line feed, you often get the rest.  And that’s what I did.  I worked out by experiment where I needed to be, and then I got the above.

I don’t quite believe the translation of the second sentence either.  I suspect I need to play with this a bit to work out what each word is.

I notice all those colons between every word.  It might help if I actually looked up the script online!

But I think you’ll agree that this is quite marvellous – I, who know absolutely nothing about the language, am getting something useful out!

Magic!

Share

Why we should use Latin spellings of Greek names

A twitter thread by @EzhmaarSul from June 11, 2023, made some interesting points about the use in English of spellings like “Nikaia” rather than “Nicaea”. Few will have seen it, and I’ve never seen another public discussion of the subject.  So let’s give it a bit more visibility.

It went as follows:

Something I really hate about modern amateur historians (and which will leak into the professional class as these amateurs achieve doctorates) is the mixing of Greek and Latin spellings of Greek names. I’ve fallen prey to the same because of the ubiquity of amateur historians.

It starts with people wanting to use phonetic spellings of closer to the original Greek.

This urge comes from a deranged, nerdish desire to “well actually” people through text. Not as malicious as BCE, but coming from an adjacent place in petty souls.

“It’s Nikaia, not Nicaea!”

Not only does this look ugly and wrong in modern English, which is based on Latin rules of spelling and grammar, but it betrays a certain philistinism.

Greeks don’t use our alphabet! You’re broadcasting to us, “I don’t know how to pronounce this unless I spell it wrong.”

This also screws up scholarship. We have centuries of scholarship referring to Alexius Comnenus and John Palaeologus. Then along comes some redditor-turned-PHD writing about “Alexius Komnenos” and “Ioannes Palaiologos.”

And they invariably f**k it up.

“Oh Theodoros is obviously Theodore, so I’ll call him Theodore Laskaris in my paper… but Ioannes is exotic! I’ll call him Ioannes even though everyone recognizes it’s the Greek version of ‘John.’”

Don’t get me started on “Constantine.”

Just stick with the Latin and Anglophone spellings, you buffoons.

I think the author has a point. It does look hideous.  It does create a barrier.  It makes Greek history look barbarous.

There is a definite tendency among elites to create barriers for others in order to advance themselves, to order others around while feeling smug.  How else did we end up with printed Latin texts where the useful modern separation of consonant and vowel, of “i”/”j” and “u”/”v”, was actually and deliberately abandoned?  So… I rather agree.

Share

Archivo Pertzii??

Here’s a reference guaranteed to waste the time of a researcher.  It’s from the Bibliotheca Hagiographica Latina:

This is some miracle material associated with the abbey of Brauweiler.  But… what is “Archivo Pertzii”?? I did find out, but it was enough work that I thought I’d put up a blog post, in case I forget and need to google it again.

At the moment, a google search points you right back to the BHL, seemingly the only publication in all history to know of this source.

The italics on Archivo are the key; clearly it’s the abbreviated title of a journal.  I know that German publications often referred to the editor of a journal, especially if he was someone famous, so “Pertzii” is probably the editor.  But you may search for “Archivo” as long as you like.  It’s bad enough with Google.  Imagine the bafflement of a 20th century researcher without it!

I tried “Pertzius”, and kept reading results, and this gave me what I needed.  Apparently this is Georg Heinrich Pertz, whoever he might have been.

I found that he edited the last volume, volume 12, of the “Archiv der Gesellschaft für Ältere Deutsche Geschichtkunde zur Beförderung einer Gesammtausgabe der Quellenschriften deutscher Geschichten des Mittelalters”.  So the “Archivo” is just a Latin ablative of the real name “Archiv”.  The BHL, in translating it into Latin, managed to obscure the sense completely.

Just to make it better, Pertz only edited some volumes.  It wasn’t his “Archiv” anyway.

The journal is online at Digizeitschriften.de, which has a useful page for the whole serial here, but a rather awkward interface to download any of it.  I ended up downloading the 9 pages individually and combining them locally. Then I found there was a button for “PDFs for individual items”, which I struggled with and finally got the same chunk in one file.  Why you can’t just download the volume I can’t imagine.  But I think this is teething troubles.  The site otherwise seemed well organised.

There is another copy online, at digitale-sammlungen.de, here.  But I only found this after more effort.

Share

A new use for the parallel Latin translations in the Patrologia Graeca

Now that we have a very effective Latin translation in Google translate, it occurs to me that we can also use this to read a great deal of patristic Greek.  For as we all know, the Greek fathers were all translated into Latin at the renaissance and after, and were nearly always printed with parallel Latin translation, right the way down to the 19th century.

The obvious example of this is Migne’s Patrologia Graeca, our standard reference collection of texts.  It’s never been worth transcribing the Latin side.  But maybe now it is, just as a reading aid for those of us without fluent Greek?

This isn’t a new situation, in a way.  Indeed the reason why all these Latin translations even exist at all, is that knowledge of Greek was always rarer than fluency in Latin.  The translations are not always reliable; but something is better than nothing.

On the other hand it won’t be all that easy to OCR the Latin of Migne…

An excerpt from PG volume 78, column 226, a letter of Isidore of Pelusium in the Migne edition.

The low quality of Migne’s printing is something that we have all struggled with.

But there are workarounds.  The last time that I needed to OCR the Latin of Migne, I went and found the edition that he was reprinting on Google Books.  This, needless to say, was far better printed, and created many fewer errors in Finereader 15.

So it is possible, and it’s worth bearing in mind if we need to work with a large patristic text for which no modern translation exists.  Spend some time creating an electronic text of the Latin translation, and push it through Google Translate!

Update (5 Aug 2023): Note that it is actually possible to copy the OCR’d text from Google books itself, for both the Greek and Latin sides in the PG.  Go to the page in question.  Hit the cut-and-paste icon so it goes dark grey, then drag a rectangle over the area that you want to copy the text from. As you release the mouse, a dialogue will pop up, and the text is in the top box. It looks as if its monotonic for Greek. The results are quite respectable.

Share

A new BHL-type database of Latin hagiographical texts and manuscripts at the IRHT?

It seems that something is going on at the IRHT (= L’Institut de recherche et d’histoire des textes).  For those who do not know, the IRHT is the French manuscripts people.  They do all sorts of very useful things.  But there is no announcement, nor much online.  It’s a new database, designed to allow you to look up a particular Latin hagiographical text, and find what manuscripts it appears in.

We already have something of the sort.  A couple of decades ago, the Bollandist “BHL” index (= Bibliographica Hagiographica Latina) was turned into the BHLms database.  I use it intensely.  You look up a text by the BHL reference and it will tell you the libraries that have copies in manuscript, the manuscript reference number (= shelfmark), and the pages or folio numbers.  It’s great. But it is obviously old.

Last night I saw a job advert.  It’s for 6 months, starting in September.  It’s actually in various places, but I saw it here.  Excerpts (Google translate) –

Job offer – Study engineer (M/F). Feeding a database devoted to Latin hagiographic manuscripts

As part of the “Latin Legends” project – the result of a partnership between the IRHT, the Société des Bollandistes (Brussels) and the University of Namur – the IRHT offers a 6-month fixed-term contract to contribute to the data load of a new database dedicated to Latin hagiographic manuscripts.

This database, still in the process of being run in, is intended to list all the manuscript witnesses (medieval and modern) of Latin hagiographic texts identified by the “Bibliotheca Hagiographica Latina” (“BHL”). Once published, it will work in conjunction with the other databases hosted by the IRHT (Medium, Pinakes, Biblissima).

Sounds interesting!

– he will contribute to integrating the data from the Bollandist files (approximately 9000 handwritten files, now digitized on the IRHT servers). If necessary, he will complete the information, and, in case of doubt, will verify its accuracy.  …

The recruited research engineer will work under the scientific direction of Cécile Lanéry-Ouvrard (IRHT Researcher), within the Latin section of the IRHT (Aubervilliers site). He will also be in contact with the other researchers and technicians involved in the “Latin Legendaries” project (in particular Cyril Masset, from the IT department of the IRHT, and Fernand Peloux, researcher at the CNRS in Toulouse). On occasion, he may be required to correspond with the Société des Bollandistes de Bruxelles, or with other researchers specializing in hagiography. 

Part of the work may be carried out remotely, with this restriction, however, that he has permanent access to the tools necessary for the proper performance of his activities (access to IRHT servers, access to bibliographic tools, etc.)

The list of skills is formidable, but this isn’t really an IT role, as one bit amusingly makes plain:

some skills or previous experience in the field of databases and their operation would be welcome, to facilitate exchanges with the IT department of the IRHT, responsible for maintaining the database.

So it basically requires manuscripts skills, rather than hard-core SQL.  I wonder what the database actually is.

A bit of googling reveals another worker on the project, Antoine Charrié-Benoist, who is listed as doing data load for the BHLms database, and gave a paper about the project.

It would be good to know more.  But clearly whatever database is planned will be immensely useful.  Excellent news.

Share