Jump to content

syntactical databases


Recommended Posts



This is a serious critique of the charge of subjectivity with regard to syntactical databases. I'm not quite sure what to think of it. I was wondering if you guys might want to weigh in.


There is obviously a level of subjectivity in syntax. Neither during the "shootout" nor any of our demonstrations did any of our staff appeal to the subjectivity claim as a reason why we do not (currently) offer syntax databases of the GNT/MT. While Heiser believes that this is the cutting edge of Biblical language research, I can say that I only had one person ask about anything syntax related during the many demos I gave during SBL (and this one person was kind of an exception since she is particularly keen on this subject). This is not to say that it is unimportant, but that it represents a largely uncharted area in computer-assisted Biblical research (at least with relation to retail software). Furthermore, if we felt that it was subjective and unimportant we would not be developing these databases!


I could say more about Heiser's impressions on the shootout, and my own reflections, but that would stray beyond the topic of this thread. Stay tuned to the Accordance blog for a post on it from David.


Danny, sorry you couldn't make it out. I seriously doubt we will post anything online regarding this feature

Link to comment
Share on other sites

As for Michael Heiser's posting, I'd love to offer a response, but as of yet can't figure out how to leave one. Noticing that no one else has left a comment, it makes me wonder if something's not right with his blog or if he just doesn't want comments. But I'm still trying.


Some blog owners turn off comments rather than deal with spam and other uglies of the internet. Some turn it off and don't realise that they did. Which it is in this case, I am not sure, but it's clear you can't leave a comment on the blog.

Link to comment
Share on other sites

I have read the post by Michael Heiser, too, and I have mixed feelings.

There are some points that he makes that I agree with. And there is something else that I am not impressed with.


I agree with his point that subjective choices are implied not only in syntactical databases, but also in morphological databases. And I also agree that an important limitation of morphological databases is that they are not aware of clause boundaries.


What I don't agree with is his representation of users of other software packages:

Despite the critical importance of syntax, since I
Edited by Helen Brown
minor edits by permission
Link to comment
Share on other sites

I wish to add something to the discussion about morphological vs. syntactical databases.


The blogger quoted above maintains that

The charge is that syntactical tagging is “subjective” since it gets into interpretive decisions.


He then proceeds to explain why:


First, identifying and labeling syntactical constructions are often not subjective exercises. There is nothing subjective about identifying (and tagging) a wayyiqtol followed by an expressed subject with a following accusative marker with noun. There are dozens of other such features marked in a syntax database that are not subjective. The construction is what it is, and is often crystal clear. So, in one respect, a syntax database does what a morphological database does when it identifies things. Morph databases identify words; syntax databases identify clusters of words.


In my previous post, I didn't deal with this first reason. I tend to agree with it, but I need to add that the same decisions about crystal clear constructions are often already included in morphological databases.


Bear with me, all you Hebrew scholars: I will provide some examples out of Greek texts. Greek morphology has many endings that are homographs, but require different tags. An -A ending may be Accusative Masculine Singular (ASM), or Accusative Neuter Plural (APN), or Nominative neuter plural (NPN). When morphological databases choose among the three, they make a decision which is not based on a word in itself, but rather on careful observation of a cluster of words. The clause syntax needs to be taken into account.


So, in this sense, there already is much syntax within morphological databases. Even if in some cases decisions are subjective, in most cases the reasons are crystal clear. When the syntactical reason are crystal clear, a failure to distinguish between AMS, ANP and NNP is a mistake. It is not a subjective choice, just a plain mistake.


I have tagged Greek and Latin texts for Accordance, and I know that I can make such mistakes. This is why I carefully check my work, to remove as many mistakes as possible. Some still remain, and I am ready to remove them as soon as some user finds them. So, I am not in a condition of being intolerant of such mistakes. Whenever they occur, it is because the person who tagged the text did not think enough of the syntactical context in which a word occurs.


Now, what I find hard to swallow is the notion that Logos is far ahead of competition because of its awareness of syntax. It is, I grant, inasmuch as a syntactical module is not available in Accordance right now.

It is not if we look back at tagged text that have already been published as module of both Accordance and Logos.


Consider the works of Philo in Greek as published by Logos. They offer a sample on the web.



Conveniently enough, it represents the start of the first work of Philo, On the creation of World. Until they take it down, I can offer a few comments. Let's looks at -A endings.


τὰ νομισθέντα



ASM is clearly wrong. It was tagged without taking syntax into account.





Not so. It's APN, for both syntactical and morphological reasons.


If required, I could move to the second paragraph and examine a few -ON endings, like ἄσκεπτον καὶ ἀταλαίπωρον, but I don't want to take the fun away from those who are expert in Greek, and I don't want to bother those who are more interested in Hebrew.


Why is Logos missing so badly on syntax? It might be easy to blame others: the database has been tagged by the Philo Concordance project. Now, the goal of that project was to assign to each word the tag which is statistically more frequent for that word form. If in most cases -A is APN, they automatically set all instances to APN. Even so, the database is very valuable. It has gone as far as one can go if tagging is automated, that is to say, if tagging decisions ignored the syntax.


Now, the Philo Concordance Project people want the users to be aware of this, and they say so on their webpage:

In this concordance the words were organised mechanically, on the basis of the Greek alphabetic order of the text­forms ("tokens"). Two copies were printed in 1974, typed with Greek letters.


Later, some words were completely lemmatised and tagged in context and all words were automatically organised based upon this initial lemmatisation and tagging by Roald Skarsten.


When distributing this database, it is important for the user to be aware of what it includes and what it doesn't.


Later, the database was distributed by Accordance, BibleWorks and Logos. Actually, they name Logos first and add a picture. Then they mention Accordance (Mac) and Bibleworks. And they add:

These publishers also intend to complete the morphological analysis of the texts.


Doing that is hard work. I have personally helped to refine the database for Accordance. Is that work completely finished? No. Philo has more than 438,000 words, and, once TA has received the overall tag of APN, it is hard to catch all NPN and correct the tag. But we have made many thousands of corrections.


Also Logos informs its users:

Note: The morphological analysis contained in this edition is under revision. Forms that are ambiguous; particularly conjunctions, particles, pronouns and adverbs, are in the process of being revised. A rebuilt resource with the updated form of this information will be made available at no cost via download upon completion of the extended analysis.


Apart from conjunctions, particles, pronouns and adverbs, they should have added articles, nouns and verbs. I guess that interjections are all right.


So, they are honest. As I am not a Logos user, I don't know how many free upgrades they already distributed. Preparing an upgrade takes a lot of time. It is very hard work, as I said, and one that is not rewarding and easy to market. I find it as important to fix existing texts as moving to new projects. I am sure that scholars appreciate both careful revisions and new ideas.


I am aware of the efforts of Logos with syntax and I appreciate them. However, I find that there is no need to try to give the impression that only Logos is aware of the complexities of syntax, for that picture is not accurate.


[Edited: after checking, I corrected the number of words in Philo. I first entered the number by heart]

Edited by Marco
minor edits by permission
Link to comment
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in

Sign In Now

  • Create New...