Finding number of lemmata in a selection

February 1, 2022

Is there a quick way to find out how many unique lemmata are in a largish selection of text?

I could theoretically make a concordance and go through and count up all the headwords individually, but is there a faster way?

February 1, 2022

The COUNT command should do the trick.

February 1, 2022

I think he is looking for all the lexical forms (regardless of count #) in a passage. So he would have to use the range command and the analysis analytics likely to get a list of lemmata, right?

Edited February 1, 2022 by Brian W. Davidson

February 2, 2022

@Tyler Smith could you elaborate on exactly what you're trying to find?

February 2, 2022

Thanks everybody! The COUNT command was what I needed. To get every word in my selection, I just had to set the upper limit nice and high [COUNT 1-100000] [RANGE ...], then use the "Analysis" pane to see the results.

February 2, 2022

Since we are on the topic of the COUNT command:

If I set COUNT to, e.g., 50-100, it still returns results used less than 50x. I.e., all the counts of 1-49x. I get the same thing if I set the COUNT to +50. Of course I can sort my results by count down, but I'm wondering why I can't get just the results I want.
Bigger question, how can I get the search to ignore capitalization and accents? E.g., I get separate results for Καὶ, καὶ, and καί.

Thanks!

February 2, 2022

17 minutes ago, mgvh said:

Since we are on the topic of the COUNT command:

If I set COUNT to, e.g., 50-100, it still returns results used less than 50x. I.e., all the counts of 1-49x. I get the same thing if I set the COUNT to +50. Of course I can sort my results by count down, but I'm wondering why I can't get just the results I want.

Bigger question, how can I get the search to ignore capitalization and accents? E.g., I get separate results for Καὶ, καὶ, and καί.

Thanks!

1. It looks like it's finding some crasis forms that occur less than 50 times in a search for [COUNT +50]. I'll report that as a bug.
2. When I run the same command, I don't see separate results for Καὶ, καὶ, and καί. Do you see those separate results when you're performing a simple search like this: [COUNT +50]

February 3, 2022

7 hours ago, Mark Allison said:

2. When I run the same command, I don't see separate results for Καὶ, καὶ, and καί. Do you see those separate results when you're performing a simple search like this: [COUNT +50]

I should have been clearer. I am searching for Inflected forms (not lexemes). I.e., I am customizing the analysis display to only show INFLECTED and not LEX.

February 3, 2022

11 hours ago, mgvh said:

I should have been clearer. I am searching for Inflected forms (not lexemes). I.e., I am customizing the analysis display to only show INFLECTED and not LEX.

We should certainly have an option to ignore capitalization. However, if you ignore accents, you wouldn't be able to distinguish between lexical forms like εἷς and εἰς.

Finding number of lemmata in a selection

Recommended Posts

Tyler Smith

Link to comment

Share on other sites

Mark Allison

Link to comment

Share on other sites

Brian W. Davidson

Link to comment

Share on other sites

Mark Allison

Link to comment

Share on other sites

Tyler Smith

Link to comment

Share on other sites

mgvh

Link to comment

Share on other sites

Mark Allison

Link to comment

Share on other sites

mgvh

Link to comment

Share on other sites

Mark Allison

Link to comment

Share on other sites

Please sign in to comment

Browse

Activity