Jump to content

Finding number of lemmata in a selection


Tyler Smith

Recommended Posts

Is there a quick way to find out how many unique lemmata are in a largish selection of text? 

 

I could theoretically make a concordance and go through and count up all the headwords individually, but is there a faster way?

Link to comment
Share on other sites

The COUNT command should do the trick.

 

 

Screen Shot 2022-02-01 at 12.25.33 PM.png

  • Like 2
Link to comment
Share on other sites

I think he is looking for all the lexical forms (regardless of count #) in a passage. So he would have to use the range command and the analysis analytics likely to get a list of lemmata, right?

Edited by Brian W. Davidson
  • Like 1
Link to comment
Share on other sites

Thanks everybody! The COUNT command was what I needed. To get every word in my selection, I just had to set the upper limit nice and high [COUNT 1-100000] [RANGE ...], then use the "Analysis" pane to see the results.

  • Like 1
Link to comment
Share on other sites

Since we are on the topic of the COUNT command:

  • If I set COUNT to, e.g., 50-100, it still returns results used less than 50x. I.e., all the counts of 1-49x. I get the same thing if I set the COUNT to +50. Of course I can sort my results by count down, but I'm wondering why I can't get just the results I want.
  • Bigger question, how can I get the search to ignore capitalization and accents? E.g., I get separate results for Καὶ, καὶ, and καί.

Thanks!

Link to comment
Share on other sites

17 minutes ago, mgvh said:

Since we are on the topic of the COUNT command:

  • If I set COUNT to, e.g., 50-100, it still returns results used less than 50x. I.e., all the counts of 1-49x. I get the same thing if I set the COUNT to +50. Of course I can sort my results by count down, but I'm wondering why I can't get just the results I want.
  • Bigger question, how can I get the search to ignore capitalization and accents? E.g., I get separate results for Καὶ, καὶ, and καί.

Thanks!

 

1. It looks like it's finding some crasis forms that occur less than 50 times in a search for [COUNT +50]. I'll report that as a bug.
2. When I run the same command, I don't see separate results for Καὶ, καὶ, and καί. Do you see those separate results when you're performing a simple search like this: [COUNT +50]

Link to comment
Share on other sites

7 hours ago, Mark Allison said:

 

2. When I run the same command, I don't see separate results for Καὶ, καὶ, and καί. Do you see those separate results when you're performing a simple search like this: [COUNT +50]

I should have been clearer. I am searching for Inflected forms (not lexemes). I.e., I am customizing the analysis display to only show INFLECTED and not LEX.

Link to comment
Share on other sites

11 hours ago, mgvh said:

I should have been clearer. I am searching for Inflected forms (not lexemes). I.e., I am customizing the analysis display to only show INFLECTED and not LEX.

 

We should certainly have an option to ignore capitalization. However, if you ignore accents, you wouldn't be able to distinguish between lexical forms like εἷς and εἰς.

  • Like 1
Link to comment
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...