The Educational Greek Corpus (EGC) is the extended version of the Hellenic National Corpus (HNC) version 2.0. It comprises:
The general corpus contains more than 34,000,000 words of written texts and is a part of the Hellenic National Corpus developed by the Institute for Language and Speech Processing / R.C. "Athena".
Texts in the HNC represent modern Greek language use and most of them having been written after 1990. In order to include different types of language, texts from several media, belonging to different genres and dealing with various topics have been selected. Texts written in highly idiomatic language have been excluded from the corpus. Most texts have been selected based on their high readability (high circulation newspapers, best-selling books etc.).
The textbooks corpus contains 2.250.000 words from the textbooks used in Greek public schools.
The educators' corpus comprises texts selected and uploaded by the teachers themselves.
For all EGC texts users can have access to the following information:
How can one use EGC?
EGC offers the environment and the tools to:
1. Retrieve authentic examples of use of the Greek language by searching for:
2. Study the results of your query in concordances.
3. Look for specific word and lemma frequencies in the EGC corpora.