We and #AI are leveling up the Word Sketch tool with automated word sense identification. It categorizes collocations into groups using advanced language models and word embeddings. It works in preloaded and user corpora.
sketchengine.eu/news/word-sens
We’re constantly seeking ways to improve our services. Today brings multi-word term extraction for Greek🇬🇷 Also available in 30+ languages. Upload your documents or create corpora from your data and extract #terminology.https://t.co/lUm8DBN7h5#corpuslinguistics #termextraction pic.twitter.com/iwtlAoAsxu
— Sketch Engine (@SketchEngine) January 10, 2024
Parlez-vous français? The French Web 2023 corpus with 23.8 billion words now available! The texts were carefully cleaned and classified into genres (blog, news, …) and topics (arts, health, …). https://t.co/0HaqiQkhpm#corpuslinguistics #digitalhumanities #TextClassification pic.twitter.com/mtHv1lSodT
— Sketch Engine (@SketchEngine) January 16, 2024
Multiword sketches analyze collocations of multi-word phrases. Just type two or more lemmas without articles, conjunctions, or determiners e.g. “tell truth” (instead of “telling the truth”).https://t.co/p60HpQViee#collocations #corpuslinguistics #textanalysis pic.twitter.com/lQCXeMfGS9
— Sketch Engine (@SketchEngine) January 5, 2024
Check out our latest addition to the corpus list! Apart from multi-billion corpora, we also create small domain-specific collections. Explore the Polish language of the 1960s. https://t.co/N0wHuAt9Gh Thanks to @OgrMaciej and other authors for their work! #corpuslinguistics pic.twitter.com/6gnoBnhtIf
— Sketch Engine (@SketchEngine) January 26, 2024