(Now looking for a generic interface to split compounds, or split off suffixes etc., or even sentences into words, for various languages. Even in English, I'd prefer to split off parts like -ed and -s and add them separately to the board)
Search results
Quick #NLP question
Where can I find a list of most frequent words, not just ranked but with (rough) frequencies?
(Without downloading a huge corpus and compiling it myself?)
edit: fully answered - see replies :D
JAR로 되어있는 한글 형태소 분석기 라이브러리인 꼬꼬마 2.0과 Maven에 게시된 open-korean-text를 C#에서 프로젝트 구성없이 불러다 쓰는 예제입니다.
C# 코드 파일과 XML 설정 파일만 있으면 Maven과 JAR 패키지도 손쉽게 가져와서 C#으로 코딩할 수 있습니다. C# 파일 하나만 바라보면 될테니 코드 어시스턴트의 도움도 마음껏 받을 수 있으니 더욱 좋겠죠!
Hi everyone! I'm Jodie and I love all things #data!
I'm currently working as a developer advocate in #datascience at #jetbrains, and in my previous life I worked as a #datascientist, mostly in #nlp. I write tutorials and blog posts, present at conferences and host webinars about a range of #machinelearning, data science and #dataviz topics. 📊
I'm hoping we can build up an amazing machine learning and data science community here on Mastodon!
Linderaで日本語を形態素解析してPostgreSQLの全文検索を最小限の努力で動かす
https://dev.classmethod.jp/articles/lindera-python-japanese-tokenization-postgresql-fulltext-search/
#introduction
I'm a researcher at the Institut Urban Landscape at the Zürich University of Applied Sciences (ZHAW). I'm working on the governance and narratives of sustainable digitalization and urban sustainability transformations.
Here for friendly exchanges and critical takes on urban digitalization and sustainability, governance and #polsci, #rstats + 🐍, #datascience, #nlp, Bayesian stats, #OpenScience and finding little nuggets of inspiration from your research lifes. Also cycling.
This is an #Introduction to my interests. Professionally I'm interested in #therapy, #trauma, #complextrauma, #attachment, #eft / #tapping, #nlp, #pkm, #obsidian, #compassion
Personally I'm into #photography, #Buddhism, #Scotland, #Wales, #camping, #hiking
That should do it for now.
Gemini Deep Think learns math, wins gold medal at International Math Olympiad
https://arstechnica.com/ai/2025/07/google-deepmind-earns-gold-in-international-math-olympiad-with-new-gemini-ai/
#AI #artificialintelligence #Google #DeepMind #DeepThink #mathematics #NLP
We're the LIPN, a joint #ComputerScience Laboratory between the #CNRS and the University Sorbonne Paris Nord.
We're approximately 150 researchers, within five research teams :
- Machine Learning #ML
- Combinatorial #Optimization and High Performance Computing #HPC
- design and analysis of combinatorial models at the interface of physics, geometry and algorithmic #Combinatorics
- #Logic and #Verification
- Automatic natural language processing and knowledge representation #NLP.
I might as well do another #introduction specifically for the #academic side of this here fediverse:
Coming from #theoreticalCS (with applications in #NLP) to doing #digitalhumanities (computational #musicology), I've now landed in #ResponsibleAI. Specifically, I'm interested in exploring #AntiCapitalistAI, both sharpening existing critiques of current AI practise by confronting capital and exploring inherent politics of technologies, and finding better ones for a socialist world.
Browser-Native Translation and Language Detection APIs Coming Soon

洪 民憙 (Hong Minhee) @hongminhee@hackers.pub
Just reviewed the W3C draft for the Translator and Language Detector APIs. This is genuinely exciting development for web developers.
The proposal would add native browser support for:
- Text translation between languages
- Language detection of arbitrary text
- Both with streaming capabilities
No more relying on third-party translation services or embedding external APIs for basic language operations. All processing happens locally in the browser.
The API design is clean and straightforward:
// Translation example
const translator = await Translator.create({
sourceLanguage: "en",
targetLanguage: "fr"
});
const translatedText = await translator.translate("Hello world");
// Language detection example
const detector = await LanguageDetector.create();
const results = await detector.detect("Hello world");
// Returns array of detected languages with confidence scores
This will be a game-changer for multilingual sites and applications. The browser handles downloading appropriate language models and manages usage quotas.
The spec is still in draft form but shows promising progress toward standardizing these capabilities across browsers. Looking forward to seeing this implemented.
I am a professor of linguistics at the University of Washington, where I run our Master of Science in Computational Linguistics. I work on computational approaches to syntax and the syntax-semantics interface, the role of #linguistics in #NLP, and on the societal impacts of language technology. On social media, I spend a lot of time debunking #AIhype, for which I find linguistics very useful!
Curso "Programación declarativa (2006-07)". https://jaalonso.github.io/cursos/pd-06 #LogicProgramming #Prolog #AI #NLP #Constraints #Logic
Curso "Programación lógica (2004-05)". https://jaalonso.github.io/cursos/d-pl-04 #LogicProgramming #Prolog #AI #NLP #Constraints #MachineLearning