Technology & Software
In the past 14 years, I have worked for academic and commercial organizations to build specialized and curated search engines and discovery platforms for educational content, researchers, journalists, historical texts, and job seekers.
I turn state-of-the-art technologies in fields like Natural Language Processing (NLP), Machine Learning, and Deep Learning into practical applications with a strong emphasis on the human perspective and scientific methodology.
🔍 Information Retrieval & Search
Keyword Search: Which documents are most relevant for a search term?
Semantic Search: Which documents cover the most similar topics?
Reranking: What is the best order of the results?
🧠 Language Modelling & Generation
Encoding: Machine-readable vector representation using BERT and similar models
Generation: Creating human-readable texts and summaries
RAG: Retrieval-augmented generation for knowledge-grounded responses
🏷️ Classification & Labeling
Document Classification: Subject, age category, content type
Spam Detection: Identifying unwanted content
Named Entity Extraction: Persons, places, organizations
📊 Topic Modelling & Clustering
Unsupervised Learning: Discovering patterns in data
Topic Discovery: Finding thematic clusters in document collections
Dimensionality Reduction: Making data interpretable
Programming Languages & Frameworks
Python
Primary language for research and data science. Deep expertise with scientific computing ecosystem.
Machine Learning & Deep Learning
Building and deploying modern neural networks and ML models.
Java & JVM Languages
Enterprise software development and backend systems.
Search & Data Technologies
Building scalable search infrastructure and data pipelines.
DevOps & Cloud
Containerization, orchestration, and cloud infrastructure.
CI/CD & Agile
Modern development practices and continuous integration.
Key Principles
Grounded Evaluation
Understanding what is the best solution in a specific context through careful measurement and qualitative analysis.
Data Curation
Manual annotation and qualitative evaluation ensure quality over generic benchmarks.
Efficient Resources
Building sustainable solutions that respect computational constraints and environmental impact.
User-Centered Design
Understanding purpose and requirements of users, not providing generic solutions.