
About Us
From Department of Computer Science and Engineering,University of Moratuwa
Aaivu organization is an opensource organization founded by Dr.Uthayasanker Thayasivam, at the Department of Computer Science and Engineering, University of Moratuwa to enhance and enrich the lives of the community who will benefit from NLP.
Our objectives
- We engage in NLP based projects to adapt the projects that were done for languages such as English to local languages such as Tamil and Sinhala.
- We use techniques such as word embedding to understand and analyse the languages and implement them in a useful manner.
- We do Data Mining to support these ventures.
Projects
Tesseract Quality Checker API
This API is used to check the quality of the captured image for tesseract OCR. Tesseract OCR works best when there is a clear segmentation of the foreground text from the background. It is easy to instruct the user get a best-fit image rather than doing preprocessing on the images.
English to Sinhala Neural Machine Translation
This research is about developing a NMT system using Transformer architecture for the under-resourced, domain-specific English to Sinhala translation task. The translation quality is improved by exploring effective ways of incorporating Part-of-Speech (POS) information and subword techniques.
Dialogue policy optimization in low resource setting
The dialogue policy optimization in task oriented concversational agents in low resource setting is a open research project. We have develop a novel approach for dialogue policy optimization using Rienforcement Learning. The methodology is based on Self-play and a novel sampling technique that prioratizes failed dialogues over successful ones.
Semantic Table Interpretation
Semantic Table Interpretation is the use of external knowledge bases or ontologies to provide context to tabular data sources. Data usually loses its context when converted into tabular structures. Hence, mapping these data back into its original context is non trivial. Since table data is one of most widespread structures in use, the information loss is significant. In this project, we introduce a two novel algorithms, ReleX and STEM, to derive meaning full relationships between table columns and to identify table entities with context using web ontologies.
Tamizhi Net OCR
Tamizhi-Net OCR is a tool that extract text from scanned Pdf/Image. The system covers Tamil, Sinhala and English languages.
How to contribute?
We welcome all the opensource contributors and we hope you stay with us throughout our journey.

Introduction
Visit our GitHub Introduction page to learn how to contribute and pick up some interesting projects
Mail us regarding any issues at hello@aaivu.org
Talk forum
If you have any community related questions, please ask in the Aaivu Talk forum
Help Desk
Please reach us via this Help Desk form to request access to the documentation for edit/write access and Github write access
Our Pioneers

Suthagar Kailayapathy
Co-Founder & Software Engineer at Sysco Labs
Suthagar is a passionate and dedicated full-stack software engineer with a passion for developing innovative programs that expedite the efficiency and effectiveness of organizational success. Having experience in Software Development, Dev testing, Cloud engineering, Continues Integration and Continues Delivery and team management with excellent skills and talents who believe in pursuing goals with hard work, commitment, and persistence.

Dr. Uthayasanker Thayasivam
Founder & Senior Lecturer
Dr.Uthayasanker is an experienced engineer, currently lecturing at the Department of Computer Science & Engineering, University of Moratuwa, Sri Lanka . He obtained his doctoral degree in semantic computing from University of Georgia, USA. His special interest and eloquent practice in Advanced Machine Learning & Advanced Data Science makes him a Unique expert in this ever evolving dynamic science. Bringing his industrial experience from globally reputed Information Retrieval organizations: Google, Microsoft, Ask.com and Glassdoor & teaching experience from the University of Moratuwa, Uthaya offers a platter on the table for the data driven leaders of the corporate world.
Frequently Asked Questions
-
How does Aaivu organization impact its domain?
Aaivu organization helps to eliminate the challenges that are prevailing when a technique is adapted to the local language. Read this article about tamil opensource landscape and the opportunities and challenges that are prevailing in this domain,to get a better understanding.
-
What are the innovations that are done through Aaivu?
Aaivu organization, addresses the discrepencies when NLP related projects are applied to languages that have a significant morphology, which needs new techniques to be invented and adapted to cope up with it.