Twitter - LinkedIn - GitHub

drawing

I am a PhD candidate and Lecturer (≈ Assistant Professor) at the University of Cape Town, working under Jan Buys in the Computer Science Department. My project is about text generation (language modelling, machine translation, data-to-text) for low-resource, morphologically complex languages. I am developing subword-driven neural architectures to model the agglutinative morphology of the Nguni languages of South Africa.

My broader research interests are in linguistically informed NLP interpretability. I am particularly interested in investigating whether deep neural networks can learn and use concepts like morphology, syntax, and compositionality.

Previously I completed my masters in AI at the University of Amsterdam, supervised by Martha Lewis. Before that I obtained my undergraduate degrees in Computer Science and Mathematical Statistics at Stellenbosch University in South Africa.

Reviewing (2022-2024): ACL ARR, EMNLP, EACL, ICLR, NeurIPS, COLM, BlackBoxNLP, AfricaNLP

Teaching (2024): CSC3022 Machine Learning for 3rd years, CSC1016 Java programming for 1st years, teaching part of the NLP masters course at the AIMS AI for Science Masters

News

June 2024 I attended NAACL in Mexico City to present our paper A Systematic Analysis of Subwords and Cross-Lingual Transfer in Multilingual Translation.

May 2024 I attended LREC-COLING in Turin to present a talk on T2X and a poster on NGLUENi.

May 2024 Our paper NGLUEni: Benchmarking and Adapting Pretrained Language Models for Nguni Languages won a best paper award at the AfricaNLP workshop co-located with ICLR 2024.

March 2024 A Systematic Analysis of Subwords and Cross-Lingual Transfer in Multilingual Translation is accepted at NAACL Findings. I will be in Mexico City in June to present the work at NAACL.

February 2024 I have two papers accepted at LREC-COLING and will be in Turin, Italy to present the work in person.

December 2023 I am attending EMNLP in Singapore to present a poster about our work on Morphological Compositional Generalisation at the GenBench workshop.

September 2023 I have been awarded an Amazon travel scholarship to attend the GenBench workshop at EMNLP 2023.

August 2023 I gave an invited talk at the NLP for Southern African Languages Workshop collocated with COMPASS 2023.

May - August 2023 I spent 3 months in Kyoto, Japan for a research internship at the NICT Advanced Translation Technology Laboratory.

May 2023 Our paper Subword Segmental Machine Translation: Unifying Segmentation and Target Sentence Generation has been accepted at Findings of ACL.

October 2022 Our paper Subword Segmental Language Modelling for Nguni Languages has been accepted at Findings of EMNLP.

September 2022 Our submission to the WMT22 Shared Task: Large-Scale Machine Translation Evaluation for African Languages is a multilingual translation model for 8 South African languages.

December 2021 I attended SACAIR 2021 to present my winning submission to the Nguni languages POS tagging shared task.