Accepting PhD Students

PhD projects

Large language models, LLM agents, multilinguality, machine translation, robust and scalable evaluation, AI for finance, and AI for programming.

20202025

Research activity per year

Collaborations and top research areas from the last five years

Recent external collaboration on country/territory level. Dive into details by clicking on the dots or
  • An expanded massive multilingual dataset for High-Performance Language Technologies (HPLT)

    Burchell, L., de Gibert, O., Arefyev, N., Aulamo, M., Bañón, M., Chen, P., Fedorova, M., Guillou, L., Haddow, B., Hajič, J., Helcl, J., Henriksson, E., Klimaszewski, M., Komulainen, V., Kutuzov, A., Kytöniemi, J., Laippala, V., Mæhlum, P., Malik, B. & Mehryary, F. & 15 others, Mikhailov, V., Moghe, N., Myntti, A., O'Brien, D., Oepen, S., Pal, P., Piha, J., Pyysalo, S., Ramírez-Sánchez, G., Samuel, D., Stepachev, P., Tiedemann, J., Variš, D., Vojtěchová, T. & Zaragoza-Bernabeu, J., 27 Jul 2025, Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Che, W., Nabende, J., Shutova, E. & Pilehvar, M. T. (eds.). Vienna, Austria: Association for Computational Linguistics, p. 17452-17485 34 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Open Access
    File
    1 Downloads (Pure)
  • AveniBench: Accessible and Versatile Evaluation of Finance Intelligence

    Klimaszewski, M., Chen, P., Guillou, L., Papaioannou, I., Haddow, B. & Birch, A., 19 Jan 2025, Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal). Chen, C.-C., Moreno-Sandoval, A., Huang, J., Xie, Q., Ananiadou, S. & Chen, H.-H. (eds.). Abu Dhabi, UAE: Association for Computational Linguistics, p. 111-117 7 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Open Access
    File
    1 Downloads (Pure)
  • Crossmodal ASR error correction with discrete speech units

    Li, Y., Chen, P., Bell, P. & Lai, C., 16 Jan 2025, 2024 IEEE Spoken Language Technology Workshop (SLT): Proceedings. IEEE, p. 431-438 8 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    6 Citations (Scopus)
  • DocHPLT: a massively multilingual document-Level translation dataset

    O'Brien, D., Malik, B., de Gibert, O., Chen, P., Haddow, B. & Tiedemann, J., 01 Nov 2025, Proceedings of the Tenth Conference on Machine Translation . Haddow, B., Kocmi, T., Koehn, P. & Monz, C. (eds.). Suzhou, China: Association for Computational Linguistics, p. 286-300 15 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Open Access
    File
    1 Downloads (Pure)
  • Findings of the WMT25 multilingual instruction shared task: persistent hurdles in reasoning, generation, and evaluation

    Kocmi, T., Artemova, E., Avramidis, E., Briakou, E., Chen, P., Fadaee, M., Freitag, M., Grundkiewicz, R., Hou, Y., Koehn, P., Kreutzer, J., Mansour, S., Perrella, S., Proietti, L., Riley, P., Sánchez, E., Schmidtová, P., Shmatova, M. & Zouhar, V., 01 Nov 2025, Proceedings of the Tenth Conference on Machine Translation. Haddow, B., Kocmi, T., Koehn, P. & Monz, C. (eds.). Suzhou, China: Association for Computational Linguistics, p. 414-435 22 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Open Access
    File