Aykut Erdem

The broad goal of my research is to explore better ways to understand, interpret and manipulate visual data*. My research interests span a diverse set of topics, ranging from image editing to visual saliency estimation, and to multimodal learning for integrated vision and language.

I am an Associate Professor of Computer Engineering at Koç University. I’m also affiliated with KUIS AI Center, and a proud ELLIS member. Before joining Koç University, I was with Hacettepe University, where I was one of the directors of the Computer Vision Lab. I received my PhD degree from the Computer Engineering Department of METU in 2008. During my PhD studies, I spent a summer at Virginia Tech as a visiting researcher, and a semester at MIT as a visiting scholar. I did my postdoc at Ca’Foscari University of Venice in Italy.

I'm looking for motivated MSc/PhD students. Also, don't hesitate to contact me, if you're an undergrad wanting to get into AI and has a desire to learn!

Highlights and News


Mar 2026	Our project on on multimodal LLMs for whole-slide pathology image analysis has been selected for the NVIDIA Academic Grant Program.
Feb 2026	Our work on language-assisted motion planning for controllable video generation has been accepted to CVPR2026.
Aug 2025	Our work on audio-visual saliency prediction in 360 degree videos will be published in IEEE Transactions on Pattern Analysis and Machine Intelligence.
Jun 2025	Our work on efficient representation of videos got accepted to ICCV 2025.
Nov 2024	Awarded funding from TUBITAK 2247 - A - National Outstanding Researchers Program on generative AI approaches to visual data generation.
Sep 2024	Our work on diffusion-based object removal from images accepted to NeurIPS 2024 as a poster presentation.
Sep 2024	Our work on a GAN-based unified framework for domain adaptation, image synthesis and manipulation accepted to SIGGRAPH Asia 2024.
Mar 2024	Our work on sequential compostional generalization in multimodal models accepted to NAACL 2024 as an oral presentation.
Jan 2024	Our work on evaluating zero-shot linguistic and temporal understanding capabilities of video-language models accepted to ICLR 2024.
Aug 2023	Our work on 360-degree video saliency prediction accepted to BMVC 2023.
Jul 2023	Our paper on video editing is accepted to ICCV 2023: "VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs". Read more about our paper at our project website.
Jul 2023	Our work on StyleGAN and CLIP-based text-guided image manipulation work has been accepted for publication in ACM Transactions on Graphics. We will be presenting our work at SIGGRAPH Asia 2023 at Sydney.
Jun 2023	I received a gift fund from Adobe Research to work on text-guided image synthesis and editing. I will be collaborating with Duygu Ceylan of Adobe Research and Erkut Erdem in this exciting project. Thanks Adobe!
May 2023	We are organizing the first Workshop on Muiltimodal, Multilingual NLG (MM-NLG) at INLG/SIGDial 2023 in September 2023.
Apr 2023	BIG-bench paper has been accepted for publication in Transactions on Machine Learning Research.
Feb 2022	Our work on omnidirectional image quality assessment got accepted to ICASSP 2023.
Nov 2022	We ranked 2nd in the euphemism detection shared task organized by the Figurative Language Processing workshop at EMNLP 2022.
Sep 2022	Our work on language-guided video manipulation accepted to BMVC 2022.
Jun 2022	Our work on language-guided image analysis received the best paper award at 5th Multimodal Learning and Applications Workshop.
Feb 2022	I was appointed to an Associate Editor of IEEE Transactions on Image Processing (IEEE-TIP).

older items

Funding

Principal Investigator	TUBITAK 1001 The Support Program for Scientific and Technological Research Projects Award #120E501, 2021-2024 · project page
	TUBITAK 1003 Primary Subjects R&D Funding Program Award# 116E685, 2017-2020 · project page
	TUBITAK 3501 Career Development Program Award #113E497, 2014-2017 · project page
Co-Investigator	TUBITAK 1007 Public Institutions Research Funding Program Award #114G028, 2016-2019 · project page
	TUBITAK 1001 The Support Program for Scientific and Technological Research Projects Award #113E116 and European Union under European Cooperation in Science and Technology (COST) Programme: ICT COST IC1037 Action, 2014-2017 · project page
	TUBITAK 3501 Career Development Program: Award #112E146, 2012-2015 · project page
Donations	NVIDIA Academic Grant, 2026 Adobe Research Gift, 2025 Adobe Research Gift, 2023 Adobe Research Gift, 2020 NVIDIA Hardware Donation Grant (Tesla K40 GPU), 2016

Selected Publications

LAMP: Language-Assisted Motion Planning for Controllable Video Generation
Muhammed Burak Kizil, Enes Sanli, Niloy J. Mitra, Erkut Erdem, Aykut Erdem, Duygu Ceylan
CVPR 2026.
pdf · project page (with code)

Spherical Vision Transformers for Audio-Visual Saliency Prediction in 360° Videos
Mert Cokelek, Halit Ozsoy, Nevrez Imamoglu, Cagri Ozcinar, Inci Ayhan, Erkut Erdem, Aykut Erdem
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 48, Issue 1, pp. 329 - 345, January 2026.
pdf · project page (with code)

GaussianVideo: Efficient Video Representation via Hierarchical Gaussian Splatting
Andrew Bond, Jui-Hsien Wang, Long Mai, Erkut Erdem, Aykut Erdem
ICCV 2025.
pdf · project page (with code)

HyperGAN-CLIP: A Unified Framework for Domain Adaptation, Image Synthesis and Manipulation
Abdul Basit Anees, Ahmet Canberk Baykal, Muhammed Burak Kizil, Duygu Ceylan, Erkut Erdem, Aykut Erdem
SIGGRAPH Asia 2024 Conference Papers.
pdf · project page (with code)

CLIPAway: Harmonizing Focused Embeddings for Removing Objects via Diffusion Modelss

CLIPAway: Harmonizing Focused Embeddings for Removing Objects via Diffusion Models
Yigit Ekin, Ahmet Burak Yildirim, Erdem Eren Caglar, Aykut Erdem, Erkut Erdem, Aysegul Dundar
NeurIPS 2024.
pdf · project page (with code)

Sequential Compositional Generalization in Multimodal Models
Semih Yagcioglu, Osman Batur Ince, Aykut Erdem, Erkut Erdem, Desmond Elliott, Deniz Yuret
NAACL 2024.
pdf · project page (with code)

HyperE2VID: Improving Event-Based Video Reconstruction via Hypernetworks
Burak Ercan, Onur Eker, Canberk Saglam, Aykut Erdem, Erkut Erdem
IEEE Transactions on Image Processing, March 2024.
pdf · project page (with code)

ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models
Ilker Kesen, Andrea Pedrotti, Mustafa Dogan, Michele Cafagna, Emre Can Acikgoz, Letitia Parcalabescu, Iacer Calixto, Anette Frank, Albert Gatt, Aykut Erdem, Erkut Erdem
ICLR 2024.
pdf · project page (with code)

Harnessing Dataset Cartography for Improved Compositional Generalization in Transformers
Osman İnce, Tanin Zeraati, Semih Yagcioglu, Yadollah Yaghoobzadeh, Erkut Erdem, and Aykut Erdem
Findings of EMNLP 2023, Singapore, December 2023..
pdf · project page (with code)

VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs
Moayed Haji Ali, Andrew Bond, Tolga Birdal, Duygu Ceylan, Levent Karacan, Erkut Erdem, Aykut Erdem
ICCV 2023.
pdf · project page (with code)

CLIP-guided StyleGAN Inversion for Text-driven Real Image Editing
Ahmet Canberk Baykal, Abdul Basit Anees, Duygu Ceylan, Erkut Erdem, Aykut Erdem, Deniz Yuret
ACM Transactions on Graphics, Vol. 42, Issue 5, Article 172, August 2023.
pdf · project page (with data and code)

Burst Photography for Learning to Enhance Extremely Dark Images
Ahmet Serdar Karadeniz, Erkut Erdem, Aykut Erdem
IEEE Transactions on Image Processing, November 2021.
pdf · project page (with code)

SLAMP: Stochastic Latent Appearance and Motion Prediction
Adil Kaan Akan, Erkut Erdem, Aykut Erdem, Fatma Guney
ICCV 2021.
pdf · project page (with code)

mustGAN: multi-stream Generative Adversarial Networks for MR Image Synthesis
Mahmut Yurt, Salman UH Dar, Aykut Erdem, Erkut Erdem, Kader K. Oguz, Tolga Cukur
Medical Image Analysis, Vol. 70, May 2021.
pdf

Belief Regulated Dual Propagation Nets for Learning Action Effects on Articulated Multi-Part Objects

Cross-lingual Visual Pre-training for Multimodal Machine Translation
Ozan Caglayan, Menekse Kuyu, Mustafa Sercan Amac, Pranava Madhyastha, Erkut Erdem, Aykut Erdem, Lucia Specia
EACL 2021.
pdf · project page (with data and code)

Belief Regulated Dual Propagation Nets for Learning Action Effects on Articulated Multi-Part Objects
Ahmet E. Tekden, Aykut Erdem, Erkut Erdem, Mert Imre, M. Yunus Seker and Emre Ugur
ICRA 2020.
pdf · video

Manipulating Attributes of Natural Scenes via Hallucination
Levent Karacan, Zeynep Akata, Aykut Erdem, Erkut Erdem
ACM Transactions on Graphics, Vol. 39, Issue 1, Article 7, February 2020.
pdf · project page (with data and code) · two minute papers

Procedural Reasoning Networks for Understanding Multimodal Procedures
Mustafa Sercan Amac, Semih Yagcioglu, Aykut Erdem, and Erkut Erdem
CoNLL 2019
pdf · project pages (with code)

Image Synthesis in Multi-Contrast MRI with Conditional Generative Adversarial Networks
Salman Ul Hassan Dar, Mahmut Yurt, Levent Karacan, Aykut Erdem, Erkut Erdem, Tolga Çukur
IEEE Trans. Med. Imag., Vol. 38, Issue 10, pp. 2375-2388, October 2019.
pdf

RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes
Semih Yagcioglu, Aykut Erdem, Erkut Erdem, and Nazli Ikizler-Cinbis
EMNLP 2018
pdf · project pages (with data and leaderboard)

Two-Stream Convolutional Networks for Dynamic Saliency Prediction

Spatio-Temporal Saliency Networks for Dynamic Saliency Prediction
Cagdas Bak, Aysun Kocak, Erkut Erdem, Aykut Erdem
IEEE Trans. Multimed., Vol. 20, Issue 7, pp. 1688-1698, July 2018.
pdf

Re-evaluating Automatic Metrics for Image Captioning
Mert Kilickaya, Aykut Erdem, Nazli Ikizler-Cinbis, and Erkut Erdem
EACL 2017
pdf · slides

Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts
Levent Karacan, Zeynep Akata, Aykut Erdem, Erkut Erdem
arXiv preprint arXiv:1612.00215
pdf

Deformable Part-based Tracking by Coupled Global and Local Correlation Filters
Osman Akin, Erkut Erdem, Aykut Erdem, Krystian Mikolajczyk
J Vis Commun Image Represent 2016
pdf

An Objective Deghosting Quality Metric for HDR Images

TasvirEt: A Benchmark Dataset for Automatic Turkish Description Generation from Images
Mesut Erhan Unal, Begum Citamak, Semih Yagcioglu, Aykut Erdem, Erkut Erdem, Nazli Ikizler Cinbis, Ruket Cakici
SIU 2016
pdf (in Turkish) · project page · Turkish captions for Flickr8K dataset

An Objective Deghosting Quality Metric for HDR Images
Okan Tarhan Tursun, Ahmet Oguz Akyuz, Aykut Erdem, Erkut Erdem
EG 2016
pdf · project page (with code) · experiment page

Image Matting with KL-Divergence Based Sparse Sampling
Levent Karacan, Aykut Erdem, Erkut Erdem
ICCV 2015
pdf · supplementary material · project page (with code)

A Distributed Representation Based Query Expansion Approach for Image Captioning
Semih Yagcioglu, Erkut Erdem, Aykut Erdem and Ruket Cakici
ACL 2015
pdf · supplementary material · project page (with code)

Predicting Memorability of Images Using Attention-driven Spatial Pooling and Image Semantics
Bora Celikkale, Aykut Erdem and Erkut Erdem
IMAVIS 2015
pdf · project page

The State of the Art in HDR Deghosting: A Survey and Evaluation
Okan Tarhan Tursun, Ahmet Oguz Akyuz, Aykut Erdem and Erkut Erdem
EG STAR 2015
pdf · project page

Top down saliency estimation via superpixel-based discriminative dictionaries
Aysun Kocak, Kemal Cizmeciler, Aykut Erdem and Erkut Erdem
BMVC 2014
pdf · extended abstract · supplementary material · project page (with code)

Structure-Preserving Image Smoothing via Region Covariances
Levent Karacan, Erkut Erdem and Aykut Erdem
SIGGRAPH Asia 2013
pdf · project page (with code)

Visual saliency estimation by nonlinearly integrating features using region covariances
Erkut Erdem and Aykut Erdem
JOV 2013
pdf · project page (with code)

Graph Transduction as a Non-Cooperative Game
Aykut Erdem and Marcello Pelillo
Neural Comput 2012
pdf · code

all publications

My Erdös number is 3 (via 1. Arun Kumar Jagota, 2. Marcello Pelillo)

Associate Professor
KUIS AI Center
Department of Computer Engineering
Koç University

aerdem@ku.edu.tr
profile
@aykuterdemml

College of Engineering ENG 103
Koç University
Rumelifeneri Yolu
Istanbul, Turkey TR-34450
+90 (212) 338-0916

Office hours: Tue 10:00-11:00

Curriculum vitae · all publications

Teaching (2025-2026)

Fall	COMP201 · COMP541
Spring	COMP201 · COMP547

previous courses

Talks and travel

12 Dec	SIGGRAPH Asia 2023
7 May	ICLR 2024
3 Dec	SIGGRAPH Asia 2024
10 Dec	NeurIPS 2024
20 Dec	ROYAL Boun

a curated list of resources for grad students

PhD students
Deniz Bilge Akkoc
Andrew Yong Xern Bond
Hakan Capuk
Ismail Cetin
Muhammed Burak Kizil
Ali Kerem Ozturk
Enes Sanli

MSc students
Yusuf Bayindir
Doga Kukul
Merve Vural

Former Students
Burak Ercan (PhD) 2024
Aysun Kocak (PhD) 2023
Semih Yagcioglu (PhD) 2023
Kemal Cizmeciler (PhD) 2022
Bora Celikkale (PhD) 2020
Levent Karacan (PhD) 2019
Yasin Kavak (PhD) 2017
Osman Akin (PhD) 2016

Suleyman Yildirim (MSc) 2025
Emre Can Acikgoz (MSc) 2024
Osman Batur Ince (MSc) 2024
Burak Can Biner, (MSc) 2023
Nafiseh Jabbari Tofighi, (MSc) 2023
Ahmed Imam Shah, (MSc) 2023
Mert Cokelek (MSc) 2023
Abdul Basit Anees (MSc) 2023
Ahmet Canberk Baykal (MSc) 2022
Tayfun Ates (MSc) 2020
Ahmet Serdar Karadeniz (MSc) 2020
Emre Boran (MSc) 2020
Menekse Kuyu (MSc) 2020
Begum Citamak (MSc) 2020
Mehmet Gunel (MSc) 2018
Goksu Erdogan (MSc) 2018
Burcak Asal (MSc) 2017
Cagdas Bak (MSc) 2016
Mert Kilickaya (MSc) 2016
Tugrul Erdogan (MSc) 2015

Our research is made possible through generous support by The Scientific and Technological Research Council of Turkey (TUBITAK) and NVIDIA Corporation.

Some books I've read and enjoyed

Page design courtesy of Michael Bernstein

Aykut Erdem · Koç University

The broad goal of my research is to explore better ways to understand, interpret and manipulate visual data*. My research interests span a diverse set of topics, ranging from image editing to visual saliency estimation, and to multimodal learning for integrated vision and language.