Rianne M. Schouten

Other important links: Google Scholar, LinkedIn, Github

Introduction

I develop data mining methods that discover and describe differences between individuals. People differ, and my methods aim to reveal exceptional, coherent and interpretable subgroups in a population of persons. For instance, in the medical context, we aim to discover subgroups of patients with exceptional responses to treatments. And in the educational domain, we discover subgroups of students with exceptional learning behaviour.

My methodological contributions stand out because I develop generic methods that have real-world impact.

Methods: Exceptional Model Mining | Local Pattern Mining | Subgroup Discovery | Subgroup Analysis |

Application domains: EdTech | Personalized Learning Analytics | e-Health | Medicine |

In sum

I obtained a personal NWO take-off grant, and work together with Turku Research Institute for Learning Analytics and AlgebraKit to investigate whether our data mining technology can be implemented in digital learning platforms.

Together with Erasmus Medical Center and Catharina Hospital Eindhoven, I unravel population heterogeneity and solves issues with missing data in longitudinal data. In the past, I worked with Netherlands Institute of Mental Health and Addiction, Hospital Twente and Utrecht Medical Center.

Before joining TU/e, I worked as a Missing Data Researcher and visited Prof. Andrew Gelman at the Department of Statistics at Columbia University in 2016. Recently, I visited Prof. Barbara Hammer at Bielefeld University in 2025. Furthermore, in 2024, I served as Proceedings Chair at ECML PKDD, and was pronounced an Excellent Reviewer.

I love interacting with students, and have supervised over 20 individual students, 15 groups of students, taught in 4 Master level courses, and received Two awards for Excellent Course Evaluations. I currently supervise 2 PhD candidates.

Affiliations

2025 - present: Post-doc at TU/e.

2020 - 2024: PhD candidate at TU/e.

Supervisors: Prof. Mykola Pechenizkiy, Dr. Wouter Duivesteijn.

Topic: Exceptional Model Mining for Hierarchical Data.

2017 - 2019: Missing Data Researcher at Utrecht University.

Supervisors: Prof. Stef van Buuren, Dr. Gerko Vink.

2018 - 2019: Developer Data & Analytics at a Youth-Care Organization.

Database engineering, Stakeholder communication, Small team lead.

2017: Data Science Consultant at DPA Professionals.

2016: Staff Associate of Professor Andrew Gelman at Columbia University in the City of New York, US. Duration: 2 months.

Awards and recognition

Recognition as Excellent reviewer Research Track ECML PKDD 2024
Award for Excellent course evaluation Research Topics in Data Mining 2022/2023
Award for Excellent course evaluation Research Topics in Data Mining 2021/2022
In top-3 Best Poster during NWO Commit2Data Day in 2021

Grants and funding

Year	Call	Type	Title	Together with
2024	Take-off Phase 1 NWO	€40k Feasibility study	Integration of Local Pattern Mining in Digital Assessment Tools	Prof. Mykola Pechenizkiy
2022	AI for Health EWUU Alliance	EUR 45k seed money	Better Imputation by Generative Adversarial NeTworks (BIGANT)	Prof. Stef van Buuren, Dr. Gerko Vink, Hanne Oberman, Prof. Mykola Pechenizkiy, Prof. Cassio de Campos, Rianne Schouten, Daniel Oberski, Dr. Thomas Debray, Prof. Fred van Eeuwijk

Domain collaborations

Year	Organization	Type of collaboration
2025-now	AlgebraKit	Answering domain questions using EMM
2022-now	Turku Research Institute for Learning Analytics	Answering domain questions using EMM
2022-now	Dutch south west Psoriatic Arthritis Registry, Erasmus MC	Solving MD problems + Answering domain questions using EMM
2020-now	Biomedical Systems and Signals Group UTwente and Hospital Twente	Discovering exceptional blood glucose fluctuations
2021-2024	Netherlands Institute of Mental Health and Addiction	Answering domain questions using EMM
2017-2019	University Medical Center Utrecht	Statistics consultant

Research visits

Year	Where?	Who?	Why?	Duration
2025	Leuven University	Prof. Hendrik Blockeel, Prof. Jesse Davis	To discuss Pattern mining and Personalized recommendations	3 days, in August
2025	Ghent University	Prof. Tijl de Bie, Prof. Jefrey Lijfijt	To discuss Pattern mining and fairness	2 days, in August
2025	Bielefeld University	Prof. Barbara Hammer	To discuss Concept drift, XAI and User interaction	1 week
2025	University of Twente	Prof. Monique Tabak	To discuss E-health and Personalized recommendations	1 day
2024	Radboud University Nijmegen	Dr. Marcos L.P. Bueno	To provide a guest-lecture in the AI for Healthcare course	1 day
2016	Columbia University	Prof. Andrew Gelman	To collaborate on Evaluating missing data methods, together with Dr. Gerko Vink	2 months

Publications

See Google Scholar: >20 publications, >400 citations and h-index: 7.

van den Biggelaar, L., Schouten, R.M., de Bie, A., Bouwman, A, & Duivesteijn. W. (2025) Characterizing the Risk of Atrial Fibrillation in Cardiac Patients with Exceptional Electrocardiogram Phenotypes. Accepted for publication at KDD25.
Schouten, R.M. (2025) Exceptional Model Mining for Hierarchical Data. PhD thesis.
Schouten, R.M. (2024) On the role of prognostic factors and effect modifiers in structural causal models. Accepted for presentation at Causal Representation Learning Workshop NeurIPS. See here for the paper!
Van den Berg, N. T., Broekgaarden, B. O., Mahieu Dionysia, P., Martens, J. G., Niederle, J., Schouten, R.M., & Duivesteijn, W. (2024) Generating MNAR missingness in image data, with additional evaluation ofMisGAN. Accepted for presentation at BNAIC/BeNeLearn 2024. See here for the presentation.
Schouten, R.M., Stevens, G.W.J.M., van Dorsselaer, S.A.F.M., Duinhof, E.L., Monshouwer, K., Pechenizkiy, M. & Duivesteijn, W. (2024) Analyzing the interplay between societal trends and socio-demographic variables with local pattern mining: Discovering exceptional trends in adolescent alcohol use in the Netherlands. To appear in post-proceedings BNAIC/BeNeLearn 2024.
Schouten, R.M., Duivesteijn, W., Rasanen, P, Paul, J.M., & Pechenizkiy, M. (2024) Exceptional Subitizing Patterns: Exploring Mathematical Abilities of Finnish Primary School Children with Piecewise Linear Regression. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pp. 66-82.
Schouten, R.M., Tascau, V., Ziegler, G.G., Casano, D., Ardizonne, M., & Erotokritou M.A. (2023) Dropping incomplete records is (not so) straightforward. In: Proceedings of the 21st International Symposium on Intelligent Data Analysis (IDA), pp. 379-391.
Verhaegh, R.F.A., Kiezebrink, J.J.E., Nusteling, F., Rio, A.W.A, Bendicsek, M.B., Duivesteijn, W. & Schouten, R.M. (2022) A Clustering-inspired Quality Measure for Exceptional Preferences Mining — Design Choices and Consequences. In: Proceedings of the International Conference on Discovery Science (DS), pp. 429–444.
Schouten, R.M., Zamanzadeh, D. & Singh, P. (2022) Pyampute: a Python library for data amputation. Zenodo. https://doi.org/10.25080/majora-212e5952-03e.
Van der Haar, J.F., Nagelkerken, S.C., Smit, I.G., van Straaten, K., Tack, J.A., Schouten, R.M. & Duivesteijn, W. (2022) Efficient Subgroup Discovery Through Auto-Encoding In: Proceedings of the 20th International Symposium on Intelligent Data Analysis (IDA), pp. 327-340.
Schouten, R.M., Duivesteijn, W. & Pechenizkiy, M. (2022) Exceptional Model Mining for Repeated Cross-Sectional Data (EMM-RCS). In: Proceedings of SIAM International Conference on Data Mining (SDM), pp. 585-593.
Schouten, R.M., Bueno, M.L.P., Duivesteijn, W. & Pechenizkiy, M. (2022) Mining Sequences with Exceptional Transition Behaviour of Varying Order using Quality Measures based on Information-Theoretic Scoring Functions Data Mining and Knowledge Discovery (DAMI), 36: 379-413.
IJsselhof R, Duchateau S, Schouten R.M., Slieker M, Hazekamp M & Schoof P. (2020) Long-Term Follow-Up of Pericardium for the Ventricular Component in Atrioventricular Septal Defect Repair World Journal for Pediatric and Congenital Heart Surgery, 11(6): 742-747.
IJsselhof R.J., Duchateau S.D.R., Schouten R.M., Freund, M.W., Heuser, J., Fejzic, Z., Haas, F., Schoof, P.H. & Slieker, M.G. (2019) Follow-up After Biventricular Repair of the Hypoplastic Left Heart Complex European Journal of Cardiothoracic Surgery, 57(4): 644-651.
Schouten R.M., Lugtig, P. & Vink, G. (2018) Generating missing values for simulation purposes: A multivariate amputation procedure Journal of Statistical Computation and Simulation, 88(15): 1909-1930.
Schouten, R.M. and Vink, G. (2021) The dance of the mechanisms: How observed information influences the validity of missingness assumptions Sociological Methods & Research, 50(3): 1243-1258.
Kappen, I.F.P.M., Bittermann, G.K.P., Schouten, R.M., Bittermann, D., Etty, E., Koole, R., Kon, M., Van der Molen, M. & Breugem, C.C. (2017) Long-term mid-facial growth of patients with a unilateral complete cleft of lip, alveolus and palate treated by two-stage palatoplasty: cephalometric analysis Clinical Oral Investigations, 21: 1801-1810.
de Vries, C.P., Schouten, R.M., Van der Kuur, J., Gottardi, L., & Akamatsu, H. (2016) Microcalorimeter pulse analysis by means of principle component decomposition In: Proceedings SPIE 9905, Space TElescopes and Instrumentation 2016: Ultraviolet to Gamma Ray, 99055v. DOI: 10.1117/12.2231627

Teaching

At Eindhoven University of Technology, my teaching track record is:

Year	Course	Level	Activities
24/25	Research Topics in Data Mining	MSc	Responsible for track: Empirical Challenges in Data Mining
22/23	Research Topics in Data Mining	MSc	Taught 2 lectures about Missing Data, supervised 4 groups of students during research project
21/22	Research Topics in Data Mining	MSc	Taught 2 lectures about Missing Data, supervised 7 groups of students during research project
20/21	Foundations of Data Mining	MSc	Taught 2 lectures about Missing Data, provided answers in weekly Q&A, developed practice questions for exam, administrative activities

“I took a lot of courses last year, but I like your instructions the most. It is not only because of your professional knowledge, but also because of your personality of being kind, patient, responsible.” (Jin Ouyang, Master student, 2022)

At Utrecht University, my teaching track record is:

Year	Course	Level	Activities
2016	Survey Research: Statistical Analysis and Estimation	Advanced MSc	Organization,Tutoring exercise classes
2016	Survey Research: Design, Implementation and Data Processing	Advanced MSc	Organization,Tutoring exercise classes
2015	Survey Research: Statistical Analysis and Estimation	Advanced MSc	Organization,Tutoring exercise classes
2015	Survey Research: Design, Implementation and Data Processing	Advanced MSc	Organization,Tutoring exercise classes

“Rianne was a first class assistant at our summer school courses. Not only was all material prepared extremely punctual and without errors, she also got very high student evaluations. I can wholeheartedly recommend Rianne!” (Prof. dr. Edith de Leeuw, 2016)

Supervision

Current students:

Start	Student	Type	Topic
May 2025	Abdullahi Farah	MSc Thesis	Numerical Optimization for EMM
Aug 2024	Lieke van den Biggelaar	PhD Research	Exceptional Model Mining with Time Series Data
Nov 2023	Emmanuel C. Chukwu	PhD Research	Counterfactual Explanations in Time Series Classification

Alumni:

Year	Student	Type	Topic
24/25	Haoqi Guo	MSc Thesis	Improving Diversity and Feasibility of Counterfactual Explanations, supervision together with prof.dr. Mykola Pechenizkiy
23/24	Lieke van den Biggelaar	MSc Thesis with Catharina Hospital	Discovering subgroups of patients with exceptional Atrium Fibrillation based on ECGs, supervision together with dr. Wouter Duivesteijn
22/23	Victoria Tascau	MSc Thesis with DEPAR/Erasmus Medical Centrum	Handling Missing Values in Longitudinal Medical Data, supervision together with dr. Wouter Duivesteijn
21/22	Mika van Loon	MSc Thesis	Bootstrap Hypothesis Tests for Evaluating Subgroup Descriptions in Exceptional Model Mining, supervision together with dr. Wouter Duivesteijn
21/22	Varun Kamat	Internship at Signify	Recommendation Tool for Component Database
21/22	Isabel van den Heuvel	Research Proposal	Equivalence Testing for Developing Fair Machine Learning Algorithms, supervision together with Hilde Weerts
21/22	Mats Verbraak	Research Proposal	Handling Missing Data in the Prediction Domain using Multiple Imputation
20/21	Bart van Dooren	MSc Thesis with Philips	Predicting Cardiovascular Risk with Objective Physical Activity Measurements, supervision together with prof.dr. Mykola Pechenizkiy

Community service

I reviewed >15 papers for top-level data mining conferences and statistical journals (DAMI, ECML PKDD, EWAF, JRSSB, SiM, BimJ). I was recognized as an Excellent reviewer at ECML PKDD 2024. I served as Proceedings Chair at ECML PKDD 2024. At EWAF 2025, I was a Session chair.

Software development

1. `ampute` in R-package mice

library(mice)
?ampute

R-function ampute is the implementation of a multivariate amputation procedure: a method for generating missing data in complete datasets. With ampute, it is straightforward to generate missing values in multiple variables, with different missing data proportions and varying underlying missingness mechanisms. Read the article or the vignette to learn more.

2. `parlMICE`

For large datasets or when you want to impute with a large number of imputations, multiple imputation with mice in R-package mice may have a long run time. As a solution, Gerko Vink and I created wrapper function parlMICE, which allows for a parallel run of mice.

The function is now part of package mice under the name parlmice.

library(mice)
?parlmice

All information can be found in the github repo or in the vignette.

3. `pyampute`: the first Python library for data amputation

Library pyampute provides the multivariate amputation methodology to the Python community, and it does more. It has improved default settings, allows for a combined MAR+MNAR mechanism, for custom probability functions and since it is compatible with scikit-learn’s fit and transform paradigm, seamless integration in data processing pipelines becomes easy.

Find plenty of examples in pyampute’s documentation. Davina’s presentation at SciPy22 can be found here.

Install using pip or from source:

pip install pyampute

git clone https://github.com/RianneSchouten/pyampute.git
pip install ./pyampute

Invited presentations

Year	Occasion	Type	Topic	Link to materials
2025	Research visit @ Bielefeld University	Colloquium talk	Exceptional Model Mining for Hierarchical Data
2025	Info-topic about my PhD research	News	Exceptional Model Mining for Hierarchical Data	link
2024	AI for Healthcare	Invited guest-lecture at Radboud University Nijmegen	Exceptional Model Mining
2021	EAISI Eindhoven	Invited presentation	Towards a better understanding of exceptional lifestyle behaviour
2019	Workshop R-Ladies Amsterdam	Invited presentation	Developed and presented a workshop about analysis of missing values, evaluation and implementation of missing data methods	link
2018	ICT Open	Contribution to 1-day conference	Handling Missing Data in Data Science	link
2018	European Women in Technology	Masterclass at conference	Dealing with missing data in `R`: Amputation or Imputation?	presentation and exercises
2018	sat-R-Day	Contribution to 1-day conference	Missing data	link
2018	Data Science Hackathon	By invitation	Developed and lead a missing data challenge	link
2017	Amst-R-Dam	By invitations	How to use R-function `ampute` to generate missing values in complete datasets	article and documentation
2017	UseR!2017	Contribution to conference	Introduction to multivariate amputation with `ampute`