See here for complete Curriculum Vitae

Google Scholar

LinkedIn

Github

E-mail: r.m.schouten@tue.nl

Please contact me for my research and teaching statements.


Introduction

My research revolves around developing Local Pattern Mining methods (LPMs) that extract societal relevant, interpretable patterns from data. In particular, I focus on developing methods that extract reliable and robust patterns, from data that does not follow the conventional row-by-column, flat-table format, such as sequential and hierarchical data.

I enjoy working in inter- and multidisciplinary teams. In some collaborations, domain experts consult me to support the process of doing statistically sound and trustworthy analyses. Generally, their problems relate to missing data. In other collaborations, I further develop pattern mining techniques to enable domain experts to analyze variation in human behavior. For instance, together with medical experts, we discovered subgroups of patients with deviating blood glucose fluctuations. Furthermore, together with policy advisers, we discovered subgroups of adolescents with deviating trends in alcohol use.

In my teaching and supervision, I aim to support students in becoming independent learners. My style connects well with the challenge-based learning paradigm, where the task of the teacher is to guide students in taking a structured approach in problem-solving and in thinking critically.


Affiliations

See here for complete Curriculum Vitae.

2020 - Present: Ph.D. Candidate at Eindhoven University of Technology, the Netherlands. Under supervision of prof. dr. Mykola Pechenizkiy and dr. Wouter Duivesteijn. Dissertation is submitted. Ph.D. defense scheduled for 16 Jan. 2025. Topic: Exceptional Model Mining for Hierarchical Data.

2017 - 2019: Researcher at Utrecht University, the Netherlands. Supervised by prof. dr. Stef van Buuren and dr. Gerko Vink.

Autumn 2016: Staff Associate of Professor Andrew Gelman at Columbia University in the City of New York, US.

Spring 2015: Intern at SRON, Dutch Institute of Space Research, the Netherlands.


Publications

See Google Scholar for number of citations.


Awards and recognition

  • Recognition as Excellent reviewer Research Track ECML PKDD 2024

  • Award for Excellent course evaluation Research Topics in Data Mining 2022/2023

  • Award for Excellent course evaluation Research Topics in Data Mining 2021/2022

  • Performance bonus for 2021

“Beside your overall excellent performance in your PhD research and EDIC project execution, you did an excellent job in project management of EDIC, and in setting up new successful collaborations. You helped a lot with the Research Topics in Data Mining course, and supervision of students.” (Prof. dr. Mykola Pechenizkiy)


Grants and funding

Year Call Type Title Together with
2022 AI for Health EWUU Alliance EUR 45k seed money Better Imputation by Generative Adversarial NeTworks (BIGANT) prof. dr.Stef van Buuren, dr. Gerko Vink, Hanne Oberman, prof. dr. Mykola Pechenizkiy, prof. dr. Cassio de Campos, Rianne Schouten, prof. dr. Daniel Oberski, dr. Thomas Debray, prof. dr. Fred van Eeuwijk
2024 ECML PKDD €500 reimbursement of entree ticket Proceedings Chair Together with dr. Wouter Duivesteijn

Ongoing: reached interview phase, scheduled for Nov. 2024:

Year Call Type Title Together with
2024 Take-off Phase 1 NWO €40k Feasibility study Integration of Local Pattern Mining in Digital Assessment Tools Together with prof.dr. Mykola Pechenizkiy

Teaching

To further develop myself as a teacher, I am participating in the University Teaching Qualification (UTQ) Training Program.

At Eindhoven University of Technology, my teaching track record is:

Year Course Level Activities
20/21 Foundations of Data Mining MSc Taught 2 lectures about Missing Data, provided answers in weekly Q&A, developed practice questions for exam, administrative activities
21/22 Research Topics in Data Mining MSc Taught 2 lectures about Missing Data, supervised 7 groups of students during research project
22/23 Research Topics in Data Mining MSc Taught 2 lectures about Missing Data, supervised 4 groups of students during research project
24/25 Research Topics in Data Mining MSc Responsible for track: Empirical Challenges in Data Mining

“I took a lot of courses last year, but I like your instructions the most. It is not only because of your professional knowledge, but also because of your personality of being kind, patient, responsible.” (Jin Ouyang, Master student, 2022)

At Utrecht University, my teaching track record is:

Year Course Level Activities
2015 Survey Research: Design, Implementation and Data Processing Advanced MSc Organization,Tutoring exercise classes
2015 Survey Research: Statistical Analysis and Estimation Advanced MSc Organization,Tutoring exercise classes
2016 Survey Research: Design, Implementation and Data Processing Advanced MSc Organization,Tutoring exercise classes
2016 Survey Research: Statistical Analysis and Estimation Advanced MSc Organization,Tutoring exercise classes

“Rianne was a first class assistant at our summer school courses. Not only was all material prepared extremely punctual and without errors, she also got very high student evaluations. I can wholeheartedly recommend Rianne!” (Prof. dr. Edith de Leeuw, 2016)


Supervision

These students completed their projects, (partly) under my supervision, many with high grades.

Year Student Type Topic
20/21 Bart van Dooren MSc Thesis with Philips Predicting Cardiovascular Risk with Objective Physical Activity Measurements, supervision together with prof.dr. Mykola Pechenizkiy
21/22 Mats Verbraak Research Proposal Handling Missing Data in the Prediction Domain using Multiple Imputation
21/22 Isabel van den Heuvel Research Proposal Equivalence Testing for Developing Fair Machine Learning Algorithms, supervision together with Hilde Weerts
21/22 Varun Kamat Internship at Signify Recommendation Tool for Component Database
21/22 Mika van Loon MSc Thesis Bootstrap Hypothesis Tests for Evaluating Subgroup Descriptions in Exceptional Model Mining, supervision together with dr. Wouter Duivesteijn
22/23 Victoria Tascau MSc Thesis with DEPAR/Erasmus Medical Centrum Handling Missing Values in Longitudinal Medical Data, supervision together with dr. Wouter Duivesteijn
23/24 Lieke van den Biggelaar MSc Thesis with Catharina Hospital Discovering subgroups of patients with exceptional Atrium Fibrillation based on ECGs, supervision together with dr. Wouter Duivesteijn

These students are currently doing their project, (partly) under my supervision:

Year Student Type Topic
23/24 Bart Slenders MSc Thesis Beam Pollution in Exceptional Model Mining, supervision together with dr. Wouter Duivesteijn
24/25 Haoqi Guo MSc Thesis Visualization of Counterfactual Explanations, supervision together with prof.dr. Mykola Pechenizkiy

Reviewing activities

Reviewing for journals and conferences such as Data Mining and Knowledge Discovery (DAMI), Machine Learning (ML), ECML PKDD, Journal of Statistical Society, and more.

I was recognized as an excellent reviewer at ECML PKDD 2024.


Software development

1. ampute in R-package mice

library(mice)
?ampute

R-function ampute is the implementation of a multivariate amputation procedure: a method for generating missing data in complete datasets. With ampute, it is straightforward to generate missing values in multiple variables, with different missing data proportions and varying underlying missingness mechanisms. Read the article or the vignette to learn more.

2. parlMICE

For large datasets or when you want to impute with a large number of imputations, multiple imputation with mice in R-package mice may have a long run time. As a solution, Gerko Vink and I created wrapper function parlMICE, which allows for a parallel run of mice.

The function is now part of package mice under the name parlmice.

library(mice)
?parlmice

All information can be found in the github repo or in the vignette.

3. pyampute: the first Python library for data amputation

Library pyampute provides the multivariate amputation methodology to the Python community, and it does more. It has improved default settings, allows for a combined MAR+MNAR mechanism, for custom probability functions and since it is compatible with scikit-learn’s fit and transform paradigm, seamless integration in data processing pipelines becomes easy.

Find plenty of examples in pyampute’s documentation. Davina’s presentation at SciPy22 can be found here.

Install using pip or from source:

pip install pyampute
git clone https://github.com/RianneSchouten/pyampute.git
pip install ./pyampute

Presentations (other than conference presentations)

Year Occasion Type Topic Link to materials
2024 Course AI for Health, at Radboud University Nijmegen Invited guest-lecture Exceptional Model Mining to appear
2021 Neglected Assumptions in Causal Inference (NACI) workshop at ICML Contribution Understanding the Role of Prognostic Factors and Effect Modifiers in Heterogeneity of Treatment Effect using a Within-Subjects Analysis of Variance link
2021 EAISI Eindhoven Contribution to seminar Towards a better understanding of exceptional lifestyle behaviour
2021 ECMLPKDD Poster at conference Mining Sequences with Exceptional Transition Behaviour of Varying Order link
2019 Workshop R-Ladies Amsterdam Invited presentation at seminar Developed and presented a workshop about analysis of missing values, evaluation and implementation of missing data methods link
2018 ICT Open Contribution to 1-day conference Handling Missing Data in Data Science link
2018 European Women in Technology Masterclass at conference Dealing with missing data in R: Amputation or Imputation? presentation and exercises
2018 sat-R-Day Contribution to 1-day conference Missing data link
2018 Data Science Hackathon By invitation Developed and lead a missing data challenge link
2017 Amst-R-Dam Contribution to seminar How to use R-function ampute to generate missing values in complete datasets article and documentation
2017 UseR!2017 Contribution to conference Introduction to multivariate amputation with ampute