Introduction and affiliations

My name is Rianne and I develop data analysis methods. I am a PhD Candidate at Eindhoven University of Technology and study the Exceptional Model Mining framework. We work on developing solutions for heterogeneous datasets with time-dependent attributes. My supervisors are dr. Wouter Duivesteijn and prof. Mykola Pechenizkiy.

Additionally, I work on Missing Data methods, particularly the process of generating missing values in complete datasets. Under supervision of prof. Stef van Buuren and dr. Gerko Vink, I developed a multivariate amputation procedure and implemented the method in an R-function (ampute in package mice) and python module (pyampute).

I find it important that my work can be used for real-world problems (e.g. in health care) and enjoy collaborating on projects. Please contact me if you have questions or seek opportunities to work together.


Publications


Awards and recognition

  • Recognition as Excellent reviewer Research Track ECML PKDD 2024

  • Award for Excellent course evaluation Research Topics in Data Mining 2022/2023

  • Award for Excellent course evaluation Research Topics in Data Mining 2021/2022

  • Performance bonus for 2021

“Beside your overall excellent performance in your PhD research and EDIC project execution, you did an excellent job in project management of EDIC, and in setting up new successful collaborations. You helped a lot with the Research Topics in Data Mining course, and supervision of students.”


Teaching activities

I am teaching in the following courses:

Year Course Level Activities
20/21 2AMM20 Foundations of Data Mining MSc Taught 2 lectures about Missing Data, provided answers in weekly Q&A, developed practice questions for exam, administrative activities
21/22 2AMM20 Research Topics in Data Mining MSc Taught 2 lectures about Missing Data, supervised 7 groups of students during research project
22/23 2AMM20 Research Topics in Data Mining MSc Taught 2 lectures about Missing Data, supervised 4 groups of students during research project
24/25 2AMM20 Research Topics in Data Mining MSc Responsible for track: Empirical Challenges in Data Mining

During my time at UU, I have taught lectures in 2 Summer School courses in 2015 and 2016. The courses are at an Advanced Master level: Survey Research: Design, Implementation and Data Processing and Survey Research: Statistical Analysis and Estimation.


Supervision activities

Year Student Type Topic
20/21 Bart van Dooren MSc Thesis with Philips Predicting Cardiovascular Risk with Objective Physical Activity Measurements, supervision together with prof.dr. Mykola Pechenizkiy
21/22 Mats Verbraak Research Proposal Handling Missing Data in the Prediction Domain using Multiple Imputation
21/22 Isabel van den Heuvel Research Proposal Equivalence Testing for Developing Fair Machine Learning Algorithms, supervision together with Hilde Weerts
21/22 Varun Kamat Internship at Signify Recommendation Tool for Component Database
21/22 Mika van Loon MSc Thesis Bootstrap Hypothesis Tests for Evaluating Subgroup Descriptions in Exceptional Model Mining, supervision together with dr. Wouter Duivesteijn
22/23 Victoria Tascau MSc Thesis with DEPAR/Erasmus Medical Centrum Handling Missing Values in Longitudinal Medical Data, supervision together with dr. Wouter Duivesteijn
23/24 Lieke van den Biggelaar MSc Thesis with Catharina Hospital Discovering subgroups of patients with exceptional Atrium Fibrillation based on ECGs, supervision together with dr. Wouter Duivesteijn
23/24 Bart Slenders MSc Thesis Beam Pollution in Exceptional Model Mining, supervision together with dr. Wouter Duivesteijn
24/25 Haoqi Guo MSc Thesis Visualization of Counterfactual Explanations, supervision together with prof.dr. Mykola Pechenizkiy

Grants and funding

Year Call Type Title Together with
2022 AI for Health EWUU Alliance EUR 45k seed money Better Imputation by Generative Adversarial NeTworks (BIGANT) Prof.dr.Stef van Buuren, dr. Gerko Vink, Hanne Oberman, Prof.dr. Mykola Pechenizkiy, Prof.dr. Cassio de Campos, Rianne Schouten, Prof.dr. Daniel Oberski, dr. Thomas Debray, prof.dr. Fred van Eeuwijk
2024 ECML PKDD €500 reimbursement of entree ticket Proceedings Chair Together with dr. Wouter Duivesteijn

Reviewing activities

Reviewing for journals and conferences such as Data Mining and Knowledge Discovery (DAMI), Machine Learning (ML), ECML PKDD, Journal of Statistical Society, and more.

I was recognized as an excellent reviewer at ECML PKDD 2024.


Software development

1. ampute in R-package mice

library(mice)
?ampute

R-function ampute is the implementation of a multivariate amputation procedure: a method for generating missing data in complete datasets. With ampute, it is straightforward to generate missing values in multiple variables, with different missing data proportions and varying underlying missingness mechanisms. Read the article or the vignette to learn more.

2. parlMICE

For large datasets or when you want to impute with a large number of imputations, multiple imputation with mice in R-package mice may have a long run time. As a solution, Gerko Vink and I created wrapper function parlMICE, which allows for a parallel run of mice.

The function is now part of package mice under the name parlmice.

library(mice)
?parlmice

All information can be found in the github repo or in the vignette.

3. pyampute: the first Python library for data amputation

Library pyampute provides the multivariate amputation methodology to the Python community, and it does more. It has improved default settings, allows for a combined MAR+MNAR mechanism, for custom probability functions and since it is compatible with scikit-learn’s fit and transform paradigm, seamless integration in data processing pipelines becomes easy.

Find plenty of examples in pyampute’s documentation. Davina’s presentation at SciPy22 can be found here.

Install using pip or from source:

pip install pyampute
git clone https://github.com/RianneSchouten/pyampute.git
pip install ./pyampute

Presentations

Year Occasion Type Topic Link to materials
2024 Course AI for Health, at Radboud University Nijmegen Invited guest-lecture Exceptional Model Mining to appear
2021 Neglected Assumptions in Causal Inference (NACI) workshop at ICML Contribution Understanding the Role of Prognostic Factors and Effect Modifiers in Heterogeneity of Treatment Effect using a Within-Subjects Analysis of Variance link
2021 EAISI Eindhoven Contribution to seminar Towards a better understanding of exceptional lifestyle behaviour
2021 ECMLPKDD Poster at conference Mining Sequences with Exceptional Transition Behaviour of Varying Order link
2019 Workshop R-Ladies Amsterdam Invited presentation at seminar Developed and presented a workshop about analysis of missing values, evaluation and implementation of missing data methods link
2018 ICT Open Contribution to 1-day conference Handling Missing Data in Data Science link
2018 European Women in Technology Masterclass at conference Dealing with missing data in R: Amputation or Imputation? presentation and exercises
2018 sat-R-Day Contribution to 1-day conference Missing data link
2018 Data Science Hackathon By invitation Developed and lead a missing data challenge link
2017 Amst-R-Dam Contribution to seminar How to use R-function ampute to generate missing values in complete datasets article and documentation
2017 UseR!2017 Contribution to conference Introduction to multivariate amputation with ampute

Personal development

To develop myself as a teacher, I am participating in the University Teaching Qualification (UTQ) Training Program.

Other courses that I participated in are Project management (PROOF TU/e 2020), Introduction to Process Mining (Fluxicon, 2020), Learning and Reasoning (SIKS, 2021) and Communication Styles (PROOF TU/e 2021).


Contact details