Introduction and affiliations

My name is Rianne and I develop data analysis methods. I am a PhD Candidate at Eindhoven University of Technology and study the Exceptional Model Mining framework. We work on developing solutions for heterogeneous datasets with time-dependent attributes. My supervisors are dr. Wouter Duivesteijn and prof. Mykola Pechenizkiy.

Additionally, I work on Missing Data methods, particularly the process of generating missing values in complete datasets. Under supervision of prof. Stef van Buuren and dr. Gerko Vink, I developed a multivariate amputation procedure and implemented the method in an R-function (ampute in package mice) and python module (pyampute).

I find it important that my work can be used for real-world problems (e.g. in health care) and enjoy collaborating on projects. Please contact me if you have questions or seek opportunities to work together.

Below you will find overviews of and links to Publications, Awards, Software development, Teaching activities, Supervision activities, Review activities, Grants, Presentations, Personal development, and Contact details


Publications


Awards

  1. Performance bonus for 2021

“Beside your overall excellent performance in your PhD research and EDIC project execution, you did an excellent job in project management of EDIC, and in setting up new successful collaborations. You helped a lot with the Research Topics in Data Mining course, and supervision of students.”

  1. Pluim for Excellent course evaluation Research Topics in Data Mining 2021/2022

  2. Pluim for Excellent course evaluation Research Topics in Data Mining 2022/2023


Software development

1. ampute in R-package mice

library(mice)
?ampute

R-function ampute is the implementation of a multivariate amputation procedure: a method for generating missing data in complete datasets. With ampute, it is straightforward to generate missing values in multiple variables, with different missing data proportions and varying underlying missingness mechanisms. Read the article or the vignette to learn more.

2. parlMICE

For large datasets or when you want to impute with a large number of imputations, multiple imputation with mice in R-package mice may have a long run time. As a solution, Gerko Vink and I created wrapper function parlMICE, which allows for a parallel run of mice.

The function is now part of package mice under the name parlmice.

library(mice)
?parlmice

All information can be found in the github repo or in the vignette.

3. pyampute: the first Python library for data amputation

Library pyampute provides the multivariate amputation methodology to the Python community, and it does more. It has improved default settings, allows for a combined MAR+MNAR mechanism, for custom probability functions and since it is compatible with scikit-learn’s fit and transform paradigm, seamless integration in data processing pipelines becomes easy.

Find plenty of examples in pyampute’s documentation. Davina’s presentation at SciPy22 can be found here.

Install using pip or from source:

pip install pyampute
git clone https://github.com/RianneSchouten/pyampute.git
pip install ./pyampute

Teaching activities

During my time at TU/e, I have taught in the following courses.

Year Course Level Activities
20/21 2AMM20 Foundations of Data Mining MSc Taught 2 lectures about Missing Data, provided answers in weekly Q&A, developed practice questions for exam, administrative activities
21/22 2AMM20 Research Topics in Data Mining MSc Taught 2 lectures about Missing Data, supervised 7 groups of students during research project
22/23 2AMM20 Research Topics in Data Mining MSc Taught 2 lectures about Missing Data, supervised 4 groups of students during research project

During my time at UU, I have taught lectures in 2 summerschool courses in 2015 and 2016. The courses are at an Advanced Master level: Survey Research: Design, Implementation and Data Processing and Survey Research: Statistical Analysis and Estimation.


Supervision activities

During my time at TU/e, I have supervised the following students.

Year Student Type Topic
20/21 Bart van Dooren MSc Thesis with Philips Predicting Cardiovascular Risk with Objective Physical Activity Measurements
21/22 Mats Verbraak Research Proposal Handling Missing Data in the Prediction Domain using Multiple Imputation
21/22 Isabel van den Heuvel Research Proposal Equivalence Testing for Developing Fair Machine Learning Algorithms
21/22 Varun Kamat Internship at Signify Recommendation Tool for Component Database
21/22 Mika van Loon MSc Thesis Bootstrap Hypothesis Tests for Evaluating Subgroup Descriptions in Exceptional Model Mining
22/23 Victoria Tascau MSc Thesis with DEPAR/Erasmus MC Handling Missing Values in Longitudinal Medical Data

Review activities

With help from my supervisors, I have reviewed for Data Mining and Knowledge Discovery (DAMI) and the journal track of ECMLPKDD 2020 (Machine Learning).


Grants

Year Call Type Title Together with
2022 AI for Health EWUU Alliance EUR 45k seed money Better Imputation by Generative Adversarial NeTworks (BIGANT) Prof.dr.Stef van Buuren, dr. Gerko Vink, Hanne Oberman, Prof.dr. Mykola Pechenizkiy, Prof.dr. Cassio de Campos, Rianne Schouten, Prof.dr. Daniel Oberski, dr. Thomas Debray, prof.dr. Fred van Eeuwijk

Presentations

Year Occasion Type Topic Link to materials
2021 Neglected Assumptions in Causal Inference (NACI) workshop at ICML Contribution Understanding the Role of Prognostic Factors and Effect Modifiers in Heterogeneity of Treatment Effect using a Within-Subjects Analysis of Variance link
2021 EAISI Eindhoven Contribution to seminar Towards a better understanding of exceptional lifestyle behaviour
2021 ECMLPKDD Poster at conference Mining Sequences with Exceptional Transition Behaviour of Varying Order link
2019 Workshop R-Ladies Amsterdam Invited presentation at seminar Developed and presented a workshop about analysis of missing values, evaluation and implementation of missing data methods link
2018 ICT Open Contribution to 1-day conference Handling Missing Data in Data Science link
2018 European Women in Technology Masterclass at conference Dealing with missing data in R: Amputation or Imputation? presentation and exercises
2018 sat-R-Day Contribution to 1-day conference Missing data link
2018 Data Science Hackathon By invitation Developed and lead a missing data challenge link
2017 Amst-R-Dam Contribution to seminar How to use R-function ampute to generate missing values in complete datasets article and documentation
2017 UseR!2017 Contribution to conference Introduction to multivariate amputation with ampute

Personal development

To develop myself as a teacher, I am participating in the University Teaching Qualification (UTQ) Training Program.

Other courses that I participated in are Project management (PROOF TU/e 2020), Introduction to Process Mining (Fluxicon, 2020), Learning and Reasoning (SIKS, 2021) and Communication Styles (PROOF TU/e 2021).


Contact details