Introduction and affiliations

My name is Rianne and I develop data analysis methods.

I am a PhD Candidate at Eindhoven University of Technology and study the Exceptional Model Mining framework. We work on developing solutions for heterogeneous datasets with time-dependent attributes. My supervisors are dr. Wouter Duivesteijn and prof. Mykola Pechenizkiy.

Additionally, I work on Missing Data methods, particularly the process of generating missing values in complete datasets. Under supervision of prof. Stef van Buuren and dr. Gerko Vink, I developed a multivariate amputation procedure and implemented the method in an R-function (ampute in package mice) and python module (pyampute).

I find it important that my work can be used for real-world problems (e.g. in health care) and enjoy collaborating on projects. Please contact me if you have questions or seek opportunities to work together.

Below you will find overviews of and links to Publications, Awards, Software development, Teaching activities, Supervision activities, Review activities, Grants, Presentations, Personal development, and Contact details


Publications


Awards

  1. Performance bonus for 2021

“Beside your overall excellent performance in your PhD research and EDIC project execution, you did an excellent job in project management of EDIC, and in setting up new successful collaborations. You helped a lot with the Research Topics in Data Mining course, and supervision of students.”

  1. Pluim for Excellent course evaluation Research Topics in Data Mining 2021/2022

Software development

1. ampute in R-package mice

library(mice)
?ampute

R-function ampute is the implementation of a multivariate amputation procedure: a method for generating missing data in complete datasets. With ampute, it is straightforward to generate missing values in multiple variables, with different missing data proportions and varying underlying missingness mechanisms. Read the article or the vignette to learn more.

2. parlMICE

For large datasets or when you want to impute with a large number of imputations, multiple imputation with mice in R-package mice may have a long run time. As a solution, Gerko Vink and I created wrapper function parlMICE, which allows for a parallel run of mice.

The function is now part of package mice under the name parlmice.

library(mice)
?parlmice

All information can be found in the github repo or in the vignette.

3. pyampute: the first Python library for data amputation

Library pyampute provides the multivariate amputation methodology to the Python community, and it does more. It has improved default settings, allows for a combined MAR+MNAR mechanism, for custom probability functions and since it is compatible with scikit-learn’s fit and transform paradigm, seamless integration in data processing pipelines becomes easy.

Find plenty of examples in pyampute’s documentation. Davina’s presentation at SciPy22 can be found here.

Install using pip or from source:

pip install pyampute
git clone https://github.com/RianneSchouten/pyampute.git
pip install ./pyampute

Teaching activities

During my time at TU/e, I have taught in the following courses.

Year Course Level Activities
20/21 2AMM20 Foundations of Data Mining MSc Taught 2 lectures about Missing Data, provided answers in weekly Q&A, developed practice questions for exam, administrative activities
21/22 2AMM20 Research Topics in Data Mining MSc Taught 2 lectures about Missing Data, supervised 7 groups of students during research project
22/23 2AMM20 Research Topics in Data Mining MSc Taught 2 lectures about Missing Data, supervised 4 groups of students during research project

During my time at UU, I have taught lectures in 2 summerschool courses in 2015 and 2016. The courses are at an Advanced Master level: Survey Research: Design, Implementation and Data Processing and Survey Research: Statistical Analysis and Estimation.


Supervision activities

During my time at TU/e, I have supervised the following students.

Year Student Type Topic
20/21 Bart van Dooren MSc Thesis with Philips Predicting Cardiovascular Risk with Objective Physical Activity Measurements
21/22 Mats Verbraak Research Proposal Handling Missing Data in the Prediction Domain using Multiple Imputation
21/22 Isabel van den Heuvel Research Proposal Equivalence Testing for Developing Fair Machine Learning Algorithms