My name is Rianne and I develop data analysis methods.
I am a PhD Candidate at Eindhoven University of Technology and study the Exceptional Model Mining framework. We work on developing solutions for heterogeneous datasets with time-dependent attributes. My supervisors are dr. Wouter Duivesteijn and prof. Mykola Pechenizkiy.
Additionally, I work on Missing Data methods, particularly the
process of generating missing values in complete datasets. Under
supervision of prof.
Stef van Buuren and dr. Gerko
Vink, I developed a multivariate amputation procedure and
implemented the method in an R-function (ampute
in package mice
) and
python module (pyampute
).
I find it important that my work can be used for real-world problems (e.g. in health care) and enjoy collaborating on projects. Please contact me if you have questions or seek opportunities to work together.
Below you will find overviews of and links to Publications, Awards, Software development, Teaching activities, Supervision activities, Review activities, Grants, Presentations, Personal development, and Contact details
Schouten, Rianne M, Victoria Tascau, Gabriel G. Ziegler, Davide Casano, Marco Ardizzone, and Michael-Angelos Erotokritou (2023) Dropping incomplete records is (not so) straightforward. In: Proceedings of the 21st International Symposium on Intelligent Data Analysis (IDA), pp. 379-391.
Verhaegh, Ruben F.A., Jacco J.E. Kiezebrink, Frank Nusteling, Arnaud W.A. Rio, Marton B. Bendicsek, Wouter Duivesteijn & Rianne M. Schouten (2022) A Clustering-inspired Quality Measure for Exceptional Preferences Mining — Design Choices and Consequences. In: Proceedings of the International Conference on Discovery Science (DS), pp. 429–444.
Schouten, Rianne. M., Zamanzadeh, Davina & Singh, Prabhant (2022) Pyampute: a Python library for data amputation. Zenodo. https://doi.org/10.25080/majora-212e5952-03e.
Van der Haar, J.F., Nagelkerken, S.C., Smit, I.G., van Straaten, K., Tack, J.A., Schouten, R.M. & Duivesteijn, W. (2022) Efficient Subgroup Discovery Through Auto-Encoding. In: Proceedings of the 20th International Symposium on Intelligent Data Analysis (IDA), pp. 327-340.
Schouten, R.M., Duivesteijn, W. & Pechenizkiy, M. (2022) Exceptional Model Mining for Repeated Cross-Sectional Data (EMM-RCS). In: Proceedings of SIAM International Conference on Data Mining (SDM), pp. 585-593.
Schouten, R.M., Bueno, M.L.P., Duivesteijn, W. & Pechenizkiy, M. (2022) Mining Sequences with Exceptional Transition Behaviour of Varying Order using Quality Measures based on Information-Theoretic Scoring Functions. Data Mining and Knowledge Discovery (DAMI), 36: 379-413.
IJsselhof R, Duchateau S, Schouten R.M., Slieker M, Hazekamp M & Schoof P. Long-Term Follow-Up of Pericardium for the Ventricular Component in Atrioventricular Septal Defect Repair. World Journal for Pediatric and Congenital Heart Surgery, 11(6): 742-747.
IJsselhof R.J., Duchateau S.D.R., Schouten R.M., Freund, M.W., Heuser, J., Fejzic, Z., Haas, F., Schoof, P.H. & Slieker, M.G. (2019) Follow-up After Biventricular Repair of the Hypoplastic Left Heart Complex. European Journal of Cardiothoracic Surgery, 57(4): 644-651.
Schouten R.M., Lugtig, P. & Vink, G. (2018) Generating missing values for simulation purposes: A multivariate amputation procedure. Journal of Statistical Computation and Simulation, 88(15): 1909-1930.
Schouten, R.M. and Vink, G. (2021) The dance of the mechanisms: How observed information influences the validity of missingness assumptions. Sociological Methods & Research, 50(3): 1243-1258.
Kappen, I.F.P.M., Bittermann, G.K.P., Schouten, R.M., Bittermann, D., Etty, E., Koole, R., Kon, M., Van der Molen, M. & Breugem, C.C. (2017) Long-term mid-facial growth of patients with a unilateral complete cleft of lip, alveolus and palate treated by two-stage palatoplasty: cephalometric analysis. Clinical Oral Investigations, 21: 1801-1810.
de Vries, C.P., Schouten, R.M., Van der Kuur, J., Gottardi, L., & Akamatsu, H. (2016) Microcalorimeter pulse analysis by means of principle component decomposition. Proceedings SPIE 9905, Space TElescopes and Instrumentation 2016: Ultraviolet to Gamma Ray, 99055v. DOI: 10.1117/12.2231627
“Beside your overall excellent performance in your PhD research and EDIC project execution, you did an excellent job in project management of EDIC, and in setting up new successful collaborations. You helped a lot with the Research Topics in Data Mining course, and supervision of students.”
ampute
in R-package miceR-function ampute
is the implementation of a
multivariate amputation procedure: a method for generating missing data
in complete datasets. With ampute
, it is straightforward to
generate missing values in multiple variables, with different missing
data proportions and varying underlying missingness mechanisms. Read the article or the
vignette to learn more.
parlMICE
For large datasets or when you want to impute with a large number of
imputations, multiple imputation with mice
in R-package
mice may have a long run time. As a solution, Gerko
Vink and I created wrapper function parlMICE
, which allows
for a parallel run of mice
.
The function is now part of package mice under the
name parlmice
.
All information can be found in the github repo or in the vignette.
pyampute
: the first Python library for data
amputationLibrary pyampute
provides the multivariate amputation
methodology to the Python community, and it does more. It has improved
default settings, allows for a combined MAR+MNAR mechanism, for custom
probability functions and since it is compatible with scikit-learn’s
fit
and transform
paradigm, seamless
integration in data processing pipelines becomes easy.
Find plenty of examples in pyampute
’s
documentation. Davina’s presentation at SciPy22 can be found here.
Install using pip or from source:
pip install pyampute
git clone https://github.com/RianneSchouten/pyampute.git
pip install ./pyampute
During my time at TU/e, I have taught in the following courses.
Year | Course | Level | Activities |
---|---|---|---|
20/21 | 2AMM20 Foundations of Data Mining | MSc | Taught 2 lectures about Missing Data, provided answers in weekly Q&A, developed practice questions for exam, administrative activities |
21/22 | 2AMM20 Research Topics in Data Mining | MSc | Taught 2 lectures about Missing Data, supervised 7 groups of students during research project |
22/23 | 2AMM20 Research Topics in Data Mining | MSc | Taught 2 lectures about Missing Data, supervised 4 groups of students during research project |
During my time at UU, I have taught lectures in 2 summerschool courses in 2015 and 2016. The courses are at an Advanced Master level: Survey Research: Design, Implementation and Data Processing and Survey Research: Statistical Analysis and Estimation.
During my time at TU/e, I have supervised the following students.
Year | Student | Type | Topic |
---|---|---|---|
20/21 | Bart van Dooren | MSc Thesis with Philips | Predicting Cardiovascular Risk with Objective Physical Activity Measurements |
21/22 | Mats Verbraak | Research Proposal | Handling Missing Data in the Prediction Domain using Multiple Imputation |
21/22 | Isabel van den Heuvel | Research Proposal | Equivalence Testing for Developing Fair Machine Learning Algorithms |