AutoPrep

AutoPrep is an automated data preprocessing and analysis Python package that generates comprehensive LaTeX reports. It handles common preprocessing tasks, creates insightful visualizations, and documents the entire process in a professional PDF report. It focuses on tabular data, supporting numerous explainable AI models. Emphasizing interpretability and ease of use, it includes subsections for each model, explaining their strengths, weaknesses, and providing usage examples.
The pipeline automatically detects task type (binary classification, multiclass classification or regression), generates an array of possibles preprocessing pipelines, scores them, trains models, tunes hyperparameters and generates a well-structured report.
Technologies: Python, poetry, Pandas, NumPy, scikit-learn, seaborn, matplotlib
Co-authors: Kruk Julia, Pozorski Paweł, Rogalska Katarzyna
Hyperparameter Tunability

This project focused on reproducing results from publication
Bernd Bischl, Anne-Laure Boulesteix, and Philipp Probst. Tunability: Importance of hyperparameters of machine learning algorithms. Journal of Machine Learning Research, 2019.
The experiments were conducted on 5 datasets from OpenML and 3 models (XGBoost, logistic regression and k-nearest neighbours classifier). Two research questions were answered:
- Do the AUC scores differ significantly between models optimized with Random Search and Bayes Search?
- Does Random Search converge significantly faster than Bayes Search?
Technologies: Python, Pandas, NumPy, scikit-learn, seaborn, matplotlib, scipy.stats
Co-author: Kruk Julia
Other projects
Other projects include implementations in
- Python (e.g. Django project Auctions site),
- R and
- Java.