Senior software developer at Wärtsilä. I write Python code and know how to organize pipelines (backend development) for machine learning (ML) and data analysis (Data Science).
Well skilled in architecture design and programming in general (patterns, OOP, SOLID, TDD, git) and ML systems in particular. Experienced in R&D (have more than 30 publications).
My tech stack: Python, FastAPI, scikit-learn, networkx, PyTorch, MLFlow, Dask, SQLAlchemy, alembic, docker, PostgreSQL, MongoDB, GDAL, QGIS, LaTex etc.
Some open-source developments in which I have been or am involved:
- AutoML framework FEDOT
- Time series benchmark pytsbe
- ETL (Extract, Transform, Load) library wiredflow
- Library for spatial data fusion and processing for real estate objects estaty
- Module for filling gaps in matrices (applied mostly for remote sensing data) based on machine learning and spatial relations SSGP-toolbox
- QGIS plugin for river stream ordering (or any vector linear system) Lines Ranking
I write, along with my colleagues, posts on Towards Data Science, Medium and Habr.
Towards Data Science:
- Stream Ordering: How And Why a Geo-Scientist Sometimes Needed to Rank Rivers on a Map (eng)
- A Data Science Course Project About Crop Yield and Price Prediction I’m Still Not Ashamed Of (eng)
- Almost Everything You Want to Know About Partition Size of Dask Dataframes (eng)
- Proximity Analysis to Find the Nearest Bar Using Python (eng)
- Combining Open Street Map and Landsat open data to verify areas of green zones (eng)
- What to Do If a Time Series Is Growing (But Not in Length) (eng) - NSS Lab post
- Hyperparameters Tuning for Machine Learning Model Ensembles (eng) - NSS Lab post
- Clean AutoML for “Dirty” Data: How and why to automate preprocessing of tables in machine learning (eng) - NSS Lab post
- One more approach to optimize neural networks (eng)
- Winning a Flood-Forecasting Hackathon with Hydrology and AutoML (eng) - NSS Lab post
- AutoML for time series: advanced approaches with FEDOT framework (eng) - NSS Lab post
- AutoML for time series: definitely a good idea (eng) - NSS Lab post
Medium:
Habr:
- Все реки в порядке: как и зачем в географических науках ранжируют водотоки (rus)
- Data Scientist in Helsinki. Мое небольшое исследование про поиск работы в Финляндии в 2024 году (rus)
- “Ну и долго мне ещё до магазина?” Или пара слов о геоинформационном анализе с помощью Python (rus)
- Объединение открытых данных Open Street Map и Landsat для уточнения площадей зеленых зон (rus)
- Что делать, если твой временной ряд растёт вширь (rus) - NSS Lab post
- Про настройку гиперпараметров ансамблей моделей машинного обучения (rus) - NSS Lab post
- Чистый AutoML для “грязных” данных: как и зачем автоматизировать предобработку таблиц в машинном обучении (rus) - NSS Lab post
- Как мы “повернули реки вспять” на Emergency DataHack 2021, объединив гидрологию и AutoML (rus) - NSS Lab post
- Прогнозирование временных рядов с помощью AutoML (rus) - NSS Lab post
- Алгоритм ранжирования сегментов речной сети с использованием графов для геоинформационного анализа (rus)
My accounts on other platforms: scholar.google, kaggle, drivendata