Hello πŸ‘‹

I’m a Data Science student πŸŽ“ at Aalto University expecting to graduate in Sep 2024. My journey in Data Science πŸ“Š started 4 years ago, where I initlially worked on Computer Vision models for biomedicine πŸ’‰, and later specialized in Online Learning algorithms ⌚️. I’m always courious and very open to learn new Data Science technologies to broaden my experience in this field πŸ€“.

Projects

    2024

  •      

    LightRiver β€” Master Thesis

    Online Machine Learning in Rust

    • Ported an Online Machine Learning model from Python to Rust, achieving a 97% reduction in execution time and attaining a processing capability of 10,000 records per second for IoT devices.
    • Optimized memory usage to a minimal 20MB, facilitating deployment on robots with stringent memory limits and resulting in a 55% enhancement in positioning accuracy.
    • Specialized implementation of Mondrian Forests, streamlining the model for real-time learning in performance-sensitive applications.
  •      

     NYCTaxis

    Big Data project with Apache Spark

    • Analyzed taxi tipping behavior using Apache Spark to process large-scale datasets (67 GB of Yellow Taxi trip) to assess the impact of ride area, time, distance, and traffic on tipping.
    • Employed Spark for data aggregation and analysis, generating insights into spatial and temporal patterns of taxi rides and their correlation with tipping trends.
    • Visualized findings using Python libraries (Matplotlib and Seaborn), creating heatmaps and time-series graphs to depict variations in tipping across different NYC zones and times of the day.
  • 2023

  •      

     MT-SMAC

    Multi-Target Hyperparameter Optimization

    • Implemented improvements to a multi-target Hyperparameter Optimization model, extending and enhancing the work of a PhD project.
    • The project, known as MT-SMAC, focuses on the Sequential Model-based Algorithm Configuration (SMAC) for optimizing highly parameterized algorithms.
    • Conducted empirical analysis using the YAHPO gym for benchmarking, providing insights into the efficiency and robustness of the model.
  •      

     CourseMatch

    NLP Project with Transformer Models

    • Developed a Course Recommender System in a team of 3 students using a Transformer based Large Language Model (LLM).
    • Utilized the BERT model from the HuggingFace library to generate embeddings for both the query and the corpus.
    • Conducted comprehensive experiments on a custom-built database and evaluated the models using various similarity techniques.
    • Recognized as the Best Project in the 2023 Natural Language Processing Course, surpassing 26 other projects.
  • 2022

  •      

     SmartElevator

    AI sytem to Analyze and make Fault prediction in your elevator

    • Built Fault prevention system for elevators using Python and improving apon the Scikit‐learn simple methods.
    • Ranked 3rd for the best project out of 10 teams.
    • Graded 29/30 in AI for Innovation master course.
  •      

    Medical Imaging

    FBK Research Project on Bioimaging

    • Engaged in a medical image segmentation project utilizing advanced bioimaging techniques.
    • Analyzed 3-D structures to extract information about the progression of specific illnesses in individual patients.
    • Implemented the project using a deep neural network in PyTorch, with data analysis performed using pandas and Scikit-learn.
  •      

    Nowcasting β€” Bachelor Thesis

    Comparative Analysis of Prediction Models for Short-Term Forecasting

    β€’ Trained Image-based time series forecasting system to predict extreme precipitation to alert Italian civil protection together with a researcher.
    β€’ Implemented large-scale data collection in PostgreSQL, developed using Python Telegram APIs.
    β€’ Deployed using Azure Functions using Docker containerization system, tested using GitLab CI.

  • 2021

  •      

     Calendar Analyzer

    Interactive dashboard for Google Calendar Event Analysis

    β€’ Analyzed time-series data gathered from Google Calendar using Python and pandas for data frame manipulation
    β€’ Implemented Stramlit dashboard in Python using Altair data visualization framework
    β€’ Deployed in a Docker container using continuous deployment in Github Actions on a raspberry using the Linux ecosystem

  •      

     DeepGalaxies

    ML Course Project on Galaxy Classification

    • Designed and implemented an image recognition system for galaxies classification.
    • Conducted comparisons with state-of-the-art methods using feature extraction and dimensionality reduction techniques.
    • Implemented the network using the PyTorch framework, with NumPy for data manipulation and seaborn for data visualization.
  • 2020

  •      

     UniBuk

    E-commerce in React to sell, buy and search for used books, notes and resources

  •      

     Swrace

    Path finding algorithm in C

  •      

     Disk usage analyzer

    Efficient analyzer of file sizes in Linux enviroments written in C

  •      

     MPM Bot

    Discord Bot for Clash Of Clans tournment management

  •      

     WebValley

    Summer school website built in Django

  • 2019

  •      

     Docker Tools

    Tools for raspberry installed through Docker

  •      

     Unitn Google Calendar

    Always updated lecture schedule for Google Calendar

  • 2018

  •        

     Home Automation

    System to control and update IOT devices remotely via a website

  •      

     Web Uploader

    Web page to upload file to server

  • 2017

  •      

     DiFra Memory

    Memory game app made with Andorid Studio