About
Data Scientist turning into a full-fledged MLOps Engineer. My experience in Data Science and Software Engineering is resulting in the perfect positioning for this new area of expertise. Currently acquiring DevOps skills and counting. Hire me if you need someone to operationalize the Machine Learning Operations in your company and let the Data Scientists focus on their modelling task: I'll handle the operations.
Skills

MLOps

CI/CD

Python

Kubernetes

Scikit-Learn

Grafana

AWS

Experience

MLOps Engineer

Cardo AI cardoai.com

Feb 2023 to PresentRemote

Machine Learning Engineer with AWS and Kubernetes

  • Created a Kubernetes CronJob connected to a Slack bot that switches off KubeFlow notebooks, therefore cutting AWS costs by releasing notebook-bound EC2 resources.
  • Developed Argo Workflow for monitoring of models in production, using NannyML, Kubernetes, AWS SQS and S3 Event notifications, allowing for observability of performance and feature drift.
  • Enhanced data processing workflows in AirFlow and added unit tests to AirFlow DAGs.
  • Deployed and managed Grafana, Prometheus and Loki for general monitoring of K8s resources.
  • Added model pre-deployment tests in CI/CD pipelines to ensure model, training and data consistency.

MLOps Engineer (Contractor)

MRIcons Ltd. mricons.eu

Oct 2021 to PresentRemote
  • Developed Docker Swarm stacks for deployment of MLOps services, including a model registry (MLFlow); a reverse‐proxy (Traefik) with an authentication service; a workflow orchestrator (Prefect).
  • Designed an on‐prem MLOps platform to provide end‐to‐end services for frictionless training, validation, deployment and monitoring of ML models based on Open‐Source software, including a CI/CD component for ML.
  • Designed a workflow template using DVC for Data Scientists to confidently develop datasets and models with unit testing integrating a CI/CD pipeline written in GitLab CI/CD specification.
  • Wrote tests and GitLab CI pipelines for deployment of thoroughly tested models on their data and performance. Wrote Gitlab CI pipelines for automated deployment of Docker Stacks.
  • Optimized performance of data queries and transformations using optimized memory formats Apache Arrow.

Research Fellow

Champalimaud Foundation [Computational Clinical Imaging Group]

Jan 2020 to Jul 2021Lisbon, Portugal
  • Deployed MLFlow server for experiment tracking and experiment data collection.
  • Conducted research on performance feature selection methods applied to radiomics for clinical decision‐making.
  • Implemented Target Shuffling robustness check and Nested Cross‐Validation procedure for use by other members in the team.

Data Scientist

Madeira Interactive Technologies Institute

Jun 2018 to Feb 2020
  • Implemented an object detection viewer and trained a marine‐species object detector via transfer learning.
  • Automated dataset processing, model training and Tensorflow model format conversion to TensorFlowJS and TensorFlow Lite using Docker, Python and Bash.
  • Provided a user‐friendly CLI for pulling, configuring and training models from TensorFlow’s Object Detection API.
  • Implemented process for cooperative collection, annotation and augmentation of training images.
  • Analyzed social media data activity levels per location, including profiling points of interest on Madeira island via topic modelling of TripAdvisor reviews with Latent Drichlet Allocation.
  • Performed data wrangling and analysis to assess performance of low‐cost air quality sensors.

Software Engineering Intern

Eyeware eyeware.tech

Nov 2017 to Feb 2018Lisbon, Portugal
  • Developed a raycasting prototype in order to produce an attention heatmap on a 3D object using NumPy and VTK in Python.
Languages

Portuguese ⋅ Fluent

English ⋅ Fluent

Education

Nova Information Management School (IMS) novaims.unl.pt

MSc, Data Science and Advanced Analytics

Jan 2015 to 2017

Católica-Lisbon School of Business and Economics

BSc, Economics

2011 to 2014
View Resume

Posts

Open Source Activity

Public Repos: 42
Pull Requests: 23
Contributed Repos: 10
Starred Repos: 388
Watched Repos: 15
Organizations: 0
Public Gists: 25