👋🏻 Hi, I'm Hamza

Data Science & Engineering Student

> Junior Data Engineer 👨‍🔧

About Me

« The world is one big data problem. »

This quote encapsulates the reason and passion that has driven my journey to dive deep into the data and AI world. Having a background in applied mathematics, I am currently enrolled in a data science and engineering master's degree. I'm driven to streamline data workflows, ensure data quality, and make data accessible for insightful analysis. My expertise spans Python, R, SQL, machine learning, data analysis, visualization, designing and implementing data pipelines with technologies like Terraform, Airflow, Spark, and cloud platforms like GCP, AWS, and Azure.

If you are seeking a dedicated data engineer passionate about leveraging data's power, let's connect and explore how we can make a difference together, one problem at a time.

My Picture

Skills

Programming Languages

Python

Java

C

R

Data & Machine Learning

TensorFlow

Scikit Learn

Pandas

Numpy

Selenium

Airflow

Kafka

Spark

Power BI

Cloud

AWS

Google Cloud

Azure

Firebase

Databases

MySQL

SQLite

PostegreSQL

MongoDB

Oracle

Web Development

HTML

CSS

Tailwindcss

JavaScript

TypeScript

React

Node.js

Nextjs

Django

FastAPI

Streamlit

Other Tools

Docker

Terraform

Linux

Git

Jira

Qualifications

Education
Experience

MS. Ingénierie des Données et Protection (IDP)

Faculty of Sciences and Engineering (FSI), Toulouse
2024-2026

Skills: Cloud & DevOps · Big Data · AI Projects · Data Governance · Data Project Architecture · Agile Project Management

MS. Data Science and Engineering

Faculty of Sciences of Rabat (FSR), Rabat
2022-2024

Skills: Python · Java · Extract, Transform, Load (ETL) · Web Development · Oracle Database Administration · Cybersecurity · Datamining · Business Intelligence · Data Science · Deep Learning · Big Data

BASC. Applied Mathematics

Faculty of Sciences and Techniques (FSTM), Mohammedia
2019-2022

Skills: SQL · C · Calculus · Data Structures · Probability · Statistics · Linear Algebra · Relational Databases SQL

Baccalaureate in Physical Sciences

Newton High School, Mohammedia
2019

Data Engineer | Internship

Blent.ai · Paris, France · Remote
Feb 2024 - Jul 2024

• Implemented automated ELT pipelines to feed the Customer Data Platform on the Data Warehouse from sources like Hubspot, Retool, Zoom, MongoDB etc.
• Developed Reverse ETLs for marketing tools such as Hubspot and CustomerIO.
• Constructed a sales dashboard to analyze customer data (clients and leads), and a marketing dashboard to analyze marketing strategies.

𝗖𝗼𝗻𝘀𝘂𝗹𝘁𝗶𝗻𝗴 𝗳𝗼𝗿 𝗗𝗚𝗔𝗖:
• Automated the workflow of extracting complex pieces of information from PDFs of aerodromes in France using an LLM.

𝗘𝗱𝘂𝗰𝗮𝘁𝗶𝗼𝗻𝗮𝗹 𝗰𝗼𝗻𝘁𝗲𝗻𝘁:
• Wrote notebooks and tutorials about Analytics Engineering (dbt and Airflow) for the Data Engineering course.
• Created SQL and Python Coding Challenges.

𝗧𝗼𝗼𝗹𝘀: Airbyte, Mage, BigQuery, dbt, Looker Studio, Open Metadata, CustomerIO API, Langchain, PDFplumber, OpenAI API, ChromaDB, Streamlit

Full Stack Developer | Internship

3D Smart Factory · Mohammedia, Morocco · Hybrid
Jul 2023 - Sep 2023

• Deployed the MeshSegNet deep learning model that does precise 3D dental scan segmentation, using a RESTful API FastAPI. Hosted the API on AWS Cloud using Docker, AWS ECR, Lambda, and API Gateway, ensuring scalability and reliability. Reduced the cost of usage from 85$/month to 1.3$ per 1000 requests.

• Created an intuitive user interface using Vite, React.js, Three.js, and VTK.js, allowing dental professionals to seamlessly upload, segment, and analyze intraoral scans in 3D.

• Integrated Firebase Auth for authentication using a Google, or a Github account. Used the Open AI API to create an AI assistant chatbot that guides the website users to use the different features.

• Deployed the web application using Vercel.

🖱️ Live Demo
💻 Github Repository

Front End Task Lead | Internship

Omdena • Île-de-France, France · Remote
May 2023 - Jun 2023

• Led a team in developing an AI-driven transportation chatbot application for Île-de-France. Utilized Streamlit and Folium for the front-end, integrated web scraping of the RATP website for real-time traffic status, and managed data uploads to an AWS S3 bucket with an ETL using the AWS Lambda function.

• Created an algorithm that calculated the nearest available transportation station based on user preferences and location, ultimately reducing the impact of transportation disruptions in Île-de-France and offering personalized alternative routes and directions.

• Implemented a version using the Navitia API, ensuring accurate and up-to-date transportation data retrieval by communicating directly with RATP's non-public API, enhancing the chatbot's reliability.

• Enhanced user interaction by integrating a dynamic chatbot UI alongside the Folium map, improving user engagement and accessibility to personalized alternative transportation information during strikes.

End of Studies Bachelor Internship Project

Faculty of Sciences and Techniques of Mohammedia (FSTM) · Mohammedia, Morocco · On-site
Apr 2022 - Jun 2022

• Developed a predictive model for Airbnb hosts to optimize rental prices using supervised machine learning techniques, including K-Nearest Neighbors (KNN).

• Utilized a range of technologies, including Python, Pandas, NumPy, Scikit-learn, Matplotlib, and Seaborn, to analyze data and fine-tune pricing strategies.

• Explored the integration of the pricing model into Airbnb's backend, simplifying the process for hosts to set competitive rental prices based on property characteristics.

Projects

Data & ML Projects

Data Engineering Jobs

Analyzing the Data Engineers Job Market in the USA and Predicting Salaries

Demo Repository

👨‍🔧 Online Retail Data Pipeline

Analysing Data of a Online Retail Store using an ETL with Airflow, GCP BigQuery, dbt, Soda, and Looker

Demo Repository

🔧 Analyzing Sales of AdventureWorks 🔌

On-prem SQL Server to Azure Cloud Pipeline with Data Factory, Lake Storage, Spark, Databricks, PowerBI

Repository

NYC Taxi Data Analysis

Analysing NYC Taxi Records data using Data Modeling, GCP Storage, Mage ETL, BigQuery, and Looker Studio.

Demo Repository

Decathlon Chat

A Customer Service Chatbot Trained on Decathlon Morocco Data

Demo Repository

AtliQ Hospitality Dashboard

A Data Analysis Project in Hospitality Domain with PowerBI

Repository
Web Projects

TeethSeg

Web App AI Based Automated Teeth Segmentation using MeshSegNet

Demo Repository

BrandGenie

AI Branding Assistant Web App powered by OpenAI GPT3, that helps business owners generate brand name, slogan, keywords and ad copy.

Demo Repository
gptube

GPTube: ChatGPT For YouTube

Youtube Video Summarizer and Question Answering App Using Whisper, Langchain and Streamlit

Demo Repository
studynotes

Studynotes

A Cornell Method Note-Taking Web App built with HTML, TailwindCSS, JS, Django and Youtube API.

Demo Repository
studynotes

DE Zoomcamp UI

An Interactive UI for the Data Engineering Zoomcamp Course provided by DataTalksClub

Demo Repository

Blog

Extracting Information From YouTube Video Using Whisper and Langchain

Build a Youtube Question Answering App Using Whisper, Langchain, and Streamlit

OpenAI

Whisper

Langchain

Streamlit

Escaping Tutorial Hell for Data Scientists

How I Built a Unique Data Science & Engineering Project Using 4 Different YouTube Tutorials

YouTube

Data Science

Data Egineerings

Projects

Get In Touch

Thank you for visiting my portfolio 🙏🏼 ! Feel free to contact me.

Email

hamza.lbelghiti@gmail.com

Location

Mohammedia, Morocco