Clay Ndugga

Software Engineer

I'm a Mathematics & Engineering graduate from Queen's University. I have extensive experience with Machine Learning and Backend System Design.

resume.pdf

Clay Ndugga

Software Engineer

I'm a Mathematics & Engineering graduate from Queen's University. I enjoy using math and programming to tackle complex challenges and find innovative solutions.

resume.pdf

Experience

Software Engineer

Longview Systems   ·   May 2021 - Sept 2021   ·   Calgary, AB

  • Architected an end-to-end IoT Azure cloud solution to optimize building energy efficiency using real time machine learning for decision making
  • Created robust data pipeline to stream in real-time unstructured data from IOT temperature sensors. Scheduled jobs to handle the process of cleaning and aggregating data for future ML modeling
  • Used Random Forest Regression for times series forecasting to determine optimal setpoints 24h into the future.
  • Employed best practice ML Ops by automating monthly retraining via MLflow to adapt to seasonal patterns, increasing forecasting accuracy.
  • Technologies: Azure, Databricks, Python, MLOps, MLFlow, SQL

Machine Learning Engineer

Boardwalk Real Estate   ·   May 2022 - Sept 2022   ·   Calgary, AB

  • Built an ML model using Tensorflow to predict tenant lease renewals (binary classification) achieving a 0.85 (out of 1) F1-score, reducing profit loss to empty properties
  • Created data pipelines to ingest and preprocess 1M+ records from SQL databases, surveys, and property reports using Pandas and SQL
  • Optimized model precision via hyperparameter tuning to prioritize the reduction of false positives to minimize vacancy risk (precision-recall)
  • Periodically communicated technical concepts to non-technical stakeholders (The Board of Directors)
  • Technologies: Python, SQL, Jupyter, Pandas, NumPy, Scikit-Learn, TensorFlow

Software Engineer

Longview Systems   ·   May 2023 - Sept 2023   ·   Calgary, AB

  • Built a stateful REST API data pipeline (ETL) with queuing to ingest 1M+ employee records, reducing ingestion times by 70%. Implemented idempotent retries, pagination tracking, and checkpointing to handle API rate limits, enabling 24/7 reliability on spot instances. Later adopted as the team's standard ingestion template.
  • Migrated 20k+ lines of legacy SQL stored procedures to move to modern Databricks/Pyspark architecture drastically cutting run times.
  • Automated Azure wiki documentation for 300+ functions using Open AI API (LLM's) eliminating 15+ hours/work of manual effort. Designed a structured prompting process to reduce hallucinations achieving 100% accuracy.
  • Technologies: Azure, Databricks, Python, PySpark, SQL

Thesis

Deep Learning for Pointcloud Compression

    Made With:

Efficient compression is essential for the storage and transmission of data. While traditional methods like JPEG for images and MPEG for videos are effective for 2D media, they struggle with 3D data formats like point clouds. This project aims to enhance point cloud compression using a Non-Linear Transform Coder (NTC) , which can better handle the complexities of 3D data. The project focuses on designing a NTC for lossy point cloud compression. The architecture leverages neural networks to capture spatial patterns in point cloud data and eliminates redundancy, leading to efficient compression.

Projects

Chat with PDF

    Made With:

I deployed a web application that enables interactive PDF chat with clickable references. The backend is a scalable REST API built with Node.js/Express, containerized with Docker, and deployed on Google Cloud Run with autoscaling. The system generates embeddings using OpenAI and enhances query accuracy through dynamic re-ranking using Pinecone Vector DB.

YTC Landing page Diagram

Youtube Comment Finder

    Made With:

I developed a serverless web application that integrates YouTube and Spotify, allowing users to search YouTube comments for song mentions. The backend leverages AWS Lambda and DynamoDB to efficiently manage job state between functions, ensuring scalability and reliability. Following serverless best practices, the architecture incorporates queues for loose coupling, idempotent function design, and caching mechanisms to enhance performance and resilience.

Serverless Architecture Diagram

LLM Research Paper Abstract Generator

    Made With:

Provided with a research paper title and a list of key findings, this project will generate an abstract for the research paper. This project demonstrates a serverless architecture for generating research paper abstracts using Large Language Models (LLMs). It employs a Retrieval-Augmented Generation (RAG) approach, integrating LLM prompt engineering with queries to a vector database for contextual relevance and accuracy. The project uses practices and tools that are crucial for modern, scalable applications.

Queen's Housing Full Stack Web Application

    Made With:

Queen's Housing is a full stack web application with a nodejs RESTful API backend, and a frontend made with tailwind css. The project gives a place for students to share and discuss their rental experiences, and is currently deployed on heroku. The project had similiar technical requirements as a social media app as I learned about: user authentication in browser sessions, designing database structure to fit certain requirments, routing, RESTful API development, and front end UI design. Note: The project is fully functional but contains mock data

PID controller tuning using the Genetic Algorithm

    Made With:

Using signal processing, control systems, and machine learning a PID controller was tuned with the Genetic Algorithm on an unkown blackbox system. The Genetic Algorithm is a global optimization technique that mimics natural selection to solve minimization problems!

React Frontend for VideoGame API

    Made With:

A React frontend for a video game API. The entire project was built using Typescript to replicate the needs of a large scale production environment. The react frontend allows users to selectively filter and search for different games, a GET request containing the relevant search parameters is then sent to the API to retrieve the appropriate games. Chakra UI was used to ensure consistent and aestically pleasing visuals.

NYC Taxi Fare Prediction

    Made With:

Analyzed NYC Taxi trip data and built multiple models to predict trip fare. Created an Ensemble model that combines the output of multiple predictive models to achieve higher prediction accuracy

Education

Applied Mathematics & Engineering

Systems and Robotics Minor   ·   2019 - 2024   ·   Kingston, ON

  • Similar to Software Engineering with a focus on advanced mathematics