Workbench
Projects worth bookmarking.
A blend of production deployments, community contributions, and joyful experiments. Each card includes the stack, context, and where to dive deeper.
Maya
Dec 2024As part of the Cohere 4 AI community, trained a novel Vision Language Model on a multilingual instruction dataset curated by the team. Published paper on arXiv.
HuggingFace 🤗 Contributions
2024 – PresentContributed notebooks and documentation for knowledge distillation in computer vision and PII detection for LLM gateways as part of community initiatives.
Morph Chess
2025A distributed chess system that runs multiple chess agents on morph cloud instances with real-time visualization.
Topic Auto-label
Nov 2024Released a pip package to automatically label text, image, and video data using LLMs for topic identification. Supports local LLMs via Ollama and pydantic for structured output.
Manifest Climate
2023Led a University of Waterloo Data Science Club team to build a labeling tool, create custom datasets, and fine-tune DistilBERT for climate disclosures—unlocking 16 new signals and slashing LLM API costs by ~99.9%.
Text2SQL
2023Fine-tuned a Llama-based model on synthetic data to answer natural language queries about an SQLite database by generating SQL and interpreting the results. Achieved 86% accuracy on held-out tasks.
DotaLLM
2024Trained a YOLO model for enemy detection and used the detections to prompt Cohere’s Command-R+ for movement and combat actions.
Dreambella
2023A Dreambooth fine-tune of Stable Diffusion on a very important subject: my dog, Bella.
Titanic Challenge in Production
2020Created synthetic data with simulated drift for an introductory lesson covering TensorFlow Extended, drift monitoring, and CTGAN-based tabular generation.