Engineering and Infrastructure Work
🖥️ Software Engineering
Custom Experimental Software for Research
Designed, created, and deployed custom-built software for stimulus presentation, replacing industry-standard tool. This software improved the industry-standard software in two important ways. 1) It is easier to implement across computers and research sites. 2) It is more flexible and allows for easier integration of event markers into the protocol, which reduces post-processing analysis time. It is currently being used at our offices in New York, as well as in Stellenbosch, South Africa and São Paulo, Brazil.
Github Repository 🐙
Github Repository 🐙
The Question
How can we bypass the UI constraints and high timing variance of off-the-shelf testing tools to capture millisecond-precise behavioral data without degrading the end-user experience?
What I Did
Architected a scriptable desktop application featuring custom rendering logic to act as a standardized, responsive testing console with automated interaction logging.
Impact & Metrics
Successfully phased out legacy tools across active operations. Delivered a heavily optimized, high-fidelity user experience that simultaneously elevated participant engagement, eliminated confounding system noise, and unlocked absolute timing precision.
💾 Database Engineering
Centralized SQL Data Architecture & Systems Optimization
Engineered an enterprise-grade data infrastructure solution to dismantle operational data silos across multi-stream research initiatives. By architecting and deploying a centralized relational database system, this project unified fragmented data environments into a secure, high-throughput analytics engine. The optimized schema design eliminated manual, error-prone data aggregation workflows and established a clean, single source of truth. This initiative showcases a strong mastery of data architecture, database administration, and operational systems design aimed at scaling research velocity and enabling rapid, data-driven decisions.
The Question
How can we eliminate data accessibility bottlenecks and friction across disparate, isolated project databases to rapidly accelerate analytics lifecycle speed?
What I Did
Designed, built, and deployed a centralized relational database from scratch, establishing standardized data schemas, cleaning pipelines, and access protocols to ingest multi-source behavioral data.
Impact & Metrics
Successfully harmonized data streams across 6 distinct projects encompassing 3,000+ global participants. Quantifiably reduced analytical turnaround time by 75% (slashing processing cycles from 8 weeks down to just 2 weeks).
⚙️ Data Pipelines (ETL)
Automated ETL Pipeline & Behavioral Data Governance System
Architected and deployed an automated data engineering pipeline in Python to transform unstructured, high-dimensional human-behavior data into analytics-ready, standardized datasets. By implementing programmatic validation checkpoints directly into the data ingestion layer, this pipeline eliminated the latency of retrospective data auditing. The system programmatically flags operational anomalies and experimental protocol drifts as they happen, ensuring absolute data integrity before it reaches down-stream analysis. This project exemplifies a strong capability in Research Ops and Data Engineering—building the foundational software layers that guarantee clean, dependable behavioral telemetry at scale.
The Question
How can we systematically ingest, clean, and validate complex, erratic human-behavior data at scale while catching procedural errors early enough to salvage data collections?
What I Did
Built an end-to-end algorithmic pipeline that automatically parsed raw data inputs, mapped them to a rigorous schema, and continuously ran real-time exception-handling tests to catch protocol deviations.
Impact & Metrics
Engineered flawless, production-grade datasets supporting a project ecosystem of 500+ active users. Automated drift detection to unlock near-real-time workflow corrections, drastically mitigating data-loss risk and securing high-fidelity data capture.