Senior Data Engineer | Healthcare AI | Trustworthy Data Systems
I design and build reliable data and AI systems for healthcare, research, and other high-stakes domains where trust matters.
Professionally, I’m a Senior Data Engineer with deep experience in data ingestion, analytics infrastructure, orchestration, and platform reliability. Independently, I’m building projects at the boundary of healthcare data engineering, applied AI, privacy-preserving systems, and human-centered tooling.
The common thread in my work is simple: complex systems should be understandable, reproducible, and useful to the people who depend on them.
I care about systems that:
- work reliably at scale,
- protect sensitive data,
- make uncertainty visible,
- and remain understandable to both technical and non-technical users.
I’m currently building independent projects around:
- Trustworthy AI for healthcare and public health
- Privacy-preserving data engineering
- Edge-ready and local-first AI systems
- Clinical and biomedical data workflows
- Explainable, reproducible ML pipelines
- Agentic tooling for data platforms and research workflows
I’m especially interested in the practical middle ground between production engineering and applied research: turning messy data, fragile workflows, and vague questions into systems that can be tested, explained, and improved.
A long-term thread across this work is Careful Intelligence — my personal effort to explore trustworthy, privacy-aware AI systems that are useful in the real world, not just impressive in demos.
Languages Python, SQL, Bash, Java, TypeScript, JavaScript (comfortable learning new languages when the system demands it)
Data Engineering & Orchestration Dagster, Apache Spark, dbt, Pandas, Hadoop ecosystem
Cloud & Storage Google Cloud Platform, GCS, BigQuery, Datastore, Azure SQL, SQL Server, Teradata, Informix
Analytics, Visualization & Geospatial Tableau, Vega-Lite, Matplotlib, GeoPandas, OpenRefine
AI / ML Areas I Work Around LLM evaluation, model calibration, quantization, prompt stability, de-identification, NER, reproducible experiment workflows
Practices I Care About Idempotent pipelines · data validation · observability · schema evolution · reproducibility · de-identification · operational clarity · human review loops
Cityblock Health | Feb 2024 – Present
- Design and evolve multi-source, multi-format ingestion pipelines for HIPAA-protected healthcare data
- Lead and support migration toward Dagster-based orchestration, improving reliability, extensibility, and developer velocity
- Build validation, profiling, quarantine, and review workflows that balance automation with human oversight
- Collaborate across analytics, clinical, implementation, and platform teams to make complex data systems more usable and trustworthy
- Work on operational visibility for ingestion workflows, including review paths for unknown, anomalous, or unconfigured inbound files
Cityblock Health | Feb 2022 – Feb 2024
- Built robust, idempotent ETL pipelines supporting analytics, operations, and clinical workflows
- Helped improve data quality, reliability, and maintainability across shared data infrastructure
- Supported ingestion and transformation patterns for healthcare data from multiple external sources
Walmart Global Tech | Dec 2019 – Feb 2022
- Developed large-scale analytics systems for pharmacy operations
- Worked with big-data platforms and production streaming/batch pipelines
- Built data-driven applications and dashboards supporting operational and analytical decision-making
Independent continuation of graduate research exploring how quantization affects biomedical language model reliability. Focus areas include calibration error, prompt stability, macro-F1, and reproducible evaluation workflows for resource-constrained AI deployment.
Spark-based ingestion framework using NER-driven masking and de-identification for secure handling of clinical text and structured healthcare data.
Dagster-orchestrated ingestion system with automated validation, quarantine paths, profiling, and review workflows for unconfigured or anomalous data.
Independent research-oriented work exploring how healthcare data, social determinants of health, environmental signals, and explainable AI can be combined into practical, reproducible analysis workflows.
Interactive dashboards and data applications supporting pharmacy stakeholders with operational and analytical decision-making.
Python CLI for tracking what is actually new in UAP/UFO-related releases and coverage. It pulls from official and news sources, applies rule-based source trust logic, uses LLMs for summaries and novelty scoring, and caches results locally in SQLite to reduce repeat API spend.
Local-first vector stroke recorder for scientific drawings. Designed to record stylus strokes as structured vector data rather than pixels, then replay/export high-resolution transparent timelapses for use in video editing workflows. Currently in early capture-core development.
-
M.S. in Computer Science — Old Dominion University
-
Research and project interests:
- trustworthy AI infrastructure
- healthcare data systems
- privacy-preserving ML
- edge / local-first AI
- public health analytics
- biomedical LLM evaluation
- reproducible applied research workflows
I’m no longer focused on coursework. My attention now is on building a body of independent work: practical systems, research-informed prototypes, and technical writing that connect my healthcare data engineering background with the next generation of trustworthy AI tools.
- Local-first and edge-ready AI deployment
- Model compression, quantization, calibration, and inference efficiency
- Agentic workflows for data engineering and research
- Public health analytics using explainable and reproducible methods
- Hardware tinkering: Arduino, sensors, instrumentation, and small systems
- Algorithms, systems fundamentals, and infrastructure design
- Security and privacy as design constraints, not afterthoughts
- Former Persian Linguist, U.S. Army 🇺🇸
- Lifelong reader; happiest in libraries and used bookstores 📚
- Drums, guitar, piano
- Pinball, live music, comedy, and learning things the hard way
If you’re interested in:
- healthcare data platforms,
- trustworthy AI,
- privacy-preserving systems,
- biomedical or public health data workflows,
- or building tools that people can actually understand and depend on,
feel free to explore my repositories or reach out.




