close
Skip to content
View sagarsahu27's full-sized avatar
🏡
Working from home
🏡
Working from home

Block or report sagarsahu27

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sagarsahu27/README.md
Image

Typing SVG

> whoami

class SagarSahu:
    role      = "Senior Software Engineer"
    exp       = "10+ years shipping production systems"
    focus     = ["Heterogeneous Computing", "Edge AI", "Systems Programming", "Cloud & Security"]
    languages = ["Python", "C++", "C", "Rust", "Shell", "TypeScript"]
    currently = "Programming AMD GPUs & NPUs for accelerated AI inference"

I build software that runs close to the metal — from GPU kernels and NPU inference pipelines to cloud-scale security agents deployed on millions of endpoints. I care about performance, correctness, and writing code that other engineers enjoy reading.


🛠️ Tech Stack

Languages

Python C++ C Rust Shell TypeScript

AI / ML / Accelerated Computing

PyTorch ONNX OpenCL AMD Databricks

Cloud & Infrastructure

Azure Linux Docker GitHub Actions Git


📌 Featured Projects

Project What it does Tech
gemem-AMD 9-session tutorial series: heterogeneous computing on AMD GPU (RDNA 3.5) + NPU (XDNA 2) — from OpenCL kernels to INT8 quantization Python OpenCL ONNX Runtime PyTorch
YoutubeSummarizer AI-powered YouTube video summarizer with periodic updates Python AI/ML
sample-app-aoai-chatGPT Web chat interface powered by Azure OpenAI ChatGPT TypeScript Azure OpenAI
distributed_computing Distributed systems fundamentals — consensus, replication, fault tolerance C MPI

📊 GitHub Stats

Image    Image



Image

🧠 What I'm Working On

  • 🔥 Heterogeneous Computing — Writing GPU/NPU kernels for AMD Ryzen AI (RDNA 3.5 + XDNA 2)
  • 🧮 Performance Engineering — Amdahl's Law → Roofline Model → Kernel Fusion → real benchmarks
  • 🤖 Edge AI — INT8 quantization, ONNX Runtime execution providers, on-device inference
  • ☁️ Cloud + AI — Azure OpenAI, Databricks, building production ML pipelines

🤝 Connect

LinkedIn GitHub Email


Image

"Make it work, make it right, make it fast." — Kent Beck

Pinned Loading

  1. Coursera_CNN Coursera_CNN Public archive

    This contains assignment submitted in Convolutional Neural Networks by deeplearning.ai

    Jupyter Notebook

  2. deep-learning deep-learning Public archive

    Forked from udacity/deep-learning

    Repo for the Deep Learning Nanodegree Foundations program.

    Jupyter Notebook

  3. GAEAssignmentApp GAEAssignmentApp Public archive

    It is google app engine sample app written for my M Tech Cloud computing assignement

    JavaScript

  4. grpc_example grpc_example Public archive

    self learning project for grpc

    Go

  5. network_programming network_programming Public archive

    network programming course of BITS PILANI MTech 2019

    C

  6. protobuf_project protobuf_project Public archive

    this is self learning project for understanding protobuf. Its examples are based on tutorials in Complete Introduction to Protocol Buffers 3

    Go