Le Duc Anh Tuan
Senior AI Engineer
Hi, I’m Charles! Interested in LLM Systems, with experience spanning AI Research (ML, CV, Speech) to Applications (LLMs), and across the stack - from low-level GPU kernel optimization (CUDA, HIP) to high-level LLM system design, training to serving.
Industry:
-
Implemented a pure HIP C++ of OpenAI’s MoE GPT-OSS from scratch on AMD GPUs, optimized model loading, continuous batching, multi-streaming, multi-GPU communication, MoE scheduling, CPU-GPU–SRAM memory access, FlashAttention, and MFMA GEMM kernels; achieved 30K TPS (20B) and 10K TPS (120B) on a single node with 8× AMD MI250 GPUs and featured on r/LocalLLaMA.
-
Architected Leo - an end-to-end LLMOps system integrating offline and online pipelines for data, training, and inference; built offline pipelines for GET data, ETL, SFT, feature generation, and hybrid RAG ingestion enhanced by a contextual agent, all steps managed by an orchestrator; implemented an agentic RAG-based online system served via API with a React FE and proxy-connected BE, supporting scalable BE services, replica database nodes, vLLM inference engine; incorporated Redis prompt caching, short- and long-term memory modules, CI/CD RAG evaluation triggers (retrieval & generation), and real-time token streaming via SSE.
-
Developed a multimodal, multi-agent conversational recommendation system with vision and speech-to-speech interaction; integrated AdaptiveICL, synthetic data generation, retrieval-ranking pipelines and achieved Top 1 in track DSAI at Viettel Digital Talent 2024.
Research:
-
Researched Speechless (Ichigo family LLM model), aiming to generate synthetic semantic audio representations from multimodal inputs, trained on semantic tokens generated by Ichigo Whisper; published a paper accepted at Interspeech 2025 (CORE Rank A conference in Speech Processing).
-
Proposed “Efficient Continual Detection Transformer”, leveraging pseudo-labeling, knowledge distillation, and LoRA; achieved high performance with only 3% trainable parameters compared to RT-DETR on COCO dataset.
Community:
-
Released and maintained open-source projects for developers, including gpt-oss-amd (150 ⭐️), nvims (100 ⭐️), leo, gemino; contributed to leading projects like Ichigo (2.4k ⭐️) and WhisperSpeech (4.5k ⭐️).
-
Organized multiple technical events for the developer community, such as VinAI Day, Google I/O Extended, Google DevFest, International Women’s Day x Flutter Forward Extended, and Google Build with AI; spoke at Google DevFest 2022 on “Detecting Cheating in Examinations” (highlighted on VTV24); certified by Google’s Global Headquarters.
News 📰
| Oct 10, 2025 |
Introducing gpt-oss-amd - a pure HIP C++ implementation from scratch (no rocBLAS/hipBLAS) of OpenAI’s MoE GPT-OSS, achieving over 30k TPS (20B) and 10k TPS (120B) on single node 8× AMD MI250 GPUs, featured on r/LocalLLaMA.
|
|---|---|
| Jun 22, 2025 | Releasing nvims — a lightning-fast, AI-powered editor with a beautiful UI and VSCode vibes in the terminal, featured on J2Team, MLCB, MiAI, and LinkedIn. |
| May 20, 2025 | Our paper, “Speech Instruction Training Without Speech for Low-Resource Languages”, was accepted at Interspeech, a CORE Rank A conference in speech processing. |
| Dec 23, 2024 | Trained Speech Tokenizer to support multiple Asian languages, achieving SOTA results on viVoice and LibriTTS-R |
| Oct 28, 2024 | Joining Homebrew (Singapore) as an LLM Researcher |
| Oct 10, 2024 | Top 1 in the DSAI track of Viettel Digital Talent 2024 |
| Sep 28, 2024 | Graduated from Hanoi University of Science and Technology (HUST) |
| Jul 26, 2024 | Onboarding as a Data Scientist at the Data Analytics Center, Viettel Telecom |
| Jul 6, 2024 | Thrilled to be ranked among the top in the DSAI track and in the overall Top 50 within Viettel Group in the VDT’2024 |
| Jun 21, 2024 | Delighted to announce that the graduation thesis, ECOD, has achieved SOTA results on the COCO dataset with little trainable parameters |
| Jun 20, 2024 | Honored to receive the Best Presentation Award in Round 1 of the Viettel Digital Talent program |
| Apr 4, 2024 | Rejoining Viettel Group as a Digital Talent 2024 |
| Feb 17, 2024 | Honored to receive the physical GDG organizer certificate from Google’s Global Headquarters |
| Nov 11, 2023 | Honored to be an Ambassador for MT Leader Nestlé |
| Oct 3, 2023 | Thrilled to share that my SOTA “3DNeRV” paper has advanced to phase 2 in AAAI 2024 |
| Oct 1, 2023 | Proud to be an AI Mentor at the SheCodes Hackathon 2023 |
| Sep 29, 2023 | Embarking on an exciting journey as an Applied Scientist with VinBrain |
| Mar 3, 2023 | Proudly stepping into the role of Computer Vision Researcher at the Camera Center, Viettel High Tech |