Deep dive into optimizing LLM inference on CPU hardware using Numba JIT compilation and custom memory management techniques.
Read More →Enes Altıparmak
AI/ML Developer & Student
Passionate about AI, machine learning, and CPU inference optimization. Focused on Turkish language models, low-level performance engineering, and building practical AI tools.
Featured Projects
Sixfinger API
⚡ Ultra-Fast AI Platform - 10-20x faster than popular AI services with 13 powerful models including Meta Llama 3.3 70B, Qwen3 32B, and DeepSeek-R1. Achieves 100-150 tokens/sec throughput through advanced optimizations including model quantization (8-bit/4-bit), intelligent request batching, KV cache optimization, and speculative decoding. Features OpenAI-compatible API, real-time SSE streaming, async processing with FastAPI, and comprehensive model selection from 1B to 70B parameters. Perfect for developers needing fast, cost-effective AI inference without sacrificing quality.
TurboTensors
🚀 High-performance CPU Inference Engine for Large Language Models achieving 15-20x speedup over standard PyTorch CPU implementations. Implements custom Numba-JIT compiled kernels for matrix operations, SIMD vectorization (AVX2/AVX-512), asymmetric quantization with per-channel scaling, and intelligent memory pooling. Specifically optimized for Turkish language models with specialized embedding lookup optimizations. Enables running 3B parameter models at interactive speeds (< 100ms/token) on commodity CPU hardware, making AI accessible without expensive GPU infrastructure.
Genesis
🧬 Evolutionary Code Optimization Engine using genetic algorithms to automatically improve Python function performance. Operates on Abstract Syntax Trees (AST) to perform intelligent mutations including operator replacement, loop restructuring, algorithmic substitutions, and data structure optimizations. Includes multi-dimensional fitness evaluation (execution time, memory usage, code complexity) with mandatory correctness testing. Successfully achieved 1000x speedup on recursive functions through automatic memoization discovery and 50x improvement on data pipelines by discovering NumPy vectorization opportunities.
Areas of Interest
Recent Blog Posts
Exploring how evolutionary algorithms can automatically improve code performance through intelligent mutations and fitness evaluation.
Read More →The unique challenges and solutions in developing NLP tools for the Turkish language, an agglutinative language with complex morphology.
Read More →Get In Touch
Interested in collaboration or have questions? Feel free to reach out!