Skip to content

#DevOps

61 articles

Container Image Size: From 1.2 GB to 80 MB (Real Recipes)
Containers

Container Image Size: From 1.2 GB to 80 MB (Real Recipes)

Cut Docker image size 80-90%. Multi-stage builds, distroless, scratch — with copy-paste recipes for Node, Go, Python, Java, Rust, and .NET.

10 min read·
KV Cache Quantization: When Q8 Beats FP16 (and When It Doesn't)
AI/ML Engineering

KV Cache Quantization: When Q8 Beats FP16 (and When It Doesn't)

Q8 KV cache halves VRAM with under 0.1% perplexity cost. Q4 K-cache is OK, Q4 V-cache hurts. Asymmetric Q4-K + Q8-V is the magic combo.

10 min read·
RTX 5090 for Local LLMs: 32B Models with Headroom (2026)
AI/ML Engineering

RTX 5090 for Local LLMs: 32B Models with Headroom (2026)

RTX 5090 unlocks Qwen 3.5 32B at Q5_K_M with 16K context. NVFP4 native gives 60-80% inference speedup over RTX 4090. Real benchmarks and build guide.

12 min read·
Self-Hosted LLM Cost: Hardware vs Cloud GPU vs API (2026)
AI/ML Engineering

Self-Hosted LLM Cost: Hardware vs Cloud GPU vs API (2026)

Below 3M tokens/day, the API wins. 3-30M, cloud GPU wins. Above 30M sustained, hardware pays back in 18-24 months. Real 2026 numbers.

12 min read·
Qwen 3.5 vs DeepSeek V4 vs GLM-5.1: Local Coding Showdown
AI/ML Engineering

Qwen 3.5 vs DeepSeek V4 vs GLM-5.1: Local Coding Showdown

Three frontier open-weight models compared for coding in April 2026. Qwen wins on consumer GPUs, GLM-5.1 leads SWE-Bench Pro, DeepSeek V4 has 1M context.

13 min read·
Qwen 3.5 GGUF Quantization: Q4_K_M vs Q5_K_M vs Q8 Guide
AI/ML Engineering

Qwen 3.5 GGUF Quantization: Q4_K_M vs Q5_K_M vs Q8 Guide

Q5_K_M is the sweet spot for Qwen 3.5 GGUF. Full perplexity table, K-quants vs IQ-quants, NVFP4 on Blackwell, and picks by VRAM tier with framework flags.

16 min read·
Bash Scripting Best Practices for DevOps Engineers
Linux

Bash Scripting Best Practices for DevOps Engineers

Write reliable bash scripts with set -euo pipefail, proper quoting, [[ ]] tests, idempotent patterns, cleanup traps, ShellCheck, and knowing when to switch to Python.

10 min read·
Microservices vs Monolith: The Decision Framework Engineers Actually Use
Architecture

Microservices vs Monolith: The Decision Framework Engineers Actually Use

The microservices vs monolith decision depends on team size, deployment needs, and organizational structure. Learn the five-question decision framework, the Strangler Fig migration pattern, and why most teams should start with a well-structured monolith.

10 min read·
Semantic Versioning and Automated Releases with Conventional Commits
CI/CD

Semantic Versioning and Automated Releases with Conventional Commits

Version numbers should encode compatibility, not vibes. Learn semantic versioning, the Conventional Commits spec, commitlint enforcement, and fully automated releases with semantic-release and Release Please.

9 min read·

Stay in the loop

New articles delivered to your inbox. No spam.