#DevOps

61 articles

Container Image Size: From 1.2 GB to 80 MB (Real Recipes)

Cut Docker image size 80-90%. Multi-stage builds, distroless, scratch — with copy-paste recipes for Node, Go, Python, Java, Rust, and .NET.

10 min read·Jan 20, 2026

AI/ML Engineering

KV Cache Quantization: When Q8 Beats FP16 (and When It Doesn't)

Q8 KV cache halves VRAM with under 0.1% perplexity cost. Q4 K-cache is OK, Q4 V-cache hurts. Asymmetric Q4-K + Q8-V is the magic combo.

10 min read·Jan 11, 2026

AI/ML Engineering

RTX 5090 for Local LLMs: 32B Models with Headroom (2026)

RTX 5090 unlocks Qwen 3.5 32B at Q5_K_M with 16K context. NVFP4 native gives 60-80% inference speedup over RTX 4090. Real benchmarks and build guide.

12 min read·Jan 8, 2026

AI/ML Engineering

Self-Hosted LLM Cost: Hardware vs Cloud GPU vs API (2026)

Below 3M tokens/day, the API wins. 3-30M, cloud GPU wins. Above 30M sustained, hardware pays back in 18-24 months. Real 2026 numbers.

12 min read·Jan 5, 2026

AI/ML Engineering

Qwen 3.5 vs DeepSeek V4 vs GLM-5.1: Local Coding Showdown

Three frontier open-weight models compared for coding in April 2026. Qwen wins on consumer GPUs, GLM-5.1 leads SWE-Bench Pro, DeepSeek V4 has 1M context.

13 min read·Jan 2, 2026

AI/ML Engineering

Qwen 3.5 GGUF Quantization: Q4_K_M vs Q5_K_M vs Q8 Guide

Q5_K_M is the sweet spot for Qwen 3.5 GGUF. Full perplexity table, K-quants vs IQ-quants, NVFP4 on Blackwell, and picks by VRAM tier with framework flags.

16 min read·Dec 30, 2025

Linux

Bash Scripting Best Practices for DevOps Engineers

Write reliable bash scripts with set -euo pipefail, proper quoting, [[ ]] tests, idempotent patterns, cleanup traps, ShellCheck, and knowing when to switch to Python.

10 min read·Dec 23, 2025

Architecture

Microservices vs Monolith: The Decision Framework Engineers Actually Use

The microservices vs monolith decision depends on team size, deployment needs, and organizational structure. Learn the five-question decision framework, the Strangler Fig migration pattern, and why most teams should start with a well-structured monolith.

10 min read·Nov 5, 2025

CI/CD

Semantic Versioning and Automated Releases with Conventional Commits

Version numbers should encode compatibility, not vibes. Learn semantic versioning, the Conventional Commits spec, commitlint enforcement, and fully automated releases with semantic-release and Release Please.

9 min read·Oct 30, 2025

← NewerPage 6 of 7Older →

Container Image Size: From 1.2 GB to 80 MB (Real Recipes)

KV Cache Quantization: When Q8 Beats FP16 (and When It Doesn't)

RTX 5090 for Local LLMs: 32B Models with Headroom (2026)

Self-Hosted LLM Cost: Hardware vs Cloud GPU vs API (2026)

Qwen 3.5 vs DeepSeek V4 vs GLM-5.1: Local Coding Showdown

Qwen 3.5 GGUF Quantization: Q4_K_M vs Q5_K_M vs Q8 Guide

Bash Scripting Best Practices for DevOps Engineers

Microservices vs Monolith: The Decision Framework Engineers Actually Use

Semantic Versioning and Automated Releases with Conventional Commits

Stay in the loop