Skip to content

AI/ML Engineering

Practical AI and machine learning engineering. LLM inference, tokenization, RAG pipelines, model deployment, vector databases, and the infrastructure behind modern AI applications.

54 articles

GLM-5.1 vs Claude Opus 4.6: How Zhipu AI Caught Up on Coding
AI/ML Engineering

GLM-5.1 vs Claude Opus 4.6: How Zhipu AI Caught Up on Coding

Zhipu AI's GLM-5.1 beat Claude Opus 4.6 on SWE-Bench Pro at 7x lower API cost. Where the headline holds (batch coding, cost-sensitive loops) and where Opus still wins (subjective quality, agentic tool use, latency).

9 min read·
DeepSeek V4 Explained: 1T-Param MoE, Engram Memory, 1M Context
AI/ML Engineering

DeepSeek V4 Explained: 1T-Param MoE, Engram Memory, 1M Context

DeepSeek V4's 1T-parameter MoE architecture, the Engram learned-memory layer behind its 1M-token context window, real benchmarks vs Claude Opus 4.7 and GPT-5.4, API pricing, and the honest case for when to pick V4.

9 min read·
Vibe Coding in 2026: What Production Teams Actually Do
AI/ML Engineering

Vibe Coding in 2026: What Production Teams Actually Do

An honest look at where vibe coding works in production (greenfield prototypes, glue code, refactors), where it fails (payments, auth, hot paths), and the team norms that make it viable.

12 min read·
AI Coding Agent Pricing in 2026: Per-Seat vs Per-Task vs Self-Hosted Math
AI/ML Engineering

AI Coding Agent Pricing in 2026: Per-Seat vs Per-Task vs Self-Hosted Math

Real annual cost math for Claude Code, Cursor, Copilot, Codex, and self-hosted Aider across a 10-engineer team. The token-volume thresholds that flip the answer.

10 min read·
Self-Hosted AI Coding Agents: Aider vs Continue vs OpenHands
AI/ML Engineering

Self-Hosted AI Coding Agents: Aider vs Continue vs OpenHands

Aider for CLI / git-native, Continue for IDE-native BYO-model, OpenHands for autonomous multi-step tasks. Real SWE-Bench scores with Qwen 3.5 32B local.

10 min read·
Run Qwen 3.5 9B on 64GB RAM: Complete Setup Guide
AI/ML Engineering

Run Qwen 3.5 9B on 64GB RAM: Complete Setup Guide

Step-by-step guide to running Qwen 3.5 9B on local hardware. Covers system requirements, optimization techniques, quantization, inference speed, and practical limitations for developers.

11 min read·
Cursor Background Agents: Parallel Coding Tasks Explained
AI/ML Engineering

Cursor Background Agents: Parallel Coding Tasks Explained

Cursor parallel-agent UI launched April 2026. Wins on narrow multi-file refactors; misfires on cross-cutting changes. Real workflow patterns and token cost math.

9 min read·
AI Coding Assistants Compared: Claude Code vs Cursor vs GitHub Copilot (2026)
AI/ML Engineering

AI Coding Assistants Compared: Claude Code vs Cursor vs GitHub Copilot (2026)

Copilot is an IDE extension, Cursor is a forked IDE, Claude Code is a CLI agent. Compare them across writing code, debugging, refactoring, test generation, context windows, pricing, and privacy.

12 min read·
Claude Code Subagents and Skills: Building Real Workflows
AI/ML Engineering

Claude Code Subagents and Skills: Building Real Workflows

Layer subagents, skills, hooks, slash commands, and MCP servers into autonomous workflows. Real config patterns for 5-engineer backend teams.

11 min read·

Stay in the loop

New articles delivered to your inbox. No spam.