AI/ML Engineering

Practical AI and machine learning engineering. LLM inference, tokenization, RAG pipelines, model deployment, vector databases, and the infrastructure behind modern AI applications.

54 articles

AI/ML Engineering

GLM-5.1 vs Claude Opus 4.6: How Zhipu AI Caught Up on Coding

Zhipu AI's GLM-5.1 beat Claude Opus 4.6 on SWE-Bench Pro at 7x lower API cost. Where the headline holds (batch coding, cost-sensitive loops) and where Opus still wins (subjective quality, agentic tool use, latency).

9 min read·Mar 3, 2026

AI/ML Engineering

DeepSeek V4 Explained: 1T-Param MoE, Engram Memory, 1M Context

DeepSeek V4's 1T-parameter MoE architecture, the Engram learned-memory layer behind its 1M-token context window, real benchmarks vs Claude Opus 4.7 and GPT-5.4, API pricing, and the honest case for when to pick V4.

9 min read·Feb 28, 2026

AI/ML Engineering

Vibe Coding in 2026: What Production Teams Actually Do

An honest look at where vibe coding works in production (greenfield prototypes, glue code, refactors), where it fails (payments, auth, hot paths), and the team norms that make it viable.

12 min read·Feb 25, 2026

AI/ML Engineering

AI Coding Agent Pricing in 2026: Per-Seat vs Per-Task vs Self-Hosted Math

Real annual cost math for Claude Code, Cursor, Copilot, Codex, and self-hosted Aider across a 10-engineer team. The token-volume thresholds that flip the answer.

10 min read·Feb 22, 2026

AI/ML Engineering

Self-Hosted AI Coding Agents: Aider vs Continue vs OpenHands

Aider for CLI / git-native, Continue for IDE-native BYO-model, OpenHands for autonomous multi-step tasks. Real SWE-Bench scores with Qwen 3.5 32B local.

10 min read·Feb 19, 2026

AI/ML Engineering

Run Qwen 3.5 9B on 64GB RAM: Complete Setup Guide

Step-by-step guide to running Qwen 3.5 9B on local hardware. Covers system requirements, optimization techniques, quantization, inference speed, and practical limitations for developers.

11 min read·Feb 17, 2026

AI/ML Engineering

Cursor Background Agents: Parallel Coding Tasks Explained

Cursor parallel-agent UI launched April 2026. Wins on narrow multi-file refactors; misfires on cross-cutting changes. Real workflow patterns and token cost math.

9 min read·Feb 16, 2026

AI/ML Engineering

AI Coding Assistants Compared: Claude Code vs Cursor vs GitHub Copilot (2026)

Copilot is an IDE extension, Cursor is a forked IDE, Claude Code is a CLI agent. Compare them across writing code, debugging, refactoring, test generation, context windows, pricing, and privacy.

12 min read·Feb 13, 2026

AI/ML Engineering

Claude Code Subagents and Skills: Building Real Workflows

Layer subagents, skills, hooks, slash commands, and MCP servers into autonomous workflows. Real config patterns for 5-engineer backend teams.

11 min read·Feb 13, 2026

← NewerPage 4 of 6Older →

GLM-5.1 vs Claude Opus 4.6: How Zhipu AI Caught Up on Coding

DeepSeek V4 Explained: 1T-Param MoE, Engram Memory, 1M Context

Vibe Coding in 2026: What Production Teams Actually Do

AI Coding Agent Pricing in 2026: Per-Seat vs Per-Task vs Self-Hosted Math

Self-Hosted AI Coding Agents: Aider vs Continue vs OpenHands

Run Qwen 3.5 9B on 64GB RAM: Complete Setup Guide

Cursor Background Agents: Parallel Coding Tasks Explained

AI Coding Assistants Compared: Claude Code vs Cursor vs GitHub Copilot (2026)

Claude Code Subagents and Skills: Building Real Workflows

Stay in the loop